Abstract
Mitochondria are essential cellular organelles that play critical roles in cancer. Here, as part of the International Cancer Genome Consortium/The Cancer Genome Atlas Pan-Cancer Analysis of Whole Genomes Consortium, which aggregated whole-genome sequencing data from 2,658 cancers across 38 tumor types, we performed a multidimensional, integrated characterization of mitochondrial genomes and related RNA sequencing data. Our analysis presents the most definitive mutational landscape of mitochondrial genomes and identifies several hypermutated cases. Truncating mutations are markedly enriched in kidney, colorectal and thyroid cancers, suggesting oncogenic effects with the activation of signaling pathways. We find frequent somatic nuclear transfers of mitochondrial DNA, some of which disrupt therapeutic target genes. Mitochondrial copy number varies greatly within and across cancers and correlates with clinical variables. Co-expression analysis highlights the function of mitochondrial genes in oxidative phosphorylation, DNA repair and the cell cycle, and shows their connections with clinically actionable genes. Our study lays a foundation for translating mitochondrial biology into clinical applications.
Subject terms: Cancer, Genomics
Analysis of mitochondrial genomes (mtDNA) by using whole-genome sequencing data from 2,658 cancer samples across 38 cancer types identifies hypermutated mtDNA cases, frequent somatic nuclear transfer of mtDNA and high variability of mtDNA copy number in many cancers.
Main
Mitochondria are crucial cellular organelles in eukaryotes, and there can be several hundred mitochondria in a single human cell1. Known as ‘the powerhouses of the cell’, mitochondria play essential roles in generating most of the cell’s energy through oxidative phosphorylation2. Despite its small size (16.6 kilobases (kb)), the circular mitochondrial genome encodes 13 proteins that form respiratory chain complexes with other proteins of nuclear origin3. The involvement of mitochondria in carcinogenesis has long been suspected4,5 because altered energy metabolism is a common feature of cancer6. Furthermore, mitochondria play important roles in other tasks, such as biosynthesis, signaling, cellular differentiation, apoptosis, maintaining control of the cell cycle and cell growth, all of which are intrinsically linked to tumorigenesis5,7.
In several recent studies, molecular characterization of mitochondria was performed in cancer by using next-generation sequencing data8–13, but these studies usually describe one specific dimension of the mitochondrial genome (for example, somatic mutations) based on relatively small sample cohorts. Furthermore, due to the whole-exome sequencing data employed, the relatively low depth of mitochondrial genomes limits the accuracy and scope of these studies. Thus, a comprehensive, multidimensional molecular portrait of mitochondria across a broad range of cancer types has not been achieved. Moreover, previous studies have focused on the patterns of mitochondrial alterations alone, without fully exploring the interplay between the mitochondrial genome and the nuclear genome, as well as the biomedical significance of mitochondrial alterations.
The Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium aggregated whole-genome sequencing (WGS) data from 2,658 cancers across 38 tumor types generated by the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA) projects. These sequencing data were re-analyzed with standardized, high-accuracy pipelines to align to the human genome (reference build hs37d5) and identify germline variants and somatically acquired mutations, as described14. Meanwhile, TCGA has generated RNA sequencing (RNA-seq) data from a large number of patient samples, which allow for assessment of the transcriptional activities of mitochondrial genes15. These large-scale datasets create a tremendous resource for characterizing cancer mitochondrial genomes at an unprecedented level (Fig. 1a). We first characterized mitochondrial somatic mutations, nuclear transfers and copy numbers, then investigated their interactions with nuclear somatic alterations16,17, and finally examined the expression profiles of mitochondrial genes and their connections with clinically relevant nuclear genes.
Fig. 1. Mutational landscape and process of cancer mitochondrial genomes.
a, Overview of our multidimensional and integrated mitochondrial genome analyses. b, Landscape of mtDNA somatic substitutions. The numbers represent the mitochondrial genome coordinates. The outer (blue) circle shows the density of all variants with VAF > 1%. The inner (red) circle shows the density of variants with VAF > 3%. c, Highly consistent mtDNA mutational spectrum across 21 cancer tissue groups. Average numbers of somatic substitutions per sample are also shown (right). d, Correlation between the highest VAF of mtDNA mutations in a cancer tissue and patient age at the time of diagnosis. The correlation was based on the 2,414 patients with both age and somatic single-nucleotide variant information available. The shaded region represents the 95% confidence interval for the predictions from a linear model with the highest VAF as the response variable and patient age as the explanatory variable. e, Correlations between the numbers of nuclear and mtDNA somatic mutations. The associations were tested among samples with both nuclear and mtDNA somatic mutations available (with sample sizes labeled on plot) using Spearman’s rank correlation. Magenta bars indicate significant positive correlations (P < 0.05). Asterisks indicate that both nuclear and mitochondrial somatic mutations were correlated with patient age in that cancer type (P < 0.05). f, Proportions of tumor samples harboring different categories of somatic alterations: nuclear driver alterations only (red); both nuclear driver alterations and mtDNA mutations (VAF > 10%, green); and mtDNA mutations without known nuclear drivers (blue). AML, acute myeloid leukemia; BNHL, B cell non-Hodgkin lymphoma; CA, carcinoma; ChRCC, chromophobe renal cell carcinoma; CLL, chronic lymphocytic leukemia; CNS, central nervous system; eso, esophageal; GBM, glioblastoma; HCC, hepatocellular carcinoma; leiomyo, leiomyosarcoma; liposarc, liposarcoma; medullo, medulloblastoma; MPN, myeloproliferative neoplasm; oligo, oligometastatic; osteosarc, osteosarcoma; PiloAstro, pilocytic astrocytoma; RCC, renal cell carcinoma; SCC, squamous cell carcinoma; TCC, transitional cell carcinoma; adenoCA, adenocarcinoma; epith, epithelioid.
Results
Mutational landscape of cancer mitochondrial genomes
To characterize somatic mutations in mitochondrial genomes across cancer types, we extracted the mitochondrial DNA (mtDNA) mapped reads of 2,658 cancer and matched control sample pairs from the PCAWG Consortium. The samples we surveyed covered 21 cancer tissues and 38 specific cancer types (Supplementary Table 1). On average, the sequencing depth for the mitochondrial genome was 9,959×, which was much higher than that obtained from whole-exome sequencing data, allowing for confident detection of somatic mutations at a very low heteroplasmic level (variant allele fraction (VAF) > 1%; Supplementary Fig. 1). By applying a well-designed computational pipeline that carefully considered various potentially confounding factors (for example, sample cross-contamination, mismapping of reads from nuclear mtDNA-like sequence18, and artifactual mutations caused by oxidative DNA damage during library preparation19), we identified a total of 7,611 somatic substitutions and 930 small indels in 2,536 high-quality cancer samples (122 samples were excluded in the mutation analysis for the issues mentioned above; Supplementary Fig. 2 and Methods). The high reliability of the mutations was confirmed by long-range PCR-based validation (Supplementary Table 2) and by inspection of the mutational spectrum of the very low-VAF mutation candidates (Supplementary Fig. 3).
Of the 7,611 substitutions, >85% were clearly heteroplasmic, showing VAFs lower than 0.6 (average: 0.2; median: 0.045). Overall, mtDNA mutations located in the transcribed regions were also found in RNA-seq with similar VAFs, except for a fraction of transfer RNA (tRNA) mutations showing much higher VAFs in transcripts due to the accumulation of unprocessed tRNA precursors during the processing of polycistronic mitochondrial transcripts10 (Supplementary Fig. 4). Across all of the cancer samples, we observed several mutational hotspots in the regulatory D-loop region and the ND4 gene (Fig. 1b). Of the 13 protein-coding genes, ND5 was the most frequently mutated in most cancer types, while ND4 was most frequently mutated in prostate and lung cancers, and COX1 was most frequently mutated in breast, cervical and bladder cancers (Supplementary Fig. 5). We identified that cancer type and gene identity were associated with the mutation status of the 13 coding genes (log-linear model, Pcancer type < 2.2 × 10−16; Pgene < 2.2 × 10−16), but the effect of their interaction was not significant (Pcancer type × gene = 0.12).
In contrast with somatic mutations in nuclear genomes (where cancer type-specific mutational signatures are observed)20, mtDNA mutational signatures were very similar across tumor types, with C:G>T:A (58.3%) and T:A>C:G (34.2%) substitutions being the most and second most frequent mutation types, respectively (Fig. 1c and Supplementary Fig. 6). Indeed, the impact of well-known carcinogens (for example, tobacco smoking (C:G>A:T dominant; signature 4), ultraviolet light (C:G>T:A dominant at dipyrimidine contexts; signature 7) and reactive oxygen species (G:C>T:A dominant)) were minimal (Supplementary Fig. 7) even in lung and skin cancers (the latest mutational signatures of nuclear genomes are available from the Catalogue of Somatic Mutations in Cancer database: https://cancer.sanger.ac.uk/cosmic/signatures). Instead, the vast majority of mtDNA mutations were manifesting extreme replicational mtDNA strand bias9,21,22; that is, predominant G>A and T>C substitutions and deficient complementary C>T and A>G substitutions on the light (L) strand of the mtDNA genome sequence (+strand of the revised Cambridge Reference Sequence) despite the relative depletion of guanines and thymines on the L strand (Supplementary Fig. 6). These mutational signatures suggest that mitochondria-specific, replication-coupled mutational processes (such as mtDNA polymerase gamma error9,21,23 or other replication-coupled DNA damage mechanisms) are dominantly responsible for somatic mtDNA mutations in cancer.
In agreement with its endogenous origin, we observed clock-like properties in mtDNA mutations as nuclear genome mutations24. The number of mtDNA mutations in our study was largely proportional to the age of the patient at the time of tissue sampling (Supplementary Fig. 8). In addition, the maximum VAF of somatic mutations in a cancer tissue showed positive correlation with age, on average (Fig. 1d; P < 2.2 × 10−16). Collectively, these results suggest that the vast majority of mtDNA somatic mutations were: (1) acquired at an earlier age when the cell lineage was phenotypically normal; and (2) overall shifted towards homoplasmy throughout life in the cellular lineage of the neoplastic cells. The spread to homoplasmy can, in theory, be caused by either physiological advantage (selection) or a series of asymmetric segregations during cell divisions (drift)25, or both.
To further assess the potential impact of mtDNA mutations, we performed integrative analysis by examining alterations from mitochondrial and nuclear genomes simultaneously17. We observed significantly positive correlations between the mutation burdens of mitochondrial and nuclear genomes in several cancer types, with the highest correlations observed in kidney chromophobe and thyroid cancers (magenta bars in Fig. 1e). Some of these correlations may be explained by the age effect, as the mutation numbers in both mitochondrial and nuclear genomes were significantly correlated with patient age in the corresponding cancer types (bars marked with an asterisk in Fig. 1e). In addition, we examined the mtDNA mutation frequency in the context of nuclear drivers. Although nuclear driver alterations exist in the majority of patients in most cancer types, a notable proportion of patients (22.2% with kidney chromophobe cancer and 18.8% with thyroid cancer) bear non-silent mtDNA mutations but no known nuclear drivers, suggesting a potential functional contribution of mtDNA mutations in the absence of nuclear drivers in these cancer types (Fig. 1f).
Hypermutation process in mitochondrial genomes
Hypermutation processes have been well established for a small proportion of cancer nuclear genomes (for example, microsatellite instability)26,27, but have not been reported for mitochondrial genomes. Of the 2,536 cancer samples surveyed, seven cases showed extremely large numbers of mtDNA somatic substitutions (>13 mutations), which were larger than expected from the background distribution (Fig. 2a; around three somatic substitutions per sample on average, with a standard deviation of 2.6). The mutational spectra in these hypermutated samples were sometimes clearly distinguished from the background L-strand G>A and T>C substitution dominant signature (Fig. 2b), suggesting that the massive numbers of mutations are not the consequence of the gradual accumulation of ordinary mtDNA substitutions.
Fig. 2. Characterization of hypermutated cancer mitochondrial genomes.
a, Distribution of mtDNA mutations. The blue curve represents the calculated ratio between observed and expected numbers of samples for each bin (right-hand y axis). b, Mutational spectrum of the seven hypermutated mitochondrial genomes identified. The P values were generated by chi-squared test without multiple comparison adjustment (**P < 0.01; ****P < 0.0001). c, Distribution of the 33 somatic mutations in the breast cancer sample of mtDNA hypermutation (sample ID: SP6730). d, Proposed model of the mtDNA hypermutation process in SP6730. rRNAs, ribosomal RNAs.
The most striking case was a breast cancer sample (sample ID: SP6730) harboring 33 mutations, 30 of which were localized in a 2-kb region (Fig. 2c), resulting in a local hypermutational rate (>75× higher than the background mutational rate). The mutations were neither of germline origin (~70% were novel) nor caused by sequencing errors, as confirmed by independent exome and RNA-seq analyses (Supplementary Fig. 9). Interestingly, most of the localized mutations (n = 28) were T>C substitutions on the L strand (Fig. 2b,c) and were co-clonal of each other, with highly similar VAFs (~7%) and direct physical phasing by Illumina sequencing reads (Supplementary Fig. 9). Collectively, these lines of evidence strongly suggest that the 28 localized mutations (19 missense, four silent and five tRNA mutations) were acquired by a ‘single-hit’ catastrophic mutational mechanism with strand-specific T>C substitutions as a dominant spectrum, reminiscent of the kataegis phenomenon in the nuclear genome28 (Fig. 2c) and/or complex somatic mutations reported in mtDNA29. The mutated mtDNA copy is then likely to shift to appreciable VAF (~7% frequency) by a series of replications throughout the cell lineages, despite the low probability of causation of a defective phenotype.
Cancer type-specific selective pressures on mtDNA mutations
To investigate the functional consequences of mtDNA genes, we examined the dN/dS ratio (a common measure of selective pressure on missense mutations) with consideration of the unique mtDNA mutational signature9. We found that dN/dS was overall close to 1 for missense mutations at different VAFs across cancer types, suggesting that overall selection for mtDNA missense mutations is nearly neutral (Supplementary Fig. 10). However, it should not be interpreted that all missense mtDNA mutations are passengers.
For truncating mutations on the 13 mtDNA genes, we found evidence of negative selection in most cancer types, suggesting the importance of intact mitochondrial function in cancer cells. For example, the VAFs of mtDNA truncating mutations were notably more suppressed than those of missense or silent mutations (Fig. 3a). Interestingly, kidney, colorectal and thyroid cancers showed the opposite trend, where mtDNA truncating mutations exhibited significantly higher VAFs than the background (F-test, P < 2.2 × 10−16; Fig. 3a). The enrichment of nearly homoplasmic (>60% VAF) truncating mutations was very striking in kidney cancers, especially in chromophobe and papillary types, suggesting that the inactivation of the normal mitochondrial function is an important step in tumorigenesis30 (Fig. 3b and Supplementary Fig. 11). The mtDNA truncating mutations were enriched in ND5. Compared with kidney chromophobe and colorectal cancers, kidney papillary cancers harbored ND5 truncation mutations enriched in the amino-terminal region (Fisher’s exact test, P = 0.05; Fig. 3c). Integrating with the mutation data of nuclear genes, we found that the high VAF truncating mutations in the two kidney cancer types were mutually exclusive to the mutations of known cancer genes (Fisher’s exact test, P = 0.01; Fig. 3d). Moreover, samples with mtDNA truncating mutations showed upregulation of gene expression in cancer-related pathways, such as mammalian target of rapamycin signaling, tumor necrosis factor-α signaling, oxidative phosphorylation and protein secretion (false discovery rate (FDR) < 0.05; Fig. 3e). Collectively, these results strongly suggest functional oncogenic impacts of mitochondrial truncating mutations in the initiation and clonal evolution of the specific cancer types.
Fig. 3. mtDNA truncating mutation patterns.
a, Distinct VAF accumulation curves of truncating mutations between kidney/colorectal/thyroid cancers and other cancer types. For comparison, similar curves were generated for silent and missense mutations, which are overall functionally neutral, in other types of cancer after normalization of mutation numbers. Generally, fewer truncating mutations were observed at higher allele-frequency levels (red), except for kidney, colorectal and thyroid cancer types (blue). n = number of samples. b, Kidney chromophobe, kidney papillary, colorectal and thyroid cancers accumulated excessive high-allele-frequency truncating mutations (sample sizes in parentheses). Areas under the curve across cancer types for a VAF interval of 0.6–1.0 (from Supplementary Fig. 11a); were calculated and are shown. Their distribution is shown by a box plot. The boundaries of the box mark the first and third quartile, with the median in the center, and whiskers extending to 1.5× the interquartile range from the boundaries. c, Distribution patterns of truncating mutations in ND5. C, carboxy; N, amino; Proton_antipo_N, NADH-ubiquinone oxidoreductase (complex I), chain 5 N-terminus; Proton_antipo_M, proton-conducting membrane transporter; NADH5_C, NADH dehydrogenase subunit 5 C terminus. d, Heat map of mtDNA truncating mutations with recurrent somatic mutations in cancer genes in kidney chromophobe and kidney papillary cancers. MT_truncating standas for mitochondrial truncating mutations, which include frameshift mutations and stop-gain mutations. Statistical significance of mutual exclusivity between mutations was assessed by Fisher’s exact test. SNV, single-nucleotide variant. e, Heat map of the signaling pathways enriched by nuclear genes upregulated in cancer samples with truncating mutations. A dot indicates FDR < 0.05. IFN-γ, interferon-γ; IL-6, interleukin-6; JAK, Janus kinase; mTORC1, mammalian target of rapamycin complex 1; NF-κB, nuclear factor κB; STAT3, signal transducers and activators of transcription 3; TNF-α, tumor necrosis factor-α; KRAS, KRAS proto-oncogene, GTPase.
Somatic transfer of mtDNA into the nuclear genome
The migration of mtDNA into the nuclear genome has been assessed using different technologies31–33. Recently, somatic mtDNA nuclear transfers (SMNTs) have been more systematically studied in nucleotide resolution11, mostly in breast cancers. In this study, of the 2,658 cancer cases across 21 tissue types, we found 55 positive cases (2.1% overall positive rate) (Methods). The SMNT rate varied according to the cancer tissue type (Fisher’s exact test, P < 1 × 10−5; Fig. 4a). For example, lung, skin, breast and uterine cancers showed frequencies higher than 5%. In particular, human epidermal growth factor receptor 2-positive (HER2+) breast cancers and squamous cell lung cancers showed positive rates of 16.0% (four out of 25 cases) and 14.6% (seven out of 48 cases), respectively, which were significantly higher than the average (Fisher’s exact test, P < 0.003 and P < 0.001, respectively). In contrast, we did not find any positive cases from blood, kidney, esophagogastric, liver, prostate and colorectal cancers. The samples with SMNTs showed a much higher number of global and local structural variations in the nuclear genome than the control samples16 (P = 1 × 10−4; Fig. 4b and Supplementary Fig. 12). SMNT integration sites (breakpoints) were spatially closer to inversion and translocation breakpoints than expected (Fig. 4c). These results suggest that the integration of mtDNA segments into nuclear DNA is often mechanistically combined with some specific processes underlying structural variations in the nuclear genome.
Fig. 4. Somatic transfer of mtDNA into the cancer nuclear genome.
a, Frequency of SMNTs in different cancer tissues. Circle size indicates the sample size of a given cancer type. ER, estrogen receptor. b, Numbers of structural variant breakpoints in samples with and without SMNTs. Sample sizes are labeled below the boxes. The P values (***P < 0.001; ****P < 0.0001) were generated by two-sided t-test without multiple comparison adjustment. c, Distances from SMNT breakpoints to the nearest structural variant breakpoints are shorter than random expectation for all and each type of structural variant. Sample sizes are labeled in the centers of the boxes. The P values (**P < 0.01; ***P < 0.001) were generated by two-sided t-test without multiple comparison adjustment. bp, base pair; exp, expected; Mb, megabase; NS, not significant; obs, observed. d, Circos plot of three independent SMNT events in a bladder cancer genome (sample ID: SP953), showing 23 human chromosomes in the outer layer, as well as copy numbers of nuclear cancer genomes (inner layer; black dots); chromosomal rearrangements (gray curves) and SMNTs (red curves). A summary of three SMNTs with genomic coordinates in numbers is depicted below the Circos plot with breakpoints. e, An SMNT event found in a HER2+ breast cancer genome (sample ID: SP10563), leading to a tandem duplication process of ERBB2 exons 10–23 and their subsequent expression. The novel exon junction is supported by the RNA reads from the corresponding RNA-seq data. In all boxplots, the boundaries of the boxes mark the first and third quartiles, with the median in the center, and whiskers extending to 1.5× the interquartile range from the boundaries.
Despite the overall low SMNT frequency (~2%), some cancer samples showed up to three independent SMNT events (Fig. 4d and Supplementary Fig. 13). Sometimes, somatically transferred mtDNA segments were extensively rearranged (Supplementary Fig. 13b), implying extreme genomic instability at the time of the SMNT events. We observed 42 SMNT events in 35 tumor cases that were integrated in the middle of genes (n = 42), mostly in introns (n = 37), with a few events in the protein-coding regions (n = 3) and in the untranslated regions (n = 2) (Supplementary Table 3). Among these, open reading frames of at least 23 genes (23/42 = 55%), including cancer genes such as ERBB2, FOLH1 and ULK2, were predicted to be altered by these SMNTs and their combined structural variant events in the vicinity (Supplementary Fig. 14). Of particular interest, one SMNT was involved in transforming focal amplification of the ERBB2 gene in a HER2+ breast cancer genome (Fig. 4e).
Copy-number and structural variations of mtDNA
Although previous studies have examined mtDNA copy numbers in individual cancer types34–36 or from a collection of whole-exome sequencing data12, we performed a systematic and accurate analysis of mtDNA copy numbers per cell over the largest sample cohort with WGS data so far, with consideration to confounding factors such as the normal-cell contamination and genome ploidy of tumor cells (Supplementary Fig. 15 and Methods).
Based on the 2,157 cancer samples that passed the purity filter, we observed great variation in mtDNA copy numbers across and within cancer types: mtDNAs were most abundant in samples of ovarian cancer (median: 644 copies per cell) and least abundant in myeloid cancer (median: 90 copies per cell) (Fig. 5a). Different cancer subtypes originating from the same tissue sometimes showed distinct mtDNA copy-number distributions (Fig. 5b and Supplementary Fig. 16). For example, the mtDNA copy numbers for kidney chromophobe were significantly higher than those for kidney clear cell and kidney papillary (analysis of variance (ANOVA), P < 7.8 × 10−6; Fig. 5b). This may be interlinked with the general inadequacy of mitochondrial quality control and resultant increase in the steady-state mtDNA copy number, as seen in renal oncocytoma37. Indeed, we found that the mtDNA copy number was significantly higher in the samples with high-allele-frequency truncating mutations (ANOVA with consideration of confounders, P < 1.7 × 10−4; Fig. 5c), suggesting that the dosage effect of mtDNAs was selected to compensate for the deleterious effect of truncating mutations. For the cancer samples with WGS data from matched normal tissues (n = 507), we observed increased mtDNA copy numbers in cancer samples in patients with chronic lymphocytic leukemia, lung squamous cell carcinoma and pancreatic adenocarcinoma, but decreased copy numbers in cancer samples in patients with kidney clear cell carcinoma, hepatocellular carcinoma and myeloproliferative neoplasm (Fig. 5d). At face value, the distinct patterns in different cancer types may be due to cancer-specific oncogenic stimulation, metabolic activity and mitochondrial malfunctions. For example, a recent study12 suggested that significantly decreased mtDNA copy number in kidney clear cell cancer may be due to downregulation of peroxisome proliferator-activated receptor-γ coactivator 1α (a central regulator of mitochondrial biogenesis) by hyperactivated hypoxia-inducible factor 1α, which is most frequently mutated and activated in this disease38. However, since the available mtDNA copy numbers in normal tissues are average values from mixtures of many heterogeneous cell types with unknown relative contributions, a direct comparison between tumor and adjacent normal tissues should be interpreted cautiously.
Fig. 5. Pan-cancer view of mtDNA copy number.
a, Distributions of mtDNA copy number by cancer tissue type. Sample numbers with mtDNA copy-number information available are labeled on the top, with the median mtDNA copy numbers marked as red bars. b, Distinct mtDNA copy-number distributions for cancer types derived from the kidney (top) and brain (bottom). n = number of samples with mtDNA copy number. P values were generated by ANOVA. c, mtDNA copy numbers with and without truncating mutations in mtDNA genes. n = number of samples with both mtDNA copy number and somatic mutation information available. P values were based on ANOVA, adjusting for cancer types. d, Paired copy-number comparison of tumor and matching normal tissue samples. n = number of matching normal tissue and cancer sample pairs. Raw P values were determined by two-sided Wilcoxon signed-rank test, then adjusted for FDR (**FDR < 0.01; ***FDR < 0.001). e, Correlation of mtDNA copy number with patient age in prostate cancer. n = number of samples with cancer mtDNA copy number and patient age information available. Correlations and P values are based on Spearman’s rank correlation. f, Correlation of mtDNA copy number with cancer stage in chronic lymphocytic leukemia. n = number of samples with mtDNA copy number and stage information. g, Focal copy gain and loss caused by structural variations in three cancer samples. Green lines represent focal loss, whereas red lines represent tandem duplication. In all boxplots, the boundaries of the boxes mark the first and third quartile, with the median in the center, and whiskers extending to 1.5× the interquartile range from the boundaries.
To assess the potential biomedical significance of mtDNA copy numbers, we examined their correlations with key clinical variables. We found significant positive correlations between the mtDNA copy number and the patient’s age at diagnosis in prostate (Spearman’s rank, Rs = 0.31; P < 1.7 × 10−4; Fig. 5e), colorectal and skin cancers (Supplementary Fig. 17). In contrast, we observed negative correlations of normal blood mtDNA copy number with patient age in most cases (Supplementary Fig. 18). We further observed correlations between mtDNA copy number and tumor stage in multiple cancer types (Fig. 5f and Supplementary Fig. 19).
Using WGS data, we examined the focal copy gain and loss in the mitochondrial genomes that were known to be present in prostate cancers and aged tissues39. Of the 2,658 cancer samples, three (0.11%) showed notable structural variants in the mtDNA (Fig. 5g). For example, a pancreatic cancer case (sample ID: SP76017) harbored a ~3.4-kb-long mtDNA loss that truncated ribosomal RNA and ND1 genes. The VAF of this mutant mtDNA was estimated at 63%. Similarly, a melanoma case (sample ID: SP127680) showed tandem duplication of an mtDNA segment of ~4 kb, with 100% VAF. Thus, our analysis identified structural variants in mtDNA genomes based on WGS.
Co-expression network analysis of mitochondrial genes
To understand the functional impact of 13 mtDNA genes in cancers, we quantified the gene expression levels using RNA-seq data profiled from 4,689 TCGA tumor samples of 13 cancer types (Supplementary Table 4). The correlation between the gene expression levels and the mtDNA copy number varied by cancer type (Supplementary Fig. 20). Among the cancer types, the mtDNA genes were highly expressed in the three types of kidney cancer (chromophobe, papillary and clear cell) but weakly expressed in the three types of squamous cell carcinoma (cervical, lung and head and neck) (Fig. 6a). This observation was partially due to the relative abundance of mtDNA copy number across cancer types and is consistent with a study of normal tissues40.
Fig. 6. Co-expression patterns of mtDNA genes across different cancer types.
a, Left: heat map of the expression levels of 13 mtDNA genes of 13 cancer types. Right: bar plot showing the sample sizes for each cancer type. b, Commonly enriched pathways identified by co-expression with mtDNA genes in different cancer types. Borders of cells with FDR < 0.05 are highlighted in yellow. c, mtDNA gene-centric pan-cancer co-expression network. The pie chart colors at each node indicate occurrence of the node in cancer types of the corresponding colors. Green borders, nuclear genes; blue borders, mitochondrial genes. Node size is proportional to the number of direct neighbors (connectivity) of the node. Thickness of the edge is proportional to the frequency of this edge being observed across all cancer types. Edges are colored according to the connection type (gray: mtDNA gene–mtDNA gene connection; magenta: mtDNA gene–nuclear gene connection). BLCA, bladder urothelial carcinoma; BRCA, breast invasive carcinoma; CESC, cervical squamous cell carcinoma and endocervical adenocarcinoma; HNSC, head and neck squamous cell carcinoma; KICH, kidney chromophobe; KIRC, kidney renal clear cell carcinoma; KIRP, kidney renal papillary cell carcinoma; LGG, brain lower-grade glioma; LIHC, liver hepatocellular carcinoma; LUAD, lung adenocarcinoma; LUSC, lung squamous cell carcinoma; PRAD, prostate adenocarcinoma; THCA, thyroid carcinoma.
To gain more insight into the functions of mtDNA genes and their related nuclear genes and pathways, for each cancer type, we used the weighted gene co-expression network analysis (WGCNA) package41 to build a weighted gene co-expression network that consisted of both nuclear genes and mitochondrial genes (Methods). We then performed gene set enrichment analysis (GSEA)42 based on the rank of all nuclear genes by measuring their edge strength to a mitochondrial gene in the co-expression network. We found oxidative phosphorylation to be the top-ranked enriched pathway, and to be enriched in eight out of the 13 cancer types examined (FDR < 0.05), highlighting the essential role of mitochondrial genes in energy generation (Fig. 6b). Pathways related to the cell cycle (MYC targets, mitotic spindle, G2/M checkpoint and E2F targets) and DNA repair were also enriched in multiple cancer types (Fig. 6b), consistent with the established notion that mtDNA plays an important role in these pathways37,43.
We also examined the mtDNA-centric co-expression networks (Fig. 6c and Methods). Across cancer types, the mtDNA genes were almost always strongly interconnected, which is expected since they are transcribed as long polycistronic precursor transcripts44. Interestingly, several clinically actionable genes were among the neighboring genes that showed strong co-expression patterns with mtDNA genes (Fig. 6c and Supplementary Fig. 21). For example, AR, EGFR, DDR2 and MAP2K2 were connected with mtDNA genes in prostate cancer, and TMPRSS2, NF1, PIK3CA, BRCA1 and TOP1 were the top neighbors of mtDNA genes in multiple cancer types. This correlation-based analysis does not necessarily suggest causality, and further efforts are needed to investigate these relationships. Elucidating the underlying mechanisms may lay a foundation for developing mtDNA-related cancer therapy.
An open-access Cancer Mitochondrial Atlas data portal
To facilitate mitochondria-related biological discoveries and clinical applications, we developed an open-access, user-friendly data portal, The Cancer Mitochondrial Atlas (TCMA), for fluent exploration of the various types of molecular data characterized in this study (Supplementary Fig. 22). The data portal can be accessed at http://bioinformatics.mdanderson.org/main/TCMA:Overview. There are four modules in TCMA: somatic mutations, nuclear transfer, copy number and gene expression. The first three modules are based on the ICGC WGS data and provide detailed annotations for the corresponding features of each cancer sample. The last module is based on TCGA RNA-seq data and provides an interactive interface through which users can visualize the co-expression network with convenient navigation and zoom features. Not only can users browse and query the molecular data by cancer type, they can also download all of the data for their own analysis.
Discussion
This work characterizes the cancer mitochondrial genome in a comprehensive manner, including somatic mutations, nuclear transfer, copy number, structural variants and mtDNA gene expression. Because of the ultra-high coverage of mtDNA from the WGS data and the large number of patient samples surveyed, our study provides a definitive landscape of mtDNA somatic mutations and identifies several unique features. First, we report hypermutated mitochondrial cases, highlighting the dynamic mutational processes in this tiny genome. Second, our systemic analysis of mitochondrial genomes has firmly shown that several cancer types are enriched for high-allele-frequency truncating mutations, including previously reported kidney chromophobe30,45 as well as newly identified kidney papillary, and thyroid and colorectal cancers. Interestingly, the thyroid and kidney are the most frequent sites of oncocytomas, which are rare, benign tumors characterized by frequent nuclear chromosomal aneuploidy as well as vast accumulation of defective mitochondria45,46, further assuring the functional association between mitochondrial inactivation and the pathogenesis of these cancer types. Third, in contrast with the diversified mutational signatures observed in the nuclear genomes of different cancers20, mtDNAs show very similar mutational signatures regardless of cancer tissue origins: predominantly G>A and T>C substitutions on the L strand. This monotonous pattern may partially stem from different mutational generators and DNA repair processes between the nucleus and mitochondria9,47,48. Due to their large numbers of copies per cell, mitochondria may simply remove mtDNA damaged from external mutagens (for example, ultraviolet radiation, tobacco smoking and reactive oxygen species) through autophagy and other mitochondrial dynamic mechanisms49, rather than employing a complex array of repair proteins as in the nucleus.
One unique aspect of our study is the integrative analysis of mitochondrial molecular alterations with those in the nuclear genome that are characterized by the PCAWG Consortium. We found that: (1) high-allele-frequency truncating mtDNA mutations are mutually exclusive to mutated cancer genes in kidney cancer; (2) mtDNA nuclear transfers are associated with increased numbers of structural variants in the nuclear genome; and (3) mtDNA co-expressed nuclear genes are enriched in several processes critical for tumor development. These results indicate that the mitochondrial genome is an essential component in understanding the complex molecular patterns observed in cancer genomes and helping to pinpoint potential cancer driver events. Our results, such as the nuclear transfer of mtDNA into a therapeutic target gene, correlations of mtDNA copy numbers with clinical variables, and the co-expression of mtDNA and clinically actionable genes, underscore the clinical importance of mitochondria.
Taken together, this study has untangled and characterized the full spectrum of molecular alterations of mitochondria in human cancers. Our analyses have provided essentially complete catalogs of somatic mtDNA alterations in cancers, including substitutions, indels, copy-number alterations and structural variants. Furthermore, we have developed a user-friendly web resource to enable the broader biomedical community to capitalize on our results. These efforts lay a foundation for translating mitochondrial biology into clinical investigations.
Methods
Data generation and collection
We extracted BAM files of mtDNA sequencing reads from the whole-genome alignment files of 2,658 cancer samples and their matched normal tissue samples generated by the PCAWG Consortium. BWA was used to align the reads to the human reference genome (hs37d5). From the CGHub, we obtained TCGA RNA-seq BAM files of 13 cancer types, all of which employed paired-end sequencing strategies. We used Cufflinks to quantify the messenger RNA expression levels (in fragments per kilobase per million mapped fragments) of the 13 mitochondrial protein-coding genes. We obtained the nuclear somatic mutations and annotated driver mutations of corresponding samples as described17.
Somatic mutation calling
The nuclear genome mutations were called using the Sanger pipeline, provided by the PCAWG. The mitochondrial variants were initially called using VarScan2 (ref. 50) and the same parameter setting as previously reported9: --strand-filter 1 (mismatches should be reported by both forward and reverse reads), --min-var-freq 0.01 (minimum VAF 1%), --min-avg-qual 20 (minimum base quality 20), --min-coverage × and --min-reads2 ×). We applied a series of downstream bioinformatic filters to further remove false positives as follows (Supplementary Fig. 2a).
First, we filtered germline polymorphisms and false positive calls (for example, frequent mapping errors due to known mtDNA homopolymers, candidates with substantial mapping strand bias and candidates with substantial mutant alleles in the matched normal sample). For analytic simplicity, we removed multi-allelic mtDNA mutations and back mutations from the non-reference to the reference allele. After this filtration step, we obtained 10,083 somatic substitution candidates.
Second, we examined DNA cross-contamination because even minor DNA cross-contamination (that is, contamination level < 3%) would generate many low-VAF false positive calls that are in fact germline polymorphisms from the contaminating sample. We tested whether mtDNA somatic mutations detected from a cancer sample show greater overlap with known mtDNA polymorphisms than expected from the overall average rate (73.5%; 3,922/5,337 substitutions) using the binomial test with a cutoff P < 0.01. From this step, we removed 96 samples with evidence of DNA cross-contamination (harboring 935 known mutations out of 1,131 known mutation candidates).
Third, we examined the overall mtDNA substitution signatures in the 96 possible mutation classes. We removed four samples with extremely high proportions of C>G substitutions with strong sequence context bias (at CpCpN>CpApN; most frequently at CpCpG>CpApG; Supplementary Fig. 2b). This spectrum is known to arise from artificial guanine oxidation during sequencing library preparation steps19 with low VAF (1–2%). We explicitly removed these samples from further analyses.
Then, we examined the possibility of false positive calls due to mismapping of reads from inherited nuclear mtDNA-like sequences (known as numts) not represented in the human reference genome18, especially when the specific numts regions were amplified in the cancer nuclear genome. These mutation candidates showed some specific features: (1) they appeared as highly recurrent mtDNA somatic mutations among multiple samples; (2) VAFs in mitochondria were only slightly higher than our 1% cutoff criteria; and (3) the matched normal samples also had small but substantial numbers of mutation allele counts. To remove these false positive calls, we applied two statistical tests of: (1) whether the VAF of a mutation candidate in the matched normal sequences was within the normal range (<0.0024; the cutoff is determined by the median VAF of all mutation candidates +2× the interquartile range); and (2) whether:
was within the normal range (<0.0357; the cutoff is determined by the median VAF of all mutation candidates +2× the interquartile range), where Nmut is the mutation allele count, RD is the average read depth for the nuclear genome, and nor and tum are normal and matched tumor tissues, respectively. When a mutation appeared to be an outlier according to both criteria, we removed the candidate from our downstream analyses.
In our previous study9, we could not detect mutations under a 3% VAF cutoff because mtDNA was sequenced with a read depth of ~100× from the majority of samples surveyed. Taking advantage of the ultra-high depth (>8,000×) in this study, we used a 1% VAF cutoff to obtain better sensitivity. We found 2,133 more substitutions when the VAF was between 1 and 3%. Because of the ultra-high depth, even 1% VAF mutations were considered to be specific, and were supported by a high number (n = ~80) of mutation alleles. We confirmed the high specificity of these mutations using the unique mtDNA mutational signatures robustly observed even from these low-VAF mutations: (1) the mutational spectrum is generally consistent with those from higher heteroplasmic levels of mutations (that is, VAFs from 3–10% and 10–100%); (2) we observed the absolute dominance of C>T and T>C substitutions in the expected trinucleotide contexts (NpCpG for C>T and NpTpC for T>C substitutions); and (3) we also observed extreme replication strand bias (Supplementary Fig. 3). These features would not be observed if contaminations resulted in many false positive calls. To assess the factors affecting the mutation frequency of the 13 coding genes, we performed the sample-level analysis using log-linear modeling: we assigned the binary mutation indicator (1: with mutation; 0: without mutation) to each sample for each gene and then fit this binary response variable to a logistic regression model, including cancer type, gene identity and their interaction as explanatory variables, which were later summarized using ANOVA. In addition, within each cancer type, we used Spearman’s rank correlation to assess the association between the numbers of nuclear and mtDNA somatic mutations, as well as their individual association with patient age.
Truncating mutation analysis
Taking into account the mtDNA-specific mutational signature, we examined the dN/dS ratio for mtDNA missense substitutions as reported previously9. We defined truncating mutations as those that lead to truncated protein products (that is, nonsense mutations and frameshift indels), and accordingly categorized the samples into the truncating group (bearing at least one truncating mutation with VAF ≥ 60%). The ND5 protein domain information was obtained from Pfam (http://pfam.xfam.org/protein/P03915). The cancer gene census list was obtained from http://cancer.sanger.ac.uk/cosmic/download. Cancer census genes with recurrent somatic mutations in kidney chromophobe and kidney papillary cancers were selected for analysis of mutual exclusivity and heat-map representation. One sample with a nuclear DNA hypermutator phenotype was excluded from this analysis. To examine the functional consequences of mtDNA truncating mutations, we performed GSEA based on the ranks of differentially expressed genes between samples with and samples without mtDNA truncating mutations for kidney chromophobe, kidney papillary, colorectal and thyroid cancers and their combination, and identified significantly enriched pathways at FDR = 0.05.
SMNT analysis
We examined the WGS data from the cancer and matched control tissue samples using a pipeline for the identification of mtDNA translocation to the nuclear genome, as reported previously11. The specificity was shown to be 100% in the previous study11. Briefly, we extracted and clustered discordant reads from cancer genomes, where one end aligned to nuclear DNA and the other aligned to mtDNA. Then, to determine the nucleotide resolution breakpoints, we searched for split reads near putative breakpoint junctions (1,000 base pairs upstream and downstream), where a fraction of a single read aligned to genomic DNA near the junctions and the rest aligned to mtDNA. All filtering criteria were the same as previously reported, except that we did not use BLAT51 for split-read detection because the BWA-MEM alignment tool used to map all pan-cancer samples fundamentally enables split-read mapping. We removed candidate mitochondrion–nuclear DNA junctions that overlapped with clusters from matched and unmatched normal samples and/or known human SMNTs—a combined set from the human reference genome (hg19; n = 123) and a published study52 (n = 766)—because the source of the mtDNA sequence fused to the nuclear genome might be SMNTs rather than real mitochondria in the cytoplasm of cells. We obtained the PCAWG Structural Variation Working Group16 and compared the samples with and without SMNTs by t-test. To study the relationship of SMNTs and structural variant breakpoints, we randomly chose the same number of structural variant breakpoints from each sample 100 times to estimate the random expectation.
MtDNA copy-number analysis
To better estimate the mtDNA copy number for cancer samples, we employed the following formula, which incorporates both tumor purity and ploidy information:
where f is the tumor purity (ranging from 0 to 1, where 1 stands for pure cancer cells and 0 stands for pure normal cells), CN is the mtDNA copy number, coverage_depthmtDNA and coverage_depthgDNA are the mean coverage depths for mtDNA and the nuclear genome in individual WGS BAM files, respectively, and ploidycancer is the number of sets of chromosomes in tumor cells, while ploidy in the normal cells is 2. Both f and ploidycancer were obtained using allele-specific copy-number analysis of tumors estimation53, provided by the PCAWG Consortium. Donors with multiple samples were preselected so that each donor came with one representative primary cancer sample. We excluded cancer samples with low purity (<0.4, estimated by allele-specific copy-number analysis of tumors) for further downstream analyses. We used ANOVA (if there were more than two cancer types) or t-test to compare the mtDNA copy number of cancer types derived from the same tissue. Since many of the normal samples were from blood, we focused on the cancer types with at least ten samples from the normal tissue adjacent to the tumor in order to compare the mtDNA copy number of the paired cancer and normal samples. We used the Wilcoxon signed-rank test to compare the mtDNA copy number for each selected cancer type and further adjusted the raw P values based on the FDR. To assess the correlation of mtDNA copy number with truncating mutations, we employed ANOVA (with the cancer type included in the model, to account for its potential effect). We assessed the correlations of the mtDNA copy number with the patient’s age, overall survival time and cancer stage using Spearman’s rank correlation, Cox model/log-rank test and ANOVA, respectively. We log2-transformed the mtDNA copy-number values when using ANOVA and the t-test, to conform to the normality assumption.
mtDNA structural variation analysis
To investigate large deletions or duplications in the mtDNA genome, we sought the read-depth change of tumor mtDNA sequences using normal mtDNA sequences as a reference. To this end, we calculated the normalized depth of mtDNA loci in 100-base pair-sized bins from all of the normal samples. Then, we calculated the deviation of mtDNA read depth in each tumor sample. When ten bins were consecutively increased or decreased in the relative depth sufficiently (z score > 3), we considered the region as a structural variation candidate. From all of the candidates, we sought discordant paired-end reads, or breakpoint-spanning reads, which strongly support structural variations11.
Co-expression analysis
For each cancer type, we used the WGCNA package41 to build a weighted gene co-expression network that contains ~20,000 nodes (including both nuclear genes and mitochondrial genes). The key parameter, β, for a weighted network construction was optimized to maintain both the scale-free topology and sufficient node connectivity, as recommended in the manual. In such a network, any two genes were connected and the edge weight was determined by the topology overlap measure provided in WGCNA. This measure considered not only the expression correlation between two partner genes, but also how many ‘friends’ the two genes shared. The weights ranged from 0 to 1, which reflected the strength of the connection between the two genes. To identify mitochondria-related pathways, we performed GSEA42 on the basis of the full set of nuclear protein-coding genes, ranked on the basis of the weights of the edge connecting the mitochondrial genes, and detected significant pathways at FDR = 0.05. To construct the mitochondria-centric network, we focused on the top 500 neighboring genes that showed the strongest connections with the mitochondrial genes, with a minimum weight of 0.05. Among these neighboring genes, we detected the clinically actionable genes (defined as FDA-approved therapeutic targets and their relevant predictive markers54) in at least one of the cancer types we surveyed. We examined the correlations of mtDNA gene expression levels with mtDNA copy numbers using Spearman’s rank correlations.
TCMA data portal construction
We stored the precalculated mtDNA molecular data (including mtDNA mutation, nuclear transfer, copy number and expression) in a database of CouchDB. The Web interface was implemented by JavaScript, tables were visualized by DataTables, and the co-expression network visualization was implemented by Cytoscape Web.
Reporting Summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Online content
Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at 10.1038/s41588-019-0557-x.
Supplementary information
Supplementary Figs. 1–22 and Tables 1–4
Acknowledgements
This study was partially supported by an MD Anderson Cancer Center Faculty Scholar Award (to H.L.), the Lorraine Dell Program in Bioinformatics for Personalization of Cancer Medicine (to J.N.W.), an Institute for Information and Communications Technology Promotion grant funded by the Korean government (Ministry of Science, ICT and Future Planning) (B0101-15-0104; the development of a supercomputing system for genome analysis), the Korea Health Technology Research and Development Project (through the Korea Health Industry Development Institute, funded by the Ministry of Health and Welfare, Republic of Korea) (HI14C0072 to H.-L.K. and HI17C1836 to Y.S.J.) and the Korean National Research Foundation (NRF-2016R1D1A1B03934110 and NRF-2017R1A2B2012796). We also thank the Electronics and Telecommunications Research Institute in Korea for its commitment to the ICGC PCAWG projects, the MD Anderson Cancer Center High-Performance Computing Core Facility for computing, and L. Chastain for editorial assistance. We acknowledge the contributions of the members of the many clinical networks across ICGC and TCGA who provided samples and data to the PCAWG Consortium, and the contributions of the Technical Working Group and the Germline Working Group of the PCAWG Consortium for collation, realignment and harmonized variant calling of the cancer genomes used in this study. We thank the patients and their families for participation in the individual ICGC and TCGA projects.
Author contributions
K.P., P.J.C. and H.L. conceived and designed the project. Y. Yuan, Y.S.J., Y.K., J.L., Y.W., C.J.Y., Y. Yang, I.M., C.J.C., J.N.W., Y.X., L.H., H.-L.K., H.N., K.P., P.J.C. and H.L. contributed to the data analysis. Y. Yuan, Y.S.J., Y.K. and H.L. wrote the manuscript with input from all other authors. H.L. supervised the entire project.
Data availability
Somatic and germline variant calls, mutational signatures, subclonal reconstructions, transcript abundance, splice calls and other core data generated by the ICGC/TCGA PCAWG Consortium are described here14 and are available for download at https://dcc.icgc.org/releases/PCAWG. Additional information on accessing the data, including raw read files, can be found at https://docs.icgc.org/pcawg/data/. In accordance with the data access policies of the ICGC and TCGA projects, most molecular, clinical and specimen data are in an open tier that does not require access approval. To access information with the potential to identify individuals, such as germline alleles and underlying sequencing data, researchers will need to apply to the TCGA Data Access Committee via the Database of Genotypes and Phenotypes (https://dbgap.ncbi.nlm.nih.gov/aa/wga.cgi?page=login) for access to the TCGA portion of the dataset, and to the ICGC Data Access Compliance Office (http://icgc.org/daco) for the ICGC portion. In addition, to access somatic single-nucleotide variants derived from TCGA donors, researchers will also need to obtain Database of Genotypes and Phenotypes authorization. Derived datasets described specifically in this study are available from the TCMA data portal (http://bioinformatics.mdanderson.org/main/TCMA:Overview).
Code availability
The core computational pipelines used by the PCAWG Consortium for alignment, quality control and variant calling are available to the public at https://dockstore.org/search?search=pcawg under GNU General Public License version 3.0, which allows for reuse and distribution.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Yuan Yuan, Young Seok Ju, Youngwook Kim.
A list of authors and their affiliations appears online.
Change history
2/11/2020
A Correction to this paper has been published: https://doi.org/10.1038/s41588-020-0587-4
Change history
4/27/2020
A Correction to this paper has been published: https://doi.org/10.1038/s41588-020-0629-y
Change history
3/21/2023
A Correction to this paper has been published: 10.1038/s41588-023-01317-x
Change history
2/11/2020
A Correction to this paper has been published: 10.1038/s41588-020-0587-4
Change history
4/27/2020
A Correction to this paper has been published: 10.1038/s41588-020-0629-y
Contributor Information
Keunchil Park, Email: kpark@skku.edu.
Peter J. Campbell, Email: pc8@sanger.ac.uk
Han Liang, Email: hliang1@mdanderson.org.
PCAWG Consortium:
Lauri A. Aaltonen, Federico Abascal, Adam Abeshouse, Hiroyuki Aburatani, David J. Adams, Nishant Agrawal, Keun Soo Ahn, Sung-Min Ahn, Hiroshi Aikata, Rehan Akbani, Kadir C. Akdemir, Hikmat Al-Ahmadie, Sultan T. Al-Sedairy, Fatima Al-Shahrour, Malik Alawi, Monique Albert, Kenneth Aldape, Ludmil B. Alexandrov, Adrian Ally, Kathryn Alsop, Eva G. Alvarez, Fernanda Amary, Samirkumar B. Amin, Brice Aminou, Ole Ammerpohl, Matthew J. Anderson, Yeng Ang, Davide Antonello, Pavana Anur, Samuel Aparicio, Elizabeth L. Appelbaum, Yasuhito Arai, Axel Aretz, Koji Arihiro, Shun-ichi Ariizumi, Joshua Armenia, Laurent Arnould, Sylvia Asa, Yassen Assenov, Gurnit Atwal, Sietse Aukema, J. Todd Auman, Miriam R. R. Aure, Philip Awadalla, Marta Aymerich, Gary D. Bader, Adrian Baez-Ortega, Matthew H. Bailey, Peter J. Bailey, Miruna Balasundaram, Saianand Balu, Pratiti Bandopadhayay, Rosamonde E. Banks, Stefano Barbi, Andrew P. Barbour, Jonathan Barenboim, Jill Barnholtz-Sloan, Hugh Barr, Elisabet Barrera, John Bartlett, Javier Bartolome, Claudio Bassi, Oliver F. Bathe, Daniel Baumhoer, Prashant Bavi, Stephen B. Baylin, Wojciech Bazant, Duncan Beardsmore, Timothy A. Beck, Sam Behjati, Andreas Behren, Beifang Niu, Cindy Bell, Sergi Beltran, Christopher Benz, Andrew Berchuck, Anke K. Bergmann, Erik N. Bergstrom, Benjamin P. Berman, Daniel M. Berney, Stephan H. Bernhart, Rameen Beroukhim, Mario Berrios, Samantha Bersani, Johanna Bertl, Miguel Betancourt, Vinayak Bhandari, Shriram G. Bhosle, Andrew V. Biankin, Matthias Bieg, Darell Bigner, Hans Binder, Ewan Birney, Michael Birrer, Nidhan K. Biswas, Bodil Bjerkehagen, Tom Bodenheimer, Lori Boice, Giada Bonizzato, Johann S. De Bono, Arnoud Boot, Moiz S. Bootwalla, Ake Borg, Arndt Borkhardt, Keith A. Boroevich, Ivan Borozan, Christoph Borst, Marcus Bosenberg, Mattia Bosio, Jacqueline Boultwood, Guillaume Bourque, Paul C. Boutros, G. Steven Bova, David T. Bowen, Reanne Bowlby, David D. L. Bowtell, Sandrine Boyault, Rich Boyce, Jeffrey Boyd, Alvis Brazma, Paul Brennan, Daniel S. Brewer, Arie B. Brinkman, Robert G. Bristow, Russell R. Broaddus, Jane E. Brock, Malcolm Brock, Annegien Broeks, Angela N. Brooks, Denise Brooks, Benedikt Brors, Søren Brunak, Timothy J. C. Bruxner, Alicia L. Bruzos, Alex Buchanan, Ivo Buchhalter, Christiane Buchholz, Susan Bullman, Hazel Burke, Birgit Burkhardt, Kathleen H. Burns, John Busanovich, Carlos D. Bustamante, Adam P. Butler, Atul J. Butte, Niall J. Byrne, Anne-Lise Børresen-Dale, Samantha J. Caesar-Johnson, Andy Cafferkey, Declan Cahill, Claudia Calabrese, Carlos Caldas, Fabien Calvo, Niedzica Camacho, Peter J. Campbell, Elias Campo, Cinzia Cantù, Shaolong Cao, Thomas E. Carey, Joana Carlevaro-Fita, Rebecca Carlsen, Ivana Cataldo, Mario Cazzola, Jonathan Cebon, Robert Cerfolio, Dianne E. Chadwick, Dimple Chakravarty, Don Chalmers, Calvin Wing Yiu Chan, Kin Chan, Michelle Chan-Seng-Yue, Vishal S. Chandan, David K. Chang, Stephen J. Chanock, Lorraine A. Chantrill, Aurélien Chateigner, Nilanjan Chatterjee, Kazuaki Chayama, Hsiao-Wei Chen, Jieming Chen, Ken Chen, Yiwen Chen, Zhaohong Chen, Andrew D. Cherniack, Jeremy Chien, Yoke-Eng Chiew, Suet-Feung Chin, Juok Cho, Sunghoon Cho, Jung Kyoon Choi, Wan Choi, Christine Chomienne, Zechen Chong, Su Pin Choo, Angela Chou, Angelika N. Christ, Elizabeth L. Christie, Eric Chuah, Carrie Cibulskis, Kristian Cibulskis, Sara Cingarlini, Peter Clapham, Alexander Claviez, Sean Cleary, Nicole Cloonan, Marek Cmero, Colin C. Collins, Ashton A. Connor, Susanna L. Cooke, Colin S. Cooper, Leslie Cope, Vincenzo Corbo, Matthew G. Cordes, Stephen M. Cordner, Isidro Cortés-Ciriano, Kyle Covington, Prue A. Cowin, Brian Craft, David Craft, Chad J. Creighton, Yupeng Cun, Erin Curley, Ioana Cutcutache, Karolina Czajka, Bogdan Czerniak, Rebecca A. Dagg, Ludmila Danilova, Maria Vittoria Davi, Natalie R. Davidson, Helen Davies, Ian J. Davis, Brandi N. Davis-Dusenbery, Kevin J. Dawson, Francisco M. De La Vega, Ricardo De Paoli-Iseppi, Timothy Defreitas, Angelo P. Dei Tos, Olivier Delaneau, John A. Demchok, Jonas Demeulemeester, German M. Demidov, Deniz Demircioğlu, Nening M. Dennis, Robert E. Denroche, Stefan C. Dentro, Nikita Desai, Vikram Deshpande, Amit G. Deshwar, Christine Desmedt, Jordi Deu-Pons, Noreen Dhalla, Neesha C. Dhani, Priyanka Dhingra, Rajiv Dhir, Anthony DiBiase, Klev Diamanti, Li Ding, Shuai Ding, Huy Q. Dinh, Luc Dirix, HarshaVardhan Doddapaneni, Nilgun Donmez, Michelle T. Dow, Ronny Drapkin, Oliver Drechsel, Ruben M. Drews, Serge Serge, Tim Dudderidge, Ana Dueso-Barroso, Andrew J. Dunford, Michael Dunn, Lewis Jonathan Dursi, Fraser R. Duthie, Ken Dutton-Regester, Jenna Eagles, Douglas F. Easton, Stuart Edmonds, Paul A. Edwards, Sandra E. Edwards, Rosalind A. Eeles, Anna Ehinger, Juergen Eils, Roland Eils, Adel El-Naggar, Matthew Eldridge, Kyle Ellrott, Serap Erkek, Georgia Escaramis, Shadrielle M. G. Espiritu, Xavier Estivill, Dariush Etemadmoghadam, Jorunn E. Eyfjord, Bishoy M. Faltas, Daiming Fan, Yu Fan, William C. Faquin, Claudiu Farcas, Matteo Fassan, Aquila Fatima, Francesco Favero, Nodirjon Fayzullaev, Ina Felau, Sian Fereday, Martin L. Ferguson, Vincent Ferretti, Lars Feuerbach, Matthew A. Field, J. Lynn Fink, Gaetano Finocchiaro, Cyril Fisher, Matthew W. Fittall, Anna Fitzgerald, Rebecca C. Fitzgerald, Adrienne M. Flanagan, Neil E. Fleshner, Paul Flicek, John A. Foekens, Kwun M. Fong, Nuno A. Fonseca, Christopher S. Foster, Natalie S. Fox, Michael Fraser, Scott Frazer, Milana Frenkel-Morgenstern, William Friedman, Joan Frigola, Catrina C. Fronick, Akihiro Fujimoto, Masashi Fujita, Masashi Fukayama, Lucinda A. Fulton, Robert S. Fulton, Mayuko Furuta, P. Andrew Futreal, Anja Füllgrabe, Stacey B. Gabriel, Steven Gallinger, Carlo Gambacorti-Passerini, Jianjiong Gao, Shengjie Gao, Levi Garraway, Øystein Garred, Erik Garrison, Dale W. Garsed, Nils Gehlenborg, Josep L. L. Gelpi, Joshy George, Daniela S. Gerhard, Clarissa Gerhauser, Jeffrey E. Gershenwald, Mark Gerstein, Moritz Gerstung, Gad Getz, Mohammed Ghori, Ronald Ghossein, Nasra H. Giama, Richard A. Gibbs, Bob Gibson, Anthony J. Gill, Pelvender Gill, Dilip D. Giri, Dominik Glodzik, Vincent J. Gnanapragasam, Maria Elisabeth Goebler, Mary J. Goldman, Carmen Gomez, Santiago Gonzalez, Abel Gonzalez-Perez, Dmitry A. Gordenin, James Gossage, Kunihito Gotoh, Ramaswamy Govindan, Dorthe Grabau, Janet S. Graham, Robert C. Grant, Anthony R. Green, Eric Green, Liliana Greger, Nicola Grehan, Sonia Grimaldi, Sean M. Grimmond, Robert L. Grossman, Adam Grundhoff, Gunes Gundem, Qianyun Guo, Manaswi Gupta, Shailja Gupta, Ivo G. Gut, Marta Gut, Jonathan Göke, Gavin Ha, Andrea Haake, David Haan, Siegfried Haas, Kerstin Haase, James E. Haber, Nina Habermann, Faraz Hach, Syed Haider, Natsuko Hama, Freddie C. Hamdy, Anne Hamilton, Mark P. Hamilton, Leng Han, George B. Hanna, Martin Hansmann, Nicholas J. Haradhvala, Olivier Harismendy, Ivon Harliwong, Arif O. Harmanci, Eoghan Harrington, Takanori Hasegawa, David Haussler, Steve Hawkins, Shinya Hayami, Shuto Hayashi, D. Neil Hayes, Stephen J. Hayes, Nicholas K. Hayward, Steven Hazell, Yao He, Allison P. Heath, Simon C. Heath, David Hedley, Apurva M. Hegde, David I. Heiman, Michael C. Heinold, Zachary Heins, Lawrence E. Heisler, Eva Hellstrom-Lindberg, Mohamed Helmy, Seong Gu Heo, Austin J. Hepperla, José María Heredia-Genestar, Carl Herrmann, Peter Hersey, Julian M. Hess, Holmfridur Hilmarsdottir, Jonathan Hinton, Satoshi Hirano, Nobuyoshi Hiraoka, Katherine A. Hoadley, Asger Hobolth, Ermin Hodzic, Jessica I. Hoell, Steve Hoffmann, Oliver Hofmann, Andrea Holbrook, Aliaksei Z. Holik, Michael A. Hollingsworth, Oliver Holmes, Robert A. Holt, Chen Hong, Eun Pyo Hong, Jongwhi H. Hong, Gerrit K. Hooijer, Henrik Hornshøj, Fumie Hosoda, Yong Hou, Volker Hovestadt, William Howat, Alan P. Hoyle, Ralph H. Hruban, Jianhong Hu, Taobo Hu, Xing Hua, Kuan-lin Huang, Mei Huang, Mi Ni Huang, Vincent Huang, Yi Huang, Wolfgang Huber, Thomas J. Hudson, Michael Hummel, Jillian A. Hung, David Huntsman, Ted R. Hupp, Jason Huse, Matthew R. Huska, Barbara Hutter, Carolyn M. Hutter, Daniel Hübschmann, Christine A. Iacobuzio-Donahue, Charles David Imbusch, Marcin Imielinski, Seiya Imoto, William B. Isaacs, Keren Isaev, Shumpei Ishikawa, Murat Iskar, S. M. Ashiqul Islam, Michael Ittmann, Sinisa Ivkovic, Jose M. G. Izarzugaza, Jocelyne Jacquemier, Valerie Jakrot, Nigel B. Jamieson, Gun Ho Jang, Se Jin Jang, Joy C. Jayaseelan, Reyka Jayasinghe, Stuart R. Jefferys, Karine Jegalian, Jennifer L. Jennings, Seung-Hyup Jeon, Lara Jerman, Yuan Ji, Wei Jiao, Peter A. Johansson, Amber L. Johns, Jeremy Johns, Rory Johnson, Todd A. Johnson, Clemency Jolly, Yann Joly, Jon G. Jonasson, Corbin D. Jones, David R. Jones, David T. W. Jones, Nic Jones, Steven J. M. Jones, Jos Jonkers, Young Seok Ju, Hartmut Juhl, Jongsun Jung, Malene Juul, Randi Istrup Juul, Sissel Juul, Natalie Jäger, Rolf Kabbe, Andre Kahles, Abdullah Kahraman, Vera B. Kaiser, Hojabr Kakavand, Sangeetha Kalimuthu, Christof von Kalle, Koo Jeong Kang, Katalin Karaszi, Beth Karlan, Rosa Karlić, Dennis Karsch, Katayoon Kasaian, Karin S. Kassahn, Hitoshi Katai, Mamoru Kato, Hiroto Katoh, Yoshiiku Kawakami, Jonathan D. Kay, Stephen H. Kazakoff, Marat D. Kazanov, Maria Keays, Electron Kebebew, Richard F. Kefford, Manolis Kellis, James G. Kench, Catherine J. Kennedy, Jules N. A. Kerssemakers, David Khoo, Vincent Khoo, Narong Khuntikeo, Ekta Khurana, Helena Kilpinen, Hark Kyun Kim, Hyung-Lae Kim, Hyung-Yong Kim, Hyunghwan Kim, Jaegil Kim, Jihoon Kim, Jong K. Kim, Youngwook Kim, Tari A. King, Wolfram Klapper, Kortine Kleinheinz, Leszek J. Klimczak, Stian Knappskog, Michael Kneba, Bartha M. Knoppers, Youngil Koh, Jan Komorowski, Daisuke Komura, Mitsuhiro Komura, Gu Kong, Marcel Kool, Jan O. Korbel, Viktoriya Korchina, Andrey Korshunov, Michael Koscher, Roelof Koster, Zsofia Kote-Jarai, Antonios Koures, Milena Kovacevic, Barbara Kremeyer, Helene Kretzmer, Markus Kreuz, Savitri Krishnamurthy, Dieter Kube, Kiran Kumar, Pardeep Kumar, Sushant Kumar, Yogesh Kumar, Ritika Kundra, Kirsten Kübler, Ralf Küppers, Jesper Lagergren, Phillip H. Lai, Peter W. Laird, Sunil R. Lakhani, Christopher M. Lalansingh, Emilie Lalonde, Fabien C. Lamaze, Adam Lambert, Eric Lander, Pablo Landgraf, Luca Landoni, Anita Langerød, Andrés Lanzós, Denis Larsimont, Erik Larsson, Mark Lathrop, Loretta M. S. Lau, Chris Lawerenz, Rita T. Lawlor, Michael S. Lawrence, Alexander J. Lazar, Ana Mijalkovic Lazic, Xuan Le, Darlene Lee, Donghoon Lee, Eunjung Alice Lee, Hee Jin Lee, Jake June-Koo Lee, Jeong-Yeon Lee, Juhee Lee, Ming Ta Michael Lee, Henry Lee-Six, Kjong-Van Lehmann, Hans Lehrach, Dido Lenze, Conrad R. Leonard, Daniel A. Leongamornlert, Ignaty Leshchiner, Louis Letourneau, Ivica Letunic, Douglas A. Levine, Lora Lewis, Tim Ley, Chang Li, Constance H. Li, Haiyan Irene Li, Jun Li, Lin Li, Shantao Li, Siliang Li, Xiaobo Li, Xiaotong Li, Xinyue Li, Yilong Li, Han Liang, Sheng-Ben Liang, Peter Lichter, Pei Lin, Ziao Lin, W. M. Linehan, Ole Christian Lingjærde, Dongbing Liu, Eric Minwei Liu, Fei-Fei Fei Liu, Fenglin Liu, Jia Liu, Xingmin Liu, Julie Livingstone, Dimitri Livitz, Naomi Livni, Lucas Lochovsky, Markus Loeffler, Georgina V. Long, Armando Lopez-Guillermo, Shaoke Lou, David N. Louis, Laurence B. Lovat, Yiling Lu, Yong-Jie Lu, Youyong Lu, Claudio Luchini, Ilinca Lungu, Xuemei Luo, Hayley J. Luxton, Andy G. Lynch, Lisa Lype, Cristina López, Carlos López-Otín, Eric Z. Ma, Yussanne Ma, Gaetan MacGrogan, Shona MacRae, Geoff Macintyre, Tobias Madsen, Kazuhiro Maejima, Andrea Mafficini, Dennis T. Maglinte, Arindam Maitra, Partha P. Majumder, Luca Malcovati, Salem Malikic, Giuseppe Malleo, Graham J. Mann, Luisa Mantovani-Löffler, Kathleen Marchal, Giovanni Marchegiani, Elaine R. Mardis, Adam A. Margolin, Maximillian G. Marin, Florian Markowetz, Julia Markowski, Jeffrey Marks, Tomas Marques-Bonet, Marco A. Marra, Luke Marsden, John W. M. Martens, Sancha Martin, Jose I. Martin-Subero, Iñigo Martincorena, Alexander Martinez-Fundichely, Yosef E. Maruvka, R. Jay Mashl, Charlie E. Massie, Thomas J. Matthew, Lucy Matthews, Erik Mayer, Simon Mayes, Michael Mayo, Faridah Mbabaali, Karen McCune, Ultan McDermott, Patrick D. McGillivray, Michael D. McLellan, John D. McPherson, John R. McPherson, Treasa A. McPherson, Samuel R. Meier, Alice Meng, Shaowu Meng, Andrew Menzies, Neil D. Merrett, Sue Merson, Matthew Meyerson, William Meyerson, Piotr A. Mieczkowski, George L. Mihaiescu, Sanja Mijalkovic, Tom Mikkelsen, Michele Milella, Linda Mileshkin, Christopher A. Miller, David K. Miller, Jessica K. Miller, Gordon B. Mills, Ana Milovanovic, Sarah Minner, Marco Miotto, Gisela Mir Arnau, Lisa Mirabello, Chris Mitchell, Thomas J. Mitchell, Satoru Miyano, Naoki Miyoshi, Shinichi Mizuno, Fruzsina Molnár-Gábor, Malcolm J. Moore, Richard A. Moore, Sandro Morganella, Quaid D. Morris, Carl Morrison, Lisle E. Mose, Catherine D. Moser, Ferran Muiños, Loris Mularoni, Andrew J. Mungall, Karen Mungall, Elizabeth A. Musgrove, Ville Mustonen, David Mutch, Francesc Muyas, Donna M. Muzny, Alfonso Muñoz, Jerome Myers, Ola Myklebost, Peter Möller, Genta Nagae, Adnan M. Nagrial, Hardeep K. Nahal-Bose, Hitoshi Nakagama, Hidewaki Nakagawa, Hiromi Nakamura, Toru Nakamura, Kaoru Nakano, Tannistha Nandi, Jyoti Nangalia, Mia Nastic, Arcadi Navarro, Fabio C. P. Navarro, David E. Neal, Gerd Nettekoven, Felicity Newell, Steven J. Newhouse, Yulia Newton, Alvin Wei Tian Ng, Anthony Ng, Jonathan Nicholson, David Nicol, Yongzhan Nie, G. Petur Nielsen, Morten Muhlig Nielsen, Serena Nik-Zainal, Michael S. Noble, Katia Nones, Paul A. Northcott, Faiyaz Notta, Brian D. O’Connor, Peter O’Donnell, Maria O’Donovan, Sarah O’Meara, Brian Patrick O’Neill, J. Robert O’Neill, David Ocana, Angelica Ochoa, Layla Oesper, Christopher Ogden, Hideki Ohdan, Kazuhiro Ohi, Lucila Ohno-Machado, Karin A. Oien, Akinyemi I. Ojesina, Hidenori Ojima, Takuji Okusaka, Larsson Omberg, Choon Kiat Ong, Stephan Ossowski, German Ott, B. F. Francis Ouellette, Christine P’ng, Marta Paczkowska, Salvatore Paiella, Chawalit Pairojkul, Marina Pajic, Qiang Pan-Hammarström, Elli Papaemmanuil, Irene Papatheodorou, Nagarajan Paramasivam, Ji Wan Park, Joong-Won Park, Keunchil Park, Kiejung Park, Peter J. Park, Joel S. Parker, Simon L. Parsons, Harvey Pass, Danielle Pasternack, Alessandro Pastore, Ann-Marie Patch, Iris Pauporté, Antonio Pea, John V. Pearson, Chandra Sekhar Pedamallu, Jakob Skou Pedersen, Paolo Pederzoli, Martin Peifer, Nathan A. Pennell, Charles M. Perou, Marc D. Perry, Gloria M. Petersen, Myron Peto, Nicholas Petrelli, Robert Petryszak, Stefan M. Pfister, Mark Phillips, Oriol Pich, Hilda A. Pickett, Todd D. Pihl, Nischalan Pillay, Sarah Pinder, Mark Pinese, Andreia V. Pinho, Esa Pitkänen, Xavier Pivot, Elena Piñeiro-Yáñez, Laura Planko, Christoph Plass, Paz Polak, Tirso Pons, Irinel Popescu, Olga Potapova, Aparna Prasad, Shaun R. Preston, Manuel Prinz, Antonia L. Pritchard, Stephenie D. Prokopec, Elena Provenzano, Xose S. Puente, Sonia Puig, Montserrat Puiggròs, Sergio Pulido-Tamayo, Gulietta M. Pupo, Colin A. Purdie, Michael C. Quinn, Raquel Rabionet, Janet S. Rader, Bernhard Radlwimmer, Petar Radovic, Benjamin Raeder, Keiran M. Raine, Manasa Ramakrishna, Kamna Ramakrishnan, Suresh Ramalingam, Benjamin J. Raphael, W. Kimryn Rathmell, Tobias Rausch, Guido Reifenberger, Jüri Reimand, Jorge Reis-Filho, Victor Reuter, Iker Reyes-Salazar, Matthew A. Reyna, Sheila M. Reynolds, Esther Rheinbay, Yasser Riazalhosseini, Andrea L. Richardson, Julia Richter, Matthew Ringel, Markus Ringnér, Yasushi Rino, Karsten Rippe, Jeffrey Roach, Lewis R. Roberts, Nicola D. Roberts, Steven A. Roberts, A. Gordon Robertson, Alan J. Robertson, Javier Bartolomé Rodriguez, Bernardo Rodriguez-Martin, F. Germán Rodríguez-González, Michael H. A. Roehrl, Marius Rohde, Hirofumi Rokutan, Gilles Romieu, Ilse Rooman, Tom Roques, Daniel Rosebrock, Mara Rosenberg, Philip C. Rosenstiel, Andreas Rosenwald, Edward W. Rowe, Romina Royo, Steven G. Rozen, Yulia Rubanova, Mark A. Rubin, Carlota Rubio-Perez, Vasilisa A. Rudneva, Borislav C. Rusev, Andrea Ruzzenente, Gunnar Rätsch, Radhakrishnan Sabarinathan, Veronica Y. Sabelnykova, Sara Sadeghi, S. Cenk Sahinalp, Natalie Saini, Mihoko Saito-Adachi, Gordon Saksena, Adriana Salcedo, Roberto Salgado, Leonidas Salichos, Richard Sallari, Charles Saller, Roberto Salvia, Michelle Sam, Jaswinder S. Samra, Francisco Sanchez-Vega, Chris Sander, Grant Sanders, Rajiv Sarin, Iman Sarrafi, Aya Sasaki-Oku, Torill Sauer, Guido Sauter, Robyn P. M. Saw, Maria Scardoni, Christopher J. Scarlett, Aldo Scarpa, Ghislaine Scelo, Dirk Schadendorf, Jacqueline E. Schein, Markus B. Schilhabel, Matthias Schlesner, Thorsten Schlomm, Heather K. Schmidt, Sarah-Jane Schramm, Stefan Schreiber, Nikolaus Schultz, Steven E. Schumacher, Roland F. Schwarz, Richard A. Scolyer, David Scott, Ralph Scully, Raja Seethala, Ayellet V. Segre, Iris Selander, Colin A. Semple, Yasin Senbabaoglu, Subhajit Sengupta, Elisabetta Sereni, Stefano Serra, Dennis C. Sgroi, Mark Shackleton, Nimish C. Shah, Sagedeh Shahabi, Catherine A. Shang, Ping Shang, Ofer Shapira, Troy Shelton, Ciyue Shen, Hui Shen, Rebecca Shepherd, Ruian Shi, Yan Shi, Yu-Jia Shiah, Tatsuhiro Shibata, Juliann Shih, Eigo Shimizu, Kiyo Shimizu, Seung Jun Shin, Yuichi Shiraishi, Tal Shmaya, Ilya Shmulevich, Solomon I. Shorser, Charles Short, Raunak Shrestha, Suyash S. Shringarpure, Craig Shriver, Shimin Shuai, Nikos Sidiropoulos, Reiner Siebert, Anieta M. Sieuwerts, Lina Sieverling, Sabina Signoretti, Katarzyna O. Sikora, Michele Simbolo, Ronald Simon, Janae V. Simons, Jared T. Simpson, Peter T. Simpson, Samuel Singer, Nasa Sinnott-Armstrong, Payal Sipahimalani, Tara J. Skelly, Marcel Smid, Jaclyn Smith, Karen Smith-McCune, Nicholas D. Socci, Heidi J. Sofia, Matthew G. Soloway, Lei Song, Anil K. Sood, Sharmila Sothi, Christos Sotiriou, Cameron M. Soulette, Paul N. Span, Paul T. Spellman, Nicola Sperandio, Andrew J. Spillane, Oliver Spiro, Jonathan Spring, Johan Staaf, Peter F. Stadler, Peter Staib, Stefan G. Stark, Lucy Stebbings, Ólafur Andri Stefánsson, Oliver Stegle, Lincoln D. Stein, Alasdair Stenhouse, Chip Stewart, Stephan Stilgenbauer, Miranda D. Stobbe, Michael R. Stratton, Jonathan R. Stretch, Adam J. Struck, Joshua M. Stuart, Henk G. Stunnenberg, Hong Su, Xiaoping Su, Ren X. Sun, Stephanie Sungalee, Hana Susak, Akihiro Suzuki, Fred Sweep, Monika Szczepanowski, Holger Sültmann, Takashi Yugawa, Angela Tam, David Tamborero, Benita Kiat Tee Tan, Donghui Tan, Patrick Tan, Hiroko Tanaka, Hirokazu Taniguchi, Tomas J. Tanskanen, Maxime Tarabichi, Roy Tarnuzzer, Patrick Tarpey, Morgan L. Taschuk, Kenji Tatsuno, Simon Tavaré, Darrin F. Taylor, Amaro Taylor-Weiner, Jon W. Teague, Bin Tean Teh, Varsha Tembe, Javier Temes, Kevin Thai, Sarah P. Thayer, Nina Thiessen, Gilles Thomas, Sarah Thomas, Alan Thompson, Alastair M. Thompson, John F. F. Thompson, R. Houston Thompson, Heather Thorne, Leigh B. Thorne, Adrian Thorogood, Grace Tiao, Nebojsa Tijanic, Lee E. Timms, Roberto Tirabosco, Marta Tojo, Stefania Tommasi, Christopher W. Toon, Umut H. Toprak, David Torrents, Giampaolo Tortora, Jörg Tost, Yasushi Totoki, David Townend, Nadia Traficante, Isabelle Treilleux, Jean-Rémi Trotta, Lorenz H. P. Trümper, Ming Tsao, Tatsuhiko Tsunoda, Jose M. C. Tubio, Olga Tucker, Richard Turkington, Daniel J. Turner, Andrew Tutt, Masaki Ueno, Naoto T. Ueno, Christopher Umbricht, Husen M. Umer, Timothy J. Underwood, Lara Urban, Tomoko Urushidate, Tetsuo Ushiku, Liis Uusküla-Reimand, Alfonso Valencia, David J. Van Den Berg, Steven Van Laere, Peter Van Loo, Erwin G. Van Meir, Gert G. Van den Eynden, Theodorus Van der Kwast, Naveen Vasudev, Miguel Vazquez, Ravikiran Vedururu, Umadevi Veluvolu, Shankar Vembu, Lieven P. C. Verbeke, Peter Vermeulen, Clare Verrill, Alain Viari, David Vicente, Caterina Vicentini, K. VijayRaghavan, Juris Viksna, Ricardo E. Vilain, Izar Villasante, Anne Vincent-Salomon, Tapio Visakorpi, Douglas Voet, Paresh Vyas, Ignacio Vázquez-García, Nick M. Waddell, Nicola Waddell, Claes Wadelius, Lina Wadi, Rabea Wagener, Jeremiah A. Wala, Jian Wang, Jiayin Wang, Linghua Wang, Qi Wang, Wenyi Wang, Yumeng Wang, Zhining Wang, Paul M. Waring, Hans-Jörg Warnatz, Jonathan Warrell, Anne Y. Warren, Sebastian M. Waszak, David C. Wedge, Dieter Weichenhan, Paul Weinberger, John N. Weinstein, Joachim Weischenfeldt, Daniel J. Weisenberger, Ian Welch, Michael C. Wendl, Johannes Werner, Justin P. Whalley, David A. Wheeler, Hayley C. Whitaker, Dennis Wigle, Matthew D. Wilkerson, Ashley Williams, James S. Wilmott, Gavin W. Wilson, Julie M. Wilson, Richard K. Wilson, Boris Winterhoff, Jeffrey A. Wintersinger, Maciej Wiznerowicz, Stephan Wolf, Bernice H. Wong, Tina Wong, Winghing Wong, Youngchoon Woo, Scott Wood, Bradly G. Wouters, Adam J. Wright, Derek W. Wright, Mark H. Wright, Chin-Lee Wu, Dai-Ying Wu, Guanming Wu, Jianmin Wu, Kui Wu, Yang Wu, Zhenggang Wu, Liu Xi, Tian Xia, Qian Xiang, Xiao Xiao, Rui Xing, Heng Xiong, Qinying Xu, Yanxun Xu, Hong Xue, Shinichi Yachida, Sergei Yakneen, Rui Yamaguchi, Takafumi N. Yamaguchi, Masakazu Yamamoto, Shogo Yamamoto, Hiroki Yamaue, Fan Yang, Huanming Yang, Jean Y. Yang, Liming Yang, Lixing Yang, Shanlin Yang, Tsun-Po Yang, Yang Yang, Xiaotong Yao, Marie-Laure Yaspo, Lucy Yates, Christina Yau, Chen Ye, Kai Ye, Venkata D. Yellapantula, Christopher J. Yoon, Sung-Soo Yoon, Fouad Yousif, Jun Yu, Kaixian Yu, Willie Yu, Yingyan Yu, Ke Yuan, Yuan Yuan, Denis Yuen, Christina K. Yung, Olga Zaikova, Jorge Zamora, Marc Zapatka, Jean C. Zenklusen, Thorsten Zenz, Nikolajs Zeps, Cheng-Zhong Zhang, Fan Zhang, Hailei Zhang, Hongwei Zhang, Hongxin Zhang, Jiashan Zhang, Jing Zhang, Junjun Zhang, Xiuqing Zhang, Xuanping Zhang, Yan Zhang, Zemin Zhang, Zhongming Zhao, Liangtao Zheng, Xiuqing Zheng, Wanding Zhou, Yong Zhou, Bin Zhu, Hongtu Zhu, Jingchun Zhu, Shida Zhu, Lihua Zou, Xueqing Zou, Anna deFazio, Nicholas van As, Carolien H. M. van Deurzen, Marc J. van de Vijver, L. van’t Veer, and Christian von Mering
Supplementary information
is available for this paper at 10.1038/s41588-019-0557-x.
References
- 1.Schon EA, DiMauro S, Hirano M. Human mitochondrial DNA: roles of inherited and somatic mutations. Nat. Rev. Genet. 2012;13:878–890. doi: 10.1038/nrg3275. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Smeitink J, van den Heuvel L, DiMauro S. The genetics and pathology of oxidative phosphorylation. Nat. Rev. Genet. 2001;2:342–352. doi: 10.1038/35072063. [DOI] [PubMed] [Google Scholar]
- 3.Anderson S, et al. Sequence and organization of the human mitochondrial genome. Nature. 1981;290:457–465. doi: 10.1038/290457a0. [DOI] [PubMed] [Google Scholar]
- 4.Brandon M, Baldi P, Wallace DC. Mitochondrial mutations in cancer. Oncogene. 2006;25:4647–4662. doi: 10.1038/sj.onc.1209607. [DOI] [PubMed] [Google Scholar]
- 5.Zong WX, Rabinowitz JD, White E. Mitochondria and cancer. Mol. Cell. 2016;61:667–676. doi: 10.1016/j.molcel.2016.02.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Hanahan D, Weinberg RA. Hallmarks of cancer: the next generation. Cell. 2011;144:646–674. doi: 10.1016/j.cell.2011.02.013. [DOI] [PubMed] [Google Scholar]
- 7.Hengartner MO. The biochemistry of apoptosis. Nature. 2000;407:770–776. doi: 10.1038/35037710. [DOI] [PubMed] [Google Scholar]
- 8.Larman TC, et al. Spectrum of somatic mitochondrial mutations in five cancers. Proc. Natl Acad. Sci. USA. 2012;109:14087–14091. doi: 10.1073/pnas.1211502109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Ju YS, et al. Origins and functional consequences of somatic mitochondrial DNA mutations in human cancer. eLife. 2014;3:e02935. doi: 10.7554/eLife.02935. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Stewart JB, et al. Simultaneous DNA and RNA mapping of somatic mitochondrial mutations across diverse human cancers. PLoS Genet. 2015;11:e1005333. doi: 10.1371/journal.pgen.1005333. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Ju YS, et al. Frequent somatic transfer of mitochondrial DNA into the nuclear genome of human cancer cells. Genome Res. 2015;25:814–824. doi: 10.1101/gr.190470.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Reznik E, et al. Mitochondrial DNA copy number variation across human cancers. eLife. 2016;5:e10769. doi: 10.7554/eLife.10769. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Hopkins JF, et al. Mitochondrial mutations drive prostate cancer aggression. Nat. Commun. 2017;8:656. doi: 10.1038/s41467-017-00377-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.The ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium. Pan-cancer analysis of whole genomes. Nature10.1038/s41586-020-1969-6 (2020).
- 15.The Cancer Genome Atlas Research Network et al. The Cancer Genome Atlas pan-cancer analysis project. Nat. Genet.45, 1113–1120 (2013). [DOI] [PMC free article] [PubMed]
- 16.Li, Y. et al. Patterns of somatic structural variation in human cancer genomes. Nature10.1038/s41586-019-1913-9 (2020). [DOI] [PMC free article] [PubMed]
- 17.Rheinbay, E. et al. Analyses of non-coding somatic drivers in 2,693 cancer whole genomes. Nature10.1038/s41586-020-1965-x (2020). [DOI] [PMC free article] [PubMed]
- 18.Dayama G, Emery SB, Kidd JM, Mills RE. The genomic landscape of polymorphic human nuclear mitochondrial insertions. Nucleic Acids Res. 2014;42:12640–12649. doi: 10.1093/nar/gku1038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Costello M, et al. Discovery and characterization of artifactual mutations in deep coverage targeted capture sequencing data due to oxidative DNA damage during sample preparation. Nucleic Acids Res. 2013;41:e67. doi: 10.1093/nar/gks1443. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Lawrence MS, et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature. 2013;499:214–218. doi: 10.1038/nature12213. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Kennedy SR, Salk JJ, Schmitt MW, Loeb LA. Ultra-sensitive sequencing reveals an age-related increase in somatic mitochondrial mutations that are inconsistent with oxidative damage. PLoS Genet. 2013;9:e1003794. doi: 10.1371/journal.pgen.1003794. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Tanaka M, Ozawa T. Strand asymmetry in human mitochondrial DNA mutations. Genomics. 1994;22:327–335. doi: 10.1006/geno.1994.1391. [DOI] [PubMed] [Google Scholar]
- 23.Zheng W, Khrapko K, Coller HA, Thilly WG, Copeland WC. Origins of human mitochondrial point mutations as DNA polymerase γ-mediated errors. Mutat. Res. 2006;599:11–20. doi: 10.1016/j.mrfmmm.2005.12.012. [DOI] [PubMed] [Google Scholar]
- 24.Alexandrov LB, et al. Clock-like mutational processes in human somatic cells. Nat. Genet. 2015;47:1402–1407. doi: 10.1038/ng.3441. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Coller HA, et al. High frequency of homoplasmic mitochondrial DNA mutations in human tumors can be explained without selection. Nat. Genet. 2001;28:147–150. doi: 10.1038/88859. [DOI] [PubMed] [Google Scholar]
- 26.The Cancer Genome Atlas Network. Comprehensive molecular characterization of human colon and rectal cancer. Nature. 2012;487:330–337. doi: 10.1038/nature11252. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.The Cancer Genome Atlas Research Network et al. Integrated genomic characterization of endometrial carcinoma. Nature497, 67–73 (2013). [DOI] [PMC free article] [PubMed]
- 28.Nik-Zainal S, et al. Mutational processes molding the genomes of 21 breast cancers. Cell. 2012;149:979–993. doi: 10.1016/j.cell.2012.04.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Pak JW, Vang F, Johnson C, McKenzie D, Aiken JM. MtDNA point mutations are associated with deletion mutations in aged rat. Exp. Gerontol. 2005;40:209–218. doi: 10.1016/j.exger.2004.12.005. [DOI] [PubMed] [Google Scholar]
- 30.Davis CF, et al. The somatic genomic landscape of chromophobe renal cell carcinoma. Cancer Cell. 2014;26:319–330. doi: 10.1016/j.ccr.2014.07.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Caro P, et al. Mitochondrial DNA sequences are present inside nuclear DNA in rat tissues and increase with age. Mitochondrion. 2010;10:479–486. doi: 10.1016/j.mito.2010.05.004. [DOI] [PubMed] [Google Scholar]
- 32.Chen D, Xue W, Xiang J. The intra-nucleus integration of mitochondrial DNA (mtDNA) in cervical mucosa cells and its relation with c-myc expression. J. Exp. Clin. Cancer Res. 2008;27:36. doi: 10.1186/1756-9966-27-36. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Srinivasainagendra V, et al. Migration of mitochondrial DNA in the nuclear genome of colorectal adenocarcinoma. Genome Med. 2017;9:31. doi: 10.1186/s13073-017-0420-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Cui H, et al. Association of decreased mitochondrial DNA content with the progression of colorectal cancer. BMC Cancer. 2013;13:110. doi: 10.1186/1471-2407-13-110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Dickinson A, et al. The regulation of mitochondrial DNA copy number in glioblastoma cells. Cell Death Differ. 2013;20:1644–1653. doi: 10.1038/cdd.2013.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Van Osch FH, et al. Mitochondrial DNA copy number in colorectal cancer: between tissue comparisons, clinicopathological characteristics and survival. Carcinogenesis. 2015;36:1502–1510. doi: 10.1093/carcin/bgv151. [DOI] [PubMed] [Google Scholar]
- 37.McBride HM, Neuspiel M, Wasiak S. Mitochondria: more than just a powerhouse. Curr. Biol. 2006;16:R551–R560. doi: 10.1016/j.cub.2006.06.054. [DOI] [PubMed] [Google Scholar]
- 38.Vyas S, Zaganjor E, Haigis MC. Mitochondria and cancer. Cell. 2016;166:555–566. doi: 10.1016/j.cell.2016.07.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Lott MT, et al. mtDNA variation and analysis using mitomap and mitomaster. Curr. Protoc. Bioinformatics. 2013;44:1.23.1–1.23.26. doi: 10.1002/0471250953.bi0123s44. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Mercer TR, et al. The human mitochondrial transcriptome. Cell. 2011;146:645–658. doi: 10.1016/j.cell.2011.06.051. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics. 2008;9:559. doi: 10.1186/1471-2105-9-559. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Subramanian A, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA. 2005;102:15545–15550. doi: 10.1073/pnas.0506580102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Koczor CA, et al. Mitochondrial DNA damage initiates a cell cycle arrest by a Chk2-associated mechanism in mammalian cells. J. Biol. Chem. 2009;284:36191–36201. doi: 10.1074/jbc.M109.036020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Ojala D, Montoya J, Attardi G. tRNA punctuation model of RNA processing in human mitochondria. Nature. 1981;290:470–474. doi: 10.1038/290470a0. [DOI] [PubMed] [Google Scholar]
- 45.Joshi S, et al. The genomic landscape of renal oncocytoma identifies a metabolic barrier to tumorigenesis. Cell Rep. 2015;13:1895–1908. doi: 10.1016/j.celrep.2015.10.059. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Gasparre G, Romeo G, Rugolo M, Porcelli AM. Learning from oncocytic tumors: why choose inefficient mitochondria? Biochim. Biophys. Acta. 2011;1807:633–642. doi: 10.1016/j.bbabio.2010.08.006. [DOI] [PubMed] [Google Scholar]
- 47.Clayton DA, Doda JN, Friedberg EC. The absence of a pyrimidine dimer repair mechanism in mammalian mitochondria. Proc. Natl Acad. Sci. USA. 1974;71:2777–2781. doi: 10.1073/pnas.71.7.2777. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Haradhvala NJ, et al. Mutational strand asymmetries in cancer genomes reveal mechanisms of DNA damage and repair. Cell. 2016;164:538–549. doi: 10.1016/j.cell.2015.12.050. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Bess AS, Crocker TL, Ryde IT, Meyer JN. Mitochondrial dynamics and autophagy aid in removal of persistent mitochondrial DNA damage in Caenorhabditis elegans. Nucleic Acids Res. 2012;40:7916–7931. doi: 10.1093/nar/gks532. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Koboldt DC, et al. VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res. 2012;22:568–576. doi: 10.1101/gr.129684.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Kent WJ. BLAT—the BLAST-like alignment tool. Genome Res. 2002;12:656–664. doi: 10.1101/gr.229202. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Simone D, Calabrese FM, Lang M, Gasparre G, Attimonelli M. The reference human nuclear mitochondrial sequences compilation validated and implemented on the UCSC genome browser. BMC Genomics. 2011;12:517. doi: 10.1186/1471-2164-12-517. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Van Loo P, et al. Allele-specific copy number analysis of tumors. Proc. Natl Acad. Sci. USA. 2010;107:16910–16915. doi: 10.1073/pnas.1009843107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Van Allen EM, et al. Whole-exome sequencing and clinical interpretation of formalin-fixed, paraffin-embedded tumor samples to guide precision cancer medicine. Nat. Med. 2014;20:682–688. doi: 10.1038/nm.3559. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplementary Figs. 1–22 and Tables 1–4
Data Availability Statement
Somatic and germline variant calls, mutational signatures, subclonal reconstructions, transcript abundance, splice calls and other core data generated by the ICGC/TCGA PCAWG Consortium are described here14 and are available for download at https://dcc.icgc.org/releases/PCAWG. Additional information on accessing the data, including raw read files, can be found at https://docs.icgc.org/pcawg/data/. In accordance with the data access policies of the ICGC and TCGA projects, most molecular, clinical and specimen data are in an open tier that does not require access approval. To access information with the potential to identify individuals, such as germline alleles and underlying sequencing data, researchers will need to apply to the TCGA Data Access Committee via the Database of Genotypes and Phenotypes (https://dbgap.ncbi.nlm.nih.gov/aa/wga.cgi?page=login) for access to the TCGA portion of the dataset, and to the ICGC Data Access Compliance Office (http://icgc.org/daco) for the ICGC portion. In addition, to access somatic single-nucleotide variants derived from TCGA donors, researchers will also need to obtain Database of Genotypes and Phenotypes authorization. Derived datasets described specifically in this study are available from the TCMA data portal (http://bioinformatics.mdanderson.org/main/TCMA:Overview).
The core computational pipelines used by the PCAWG Consortium for alignment, quality control and variant calling are available to the public at https://dockstore.org/search?search=pcawg under GNU General Public License version 3.0, which allows for reuse and distribution.






