Abstract
Circular extrachromosomal DNA (ecDNA) in patient tumors is an important driver of oncogenic gene expression, evolution of drug resistance and poor patient outcomes. Applying computational methods for the detection and reconstruction of ecDNA across a retrospective cohort of 481 medulloblastoma tumors from 465 patients, we identify circular ecDNA in 82 patients (18%). Patients with ecDNA-positive medulloblastoma were more than twice as likely to relapse and three times as likely to die within 5 years of diagnosis. A subset of tumors harbored multiple ecDNA lineages, each containing distinct amplified oncogenes. Multimodal sequencing, imaging and CRISPR inhibition experiments in medulloblastoma models reveal intratumoral heterogeneity of ecDNA copy number per cell and frequent putative ‘enhancer rewiring’ events on ecDNA. This study reveals the frequency and diversity of ecDNA in medulloblastoma, stratified into molecular subgroups, and suggests copy number heterogeneity and enhancer rewiring as oncogenic features of ecDNA.
Subject terms: CNS cancer, Gene regulation, Population genetics, Epigenomics
Circular extrachromosomal DNA in high-risk medulloblastoma contributes to tumor heterogeneity and associates with relapse and survival. Enhancer rewiring events involving known oncogenes are frequent events, affecting transcription and proliferation.
Main
Circular ecDNA molecules, also known as double minutes, have been described in isolated tumor and tumor-derived cells since the 1960s (ref. 1). Recent results have shown ecDNA to be far more common in human cancer than previously assumed2,3. Commonly defined as circular, acentric chromatin bodies tens of kilobases to tens of megabase pairs (Mbp) in length, circular ecDNA is now understood to be a major contributor to intratumoral heterogeneity and is implicated in oncogenesis, tumor evolution and the evolution of drug resistance4–7. Circular ecDNA is a frequent form of high-copy oncogene amplification3 and a prognostic biomarker in many tumor types8–10, and it allows amplified oncogenes to ‘hijack’ noncoding regulatory enhancers that would be inaccessible under normal karyotypic topology11–13.
Medulloblastomas were represented among the first patient case reports describing ecDNA1. Few effective targeted molecular treatments exist for medulloblastoma, and the current standard of care carries a substantial risk of cognitive disorders, neurological damage and secondary malignancy14. There are four major molecular subgroups of medulloblastoma: WNT, Sonic hedgehog (SHH), Group 3 and Group 4 (ref. 15). Prognosis is especially poor for a subset of MYC-activated Group 3 tumors and for TP53-mutant SHH subgroup tumors16–19. The mutational landscape of medulloblastoma subgroups has recently been characterized18; however, the frequency of ecDNA in the different molecular medulloblastoma subgroups, the amplified genomic regions and their impact on patient outcomes are not well understood. Furthermore, the contribution of ecDNA to intertumoral and intratumoral heterogeneity as well as the potential role for enhancer hijacking by ecDNA in medulloblastoma remain open questions. Here, we resolve ecDNA content and structure using next-generation sequencing, optical mapping, CRISPR-CATCH and microscopy of ecDNA in medulloblastoma cells. We estimate intratumoral heterogeneity using computational approaches applied to microscopy and single-cell sequencing data. We perform epigenetic profiling to examine the transcriptional regulatory circuitry of ecDNA sequences and interrogate functional transcriptional enhancers on an ecDNA using CRISPRi. Our results demonstrate that ecDNA confers shorter survival for a subset of patients with medulloblastoma and illuminate molecular roles for ecDNA in medulloblastoma pathogenesis.
ecDNA amplifies medulloblastoma oncogenes
To examine the landscape of ecDNA in medulloblastoma, we accessed whole genome sequencing (WGS) data of tumors available in three cloud cancer genomics platforms20–22. In addition, we included 43 tumors from a previous proteomic analysis23 and 8 tumors diagnosed at the Rady Children’s Hospital, San Diego. In total, our retrospective cohort comprised 481 tumor biopsies from 468 patients. Using DNA fingerprint analysis, we ensured that the combined cohort contained no duplicates. Clinical metadata were available for most patients and included age at diagnosis, sex, medulloblastoma molecular subgroup and survival (Supplementary Tables 1–4). Using the AmpliconArchitect algorithm24, we detected 102 putative ecDNA sequences in tumor samples from 82 out of 468 (18%) patients. By molecular subgroup, patients with ecDNA-positive (ecDNA+) tumors were distributed as follows: WNT, 0 out of 22; SHH, 30 out of 112 (27%); Group 3, 19 out of 107 (18%); and Group 4, 26 out of 181 (14%) (Fig. 1a). SHH subgroup tumors were significantly more likely to contain ecDNA than tumors from the other medulloblastoma subgroups (χ2 = 7.66, P = 0.006). Among the ecDNA-amplified genes occurring in two or more samples in this cohort were known or suspected medulloblastoma oncogenes MYC, MYCN, MYCL, TERT, GLI2, CCND2 (ref. 25), PPM1D (WIP1) (ref. 26) and ACVR2B (ref. 27); genes encoding DNA repair machinery (RAD51AP1 and RAD21); and genes encoding TP53 pathway inhibitors (PPM1D28 and CDK6 (ref. 29)) (Fig. 1b). Of MYC oncogene family amplifications, 19 out of 23 MYCN, 11 out of 18 MYC and 3 out of 3 MYCL1 were on ecDNA, as were all amplifications of CCND2, GLI2 and TERT.
ecDNA predicts poor prognosis in medulloblastoma
To evaluate ecDNA as a potential prognostic marker in medulloblastoma, we performed survival analyses across patients for whom clinical metadata were available. Patients with ecDNA+ tumors had significantly worse overall and progression-free five-year survival compared to patients with ecDNA-negative (ecDNA−) tumors (log-rank test, P < 0.005; Fig. 1c and Extended Data Fig. 1a). Stratified by molecular subgroup, patients with ecDNA+ tumors had worse overall survival in the SHH, Group 3 and Group 4 subgroups (P < 0.05; Fig. 1d–f and Extended Data Fig. 1b–d). Survival of patients in the WNT subgroup was not analyzed because no WNT tumors in our patient cohort were ecDNA+. To determine whether patients with ecDNA+ tumors had worse outcomes than patients with tumors harboring other types of focal somatic copy number amplification, we stratified patients by the topology of the amplification(s) present in the tumor genomes3. As expected, patients with ecDNA+ tumors had the poorest outcomes, significantly (P < 0.005) worse than patients without focal somatic copy number amplification or with linear amplifications (Extended Data Fig. 2).
To further estimate the prognostic value of ecDNA, we conducted Cox proportional hazards regressions, controlling for sex, age and molecular subgroup. Patients with ecDNA+ tumors were at greater estimated risk for progression (hazard ratio, 2.36; P < 0.005) and mortality (hazard ratio, 2.99; P < 0.005) than patients with ecDNA− tumors (Fig. 1g, Extended Data Fig. 1e and Supplementary Tables 5 and 6).
TP53 alterations are associated with ecDNA in SHH medulloblastoma tumors
The tumor suppressor protein p53 (encoded by TP53) regulates DNA damage sensing and cell cycle arrest and apoptosis, and is frequently affected by somatic mutations and pathogenic germline variants in SHH medulloblastoma19,30,31. Moreover, SHH medulloblastomas with inactivating TP53 mutations are known to be associated with chromothripsis17, the catastrophic shattering of a chromosome that precedes ecDNA formation in some cell line models32,33. To test whether TP53 mutations were associated with the presence of ecDNA, we accessed somatic and germline TP53 mutation status of 92 SHH medulloblastomas. TP53 alterations were enriched in ecDNA+ SHH subgroup tumors (12 out of 23, 52%) compared to the ecDNA− SHH subgroup (2 out of 69, 3%; Fisher exact test, P = 1.3 × 10−7). We did not find a significant association between TP53 alterations and ecDNA in the other subgroups or across the entire cohort, suggesting that in medulloblastoma, a possible functional relationship between TP53 alterations and ecDNA is restricted to the SHH subgroup. We reasoned that the established effect of TP53 mutation on the survival of patients with medulloblastoma34 may be mediated, at least partially, by ecDNA (Extended Data Fig. 3). To test this hypothesis, we conducted mediation analysis using the Baron–Kenny approach35. Accelerated failure time (AFT) regressions of progression-free survival on TP53 mutation and ecDNA status suggest that much of the effect of TP53 mutation on prognosis can be explained by an effect of ecDNA and by the frequent co-occurrence of ecDNA in TP53-mutant tumors (Supplementary Tables 7 and 8 and Supplementary Note 1).
To evaluate whether there is a TP53-independent effect of ecDNA on survival, we performed Cox regression, including TP53 alteration as a covariate and controlling for collinearity. The effect of ecDNA on survival remains significant but diminished when we include TP53 alteration as a covariate in our Cox models (hazard ratio for progression-free survival, 1.87, P = 0.01; hazard ratio for overall survival, 2.32, P < 0.005; Extended Data Fig. 4 and Supplementary Tables 9 and 10), indicating that there is an effect of ecDNA on survival that cannot be explained by TP53 mutation alone. Such an effect may be explainable by a TP53-independent mechanism of ecDNA formation or by inactivation of the TP53 pathway by other means, such as CDKN2A deletion or PPM1D, CDK6, MDM4 or MDM2 amplification36. In our patient cohort, we observe nine such amplifications on ecDNA across all subgroups (Fig. 1b). Although causality cannot be inferred from these data alone, these survival analyses identify TP53 alteration and ecDNA as clinically relevant biomarkers for a subset of highly aggressive SHH medulloblastoma tumors.
Multiple ecDNA lineages coexist in some medulloblastomas
Our patient cohort included 16 medulloblastoma tumors with multiple distinct ecDNA sequences (Supplementary Table 11). This set included a SHH medulloblastoma primary tumor with heterozygous somatic TP53 mutation37 (RCMB56-ht), which we established as an orthotopic patient-derived xenograft mouse model (RCMB56-pdx). Analysis of WGS data from RCMB56-ht predicted two distinct focal amplifications: a circular ecDNA of length 3.2 Mbp comprising three regions of chromosome 1 (amp1; Supplementary Fig. 1) and a complex, possibly chromothriptic, 4.5 Mbp amplicon comprising 20 segments from chromosome 7 and one segment from chromomsome 17, with ends mapping to pericentromeric and peritelomeric regions (amp2; Supplementary Fig. 2). Similar analysis of RCMB56-pdx confirmed that both focal amplifications were unchanged compared to the original primary human tumor. Sequencing depth of the WGS data also indicated low-copy gain (gain1) of unknown architecture composed of other segments of chromosome 7 (35 Mbp) and chromosome 17 (800 kbp).
To assemble high-confidence sequences for the two amplicons, we performed optical genome mapping (OGM) of RCMB56-pdx. Genome assembly from deep WGS and OGM validated the circular amp1, composed of three DNA segments from chromosome 1 (Fig. 2a). This analysis also validated the contiguous chromothriptic amp2, comprising 21 segments of chromosome 7 and chromosome 17; however, a circular structure could not be conclusively established from OGM and WGS data (Fig. 2b). Copy number of amp1 and amp2 was estimated from WGS data at 20 and 10, respectively in RCMB56-ht, and 30 and 25, respectively in RCMB56-pdx. DNA fluorescence in situ hybridization (FISH) imaging of metaphase cells for marker gene loci DNTTIP2 (amp1), KMT2E (amp2) and ETV1 (gain1) indicated that amp1 and amp2 are amplified extrachromosomally (Fig. 2c). To confirm co-occurrence in the same cells, we performed multi-channel FISH imaging for the same markers in interphase cells. We observed distinct fluorescence spots for each gene within the same nucleus, indicating that copies of each amplified gene are located on distinct chromatin bodies (Fig. 2d). To further validate the predicted circular amp1 assembly, we used a recent method for targeted profiling of ecDNA, CRISPR-CATCH38. As expected, cutting amp1 in DNA from RCMB56-pdx produced a single fraction of linear DNA matching the length of the amp1 assembly (Fig. 2e). Short read sequencing maps this DNA to the amp1 sequence identified from bulk sequencing, confirming its circular structure (Fig. 2f).
Medulloblastomas have heterogeneous ecDNA copy number
Substantial intratumoral copy number heterogeneity is expected in ecDNA+ tumors owing to random segregation of ecDNA during mitosis, driving tumor evolution and treatment resistance39. To quantify copy number heterogeneity of ecDNA in medulloblastoma, we established an automated image analysis pipeline to estimate the distributions of copy number per cell in interphase FISH microscopy imaging and applied it to four primary medulloblastoma tumors harboring ecDNA: MB036 (MYCN), MB177 (MYCN), MB268 (MDM4) and RCMB56 (DNTTIP2, KMT2E, ETV1). The estimated copy number per cell of all ecDNA-amplified marker genes had significantly greater mean (Wilcoxon test) and variance (Levene’s test) than the ecDNA− cell line COLO320-HSR (Fig. 3a,b and Supplementary Fig. 3), which includes the MYC locus on a chromosomal amplification40. These results from human medulloblastoma tumors are consistent with the high copy number heterogeneity observed in human cancer cell lines with ecDNA39. In each primary tumor analyzed, ecDNA was amplified (copy number greater than five) in only a subset of cells (22–41%; Supplementary Tables 12–18).
To determine whether copy number heterogeneity of an ecDNA+ tumor is accompanied by transcriptional heterogeneity, we analyzed 2,762 single nuclei from frozen tissue of RCMB56-ht using a single nuclei multiome RNA (snRNA) and assay for transposase-accessible chromatin with sequencing (snATAC-seq) assay (10x Genomics) to profile transcriptomes and accessible chromatin of the same individual cells. Consistent with previous findings in bulk samples2,11, RCMB56-ht snATAC-seq coverage was enriched at the amp1 and amp2 loci at the aggregate level and in individual cells (Fig. 3c). To detect focal amplifications in single nuclei, we performed Monte Carlo permutation tests comparing snATAC-seq read density at the amplicon locus to those at random locations elsewhere in the genome. Z-score normalized read density at the amp1 and amp2 loci had greater mean and variance than at gain1 (Fig. 3d), consistent with our observations of the interphase FISH data. We conservatively estimate that at least 224 out of 2,762 (8%, false discovery rate q < 0.10) cells contained amp1 or amp2 (ecDNA+ cells). Of these, both amp1 and amp2 were detected together in only a minority of cells (72 out of 224, 32%) (Fig. 3e). Thus, evidence from quantitative FISH microscopy and multiome single-cell sequencing show that only a fraction of tumor cells in ecDNA+ medulloblastoma tumors harbor high-copy ecDNA and that these have highly variable copy numbers of single or multiple different extrachromosomal amplifications.
ecDNA+ cells have distinct transcriptional profiles
Clustering single cells using the weighted nearest neighbors algorithm41 placed the majority of ecDNA+ cells in a single cluster with distinct transcriptional and epigenetic features (Fig. 3f and Extended Data Fig. 5a). As expected, cells in the ecDNA+ cluster overexpressed DNTTIP2 (Wilcoxon rank sum test, q < 0.001) and KMT2E (q < 0.001), the marker genes for amp1 and amp2. Compared with other tumor and normal cells, the ecDNA+ cell cluster also overexpressed GLI2 (q < 0.001), a mediator of SHH-mediated transcription and marker for SHH medulloblastoma, despite GLI2 not being affected by copy number alteration in this tumor (Fig. 3g and Supplementary Table 19). To further investigate the relationship between ecDNA copy number and transcription, we first estimated ecDNA copy number in single cells (z-scores) and then the transcriptional activity of genes amplified on ecDNA in each cell (ssGSEA42 scores, see Methods). As expected, ssGSEA scores were positively correlated with z-scores, indicating greater transcription of ecDNA-amplified genes with increasing ecDNA copy number (Extended Data Fig. 5b–e). In addition to the ecDNA+ tumor cells, we identified two other clusters of tumor cells that were not enriched for ecDNA and with low expression of the marker genes, one of which strongly expressed mitochondrial genes (labeled ‘ecDNA−’ and ‘ecDNA− MT high’), as well as normal cells such as astrocytes, oligodendrocytes and hematopoietic cells (Fig. 3f,g). Normal cell types were annotated by cluster-specific expression of known marker genes. Genomic copy number estimation from snRNA-seq confirmed that normal cells had stable genomes whereas tumor cell clusters harbored various copy-number alterations (Extended Data Fig. 5f).
ecDNA places oncogenes in ectopic gene regulatory contexts
It has been shown that some medulloblastoma tumors are driven by ‘enhancer hijacking’ events, whereby somatic structural variants cause a noncoding regulatory enhancer to be rewired to amplify oncogenic transcription18,43. Given the extensive genomic rearrangement associated with some medulloblastoma ecDNA, we investigated whether aberrant DNA interactions emerge on circular ecDNA between co-amplified oncogenes and enhancers. To test this hypothesis, we profiled the accessible chromatin of 25 medulloblastoma tumors (11 ecDNA+, 14 ecDNA−) using ATAC-seq44, as well as chromatin interactions of 17 medulloblastoma tumors (eight ecDNA+, nine ecDNA−) using chromatin conformation capture (Hi-C)45. Consistent with previous reports11,46, bulk ATAC-seq read density was markedly enriched across entire ecDNA regions, even for ecDNA with only low-level amplification as estimated by bulk WGS. Hi-C sequencing reads exhibited similar patterns of enrichment at ecDNA regions (Fig. 4a).
In half of the analyzed ecDNA+ tumors (D458, MB106, MB268 and RCMB56), we observed clear evidence of aberrant chromatin interactions on ecDNA that spanned structural variant breakpoints to juxtapose accessible loci and co-amplified genes from distal genomic regions. For example, in the ecDNA+ Group 3 primary tumor MB106, DNA interactions occurred between the MYC locus and two co-amplified accessible regions 13 Mbp away on the reference genome, but less than 1 Mbp away on the ecDNA (Fig. 4b,c). These chromatin interactions were specific to the MB106 ecDNA compared to the interactome of the ecDNA− Group 3 primary tumor MB288 (Fig. 4c).
In the SHH subgroup primary tumor MB268, we identified a 10.2 Mbp ecDNA amplification including the p53 regulator MDM4 (ref. 47) (Extended Data Fig. 6). MDM4 is recurrently amplified on glioblastoma ecDNA24 and is a putative driver event in MB268. On the same ecDNA, we also observed aberrant DNA interactions with the immune complement system regulator CFH promoter. However, the functional significance of these co-amplified genes and DNA interactions remains unclear.
In two instances, the SHH subgroup tumor RCMB56-pdx and the Group 3 cell line D458, we identified rewired interactions between genomic loci originating from different chromosomes but co-amplified on the same ecDNA. As described above, RCMB56 harbored an ecDNA comprising segments of chromosome 1 and a complex extrachromosomal amplification comprising segments of chromosome 7 and chromosome 17. Hi-C data indicated frequent chromatin interaction across breakpoints in each of the two amplicons (Extended Data Fig. 7). Aberrant chromatin interactions mapping to amp1 targeted accessible regions at the DNTTIP2, SH3GLB1 and SELENOF gene loci (Extended Data Fig. 7a–c). Aberrant interactions on amp2 included intrachromosomal interactions mapping to RPA3, HERPUD2, KLF14 and others; and trans-chromosomal interactions between the SP2 locus and the brain-specific long noncoding RNA LINC03013 (ref. 48), and from the PRR15L promoter to an intragenic region upstream of SRI (Extended Data Fig. 7d–f).
D458 harbored an ecDNA amplification containing oncogenes MYC and OTX2 from chromosomes 8 and 14, respectively. Co-amplification of MYC and OTX2 on the same ecDNA was validated by confocal FISH (Fig. 5a) and by assembly of the D458 ecDNA from WGS and OGM data (Extended Data Fig. 8a). OTX2 is a known regulator of MYC transcription49 and both genes are highly expressed in D458 (Fig. 5b). Hi-C data revealed several interactions of the MYC promoter with co-amplified regulatory elements of chromosome 8 (Fig. 5c) and chromosome 14 (Extended Data Fig. 8b). In summary, these results show that aberrant enhancer–promoter interactions resulting from structural rearrangements on ecDNA are common in medulloblastoma tumors.
ecDNA-amplified enhancers modulate oncogene transcription
To test whether co-amplified enhancers on ecDNA have functional roles in tumor cell proliferation, we performed a pooled CRISPRi proliferation screen in the Group 3 medulloblastoma cell line D458, targeting all 645 accessible loci on the ecDNA using 32,530 small guide RNA sequences (sgRNAs). These loci included ten highly accessible regions from chromosome 14, each overlapping ENCODE candidate cis-regulatory elements50. Given that enhancer usage is highly conserved in Group 3 tumors51, we performed the same screen in the Group 3 cell line D283, in which MYC (but not OTX2) is tandem amplified on a 55 Mbp homogeneously staining region of chromosome 8q (Fig. 5d). Although the MYC promoter was essential in both cell lines, our screen identified six functional elements that, upon CRISPRi inhibition, specifically reduced D458 proliferation compared to D283 after 21 days (MAGeCK MLE, q < 0.05; Fig. 5e)52. On chromosome 8, these loci included two accessible regions of a known MYC super-enhancer51 and the PVT1 promoter. In D458, much of the super-enhancer is duplicated internally on the ecDNA, and PVT1 is amplified in D458 but not in D283. Conversely, we observed that other accessible regions of the same MYC super-enhancer were specifically essential for D283 but not for D458. The D458 interactome included interchromosomal interactions between MYC on chromosome 8 and regulatory elements of chromosome 14 co-amplified on the same ecDNA, two of which were essential for D458 proliferation (Extended Data Fig. 8b): a cluster of elements at the OTX2 locus as well as a distal enhancer53 located 80 kbp downstream of OTX2 on the reference genome but inverted on the ecDNA. D283-specific elements on chromosome 14 included peaks at the amino-terminal exon of OTX2 and another distal enhancer53 55 kbp from OTX2 on the reference but also inverted on the ecDNA.
To further test the influence on transcription of regulatory regions essential in D458 but not in D283, we performed additional CRISPRi inhibition experiments targeting the PVT1 promoter and an accessible region within the internal duplication of the MYC super-enhancer. Consistent with the result of the CRIPSRi proliferation screen, silencing of the MYC super-enhancer reduced MYC expression for two out of three sgRNAs in D458 but not in D283 (Extended Data Fig. 9a,b). No significant difference was observed in OTX2 transcription in either cell line (Extended Data Fig. 9c,d). Silencing of the PVT1 promoter abrogated PVT1 transcription but not MYC or OTX2, in D458 but not in D283 (Supplementary Fig. 4). Thus, although proliferation in both Group 3 medulloblastoma cell lines is driven by MYC amplification, the relative importance of co-amplified genes and cis-regulatory elements is specific to the genomic architecture of the amplicon.
Discussion
A long-standing problem in the clinical management of medulloblastoma tumors has been the paucity of effective targeted molecular treatments for the disease, especially in relapsed cases. For example, the SMO inhibitor vismodegib, one of few targeted drugs approved for SHH medulloblastoma, is ineffective against TP53-mutant, MYCN-amplified or GLI2-amplified tumors54, each of which were recurrent features of ecDNA+ medulloblastoma in our patient cohort. By retrospective analysis of WGS and clinical outcome data from a large cohort of medulloblastomas, we demonstrate that ecDNA associates with poor outcome across the entire cohort and within individual disease subgroups. Survival analysis indicates that relative to patients with ecDNA−, patients with ecDNA+ medulloblastomas are more than twice as likely to relapse and three times as likely to die during the follow-up interval. Identification of ecDNA in medulloblastoma tumors is therefore crucial to pave the way for precision medicine approaches targeting ecDNA.
As in other cancers2,3,55, ecDNA frequently amplifies known medulloblastoma oncogenes. ecDNA is a frequent feature of MYC-amplified Group 3 and TP53-mutant SHH tumors, which share exceptionally poor prognoses16,17 but few other recurrent driver mutations. Recent longitudinal analysis of Barrett’s esophagus suggests that TP53 alteration is an early event in ecDNA-driven malignant transformation55. However, the absence of detectable ecDNA in TP53-mutant WNT subgroup tumors and the frequent occurrence of ecDNA in Group 3 tumors with wild-type TP53 suggest that the mechanisms for the generation and selection of ecDNA may be modulated by subgroup-specific cellular contexts of medulloblastoma progenitor cells.
Close examinations of medulloblastoma tumors using FISH microscopy and single-cell sequencing reveal broad intratumoral distributions of ecDNA copy number per cell. In the illustrative example of RCMB56, a pediatric SHH medulloblastoma tumor with somatic TP53 mutation, we reconstructed two extrachromosomal amplifications and conclusively elucidated the circular structure of amp1. FISH and single-cell sequencing analyses concur that only a minority of RCMB56 primary tumor cells harbored high-copy amplification, and clustering on single-cell data suggests that these cells express a distinct transcriptional and epigenetic profile, including a canonical marker of SHH signaling. Based on these findings, it is imperative to investigate how the heterogeneous cell populations in medulloblastoma tumors respond to therapeutic pressure and contribute to treatment resistance and relapse.
By mapping accessible chromatin and chromosome conformation in medulloblastoma tumors and models, we find frequent gene regulatory rewiring as a consequence of ecDNA sequence rearrangement, suggesting that an altered gene regulatory landscape may contribute to transcriptional activation of ecDNA-amplified oncogenes. Consistent with previous findings in glioblastoma13, a functional inhibition screen in two Group 3 MYC-amplified medulloblastoma cell lines shows that co-amplified enhancer function differs depending on the architecture of the amplification. However, the relative importance to oncogenic gene expression of native co-amplified enhancers versus aberrant regulatory rewiring on ecDNA remains an open question.
Recent studies have revealed intermolecular enhancer–promoter interactions between ecDNA molecules40 or between the chromosomes and ecDNA56. To test for such intermolecular chromatin interactions in medulloblastoma, we computationally identified interchromosomal loops from Hi-C of the SHH medulloblastoma tumor RCMB56-pdx, in which one loop anchor mapped to the circular ecDNA amp1. This analysis revealed a nexus of interactions mapping from the ARHGAP29 locus on amp1 to loci elsewhere in the genome with plausible tumorigenic roles, including MECOM, RAD51AP2, POU4F1 and IGF1R (Extended Data Fig. 10). However, the functional significance of these intermolecular chromatin interactions in medulloblastoma remains untested.
ecDNA has been implicated in intratumoral heterogeneity2,57,58, modulation of oncogene copy number in response to therapy32,39,59 and evolution of targeted therapy resistance6,7,58,60. In this context, we have shown that ecDNA is a strong predictor for the outcome of patients with medulloblastoma tumors and is associated with other known molecular prognostic indicators, oncogene amplification, intratumoral copy number and transcriptional heterogeneity, and transcriptional regulatory rewiring. Further analysis of the mechanistic relationships between DNA repair pathway mutation, ecDNA formation and maintenance, and chemotherapy resistance may uncover new combinatorial therapies for patients with high-risk medulloblastoma who have exceptionally poor prognoses.
Methods
Statistical methods
Statistical tests, test statistics and P values are indicated where appropriate in the main text. Categorical associations were established using the chi-squared test of independence if n > 5 for all categories and Fisherʼs exact test otherwise. For both tests, the Python package scipy.stats v1.5.3 implementation was used64. Multiple hypothesis corrections were performed using the Benjamini–Hochberg correction65 implemented in statsmodels v0.12.0 (ref. 66). All statistical tests described herein were two-sided unless otherwise specified.
Patient consent
Details on informed consent from patients for the collection of samples, and previously published data (Children’s Brain Tumor Network (CBTN), St. Jude, International Cancer Genome Consortium (ICGC) and Archer datasets) are described in Supplementary Note 2. Patients that were diagnosed at Rady Children’s Hospital–San Diego provided consent under the protocol Molecular Tumor Profiling Platform for Oncology Patients (IRB 190055), approved by the University of California San Diego (UCSD) Institutional Review Board (Supplementary Table 1). Patients were not compensated for their participation.
Medulloblastoma WGS
Paired-end WGS data were acquired from different sources as described in Supplementary Note 2. In total, the WGS cohort comprised 468 patients (161 female, 277 male, 30 N/A; aged 0–36 years; see Supplementary Table 1). Unless otherwise specified, WGS was acquired for one tumor biosample per patient. Details on WGS data processing pipelines are described in Supplementary Note 2.
ecDNA detection and classification from bulk WGS
To detect ecDNA, all samples in the WGS cohort were analyzed using AmpliconArchitect24 v1.2 and AmpliconClassifier3 v0.4.4. In brief, copy number segmentation and estimation were performed using CNVkit v0.9.6 (ref. 67). Segments with copy number ≥ 4 were extracted using AmpliconSuite-pipeline (April 2020 update) as ‘seed’ regions. For each seed, AmpliconArchitect searches the region and nearby loci for discordant read pairs indicative of genomic structural rearrangement. Genomic segments are defined based on boundaries formed by genomic breakpoint locations and by modulations in genomic copy number. A breakpoint graph of the amplicon region is constructed using the copy-number-aware segments and the genomic breakpoints, and cyclic paths are extracted from the graph. Amplicons are classified as ecDNA, breakage–fusion–bridge, complex, linear or no focal amplification by the heuristic-based companion script, AmpliconClassifier. Biosamples with one or more classifications of ‘ecDNA’ were considered potentially ecDNA+; all others were considered ecDNA− (Supplementary Table 3). We manually curated all potential ecDNA+ assembly graphs and reclassified those with inconclusive ecDNA status, which we defined as any of the following: low-copy amplification (<5) AND no copy number change at discordant read breakpoints; and/or cycles consisting of the repetitive region at chr5:820000 (GRCh37).
The ecDNA− status of the D283 cell line was not determined computationally by WGS, but by copy number analysis of DNA methylation, FISH (see Methods) and analysis of OGM data.
Fingerprinting analysis
To uniquely identify WGS from each patient, we counted reference and alternate allele frequencies at 1,000 variable non-pathogenic single-nucleotide polymorphism locations in the human genome according to the 1000 Genomes project68 and performed pairwise Pearson correlation between all WGS samples. Biospecimens originating from the same patient tumor (for example, primary–relapse or human tumor–PDX pairs) were distinguishable by high correlation across these sites (r > 0.80). We identified one case in which two tumor biosamples had highly correlated fingerprints: MDT-AP-1217.bam and ICGC_MB127.bam. We arbitrarily removed ICGC_MB127 from the patient cohort.
Patient metadata, survival and subgroup annotations
Where available, patient samples and models were assigned metadata annotations including age, sex, survival and medulloblastoma subgroup based on previously published annotations of the same tumor or model18,23,31,37,69–71. Sample metadata are also available in some cases from the respective cloud genomics data platforms: https://dcc.icgc.org (ICGC), https://pedcbioportal.kidsfirstdrc.org and https://portal.kidsfirstdrc.org (CBTN), and https://pecan.stjude.cloud (St. Jude). Patient tumors from CBTN were assigned molecular subgroups based on a consensus of two molecular classifiers, using RSEM-normalized FPKM data: MM2S (ref. 72) and the D3b medulloblastoma classifier at the Children’s Hospital of Philadelphia (https://github.com/d3b-center/medullo-classifier-package). Where primary sources disagreed on a metadata value, that value was reassigned to N/A.
TP53 mutation annotations
Somatic mutations
Somatic TP53 mutation information for the ICGC and CBTN cohorts was acquired from a previous publication31 and from the ICGC and CBTN data portals. Somatic TP53 mutation information for the St. Jude cohort was extracted from the standard internal St. Jude variant calling pipeline20. We only considered somatic mutations that were protein-coding and missense, nonsense, insertion or deletion, or that affected a splice site junction.
Germline variants
Germline variant GVCF files were downloaded from the ICGC, KidsFirst and St. Jude Pediatric Cancer Genome Project (PCGP) data portals. GVCF files were merged with GLnexus73 and converted to PLINK format. PCGP genotypes were converted to hg19 coordinates using liftover. Variants from the TP53 genomic locus (hg19:chromosome 17:7571739–759080) were extracted and annotated with REVEL (https://sites.google.com/site/revelgenomics)74, CADD v1.6 (https://cadd.gs.washington.edu/info)75, ClinVar (https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh37) and Variant Effect Predictor (VEP) r104 (http://grch37.ensembl.org/index.html)76. VEP variants that were considered pathogenic included ‘frameshift’ and ‘splice’ variants. ClinVar annotations that were considered pathogenic included ‘frameshift’, ‘stop’, ‘splice’ and ‘deletion’, and for which the clinical significance was ‘pathogenic’ or ‘likely pathogenic’. CADD pathogenic variants had a CADD score of at least ten. REVEL pathogenic variants had a REVEL score of at least 0.5. Only variants with a minor allele frequency of less than 5% according to the gnomAD r2.1.1 database were analyzed77.
Survival analyses
Kaplan–Meier, Cox proportional hazards and AFT analyses were performed with Lifelines v0.26.5 (ref. 78). For all analyses, the sample set contained data from all patients annotated with the included covariates; no imputation was performed.
Kaplan–Meier analysis
For Kaplan–Meier analysis, the sample size was n = 362 (65 ecDNA+; 297 ecDNA−). Differential survival was determined by a log-rank test. For Kaplan–Meier analyses by class of structural variant, samples were assigned a label if at least one amplicon was classified by AmpliconClassifier with that label, in order of priority: ecDNA, breakage–fusion–bridge, complex non-cyclic, linear, no focal somatic copy number amplification3. Our sample of tumors with breakage–fusion–bridge amplification but no ecDNA was too small to test (n = 2).
Cox proportional hazards on age, sex, molecular subgroup and ecDNA
For the Cox proportional hazards analysis, the sample size was n = 352 observations. The model was fitted by maximum likelihood estimation.
Cox proportional hazards on age, sex, molecular subgroup, p53 mutation and ecDNA
For the proportional hazards analysis that included p53 mutation, the sample size was n = 322 observations. Collinearity, which is strong correlation between predictive variables in a regression model, can result in model instability and unreliable estimation of the collinear coefficients79. To address collinearity between ecDNA and p53 status in our model, we performed ridge estimation of model coefficients80,81, determining the ridge penalty parameter λ by grid search on fivefold cross-validation of model likelihood on the withheld set.
AFT models and mediation analysis
Mediation analysis was performed using the Baron–Kenny framework35, following recent best practices82. Owing to the non-collapsibility of hazard ratios, the proportional hazards assumption and Cox proportional hazards model may not be suitable for mediation analysis in which we need to compare the coefficients with and without the mediator. Therefore, we fitted parametric log-normal AFT regression models as a reasonable alternative to Cox regression. Percentage change values were calculated as:
where is the maximum likelihood estimation regression coefficient for random variable k.
OGM data collection and processing
Ultra-high molecular weight (UHMW) DNA was extracted from frozen cells preserved in dimethylsulfoxide (DMSO) following the manufacturer’s protocols (Bionano Genomics). Cells were digested with Proteinase K and RNase A. DNA was precipitated with isopropanol and bound with nanobind magnetic disks. Bound UHMW DNA was resuspended in the elution buffer and quantified with Qubit dsDNA assay kits (ThermoFisher Scientific).
DNA labeling was performed following the manufacturer’s protocols (Bionano Genomics). Standard Direct Labeling Enzyme 1 reactions were performed using 750 ng of purified UHMW DNA. Fluorescently labeled DNA molecules were imaged sequentially across nanochannels on a Saphyr instrument (Bionano). At least 400× genome coverage was achieved for all samples.
De novo assemblies of the samples were performed with Bionano’s De Novo Assembly Pipeline (DNP) using standard haplotype-aware arguments (Bionano Solve v3.6). With the Overlap-Layout-Consensus paradigm, pairwise comparison of DNA molecules was used to create a layout overlap graph, which was then used to generate initial consensus genome maps. By realigning molecules to the genome maps (P < 10−12) and using only the best-matched molecules, a refinement step was done to refine the label positions on the genome maps and to remove chimeric joins. Next, an extension step aligned molecules to genome maps (P < 10−12) and extended the maps based on molecules aligning past the map ends. Overlapping genome maps were then merged (P < 10−16). These extension and merge steps were repeated five times before a final refinement (P < 10−12) was applied to ‘finish’ all genome maps.
ecDNA reconstruction with OGM data
The ecDNA reconstruction strategy incorporated the copy-number-aware breakpoint graph generated by AmpliconArchitect24 with OGM contigs generated by the Bionano DNP. For RCMB56 assemblies, we used contigs from the DNP as well as the Rare Variant Pipeline.
We used AmpliconReconstructor83 v1.01 to scaffold individual breakpoint graph segments from OGM contigs, with the ‘–noConnect’ flag set and otherwise default settings. A subset of informative contigs with alignments to multiple graph segments as well as a breakpoint junction were then selected for subsequent scaffolding, using the ‘–contig_subset’ argument of AmpliconReconstructor’s OMPathFinder.py script. For the exploration of unaligned regions of OGM contigs used in the reconstructions, we used the OGM alignment tool FaNDOM84 v0.2 (default settings). FaNDOM was used to identify the loose ends of the RCMB56 amp2.
RCMB56 amp1 and D458 were fully reconstructed as described above; however, RCMB56 amp2 required manual intervention. Owing to the fractured nature of the breakpoint graphs in RCMB56 amp2, we searched for copy-number-aware paths in the AmpliconArchitect breakpoint graph, using the plausible_paths.py script from the AmpliconSuite-pipeline, then converted these to in silico OGM sequences and aligned paths to OGM contigs directly using AmpliconReconstructor’s SegAligner.
Animals
NOD-SCID IL2Rγ null (NSG) mice (Jackson Laboratory, strain no. 005557) were housed in an aseptic barrier research animal facility at the Sanford Consortium for Regenerative Medicine, with a 12 h light–dark cycle, ambient temperature of 19–24 °C and 40–60% humidity. All experiments were performed in accordance with national guidelines and regulations, and according to protocols approved by the Animal Care and Use Committees at the Sanford Burnham Prebys Medical Discovery Institute and UCSD (San Diego, CA, USA) and the UCSD Institutional Review Board (Project no. 171361XF). In compliance with humane endpoint protocols, tumor-bearing mice displaying signs of moribundity (dysmorphic head, hunched posture, ataxia, excessive weight loss) were euthanized and processed without exceeding tumor burden limitations.
Establishment and maintenance of PDX RCMB56
RCMB56-pdx was originally derived with consent from a TP53-mutant SHH subgroup medulloblastoma of an eight-year-old male patient who was diagnosed at Rady Children’s Hospital–San Diego, under the protocol Molecular Tumor Profiling Platform for Oncology Patients (IRB 190055). Primary surgical tumor tissue was disassociated via Liberase (Sigma-Aldrich, 05401020001) and suspended in Neurocult media (Stem Cell Technologies, 05750). Cells (0.5–1 × 106) were orthotopically implanted into NSG mouse cerebella for expansion. Initial xenograft tumor latency was six months post-implant, whereupon tumor tissue was dissected from moribund mice, dissociated and reimplanted into new recipient NSG mice or cryopreserved without in vitro passaging. Ex vivo experiments were performed with PDX RCMB56 cells from in vivo passage 1 (x1).
Metaphase spreads
Cell lines were enriched for metaphases by the addition of KaryoMAX (Gibco) at 0.1 µg ml–1 for 2 h to overnight (0.02 µg ml–1 overnight for dissociated PDX cells). Single-cell suspensions were then incubated with 75 mM KCl for 8–15 min at 37 °C. Cells were washed in carnoy fixative (3:1 methanol:acetic acid) three times. Cells were then dropped onto humidified slides.
FISH
Slides containing fixed cells were briefly equilibrated in 2× SSC buffer, followed by dehydration in 70%, 85% and 100% ethyl alcohol for 2 min each. FISH probes (Supplementary Table 20) diluted in hybridization buffer were applied to slides and covered with a coverslip. Slides were denatured at 72 °C for 1–2 min and hybridized overnight at 37 °C. The slide was then washed with 0.4× SSC, then 2× SSC-0.1% Tween 20. DAPI was added before washing again and mounting with Prolong Gold.
Microscopy
Conventional fluorescence microscopy was performed using either the Olympus BX43 microscope equipped with a QiClick cooled camera, or the Leica DMi8 widefield fluorescence microscope followed by Thunder deconvolution using a ×63 oil objective. Confocal microscopy was performed using a Leica SP8 microscope with lightning deconvolution and white light laser (UCSD School of Medicine Microscopy Core). Excitation wavelengths for multiple color FISH images were set manually based on the optimal wavelength for the individual probes, with care taken to minimize crosstalk between channels. ImageJ 1.53 was used to uniformly edit and crop images.
Automated FISH analysis
Cell segmentation
We applied NuSeT85 to perform cell segmentation. The parameters were min_score 0.95, nms threshold of 0.01, a nuclei size threshold of 500 and a scale ratio of 0.3.
Number of FISH blobs
To annotate pixels with high local intensity, we convolved the original image with a sampled Gaussian kernel, with a standard deviation of three pixels and a size of seven by seven pixels. After convolving, we applied a threshold of 15 / 255 pixel brightness. Then, to filter out low brightness noise, we set a binary threshold that the brightness of these peaks must exceed one standard deviation above the average FISH brightness and added an additional minimum area requirement.
Amplification mechanism
We ran ecSeg-i86 on each segmented cell to determine the amplification mechanism. ecSeg-i produces three probability scores representing the likelihood of the cell having no amplification, ecDNA amplification or homogeneously staining region amplification. We assigned the amplification mechanism with the highest likelihood.
Single Cell Multiome ATAC + Gene Expression sequencing
From the RCMB56 primary patient tumor (RCMB56-ht), disassociated cryopreserved cells stored in 10% DMSO/FBS were used. At least 50 mg of tissue (1 M cells) was used for both samples. Disassociated cells were prepared for Single Cell Multiome ATAC + Gene Expression sequencing (10× Genomics) according to the manufacturer’s instructions87. Sequencing was performed on an Illumina NovaSeq S4 200 to a depth of at least 250 M reads for snATAC-seq and 200 M reads for snRNA-seq.
Single-cell data processing and clustering
Sequencing data were uniformly processed using CellRanger ARC v2.0.0 with default parameters, followed by Seurat v4.0.4 (ref. 41). Cell barcodes that passed the following quality thresholds were retained: ATAC mitochondrial fraction less than 0.1; ATAC read count between 1,000 and 70,000; and RNA read count between 500 and 25,000. Doublets were identified and removed using DoubletFinder v2.0 (ref. 88) using default parameters. Single-cell transcription data were normalized using regularized negative binomial regression, implemented in the sctransform package89 (SCT) included with Seurat.
Clustering was performed using the weighted nearest neighbors algorithm41 with a resolution of 0.1 and the other parameters set at default. To label cell clusters with cell type identities, differentially expressed genes were found for each cluster using Seurat’s FindAllMarkers function with default parameters (Supplementary Table 19) and cross-referenced against known cell type marker genes90.
Copy number estimation from scRNAseq was performed using InferCNV v1.3.3 (ref. 91). Normal reference cells were defined as ecDNA− cells belonging to cell clusters labeled as normal cell types. All parameters were set at default.
Sequencing coverage of single cells (Fig. 4c) were visualized in IGV desktop v2.9.2 (ref. 92). Bulk WGS coverage (bigwig format) was generated from deduplicated sequencing reads using deeptools v3.5.1 (ref. 93) bamCoverage was at 50 bp resolution using default parameters. Single-cell coverage tracks were parsed from CellRanger ARC atac_fragments.txt.gz output format to .bed format using a custom script, then converted to bigwig format using bedtools v2.27.1 (ref. 94) genomecov and UCSC browser tools95 bedGraphToBigWig v4.
Identification of ecDNA-containing cells
ecDNA-containing cells were identified by permutation tests comparing snATAC-seq read coverage at the ecDNA regions to read coverage of random regions elsewhere in the genome. In brief, deduplicated snATAC-seq reads from the fragments.tsv output of CellRanger ARC were sorted by barcode. For Monte Carlo permutation testing, 1,000 random contiguous regions of the genome, excluding centromeres, telomeres, known ecDNA and low-mappability regions, were generated using bedtools v2.27.1 (ref. 94). Read coverage was counted using PyRanges v0.0.112 (ref. 96) and scaled to region length. For each cell, empirical P values were estimated as , where r is the rank of the test value out of n permutations97. Multiple hypothesis correction was performed using a Benjamini–Hochberg-corrected P value (P < 0.10). Z-scores were calculated using the standard formula, comparing the average read coverage at the ecDNA-amplified region to the mean and variance of the Monte Carlo permutations.
Single-sample gene set enrichment analysis
Single-sample gene set enrichment analysis (ssGSEA) is a variation of gene set enrichment analysis for quantifying the aggregate expression of a gene set across the transcriptome of one sample42. To quantify the transcriptional activity of ecDNA in single cells, we performed ssGSEA of two gene sets comprising every gene amplified on RCMB56 amp1 or amp2, treating each cell as a single sample. The population sample consisted of n = 247 ecDNA+ cells from the RCMB56-ht sample. Gene expression values were the SCT-normalized transcription matrix, generated as described above using Seurat v4.0.4. ssGSEA was run using ssGSEA v10.0.11 implemented at https://cloud.genepattern.org (ref. 98). Association with z-score ecDNA copy number estimates was performed using Pearson’s R, implemented in scipy.stats v1.7.3 and visualized using Seaborn v0.9.0 (ref. 99) histplot.
ATAC-seq
ATAC-seq was performed at the Massachusetts Institute of Technology (Cambridge, MA) or ActiveMotif (San Diego, CA). Center-specific detail is included in Supplementary Note 3. Reads were aligned to the hg38 reference, deduplicated and preprocessed according to ENCODE best practices. Accessible chromatin regions were identified using MACS2 v2.1.2 (ref. 100) using a Benjamini–Hochberg-corrected P value threshold (P < 0.05).
Chromosome conformation capture (Hi-C)
Hi-C was performed at the Salk Institute (La Jolla, CA) or Arima Genomics (San Diego, CA). Center-specific details are included in Supplementary Note 4.
Hi-C data processing
Hi-C reads were trimmed using Trimmomatic 0.39 (ref. 101) and aligned to the hg38 human genome reference using HiC-Pro v2.11.3-beta and bowtie 2.3.5 (ref. 102) with default parameters103. Visualization and contact normalization was performed with JuiceBox v1.11.08 (ref. 104) and the Knight–Ruiz algorithm105. Intrachromosomal chromatin interactions were called using Juicer Tools GPU HiCCUPS v1.22.01(ref. 106) using a false discovery rate threshold of 0.2 and default recommended parameters45. Visual inspection indicated that HiCCUPS correctly annotated interactions mapping to ecDNA, except for locus pairs mapping within ~50 kb of a structural rearrangement. Owing to these technical challenges, chromatin interactions described herein were manually curated based on HiCCUPS interaction calls. Ectopic chromatin interactions spanning breakpoints on the D458, MB268 and RCMB56 ecDNA, including interchromosomal interactions, could not be accurately called by any software tools known to us because of technical limitations in this emerging field. These interactions were manually annotated from the interaction matrices shown in Extended Data Figs. 6–8.
Identification of intermolecular chromatin interactions
To screen for putative intermolecular chromatin interactions originating from possible mobile enhancers56 on ecDNA, we performed loop detection on Hi-C data of RCMB56-pdx using FitHiC v2.0.8 (ref. 107) interchromosomal mode, at a resolution of 50 kbp and setting no bias upper bound, as recommended by the tool’s authors for this task. Interactions with corrected q-values less than 0.05 were selected and then further filtered for loops with one anchor mapping to RCMB56 amp1. To reduce false-positive loop calls originating from copy number variation, loops mapping to amp2 or to within 100 kbp of a breakpoint on amp1 were also removed. After filtering, 46 high-confidence loops remained that mapped from amp1 to elsewhere in the reference genome. Genes were associated with a loop if the gene locus overlapped the 50 kbp loop anchor. Panel S11a was generated using circos v0.69-8 (ref. 108).
Pooled CRISPRi proliferation screen
The pooled CRISPRi proliferation screen was designed after a similar screen in glioblastoma cell lines13. In brief, this screen targeted all 645 accessible regions of the D458 ecDNA with 32,530 sgRNAs. Cultures of D458 (ecDNA+) and D283 (ecDNA−) cells were grown for 21 days and then sequenced to determine overrepresented and underrepresented sgRNAs. Further details are provided in Supplementary Note 5.
Targeted CRISPRi experiments
For CRISPRi experiments, D283 and D458 cells were lentivirally transduced with dCas9-KRAB-mCherry plasmid109 (Addgene, 60954) to express dCas9. Cells stably expressing dCas9 were FACS-sorted based on mCherry expression and transduced with sgRNA vectors. sgRNAs were cloned into the lentiGuide-puro plasmid (Addgene, 52963) (ref. 110). sgRNAs are listed in Supplementary Table 21. All plasmids were verified by Sanger sequencing. HEK293T cells (ATCC, CRL-3216) were used to generate lentiviral particles by cotransfecting the packaging vectors psPAX2 and pMD2.G using LipoD293 transfection reagent (SignaGen, SL100668).
Quantitative RT–PCR
Five days after sgRNA transduction, total cellular RNA was isolated from cell pellets using a Qiagen RNeasy Kit. iScript cDNA Synthesis Kit (Bio-Rad, 1708890) was used for reverse transcription into cDNA. Quantitative RT–PCR was performed in technical triplicate for two bioreplicates of each experimental condition on a Bio-Rad CFX384 Real-Time System using SYBR Green PCR Master Mix (Bio-Rad, 1725270). qPCR primers are listed in Supplementary Table 22.
Gene transcription was estimated using the delta delta Ct method (Exp, 2−ΔΔCt) relative to actin. Testing for change in gene expression was performed using one-sided nested ANOVA with Dunnett’s multiple comparisons test, implemented in GraphPad Prism v9.5.2.
Biological material availability
PDX and cell line materials used in this study are available upon request. Patient tumor material used in this study are depleted and therefore not available.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Online content
Any methods, additional references, Nature Portfolio reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at 10.1038/s41588-023-01551-3.
Supplementary information
Source data
Acknowledgements
This work was delivered as part of the eDyNAmiC team supported by the Cancer Grand Challenges partnership funded by Cancer Research UK (CRUK) (P.S.M. and H.Y.C., CGCATF-2021/100012; V.B. and J.L., CGCATF-2021/100025) and the National Cancer Institute (P.S.M. and H.Y.C., OT2CA278688; V.B. and J.L., OT2CA278635). This work is supported by a generous endowment by the Clayes Foundation to the Research Center for Neuro-Oncology and Genomics within the Rady Children’s Institute for Genomic Medicine, a Hannah’s Heroes St. Baldrick’s Scholar Award (L.C.), a grant from The National Brain Tumor Society (P.S.M.), funding from the National Institutes of Health (NIH) National Institute of Neurological Disorders and Stroke Institute R35 NS122339 (R.J.W.R.), R01 NS132780 (L.C.), R21 NS130137 (L.C.), R21 NS116455 (L.C.) and R21 NS120075 (L.C.), the NIH National Cancer Institute R01 CA159859 (R.J.W.R.), R01 CA238249 (P.S.M.), R01-CA238379 (P.S.M.), U24 CA258406 (J.P.M. and J.T.R.), U24 CA210004 (J.P.M. and J.T.R.), U24 CA220341 (J.P.M.), U24 CA264379 (V.B., P.S.M. and J.P.M.), U01 CA184898 (J.P.M., E.F., S.L.P.), U01 CA253547 (J.P.M. and E.F.), F99 CA274692 (K.L.H.) and F31 CA271777 (O.S.C.), the NIH National Institute of General Medical Sciences R01 GM074024 (J.P.M.) and R01 GM114362 (V.B.), the NIH National Library of Medicine T15 LM011271 (O.S.C.), a Moores Cancer Center Pilot Grant (L.C., V.B., J.P.M. and P.S.M.), Hyundai Hope on Wheels (J.E.A.M., Y.J.C.) and a Stanford Graduate Fellowship (K.L.H.). Microscopy work was supported by funding from the NIH National Institute of Neurological Disorders and Stroke P30 NS047101 (University of California San Diego Microscopy Core). This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number ACI-1548562. DNA methylation array analysis was conducted at the IGM Genomics Center, University of California, San Diego, La Jolla, CA (P30 CA023100). This research was conducted using data made available by The Children’s Brain Tumor Network (formerly the Children’s Brain Tumor Tissue Consortium). H.Y.C. is an Investigator of the Howard Hughes Medical Institute. In addition, we thank I. A. Reyes, J. H. Zhang, C. McLeod and A. Resnick for facilitating data access; M. Reich and M. Tatineni for computational support; J.Olson, J. Weissman, R. Vibhakar and X.-N. Li for biosamples and materials; A. Pang (Bionano Genomics) for assistance running the Bionano Assembly pipeline; C.-C. Yang, C.-T. Huang, R. Murad, A. Morton and P. Scacheri for CRISPRi services and guidance; M. Kazachkova for exploratory analyses on the data; and A. Wenzel and M. Chapman for helpful scientific discussions.
Extended data
Author contributions
O.S.C., J.L., J.P.M. and L.C. prepared the manuscript and figures. S. Wani, A.T., S.C., M.A., L.M., D.D., C.H., J.R.C., N.G.C., M.L.L., D.M.M., A.K., S.L.P., J.R.D., A.B., A.J.D., R.H.S. and E.F. performed sample preparation and experimental analysis. J.D.L., Y.Y.L. and R.J.W.R. contributed PDX mouse models. J.T.L., I.T.L.W. and P.S.M. performed all microscopy. K.L.H., B.J.H. and H.Y.C. performed CRISPR-CATCH. S. Wani., A.T., D.D., L.M., J.N.R., J.E.A.M., Y.J.C, A.B. and A.J.D. performed CRISPRi sample preparation, experiments and analysis. O.S.C., J.L., M.S.P., S. Wang, Y.L., A.D., C.G., S.S., J.T.R., S.R.D., G.P., U.R., E.J., X.Z., H.C. and V.B. contributed computational data analyses. O.S.C., J.P.M. and L.C. designed the study and J.P.M. and L.C. co-supervised the project.
Peer review
Peer review information
Nature Genetics thanks Anton Henssen, Jinghui Zhang and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Data availability
WGS data from the ICGC, CBTN and St. Jude datasets are under controlled access as implemented by the respective organizations, but are available from the following sources upon reasonable request. ICGC and Archer patient cohorts: International Cancer Genome Consortium (https://dcc.icgc.org). Inclusion criteria were all medulloblastomas from datasets PEME-CA and PBCA-DE. CBTN patient cohort: Kids First Data Resource Center (https://kidsfirstdrc.org). Inclusion criteria were all medulloblastomas from dataset PBTA-CBTN as of March 2020. St. Jude patient cohort: St. Jude Cloud (https://www.stjude.cloud). Inclusion criteria were all medulloblastomas from the Pediatric Cancer Genome Project (PCGP, SJC-DS-1001) and Real-Time Clinical Genomics (RTCG, SJC-DS-1007) datasets as of March 2020. Rady Children’s Hospital patient cohort, medulloblastoma cell line and PDX models: SRA PRJNA1011359. OGM contigs: SRA PRJNA1011359. Other datasets referenced in this work: 1000 Genomes Common SNPs (that is, dbSNP b141; https://ftp.ncbi.nih.gov/snp); DepMap 21Q2 (https://depmap.org/portal/download/all); ENCODE Registry of cCREs v3 (https://screen.encodeproject.org). ATAC-seq, Hi-C, single-cell sequencing and pooled CRISPRi screen data are available at the NCBI Gene Expression Omnibus (GEO) under accession GSE240985. FISH images are available at 10.6084/m9.figshare.c.6759093. Source data are provided with this paper.
Code availability
Code for the AmpliconArchitect family of software tools is available from the following repositories: PrepareAA (https://github.com/jluebeck/PrepareAA); AmpliconArchitect (https://github.com/jluebeck/AmpliconArchitect); AmpliconClassifier (https://github.com/jluebeck/AmpliconClassifier). Code for the analysis and generation of the figures is available from the following repositories: analyses on clinical and bulk sequencing data (https://github.com/auberginekenobi/medullo-ecdna); and detection and quantification of ecDNA in single-cell ATAC-seq data (https://github.com/auberginekenobi/ecdna-quant). Other single-cell analyses: https://github.com/auberginekenobi/rcmb56-single-cell.
Competing interests
The authors declare the following competing interests: H.Y.C. is a co-founder of Accent Therapeutics, Boundless Bio, Cartography Biosciences, Orbital Therapeutics and is an advisor of 10x Genomics, Arsenal Biosciences, Chroma Medicine and Spring Discovery. P.S.M. is a co-founder, chairs the scientific advisory board (SAB) and has equity interest in Boundless Bio. P.S.M. is also an advisor with equity for Asteroid Therapeutics and is an advisor to Sage Therapeutics. V.B. is a co-founder, consultant, SAB member and has equity interest in Boundless Bio and Abterra. The terms of this arrangement have been reviewed and approved by the University of California, San Diego in accordance with its conflict of interest policies. J.L. is a part-time consultant for Boundless Bio. The terms of this arrangement have been reviewed and approved by the University of California, San Diego in accordance with its conflict of interest policies. J.T.L. is an employee of Boundless Bio. His employment began after his contributions to the manuscript. The remaining authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors jointly supervised this work: Jill P Mesirov, Lukas Chavez.
Extended data
is available for this paper at 10.1038/s41588-023-01551-3.
Supplementary information
The online version contains supplementary material available at 10.1038/s41588-023-01551-3.
References
- 1.Cox D, Yuncken C, Spriggs A. Minute chromatin bodies in malignant tumours of childhood. Lancet. 1965;286:55–58. doi: 10.1016/S0140-6736(65)90131-5. [DOI] [PubMed] [Google Scholar]
- 2.Turner KM, et al. Extrachromosomal oncogene amplification drives tumour evolution and genetic heterogeneity. Nature. 2017;543:122–125. doi: 10.1038/nature21356. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Kim H, et al. Extrachromosomal DNA is associated with oncogene amplification and poor outcome across multiple cancers. Nat. Genet. 2020;52:891–897. doi: 10.1038/s41588-020-0678-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Verhaak RGW, Bafna V, Mischel PS. Extrachromosomal oncogene amplification in tumour pathogenesis and evolution. Nat. Rev. Cancer. 2019;19:283–288. doi: 10.1038/s41568-019-0128-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Ståhl F, Wettergren Y, Levan G. Amplicon structure in multidrug-resistant murine cells: a nonrearranged region of genomic DNA corresponding to large circular DNA. Mol. Cell. Biol. 1992;12:1179–1187. doi: 10.1128/mcb.12.3.1179. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Morales C, et al. Dihydrofolate reductase amplification and sensitization to methotrexate of methotrexate-resistant colon cancer cells. Mol. Cancer Therapeutics. 2009;8:424–432. doi: 10.1158/1535-7163.MCT-08-0759. [DOI] [PubMed] [Google Scholar]
- 7.Nathanson DA, et al. Targeted therapy resistance mediated by dynamic regulation of extrachromosomal mutant EGFR DNA. Science. 2014;343:72–76. doi: 10.1126/science.1241328. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Nikolaev S, et al. Extrachromosomal driver mutations in glioblastoma and low-grade glioma. Nat. Commun. 2014;5:5690. doi: 10.1038/ncomms6690. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Koche RP, et al. Extrachromosomal circular DNA drives oncogenic genome remodeling in neuroblastoma. Nat. Genet. 2020;52:29–34. doi: 10.1038/s41588-019-0547-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Storlazzi CT, et al. MYC-containing double minutes in hematologic malignancies: evidence in favor of the episome model and exclusion of MYC as the target gene. Hum. Mol. Genet. 2006;15:933–942. doi: 10.1093/hmg/ddl010. [DOI] [PubMed] [Google Scholar]
- 11.Wu S, et al. Circular ecDNA promotes accessible chromatin and high oncogene expression. Nature. 2019;575:699–703. doi: 10.1038/s41586-019-1763-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Helmsauer K, et al. Enhancer hijacking determines extrachromosomal circular MYCN amplicon architecture in neuroblastoma. Nat. Commun. 2020;11:5823. doi: 10.1038/s41467-020-19452-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Morton AR, et al. Functional enhancers shape extrachromosomal oncogene amplifications. Cell. 2019;179:1330–1341.e1313. doi: 10.1016/j.cell.2019.10.039. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Salloum R, et al. Late morbidity and mortality among medulloblastoma survivors diagnosed across three decades: a report from the childhood cancer survivor study. J. Clin. Oncol. 2019;37:731–740. doi: 10.1200/JCO.18.00969. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Juraschka K, Taylor MD. Medulloblastoma in the age of molecular subgroups: a review. J. Neurosurg. Pediatr. 2019;24:353–363. doi: 10.3171/2019.5.PEDS18381. [DOI] [PubMed] [Google Scholar]
- 16.Ryan SL, et al. MYC family amplification and clinical risk-factors interact to predict an extremely poor prognosis in childhood medulloblastoma. Acta Neuropathol. 2012;123:501–513. doi: 10.1007/s00401-011-0923-y. [DOI] [PubMed] [Google Scholar]
- 17.Rausch T, et al. Genome sequencing of pediatric medulloblastoma links catastrophic DNA rearrangements with TP53 mutations. Cell. 2012;148:59–71. doi: 10.1016/j.cell.2011.12.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Northcott PA, et al. The whole-genome landscape of medulloblastoma subtypes. Nature. 2017;547:311–317. doi: 10.1038/nature22973. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Ramaswamy V, et al. Risk stratification of childhood medulloblastoma in the molecular era: the current consensus. Acta Neuropathol. 2016;131:821–831. doi: 10.1007/s00401-016-1569-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.McLeod C, et al. St. Jude Cloud: a pediatric cancer genomic data-sharing ecosystem. Cancer Discov. 2021;11:1082–1099. doi: 10.1158/2159-8290.CD-20-1230. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Lilly JV, et al. The children’s brain tumor network (CBTN)—accelerating research in pediatric central nervous system tumors through collaboration and open science. Neoplasia. 2023;35:100846. doi: 10.1016/j.neo.2022.100846. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Campbell PJ, et al. Pan-cancer analysis of whole genomes. Nature. 2020;578:82–93. doi: 10.1038/s41586-020-1969-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Archer TC, et al. Proteomics, post-translational modifications, and integrative analyses reveal molecular heterogeneity within medulloblastoma subgroups. Cancer Cell. 2018;34:396–410.e398. doi: 10.1016/j.ccell.2018.08.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Deshpande V, et al. Exploring the landscape of focal amplifications in cancer using AmpliconArchitect. Nat. Commun. 2019;10:392. doi: 10.1038/s41467-018-08200-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Garancher A, et al. NRL and CRX define photoreceptor identity and reveal subgroup-specific dependencies in medulloblastoma. Cancer Cell. 2018;33:435–449.e436. doi: 10.1016/j.ccell.2018.02.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Wen J, et al. WIP1 modulates responsiveness to Sonic Hedgehog signaling in neuronal precursor cells and medulloblastoma. Oncogene. 2016;35:5552–5564. doi: 10.1038/onc.2016.96. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Morabito M, et al. An autocrine ActivinB mechanism drives TGFβ/Activin signaling in Group 3 medulloblastoma. EMBO Mol. Med. 2019;11:e9830. doi: 10.15252/emmm.201809830. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Lu X, et al. The type 2C phosphatase Wip1: an oncogenic regulator of tumor suppressor and DNA damage response pathways. Cancer Metastasis Rev. 2008;27:123–135. doi: 10.1007/s10555-008-9127-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Bellutti F, et al. CDK6 antagonizes p53-induced responses during tumorigenesis. Cancer Discov. 2018;8:884. doi: 10.1158/2159-8290.CD-17-0912. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Li FP, et al. A cancer family syndrome in twenty-four kindreds. Cancer Res. 1988;48:5358–5362. [PubMed] [Google Scholar]
- 31.Waszak SM, et al. Spectrum and prevalence of genetic predisposition in medulloblastoma: a retrospective genetic study and prospective validation in a clinical trial cohort. Lancet Oncol. 2018;19:785–798. doi: 10.1016/S1470-2045(18)30242-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Shoshani O, et al. Chromothripsis drives the evolution of gene amplification in cancer. Nature. 2021;591:137–141. doi: 10.1038/s41586-020-03064-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Umbreit NT, et al. Mechanisms generating cancer genome complexity from a single cell division error. Science. 2020;368:eaba0712. doi: 10.1126/science.aba0712. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Zhukova N, et al. Subgroup-specific prognostic implications of TP53 mutation in medulloblastoma. J. Clin. Oncol. 2013;31:2927–2935. doi: 10.1200/JCO.2012.48.5052. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Baron RM, Kenny DA. The moderator–mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. J. Personal. Soc. Psychol. 1986;51:1173–1182. doi: 10.1037/0022-3514.51.6.1173. [DOI] [PubMed] [Google Scholar]
- 36.McLendon R, et al. Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature. 2008;455:1061–1068. doi: 10.1038/nature07385. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Rusert JM, et al. Functional precision medicine identifies new therapeutic candidates for medulloblastoma. Cancer Res. 2020;80:5393–5407. doi: 10.1158/0008-5472.CAN-20-1655. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Hung KL, et al. Targeted profiling of human extrachromosomal DNA by CRISPR-CATCH. Nat. Genet. 2022;54:1746–1754. doi: 10.1038/s41588-022-01190-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Lange JT, et al. The evolutionary dynamics of extrachromosomal DNA in human cancers. Nat. Genet. 2022;54:1527–1533. doi: 10.1038/s41588-022-01177-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Hung KL, et al. ecDNA hubs drive cooperative intermolecular oncogene expression. Nature. 2021;600:731–736. doi: 10.1038/s41586-021-04116-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Hao Y, et al. Integrated analysis of multimodal single-cell data. Cell. 2021;184:3573–3587.e3529. doi: 10.1016/j.cell.2021.04.048. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Barbie DA, et al. Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1. Nature. 2009;462:108–112. doi: 10.1038/nature08460. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Northcott PA, et al. Enhancer hijacking activates GFI1 family oncogenes in medulloblastoma. Nature. 2014;511:428–434. doi: 10.1038/nature13379. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Buenrostro JD, et al. Single-cell chromatin accessibility reveals principles of regulatory variation. Nature. 2015;523:486–490. doi: 10.1038/nature14590. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Rao SSP, et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell. 2014;159:1665–1680. doi: 10.1016/j.cell.2014.11.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Kumar P, et al. ATAC-seq identifies thousands of extrachromosomal circular DNA in cancer and cell lines. Sci. Adv. 2020;6:eaba2489. doi: 10.1126/sciadv.aba2489. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Toledo F, Wahl GM. MDM2 and MDM4: p53 regulators as targets in anticancer therapy. Int. J. Biochem. Cell Biol. 2007;39:1476–1482. doi: 10.1016/j.biocel.2007.03.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Fagerberg L, et al. Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics. Mol. Cell. Proteom. 2014;13:397–406. doi: 10.1074/mcp.M113.035600. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Adamson, D. C. et al. OTX2 is critical for the maintenance and progression of Shh-independent medulloblastomas. Cancer Res. 10.1158/0008-5472.CAN-09-2331 (2010). [DOI] [PMC free article] [PubMed]
- 50.Moore JE, et al. Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature. 2020;583:699–710. doi: 10.1038/s41586-020-2493-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Lin CY, et al. Active medulloblastoma enhancers reveal subgroup-specific cellular origins. Nature. 2016;530:57–62. doi: 10.1038/nature16546. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Li W, et al. Quality control, modeling, and visualization of CRISPR screens with MAGeCK-VISPR. Genome Biol. 2015;16:281. doi: 10.1186/s13059-015-0843-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Wortham M, et al. Chromatin accessibility mapping identifies mediators of basal transcription and retinoid-induced repression of OTX2 in medulloblastoma. PLoS One. 2014;9:e107156. doi: 10.1371/journal.pone.0107156. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Robinson GW, et al. Vismodegib exerts targeted efficacy against recurrent Sonic Hedgehog-subgroup medulloblastoma: results from phase II pediatric brain tumor consortium studies PBTC-025B and PBTC-032. J. Clin. Oncol. 2015;33:2646–2654. doi: 10.1200/JCO.2014.60.1591. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Luebeck J, et al. Extrachromosomal DNA in the cancerous transformation of Barrett’s oesophagus. Nature. 2023;616:789–805. doi: 10.1038/s41586-023-05937-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Zhu Y, et al. Oncogenic extrachromosomal DNA functions as mobile enhancers to globally amplify chromosomal transcription. Cancer Cell. 2021;39:694–707.e697. doi: 10.1016/j.ccell.2021.03.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.deCarvalho AC, et al. Discordant inheritance of chromosomal and extrachromosomal DNA elements contributes to dynamic disease evolution in glioblastoma. Nat. Genet. 2018;50:708–717. doi: 10.1038/s41588-018-0105-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Xu K, et al. Structure and evolution of double minutes in diagnosis and relapse brain tumors. Acta Neuropathol. 2019;137:123–137. doi: 10.1007/s00401-018-1912-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Kaufman RJ, Brown PC, Schimke RT. Amplified dihydrofolate reductase genes in unstably methotrexate-resistant cells are associated with double minute chromosomes. Proc. Natl Acad. Sci. USA. 1979;76:5669–5673. doi: 10.1073/pnas.76.11.5669. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Meng X, et al. Novel role for non-homologous end joining in the formation of double minutes in methotrexate-resistant colon cancer cells. J. Med. Genet. 2015;52:135. doi: 10.1136/jmedgenet-2014-102703. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Sondka Z, et al. The COSMIC Cancer Gene Census: describing genetic dysfunction across all human cancers. Nat. Rev. Cancer. 2018;18:696–705. doi: 10.1038/s41568-018-0060-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Ghandi M, et al. Next-generation characterization of the Cancer Cell Line Encyclopedia. Nature. 2019;569:503–508. doi: 10.1038/s41586-019-1186-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Hsu JY, et al. CRISPR-SURF: discovering regulatory elements by deconvolution of CRISPR tiling screen data. Nat. Methods. 2018;15:992–993. doi: 10.1038/s41592-018-0225-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Virtanen P, et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods. 2020;17:261–272. doi: 10.1038/s41592-019-0686-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Methodol. 1995;57:289–300. [Google Scholar]
- 66.Seabold, S. & Perktold, J. Statsmodels: econometric and statistical modeling with Python. In Proc. 9th Python in Science Conference (SCIPY 2010) (Eds. van der Walt S. & Millman J.) 92–96 (2010).
- 67.Talevich E, Shain AH, Botton T, Bastian BC. CNVkit: genome-wide copy number detection and visualization from targeted DNA sequencing. PLoS Comput. Biol. 2016;12:e1004873. doi: 10.1371/journal.pcbi.1004873. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Fairley S, Lowy-Gallego E, Perry E, Flicek P. The International Genome Sample Resource (IGSR) collection of open human genomic variation resources. Nucleic Acids Res. 2020;48:D941–D947. doi: 10.1093/nar/gkz836. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Ivanov DP, Coyle B, Walker DA, Grabowska AM. In vitro models of medulloblastoma: choosing the right tool for the job. J. Biotechnol. 2016;236:10–25. doi: 10.1016/j.jbiotec.2016.07.028. [DOI] [PubMed] [Google Scholar]
- 70.Robinson G, et al. Novel mutations target distinct subgroups of medulloblastoma. Nature. 2012;488:43–48. doi: 10.1038/nature11213. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Northcott PA, et al. Subgroup-specific structural variation across 1,000 medulloblastoma genomes. Nature. 2012;488:49–56. doi: 10.1038/nature11327. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Gendoo DMA, Haibe-Kains B. MM2S: personalized diagnosis of medulloblastoma patients and model systems. Source Code Biol. Med. 2016;11:6. doi: 10.1186/s13029-016-0053-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Lin, M. F. et al. GLnexus: joint variant calling for large cohort sequencing. Preprint at bioRxiv10.1101/343970 (2018).
- 74.Ioannidis NM, et al. REVEL: an ensemble method for predicting the pathogenicity of rare missense variants. Am. J. Hum. Genet. 2016;99:877–885. doi: 10.1016/j.ajhg.2016.08.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Rentzsch P, Witten D, Cooper GM, Shendure J, Kircher M. CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 2019;47:D886–D894. doi: 10.1093/nar/gky1016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.McLaren W, et al. The Ensembl Variant Effect Predictor. Genome Biol. 2016;17:122. doi: 10.1186/s13059-016-0974-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Karczewski KJ, et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature. 2020;581:434–443. doi: 10.1038/s41586-020-2308-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Davidson-Pilon C. lifelines: survival analysis in Python. J. Open Source Softw. 2019;4:1317. doi: 10.21105/joss.01317. [DOI] [Google Scholar]
- 79.Liu G, Piantadosi S. Ridge estimation in generalized linear models and proportional hazards regressions. Commun. Stat. Theory Methods. 2017;46:11466–11479. doi: 10.1080/03610926.2016.1267767. [DOI] [Google Scholar]
- 80.Verweij PJM, Van Houwelingen HC. Penalized likelihood in Cox regression. Stat. Med. 1994;13:2427–2436. doi: 10.1002/sim.4780132307. [DOI] [PubMed] [Google Scholar]
- 81.Xue X, Kim MY, Shore RE. Cox regression analysis in presence of collinearity: an application to assessment of health risks associated with occupational radiation exposure. Lifetime Data Anal. 2007;13:333–350. doi: 10.1007/s10985-007-9045-1. [DOI] [PubMed] [Google Scholar]
- 82.Lapointe-Shaw L, et al. Mediation analysis with a time-to-event outcome: a review of use and reporting in healthcare research. BMC Med. Res. Method. 2018;18:118. doi: 10.1186/s12874-018-0578-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Luebeck J, et al. AmpliconReconstructor integrates NGS and optical mapping to resolve the complex structures of focal amplifications. Nat. Commun. 2020;11:4374. doi: 10.1038/s41467-020-18099-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Raeisi Dehkordi S, Luebeck J, Bafna V. FaNDOM: fast nested distance-based seeding of optical maps. Patterns. 2021;2:100248. doi: 10.1016/j.patter.2021.100248. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Yang L, et al. NuSeT: a deep learning tool for reliably separating and analyzing crowded cells. PLoS Comput. Biol. 2020;16:e1008193. doi: 10.1371/journal.pcbi.1008193. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Rajkumar U, et al. EcSeg: semantic segmentation of metaphase images containing extrachromosomal DNA. iScience. 2019;21:428–435. doi: 10.1016/j.isci.2019.10.035. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Nuclei isolation from complex tissues for single cell multiome ATAC + gene expression sequencing (10x Genomics, 2021).
- 88.McGinnis CS, Murrow LM, Gartner ZJ. DoubletFinder: doublet detection in single-cell RNA sequencing data using artificial nearest neighbors. Cell Syst. 2019;8:329–337.e324. doi: 10.1016/j.cels.2019.03.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Hafemeister C, Satija R. Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression. Genome Biol. 2019;20:296. doi: 10.1186/s13059-019-1874-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Karlsson M, et al. A single–cell type transcriptomics map of human tissues. Sci. Adv. 2021;7:eabh2169. doi: 10.1126/sciadv.abh2169. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Tickle, T., Tirosh, I., Georgescu, C., Brown, M. & Haas, B. inferCNV of the Trinity CTAT project. https://github.com/broadinstitute/inferCNV (2019).
- 92.Robinson JT, et al. Integrative Genomics Viewer. Nat. Biotechnol. 2011;29:24–26. doi: 10.1038/nbt.1754. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Ramírez F, et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 2016;44:W160–W165. doi: 10.1093/nar/gkw257. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–842. doi: 10.1093/bioinformatics/btq033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Kent WJ, Zweig AS, Barber G, Hinrichs AS, Karolchik D. BigWig and BigBed: enabling browsing of large distributed datasets. Bioinformatics. 2010;26:2204–2207. doi: 10.1093/bioinformatics/btq351. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Stovner EB, Sætrom P. PyRanges: efficient comparison of genomic intervals in Python. Bioinformatics. 2020;36:918–919. doi: 10.1093/bioinformatics/btz615. [DOI] [PubMed] [Google Scholar]
- 97.North BV, Curtis D, Sham PC. A note on the calculation of empirical P values from Monte Carlo procedures. Am. J. Hum. Genet. 2002;71:439–441. doi: 10.1086/341527. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Reich M, et al. GenePattern 2.0. Nat. Genet. 2006;38:500–501. doi: 10.1038/ng0506-500. [DOI] [PubMed] [Google Scholar]
- 99.Waskom M. Seaborn: statistical data visualization. J. Open Source Softw. 2021;6:3021. doi: 10.21105/joss.03021. [DOI] [Google Scholar]
- 100.Zhang Y, et al. Model-based analysis of ChIP-seq (MACS) Genome Biol. 2008;9:R137. doi: 10.1186/gb-2008-9-9-r137. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–2120. doi: 10.1093/bioinformatics/btu170. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102.Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat. Methods. 2012;9:357–359. doi: 10.1038/nmeth.1923. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 103.Servant N, et al. HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome Biol. 2015;16:259. doi: 10.1186/s13059-015-0831-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104.Durand NC, et al. Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell Syst. 2016;3:99–101. doi: 10.1016/j.cels.2015.07.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Knight PA, Ruiz D. A fast algorithm for matrix balancing. IMA J. Numer. Anal. 2013;33:1029–1047. doi: 10.1093/imanum/drs019. [DOI] [Google Scholar]
- 106.Durand NC, et al. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst. 2016;3:95–98. doi: 10.1016/j.cels.2016.07.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Kaul A, Bhattacharyya S, Ay F. Identifying statistically significant chromatin contacts from Hi-C data with FitHiC2. Nat. Protoc. 2020;15:991–1012. doi: 10.1038/s41596-019-0273-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108.Krzywinski M, et al. An information aesthetic for comparative genomics. Genome Res. 2009;19:1639–1645. doi: 10.1101/gr.092759.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.Gilbert LA, et al. Genome-scale CRISPR-mediated control of gene repression and activation. Cell. 2014;159:647–661. doi: 10.1016/j.cell.2014.09.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Dixit D, et al. The RNA m6A reader YTHDF2 maintains oncogene expression and is a targetable dependency in glioblastoma stem cells. Cancer Discov. 2021;11:480–499. doi: 10.1158/2159-8290.CD-20-0331. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
WGS data from the ICGC, CBTN and St. Jude datasets are under controlled access as implemented by the respective organizations, but are available from the following sources upon reasonable request. ICGC and Archer patient cohorts: International Cancer Genome Consortium (https://dcc.icgc.org). Inclusion criteria were all medulloblastomas from datasets PEME-CA and PBCA-DE. CBTN patient cohort: Kids First Data Resource Center (https://kidsfirstdrc.org). Inclusion criteria were all medulloblastomas from dataset PBTA-CBTN as of March 2020. St. Jude patient cohort: St. Jude Cloud (https://www.stjude.cloud). Inclusion criteria were all medulloblastomas from the Pediatric Cancer Genome Project (PCGP, SJC-DS-1001) and Real-Time Clinical Genomics (RTCG, SJC-DS-1007) datasets as of March 2020. Rady Children’s Hospital patient cohort, medulloblastoma cell line and PDX models: SRA PRJNA1011359. OGM contigs: SRA PRJNA1011359. Other datasets referenced in this work: 1000 Genomes Common SNPs (that is, dbSNP b141; https://ftp.ncbi.nih.gov/snp); DepMap 21Q2 (https://depmap.org/portal/download/all); ENCODE Registry of cCREs v3 (https://screen.encodeproject.org). ATAC-seq, Hi-C, single-cell sequencing and pooled CRISPRi screen data are available at the NCBI Gene Expression Omnibus (GEO) under accession GSE240985. FISH images are available at 10.6084/m9.figshare.c.6759093. Source data are provided with this paper.
Code for the AmpliconArchitect family of software tools is available from the following repositories: PrepareAA (https://github.com/jluebeck/PrepareAA); AmpliconArchitect (https://github.com/jluebeck/AmpliconArchitect); AmpliconClassifier (https://github.com/jluebeck/AmpliconClassifier). Code for the analysis and generation of the figures is available from the following repositories: analyses on clinical and bulk sequencing data (https://github.com/auberginekenobi/medullo-ecdna); and detection and quantification of ecDNA in single-cell ATAC-seq data (https://github.com/auberginekenobi/ecdna-quant). Other single-cell analyses: https://github.com/auberginekenobi/rcmb56-single-cell.