Skip to main content
eLife logoLink to eLife
. 2022 Dec 7;11:e80207. doi: 10.7554/eLife.80207

Oncogene expression from extrachromosomal DNA is driven by copy number amplification and does not require spatial clustering in glioblastoma stem cells

Karin Purshouse 1,2, Elias T Friman 1, Shelagh Boyle 1, Pooran Singh Dewari 2, Vivien Grant 2, Alhafidz Hamdan 2, Gillian M Morrison 2, Paul M Brennan 2,3, Sjoerd V Beentjes 1,4, Steven M Pollard 2,, Wendy A Bickmore 1,†,
Editors: Jessica K Tyler5, Jessica K Tyler6
PMCID: PMC9728993  PMID: 36476408

Abstract

Extrachromosomal DNA (ecDNA) are frequently observed in human cancers and are responsible for high levels of oncogene expression. In glioblastoma (GBM), ecDNA copy number correlates with poor prognosis. It is hypothesized that their copy number, size, and chromatin accessibility facilitate clustering of ecDNA and colocalization with transcriptional hubs, and that this underpins their elevated transcriptional activity. Here, we use super-resolution imaging and quantitative image analysis to evaluate GBM stem cells harbouring distinct ecDNA species (EGFR, CDK4, PDGFRA). We find no evidence that ecDNA routinely cluster with one another or closely interact with transcriptional hubs. Cells with EGFR-containing ecDNA have increased EGFR transcriptional output, but transcription per gene copy is similar in ecDNA compared to the endogenous chromosomal locus. These data suggest that it is the increased copy number of oncogene-harbouring ecDNA that primarily drives high levels of oncogene transcription, rather than specific interactions of ecDNA with each other or with high concentrations of the transcriptional machinery.

Research organism: Human

Introduction

Glioblastoma (GBM) is characterized by intra-tumoural heterogeneity and stem cell-like properties that underpin treatment resistance and poor prognosis (Bulstrode et al., 2017; Suvà et al., 2014). GBM is divided into distinct transcriptional subtypes that span a continuum of stem cell/developmental and injury response/immune evasion cell states (Richards et al., 2021; Verhaak et al., 2010; Wang et al., 2021). Genetically, activation or amplification of EGFR (chr7) is altered in almost two-thirds of GBM (Brennan et al., 2013). Other commonly amplified genes include PDGFRA (chr4), CDK4, MDM2 (chr12), MET, and CDK6 (chr7) with multicopy extrachromosomal DNA (ecDNA) considered a major mechanism for oncogene amplification (Brennan et al., 2013; Kim et al., 2020; Snuderl et al., 2011; Szerlip et al., 2012).

Although a long-recognized feature of cancer (Cox et al., 1965), ecDNA are particularly common in GBM, with 90% of patient-derived GBM tumour models harbouring ecDNA (Turner et al., 2017). However, there is much broader interest in mechanisms of ecDNA function across many solid tumours, as ecDNA enable rapid oncogene amplification in response to selective pressures, and have been shown to correlate with poor prognosis and treatment resistance (Kim et al., 2020; Nathanson et al., 2014; Vicario et al., 2015). EcDNA are centromere-free DNA circles of around 1–3 Mb in size that frequently exist as doublets (double minutes), but also as single elements (Hamkalo et al., 1985; Verhaak et al., 2019; Vogt et al., 2004). EcDNA can be composed of multiple genetic fragments generated as a result of chromothripsis (Gibaud et al., 2010; Shoshani et al., 2021; Rosswog et al., 2021). Although ecDNA were previously identified in 1.4% of cancers, more recent studies have shown their prevalence to be significantly higher (Fan et al., 2011; Kim et al., 2020; Turner et al., 2017). EcDNA can lead to oncogene copy number being amplified to >100 in any given cell, with significant copy number heterogeneity between cells (Lange et al., 2022; Turner et al., 2017). Freed from the constraints imposed by being embedded within a chromosome, ecDNA have spatial freedom and can adapt to targeted therapeutics (Lange et al., 2022; Nathanson et al., 2014). For example, the EGFR variant EGFRvIII (exon 2–7 deletion) is found on ecDNA, and is associated with an aggressive disease course and resistance mechanisms against EGFR inhibitors (Brennan et al., 2013; Inda et al., 2010; Nathanson et al., 2014; Turner et al., 2017).

As well as their resident oncogenes, ecDNA also harbour regulatory elements (enhancers) required to drive oncogene expression (Morton et al., 2019; Zhu et al., 2021). Consistent with this, ecDNA have been found to have regions of largely accessible chromatin (assayed by ATAC-seq), indicative of nucleosome displacement by bound transcription factors, and to be decorated with histone modifications associated with active chromatin (Wu et al., 2019). Transcription factors densely co-bound at enhancers have been suggested to nucleate condensates or ‘hubs’ (Cho et al., 2018; Rai et al., 2018; Strom and Brangwynne, 2019), enriched with key transcriptional components such as mediator and RNA polymerase II (PolII) to drive high levels of gene expression (Cho et al., 2018; Chong et al., 2018; Sabari et al., 2018). Given the colocation of enhancers and driver oncogenes on ecDNA, it has therefore been suggested that ecDNA cluster together in the nucleus, driving the recruitment of a high concentration of RNA PolII and creating ecDNA-driven nuclear hubs that in turn enhance the transcriptional output from ecDNA (Adelman and Martin, 2021; Hung et al., 2021; Yi et al., 2021; Zhu et al., 2021).

Here, using super-resolution imaging of primary GBM cell lines, we find that ecDNA are widely dispersed throughout the nucleus and we find neither evidence of ecDNA clustering together nor any significant spatial overlap between ecDNA and large PolII hubs. As expected, we show that expression from genes on ecDNA, both at mRNA and protein level, correlates with ecDNA copy number in the tumour cell lines. However, transcription of genes present on each individual ecDNA molecule appears to occur at a similar efficiency (transcripts per copy number) to that of the equivalent endogenous chromosomally located gene. These data suggest that it is primarily the increased copy number of ecDNA in GBM stem cells, and not a specific property of nuclear colocalization, that drives the increased transcriptional capacity of their resident oncogenes.

Results

EcDNA are more frequently located centrally in the nucleus in GBM stem cells

We characterized two GBM-derived glioma stem cell (GSC) primary cell lines containing multiple EGFR-harbouring ecDNA (ecEGFR) populations (GCGR-E26 and GCGR-E28, referred to here as E26 and E28). Whole genome sequencing (WGS) analysis using Amplicon Architect (Deshpande et al., 2019) indicated that E26 ecDNA harbour an EGFRvIII (exon 2–7 deletion), and E28 have a subpopulation of ecDNA with EGFR exon 7–14 deleted (Figure 1A). The presence of EGFR on ecDNA was confirmed by DNA FISH on metaphase spreads (Figure 1B and C). E26 harboured more ecDNA per cell than E28 (Figure 1D), with approximately 10% of metaphases also indicating the presence of a chromosomal homogeneously staining region (HSR) (Figure 1B; arrow). Endogenous EGFR is located on human chromosome 7, and metaphase spreads of the two tumour lines showed 3–6 copies of chromosome 7 in E26 and frequently 3 copies in E28 (Figure 1E).

Figure 1. The nuclear localization of extrachromosomal DNA (ecDNA) in glioblastoma (GBM) cell lines.

(A) Whole genome sequencing (WGS) and AmpliconArchitect analysis for ecDNA regions for E26 and E28 cell lines showing an EGFR exon 2–7 deletion in all ecDNA in E26 cells (seen in WGS and AmpliconArchitect regions a and b), and a subpopulation of ecDNA in E28 with a deletion across EGFR exons 7–14 (seen in WGS and Amplicon Architect region a – no deletion in E28 AmpliconArchitect region b). Genome coordinates (bp) are from the hg38 assembly of the human genome. (B) DNA FISH on metaphase spread of the E26 cell line showing EGFR (green) present on ecDNA, and on a homogeneously staining region (HSR) (arrowed) detected in ~10% of metaphases. Scale bar = 10 μm. (C) As for (B) but for the E28 cell line. (D) Violin plot of the number of EGFR DNA FISH signals per metaphase spread of E26 and E28 cells. Median and quartiles are shown. ** p=0.008 (Mann-Whitney test). Median values are 51 (E26)and 12 (E28), n=25 (E26) and 24 (E28) spreads. (E and F) Representative DNA FISH images of metaphase spread (E) and 2D nuclei (F) for neural stem cell (NSC), E26, and E28 cells showing signals for chromosome 7 (red) and EGFR (green). Blue = DNA (DAPI). Scale bar = 10 μm. The five erosions bins from the periphery to the centre of the nucleus are shown in F. (G) EGFR FISH signal intensity normalized to that for chromosome 7 (EGFR:Chr7 Mean Intensity) across five bins of equal area eroded from the peripheral (Bin 1) to the centre (Bin 5) of the nucleus for NSC, E26, and E28 cell lines. Median and quartiles shown. **** p<0.0001, * p<0.05. Kruskall-Wallis test. EGFR and chr7 signal normalized to DAPI shown in Figure 1—figure supplement 1. n=66 (NSC), 59 (E26), 64 (E28) nuclei. Statistical data relevant for this figure are in Figure 1—source data 1.

Figure 1—source data 1. Statistical data for Figure 1 and Figure 1—figure supplement 1.
Median values for number of EGFR DNA FISH signals per metaphase spread for E26 and E28 cell lines. Data are for Figure 1D. Mean Chr7 (Texas Red) and EGFR (FITC) DNA FISH signal intensity in bins eroded from the periphery (1) to the centre (5) of the nucleus of neural stem cell (NSC), E26 and E28 cells. p-Values from Kruskall-Wallis test. Data are for Figure 1G and Figure 1—figure supplement 1.

Figure 1.

Figure 1—figure supplement 1. Additional EGFR and chromosome 7 signal intensity data.

Figure 1—figure supplement 1.

Radial distribution, normalized to DAPI, across bins of equal area eroded from the edge (1) to the centre (5) of the nucleus for (A) chromosome 7 (TxR mean normalized intensity per nucleus) or (B) EGFR (FITC mean normalized intensity per nucleus) hybridization signals.
Median and quartiles are shown. Statistical significance was examined by Kruskall-Wallis. **** p<0.0001. Statistical data relevant for this figure are in Figure 1—source data 1.

Human chromosomes have non-random nuclear organization, with active regions preferentially located towards the central regions of the nucleus (Boyle et al., 2001; Croft et al., 1999). We sought to determine the nuclear localization of ecDNA in GBM cell lines as compared with the endogenous chromosomal EGFR. DNA FISH for chromosome 7 and EGFR in nuclei from human fetal neural stem cells (NSCs) confirmed the trend for human chromosome 7 to be generally found towards the periphery of the nucleus (Boyle et al., 2001 Figure 1F and G, Figure 1—figure supplement 1, Figure 1—source data 1). Signal intensity analysis for equally sized bins eroded from the edge to the centre of each nucleus indicated that chromosome 7 and EGFR signal intensity were preferentially located towards the nuclear periphery in each cell line (Figure 1—figure supplement 1, Figure 1—source data 1). Even once chromosome 7 signal was accounted for, EGFR DNA FISH signal was still highest at the periphery of NSC nuclei and lowest in the central regions (p<0.0001) (Figure 1G), likely reflecting the centromere proximal localization of endogenous EGFR on chromosome 7 (Carvalho et al., 2001). This radial organization was still significant (p=0.012), but much less marked, in E28 cells which have on average a modest number of EGFR ecDNA compared to endogenous copies (Figure 1D). In E26 cells, which have a very high copy number of ecDNA, this preference for a more peripheral localization is lost (p=0.06). These data suggest that, freed of the constraints on nuclear localization imposed by human chromosome 7, EGFR genes located on ecDNA can access more central regions of the nucleus.

EGFR-containing ecDNA in GBM stem cells do not cluster in the nucleus

It has been suggested that ecDNA cluster into ‘ecDNA hubs’ within nuclei of cancer cells, including for EGFRvIII-containing ecDNA in other GBM cell lines (HK359 and GBM39) (Hung et al., 2021; Yi et al., 2021). We sought to quantify this using our E26 and E28 GBM cells with a single oncogene-harbouring ecDNA population (EGFR variant amplicons). Previous studies exploring genomic loci proximity and contact domains (Williamson et al., 2016; Williamson et al., 2019; Hansen et al., 2021), and the proximity of super-enhancers to BRD4/MED1 puncta (Sabari et al., 2018), would suggest that ecDNAs clustering together at a transcriptional hub should be located within ~200 nm or less of each other. We used 3D image-based analysis of the EGFR DNA FISH signals (Figure 2A) to determine if there is clustering of ecDNA. The relative frequency of all shortest EGFR-EGFR distances per nucleus did not suggest frequent ecDNA-ecDNA interactions at ≤200 nm in either cell line (Figure 2B, Figure 2—figure supplement 1A). The mean shortest interprobe distances per nucleus were also not suggestive of close interactions, with no values <500 nm (Figure 2—figure supplement 1B, C; Figure 2—source data 1). The single shortest interprobe distance per nucleus was also larger (0.24 μm, E26; 0.25 μm, E28) than would be expected if there were clustering of ecDNA in the close proximity required for coordinated transcription in hubs (Figure 2—figure supplement 1D, E; Figure 2—source data 1).

Figure 2. EGFR-containing extrachromosomal DNA (ecDNA) do not cluster in the nucleus.

(A) Representative images shown as maximum intensity projection of DNA FISH for EGFR (red) in the nuclei of E26 (top) and E28 (bottom) glioblastoma (GBM) cell lines, scale bar = 1 μm. (B) Cumulative frequency distribution of shortest EGFR-EGFR distances between all foci in each nucleus across all E26 (n=37) and E28 (n=36) nuclei. Dotted line = 200nm. (C) (Top) Representative maximum intensity projection images of EGFR DNA FISH (red) in nuclei of E26 and E28 cells (blue=DNA). Scale bar = 5 μm. (Bottom) Associated 3D Ripley’s K function for these nuclei showing observed K function (red), max/min/median (black) of 10,000 null samples with p=0.05 significance cut-off shown (empty black circle). (D) Ripley’s K function for EGFR DNA FISH signals showing number of E26 (n=12) and E28 (n=8) nuclei with significant and non-significant clustering at each given radius. p-values were calculated using Neyman-Pearson lemma with optimistic estimate p-value where required (see Materials and methods), and Benjamini-Hochberg procedure (BHP, FDR = 0.05).

Figure 2—source data 1. Statistical data for Figure 2—figure supplement 1.
Statistical analysis of data for Figure 2—figure supplement 1B-E. Mean shortest interprobe distance and shortest interprobe distance in E26 and E28 cell lines. EGFR-EGFR interprobe distance (μm) = median values shown. The statistical significance of the data distributions between E26 and E28 were assessed with a Mann-Whitney test. n = number of nuclei.

Figure 2.

Figure 2—figure supplement 1. Additional analysis of EGFR-EGFR distances in E26 and E28 cell lines.

Figure 2—figure supplement 1.

(A) Cumulative frequency distribution of shortest EGFR-EGFR distances between all foci in each nucleus across all E26 and E28 nuclei with >10 EGFR foci. (B) Violin plots showing the distribution of mean shortest interprobe distance between EGFR foci per nucleus in E26 and E28 cell lines. Dotted line denotes y=200 nm. Number of nuclei (n): E26 = 37, E28 = 36. (C) As for (B) but only showing nuclei with >10 EGFR foci. Number of nuclei (n): E26 = 18, E28 = 9. (D) As for (B) but for shortest single distance between two EGFR foci in any nucleus. (E) As for (D) but only showing nuclei with >10 EGFR foci. Statistical significance examined by Mann-Whitney test. ns = not significant, * p<0.05, ** p<0.01. Figure 2—source data 1. (F) Ripley’s K function for E26 and E28 nuclei showing number of nuclei with significant and non-significant clustering at each given radius − with spot diameter = 150 nm.

The analysis above quantified distances between FISH hybridization signals but does not determine whether there is a non-random distribution of foci in the nuclei at distances in keeping with transcription hubs. We therefore used 3D Ripley’s K function to determine the observed spatial pattern of the foci in each nucleus and compared this with a random null distribution of 10,000 simulations of the same number of foci in the same volume. We powered this to identify any significant clustering at each radius in 0.1 μm increments between 0.1 and 1 μm (examples of E26 and E28 nuclei and their corresponding Ripley’s K function in Figure 2C). The E26 cell line had some nuclei with significant non-random distribution of ecDNA, but only at ≥400 nm radial distances, and E28 only had occasional nuclei with significant non-random distribution of ecDNA at ≥700 nm (Figure 2D). We repeated this analysis, reducing the focus spot size from 300 to 150 nm diameter to ensure no small FISH foci were omitted that might skew our analysis. No significant clustering was observed at <300 nm (Figure 2—figure supplement 1F).

Different ecDNA populations do not cluster in the nucleus of GBM stem cells

To ensure that multiple ecDNAs are not so tightly clustered that they cannot be resolved by FISH, we analysed another primary GBM cell line (E25) which has two different oncogenes carried on separate ecDNA populations: CDK4 and PDGFRA (Figure 3—figure supplement 1A, B). There was no obvious clustering of the two ecDNA populations in the nuclei of E25 cells (Figure 3A). The relative frequency of CDK4-CDK4, PDGFRA-PDGFRA, and CDK4-PDGFRA distances of ≤200 nm was low (Figure 3B). Indeed, the mean shortest interprobe distances per nucleus were overwhelmingly >1 μm, suggesting ecDNA were generally not in close proximity (Figure 3—figure supplement 1C). The shortest interprobe distances for CDK4-CDK4 and CDK4-PDGFRA were shorter than for PDGFRA-PDGFRA foci, as expected given the higher copy number of CDK4 ecDNA (Figure 3—figure supplement 1B); however, there was no significant difference in the shortest distance between CDK4-CDK4 and CDK4-PDGFRA foci (Figure 3—figure supplement 1D). No two CDK4 or two PDGFRA foci were <200 nm apart, and only four CDK4-PDGFRA distances were <200 nm (4/1011 [0.39%] CDK4 foci, 4/518 [0.77%] PDGFRA foci) (Figure 3—figure supplement 1D). These data suggest that clustering is not a significant feature of two separate populations of ecDNA.

Figure 3. Two separate extrachromosomal DNA (ecDNA) populations do not cluster in the nucleus.

(A) Representative maximum intensity projection images of DNA FISH for CDK4 (green) and PDGFRA (red) in an E25 nucleus. Blue=DNA (DAPI) . Scale bar = 1 μm. (B) Cumulative frequency distribution of shortest interprobe distances (CDK-CDK, PDGFRA-PDGFRA, CDK4-PDGFRA, and PDGFRA-CDK4) between all foci in each nucleus across all E25 nuclei (n=26). (C) (Left) Representative maximum intensity projection image shown of E25 nucleus hybridized with probes for CDK4 (green) and PDGFRA (red). Blue=DNA (DAPI). Scale bar = 5 μm. (Right) Ripley’s K function for this nucleus showing observed K function (red), max/min/median (black) of 10,000 null samples with p=0.05 significance cut-off shown (empty black circle) for CDK4, PDGFRA, and CDK4 and PDGFRA spots combined. (D) Ripley’s K function for E25 nuclei showing number of nuclei with significant and non-significant clustering at each given radius for CDK4 spots (n=13 nuclei), PDGFRA spots (n=9 nuclei), and CDK4 and PDGFRA spots combined (n=9 nuclei). p-values were calculated using Neyman-Pearson lemma with optimistic estimate p-value where required (see Materials and methods), and Benjamini-Hochberg procedure (BHP, FDR = 0.05). Metaphase analysis of E25 cells and Ripley’s K analysis with smaller foci are in Figure 3—figure supplement 1. (E) Representative maximum intensity projection image of E20 interphase nuclei hybridized with probes for CDK4 (green) and PDGFRA (red). Scale bar = 5 μm. (F) As in (E) but for a nucleus where the close association of CDK4 and PDGFA signal in doublets is indicative of ecDNAs harbouring both oncogenes. Scale bar = 1 μm in main panel (G) as in (E) but showing an E20 nucleus with doublets of CDK4 foci. Metaphase analysis of E20 cells with CDK4 and PDGFRA probes in Figure 3—figure supplement 2. (H) As in (B) but for E20 nuclei (n=24) (noting all nuclei shown here harbored >20 foci of each oncogene). (I) As in (D) but for E20 nuclei.

Figure 3—source data 1. Statistical data for Figure 3—figure supplement 1.
Median number of CDK4 and PDGFRA DNA FISH foci in E25 cell line (n=26) nuclei. Data are for Figure 3—figure supplement 1B. Mean shortest interprobe distance and shortest interprobe distance between CDK4 and PDGFRA DNA FISH foci in E25 cell line. Statistical analysis of data for Figure 3—figure supplement 1C and D, Interprobe distance (μm) between fosmids indicated = median values shown. Value in brackets indicates adjusted p-value (adj) = Bonferroni. n=26 nuclei.

Figure 3.

Figure 3—figure supplement 1. Additional analysis of the distribution of CDK4 and PDGFRA ecDNAs in the E25 cell line.

Figure 3—figure supplement 1.

(A) DNA FISH for CDK4 and PDGFRA on metaphase spreads from the E25 cell line, showing CDK4 and PDGFRA on separate extrachromosomal DNA (ecDNA), scale bar = 5 μm. (B) Number of ecDNA per nucleus in E25 cell line, CDK4 (green) and PDGFRA (red). (C) Violin plots showing distribution of mean shortest distance between CDK4 and PDGFRA foci per E25 nucleus. (D) As for (C) but showing the shortest single interprobe distance measured in any nucleus. Dotted line denotes y=200 nm. Statistical significance examined by Mann-Whitney test (hooked line, ns = not significant) and Kruskall-Wallis (straight line, *** p<0.001). Statistical data are detailed in Figure 3—source data 1. (E) Ripley’s K function for E25 nuclei showing number of nuclei with significant and non-significant clustering at each given radius for CDK4 spots, PDGFRA spots, and CDK4 and PDGFRA spots combined − with spot diameter = 150 nm. All p-values for Ripley’s K function calculated using Neyman-Pearson lemma with optimistic estimate p-value where required, and Benjamini-Hochberg procedure (BHP, FDR = 0.05).
Figure 3—figure supplement 2. DNA FISH on metaphase spreads of the E20 cell line showing hybridization signal for PDGFRA (red) and CDK4 (green).

Figure 3—figure supplement 2.

Scale bar = 10 μm. (A) Metaphase spread representative of most cells with PDGFRA and CDK4 clearly on separate extrachromosomal DNA (ecDNAs). (B) Metaphase spread showing an example, representative of approximately 10% of cells where PDGFRA and CDK4 of signals are juxtaposed, suggesting that both oncogenes are located on the same ecDNA. Scale bar = 10 μm. Insets a and b are shown zoomed in below (scale bar 1 μm). Note these are representative images of metaphase spreads where both oncogenes were located on ecDNA – we observed many metaphases with primarily CDK4 ecDNA and few/no PDGFRA ecDNA. Subsequent interphase nuclei were therefore only imaged if >20 CDK and >20 PDGFRA foci to ensure ecDNA of both were present, and to allow Ripley’s K analysis.

We used 3D Ripley’s K function to evaluate point patterns in the E25 dual ecDNA oncogene cell line (Figure 3C). Some nuclei had a significant non-random distribution of PDGFRA ecDNA at ≥400 nm, and most nuclei had non-random distribution of CDK4 ecDNA at >400 nm (Figure 3D). When both foci were combined, there was no significant clustering at <300 nm in any nucleus, and the number of nuclei with a significant non-random distribution at a given radius rose with increasing radial distance (Figure 3D). As previously, a repeat analysis with a smaller (150 nm diameter) spot size identified no instances of significant clustering at <300 nm (Figure 3—figure supplement 1E).

To further validate this, we repeated 3D Ripley’s function analysis in a second GBM cell line (E20) harbouring CDK4 and PDGFRA ecDNAs. Whilst in the majority of metaphase spreads these two oncogenes were on clearly separate ecDNAs, in approximately 10% of metaphase spreads we noted colocalization of CDK4 and PDGFRA hybridization signals indicating a subset of ecDNA harbouring both oncogenes (Figure 3—figure supplement 2A, B). This colocalization could be observed in a similar proportion of interphase nuclei (Figure 3E and F). However, as observed in E25 cells the relative frequency of CDK4-CDK4, PDGFRA-PDGFRA, and CDK4-PDGFRA distances of ≤200 nm was low in the nucleus of E20 cells (Figure 3H). Ripley’s K function analysis of hybridization signals in most E20 nuclei (22/24) showed no evidence for significant clustering of CDK4 or PDGFRA at <300 nm (Figure 3I). We noted 2/24 (8.3%) of interphase nuclei (e.g. Figure 3F, see inset) where Ripley’s K function indicated clustering of CDK4 and PDGFRA foci at 100–200 nm and we suggest that these represent cells, as seen at metaphase, where the two oncogenes are located on the same ecDNA molecule. Doublets of CDK4 foci (200 nm) were detected in 4/24 (16.7%) nuclei (Figure 3G, see inset).

Our analysis of two independent GBM cell lines harbouring different ecDNA populations (CDK4 and PDGFRA) provides no evidence for systematic clustering of ecDNA molecules in the nucleus at distances <200 nm.

ecDNA do not colocalize with large RNA PolII hubs in GBM stem cells

DNA FISH detects all ecDNA, so it might be that only transcriptionally active elements cluster. Therefore, we used RNA FISH to detect nascent EGFR transcripts in the nuclei of GBM cells. As expected, nascent RNA FISH foci were more frequent in the EGFR ecDNA-harbouring cell lines than in NSCs and were more frequent in the E26 GBM cell line than in E28 (Figure 4—figure supplement 1A and B). As for DNA FISH, we found no evidence of clustering of sites of EGFR nascent transcription at <400 nm in E26 cells (Figure 4A and B). These data suggest that ecDNA actively transcribing a driver oncogene do not colocalize in the nucleus of GBM cells more than expected by chance.

Figure 4. Extrachromosomal DNA (ecDNA) do not colocalize with large foci of the transcriptional machinery.

(A) Representative maximum intensity projection image of nascent EGFR RNA FISH (red) in E26 cell nucleus,(blue=DNA). Scale bar = 5 μm. Associated Ripley’s K function for this nucleus showing observed K function (red), max/min/median (black) of 10,000 null samples with p=0.05 significance cut-off shown (empty black circle). (B) Ripley’s K function for E26 nuclei (n=11) after EGFR nascent RNA FISH showing number of nuclei with significant and non-significant clustering at each given radius. All p-values for Ripley’s K function calculated using Neyman-Pearson lemma with optimistic estimate p-value where required, and Benjamini-Hochberg procedure (BHP, FDR = 0.05). (C) Representative maximum intensity projection images of immunoFISH in neural stem cell (NSC), E26 and E28 cell lines: Immunofluorescence for RPB1 (green) and EGFR DNA FISH (red). Scale bar = 5 μm. (D) Spearman’s correlation between number of EGFR foci and number of RPB1 foci, p = 0.13, E26 and E28 cell line data combined. (E) Violin plot of distribution of mean shortest interprobe distance per nucleus between EGFR foci and PolII foci in NSC (n=7), E26 (n=8) and E28 (n=7) cell lines. (F) As for (E) but for shortest single distance in each nucleus. ns, not significant. Kruskall-Wallis test. Statistical data relevant for this figure are in Figure 4—source data 1.

Figure 4—source data 1. Statistical data for Figure 4 and Figure 4—figure supplement 1.
EcDNA-large RPB1 foci distances for neural stem cell (NSC), E26 and E28 cell lines. Statistical analysis of data for DNA-ImmunoFISH (i – Figure 4E and F), RNA-ImmunoFISH (ii – Figure 4—figure supplement 1E F) EcDNA-large RPB1 foci distance (μm) indicated = median values shown. (iii) Median number of EGFR RNA FISH signals for NSC, E26 and E28 cell lines. (iv) Median ecDNA-large POLR2G foci distances for E28 mCherry-POLR2G cell line (Figure 4—figure supplement 1I, J). n = number of nuclei. Kruskall-Wallis and Mann-Whitney tests performed with comparisons as indicated.

Figure 4.

Figure 4—figure supplement 1. Analysis of sites of EGFR nascent transcription relative to RNA polymerase II in GBM cell lines.

Figure 4—figure supplement 1.

(A) Number of nascent EGFR RNA foci per cell line, at least 25 nuclei of each cell line imaged. Statistical significance examined by Mann-Whitney test. ns = not significant, ** p<0.01, ****p<0.0001 and are detailed in Figure 4—source data 1. (B) Representative images of nascent EGFR RNA FISH (shown in greyscale) in neural stem cell (NSC), E26 and E28 cell lines. MIP, scale bar = 5 μm. (C) Representative images of RNA polymerase II (RPB1) foci (arrow heads) detected by immunofluorescence. Scale bar = 5 μm. (D) Representative images of E26 and E28 nascent RNA immunoFISH for EGFR (red) and RNA polymerase II (PolII) (Rpb1 – green), MIP, scale bar = 5 μm. (E) Mean shortest interprobe distance between EGFR RNA and PolII foci. (F) As for (E) but for shortest distance. Median and quartiles plotted. Dotted line denotes y = 200 nm. Statistical significance examined by Mann-Whitney. ns, not significant. Statistical data relevant for this figure are in Figure 4—source data 1. (G) Representative images of immunoFISH in the E28 mCherry-POL2RG cell line: Immunofluorescence for mCherry and EGFR DNA FISH. Scale bar = 5 μm. (H) Violin plot of mean shortest distance per nucleus between EGFR foci and foci detected by Pol2RG-mCherry fusion. Dotted line denotes y=200 nm. (I) As for (H) but for shortest single distance in each nucleus. n=14 nuclei.

We next assessed whether ecDNA foci, albeit not clustered with each other, colocalize with high focal concentrations of the transcriptional machinery to create ecDNA/large PolII transcription hubs. First, we examined the presence of such hubs by immunofluorescence for RPB1 (POLR2A), the largest subunit of RNA PolII. The large RPB1 foci we detected were sparse with only a few clearly visible per nucleus (Figure 4—figure supplement 1C).

We used 3D analysis of immunoFISH in NSCs and compared this to E26 and E28 GBM cells to establish whether ecDNA and large RPB1 foci colocalized. There was no obvious overlap between foci of RPB1 and EGFR (Figure 4C) and no correlation between the number of large RPB1 foci and the number of EGFR foci (Figure 4D). Indeed, the mean shortest distance between EGFR foci and large RPB1 foci per nucleus was routinely >1 μm in all cell lines, despite the greater number of EGFR foci in the GBM cell lines (Figure 4E). The single shortest distance per nucleus between an EGFR locus and a large RPB1 locus was not significantly different across NSC and tumour lines (Figure 4F). There were no instances where the distance between EGFR and large RPB1 foci was <200 nm. To test if this was also the case for the nascent EGFR RNA transcript, we repeated this analysis using nascent RNA FISH, with the same result (Figure 4—figure supplement 1D–F). As the distance distributions to large RPB1 foci were similar for DNA and RNA FISH, this suggests that proximity to large PolII hubs does not alter the probability that ecDNA are transcribed.

To ensure this result was not specific to this PolII antibody, we repeated this analysis using E28 cells in which mCherry was fused by knock-in to endogenous POLR2G, a key subunit of RNA PolII (Cramer et al., 2000). The mean distance between EGFR foci and large POLR2G foci and the shortest minimum distance in any given nucleus (Figure 4—figure supplement 1G–I) further support that there is no close spatial relationship apparent between ecDNA and large PolII hubs.

Levels of EGFR transcription from ecDNA reflect copy number, not enhanced transcriptional efficiency

Having shown a lack of colocalization of ecDNA, either with each other or with large PolII foci, we proceeded to characterize the levels of EGFR expression from ecDNA. Flow cytometry using a fluorophore-conjugated EGFR ligand (EGF-647) revealed consistently higher levels of EGFR in the GBM cells than NSC, with highest signal in E26 (Figure 5—figure supplement 1A, B), consistent with their higher ecDNA copy number compared with E28 (Figure 1C). To confirm this link between ecDNA number and levels of EGFR, E26 and E28 cells were sorted by fluorescence activated cell sorting (FACS) into EGFR-high and EGFR-low populations. In both tumour cell lines, DNA FISH demonstrated that EGFR-high cells had a significantly higher number of EGFR DNA foci than EGFR-low (Figure 5—figure supplement 1C–E).

Previous studies have reported that ecDNA have greater transcript production per oncogene than chromosomal loci (Wu et al., 2019). We therefore sought to characterize the transcriptional efficiency (per copy number) of chromosomal and ecDNA-located EGFR genes in our GBM cell lines, by assaying the RNA:DNA EGFR FISH foci ratio. We performed nascent EGFR RNA FISH using a probe targeting the first intron of EGFR and EGFR DNA FISH to test this hypothesis (Figure 5A).

Figure 5. Levels of transcription from extrachromosomal DNA (ecDNA) reflect copy number but not enhanced transcriptional efficiency.

(A) Representative maximum intensity projection (MIP) images of nascent EGFR RNA, EGFR, and centromere 7 (CEN7) DNA FISH in neural stem cell (NSC), E26 and E28 cell lines (scale bar = 5 µm). (B) Ratio of RNA:DNA foci per nucleus in NSC, E26 and E28 cell lines. * p<0.05, n.s. not significant. Flat line – one-way ANOVA, hooked lines – unpaired t-test. Mean and standard error of the mean (SEM) plotted, with 3 biological replicates for NSC (total n=67), E26 (98) and E28 (95) nuclei. (C) Representative Spearman r correlation (ρ) and p-values shown for E26 (n=29) and E28 (n=39) cells. RNA:DNA ratio = number of RNA foci/number of DNA foci. EcDNA proportion = (number of EGFR DNA foci – number of CEN7 foci)/number of EGFR DNA foci. Three biological replicates performed, data from replicate 1 shown here. (D) UCSC genome browser tracks showing E26 and GBM39 RNA-seq and WGS aligned sequences in the region of chromosome 7 where EGFR is located, EGFR exons (GENCODE) and the exon deletion predicted by AmpliconArchitect. Note that RNA-seq counts in some ecDNA regions go above the maximum value. Genome coordinates (Mb) are from the hg38 assembly of the human genome. (E) RNA-seq/WGS allele frequency ratio for SNPs overlapping with expressed exons in the amplicon. Lines denote median values. (F) EGFR RNA-seq counts normalized by WGS read count per EGFR exon in E26, with exons defined as extrachromosomal (exons 1,8-28) or chromosomal (exons 2-7). Statistical significance examined by Mann-Whitney test. ns, not significant. (G) As for (F) but for GBM39. Statistical data relevant for this figure are in Figure 5—source data 1.

Figure 5—source data 1. Statistical data for Figure 5 and Figure 5—figure supplement 1.
(i) RNA:DNA FISH EGFR foci ratios. Statistical analysis of data for Figure 5B, RNA:DNA FISH EGFR foci ratio = mean values shown. n = number of nuclei, total across three biological replicates. Values in brackets indicate adjusted p-value (adj) = Bonferroni. (ii) Correlation of RNA:DNA ratio and ecDNA/total foci ratio (Figure 5C), Spearman r (p-value) shown for three biological replicates. N = number of nuclei. Rep1 data shown in figure. (iii) RNA-seq/whole genome sequencing (WGS) allele frequency ratio, for Figure 5E. Median and number of SNPs per gene per cell line. (iv) EcDNA versus chromosomal EGFR exons (Figure 5F and G), Mann-Whitney test of normalized RNA counts between chromosomal and predominantly EGFR ecDNA exons. (v) Mann-Whitney test of of EGFR RNA FISH foci in FACs sorted E26 and E28 cells (Figure 5—figure supplement 1E).

Figure 5.

Figure 5—figure supplement 1. EGFR levels, ecDNA number, and ecDNA SNP allele frequency in E26 and E28 cell lines.

Figure 5—figure supplement 1.

(A) Histogram of flow cytometry with EGF-647 showing signal in neural stem cell (NSC), E28 and E26 cell lines from live cells, normalized to peak count per cell line. Median EGF-647 – NSC = 172.2; E28 = 985.64; E26 = 7191.81. (B) Flow cytometry with EGF-647; gates showing negative, normal (NSC), and elevated (glioma stem cell [GSC]) EGF-647 signal in NSC, E28 and E26 cell lines. (C) Fluorescence activated cell sorting (FACS) into EGF-647 high and low populations from E26 and E28 cell lines. The percentage of total live cell population in each sorted population are shown. (D) Representative EGFR DNA FISH images (shown in greyscale) of E26 and E28 cells sorted via flow cytometry with EGF-647 into EGFR high and low cells. MIP, scale bar = 5 μm. (E) Number of EGFR DNA FISH per nucleus in sorted E26 and E28 cells. Statistical significance examined by Mann-Whitney. **** p<0.0001. Statistical data relevant for this figure are in Figure 5—source data 1. (F) SNP allele frequencies in E26 and E28 cell lines plotted in blood (blue) and glioblastoma GBM (orange) whole genome sequencing (WGS) samples. Dotted lines denote EGFR gene start and end.

When comparing the RNA:DNA ratio of all nuclei, only E26 had a higher ratio than NSCs (Figure 5B). To explore whether EGFR transcription in these cell lines could be due to ecEGFR-driven increased transcriptional efficiency, we used chromosome 7 copy number (evaluated by CEN7 probe) to account for chromosomal EGFR copy number. We correlated the RNA:DNA FISH ratio with the proportion of ecEGFR (number of EGFR foci minus number of CEN7 foci, divided by the total number of EGFR foci). We observed no correlation in either cell line (Figure 5C), suggesting that EGFR transcription from ecDNA and chromosomes occurs at similar levels when normalized to chromosome 7 copy number. There is no increased transcriptional efficiency from ecDNA compared to chromosomal DNA based on these analyses.

To test this using an independent method, we took advantage of WGS and RNA-seq data (Figure 5D) and called SNPs present in the amplicon region at 40% to 60% allele frequencies in patient control blood WGS (control) samples. Most of the allele frequencies of these SNPs were >80% in GBM samples in the main part of the amplicons, in line with the amplification being derived from one parental allele (Figure 5—figure supplement 1F). We then selected those SNPs located in expressed exons of the amplicon, including several in EGFR. The WGS allele frequencies of these were all >88%, that is, predominantly from amplicons. If genes on the ecDNA are more highly transcribed than chromosomal counterparts, we expect the ratio of RNA-seq to WGS reads of the amplicon-derived SNP to be above 1. Consistent with genes on ecDNA and on chromosomes being transcribed with similar efficiencies, these values were very close to 1, the highest being 1.05 (Figure 5D, E). The lower values for LANCL2, 3’ of EGFR, are likely because only part of this gene is present on the amplicon such that the transcript is truncated. As an additional approach, we utilized the large exon 2–7 deletion present on E26 EGFR ecDNA to compare the copy number-normalized RNA expression of exons present only on the endogenous chromosomal EGFR locus (exons 2–7) with those predominantly on ecDNA (exons 1, 8–28) (Figure 5E, D). Copy number normalized EGFR RNA counts were not significantly different between exons 2–7 and those located predominantly on ecDNA (Figure 5F). EcDNA with EGFR in another established GBM cell line, GBM39, also contain a deletion spanning exons 2–7. We therefore repeated this analysis using previously published WGS and RNA-seq data from this cell line (Wu et al., 2019). The normalized RNA read count of primarily ecEGFR exons was not significantly different than that of chromosomal EGFR exons (Figure 5G). Altogether, RNA:DNA FISH and sequencing analyses suggest that EGFR on each ecDNA is transcribed at a similar level to that of the corresponding endogenous chromosomal EGFR locus. Increased output of oncogenes in GBM stem cells with ecDNA appears to be primarily driven by increased copy number, rather than inherent features of their chromatin state, transcriptional control, or spatial localization.

Discussion

Understanding the importance of ecDNA in the etiology of cancer, and whether this poses an interesting target for therapeutic interventions, depends on deeper analysis of ecDNA activity (Nathanson et al., 2014; Kim et al., 2020). Clustering of ecDNA into ‘ecDNA hubs’ based on imaging and chromosome conformation capture data has been reported in a range of established cancer cell lines, and has been suggested to underlie the ability of ecDNA to drive very high levels of transcription (Hung et al., 2021; Yi et al., 2021; Zhu et al., 2021). However, in multiple primary human GBM cells studied here, we observe no significant colocalization at distances (~200 nm) thought to be functionally important in driving transcription. We reach this conclusion for both cells with single ecDNA species, as well as with heterogeneous ecDNA harbouring different oncogenes. EcDNA were not colocalized with, or notably close to, large PolII foci. Moreover, taking advantage of the unique transcripts from ecDNA, and the presence of SNPs in these transcripts, to compare ecDNA-derived and chromosomal transcripts, we demonstrate that increased copy number primarily drives increased transcription of ecDNA-located genes rather than increased transcriptional efficiency of ecDNA in GBM stem cells.

Our data support a regional, rather than clustered, spatial organization of ecDNA in GBM stem cells. We observe that oncogenes on ecDNA are distributed more towards the centre of the nucleus than the corresponding endogenous gene loci. This is consistent with an actively transcribing state (Boyle et al., 2001; Croft et al., 1999) and independence from the constraints of chromosome territories (Kalhor et al., 2011; Mahy et al., 2002).

We sought to maximize our opportunity of observing ecDNA clustering at close distances by performing 3D spot analysis, using Ripley’s K to call instances of significant clustering at given distances using ecDNA x,y,z coordinates, and utilizing cells with two distinct ecDNA species to ensure we were not under-scoring colocalization. 3D analysis ensures a false positive clustering effect is avoided that might be seen when 3D images are combined via tools such as maximum intensity projection (MIP). Other tools to assess clustering have noted the possibility of the 2D Ripley’s K function resulting in over-counting, leading to the development of alternative auto-correlation tools, but this was not observed in this 3D Ripley’s K analysis (Veatch et al., 2012). It is possible that multiple clustered DNA/RNA foci appear as a single DNA/RNA FISH signal that we cannot resolve. We controlled for this by repeating cluster analysis with smaller spot sizes, analyzing cell lines with two ecDNA populations and using super-resolution imaging (optical resolution ~120 nm). We did observe ecDNA clustering at close distances (≤200 nm) in a small proportion of E20 dual-ecDNA cells, but in the case of CDK4-PDGFRA colocalization this was at a similar proportion to that observed in metaphase spreads, indicative of ecDNA molecules harbouring both CDK and PDGFRA. The incidence of CDK4 doublets (which appeared in keeping with double minutes) was also low. Overall, this suggests that close clustering is not a major contributor to increased ecDNA transcriptional output in GBM stem cells.

Our findings may reflect fundamentally different functional characteristics of the ecDNA in patient-derived primary GBM cell cultures used in our experiments versus previously published studies (Hung et al., 2021; Yi et al., 2021). These might include the size of the ecDNA, or the number of oncogene loci per ecDNA (which was singular in our cell lines, with the exception of ~10% E20 CDK4/PDGFRA colocalized ecDNA). For example, the COLO320-DM cell line, used in a recent study of ecDNA hubs, harbours 3 copies of MYC on each of its ecDNA, and results in large (4.328 Mb, approx. 1.75 μm diameter) ecDNA (Hung et al., 2021; Wu et al., 2019). The HK359 GBM cell line, previously noted to have clustered ecDNA hubs, has a 42 kb insertion at the site of EGFRvIII (exon 2–7 deletion), again suggesting a large ecDNA quite different in character to those described here (Hung et al., 2021; Koga et al., 2018). More quantitative analysis across a larger set of primary cancer cells will be needed to determine if long-term established cell lines have unusual ecDNA features and are unrepresentative of primary GBM cells.

Recent work proposing that ecDNA act as mobile super-enhancers for chromosomal targets has raised the possibility that ecDNA can actively recruit RNA PolII to drive ‘ecDNA-associated phase separation’ (Zhu et al., 2021). A live-cell ecDNA-labelling strategy reported colocalization of ecDNA and RNA PolII (Yi et al., 2021). We did not detect evidence of a close relationship between ecDNA, or their nascent transcript, with large PolII foci, but cannot exclude that there are smaller, sub-diffraction limit sized transcriptional hubs associated with our ecDNA.

We observe that while the copy number of EGFR ecDNAs positively correlates with greater transcriptional output, this is likely due to copy number increases, rather than increased transcriptional activity on individual ecDNA. It has been proposed that ecDNA increase transcription of their resident oncogenes partly due to their increased DNA copy number, but also due to their more accessible chromatin structure, and that gene transcription from circular amplicons is greater than that of linear amplicons once copy number normalized (Kim et al., 2020; Wu et al., 2019). An analysis of RNA-seq and WGS data from a cohort of 36 independent clinical samples found that only 3 out of 11 ecDNA-encoded genes produced significantly more transcripts when normalized to gene copy number, only one of which is a key oncogene (Wu et al., 2019). In agreement with this, our analysis of both oncogene and amplicon-resident polymorphisms suggests that copy number is the dominant driver of ecDNA gene transcription.

Overall, our data suggest that in primary GBM stem cells, ecDNA can succeed at driving oncogene expression without requiring close colocalization with each other, or with transcriptional hubs. It is the increased copy number that is primarily responsible for higher levels, rather than ecDNA-intrinsic features or nuclear sub-localization.

Materials and methods

Key resources table.

Reagent type (species) or resource Designation Source or reference Identifiers Additional information
Antibody mCherry (Rabbit poly-clonal) abcam ab167453 IF (1 in 500)
Antibody Rpb1 NTD (D8L4Y) (Rabbit mono-clonal) Cell Signaling Technology #14958 IF (1 in 1000)
Antibody Anti-Digoxigenin (Sheep poly-clonal) Roche Ref 11333089001 DNA FISH (1 in 10)
Antibody Secondary Antibody – Alexa Fluor 647 (Donkey anti-Sheep IgG poly-clonal) Thermo Fisher Scientific A-21448 DNA FISH (1 in 10)
Antibody Secondary Antibody – Alexa Fluor 568 (Donkey anti-Rabbit IgG poly-clonal) Thermo Fisher Scientific A-10042 IF (1 in 1000)
Antibody Secondary Antibody – Alexa Fluor 488 (Donkey anti-Rabbit IgG poly-clonal) Thermo Fisher Scientific A-21206 IF (1 in 1000)
Antibody Secondary Antibody – Alexa Fluor 488 (Donkey anti-Rat IgG poly-clonal) Thermo Fisher Scientific A-21208 IF (1 in 1000)
Genetic reagent (human) Fosmid FISH probe (Human) BACPAC resource https://bacpacresources.org/library.php?id=275 See Materials and methods - Supplementary file 1
Cell line (Homo sapiens) E20, E25, E26, E28, NSC – GCGR Human Glioma Stem Cells This paper, Glioma Cellular Genetics Resource, CRUK, UK http://gcgr.org.uk; pending publication
Other DMEM/HAMS-F12 Sigma-Aldrich Cat#: D8437 Cell culture, media
Chemical compound, drug Pen/Strep GIBCO Cat#: 15140–122 Cell culture, media supplement
Other BSA Solution GIBCO Cat#: 15260–037 Cell culture, media supplement
Other B27 Supplement (×50) LifeTech/GIBCO Cat#: 17504–044 Cell culture, media supplement
Other N2 Supplement (×100) LifeTech/GIBCO Cat#: 17502–048 Cell culture, media supplement
Other Laminin Cultrex Cat#: 3446-005-01 Cell culture, media supplement, and pre-lamination of culture vessels
Peptide, recombinant protein EGF Peprotech Cat: 315–09 Cell culture, media supplement
Peptide, recombinant protein FGF-2 Peprotech 100-18B Cell culture, media supplement
Other Accutase Sigma-Aldrich Cat#: A6964 Cell culture, cell dissociation agent
Other DMSO Sigma-Aldrich Cat#: 276855 Cell culture, freeze media, and drug diluent
Other Triton X-100 Merck Life Sciences Cat#: X-100 Cell permeabiliz-ation agent following cell fixation
Other Paraformaldehyde Powder 95% Sigma Cat#: 158127 Cell fixation agent
Other Tween 20 Cambridge Bioscience Cat#: TW0020 DNA FISH (hybridization mix)
Other PBS Tablets Sigma-Aldrich Cat#: P4417 Diluent and washing agent
Other Ethanol VWR Cat#: 20821–330 DNA FISH
Other Methanol Fisher Chemical M/4000/17 Used 3:1 with acetic acid for metaphase spreads
Other Acetic acid Honeywell Research Chemicals 33209-1L See above
Peptide, recombinant protein  Alexa Fluor 647 EGF complex Thermo Fisher Scientific E35351 Flow cytometry
Other Green496-dUTP ENZO Life Sciences ENZ-42831L Direct labelling of Fosmid DNA FISH probes via nick translation
Other ChromaTide Alexa Fluor 594–5-dUTP Thermo Fisher Scientific C11400 Direct labelling of Fosmid DNA FISH probes via nick translation
Peptide, recombinant protein DNA Polymerase 1 Invitrogen 18010–017
Peptide, recombinant protein DNase I recombinant, RNase-free Roche 04716728001
Genetic reagent (human) Human Cot-1 DNA Thermo Fisher Scientific 15279011
Genetic reagent (salmon) Salmon Sperm DNA Invitrogen 15632011
Chemical compound, drug Paclitaxel Cambridge Bioscience CAY10461 10–100 nM
Chemical compound, drug Nocodazole Sigma-Aldrich SML1665 50–100 ng/ml
Other XCP 7 Orange Chromosome Paint MetaSystems Probes D-0307-100-OR DNA FISH (see Figure 1 and Materials and methods referring to this)
Commercial assay or kit Stellaris RNA-FISH probes (Custom Assay with Quasar 570 Dye) LGC Biosearch Technologies SMF-1063–5 RNA FISH
Commercial assay or kit Stellaris RNA FISH Hybridization Buffer LGC Biosearch Technologies SMF-HB1-10 RNA FISH
Genetic reagent (human) Alt-R CRISPR-Cas9 crRNA IDT-Technologies Alt-R CRISPR-Cas9 crRNA
Genetic reagent (human) Alt-R CRISPR-Cas9 tracrRNA IDT-Technologies 1072532
Commercial assay or kit SG Cell Line 4D-NucleofectorTM X Kit S Lonza Bioscience V4XC-3032
Genetic reagent (human) Chromosome 7 Control Probe Pisces Scientific CHR07-10-DIG Probe and hybridization mix
Other DAPI (4',6-Diamidino-2-Phenylindole, Dihydrochloride) Thermo Fisher Scientific D1306 Nuclear staining; 50 ng/ml and 5 ng/ml (as indicated in Materials and methods)
Sequence-based reagent mCherry_PolR2G crRNA and dsDNA (donor) Twist Bioscience See Materials and methods and Supplementary file 1
Other WGS and RNAseq This paper
Glioma Cellular Genetics Resource, CRUK, UK
GEO: GSE215420
See also: https://gcgr.org.uk
See Materials and methods
Other Erosion Territories analysis This paper Code available at: https://github.com/IGC-Advanced-Imaging-Resource/Purshouse2022_paper
Other Cluster analysis This paper Code available at: https://github.com/SjoerdVBeentjes/ripleyk
Other RNA-seq/WGS analysis This paper Code available at: https://github.com/kpurshouse/ecDNAcluster
Software, algorithm GraphPad Prism 9.0  GraphPad Software, Inc https://www.graphpad.com/
Software, algorithm FCS Express  FCS Express 7 https://denovosoftware.com/
Software, algorithm Fiji/ImageJ  Open Source https://imagej.net/Fiji
Software, algorithm BioRender  BioRender https://biorender.com/
Software, algorithm Python v3.9  Open Source https://www.python.org
Software, algorithm Algorithm - RipleyK package  Python Package Index https://pypi.org/project/ripleyk/
Software, algorithm Imaris x64 v9.4.0  Imaris Microscopy Image Analysis Software https://imaris.oxinst.com/
Software, algorithm UCSC Genome Browser Kent et al., 2002 https://genome.cshlp.org/content/12/6/996
Software, algorithm STAR 2.7.1a Dobin et al., 2013 https://github.com/alexdobin/STAR; Dobin et al., 2013
Software, algorithm Picard  Broad Institute https://broadinstitute.github.io/picard/
RRID:SCR_006525, Version 2.23.2
Software, algorithm AmpliconArchitect Deshpande et al., 2019 https://github.com/virajbdeshpande/AmpliconArchitect; Deshpande et al., 2019 (with Python v2.7)
Software, algorithm AmpliconClassifier Kim et al., 2020 https://github.com/jluebeck/AmpliconClassifier (with Python v2.7)
Software, algorithm deepTools v3.4 Ramírez et al., 2016 https://deeptools.readthedocs.io/en/develop/
Software, algorithm HOMER2 4.10 Heinz et al., 2010 http://homer.ucsd.edu/homer/
Software, algorithm SAMtools v1.10 Li et al., 2009 http://www.htslib.org
Software, algorithm BEDTools v2.3 Quinlan and Hall, 2010 http://code.google.com/p/bedtools
Software, algorithm bcftools Danecek et al., 2021 https://doi.org/10.1093/gigascience/giab008
Software, algorithm strelka v2.9.10 Kim et al., 2018 https://doi.org/10.1038/s41592-018-0051-x

Lead contact

Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contacts, Wendy Bickmore (wendy.bickmore@ed.ac.uk) and Steven Pollard (steven.pollard@ed.ac.uk).

Materials availability

This study generated a new CRISPR engineered knock-in reporter cell line – E28 mCherry_POLR2G.

Experimental model and subject details

GSC and NSC lines from the Glioma Cellular Genetics Resource (GCGR) (https://gcgr.org.uk) were cultured in serum-free basal DMEM/F12 medium (Sigma) supplemented with N2 and B27 (Life Technologies), 2 μg/ml laminin (Cultrex), and 10 ng/ml growth factors EGF and FGF-2 (Peprotech) (Pollard et al., 2009). Cells were split with Accutase solution (Sigma), and centrifuged approximately weekly as previously reported. All GBM cell lines were derived from treatment-naive patients, and the NSC cell line GCGR-NS9FB_B was derived from 9 week of gestation forebrain. GSC cell lines were selected on the basis of predominantly (E26) or entirely (E28, E25, and E20) harbouring oncogenes on ecDNAs (rather than HSRs) via metaphase spread analysis (see Materials and method below). Human GBM tissue was obtained with informed consent and ethical approval (East of Scotland Research Ethics service, REC reference 15/ES/0094). Human embryonic brain tissue was obtained with informed consent and ethical approval (South East Scotland Research Ethics Committee, REC reference 08/S1101/1). Cell lines were regularly tested for mycoplasma.

Method details

Metaphase spreads and interphase nuclei

Cell lines were optimized to generate metaphase spreads. Briefly, cells at near confluence in a T75 flask were incubated between 4 and 16 hr in the presence of 10–100 nm paclitaxel (Cambridge BioScience) with or without 50–100 ng/ml nocodazole (Sigma-Aldrich). Along with the media, cells dissociated with accutase were centrifuged, washed in PBS, and resuspended in 10 ml potassium chloride (KCl) 0.56%, with sodium citrate dihydrate (0.9%) if required, for 20 min. After further centrifugation, cells were resuspended in methanol:acetic acid 3:1 and dropped onto humidified slides.

For all other fixed cell experiments described below, cells were seeded overnight onto glass cover-slips or poly-L-lysine coated glass slides (Sigma-Aldrich). Cells were fixed with 4% paraformaldehyde (PFA – 10 min) and permeabilized with 0.5% Triton X-100 (15 min) with thorough PBS washes in-between. Where cells were dried (see FISH methods), this only occurred following PFA fixation in order to preserve 3D structures and minimize cell and nuclear flattening.

DNA FISH

A detailed method for DNA FISH has been described elsewhere (Jubb and Boyle, 2020). Briefly, DNA stocks of fosmid clones targeting EGFR (WI2-2910M03), CDK4 (WI2-0793J08), and PDGFRA (WI2-2022O22) (Supplementary file 1) were prepared via an alkaline lysis miniprep protocol (Jubb and Boyle, 2020). Each fosmid DNA probe was labelled via Nick Translation directly to a fluorescent dUTP (Green496-dUTP, ENZO Life Sciences; ChromaTide Alexa Fluor 594-5-dUTP, Thermo Fisher Scientific) and incubated with unlabelled dATP, dCTP, and dGTP, ice-cold DNase and DNA PolI for 90 min at 16°C. The reaction was quenched with EDTA and 20% SDS, TE buffer added, and the reaction mix run through a Quick Spin Sephadex G50 column.

Cells on slides or cover-slips were prepared by incubating for 1 hr in ×2 trisodium citrate and sodium chloride (SSC)/RNaseA 100 μg/ml at 37°C, then dehydrated in 70%, 90%, and 100% ethanol. Slides were warmed at 70°C prior to immersion in a denaturing solution (×2 SSC/70% formamide, pH 7.5) heated to 70°C (methanol:acetic acid-fixed cells) or 80°C (PFA-fixed cells), the duration of which was optimized to each cell line. After denaturing, slides were immersed in ice-cold 70% ethanol, then 90% and 100% ethanol at room temperature before air drying.

FISH probes were prepared by combining 100 ng of each directly labelled fosmid probe (per slide), 6 μg Human Cot-1 DNA (per probe), 5 μg sonicated salmon sperm (per slide), and 100% ethanol. Once completely dried, the resulting pellet was suspended in hybridization mix (50% deionized formamide [DF], ×2 SSC, 10% dextran sulfate, 1% Tween 20) for 1 hr at room temperature, denatured for 5 min at >70°C and annealed at 37°C for 15 min. Where relevant, FISH probes were instead hybridized in Chromosome 7 paint (XCP 7 Orange, Metasystems). The probes were incubated overnight at 37°C. The following day, the slides were washed in ×2 SSC (45°C), 0.1% SSC (60°C) and finally in ×4 SSC/0.1% Tween 20 with 50 ng/ml 4′,6-diamidino-2-phenylindole (DAPI). Slides were mounted with Vectashield.

RNA FISH

RNA FISH probes (Custom Assay with Quasar 570 Dye) targeting the first intron (pool of 48 22-mer probes) of EGFR were designed and ordered via the Stellaris probe designer (Biosearch Technologies, Inc, Petaluma, CA) (https://www.biosearchtech.com/support/tools/design-software/stellaris-probe-designer, version 4.2). Cells were seeded, fixed, and permeabilized as above. Slides were immersed in ×2 SSC, 10% DF in DEPC-treated water for 2–5 min before applying the hybridization mix (Stellaris RNA FISH hyb buffer, 10% DF, 125 nm RNA FISH probe) for incubation at 37°C. After overnight incubation, slides were incubated in ×2 SSC, 10% formamide in DEPC-treated water for 30 min, and then stained with DAPI (5 ng/ml). Slides were washed with PBS before mounting with Vectashield.

Combined RNA:DNA FISH

Nascent EGFR RNA FISH was performed as above, and nuclei imaged as described below. The x,y,z coordinates for each image were recorded via NIS software at the time of imaging. After removing the cover-slips and washing the slides in PBS, EGFR DNA FISH was performed whereby the probe preparation was as above. Centromere 7 (CEN7 – CHR07-Dig Control) FISH probe (Pisces Scientific) was prepared, denatured for 5 min at 80°C and snap-frozen on crushed ice. Slides were transferred from PBS wash to denaturing solution at 80°C for 15–30 min, washed in ×2 SSC, and incubated overnight with the probe(s) at 37°C. The subsequent stringency washes were as described above. Slides were then incubated in blocking buffer (×4 SSC/5% Marvel) for 5 min, followed by anti-digoxigenin antibody (Roche; 1 in 10; 1 hr at humidified 37°C) and anti-sheep Alexa Fluor 647 secondary antibody (Thermo Fisher Scientific; 1 in 10; 1 hr at humidified 37°C) with ×4 SSC/0.1% Tween 20 washes in between. After the final washes, slides were stained with DAPI and mounted as described above. The stored x,y,z coordinates were used to relocate and image each nucleus. Owing to the irregularity of the tumour nuclei, it was possible to be confident in re-imaging the correct nucleus – nuclei were excluded where this was not the case, or where nuclei were lost between RNA and DNA FISH. Spot counting was subsequently performed as described below with RNA and DNA foci being defined and counted separately to avoid influencing the outcome. For CEN7, nuclei were excluded if the number of foci could not be clearly identified.

Immunofluorescence and immuno-FISH

Slides were blocked in 1%BSA/PBS/Triton X-100 0.1% for 30 min at 37°C before overnight incubation with the primary antibody at 4°C (Rpb1 NTD (D8L4Y) #14958, Cell Signaling Technology, 1 in 1000; mCherry [ab167453], abcam, 1 in 500). The following day, slides were washed in PBS before incubation with an appropriate secondary antibody (1 in 1000 Alexa Fluor) for 1 hr at 37°C. After further PBS washes and DAPI staining, slides were mounted with Vectashield.

For immuno-FISH (DNA), the IF signal was fixed via incubation with 4% PFA for 30 min. Following thorough PBS washes, the DNA FISH protocol was then followed as above.

For immuno-FISH (RNA), the antibodies were added at the same concentration as described above to the hybridization mix (primary antibody) and ×2 SSC/10% DF washes (secondary antibody).

Flow cytometry and FACS

Cells were prepared by adding EGF-free media for 30 min before lifting and suspending cells in 0.1% BSA/PBS. Cells were incubated in 100 ng/ml EGF-647 (E35351, Thermo Fisher Scientific) in 0.1%BSA/PBS, with cells incubated in 0.1% BSA/PBS as a negative control, for 25 min. Cells were washed three times in 0.1%BSA/PBS before being analysed on the BD FACSAria III FUSION. Where indicated, cells were sorted by EGF-647 gated into high and low groups, and a sort check was performed to verify these were true populations prior to expanding these cells onto 22×22 mm2 cover-slips. Fifteen days after the cells were sorted, the slides were fixed, permeabilized, and DNA FISH performed as above.

mCherry_POLR2G knock-in cell line

crRNA and donor DNA was designed using the previously reported TAG-IN tool (Dewari et al., 2018), with the corresponding fluorescent reporter gene sequences for mCherry implemented into the existing tool (Supplementary file 1). Output sequences from the TAG-IN tool were manufactured by Twist Bioscience. Gene-specific crRNA (100 pmoles – IDT Technologies, Coralville, IA, USA) and universal tracrRNA (100 pmoles, IDT Technologies, Coralville, IA, USA) were assembled to a cr:tracrRNA complex by annealing at the following settings on a PCR block: 95°C for 5 min, step down cooling from 95°C to 85°C at 0.5°C/s, step down cooling from 85°C to 20°C at 0.1°C/s, store at 4°C. Recombinant Cas9 protein (10 μg, purified in house – see Dewari et al., 2018) was added to form the ribonucleoprotein (RNP) complex at room temperature for 10 min, then stored on ice; 300 ng of donor dsDNA were denatured in 30% DMSO by incubating at 95°C for 5 min followed by immediate immersion in ice. The donor dsDNA and RNPs were electroporated into E28 cells using the 4D Amaxa X Unit (programme DN-100). After 2 weeks of serial expansion of cells in 2D culture, assessment of knock-in efficiency was assessed by suspending 5–7 × 105 cells in 0.2% BSA/PBS and analysed on BD LSRFortessa Cell Analyzer, with cells electroporated with tracrRNA:Cas9 only as a negative control. Cells were then further sorted into a pure KI population, and mCherry KI was verified by immunofluorescence for mCherry and Rpb1.

Imaging

Slides were imaged on epifluorescence microscopes (Zeiss AxioImager 2 and Zeiss AxioImager.A1) and the SoRa spinning disk confocal microscope (Nikon CSU-W1 SoRa). For 3D image analysis, images were taken with the SoRa microscope and a 3 μm section across each nucleus was imaged in 0.1 μm steps. Images were denoised and deconvolved using NIS deconvolution software (blind preset or Lucy-Richardson) (Nikon). 3D images are shown in the figures as MIP prepared using ImageJ.

Quantification and statistical analysis

Image analysis of nuclear localization

Images were analysed using Imarisv9.7 and Fiji. The scripts used to perform nuclear territory analysis have been described elsewhere (Boyle et al., 2001; Croft et al., 1999; see also Data availability). Briefly, single-slice images were taken with a ×20 lens using the Zeiss AxioImager 2, imaging at least 50 nuclei per cell line. The images were segmented first to individual nuclei, and subsequently the area of the DAPI signal was segmented to define the nuclear area. This area was segmented into concentric shells of equal area from the periphery to the centre of each nucleus. The signal intensity of each FISH probe or chromosome paint signal was calculated, with normalization for the DAPI signal in each shell.

Image analysis of ecDNA and large PolII foci

For 3D analysis, deconvolved images were analysed using Imaris (v9.7) and all analysis was performed on the full 3D image. RNA and DNA FISH foci, and where relevant, large PolII foci, were defined, counted and distances between them calculated, using the Spots function within Imaris. Imaris spot size diameter was selected by single plane measurement of representative foci and this defined diameter was applied to all nuclei of a given experiment for 3D analysis. For DNA FISH analysis, E26, E28, and E25 spot size was 300 nm diameter, and where indicated in the text, reanalysed with 150 nm spot diameter. For E20 and all RNA FISH experiments, a spot size diameter of 200 nm was used. For RPB1 and POLR2G foci (IF), large foci were defined as those ≥500 nm diameter (Cho et al., 2018; Sabari et al., 2018).

For 3D cluster analysis of FISH spots, Ripley’s K function was performed using the x,y,z coordinates for each FISH spot using the Imaris Spots function to determine observed and null distribution values.

K(r)=1λijI{d(i,j)r}n

Ripley’s K function compares the number of points at a distance smaller than a given radius r, relative to the average number of points in the volume. This average is the density lambda, in this case the number of foci, n, divided by the volume. In the above equation,

I

is the indicator function which equals 1 if the distance between points i and j is no larger than r, and 0 otherwise. A high value of Ripley’s K function represents clustering at the given radius r, whereas a low value represents dispersion. Consequently, a high Ripley’s K function at a given radius is indicative of clustering at this radius. By comparing the observed value of Ripley’s K function at a given radius with that computed on the same number of foci and with the same volume but drawn from a uniform null distribution, the presence of significant clustering in the given cluster at the given radius can be detected.

The code written to perform this analysis was formed using a script written in Python (v3.9) and has been made available on GitHub (see Data availability). Ripley’s K function was determined across a radius of 0.1–1 µm in 0.1 µm increments. After calculating the observed Ripley’s K function value, a null distribution of no clustering, estimated on uniformly distributed samples with the same number of spots, was generated using the coordinates for each given nucleus to calculate 10,000 Ripley’s K function values at each radial increment. We tested a sample of nuclei with 50,000 values and confirmed that 10,000 values would provide sufficient accuracy. Having sampled that nucleus shape and size did not affect the significance of a result at each increment in the given range of radii, a bounding radius of 5 was used for all samples. Only nuclei with greater than 20 EGFR foci were included to ensure both that the majority of foci were ecEGFR, to allow adequate granularity and minimize the risk of a false negative result due to lack of foci. The p-value for each observed K function was established against the expected values using the Neyman-Pearson lemma. Where the observed and expected K function at p=0.05 were the same, a randomized binomial test was performed to determine if p<0.05 for the observed value, weighting the probability of success as the ratio of the number of values p<0.05 and the total number of equal values. Having determined this, the most optimistic estimate of p-value was made which would favour identification of a significant result, that is, a bias in favour of significant clustering. A Benjamini-Hochberg procedure was performed to control for the false discovery rate (FDR = 0.05).

All other statistical analysis was performed with GraphPad Prism v9.0. The statistical details for each experiment can be found in the relevant figure legends and in the Source Data. For figures, p-values are represented as follows: *<0.05, **<0.01, ***<0.001, ****<0.0001. Where appropriate, Bonferroni correction for multiple hypothesis testing was performed, and, where relevant, corrected p-values are those plotted in the figures and are given in the Source Data in brackets next to the uncorrected p value.

RNA and WGS sequencing sample preparation, analysis, and processing

The preparation of these cell lines for RNA-seq has been described in detail elsewhere (Gangoso et al., 2021). WGS was undertaken by BGI Tech Solutions with PE100 and normal library construction. WGS, RNA-seq, and AmpliconArchitect data for GBM39 was taken from data made available via publication and in the NCBI Sequence Read Archive (BioProject: PRJNA506071) (Wu et al., 2019).

Sequences were aligned to hg38 with STAR 2.7.1a with settings ‘--outFilterMultimapNmax 1’ used for WGS and RNA-seq data and settings ‘--alignMatesGapMax 2000 --alignIntronMax 1 --alignEndsType EndToEnd’ used only for WGS data (Dobin et al., 2013). Duplicate reads were removed using Picard (Broad Institute). AmpliconArchitect (Deshpande et al., 2019) and AmpliconClassifier (Kim et al., 2020) were used to predict the ecDNA regions and classify circular amplicons for E26 and E28, and to classify EGFR exons as being located primarily on ecDNA or only on chromosomal DNA in E26 and E28. Exon coordinates were extracted from Ensembl (isoform:EGFR-201, Ensembl Transcript ID: ENST00000275493.7). Alignments were converted to bigWig files using deepTools bamCoverage with setting ‘--normalizeUsingRPKM’ (Ramírez et al., 2016) and visualized using the UCSC genome browser (Kent et al., 2002). HOMER2 (Heinz et al., 2010) makeTagDirectory and annotatePeaks.pl (settings ‘-len 0 -size given’) were used for read counting of WGS and RNA in EGFR exons. Analysis of RNA-seq counts per copy number was performed using scripts written in Python (v3.9). We normalized the RNA-seq read counts to the WGS read count in each EGFR exon, and analysed in GraphPad Prism v9.0. SNP calling was done using strelka v2.9.10 (Kim et al., 2018) using the configureStrelkaGermlineWorkflow.py command on all samples (WGS blood, WGS tumour, and RNA-seq tumour) for each cell line (E26 and E28) separately. SNPs that passed all filters were extracted using bcftools (Danecek et al., 2021) and selected for those that had an allele frequency in the WGS blood between 40% and 60%. The ratio of allele frequencies between the RNA-seq and WGS tumour samples were determined for those SNPs overlapping expressed exons with at least 20 reads in the RNA-seq samples . See Data availability.

Source data

Source data regarding the statistical tests applied, the exact sample number, p-values of tests (and adjustments for multiple hypothesis testing), and details of replicates are included where indicated in the article. N=number of nuclei.

Acknowledgements

SVB would like to thank Dr Tim Cannings for helpful suggestions on statistical analysis.

We acknowledge the Advanced Imaging Resource at the Institute of Genetics and Cancer and the Edinburgh Super-Resolution Imaging Consortium (ESRIC), and the Flow Cytometry team at the Centre for Regenerative Medicine, University of Edinburgh, for their technical support. This work has made use of the resources provided by the Edinburgh Compute and Data Facility (ECDF) (http://www.ecdf.ed.ac.uk/). KP was supported by a Wellcome PhD Training Fellowship (220399/Z/20/Z). ETF was supported by the Swiss National Science Foundation (P2ELP3_191695). GMM and the Glioma Cellular Genetics Resource (https://www.gcgr.org.uk/) were supported by the Cancer Research UK (CRUK) Centre Accelerator Award (A21922). AH was supported by a CRUK PhD Fellowship (C157/A29279). SMP is a Cancer Research UK Senior Research Fellow (A17368). Work in the group of WAB is supported by MRC University Unit grant MC_UU_00007/2.

Funding sources were not involved in study design, data collection, data interpretation, or the decision to submit the work for publication.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication. For the purpose of Open Access, the authors have applied a CC BY public copyright license to any Author Accepted Manuscript version arising from this submission.

Contributor Information

Steven M Pollard, Email: steven.pollard@ed.ac.uk.

Wendy A Bickmore, Email: wendy.bickmore@ed.ac.uk.

Jessica K Tyler, Weill Cornell Medicine, United States.

Jessica K Tyler, Weill Cornell Medicine, United States.

Funding Information

This paper was supported by the following grants:

  • Wellcome Trust 220399/Z/20/Z to Karin Purshouse.

  • Swiss National Science Foundation P2ELP3_191695 to Elias T Friman.

  • Cancer Research UK DRCNPG-Nov21\100002 to Steven M Pollard.

  • Cancer Research UK C157/A29279 to Alhafidz Hamdan.

  • Cancer Research UK A17368 to Karin Purshouse.

  • Medical Research Foundation MC_UU_00007/2 to Wendy A Bickmore.

Additional information

Competing interests

No competing interests declared.

No competing interests declared.

Author contributions

Investigation, Methodology, Writing – original draft, Writing – review and editing.

Formal analysis, Methodology, Writing – review and editing.

Investigation.

Investigation.

Investigation.

Formal analysis, Investigation.

Investigation.

Resources.

Formal analysis, Methodology, Writing – review and editing.

Conceptualization, Supervision, Funding acquisition, Writing – original draft, Project administration, Writing – review and editing.

Conceptualization, Supervision, Funding acquisition, Writing – original draft, Project administration, Writing – review and editing.

Additional files

Supplementary file 1. Genomic information for FISH probes and CRISPR knockin.

(A)Fosmid probes for DNA FISH related to STAR methods. Genome coordinates (Mb) are from the hg38 assembly of the human genome. (B) CrRNA sequence and dsDNA sequence for mCherry_PolR2G CRISPR knock-in.

elife-80207-supp1.docx (14.4KB, docx)
MDAR checklist

Data availability

WGS and RNAseq data have been deposited on NCBI GEO under study accession number GSE215420 and is publicly available as of the date of publication As indicated in the Key Resources, all original code has been deposited as: https://github.com/IGC-Advanced-Imaging-Resource/Purshouse2022_paper (copy archived at swh:1:rev:5b1a3920afa8e85132c94bcc6dfce94575f939ce) https://github.com/SjoerdVBeentjes/ripleyk (copy archived at swh:1:rev:1303af539403303786b6460fabef355e345ea6c9) https://github.com/kpurshouse/ecDNAcluster (copy archived at swh:1:rev:9162a39f3c8e19e973eaedc50ad4e1d3dc570e90).

The following dataset was generated:

Purshouse et al 2022. WGS and RNA-seq data E26,E28. NCBI Gene Expression Omnibus. GSE215420

References

  1. Adelman K, Martin BJE. EcDNA Party bus: bringing the enhancer to you. Molecular Cell. 2021;81:1866–1867. doi: 10.1016/j.molcel.2021.04.017. [DOI] [PubMed] [Google Scholar]
  2. Boyle S, Gilchrist S, Bridger JM, Mahy NL, Ellis JA, Bickmore WA. The spatial organization of human chromosomes within the nuclei of normal and emerin-mutant cells. Human Molecular Genetics. 2001;10:211–219. doi: 10.1093/hmg/10.3.211. [DOI] [PubMed] [Google Scholar]
  3. Brennan CW, Verhaak RGW, McKenna A, Campos B, Noushmehr H, Salama SR, Zheng S, Chakravarty D, Sanborn JZ, Berman SH, Beroukhim R, Bernard B, Wu CJ, Genovese G, Shmulevich I, Barnholtz-Sloan J, Zou L, Vegesna R, Shukla SA, Ciriello G, Yung WK, Zhang W, Sougnez C, Mikkelsen T, Aldape K, Bigner DD, Van Meir EG, Prados M, Sloan A, Black KL, Eschbacher J, Finocchiaro G, Friedman W, Andrews DW, Guha A, Iacocca M, O’Neill BP, Foltz G, Myers J, Weisenberger DJ, Penny R, Kucherlapati R, Perou CM, Hayes DN, Gibbs R, Marra M, Mills GB, Lander E, Spellman P, Wilson R, Sander C, Weinstein J, Meyerson M, Gabriel S, Laird PW, Haussler D, Getz G, Chin L. The somatic genomic landscape of glioblastoma. Cell. 2013;157:753. doi: 10.1016/j.cell.2014.04.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Bulstrode H, Johnstone E, Marques-Torrejon MA, Ferguson KM, Bressan RB, Blin C, Grant V, Gogolok S, Gangoso E, Gagrica S, Ender C, Fotaki V, Sproul D, Bertone P, Pollard SM. Elevated FOXG1 and Sox2 in glioblastoma enforces neural stem cell identity through transcriptional control of cell cycle and epigenetic regulators. Genes & Development. 2017;31:757–773. doi: 10.1101/gad.293027.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Carvalho C, Pereira HM, Ferreira J, Pina C, Mendonça D, Rosa AC, Carmo-Fonseca M. Chromosomal G-dark bands determine the spatial organization of centromeric heterochromatin in the nucleus. Molecular Biology of the Cell. 2001;12:3563–3572. doi: 10.1091/mbc.12.11.3563. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Cho WK, Spille JH, Hecht M, Lee C, Li C, Grube V, Cisse II. Mediator and RNA polymerase II clusters associate in transcription-dependent condensates. Science. 2018;361:412–415. doi: 10.1126/science.aar4199. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Chong S, Dugast-Darzacq C, Liu Z, Dong P, Dailey GM, Cattoglio C, Heckert A, Banala S, Lavis L, Darzacq X, Tjian R. Imaging dynamic and selective low-complexity domain interactions that control gene transcription. Science. 2018;361:eaar2555. doi: 10.1126/science.aar2555. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Cox D, Yuncken C, Spriggs AI. Minute chromatin bodies in malignant tumours of childhood. Lancet. 1965;1:55–58. doi: 10.1016/s0140-6736(65)90131-5. [DOI] [PubMed] [Google Scholar]
  9. Cramer P, Bushnell DA, Fu J, Gnatt AL, Maier-Davis B, Thompson NE, Burgess RR, Edwards AM, David PR, Kornberg RD. Architecture of RNA polymerase II and implications for the transcription mechanism. Science. 2000;288:640–649. doi: 10.1126/science.288.5466.640. [DOI] [PubMed] [Google Scholar]
  10. Croft JA, Bridger JM, Boyle S, Perry P, Teague P, Bickmore WA. Differences in the localization and morphology of chromosomes in the human nucleus. The Journal of Cell Biology. 1999;145:1119–1131. doi: 10.1083/jcb.145.6.1119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, Whitwham A, Keane T, McCarthy SA, Davies RM, Li H. Twelve years of samtools and bcftools. GigaScience. 2021;10:giab008. doi: 10.1093/gigascience/giab008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Deshpande V, Luebeck J, Nguyen NPD, Bakhtiari M, Turner KM, Schwab R, Carter H, Mischel PS, Bafna V. Exploring the landscape of focal amplifications in cancer using ampliconarchitect. Nature Communications. 2019;10:392. doi: 10.1038/s41467-018-08200-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Dewari PS, Southgate B, Mccarten K, Monogarov G, O’Duibhir E, Quinn N, Tyrer A, Leitner M-C, Plumb C, Kalantzaki M, Blin C, Finch R, Bressan RB, Morrison G, Jacobi AM, Behlke MA, von Kriegsheim A, Tomlinson S, Krijgsveld J, Pollard SM. An efficient and scalable pipeline for epitope tagging in mammalian stem cells using Cas9 ribonucleoprotein. eLife. 2018;7:e35069. doi: 10.7554/eLife.35069. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29:15–21. doi: 10.1093/bioinformatics/bts635. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Fan Y, Mao R, Lv H, Xu J, Yan L, Liu Y, Shi M, Ji G, Yu Y, Bai J, Jin Y, Fu S. Frequency of double minute chromosomes and combined cytogenetic abnormalities and their characteristics. Journal of Applied Genetics. 2011;52:53–59. doi: 10.1007/s13353-010-0007-z. [DOI] [PubMed] [Google Scholar]
  16. Gangoso E, Southgate B, Bradley L, Rus S, Galvez-Cancino F, McGivern N, Güç E, Kapourani CA, Byron A, Ferguson KM, Alfazema N, Morrison G, Grant V, Blin C, Sou I, Marques-Torrejon MA, Conde L, Parrinello S, Herrero J, Beck S, Brandner S, Brennan PM, Bertone P, Pollard JW, Quezada SA, Sproul D, Frame MC, Serrels A, Pollard SM. Glioblastomas acquire myeloid-affiliated transcriptional programs via epigenetic immunoediting to elicit immune evasion. Cell. 2021;184:2454–2470. doi: 10.1016/j.cell.2021.03.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Gibaud A, Vogt N, Hadj-Hamou NS, Meyniel JP, Hupé P, Debatisse M, Malfoy B. Extrachromosomal amplification mechanisms in a glioma with amplified sequences from multiple chromosome loci. Human Molecular Genetics. 2010;19:1276–1285. doi: 10.1093/hmg/ddq004. [DOI] [PubMed] [Google Scholar]
  18. Hamkalo BA, Farnham PJ, Johnston R, Schimke RT. Ultrastructural features of minute chromosomes in a methotrexate-resistant mouse 3T3 cell line. PNAS. 1985;82:1126–1130. doi: 10.1073/pnas.82.4.1126. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Hansen JC, Maeshima K, Hendzel MJ. The solid and liquid states of chromatin. Epigenetics & Chromatin. 2021;14:50. doi: 10.1186/s13072-021-00424-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Heinz S, Benner C, Spann N, Bertolino E, Lin YC, Laslo P, Cheng JX, Murre C, Singh H, Glass CK. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Molecular Cell. 2010;38:576–589. doi: 10.1016/j.molcel.2010.05.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Hung KL, Yost KE, Xie L, Shi Q, Helmsauer K, Luebeck J, Schöpflin R, Lange JT, Chamorro González R, Weiser NE, Chen C, Valieva ME, Wong IT-L, Wu S, Dehkordi SR, Duffy CV, Kraft K, Tang J, Belk JA, Rose JC, Corces MR, Granja JM, Li R, Rajkumar U, Friedlein J, Bagchi A, Satpathy AT, Tjian R, Mundlos S, Bafna V, Henssen AG, Mischel PS, Liu Z, Chang HY. EcDNA hubs drive cooperative intermolecular oncogene expression. Nature. 2021;600:731–736. doi: 10.1038/s41586-021-04116-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Inda MM, Bonavia R, Mukasa A, Narita Y, Sah DWY, Vandenberg S, Brennan C, Johns TG, Bachoo R, Hadwiger P, Tan P, Depinho RA, Cavenee W, Furnari F. Tumor heterogeneity is an active process maintained by a mutant EGFR-induced cytokine circuit in glioblastoma. Genes & Development. 2010;24:1731–1745. doi: 10.1101/gad.1890510. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Jubb A, Boyle S. In: In Situ Hybridization Protocols. Nielsen BS, Jones J, editors. New York, NY: Springer; 2020. Visualizing genome reorganization using 3D DNA FISH; pp. 85–95. [DOI] [PubMed] [Google Scholar]
  24. Kalhor R, Tjong H, Jayathilaka N, Alber F, Chen L. Genome architectures revealed by tethered chromosome conformation capture and population-based modeling. Nature Biotechnology. 2011;30:90–98. doi: 10.1038/nbt.2057. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D. The human genome browser at UCSC. Genome Research. 2002;12:996–1006. doi: 10.1101/gr.229102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Kim S, Scheffler K, Halpern AL, Bekritsky MA, Noh E, Källberg M, Chen X, Kim Y, Beyter D, Krusche P, Saunders CT. Strelka2: fast and accurate calling of germline and somatic variants. Nature Methods. 2018;15:591–594. doi: 10.1038/s41592-018-0051-x. [DOI] [PubMed] [Google Scholar]
  27. Kim H, Nguyen NP, Turner K, Wu S, Gujar AD, Luebeck J, Liu J, Deshpande V, Rajkumar U, Namburi S, Amin SB, Yi E, Menghi F, Schulte JH, Henssen AG, Chang HY, Beck CR, Mischel PS, Bafna V, Verhaak RGW. Extrachromosomal DNA is associated with oncogene amplification and poor outcome across multiple cancers. Nature Genetics. 2020;52:891–897. doi: 10.1038/s41588-020-0678-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Koga T, Li B, Figueroa JM, Ren B, Chen CC, Carter BS, Furnari FB. Mapping of genomic EGFRvIII deletions in glioblastoma: insight into rearrangement mechanisms and biomarker development. Neuro-Oncology. 2018;20:1310–1320. doi: 10.1093/neuonc/noy058. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Lange JT, Rose JC, Chen CY, Pichugin Y, Xie L, Tang J, Hung KL, Yost KE, Shi Q, Erb ML, Rajkumar U, Wu S, Taschner-Mandl S, Bernkopf M, Swanton C, Liu Z, Huang W, Chang HY, Bafna V, Henssen AG, Werner B, Mischel PS. The evolutionary dynamics of extrachromosomal DNA in human cancers. Nature Genetics. 2022;54:1527–1533. doi: 10.1038/s41588-022-01177-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, 1000 Genome Project Data Processing Subgroup The sequence alignment/map format and samtools. Bioinformatics. 2009;25:2078–2079. doi: 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Mahy NL, Perry PE, Bickmore WA. Gene density and transcription influence the localization of chromatin outside of chromosome territories detectable by fish. The Journal of Cell Biology. 2002;159:753–763. doi: 10.1083/jcb.200207115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Morton AR, Dogan-Artun N, Faber ZJ, MacLeod G, Bartels CF, Piazza MS, Allan KC, Mack SC, Wang X, Gimple RC, Wu Q, Rubin BP, Shetty S, Angers S, Dirks PB, Sallari RC, Lupien M, Rich JN, Scacheri PC. Functional enhancers shape extrachromosomal oncogene amplifications. Cell. 2019;179:1330–1341. doi: 10.1016/j.cell.2019.10.039. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Nathanson DA, Gini B, Mottahedeh J, Visnyei K, Koga T, Gomez G, Eskin A, Hwang K, Wang J, Masui K, Paucar A, Yang H, Ohashi M, Zhu S, Wykosky J, Reed R, Nelson SF, Cloughesy TF, James CD, Rao PN, Kornblum HI, Heath JR, Cavenee WK, Furnari FB, Mischel PS. Targeted therapy resistance mediated by dynamic regulation of extrachromosomal mutant EGFR DNA. Science. 2014;343:72–76. doi: 10.1126/science.1241328. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Pollard SM, Yoshikawa K, Clarke ID, Danovi D, Stricker S, Russell R, Bayani J, Head R, Lee M, Bernstein M, Squire JA, Smith A, Dirks P. Glioma stem cell lines expanded in adherent culture have tumor-specific phenotypes and are suitable for chemical and genetic screens. Stem Cell. 2009;4:568–580. doi: 10.1016/j.stem.2009.03.014. [DOI] [PubMed] [Google Scholar]
  35. Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–842. doi: 10.1093/bioinformatics/btq033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Rai AK, Chen JX, Selbach M, Pelkmans L. Kinase-Controlled phase transition of membraneless organelles in mitosis. Nature. 2018;559:211–216. doi: 10.1038/s41586-018-0279-8. [DOI] [PubMed] [Google Scholar]
  37. Ramírez F, Ryan DP, Grüning B, Bhardwaj V, Kilpert F, Richter AS, Heyne S, Dündar F, Manke T. DeepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Research. 2016;44:W160–W165. doi: 10.1093/nar/gkw257. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Richards LM, Whitley OKN, MacLeod G, Cavalli FMG, Coutinho FJ, Jaramillo JE, Svergun N, Riverin M, Croucher DC, Kushida M, Yu K, Guilhamon P, Rastegar N, Ahmadi M, Bhatti JK, Bozek DA, Li N, Lee L, Che C, Luis E, Park NI, Xu Z, Ketela T, Moore RA, Marra MA, Spears J, Cusimano MD, Das S, Bernstein M, Haibe-Kains B, Lupien M, Luchman HA, Weiss S, Angers S, Dirks PB, Bader GD, Pugh TJ. Gradient of developmental and injury response transcriptional states defines functional vulnerabilities underpinning glioblastoma heterogeneity. Nature Cancer. 2021;2:157–173. doi: 10.1038/s43018-020-00154-9. [DOI] [PubMed] [Google Scholar]
  39. Rosswog C, Bartenhagen C, Welte A, Kahlert Y, Hemstedt N, Lorenz W, Cartolano M, Ackermann S, Perner S, Vogel W, Altmüller J, Nürnberg P, Hertwig F, Göhring G, Lilienweiss E, Stütz AM, Korbel JO, Thomas RK, Peifer M, Fischer M. Chromothripsis followed by circular recombination drives oncogene amplification in human cancer. Nature Genetics. 2021;53:1673–1685. doi: 10.1038/s41588-021-00951-7. [DOI] [PubMed] [Google Scholar]
  40. Sabari BR, Dall’Agnese A, Boija A, Klein IA, Coffey EL, Shrinivas K, Abraham BJ, Hannett NM, Zamudio AV, Manteiga JC, Li CH, Guo YE, Day DS, Schuijers J, Vasile E, Malik S, Hnisz D, Lee TI, Cisse II, Roeder RG, Sharp PA, Chakraborty AK, Young RA. Coactivator condensation at super-enhancers links phase separation and gene control. Science. 2018;361:eaar3958. doi: 10.1126/science.aar3958. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Shoshani O, Brunner SF, Yaeger R, Ly P, Nechemia-Arbely Y, Kim DH, Fang R, Castillon GA, Yu M, Li JSZ, Sun Y, Ellisman MH, Ren B, Campbell PJ, Cleveland DW. Chromothripsis drives the evolution of gene amplification in cancer. Nature. 2021;591:137–141. doi: 10.1038/s41586-020-03064-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Snuderl M, Fazlollahi L, Le LP, Nitta M, Zhelyazkova BH, Davidson CJ, Akhavanfard S, Cahill DP, Aldape KD, Betensky RA, Louis DN, Iafrate AJ. Mosaic amplification of multiple receptor tyrosine kinase genes in glioblastoma. Cancer Cell. 2011;20:810–817. doi: 10.1016/j.ccr.2011.11.005. [DOI] [PubMed] [Google Scholar]
  43. Strom AR, Brangwynne CP. The liquid nucleome-phase transitions in the nucleus at a glance. Journal of Cell Science. 2019;132:jcs235093. doi: 10.1242/jcs.235093. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Suvà ML, Rheinbay E, Gillespie SM, Patel AP, Wakimoto H, Rabkin SD, Riggi N, Chi AS, Cahill DP, Nahed BV, Curry WT, Martuza RL, Rivera MN, Rossetti N, Kasif S, Beik S, Kadri S, Tirosh I, Wortman I, Shalek AK, Rozenblatt-Rosen O, Regev A, Louis DN, Bernstein BE. Reconstructing and reprogramming the tumor-propagating potential of glioblastoma stem-like cells. Cell. 2014;157:580–594. doi: 10.1016/j.cell.2014.02.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Szerlip NJ, Pedraza A, Chakravarty D, Azim M, McGuire J, Fang Y, Ozawa T, Holland EC, Huse JT, Jhanwar S, Leversha MA, Mikkelsen T, Brennan CW. Intratumoral heterogeneity of receptor tyrosine kinases EGFR and PDGFRA amplification in glioblastoma defines subpopulations with distinct growth factor response. PNAS. 2012;109:3041–3046. doi: 10.1073/pnas.1114033109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Turner KM, Deshpande V, Beyter D, Koga T, Rusert J, Lee C, Li B, Arden K, Ren B, Nathanson DA, Kornblum HI, Taylor MD, Kaushal S, Cavenee WK, Wechsler-Reya R, Furnari FB, Vandenberg SR, Rao PN, Wahl GM, Bafna V, Mischel PS. Extrachromosomal oncogene amplification drives tumour evolution and genetic heterogeneity. Nature. 2017;543:122–125. doi: 10.1038/nature21356. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Veatch SL, Machta BB, Shelby SA, Chiang EN, Holowka DA, Baird BA. Correlation functions quantify super-resolution images and estimate apparent clustering due to over-counting. PLOS ONE. 2012;7:e31457. doi: 10.1371/journal.pone.0031457. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Verhaak RGW, Hoadley KA, Purdom E, Wang V, Qi Y, Wilkerson MD, Miller CR, Ding L, Golub T, Mesirov JP, Alexe G, Lawrence M, O’Kelly M, Tamayo P, Weir BA, Gabriel S, Winckler W, Gupta S, Jakkula L, Feiler HS, Hodgson JG, James CD, Sarkaria JN, Brennan C, Kahn A, Spellman PT, Wilson RK, Speed TP, Gray JW, Meyerson M, Getz G, Perou CM, Hayes DN, Cancer Genome Atlas Research Network Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1. Cancer Cell. 2010;17:98–110. doi: 10.1016/j.ccr.2009.12.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Verhaak RGW, Bafna V, Mischel PS. Extrachromosomal oncogene amplification in tumour pathogenesis and evolution. Nature Reviews Cancer. 2019;19:283–288. doi: 10.1038/s41568-019-0128-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Vicario R, Peg V, Morancho B, Zacarias-Fluck M, Zhang J, Martínez-Barriocanal Á, Navarro Jiménez A, Aura C, Burgues O, Lluch A, Cortés J, Nuciforo P, Rubio IT, Marangoni E, Deeds J, Boehm M, Schlegel R, Tabernero J, Mosher R, Arribas J. Patterns of HER2 gene amplification and response to anti-HER2 therapies. PLOS ONE. 2015;10:e0129876. doi: 10.1371/journal.pone.0129876. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Vogt N, Lefèvre SH, Apiou F, Dutrillaux AM, Cör A, Leuraud P, Poupon MF, Dutrillaux B, Debatisse M, Malfoy B. Molecular structure of double-minute chromosomes bearing amplified copies of the epidermal growth factor receptor gene in gliomas. PNAS. 2004;101:11368–11373. doi: 10.1073/pnas.0402979101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Wang L-B, Karpova A, Gritsenko MA, Kyle JE, Cao S, Li Y, Rykunov D, Colaprico A, Rothstein JH, Hong R, Stathias V, Cornwell M, Petralia F, Wu Y, Reva B, Krug K, Pugliese P, Kawaler E, Olsen LK, Liang W-W, Song X, Dou Y, Wendl MC, Caravan W, Liu W, Cui Zhou D, Ji J, Tsai C-F, Petyuk VA, Moon J, Ma W, Chu RK, Weitz KK, Moore RJ, Monroe ME, Zhao R, Yang X, Yoo S, Krek A, Demopoulos A, Zhu H, Wyczalkowski MA, McMichael JF, Henderson BL, Lindgren CM, Boekweg H, Lu S, Baral J, Yao L, Stratton KG, Bramer LM, Zink E, Couvillion SP, Bloodsworth KJ, Satpathy S, Sieh W, Boca SM, Schürer S, Chen F, Wiznerowicz M, Ketchum KA, Boja ES, Kinsinger CR, Robles AI, Hiltke T, Thiagarajan M, Nesvizhskii AI, Zhang B, Mani DR, Ceccarelli M, Chen XS, Cottingham SL, Li QK, Kim AH, Fenyö D, Ruggles KV, Rodriguez H, Mesri M, Payne SH, Resnick AC, Wang P, Smith RD, Iavarone A, Chheda MG, Barnholtz-Sloan JS, Rodland KD, Liu T, Ding L, Clinical Proteomic Tumor Analysis Consortium Proteogenomic and metabolomic characterization of human glioblastoma. Cancer Cell. 2021;39:509–528. doi: 10.1016/j.ccell.2021.01.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Williamson I, Lettice LA, Hill RE, Bickmore WA. Shh and ZRS enhancer colocalisation is specific to the zone of polarising activity. Development. 2016;143:2994–3001. doi: 10.1242/dev.139188. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Williamson I, Kane L, Devenney PS, Flyamer IM, Anderson E, Kilanowski F, Hill RE, Bickmore WA, Lettice LA. Developmentally regulated shh expression is robust to TAD perturbations. Development. 2019;146:dev179523. doi: 10.1242/dev.179523. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Wu S, Turner KM, Nguyen N, Raviram R, Erb M, Santini J, Luebeck J, Rajkumar U, Diao Y, Li B, Zhang W, Jameson N, Corces MR, Granja JM, Chen X, Coruh C, Abnousi A, Houston J, Ye Z, Hu R, Yu M, Kim H, Law JA, Verhaak RGW, Hu M, Furnari FB, Chang HY, Ren B, Bafna V, Mischel PS. Circular ecdna promotes accessible chromatin and high oncogene expression. Nature. 2019;575:699–703. doi: 10.1038/s41586-019-1763-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Yi E, Gujar AD, Guthrie M, Kim H, Zhao D, Johnson KC, Amin SB, Costa ML, Yu Q, Das S, Jillette N, Clow PA, Cheng AW, Verhaak RGW. Live-cell imaging shows uneven segregation of extrachromosomal DNA elements and transcriptionally active extrachromosomal DNA hubs in cancer. Cancer Discovery. 2021;12:468–483. doi: 10.1158/2159-8290.CD-21-1376. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Zhu Y, Gujar AD, Wong CH, Tjong H, Ngan CY, Gong L, Chen YA, Kim H, Liu J, Li M, Mil-Homens A, Maurya R, Kuhlberg C, Sun F, Yi E, AC. Ruan Y, Verhaak RGW, Wei CL. Oncogenic extrachromosomal DNA functions as mobile enhancers to globally amplify chromosomal transcription. Cancer Cell. 2021;39:694–707. doi: 10.1016/j.ccell.2021.03.006. [DOI] [PMC free article] [PubMed] [Google Scholar]

Editor's evaluation

Jessica K Tyler 1

This study convincingly shows that, in contrast to recent reports, the transcriptional output of oncogenes carried on extrachromosomal DNA (ecDNA) in glioblastoma cell lines is driven by the copy number of the ecDNA, rather than their spatial localization into transcriptional hubs. This study is relevant to researchers interested in nuclear function, particularly transcriptional organization within malignant cells.

Decision letter

Editor: Jessica K Tyler1

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

[Editors' note: this paper was reviewed by Review Commons.]

eLife. 2022 Dec 7;11:e80207. doi: 10.7554/eLife.80207.sa2

Author response


General Statements

We thank the reviewers for the detailed and considered comments. We believe they have significantly improved and enhanced the manuscript.

Point-by-point description of the revisions

Reviewer #1 (Evidence, reproducibility and clarity (Required)):

The manuscript by Purshouse et al. is focused on the question whether oncogene expression from amplified extrachromosomal DNA (ecDNA) is driven by their intranuclear positioning and special transcriptional control. Using microscopy after FISH or immuno-FISH with following quantitative analysis, as well as RNA and WGS sequencing, the authors show that transcription output of oncogenes in the three studied glioblastoma cell lines depends primarily on the ecDNA copy number but not on spatial localization and formation so called transcriptional hubs, as has been claimed in recent publications.

Overall, the study is very well conceived and presented, experimental approaches are adequate and carefully controlled. The presentation of the results is clear, the images are of a high quality, the data are comprehensively discussed and the paper is agreeable to read.

I have only several small comments to the authors:

(1) Figure 1D,F and Figure S1C:

Classification of signals into peripheral and internal is extremely difficult in adherently growing flat cells, such as glioblastoma cells. This fact is aggravated by performing a 2D analysis of signal distribution within concentric shells. Therefore I wonder how the authors can exclude that signals in the central shells are not sitting at the top or bottom of the nuclei, i.e., are not peripheral?

Analysis of 2D images to infer 3D organisation is a well-established approach. Though indeed, in a single nucleus a signal in the central shells might be on the top or bottom surface, averaging over many nuclei (assuming random orientation with respect to the slide surface) allows identification of a truly central preferential localisation (references included in the methods).

  1. Boyle, S., Gilchrist, S., Bridger, J.M., Mahy, N.L., Ellis, J.A., and Bickmore, W.A. (2001). The spatial organization of human chromosomes within the nuclei of normal and emerin-mutant cells. Hum. Mol. Genet. 10, 211–219.

  2. Croft, J.A., Bridger, J.M., Boyle, S., Perry, P., Teague, P., and Bickmore, W.A. (1999). Differences in the localization and morphology of chromosomes in the human nucleus. J. Cell Biol. 145, 1119–1131

Please note: Images were taken as a single plane image through a nucleus, and not as a stacked 3D image. Although GBM cells are relatively flat when grown as adherent cells, they still represent a 3D structure and foci in the top or bottom of the nuclei would not have been in focus in a 2D single plane image. In addition, our finding of ecDNA occupying a more central territory is consistent with previous publications, which we have cited.

(2) Although the authors note a significant reduction in peripheral localization of EGFR signal in nuclei of glioblastoma cells in comparison to control NSCs, the signal remain to a great degree peripheral – e.g. Figure S1C, Figure 2D, Figure 3D, Figure 4D (also see the previous comment). I am wondering whether it could be explained by sticking of ecDNA to periphery of early/late anaphase chromatin during mitotic segregation? Have the authors observed ecDNA positioning at these cell cycle stages?

We considered exploring ecDNA positioning at different cell cycle stages, particularly whether quiescent GBM cells would have altered ecDNA regulation. However, we felt this represented a separate important research question that is beyond the scope of the current study. Moreover, cell cycle control would not alter our main conclusions. The patient-derived GBM models we have used here are slow growing and so very few cells will be in anaphase. In a culture of GSCs we anticipate approximately only ~1% are in mitosis – so this would have a limited contribution to the current data.

(3) The Figure 1A shows an example of HSR, formation of which is the next step in evolution of ecDNA elements in tumor cells. Have the authors observed HSR in interphase nuclei? And if yes, what is their location – peripheral or internal?

There is no means of confirming that a group of EGFR DNA FISH foci represent an HSR rather than a group of clustered ecDNA (or double minutes) in an interphase nucleus. Consequently, as our study sought to evaluate the presence or absence of clustering of ecDNA, we were mindful of the possibility of HSRs in the E26 cells resulting in a ‘false positive’ clustering effect in our later cluster analysis. We decided for this reason to include cell lines, E28 and E25, where no HSRs were observed in metaphase spreads. This is a good point, however, and we therefore make this more explicit in the current results.

(4) First paragraph of the chapter "EcDNA do not cluster in the nucleus":

"We used 3D image-based analysis of the EGFR DNA FISH signals…"

What was the thickness of studied nuclei after hybridization? As far as I understood from the Methods, nuclei were air-dried during FISH procedure, a step which significantly flattens cells including nuclei. My concern is whether the analysis the authors performed is a real 3D analysis. I do not see a problem if it is 2D, but the limitations of the method have to be mentioned in any case.

Cells were only air dried after PFA fixation so we believe the 3D structure will therefore be largely preserved (see Methods). As noted in the methods section, nuclear sections were imaged in 3D across 3uM in 100nm increments in the z plane. Even if flattening during processing did occur, this would have increased the likelihood of observing close clustering across the cross-section of each nucleus imaged.

We have added a note to the Methods ‘Metaphase spreads and interphase nuclei’ to make this methodology clear, and direct the reviewer to the Methods section ‘Imaging’ where the 3D imaging strategy is outlined.

(5) The same chapter: "…if there were clustering of ecDNA in the close proximity required for coordinated transcription in condensates or hubs; this should be ~200nm or less." I think it is important to explain why the authors have chosen this threshold of 200 nm.

We have now added a sentence to this paragraph (Results, ‘EcDNA in GBM stem cells do not cluster in the nucleus’) to provide a referenced justification for this threshold.

We chose this threshold based on optical resolution for conventional wide-field microscopy being ~200nm. The resolution of the SoRa super-resolution microscope used in our study is approximately 120nm. We have added this to the Discussion so this is clearer for the reader.

200nm has been used as a threshold in published studies (see cited) for colocalisation when exploring genomic loci proximity and contact domains, and previous work indicating interactions between super-enhancers and BRD4/Med1 puncta appeared to occur within this distance. So we believe this is appropriate to enable cross-comparison of existing literature.

In similar spirit to this point, we felt greater explanation around choice of PolII hub size and how foci were measured using the Spots function in Imaris would be beneficial to the reader. We have added this to the ‘Image analysis of ecDNA and large PolII foci’ section of the methods.

(6) The same chapter: the authors consider a single FISH signal (a spot) as a single ecDNA element. Why the authors are sure that these spots are not small clusters themselves? From the figures through the paper, I can see that FISH spots do very in size.

The use of a heterotypic ecDNA-harbouring cell line (E25) is central to address this concern. We have analysed another heterotypic ecDNA-harbouring cell line (E20) also in response to Reviewer 2; Point 1b.

Due to the variable nature of ecDNA structure, we agree that some FISH foci are smaller due to different ecDNA sizes and possibly different break points in individual ecDNA. To address these points, we have added the following to the manuscript (Discussion, para 3).

We have performed reanalysis of the raw data. The largest ecDNA EGFR FISH foci were similar in size to those on the native chromosomal signal in NSC cells, which suggest these are not multiple clusters of smaller foci. It should be noted that the images in figure 1 were taken with an epifluorescence microscope with a lower optical resolution appropriate to the erosion territory analysis, and this may overstate the presence of large ecDNA loci. To ensure we were minimising the risk of missing two closely localised elements, we performed all subsequent analysis in 3D with images taken on the SoRa microscope (optic resolution ~120nm).

We use two-colour FISH to visualise two ecDNA populations in two cell lines. This is an important control for the possibility of small clustering, as spots from the two different populations would be expected to cluster in this scenario. In the E25 cell line we repeated the analysis by using a 150nm size threshold in Imaris (i.e. half the size) to ensure there was no omission of such small foci in small clusters, and observed the same result (Figure 3 Supplemental Figure 1).

We repeated the analysis in greater numbers in another dual-ecDNA population cell line, E20, where metaphase spread showed that a small proportion of ecDNA (~10%) had both oncogenes on the same ecDNA (Figure 3 Supplemental Figure 2). A similar proportion were identified as colocalising via Ripley’s K. This gave us confidence both in our ability to capture true colocalisation, and that the lack of close clustering in all other interphase nuclei in this cell line was true.

(7) Figure S5D: on the RGB images DNA signals are hardly or not visible at all. I suggest the authors to convert the signal channel into enhanced grey scale image and, instead of showing counterstain with DAPI, outline nuclear borders.

Similar comment is for Figure S5G and Figure 5B: either use grayscale images for the signals or enhance the RGB signal channels so that it is clearly visible on the DAPI background.

Good suggestion. Images for these figures are now presented in greyscale for easier visualisation. For figure 5A we have supplied the images in full colour (as these now include new data (CEN7 FISH) – we felt black and white images in this instance were not as clear). We are happy to supply this version if preferred on review.

(8) There are several textual inaccuracies:

"…was confirmed by DNA FISH on metaphase chromosomes…" – the authors probably meant "metaphase spreads"

"…10% formamide in DEPC-treated water for 30 min, and then repeated with DAPI…" – the authors probably meant "stained with DAPI"

For clarity, the ordinate axis in Figure 1F has to be relabeled as "Normalised EGFP:chr 7 mean signal intensity"

Agree. The text and figure legend have been amended.

Reviewer #1 (Significance (Required)):

This paper is a good example of a report about negative results contradicting recently published data that became a common wisdom. The present work is important for our understanding of nuclear functioning and, in particular, of transcription organization within malignant cell nuclei. I fully support publication of this work.

Reviewer #2 (Evidence, reproducibility and clarity (Required)):

In this study, Purshouse et al. performed super-resolution microscopy to (1) investigate the cytogenetic features of amplified oncogenes; and (2) assess the quantitative relationship between oncogene expression and DNA copy number. The authors found that extra chromosomally amplified DNA in cancer cells does not form transcriptional condensates and the transcriptional output of amplified DNA is largely proportional to the DNA copy number. These findings suggest that high-level gene amplification is sufficient to drive transcriptional amplification.

I. Regarding the 1st conclusion that ecDNAs (also referred to as double-minute chromosomes or DMs) do not form condensates, the authors provide two lines of evidence. First, individual ecDNA circles containing the same amplified DNA do not cluster together (Figure 2). Second, ecDNA circles containing different amplified DNA also do not cluster together (Figure 3). Both observations were quantified by the average distance between ecDNA circles detected by DNA-FISH. If the authors had only shown that ecDNA circles containing the same DNA do not cluster together, this could not rule out the possibility that some FISH spots represent tightly clustered DMs/ecDNAs within 200nm that cannot be resolved by microscopy. However, the observation that different ecDNA circles (which can be resolved using different FISH probes) do not co-localize with each other provides compelling evidence for the lack of apparent ecDNA/DM clustering. Overall I consider these data convincingly support the conclusion that ecDNAs/DMs do not form condensates. I have one question about Figure 2 and one suggestion.

Ia. How did the authors distinguish between large FISH spots and two clustered FISH spots?

Please see also response to Reviewer 1, Question 6. We feel the two-colour FISH was essential in controlling for this possibility, and refer you further to our response to your comments to 1b below.

Can the authors distinguish between large FISH spots representing ecDNA condensates and those reflecting HSRs (homogeneously staining regions)?

Please also refer to answer to Reviewer 1, Point 3. Due to this concern, we made sure to include cell lines that harboured entirely (E28, E25) or mostly (E26) ecDNA, and the additional cell line, E20, also appeared to harbour only ecDNA. Had we observed close clustering, our first question would have been whether these were truly clustered ecDNA or HSRs. In addition, a criterion of our Ripley’s K function was that we only included nuclei with >20 loci (See methods), which would have likely excluded any nuclei in which only an HSR was present.

Ib. To further strengthen the conclusion, the authors may perform the same analysis on at least another cell line with multiple ecDNA species. (Currently this was only done on the E25 cell line.) In my opinion, this experiment is not essential but will significantly strengthen the conclusion.

We thank the reviewer for this suggestion, and have performed the same analysis on another cell line, E20, that harbours CDK4 and PDGFRA ecDNA. Interestingly, we observed ~10% ecDNA had colocalised CDK4 and PDGFRA on metaphase spreads, suggesting the two genes are on the same ecDNA molecule. In interphase nuclei we observed a similar proportion of interphase nuclei with this pattern, and Ripley’s K confirmed these nuclei had colocalised CDK4/PDGFRA foci. This gives confidence that 3D Ripley’s K is able to identify true colocalisation. We note that no other nuclei (22/24) had significant clustering at <300nm when considering CDK4 and PDGFRA together.

We note in addition that 4/24 nuclei had clustering of CDK4 foci at 200nm. We reviewed these nuclei, and believe that these represent double minutes, and have included a representative image in the figures.

We have added these data to Figure 3 and Supplementary Figure 3, and to the text in the Results and Discussion.

The authors performed additional experiments showing that ecDNAs/DMs do not co-localize with transcriptional condensates (Figure 4). Given the somewhat ambiguous criteria for the identification of transcriptional condensates and the promiscuity of their biological nature (see for example, the review of McSwiggen et al., Genes & Dev. 2019), I consider these data to be informative though not as convincing as the direct analysis of ecDNA/DM clustering.

We acknowledge the reviewer’s point and have amended the Results subheading from ‘condensates’ to ‘hubs’ i.e. Transcriptional hubs are few and do not colocalize with ecDNA in GBM stem cells’. We have altered ‘condensates’ to ‘large PolII foci/hubs’ as appropriate throughout the manuscript to be more specific.

We have clarified our description of the literature around ecDNA and colocalisation with RNAPolII (Discussion, para 5). It is inherently challenging to control for varying ecDNA number and chance colocalisation of ecDNA and PolII foci, hence our focus on large PolII foci. We have endeavoured to clearly outline our methods (see last bullet point of response to reviewer 1, point 5) so this is clear to the reader and have added text to highlight the limitations in the discussion.

II. Regarding the 2nd conclusion that the transcriptional output of amplified DNA is largely proportional to the DNA copy number, the authors assessed both the transcriptional efficiency (nascent RNA FISH) and the transcriptional yield of amplified ecDNA (RNA-Seq). To assess transcriptional efficiency (Figure 5), the authors performed dual DNA/RNA-FISH to (1) measure the frequency of co-localization of amplified oncogene (EGFR) and nascent RNA; (2) compare the frequency of active transcription of extrachromosomal and intrachromosomal copies of the same oncogene. I have a few comments about these data and their interpretation.

IIa. Figure 5B and 5C suggest that the transcriptional efficiency of ecDNA circles is dependent on the sequence of amplified DNA (as it varies between different glioblastoma lines) but not on the number of ecDNA circles (demonstrated by the linear relationship between actively transcribing and total ecDNA). This observation provides a great example showing that the genetic sequence of amplified DNA plays a big role in the transcriptional output. This example demonstrates that one cannot naively attribute differential gene expression in different cancer cell lines (even after normalization of gene copy number) to putative epigenetic changes and ignore the genetic variation.

IIb. The linear relationship between DNA and RNA foci in Figure 5C does not exclude the possibility that tightly clustered ecDNAs/DMs, which only produce single DNA foci, may be transcribed more frequently. The authors should comment on this.

There is the possibility that we are unable to resolve multiple clustered DNA and RNA foci. We refer to our comment to Reviewer 1, Question 6, and note our addition to the text highlighting this caveat (Discussion, para 3). We have increased the number of biological replicates (n=3 – see new Figure 5B) and performed further RNAseq/WGS analysis (see response to IId. below) of SNPs which we feel significantly adds to this section of the study.

IIc. In Figure 5D, the authors analyzed the transcriptional efficiency of intra-chromosomal and extra-chromosomal EGFR gene copies. The heuristic classification of intra- versus extra-chromosomal EGFR gene copies (based on the number of total EGFR foci in a single cell) is less than satisfactory. I wonder whether the author could redo the experiment with additional FISH probes against either Chr7-centromeric DNA or another gene on Chr7p next to the endogeneous EGFR locus to determine the endogeneous EGFR loci with better certainty. This will enable a direct measurement of the transcriptional efficiency of endogenous EGFR and extrachromosomal EGFR in the same cell.

We agree, and have repeated this experiment (with 3 biological replicates) with a Centromere 7 control probe (See figure 5A for representative images), and included these data in evaluating overall RNA:DNA FISH ratios (Figure 5B). We then used these data to correlate RNA:DNA FISH ratio against the proportion of ecDNA (RNA:DNA ratio = number of RNA foci / number of DNA foci. EcDNA proportion = (number of EGFR DNA foci – number of CEN7 foci) / number of EGFR DNA foci) by Spearman’s correlation. The data from these technical replicates is included in Figure 5 – Supplementary file 1, and data from replicate 1 is shown in figure 5C. We agree this is a more accurate measure of ecDNA transcriptional efficiency, and hope this is clearer for the reader. We have removed the previous figures and supplemental figure, and amended the methods, results and discussion to reflect this.

IId. In Figure 5E-G, the authors further analyzed the transcriptional yield of endogeneous EGFR and amplified EGFR derived from RNA-Seq data. The authors used a clever trick to separate the transcriptional output of wild-type EGFR from amplified EGFR vIII (with exon 2-7 deletion). The result supports the conclusion that the normalized transcriptional yield of the deleted exons is similar to that of amplified exons. But the large variation in the RNA:DNA ratio of non-deleted exons in the ecDNA group is puzzling. I suspect this may be due to the size variation of individual exons.

The authors described the analysis as follows in the Method section:

WGS and RNA-seq read counts were normalised to the size of the ecDNA/chromosome blocks and EGFR exons, respectively. Normalized RNA-seq read counts of each exon were divided by the normalised WGS read counts of the corresponding ecDNA/chromosome region to give a normalized RNA-seq count for each exon, and analysed in Graphpad Prism v9.0.

Why is it necessary to normalize WGS/RNA-Seq counts to the size of ecDNA blocks or exonic size? (1) One can simply normalize the RNA read counts to the DNA read counts in each exon and compare the ratio across different exons. The ratio automatically controls for the exonic size, sequencing depth, etc. (2) A potentially more specific analysis is to calculate the ratio of RNA reads joining exon 1 to exon 8 (from EGFR vIII) and RNA reads joining exon 1 to exon 2 (from wildtype EGFR), and compare the RNA ratio to DNA ratio calculated from the average read depth in EGFR minus the average read depth in exons 2-7 (EGFR vIII) and the average depth in exons 2-7 (wildtype EGFR).

We agree that counting reads in individual exons is a better way to perform the intended analysis. We have performed this analysis in the E26 and GBM39 cell lines directly comparing WGS and RNAseq counts in exons 1-28 (with exons 2-7 = chromosomal, exons 1, 8-28 = predominantly ecDNA). As expected, the result is comparable to that of our previous method. We have replaced the AmpliconArchitect block analysis with this exon-based analysis in the manuscript and methods.

To confirm these new data were comparable to our previous analysis, we performed correlation analysis on the normalized RNA counts for the original Amplicon Architect (AA) vs revised exons approach. All correlations were positive (Pearson correlation, p<0.05 in all cases – plots shown in Author response image 1).

Author response image 1.

Author response image 1.

Regarding using exon 1-2 and 1-8 spanning reads as a specific measurement of extrachromosomal and chromosomal DNA, we were interested to perform this analysis. However, we noted technical issues that limit this. Notably, exon 1 of EGFR sits at the TSS which contains a CpG island. We noticed that there is a drop in sequencing read coverage here and genome-wide across CpG islands in the WGS data due to GC-bias in the sequencing or sample preparation. Furthermore, the polyA-enriched RNA sequencing data is heavily biased toward the 3’ end of the gene. See Author response image 2. These two technical biases mean that the comparison of ratios of spliced reads to genome coverage at exons will not be meaningful. We note that these biases could also influence the RNA/DNA ratio at individual exons (Author response image 1), which likely explains some of the variability in normalised RNA-seq counts between exons. However, the chromosomal exons 2-7 are toward the 5’ end of the gene, so if anything their RNA-seq reads would be underestimated compared to the mostly extrachromosomal exons >7. Importantly, the allelic ratio of SNPs would not suffer from these biases, as ratios of RNA and WGS reads across individual SNPs are compared directly to each other and single basepair changes are unlikely to have a technical effect on sequencing coverage. See Author response image 2 for this analysis.

Author response image 2.

Author response image 2.

I suggest two additional calculations that can further strengthen this analysis. (1) If there are polymorphic sites in the amplified sequence, the authors can calculate the fraction of allele-specific transcripts and the fraction of allele-specific genomic DNA from the sequencing data, and then calculate the RNA:DNA ratio. As the amplified DNA is derived from one parental homolog, this analysis can be done even in GBM lines with amplified wtEGFR. (2) If there are other co-amplified genes, the authors can perform allele-specific RNA:DNA analysis on those genes. This analysis will be informative as the co-amplified gene(s) may not be under positive selection.

We thank the reviewer for this suggestion which we believe strengthens our conclusions. We called germline variants in patient control (blood) samples using strelka2 and selected expressed exon-overlapping polymorphisms in the amplified region to calculate the allele frequencies in the WGS and RNA-seq samples (added as Figure 5 Supplementary Figure 5F). In the E26 cell line, only EGFR and LANCL2 (200kb 3’ of EGFR) are sufficiently expressed and have overlapping polymorphisms present on ecDNA (note only a proportion of LANCL2 polymorphisms are present on a subset of ecDNA). Nevertheless, the allelic ratio in the DNA/RNA is close to one for all polymorphisms. We also performed the same analysis on E28, which harbours expressed SNPs in both LANCL2 and VSTM2A, also showing allelic ratios close to 1, in line with our previous analysis. These data have now been added to the results and Figure 5D.

The lack of non-tumour patient controls prohibits us to call such polymorphisms in the GBM39 cell line.

Additional comments:

IIe. There is apparent variation in the intensities of both DNA-FISH and RNA-FISH (see e.g., Figure 5A). Does this reflect copy-number variation of amplified DNA in single ecDNA/DM? Can the authors quantify such variation as well as its correlation with the efficiency of transcription?

We have subjectively not noted a correlation between RNA FISH signal intensity and its associated DNA FISH locus. We have now undertaken a pilot comparison of RNA/DNA FISH foci signal intensity in E26 and E28 cell lines using the raw data from Figure 5A. We observed no correlation in this small sample. Based on this pilot analysis, we do not believe further assessment of this would be useful or alter our main conclusions. We have also commented on the variation in the text (See Reviewer 1 Point 6).

There are always slight variations in intensity due to differences in how a FISH probe accesses the nucleus. In addition, differences in these images may reflect how these representative images were prepared for the manuscript (i.e. differences between cell lines). Raw imaging data can be made available and may further reassure the reviewer.

III. I suggest the authors cite more original research papers related to ecDNA/double-minute amplification.

Original report of double-minute chromosomes

Cox, Yuncken, Spriggs, Lancet, 1965.

EM analysis of double-minutes

Hamkalo et al., PNAS 1985.

First cytogenetic analysis of extrachromosomally amplified EGFR in primary glioma:

Vogt et al., PNAS 2004

First study showing complex double minutes composed of multiple non-syntenic segments in a single tumor

Gibaud et al., Hum Mol Genet 2010.

Dynamic evolution of multiple oncogenic ecDNA/DMs in glioblastomas:

Snuderl et al., Cancer Cell 2011.

Szerlip et al., PNAS 2012.

First single-cell genomic analysis of amplified DNA in glioblastomas:

Francis et al., Cancer Discovery 2014

We appreciate these recommendations. We have incorporated the following references where appropriate into the introduction (as they appear chronologically in the text):

Snuderl et al., Cancer Cell 2011.

Szerlip et al., PNAS 2012.

Cox et al., Lancet, 1965

Hamkalo et al., PNAS, 1985

Vogt et al., PNAS, 2004

We have added a sentence associated with these references: ‘EcDNA can be composed of multiple genetic fragments generated as a result of chromothripsis’

Gibaud et al., Hum Mol Genet, 2010

In association with this we felt it was important to add two further key references to ecDNA formation/structure:

Shoshani et al., Nature, 2021

Rosswog et al., Nature Genetics, 2021

We have not included the Francis et al. reference. While this is an important original research paper, this focuses more on EGFRvII, which is not of central relevance to this paper. Cross-commenting:

I agree with most comments from the other two reviewers and am happy to discuss my comments if needed.

Reviewer #2 (Significance (Required)):

Gene amplification and double-minute chromosomes were both discovered more than 50 years ago. But the molecular and genetic features of amplified DNA, including their origin, remain incompletely understood. There is a resurgence of interest in extrachromosomally amplified DNA (originally termed double-minute chromosomes) in cancer cells due to their connection to tumor evolution and therapy resistance. Several recent studies showed that ecDNA/DMs have unique chromatin organization and epigenetic states that promote gene transcription. Some of these studies suggested that epigenetic alterations, including enhancer rewiring, may be more important than putative gene copy-number amplification. Contrary to these studies, the current study suggests a more classical (genetic) and more intuitive model of amplified DNA, highlighting the determining contribution of gene copy number to transcriptional output. In comparison to other ecDNA papers, the current study does not have as many "novelty" factors, such as the usage of novel sequencing techniques or the proposed novel mechanisms of transcriptional upregulation (e.g., transcriptional condensates/hubs). However, I have found the data and analyses presented in the current study to be more convincing. This is because of two reasons. First, amplified DNA is genetically unstable and often displays cell-to-cell variation. Such heterogeneity cannot be resolved by bulk measurements, including DNA, RNA, epigenome, or Hi-C sequencing. By contrast, single-cell imaging can directly capture such variation. Second, the unique epigenetic/transcriptional features of ecDNA were often inferred from data generated using complementary assays (including different bulk sequencing measurements and imaging) on different cells. These experimental assays are often subject to different technical or biological variations that are difficult to control. By contrast, the current study performs simultaneous measurements of the number and transcriptional efficiency of ecDNA/DMs using a single experimental assay (super-resolution FISH), which is a more robust strategy.

As my expertise is in genomic analysis, I know very well the limitations of sequencing assays and bioinformatic analysis, especially in the resolution of genetic/epigenetic features of repetitive DNA. For example, one cannot distinguish between extra-chromosomal (ecDNA/DM) or intra-chromosomal (homogeneous staining regions) amplified DNA based on bulk sequencing data, despite claims made in some recent studies (such as Wu et al. 2019). This is both because of the dynamic conversion between DMs and HSRs and because rearrangement junctions detected from sequencing data may be either within a single copy of amplified DNA (as in DMs) or between different amplified DNA copies as in HSRs. As my research focuses on chromosomal instability and cancer genomic rearrangements, I also have good knowledge of cancer genomics, cytogenetics, and mechanisms of chromosomal instability. Overall I have found the authors' analysis to be solid, although the conclusion could be strengthened with the analysis of more samples and with a comparison between ecDNA/DMs and HSRs.

We thank the reviewer for their detailed comments. We agree with the reviewer’s comment regarding more samples and have outlined above the specific additional experiments above which we agree significantly strengthens the analysis.

Regarding ecDNA vs HSRs, we refer the reviewer to our comments to Reviewer 1, Question 3. The data presented here cannot, and do not seek to, make a comparison between HSRs and ecDNA, and indeed have sought to ensure all analysis focuses on ecDNA characteristics. We make this clear in the updated manuscript. Exploration of HSRs and ecDNA transitions, as well as cell cycle impact, are areas ripe for future studies.

Reviewer #3 (Evidence, reproducibility and clarity (Required)):

Recent work highlights the importance of ecDNA in cancer, including its role in enhanced oncogene transcription. Two independent studies recently demonstrated that ecDNA can form hubs or clusters, and in doing so, leads to further enhanced transcription. The authors are addressing an important problem. Understanding when hubs do, and do not form, and examining the context cell lineage, copy number, and other contexts, including cell cycle, is an important area of study. Further, these entities appear to be dynamic, as opposed to static. Therefore, there is considerable merit in studying this topic.

The authors have characterized a set of glioblastoma neurosphere models, using spatial, transcriptomic, and computational methods to conclude that ecDNA may not require hubs to drive oncogenic transcription, which can result purely from elevated copy number driven by non-chromosomal inheritance. There is no doubt that the elevated copy number plays a key role, however, understanding this complex behavior related to hubs is worthy of study.

A number of major concerns are raised:

1) The title doesn't seem to cover the scope of work. It really is an analysis of GBM stem cell cultures, not a broad analysis of cancer. The title should be more reflective of the work.

We provide a revised title: ‘Oncogene expression from extrachromosomal DNA is driven by copy number amplification and does not require spatial clustering in glioblastoma stem cells’

2) The models used are described as being characterized by the GCRC, but a trip to the website to try to learn more about the models indicates: "COMING SOON". This needs to be rectified.

We agree, and the GEO Accession numbers will be added to the Key Resources so all key data relating to this manuscript are available at the point of publication. We note that the GCGR are characterizing these cell lines as part of an upcoming resource, so have also referenced this in the Key Resources.

3) If the authors do not believe that hubs play any role, then the right experiment is to analyze cancer cell line models that have been studied by others and shown to have hubs. If the authors do not want to do that, they need to modify their claims and restrict them to the models they studied.

We propose the following to make our claims in clear connection to the models studied (in addition to the title change noted above), and have amended the Results sections headers as follows:

  1. EcDNA do not cluster in the nucleus -> EcDNA in GBM stem cells do not cluster in the nucleus

  2. Transcriptional condensates are few and do not colocalize with ecDNA -> Transcriptional condensates are few and do not colocalize with ecDNA in GBM stem cells

  3. Levels of transcription from ecDNA reflect copy number but not enhanced transcriptional efficiency -> Levels of EGFR transcription from ecDNA reflect copy number but not enhanced transcriptional efficiency

We have checked the manuscript for other opportunities to make clear the models used e.g. ‘in GBM stem cells’, see Discussion para 1 and para 5. We have added a point of clarification the Results paragraph ‘Transcriptional hubs….’ – ‘We next assessed whether ecDNA foci, which, whilst not clustered, share regional localization, …..’ (see response to Reviewer 3, Question 5 below).

We carefully considered our approach to address the question of ecDNA transcriptional hubs, namely undertaking a 3D approach with mathematical modelling to control for ecDNA number and nuclear size. We did not feel that the methods used in previously cited papers (2D correlation analysis) were suitable for evaluating 3D clustering at transcriptionally relevant distances in nanometre distances rather than pixels. We have clearly outlined the aims and models used, made the code openly available and made sure our claims are clearly stated for the reader to consider accordingly. Indeed the addition of another cell line (E20) in which some ecDNA have CDK4 and PDGFRA colocalised (in both metaphase spread and interphase nuclei in similar proportions) which was captured via 3D Ripley’s K gives us confidence in our tool. We hope this will be a useful resource and set of analytic tools for the field.

We have outlined through the discussion why we our results may differ, and why our approach cannot necessarily be used in other cell lines. For example, COLO320 harbours 3 copies of MYC on each ecDNA, and the ecDNA are very large (Discussion, para 4).

4) In previous studies, ecDNA hub formation was found to occur in about half of the cells over a 48 hours of live-cell imaging and typically lasted five to six hours. Are the single-time point interphase snapshots shown here able to capture events at this frequency?

We believe the reviewer is referring to Yi et al., Cancer Discovery, 2021 (Figure 3F). This is an important study of live-cell labelling of ecDNA with which comparison is necessarily undertaken throughout the manuscript. We would still expect to observe clustering (<200nm) in the data presented within our manuscript if transcriptional-range clustering was occurring in interphase nuclei. We have added a sentence to clarify the method selected here (Discussion, para 3).: ‘…using Ripley’s K to objectively call instances of significant clustering at given distances from ecDNA x,y,z coordinates.’

In addition, we note fixed-cell analysis of ecDNA hub formation in Hung et al., Nature, 2021 in a range of cell lines (Figure 1), which we have included as a central reference point throughout the text, and highlighting cell lines of particular reference to this study (See Discussion). This gives us further confidence that fixed cell imaging is an appropriate modality to study whether ecDNA clustering is a key phenomenon in ecDNA gene regulation.

Further, it may be difficult to generalise the timeline of ecDNA hub formation. The PC3 cell line (live-cell imaging in Yi et al., 2021), along with other established cell lines studied in the context of ecDNA, may have inherent differences, such as cell cycle length, mitotic index, physiology, growth characteristics etc in comparison to primary GBM cells which might affect these frequencies. We chose to use patient-derived primary GBM cultures that mirror human disease, and have focused on quantitative spatial analysis. We have explored this in the Discussion, para 4.

5) ecDNA hubs show preferential colocalization with RNAPII and therefore, as these results do not find ecDNA hubs, it is not surprising that there was no colocalization between ecDNA molecules and RNAPII.

We have amended the first sentence of Results section three to make this clearer: ‘We next assessed whether ecDNA foci, which, whilst not clustered, share regional localization…’. It remained important to evaluate if this regional localisation was associated with RNAPol2, as this might have suggested that, while not in ecDNA hubs, ecDNA were instead associated with large RNAPol2 foci of the nucleus in ecDNA/RNAPII hubs.

6) The E26 EGFR line (Figure 2D) as well as the E25 CDK4/PDGFRA ecDNA line shows evidence of clustering of molecules (Figure 3D and 3E). The measuring method of averaging distances would undervalue those as the signal is drained in a nucleus with many ecDNA molecules.

We have supplied a sample of single slice images of E25, E26 and E28 nuclei, and of the E20 nuclei shown without (A) and with (B) colocalisation for review. We selected maximum intensity projection (MIP) images for the main figures for ease of visualisation of these small foci and the figure legend notes that the images provided are maximum intensity projections of the analysed 3D Z-stacks. As a result, this may be seen as clustering if two foci align closely in the z plane.

In addition to this, two or more foci may be closely located by chance, particularly when there are many foci in a single nucleus. Accounting for this chance through mathematical modelling is the basis of the Ripley’s K tool, as this allows number of foci and size of volume to be accounted for. While we did not observe clustering at distances in keeping with transcriptional contacts, we did observe regular non-random distribution of loci within nuclei at >300nm distances. We propose an alternative hypothesis for ecDNA distribution that does not relate to coordinated transcription.

We acknowledge the concerns expressed about averaging distances. To give a more complete picture, we have therefore included Cumulative Frequency Distribution (CFD) graphs, which include the shortest distance between all the indicated foci, measured across all nuclei measured, to show clearly the distribution of all available data. These are in Figure 2 and Figure 3 for all cell lines analysed.

We agree that the averaging data can only give an overall impression. Indeed, for a nucleus with many foci, the mean shortest interprobe distances would in fact be smaller. This is intended to give an overall picture of ecDNA spatial relationships, which is then quantified in detail using the Ripley’s K function analysis. This is clearly explained for the reader. We have therefore retained the mean shortest and single shortest foci data in the Figure 1 – supplemental figure 1 – and amended the results text to give greater clarity.

A general comment – debate on data and interpretation is a critical part of science. The paper has value in stimulating debate. However, the oppositional tone is distracting. The paper would be more effective and impactful, at least in this reviewer's view, if it were written from the standpoint of trying to understand the complex dynamics of ecDNA biology.

We agree an oppositional tone is unhelpful. Our intention was very much to present our findings in an objective manner with reference to key previous studies. We have made adjustments throughout the text to stimulate debate as to why discrepant conclusions have been reached. We did not feel these were unfairly critical or partisan – but merely our attempt to explain why different results have been obtained in our study.

Reviewer #3 (Significance (Required)):

Please see above.

Author note:

Further amendments:

Recent work proposing that ecDNA act as mobile super-enhancers for chromosomal targets has raised the possibility that ecDNA can actively recruit RNA PolII to drive ‘ecDNA-associated phase separation’ (Zhu et al. 2021). – This previously included a reference to Adleman and Martin, 2021, but on further review have edited this to the primary reference only as shown here.

We noted an error in para 4 of the discussion:

‘…results in large (approx. 1.75Mb) ecDNA (Hung et al., 2021; Wu et al., 2019)’.

This should have read ‘1.75 μm’, and so we have updated this as follows:

‘…results in large (4.328Mb, approx. 1.75μm diameter) ecDNA (Hung et al., 2021; Wu et al., 2019).

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Citations

    1. Purshouse et al 2022. WGS and RNA-seq data E26,E28. NCBI Gene Expression Omnibus. GSE215420

    Supplementary Materials

    Figure 1—source data 1. Statistical data for Figure 1 and Figure 1—figure supplement 1.

    Median values for number of EGFR DNA FISH signals per metaphase spread for E26 and E28 cell lines. Data are for Figure 1D. Mean Chr7 (Texas Red) and EGFR (FITC) DNA FISH signal intensity in bins eroded from the periphery (1) to the centre (5) of the nucleus of neural stem cell (NSC), E26 and E28 cells. p-Values from Kruskall-Wallis test. Data are for Figure 1G and Figure 1—figure supplement 1.

    Figure 2—source data 1. Statistical data for Figure 2—figure supplement 1.

    Statistical analysis of data for Figure 2—figure supplement 1B-E. Mean shortest interprobe distance and shortest interprobe distance in E26 and E28 cell lines. EGFR-EGFR interprobe distance (μm) = median values shown. The statistical significance of the data distributions between E26 and E28 were assessed with a Mann-Whitney test. n = number of nuclei.

    Figure 3—source data 1. Statistical data for Figure 3—figure supplement 1.

    Median number of CDK4 and PDGFRA DNA FISH foci in E25 cell line (n=26) nuclei. Data are for Figure 3—figure supplement 1B. Mean shortest interprobe distance and shortest interprobe distance between CDK4 and PDGFRA DNA FISH foci in E25 cell line. Statistical analysis of data for Figure 3—figure supplement 1C and D, Interprobe distance (μm) between fosmids indicated = median values shown. Value in brackets indicates adjusted p-value (adj) = Bonferroni. n=26 nuclei.

    Figure 4—source data 1. Statistical data for Figure 4 and Figure 4—figure supplement 1.

    EcDNA-large RPB1 foci distances for neural stem cell (NSC), E26 and E28 cell lines. Statistical analysis of data for DNA-ImmunoFISH (i – Figure 4E and F), RNA-ImmunoFISH (ii – Figure 4—figure supplement 1E F) EcDNA-large RPB1 foci distance (μm) indicated = median values shown. (iii) Median number of EGFR RNA FISH signals for NSC, E26 and E28 cell lines. (iv) Median ecDNA-large POLR2G foci distances for E28 mCherry-POLR2G cell line (Figure 4—figure supplement 1I, J). n = number of nuclei. Kruskall-Wallis and Mann-Whitney tests performed with comparisons as indicated.

    Figure 5—source data 1. Statistical data for Figure 5 and Figure 5—figure supplement 1.

    (i) RNA:DNA FISH EGFR foci ratios. Statistical analysis of data for Figure 5B, RNA:DNA FISH EGFR foci ratio = mean values shown. n = number of nuclei, total across three biological replicates. Values in brackets indicate adjusted p-value (adj) = Bonferroni. (ii) Correlation of RNA:DNA ratio and ecDNA/total foci ratio (Figure 5C), Spearman r (p-value) shown for three biological replicates. N = number of nuclei. Rep1 data shown in figure. (iii) RNA-seq/whole genome sequencing (WGS) allele frequency ratio, for Figure 5E. Median and number of SNPs per gene per cell line. (iv) EcDNA versus chromosomal EGFR exons (Figure 5F and G), Mann-Whitney test of normalized RNA counts between chromosomal and predominantly EGFR ecDNA exons. (v) Mann-Whitney test of of EGFR RNA FISH foci in FACs sorted E26 and E28 cells (Figure 5—figure supplement 1E).

    Supplementary file 1. Genomic information for FISH probes and CRISPR knockin.

    (A)Fosmid probes for DNA FISH related to STAR methods. Genome coordinates (Mb) are from the hg38 assembly of the human genome. (B) CrRNA sequence and dsDNA sequence for mCherry_PolR2G CRISPR knock-in.

    elife-80207-supp1.docx (14.4KB, docx)
    MDAR checklist

    Data Availability Statement

    WGS and RNAseq data have been deposited on NCBI GEO under study accession number GSE215420 and is publicly available as of the date of publication As indicated in the Key Resources, all original code has been deposited as: https://github.com/IGC-Advanced-Imaging-Resource/Purshouse2022_paper (copy archived at swh:1:rev:5b1a3920afa8e85132c94bcc6dfce94575f939ce) https://github.com/SjoerdVBeentjes/ripleyk (copy archived at swh:1:rev:1303af539403303786b6460fabef355e345ea6c9) https://github.com/kpurshouse/ecDNAcluster (copy archived at swh:1:rev:9162a39f3c8e19e973eaedc50ad4e1d3dc570e90).

    The following dataset was generated:

    Purshouse et al 2022. WGS and RNA-seq data E26,E28. NCBI Gene Expression Omnibus. GSE215420


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES