Abstract
Chromosomal instability is a major driver of intratumoral heterogeneity (ITH), promoting tumor progression. In the present study, we combined structural variant discovery and nucleosome occupancy profiling with transcriptomic and immunophenotypic changes in single cells to study ITH in complex karyotype acute myeloid leukemia (CK-AML). We observed complex structural variant landscapes within individual cells of patients with CK-AML characterized by linear and circular breakage–fusion–bridge cycles and chromothripsis. We identified three clonal evolution patterns in diagnosis or salvage CK-AML (monoclonal, linear and branched polyclonal), with 75% harboring multiple subclones that frequently displayed ongoing karyotype remodeling. Using patient-derived xenografts, we demonstrated varied clonal evolution of leukemic stem cells (LSCs) and further dissected subclone-specific drug–response profiles to identify LSC-targeting therapies, including BCL-xL inhibition. In paired longitudinal patient samples, we further revealed genetic evolution and cell-type plasticity as mechanisms of disease progression. By dissecting dynamic genomic, phenotypic and functional complexity of CK-AML, our findings offer clinically relevant avenues for characterizing and targeting disease-driving LSCs.
Subject terms: Acute myeloid leukaemia, Acute myeloid leukaemia, Genomics, Cancer stem cells
An integrated single-cell multiomic analysis of complex karyotype acute myeloid leukemia characterizes intratumoral heterogeneity and highlights links to therapeutic sensitivities.
Main
Acute myeloid leukemia with complex karyotype (CK-AML) is typically characterized by three or more chromosomal aberrations and comprises 10–12% of patients with AML. The disease is associated with complex chromosomal rearrangements1, ITH, therapy resistance and poor overall survival2–4. The molecular and cellular mechanisms underlying poor response to standard induction chemotherapy are poorly understood, although frequent TP53 loss and extensive ITH as a result of genomic instability are believed to contribute to therapeutic failure2,5. Despite major clinical need, CK-AML has remained understudied at the genomic, molecular and cellular levels, largely because of technological limitations in analyzing ITH alongside widespread chromosomal complexity6.
Single-cell genomic sequencing has emerged as a promising technique to investigate ITH through somatic copy-number profiling7–10. However, copy-number profiles do not capture the full karyotypic heterogeneity in malignancies with complex structural variant patterns, such as CK-AML, because copy-balanced and complex rearrangement structures remain typically unresolved in these malignancies6,10,11. In addition, the connections of cell genotype, epigenotype, phenotype and function remain underexplored in malignancies that exhibit extensive karyotypic complexity and genetic heterogeneity, such as CK-AML. Thus, the prevalence of genetic and nongenetic mechanisms driving disease progression and resistance remain underexplored12.
In the present study, we extended the understanding of patterns of ITH during CK-AML evolution and exemplified the translational relevance of single-cell clonal evolution analyses. We harnessed two single-cell multiomics frameworks (single-cell nucleosome occupancy and genetic variation analysis (scNOVA13)), based on single-cell template strand sequencing (Strand-seq)14 and cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq)15, coupling single-cell transcriptomics with cell-surface, protein-level measurements—linking genotype and phenotype in eight patients with primary CK-AML and two longitudinally collected samples. We combined this single-cell characterization with functional xenotransplantation assays and ex vivo drug-sensitivity profiling.
Results
Genetic complexity drives karyotype heterogeneity in CK-AML
To gain insight into the evolution of genomic rearrangements and the resulting phenotypic complexity in CK-AML, we established a single-cell multiomics framework to study heterogeneity of structural variants together with nongenetic properties at single-cell resolution. We coupled scNOVA13 with droplet-based CITE-seq15, to reveal the scNOVA–CITE framework outlined in Fig. 1a (Supplementary Fig. 1a–c). To allow comprehensive insight into CK-AML genetic complexity, we generated Strand-seq libraries from bone marrow or peripheral blood cells of eight patients with primary CK-AML from diagnosis or salvage samples, five matched patient-derived xenografts (PDXs) and two matched relapse or refractory samples with 855 single-cell genomes sequenced overall (Fig. 1a and Supplementary Table 1). Each single-cell library was sequenced to a mean of 365,436 mapped nonduplicate read-pairs, amounting to ~0.017× coverage per cell (Supplementary Table 2 and Supplementary Fig. 1a).
Capitalizing on the Strand-seq data generated, we first focused on the eight diagnosis or salvage CK-AML samples. Performing structural variant detection with the single-cell tri-channel processing (scTRIP) method16, we identified an average of 18.9 (±2.9 s.d.) chromosomal alterations per cell, including interstitial structural variants, terminal gains and losses, whole-chromosome aneuploidies, balanced structural variants and complex chromosomal rearrangements (Fig. 1b and Supplementary Table 3). In each patient with CK-AML, 3–12 chromosomes harbored at least one chromosomal alteration present at high cell fraction (>80%) (Fig. 1b and Supplementary Table 3), with CK282 exhibiting the highest number of alterations (n = 50.3, mean per single cell). Although chromosomes 5 and 12 were most frequently mutated at a high cell fraction (present in 5 out of 8 patients), chromosomes 10, 13, 19 and 22 did not show detectable high cell fraction aberrations in any patient (Fig. 1b and Supplementary Table 3). These data underscore the extensive karyotypic complexity of CK-AML.
Analysis of clonal structural variants present at high cell fractions revealed several instances of complex structural variant formation, highlighting considerable chromosomal instability of CK-AML. In patient CK282, the copy-number profiles of chromosomes 12 and 17 oscillated between three states and displayed islands of deletions (dels), inversions (invs) and inverted duplications with at least 15 and 6 detected breakpoints, respectively (Fig. 1c,d). For both chromosomes, resolving the structural variants by chromosome-length haplotype revealed only a single rearranged homolog (Fig. 1c,d and Supplementary Fig. 2a), suggesting that the respective structural variant profiles resulted from chromothripsis1,17,18. By quantifying the co-segregation footprints of the directional reads using scTRIP16, we identified 15 high-confidence translocations (Supplementary Table 4) that fused fragments of these complex rearrangements into both derivative and marker chromosomes—an observation verified by multiplex fluorescence in situ hybridization (M-FISH) and ultra-long DNA molecule optical genome mapping (OGM) (Fig. 1e and Supplementary Fig. 2b).
We also detected complex, clonal rearrangements affecting chromosomes commonly rearranged in AML. In patients HIAML85 and CK397, fragments from one 3q haplotype (H1) contained intrachromosomal rearrangements spanning the 3q arm in all cells. HIAML85 cells contained one large inversion, whereas CK397 cells harbored a complex intrachromosomal rearrangement involving at least three large inversions (Fig. 1f and Extended Data Fig. 1a). Reconstruction of the 3q arm using OGM confirmed both rearrangements, validating the Strand-seq-based data (Fig. 1f and Supplementary Fig. 3a). In patient HIAML85, the single inv(3)(q21.3q26.2) generated the oncogenic RPN1–MECOM fusion (Fig. 1f), commonly seen in 3q-rearranged AML19,20. In patient CK397, the kilobase-scale resolution provided by OGM identified 11 intrachromosomal fusions spanning the 3q arm with inv(3)(q21.3q26.2) and inv(3)(q26.2q29) also generating a RPN1–MECOM fusion (Fig. 1f and Supplementary Fig. 3b), further verified by RNA sequencing (RNA-seq; Supplementary Fig. 3c). In both patients the 3q rearrangement resulted in overexpression of MECOM (Extended Data Fig. 1b) and an H1-specific reduction in nucleosome occupancy in CK397 (Extended Data Fig. 1c). Hence, by leveraging the ability of Strand-seq to characterize structural variants in a haplotype-aware manner along each homolog, our data revealed balanced as well as complex intrachromosomal 3q rearrangements as driver events, resulting in overexpression of the poor prognosis oncogene MECOM21.
To further quantify ITH using Strand-seq, we calculated the structural variant burden per CK-AML cell (ranging between 0 SV- and 63 SV-altered segments per cell as identified by scTRIP; Fig. 1g and Supplementary Table 3) and applied the standard deviation of the structural variant burden as a measure of intrapatient karyotype heterogeneity. CK282 had both the highest structural variant burden (n = 50.3, mean per single cell) and intrapatient karyotype heterogeneity (s.d. 9.3) followed by CK349 (s.d. 6.3) (Fig. 1g). By contrast, the two MECOM-overexpressing samples, CK397 and HIAML85, did not show extensive intrapatient karyotype heterogeneity (s.d. 0.5 and 0.3, respectively) despite CK397 exhibiting the third highest structural variant burden (n = 22.0, mean per single cell) (Fig. 1g). These data underscore that, although intrapatient karyotype heterogeneity is widespread in CK-AML, this is not necessarily linked to the overall structural variant burden in a patient, but instead reflects individual subclonal diversity levels.
Different modes of clonal dynamics in CK-AML
To gain further insights into CK-AML subclonal evolution, we carried out a comprehensive analysis of structural variant subclonality for each diagnosis or salvage sample. We observed three distinct subclonal growth patterns: (1) monoclonal growth, (2) linear growth and (3) branched polyclonal growth (Fig. 2a). Two of eight cases exhibited monoclonal growth, whereby a single subclone was dominant at the time of sampling and only individual cells deviated from the main clone (Fig. 2b and Supplementary Table 3). In the remaining six cases, we identified oligo- or polyclonal growth, whereby multiple subclones were present. Of these, three showed linear and three branched growth patterns (Fig. 2a). As expected, the two samples with the highest intrapatient karyotype heterogeneity showed branched growth patterns (Fig. 1g).
In the two patients with monoclonal growth, structural variants were shared between all cells (excluding singleton events) affecting 3 chromosomes in patient HIAML85 and 12 chromosomes in patient CK397 (Fig. 2b and Supplementary Table 3). Both patients harbored inversions at 3q, generating the recurrent oncogenic RPN1–MECOM fusion described above (Fig. 1f and Supplementary Fig. 4a). By contrast, the three patients with linear growth (CK295, P9D and HIAML47) were characterized by a step-wise acquisition of structural variants (Fig. 2c). In each of the three patients a set of structural variants was present in virtually all cells (Fig. 2c) and thus probably originate from a common precursor AML cell. In all cases, we identified additional structural variants acquired in a step-wise manner, generating the dominant clone at the time of sampling (Fig. 2c and Supplementary Fig. 4b). Structural variants acquired later in disease evolution generally overlapped regions with known oncogenes and tumor suppressors, such as MYC at 8q, CDKN1B at 12p and TP53 at 17p. Notably, one cell (1 out of 91 cells, 1.1%) in patient HIAML47 lacked detectable structural variants (Fig. 2c,d). As this patient progressed from a JAK2-mutant myeloproliferative neoplasm (MPN) or chronic myelomonocytic leukemia (CMML) to AML (Supplementary Table 1), this cell hints at the presence of residual MPN- or CMML-related blood cells at the time of CK-AML diagnosis. Collectively, these findings underscore the selective growth advantage gained by the acquisition of additional structural variants in a linear step-wise process, probably leading to a successively more aggressive malignancy.
The branched polyclonal growth cases (D1922, CK282 and CK349) harbored multiple subclones displaying differences in their karyotypical complexities (Fig. 2e and Supplementary Table 3). Similar to the linear growth samples, we identified a set of structural variants that were present in virtually all cells, indicative of a common precursor cell. In patient D1922, all cells harbored a polyploid chromosome 8 together with translocation signatures at 1p and 6q, whereas, in patients CK349 and CK282, seven and ten chromosomes, respectively, carried both simple gains and losses as well as complex rearrangements (Fig. 2e). Among the branched polyclonal growth cases, patient D1922 had the lowest structural variant burden (n = 4.5, mean per single cell) and largely lacked complex rearrangements (Fig. 2e). We detected five subclones, referred to as SC1–SC5, that were characterized by distinct sets of whole-chromosome duplications, affecting five chromosomes (chromosomes 5, 16, 19, 20 and 21; Fig. 2e). In patient CK349, we classified cells into three main subclones, referred to as SC1, SC2 and SC3, each with distinct structural variant burdens (Fig. 2e). SC1 (81 out of 91 cells, 89%) represented the largest clone and harbored uniquely a chromosome 8 trisomy (Fig. 2e). By contrast, SC2 (5 out of 91 cells, 5.5%) and SC3 (5 out of 91 cells, 5.5%) carried a distinct set of rearrangements affecting chromosome 13 (Fig. 2e and Extended Data Fig. 2a,b). SC3 had additionally acquired a set of structural variants at chromosome 11, resulting in wave-like, copy-number profiles (discussed further below). Finally, patient CK282 showed the most abundant subclone diversity, represented by five distinct subclones and characterized by 6–59 structural variant-altered segments in each cell (Fig. 2e and Extended Data Fig. 3a). Three subclones, referred to as SC1, SC2 and SC3, showed a high level of genetic similarity, with the exception of structural variants identified on chromosomes 8 and 20 (Fig. 2e and Extended Data Fig. 3a,b). SC4 (19 out of 76 cells, 25.0%) lacked rearrangements on chromosome 20 but displayed several unique structural variants, including three duplications on chromosomes 9, 12 and 18, and one inversion on chromosome 17, respectively (Fig. 2e and Extended Data Fig. 3a). By comparison, SC5 (3 out of 76 cells, 3.95%) differed markedly from all other subclones and harbored a distinct and much smaller structural variant set, which almost entirely lacked complex rearrangements abundant in SC1–SC4 (Fig. 2e and Extended Data Fig. 3a), suggesting parallel evolution from a common precursor stem cell harboring an inversion at 3q.
Together, our single-cell assessment of subclonal growth patterns in CK-AML add new insight into the clonal dynamics in diagnosis or salvage CK-AML, and showcase that multiple clones can exist and expand simultaneously in CK-AML. A detailed description of the structural variants in all samples and subclones can be found in Supplementary Note 1.
Single cells with excessive chromosomal instability
Beyond the assessment of subclonal growth patterns, our analysis of structural variants restricted to an individual cell revealed evidence for genomic regions subject to extensive chromosomal instability. As an example, we noted that chromosome 20 in CK282 subclones SC1, SC2 and SC3 displayed a classic breakage–fusion–bridge (BFB) event16,22 with the typical inverted duplication and adjacent terminal deletion signature arising on the same haplotype, but with the length of the terminal deletion varying from cell to cell (Extended Data Fig. 4a,b and Supplementary Note 1). Likewise, all CK349 cells displayed deletions on chromosome 17, with these events partially overlapping and presenting 15 unique, nonoverlapping breakpoints, pointing to persistent chromosomal instability involving this chromosome (Extended Data Fig. 4c,d and Supplementary Note 1).
We also detected subclone-specific chromosomal instability. The five cells comprising the SC3 of patient CK349 exhibited the highest degree of karyotype heterogeneity across all cells (Figs. 1g and 2e). These cells exhibited a diversity of complex rearrangements affecting chromosome 11, comprising amplifications at different genomic positions and reaching distinct copy-number levels, interrupted by nonamplified disomic and/or deleted segments (Fig. 2f and Extended Data Fig. 5a,b). Closer inspection of the amplified regions showed highly variable and oscillating copy-number states, which differed from one-off chromothripsis events that yield typically only two (or occasionally three) oscillating copy-number states (Fig. 1c,d and Supplementary Fig. 5a)17,18. These wave-like, copy-number events also differed from other amplification events that contained distinct structural variant breakpoints demarking a single copy-number state (Extended Data Figs. 2a and 3b and Supplementary Fig. 5b). Instead, these rearrangement patterns are indicative of the occurrence of seismic amplifications, a class of complex structural variants recently described in solid tumors from bulk whole-genome analysis23,24. Given the multistep rearrangement process involved in seismic amplifications23,24, the unique breakpoints and amplification states observed in each cell with a high structural variant burden in CK349 may result from successive circular recombination events initiated on chromosome 11 (Fig. 2g). Indeed, M-FISH analysis of a PDX sample generated from CK349 revealed a large ring chromosome containing several copies of segments from 11p and 11q (Fig. 2h), confirming the presence of a circular DNA structure. This is likely to promote chromosomal instability and acquisition of intrapatient karyotype heterogeneity in patient CK349. Linearized marker chromosomes containing segments from chromosome 11 were likewise present (Extended Data Fig. 5c), suggesting stabilization of the seismic amplification process in a subset of cells. Our findings are notably consistent with, and hence validate, the previously proposed model of circular recombination23,24, which our data reveal can act as a source of cell-to-cell DNA rearrangements fostering ITH in CK-AML. A detailed characterization of the chromosome 11 events can be found in Supplementary Note 2.
Epigenetic and transcriptomic insight into patient subclones
The impacts of larger structural variants on the cell epigenome, transcriptome and cell-surface proteome in AML remain unexplored as a result of the current lack of appropriate genomic technologies. To address this gap, we harnessed the distinct multimodal, single-cell readouts accessible through scNOVA and CITE-seq. Capitalizing on the high-resolution structural variant breakpoint coordinates obtained from Strand-seq, we utilized the CONICSmat25 computational method to pursue targeted somatic copy-number alteration (SCNA) recalling in the CITE-seq data to integrate the single-cell readouts, thereby expanding the number of assessed single cells to 35,577 (Extended Data Fig. 6a and Methods). In five of six patients exhibiting polyclonal growth, we confidently assigned cells from the CITE-seq data to the corresponding subclones defined by scNOVA13 (Fig. 3a, Extended Data Fig. 6b and Supplementary Fig. 6). We observed a marked correlation between Strand-seq and CITE-seq subclone detection (Spearman’s R = 0.7, P = 0.0003; Extended Data Fig. 6b,c and Supplementary Fig. 6), suggesting that both single-cell techniques provide a similar representation of subclonal frequencies. Within each patient, integration of the CITE-seq data showed clustering of the cells mostly by genetic subclone, with each subclone exhibiting distinct transcriptomic and immunophenotypic profiles (Fig. 3a). This effect was most evident in patients with branched growth, suggesting stronger phenotypic differences between competing subclones.
Leveraging the SCNA recalling in the CITE-seq data, we were able to obtain further insight into each subclone identified using scTRIP. For example, in patient HIAML47, we rediscovered the presence of primitive myeloid cells lacking structural variants (n = 77 cells) (Fig. 3a and Extended Data Fig. 7a), confirming the presence of pre-LSCs also identified using Strand-seq (Fig. 2c). These pre-LSCs (SC1) showed upregulation of multiple interferon (IFN) response genes (for example, IFITM1, IFITM2 and IFITM3) (Extended Data Fig. 7b,c and Supplementary Table 5), commonly upregulated in MPNs26. This was further recapitulated by pathway analysis whereby INFγ and INFα response gene sets, as well as the JAK–STAT signaling pathway, showed strongly enriched activity (Extended Data Fig. 7d), providing additional support for our hypothesis that the pre-LSCs represent residual persister cells of the preceding MPN or CMML disease rather than healthy hematopoietic stem or progenitor cells (HSPCs). By contrast, the dominating subclone harboring the most structural variants (SC3) in HIAML47 showed the lowest IFN and JAK–STAT signaling, but increased expression of cell cycle-associated genes (for example, E2F3, EIF4E, EIF3H and EIF3J) (Extended Data Fig. 7b,c and Supplementary Table 5). This was further reflected in the upregulation of the G2M checkpoint and mitotic spindle-associated gene signatures (Extended Data Fig. 7d). These findings are consistent with the selective growth advantage observed for this subclone.
We also gained insight into the molecular expression networks of patients displaying branched growth. Subclones from the same evolutionary branch typically expressed similar transcriptomic programs (Supplementary Note 3 and Supplementary Fig. 7). For example, in patient CK282, cells from SC1, SC2 and SC3 showed upregulation of genes involved in mitochondrial complex V (ATP5MF, ATP5MG and ATP5MD) (Fig. 3b and Supplementary Table 5) and enrichment of oxidative phosphorylation (Fig. 3c). In patient CK349, the transcriptomic data also reflected the extensive chromosomal instability observed at the genetic level, caused by the seismic amplification in SC3. We observed subclone-specific increased expression of several genes involved in cellular stress and DNA-damage response (for example, LDHA, SESN1, PRDX1, PRDX2, PRDX4, ATM, ALDH2 and ALDH1A1), many of which also showed reduced nucleosome occupancy (Fig. 3d, Extended Data Fig. 7e and Supplementary Tables 5 and 6), suggesting that these may be deregulated as a consequence of ongoing recombinatorial rearrangements of the respective circular DNA. It is interesting that these SC3 cells also upregulated classic cell proliferation-associated pathways, including the G2M checkpoint, MYC targets and mitotic spindle-associated gene signatures (Fig. 3e and Extended Data Fig. 7f), arguing that they may have a relatively higher proliferative activity compared with the other subclones in the same sample, which might contribute to the rapid mutation acquisition of this subclone.
In summary, our integrated framework enabled us to capture phenotypic intrapatient heterogeneity of genetically related yet distinct leukemic subclones. This revealed both shared and subclone-specific pathway dysregulation and cell-type biases (Supplementary Note 4, Supplementary Fig. 8 and Supplementary Table 7), driving distinct molecular programs that are simultaneously present within the same patient.
CK-AML clonal evolution patterns in mice
We hypothesized that the observed phenotypic diversity may also result in differences in functional disease-propagating capacity. To explore this, we established PDX models for five patients (Supplementary Table 1) and analyzed the engrafting cells using scNOVA (Fig. 4a). This revealed two engraftment patterns in PDX: (1) engraftment of the dominant clone (HIAML85 or HIAML47) or (2) engraftment of a minor subclone (CK282, CK349 or CK397) (Fig. 4b,c). Detailed characterization of patient-specific clonal dynamics in the PDX can be found in Supplementary Note 5. Transcriptomically, the engraftment-driving cells shared programs involved in cell growth, proliferation and oxidative phosphorylation, whereas downregulated gene sets were associated with inflammation (Supplementary Note 5, Supplementary Fig. 9a,b and Supplementary Table 8). Overall, the engrafted CK-AMLs in the PDXs showed increased structural variant burden but reduced karyotype heterogeneity compared with the corresponding primary patient samples (Fig. 4d, Extended Data Fig. 8a and Supplementary Table 3), consistent with expansion of a single or a few engrafted LSCs that may continue to undergo genomic evolution. Indeed, we also found unstable chromosomes in two of five PDXs already present in the primary samples and singleton structural variants in individual cells in four of five PDXs (Extended Data Fig. 8b–e). Thus, engraftment of LSCs in mice can be accompanied by spontaneous generation of de novo karyotype diversity.
To exemplify the clinical relevance of engraftment-driving LSCs, we analyzed karyograms from patient CK349 at relapse after chemotherapy treatment (Fig. 4e and Supplementary Fig. 10). At relapse, 88% (22 out of 25 cells) of chemotherapy-resistant cells lacked the trisomy 8 present at diagnosis, but harbored a large marker chromosome instead (Fig. 4f). The remaining 12% (3 out of 25) had a normal female karyotype and thus originated from the allogeneic HSC transplantation donor (Fig. 4f). Similarly, engraftment in CK349 PDX was driven by cells lacking trisomy 8 but harboring the complex seismic amplification at chromosome 11 (SC3; Fig. 4c,g) with the relative size of the engraftment-driving subclone increasing from 5.5% (5 out of 91 cells) at diagnosis in the patient to 97.5% (39 out of 40 cells) in the PDX (Fig. 4c,g). M-FISH analysis of the PDX cells confirmed that the amplifications on chromosome 11 resulted in a large ring chromosome or linearized marker chromosome (Fig. 4g and Extended Data Fig. 8b), consistent with the karyotype of the relapse-driving clone. These data strongly indicate that LSCs from the most genetically unstable subclone (SC3) at the time of diagnosis not only engrafted the leukemia in the PDX, but also drove clonal relapse in patient CK349. In summary, we identified different clonal evolution fates and patterns during CK-AML reconstitution in mice. Our data further indicate that PDX engraftment-driving subclones may also drive relapse outgrowth in patients with CK-AML, as in the case of CK349 (refs. 27,28).
Single-cell multiomics to dissect drug–response profiles
We next leveraged our single-cell multiomics data to study drug–response profiles of different genetic subclones ex vivo and examine the possible clinical relevance of functional LSCs. Based on the availability of primary material for follow-up studies, we included three patient samples that showed linear or branched polyclonal growth patterns at diagnosis (HIAML47, CK349 and CK282). We used our CITE-seq data to design antibody panels specific to the distinct subclones in each sample and assessed the drug–response profiles of each subclone by flow cytometry (Fig. 5a,b, Supplementary Fig. 11, Supplementary Table 9 and Supplementary Note 6).
In line with the known poor clinical therapy response of patients with CK-AML, all samples showed different levels of resistance to most of the tested drugs ex vivo (Fig. 5c). However, in HIAML47 and CK349, the LSC-enriched CD34−GPR56+ and CD45RA+CD49F+ cells, respectively, showed considerable response to the hypomethylating agent azacitidine (Extended Data Fig. 9a,b), supporting the favorable clinical trends for azacitidine in patients with AML and poor-risk cytogenetics29. It is interesting that HIAML47 cells exhibited no marked response to venetoclax monotherapy (Fig. 5d–f), even though the engraftment-driving LSCs demonstrated a notable response to high concentrations of venetoclax when combined with azacitidine (Extended Data Fig. 9a). Reflecting the ex vivo findings, patient HIAML47 exhibited an initial response to venetoclax and azacitidine treatment, but the leukemia re-emerged rapidly with an immunophenotype matching the engraftment-driving LSCs (Supplementary Figs. 10 and 11a and Supplementary Table 1). In CK349, we observed a distinct resistance exclusively in the engraftment-driving LSCs to cytarabine and daunorubicin, the same chemotherapy regimen that the patient received as first-line treatment (Fig. 5g–i, Extended Data Fig. 9b–d and Supplementary Fig. 10). Yet, the engrafted cells from CK349 showed considerable response to elesclomol (Extended Data Fig. 9e), a drug inducing apoptosis by oxidative stress30.
CK-AML cells of CK282 showed a striking response to the BCL-xL inhibitor A-1331852. Although this was the case for all CK282 subpopulations, CD90highCD45RA− LSC-enriched cells showed the strongest response in the primary sample (Fig. 5j–m and Extended Data Fig. 9f,g) and the PDX cells continued to be sensitive to this treatment (Supplementary Fig. 12a). In line with these results, BCL-xL protein expression levels were the highest in the engraftment-driving LSCs (Fig. 5n and Extended Data Fig. 9h). As the CD90high-expressing cells showed resistance to all other tested drugs, including standard chemotherapy (Fig. 5m and Supplementary Fig. 12b,c), BCL-xL inhibition may provide a valid alternative to standard chemotherapy regimens in a subset of CK-AML31. Beyond identifying alternative therapeutic options to explore further, the observed drug responses of functional LSCs largely reflected the clinical responses of the patients, providing a proof-of-concept method for larger screening efforts.
Longitudinal evolution of CK-AML in response to therapy stress
To further exemplify the biological and clinical relevance of single-cell clonal evolution analysis, we performed longitudinal scNOVA–CITE analysis on two patients (P9 and P5) where paired diagnosis or post-treatment samples were available (Supplementary Note 7). Patient P5 achieved complete remission after induction chemotherapy but relapsed 167 days later (Fig. 6a). At diagnosis the patient harbored five distinct subclones (SC1–SC5), whereas, at relapse, only SC1 cells were detected (Fig. 6b). Of the relapse cells, 98% (53 out of 54 cells) had additionally acquired a new complex rearrangement on chromosome 6, reminiscent of chromothripsis and manifesting as a marker chromosome (Fig. 6b,c, Extended Data Fig. 10a and Supplementary Table 1). Relapse cells also showed enrichment of immature HSC-like cells as evident by nucleosome occupancy-based cell typing (P = 0.047, Fisher’s exact test; Fig. 6d), which was accompanied by increased stemness scores (Extended Data Fig. 10b). Compared with treatment-naive cells, genes involved in translation (for example, EIF5A, EIF3F and EIF3L) were upregulated in relapse cells, which was consistent with upregulation of MYC targets and oxidative phosphorylation gene signatures (Fig. 6e–g, Extended Data Fig. 10c,d and Supplementary Table 10). Collectively, the relapse in patient P5 was probably driven by a chromothripsis event on chromosome 6 in SC1. This generated CK-AML cells with increased stemness as well as a steady increase in cell growth and oxidative phosphorylation, driving clonal disease progression.
Unlike patient P5, patient P9 received first-line treatment with the BCL-2 inhibitor venetoclax in combination with azacitidine, but was clinically refractory (Fig. 7a and Supplementary Fig. 10). At diagnosis, P9 cells consisted of three subclones, with two persisting after 12 days of treatment (Fig. 7b). In the refractory sample, 14.3% of cells (3 out of 21 cells) resembled diagnosis subclone SC1 and 85.7% (18 out of 21 cells) resembled SC3 (Fig. 7b,c and Extended Data Fig. 10e). Post-treatment, SC3-derived cells showed an increase in megakaryocyte–erythroid progenitor (MEP)-like cells (17.9% versus 27.7%), but a decrease in lymphoid-primed multipotent progenitor (LMPP)-like cells (20.5% versus 11.1%) (Fig. 7d). Meanwhile, SC1-derived cells acquired a new 5-Mb focal deletion on chromosome 17q (Fig. 7c). This includes the NF1 tumor-suppressor gene, which showed reduced expression specifically in the SC1-derived refractory cells (Fig. 7e,f and Supplementary Table 10). In addition, refractory cells upregulated inflammation-associated gene signatures, including tumor necrois factor via nuclear factor κ-light-chain enhancer of activated B cells (NF-κB) signaling (Extended Data Fig. 10f,g). Finally, ex vivo drug–response profiling revealed that both SC1- and SC3-enriched populations were resistant to venetoclax monotherapy and azacitidine combination therapy already at diagnosis (Fig. 7g–i and Extended Data Fig. 10h), mimicking the clinical response. Strikingly, these venetoclax-resistant subclones showed sensitivity to elesclomol (Fig. 7j), a drug previously observed to induce cell death in venetoclax-resistant cells32. Collectively, patient P9 exhibited persistence of two distinct subclones post-treatment, with each having acquired subclone-specific mechanisms to further resistance: a shift toward MEP-like cells and NF1 loss leading to increased RAS signaling. Notably, both subclones were susceptible to the oxidative stress inducer elesclomol, a finding deserving of further preclinical and clinical investigation in the future.
Discussion
We dissected the intrapatient heterogeneity of ten samples from patients with CK-AML at unprecedented single-cell multiomics resolution, including structural variant mapping and functional assays. This approach provided intriguing insights into CK-AML heterogeneity and revealed key resistance mechanisms. Single-cell structural variant mapping identified three modes of clonal growth in CK-AML: monoclonal, linear and branched polyclonal growth. Although previous studies using bulk whole-genome and single-cell DNA sequencing in AML have identified similar clonal evolution patterns based on single nucleotide variants33,34, inferring evolutionary history of structural variants is highly challenging in CK-AML as a result of an extensive number of alterations (up to 63 structural variant-altered segments in individual cells) and spontaneous karyotype diversity35,36. Despite known limitations14,16,37, our findings emphasize the need for single-cell resolution technologies (Supplementary Notes 8 and 9).
Strand-seq data, compared with single-cell RNA-sequencing (scRNA-seq) data, offer superior resolution for detecting structural variants and studying subclonal dynamics, often not fully captured by scRNA-seq data alone because of limited resolution13,38. Yet, our integrative framework coupling high-resolution genomic data based on Strand-seq and scNOVA with CITE-seq provided deeper insights into the transcriptomic states of subclones than Strand-seq alone. Using scNOVA, we identified cells with extreme chromosomal instability as well as rare pre-LSCs lacking structural variants, consistent with recent findings in secondary AML39. Using CITE-seq, we showed that pre-LSCs displayed reduced cell proliferation compared with the CK-AML cells in the same sample, whereas extreme chromosomal instability was reflected in the upregulation of cellular stress and DNA-damage response, together with increased proliferation. In the context of venetoclax resistance, our integrative analysis revealed subclone-specific mechanisms to further resistance such as de novo structural variant acquisition and lineage plasticity, insights that would have probably remained obscured by either single-cell method alone.
Although ex vivo drug testing provides a predictive assay for new treatments, sensitivity of the results is significantly influenced by the method used40,41. Bulk assays yield lower sensitivity compared with flow cytometry-based assays that enable blast and LSC-specific readouts40,41. In the present study, utilizing distinct cell-surface phenotypes of different subclones identified by our framework, we recapitulated clinical responses in three patients using ex vivo drug testing, effectively targeting leukemia-regenerating cells in one patient with adverse genetics using BCL-xL inhibition. Although we were not able to identify inhibitors with strong efficacy toward LSCs in all patients, our platform shows promise for discovering alternative treatments in CK-AML, which may be particularly relevant for personalized cancer therapy42,43. One such drug was elesclomol, which showed efficacy in both venetoclax resistance-driving subclones of patient P9. This underscores the need for expanded screening to identify patient-specific, LSC-targeting options through ex vivo drug testing with subclonal readouts.
Methods
Samples from patients with primary AML
All samples were obtained from patients who provided written informed consent for the research use of their specimens in agreement with the Declaration of Helsinki. The project was approved by the Ethics Committee/Institutional Review Board of the Medical Faculty of Heidelberg and Cancer and Leukemia Group B (GALGB) (NCT-MASTER platforms S-206/2011 and S-169/2017, and GALGB studies CALGB 8461, CALGB 9665 and CALGB 20202). The protocols involved collection of bone marrow aspirates and peripheral blood samples. Part of the cohort was provided by the NCT (National Center for Tumor Diseases) Liquid and Cell Biobank, a member of the BioMaterialBank Heidelberg (BMBH). Bone marrow and peripheral blood mononuclear cells were isolated by density gradient centrifugation and stored in liquid nitrogen until further use. Patient characteristics are listed in Supplementary Table 1.
Processing of primary AML cells for single-cell sequencing
Viably cryopreserved AML bone marrow and/or peripheral blood samples were thawed at 37 °C in Iscove’s modified Dulbecco’s medium (IMDM) containing 10% fetal bovine serum and treated with DNase I for 15 min (100 μg ml−1).
Strand-seq in leukemia cells
For Strand-seq analysis, recovered cells were cultured using previously established protocols44,45 with IMDM, 15% BIT (bovine serum albumin, insulin, transferrin; STEMCELL Technologies, 09500), 20 ng ml−1 of granulocyte colony-stimulating factor (G-CSF; PeproTech, 300-23), 50 ng ml−1 of FLT3-L (PeproTech, 300-19),100 ng ml−1 of stem cell factor (SCF; PeproTech, 300-07), 20 ng ml−1 of interleukin-3 (IL-3) (PeproTech, 200-03), 100 μM β-mercaptoethanol (Thermo Fisher Scientific, 31350010), 500 nM SR1 (StemRegenin 1, STEMCELL Technologies, 72342), 500 nM UM729 (STEMCELL Technologies, 72332) and 1% penicillin–streptomycin (Sigma-Aldrich, P4458-100ML). Bromodeoxyuridine (BrdU; 40 μM) was incorporated for the duration of one cell division (52–62 h) to perform nontemplate strand labeling. Single nuclei from the appropriate time point were sorted into 96-well plates using a BD FACSMelody cell sorter, followed by Strand-seq library preparation, as described previously14,46. Libraries were sequenced on an Illumina NextSeq 500 sequencing platform (75-bp, paired-end sequencing protocol).
CITE-seq in leukemia cells
For combined scRNA-seq and antibody-derived tag sequencing (CITE-seq) analysis, recovered cells were stained with a total of 38 or 149 antibody-derived tags (ADTs) and in some cases also with a hashtag oligo (HTO; Supplementary Table 11), and sorted for live CD45+ cells using a BD FACSAria II or III cell sorter. CITE-seq library preparation was performed as previously reported15 using the Chromium Single Cell 3′ Library and Gel Bead Kit (10x Genomics, 1000128). Then, 5,000–10,000 cells were targeted for each sample and processed according to the manufacturer’s instructions (10x Genomics) and 0.2 mM ADT additive oligonucleotides or 3′ feature complementary DNA Primers2 (10x Genomics) were spiked into the cDNA amplification PCR (13 cycles). After PCR, a large cDNA fraction was separated from ADTs or HTOs using 0.6× solid-phase reversible immobilization (SPRI). The cDNA fraction was processed using the 10x Genomics Single Cell 3′ v.3.1 protocol to generate the transcriptome libraries. To generate the ADT libraries, ADTs were indexed with Truseq Small RNA RPIx primers by PCR for ten cycles, followed by library purification and reamplification for five additional cycles with P5 or P7 generic primers. To generate the ADT or HTO libraries, ADTs/HTOs were indexed with Dual Index NT primers by PCR for 12 cycles, followed by library purification. ADTs or HTOs and scRNA-seq libraries were either pooled in a ratio of 25% ADT and 75% RNA or sequenced separately on an Illumina NovaSeq 6000 S1 (300 pM with 1% PhiX loading concentration, 28 + 94-bp read configuration).
Strand-seq-based structural variant discovery
Paired-end sequencing reads were aligned to the human reference genome (GRCh38) using the Burrows–Wheeler alignment algorithm47 and duplicated reads were marked using biobambam48 as described previously for the Strand-seq data analysis16. Good quality (mapping quality MAPQ ≥ 10) and nonduplicated reads were used in the downstream analysis. Reads aligned to the Watson and Crick strands were counted separately in the 100-kb genomic bins. We used reads mapping to the Watson and Crick strands to resolve the Strand-seq data by chromosome-length haplotype49. Based on the read depth, strand orientation and haplotype information, structural variant calling was performed using the scTRIP method16. In brief, the scTRIP framework infers structural variants in the segmented data by employing a Bayesian model that estimates the genotype likelihoods for each segment and each cell. Using this Bayesian model, the most probable structural variant type was assigned to each segment, followed by manual inspection of each structural variant. Cells were assigned to subclones based on the presence of shared structural variants, whereby a subclone was defined by three or more cells sharing a set of structural variants. For cells presenting clear progeny of a larger subclone, also fewer than three cells were considered as subclones (see linear growth samples in Fig. 2c).
Structural variant burden and intrapatient karyotype heterogeneity
Using the structural variant calls from scTRIP, individual structural variant-altered segments were annotated and counted for each cell. Structural variant burden was calculated as the sum of all identified structural variant-altered segments per cell. The s.d. of the structural variant burdens per patient was used as a measure of patient-level, intrapatient karyotype heterogeneity. For subclone-level, intrapatient karyotype heterogeneity, the s.d. of the structural variant burden per subclone was used.
Nucleosome occupancy-based cell-type classification of CK-AML cells
Using single-cell Strand-seq libraries of CK-AML, scNOVA analysis was performed to obtain nucleosome occupancy at gene bodies for each single cell as previously described13. As genetic SCNA can confound the nucleosome occupancy measurement at gene bodies, copy-number normalization of nucleosome occupancy, based on the ploidy status inferred by PloidyassignR using 1-Mb bins and 500-kb sliding window (https://github.com/lysfyg/PloidyAssignR), was performed. The copy-number-normalized nucleosome occupancy matrix was used as input for the nucleosome occupancy-based cell-type classifier of HSPCs38 to predict the most likely cell type for each single-cell Strand-seq library.
Differentially occupied genes in subclones based on scNOVA
Using the copy-number-normalized nucleosome occupancy measurement at gene bodies, as described above, differential gene activity analysis of scNOVA13 was performed for samples with linear or branched growth. To infer differentially active genes for each subclone, the single cells in a subclone were compared with all other single cells in the same sample using an alternative mode of scNOVA based on partial least squares-discriminant analysis. The inferred cell type was considered as a confounding factor in the differential analysis.
Haplotype-specific nucleosome occupancy analysis
First, the chromosome-wide haplotype of nucleosome occupancy at gene bodies was resolved. The nucleosome occupancy of two haplotypes for each gene were compared using two-tailed Wilcoxon’s test followed by a Benjamini–Hochberg multiple correction. Using 10% FDR cutoff, genes showing haplotype-specific nucleosome occupancy were identified.
CITE-seq data pre-processing and integration
Cell Ranger v.6.0 (10x Genomics) was used to align the sequencing reads to the GRCh38 human reference genome build, distinguish cells from the background and generate a unified feature-barcode matrix that contains gene expression counts, alongside cell-surface protein feature counts for each cell barcode.
Quality control of CITE-seq data
The R package Seurat v.4.0.4 was used to calculate the quality control metrics50. Cells were removed from the analysis if <200 or >8,000 distinct genes, <1,000 counts or >15% of reads mapping to mitochondrial genes were detected.
Pre-processing and dimensional reduction of CITE-seq data
Pre-processing and dimensional reduction of CITE-seq data were performed independently on both RNA and ADT assays. Gene counts were normalized by applying regularized negative binomial regression using the Seurat sctransform function51, followed by principal component analysis (PCA) with highly variable genes as input. Cell-surface protein counts were centered log-ratio transformed across cells using the Seurat NormalizeData function with ‘CLR’ method, followed by scaling and PCA.
Weighted nearest neighbor analysis of CITE-seq data
For each cell, its closest neighbors in the dataset were calculated based on a weighted combination of RNA and protein similarities, using the Seurat FindMultiModalNeighbors function52. For the RNA modality, 30 dimensions were used and, for the protein modality, 18 dimensions. Downstream analysis including uniform manifold approximation and projection (UMAP) visualization and t-distributed stochastic neighbor embedding visualization of the data, as well as clustering, was performed based on a weighted combination of RNA and protein data. Clustering of the cells was done using the FindClusters function.
scNOVA–CITE workflow
Targeted SCNA recalling
SCNA calling from the gene expression counts from CITE-seq data was done using the CONICSmat R package. In brief, to determine the copy-number status of each cell, CONICSmat fits a two-component Gaussian mixture model for each provided chromosomal region. The mixture model is fit to the average gene expression of genes within a region and cells with a deletion of the region will show an on-average lower expression from the region than cells without the deletion. The posterior probabilities for each cell belonging to one of the components can then be used to decipher the copy-number status of each cell25.
In the present study, the structural variant discovery from scTRIP was used to construct a list of chromosomal regions containing SCNAs. These were used to infer the copy-number status of each cell for each chromosomal region using the log2(counts per million)/10 + 1) normalized gene counts from CITE-seq data. To be able to detect SCNAs affecting smaller regions, posterior probabilities were computed for regions with more than ten expressed genes (modified VisualizePosterior.R script; line 107 if(length(chr_genes)>10)). After obtaining the mixture model results, uninformative noisy regions were filtered based on the likelihood ratio test, adjusted P (Padj) < 0.01 and Bayes information criterion >200. A posterior probability cutoff of 0.8 was used for a confident SCNA assignment.
Assignment of CITE-seq cells to genetic subclones
SCNA regions from CONICSmat passing filtering were used as ‘marker structural variants’ matching subclone-specific structural variants identified using scTRIP. These marker structural variants were used to assign each cell to its corresponding genetic subclone. Cells not reaching confidence cutoff of 0.8 were termed ‘unassigned’ and excluded from downstream subclone-level analyses. For pre-LSCs in HIAML47, cells annotated as HSPCs and reaching confidence cutoff for the absence of marker structural variants were considered.
‘Reference-based’ annotation of leukemic cells
Single leukemic cells were assigned to their corresponding healthy counterparts using automatic cell-type annotation with SingleR53 by determining similarity to reference bone marrow cells based on Spearman’s correlation. A previously published CITE-seq dataset, which consists of 30,672 scRNA-seq profiles measured alongside a panel of 25 antibodies from bone marrow, was used as the reference bone marrow atlas54.
Finding differentially expressed features between subclones
Marker genes that defined each structural variant group by differential expression were identified using the scran findMarkers function with two-sided Welch’s t-test as the pairwise test. To account for the biases driven by different cell types in the structural variant groups, cell-type variable together with the structural variant group variable were used as predictors in the linear model via the design argument of findMarkers. Only upregulated marker genes were considered. Genes with an FDR-corrected P ≤ 0.05 and at least a 0.1-log(fold-change) in expression (log1(pFC) ≥ 0.1) were considered as differentially expressed unless otherwise stated.
Molecular phenotype analysis in gene sets
AUCell55 was used for signature score calculations between subclones with default parameters, using Hallmark modules from MSigDB56. LSC stemness scores were calculated for each cell as the mean expression of the normalized gene counts of the signature genes obtained by Ng and colleagues57. Gene-set over-representation analysis using enricher function from clusterProfiler was performed to model gene expression changes across the Hallmark modules from MSigDB56. For each gene set, the significance of overlap between the target gene set and genes exhibiting differential gene expression between subclones was computed using hypergeometric tests, followed by controlling the FDR at 0.05.
Mouse experiments
NOD.Prkdcscid.Il2rgnull (NSG) mice were bred and housed under specific pathogen-free conditions in individually ventilated cages with controlled temperature (approximately 22 °C) and humidity (50%) under 12 h:12 h light:dark cycle at the central animal facility of the German Cancer Research Center (DKFZ). Animal experiments were conducted in compliance with all relevant ethical regulations. We obtained written, informed consent for all experiments and they were approved by the Regierungspräsidium Karlsruhe under Tierversuchsantrag nos. G42/18 and G-140-21.
Xenotransplantations
Female mice aged 8–12 weeks were sublethally irradiated (175 cGy) 24 h before xenotransplantation assays. AML samples were stained with human CD3 MicroBeads (Miltenyi Biotec, 130-050-101) for depletion of CD3+ T cells. Magnetic-activated cell sorting (MACS) was performed according to manufacturer’s instructions and unlabeled cells run through the MACS column were collected. Then, 1 × 106–2 × 106 bulk, CD3-depleted AML cells were injected into the femoral bone marrow cavity of sublethally irradiated mice. Human leukemic engraftment in mouse bone marrow was evaluated by flow cytometry at 10 weeks, 16 weeks and endpoint (maximum 30 weeks unless endpoint criteria were reached earlier), using anti-human-CD45-AF700 (clone HI30; BD Biosciences, 560566), anti-human-CD34-BUV395 (clone 581; BD Biosciences, 563778), anti-human-CD38-BUV496 (clone HIT2; BD Biosciences, 612946), anti-human-GPR56-PE (clone CG4; BioLegend, 358204), anti-human-CD19-APC (clone HIB19; eBioscience, 17-0199-42), anti-human-CD33-APC (clone WM53; BioLegend, 740974) and anti-mouse-CD45-FITC (clone 30-F11; eBioscience, 11-0451-82). Mice were considered ‘engrafted’ if human cells represented >1% of the bone marrow cell population and ‘leukemic/myeloid’ if the human cells showed >80% CD33 positivity. At the endpoint, bone marrow cells were harvested from tibiae, femurs, iliac crests and spine by bone crushing. Spleen cells were harvested by mincing the spleen with a plunger. After red blood cell lysis, cells were resuspended in Cryostore (Sigma-Aldrich, C2874-100) and stored in liquid nitrogen until further use.
Optical genome mapping
OGM was performed on primary HIAML85 sample and xenotransplantation samples from CK282 and CK397. Ultra-high molecular mass DNA was extracted from AML cells recovered from bone marrow or spleen following the manufacturer’s protocols (Bionano Genomics). Briefly, the cells were digested followed by DNA precipitation and binding with a nanobind magnetic disk. Labeling of the ultra-high molecular mass DNA was performed following the manufacturer’s instructions (Bionano Genomics), with 750 ng of DNA labeled using the standard direct labeling enzyme 1. The fluorescently labeled DNA molecules were imaged sequentially across nanochannels on a Saphyr instrument. A coverage of approximately 300× was achieved for all samples.
Somatic structural variants were analyzed using the Rare Variant Analyses software (Bionano Solve software) provided by Bionano Genomics. Molecules were aligned against the GRCh38 human reference genome build, without ploidy assumption. Consensus genome maps (*.cmaps) were assembled from clustered sets of molecules identifying the same variant, then realigned to GRCh38. Fractional SCNA analysis was performed from the alignment of molecules and labels against GRCh38 (alignmolvrefsv). A sample’s raw label coverage was normalized against relative coverage from normal human controls, segmented and baseline copy-number state estimated by calculating the mode of coverage of all labels. Significant deviations from the baseline were used to assess the copy-number states, with high-variance regions masked.
Multiplex FISH
M-FISH analysis was performed on xenotransplantation samples from CK282 and CK349. Cells were cultured the same as for Strand-seq analysis (see above) using previously established protocols44,45. M-FISH was performed as described previously58. In brief, seven pools of flow-sorted, whole-chromosome painting probes were amplified and combinatorically labeled by degenerative oligonucleotide-primed-PCR using DEAC-, FITC-, Cy3-, TexasRed- and Cy5-conjugated nucleotides and biotin-dUTP and digoxigenin-dUTP, respectively. Metaphase spreads were digested with pepsin (0.5 mg ml−1; Sigma-Aldrich) in 0.2 N HCL (Roth), post-fixed in 1% formaldehyde, dehydrated with a degraded ethanol series and air dried, followed by denaturation of slides. Hybridization mixture was hybridized to the denatured metaphase preparations and incubated for 48 h at 37 °C. Three layers of antibodies were used to visualize biotinylated probes: streptavidin Alexa Fluor-750 conjugate (Invitrogen, S21384), biotinylated goat anti-avidin (Vector, BA-0300), followed by a second streptavidin Alexa Fluor-750 conjugate (Invitrogen, S21384). Two layers of antibodies were used to visualize digoxigenin-labeled probes: rabbit anti-digoxin (Sigma-Aldrich, D7782) followed by goat anti-rabbit immunoglobulin G Cy5.5 (Linaris, PAK0027). Slides were counterstained with DAPI and covered with antifade solution. A DMRXA epifluorescence microscope (Leica Microsystems) equipped with a Sensys CCD camera (Photometrics) was used to capture images of metaphase spreads for each fluorochrome using highly specific filter sets (Chroma Technology). Leica Q-FISH software was used to control the camera and microscope. Leica MCK software was used to process the images that were presented as multicolor karyograms (Leica Microsystems Imaging solutions).
Fusion transcript detection from bulk RNA-seq
STAR-aligner-based Arriba fusion detection tool59 was used to detect fusion transcripts from bulk RNA-seq data. First, reads were demultiplexed and STAR aligner 2.5.3a was used to align FASTQ files containing reads for individual samples by two-pass alignment60. Reads were aligned to a STAR index generated using the GRCh38 genome build. Detection of chimeric reads was enabled. Next, the Arriba fusion detection tool was used to extract the Chimeric.out.sam and Aligned.out.bam files and to create a list of fusion predictions passing Arriba’s filters.
Ex vivo drug screening
Ex vivo drug screening was performed on thawed cells from four diagnosis samples and human CD45+ cells from two PDX samples (Supplementary Table 1). Cells were cultured the same as for Strand-seq analysis (see above) using previously established protocols44,45. Then, 0.5 × 105 AML cells per well were seeded in flat-bottomed, 96-well plates and cells were treated with up to 12 treatment conditions consisting of standard chemotherapy regimens as well as new compounds for 24 h, and for selected conditions for another 72 h (Supplementary Table 9). After 24 or 72 h, the cells were stained with cell-surface antibodies (Supplementary Table 12). The same amount of CountBright Absolute Counting Beads (Thermo Fisher Scientific, C36950) together with 7-aminoactinomycin D (BD Biosciences, 559925) was added to each sample before analysis with a BD LSRFortessa Cell Analyzer.
Intracellular staining for BCL-2 family members
Intracellular staining was performed on thawed cells from four diagnosis samples as previously described41 (Supplementary Table 12). Thawed cells were stained with Zombie NIR Fixable Viability stain in phosphate-buffered saline (BioLegend, 423105), followed by cell-surface antibody staining (Supplementary Table 12). Stained cells were fixed and permeabilized using the Fixation/Permeabilization Solution Kit (BD Biosciences, 554714) according to the manufacturer’s instructions. To enhance intracellular staining, a secondary permeabilization step using Permeabilization Buffer Plus (BD Biosciences, 561651) was performed. Fixed and permeabilized cells were stained for anti-human-BCL-2-AF647 (clone 124; Cell Signaling, 82655), anti-human-MCL-1-AF488 (clone D2W9E; Cell Signaling, 58326) and anti-human-BCL-xL-PE-Cy7 (clone 54H6; Cell Signaling, 81965) (Supplementary Table 12). Samples were analyzed using a BD LSRFortessa Cell Analyzer.
Quantification and statistical analysis
Methods used for statistical analyses are detailed in the figure legends. All statistical analyses were done using R 4.0.0. Flow cytometry data analysis was done using FlowJo v.10.5.3.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Online content
Any methods, additional references, Nature Portfolio reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at 10.1038/s41588-024-01999-x.
Supplementary information
Acknowledgements
We thank all technicians of A.T.’s laboratory for technical assistance and K. Stumpf, A. Narr and other lab members for constructive discussions. We are grateful to J.-P. Mallm and K. Bauer from the DKFZ Single-cell Open Lab, S. Schmitt, M. Eich, K. Hexel, T. Rubner and F. Blum from the DKFZ Flow Cytometry Core Facility for their assistance, and K. Reifenberg, P. Prückl, M. Durst, A. Rathgeb and all animal caretakers of the DKFZ Central Animal Laboratory for excellent animal welfare and husbandry. We also thank the DKFZ Genomics and Proteomics Core Facility for their assistance, as well as the DKFZ ODCF System Administration, and the European Molecular Biology Laboratory (EMBL) Flow Cytometry Core Facility for assistance in cell sorting and the EMBL Genomics Core Facility for assisting in Strand-seq single-cell automation. This work was partly supported by: the SPP2036, FOR2674 and SFB873 funded by the Deutsche Forschungsgemeinschaft (DFG); the DKTK joint funding project ‘RiskY-AML’; the ‘Integrate-TN’ Consortium funded by the Deutsche Krebshilfe; the European Research Council (ERC) Advanced Grant SHATTER-AML (grant AdG-101055270); the ERC Consolidator Grant MOSAIC (grant CoG-773026); the Dietmar Hopp Foundation; and the National Institutes of Health (grants R01CA262496, R01CA284595-01 and R01CA283574-01). A.M.L. was supported by the Ida Montin Foundation. A.W. was supported by the European Molecular Biology Organization Postdoctoral Fellowship and the Marie Curie Individual Fellowship. B.R.M. was supported by a Bridging Excellence Fellowship provided by the Life Science Alliance. Graphic illustrations were created with BioRender.com.
Extended data
Author contributions
A.M.L., K.G., H.J., A.D.S., J.O.K. and A.T. conceptualized the study. A.M.L., K.G., F.Y.H., E.V.B. and P.H. performed Strand-seq experiments. K.G., A.M.L., H.J., F.Y.H., A.D.S. and J.O.K. performed structural variant analysis. K.G., A.M.L., F.Y.H. and J.O.K. performed subclonal reconstruction as well as measurement of intrapatient karyotype heterogeneity using Strand-seq data. H.J. and A.A. performed haplotype-specific nucleosome occupancy analysis and cell-type classification. A.M.L. and F.Y.H. performed CITE-seq experiments. A.M.L., F.Y.H. and F.G. performed alignment of CITE-seq data. A.M.L. carried out the analysis of CITE-seq data. A.M.L., A.W. and M.S. performed in vivo transplantation experiments. A.M.L., F.Y.H. and A.W. performed ex vivo drug-screening experiments. A.M.L. and A.J. performed M-FISH experiments. A.M.L. carried out OGM experiments. B.R.M. contributed to data analysis. A.W., T.B., D.K., V.T., A.D. and L.B. contributed to patient sample and PDX processing. A.K., S.R., P.S., C.M.T., A.K.E. and K.M. provided samples and clinical information. A.M.L., J.O.K. and A.T. wrote the manuscript with support from K.G. and A.D.S. and additional contributions from all authors.
Peer review
Peer review information
Nature Genetics thanks Jonas Demeulemeester and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Funding
Open access funding provided by Deutsches Krebsforschungszentrum (DKFZ).
Data availability
Sequencing data from this study can be retrieved from the European Genome-phenome Archive (EGA) and ArrayExpress. Data from primary CK-AML cells and PDXs are available under the following accessions: Strand-seq and CITE-seq (EGA, EGAS00001007436); bulk RNA-seq (ArrayExpress, E-MTAB-14420). Human patient data stored at the EGA are managed by the EGA Data Access Committee, following their most current standards for patient-derived omics data. This ensures that the data remain nonidentifiable while being accessible to researchers, typically within 2 weeks of submitting a reasonable request to the committee. We also used publicly available databases as follows: human GRCh38 reference database (Ensembl: http://ftp.ensembl.org) and Molecular Signature Database (MSigDB: https://www.gsea-msigdb.org/gsea/msigdb).
Code availability
The computational software used in the present study include scNOVA (https://github.com/jeongdo801/scNOVA), Mosaicatcher (https://github.com/friendsofstrandseq/mosaicatcher-pipeline), Strand-PhaseR (https://github.com/daewoooo/StrandPhaseR), CONICSmat (https://github.com/diazlab/CONICS), Delly2 (https://github.com/dellytools/delly), NO_based_HSPC_classifier (https://github.com/jeongdo801/NO_based_HSPC_classifier), PloidyAssignR (https://github.com/lysfyg/PloidyAssignR), BWA47 (v.0.7.15), STAR60 (v.2.7.9a and v.2.5.3a), SAMtools61 (v.1.3.1), biobambam2 (ref. 48) (v.2.0.76), Sambamba62 (v.0.6.5), R63 (v.4.0.0), DESeq2 (ref. 64), Cell Ranger65 (v.6.0), Seurat66 (v.4.3.0.1), scran67 (1.28.2), AUCell55 (v.1.2.2.0), SingleR53 (2.2.0), Arriba59 (v.1.2.0), FlowJo (v.10.5.3), GraphPad Prism (v.9.3.1), Bionano Solve (v.3.7), Bionano Access (v.1.7.1) and BD FACSDiva. Analysis notebooks for the figures are available at https://github.com/amleppa/scNOVA-CITE_paper.
Competing interests
A.D.S. and J.O.K. have previously disclosed a patent application (no. EP19169090) that is relevant to this manuscript. A.K.E. received an honorarium from AstraZeneca for serving on their diversity, equity and inclusion advisory board, and her spouse has ownership interest and is employed by Karyopharm Therapeutics. The remaining authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Aino-Maija Leppä, Karen Grimes, Hyobin Jeong.
These authors jointly supervised this work: Ashley D. Sanders, Jan O. Korbel and Andreas Trumpp.
Contributor Information
Jan O. Korbel, Email: jan.korbel@embl.org
Andreas Trumpp, Email: a.trumpp@dkfz-heidelberg.de.
Extended data
is available for this paper at 10.1038/s41588-024-01999-x.
Supplementary information
The online version contains supplementary material available at 10.1038/s41588-024-01999-x.
References
- 1.Rausch, T. et al. Genome sequencing of pediatric medulloblastoma links catastrophic DNA rearrangements with TP53 mutations. Cell148, 59–71 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Bochtler, T. et al. Clonal heterogeneity as detected by metaphase karyotyping is an indicator of poor prognosis in acute myeloid leukemia. J. Clin. Oncol.31, 3898–3905 (2013). [DOI] [PubMed] [Google Scholar]
- 3.Papaemmanuil, E. et al. Genomic classification and prognosis in acute myeloid leukemia. N. Engl. J. Med.374, 2209–2221 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Mrózek, K. et al. Complex karyotype in de novo acute myeloid leukemia: typical and atypical subtypes differ molecularly and clinically. Leukemia33, 1620–1634 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Rücker, F. G. et al. TP53 alterations in acute myeloid leukemia with complex karyotype correlate with specific copy number alterations, monosomal karyotype, and dismal outcome. Blood119, 2114–2121 (2012). [DOI] [PubMed] [Google Scholar]
- 6.Cosenza, M. R., Rodriguez-Martin, B. & Korbel, J. O. Structural variation in cancer: role, prevalence, and mechanisms. Annu. Rev. Genom. Hum. Genet23, 123–152 (2022). [DOI] [PubMed] [Google Scholar]
- 7.Navin, N. et al. Tumour evolution inferred by single-cell sequencing. Nature472, 90–94 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Wang, Y. et al. Clonal evolution in breast cancer revealed by single nucleus genome sequencing. Nature512, 155–160 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Laks, E. et al. Clonal decomposition and DNA replication states defined by scaled single-cell genome sequencing. Cell179, 1207–1221.e22 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Gawad, C., Koh, W. & Quake, S. R. Single-cell genome sequencing: current state of the science. Nat. Rev. Genet.17, 175–188 (2016). [DOI] [PubMed] [Google Scholar]
- 11.Nam, A. S., Chaligne, R. & Landau, D. A. Integrating genetic and non-genetic determinants of cancer evolution by single-cell multi-omics. Nat. Rev. Genet.22, 3–18 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Marine, J. C., Dawson, S. J. & Dawson, M. A. Non-genetic mechanisms of therapeutic resistance in cancer. Nat. Rev. Cancer20, 743–756 (2020). [DOI] [PubMed] [Google Scholar]
- 13.Jeong, H. et al. Functional analysis of structural variants in single cells using Strand-seq. Nat. Biotechnol.41, 832–844 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Sanders, A. D., Falconer, E., Hills, M., Spierings, D. C. J. & Lansdorp, P. M. Single-cell template strand sequencing by Strand-seq enables the characterization of individual homologs. Nat. Protoc.12, 1151–1176 (2017). [DOI] [PubMed] [Google Scholar]
- 15.Stoeckius, M. et al. Simultaneous epitope and transcriptome measurement in single cells. Nat. Methods14, 865–868 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Sanders, A. D. et al. Single-cell analysis of structural variations and complex rearrangements with tri-channel processing. Nat. Biotechnol.38, 343–354 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Korbel, J. O. & Campbell, P. J. Criteria for inference of chromothripsis in cancer genomes. Cell152, 1226–1236 (2013). [DOI] [PubMed] [Google Scholar]
- 18.Stephens, P. J. et al. Massive genomic rearrangement acquired in a single catastrophic event during cancer development. Cell144, 27–40 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Ottema, S. et al. Atypical 3q26/MECOM rearrangements genocopy inv(3)/t(3;3) in acute myeloid leukemia. Blood136, 224–234 (2020). [DOI] [PubMed] [Google Scholar]
- 20.Yamazaki, H. et al. A remote GATA2 hematopoietic enhancer drives leukemogenesis in inv(3)(q21;q26) by activating EVI1 expression. Cancer Cell25, 415–427 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Lugthart, S. et al. Clinical, molecular, and prognostic significance of WHO type inv(3)(q21q26.2)/t(3;3)(q21;q26.2) and various other 3q abnormalities in acute myeloid leukemia. J. Clin. Oncol.28, 3890–3898 (2010). [DOI] [PubMed] [Google Scholar]
- 22.McClintock, B. The stability of broken ends of chromosomes in Zea mays. Genetics26, 234–282 (1941). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Rosswog, C. et al. Chromothripsis followed by circular recombination drives oncogene amplification in human cancer. Nat. Genet.53, 1673–1685 (2021). [DOI] [PubMed] [Google Scholar]
- 24.Garsed, D. W. et al. The architecture and evolution of cancer neochromosomes. Cancer Cell26, 653–667 (2014). [DOI] [PubMed] [Google Scholar]
- 25.Müller, S., Cho, A., Liu, S. J., Lim, D. A. & Diaz, A. CONICS integrates scRNA-seq with DNA sequencing to map gene expression to tumor sub-clones. Bioinformatics34, 3217–3219 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Fish, E. N. & Platanias, L. C. Interferon receptor signaling in malignancy: a network of cellular pathways defining biological outcomes. Mol. Cancer Res12, 1691–1703 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Shlush, L. I. et al. Tracing the origins of relapse in acute myeloid leukaemia to stem cells. Nature547, 104–108 (2017). [DOI] [PubMed] [Google Scholar]
- 28.Kawashima, N. et al. Comparison of clonal architecture between primary and immunodeficient mouse-engrafted acute myeloid leukemia cells. Nat. Commun.13, 1624 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Dombret, H. et al. International phase 3 study of azacitidine vs conventional care regimens in older patients with newly diagnosed AML with >30% blasts. Blood126, 291–299 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Kirshner, J. R. et al. Elesclomol induces cancer cell apoptosis through oxidative stress. Mol. Cancer Ther.7, 2319–2327 (2008). [DOI] [PubMed] [Google Scholar]
- 31.Kuusanmäki, H. et al. Erythroid/megakaryocytic differentiation confers BCL-XL dependency and venetoclax resistance in acute myeloid leukemia. Blood141, 1610–1625 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Nechiporuk, T. et al. The TP53 apoptotic network Is a primary mediator of resistance to BCL2 inhibition in AML cells. Cancer Discov.9, 910–925 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Ding, L. et al. Clonal evolution in relapsed acute myeloid leukaemia revealed by whole-genome sequencing. Nature481, 506–510 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Morita, K. et al. Clonal evolution of acute myeloid leukemia revealed by high-throughput single-cell genomics. Nat. Commun.11, 5327 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Griffith, M. et al. Optimizing cancer genome sequencing and analysis. Cell Syst.1, 210–223 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Layer, R. M., Chiang, C., Quinlan, A. R. & Hall, I. M. LUMPY: a probabilistic framework for structural variant discovery. Genome Biol.15, R84 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Griffiths, J. A., Scialdone, A. & Marioni, J. C. Using single-cell genomics to understand developmental processes and cell fate decisions. Mol. Syst. Biol.14, e8046 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Grimes, K. et al. Cell-type-specific consequences of mosaic structural variants in hematopoietic stem and progenitor cells. Nat. Genet.56, 1134–1146 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Rodriguez-Meira, A. et al. Single-cell multi-omics identifies chronic inflammation as a driver of TP53-mutant leukemic evolution. Nat. Genet.55, 1531–1541 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Kuusanmäki, H. et al. Ex vivo venetoclax sensitivity testing predicts treatment response in acute myeloid leukemia. Haematologica108, 1768–1781 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Waclawiczek, A. et al. Combinatorial BCL2 family expression in acute myeloid leukemia stem cells predicts clinical response to azacitidine/venetoclax. Cancer Discov.13, 1408–1427 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Kornauth, C. et al. Functional precision medicine provides clinical benefit in advanced aggressive hematologic cancers and identifies exceptional responders. Cancer Discov.12, 372–387 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Malani, D. et al. Implementing a functional precision medicine tumor board for acute myeloid leukemia. Cancer Discov.12, 388–401 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Pabst, C. et al. GPR56 identifies primary human acute myeloid leukemia cells with high repopulating potential in vivo. Blood127, 2018–2027 (2016). [DOI] [PubMed] [Google Scholar]
- 45.Pabst, C. et al. Identification of small molecules that support human leukemia stem cell activity ex vivo. Nat. Methods11, 436–442 (2014). [DOI] [PubMed] [Google Scholar]
- 46.Falconer, E. et al. DNA template strand sequencing of single-cells maps genomic rearrangements at high resolution. Nat. Methods9, 1107–1112 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics25, 1754–1760 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Tischler, G. & Leonard, S. biobambam: tools for read pair collation based algorithms on BAM files. Source Code Biol. Med.9, 1–18 (2014).24401704 [Google Scholar]
- 49.Porubsky, D. et al. Direct chromosome-length haplotyping by single-cell sequencing. Genome Res.26, 1565–1574 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Butler, A., Hoffman, P., Smibert, P., Papalexi, E. & Satija, R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol.36, 411–420 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Hafemeister, C. & Satija, R. Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression. Genome Biol.20, 296 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Hao, Y. et al. Integrated analysis of multimodal single-cell data. Cell184, 3573–3587 e29 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Aran, D. et al. Reference-based analysis of lung single-cell sequencing reveals a transitional profibrotic macrophage. Nat. Immunol.20, 163–172 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Stuart, T. et al. Comprehensive Integration of single-cell data. Cell177, 1888–1902 e21 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Aibar, S. et al. SCENIC: single-cell regulatory network inference and clustering. Nat. Methods14, 1083–1086 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Liberzon, A. et al. The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Syst.1, 417–425 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Ng, S. W. et al. A 17-gene stemness score for rapid determination of risk in acute leukaemia. Nature540, 433–437 (2016). [DOI] [PubMed] [Google Scholar]
- 58.Geigl, J. B., Uhrig, S. & Speicher, M. R. Multiplex-fluorescence in situ hybridization for chromosome karyotyping. Nat. Protoc.1, 1172–1184 (2006). [DOI] [PubMed] [Google Scholar]
- 59.Uhrig, S. et al. Accurate and efficient detection of gene fusions from RNA sequencing data. Genome Res.31, 448–460 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics29, 15–21 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Danecek, P. et al. Twelve years of SAMtools and BCFtools. Gigascience10, giab008 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Tarasov, A., Vilella, A. J., Cuppen, E., Nijman, I. J. & Prins, P. Sambamba: fast processing of NGS alignment formats. Bioinformatics31, 2032–2034 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.R Core Team. R: A Language and Environment for Statistical Computing (Foundation for Statistical Computing, 2013).
- 64.Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol.15, 550 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Zheng, G. X. et al. Massively parallel digital transcriptional profiling of single cells. Nat. Commun.8, 14049 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Satija, R., Farrell, J. A., Gennert, D., Schier, A. F. & Regev, A. Spatial reconstruction of single-cell gene expression data. Nat. Biotechnol.33, 495–502 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Lun, A. T., McCarthy, D. J. & Marioni, J. C. A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor. F1000Res5, 2122 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Sequencing data from this study can be retrieved from the European Genome-phenome Archive (EGA) and ArrayExpress. Data from primary CK-AML cells and PDXs are available under the following accessions: Strand-seq and CITE-seq (EGA, EGAS00001007436); bulk RNA-seq (ArrayExpress, E-MTAB-14420). Human patient data stored at the EGA are managed by the EGA Data Access Committee, following their most current standards for patient-derived omics data. This ensures that the data remain nonidentifiable while being accessible to researchers, typically within 2 weeks of submitting a reasonable request to the committee. We also used publicly available databases as follows: human GRCh38 reference database (Ensembl: http://ftp.ensembl.org) and Molecular Signature Database (MSigDB: https://www.gsea-msigdb.org/gsea/msigdb).
The computational software used in the present study include scNOVA (https://github.com/jeongdo801/scNOVA), Mosaicatcher (https://github.com/friendsofstrandseq/mosaicatcher-pipeline), Strand-PhaseR (https://github.com/daewoooo/StrandPhaseR), CONICSmat (https://github.com/diazlab/CONICS), Delly2 (https://github.com/dellytools/delly), NO_based_HSPC_classifier (https://github.com/jeongdo801/NO_based_HSPC_classifier), PloidyAssignR (https://github.com/lysfyg/PloidyAssignR), BWA47 (v.0.7.15), STAR60 (v.2.7.9a and v.2.5.3a), SAMtools61 (v.1.3.1), biobambam2 (ref. 48) (v.2.0.76), Sambamba62 (v.0.6.5), R63 (v.4.0.0), DESeq2 (ref. 64), Cell Ranger65 (v.6.0), Seurat66 (v.4.3.0.1), scran67 (1.28.2), AUCell55 (v.1.2.2.0), SingleR53 (2.2.0), Arriba59 (v.1.2.0), FlowJo (v.10.5.3), GraphPad Prism (v.9.3.1), Bionano Solve (v.3.7), Bionano Access (v.1.7.1) and BD FACSDiva. Analysis notebooks for the figures are available at https://github.com/amleppa/scNOVA-CITE_paper.