Abstract
Studies have revealed key genomic aberrations in pediatric acute myeloid leukemia (AML) based on Western populations. It is unknown to what extent the current genomic findings represent populations with different ethnic backgrounds. Here we present the genomic landscape of driver alterations of Chinese pediatric AML and discover previously undescribed genomic aberrations, including the XPO1-TNRC18 fusion. Comprehensively comparing between the Chinese and Western AML cohorts reveal a substantially distinct genomic alteration profile. For example, Chinese AML patients more commonly exhibit mutations in KIT and CSF3R, and less frequently mutated of genes in the RAS signaling pathway. These differences in mutation frequencies lead to the detection of previously uncharacterized co-occurring mutation pairs. Importantly, the distinct driver profile is clinical relevant. We propose a refined prognosis risk classification model which better reflected the adverse event risk for Chinese AML patients. These results emphasize the importance of genetic background in precision medicine.
Subject terms: Cancer genomics, Acute myeloid leukaemia, Acute lymphocytic leukaemia, Paediatric cancer
The genomic landscape of pediatric acute myeloid leukemia (AML) has mostly been characterised for Western populations. Here, the authors identify potential driver alterations in Chinese pediatric AML, which differ from Western populations, and propose a prognostic risk classification model.
Introduction
Among childhood cancers, leukemia is the most common malignancy1, and 15–20% of childhood leukemias are acute myeloid leukemia (AML). Although less common than acute lymphoblastic leukemia (ALL), patients with AML have inferior outcomes2,3. Genomic profiling based on next-generation sequencing (NGS) has elucidated the genomic landscape of pediatric AML and detected key molecular alterations4–8. Such alterations include fusions, like RUNX1-RUNX1T1 and CBFB-MYH11; KMT2A rearrangements; NUP98 rearrangements; and sequence mutations, like FLT3, KIT, WT1, CEBPA, and NPM1, which are among the most commonly mutated genes in pediatric AML. Further studies have been conducted to establish the association between key driver alterations and patient outcome. For example, KMT2A rearrangements or FLT3 ITD variants are associated with adverse outcome9–11, while RUNX1-RUNX1T1 fusion or mutations in NPM1 likely indicate a favorable prognosis12,13. These findings have substantially improved our understanding of the genetics and molecular complexity underlying AML and have also promoted the development of clinical methods for precision diagnosis and patient management.
Recent studies have revealed that genomic profiles significantly differ between pediatric and adult cancer, including AML4,5,14–16. Compared to adult, pediatric AML tends to exhibit higher mutation frequencies in MYC ITD and WT1, and less frequent mutations in DNMT3A and TP53. Meanwhile, studies in adult solid tumors have also shown that different ethnic backgrounds may have a profound impact on molecular drivers of disease development and progression17,18. Following this evidence, it is notable that although comprehensive genomic studies have revealed key genomic aberrations in pediatric AML, these observations have been primarily based on genomic profiling of patients from Western populations. There remains a lack of genomic profiling in Chinese AML patients, such that precision medicine in Chinese AML is largely biased.
In the present study, we report the comprehensive genomic and transcriptomic study on Chinese pediatric AML. We perform transcriptome sequencing to analyze the driver alterations in 292 pediatric AML cases and their correlations. We further identify a distinct driver profile by comparing our results to the mutation profile characterized in the Children’s Oncology Group-National Cancer Institute TARGET AML initiative, representing the Western pediatric AML cohort4. Finally, we demonstrate that the different driver profile identified in Chinese AML is clinic relevant, and we propose a refined risk classification model based on these results.
Results
Clinical characteristics of Chinese pediatric AML
We analyzed genomic alterations by studying 292 Chinese pediatric AML patients who were diagnosed and treated at Shanghai Children’s Medical Center (SCMC) from 2001–2018, and for whom adequate material from tumor cells was available (Supplementary Data 1). The clinical characteristics of the Chinese cohort were comparable to those of Western populations4. However, Chinese AML patients were younger, with a median age of 5.3 years compared to 10.6 years in the Western cohort (Supplementary Data 2). Patients enrolled in this study were treated on AML-SCMC-2009-A protocol (n = 196, 67.1%), AML-SCMC-2009-B (n = 78, 26.7%), SCMC-AML-XH99 (n = 10, 3.4%), and other (n = 6, 2.1%), while information was missing for two patients (0.7%). No significant difference in patient outcome was observed between AML-SCMC-2009-A and AML-SCMC-2009-B (P = 0.207) or between patients treated over period of time (2009–2013 versus 2014-2018, P = 0.655) (Methods section and Supplementary Fig. 1). Transcriptome sequencing (RNA-seq) was applied to all tumor samples (Supplementary Data 3) and analyzed for both sequence mutations and gene rearrangements (Methods section).
Chimeric fusions identified in Chinese pediatric AML
Analysis of RNA-seq data revealed 224 rearrangements involving 97 genes, in 200 out of 292 patients (68.5%) (Fig. 1a and Supplementary Data 4). In concordance with previous reports, the most prevalent fusions detected in Chinese AML patients included RUNX1-RUNX1T1 (n = 82, 28.1%), KMT2A rearrangements (n = 45, 15.4%), and NUP98 rearrangements (n = 17, 5.8%). Additionally, we identified a recurrent in-frame fusion involving the XPO1 gene in three patients in our cohort (Fig. 1b), including the XPO1-TNRC18 fusion (n = 2) and XPO1-MLLT10 fusion (n = 1). Notably, the XPO1-TNRC18 fusion was also detected in an additional two AML patients who were recently diagnosed at SCMC. Interestingly, all four cases carrying XPO1-TNRC18 belonged to the M7 group and did not harbor any known AML-driving fusion, supporting that this fusion was the potential driver of AML in these cases, and might define a previously unclassified molecular subtype within this FAB group. The XPO1 gene is mainly involved in the nuclear export of proteins and RNAs. Hotspot mutations in XPO1 have been described in several tumor types, including B-cell malignancies, and are reportedly associated with tumorigenesis19–22. On the other hand, the functions of TNRC18 involve chromatin and DNA binding, and have not been associated with tumors. The XPO1-TNRC18 chimeric fusion protein conserved both the exportin 1-like domain of XPO1 and the bromo-adjacent homology domain of TNRC18. Further experimental investigation is required to elucidate the function of this fusion protein. Overall, the cases with XPO1-TNRC18 accounted for 5.0% of the AML M7 subtype (2 of 40 in our cohort). We also identified other gene rearrangements in our current analysis that have not been observed previously, including PTPRA-FUS, ZEB2-ATIC, and MSI2-UBE3C (n = 1 each).
Distinct profile of sequence mutations in Chinese pediatric AML
We optimized the variant calling and processing pipeline for analyzing potential somatic and cancer-associated sequence mutations from tumor-only RNA-seq data (Methods section). To first evaluate the performance of this approach, we applied it to RNA-seq data collected from 10 previously published pediatric ALL cases23, for which we also had matched whole-genome sequencing (WGS) data for both tumor and remission samples from each case. Results demonstrated that our analysis of tumor-only RNA-seq data successfully identified 85.7% of the driver mutations discovered by WGS (18 of 21) (Supplementary Data 5 and Supplementary Fig. 2a–d). Meanwhile, RNA-seq analysis detected an additional nine potential driver mutations, including NRAS G13D and KRAS G13D, among others. Of these 9 mutations, 8 had been included in the capture validation experiment using genomic DNA in our previous study. Among these 8 mutations, 7 (87.5%) (Supplementary Data 5 and Supplementary Fig. 2e) were successfully validated. Notably, all nine mutations were subclonal (vaf < 0.3) and were missed in WGS analysis due to insufficient coverage. These results demonstrate the power of analyzing sequence mutations from RNA-seq, especially for detection of subclonal variants.
Using this method, we next identified a total of 975 nonsynonymous sequence mutations affecting 305 genes (Supplementary Data 6). These mutations included 707 single-nucleotide variants (SNV) and 268 insertion/deletion (indels), with a median of four mutations detected per case (range 0–10). We further analyzed potential driver mutations by applying a combined strategy, integrating mutation pathogenicity analysis using PeCanPIE24 with mutation cluster analysis using MutClan (Methods section and Supplementary Data 6). A total of 572 potential driver variations were identified affecting 73 genes. Moreover, 24 internal tandem duplications (ITD, Supplementary Data 7) were detected with CICERO25, influencing FLT3 and MYC. Overall, driver sequence mutations were detected in 81.8% of cases (n = 239), with a median of three driver mutations detected per case (range, 0–8). Our subsequent analysis focused only on driver mutations. Among the drivers, 10 genes were recurrently mutated in over 5% of patients (Fig. 2a), including FLT3 (n = 56 patients, 19.2%), KIT (n = 54, 18.5%), NRAS (n = 52, 17.8%), KRAS (n = 27, 9.2%), CEBPA (n = 24, 8.2%), ASXL2 (n = 21, 7.2%), PTPN11 (n = 18, 6.2%), CSF3R (n = 15, 5.1%), GATA2 (n = 15, 5.1%), and JAK2 (n = 15, 5.1%). Notably, we uncovered driver genes in pediatric AML, including mutations in LZTR1 (n = 2) and SPOP (n = 1), which have not been previously associated with pediatric AML, as well as loss of function mutations in ARID2 (n = 2) and SH2B3 (n = 2), which have been reported as pathogenic in other pediatric cancers, like ALL, but not in AML.
Although genomic mutations in most of these genes have been previously reported in pediatric AML, we observed a dramatically different profile of mutation occurrence in Chinese cohort compared to in the TARGET AML study representing the Western population. Among the 21 genes or hotspots within a driver gene recurrently mutated in >4% of patients in either SCMC or TARGET cohort, 10 (47.6%) showed a significantly different mutation frequency (P < 0.05, two-sided Fisher’s exact test) (Fig. 2b, c and Supplementary Fig. 3 and Supplementary Data 8a). Among these, four showed a higher mutation frequency in Chinese patients, including ASXL2 (7.3% vs. 3.4% in the Chinese and Western cohorts, respectively, P = 0.025), JAK2 (5.2% vs. 0.0%, P < 0.001), CSF3R cytoplasmic domain (CSF3R-CY, 4.5% vs. 0.6%, P < 0.001), and KIT exon 17 (KIT-E17, 13.9% vs. 7.9%, P = 0.020). On the other hand, the Chinese patients showed less frequent mutations in FLT3 (19.1% vs. 31.7%, P = 0.003) and FLT3 ITD (7.3% vs. 16.1%, P = 0.002), NRAS (17.7 vs. 31.5%, P < 0.001), WT1 (2.4% vs. 13.7%, P < 0.001), NPM1 (2.1% vs. 8.6%, P < 0.001), and TET2 (1.4% vs. 4.4%, P = 0.036). Notably, the different mutation frequency between Chinese and Western patients were mostly contributed by patients of 3-14 years old (Supplementary Fig. 4a and Supplementary Data 8a). Furthermore, we found that FLT3 and NRAS mutations were more frequently detected in younger patients in Chinese cohort (FLT3, 14 out of 55, 25.5%; NRAS, 20 out of 51, 39.2%) than in TARGET (FLT3, 20 out of 158, 12.7%, P = 0.033; NRAS, 37 out of 157, 23.6%, P = 0.046, Supplementary Fig. 4b and Supplementary Data 8b).
Landscape of driver genomic alterations unveiled different pattern of co-mutations in Chinese pediatric AML
We further integrated different variant types to analyze the genomic landscape within Chinese pediatric AML. In addition to the above-described fusions and sequence mutations, we also analyzed the CBL exon 8/9 deletion identified in RNA-seq analysis (Methods), which was recurrently deleted in AML5,26. This focal deletion was identified in 12 cases in our cohort (4.1%) (Supplementary Data 9 and Supplementary Fig. 5), comparable to the TARGET cohort (4.2% vs 3.6% for patients <15 years, P = 0.704). Overall, driver mutations identified in 93 genes were grouped into six pathways (Fig. 3a and Supplementary Data 10). The most frequently mutated pathways in AML were transcription regulation, epigenetics, and RAS signaling, which were mutated in 69.2%, 36.6%, and 32.2% of patients, respectively. Mutations activating other signaling pathways, including JAK-STAT and others, were detected in a total of 50.7% patients. Notably, the RAS signaling pathway was significantly less frequently mutated in Chinese patients (32.2% vs. 49.0%, P < 0.001), consistent with the low mutation frequency observed for individual genes in the RAS pathway.
We next investigated the pairwise relationships of driver mutations detected in Chinese AML patients. A total of 150 pairs of genes, mutation hotspots, or FAB groups (143 after excluding pairs between fusion partners or different domains within a single gene) were found to be significantly concurrently or exclusively mutated (P < 0.05, two-sided Fisher’s exact test) (Fig. 3b and Supplementary Data 11). Through this analysis, we established several associations in AML, including the co-mutation of CSF3R and KIT, ASXL2 and KIT, DHX15 and KIT, and DNM2 and JAK3, among others. The observation of these associations that have not previously been described could be only partly explained by the higher mutation frequency of these genes among Chinese AML patients, as completely different mutation associations were also discovered in different populations. For example, CSF3R was found to be significantly co-mutated with KIT in Chinese cohort, but reportedly co-occurs with CEBPA in the Western population4. Similarly, NF1 mutations was found to be significantly associated with the FAB M5 group in Chinese patients (P < 0.001) but were previously found to be co-mutated with CBFB-MYH11 fusion in the M4 group6.
Impact of driver genomic alterations on clinical outcomes
Survival analysis revealed driver genomic aberrations associated with patient prognosis (Supplementary Data 12 and Fig. 4a). Consistent with previous reports, we found that the CBFB-MYH11 fusion was associated with favorable outcome, while NUP98-KDM5A/NSD1, FUS-ERG, and CBFA2T2-GLIS2 were associated with unfavorable prognosis7,11,12,27–30. On the other hand, we noticed that patients with RUNX1-RUNX1T1 fusion and KMT2A rearrangements exhibited similar and intermediate 5-year event-free survival (EFS) rates of 55.3% (CI 44.8–68.2%) and 56.1% (CI 42.9–73.3%), respectively. Regarding the mutations, we found that mutations in CEBPA, NPM1, and GATA2 were associated with favorable prognosis, while mutations in RUNX1 and FLT3 ITD were associated with worse prognosis, which is consistent with previous findings in Western cohort31–38. Patients carrying the driver genomic aberrations above showed comparable prognosis in Chinese and Western cohort (Supplementary Results and Supplementary Fig. 6). Besides these previously established associations, we also found that patients carrying UBTF mutations showed worse prognosis compared to wild-type (Supplementary Fig. 7a). Furthermore, the impact of these driver variants on patient prognosis was further influenced by co-occurrence relationships. Patients carrying FLT3 variants together with UBTF mutation, RUNX1 mutation, or NUP98 rearrangements exhibited worse prognosis compared to patients carrying FLT3 variants alone (Supplementary Fig. 7b–d). On the other hand, patients with FLT3 ITD and NPM1 mutation together showed good outcomes in our current analysis (Supplementary Fig. 7e). We also applied univariate and multivariate Cox regression analysis in this study (Supplementary Data 13a, b). While univariate Cox regression revealed consistent results as above, multivariate Cox regression showed that CBFA2T3-GLIS2, FUS-ERG, NUP98 rearrangements, FLT3 ITD and RUNX1 mutation were independently associated with adverse prognosis, while GATA2 was independently associated with favorable outcome, with P < 0.05. UBTF and CEBPA mutation were not included in the multivariate Cox regression model due to the significant co-mutation with FLT3 ITD and GATA2, respectively. Furthermore, we combined the above five genomic factors associated with worse outcome into high risk genotype and carried out the multivariate Cox regression analysis again, together with CR1 status and GATA2. Results showed that the combined high-risk genotype was independent risk factor significantly associated inferior prognosis (P < 0.001).
Within our cohort, 78 patients carried at least one of the above-described alterations associated with favorable (n = 28) or poor outcome (n = 50). We next examined whether the remaining patients showed any other clinic relevant alterations. To this end, we focused on the patients who lacked any aberrations having well-established associations with prognosis. In addition to the 78 patients above, 24 were further excluded because they carried mutations in TP53 (n = 3) or ASXL1 (n = 1), DEK-NUP214 fusion (n = 1), or confirmed chromosomal abnormality (n = 19, including complex karyotype, monosomal karyotype, −7, −17, del(5q)). We found that for the remaining patients, treatment response after the first cycle of induction therapy was among the factors most significantly associated with patient prognosis (P < 0.001, Supplementary Fig. 8a). Patients who did not achieve complete remission after one cycle of induction (CR1) showed adverse outcomes, similar to patients carrying genomic variants associated with poor outcome (P = 0.155, Supplementary Fig. 8b). On the other hand, while patients with CR1 showed relatively good outcomes, these patients had a worse prognosis compared to patients carrying variants related to favorable prognosis (P = 0.081) (Supplementary Fig. 8c), indicating a mixture of patients within this CR1 group.
We next analyzed the association between genomic aberrations and prognosis within each fusion subtype of the 129 CR1 patients, including RUNX1-RUNX1T1 (n = 53), KMT2A rearrangement (n = 30), cases with other fusions (n = 12), and fusion-negative cases (n = 34). We found that CSF3R (P = 0.038) and KIT-E17 (P = 0.064) mutations in RUNX1-RUNX1T1 patients were associated with adverse prognosis (Supplementary Fig. 9). Notably, CSF3R and KIT exon17 mutations significantly co-occurred within the RUNX1-RUNX1T1 fusion subtype. Indeed, CR1 patients within the RUNX1-RUNX1T1 fusion subtype who carried both CSF3R and KIT exon17 mutations exhibited a significantly worse prognosis compared to patients carrying either one or none of these two mutations (P = 0.012) (Fig. 4b).
Revised risk classification for Chinese pediatric AML patients
We next revised the European LeukemiaNet (ELN) genetic risk classification32 model based on the clinical relevance established in this study. The proposed SCMC-pAML model (Fig. 5a and Supplementary Data 14) featured adjusted risk classification of several genetic aberrations. For example, FUS-ERG, CBFA2T3-GLIS2, NUP98-KDM5A and NUP98-NSD1 were grouped into the high-risk category (HR), while KMT2A rearrangements were classified into an intermediate-risk group (IR). Importantly, patients with the RUNX1-RUNX1T1 fusion, which has been associated with favorable prognosis in previous models, were further subdivided based on CR1 status and subsequently acquired mutations, including CSF3R and KIT-E17 mutations. Compared to the ELN model, the SCMC-pAML model identified more HR patients and less low-risk (LR) patients, along with a similar number of IR patients (Fig. 5b): with the SCMC-pAML model vs. the ELN model, the percentages of LR, IR, and HR, respectively, were 19.9% vs. 35.4%, 36.2% vs. 35.8%, and 43.9% vs. 28.8% (two-sided Fisher’s exact test, P < 0.001). Although the risk groups classified by both models revealed significant differences in patients’ prognoses, the LR and IR groups classified with SCMC-pAML exhibited more favorable outcomes (Fig. 5c), with 5-year EFS rates of 84.9% (95% CI 75.8–95.1%) and 74.5% (95% CI 66.2–83.9%), respectively. These rates were significantly higher than in the LR (64.2%, 95% CI 54.8–75.4%, log-rank test, P = 0.02) and IR (57.3%, 95% CI 48.1–68.4%, log-rank test, P = 0.01) groups stratified with the ELN model. On the other hand, the SCMC-pAML HR group showed a trend of worse prognosis compared to the ELN HR group, with 5-year EFS rates of 18.2% (95% CI 12.0–27.7%) vs. 30.9% (95% CI 21.8–43.7%), without a statistical difference (log-rank test, P = 0.40). Furthermore, being stratified into the HR group with the SCMC-pAML model was identified as an independent risk factor in a multivariable cox model, with a significantly increased risk of adverse event in SCMC-pAML HR patients (hazard ratio 7.7, 95% CI 3.5–17.1, P < 0.001) (Supplementary Data 15).
Discussion
Here we present the comprehensive genomic landscape of driver variants in a large cohort of Chinese pediatric AML. Through comparison to the driver landscape defined in Western population, we first delineated a landscape of driver variants in Chinese AML, recapitulating previously findings. Furthermore, our analyses revealed substantial distinctions between the two cohorts, including less frequent mutations in the RAS signaling pathway and higher mutation rates in KIT and CSF3R, among others. These differences between Chinese and Western patients could be replicated by analyzing genomic aberrations reported by two independent studies representing Japanese pediatric AML39 and French40 pediatric core binding factor AML (Supplementary Fig. 10 and Supplementary Data 16). Moreover, by examining the co-mutations between pair of genes, we demonstrated that the cohorts also differed in the co-occurrence of gene pairs. These results show an unexpected difference in genomic aberrations between different populations with a same clinically defined disease. With the causal yet unveiled, the observed difference might be related to genetic or environmental factors. As people start to appreciate, mutagenesis could be affected by local structures within the genome, including DNA accessibility and gene transcription, among others41. The genetic polymorphism associated with these genomic features might contribute to the different mutation frequencies observed among populations.
Importantly, the distinct driver variants profile of Chinese AML was relevant to patient outcomes. For example, CSF3R mutation reportedly co-occurs with CEBPA mutation in a Western cohort. However, in our Chinese cohort, we found significant co-occurrence between CSF3R and KIT. Moreover, this co-occurrence, specifically between CSF3R and mutation in exon 17 of KIT, was associated with worse event-free survival among CR1 patients of the RUNX1-RUNX1T1 fusion subtype. Besides the five patients reported in current study, we identified additional two RUNX1-RUNX1T1 positive patients diagnosed and treated at SCMC who reached CR1 and carrying both CSF3R and KIT-E17 mutations. The first patient relapsed after 8 months from diagnosis and was detected with CSF3R L780fs (VAF = 0.306), KIT N822K (VAF = 0.165), and KIT D820G (VAF = 0.012) mutations. The second patient was enrolled into RNA-seq analysis during relapse. We detected CSF3R T618I (VAF = 0.459) and KIT D816V (VAF = 0.118) in the relapsed sample. Both mutations could be detected at diagnosis with target sequencing, with a low VAF for both mutations (CSF3R T618I VAF = 0.007 (30/4008) and KIT D816V VAF = 0.012 (75/6001)). Both patients relapsed within one year from diagnosis, consistent with the poor prognosis observed in current cohort. All 7 patients (including the two described above) in this category reached CR1 during the treatment, suggesting the CSF3R and KIT-E17 mutations were not resistant to current chemotherapy but indicating a higher risk for relapse. The mechanism remains further investigation.
Consistent with EFS analysis, results of the driver genomic aberrations’ impact on patient prognosis were observed from analyzing overall survival (OS) of patients (Supplementary Fig. 11). Based on these observations, we proposed the SCMC-pAML model to more precisely reflect genomic aberration-based risk classification in Chinese pediatric AML patients. The refined model significantly improved the risk stratification, as shown in the multivariable cox model. In our current analysis, the SCMC-pAML model could influence and refine the clinical risk stratification of 35.6% of patients (104 out of 292), including a total of 60 patients re-stratified into the HR group. The performance of this revised model requires further validation with multi-center clinical studies. However, its power to precisely classify HR patients can already potentially guide the development of specific treatments for this patient group, as a next step of precision medicine.
In addition to the distinct profile of mutation frequencies, our analyses also identified several potentially drivers, including an in-frame fusion involving XPO1. This fusion was previously reported in one pediatric B-ALL from Children’s Oncology Group42 and one pediatric AML from Swedish group43. In current study, we showed XPO1-TNRC18 fusion was recurrent and enriched in pediatric AML-M7. Patients carrying the XPO1 fusion exhibited no other oncogenic fusion, indicating that this fusion was likely the driver in these tumors. Moreover, we have observed more cases carrying this in-frame fusion through real-time genomic analysis of AML patients diagnosed at SCMC, and all presently discovered cases belong to the M7 subtype. Notably, an inhibitor targeting XPO1 is available, suggesting a potential treatment target in pediatric AML. The E571K mutation was identified as a hotspot mutation in XPO1 in multiple types of cancer19–22. This mutation located in the nuclear export signal (NES)-binding groove of XPO1 protein and was suggested to affect the binding preferences of XPO1 for nuclear export protein and RNA19,22,44,45, and play an oncogenic role in cancer. On the other hand, the role of TNRC18 in cancer was still unclear. The XPO1-TNRC18 fusion protein contains both the wild type NES-binding groove of XPO1 and the BAH (Bromo Adjacent Homology) domain of TNRC18 protein which is associated with transcriptional silencing and chromatin remodeling through recognizing histone modification or protein-protein interaction46,47. The fuse of these domains might result in mislocalization of chromatin modification proteins in the cell. The molecular mechanism of this fusion in leukemia remains further investigated.
By analyzing transcriptome sequencing data of AML tumor cells, we systematically analyzed both fusions and sequence mutations in a single experiment. We demonstrated that the mutation analysis from RNA-seq was highly consistent in driver mutation detection, as compared to DNA-based analysis, including WES and WGS. Furthermore, the RNA-seq-based mutation analysis could achieve even higher coverage (>1000×) than WES for oncogenes. This was due to the nature of high transcription of these genes, further increasing the sensitivity of analyzing subclonal mutations in these genes, which are common in leukemias48–50. On the other hand, this approach would have limitations for analyzing genes that were not transcribed or were down-regulated in transcription, as well as in analyzing copy number aberrations. With these limitations, our analysis might have underestimated the occurrence of tumor suppressor genes.
In summary, here we characterized the comprehensive genomic landscape of Chinese pediatric AML. Our results unveiled a clinically relevant mutation profile that was distinct from that of the Western cohort, in terms of both mutation frequency and patterns of mutation co-occurrence. These findings further elucidate the complexity of pediatric AML and highlight the importance of considering ethnic background when establishing risk stratification for clinical management in the era of precision medicine.
Methods
Patient samples
Bone marrow samples were obtained from 292 patients diagnosed as AML through 2001–2018 in the Department of Hematology and Oncology, Shanghai Children’s Medical Center (SCMC). The research was approved by the Ethics Committee at SCMC. Informed written consents were obtained from parents for all patients.
Treatment
Patients enrolled in AML-SCMC-2009 protocol were treated with ten courses of chemotherapy in three treatment phases. During induction treatment standard 3 + 7 regimen was given in the first course, including daunorubicin (DNR) 40 mg/m2 per day on days 1–3, cytarabine (Ara-c)100 mg/m2 every 12 h on days 1–7, and etoposide (VP-16) 100 mg/m2 per day on days 5–7. After evaluation at the end of the first induction, patients were treated with (Group A) or without (Group B) anthracycline as following treatment. During the second induction, patients received mitoxantrone 10 mg/m2 on days 1–3 (Group A) or homoharringtonine (HHT) (Group B) instead of DNR as in the first induction. In consolidation phase four courses of high-dose Ara-c at dosage 3.0 g/m2 for six doses with 12 h interval coupled with DNR 40 mg/m2(hAD), or mitoxantrone 10 mg/m2 (hAM), or VP-16 100 mg/m2 (hAE) for 2 days in Group A, or HHT 3 mg/m2 for 5 consecutive days (hAH) to substitute DNR and mitoxantrone in Group B. During maintenance therapy, patients were given mercaptopurine 75 mg/m2 every day and Ara-C 75 mg/m2 every 12 h for 7 days in Group A or HHT 3 mg/m2 every day for 9 days and Ara-C 75 mg/m2 every 12 h for 7 days in Group B for 4 cycles respectively. Ten times (for FAB M4 and M5) or four times triple intrathecal therapy with methotrexate, cytarabine, and hydrocortisone at an age-adjusted dose was exploited as central nervous system prophylaxis. Patients who failed to achieve complete remission by the first induction therapy was the indication for hematopoietic stem cell transplantation.
Transcriptome sequencing (RNA-seq) and analysis
Total RNA was extracted from frozen bone marrow cells of AML samples with TRIzol, and checked for RIN (>6) to inspect RNA integrity. RNA-seq experiments were performed according to manufacturer’s instruction with total RNA-seq or mRNA-seq protocol (Supplementary Data 1). For strand-specific sequencing, cDNA libraries were prepared with Illumina TruSeq Stranded Total RNA Kit. The purified cDNA libraries were sequenced on the Illumina HiSeq 2000, HiSeq X Ten or NovaSeq 6000 system (Illumina, United States) with PE 150 bp. RNA-seq data was aligned to human genome version GRCh37-lite with STAR51 and mark duplicate with Picard. Single nucleotide variants (SNVs) and indels were detected with MuTect252 and Rnaindel53. Variants were annotated with VEP54 and filtered for variants that were likely somatic and/or associated with cancer with in house code as described below. Fusions were analyzed with CICERO25 and FusionCatcher55. Internal tandem duplication (ITD) was analyzed with CICERO5,56.
SNV and Indel analysis
Following steps were applied to identify potential somatic and/or cancer-associated variants. First, variants were excluded if: (1) in Ig/TCR region; (2) mutant reads <3 or depth <8; (3) frequency in 1000 Genomes database >0.001; (4) mutation type did not match missense variant, stop gained, inframe insertion/deletion, frameshift, splice acceptor/donor, stop lost, start lost, transcript amplification, protein altering variant, splice region variant or coding sequence variant; (5) in variant artifact list constructed with the 292 cases analyzed in this study. The variants were further grouped into 3 levels. L1 variants met any of the following criteria: (1) Recurrent (reported in ≥3 tumors in the COSMIC database57) and variant allele frequency (vaf) ≥0.1; (2) Previously reported in pediatric cancer genomic projects including PCGP and TARGET; (3) Loss of function variants (frameshift, splicing and stop gained) in known tumor suppressor genes; (4) “Pathogenic” or “Likely Pathogenic” in ClinVar; (5) vaf > 0.1, ‘deleterious’ and ‘possibly_damaging’ in SIFT and PolyPhen analysis, in COSMIC cancer gene census and absent from 1000 Genomes database. Variants did not meet above criteria would be grouped into L2, if: Recurrent (reported in ≥3 tumors in the COSMIC database) and vaf ≤ 0.1 or not in 1000 Genomes database. All other variants were grouped into L3. L1 and L2 mutations were considered to be putative somatic and/or cancer associated. Variants were manually curated to exclude artifacts. Furthermore, nonsynonymous variants detected in the genes known to be recurrently mutated in AML (including KIT, CEBPA, WT1 among others) were manually curated.
Driver mutation analysis
PeCanPIE24 and MutClan (Cui B., Sun H., Rao J., Zhao S., Wang H., Liu T., Wang R., Shen S. Liu Y., Manuscript in preparation) analysis were applied to identify driver mutations. MutClan was designed to identify mutations appeared in a cluster of previously reported somatic mutations. A total of 6,975,733 published somatic mutations in 983 pediatric tumors collected from St. Jude Cloud58 were used to construct the mutation cluster background. A mutation would be considered as potential driver if classified as gold in PeCanPIE analysis or significantly located in mutation cluster (fdr q-value < 0.05). For mutations in TARGET cohort, we included the multi-platform verified somatic genomic variants reported in the Bolouri et al (Nature Medicine 2018), including both discovery and validation dataset. PeCanPIE analysis was applied to TARGET mutations and only mutations classified with a medal were included in the analysis. Results were manually reviewed for each mutation and visualized using ProteinPaint59.
Evaluation of driver mutation detection with RNA-seq
Ten ALL diagnosis samples with matched RNA-seq and WGS data published previously were collected23. RNA-seq data were analyzed for SNVs and Indels as described above. SNVs and Indels detected with WGS were collected as a benchmark. PeCanPIE was applied to all mutations and only mutations classified as gold or silver were included in this analysis. For mutations detected specifically in RNA-seq, capture sequencing data from previous published study23 were curated whenever available.
Statistical analysis
Association between mutations was examined using Fisher’s exact test. Kaplan–Meier estimation, log-rank test and Cox regression analysis were applied to the survival analysis. EFS was defined from diagnosis to the first major adverse event, including relapse after remission, failure to achieve remission, death due to any cause, abandonment, development of a second malignancy, or transfer to other hospital. OS was defined from diagnosis to death of any cause. There were 82 patients (28.1%) abandoned treatment by parental decision, due to relapse (n = 41, 50%), poor treatment response (n = 24, 29.3%), drug resistance (n = 14, 17.1%), second tumor (n = 2, 2.4%), or unknown reason (n = 1, 1.2%). These patients were grouped to event in OS analysis, with the last follow-up date as date of event. Patients lost to follow-up were censored at the date of last known contact.
Analyses were performed using Rstudio v4.1.0.
Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Supplementary information
Acknowledgements
This work was supported by the National Natural Science Foundation of China (30670883 to S.S. and 31970627 to Y.L.); Innovation Program of Shanghai Municipal Education Commission (2021-01-07-00-E00091 to S.S.), Clinical Research and Innovation Project of Shanghai Shenkang Hospital Development Center (SHDC12019121 to S.S.), and Shanghai Key Laboratory of Clinical Molecular Diagnostics for Pediatrics (20dz2260900 to Y.L.). The computations in this paper were run on the π2.0 cluster supported by the Center for High Performance Computing at Shanghai Jiao Tong University.
Author contributions
T.L., J.R., B.C., H.S., H.W., and R.W. analyzed genomic data. J.Ca. and T.L. performed statistic analysis. Y.L. supervised genomic analysis. W.H., Yuhan. L., X.C., Y.T., J.C., and X.W. collected clinical samples and information. W.Q., B.M., and S.G. provided sequencing data. Y.L. and S.S. designed and supervised the study. T.L., Y.L., and S.S. wrote the manuscript.
Peer review
Peer review information
Nature Communications thanks Rachel Rau and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Data availability
The RNA-seq data generated in this study have been deposited in the Genome Sequence Archive (GSA) for Human of the National Genomics Data Center of China under accession number HRA000789. The data is available for academic use under controlled access in compliance with the regulation of the Ministry of Science and Technology (MOST) of China for the deposit and use of human genomic data. Access can be obtained by contacting members of the Data Access Committee (DAC) Shuhong Shen at shenshuhong@scmc.com.cn or Yu Liu at liuyu@scmc.com.cn and following the application procedure in GSA. For detailed guidance, see GSA-Human_Request_Guide_for_Users [https://ngdc.cncb.ac.cn/gsa-human/document/GSA-Human_Request_Guide_for_Users_us.pdf]. Data will be available immediately once the application was approved. The access to the controlled data will be valid for one year from the date approved. The processed genomic aberrations from this dataset are available within the Supplementary Information files. The publicly available genomic data for TARGET AML are available in the database of Genotypes and Phenotypes (dbGap) under accession number phs000465. The clinical annotations of TARGET AML cohort and verified somatic genomic variants from both discovery and validation dataset are downloaded from Bolouri et al.4 and TARGET Data Matrix (https://ocg.cancer.gov/programs/target/data-matrix). The 10 previously published RNA-seq data re-analyzed in this study are available as part in GSA for Human under accession number HRA000119. The mutations from Japanese and French pediatric AML cohorts were obtained from Shiba et al.39 and Duployez et al.40 respectively. The remaining data are available within the Article or Supplementary Information.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Ting Liu, Jianan Rao, Wenting Hu.
Contributor Information
Yu Liu, Email: liuyu@scmc.com.cn.
Shuhong Shen, Email: shenshuhong@scmc.com.cn.
Supplementary information
The online version contains supplementary material available at 10.1038/s41467-022-29336-y.
References
- 1.Steliarova-Foucher E, et al. International incidence of childhood cancer, 2001-10: a population-based registry study. Lancet Oncol. 2017;18:719–731. doi: 10.1016/S1470-2045(17)30186-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Pui CH, et al. Treatment outcomes in black and white children with cancer: results from the SEER database and St Jude Children’s Research Hospital, 1992 through 2007. J. Clin. Oncol. 2012;30:2005–2012. doi: 10.1200/JCO.2011.40.8617. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Siegel RL, Miller KD, Jemal A. Cancer statistics, 2020. CA Cancer J. Clin. 2020;70:7–30. doi: 10.3322/caac.21590. [DOI] [PubMed] [Google Scholar]
- 4.Bolouri H, et al. The molecular landscape of pediatric acute myeloid leukemia reveals recurrent structural alterations and age-specific mutational interactions. Nat. Med. 2018;24:103–112. doi: 10.1038/nm.4439. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Ma X, et al. Pan-cancer genome and transcriptome analyses of 1,699 paediatric leukaemias and solid tumours. Nature. 2018;555:371–376. doi: 10.1038/nature25795. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Faber ZJ, et al. The genomic landscape of core-binding factor acute myeloid leukemias. Nat. Genet. 2016;48:1551–1556. doi: 10.1038/ng.3709. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Gruber TA, Downing JR. The biology of pediatric acute megakaryoblastic leukemia. Blood. 2015;126:943–949. doi: 10.1182/blood-2015-05-567859. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Downing JR, et al. The Pediatric Cancer Genome Project. Nat. Genet. 2012;44:619–622. doi: 10.1038/ng.2287. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Zwaan CM, et al. FLT3 internal tandem duplication in 234 children with acute myeloid leukemia: prognostic significance and relation to cellular drug resistance. Blood. 2003;102:2387–2394. doi: 10.1182/blood-2002-12-3627. [DOI] [PubMed] [Google Scholar]
- 10.Meshinchi S, et al. Prevalence and prognostic significance of Flt3 internal tandem duplication in pediatric acute myeloid leukemia. Blood. 2001;97:89–94. doi: 10.1182/blood.v97.1.89. [DOI] [PubMed] [Google Scholar]
- 11.de Rooij JD, et al. Recurrent abnormalities can be used for risk group stratification in pediatric AMKL: a retrospective intergroup study. Blood. 2016;127:3424–3430. doi: 10.1182/blood-2016-01-695551. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Opatz S, et al. The clinical mutatome of core binding factor leukemia. Leukemia. 2020;34:1553–1562. doi: 10.1038/s41375-019-0697-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Hollink IH, et al. Favorable prognostic impact of NPM1 gene mutations in childhood acute myeloid leukemia, with emphasis on cytogenetically normal AML. Leukemia. 2009;23:262–270. doi: 10.1038/leu.2008.313. [DOI] [PubMed] [Google Scholar]
- 14.Grobner SN, et al. The landscape of genomic alterations across childhood cancers. Nature. 2018;555:321–327. doi: 10.1038/nature25480. [DOI] [PubMed] [Google Scholar]
- 15.Papaemmanuil E, et al. Genomic Classification and Prognosis in Acute Myeloid Leukemia. New Engl. J. Med. 2016;374:2209–2221. doi: 10.1056/NEJMoa1516192. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Cancer Genome Atlas Research N. et al. Genomic and epigenomic landscapes of adult de novo acute myeloid leukemia. New Engl. J. Med. 2013;368:2059–2074. doi: 10.1056/NEJMoa1301689. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Chen J, et al. Genomic landscape of lung adenocarcinoma in East Asians. Nat. Genet. 2020;52:177–186. doi: 10.1038/s41588-019-0569-6. [DOI] [PubMed] [Google Scholar]
- 18.Li J, et al. A genomic and epigenomic atlas of prostate cancer in Asian populations. Nature. 2020;580:93–99. doi: 10.1038/s41586-020-2135-x. [DOI] [PubMed] [Google Scholar]
- 19.Taylor J, et al. Altered nuclear export signal recognition as a driver of oncogenesis. Cancer Discov. 2019;9:1452–1467. doi: 10.1158/2159-8290.CD-19-0298. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Puente XS, et al. Whole-genome sequencing identifies recurrent mutations in chronic lymphocytic leukaemia. Nature. 2011;475:101–105. doi: 10.1038/nature10113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Damm F, et al. Acquired initiating mutations in early hematopoietic cells of CLL patients. Cancer Discov. 2014;4:1088–1101. doi: 10.1158/2159-8290.CD-14-0104. [DOI] [PubMed] [Google Scholar]
- 22.Azmi AS, Uddin MH, Mohammad RM. The nuclear export protein XPO1 - from biology to targeted therapy. Nat. Rev. Clin. Oncol. 2021;18:152–169. doi: 10.1038/s41571-020-00442-4. [DOI] [PubMed] [Google Scholar]
- 23.Li B, et al. Therapy-induced mutations drive the genomic landscape of relapsed acute lymphoblastic leukemia. Blood. 2020;135:41–55. doi: 10.1182/blood.2019002220. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Edmonson MN, et al. Pediatric Cancer Variant Pathogenicity Information Exchange (PeCanPIE): a cloud-based platform for curating and classifying germline variants. Genome Res. 2019;29:1555–1565. doi: 10.1101/gr.250357.119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Tian L, et al. CICERO: a versatile method for detecting complex and diverse driver fusions using cancer RNA sequencing data. Genome Biol. 2020;21:126. doi: 10.1186/s13059-020-02043-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Reindl C, et al. CBL exon 8/9 mutants activate the FLT3 pathway and cluster in core binding factor/11q deletion acute myeloid leukemia/myelodysplastic syndrome subtypes. Clin. Cancer Res. 2009;15:2238–2247. doi: 10.1158/1078-0432.CCR-08-1325. [DOI] [PubMed] [Google Scholar]
- 27.de Rooij JD, et al. Pediatric non-Down syndrome acute megakaryoblastic leukemia is characterized by distinct genomic subsets with varying outcomes. Nat. Genet. 2017;49:451–456. doi: 10.1038/ng.3772. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Zwaan CM, et al. Collaborative efforts driving progress in pediatric acute myeloid leukemia. J. Clin. Oncol. 2015;33:2949–2962. doi: 10.1200/JCO.2015.62.8289. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Michmerhuizen NL, Klco JM, Mullighan CG. Mechanistic insights and potential therapeutic approaches for NUP98-rearranged hematologic malignancies. Blood. 2020;136:2275–2289. doi: 10.1182/blood.2020007093. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Ostronoff F, et al. NUP98/NSD1 and FLT3/ITD coexpression is more prevalent in younger AML patients and leads to induction failure: a COG and SWOG report. Blood. 2014;124:2400–2407. doi: 10.1182/blood-2014-04-570929. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Tallman MS, et al. Acute Myeloid Leukemia, Version 3.2019, NCCN Clinical Practice Guidelines in Oncology. J. Natl Compr. Canc. Netw. 2019;17:721–749. doi: 10.6004/jnccn.2019.0028. [DOI] [PubMed] [Google Scholar]
- 32.Dohner H, et al. Diagnosis and management of AML in adults: 2017 ELN recommendations from an international expert panel. Blood. 2017;129:424–447. doi: 10.1182/blood-2016-08-733196. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Heath EM, et al. Biological and clinical consequences of NPM1 mutations in AML. Leukemia. 2017;31:798–807. doi: 10.1038/leu.2017.30. [DOI] [PubMed] [Google Scholar]
- 34.Gaidzik VI, et al. RUNX1 mutations in acute myeloid leukemia are associated with distinct clinico-pathologic and genetic features. Leukemia. 2016;30:2160–2168. doi: 10.1038/leu.2016.126. [DOI] [PubMed] [Google Scholar]
- 35.Dufour A, et al. Acute myeloid leukemia with biallelic CEBPA gene mutations and normal karyotype represents a distinct genetic entity associated with a favorable clinical outcome. J. Clin. Oncol. 2010;28:570–577. doi: 10.1200/JCO.2008.21.6010. [DOI] [PubMed] [Google Scholar]
- 36.Taskesen E, et al. Prognostic impact, concurrent genetic mutations, and gene expression features of AML with CEBPA mutations in a cohort of 1182 cytogenetically normal AML patients: further evidence for CEBPA double mutant AML as a distinctive disease entity. Blood. 2011;117:2469–2475. doi: 10.1182/blood-2010-09-307280. [DOI] [PubMed] [Google Scholar]
- 37.Fasan A, et al. GATA2 mutations are frequent in intermediate-risk karyotype AML with biallelic CEBPA mutations and are associated with favorable prognosis. Leukemia. 2013;27:482–485. doi: 10.1038/leu.2012.174. [DOI] [PubMed] [Google Scholar]
- 38.Gaidzik VI, et al. RUNX1 mutations in acute myeloid leukemia: results from a comprehensive genetic and clinical analysis from the AML study group. J. Clin. Oncol. 2011;29:1364–1372. doi: 10.1200/JCO.2010.30.7926. [DOI] [PubMed] [Google Scholar]
- 39.Shiba N, et al. Transcriptome analysis offers a comprehensive illustration of the genetic background of pediatric acute myeloid leukemia. Blood Adv. 2019;3:3157–3169. doi: 10.1182/bloodadvances.2019000404. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Duployez N, et al. Comprehensive mutational profiling of core binding factor acute myeloid leukemia. Blood. 2016;127:2451–2459. doi: 10.1182/blood-2015-12-688705. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Gonzalez-Perez A, Sabarinathan R, Lopez-Bigas N. Local determinants of the mutational landscape of the human genome. Cell. 2019;177:101–114. doi: 10.1016/j.cell.2019.02.051. [DOI] [PubMed] [Google Scholar]
- 42.Gu Z, et al. Genomic analyses identify recurrent MEF2D fusions in acute lymphoblastic leukaemia. Nat. Commun. 2016;7:13331. doi: 10.1038/ncomms13331. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Gunnarsson R, et al. Single base substitution mutational signatures in pediatric acute myeloid leukemia based on whole genome sequencing. Leukemia. 2021;35:1485–1489. doi: 10.1038/s41375-021-01242-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Baumhardt JM, et al. Recognition of nuclear export signals by CRM1 carrying the oncogenic E571K mutation. Mol. Biol. Cell. 2020;31:1879–1891. doi: 10.1091/mbc.E20-04-0233. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Garcia-Santisteban I, et al. A cellular reporter to evaluate CRM1 nuclear export activity: functional analysis of the cancer-related mutant E571K. Cell. Mol. Life Sci. 2016;73:4685–4699. doi: 10.1007/s00018-016-2292-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Fan H, et al. BAHCC1 binds H3K27me3 via a conserved BAH module to mediate gene silencing and oncogenesis. Nat. Genet. 2020;52:1384–1396. doi: 10.1038/s41588-020-00729-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Goodwin GH, Nicolas RH. The BAH domain, polybromo and the RSC chromatin remodelling complex. Gene. 2001;268:1–7. doi: 10.1016/s0378-1119(01)00428-0. [DOI] [PubMed] [Google Scholar]
- 48.Ding L, et al. Clonal evolution in relapsed acute myeloid leukaemia revealed by whole-genome sequencing. Nature. 2012;481:506–510. doi: 10.1038/nature10738. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Ma X, et al. Rise and fall of subclones from diagnosis to relapse in pediatric B-acute lymphoblastic leukaemia. Nat. Commun. 2015;6:6604. doi: 10.1038/ncomms7604. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Gawad C, Koh W, Quake SR. Dissecting the clonal origins of childhood acute lymphoblastic leukemia by single-cell genomics. Proc. Natl Acad. Sci. USA. 2014;111:17947–17952. doi: 10.1073/pnas.1420822111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Dobin A, et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29:15–21. doi: 10.1093/bioinformatics/bts635. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Benjamin, D. et al. Calling somatic SNVs and indels with Mutect2. bioRxiv10.1101/861054 (2019).
- 53.Hagiwara K, et al. RNAIndel: discovering somatic coding indels from tumor RNA-Seq data. Bioinformatics. 2020;36:1382–1390. doi: 10.1093/bioinformatics/btz753. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.McLaren W, et al. The ensembl variant effect predictor. Genome Biol. 2016;17:122. doi: 10.1186/s13059-016-0974-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Nicorici, D. et al. FusionCatcher – a tool for finding somatic fusion genes in paired-end RNA-sequencing data. bioRxiv10.1101/011650 (2014).
- 56.Liu Y, et al. The genomic landscape of pediatric and young adult T-lineage acute lymphoblastic leukemia. Nat. Genet. 2017;49:1211–1218. doi: 10.1038/ng.3909. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Tate JG, et al. COSMIC: the catalogue of somatic mutations in cancer. Nucleic Acids Res. 2019;47:D941–D947. doi: 10.1093/nar/gky1015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.McLeod C, et al. St. Jude Cloud: a pediatric cancer genomic data-sharing ecosystem. Cancer Discov. 2021;11:1082–1099. doi: 10.1158/2159-8290.CD-20-1230. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Zhou X, et al. Exploring genomic alteration in pediatric cancer using ProteinPaint. Nat. Genet. 2016;48:4–6. doi: 10.1038/ng.3466. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The RNA-seq data generated in this study have been deposited in the Genome Sequence Archive (GSA) for Human of the National Genomics Data Center of China under accession number HRA000789. The data is available for academic use under controlled access in compliance with the regulation of the Ministry of Science and Technology (MOST) of China for the deposit and use of human genomic data. Access can be obtained by contacting members of the Data Access Committee (DAC) Shuhong Shen at shenshuhong@scmc.com.cn or Yu Liu at liuyu@scmc.com.cn and following the application procedure in GSA. For detailed guidance, see GSA-Human_Request_Guide_for_Users [https://ngdc.cncb.ac.cn/gsa-human/document/GSA-Human_Request_Guide_for_Users_us.pdf]. Data will be available immediately once the application was approved. The access to the controlled data will be valid for one year from the date approved. The processed genomic aberrations from this dataset are available within the Supplementary Information files. The publicly available genomic data for TARGET AML are available in the database of Genotypes and Phenotypes (dbGap) under accession number phs000465. The clinical annotations of TARGET AML cohort and verified somatic genomic variants from both discovery and validation dataset are downloaded from Bolouri et al.4 and TARGET Data Matrix (https://ocg.cancer.gov/programs/target/data-matrix). The 10 previously published RNA-seq data re-analyzed in this study are available as part in GSA for Human under accession number HRA000119. The mutations from Japanese and French pediatric AML cohorts were obtained from Shiba et al.39 and Duployez et al.40 respectively. The remaining data are available within the Article or Supplementary Information.