Skip to main content
AACR Open Access logoLink to AACR Open Access
. 2024 Jul 24;30(18):4100–4114. doi: 10.1158/1078-0432.CCR-24-1063

SJPedPanel: A Pan-Cancer Gene Panel for Childhood Malignancies to Enhance Cancer Monitoring and Early Detection

Pandurang Kolekar 1,#, Vidya Balagopal 2,#, Li Dong 1,#, Yanling Liu 1, Scott Foy 1, Quang Tran 1, Heather Mulder 1, Anna LW Huskey 2, Emily Plyler 1, Zhikai Liang 1, Jingqun Ma 2, Joy Nakitandwe 3, Jiali Gu 2, Maria Namwanje 2, Jamie Maciaszek 2, Debbie Payne-Turner 2, Saradhi Mallampati 2, Lu Wang 2, John Easton 1,*, Jeffery M Klco 2,*, Xiaotu Ma 1,*
PMCID: PMC11393547  PMID: 39047169

Abstract

Purpose:

The purpose of the study was to design a pan-cancer gene panel for childhood malignancies and validate it using clinically characterized patient samples.

Experimental Design:

In addition to 5,275 coding exons, SJPedPanel also covers 297 introns for fusions/structural variations and 7,590 polymorphic sites for copy-number alterations. Capture uniformity and limit of detection are determined by targeted sequencing of cell lines using dilution experiment. We validate its coverage by in silico analysis of an established real-time clinical genomics (RTCG) cohort of 253 patients. We further validate its performance by targeted resequencing of 113 patient samples from the RTCG cohort. We demonstrate its power in analyzing low tumor burden specimens using morphologic remission and monitoring samples.

Results:

Among the 485 pathogenic variants reported in RTCG cohort, SJPedPanel covered 86% of variants, including 82% of 90 rearrangements responsible for fusion oncoproteins. In our targeted resequencing cohort, 91% of 389 pathogenic variants are detected. The gene panel enabled us to detect ∼95% of variants at allele fraction (AF) 0.5%, whereas the detection rate is ∼80% at AF 0.2%. The panel detected low-frequency driver alterations from morphologic leukemia remission samples and relapse-enriched alterations from monitoring samples, demonstrating its power for cancer monitoring and early detection.

Conclusions:

SJPedPanel enables the cost-effective detection of clinically relevant genetic alterations including rearrangements responsible for subtype-defining fusions by targeted sequencing of ∼0.15% of human genome for childhood malignancies. It will enhance the analysis of specimens with low tumor burdens for cancer monitoring and early detection.


Translational Relevance.

Here, we present the validation of a pan-cancer gene panel for targeted sequencing and identification of clinically relevant genomic alterations specifically designed for childhood malignancies. In addition to coding variants, this panel can identify copy-number alterations, promoter/enhancer alterations such as for TAL1 and TERT, and rearrangements responsible for fusion oncoproteins that are uniquely recurrent in pediatric cancers. The panel also enables ultradeep sequencing so that specimens with low tumor content are thoroughly analyzed to inform risk stratification at diagnosis and remission, as well as adjustment of treatment strategies through disease monitoring.

Introduction

Extensive insights on the genetic underpinnings (i.e., driver alterations) of childhood cancers (13) have been uncovered in the past decade using next-generation sequencing. To date, diagnostic sequencing has become a part of clinical service in some institutions (46). Although whole genome, and to lesser extent whole-exome sequencing, is preferable to maximize the detection of cancer-associated variants in the clinical setting, there are notable resource and infrastructure requirements for these modalities that are not amenable to the majority of clinical labs. Further, the broad coverage of whole-genome and whole-exome sequencing renders it challenging to achieve ultradeep sequencing that is essential for the analysis of specimens with low tumor purity such as for detecting minimal residual disease and for disease monitoring. Thus, targeted gene panel–based sequencing is helpful to address some of these challenges (7, 8).

Systematic surveys of the genetic changes revealed that childhood cancers are shaped by developmental origins with distinct properties (9) that may not be targeted by strategies used for adult cancers. Although multiple gene panels designed for adult cancers exist, such as MSK-IMPACT (6), comprehensive gene panels for pediatric cancers are currently limited. This is important considering the recent pan-cancer study of 1,699 childhood cancers that demonstrated a dramatic difference between adult and childhood cancers, in which 55% of the 142 driver genes in pediatric cancers are not found in adult pan-cancer studies (3). This finding is not surprising, as the major cancer types diagnosed in children are leukemia, lymphoma, and brain tumors compared with lung, breast, and colon cancers in adults (1). Even within a given tumor type, the prevalent subtypes of these cancers are different across the age spectrum. For example, ETV6::RUNX1 constitutes a major subtype in pediatric B-cell lymphoblastic leukemia compared with BCR::ABL1 in adults (1). Likewise, pediatric high-grade gliomas commonly harbor somatic histone H3.3 or H3.1 driver mutations, which are nearly absent in corresponding adult gliomas (10). In addition to the different affected genes, the types of genetic alterations also differ between adult and pediatric cancers. Our previous observations support this, showing that 62% of driver alterations in childhood cancers are copy-number alterations (CNV) or structural variations (SV), with boundaries that typically do not fall into protein-coding regions (3). Our recent study of oncogenic fusions (11) indicated that 55.7%, 22.5%, and 18.5% of pediatric leukemia, brain, and solid tumors demonstrate subtypes defined by oncogenic fusions, for which the DNA breakpoints typically fall into intronic regions. These facts render base pair (bp) level ascertainment of driver alterations in childhood cancers challenging by using conventional capture-sequencing kits such as exome sequencing and call for a dedicated gene panel for pediatric cancers that includes coding and noncoding targets to maximize the detection of key alterations in pediatric cancers.

Here, we highlight the prominent features of our pan-cancer gene panel (termed SJPedPanel) for childhood cancers by comparing with six existing cancer gene panels. We validate its superior coverage of genes relevant to childhood cancers using a well-described real-time clinical cohort via in silico analysis, followed by resequencing a subset of these cases for experimental validation. We also demonstrate the power of our gene panel in detecting rare variants using ultradeep sequencing via serial dilution experiments, as well as disease monitoring in remission samples from patients with acute myeloid leukemia (AML).

Materials and Methods

Panel design

Based on extensive research and literature review of pan-cancer genome profiling studies, a list of exonic and/or intronic regions (n = 5,009 regions, 2.82 Mbp) from 357 genes that are frequently implicated in pediatric cancers was compiled to detect single-nucleotide variants (SNV), small insertions and deletions (indels), gene fusions, SV, and internal tandem duplications (ITD). Also, we curated a list of 7,590 SNP from the Genome Aggregation Database (gnomAD v2.1.1 and v4.1.0; RRID:SCR_014964) that were evenly spread across human chromosomes to detect large genomic structural rearrangements such as CNV and LOH. The details of all the genomics regions and SNP used to assemble the pediatric pan-cancer panel, termed as SJPedPanel, are available in Supplementary Tables S1 and S2.

Capture efficiency of the SJPedPanel

We generated one high-depth (∼2,000X) and one low-depth (∼200X) targeted sequencing library (two replicates) using the SJPedPanel on the COLO829BL cell line (ATCC No. CRL1980, RRID), which is a gold-standard cell line for clinical validation and other established studies; refs. 12, 13). These libraries were sequenced on Illumina NovaSeq and NextSeq platforms, respectively. The data generated were used to evaluate the capture performance of the probes and uniformity of coverage across regions and loci of the SJPedPanel.

Dilution experiment

A dilution experiment using six cancer cell lines [ME1 (DSMZ: ACC 537, RRID:CVCL_2110), 697 (DSMZ: ACC 42, RRID:CVCL_0079), Rh30 (ATCC No. CRL2061, RRID: CVCL_0041), EW8 (courtesy of Elizabeth Stewart, Department of Oncology, St. Jude Children’s Research Hospital, RRID:CVCL_1658), K562 (ATCC No. CCL243, RRID: CVCL_0004), Molm13 (DSMZ: ACC 554, RRID:CVCL_2119)], and a noncancer cell line (GM12878; Coriell Institute, RRID:CVCL_7526) was designed to achieve seven tumor concentrations with two replicates each. The seven dilutions were divided in three groups: (i) ultralow (0.1%, 0.2%), (ii) low (0.5%, 1%), and (iii) medium (2.5%, 5%, and 10%), which were sequenced at depths of 10,000X, 5,000X, and 2,500X, respectively. The cell lines were also sequenced independently in undiluted forms at 250,00X to estimate the original allele fractions (AF) of 26 cell line–specific markers (14 SNV, four indels, eight SV; all these markers are confirmed to be exclusively detected from one of the six cell lines) given in Supplementary Table S3. Recall rate of these known markers across different dilutions was used to assess the limit of detection of the SJPedPanel.

The limit of detection (LoD) is determined by two critical factors: (i) the sequencing depth (also known as power) and (ii) the noise level. For example, if the true AF is 1%, a sequencing depth of 913X will ensure a 95% chance of detecting this variant with ≥5 mutant alleles (14). In consideration of the high range of dilution concentrations, we aimed for 2,500X depth for dilution ladders >1%, 5,000X for ladders 0.5% and 1%, and 10,000X for ladders 0.1% and 0.2%. Conversely, the noise level is typically regarded as background error rate. Mutations with higher background error rates are more difficult to detect because the true signal can be overwhelmed by the background noises. Previously, we developed computational error suppression methods to achieve an error rate of ∼10−6–10−4 for substitutions (15), and similar methods and results have been achieved for indels and SV (manuscript under review).

All the cell lines used in this study tested negative for mycoplasma contamination using the Lonza MycoAlert Mycoplasma Detection Kit (catalog No. LT07-318) and were authenticated by STR profiling before use.

In silico downsampling experiment

We performed in silico downsampling of data from a set of cancer cell line dilution samples to find out the trade-off between the recall rate, downsampling depth of sequencing, and associated cost estimates. The samples originally sequenced at 2,500X were further downsampling to simulate depths of sequencing at 1,000X, 1,500X, and 2,000X, whereas the samples sequenced at 5,000X and 100,00X were downsampled to simulate depths of sequencing at 1,000X, 1,500X, 2,000X, 3,000X. For each of the desired downsampling depths, 10 samples were simulated, each consisting of randomly sampled reads at loci of 14 SNV. These simulated samples were used to determine the trade-off between recall rate and depth of sequencing.

Investigating diagnostic yield using clinical sample resequencing

This study was approved by the St. Jude institutional review board, and informed written consent was obtained for samples collection from the patient, parents, or guardians. This study was conducted in accordance with U.S. Common Rule. Subjects were not compensated for participation. All patient samples are deidentified. Based on sample availability, we selected 113 specimens from previously sequenced pediatric cancer cases treated at St. Jude Children’s Research Hospital to represent a wide range of cancer subtypes common in pediatrics. Samples were chosen primarily from the pilot study cohort (n = 40; ref. 4) and Genomes for Kids (G4K) studies (n = 73; ref. 16) and previously reported clinically relevant markers identified by triple-platform approach of whole genome, whole exome, and transcriptome sequencing. A list of cases and their cancer subtypes used for these purposes is provided in Supplementary Table S4A, and their demographic summary is available in Supplementary Table S4B. The recall rate of clinically relevant markers from these cases was used to establish the diagnostic yield of the SJPedPanel. Here, seven markers from three hypermutator cases (SJHGG030335, SJHGG030336, and SJST030211) were downgraded to variant of unknown significance for this analysis per communication with the corresponding author of the G4K study (“Comment” column in Supplementary Table S4C), resulting in 140 of SNV/indel, 55 of fusion/SV, 184 CNV/LOH, and 10 ITD (total = 389; Supplementary Tables S4C–S4G).

Library preparation, capture, and sequencing

DNA samples were obtained and subjected to DNA-seq library preparation and target enrichment followed by sequencing in the clinical genomics laboratory as described below. An input of 100 ng of DNA was used to construct libraries using the Twist Library Preparation Enzymatic Fragmentation Kit 2.0 (Twist Biosciences) following the manufacturer’s instructions. Capture oligos were designed to detect putative SNV, indels, SV, ITDs, and CNV in 357 genes of clinical interest. SJPedPanel was synthesized at Twist Biosciences and is described in detail in the section on panel design. Eight libraries were pooled at a time, and target enrichment for the SJPedPanel baits was carried out using Twist hybrid capture protocol following manufacturer’s instruction. Paired-end 150-cycle sequencing was performed on NovaSeq or NextSeq instruments (Illumina Inc) as appropriate. Where necessary, additional sequencing (“top off”) was performed to ensure that a sequencing depth of at least 1,000X was achieved in all cases.

Early detection of relapsed AML cases

To test the panel’s capability for disease monitoring, two pediatric AML cases (SJAML016582 and SJAML016551) with material available at diagnosis, relapse, and remission timepoints were analyzed. Both samples provided multiple trackable somatic markers, including structural variants and SNV. Samples were subjected to deeper sequencing depths of >5,000X after targeted capture to ensure detection of low-level variants at <1%. Average of variant allele fractions (VAF) of detected somatic variants was used to estimate tumor burden at corresponding time points. Subclonal variation was visualized using fishplot R package (17).

Coverage comparison between SJPedPanel and whole exome sequencing

The content of SJPedPanel was compared with that of whole exome sequencing (WES) manifest to highlight the differences in coverage of hg19 genomic regions. An Illumina Exome 2.0 Plus hg19 BED file (18) padded with 10 bp was used for region intersection analyses. We utilized recently reported somatic variants from the Genomes for Kids (G4K; ref. 16), a real-time three-platform sequencing study of 309 pediatric patients with cancer, to benchmark the coverage of reported pathogenic and likely pathogenic variants between SJPedPanel and WES.

Comparison of SJPedPanel with other panels

We compared the content of the SJPedPanel with content of six other available DNA panels, including (i) FoundationOne Heme (19, 20), (ii) FoundationOne CDx (21) by Foundation Medicine Inc., (iii) MSK-IMPACT (6, 22) by Memorial Sloan Kettering Cancer Center, (iv) OncoKids (v) by Children’s Hospital Los Angeles, (v) combined Comprehensive Hematological Malignancy (CHMP) and Comprehensive Solid Tumor (CSTP) panels (23) by Children’s Hospital of Philadelphia (CHOP), and (vi) Oncomine Comprehensive Assay v3 by Thermo Fisher Scientific Inc. (24, 25). These panels collectively represent the breadth and diversity of clinical gene panels. Because most of the providers do not provide the exact coordinates of the regions in the panel, we compared the content of these panels using standardized gene names with the help of official gene symbols and synonyms from the NCBI gene database (RRID:SCR_006472; ref. 26). UpSet plots were generated to compare common and unique genes among seven panels using the R package UpSetR (27).

Bioinformatics analyses

The adapter-trimmed paired-end FASTQ files generated on the Illumina NovaSeq/NextSeq platforms were assessed for sequence and instrument quality using FastQC v0.11.9 (RRID: SCR_014583; ref. 28) and SequencErr v2.0.9 (29). The reads were mapped against GRCh37 build using BWA aln v0.7.12-r1039 (RRID: SCR_010910; ref. 30). The utility commands in SAMtools v1.7 (RRID: SCR_002105; ref. 31) and BEDTools v2.25.0 (RRID: SCR_006646; ref. 32) were used to perform simple operations using Binary Alignment Map (BAM) and BED files. The count files obtained from BAM files using SequencErr were further passed as an input to DeepSeqCoverageQC v0.3.1 (33) to compute depth-of-coverage QC metrics of the sequenced samples over loci/regions of the SJPedPanel. The genotyping of SV and indels to compute the AF was carried out using SVindelGenotyper (34). The CNV were detected using CNVkit v0.9.10 (RRID: SCR_021917; ref. 35). The BAM files of 30 germline samples were used to create a pooled reference of per-bin copy-number estimates. The segment and bin-level call files along with CNV diagrams generated by CNVkit batch command were used to review the CNV calls in tumor samples. To determine the LOH in sequenced samples, the minor allele frequencies of 7,590 SNP were computed using count files generated by SequencErr and subsequently used to generate allelic imbalance plots over chromosomes. The output files and diagrams generated by CNVkit v0.9.10 and the allelic imbalance figures used to review CNV and LOH events are available from Zenodo repository (RRID: SCR_004129; ref. 36).

Statistical analysis

A previously developed “rotation control” method (37) was used to obtain the background count of the variants for binomial testing, and Q-values (38) were used to assess the statistical significance of detection based on binomial testing. The procedure is further explained as follows. For a given variant, samples expected to be wild-type (i.e., no mutation detected at diagnosis in the original analysis) were used to calculate background error rate (Vb). Combining reads from all such wild-type samples at a given variant locus provides sufficient depth of coverage to estimate the background error rate as Vb= mb/tb (capped at 0.001), where mb denotes the background mutant allele count, and tb denotes the background coverage. Similarly, in mutation-positive samples, the estimation of foreground frequency (Vf) can be calculated as Vf= mf/tf where mf and tf, respectively denote the foreground mutant allele count and foreground coverage. Then, we calculated the probability of observing ≥ mf mutant reads out of tf reads by random chance using binomial distribution. A sample is called mutation-positive if the false discovery rate–controlled Q-value was found to be <0.05. This procedure repeated across all sample/variant combinations. All the statistical analyses were performed using R v4.0.3 (39).

Data availability

The cell line data generated for this study have been deposited in the European Nucleotide Archive at EMBL-EBI under accession number PRJEB64356 (https://www.ebi.ac.uk/ena/browser/view/PRJEB64356). The accession numbers of the samples are listed in Supplementary Tables S5 and S6. Other data generated in this study are available from the corresponding authors upon request.

Results

Panel design

We designed our SJPedPanel by integrating findings from 44 published tumor-normal paired genomics studies of childhood cancers that spans leukemia, brain, and solid tumor (24, 10, 4079). SJPedPanel includes 1.069 million exonic bp from 5,275 coding exons for detecting protein coding mutations in 357 known driver genes for childhood cancers (Fig. 1A; Supplementary Table S1; Supplementary Fig. S1 for plots of chromosomal distribution of genes generated using an R/Bioconductor package chromPlot; ref. 80). To account for the SV that result in subtype-defining oncogenic fusions for which DNA breakpoints typically fall in intronic regions (11), 1.438 million bp from 297 introns of 94 genes (Supplementary Table S1) were included. Moreover, 0.209 million bases from promoter regions were targeted for detecting promoter alterations including rearrangements and point mutations such as TAL1 in T-ALL (Fig. 1B for Sankey diagram showing representative cancers, genes, variant types and genomic features targeted by panel; refs. 81, 82). Highly recurrent oncogenes (MYCN in neuroblastoma; ref. 3) and tumor suppressor genes (such as CDKN2A, PAX5, and SMARCB1) were targeted by probes tiling the entire gene region for detecting CNV. To account for the fact that breakpoints of structural alterations can fall outside gene regions, we extended the target regions to frequent DNA breakpoints by using patient data from ProteinPaint (83) and GenomePaint (84). Collectively, 2.82 million bp were designed for potential SNV, indel, SV, and CNV/LOH driver alterations. Notably, a few known childhood cancer drivers are intentionally excluded due to genomic space considerations. For example, MECOM (85) and GFI1B (86) are known to be involved in enhancer-hijacking alterations and were excluded due to the large space needed to cover the many possible breakpoints.

Figure 1.

Figure 1.

Design of pediatric cancer gene panel. A, This study includes panel content, investigation of ultrasensitive detection, capture performance, diagnostic yield, and clinical applications. B, A Sankey diagram (82) showing spectrum of childhood cancers (Heme, hematologic malignancies; ST, solid tumors; Brain, brain tumors), cancer subtypes, genes, variant types, and genomic features targeted by SJPedPanel. Stacked bar plot at the right end shows space distribution of different genomic features covered by SJPedPanel.

In addition, 7,590 SNP were selected for detecting copy-number variations and loss of heterozygosity (CNV/LOH) across the genome (Supplementary Table S2).

The median distance between these SNP is 332 Kb, with the 25th and 75th quantiles being 60 Kb and 593 Kb, respectively (Supplementary Fig. S2A). Notably, >80% of these SNP exhibit population frequency (gnomAD v4.1.0) between 40% and 60% (Supplementary Fig. S2B), which ensures that nearly 50% of patients are heterozygous at each SNP site. Therefore, around 3,000 (=7,590 × 0.5 × 0.8) heterozygous SNP are expected for each patient, which leads to a theoretical resolution of ∼1 Mb for CNV/LOH detection. The number of SNP chosen per chromosome is roughly proportional to the lengths of chromosomes (Supplementary Figs. S2C, S2D and S3 for plots of chromosomal distribution of SNP generated using an R/Bioconductor package IdeoViz; ref. 87). Considering the read length and the insert length (for target capture and sequencing), these 7,590 SNP actually occupy ∼250 × 7,590 = 1.8975 million bp. Thus, our panel consisted of ∼4.7 million bp, or ∼0.15% of the human genome. The compact size of our panel enables us to reach 30,000X at the equivalent sequencing quantity of a standard WGS (30X) per sample, thus enabling cost-effective cancer monitoring and/or early detection (Fig. 1A). The gene panel was manufactured by Twist Bioscience.

Comparison of gene content with other panels

First, we compared the gene content between our panel and six other commonly used commercial DNA panels for childhood cancers, including FoundationOne Heme, FoundationOne CDx (19, 21), MSK-IMPACT (6, 22), OncoKids (5), CHOP CHMP/CSTP (23). and Oncomine Comprehensive assay v3 (OCAv3; Supplementary Table S7A and S7B; ref. 24). We used the list of 183 driver genes reported in two recent childhood pan-cancer studies (2, 3) involving 2,578 cases. As seen in Table 1, SJPedPanel covers 159 (87%) genes, whereas all other panels covered <60% of the reported pediatric cancer driver genes (Fig. 2A; Supplementary Table S8).

Table 1.

Overview of the DNA panels selected for comparison.

Panel name Focus area No. of genes Genes screened for SNP % pediatric cancer driver genes covered out of reported 183 genesa Publication
Coding exons Selected introns Promoter regions
SJPedPanel Pediatric cancers 357 357 94 10 7,590 87% This study
FoundationOne Heme Adult blood cancers 418 408 31 57% He and colleagues (20)
CHOP CHMP/CSTP Pediatric cancers 273b 273 1 1,042 55% Surrey and colleagues (23)
MSK-IMPACT Adult solid tumors 468c 468 13 1 862 52% Cheng and colleagues (6)
OncoKids Pediatric cancers 181 137 44 1 49% Hiemenz and colleagues (5)
FoundationOne CDx Adult solid tumors 324 309 34 1 44% Whitepaper by Companyd (21)
Oncomine Comprehensive Assay v3 Adult cancers 161 146 15 1 29% Hovelson and colleagues (25)
a

183 pediatric cancer genes are reported in two pediatric pan-cancer studies by Ma and colleagues (3) and Grobner and colleagues (2).

b

Considered unique genes from CHOP CHMP and CSTP panels reported in Surrey and colleagues (23).

c

The MSK-IMPACT panel is reported to include 468 genes, as it considers two different transcript isoforms for the CDKN2A gene. However, when counting only unique gene names, the panel consists of 467 genes.

d

Sources of content for all the panels are available in Supplementary Table S7A.

Figure 2.

Figure 2.

Comparison of gene content between SJPedPanel with other panels. A, Pediatric cancer relevance of seven panels based on coverage of 183 driver genes from childhood pan-cancer studies (2, 3). The horizontal bars at the bottom indicate numbers of genes designed in each panel coded by corresponding color. B, Analysis of common and unique genes among seven panels using UpSet plot (27). Venn diagrams indicate comparison of genes in SJPedPanel with C, other six panels combined, and D, other two pediatric cancer panels—OncoKids and CHOP CHMP/CSTP.

A comparison of gene names among the panels indicated that SJPedPanel exhibits unique coverage of 105 genes (Fig. 2B and C; Supplementary Table S7B) when compared with the other panels combined, such as DGCR8 and SIX1 for Wilms tumor (88), SHH for medulloblastoma (89), ZFTA for ependymoma (53, 90), and UBTF (91) and PICALM (91, 92) for AML/ALL. Among all the panels, SJPedPanel provides the largest intronic regions (297 introns from 94 genes; Table 1) responsible for rearrangements that generate fusion oncoproteins. Conversely, among the 468 genes specific to other panels, only three genes were reported in two recent pediatric pan-cancer studies with low patient frequencies (ZNF217: 0.59%, PCBP1: 0.31%, and CARD11: 0.21%; refs. 2, 3). MSK-IMPACT panel exhibits the maximum number of genes (467), of which 135 are exclusive from other panels (6, 22). Most of these genes are relevant to adult cancers with the highest concentration in adult solid tumors (6, 22). Similarly, FoundationOne Heme panel consists of 418 genes with a focus on adult hematologic malignancies (19, 20). CHOP CHMP/CSTP and OncoKids are the other pediatric cancer panels under comparison and cover 118 genes that are not included in our panel (Fig. 2D; Supplementary Table S7B), whereas SJPedPanel exhibits 146 exclusive genes from both panels. Among the 118 genes absent in SJPedPanel, MECOM would require large space for the diverse promoter-hijacking events (85), whereas CALR, RARA, and SS18 were not included due to an overall paucity of alterations in these genes in pediatric cohorts. The remaining 114 genes not included in SJPedPanel were for adult cancers and demonstrate low patient frequency in pediatric cancers (2, 3). In addition to CHOP CHMP/CSTP DNA panels, we also compared SJPedPanel with additional genes in fusion RNA panel by CHOP as described in Surrey and colleagues (Supplementary Table S7C). SJPedPanel presents exclusive 159 genes from all the CHOP panels (CHMP, CSTP, fusion Panel; Supplementary Table S7C). Furthermore, out of 183 (2, 3) pediatric cancer driver genes (as noted in Supplementary Table S8), the SJPedPanel includes 56 genes that are exclusively associated with pediatric cancer and are not found in the CHOP DNA/RNA panels. In contrast, the CHOP panels have only one exclusive gene, CARD11. This gene demonstrates low patient frequency in childhood cancers (0.21%; Supplementary Table S8) and is mostly known to be implicated in adult B-cell lymphoma (93). Some of the important driver genes that are exclusive to the SJPedPanel and not found in the CHOP panels include MEF2D (94) and ZNF384 (95) fusions for B-ALL/AML, SIX1 and SIX2 for Wilms tumor (88), and DDX3X for medulloblastoma (96).

Apart from comparison with the previously mentioned panels, we also compared the gene content of the SJPedPanel with publicly available cancer gene list(s). The OncoKB knowledge base (97, 98) provides a curated list of 1,148 cancer genes based on their inclusion in various sequencing panels, the Sanger Cancer Gene Census, or Vogelstein and colleagues (OncoKB: https://www.oncokb.org/cancer-genes, last update May 01, 2024; last access date May 29, 2024; ref. 99). Except one (RIPK2 from Foundation One Heme), all the genes in Foundation One Heme, Foundation One CDx, and MSK-IMPACT are covered in the list of OncoKB cancer genes. Conversely, we found that a subset of 67 genes from the SJPedPanel are missing in the OncoKB (Supplementary Table S7B, column “Status in OncoKB”). These include 24 known driver genes (2, 3; Supplementary Table S8) such as ZEB2 for AML/ALL (72), DHX15 for AML (64), and SIX2 and DGCR8 for Wilms tumor (100). Apart from these 24, the remaining exclusive subset also includes genes such as UBTF, which is recently reported to define distinct subtype of pediatric AML (91). Collectively, the SJPedPanel offers by far the most comprehensive and current coverage of genetic alterations for the study of childhood malignancies.

Capture performance of the panel

A critical consideration in genomic sequencing (especially in panel sequencing) is the coverage uniformity. To study this question, we sequenced four targeted sequencing libraries (C1–C4) prepared using COLO829BL (ATCC No. CRL1980), a noncancer cell line that has been extensively used in the literature for clinical proficiency testing or benchmarking (12, 13). To ensure reproducibility, technical replicates were generated to achieve high (C1, C2, ∼2,000X) and low depth (C3, C4, ∼200X) of sequencing. Libraries in each set were sequenced with either the Illumina NovaSeq 6000 (C1, C3) or the NextSeq 500 (C2, C4).

As expected, the average depth was highly correlated (r2: 0.98) with the number of raw reads sequenced (Supplementary Fig. S4A). With this data, we investigated the capture uniformity at bp level (Fig. 3A and B) and at region level (Fig. 3C and D). Because highly uniform capture data would ensure most bases/regions to demonstrate similar depth (therefore a histogram with very small standard deviation), to measure sequencing uniformity, we choose to use coefficient of variation (CV, defined by σ/μ of the histogram). Here, σ and μ are the estimate of standard deviation and mean, respectively, by trimming 2.5% of extreme values from both ends of the histograms (Fig. 3). At bp level, we observed that CV is close to 0.35 for libraries sequenced by NovaSeq and between 0.37 and 0.38 for libraries sequenced by NextSeq. At region level, NovaSeq data demonstrate CV close to 0.25, whereas NextSeq data demonstrate Cv range from 0.22 to 0.28.

Figure 3.

Figure 3.

Capture uniformity per base (A and B) and per region (C and D) of the panel. Uniformity of coverage across in the SJPedPanel for high-depth samples sequenced on Illumina NovaSeq 6000 (C1) and NextSeq 500 (C2), respectively and low-depth samples sequenced on Illumina NovaSeq 6000 (C3) and NextSeq 500 (C4), respectively. The histograms are made at bp level (2.82 Mbp; A and B) and region level (n = 5,009; C and D). The vertical dotted lines indicate (μ − 2σ) of the respective distributions. The statistical parameters (μ: average depth, σ: standard deviation, Cv: coefficient of variance) were calculated by trimming observations in the top and bottom 2.5 percentiles. All the sample and region level QC parameters are available in Supplementary Tables S5 and S9.

Overall, the standard deviation is less than or around one-third of the mean, which ensures that mos of the target bases/regions are sufficiently covered. Using the two-sigma rule (that approximates the 95% confidence interval), we also measured the percentage of bases/regions with depth higher than (μ − 2σ), as denoted by vertical dotted lines in Fig. 3. We found that 97% and 95% of bases exhibit depth higher than this threshold for NovaSeq data and NextSeq data, respectively (Supplementary Table S5; Supplementary Fig. S4B). Similar trends were observed from the region level analyses (Fig. 3C and D; Supplementary Table S9). These data indicated that the SJPedPanel demonstrates satisfactory capture efficiency that is reproducible over different sequencing platforms.

Next, we analyzed regions that are consistently poorly covered (i.e., less than μ − 3σ) in the COLO829 data. We identified 27 regions (26 regions are small exons), of which 10 regions consistently showed no coverage across all the four samples (column “Remark” in Supplementary Table S9). These 27 regions occupy 8,091 bp (∼0.3%) of the panel, and more than half of these bases (4,972 bp) belong to only two regions, NUTM2A (3,152 bp including intron 1 with 2,050 bp) and DUX4 (1,820 bp including exon 1; Supplementary Table S10; Supplementary Fig. S5). For STAG2, the coverage is slightly below the predefined cutoffs for three regions (351 bp; e.g., ∼ 500X for C1 sample, Supplementary Table S10). Although looking for the potential reasons for poor coverage of these regions, we observed that flanking regions (± 50 bp) of most of these poorly covered regions comprise either high GC content, such as exon one of MLLT1 (94% GC), or homopolymer runs, such as T-runs around three regions of the STAG2 gene (Supplementary Table S10), which can be informative for future optimization.

Out of the six panels compared, only the MSK-IMPACT panel was reported to demonstrate 31 consistently poorly covered regions (22), which all happened to be targeted by SJPedPanel as well. Interestingly, SJPedPanel demonstrated sufficient depth of coverage [> (μ − 2σ) of respective COLO829BL sample level cutoffs] for 29 out of the 31 regions (Supplementary Table S11). The remaining two regions, exon 2 of NOTCH2 and exon 15 of PMS2, consistently showed poor coverage as in MSK-IMPACT panel and were also part of the 27 poorly covered regions of SJPedPanel discussed previously (Supplementary Table S10).

Similarly, we analyzed the depth of coverage at designed SNP. Notably, 99.5% of all the 7,590 SNP demonstrate depth more than (μ − 2σ) of the respective sample level cutoffs (Supplementary Table S12; Supplementary Fig. S6A for median and minimum depth of coverages for SNP). As expected, the VAF of all the SNP in control COLO829BL samples were clustered around either 0, 0.5, or 1 (Supplementary Fig. S6B). A total of 3,300 heterozygous SNP (0.3 ≤ VAF ≤ 0.7) are observed in COLO829BL, supporting the informativeness of our designed SNP as mentioned above (∼3,000 heterozygous SNP expected from any donors).

In silico comparison of SJPedPanel and WES in a real-time clinical genomics cohort

Next, we compared the coverage of reported pediatric cancer alterations using WES, as it is an effective capture-sequencing method that targets the coding exons of all genes rather than a panel of genes (thus an upper bound of all coding–region based gene panels). First, we asked whether our panel could offer comparable coverage of driver alterations (with a focus on coding SNV and indels) in pediatric cancers to WES. The recently published “Genome for Kids” (G4K) study (16) reported pathogenic and likely pathogenic (P/LP) variants (called driver alterations hereafter) using three-platform sequencing (WGS, WES and RNAseq) from 253 pediatric cases that encompassed 20 cancer subtypes in a real-time clinical genomics (RTCG) setting, thus enabling us to assess the potential of SJPedPanel to cover driver alterations from diverse childhood cancer types. Here, we performed in silico analysis of regions targeted by SJPedPanel and WES using the curated positions of 485 driver alterations (including SNV/indel/SV/ITD; Supplementary Fig. S7A) from the G4K study (Supplementary Table S13A; and demographic information in Supplementary Tables S13B and S13C). SJPedPanel covered 86% of the 485 reported driver alterations as compared with 76% by WES (Fig. 4A, last pair of bars for “All” variants with gray background). Next, we classified the variants into SNV, indel, fusion/SV (structural rearrangements that result in fusion oncoproteins), Other SV (structural rearrangements that do not result in fusion oncoproteins but affect cancer driver genes such as disrupting tumor suppressor genes), and ITD, by using the class labels in the G4K study (16). As expected, WES only covered 12% of the fusions/SV that present with either of the breakpoints in exonic regions, whereas SJPedPanel covered 82% of these events. Conversely, although WES covered all the reported driver SNV and indels, SJPedPanel did not cover 7% and 5% of SNV and indels, respectively. Interestingly, the host genes of these uncovered SNV and indels are rarely mutated (<0.1%) childhood cancers (Fig. 4A; Supplementary Table S13A, “Comment” column; refs. 2, 3). Of note, SJPedPanel covered 100% of the ITDs, whereas WES does not cover 13% of these. In fact, the ITDs missed by WES demonstrate DNA breakpoints that fall in introns and resulted in duplication of involved exons, such as tandem duplications in PAX5 (101) and KMT2A (102), for which selected intronic regions were designed in SJPedPanel. Of note, SJPedPanel successfully captured (Supplementary Fig. S8) the recently described UBTF exonic tandem duplications (91).

Figure 4.

Figure 4.

Diagnostic yield of SJPedPanel. A,In silico coverage comparison between SJPedPanel and WES using percent coverage of variants (SNV, indels, fusion, SV, and ITD) reported in the “Genomes for Kids” study (16). The last pair of bars with the gray background shows the combined percent coverage over “All” 485 variants. B, Diagnostic yield of SJPedPanel by sequencing of previously reported 113 cases. y-axis shows percentage of covered and detected variants by SJPedPanel over each variant type. The last pair of bars with gray background for “All” variants show combined detection rate. Numbers of reported driver alterations are indicated at the bottom of bars for corresponding variant types.

Also, we asked what percentage of patients could benefit from SJPedPanel versus WES. As it turned out, at least one variant per case would have been covered by SJPedPanel in 93% of 208 cases (median, 1; range, 1–13), in contrast to 75% of the cases using WES (median, 2; range, 1–13; Supplementary Fig. S7B). Most of this gain is due to the capture of intronic breakpoints that result in oncogenic fusions (Fig. 4A). These data demonstrated that SJPedPanel exhibits superior potential for detection and reporting of driver alterations in pediatric tumors compared to WES.

Next, we compared the coverage for a subset of alterations (n = 271 out of 485) from the G4K cohort reported to demonstrate diagnostic, prognostic, or targetable value (Supplementary Table S13A—Columns: Diagnostic, Prognostic and Targetable; Supplementary Fig. S9A). Concordant with previous observations, the SJPedPanel overall showed a better coverage of these actionable variants compared with WES. The panel covered 86%, 85%, and 91% cases with diagnostic, prognostic, and targetable potential compared with 47%, 38%, and 78% for WES in respective categories (Supplementary Fig. S9B–D, last pair of bars for “All” variants). Especially for fusion/SV type of variants, the panel showed considerably higher values over these three actionable categories.

Further analysis of the same subset of actionable variants (grouped over gene/fusions) showed that some driver alterations tend to exhibit gender preference in the studied cohort (Supplementary Table S13D). For example, point mutations in SH2B3 were found to be more prevalent in male children with B-/T-ALL compared to female children. It will be interesting to see if these patterns can be validated using a larger patient cohort.

Comparison of diagnostic yield between SJPedPanel and WES

To validate the in silico findings, we resequenced 113 clinical cases with available specimens from previous clinical studies (4, 16) using SJPedPanel. These samples reflect the broad tumor types and subtypes common in childhood cancers, including 27 hematologic malignancies, 43 solid tumors, and 43 brain tumors (Supplementary Table S4A for cohort and demographic description). The demographic summary of these 113 cases is available in Supplementary Table S4B. Common subtypes, such as AML (n = 14), ALL (n = 10), rhabdomyosarcoma (n = 5), neuroblastoma (n = 5), osteosarcoma (n = 3), Wilms tumor (n = 3), high-grade glioma (n = 8), and medulloblastoma (n = 14) are represented in addition to rare entities, such as melanoma (n = 1) and desmoplastic small round cell tumor (n = 2; Supplementary Fig. S10). Among these cases, 389 driver alterations (“Methods”) are reported via three-platform (WGS, WES, RNAseq) sequencing. These include 100 SNV, 40 indels, 55 SV, 184 CNV/LOH, and 10 ITD (Supplementary Tables S4C–S4G). Of these, 361 (92.8%) variants were targeted by the panel, including 94 SNV (94%), 38 indels (95%), 36 fusion/SV (76.59%), three other SV (37.5%), 180 CNV/LOH (97.82%), and 10 ITD (100%; Fig. 4B; Supplementary Table S4H). In total, 28 P/LP variants were not covered by our panel (Supplementary Table S4H). Of these, six were SNV, two were indels, 11 were fusion/SV (not designed), five were other SV, and four were focal CNV. The uncovered variants belonged to 26 genes, which are not mutated in published pediatric pan-cancer cohorts (2, 3), except for COL1A1 that demonstrates a low mutation frequency of 0.2% (Supplementary Fig. S11; Supplementary Tables S4H and S8), further supporting their omission from our panel design.

For all the samples tested, we achieved a mean depth of ∼2,500X (Supplementary Table S14A), which ensures 95% confidence of detection of variants with ≥1% AF (14). As expected, we found an average of 3,300 heterozygous SNP (median, 3,405; range, 2,227–3,844) in germline samples (n = 30). These samples were pooled as a reference to interrogate CNV/LOH in tumor samples (“Method”; Supplementary Table S14B). By using a “rotation control” method (37) coupled with a recently developed indel/SV genotyping tool (“Method”), we detected 98% (354 of the 361 covered variants; Supplementary Table S4H) of reported driver alterations. SNV, indels, and ITDs showed a detection rate of 100% (Fig. 4B). The fusion/SV showed an overall recall rate of 97% (35 out of 36), of the covered alterations. One fusion/SV marker (RUNX1::RUNX1T1 from the case SJCBF100) that was covered in the panel design with single breakpoint in RUNX1 was missed due to insufficient depth (21X) of coverage (Supplementary Tables S4D and S14A); therefore, poor capture efficiency in certain genomic regions warrants future study. By contrast, we could detect 38% (three out of eight) of other SV, and the rest of the missed SV exhibited their breakpoints either in intergenic or noncovered regions, which is consistent with a much larger genomic space for breakpoints in tumor suppressor genes. Conversely, SJPedPanel detected 94.6% of the reported CNV/LOH (Supplementary Table S4H). Collectively, SJPedPanel detected 91% of the 389 reported driver alterations from these 113 cases (Fig. 4B, pair of last bars with gray background). Of note, at least one variant was detected for 96.5% cases (n = 113), with a median of three variants per case (range, 0–16; Supplementary Table S15). Consistent with the in silico analysis (Fig. 4A), a comparison of SNV, indel, SV, and ITD variants (n = 205 out of 389) discovered using three-platform sequencing (WGS, WES, RNAseq; ref. 4, 16) indicated that SJPedPanel covers 88% of these variants, whereas WES covers 78% (Supplementary Fig. S12). These data further confirm the superior performance achieved using SJPedPanel for childhood cancers with a panel size approximately 10% the size of WES.

We also highlight the successes and challenges in panel design by using structural variants as examples. First, due to the large genomic size of intronic regions that can be involved in oncogenic fusions, inclusion/exclusion of intronic regions involves a difficult balance between panel size and effective coverage of the patient population. In our 2023 study of fusion gene pairs involved in 5,190 childhood cancers (11), 72 representative genes are selected for 274 fusion gene pairs, and SJPedPanel included 53 of these 72 genes. The maximum mutation frequency of the 19 genes not included (Supplementary Table S7D) in our panel is 0.1% (11). Furthermore, inclusion of all relevant introns of all genes involved in oncogenic fusions would need ∼8 Mbp (described in Source data file of Fig. 2K and L mentioned in Liu and colleagues; ref. 11.). With the observation that some genes can present with multiple fusion partners (e.g., 40 of the 53 included representative genes exhibit between two and 32 partner genes), we intentionally left out some partner genes by relying on common representative genes, which reduced the space from ∼8 Mbp (280 introns) to ∼1Mbp (156 introns). For example, including only the intronic regions of KMT2A was sufficient to detect KMT2A::MLLT1 fusion in pediatric patient with T-ALL (SJMLL002) even though we did not include MLLT1 introns in the panel content (Supplementary Fig. S13). Despite this success, SV can be challenging to detect. For example, SJPedPanel missed RUNX1::RUNX1T1 in SJCBF100 because RUNX1T1 introns were not included (∼100 K bp needed). As expected, the panel also missed an inversion involving RB1 gene in case SJRB0051 (Supplementary Fig. S14) because the DNA breakpoints fall into an intronic region of RB1, and RB1 introns are not covered (∼180 kb are needed to cover all RB1 introns).

Apart from SV/fusions, SJPedPanel covers well known ITD such as FLT3, NOTCH1, BRAF, etc. (Supplementary Table S4G). Also, SJPedPanel provides exclusive coverage of UBTF gene that was recently described in pediatric AML (91). Although in literature UBTF ITD are typically mistakenly detected as small indels (3, 91, 103, 104), our panel successfully detected UBTF tandem duplications in two pediatric AML cases (SJAML015373 and SJAML016569; Supplementary Fig. S8; ref. 91).

Determining limit of detection

One of the important applications of panel sequencing is disease monitoring, in which the tumor burden is typically less than 1%, and thus, variants are rare and challenging to detect. To investigate the applicability of our panel, we first performed dilution experiments (with seven tumor concentrations of 10%, 5%, 2.5%, 1%, 0.5%, 0.2%, 0.1%, in addition to pure normal of 0% and pure cancer of 100%) using six pediatric cancer cell lines (697, EW8, K562, ME1, MOLM13, and Rh30) and one noncancer cell line (GM12878) as a normal control. These six lines collectively contain 26 unique P/LP variants (14 SNV, four indels, and eight SV; Supplementary Table S3; refs. 105, 106). The lack of shared driver alterations allowed for pooling of dilutions to reduce the experimental complexity while keeping the diversity of mutation types. For example, at a ladder concentration of 0.5%, we mixed cell equivalents in a ratio of 5:5:5:5:5:5:970 from the six cancer lines and the normal line, respectively. To ensure sufficient power of detecting variants with low AF, we achieved an average depth of 5,000X for 0.5%, >7,000X for 0.2%, and 0.1% dilution concentrations (“Method,” Supplementary Table S6).

We used SequencErr (29) and a newly developed SVindelGenotyper (manuscript under review; ref. 34) to perform allele counting followed by variant calling using binomial models with false discovery rate control (“Method”), in which the pure normal of 0% was used to estimate background error rates as no cancer-driving somatic alterations are expected in noncancer cell line GM12878. As seen in Fig. 5A, the observed AF closely represent corresponding dilution ladders, with R-squared values of 0.7 and 0.72, for biologic replicates A and B, respectively (Supplementary Table S16). Notably, although we achieved a >90% detection rate when the dilution concentration is above 0.5%, it diminishes quickly at lower dilutions. At a dilution concentration of 0.1%, the recall rate was 69% and 42% in replicates A and B, respectively.

Figure 5.

Figure 5.

Ultrasensitive detection using SJPedPanel. A, Determining the LoD with SJPedPanel. The observed AF of 26 driver alterations are shown on the y-axis as a function of corresponding dilution concentration shown in the x-axis. The detection rate for each dilution concentration is shown on top as magenta text. The observed AF of variants from normal and pure tumor cell lines are also shown using green and black points, respectively. The variants detected (Q < 0.05) are shown in blue, whereas those missed (Q > 0.05) are shown in red. Shown are results from replicate A and replicate B. B, Real-time tumor tracking in case SJAML016582. The estimated cellular fractions of subclones at four timepoints from diagnosis (day 0) to relapse (day 314) are shown as a river-plot. Subclones with very low cellularity [e.g., NRAS Q61R (SNV) at diagnosis] were adjusted for visualization purposes using an R package fishplot (17). Actual values are available in Supplementary Table S18—Case SJAML016582.

We also determined detection rates for each class of variant. As it turned out, indels and SV exhibit a better detection rate at lower dilution concentrations (<1%) than SNV (Supplementary Table S16), which is consistent with our observations in a study of error rates of SV and indels (under review).

To further investigate the effect of sequencing depth on detection rate, we performed in silico downsampling experiment (“Method”; Supplementary Table S17). For all the dilutions with concentrations ≥1%, which were initially sequenced at 5,000X and 2,500X (Supplementary Fig. S15A and S15B), the recall rate is found to be close to 100% even after downsampling their depths to 1,000X. However, the recall rate declined with downsampling depths for dilution concentrations <1%. For samples with dilution concentration of 0.5% (Supplementary Fig. S15B top panel), the recall rate dropped from 97% at 3,000X to 75% at 1,000X. Thus, markers with AF of 0.5% can be reliably detected at 2,500X ∼3,000X, which is concordant with our theoretical binomial calculation of 2,100X (“Method,” Fig. 5A). However, for markers with AF of 0.2% and 0.1%, the recall rate was below 50% even with initial data at 10,000X coverage (Supplementary Fig. S15C). This finding suggests that the current LoD is between 0.1% and 0.5% and is consistent with a recent report using cfDNA data (107).

Our data on the capture efficiency and dilution experiments together revealed insights on the designing of sequencing depth for a desired power. For example, at a predetermined sensitivity level of 0.5% AF, 2,100X depth is needed to ensure 95% chance of recall. The standard deviation of Fig. 3 indicates that we should aim for an average depth of 3 × 2,100 (=6,300X), to ensure 95% of targeted regions achieve >2,100X depth. Because the total space of this panel is ∼0.15% of a human genome, 6,300X corresponds to whole genome sequencing at 9X coverage. Similar estimates can be derived for the other sensitivity/recall rates.

Case study: real-time tracking of relapsed AML using deep sequencing

To test the suitability of the panel for disease monitoring in a real-world setting, we chose two AML cases (SJAML016582 and SJAML016551) that demonstrated remission samples with low tumor purity. These cases exhibit multiple pathogenic and likely pathogenic variants (P/LP) reported by clinical sequencing of diagnosis and relapse samples (Supplementary Table S18) and are ideal for panel sequencing. The diagnosis, remission, and relapse samples were sequenced to an average depth of 5,000X using SJPedPanel per the previously mentioned power calculations.

In case SJAML016582, apart from subtype-defining NUP98::NSD1 fusion, four pathogenic variants were detected at diagnosis (day 0), and six pathogenic variants were detected at relapse (day 315), with two variants shared between diagnosis and relapse. With the ultradeep panel sequencing data, we recovered all variants initially detected by whole genome sequencing, including four at diagnosis and six at relapse. Interestingly, panel sequencing detected an SNV encoding NRAS Q61R in the diagnostic tumor with an allelic fraction of 0.12%, and an MNV encoding NRAS Q61R in the relapse tumor with an allelic fraction of 0.07%. Both of these variants were beyond the detection limits of whole genome sequencing (Supplementary Table S18—Case SJAML016582; Fig. 5B). Further, in the day 26 remission data, we detected a high tumor burden (6.56%, Method). Notably, the tumor burden continued to decrease down to AF of 2.88% at day 97, as reflected by the SV responsible for NUP98::NSD1 fusion.

In addition to disease monitoring, we also applied ultradeep sequencing for detecting measurable residual disease (MRD) to investigate patient response to chemotherapy. Generally, flow cytometry is a method of choice for such applications (108) in addition to real-time PCR (109) and droplet digital PCR (110). However, these approaches are not as scalable as ultradeep sequencing. In order to evaluate the efficacy of SJPedPanel to detect MRD, we sequenced three samples from diagnosis (day 0), MRD (day 23), and relapse (day 344) for AML patient SJAML016551. Here, the diagnosis and relapse samples are used to define ancestral genetic alterations that are obligated to be present in the MRD sample. For this case, five pathogenic variants, including a KMT2A::MLLT10 structural variant, were identified to be shared between diagnosis and relapse and are expected to be detected in the MRD sample. Although flow-based MRD detection was negative for this case, our deep sequencing detected all five pathogenic variants with AF range between 0.7% and 1% (Supplementary Table S18—Case SJAML016551; Supplementary Fig. S16). Together, our data demonstrate the potential ability of SJPedPanel in measuring MRD and monitoring disease progression that could aid in early detection of relapse.

Discussion

We developed SJPedPanel, a hybridization capture-based assay targeting 357 pediatric relevant genes, including specific oncogenes and tumor suppressors implicated in 44 pediatric cancer genomic studies (24, 11, 16, 57, 64, 66, 88), many through the extensive efforts of the Pediatric Cancer Genome Project (1) and NCI TARGET project (3), as well as real-time clinical sequencing efforts at St. Jude Children’s Research Hospital employing WGS, WES, and RNAseq platforms (4, 16). SJPedPanel includes 297 introns that contain structural variants in 94 genes, accounting for 1.438 Mbp of genomic space, as well as 0.209 Mbp for detecting promoter-hijacking SV. Furthermore, 7,590 common SNP are covered, allowing for the detection of copy-number changes and LOH at a resolution of ∼1Mb. Due the large differences in the genomic landscape of cancers in children and adults (3), SJPedPanel exhibits unique coverage of 105 genes frequently implicated in pediatric cancers when compared with other commonly used panels (FoundationOne Heme, FoundationOne CDx, MSK-IMPACT, CHOP CHMP/CSTP, OncoKids and Oncomine Comprehensive Assay v3). One of the limitations of the current study is that the SJPedPanel was compared with other panels using in silico methods due to limited availability of specimens, which can be addressed in future prospective studies.

We used in silico and experimental strategies to evaluate the performance of SJPedPanel for detecting clinically relevant somatic mutations. Using the previously published G4K study, SJPedPanel was found to cover 86% of the reported somatic markers, including SNV, indel, and SV. Similar findings were obtained by using real-time clinical sequencing samples (4, 16) in which 91% of the reported clinically relevant variants were detected, including all SNV, indels, and ITD. At the patient level, at least one variant was detected in 96% of cases (median 3). These findings establish the ability of SJPedPanel to detect clinically relevant somatic mutations in a wide range of samples from an RTCG setting for childhood cancers.

Although tumor-normal paired whole genome sequencing remains the gold standard for cancer diagnostics, the overall cost and required bioinformatic pipelines and infrastructure currently limits its broad application. Conversely, gene panel–based genomics testing can enable many centers to perform NGS-based cancer diagnostics at an overall lower cost and faster turnaround time. The content of SJPedPanel allows for more comprehensive detection of the alterations common in pediatric cancer, compared with WES and other panels. An inherent limitation of DNA sequencing panels is the lack of coverage at all critical loci or newly discovered recurrent alterations; however, panel content can readily be updated. For example, the current version of SJPedPanel lacks sufficient coverage to identify the recurrent ASPSCR1::TFE3 fusion characteristic of alveolar soft part sarcoma or SSX1/SSX2::SS18 in synovial sarcoma. Such genes will be incorporated in future versions. To maximize the utility of this panel, the targeted genomic locations are included in Supplementary Tables S1 and S2, and future updates will be made readily available to public.

As a proof of principle, we demonstrate the application of using SJPedPanel for MRD detection and posttreatment disease monitoring. Further investigation of SJPedPanel in detecting residual diseases and cancer monitoring using larger cohort sizes is warranted. Because this is a pan-cancer gene panel for childhood malignancies, it will be relatively straightforward to develop sub panels dedicated to certain cancer subtypes to further reduce the size of the panel, and in turn enabling much higher depth with a similar cost. We anticipate the SJPedPanel will significantly enhance the management and diagnosis of pediatric cancer. Due to technological advancements and continuous research in pediatric cancers, we expect the gene content of SJPedPanel will keep evolving, and revisions will be made public. We look forward to driving future development and dissemination of this panel through close engagement with academic and industrial partners.

Supplementary Material

Supplementary Tables 1

Supplementary Tables S1-S18

Supplementary Figures 1

Supplementary Figures S1-S16

Acknowledgments

X. Ma and J.M. Klco thank Elizabeth Stewart for providing cell lines for our dilution experiments. The authors thank the anonymous reviewers for their valuable insights in improving the presentation. This work was supported in part by the National Cancer Institute of the National Institutes of Health under Award Number R01CA273326 (to X. Ma) and T32CA236748 (to A.L.W. Huskey), the Fund for Innovation in Cancer Informatics (www.the-ici-fund.org, to X. Ma and J.M. Klco), Cancer Center Support Grant P30CA021765 (Developmental Fund to J.M. Klco and X. Ma) from the National Institutes of Health, and American Lebanese Syrian Associated Charities (ALSAC). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health or other funding agencies.

Footnotes

Note: Supplementary data for this article are available at Clinical Cancer Research Online (http://clincancerres.aacrjournals.org/).

Authors’ Disclosures

X. Ma reports grants from National Institute of Health, Fund for Innovation in Cancer Informatics, and National Institute of Health during the conduct of the study. No disclosures were reported by the other authors.

Authors’ Contributions

P. Kolekar: Data curation, methodology, investigation, validation, formal analysis, visualization, software, writing–original draft. V. Balagopal: Data curation, investigation, formal analysis, writing–original draft. L. Dong: Data curation, investigation, formal analysis. Y. Liu: Software, data curation. S. Foy: Formal analysis. Q. Tran: Software. H. Mulder: Methodology. A.L.W. Huskey: Data curation. E. Plyler: Data curation. Z. Liang: Data curation. J. Ma: Data curation. J. Nakitandwe: Data curation. J. Gu: Data curation. M. Namwanje: Data curation. J. Maciaszek: Data curation. D. Payne-Turner: Data curation. S. Mallampati: Methodology, data curation. L. Wang: Data curation, writing–review and editing. J. Easton: Methodology, resources. J.M. Klco: Conceptualization, funding acquisition, resources, writing–original draft, writing–review and editing. X. Ma: Conceptualization, funding acquisition, writing–original draft, writing–review and editing. All authors read and approved the final manuscript.

References

  • 1. Downing JR, Wilson RK, Zhang J, Mardis ER, Pui C-H, Ding L, et al. The pediatric cancer genome project. Nat Genet 2012;44:619–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Gröbner SN, Worst BC, Weischenfeldt J, Buchhalter I, Kleinheinz K, Rudneva VA, et al. The landscape of genomic alterations across childhood cancers. Nature 2018;555:321–27. [DOI] [PubMed] [Google Scholar]
  • 3. Ma X, Liu Y, Liu Y, Alexandrov LB, Edmonson MN, Gawad C, et al. Pan-cancer genome and transcriptome analyses of 1,699 paediatric leukaemias and solid tumours. Nature 2018;555:371–76. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Rusch M, Nakitandwe J, Shurtleff S, Newman S, Zhang Z, Edmonson MN, et al. Clinical cancer genomic profiling by three-platform sequencing of whole genome, whole exome and transcriptome. Nat Commun 2018;9:3962. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Hiemenz MC, Ostrow DG, Busse TM, Buckley J, Maglinte DT, Bootwalla M, et al. OncoKids: a comprehensive next-generation sequencing panel for pediatric malignancies. J Mol Diagn 2018;20:765–76. [DOI] [PubMed] [Google Scholar]
  • 6. Cheng DT, Mitchell TN, Zehir A, Shah RH, Benayed R, Syed A, et al. Memorial Sloan Kettering-Integrated Mutation Profiling of Actionable Cancer Targets (MSK-IMPACT): a hybridization capture-based next-generation sequencing clinical assay for solid tumor molecular oncology. J Mol Diagn 2015;17:251–64. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Singh RR. Next-generation sequencing in high-sensitive detection of mutations in tumors: challenges, advances, and applications. J Mol Diagn 2020;22:994–1007. [DOI] [PubMed] [Google Scholar]
  • 8. Rehm HL. Disease-targeted sequencing: a cornerstone in the clinic. Nat Rev Genet 2013;14:295–300. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Chen X, Yang W, Roberts CWM, Zhang J. Developmental origins shape the paediatric cancer genome. Nat Rev Cancer 2024;24:382–98. [DOI] [PubMed] [Google Scholar]
  • 10. Wu G, Broniscer A, McEachron TA, Lu C, Paugh BS, Becksfort J, et al. Somatic histone H3 alterations in pediatric diffuse intrinsic pontine gliomas and non-brainstem glioblastomas. Nat Genet 2012;44:251–53. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Liu Y, Klein J, Bajpai R, Dong L, Tran Q, Kolekar P, et al. Etiology of oncogenic fusions in 5,190 childhood cancers and its clinical and therapeutic implication. Nat Commun 2023;14:1739. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Craig DW, Nasser S, Corbett R, Chan SK, Murray L, Legendre C, et al. A somatic reference standard for cancer genome sequencing. Sci Rep 2016;6:24607. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Pleasance ED, Cheetham RK, Stephens PJ, McBride DJ, Humphray SJ, Greenman CD, et al. A comprehensive catalogue of somatic mutations from a human cancer genome. Nature 2010;463:191–96. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Ma X, Arunachalam S, Liu Y. Applications of probability and statistics in cancer. Quantitative Biol 2020;8:15–108. [Google Scholar]
  • 15. Ma X, Shao Y, Tian L, Flasch DA, Mulder HL, Edmonson MN, et al. Analysis of error profiles in deep next-generation sequencing data. Genome Biol 2019;20:50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Newman S, Nakitandwe J, Kesserwan CA, Azzato EM, Wheeler DA, Rusch M, et al. Genomes for Kids: the scope of pathogenic mutations in pediatric cancer revealed by comprehensive DNA and RNA sequencing. Cancer Discov 2021;11:3008–27. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Miller CA, McMichael J, Dang HX, Maher CA, Ding L, Ley TJ, et al. Visualizing tumor evolution with the fishplot package for R. BMC Genomics 2016;17:880. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Illumina . DNA prep with exome 2.0 BED files. 2023[cited 2023 June 22]. Available from:https://support.illumina.com/downloads/Illumina-dna-prep-exome-20-bed-files.html.
  • 19. Foundation One Heme. 2019[cited 2023 June 22]. Available from:https://assets.ctfassets.net/w98cd481qyp0/42r1cTE8VR4137CaHrsaen/baf91080cb3d78a52ada10c6358fa130/FoundationOne_Heme_Technical_Specifications.pdf.
  • 20. He J, Abdel-Wahab O, Nahas MK, Wang K, Rampal RK, Intlekofer AM, et al. Integrated genomic DNA/RNA profiling of hematologic malignancies in the clinical setting. Blood 2016;127:3004–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Foundation one CDX. 2022[cited 2023 June 22]. Available from:https://assets.ctfassets.net/w98cd481qyp0/YqqKHaqQmFeqc5ueQk48w/d12f19680205941ea3fee417f08e9524/F1CDx_Technical_Specifications.pdf.
  • 22. MSK-IMPACT Panel. 2017[cited 2023 June 22]. Available from:https://www.accessdata.fda.gov/cdrh_docs/reviews/den170058.pdf.
  • 23. Surrey LF, MacFarland SP, Chang F, Cao K, Rathi KS, Akgumus GT, et al. Clinical utility of custom-designed NGS panel testing in pediatric tumors. Genome Med 2019;11:32. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Thermo Fisher . Oncomine Comprehensive Assay v3. 2022[cited 2023 June 22]. Available from:https://assets.thermofisher.com/TFS-Assets/LSG/brochures/oncomine-comprehensive-assay-v3-flyer.pdf.
  • 25. Hovelson DH, McDaniel AS, Cani AK, Johnson B, Rhodes K, Williams PD, et al. Development and validation of a scalable next-generation sequencing system for assessing relevant somatic variants in solid tumors. Neoplasia 2015;17:385–99. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Brown GR, Hem V, Katz KS, Ovetsky M, Wallin C, Ermolaeva O, et al. Gene: a gene-centered information resource at NCBI. Nucleic Acids Res 2015;43:D36–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Conway JR, Lex A, Gehlenborg N. UpSetR: an R package for the visualization of intersecting sets and their properties. Bioinformatics 2017;33:2938–40. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Andrews S . FastQC: a quality control tool for high throughput sequence data. [cited 2023 June 22]. Available from: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/.
  • 29. Davis EM, Sun Y, Liu Y, Kolekar P, Shao Y, Szlachta K, et al. SequencErr: measuring and suppressing sequencer errors in next-generation sequencing data. Genome Biol 2021;22:37. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Li H, Durbin R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 2010;26:589–95. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequence alignment/map format and SAMtools. Bioinformatics 2009;25:2078–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 2010;26:841–2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. DeepSeqCoverageQC. 2023[cited 2023 June 22]. Available from:https://github.com/pandurang-kolekar/DeepSeqCoverageQC.
  • 34. SVindelGenotyper. 2023[cited 2023 June 22]. Available from:https://github.com/stjude/SVindelGenotyper.
  • 35. Talevich E, Shain AH, Botton T, Bastian BC. CNVkit: genome-wide copy number detection and visualization from targeted DNA sequencing. PLoS Comput Biol 2016;12:e1004873. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Kolekar P. SJPedPanel: Supplementary Data - Output files and diagrams generated by CNVkit v0.9.10 and the allelic imbalance figures used to review CNV and LOH events. Zenodo. 2023[cited 2023 June 22]. Available from:https://zenodo.org/doi/10.5281/zenodo.8173838. [Google Scholar]
  • 37. Li B, Brady SW, Ma X, Shen S, Zhang Y, Li Y, et al. Therapy-induced mutations drive the genomic landscape of relapsed acute lymphoblastic leukemia. Blood 2020;135:41–55. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Storey JD, Bass AJ, Dabney A, Robinson D. 2023qvalue: Q-value estimation for false discovery rate control. R Package Version 2.34.0. [cited 2023 June 22] Available from:https://bioconductor.org/packages/qvalue. [Google Scholar]
  • 39. Team RC . R: a language and environment for statistical computing. 2023[cited 2023 June 22]. Available from:https://www.R-project.org/.
  • 40. Cheung N-KV, Zhang J, Lu C, Parker M, Bahrami A, Tickoo SK, et al. Association of age at diagnosis and genetic mutations in patients with neuroblastoma. JAMA 2012;307:1062–71. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Gruber TA, Larson Gedman A, Zhang J, Koss CS, Marada S, Ta HQ, et al. An inv(16)(p13.3q24.3)-encoded CBFA2T3-GLIS2 fusion protein defines an aggressive subtype of pediatric acute megakaryoblastic leukemia. Cancer Cell 2012;22:683–97. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Roberts KG, Morin RD, Zhang J, Hirst M, Zhao Y, Su X, et al. Genetic alterations activating kinase and cytokine receptor signaling in high-risk acute lymphoblastic leukemia. Cancer Cell 2012;22:153–66. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Robinson G, Parker M, Kranenburg TA, Lu C, Chen X, Ding L, et al. Novel mutations target distinct subgroups of medulloblastoma. Nature 2012;488:43–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Zhang J, Benavente CA, McEvoy J, Flores-Otero J, Ding L, Chen X, et al. A novel retinoblastoma therapy from genomic and epigenetic analyses. Nature 2012;481:329–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45. Zhang J, Ding L, Holmfeldt L, Wu G, Heatley SL, Payne-Turner D, et al. The genetic basis of early T-cell precursor acute lymphoblastic leukaemia. Nature 2012;481:157–63. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Chen X, Stewart E, Shelat AA, Qu C, Bahrami A, Hatley M, et al. Targeting oxidative stress in embryonal rhabdomyosarcoma. Cancer Cell 2013;24:710–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47. Holmfeldt L, Wei L, Diaz-Flores E, Walsh M, Zhang J, Ding L, et al. The genomic landscape of hypodiploid acute lymphoblastic leukemia. Nat Genet 2013;45:242–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48. Jaffe JD, Wang Y, Chan HM, Zhang J, Huether R, Kryukov GV, et al. Global chromatin profiling reveals NSD2 mutations in pediatric acute lymphoblastic leukemia. Nat Genet 2013;45:1386–91. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49. Paugh BS, Zhu X, Qu C, Endersby R, Diaz AK, Zhang J, et al. Novel oncogenic PDGFRA mutations in pediatric high-grade gliomas. Cancer Res 2013;73:6219–29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50. Shah S, Schrader KA, Waanders E, Timms AE, Vijai J, Miething C, et al. A recurrent germline PAX5 mutation confers susceptibility to pre-B cell acute lymphoblastic leukemia. Nat Genet 2013;45:1226–31. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51. Zhang J, Wu G, Miller CP, Tatevossian RG, Dalton JD, Tang B, et al. Whole-genome sequencing identifies genetic alterations in pediatric low-grade gliomas. Nat Genet 2013;45:602–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52. Chen X, Bahrami A, Pappo A, Easton J, Dalton J, Hedlund E, et al. Recurrent somatic structural variations contribute to tumorigenesis in pediatric osteosarcoma. Cell Rep 2014;7:104–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53. Parker M, Mohankumar KM, Punchihewa C, Weinlich R, Dalton JD, Li Y, et al. C11orf95-RELA fusions drive oncogenic NF-κB signalling in ependymoma. Nature 2014;506:451–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54. Roberts KG, Li Y, Payne-Turner D, Harvey RC, Yang Y-L, Pei D, et al. Targetable kinase-activating lesions in Ph-like acute lymphoblastic leukemia. N Engl J Med 2014;371:1005–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55. Tirode F, Surdez D, Ma X, Parker M, Le Deley MC, Bahrami A, et al. Genomic landscape of Ewing sarcoma defines an aggressive subtype with co-association of STAG2 and TP53 mutations. Cancer Discov 2014;4:1342–53. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56. Wu G, Diaz AK, Paugh BS, Rankin SL, Ju B, Li Y, et al. The genomic landscape of diffuse intrinsic pontine glioma and pediatric non-brainstem high-grade glioma. Nat Genet 2014;46:444–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57. Andersson AK, Ma J, Wang J, Chen X, Gedman AL, Dang J, et al. The landscape of somatic mutations in infant MLL-rearranged acute lymphoblastic leukemias. Nat Genet 2015;47:330–37. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58. Li B, Li H, Bai Y, Kirschner-Schwabe R, Yang JJ, Chen Y, et al. Negative feedback-defective PRPS1 mutants drive thiopurine resistance in relapsed childhood ALL. Nat Med 2015;21:563–71. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59. Lu C, Zhang J, Nagahawatte P, Easton J, Lee S, Liu Z, et al. The genomic landscape of childhood and adolescent melanoma. J Invest Dermatol 2015;135:816–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60. Ma X, Edmonson M, Yergeau D, Muzny DM, Hampton OA, Rusch M, et al. Rise and fall of subclones from diagnosis to relapse in pediatric B-acute lymphoblastic leukaemia. Nat Commun 2015;6:6604. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61. Pinto EM, Chen X, Easton J, Finkelstein D, Liu Z, Pounds S, et al. Genomic landscape of paediatric adrenocortical tumours. Nat Commun 2015;6:6302. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62. Tong Y, Merino D, Nimmervoll B, Gupta K, Wang Y-D, Finkelstein D, et al. Cross-species genomics identifies TAF12, NFYC, and RAD54L as choroid plexus carcinoma oncogenes. Cancer Cell 2015;27:712–27. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63. Zhang J, Walsh MF, Wu G, Edmonson MN, Gruber TA, Easton J, et al. Germline mutations in predisposition genes in pediatric cancer. N Engl J Med 2015;373:2336–46. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64. Faber ZJ, Chen X, Gedman AL, Boggs K, Cheng J, Ma J, et al. The genomic landscape of core-binding factor acute myeloid leukemias. Nat Genet 2016;48:1551–56. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65. Iacobucci I, Li Y, Roberts KG, Dobson SM, Kim JC, Payne-Turner D, et al. Truncating erythropoietin receptor rearrangements in acute lymphoblastic leukemia. Cancer Cell 2016;29:186–200. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66. Liu Y-F, Wang B-Y, Zhang W-N, Huang J-Y, Li B-S, Zhang M, et al. Genomic profiling of adult and pediatric B-cell acute lymphoblastic leukemia. EBioMedicine 2016;8:173–83. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67. Zhang J, McCastlain K, Yoshihara H, Xu B, Chang Y, Churchman ML, et al. Deregulation of DUX4 and ERG in acute lymphoblastic leukemia. Nat Genet 2016;48:1481–89. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68. de Rooij JDE, Branstetter C, Ma J, Li Y, Walsh MP, Cheng J, et al. Pediatric non-Down syndrome acute megakaryoblastic leukemia is characterized by distinct genomic subsets with varying outcomes. Nat Genet 2017;49:451–56. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69. Liu Y, Easton J, Shao Y, Maciaszek J, Wang Z, Wilkinson MR, et al. The genomic landscape of pediatric and young adult T-lineage acute lymphoblastic leukemia. Nat Genet 2017;49:1211–18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70. Northcott PA, Buchhalter I, Morrissy AS, Hovestadt V, Weischenfeldt J, Ehrenberger T, et al. The whole-genome landscape of medulloblastoma subtypes. Nature 2017;547:311–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71. Alexander TB, Gu Z, Iacobucci I, Dickerson K, Choi JK, Xu B, et al. The genetic basis and cell of origin of mixed phenotype acute leukaemia. Nature 2018;562:373–79. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72. Bolouri H, Farrar JE, Triche T Jr, Ries RE, Lim EL, Alonzo TA, et al. The molecular landscape of pediatric acute myeloid leukemia reveals recurrent structural alterations and age-specific mutational interactions. Nat Med 2018;24:103–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73. Pajtler KW, Wen J, Sill M, Lin T, Orisme W, Tang B, et al. Molecular heterogeneity and CXorf67 alterations in posterior fossa group A (PFA) ependymomas. Acta Neuropathol 2018;136:211–26. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74. Stewart E, McEvoy J, Wang H, Chen X, Honnell V, Ocarz M, et al. Identification of therapeutic targets in rhabdomyosarcoma through integrated genomic, epigenomic, and proteomic analyses. Cancer Cell 2018;34:411–26.e19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75. Brady SW, Ma X, Bahrami A, Satas G, Wu G, Newman S, et al. The clonal evolution of metastatic osteosarcoma as shaped by cisplatin treatment. Mol Cancer Res 2019;17:895–906. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76. Newman S, Fan L, Pribnow A, Silkov A, Rice SV, Lee S, et al. Clinical genome sequencing uncovers potentially targetable truncations and fusions of MAP3K8 in spitzoid and other melanomas. Nat Med 2019;25:597–602. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77. Brady SW, Liu Y, Ma X, Gout AM, Hagiwara K, Zhou X, et al. Pan-neuroblastoma analysis reveals age- and signature-associated driver alterations. Nat Commun 2020;11:5183. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78. Brady SW, Ma X, Zhou B-BS, Pui C-H, Yang JJ, Zhang J. Therapy-induced mutagenesis in relapsed ALL is supported by mutational signature analysis. Blood 2020;136:2235–37. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79. Waanders E, Gu Z, Dobson SM, Antić Ž, Crawford JC, Ma X, et al. Mutational landscape and patterns of clonal evolution in relapsed pediatric acute lymphoblastic leukemia. Blood Cancer Discov 2020;1:96–111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80. Oróstica KY, Verdugo RA. chromPlot: visualization of genomic data in chromosomal context. Bioinformatics 2016;32:2366–68. [DOI] [PubMed] [Google Scholar]
  • 81. Mansour MR, Abraham BJ, Anders L, Berezovskaya A, Gutierrez A, Durbin AD, et al. Oncogene regulation. An oncogenic super-enhancer formed through somatic mutation of a noncoding intergenic element. Science 2014;346:1373–77. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82. Sjoberg D. 2021 ggsankey: an R package. [cited 2023 June 22]. Available from:https://github.com/davidsjoberg/ggsankey.
  • 83. Zhou X, Edmonson MN, Wilkinson MR, Patel A, Wu G, Liu Y, et al. Exploring genomic alteration in pediatric cancer using ProteinPaint. Nat Genet 2016;48:4–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84. Zhou X, Wang J, Patel J, Valentine M, Shao Y, Newman S, et al. Exploration of coding and non-coding variants in cancer using GenomePaint. Cancer Cell 2021;39:83–95.e4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85. Ottema S, Mulet-Lazaro R, Erpelinck-Verschueren C, van Herk S, Havermans M, Arricibita Varea A, et al. The leukemic oncogene EVI1 hijacks a MYC super-enhancer by CTCF-facilitated loops. Nat Commun 2021;12:5679. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86. Northcott PA, Lee C, Zichner T, Stütz AM, Erkek S, Kawauchi D, et al. Enhancer hijacking activates GFI1 family oncogenes in medulloblastoma. Nature 2014;511:428–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87. Pai S, Ren J. IdeoViz: plots data (continuous/discrete) along chromosomal ideogram. 2023. R package version 1.37.0. [cited 2023 June 22]. Available from:https://bioconductor.org/packages/IdeoViz. [Google Scholar]
  • 88. Gadd S, Huff V, Walz AL, Ooms A, Armstrong AE, Gerhard DS, et al. A Children[R8S2Q1M7]s Oncology Group and TARGET initiative exploring the genetic landscape of Wilms tumor. Nat Genet 2017;49:1487–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89. DeSouza R-M, Jones BRT, Lowis SP, Kurian KM. Pediatric medulloblastoma-update on molecular classification driving targeted therapies. Front Oncol 2014;4:176. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90. Arabzade A, Zhao Y, Varadharajan S, Chen H-C, Jessa S, Rivas B, et al. ZFTA-RELA dictates oncogenic transcriptional programs to drive aggressive supratentorial ependymoma. Cancer Discov 2021;11:2200–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91. Umeda M, Ma J, Huang BJ, Hagiwara K, Westover T, Abdelhamed S, et al. Integrated genomic analysis identifies UBTF tandem duplications as a recurrent lesion in pediatric acute myeloid leukemia. Blood Cancer Discov 2022;3:194–207. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92. Jeha S, Choi J, Roberts KG, Pei D, Coustan-Smith E, Inaba H, et al. Clinical significance of novel subtypes of acute lymphoblastic leukemia in the context of minimal residual disease-directed therapy. Blood Cancer Discov 2021;2:326–37. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93. Lenz G, Davis RE, Ngo VN, Lam L, George TC, Wright GW, et al. Oncogenic CARD11 mutations in human diffuse large B cell lymphoma. Science 2008;319:1676–79. [DOI] [PubMed] [Google Scholar]
  • 94. Gu Z, Churchman M, Roberts K, Li Y, Liu Y, Harvey RC, et al. Genomic analyses identify recurrent MEF2D fusions in acute lymphoblastic leukaemia. Nat Commun 2016;7:13331. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95. Dickerson KM, Qu C, Gao Q, Iacobucci I, Gu Z, Yoshihara H, et al. ZNF384 fusion oncoproteins drive lineage aberrancy in acute leukemia. Blood Cancer Discov 2022;3:240–63. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 96. Patmore DM, Jassim A, Nathan E, Gilbertson RJ, Tahan D, Hoffmann N, et al. DDX3X suppresses the susceptibility of hindbrain lineages to medulloblastoma. Dev Cell 2020;54:455–70.e5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97. Chakravarty D, Gao J, Phillips SM, Kundra R, Zhang H, Wang J, et al. OncoKB: a precision oncology knowledge base. JCO Precis Oncol 2017;1:PO.17.00011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 98. Suehnholz SP, Nissan MH, Zhang H, Kundra R, Nandakumar S, Lu C, et al. Quantifying the expanding landscape of clinical actionability for patients with cancer. Cancer Discov 2024;14:49–65. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 99. Vogelstein B, Papadopoulos N, Velculescu VE, Zhou S, Diaz LA Jr., Kinzler KW. Cancer genome landscapes. Science 2013;339:1546–58. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 100. Walz AL, Ooms A, Gadd S, Gerhard DS, Smith MA, Guidry Auvil JM, et al. Recurrent DGCR8, DROSHA, and SIX homeodomain mutations in favorable histology Wilms tumors. Cancer Cell 2015;27:286–97. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 101. Gu Z, Churchman ML, Roberts KG, Moore I, Zhou X, Nakitandwe J, et al. PAX5-driven subtypes of B-progenitor acute lymphoblastic leukemia. Nat Genet 2019;51:296–307. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 102. Choi SM, Dewar R, Burke PW, Shao L. Partial tandem duplication of KMT2A (MLL) may predict a subset of myelodysplastic syndrome with unique characteristics and poor outcome. Haematologica 2018;103:e131–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 103. Patel JP, Gönen M, Figueroa ME, Fernandez H, Sun Z, Racevskis J, et al. Prognostic relevance of integrated genetic profiling in acute myeloid leukemia. N Engl J Med 2012;366:1079–89. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 104. Tyner JW, Tognon CE, Bottomly D, Wilmot B, Kurtz SE, Savage SL, et al. Functional genomic landscape of acute myeloid leukaemia. Nature 2018;562:526–31. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 105. Bairoch A. The cellosaurus, a cell-line knowledge resource. J Biomol Tech 2018;29:25–38. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 106. Tsherniak A, Vazquez F, Montgomery PG, Weir BA, Kryukov G, Cowley GS, et al. Defining a cancer dependency map. Cell 2017;170:564–76.e16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 107. Deveson IW, Gong B, Lai K, LoCoco JS, Richmond TA, Schageman J, et al. Evaluating the analytical validity of circulating tumor DNA sequencing assays for precision oncology. Nat Biotechnol 2021;39:1115–28. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 108. Loken MR, Alonzo TA, Pardo L, Gerbing RB, Raimondi SC, Hirsch BA, et al. Residual disease detected by multidimensional flow cytometry signifies high relapse risk in patients with de novo acute myeloid leukemia: a report from Children’s Oncology Group. Blood 2012;120:1581–88. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 109. Gabert J, Beillard E, van der Velden VHJ, Bi W, Grimwade D, Pallisgaard N, et al. Standardization and quality control studies of ‘real-time’ quantitative reverse transcriptase polymerase chain reaction of fusion gene transcripts for residual disease detection in leukemia - a Europe against Cancer program. Leukemia 2003;17:2318–57. [DOI] [PubMed] [Google Scholar]
  • 110. Mencia-Trinchant N, Hu Y, Alas MA, Ali F, Wouters BJ, Lee S, et al. Minimal residual disease monitoring of acute myeloid leukemia by massively multiplex digital PCR in patients with NPM1 mutations. J Mol Diagn 2017;19:537–48. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Tables 1

Supplementary Tables S1-S18

Supplementary Figures 1

Supplementary Figures S1-S16

Data Availability Statement

The cell line data generated for this study have been deposited in the European Nucleotide Archive at EMBL-EBI under accession number PRJEB64356 (https://www.ebi.ac.uk/ena/browser/view/PRJEB64356). The accession numbers of the samples are listed in Supplementary Tables S5 and S6. Other data generated in this study are available from the corresponding authors upon request.


Articles from Clinical Cancer Research are provided here courtesy of American Association for Cancer Research

RESOURCES