Transcriptome analysis of pediatric acute myeloid leukemia (AML). Schematic diagram of our experimental design and possible outcome trajectories of pediatric patients with AML. Data set consists of primary samples that are obtained at the time of diagnosis (blue), relapse samples (orange), and induction failure (IF)/refractory samples (red). (A) Sequence data (miRNA sequencing [miRNA-seq] and mRNA sequencing [mRNA-seq]) generated in our study. Samples were obtained in two batches: the discovery cohort consisted primarily of diagnostic samples from the AAML0531 trial (n = 528), but also included a few samples from the AAML03P1 (n = 71) and CCG-2961 (n = 38) trials; the AAML1031 validation cohort consisted of patients from the more recent AAML1031 trial (n = 666). (B) Analyses were performed for each sample and sequence data type. The bulk of the analyses were performed on primary (diagnostic) samples from the discovery cohort. (C) Study design for the training and validation of AMLmiR36. The discovery cohort (gray box) was randomly divided into a training cohort (two thirds; n = 425) and test cohort (one third; n = 212). AMLmiR36 (filled blue box) was trained on data from the training cohort (blue box) and validated on independent data from the test cohort and AAML1031 validation cohort (gold boxes). EFS, event-free survival; NMF, non-negative matrix factorization.