Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Apr 1.
Published in final edited form as: Leuk Res. 2022 Mar 9;115:106822. doi: 10.1016/j.leukres.2022.106822

Accurate Detection of Subclonal Variants in Paired Diagnosis-Relapse Acute Myeloid Leukemia Samples by Next Generation Duplex Sequencing

Ashwini S Kamath-Loeb a,d, Jiang-Cheng Shen a, Michael W Schmitt b, Brendan F Kohrn a, Keith R Loeb a,c, Elihu H Estey b,c, Jin Dai b,1, Sylvia Chien b, Lawrence A Loeb a, Pamela S Becker b,c,d,2
PMCID: PMC9014797  NIHMSID: NIHMS1791486  PMID: 35303493

Abstract

Mutations characterize diverse human cancers; there is a positive correlation between elevated mutation frequency and tumor progression. One exception is acute myeloid leukemia (AML), which has few clonal single nucleotide mutations. We used highly sensitive and accurate Duplex Sequencing (DS) to show now that AML, in addition, has an extensive repertoire of variants with low allele frequencies, <1%, which is below the accurate detection limit of most other sequencing methodologies. The subclonal variants are unique to each individual and change in composition, frequency, and sequence context from diagnosis to relapse. Their functional significance is apparent by the observation that many are known variants and cluster within functionally important protein domains. Subclones provide a reservoir of variants that could expand and contribute to the development of drug resistance and relapse. In accord, we accurately identified subclonal variants in AML driver genes NRAS and RUNX1 at allele frequencies between 0.1–0.3% at diagnosis, which expanded to comprise a major fraction (14–53%) of the blast population at relapse. Early and accurate detection of subclonal variants with low allele frequency thus offers the opportunity for early intervention, prior to detection of clinical relapse, to improve disease outcome and enhance patient survival.

Keywords: Acute myeloid leukemia, Duplex sequencing, subclonal variants, diagnosis, relapse, early detection

Introduction

Acute myeloid leukemia (AML) is a hematologic disorder resulting from the expansion of poorly differentiated myeloid progenitor cells (blasts). Patients with AML have poor prognosis, with 70% succumbing to the disease within 5 years of being diagnosed [1]. The AML genome was sequenced to determine whether a high frequency of clonal mutations, associated with tumor initiation and progression [2], accounts for these statistics. Instead, it was shown that AML has one of the lowest clonal mutation burdens of all sequenced cancer genomes [3]. In these studies, generally only mutations present in >5% of cells are scored. We hypothesized that in addition to the few clonal single nucleotide variants (SNV), there may be many more subclonal SNVs in AML, especially those with allele fractions below the accurate detection limit (<1%) of the implemented methodologies. These rare variants could serve as a reservoir for selection of drug resistance during disease progression and determine patient outcome.

To detect low frequency subclonal single nucleotide DNA substitutions in AML patients, we used the most accurate next generation sequencing (NGS) methodology, Duplex Sequencing (DS), developed in our laboratory [4, 5]. DS involves sequencing both strands of a DNA duplex and scoring mutations only if the nucleotide substitutions are complementary to each other and present at the same position on both strands. Artifactual errors from PCR amplification and DNA sequencing that occur randomly on only one or the other strand of the DNA duplex (false positives) are not scored, enabling highly accurate calling of mutations. The sensitivity of DS is 10−7 i.e., one mutation can be accurately detected from amongst 107 nucleotides, which is at least three orders of magnitude higher than most sequencing methods [6]. DS can simultaneously capture SNVs with allele frequencies ranging from as low as 0.03% to as high as 100% in the same DNA sample. Unlike other sensitive assays such as droplet digital PCR (ddPCR), which require prior knowledge of the mutation/s that needs to be detected, DS does not have this requirement. Further, variant allele frequencies quantified by DS and ddPCR have been shown to be equivalent [7]. As a result, DS has been used extensively in many studies: for example, to detect subclonal TP53 mutations in uterine lavage of ovarian cancer patients [7], provide a high-resolution subclonal mutational landscape in super-enhancer regions of normal human B-cells [8], reveal subclonal mutation heterogeneity in colon cancer [9], assess ABL1 kinase domain subclonal mutations in Ph+ ALL pretreatment [10] and quantify in vivo mutagenesis and carcinogenesis [11].

In this study, we used Duplex Sequencing to reveal an accurate landscape of clonal and subclonal single nucleotide variants in AML patients and track changes thereof from diagnosis to relapse. We show that, consistent with our hypothesis, there are many more subclonal SNVs in the AML genome than previously reported. Subclonal variants are unique to each individual, change in composition and frequency from diagnosis to relapse, many are known variants clustering within short DNA segments and located in functionally important protein domains. Importantly, even within our small cohort of 8 relapsed cases, DS accurately identified in 3 cases, patient-specific subclonal mutations in known AML genes at allele frequencies of 0.1–0.3% at diagnosis, which expanded to comprise a major fraction (14–53%) of the blast population at relapse. These variants could be monitored over time and potentially targeted for therapy prior to detection of clinical relapse.

Materials and Methods

Patient Samples

Patient samples were collected under protocols approved by the Institutional Review Board of the University of Washington/Fred Hutchinson Cancer Research Center Cancer Consortium. Informed consent was obtained from patients prior to sample collection. There was no selection bias in the cases investigated in this study. Only samples for which diagnosis and relapse pairs were available from the same patient were sequenced. In total, we had 11 paired samples – 8 diagnosis-relapse and 3 diagnosis-refractory/non-responsive pairs (Table 1). Further, there was an ~1:1 ratio of male and female patients. We did not have access to normal tissue or, longitudinal samples from these patients as they were not part of the approved protocols for archived samples at that time.

Table 1:

Clinical Data

Sample Age Gender Treatment Response Duration CR Survival from diagnosis (days) NPM1 FLT3 Karyotype
AML_026 66 M 7 + 3 *, G-CSF, dofarabine, cytarabine No 0 > 3831§ Neg Neg Complex
AML_036 56 M 7 + 3 CR 1091 1364 Pos Neg Normal
AML_039 52 M 7 + 3 (idarubicin) CRi 106 382 Neg Neg Normal
AML_072 50 F Anti-CXCR4, cytarabine, mitoxantrone, etoposide, No 0 182 Nd Neg Complex
AML_077 22 M G-CSF, clofarabine, cytarabine CR 120 745 Neg Neg Complex
AML_112 60 F Decitabine + tosedostat; idarubicin, cytarabine, pravastatin CR 171 406 Pos Pos Normal
AML_115 57 F 7 + 3 CRi 334 414 Neg Neg Normal
AML_156 58 M Cytarabine + liposomal daunorubicin CRi 236 344 Nd Neg Normal
AML_166 40 F Idarubicin, cytarabine, sorafenib, CNS depocyt CR 120 221 Pos Pos Normal
AML_251 66 F G-CSF, cladaribine, cytarabine, mitoxantrone No 0 100 Neg Neg Complex
AML_259 73 F G-CSF, cladaribine, cytarabine, mitoxantrone CR 202 329 Pos Pos Normal
*

7 (cytarabine) + 3 (daunorubicin);

Granulocyte colony stimulating factor;

Complete remission with incomplete blast count recovery;

§

Refractory at time of sampling;

NPM1: 4 bp insertion; FLT3: Internal tandem duplication;

Not done.

Sample Preparation and DNA Extraction

Cryopreserved leukemia cells (blast count >80%) from blood or bone marrow were thawed and washed twice in Iscove modified Dulbecco medium containing 20% each, of horse serum and fetal bovine serum (Corning, Manassas, VA, USA). Genomic DNA was extracted using the MasterPure Complete DNA and RNA Purification Kit (Epicenter/Illumina, Madison, WI). Briefly, cells were lysed with 2X T&C lysis solution containing proteinase K at 65 °C for 15 min. RNA was digested with RNase A at 37 °C for 30 min, and proteins were precipitated by the addition of MPC protein precipitation reagent. Following centrifugation to remove protein precipitates, DNA was precipitated with isopropanol. DNA concentrations were quantified by a Qubit 3.0 Fluorometer (Invitrogen, Carlsbad, CA, USA).

Duplex sequencing

Duplex sequencing of blinded samples was carried out using published protocols [46]. Briefly, genomic DNA (250 ng) was sonicated to generate fragments sized ~ 300 bp, end-repaired, and dA-tailed as described. Duplex sequencing adaptors containing unique molecular barcodes (12 random nucleotides) and 3’ dT-overhangs were ligated on to purified DNA fragments. The ligated DNA was PCR amplified, and purified amplicons were hybridized to synthetic 5’-biotinylated oligonucleotide DNA probes (120 bp; purchased from Integrated DNA Technologies) complementary to coding sequences of genes frequently mutated in AML (Fig S1 and Table SII). Two sequential rounds of hybridization were performed to increase the efficiency of target DNA capture [12]. The captured DNA sequences were tagged with distinct index sequences and sequenced on an Illumina HiSeq2500 DNA Sequencer. The positions at which subclonal single nucleotide variants were detected had duplex sequencing depths between 1000 to 4000.

Data processing

DNA sequencing reads were processed using a bioinformatics program (UnifiedConsensusMaker.py) available at https://github.com/Kennedy-Lab-UW/Duplex-Sequencing, followed by alignment to the hg19 reference genome using the BWA-MEM algorithm (http://bio-bwa.sourceforge.net/), read clipping and local realignment using GATK v3.8–1, overlap clipping using FGBio (http://fulcrumgenomics.github.io/fgbio/), and variant calling using samtools mpileup [13,14] and a script essentially identical to the variant calling method described in Kennedy et al [4]. In short, the UnifiedConsensusMaker.py script moved molecular barcodes from the main read into the header, sorted reads by molecular barcode and barcode order, combined reads with the same molecular barcode and barcode order into a consensus sequence corresponding to each strand of the DNA molecule (single stranded consensus sequence; SSCS), and combined SSCSs from the same molecular barcode but opposite barcode order into Duplex Consensus Sequence (DCS).

SSCS is analogous to the tag-based or error-corrected sequencing methodology introduced by Kinde et al [15]. While tag-based sequencing improves sensitivity to detect low frequency mutations, artifactual errors still prevail as only one DNA strand is sequenced [6, 7,12]. DCS removes these errors by only making a consensus sequence of bases that match in complementary SSCS; other bases are marked as “N”. Errors generated during PCR amplification and DNA sequencing, which typically do not result in complementary substitutions, are largely eliminated to increase sensitivity and significantly improve the accuracy of mutation calling. DS is thus at least 3 orders of magnitude more sensitive than routine NGS enabling accurate detection of 1 mutation in 107 sequenced nucleotides [46]. Variants are classified as clonal or subclonal if they are present in greater or less than 1% of cells, respectively. Heterozygous and homozygous clonal variants were excluded from analysis if they are known single nucleotide polymorphisms in the population. DS is unable to detect large insertion-deletion (indels) variants, but can detect small indels, such as the frequent NPM1 c.860_863dupTCTG insertion mutation. Raw data can be accessed at NCBI BioProject PRJNA593020 [16].

Results

Clonal Single Nucleotide Variants in Paired AML Samples

Targeted regions in genomic DNA from leukemia cells from the same AML patient at diagnosis and relapse, or after treatment, were sequenced by DS. The DNA oligonucleotide panel (Table S1 and Fig S1) used for capturing these regions was designed based on the TCGA AML mutation database and spans ~20 kb of DNA encoding exonic regions of the top twenty frequently mutated genes. It was expected to capture at least one clonal single nucleotide variant in 90% of AML samples. In accord, we detected clonal SNVs (with variant allele fractions (VAF) of ≥1%; see Methods) in one or more of the captured gene segments in each of the 11 paired samples (Fig 1 and Table S2). Consistent with published reports, only few clonal nucleotide substitutions were detected both at diagnosis and at relapse (or after treatment); the average number using our limited capture panel was ~2. In agreement with numerous publications [1720], some clonal mutations were lost (e.g., FLT3) or gained (e.g., RUNX1) at relapse, while others, such as those in DNMT3A, TET2 and JAK2 remained unchanged from diagnosis (Table S2 and Fig S2) and thus, likely not predictive of relapse. Three patients who were refractory/had not responded to treatment at the time of sample collection, showed no change in their clonal mutation profiles following treatment (Table S2). All three patients also had complex DNA karyotypes (Table 1). Of these, 2 patients (AML_072 and AML_251) had clonal TP53 mutations, one of which, R175H was reported to function in a dominant-negative capacity [21]. Additionally, consistent with reports that TP53 mutations are sufficient to define patients with adverse risk by the current ELN classification [22], AML_072 and AML_251 had the shortest survival times of the patients in our study (Table 1 and [23]).

Fig 1: Subclonal variant heterogeneity in AML.

Fig 1:

Variant allele frequencies (VAF) of clonal and subclonal single base substitutions detected by Duplex Sequencing in 11 paired diagnosis (D)–relapse/post-treatment (R) AML samples. Each sample had an average of ~2 clonal mutations with allele frequencies ranging from 1%–97%. In contrast, each showed large numbers of subclonal single nucleotide variants (5–25–fold increase), most with allele frequencies well below 1%. The dashed line corresponds to an allele frequency of 5%, the threshold at which variations are accurately scored by most NGS methodologies. VAF is plotted on a log scale.

Extensive Heterogeneity of Subclonal Variants Detected by DS

In addition to few clonal variants, we detected large numbers of variants with allele frequencies below 1% (which we define as subclonal) both at diagnosis and at relapse; the increase was 5 to 25-fold above that of clonal variants (Fig 1). The allele fractions of most subclonal SNVs were well below 1%, some even as low as 0.03%. Notably, unlike clonal variants, the frequency of subclonal SNVs (number of unique variants in each sample/total number of nucleotides sequenced, which is read depth at each nucleotide times the number of sequenced bases) was higher at relapse compared to diagnosis, especially in AML_026, _036, _112, _251 and _259 (Table 2), with AML_259 exhibiting a 4-fold increase. Likewise, the number of subclonal variants in select genes- ASXL1, CEBPA, DNMT3A, and TP53 was ~2-fold higher at relapse/after treatment compared to diagnosis (Fig 2). However, when subclonal variants from all samples at diagnosis were combined and compared to the combined number at relapse, the difference between the two groups (diagnosis and relapse-refractory) was not statistically significant (p=0.17, one-tail t-test; Fig. S3). Larger cohort sizes would more definitively inform whether the burden of subclonal variants across all sequenced targets or that in specific genes is consistently higher at relapse, if it correlates with duration of remission or disease progression, and/or if it can serve as a biomarker of relapse.

Table 2:

Frequencies of Single Nucleotide Variants at Diagnosis and Relapse

Sample Diagnosis Relapse R/D
AML_026 3.2E-07 6.3E-07 2.0
AML_036 1.3E-06 3.1E-06 2.4
AML_039 7.8E-07 9.7E-07 1.2
AML_072 5.6E-07 7.8E-07 1.4
AML_077 1.9E-06 9.6E-07 0.5
AML_112 2.8E-07 5.1E-07 1.8
AML_115 1.3E-06 1.9E-06 1.5
AML_156 1.0E-06 2.7E-07 0.3
AML_166 1.9E-06 2.7E-06 1.4
AML_251 4.7E-07 8.9E-07 1.9
AML_259 2.3E-07 9.8E-07 4.3

Fig 2: Subclonal variants are elevated at relapse compared to diagnosis.

Fig 2:

Subclonal variants in ASXL1, CEBPA, DNMT3A and TP53 were ~2-fold higher at relapse compared to that at diagnosis. Numbers above each bar indicate the number of unique subclonal single base substitutions in indicated genes.

Characteristics of Subclonal Variants

The subclonal variants differed both in number and distribution amongst patients (Fig 1), with each patient showing a unique profile (Fig 3A). Even within a patient, there were dynamic changes in the composition of subclonal variants, with different variants being present at diagnosis compared to those at relapse. For example, several RUNX1 subclones in AML_077 and TP53 subclones in AML_156 were present at diagnosis but were dramatically reduced in number and absent, respectively, in the corresponding samples at relapse (Fig 3A) perhaps, due to response to therapy. On the other hand, subclonal variants in RUNX1 in AML_036, in TP53 in AML_036 and AML_077, and in DNMT3A in AML_112 were detected only at relapse (Fig 3A); these presumably arose concomitant with treatment.

Fig 3:

Fig 3:

A) Profile of clonal and subclonal variants. The number of unique clonal and subclonal variants in each of the captured targets in each paired patient sample is presented. The first box under each gene in the column shows the number of clonal mutations while the second box shows the number of subclonal variants in that gene. Clonal mutations, 1 or 2 in a given gene, are denoted by peach or red boxes, respectively, while subclonal variants are shown in shades of purple; the color intensity correlates with the number of variants (1 to >30 as indicated in the color key). Grey boxes reflect absence of variants. B) Signatures of subclonal variants. The frequencies of single nucleotide substitutions (C>A, C>G, C>T, T>A, T>C, and T>G) within the 16 possible trinucleotide sequence contexts (shown below each panel) were determined for subclonal variants from 11 diagnosis and 8 relapse samples as indicated. The observed frequency of each substitution type is presented. Triplets with differences in nucleotide substitution frequencies described in the text are highlighted by asterisks. C) Clustered subclonal variants in TP53. Subclonal TP53 variants detected in AML patients at diagnosis or at relapse/after treatment map within the functionally important DNA-binding domain of the protein. Many correspond to clonal AML TP53 mutations reported in the TCGA database. Amino acid positions in the TP53 protein are shown on the X-axis.

Further, the spectrum of subclonal single base variants was dominated by C>T/G>A and C>A/G>T substitutions both at diagnosis and relapse (Fig S4). Unlike published reports with clonal mutations [24, 25], we did not observe an increase in transversion substitutions (A>C/T>G, C>A/G>T, C>G/G>C) amongst the subclonal variants at relapse. However, we did observe differences in the trinucleotide sequence context in which the substitutions occurred at diagnosis versus at relapse. For example, C>A substitutions within ACG and CCT triplets, were elevated at relapse but absent at diagnosis. In contrast, C>G substitutions in the trinucleotide CCA and T>C substitutions within the CTC triplet were absent at relapse but present only at diagnosis. In addition, there were differences in the frequencies (higher or lower) of some substitutions between diagnosis and relapse (for e.g., T>C substitutions in CTG were higher at diagnosis compared to relapse, and vice versa with C>A alterations in GCT triplets). These differences were also reflected in the extracted mutation signatures [26, 27]. Only two mutation signatures, SBS39 and SBS54, were shared between the 2 groups (Fig S5). Although the etiology of these signatures is currently not known, larger cohorts may potentially allow us to stratify patients with different AML subtypes and/or monitor relapse based on subclonal mutation signatures.

Significance of Subclonal Variants

The functional significance of the subclonal variants detected by DS is apparent in the following observations. 1) Approximately 50% of the subclonal variants are previously reported variants (Supporting Data Set; Column M, “Existing Variants”). 2) Subclonal and clonal SNVs were present in the same gene in the same sample – for example, in RUNX1 in AML_036R and in DNMT3A in AML_112R (Fig 3A). Subclones may provide a reservoir of variants for selection and clonal expansion. 3) Amongst the five genes that were sequenced in their entirety (all coding exons) many of the subclonal SNVs clustered within small regions and in functionally important protein domains including DNA binding, transactivation, and methyltransferase (Fig 3C and Fig S6). In TP53 for example, subclones in the DNA-binding domain included previously reported AML clonal mutations– S94*, F113C, S127T, P151R, I195T, S241Y, C242S, L265P, and R283H. Thus, within the pool of identified subclones, there are variants with potential to be selected, expand, and influence disease outcome.

Subclonal variants influence response to therapy and disease outcome

One of the most striking results we obtained from Duplex Sequencing was the early detection of low allele frequency subclones in AML driver genes. Greater than 60% of relapsed patients (5 of 8) had gained new SNVs at relapse (Figs 4, S2 and S7, and Table S2). DS accurately detected these variants as subclones at diagnosis at allele fractions as much as 10-fold lower than those reported using most other sequencing methodologies in three patients. For example, four pathogenic NRAS clones– G12C, G12D, G13D, and G13R were detected by DS in patient AML_039 at allele fractions of 16%, 0.2%, 2% and 0.3%, respectively, at diagnosis. The G12C, G12D, and G13D clones decreased to frequencies of 0.1–0.3% at relapse; however, the G13R subclone expanded from 0.3% at diagnosis to comprise 14% of the blast population at relapse, i.e., a 70-fold expansion (Fig 4). Likewise, we detected a clonal pathogenic RUNX1 driver mutation at an allele frequency of 53% in AML_156 at relapse. With DS, we accurately detected this mutation at diagnosis in multiple sequence reads at a frequency of 0.1%, representing a 530-fold expansion at relapse (Fig 4). In a third patient, AML_115, we detected two known population variants- TET2 (I1762V) and WT1 (R369R) at subclonal levels; they expanded 6-fold (1.6% to 10%) and 40-fold (0.3% to 12%), respectively, from diagnosis to relapse (Fig 4). These variants may likely be passengers in a relapse-associated driver clone absent in our capture set; they nevertheless could serve as markers of relapse.

Fig 4: Relapse-associated clonal mutations pre-exist at diagnosis.

Fig 4:

DNA from 11 paired AML diagnosis–relapse or post-treatment leukemic cell populations was sequenced by DS. Relapse-associated clonal mutations with allele frequencies ranging from 3% to 53% were found in five patient samples. Duplex sequencing accurately detected these mutations at frequencies between 0.1–0.3% in the corresponding samples at diagnosis in three patients, AML_039, AML_115 and AML_156 (red lines). TET2 I1762V, WT1 R369R, NRAS G13R, and RUNX1 D198N showed 6-, 33-, 70-, and 530-fold expansion, respectively, from diagnosis (open circles) to relapse (closed circles). Stable mutations are indicated by yellow lines while mutations lost at relapse are highlighted by blue lines.

DS also revealed the opposite phenomenon in some patients i.e., mutations at diagnosis, which would be considered by traditional NGS to have been eliminated by therapy, were still present as minor clones at relapse. For example, the NRAS G12C clone in AML_039, which was present at a frequency of 16% at diagnosis, was still present at relapse at a frequency of 0.3% (Fig 4). Likewise, the clone expressing the KIT D816Y mutation in AML_077 was present at a frequency of 26% at diagnosis and decreased to a frequency of 0.5% at relapse but was not completely cleared [16]. Thus, DS can also be used to accurately monitor the persistence of subclonal mutant clones during the course of the disease. Longitudinal sampling would reveal whether these subclones were cleared by therapy and redeveloped later, or whether they persisted at low levels during treatment. The fact that relapse occurred even when the allele fraction of dominant clones had decreased shows that monitoring the decline of clones, which are dominant at diagnosis, may be a marker for remission but is not sufficient for detecting relapse.

Discussion

We present proof-of-principle studies with a small cohort of AML patients to demonstrate the power of Duplex Sequencing to accurately and sensitively detect subclonal single nucleotide substitutions at allele fractions well below 1%. Targeted capture of exons, along with the most accurate sequencing methodology available, allowed us to reveal an accurate landscape of single nucleotide variants in frequently mutated genes in AML patients. In contrast to a few high allele frequency clonal mutations, we observed a much larger number of variants with very low allele frequencies in every patient. We show that the profile of subclonal variants is unique to each individual and dynamic, changing in composition, frequency and sequence context from diagnosis to relapse. Characteristics of subclonal variants including many being previously identified clonal mutations, clustering and presence in functional domains, as well as their expansion from diagnosis to relapse implicate active involvement of many in the disease process. While we captured only a 20 kb region of the genome encoding the top twenty mutated AML genes, in principle, one can expand the study to include additional capture sets encoding other genes mutated in AML. Duplex Sequencing could thus reveal the expansion and persistence of many other rare subclones besides those presented in this report that could also have biological implications.

Non-sequencing-based tests such as digital droplet PCR are used commonly in clinical laboratories [28]. Although ddPCR is a sensitive assay that doesn’t require bioinformatics for data analysis, one of the main limitations is that prior knowledge of the variant in question is required to design variant-specific probes for amplification. Further, it is not a high-throughput assay if many variants need to be detected. Next-generation whole genome, whole exome, and even targeted DNA sequencing, which are in use currently offer many advantages, foremost being that one doesn’t need to know the identity of the variant/s. Both known and previously unreported variants encoded in the DNA sequence can be detected simultaneously. In addition, NGS is high throughput since, by virtue of using unique identifying tags, multiple samples can be sequenced in parallel. Next-generation DNA sequencing is being used to follow the dynamics of mutant clones during disease progression [18, 24, 29, 30], however, these studies have generally focused only on clonal mutations, present in >5% cells, because of the inability to detect and/or correctly score subclonal mutations above the high background of artifactual (sequencing and PCR) errors. Our study is distinct since it is the first use of Duplex Sequencing (which largely eliminates technical errors by generating a consensus sequence based on the sequence of both DNA strands) to provide a more sensitive and accurate landscape of both clonal and subclonal single nucleotide substitutions in AML. Accurate identification of subclones (with allele frequencies below 1%) offers many advantages including early detection of disease as well as of residual disease following treatment. Those in driver genes in particular, warrant close scrutiny since they can undergo clonal expansion, as we have shown, and likely contribute to relapse. Detection of rare subclones at diagnosis or persistence of subclones during disease, in addition, offers the opportunity for early intervention and/or personalized targeted therapy to improve patient outcome. Of note, therapies targeting mutations in a number of frequently mutated AML genes including, RUNX1 [31] and NRAS [32, 33], identified at subclonal frequencies in this study, are being investigated in clinical trials. In addition, our initial analyses of trinucleotide sequence context and mutation signatures of subclonal variants show that relapse is distinct from diagnosis raising the possibility of monitoring changes in mutation signatures in addition to the expansion of specific subclones as early markers/predictors of relapse. Ultimately, the strategy of targeting subclones prior to overt clinical relapse would need to be addressed in future clinical studies with longitudinal sampling, including those in remission.

We are cognizant that the studies in this report were carried out with a small group of patients due to limited availability of paired samples. Nonetheless, even in these initial studies, DS accurately detected subclones in driver genes with low allele frequencies at diagnosis that could have been monitored longitudinally and targeted prior to the detection of clinical relapse. Our studies have, in addition, revealed many features of subclonal variants with potential to lend statistical significance and stratify patients in a larger study. Some interesting questions include the following: 1. Does the frequency of subclonal substitutions and/or subclonal mutation heterogeneity correlate with duration of remission, onset of relapse, and/or survival? 2. Do the number, types, and composition of subclonal variants correlate with clonal mutations in specific genes? 3. Can signatures of subclonal variants serve as biomarkers of relapse and disease outcome? 4. Can inclusion of subclonal variants in our machine learning approach [34] better predict drug sensitivity in AML patients? Clearly, an expanded DS study with a larger patient cohort, deeper sequencing, additional target genes, and mechanistic studies should enable a more in-depth analysis of these questions.

Conclusions:

AML offers an amenable model to study the evolution of tumors since pure populations of malignant cells can be easily obtained from peripheral blood samples. We now provide a more sensitive and accurate methodology, Duplex Sequencing, to quantitatively detect the landscape of very low allele-frequency subclonal variants in a small cohort of AML patients. Subclones are significant as they can provide an extensive and dynamic reservoir of variants for clonal evolution, drug resistance, and relapse. Since many dominant mutations (high VAF) detected at diagnosis persist during remission or are replaced by new dominant mutations at relapse, current clinical studies focusing only on the dominant diagnostic mutations have limitations. Detecting subclonal variants accurately and early, as well as monitoring changes in their landscape during the course of disease is recommended as it offers the potential for early intervention (prior to manifestation of overt clinical relapse) to improve patient outcome. Data presented in this manuscript speak to the importance of subclonal variants in shaping the trajectory of AML. However, the clinical utility of subclonal variants in predicting relapse and/or survival needs to be validated in future studies with a larger patient cohort and with longitudinal samples.

Supplementary Material

1
2

Highlights.

  • Carried out Duplex sequencing of paired diagnosis-relapse AML samples

  • Observed extensive heterogeneity of subclonal variants with allele frequencies <1%

  • Subclonal variants include tumor mutations and cluster in functional domains

  • Subclonal variant frequency was higher, and mutation signature differed, at relapse

  • Accurately detected low VAF subclones at diagnosis, which became clonal at relapse

Acknowledgements:

We thank Theo Knijnenburg and Guangrong Qin from the Shmulevich laboratory (Institute for Systems Biology) for help in accessing TCGA AML DNA sequencing data, and Tom Walsh and Ming Lee for DNA Sequencing support. We are grateful to the UW Hematology Clinical Research Team, including Manager Cody Hammer, and research coordinator Niall Curley and their colleagues for their contributions in the enrollment of and procurement of samples from study subjects. We acknowledge the acquisition of some cell samples from the Leukemia Repository at the Fred Hutchinson Cancer Research Center– Director, Dr. Derek Stirewalt and Manager, Era Pogosova. We sadly report the untimely demise of Dr. Elihu H. Estey during the preparation of our manuscript.

Funding:

Research reported in this publication was supported by the National Institutes of Health under award numbers P01 CA077852 (LAL and PSB), R01 CA193649 (LAL), P30 DK56465 (KRL), and in part through the Cancer Center Support Grant P30 CA015704 to the Fred Hutchinson Cancer Research Center/University of Washington Cancer Consortium. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. KRL was also supported by a pilot grant from the Seattle Translational Tumor Research program (FHCRC).

Declaration of interest:

Lawrence A. Loeb: shareholder of TwinStrand Biosciences.

Michael W. Schmitt: employee and shareholder of SeaGen, Inc.; co-founder, consultant, and shareholder of TwinStrand Biosciences; royalty sharing agreement with University of Washington for licensed patents regarding Duplex Sequencing technology.

Pamela S. Becker: PI on institutional research grants from Abbvie, BMS, Cardiff Oncology, Glycomimetics, JW Pharmaceutical, Pfizer, Novartis, SecuraBio and Tolero; consultant for Accordant Health Services (Caremark).

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  • 1.Cancer Stat Facts: Leukemia – Acute Myeloid Leukemia (AML). Surveillance, Epidemiology, and End Results Program 18, 2011–2017, https://seer.cancer.gov/statfacts/html/amyl.html.
  • 2.Loeb LA. Human cancers express mutator phenotypes: origin, consequences and targeting. Nat Rev Cancer 2011, 11: 450–457. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.The Cancer Genome Atlas Research Network. Genomic and Epigenomic Landscapes of Adult De Novo Acute Myeloid Leukemia. New England Journal of Medicine 2013, 368: 2059–2074. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Schmitt MW, Kennedy SR, Salk JJ, Fox EJ, Hiatt JB, Loeb LA. Detection of ultra-rare mutations by next-generation sequencing. Proceedings of the National Academy of Sciences of the United States of America 2012, 109: 14508–14513. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Kennedy SR, Schmitt MW, Fox EJ, Kohrn BF, Salk JJ, Ahn EH, Prindle MJ, Kuong KJ, Shen JC, Risques RA, Loeb LA. Detecting ultralow-frequency mutations by Duplex Sequencing. Nature Protocols 2014, 9: 2586–2606. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Salk JJ, Schmitt MW, Loeb LA. Enhancing the accuracy of next-generation sequencing for detecting rare and subclonal mutations. Nature Reviews Genetics 2018, 19: 269–285. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Salk JJ, Loubet-Senear K, Maritschnegg E, Valentine CC, Williams LN, Higgins JE, Horvat R, Vanderstichele A, Nachmanson D, Baker KT, Emond MJ, Loter E, Tretiakova M, Soussi T, Loeb LA, Zeillinger R, Speiser P, Risques RA. Ultra-sensitive TP53 sequencing for cancer detection reveals progressive clonal selection in normal tissue over a century of human lifespan. Cell Reports 2019, 28: 132–144. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Shen JC, Kamath-Loeb AS, Kohrn BF, Loeb KR, Preston BD, Loeb LA. A high-resolution landscape of mutations in the BCL6 super-enhancer in normal human B cells. Proceedings of the National Academy of Sciences of the United States of America 2019, 116: 24779–24785. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Loeb LA, Kohrn BF, Loubet-Senear KJ, Dunn YJ, Ahn EH, O’Sullivan JN, Salk JJ, Bronner MP, Beckman RA. Extensive subclonal mutational diversity in human colorectal cancer and its significance. Proceedings of the National Academy of Sciences of the United States of America 2019, 116: 26863–26872. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Short NJ, Kantarjian H, Kanagal-Shamanna R, Sasaki K, Ravandi F, Cortes J, Konopleva M, Issa GC, Kornblau SM, Garcia-Manero G, Garris R, Higgins J, Pratt G, Williams LN, Valentine CC, Rivera VM, Pritchard J, Salk JJ, Radich J, Jabbour E. Ultra-accurate Duplex Sequencing for the assessment of pretreatment ABL1 kinase domain mutations in Ph+ ALL. Blood Cancer Journal 2020, 10: 61. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Valentine CC, Young RR, Fielden MR, Kulkarni R, Williams LN, Li T, Minocherhomji S, Salk JJ. Proceedings of the National Academy of Sciences of the United States of America 2020, 117: 33414–33425. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Schmitt MW, Fox EJ, Prindle MP, Reid-Bayliss KS, True LD, Radich JP, Loeb LA. Sequencing small genomic targets with high efficiency and extreme accuracy. Nature Methods 2015, 12: 423–425. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, and 1000 Genome Project Data Processing Subgroup. The Sequence alignment/map (SAM) format and SAMtools. Bioinformatics 2009, 25: 2078–2079. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Li H A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics 2011, 27: 2987–2993. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Kinde I, Wu J, Papadopoulos N, Kinzler KW, Vogelstein B. Detection and quantification of rare mutations with massively parallel sequencing. Proceedings of the National Academy of Sciences of the United States of America 2011, 108: 9530–9535. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Kamath-Loeb AS, Loeb LA Subclonal Mutations in AML. BioProject. http://www.ncbi.nlm.nih.gov/bioproject/593020; Deposited 02 December 2019. [Google Scholar]
  • 17.Getta BM, Devlin SM, Levine RL, Arcila ME, Mohanty AS, Zehir A, Tallman MS, Giralt SA, Roshal M. Multicolor flow cytometry and multi-gene next-generation sequencing are complementary and highly predictive for relapse in acute myeloid leukemia after allogeneic transplantation. Biology of Blood and Marrow Transplantation 2017, 23: 1064–1071. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Gaksch L, Kashofer K, Heitzer E, Quehenberger F, Daga S, Hofer S, Halbwedl I, Graf R, Krisper N, Hoefler G, Zebisch A, Sill H, and Wölfler A. Residual disease detection using targeted parallel sequencing predicts relapse in cytogenetically acute myeloid leukemia. American Journal of Hematology 2018, 93: 23–30. [DOI] [PubMed] [Google Scholar]
  • 19.Greif PA, Hartmann L, Vosberg S, Stief S, Mattes R, Hellmann I, Metzeler KH, Herold T, Bamopoulos SA, Kerbs P, Jurinovic V, Schumacher D, Pastore F, Bräundl K, Zellmeier E, Ksienzyk B, Konstandin NP, Schneider S, Graf A, Krebs S, Blum H, Neumann M, Baldus CD, Bohlander SK, Wolf S, Görlich D, Berdel WE, Wörmann B, Hiddemann W, Spiekermann K. Evolution of cytogenetically normal acute myeloid leukemia during therapy and relapse: An exome sequencing study of 50 patients. Clinical Cancer Research 2018, 24: 1716–1726. [DOI] [PubMed] [Google Scholar]
  • 20.Jongen-Lavrencic M, Grob T, Hanekamp D, Kavelaars FG, al Hinai A, Zeilemaker A, Erpelinck-Verschueren CAJ, Gradowska PL, Meijer R, Cloos J, Biemond BJ, Graux C, van Marwijk Kooy M, Manz MG, Pabst T, Passweg JR, Havelange V, Ossenkoppele GJ, Sanders MA, Schuurhuis GJ, Löwenberg B, Valk PJM. Molecular minimal residual disease in acute myeloid leukemia. New England Journal of Medicine 2018, 378: 1189–1199. [DOI] [PubMed] [Google Scholar]
  • 21.Boettcher S, Miller PG, Sharma R, McConkey M, Leventhal M, Krivtsov AV, Giacomelli AO, Wong W, Kim J, Chao S, Kurppa KJ, Yang X, Milenkowic K, Piccioni F, Root DE, Rücker FG, Flamand Y, Neuberg D, Lindsley RC, Jänne PA, Hahn WC, Jacks T, Döhner H, Armstrong SA, Ebert BL. A dominant-negative effect drives selection of TP53 missense mutations in myeloid malignancies. Science 2019, 365: 599–604. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Döhner H, Estey E, Grimwade D, Amadori S, Appelbaum FR, Büchner T, Dombret H, Ebert BL, Fenaux P, Larson RA, Levine RL, Lo-Coco F, Naoe T, Niederwieser D, Ossenkoppele GJ, Sanz M, Sierra J, Tallman MS, Tien HF, Wei AH, Löwenberg B, Bloomfield CD. Diagnosis and management of AML in adults: 2017 ELN recommendations from an international expert panel. Blood 2017, 129: 424–447. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Rücker FG, Schlenk RF, Bullinger L, Kayser S, Teleanu V, Kett H, Habdank M, Kugler CM, Holzmann K, Gaidzik VI, Paschka P, Held G, von Lilienfeld-Toal M, Lübbert M, Fröhling S, Zenz T, Krauter J, Schlegelberger B, Ganser A, Lichter P, Döhner K, Döhner H. TP53 alterations in acute myeloid leukemia with complex karyotype correlate with specific copy number alterations, monosomal karyotype, and dismal outcome. Blood 2012, 119: 2114–2121. [DOI] [PubMed] [Google Scholar]
  • 24.Ding L, Ley TJ, Larson DE, Miller CA, Koboldt C, Welch JS, Ritchey JK, Young MA, Lamprecht T, McLellan M, McMichael JF, Wallis JW, Lu C, Shen D, Harris CC, Dooling DJ, Fulton RS, Fulton LL, Chen K, Schmidt H, Kalicki-Veizer J Magrini VJ, Cook L, McGrath SD, Vickery TL, Wendl MC, Heath S, Watson MA, Link C, Tomasson MH, Shannon WD, Payton JE, Kulkarni S, Westervelt P, Walter MJ, Graubert TA, Mardis ER, Wilson RK, DiPersio JF. Clonal evolution in relapsed acute myeloid leukaemia revealed by whole-genome sequencing. Nature 2012, 481: 506–509. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Garg M, Nagata Y, Kanojia D, Mayakonda A, Yoshida Y, Keloth H, et al. Profiling of somatic mutations in acute myeloid leukemia with FLT3-ITD at diagnosis and relapse. Blood 2015, 126: 2491–2501. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Alexandrov LB, Nik-Zainal S, Wedge DC, Aparicio SA, Behjati S, Biankin AV, et al. Signatures of mutational processes in human cancer. Nature 2013, 500: 415–421. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Alexandrov LB, Kim J, Haradhvala NJ, Huang MN, Tian Ng AW, Wu Y, et al. The repertoire of mutational signatures in human cancer. Nature 2020, 578: 94–101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Coccaro N, Tota G, Anelli L, Zagaria A, Specchia G, Albano F. Digital PCR: A reliable tool for analyzing and monitoring hematologic malignancies International Journal of Molecular Science 2020, 21: 3141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Cocciardi S, Dolnik A, Kapp-Schwoerer S, Rücker FG, Lux S, Blätte TJ, Skambraks S, Krönke J, Heidel FH, Schnöder TM, Corbacioglu A, Gaidzik VI, Paschka P, Teleanu V, Göhring G, Thol F, Heuser M, Ganser A, Weber D, Sträng E, Kestler HA, Döhner H, Bullinger L, Döhner K. Clonal evolution patterns in acute myeloid leukemia with NPM1 mutation. Nature Communication 2019, 10: 2031–2042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Christen F, Hoyer K, Yoshida K, Hou H-A, Waldhueter N, Heuser M, Hills RK, Chan W, Hablesreiter R, Blau O, Ochi Y, Klement P, Chou W-C, Blau I-W, Tang J-L, Zemojtel T, Shiraishi Y, Shiozawa Y, Thol F, Ganser A, Lowenberg B, Linch DC, Bullinger L, Valk PJM, Tien H-F, Gale E, Ogawa S, Damm F. Genomic landscape and clonal evolution of acute myeloid leukemia with t(8;21): an international study on 331 patients. Blood 2019, 133: 1140–1151. [DOI] [PubMed] [Google Scholar]
  • 31.Mill CP, Fiskus W, DiNardo CD, Qian Y, Raina K, Rajapakshe K, Perera D, Coarfa C, Kadia TM, Khoury JD, Saenz DT, Saenz DN, Illendula A, Takahashi K, Kornblau SM, Green MR, Futreal AP, Bushweller JH, Crews CM, Bhalla KN. RUNX1-targeted therapy for AML expressing somatic or germline mutation in RUNX1. Blood 2019, 134: 59–73. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Borthakur G, Popplewell L, Boyiadzis M, Foran J, Platzbecker U, Vey N, Walter RB, Olin R, Raza A, Giagounidis A, Al-Kali A, Jabbour E, Kadia T, Garcia-Manero G, Bauman JW, Wu Y, Liu Y, Schramek D, Cox DS, Wissel P, Kantarjian H. Activity of the oral mitogen-activated protein kinase inhibitor trametinib in RAS-mutant relapsed or refractory myeloid malignancies. Cancer 2016, 122: 1871–1879. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Ragon BK, Odenike O, Baer MR, Stock W, Borthakur, Patel K, Han L, Chen H, Ma H, Joseph L, Zhao Y, Baggerly K, Konopleva M, Jain N. Oral MEK 1/2 inhibitor trametinib in combination with AKT inhibitor GSK141795 in patients with acute myeloid leukemia with RAS mutations: A phase II study. Clinical Lymphoma, Myeloma and Leukemia 2019, 19: 431–440. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Lee S-I, Celik S, Logsdon BA, Lundberg SM, Martins TJ, Oehler VG, Estey EH, Miller CP, Chien S, Dai J, Saxena A, Blau CA, Becker PS. A machine learning approach to integrate big data for precision medicine in acute myeloid leukemia. Nature Communication 2018, 9: 42–53. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1
2

RESOURCES