Abstract
Tumor protein p53 (TP53) is the most frequently mutated gene in cancer1,2. In patients with myelodysplastic syndromes (MDS), TP53 mutations are associated with high-risk disease3,4, rapid transformation to acute myeloid leukemia (AML)5, resistance to conventional therapies6–8 and dismal outcomes9. Consistent with the tumor-suppressive role of TP53, patients harbor both mono- and biallelic mutations10. However, the biological and clinical implications of TP53 allelic state have not been fully investigated in MDS or any other cancer type. We analyzed 3,324 patients with MDS for TP53 mutations and allelic imbalances and delineated two subsets of patients with distinct phenotypes and outcomes. One-third of TP53-mutated patients had monoallelic mutations whereas two-thirds had multiple hits (multi-hit) consistent with biallelic targeting. Established associations with complex karyotype, few co-occurring mutations, high-risk presentation and poor outcomes were specific to multi-hit patients only. TP53 multi-hit state predicted risk of death and leukemic transformation independently of the Revised International Prognostic Scoring System (IPSS-R)11. Surprisingly, monoallelic patients did not differ from TP53 wild-type patients in outcomes and response to therapy. This study shows that consideration of TP53 allelic state is critical for diagnostic and prognostic precision in MDS as well as in future correlative studies of treatment response.
In collaboration with the International Working Group for Prognosis in MDS (Supplementary Table 1), we assembled a cohort of 3,324 peridiagnostic and treatment-naive patients with MDS or closely related myeloid neoplasms (Extended Data Fig. 1 and Supplementary Fig. 1). Genetic profiling included conventional G-banding analyses (CBA) and tumor-only, capture-based, next-generation sequencing (NGS) of a panel of genes recurrently mutated in MDS, as well as genome-wide copy number probes. Allele-specific copy number profiles were generated from NGS data using the CNACS algorithm7 (see Methods and Code availability). An additional 1,120 samples derived from the Japanese MDS consortium (Extended Data Fig. 2) were used as a validation cohort.
To study the effect of TP53 allelic state on genome stability, clinical presentation, outcome and response to therapy, we performed a detailed characterization of alterations at the TP53 locus. First, we assessed genome-wide allelic imbalances in the cohort of 3,324 patients, to include arm-level or focal (∼3 Mb) ploidy alterations and regions of copy-neutral loss of heterozygosity (cnLOH) (Extended Data Fig. 3, Supplementary Figs. 2–4 and Methods). Collectively, 360 (11%) patients had at least one cnLOH region and 1,571 (47%) had at least one chromosomal aberration. Among these, 329 karyotypes were complex12 and 177 were monosomal13 (Supplementary Table 2).
Mutation analysis identified 486 putative oncogenic mutations in TP53 at variant allele frequency (VAF) ≥ 2% across 378 individuals (Supplementary Figs. 5–7 and Methods). Among TP53-mutated patients, 274 (72.5%) had a single TP53 mutation, 100 had two (26.5%) and four (1%) had three. Allelic imbalances overlapping the TP53 locus were found in 177 cases. Of these, 98 were focal deletions or regions of cnLOH detected by NGS only (Supplementary Table 3). Approximately half (54%, n = 149) of patients with one TP53 mutation had loss of the wild-type allele by deletion or cnLOH. In contrast, only 13% (n = 14) of patients with more than one TP53 mutation had a concomitant allelic imbalance at the TP53 locus (odds ratio (OR) = 7.6, 95% confidence interval (CI): 4.1–15.2) (Fig. 1a). By consideration of mutations and allelic imbalances, we defined four TP53-mutant subgroups (Fig. 1b): (1) monoallelic mutation (n = 125, 33% of TP53-mutated patients); (2) multiple mutations without deletion or cnLOH affecting the TP53 locus (n = 90, 24%); (3) mutation(s) and concomitant deletion (n = 85, 22%); and (4) mutation(s) and concomitant cnLOH (n = 78, 21%). Additionally, in 24 patients the TP53 locus was affected by deletion (n = 12), cnLOH (n = 2) or isochromosome 17q rearrangement (n = 10) without evidence of TP53 mutations (Fig. 1a).
Fig. 1 |. Integration of TP53 mutations and allelic imbalances at the TP53 locus identifies TP53 states with evidence of mono- or biallelic targeting.
a, Number of patients (from patients with any hit at the TP53 locus) with 0, 1, 2 or 3 TP53 mutations. Colors represent the status of chromosome 17 at the TP53 locus, to include cnLOH, deletion (del), isochromosome 17q rearrangement (iso17q), gain or no detected aberration (normal). Unbalanced translocations leading to 17p deletion are encoded as ‘del’. b, Frequency of TP53 subgroups within TP53-mutated patients. TP53 subgroups are defined as cases with (1) single gene mutation (1mut); (2) several mutations with normal status of chromosome 17 at the TP53 locus (>1mut); (3) mutation(s) and chromosomal deletion at the TP53 locus (mut + del); and (4) mutation(s) and cnLOH at the TP53 locus (mut + cnLOH). c, Density estimation of VAF of TP53 mutations across TP53 subgroups (from top to bottom, 1mut, >1mut, mut + del, mut + cnLOH). d, Distribution of TP53 mutations along the gene body. Mutations from patients with monoallelic TP53 are depicted at the top and those from patients with multiple TP53 hits at the bottom. Missense mutations are shown as green circles. Truncated mutations corresponding to nonsense or nonstop mutations, frameshift deletions or insertions and splice site variants are shown as pink circles. Other types of mutations to include in-frame deletions or insertions are shown as orange circles. TAD, transactivation domain; OD, oligomerization domain.
In subgroups 2–4, clonality estimates of co-occurring mutations or allelic imbalances supported biallelic targeting of TP53 (Extended Data Fig. 4). In a subset of cases, biallelic targeting was validated by phasing analysis or sequential sampling (Supplementary Fig. 8). Thus, the TP53 mutant subgroups were organized into two states: (1) monoallelic TP53 state representing subgroup 1, with one residual wild-type copy of TP53, and (2) multi-hit TP53 state encompassing subgroups 2–4, with at least two TP53 hits in each patient and probably no residual TP53. While most multi-hit samples were confidently assigned as biallelic, we maintained a conservative ‘multi-hit’ notation.
Accurate determination of allelic state requires LOH mapping, as can be achieved by NGS-based analysis of sequencing panels7 or more comprehensive sequencing methods (Supplementary Fig. 4). VAF estimates were not sufficient for precise assessment of TP53 allelic state (Fig. 1c). For example, 19 cnLOH-positive patients had TP53 VAF ≤ 50% (median 29%, range 3–49%), showing that one-quarter of cnLOH patients would be misassigned as monoallelic on the basis of VAF.
In monoallelic cases, TP53 mutations were enriched for subclonal presentation (median VAF = 13%, median sample purity = 86%) as compared to TP53 mutations from patients with multiple mutations, which were predominantly clonal (median VAF = 32%, median sample purity = 85%) (Fig. 1c). Thus TP53 allelic state—and, by extension, whether a wild-type TP53 allele is retained—points toward different evolutionary trajectories or potential for clonal dominance. Overall, the spectrum of TP53 mutations was shared among the two allelic states (Fig. 1d and Supplementary Fig. 9). Of note, truncating mutations were enriched in the multi-hit state (28 versus 14%, OR = 2.3, 95% CI: 1.3–4.2) while hotspot mutations accounted for 25% of mutations in the monoallelic state and 20% in the multi-hit state.
We next assessed profiles of genome stability and patterns of co-mutations for each TP53 state. The correlation between TP53 mutations and chromosomal aneuploidies is well established3,7,14–16. Overall, 67% (n = 252) of TP53-mutated cases had at least two chromosomal deletions as compared to 5% (n = 158) of wild-type cases (OR = 35, 95% CI: 27–46). Excluding chr17 (which is linked to state definition), there was a significantly higher number of chromosomal aberrations per patient in all multi-hit TP53 subgroups compared to the monoallelic group (Fig. 2a and Extended Data Fig. 5), and this enrichment was most pronounced for deletions (median four in multi-hit versus one in monoallelic state). In particular, deletion of 5q was observed in 85% of multi-hit patients as opposed to 34% of monoallelic patients (OR = 10, 95% CI: 6.1–18; Supplementary Fig. 10). Taken together, we found a median of six unique chromosomes with aberrations in the multi-hit state and one in the monoallelic state (two-sided Wilcoxon rank-sum test W statistic = 2,395, P = 1.2 × 10−41; Fig. 2b). Our data suggest that residual wild-type TP53 is critical to the maintenance of genome stability, and that the association between TP53 and complex karyotype is specific to the multi-hit state (91 versus 13% complex karyotype patients within multi-hit or monoallelic states, OR = 70, 95% CI: 34–150; Fig. 2c).
Fig. 2 |. TP53 allelic state correlates with contrasting levels of genome stability and patterns of co-mutation.
a, Number of chromosomal aberrations per patient on chromosomes other than 17 across TP53 subgroups (1mut, >1mut, mut + del and mut + cnLOH, with 125, 90, 85 and 78 patients, respectively) and types of aberrations—rearrangement (rearr), gain or deletion (del). In all boxplots, the median is indicated by the horizontal line and the first and third quartiles by the box edges. The lower and upper whiskers extend from the hinges to the smallest and largest values, respectively, no further than 1.5× interquartile range from the hinges. ****P < 0.0001, two-sided Wilcoxon rank-sum test, each compared to the same aberration within the 1mut group. b, Number of unique chromosomes other than 17 affected by a chromosomal aberration (rearr, gain or del) per TP53 subgroup for 1mut (n = 125), >1mut (n = 90), mut + del (n = 85) and mut + cnLOH (n = 78). Dots represent the median across patients and lines extend from first to third quartiles. ****P < 0.0001, two-sided Wilcoxon rank-sum test, compared to the 1mut group. Wilcoxon W statistic= 9,950, 10,040 and 9,239 and P = 2 × 10−22, 2 × 10−28 and 1 × 10−27 for >1mut, mut + del and mut + cnLOH, respectively. c, Interaction between TP53 allelic state and complex karyotype; 13% (16/125) of monoallelic TP53 patients (1mut) had a complex karyotype and 91% (231/253) of multi-hit TP53 patients (multi) had a complex karyotype. d, Number of driver mutations on genes other than TP53 per TP53 subgroup of 1mut (n = 125), >1mut (n = 90), mut + del (n = 85) and mut + cnLOH (n = 78). Dots represent the median across patients and lines extend from first to third quartiles. ****P < 0.0001, two-sided Wilcoxon rank-sum test compared to the 1mut group. W = 8,515, 8,499 and 7,785 and P = 6 × 10−1, 6 × 10−14 and 3 × 10−13 for >1mut, mut + del and mut + cnLOH, respectively e, Proportion of cases per TP53 allelic state with driver mutations in genes most frequently co-mutated with TP53. Genes mutated in at least 5% of monoallelic (n = 125) or multi-hit (n = 253) patients are represented. ***P < 0.001, **P < 0.01, *P < 0.05, two-sided Fisher’s exact test with Benjamini–Hochberg multiple testing correction.
The total number of oncogenic gene mutations and the pattern of co-mutations were also different among the allelic states. Excluding TP53, the number of driver mutations was higher in the monoallelic state compared to the multi-hit TP53 subgroups (Fig. 2d). Overall, 40% (n = 102) of multi-hit patients did not have any identifiable driver mutations other than TP53, while 90% (n = 112) of monoallelic patients had at least one other driver mutation and 50% (n = 62) had at least three. Differences in the pattern of co-mutations were also identified, whereby monoallelic patients were significantly enriched for mutations in TET2, SF3B1, ASXL1, RUNX1, SRSF2, JAK2, BCOR and CBL (Fig. 2e).
Previous studies have recurrently linked TP53 mutations to high-risk presentation (complex karyotype, elevated blasts, severe thrombocytopenia) and adverse outcomes3,4. These correlations were recapitulated in our study (Supplementary Fig. 11). However, the clinical implications of the allelic state have not been investigated. In our cohort, monoallelic TP53 patients were less cytopenic (Fig. 3a–c) and had lower percentages of bone marrow blasts compared to multi-hit patients (median 4 versus 9%; Fig. 3d). There was a higher prevalence of lower-risk MDS in monoallelic patients, while the multi-hit state was enriched for higher-risk WHO (World Health Organization) subtypes and poor/very poor IPSS-R categories (Extended Data Fig. 6a,b). Overall survival (OS) and AML transformation were significantly different between the TP53 allelic states. In multi-hit state, the median OS was 8.7 months (95% CI: 7.7–10.3) whereas it was 2.5 years (95% CI: 2.2–4.9) for monoallelic patients (hazard ratio (HR) = 3.7, 95% CI: 2.7–5.0, P = 2 × 10−16, Wald test). In comparison, wild-type patients had a median OS of 3.5 years (95% CI: 3.4–3.9) (Fig. 3e). The effect of monoallelic TP53 on OS was not confounded by del(5q) (Supplementary Fig. 12). The 5-year cumulative incidence of AML transformation in the multi-hit and monoallelic states was, respectively, 44 and 21% (HR = 5.5, 95% CI: 3.1–9.6, P = 5 × 10−9, Wald test) (Fig. 3f). Of note, all subgroups (more than one gene mutation, mutation and deletion, mutation and cnLOH) in multi-hit state had equally dismal outcomes (Extended Data Fig. 7a,b). The OS distinction of the two states was significant across WHO classes and IPSS-R risk groups (Extended Data Fig. 6c,d and Supplementary Fig. 13), and multi-hit TP53 identified patients with poor survival across IPSS-R strata. Because 10% of multi-hit patients were classified as IPSS-R risk very good to intermediate, this shows that assessment of TP53 allelic state is critical to identification of patients with high-risk disease. In fact, multivariable Cox proportional hazards models that included TP53 state alongside age of diagnosis, cytogenetic risk score12 and established predictive features identified multi-hit TP53 as an independent predictor for the risk of death and AML transformation (HROS = 2.04, 95% CI: 1.6–2.6, P = 5 × 10−8; HRAML = 2.9, 95% CI: 1.8–4.7, P = 7 × 10−6, Wald test), whereas monoallelic TP53 state was not different compared to wild-type TP53 (Fig. 3g,h). We also evaluated that multi-hit TP53 and complex karyotype, but not monoallelic TP53, are independent predictors of adverse outcome (Supplementary Fig. 14), emphasizing the importance of mapping TP53 state alongside complex karyotype for accurate risk estimation.
Fig. 3 |. TP53 allelic state associates with distinct clinical phenotypes and shapes patient outcomes.
a–d, boxplots indicating the levels of cytopenia, that is, hemoglobin (a), platelets (b), absolute neutrophil count (ANc) (c) and percentage of bone marrow blasts (d) per TP53 allelic state of wild-type TP53 (Wt, n = 2,922), monoallelic TP53 (1mut, n = 125) or multiple TP53 hits (multi, n = 253). In all boxplots, the median is indicated by the horizontal line and the first and third quartiles by the box edges. the lower and upper whiskers extend from the hinges to the smallest and largest values, respectively, no further than 1.5× interquartile range from the hinges. the y-axis values are square-rooted. ****P < 0.0001, ***P < 0.001, two-sided Wilcoxon rank-sum test. e,f, Kaplan–Meier probability estimates of overall survival (e) and cumulative AMLt (f) per TP53 allelic state. The numbers of cases with outcome data per allelic state are indicated in parentheses. P values are derived from two-sided log-rank and Gray’s tests. g, Results of cox proportional hazards regression for overall survival (OS) performed on 2,719 patients with complete data for OS and with 1,290 observed deaths. Explicative variables are hemoglobin, platelets, ANC, bone marrow blasts, cytogenetic IPSS-R risk scores (very good, good, intermediate (the reference), poor and very poor) and TP53 allelic state (monoallelic, multi-hit and wild-type is the reference). Hemoglobin, platelets, ANC and bone marrow blasts are scaled by their sample mean; age is scaled by a factor of 10; the x-axis is log10 scaled. Dots and lines represent the estimated hazard ratios and 95% confidence intervals (CI), respectively. ****P < 0.0001, ***P < 0.001, **P < 0.01, NS, not significant. P > 0.05, Wald test. h, Results of cause-specific Cox proportional hazards regression for AMLt performed on 2,464 patients with complete data for AMLt and with 411 observed transformations. Covariates are as in g. Dots and lines represent estimated hazard ratios and 95% CI, respectively. ****P < 0.0001, **P < 0.01, NS, not significant, P > 0.05, Wald test.
Outcomes of monoallelic patients significantly differed with the number of co-occurring driver mutations (Fig. 2d,e and Supplementary Fig. 15). For example, the 5-year survival rate of monoallelic patients with no other identifiable mutations was 81% while it was 36% for patients with one or two other mutations, 26% for patients with three or four other mutations and 8% for patients with more than five other mutations. Contrastingly, the outcome of multi-hit patients was not dependent on the number of additional mutations, and the 5-year survival rate was uniformly <6%. Taken together, multi-hit TP53 patients had few co-mutations and very poor survival irrespective of genetic context. Patients with monoallelic TP53 mutations frequently had several co-occurring mutations that shaped disease pathogenesis and outcomes. These data further showcase that monoallelic TP53 mutations are not independently predictive of adverse risk.
In addition to TP53 mutations, TP53 VAF has also been reported to be of prognostic significance in MDS17–19. This is probably explained by the strong correlation between high VAF and biallelic targeting. Optimal cut-point analysis20 identified that patients with monoallelic TP53 mutations and VAF > 22% (n = 38) had increased risk of death compared to wild-type patients (HR = 2.2, 95% CI: 1.5–3.2, P = 0.0001, Wald test), whereas patients with monoallelic TP53 mutations and VAF ≤ 22% (n = 87) had OS similar to wild-type patients (Extended Data Fig. 7c). This highlights that patients with monoallelic mutations and high VAF should be closely monitored. It is possible that we missed a second TP53 hit in the small subset of monoallelic cases with VAF > 22%. Conversely, multi-hit patients had poor outcomes across ranges of VAF. Multi-hit patients with VAF ≤ 11% (n = 20) had very dismal outcomes, for both OS and AML transformation (Extended Data Fig. 7c,d). Importantly, the genomic and clinical associations established for multi-hit cases held true irrespective of VAF. Patients with multi-hit TP53 had higher genome instability, fewer cooperating mutations and more pronounced thrombocytopenia and elevated blast counts compared to monoallelic patients in both clonal and subclonal ranges (Supplementary Fig. 16). This indicates that, once established, a clone with biallelic TP53 targeting exerts its pervasive effects on clinical phenotypes and outcomes regardless of its size. The determination of TP53 allelic state requires assessment of both multiple mutations and subclonal allelic imbalances, and multi-hit TP53 state identified very-high-risk patients independently of the VAF of TP53 mutations.
The emergence of data in support of dominant negative effect (DNE)21,22 and gain of function (GOF)23–25 led us to test whether outcomes differed based on the nature of the underlying lesion—that is, missense, truncated or hotspot TP53 mutations. In the multi-hit state, no differences were observed for genome instability and outcomes across mutation types (Extended Data Fig. 8 and Supplementary Fig. 17a,b), indicating that it is the loss of both wild-type copies of TP53 that drives the dismal outcomes of TP53-mutated MDS patients rather than the underlying mutation types. In the monoallelic state, missense mutations in the DNA binding domain (DBD) had no effect on patient outcomes compared to wild-type TP53. However, there was an increased risk of death in monoallelic patients with hotspot mutation at amino acid positions R175 and R248 (but not R273) compared to wild-type patients (HR = 2.3, 95% CI: 1.2–4.7, P = 0.02 for R248 and HR = 3.0, 95% CI: 0.96–9.3, P = 0.06, Wald test for R175; Supplementary Fig. 17c,d), consistent with either DNE21 or GOF25 of the hotspot mutant proteins. This suggests that DNE21 may not be applicable to all DBD mutations, especially in the setting of MDS where exposure to genotoxic therapy is not common. Larger datasets and functional studies are warranted to further investigate the operative mechanisms of DBD mutations in MDS.
Beyond primary MDS, TP53 mutations are enriched in therapy-related MDS (t-MDS)6,26 and are associated with a high risk of progression to AML5. In t-MDS and at progression, TP53-mutated patients demarcate an extremely adverse prognostic group with a chemorefractory disease and <2% 5-year survival15,16. Our cohort included 229 t-MDS cases, with a higher proportion of TP53-mutated patients relative to de novo MDS (18 versus 6%, OR = 3.3, 95% CI: 2.4–4.6). TP53-mutated t-MDS patients more frequently had multiple hits compared to TP53-mutated de novo patients (84 versus 65%, OR = 2.8, 95% CI: 1.4–6.6). Comparison of genome profiles (Supplementary Fig. 18) and clinical outcomes (Fig. 4a) between allelic states reiterated observations from de novo MDS. TP53-mutant t-MDS is considered one of the most lethal malignancies with limited treatment options27, yet monoallelic patients had lower risk of death compared to multi-hit patients (HR = 0.39, 95% CI: 0.15–1.0, P = 0.05, Wald test).
Fig. 4 |. TP53 allelic state demarcates outcomes in therapy-related MDS and on different therapies.
a, Kaplan–Meier probability estimates of overall survival per TP53 allelic state of wild-type TP53 (WT), monoallelic TP53 (1mut) and multiple TP53 hits (multi), and across types of MDS, that is, de novo MDS (solid lines) or therapy-related MDS (dashed lines). Among de novo cases, 101 had a monoallelic TP53 mutation (solid orange line), 184 were multi-hit TP53 (solid blue line) and 2,552 were TP53 wild-type (solid gray line). Among therapy-related cases, ten had a monoallelic TP53 mutation (dashed orange line), 52 were multi-hit TP53 (dashed blue line) and 162 were TP53 wild-type (dashed gray line). Annotated P values are from two-sided log-rank tests. b–d, Kaplan–Meier probability estimates of overall survival (OS) following commencement of treatment with HMA (b) or lenalidomide for patients with del(5q) (c) or HSct (d) per TP53 allelic state. OS was measured from the start of treatment or HSCT to the time of death from any cause. Patients alive at the last follow-up date were censored at that time. The number of cases with OS data per TP53 state is indicated in parentheses. Annotated P values are from two-sided log-rank tests.
To evaluate the effect of TP53 state in disease progression, we analyzed serial data from an independent cohort of 12 patients with MDS28,29 (St James’s University Hospital, United Kingdom) who progressed to AML with a TP53 mutation (Supplementary Fig. 19). In 7/12 patients, multiple hits were observed at the time of MDS diagnosis, with a 4-month median to AML progression (Supplementary Fig. 19a–g). In three patients, biallelic targeting occurred during disease progression with interclonal competition and attainment of clonal dominance for the TP53 clone (Supplementary Fig. 19h,i). The remaining two cases that progressed with a monoallelic TP53 mutation had other high-risk mutations in either RUNX1 and KRAS or CBL (Supplementary Fig. 19k,l), consistent with the observation from our discovery cohort that monoallelic TP53 mutations tend to occur with several and diverse cooperating mutations (Fig. 2d,e). These data provided further evidence that biallelic alteration of TP53 is a potent driver of disease progression, and underscore the importance of assessing TP53 allelic state at diagnosis and for disease surveillance.
We validated the representation of TP53 allelic states (Supplementary Fig. 20), genome stability profiles (Supplementary Fig. 21) and differences in clinical phenotypes (Supplementary Fig. 22) in a cohort of 1,120 patients with MDS (Extended Data Fig. 2).
Last, we evaluated the implication of TP53 allelic state in response to therapy. Recent studies have reported that TP53 patients have poor responses to lenalidomide8 and hematopoietic stem cell transplantation (HSCT)6,7, as well as marked but transient responses to hypomethylating agent (HMA)30. We conducted an exploratory survival analysis by allelic state for patients that received HMA, lenalidomide (on the subset with deletion of 5q) and following HSCT (Extended Data Fig. 9). For HMA and lenalidomide, patients with monoallelic TP53 mutations had evidence of longer survival compared to multi-hit patients (Fig. 4b,c). The analysis of our HSCT cohort was limited due to its size, yet we observed a trend for improved survival of monoallelic patients compared to multi-hit patients following HSCT (Fig. 4d). These observations highlight the importance of mapping TP53 allelic states in future correlative studies of response to therapy.
In summary, we have provided a detailed characterization of TP53 allelic state in 3,324 patients with MDS, and assessed its implication for disease biology, clinical presentation and outcomes. Two-thirds of TP53-mutated patients had multiple hits (more than one gene mutation, mutation and deletion, mutation and cnLOH), consistent with biallelic targeting. The remaining one-third had monoallelic mutations with one residual wild-type allele.
We have demonstrated that the multi-hit TP53 state in MDS, not the bare presence of any TP53 mutation, underlies established associations with genome instability, treatment resistance, disease progression and dismal outcomes. Multi-hit TP53 identified very-high-risk patients independently of IPSS-R, co-occurring mutations and clonal representation. Surprisingly, monoallelic TP53 patients did not differ from TP53 wild-type patients with regard to response to therapy, overall survival and AML progression. The shift in survival for monoallelic patients with the number of co-mutations indicates diversity of disease pathogenesis and highlights the need for future prognostic models that consider a large spectrum of gene mutations.
Different evolutionary trajectories between multi-hit and monoallelic patients emerged from our data. In multi-hit state, TP53 mutations were predominantly in the dominant clone with complex karyotypes and few other mutations, reflecting early truncal events in MDS pathogenesis. In contrast, monoallelic TP53 mutations were frequently subclonal and co-occurred with mutations from a broad range of genes, to include genes associated with both a favorable31 (SF3B1) or poor32 (ASXL1, RUNX1, CBL) prognosis. A limitation of our study is that we may have missed a second hit for a small subset of cases, such as balanced rearrangement or aberrant methylation. However, the systematic differences between monoallelic and multi-hit patients across genomic and clinical metrics indicate that our definition of TP53 allelic state delineates two biologically and clinically relevant groups. In Extended Data Fig. 10, we propose a workflow to map TP53 allelic state in routine diagnostic practice.
Our findings imply that diagnostic and prognostic precision in MDS requires the assessment of TP53 allelic state. We propose that biallelic TP53 should be distinguished from monoallelic TP53 mutations in future revisions of IPSS-R and in correlative studies of treatment response. As the most frequently mutated gene in cancer, the representation and effect of TP53 allelic state warrant investigation across cancer indications.
Online content
Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/s41591-020-1008-z.
Methods
Patient samples.
The International Working Group for Prognosis in MDS (IWG-PM) cohort originated from 24 MDS centers (Supplementary Table 1) that contributed peridiagnosis MDS, myelodysplastic/myeloproliferative neoplasms (MDS/MPN) and AML/AML with myelodysplasia-related changes (AML-MRC) patient samples to the study. Following quality control (Supplementary Fig. 1), 3,324 samples were included in the study (Extended Data Fig. 1). The source for genomic DNA was either bone marrow or peripheral blood mononuclear cells. The median time from diagnosis to sampling was 0 d (first quartile, 0 d; third quartile, 113 d). The validation cohort consisted of 1,120 samples from the Japanese MDS consortium (Extended Data Fig. 2). Samples were obtained with informed consent in accordance with the Declaration of Helsinki and appropriate Ethics Committee approval from each IWG-PM partner institution.
Clinical data.
Diagnostic clinical variables were provided by the contributing centers and curated to ensure uniformity of metrics across centers and countries. Clinical variables included (1) sex; (2) age at diagnosis; (3) WHO disease subtype; (4) MDS type (de novo, secondary or therapy-related); (5) differential blood counts to include hemoglobin, platelets, white blood cells, neutrophils and monocytes; (6) percentage of bone marrow and peripheral blood blasts; (7) cytogenetic data; and (8) risk score as per IPSS-R11. Clinical outcomes included the time of death from any cause or last follow-up from sample collection, and the time of AML transformation or last follow-up from sample collection.
Cytogenetic data.
Conventional G-banding analysis (CBA) data were available for 2,931 patients, and karyotypes were described in accordance with the International System for Human Cytogenetic Nomenclature33. CBA data were risk stratified according to the IPSS-R guidelines12 using both algorithmic and manual classification by an expert panel of cytogeneticists.
WHO subtypes.
Contributing centers provided the vast majority of disease classification as per WHO 2008. A pathology review was performed uniformly on the entire cohort, to ensure concordance between disease classification and diagnostic variables and to update the classification as per WHO 2016. The cohort was representative of all MDS WHO subtypes and included 563 (17%) MDS/MPN and 167 (5%) AML/AML-MRC samples (Extended Data Fig. 1).
IPSS-R risk scores.
IPSS-R risk scores were uniformly calculated based on both IPSS-R cytogenetic risk scores and values for hemoglobin, platelets, absolute neutrophil count (ANC) and percentage of bone marrow blasts. All IPSS-R risk groups were represented (Extended Data Fig. 1).
Targeted sequencing.
Panel design.
The panel used for targeted sequencing included genes recurrently mutated in MDS, as well as 1,118 genome-wide single-nucleotide polymorphism (SNP) probes for copy number analysis, with on average one SNP probe every 3 Mb. Bait tiling was conducted at 2×. Baits were designed to span all exonic regions of TP53 across all transcripts, as described in RefSeq (NM_001276761, NM_001276695, NM_001126114, NM_00112611), and included 20-base pair (bp) intronic flanking regions.
Library preparation and sequencing.
For library construction, 11–800 ng of genomic DNA was used with the KAPA Hyper Prep Kit (Kapa Biosystems, no. KK8504) with 7–12 cycles of PCR. After sample barcoding, 10–1,610 ng of each library was pooled and captured by hybridization. Captured pools were sequenced with paired-end Illumina HiSeq at a median coverage of 730× per sample (range, 127–2,480×). Read length was either 100 or 125 bp.
We also sequenced 48 samples on the panel with the same sequencing conditions used for tumor samples, from the blood of young individuals who did not have hematological disease, to help further filtering of sequencing artifacts and germline SNPs.
Sequencing was performed in an unmatched setting—that is, without a matched normal tissue control per patient—so that variants had to be curated accordingly (see Variant calling and filtering for artifacts and germline variants).
Alignment.
Raw sequence data were aligned to the human genome (NCBI build 37) using BWA34 v.0.7.17. PCR duplicate reads were marked with Picard tools (https://broadinstitute.github.io/picard/) v.2.18.2. For alignment, we used the pcap-core dockerized pipeline v.4.2.1 available at https://github.com/cancerit/PCAP-core.
Sample quality control.
Quality control (QC) of fastq and bam data was performed with FastQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) v.0.11.5 and Picard tools, respectively.
In addition, a number of downstream QC steps were performed, including:
Fingerprinting—that is, evaluation of the similarity between all pairs of samples, based on the respective genotype on 1,118 SNPs. Duplicate samples were excluded from the study.
Evaluation of concordance between patient sex from the clinical data and coverage on the sex chromosomes. Discordant cases were discussed with the contributing centers to rule out patients with Klinefelter syndrome and filter out erroneous samples appropriately.
Evaluation of concordance between CBA data and NGS-derived copy number profiles (see Copy number and LOH analysis). A typical discordant case is one where CBA reports a given deletion or gain in a high number of metaphases and the NGS profile clearly shows other abnormalities not reported by CBA. All discordant cases were reviewed by a panel of experts through the IWG cytogenetic committee.
Finally, samples that passed QC but were found not to be treatment naive—that is, the patient received disease-modifying treatment before sample collection—were excluded from the study. Supplementary Fig. 1 summarizes the QC workflow.
Variant calling and filtering for artifacts and germline variants.
Variants were derived from a combination of variant callers. For single nucleotide variants (SNVs), we used CaVEMan (http://cancerit.github.io/CaVEMan/) v.1.7.4, Mutect35 v.4.0.1.2 and Strelka36 v.2.9.1. For small insertions and deletions (indels), we used Pindel37 v.1.5.4, Mutect v.4.0.1.2 and Strelka v.2.9.1. VAFs were uniformly reported across all called variants using the vafCorrect realignment procedure available at https://github.com/cancerit/vafCorrect. All called variants were annotated with VAGrENT (https://github.com/cancerit/VAGrENT) v.3.3.0 and Ensembl-VEP (https://github.com/Ensembl/ensembl-vep) with Ensembl v.91 and VEP release 94.5.
Artifact variants were filtered out based on:
Off-target variants—that is, variants called outside of the panel target regions were excluded
Variants with VAF < 2%, <20 total reads or <5 mutant supporting reads were excluded
- The number of callers calling a given variant and the combination of filters (flags) from the triple callers; more specifically:
- For SNVs, variants called by CaVEMan with >2 CaVEMan flags (from the DTH, RP, MN, PT, MQ, SR, TI, SRP, VUM, SE list) were excluded. Variants called only by Strelka and Mutect (but not CaVEMan) were filtered out if they had >0 flags or if the dirprop metric (ratio of number of reads on each strand) was <0.44. Variants called only by Mutect (but not by CaVEMan or Strelka) were filtered out if they had >0 flags or if dirprop was <0.44 or VAF < 5%.
- For indels, variants called by all three callers (Pindel, Mutect and Strelka) were excluded if they had >3 flags. Variants called by only two callers were excluded if they had >2 flags. Variants called only by Pindel were filtered out if they had >1 flag or <2 mutant reads on one strand. Variants called only by Mutect were filtered out if they had >0 flags or <2 mutant reads on one strand.
Recurrence and VAF distribution of the called variants on a panel of 48 normal samples
After prefiltering of artifactual variants, germline SNPs were filtered out by consideration of:
VAF density of variants consistent with germline SNP
Presence in the Genome Aggregation Database (gnomAD)38. More specifically, variants with a population-based allele frequency (VEP_gnomAD_AF) >0.001 were excluded (with the exception of a few variants in SH2B3 involved in familial thrombocythemia). Variants with a maximum allele frequency across the gnomAD populations (VEP_MAX_AF) >0.01 were excluded (with the exception of ASXL1 amino acid position G646, which requires specific rescue).
Recurrence in a panel of normals
All remaining probable somatic variants after the above-mentioned filtering were manually inspected with Integrative Genomics Viewer39 to rule out residual artifacts.
Variant annotation for putative oncogenicity.
From the list of probable somatic variants, putative oncogenic variants were distinguished from variants of unknown significance based on:
Recurrence in the Catalog Of Somatic Mutations in Cancer (COSMIC)40, in myeloid disease samples registered in cBioPortal40,41 and in the study dataset
Annotation in the human variation database ClinVar44
Annotation in the precision oncology knowledge database OncoKB45
Recurrence with somatic presentation in a set of in-house data derived from >6,000 myeloid neoplasms16,32,46
The inferred consequence of a mutation where nonsense mutations, splice site mutations and frameshift indels were considered oncogenic in tumor suppressor genes (from COSMIC Cancer Census Genes or OncoKB Cancer Gene List) For annotation of oncogenicity of TP53 variants we additionally considered:
Functional annotation in the International Agency for Research on Cancer (IARC) TP53 database47
Functional classification TP53 prediction scores using PHANTM48
Supplementary Fig. 5 illustrates the rationale and results of the annotation of TP53 variants for putative oncogenicity.
Copy number and LOH analysis.
In addition to CBA, we assessed chromosomal alterations based on NGS sequencing data using CNACS7. CNACS enables the detection of arm-level and focal copy number changes as well as regions of cnLOH. CNACS has been optimized to run in the unmatch setting and uses a panel of normals for calibration.
Supplementary Fig. 2 provides examples of characterization of allelic imbalances (gains, deletions and regions of cnLOH) using CNACS, with concordant copy number change findings between CBA and CNACS, focal deletions exclusively detected with CNACS and, as expected, regions of cnLOH detected only by CNACS. For genome-wide analysis, we considered CNACS segments >3 Mb with minor allele frequency <45% (when 50% represents no allelic imbalance). Supplementary Fig. 4 provides examples of characterization of allelic imbalances by CNACS and SNP arrays on 21 selected samples, with highly concordant findings between the two assays.
In addition to CNACS, we also ran CNVkit49 v.0.9.6 on the study cohort. Because CNVkit does not infer allele-specific copy numbers, it does not allow marking regions of cnLOH but it estimates copy number changes. The integration of two copy number tools increased the specificity and sensitivity of copy number calling.
For 2,931 patients with CBA data we performed a detailed comparison of CBA- and NGS-derived copy number results (Supplementary Fig. 3), which showed highly concordant findings. Along with the annotation of regions of cnLOH, we supplemented the presence of copy number changes on patients when it was clear on NGS results but missed by CBA (for example, focal deletions). In 393 patients with missing CBA data, we used the NGS results to fully annotate copy number changes. Because our NGS assay did not allow the detection of translocations, inversions, whole-genome amplification and the presence of marker or ring chromosomes, those specific alterations were statistically imputed from other molecular markers on these 393 patients.
Complex karyotype.
Among 2,931 patients with CBA data, 310 had a complex karyotype identified by CBA, where complex karyotype was defined as three or more independent chromosomal abnormalities. Among those 2,931 patients, NGS results helped to identify complex karyotypes in an additional 15 patients. Among the 393 cases with missing CBA data, 13 had a complex karyotype according to NGS copy number profiles (Supplementary Fig. 3c). Overall 329 patients had a complex karyotype, representing 10% of the study cohort.
Statistics.
All statistical analyses were conducted using the R statistical platform (https://www.r-project.org/) v.3.6.1. Fisher’s exact test and Wilcoxon rank-sum test were used to compare categorical and continuous variables. All statistical tests were two-sided. Benjamini–Hochberg multiple testing correction was applied when appropriate.
Overall survival.
Overall survival was measured from the time of sample collection to the time of death from any cause. Patients alive at the last follow-up date were censored at that time. Survival probabilities over time were estimated using Kaplan–Meier methodology, and comparisons of survival across subgroups were conducted using the two-sided log-rank test. Kaplan–Meier estimates were computed using the R package survival.
Multivariable models of overall survival were performed with Cox proportional hazards regression, using the R package coxph. Hazard ratios and 95% CIs were reported for covariates, along with P values from the Wald test. Covariates included in the multivariable model of overall survival were age, hemoglobin, platelets, ANC, bone marrow blasts, cytogenetic risk group and TP53 allelic state. Hemoglobin, platelets, ANC and bone marrow blasts were treated as continuous variables and were scaled by their sample mean. Age was treated as a continuous variable and was scaled by a factor of ten. Cytogenetic risk group was treated as a categorical variable, with the intermediate risk group as the reference group. TP53 allelic state was treated as a categorical variable, with the wild-type state as the reference group relative to monoallelic and multi-hit groups. Those covariates correspond to all covariates included in the age-adjusted IPPS-R model along the TP53 allelic state.
AML transformation (AMLt).
In univariate analysis of AMLt, time to AMLt was measured from the time of sample collection to the time of transformation, with death without transformation treated as a competing risk. Patients alive without AMLt at the last contact date were censored at that time. Cumulative incidence functions were used to estimate the incidence of AMLt using the R package cmprsk, and comparisons of cumulative incidence function across subgroups were conducted using the two-sided Gray’s test.
Multivariable models of AMLt were performed using cause-specific Cox proportional hazards regressions, where patients who did not transform but died were censored at the time of death. Hazard ratios and 95% CIs were reported for the covariates, along with P values from the Wald test. Covariates included in the multivariable model of AMLt were the same as those in the model of overall survival described above.
Reporting Summary.
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Data availability
Clinical, copy number and mutation data are available at https://github.com/papaemmelab/MDS-TP53-state. The data underlying Figs. 1–4 are provided as Source Data.
Databases used in the study are gnomAD (https://gnomad.broadinstitute.org), COSMIC (https://cancer.sanger.ac.uk/cosmic), cBioPortal for Cancer Genomics (https://www.cbioportal.org), OncoKB Precision Oncology Knowledge Base (https://www.oncokb.org), ClinVar (https://www.ncbi.nlm.nih.gov/clinvar) and the IARC TP53 Database (https://p53.iarc.fr).
Code availability
The NGS-based, allele-specific copy number algorithm CNACS7 is available as a python toil workflow engine at https://github.com/papaemmelab/toil_cnacs, where release v.0.2.0 was used in this study. Source code to reproduce figures from the manuscript is available at https://github.com/papaemmelab/MDS-TP53-state.
Extended Data
Extended Data Fig. 1 |. Study cohort characteristics.
Table describing the baseline characteristics of the study cohort. 1Q: first quartile; 3Q: third quartile; OS: overall survival; #: AML classification per WHO 2016 and previously RAEB-T cases. $: Median follow-up time is calculated for censored patients.
Extended Data Fig. 2 |. Validation cohort characteristics.
Table describing the baseline characteristics of the validation cohort. 1Q: first quartile; 3Q: third quartile; OS: overall survival; $: Median follow-up time is calculated for censored patients.
Extended Data Fig. 3 |. Landscape of chromosomal aberrations in MDS.
a, Landscape of chromosomal arm-level aberrations across 3,324 patients. Aberrations include copy-neutral loss of heterozygosity (cnloh), deletion (del) and gain. Chromosomes or chromosome arms with more than 5 aberrations are depicted on the x-axis. Aberrations were assessed using the integration of conventional G-banding analysis (CBA) data and NGS derived allele specific copy-number profiles (see Methods). NGS aberrant segments were restricted to segments larger than 3 megabases. b, Frequency distribution of chromosomal aberrations ordered by type of aberrations. First top three plots represent arm-level copy-neutral loss of heterozygosity (cnloh), deletion (del) and gain. Fourth bottom plot represents other types of aberrations to include the presence of marker chromosome (mar), rearrangements where r_i_j denotes a rearrangement between chromosome i and j, isochromosome 17q (iso17q), whole genome amplification (WGA) and presence of ring chromosome (ring). All aberrations observed in more than 3 patients are depicted. Of note, cnloh is detectable with NGS but not with CBA. On the opposite, rearrangements, presence of marker or ring chromosome and WGA were only assessed from CBA data. In 393 cases with missing CBA data, those specific aberrations were imputed from other molecular markers.
Extended Data Fig. 4 |. Evidence of biallelic TP53 targeting in the cases with multiple TP53 hits.
a, Scatter plot of the two maximum TP53 variant allele frequency (VAF) values from cases with multiple TP53 mutations and no copy-neutral LOH or deletion at TP53 locus (n=90). Points are annotated according to the level of information of the mutation pairs. In 67% (n=60) of pairs the sum of the two VAFs exceeded 50% so that the mutations were considered to be in the same cells as per the pigeonhole principle (triangle and diamond points). In 18 cases, the genomic distance between two mutations was within sequencing read length and it was therefore possible to phase the mutations. In all those cases the mutations were observed to be unphased, that is, in trans (square and diamond points). Within those 18 pairs of unphased mutations, 10 pairs had a sum of VAFs above 50%, that is, mutations were necessarily on different alleles and in the same cells, implying biallelic targeting (diamond points). b, c, Scatter plots of the VAF of TP53 mutations and minor allele frequency of 17p heterozygous SNPs from cases with one TP53 mutation and 17p deletion (b., n=69) or 17p copy-neutral LOH (c., n=61). The high correlations in (a.), (b.) and (c.) (R2 of 0.77, 0. 94 and 0.97, respectively) are indicative of biallelic targeting of TP53. d, Table of pairs of TP53 mutations from the same patients that could be phased. All pairs were in trans, that is, mutations were supported by different alleles. e, Representative IGV example of unphased mutations (patient p12 from table (d.)).
Extended Data Fig. 5 |. Heatmap of chromosomal aberrations per TP53 allelic state.
Each column represents a patient from the TP53 subgroups of monoallelic mutation (top orange band, 1mut), multiple mutations (top light blue band, >1mut), mutation(s) and deletion (top blue band, mut+del) and mutation(s) and copy-neutral loss of heterozygosity (top dark blue band, mut+cnloh). Aberrations observed at a frequency higher than 2% in either monoallelic or multi-hit TP53 state are depicted on the y-axis. Aberrations include from top to bottom the annotation of complex karyotype (complex), the presence of marker chromosome (mar), deletion (del), gain (plus), rearrangement (with r_i_j rearrangement between chromosome i and j), copy-neutral loss of heterozygosity (cnloh), whole genome amplification (WGA) and the presence of ring chromosome (ring). The deletions of 17p of two cases from the 1mut TP53 subgroup did not affect the TP53 locus.
xtended Data Fig. 6 |. TP53 allelic state segregates patient outcomes across WHO subtypes and IPSS-R risk groups.
a, Proportion of WHO subtypes per TP53 allelic state of monoallelic mutation (1mut) and multiple hits (multi). t-MDS: therapy-related MDS; SLD: single lineage dysplasia; RS: ring sideroblast; MLD: multiple lineage dysplasia; EB: excess blasts; AML-MRC: AML with myelodysplasia-related changes; U: unclassified. Multi-hit TP53 is enriched for t-MDS compared to monoallelic TP53 state (21% vs. 8%, OR=2.9, p=0.002 two-sided Fisher exact test) and for MDS-EB2 (31% vs. 13%, OR=3.1, p=5×10−5 two-sided Fisher exact test). Contrarily, monoallelic TP53 is enriched for MDS-del5q (15% vs. 2%, OR=8.4, p=6×10−6 two-sided Fisher exact test). b, Proportion of IPSS-R risk groups per TP53 allelic state. Multi-hit TP53 is strongly enriched for the very-poor category compared to monoallelic TP53 state (74% vs. 9%, OR=28, p=2×10−35 two-sided Fisher exact test). c, Kaplan-Meier probability estimates of overall survival (OS) across main WHO subtypes per TP53 allelic state of wild-type TP53 (WT), monoallelic TP53 (1mut) and multiple TP53 hits (multi). WHO subtypes MDS-SLD and MDS-MLD are merged together as MDS-SLD/MLD and WHO subtypes MDS-EB1 and MDS-EB2 are merged together as MDS-EB1/2. d, Kaplan-Meier probability estimates of overall survival across IPSS-R risk groups per TP53 allelic state. IPSS-R very-good and good risk groups are merged together (leftmost panel), and IPSS-R very-poor and poor risk groups are merged together as well (rightmost panel). In (c.) and (d.), annotated p-values are from two-sided log-rank tests and numbers indicate cases with OS data per allelic state.
Extended Data Fig. 7 |. Outcomes across TP53 subgroups and VAF strata.
a, b, Kaplan-Meier probability estimates of overall survival (a.) and cumulative incidence of AML transformation (AMLt) (b.) across TP53 subgroups of wild-type TP53 (WT), single TP53 mutation (1mut), multiple TP53 mutations (>1mut), TP53 mutation(s) and deletion (mut+del), TP53 mutation(s) and copy-neutral loss of heterozygosity (mut+cnloh). c-d, Kaplan-Meier probability estimates of overall survival (c.) and cumulative incidence of AMLt (d.) per TP53 allelic state and range of variant allele frequency (VAF) of TP53 mutations. Annotated p-values are from two-sided log-rank tests in (a.) and (c.) and from two-sided Gray’s tests in (b.) and (d.). The number of cases with outcome data per group is indicated in parentheses.
Extended Data Fig. 8 |. Maintained differences in genome instability levels and outcomes between TP53 states per mutation type.
a, Proportion of different types of mutation per TP53 subgroup. Truncated mutations (pink) include frameshift indels, nonsense or nonstop mutations and splice-site variants. Mutations annotated as hotspot (purple) are missense mutations at amino acid positions 273, 248, 220 and 175. Mutations annotated as other-missense (green) are additional missense mutations or inframe indels. Odds ratio and two-sided Fisher’s test p-values for the proportion of truncated versus non-truncated mutations between the multi-hit TP53 subgroups and the monoallelic TP53 subgroup (1mut) are indicated on the right side. b, Number per patient of unique chromosomes other than 17 with aberrations per TP53 subgroup of single gene mutation (1mut), mutation and deletion (mut+del) and mutation and copy-neutral loss of heterozygosity (mut+cnloh) and across mutation types. Note that 5 patients with both several mutations and deletion or cnloh with ambiguity between the mutation type categories have been excluded for this analysis. The number of patients within each category is indicated in parentheses. In boxplots, the median is indicated by the tick horizontal line, and the first and third quartiles by the box edges. The lower and upper whiskers extend from the hinges to the smallest and largest values, respectively, no further than 1.5x the interquartile range from the hinges. Data beyond the whiskers are plotted individually as dots. The annotated p-values are derived from the two-sided Wilcoxon rank-sum test, each compared to the 1mut group within the same mutation type. c. Kaplan-Meier probability estimates of overall survival (OS) per TP53 subgroup across mutation types. Annotated p-values are from two-sided log-rank tests. The number of cases per subgroup with OS data is indicated in parentheses.
Extended Data Fig. 9 |. Characteristics of treated cohort subsets.
Table describing the baseline characteristics of the subset of patients that i) received hypomethylating agent (HMA), ii) received Lenalidomide in the context of del(5q) or iii) underwent hematopoietic stem cell transplantation (HSct).
Extended Data Fig. 10 |. Clinical workflow for the assessment of TP53 allelic state.
Schematic of a simple clinical workflow based on the number of TP53 mutations, the presence or absence of deletion 17p per cytogenetic analysis, and the presence or absence of cnLOH or focal deletion at 17p per NGS based assay or SNP array. Mutations were considered if VAF≥2%. VAF: variant allele frequency; CK: complex karyotype; OS: overall survival; AML: transformation to acute myeloid leukemia.
Supplementary Material
Acknowledgements
This work was supported in part by grants from the Celgene Corporation through the MDS Foundation. It was also supported by grants-in-aid from the Japan Agency for Medical Research and Development (AMED) (JP19cm0106501, JP19ck0106250 and 15H05909 (S.O.) and JP18ck0106353 (Y.N.)), from the Japan Society for the Promotion of Science (JSPS) (KAKEN JP26221308, JP19H05656 (S.O.)) and from the Ministry of Education, Culture, Sports, Science and Technology (hp160219 (S.O.)). J.B. and A.P. acknowledge funding from Blood Cancer UK (grant 13042). P.V. was supported by the Austrian Science Fund (grant F4704-B20). M.Y.F. was supported by Italian MIUR-PRIN grants. L.M. was supported by the Associazione Italiana per la Ricerca sul Cancro (AIRC, Milan, Italy) 5 per Mille project (21267 and IG 20125). M.T.V. was supported by AIRC 5 per Mille project (21267). M.T.V. recruited patients through the GROM-L clinical network. E.B. was supported by the Francois Wallace Monahan Fellowship and an EvansMDS Young Investigator award. E.P. is a Josie Robertson Investigator and is supported by the European Hematology Association, the American Society of Hematology, Gabrielle’s Angels Foundation, V Foundation and The Geoffrey Beene Foundation. We thank T. Iraca for logistical support.
U.G. has received honoraria from Celgene, Novartis, Amgen, Janssen, Roche and Jazz and research funding from Celgene and Novartis. C.A.C. has received research funding from Celgene. A.A.L. is in advisory boards of Celgene, Amgen, Roche, Novartis and Alexion and has received research funding from Celgene. F.T. is on the advisory boards of Jazz, Pfizzer and Abbvie and has received research funding from Celgene. I.K. is on the advisory board of Genesis Pharma and has received research funding from Celgene and Janssen Hellas. F.P.S.S. has received honoraria from Janssen-Cilag, Bristol-Myers-Squibb, Novartis, Amgen, Abbvie and Pfizer, is on the advisory boards of Novartis, Amgen and Abbvie and has received research funding from Novartis. A.T.-K. has received honoraria from Novartis, Bristol-Myers-Squibb and MSD and has received research funding from Celgene, Ono Pharmaceutical and Cognano. T.K. has received research funding from Bristol-Myers-Squibb, Otsuka Pharmaceutical, Kyowa Kirin, MSD, Astellas Pharmaceutical, Nippon Shinyaku, Novartis Pharmaceutical, Sumitomo Dainippon Pharmaceutical, Janssen Pharmaceutical, Celgene, SymBio Pharmaceutical, Taiho Pharmaceutical, Tejin, Sanofi K.K. and Celltrion. M.R.S. is on the advisory boards of Abbvie, Astex, Celgene, Karyopharm, Selvita and TG Therapeutic, has equity in Karyopharm and has received research funding from Astex, Incyte, Sunesis, Takeda and TG Therapeutics. G.S. is on the advisory boards of AbbVie, Amgen, Astellas, Böehringer-Ingelheim, Celgene, Helsinn Healthcare, Hoffmann-La Roche, Janssen-Cilag, Novartis and Onconova and has received research funding from Celgene, Hoffmann-La Roche, Janssen-Cilag and Novartis. L.A. is on the advisory boards of Abbvie, Astex, Celgene and Novartis and has received research funding from Celgene. D.S.N. has equity in Madrigal Phamaceuticals and has received research funding from Celgene and Pharmacyclics. K.L.B. has received research funding from GRAIL. M.H. has received honoraria from Novartis, Pfizer and PriME Oncology, is on the advisory boards of Abbvie, Bayer Pharma, Daiichi Sankyo, Novartis and Pfizer and has received institutional research funding from Astellas, Bayer Pharma, BergenBio, Daiichi Sankyo, Karyopharm, Novartis, Pfizer and Roche. P.V. has received honoraria and research funding from Celgene. S.C. has received research funding from Kyowa Kirin, Chugai Pharmaceutical, Takeda Pharmaceutical, Astellas Pharmaceutical, Sanofi KK and Ono Pharmaceutical. Y.M. has received honoraria from Ohtsuka, Novartis, Nippon Shinyaku, Dainippon-Sumitomo and Kyowa Kirin and research funding from Chugai. C.F. is on the advisory boards of, and has received honoraria from, Celgene, Novartis and Janssen and has received research funding from Celgene. M.T.V. is on the advisory board of Celgene, has received honoraria from Celgene and Novartis and has received research funding from Celgene. Y.A. has received honoraria from Mochida, Meiji, Chugai and Kyowa Kirin. N.G. is on the advisory board of, and has received honoraria from, Novartis and has received research funding from Alexion. B.L.E. has received research funding from Celgene and Deerfield. R.B. is on the advisory boards of Celgene, AbbVie, Astex, NeoGenomics and Daiichi Sankyo and has received research funding from Celgene and Takeda. E.H.-L. has received research funding from Celgene. E.B. has received research funding from Celgene. E.P. has received research funding from Celgene and has served on scientific advisory boards for Novartis. E.P. is the founder and CEO of Isabl, a company offering analytics for cancer whole-genome sequencing data.
Footnotes
Competing interests
The authors declare the following competing interests.
Additional information
Extended data is available for this paper at https://doi.org/10.1038/s41591-020-1008-z.
Supplementary information is available for this paper at https://doi.org/10.1038/s41591-020-1008-z.
Peer review information Javier Carmona was the primary editor on this article, and managed its editorial process and peer review in collaboration with the rest of the editorial team.
References
- 1.Kandoth C et al. Mutational landscape and significance across 12 major cancer types. Nature 502, 333–339 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Zehir A et al. Mutational landscape of metastatic cancer revealed from prospective clinical sequencing of 10,000 patients. Nat. Med. 23, 703–713 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Haase D et al. TP53 mutation status divides myelodysplastic syndromes with complex karyotypes into distinct prognostic subgroups. Leukemia 33, 1747–1758 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Bejar R et al. Clinical effect of point mutations in myelodysplastic syndromes. N. Engl. J. Med. 364, 2496–2506 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Kitagawa M, Yoshida S, Kuwata T, Tanizawa T & Kamiyama R p53 expression in myeloid cells of myelodysplastic syndromes. Association with evolution of overt leukemia. Am. J. Pathol. 145, 338–344 (1994). [PMC free article] [PubMed] [Google Scholar]
- 6.Lindsley RC et al. Prognostic mutations in myelodysplastic syndrome after stem-cell transplantation. N. Engl. J. Med. 376, 536–547 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Yoshizato T et al. Genetic abnormalities in myelodysplasia and secondary acute myeloid leukemia: impact on outcome of stem cell transplantation. Blood 129, 2347–2358 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Jädersten M et al. TP53 mutations in low-risk myelodysplastic syndromes with del(5q) predict disease progression. J. Clin. Oncol. 29, 1971–1979 (2011). [DOI] [PubMed] [Google Scholar]
- 9.Haferlach T et al. Landscape of genetic lesions in 944 patients with myelodysplastic syndromes. Leukemia 28, 241–247 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Kastenhuber ER & Lowe SW Putting p53 in context. Cell 170, 1062–1078 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Greenberg PL et al. Revised international prognostic scoring system for myelodysplastic syndromes. Blood 120, 2454–2465 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Schanz J et al. New comprehensive cytogenetic scoring system for primary myelodysplastic syndromes (MDS) and oligoblastic acute myeloid leukemia after MDS derived from an international database merge. J. Clin. Oncol. 30, 820–829 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Breems DA et al. Monosomal karyotype in acute myeloid leukemia: a better indicator of poor prognosis than a complex karyotype. J. Clin. Oncol. 26, 4791–4797 (2008). [DOI] [PubMed] [Google Scholar]
- 14.Donehower LA et al. Integrated analysis of TP53 gene and pathway alterations in the cancer genome atlas. Cell Rep. 28, 1370–1384 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Rucker FG et al. TP53 alterations in acute myeloid leukemia with complex karyotype correlate with specific copy number alterations, monosomal karyotype, and dismal outcome. Blood 119, 2114–2121 (2012). [DOI] [PubMed] [Google Scholar]
- 16.Papaemmanuil E et al. Genomic classification and prognosis in acute myeloid leukemia. N. Engl. J. Med. 374, 2209–2221 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Sallman DA et al. Impact of TP53 mutation variant allele frequency on phenotype and outcomes in myelodysplastic syndromes. Leukemia 30, 666–673 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Goel S et al. High prevalence and allele burden-independent prognostic importance of p53 mutations in an inner-city MDS/AML cohort. Leukemia 30, 1793–1795 (2016). [DOI] [PubMed] [Google Scholar]
- 19.Montalban-Bravo G et al. Genomic context and TP53 allele frequency define clinical outcomes in TP53-mutated myelodysplastic syndromes. Blood Adv. 4, 482–495 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Lausen B & Schumacher M Maximally selected rank statistics. Biometrics 48, 73–85 (1992). [Google Scholar]
- 21.Boettcher S et al. A dominant-negative effect drives selection of TP53 missense mutations in myeloid malignancies. Science 365, 599–604 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Levine AJ The many faces of p53: something for everyone. J. Mol. Cell Biol. 11, 524–530 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Lang GA et al. Gain of function of a p53 hot spot mutation in a mouse model of li-Fraumeni syndrome. Cell 119, 861–872 (2004). [DOI] [PubMed] [Google Scholar]
- 24.Olive KP et al. Mutant p53 gain of function in two mouse models of Li-Fraumeni syndrome. Cell 119, 847–860 (2004). [DOI] [PubMed] [Google Scholar]
- 25.Loizou E et al. A gain-of-function p53-mutant oncogene promotes cell fate plasticity and myeloid leukemia through the pluripotency factor FOXH1. Cancer Discov. 9, 962–979 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Wong TN et al. Role of TP53 mutations in the origin and evolution of therapy-related acute myeloid leukaemia. Nature 518, 552–555 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Platzbecker U Treatment of MDS. Blood 133, 1096–1107 (2019). [DOI] [PubMed] [Google Scholar]
- 28.Roman E et al. Myeloid malignancies in the real-world: occurrence, progression and survival in the UK’s population-based Haematological Malignancy Research Network 2004–15. Cancer Epidemiol. 42, 186–198 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Smith A et al. Cohort profile: the Haematological Malignancy Research Network (HMRN); a UK population-based patient cohort. Int. J. Epidemiol. 47, 700–700g (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Welch JS et al. TP53 and decitabine in acute myeloid leukemia and myelodysplastic syndromes. N. Engl. J. Med. 375, 2023–2036 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Malcovati L et al. Clinical significance of SF3B1 mutations in myelodysplastic syndromes and myelodysplastic/myeloproliferative neoplasms. Blood 118, 6239–6246 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Papaemmanuil E et al. Clinical and biological implications of driver mutations in myelodysplastic syndromes. Blood 122, 3616–3627 (2013). quiz 3699. [DOI] [PMC free article] [PubMed] [Google Scholar]
References
- 33.International Standing Committee on Human Cytogenetic Nomenclature. ISCN 2013: An International System for Human Cytogenetic Nomenclature (Karger, 2013). [Google Scholar]
- 34.Li H & Durbin R Fast and accurate long-read alignment with Burrows–Wheeler transform. Bioinformatics 26, 589–595 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Cibulskis K et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat. Biotechnol. 31, 213–219 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Saunders CT et al. Strelka: accurate somatic small-variant calling from sequenced tumor-normal sample pairs. Bioinformatics 28, 1811–1817 (2012). [DOI] [PubMed] [Google Scholar]
- 37.Ye K, Schulz MH, Long Q, Apweiler R & Ning Z Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics 25, 2865–2871 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Karczewski KJ et al. Variation across 141,456 human exomes and genomes reveals the spectrum of loss-of-function intolerance across human protein-coding genes. Nature 581, 434–443 (2020).32461654 [Google Scholar]
- 39.Thorvaldsdóttir H, Robinson JT & Mesirov JP Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief. Bioinformatics 14, 178–192 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Tate JG et al. COSMIC: the catalogue of somatic mutations in cancer. Nucleic Acids Res. 47, D941–D947 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Cerami E et al. The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov. 2, 401–404 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Chang MT et al. Identifying recurrent mutations in cancer reveals widespread lineage diversity and mutational specificity. Nat. Biotechnol. 34, 155–163 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Chang MT et al. Accelerating discovery of functional mutant alleles in cancer. Cancer Discov. 8, 174–183 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Landrum MJ et al. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res. 42, D980–D985 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Chakravarty D et al. OncoKB: a precision oncology knowledge base. JCO Precis. Oncol. 2017, 10.1200 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Grinfeld J et al. Classification and personalized prognosis in myeloproliferative neoplasms. N. Engl. J. Med. 379, 1416–1430 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Bouaoun L et al. TP53 variations in human cancers: new lessons from the iarc tp53 database and genomics data. Hum. Mutat. 37, 865–876 (2016). [DOI] [PubMed] [Google Scholar]
- 48.Giacomelli AO et al. Mutational processes shape the landscape of TP53 mutations in human cancer. Nat. Genet. 50, 1381–1387 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Talevich E, Shain AH, Botton T & Bastian BC CNVkit: genome-wide copy number detection and visualization from targeted DNA sequencing. PLoS Comput. Biol. 12, e1004873 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Clinical, copy number and mutation data are available at https://github.com/papaemmelab/MDS-TP53-state. The data underlying Figs. 1–4 are provided as Source Data.
Databases used in the study are gnomAD (https://gnomad.broadinstitute.org), COSMIC (https://cancer.sanger.ac.uk/cosmic), cBioPortal for Cancer Genomics (https://www.cbioportal.org), OncoKB Precision Oncology Knowledge Base (https://www.oncokb.org), ClinVar (https://www.ncbi.nlm.nih.gov/clinvar) and the IARC TP53 Database (https://p53.iarc.fr).














