Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Jun 13.
Published in final edited form as: Lancet Haematol. 2021 Mar;8(3):e205–e215. doi: 10.1016/S2352-3026(20)30394-X

Development and validation of a disease risk stratification system for patients with haematological malignancies: a retrospective cohort study of the European Society for Blood and Marrow Transplantation registry

Roni Shouval 1, Joshua A Fein 2, Myriam Labopin 3, Christina Cho 4, Ali Bazarbachi 5, Frédéric Baron 6, Gesine Bug 7, Fabio Ciceri 8, Selim Corbacioglu 9, Jacques-Emmanuel Galimard 3, Sebastian Giebel 10, Maria H Gilleece 11, Sergio Giralt 4, Ann Jakubowski 4, Silvia Montoto 12, Richard J O’Reilly 13, Esperanza B Papadopoulos 4, Zinaida Peric 14, Annalisa Ruggeri 8, Jaime Sanz 15, Craig S Sauter 4, Bipin N Savani 16, Christoph Schmid 17, Alexandros Spyridonidis 18, Roni Tamari 4, Jurjen Versluis 19, Ibrahim Yakoub-Agha 20, Miguel Angel Perales 4, Mohamad Mohty 21, Arnon Nagler 22
PMCID: PMC9190021  NIHMSID: NIHMS1803853  PMID: 33636142

Abstract

Background:

Diagnosis and remission status at the time of allogeneic haematopoietic stem cell transplantation are the principal determinants of survival following transplantation. We sought to develop a contemporary Disease-Risk Stratification System (DRSS) that accounts for heterogeneous transplant indications.

Methods:

We studied 55 histology and remission status combinations across haematological malignancies, including acute leukaemia, lymphoma, multiple myeloma, and myeloproliferative and myelodysplastic disorders. A total of 47,265 adult patients (age≥18 years) transplanted between 01-01-2012 and 12-31-2016 and reported the European Society for Blood and Marrow Transplantation was divided into derivation (n=25,534), tuning (n=18,365), and geographical validation (n=3,366) cohorts. Disease combinations were ranked in a multivariable Cox regression for overall survival (OS) in the derivation cohort, cut-off for risk groups were evaluated on the tuning cohort, and the selected system was tested on the geographical validation cohort and an independent single-centre US cohort (n=660).

Findings:

The median follow-up on the derivation cohort was 2·1 years (IQR 1·0, 3·2). Over the derivation cohort, patients were stratified into five risk-groups with increasing mortality risk (low: reference, intermediate-1: HR 1·26[95% CI:1·17, 1·36], intermediate-2:1·53[1·42, 1·66], high:2·03[1·86, 2·22], very high:2·87[2·63, 3·13]). DRSS levels were also associated with a stepwise increase in risk across the tuning and validation cohorts. Nearly 70% of patients were categorised as having intermediate-risk disease by a previous prognostic system (Disease-Risk Index), with a projected 2-year OS of 62·1%. The DRSS reclassified these patients into finer prognostic groups with 2-year OS ranging from 45·7%−73·1%.

Interpretation:

DRSS is a risk stratification tool including disease features related to histology, genetic profile, and treatment response. The model was developed and validated in approximately 47,000 patients and should serve as a benchmark for the field. It facilitates the interpretation and analysis of studies with heterogeneous cohorts, promoting trial-design with more inclusive populations.

Funding:

The Dotan Research Center in Haemato-Oncology, Tel Aviv University

INTRODUCTION

Relapse remains a stubborn barrier to allogeneic hematopoietic stem cell transplantation (HSCT) success, occurring in nearly one-third of transplantations.1 Diagnosis and remission status at the time of transplantation are among the strongest predictors of relapse and death.14 When contemplating transplantation, designing a clinical trial, or analysing outcome data, accounting for these factors is imperative.

To standardise the process of pre-transplantation risk-assessment, prognostic systems have categorised risk based on the combination of disease and remission status.2,3,57 The disease-risk index (DRI)2,3 has proven valuable and is considered the standard for prognostication in cohorts with heterogeneous diagnoses.8,9 Nevertheless, it was developed on patients transplanted over a decade ago and assigns the bulk of recipients to the intermediate-disease-risk category.2,3 Ideally, a prognostic model would reflect more recent practice and provide finer, actionable categories. Therefore, we sought to develop and validate a more contemporary disease-risk stratification system for patients with haematological malignancies undergoing allogeneic HSCT. Such a system could promote the design of non-disease-specific trials by accounting for the population’s heterogeneity, increasing power and generalizability. Furthermore, it could contribute to the analysis and interpretation of prospective and retrospective studies.

METHODS

Study Design and Data Sources

The European Society for Blood and Marrow Transplantation (EBMT) maintains an audited registry of HSCT conducted by member-institutions. Over 600 participating centres submit anonymised data following patient informed consent. For model development and internal validation, we included adult allogeneic HSCT recipients (age ≥ 18 years) with haematological malignancies reported to the registry. Patients underwent transplantation between 1/1/2012 and 12/31/2016, using all stem cell sources. Out of 54,076 patients in the registry, cases with missing survival status (n = 528) or donor relationship (n= 941) or with insufficient information to establish diagnosis and disease status (n = 4,727) were excluded (Figure 1). An additional 522 cases were dismissed due to disease-specific missing information and 93 due to the rarity of cases (sample size <50) in that diagnosis/status category (Burkitt’s Lymphoma, relapsed biphenotypic leukaemia). The final study population comprised of 47,265 patients. For external validation, we included 660 patients transplanted at the Memorial Sloan Kettering Cancer Center (MSKCC) between 01/01/2010 and 12/31/2015. Patients receiving cord blood grafts or cells from HLA-mismatched related donors were not included in the MSKCC cohort.

Figure 1. Flow diagram.

Figure 1.

The EBMT cohort was split into derivation, tuning, and geographical validation cohorts. A cohort from Memorial Sloan Kettering Cancer Center served for external validation.

EBMT - European Society for Blood and Marrow Transplantation

The Acute Leukemia Working Party, the EBMT Scientific Council, and MSKCC’s Institutional Review Board approved this study in accordance with the Declaration of Helsinki.

Donor types and conditioning intensities were defined as previously described.1 Fifteen haematological malignancies were considered and further stratified by disease status at the time of transplantation. Additional genetic markers were studied for AML, acute lymphoblastic leukaemia (ALL), and MDS. AML was first classified as de-novo or secondary based on standard criteria.10,11 Cytogenetic risk definition in AML was based on the European Leukemia Net definitions (appendix p. 2),12 and patients in the intermediate-risk cytogenetic with de-novo AML group were stratified by FLT3-ITD (FMS-like tyrosine kinase-3 internal tandem duplication) and NPM1 (Nucleophosmin-1) mutation status. MDS was considered to have adverse cytogenetic risk features in cases with complex karyotype (≥ 3 chromosomal abnormalities) or the deletion or monosomy of chromosome 7.2 For ALL, categorisation by the Philadelphia chromosome (Ph, t(9;22)(q34;q11)/BCR-ABL1) and t(4;11)/KMT2A-AFF1 were evaluated. Since transplantation studies often have high rates of missing cytogenetic information, we kept missing cytogenetics in AML, MDS, and ALL as separate levels.

Outcomes

The studied outcome was overall survival (OS), measured from the time of stem cell infusion to censoring or death from any cause. Relapse and non-relapse mortality were also assessed as competing events.

Statistical analysis

We divided EBMT patients into derivation, tuning, and geographical validation cohorts.13 Briefly, combinations of disease and remission status were studied and ranked according to mortality risk on the derivation cohort, which included patients transplanted between 01-01-2014 and 12-31-2016. Since there is no “ground truth” for defining risk groups, we generated several potential risk groups schemes on the derivation cohort and evaluated them on the tuning cohort, which included EBMT patients transplanted between 01-01-2012 and 12-31-2014. The selected scheme was then tested on the geographical validation and external validation cohorts, which were held out throughout the training and tuning process. The former cohort comprised patients transplanted between 01-01-2014 and 12-31-2016 in Italian centres reporting to the EBMT and the later was a cohort transplanted at MSKCC between 01-01-2010 and 12-31-2015. The appendix (p. 16) provides a schematic overview of the analytic plan and we described the stages in detail below.

In the first stage, disease and disease status pairs were constructed following the example outlined in the DRI.2,3 New sub-categories of clinical interest were added where sub-histologies had distinct survival outcome (p <0.05), as assessed by hazard ratio [HR] adjusted for recipient age, Karnofsky performance status, conditioning intensity, donor and cell type, donor/recipient sex mismatch, and CMV serostatus pair with a random effect for centre in the derivation set. These changes include the division of AML into de-novo and secondary AML; separation of de-novo AML into first CR (CR1) and subsequent CR (CR2+); stratification of de-novo AML in CR1 with intermediate cytogenetics based on FLT3-ITD/NPM1 status (to FLT3-ITDpos/NPM1wt vs. all other combinations); stratification of de-novo AML in CR2+ based on adverse cytogenetics and FLT3-ITD status; stratification of secondary AML in CR based on adverse cytogenetics; stratification of ALL in CR1 based on t(9;22) status (appendix p. 17). We explored and rejected the inclusion of t(4;11) in ALL in CR1, as we did not find an association with differential survival (HR 1·32 [95% CI 0·83, 2·10]; p = 0·24). Within AML, MDS, and ALL in CR1, patients with unknown cytogenetics were categorised separately. Finally, we added biphenotypic acute leukaemia in CR and the myelodysplastic/myeloproliferative neoplasm overlap syndrome (MDS/MPN) as new diagnoses.

We constructed a mixed-effects multivariable Cox regression model for OS using the derivation set, with disease-disease status pair adjusted using the same adjustment parameters describe above for evaluating new sub-categories. The β-coefficients of the disease/status pairs were ranked. To create a risk-stratification system that would be easily applied, we sought β-coefficients cut-offs that would produce groups constituting between 10% and 40% of patients and incrementally predictive of mortality. We generated several different sets of cut points over the derivation cohort, which were selected by serially searching for optimal cut-points using the maximally selected log-rank statistic,14 a method typically applied for identifying potential cut-points in continuous covariates. Grouping schemes fitting our initial criteria were evaluated on the tuning cohort. The one scheme that resulted in clinically rational, homogenous groups was then studied using a multivariable Cox regression model, adjusting for the same covariates as described above (aside from centre effect on the single-centre validation cohort), on two datasets – the geographical validation and external validation cohorts (appendix p. 16).

OS was calculated using the Kaplan-Meier method and compared by the log-rank statistic. All p-values were 2-sided and values less than 0·05 were considered significant without adjustment for multiple testing. Discrimination of the new system was compared to the revised DRI3 using time-dependent area under the receiver operating characteristic curve (AUC) statistic.15 Analyses were conducted using SPSS (version 25·0) and R (v.3·5·3).

Role of the funding source

The funders had no role in the study design, data collection and analysis, or writing the report. Authors RS, JF and ML had full access to all data. All authors shared the responsibility for the final decision to submit the report for publication.

RESULTS

The median age of patients in the derivation cohort was 53 years (IQR: 41, 62; Table 1). AML was the primary indication for transplantation, accounting for 11,881 (46·5%) of 25,534 patients, followed by ALL (n = 3,474; 13·6%), and MDS (n = 2,977; 11·7%). Bone marrow grafts, haploidentical donors, and myeloablative conditioning were more prevalent in the geographical-validation cohort (1,128 [33·5%], 897 [26·6%] and 2,494 [74·1%] out of 3,366 patients, respectively) when compared to the derivation cohort (2,126 [8·3%], 1,762 [6·9%] and 12,476 [48·9%]). Median follow-up and completeness of follow-up16 at 2 years was 2·1 years (interquartile range [IQR]: 1·0, 3·2) and 72·7% for the derivation cohort, 4·6 years (IQR: 1·9, 5·6) and 83·8% for the tuning cohort, and 3·0 years (IQR: 2·3, 3·9) and 94·8% for the geographical validation cohort.

Table 1.

EBMT Population Characteristics

Derivation cohort 2014–2016 Tuning cohort 2012–2013 Geographic Validation cohort~ 2014–2016
N 25,534 18,365 3,366
Age, years (median [IQR]) 53 [41, 62] 51 [39, 60] 52 [40, 60]
Sex (%)
Male 15,019 (58·8%) 10,842 (59·0%) 1,911 (56·8%)
Female 10,459 (41·0%) 7,488 (40·8%) 1,453 (43·2%)
Sex unknown 56 (0·2%) 35 (0·2%) 2 (0·1%)
Karnofsky Performance Status
≥ 90 17, 465 (68·4%) 13,139 (71·5%) 2,585 (76·8%)
< 90 6,172 (24·2%) 4,017 (21·9%) 741 (22·0%)
Unknown 1,897 (7·4%) 1,209 (6·6%) 40 (1·2%)
Diagnosis (%)
Aggressive Lymphoma* 1,457 (5·7%) 1,062 (5·8%) 211 (6·3%)
Acute Lymphoblastic Leukemi^ 3,474 (13·6%) 2,699 (14·7%) 561 (16·7%)
Acute Myeloid Leukemia 11,881 (46·5%) 8,248 (44·9%) 1,578 (46·9%)
Chronic Lymphocytic Leukemia 564 (2·2%) 768 (4·2%) 54 (1·6%)
Hodgkin’s Disease 741 (2·9%) 599 (3·3%) 189 (5·6%)
Indolent Lymphoma 827 (3·2%) 751 (4·1%) 98 (2·9%)
Multiple Myeloma 852 (3·3%) 783 (4·3%) 90 (2·7%)
Myelodysplastic Syndrome 2,977 (11·7%) 1,700 (9·3%) 303 (9·0%)
Myeloproliferative Neoplasms$ 2,761 (10·8%) 1,755 (9·6%) 282 (8·4%)
Cell source (%)
Bone marrow 2,126 (8·3%) 2,261 (12·3%) 1,128 (33·5%)
Peripheral blood 22,943 (89·9%) 15,514 (84·5%) 2,198 (65·3%)
Cord blood 465 (1·8%) 590 (3·2%) 40 (1·2%)
Donor (%)
Matched related 8,492 (33·3%) 6,718 (36·6%) 979 (29·1%)
Haploidentical relative 1,762 (6·9%) 989 (5·4%) 897 (26·6%)
Matched unrelated (10/10) 7,689 (30·1%) 5,679 (30·9%) 747 (22·2%)
Mismatched unrelated (< 10/10) 2,071 (8·1%) 1,988 (10·8%) 446 (13·3%)
Unknown match unrelated 5,055 (19·8%) 2,401 (13·1%) 257 (7·6%)
Unrelated cord blood 465 (1·8%) 590 (3·2%) 40 (1·2%)
Sex-match (%)
Not female-to-male 20,432 (80·0%) 14,359 (78·2%) 2,577 (76·6%)
Female-to-male 4,565 (17·9%) 3,586 (19·5%) 707 (21·0%)
Unknown sex-match 537 (2·1%) 420 (2·3%) 82 (2·4%)
CMV serostatus pair (%)
Donor − / Recipient − 6,777 (26·5%) 4,408 (24·0%) 346 (10·3%)
Donor − / Recipient + 5,330 (20·9%) 4,234 (23·1%) 843 (25·0%)
Donor + / Recipient − 2,098 (8·2%) 1,537 (8·4%) 233 (6·9%)
Donor + / Recipient + 10,289 (40·3%) 7,421 (40·4%) 1,782 (52·9%)
CMV unknown 1,040 (4·1%) 765 (4·2%) 162 (4·8%)
Conditioning intensity (%)
Myeloablative 12,476 (48·9%) 9,964 (54·3%) 2,494 (74·1%)
Reduced-intensity 12,400 (48·6%) 8,050 (43·8%) 856 (25·4%)
Unknown 658 (2·6%) 351 (1·9%) 16 (0·5%)
~

Italian centers

*

Includes B-cell and T-cell non-Hodgkins’ lymphomas

^

Includes patients with biphenotypic acute leukaemia

Includes patients with mantle cell lymphoma

$

Includes patients with MDS/MPN Overlap

IQR – interquartile range; CMV – cytomegalovirus

On the derivation set, 55 possible disease/disease status-based combinations were studied in a multivariable Cox model. De-novo AML in first CR with intermediate-risk cytogenetics and without the FLT3-ITDpos/NPM1wt mutations was selected as the reference. The proportional increase of the adjusted hazard, relative to this reference, is shown in Figure 2 (appendix p. 3). To generate a prognostic scheme, which we refer to as the “Disease-Risk Stratification System” (DRSS), disease/disease-status pairs were grouped by β-coefficient and thresholds were selected as described above. Several possible risk grouping schemes were generated on the derivation cohort and evaluated on the tuning cohort. Based on the criteria described in the methods, the selected 5 levels scheme classified 3,298 (12·9%), 9,528 (37·3%), 7,072 (27·7%), 2,546 (10·0%), and 3,090 (12·1%) of the total 25,534 patients in the derivation cohort as low, intermediate-1, intermediate-2, high, and very-high risk, respectively. Over the derivation set, risk-levels were associated with a monotonic independent increase in the HR for overall mortality (Table 2, appendix p. 5). This risk corresponded to unadjusted 2-year OS rates of 72·4% (95% CI: 70·7, 74·1), 64·1% (62·9, 65·2), 57·6% (56·3, 58·9), 47·6% (45·4, 49·8), and 36·2% (34·3, 38·2; Figure 3A, appendix p. 6). The increasing risk of transplantation failure was driven primarily by relapse (appendix p. 68).

Figure 2. The Disease-Risk Stratification System (DRSS) scheme.

Figure 2.

The hazard ratios for overall-survival of multiple diseases and disease-status combinations over the derivation set are plotted. Combinations were divided into five risk groups.

AML – acute myeloid leukaemia; CR – complete remission; CLL – chronic lymphocytic leukaemia; ALL – acute lymphoblastic leukaemia; Ph – Philadelphia chromosome; CML – chronic myelogenous leukaemia; FLT3 - FMS-like tyrosine kinase 3; ITD - internal tandem duplication; NPM1 – Nucleophosmin 1; Int – intermediate; cyto – cytogenetics; AL – acute leukaemia; MDS – myelodysplastic syndrome; B-NHL – b-cell Non-Hodgkin’s Lymphoma; PD – progressive disease; SD – stable disease; T-NHL – t-cell non-Hodgkin’s lymphoma; MM – multiple myeloma; VGPR – very good partial response; PR – partial response; MPN – myeloproliferative neoplasm

Table 2.

Hazard ratios for Overall Survival by DRSS* stratum

Derivation cohort Tuning cohort Geographic validation cohort
HR (CI 95%) p-value HR (CI 95%) p-value HR (CI 95%) p-value
Low reference reference reference
Intermediate-1 1·26 (1·17, 1·36) < 0·0001 1·25 (1·14, 1·37) < 0·0001 1·48 (1·17, 1·88) 0·0011
Intermediate-2 1·53 (1·42, 1·66) < 0·0001 1·52 (1·38, 1·67) < 0·0001 1·62 (1·27, 2·08) < 0·0001
High 2·03 (1·86, 2·22) < 0·0001 2·04 (1·83, 2·28) < 0·0001 2·61 (2·01, 3·39) < 0·0001
Very High 2·87 (2·63, 3·13) < 0·0001 2·90 (2·62, 3·22) < 0·0001 3·70 (2·88, 4·74) < 0·0001
*

The full multivariable results for OS, relapse incidence and non-relapse mortality are presented in appendix p. 5, 7–8

OS – overall survival; CI – confidence interval; HR – hazard ratio

Figure 3. Overall survival by DRSS over the derivation, tuning, and validation cohorts, and reclassification from the DRI to DRSS.

Figure 3.

Overall survival by DRSS risk stratum is presented in the derivation (A), tuning (B) internal geographic validation (CD), and external validation cohorts (D). Panel (E) shows a mosaic plot comparing the distribution of risk assignments in the DRSS (y-axis) and revised DRI (x-axis) over the tuning and geographical validation cohorts. Column width represents the proportion of DRI categories. DRSS risk categories are colour-coded; their height represents assignment proportion. The majority of patients are categorized as intermediate risk by DRI and can be further stratified by the DRSS. Panel (F) shows overall survival among patients classified by the revised DRI as intermediate risk, in their respective DRSS risk categories.

DRSS – Disease-Risk Stratification System; DRI – Disease Risk Index (Armand et al, Blood 2014); L – low; I – intermediate; H – high; VH – very high

An online interface for calculating the DRSS category and providing disease/disease status specific unadjusted OS curves (appendix p. 18) is available (https://joshuafein.shinyapps.io/drss_calculator/).

The distribution of the DRSS over the tuning and geographical validation cohorts was similar to the derivation cohort. In the tuning cohort, low, intermediate-1, intermediate-2, high, and very high risk categories consisted of 1,653 (9·0 %), 7,019 (38·2%), 5,655 (30·8%), 1,568 (8·5%), and 2,470 (13·4%) of 18,365 patients, compared to 328 (9·7%), 1,314 (39·0%), 894 (26·6%), 342 (10·2%), and 488 (14·5%) of 3,366 patients in the validation cohort. Risk of death and relapse also followed the same pattern as in the derivation cohort, with DRSS being the strongest predictor of survival (Figure 3BC, Table 2, appendix p. 5, 78). Over the tuning and geographical validation cohorts, the hazard ratios for overall mortality of intermediate-1, intermediate-2, high, and very high, relative to low-risk disease, were 1·25 (95% CI: 1·14, 1·37) and 1·48 (1·17, 1·88, p = 0.0011), 1·52 (1·38, 1·67) and 1·62 (1·27, 2·08), 2·04 (1·83, 2·28) and 2·61 (2·01, 3·39), and 2·90 (2·62, 3·22) and 3·70 (2·88, 4·72; all p-values < 0·0001 except as noted), respectively. In a sensitivity analysis including in-vivo T-cell depletion as a covariate in the Cox model (appendix p. 9), the DRSS remained an independent predictor of survival in the geographical validation cohort.

We studied an independent cohort of 660 patients transplanted at MSKCC. Population characteristics are provided in appendix p. 10. The majority of patients received peripheral blood CD34-selected allografts (361/660, 54·7%), all of whom received myeloablative conditioning.17 AML and MDS were the leading indications for transplantation, accounting for 207 (31·4%) and 115 (17·4%) of 660 patients. The median follow-up and completeness of 2-year follow-up were 5·7 years (IQR: 4·5, 7·1) and 98·8%, respectively. In this smaller cohort, there was no survival segregation between low and intermediate-1 and these groups were therefore merged. The condensed DRSS scheme separated patients into 4 distinct risk groups associated with increasing risk of mortality for intermediate-2 (HR 1·34 [95% CI 1·04, 1·74], p = 0·025), high (HR 2·03 [95% CI 1·39, 2·95], p = 0·00023) and very-high (HR 2·26 [95% CI 1·62, 3·15], p < 0·0001) when compared to the low/intermediate-1 group (appendix p. 1112). The condensed scheme corresponded with 66·6% (95% CI: 61·6, 71·6), 55·4% (48·3, 62·5), 40·4% (27·0, 53·7), and 34·8% (23·4, 46·3) 2-year OS (Figure 3D). The DRSS remained an independent predictor of mortality after adjusting for ex vivo CD34+ cell depletion in a sensitivity analysis (appendix p. 13).

The DRI has a total of 44 categories vs. 55 in the DRSS (appendix p. 14). Including only patients who could be classified according to both systems, the AUC for 2-year OS with DRSS and DRI on the derivation, tuning, geographic, and the external cohorts was 61·0 vs. 58·9, 61·6 vs. 59·27, 63·3 vs. 62·0, and 61·6 vs 60·9. In the derivation set, 709 (2·8%) of 25,534 patients were unclassifiable by the DRI but were captured in the DRSS. Across the derivation, tuning and geographical validation, and external-validation cohorts, the DRI classified 16,510 (68%) of 25,534, 11,889 (67%) of 18,365, and 2,152 (69%) of 3,366 patients as intermediate risk, respectively. Corresponding estimates of 2-year OS were 62·5% (95% CI: 61·6, 63·3), 61·5% (60·6, 62·5), and 64·8% (62·8, 66·9). The DRSS reclassified intermediate-risk DRI patients, with 855 (6·0%) low, 7,111 intermediate-1 (50·6%), 5,700 intermediate-2 (40·6%), and 375 high/very high (2·7%) of 14,041 patients in a sub-analysis combining the tuning and internal geographic validation cohorts (Figure 3E). On the external-validation cohort 268 (65.7%) were classified as low/intermediate-1 and 140 (34·3%) as intermediate-2 (appendix, p. 38) of 408 intermediate DRI patients. Across cohorts, the reclassified DRSS tiers within the DRI intermediate-risk group were associated with distinct survival trajectories (Figure 3F, appendix p. 38). For instance, in the tuning and geographical validation cohort, patients categorised as intermediate-risk DRI had an estimated 2-year OS probability of 62·1% (95% CI: 61·2, 62·9); by DRSS, however, the same group was segregated into the five risk-groups with 2-year OS of 73·1% (70·1, 76·2), 64·4% (63·2, 65·6), 58·5% (57·1, 59·9), 45·5% (38·7, 52·4) and 45·7% (37·4, 54·0) for low, intermediate-1, intermediate-2, high and very high strata, respectively.

DISCUSSION

Survival following allogeneic HSCT is heavily dependent on the histological diagnosis and remission status at the time of transplantation.2,3 Based on these two features and additional molecular and cytogenetic data, we have constructed the DRSS, a novel risk-stratification system. The DRSS was developed and optimised on an international cohort of 43,899 patients transplanted between 2012 and 2016. It includes 15 diagnoses with a total of 55 levels grouped into five risk strata. It was validated in two hold-out datasets, one internal from the EBMT and one external from a single-centre US cohort. Across all populations, DRSS was the most important determinant of survival. Increasing risk of death with each level was primarily driven by relapse, suggesting that disease biology drives classification.

AML is the leading indication for allogeneic HSCT.1 In the DRI,2,3 AML patients in complete remission were grouped without respect to CR order. De-novo and secondary AML were considered as a single entity and FLT3-ITD and NPM1 mutation status were not included. AML categories have been refined in the DRSS. In agreement with Granfeldt et al., patients with de-novo AML had superior survival when compared to those with secondary AML (appendix p. 17).11 We did not account for secondary AML aetiology (therapy vs. antecedent haematological disorder-related), which may have further segregated patients.18 Nevertheless, a distinction between de-novo and secondary AML is informative and was therefore included. De-novo AML was further sub-classified based on the numerical order of remission; transplantation in CR1 was associated with better survival than in later CR. Notably, among patients with intermediate-risk cytogenetics who were in either CR1 or later CR, molecular markers had a major prognostic role. In CR1, FLT3-ITDpos/NPM1wt defined a distinct group with inferior survival, while any of the three remaining possible combinations of FLT3-ITD and NPM1 mutation status resulted in overlapping survival and were aggregated. The lack of separation between molecular subtypes may reflect selection bias since current guidelines suggest that AML FLT3-ITDneg/NPM1mut should be treated with consolidative chemotherapy.12 Latent covariates, such as initial induction failure and measurable residual disease (MRD), which are not captured in the registry, could result in an increased risk of recurrence and referral for transplantation in CR1. In subsequent CR, the sample size did not allow exploration of NPM1 role. FLT3-ITD status strongly stratified patients, to the extent that patients with FLT3-ITD AML had poor outcome similar to that observed with adverse-risk cytogenetics (appendix p. 17). Therefore, we grouped them into one category. FLT3-ITD and NPM1 mutational status are central determinants of AML therapeutic strategy.12 Their incorporation to transplantation risk schemes has lagged since standardised reporting to registries has begun only recently. Including these markers in the DRSS reflects a contemporary strategy to assess risk in AML patients undergoing allogeneic HSCT.

Prognostication in ALL is a moving target as practice is evolving. Philadelphia-positive ALL, which did not have a distinct prognosis from Ph- ALL in the revised-DRI,2,3 is among the best performing entities in the DRSS. This improvement likely reflects the routine clinical use of pre- and post-transplantation TKIs,19 which is only partially captured in the registry. Risk-estimation in ALL will continue to change as more data on MRD and prior therapies accumulate.20 Progress in the care of other haematological malignancies will change the profile of patients coming into transplantation. Outcomes of transplantation, as an advanced treatment line, may ultimately be worse than historical controls. In patients with aggressive B-cell or T-cell NHL, our findings suggest that achieving a CR before transplantation is imperative, as patients with less-optimal disease control were at high or very-high risk of mortality. Novel targeted therapies offer new hope in these populations.21,22 Overall, care in lymphoid malignancies is evolving. Therefore, adjustment of transplant indications and risk-assessment will be required in the coming decade.

Risk-grouping is inherently linked to loss of prognostic information.23 Acknowledging this limitation, risk-categorisation is clinically useful and facilitates comparative studies across heterogeneous populations. On the derivation set, the large sample size allowed for stratification of patients into five risk-groups, which was maintained in the tuning and geographical validation cohorts. The difference in risk between the low and intermediate-1 risk groups was small, albeit statistically significant. In the single-centre validation cohort, the first two levels had an overlapping risk for overall mortality. The difference may be related to the smaller sample size or major differences between the cohorts’ features, namely, the common application of CD34-selected graft in the external cohort, as well as its lack of cord-blood or haploidentical transplants. Importantly, the prognostic utility of DRSS held after adjusting for T-cell depletion in the internal and external validation cohorts, (appendix p. 9, 13), suggesting it is platform-independent. Since transplantation studies are often limited by sample size, as is the case in the external validation cohort, a four-level rather than five-level scheme, merging low and intermediate-1, may be appropriate in such scenarios.

Prognostic classification is not fixed and should account for emerging data in the field. The DRSS builds upon the scaffolds of the DRI,2,3 which has facilitated analyses of heterogeneous cohorts. However, at least 60% of patients in our cohort and others3 fall under the intermediate-risk group with DRI, limiting its utility.24 The new scheme has more balanced grouping even when considered as four, rather than five levels. Patients at the extremes may be candidates for strategies aimed at improving disease control (high risk) or decreasing transplant toxicity (low risk). The DRSS has slightly higher AUC than the DRI. However, similar to the DRI, it aims to stratify risk rather than provide an individualised estimate of survival. Therefore, AUC is not an optimal metric to evaluate both models. Overall, prognostic tools such as the DRI and DRSS provides an estimate of the relative risk and not outcome probabilities. Therefore, they should be used for risk stratification. To provide accurate probabilistic estimates of post-transplant events, prediction models should be developed on disease-specific cohorts and include granular information regarding patient, disease, and treatment features.

Diagnosis and disease status are among the primary drivers of treatment success.14 As a result, it is a challenge to investigate allogeneic HSCT outcomes, because studies often include cohorts with a wide range of indications.1,2527 To account for this heterogeneity, we propose the DRSS for estimating the risk of disease-associated mortality. The system was developed on one of the largest cohorts ever used in a transplantation study and was rigorously validated, demonstrating its applicability in the widest range of settings. Nevertheless, it would benefit from independent validation in additional cohorts with differing practices, disease distribution, and transplant centre experience. Importantly, HSCT prognostication is continuously changing as care of haematological malignancies improves.1 Therefore, the DRSS should be updated over time as new markers are introduced to registries. Future versions will optimally include a more comprehensive set of molecular and cytogenetic markers and MRD, which has not been routinely captured in registries until recently.20,2830 A similarly-constructed system for paediatric transplantation would also be valuable and would necessitate including non-malignant transplant indications. We see several applications for the DRSS. First, it can be used to facilitate interpretation and analysis of prospective and retrospective studies, including cohorts with mixed transplant indications. Second, DRSS can serve as a benchmark for transplantation studies and informed consent discussions, as it was developed and tested in nearly 50,000 patients. We provide an interactive interface for clinicians and investigators to further explore our findings. Finally, the DRSS promotes the design of non-disease-specific trials (e.g., conditioning regimens studies, GVHD prophylaxis), opening the door to broad more inclusive populations.

Supplementary Material

Supplement

Figure 4. Reclassification of the DRI intermediate risk group by the DRSS.

Figure 4.

A mosaic plot comparing the distribution of risk assignments in the DRSS and revised DRI over the tuning and geographical validation cohorts (A). Column width represents the proportion of DRI categories. Column hight represent the proportion of DRSS categories. Most patients were categorised as intermediate risk by DRI and were further stratified by the DRSS. Overall survival in patients classified by the revised DRI as intermediate risk, in their respective DRSS risk categories (B). DRI – disease-risk index; DRSS – disease-risk stratification system.

Research in context.

Evidence before this study

The success of allogeneic hematopoietic stem cell transplantation depends heavily on disease histology, genetic and cytogenetic features, and the disease status at the time of transplantation. Harmonisation of transplantation indications, with regard to disease-related risk of mortality, is essential for risk stratification and analyses in studies including populations with mixed diagnoses. Furthermore, it provides a means to estimate the clinical benefit of the transplant. We searched PubMed for the terms ((“disease risk”) OR (“disease status”)) AND (allogeneic stem cell transplantation) in reports published in any language from inception up to August 9, 2020 to identify relevant published clinical data. We found several prognostic systems that incorporate disease and disease status to stratify patients by the risk of overall mortality. The revised Disease Risk Index (DRI; Armand et al., Blood 2014) was the most comprehensive and widely used. The DRI has been successfully applied in retrospective and prospective studies. However, it was developed on transplants performed between 2008 and 2010 and does not reflect subsequent changes in patient and disease profile. In addition, it does not capture informative features in acute myeloid leukaemia such as disease origin, molecular features such as FLT3 and NPM1, and specific cytogenetic aberrations that are known to impact outcome. Finally, across studies implementing the DRI, approximately 70% of patients are classified as intermediate-risk. This limits the system’s discrimination between truly high and low-risk patients, who may benefit from targeted interventions.

Added value of this study

In this large international registry-based study, we developed and internally and externally validated a disease-risk stratification system for overall mortality, grouping 55 combinations of disease and disease status into five risk tiers. An increasing tendency to relapse drives the incremental risk of mortality between tiers. To our knowledge, this is the first global prognostic system which subdivides acute myeloid leukaemia, the leading allogeneic transplant indication, by ontology (de-novo vs. secondary), cytogenetics, and FLT3 and NPM1 mutational status. The new system reclassifies patients previously considered to have intermediate-risk disease by the DRI into finer, potentially actionable, prognostic categories. Finally, this is the most comprehensive and recent prognostic system for patients with haematological malignancies undergoing allogeneic transplantation. It was developed, optimised, and validated on 47,925 patients, highlighting its robustness and generalizability.

Implications of all the available evidence.

Our system reflects an up-to-date approach for risk-stratification in patients with haematological malignancies undergoing allogeneic haematopoietic stem cell transplantation. It facilitates the interpretation and analysis of prospective and retrospective studies with heterogeneous cohorts, promoting the design of non-disease-specific trials with broader, more inclusive populations. The system should also serve the medical community as a benchmark for transplantation outcomes in the coming years. Our approach lays the foundation for further iterations of this prognostic system which will incorporate more detailed molecular information and data on measurable residual disease, both of which are increasingly being captured in transplantation registries.

Acknowledgments

The study was supported by grants from the The Varda and Boaz Dotan Research Center in Haemato-Oncology affiliated with the Cancer Biology Research Center of Tel Aviv University.

We would like to thank Ms. Emmanuelle Polge, Study Office Operations Manager for the Acute Leukemia Working Party of the EBMT, as well as all the site investigators and patients who were included in this study. Dr. Shouval is a member of the Dr. Pinchas Bornstein Talpiot Medical Leadership Program, Chaim Sheba Medical Center, Ramat-Gan, Israel.

Declaration of interests

Authors disclosure are provided, none of which are represent a potential conflict of interest with the results presented in the manuscript.

FB has received travel grants from Celgene, AbbVie, Novartis, Pfizer and Sanofi and speaker honoraria from AbbVie.

GB received personal fees from Jazz, Celgene, Hexal, Novartis, Pfizer, Eurocept, Gilead, and Sanofi. She received travel grants Celgene, Gilead, Sanofi, and Neovii.

SM serves in a data monitoring committee of Bayer. She received a travel grant from Gilead and a speaker personal fee from Janssen.

RO has received royalties following licensure of the EBV-specific T-cell bank by Atara Biotherapeutics and has subsequently received research support and consultant fees from Atara Biotherapeutics.

SGr received research funding from Amgen, Actinuum, Celgene, Johnson & Johnson, Miltenyi, Takeda, and Omeros. He served on the advisory board of Amgen, Actinuum, Celgene, Johnson & Johnson, Janssen, JAZZ Pharmaceutical, Takeda, Novartis, Kite, and Spectrum Pharma.

CSS served as a paid consultant on advisory boards for: Juno Therapeutics, Sanofi-Genzyme, Spectrum Pharmaceuticals, Novartis, Genmab, Precision Biosciences, Kite/Gilead Company, Celgene/BMS, Gamida Cell, Karyopharm Therapeutics and GSK. He has received research funds for clinical trials from: Juno Therapeutics, Celgene/BMS, Bristol-Myers Squibb, Precision Biosciences and Sanofi-Genzyme.

SGr received research funding and serves in the scientific advisory board from Celgene, Janssen, BMS, Sanofi, Actinium, Amgen, Pfizer, GSK, and Jazz.

MAP received personal fees from Abbvire, Bellicum, Bristol-Meyers Squibb, Celgene, Cidara Theraputics, Incyte, Kite/Gilead, Medigene, Miltenyi, MolMed, Nektar Therapeutics, NexImmune, Novartis, Omeros, Merck, Servier, and Tekeda. He serves in Data Safety and Monitoring Board of Cidara Therapeutics, Medigene, and Servier. He received clinical trial support from Incyte, Kite/Gilead, and Miltenyi.

MM received personal fees from Sanofi, Jazz, Amgen, Takeda, Novartis, Janssen, Celgene, Adaptive Biotechnologies, Astella, Pfizer, Stemline, and GSK. He received grant support from Sanofi, Jazz, and Janssen.

Data sharing statement

Data may be made available through the senior author arnon.nagler@sheba.health.gov.il.

REFERENCES

  • 1.Shouval R, Fein JA, Labopin M, et al. Outcomes of allogeneic haematopoietic stem cell transplantation from HLA-matched and alternative donors: a European Society for Blood and Marrow Transplantation registry retrospective analysis. The Lancet Haematology 2019; 6(11): e573–e84. [DOI] [PubMed] [Google Scholar]
  • 2.Armand P, Gibson CJ, Cutler C, et al. A disease risk index for patients undergoing allogeneic stem cell transplantation. Blood 2012; 120(4): 905–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Armand P, Kim HT, Logan BR, et al. Validation and refinement of the Disease Risk Index for allogeneic stem cell transplantation. Blood 2014; 123(23): 3664–71. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Shouval R, Labopin M, Bondi O, et al. Prediction of Allogeneic Hematopoietic Stem-Cell Transplantation Mortality 100 Days After Transplantation Using a Machine Learning Algorithm: A European Group for Blood and Marrow Transplantation Acute Leukemia Working Party Retrospective Data Mining Study. J Clin Oncol 2015; 33(28): 3144–51. [DOI] [PubMed] [Google Scholar]
  • 5.Gratwohl A, Stern M, Brand R, et al. Risk score for outcome after allogeneic hematopoietic stem cell transplantation: a retrospective analysis. Cancer 2009; 115(20): 4715–26. [DOI] [PubMed] [Google Scholar]
  • 6.Parimon T, Au DH, Martin PJ, Chien JW. A risk score for mortality after allogeneic hematopoietic cell transplantation. Annals of internal medicine 2006; 144(6): 407–14. [DOI] [PubMed] [Google Scholar]
  • 7.Shouval R, Fein JA, Shouval A, et al. External validation and comparison of multiple prognostic scores in allogeneic hematopoietic stem cell transplantation. Blood Advances 2019; 3(12): 1881–90. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Milano F, Gooley T, Wood B, et al. Cord-blood transplantation in patients with minimal residual disease. The New England journal of medicine 2016; 375: 944–53. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Kasamon YL, Bolanos-Meade J, Prince GT, et al. Outcomes of Nonmyeloablative HLA-Haploidentical Blood or Marrow Transplantation With High-Dose Post-Transplantation Cyclophosphamide in Older Adults. J Clin Oncol 2015; 33(28): 3152–61. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Li Z, Labopin M, Ciceri F, et al. Haploidentical transplantation outcomes for secondary acute myeloid leukemia: Acute Leukemia Working Party (ALWP) of the European Society for Blood and Marrow Transplantation (EBMT) study. American Journal of Hematology 2018; 93(6): 769–77. [DOI] [PubMed] [Google Scholar]
  • 11.Granfeldt Østgård LS, Medeiros BC, Sengeløv H, et al. Epidemiology and clinical significance of secondary and therapy-related acute myeloid leukemia: a national population-based cohort study. Journal of Clinical Oncology 2015; 33(31): 3641–9. [DOI] [PubMed] [Google Scholar]
  • 12.Döhner H, Estey E, Grimwade D, et al. Diagnosis and management of AML in adults: 2017 ELN recommendations from an international expert panel. Blood 2017; 129(4): 424–47. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Moons KG, Altman DG, Reitsma JB, et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): explanation and elaboration. Annals of internal medicine 2015; 162(1): W1–W73. [DOI] [PubMed] [Google Scholar]
  • 14.Hothorn T, Zeileis A. Generalized Maximally Selected Statistics. Biometrics 2008; 64(4): 1263–9. [DOI] [PubMed] [Google Scholar]
  • 15.Blanche P, Dartigues J-F, Jacqmin-Gadda H. Estimating and comparing time-dependent areas under receiver operating characteristic curves for censored event times with competing risks. Stat Med 2013; 32(30): 5381–97. [DOI] [PubMed] [Google Scholar]
  • 16.Clark TG, Altman DG, De Stavola BL. Quantification of the completeness of follow-up. The Lancet 2002; 359(9314): 1309–10. [DOI] [PubMed] [Google Scholar]
  • 17.Tamari R, Oran B, Hilden P, et al. Allogeneic Stem Cell Transplantation for Advanced Myelodysplastic Syndrome: Comparison of Outcomes between CD34(+) Selected and Unmodified Hematopoietic Stem Cell Transplantation. Biol Blood Marrow Transplant 2018; 24(5): 1079–87. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Boddu P, Kantarjian HM, Garcia-Manero G, et al. Treated secondary acute myeloid leukemia: a distinct high-risk subset of AML with adverse prognosis. Blood Adv 2017; 1(17): 1312–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Giebel S, Czyz A, Ottmann O, et al. Use of tyrosine kinase inhibitors to prevent relapse after allogeneic hematopoietic stem cell transplantation for patients with Philadelphia chromosome–positive acute lymphoblastic leukemia: a position statement of the Acute Leukemia Working Party of the European Society for Blood and Marrow Transplantation. Cancer 2016; 122(19): 2941–51. [DOI] [PubMed] [Google Scholar]
  • 20.Berry DA, Zhou S, Higley H, et al. Association of Minimal Residual Disease With Clinical Outcome in Pediatric and Adult Acute Lymphoblastic Leukemia: A Meta-analysis. JAMA Oncology 2017; 3(7): e170580–e. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Gauthier J, Hirayama AV, Purushe J, et al. Feasibility and efficacy of CD19-targeted CAR T cells with concurrent ibrutinib for CLL after ibrutinib failure. 2020; 135(19): 1650–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Locke FL, Ghobadi A, Jacobson CA, et al. Long-term safety and activity of axicabtagene ciloleucel in refractory large B-cell lymphoma (ZUMA-1): a single-arm, multicentre, phase 1–2 trial. 2019; 20(1): 31–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Royston P, Altman DG. External validation of a Cox prognostic model: principles and methods. BMC medical research methodology 2013; 13(1): 33. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Pencina MJ, D’Agostino RB Sr, D’Agostino RB Jr, Vasan RS. Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Statistics in medicine 2008; 27(2): 157–72. [DOI] [PubMed] [Google Scholar]
  • 25.Zeiser R, von Bubnoff N, Butler J, et al. Ruxolitinib for Glucocorticoid-Refractory Acute Graft-versus-Host Disease. The New England journal of medicine 2020; 382(19): 1800–10. [DOI] [PubMed] [Google Scholar]
  • 26.Jagasia M, Perales MA, Schroeder MA, et al. Ruxolitinib for the treatment of steroid-refractory acute GVHD (REACH1): a multicenter, open-label phase 2 trial. Blood 2020; 135(20): 1739–49. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Bolanos-Meade J, Reshef R, Fraser R, et al. Three prophylaxis regimens (tacrolimus, mycophenolate mofetil, and cyclophosphamide; tacrolimus, methotrexate, and bortezomib; or tacrolimus, methotrexate, and maraviroc) versus tacrolimus and methotrexate for prevention of graft-versus-host disease with haemopoietic cell transplantation with reduced-intensity conditioning: a randomised phase 2 trial with a non-randomised contemporaneous control group (BMT CTN 1203). The Lancet Haematology 2019; 6(3): e132–e43. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Hourigan CS, Dillon LW, Gui G, et al. Impact of Conditioning Intensity of Allogeneic Transplantation for Acute Myeloid Leukemia With Genomic Evidence of Residual Disease. Journal of clinical oncology : official journal of the American Society of Clinical Oncology 2020; 38(12): 1273–83. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Buckley SA, Wood BL, Othus M, et al. Minimal residual disease prior to allogeneic hematopoietic cell transplantation in acute myeloid leukemia: a meta-analysis. Haematologica 2017; 102(5): 865–73. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Nagler A, Baron F, Labopin M, et al. Measurable residual disease (MRD) testing for acute leukemia in EBMT transplant centers: a survey on behalf of the ALWP of the EBMT. Bone marrow transplantation 2020. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplement

Data Availability Statement

Data may be made available through the senior author arnon.nagler@sheba.health.gov.il.

RESOURCES