Abstract
Graft-versus-host disease (GVHD)-free, relapse-free survival (GRFS) represents complete, ideal recovery after allogeneic hematopoietic cell transplantation (HCT). However, as originally proposed, this composite endpoint does not account for the possibility that HCT complications may improve after treatment. To more accurately estimate survival with response to GVHD and relapse after HCT, we developed a dynamic multistate GRFS (dGRFS) model with outcomes data from 949 patients undergoing their first allogeneic HCT for hematologic malignancy at the University of Minnesota. Because some patients were successfully treated for GVHD and relapse, dGRFS was higher than the originally defined time-to-event GRFS at 1 year (37.0 versus 27.6%) through 4 years (37.4% versus 22.2%). Mean survival without failure events was .52 years (95% confidence interval, .45 to .58 year) greater in dGRFS compared with the originally defined GRFS. Patient age (P< .001 ), disease risk (P < .001 ), conditioning intensity (P = .007), and donor type (P = .003) all significantly influenced dGRFS. The multistate model of dGRFS closely estimates the continuing and prevalent severe morbidity and mortality of allogeneic HCT. To serve the greater HCT community in more accurately modeling recovery from transplantation, we provide our R code for determination of dGRFS with annotations in Supplementary Materials.
Keywords: Allogeneic hematopoietic cell, transplantation, Graft-versus-host disease, Relapse, GRFS, Multistate modeling
INTRODUCTION
Describing the outcomes of allogeneic hematopoietic cell transplantation (HCT) with a binary endpoint (survival versus death) is inadequate to capture the complex patient experience. The most successful HCT procedures are characterized by tolerable conditioning regimens that do not result in severe organ damage, permit rapid engraftment and immune reconstitution so patients do not succumb to infections or bleeding, and control malignant disease without excess alloreactivity in the form of acute or chronic graft-versus-host disease (GVHD). Many patients survive the HCT procedure but have ongoing morbidity. Capturing these complexities accurately is critical to understanding the impact of new modalities of HCT that may benefit one aspect of the procedure but worsen another— for example, stronger GVHD prophylaxis that impedes immune reconstitution, leading to an elevated risk of fatal infection or relapse.
GVHD-free, relapse-free survival (GRFS) is a composite endpoint developed by the Blood and Marrow Transplantation Clinical Trials Network (BMT CTN) designed to capture the complex outcomes of HCT. GRFS represents recovery following HCT without severe morbidity (ie, alive, without relapse, and without having experienced grade III-IV acute GVHD or chronic GVHD requiring systemic immunosuppression) [1]. GRFS can compare clinically important outcomes that accompany different HCT techniques better than a binary endpoint. However, GRFS only recognizes the first of several possible events and does not consider events that may occur but then resolve with appropriate treatment (eg, relapse controlled with chemotherapy or donor lymphocyte infusion, GVHD that responds completely to steroids). Therefore, models that more closely follow the trajectory of a population after resolution of relevant clinical complications (eg, alive with resolved acute or chronic GVHD) may be more informative of actual and durable recovery.
Multistate models are able to capture transitions between clinical states, modeling stochastic transitions between progression to, and improvement from, a finite number of conditions over time [2]. To more closely estimate the overall morbidity of allogeneic HCT for hematologic malignancies, we developed a multistate model that captures improvement of up to 2 occurrences of major complications of allogeneic HCT (Figure 1) and reflects the prevalence of failure events over time. We show that our dynamic GRFS (dGRFS) model, developed with data from both adult and pediatric HCT, is improved by approximately 10% compared with the conventional definition of GRFS owing to resolution of some important complications. Using this index, we demonstrate that nearly two-thirds of recipients of HCT experience severe complications within the first year post-HCT, many of which persist over time, highlighting the need for continued efforts to mitigate the most common modes of failure: relapse and GVHD.
METHODS
Study Design
The objective of this study was to assess the clinical benefit of allogeneic HCT using a multistate composite endpoint of dGRFS and to determine the clinical factors associated with dGRFS success or failure. GRFS events were defined as grade III-IV acute GVHD (as a maximum, not onset grade), chronic GVHD requiring systemic immunosuppressive treatment, disease relapse, or death from any cause during the first 4 years after allogeneic HCT. Relapse was defined as morphological evidence of hematologic malignancy. Improvement of the event to remove a patient from a disease state was defined as follows: the date of response to immunosuppressive therapy for grade 11I-IV acute GVHD (both complete response [CR] and [PR] partial response were included based upon previous data showing no negative impact of PR on non-relapse mortality [3]); the date that clinical manifestations resolved to the point where systemic immunosuppression was not required for chronic GVHD; and for patients with recurrent malignancy the date of achievement of a complete response to treated relapse. We also compared dGRFS with recently described current GRFS, which omitted acute GVHD as an event and considered chronic GVHD a dynamic event [4].
The study sample included 949 consecutive pediatric and adult allogeneic HCT recipients from the University of Minnesota who underwent HCT for malignant disease between 2000 and 2013 and includes the outcomes of 907 patients described in our original GRFS report [1], This period was selected to maximize capture of 4-year outcomes for the surviving patient population. Only first allogeneic HCT procedures were included in this analysis. Eligible patients included recipients of grafts from an 8/8 HLA allele-matched sibling, an 8/8 HLA-matched unrelated donor (URD), single or double umbilical cord blood (UCB) graft. Patients were excluded if they were recipients of grafts from HLA-mismatched siblings, HLA-mismatched URD, or haploidentical donors because of their infrequency in our population during this study period.
Patient and Treatment Characteristics
Clinical factors examined included year of transplantation (2000–2007 or 2008–2013), previous autologous HCT, age (<21 versus 21+ years), sex, diagnosis (acute lymphoblastic leukemia, acute myelogenous leukemia [AML], myelodysplastic syndrome/myeloproliferative neoplasms/chronic myelogenous leukemia, non-Hodgkin lymphoma/Hodgkin lymphoma/chronic lymphocytic leukemia, or other malignancy), cytomegalovirus serostatus, conditioning (myeloablative conditioning with or without total body irradiation, reduced-intensity conditioning [RIC] with or without antithymocyte globulin), GVHD prophylaxis (cyclosporine or tacrolimus with either methotrexate [MTX] or mycophenolate mofetil [MMF], sirolimus, or T cell depletion), donor type (matched sibling donor [MSD], matched URD, UCB), stem cell source (bone marrow [BM], peripheral blood stem cells [PBSCs], or cord blood), and disease risk (standard or high risk). Disease risk was classified as standard risk or high risk based on the American Society for Blood and Marrow Transplantation’s Request for Information 2006 risk scoring schema. Disease-free survival was defined as the time from transplantation to relapse of the underlying malignancy for which the transplantation was performed or death. Overall survival (OS) was defined as the time from transplantation to death. All patients or their parents/guardians signed written informed consent allowing the use of their medical data in clinical research. All HCT and data collection protocols were reviewed and approved by the University of Minnesota Institutional Review Board.
Statistical Analysis
OS, disease-free survival, and conventional GRFS were estimated by the Kaplan-Meier method. Estimates of the current GRFS curve was obtained following the method of Solomon et al [4]. The model for calculating dGRFS curve is based on 13 possible health states after transplantation (Figure 1), which accommodates 2 possible episodes of relapse, acute GVHD, and chronic GVHD, and their improvement:
Alive and in first post-transplantation remission
Dead
Alive, in first episode of relapse or acute/chronic GVHD
Dead, in first episode of relapse or acute/chronic GVHD
Alive, in first episode of relapse with concurrent acute/chronic GVHD
Dead, in first episode of relapse with concurrent acute/chronic GVHD
Alive, back in remission without acute/chronic GVHD
Dead, back in remission without acute/chronic GVHD
Alive, in second episode of relapse or acute/chronic GVHD
Dead, in second episode of relapse or acute/chronic GVHD
Alive, in second episode of relapse and acute/chronic GVHD
Dead, in second episode of relapse and acute/chronic GVHD
Alive, in remission and without acute/chronic GVHD
The dGRFS model treats acute GVHD, chronic GVHD, and relapse as dynamic events. dGRFS is the probability that one stays n states 0,6, and 12, corresponding to survival in remission and without ongoing acute or chronic GVHD. The calculation of dGRFS curve follows that of current GRFS described by Solomon et al [4], which is the linear combination of 5 Kaplan-Meier estimates, represented by the formula C(t) = S_1 (t) + [S_2 (t) - S_3 (t)]+[S_4 (t) - S_5 (t)]. Here S_1 (t) represents the survival function corresponding to the event of death, first relapse, or initial acute or chronic GVHD, which is the chance that a patient stays in state 0. For S_2 (t), the event is second episode of relapse or acute/chronic GVHD, or death before the second episode of relapse or acute/chronic GVHD. For S_3 (t), the event is end of first episode of relapse and/or acute/chronic GVHD, or death before the end of first episode of relapse and/or acute/chronic GVHD. The difference between S_2 (t) and S_3 (t), [S_2 (t) - S_3 (t)], is the probability of being in state 6. S_5 (t) is OS with death as the events. For S_4 (t), the event is end of second episode of relapse and/or acute/chronic GVHD, or death before the end of second episode of relapse and/or acute/chronic GVHD. The difference between S_4 (t) and S_5 (t), [S_4 (t) - S_5 (t)], is the probability of being in state 12. In this dynamic event model, we use loops between States 2 and 4 and 8 and 10 to model the changes of multiple dynamic events within a continuous period. A patient could have developed and recovered from GVHD multiple times in each loop. The R code for the dGRFS model with annotations is included in Supplementary Materials.
We used resampling-based permutation tests to test the significance of dGRFS difference at year 1 between different strata defined by clinical factors [5]. In each of the permutation tests, 10,000 permuted datasets were generated, which have the same survival data of the subjects but with the values of the clinical factor of interest randomly allocated to subjects. The 10,000 realizations of difference in dGRFS between subgroups defined by the clinical factor approximates its distribution under the null hypothesis no difference in dGRFS between subgroups of the factor. Pvalues were obtained by finding the proportions of 10,000 dGRFS differences greater than the observed value in original data.
RESULTS
Patient Characteristics
Nine hundred and forty-nine HCT recipients were included in this study (Table 1). The majority of HCT procedures in this cohort were performed for acute leukemia (66%), and nearly one-half of the patients (46%) received RIC. The median patient age was 43 years (range, 1 to 75 years), and 24% were age <21 years. Donor sources included MSD bone marrow (BM; 6%) and PBSCs (29%), matched URD marrow (7%) and PBSC (2%), and UCB (56%). The median duration of follow-up was 8.1 years (1 to 16.4 years).
Table 1.
Characteristic | Study Group | |
---|---|---|
Number of patients | 949 | |
Age, yr | ||
<21,n(%) | 227 (24) | |
≥21, n (%) | 722 (76) | |
Median (range) | 43.8 (1–75) | |
Recipient gender, n (%) | ||
Male | 561 (59) | |
Female | 388 (41) | |
Donor-recipient sex match, n (%) | ||
Match | 383 (40) | |
Mismatch | 566 (60) | |
Diagnosis, n (%) | ||
ALL | 238 (25) | |
AML | 385 (41) | |
MDS/MPN/CML | 194 (20) | |
NHL/Hodgkins/CLL | 115 (12) | |
Other malignancy | 17 (2) | |
Disease risk group: high risk, n (%) | 304 (32) | |
Previous autologous HCT, n (%) | 43 (5) | |
Positive recipient CMV serostatus, n (%) | 531 (56) | |
Conditioning, n (%) | ||
MA | 516 (54) | |
RIC | 433 (46) | |
GVHD prophylaxis, n (%) | ||
CsA | 39 (4) | |
CsA/MMF | 628 (66) | |
CsA/MTX | 235 (25) | |
Other | 47 (5.0) | |
Donor type, n (%) | ||
Matched marrow sibling | 57 (6.0) | |
Matched PBSC sibling | 273 (29) | |
Matched marrow URD | 67 (7) | |
Matched PBSC URD | 16 (2) | |
UCB (single + double) | 536 (56) | |
Year of HCT, n(%) | ||
2000–2007 | 537 (57) | |
2008–2013 | 412 (43) |
ALL, acute lymphoblastic leukemia; CLL, chronic lymphocytic leukemia; CML, chromic myelogenous leukemia; , CMV, cytomegalovirus; CsA, cyclosporine; MA, myeloablative conditioning; MDS, myelodysplastic syndrome; MPN, myeloproliferative neoplasm; NHL, non-Hodgkin lymphoma.
Comparison of Clinical Endpoints
Because some patients were successfully treated for GVHD and relapse, dGRFS was higher than conventional GRFS at 1 year (37.0% versus 27.6%), and this continued throughout the 4 years (37.4% versus 22.2%). The mean survival time without failure events was .52 years greater (95% confidence interval [CI], .45 to .58 years) in dGRFS compared with the original GRFS definition. Conventional GRFS treats GVHD as a nondynamic event and as such shows a persistent downward trend over years 1 to 4 (Figure 2). In contrast, there was little change in dGRFS after year 1. This lack of difference in dGRFS over time is due predominantly to patients with the acute GVHD grade III-IV endpoint transitioning out of that disease state and surviving. More than one-half (57%; 89 of 156) of patients with an acute GVHD event had a CR or PR to therapy and remained alive without transitioning to other states. Therefore, the population of patients who remained in a disease state with persisting acute GVHD was small (67/949,7%). In addition, 40% (165/411) of patients who went into a relapse state at least temporarily improved into a remission state.
Clinical Factors Associated with dGRFS
Factors associated with dGRFS were similar to those influencing GRFS from our prior report [6]. Patient age <21 years was associated with a 28.7% higher dGRFS at 1 year compared with adults age ≥21 years (58.7% versus 30.0%; P < .001) (Figure 3A). High-risk disease was associated with a 15.4% lower dGRFS at 1 year compared with standard-risk disease (26.5% versus 41.9%; P < .001) (Figure 3B). Myeloablative conditioning was associated with a higher dGRFS than RIC (41.5% versus 31.6%; P = .007) (Figure 3C). Donor type also significantly influenced dGRFS (P = .003) with recipients of matched sibling donor BM experiencing the highest dGRFS at 1 year (64.9%), followed by UCB (41.7%), URD BM (35.8%), MSD PBSCs (23.4%), and URD PBSCs (12.5%) (Figure 3D).
DISCUSSION
The experience of allogeneic HCT recipients is complex to model and describe. We have developed a novel composite endpoint, dGRFS, which reflects the prevalence of major complications of HCT over time. In our cohort of 949 pediatric and adult HCT recipients, 37% experienced dGRFS (ie, were alive without active GVHD or relapse, or having experienced those disease states and resolved them once or twice) at 1 year post-HCT. The overall prevalence of patients in a dGRFS state remains relatively stable after year 1, varying by <1% through 4 years post-HCT, suggesting that most of the persisting events have happened by 1 year and later resolution is uncommon. Our multistate model shows a 10% higher dGRFS at 1 year than the original time-to-event GRFS definition. Nonetheless, the majority of HCT recipients have experienced, or continue to experience, major complications of HCT at 1 year and beyond. Few patients have a smooth clinical course.
As with our previous report, factors that influence dGRFS in our cohort are not easily modifiable. Patient age is fixed, but timely referrals for allogeneic HCT may allow for treatment earlier in a patient’s disease course, thereby avoiding HCT when a patient has transitioned to a high-risk hematologic malignancy. Myeloablative conditioning was also associated with a higher dGRFS. Although it is not possible for all patients of advanced age to receive myeloablative conditioning, optimizing the conditioning intensity for the patient and disease state (eg, myeloablative conditioning for AML up to age 65 if few comorbidities) may result in better overall outcomes [7,8].
Donor and graft source are important contributors to dGRFS and may be modifiable for some patients. Recipients of marrow grafts from an MSD demonstrated the best dGRFS. Although the majority of recipients receiving BM from an MSD in this cohort were pediatric, a larger cohort enriched for adult recipients of BM from MSD demonstrated improved GRFS with this graft source compared with PBSCs [9]. The optimal donor for a patient without an MSD remains of considerable debate. Long-term morbidity/mortality by dGRFS was similar for URD BM and UCB in this series. The results of this study and a recent Center for International Blood and Marrow Transplantation Research analysis suggest that URD PBSC results in inferior GRFS compared with URD BM, UCB, and haploidentical grafts [10] and we observed inferior dGRFS for URD or MSD PBSC transplants.
We could not assess the impact of GVHD prophylaxis regimens on GRFS in this series, because graft source, disease status at HCT, and GVHD prophylaxis regimens were highly correlated. Within other cohorts, it may be possible to discern an impact of various regimens on dGRFS. Retrospective studies have yielded insights regarding manipulating T cell dose and GRFS. A recent report by Simonetta et al [ 11 ] showed a higher GRFS with partial in vitro T cell depletion using alemtuzumab in the graft (54% versus 37%; P < .01 ). The European Society for Blood and Marrow Transplantation reported similar improvement in GRFS (60% versus 40%, P < .01) with in vivo T cell depletion with antithymocyte globulin after myeloablative fludarabine/busulfan conditioning for AML in CR1 [12]. The first prospective acute GVHD prophylaxis study with GRFS as its primary endpoint was recently reported. BMT CTN 1203 was a randomized, phase II RIC study of 3 novel GVHD prophylaxis regimens (tacrolimus [TAC]/MTX/maraviroc, TAC/MTX/bortezomib, and post-transplantation cyclophosphamide [PTCy]/ TAC/MMF) compared with contemporaneous TAC/MTX. In BMT CTN 1203, only the PTCy/TAC/MMF arm was superior to TAC/MTX controls in the 1-year GRFS endpoint (46% versus 32%, one-sided P= .04). Both acute GVHD grade III-IV (2% versus 13%; P= .006) and chronic GVHD requiring immunosuppression (19% versus 32%; P= .04) were significantly less frequent in PTCy versus controls, with no difference in relapse/progression (27% versus 25%, p=.34) between these arms [13]. A randomized phase III study (BMT CTN 1703) is planned to formally compare TAC/MTX versus PTCy/TAC/MMF in RIC HCT.
Since the first publication of the GRFS endpoint, it has been modified by clinical investigators to more closely model the cumulative morbidity of allogeneic HCT. The original BMT CTN definition included chronic GVHD requiring systemic immunosuppression and was later modified to moderate to severe chronic GVHD by National Institutes of Health (NIH) criteria [14]. This may help reduce subjectivity and physician preference that could influence the endpoint [15]. Solomon et al reported on current GRFS, omitting acute GVHD as an endpoint, and considering only NIH moderate to severe chronic GVHD as a dynamic event that can resolve [4].
This new construct, dGRFS, closely models the morbidity and mortality of allogeneic HCT, recognizing that whereas almost all patients experience clinically significant complications, some of those complications, both acute GVHD and relapse, can improve with treatment in some patients. We suggest that dGRFS should be considered as a potential clinical trial endpoint. Although aGVHD is contributing less to overall mortality in recent years and has been eliminated from some models, the impact of aGVHD on quality of life is not trivial. Thus, dGRFS could possibly be more closely associated with patient-reported outcomes than endpoints that do not include aGVHD, although this requires formal study. Interventions aimed at minimizing GVHD while sparing the risk of fatal infections and relapse continue to be a high priority for development, especially in the setting of matched PBSC transplantation, where dGRFS is the lowest.
ACKNOWLEDGMENTS
The authors thank Michael Franklin, MS, for assistance in editing the manuscript.
Financial disclosure: This project was supported in part by the National Institutes of Health (Grants P30 CA77598 and P01 CA111412), using the Biostatistics and Bioinformatics Shared Resource of the Masonic Cancer Center, University of Minnesota, and by the National Cancer Institute (Grant P01 CA65493, to B.R.B, and C.G.B.).
Footnotes
Conflict of interest statement: The authors have no conflicts of interest to disclose.
SUPPLEMENTARY MATERIALS
Supplementary material associated with this article can be found in the online version at doi:10.1016/j.bbmt.2019.05.015.
REFERENCES
- 1.Holtan SG, DeFor TE, Lazaryan A, et al. Composite end point of graft-versus-host disease-free, relapse-free survival after allogeneic hematopoietic cell transplantation. Blood. 2015;125:1333–1338. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Meira-Machado L, de Uña-Alvarez J, Cadarso-Suárez C, Andersen PK. Multi-state models for the analysis of time-to-event data. Stat Methods Med Res. 2009;18:195–222. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.MacMillan ML, DeFor TE, Weisdorf DJ. The best endpoint for acute GVHD treatment trials. Blood. 2010;115:5412–5417. [DOI] [PubMed] [Google Scholar]
- 4.Solomon SR, Sizemore C, Zhang X, et al. Current graft-versus-host disease-free, relapse-free survival: a dynamic endpoint to better define efficacy after allogenic transplant. Biol Blood Marrow Transplant. 2017;23:1208–1214. [DOI] [PubMed] [Google Scholar]
- 5.Lunneborg CE. Data Analysis by Resampling: Concepts and Applications. North Scitutate, MA: Duxbury Press; 2000. [Google Scholar]
- 6.Enninga EA, Nevala WK, Creedon DJ, Markovic SN, Holtan SG. Fetal sex-based differences in maternal hormones, angiogenic factors, and immune mediators during pregnancy and the postpartum period. Am J Reprod Immunol. 2015;73:251–262. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Scott BL, Pasquini MC, Logan BR, et al. Myeloablative versus reduced- intensity hematopoietic cell transplantation for acute myeloid leukemia and myelodysplastic syndromes. J Clin Oncol. 2017;35:1154–1161. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Weisdorf DJ. Reduced-intensity versus myeloablative allogeneic transplantation. Hematol Oncol Stem Cell Ther. 2017;10:321–326. [DOI] [PubMed] [Google Scholar]
- 9.Mehta RS, Peffault de Latour R, DeFor TE, et al. Improved graft-versus-host disease-free, relapse-free survival associated with bone marrow as the stem cell source in adults. Haematologica. 2016; 101:764–772. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Mehta RS, Holtan SG, Wang T, et al. Graft-versus-host disease (GVHD)-free relapse-free survival (GRFS) and chronic GVHD (CRFS) in alternative donor hematopoietic cell transplantation (HCT) in adults. Blood. 2017;130(suppl 1):517. [Google Scholar]
- 11.Simonetta F, Masouridi-Levrat S, Beauverd Y, et al. Partial T-cell depletion improves the composite endpoint graft-versus-host disease-free, relapse- free survival after allogeneic hematopoietic stem cell transplantation. Leuk Lymphoma. 2018;59:590–600. [DOI] [PubMed] [Google Scholar]
- 12.Rubio MT, DˈAveni-Piney M, Labopin M, et al. Impact of in vivo T cell depletion in HLA-identical allogeneic stem cell transplantation for acute myeloid leukemia in first complete remission conditioned with a fludara- bine IV-busulfan myeloablative regimen: a report from the EBMT Acute Leukemia Working Party. J Hematol Oncol. 2017;10:31. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Bolaños-Meade J, Reshef R, Fraser R, et al. Three prophylaxis regimens (tacrolimus, mycophenolate mofetil, and cyclophosphamide; tacrolimus, methotrexate, and bortezomib; or tacrolimus, methotrexate, and maraviroc) versus tacrolimus and methotrexate for prevention of graft-versus-host disease with haemopoietic cell transplantation with reduced-intensity conditioning: a randomised phase 2 trial with a non-randomised contemporaneous control group (BMT CTN 1203). Lancet Haematol. 2019;6:el32–el43. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Filipovich AH, Weisdorf D, Pavletic S, et al. National Institutes of Health consensus development project on criteria for clinical trials in chronic graft-versus-host disease, I: Diagnosis And Staging Working Group report. Biol Blood Marrow Transplant. 2005; 11:945–956. [DOI] [PubMed] [Google Scholar]
- 15.Solh M, Zhang X, Connor K, et al. Donor type and disease risk predict the success of allogeneic hematopoietic cell transplantation: a single-center analysis of 613 adult hematopoietic cell transplantation recipients using a modified composite endpoint. Biol Blood Marrow Transplant. 2017;23: 2192–2198. [DOI] [PubMed] [Google Scholar]