Abstract
Epstein-Barr virus (EBV) reactivation is one of the most important infections after hematopoietic stem cell transplantation (HSCT) using haplo-identical related donors (HID). We aimed to establish a comprehensive model with machine learning, which could predict EBV reactivation after HID HSCT with anti-thymocyte globulin (ATG) for graft-versus-host disease (GVHD) prophylaxis. We enrolled 470 consecutive acute leukemia patients, 60% of them (n = 282) randomly selected as a training cohort, the remaining 40% (n = 188) as a validation cohort. The equation was as follows: Probability (EBV reactivation) = , where Y = 0.0250 × (age) – 0.3614 × (gender) + 0.0668 × (underlying disease) – 0.6297 × (disease status before HSCT) – 0.0726 × (disease risk index) – 0.0118 × (hematopoietic cell transplantation-specific comorbidity index [HCT-CI] score) + 1.2037 × (human leukocyte antigen disparity) + 0.5347 × (EBV serostatus) + 0.1605 × (conditioning regimen) – 0.2270 × (donor/recipient gender matched) + 0.2304 × (donor/recipient relation) – 0.0170 × (mononuclear cell counts in graft) + 0.0395 × (CD34+ cell count in graft) – 2.4510. The threshold of probability was 0.4623, which separated patients into low- and high-risk groups. The 1-year cumulative incidence of EBV reactivation in the low- and high-risk groups was 11.0% versus 24.5% (P < .001), 10.7% versus 19.3% (P = .046), and 11.4% versus 31.6% (P = .001), respectively, in total, training and validation cohorts. The model could also predict relapse and survival after HID HSCT. We established a comprehensive model that could predict EBV reactivation in HID HSCT recipients using ATG for GVHD prophylaxis.
Keywords: Anti-, thymocyte globulin, Epstein-, Barr virus, Haplo-, identical hematopoietic stem cell transplant, Machine learning, Predictive model
1. INTRODUCTION
Allogeneic hematopoietic stem cell transplantation (allo-HSCT) significantly improves the survival of patients with acute leukemia (AL).1 Recently, with the progression of the transplant technique, human leukocyte antigen (HLA) haplo-identical donors (HIDs) are quantitatively the most important, accounting for 60% of allo-HSCT in China.2
Although patients receiving HID HSCT can achieve long-term survival,3,4 infection is still the most important cause of transplant-related mortality.5 Epstein-Barr virus (EBV) reactivation is one of the most common of these infections. It has been found to be the most important risk factor for EBV-related post-transplant lymphoproliferative disorders (PTLD)6–8 and can increase the risk of mortality.9,10 Considering that the seroprevalence of EBV in the Chinese population is as high as 90% in children 8 years old or more,11 it is important to predict the EBV reactivation after HID HSCT.
Several variables could increase the risk of EBV reactivation after allo-HSCT. Anti-thymocyte globulin (ATG) is the most critical risk factor,6,12,13 and approximately 60%–70% of patients receiving ATG during conditioning would experience EBV reactivation after allo-HSCT.12,14 In addition, HLA mismatch is another important risk factor for post-transplant EBV reactivation.6,13 Thus, patients receiving HID HSCT with an ATG-based regimen would have a high risk of EBV reactivation; however, no accepted risk factors for EBV reactivation have been reported in this population, and there is no comprehensive model to predict EBV reactivation after HID HSCT.
The objective of this article is to establish a comprehensive model, with machine learning, which could predict EBV reactivation after HID HSCT with ATG to counter graft-versus-host disease (GVHD).
2. MATERIALS AND METHODS
2.1. Study design
This study was conducted on the basis of the transplant database of Peking University, Institute of Hematology; it consisted of 470 consecutive AL patients receiving HID HSCT between January 21, 2020, and May 31, 2021. The information on acute GVHD (aGVHD) has been reported in detail,15 and in the present study, the survivors were further followed up to March 1, 2022. The study was conducted in accordance with the Declaration of Helsinki.
2.2. Transplant regimens
All patients were treated according to the registered protocol, NCT03756675. The major regimen included cytarabine, busulfan, cyclophosphamide, and semustine for conditioning,3,16 using granulocyte colony-stimulating factor-primed peripheral blood (PB) harvests as grafts.17 The protocol for GVHD prophylaxis included ATG, cyclosporine A, mycophenolate mofetil, and short-term methotrexate (SDC, Methods, http://links.lww.com/BS/A53).18–25
2.3. Protocols for EBV monitoring and prevention
Plasma EBV copies were monitored at least weekly until day +100 with quantitative polymerase chain reaction (Q-PCR) analysis. For patients who received systemic immunosuppressive treatments, EBV monitoring was conducted regularly after day +100. If symptoms of suspected virus infection were present, additional detection was performed. The EBV reactivation was defined as more than 1 × 103 copies/mL EBV-DNA in plasma by Q-PCR in 1 test.26 The protocols for infection prophylaxis other than EBV and the pre-emptive intervention for EBV reactivation are shown in the SDC, Methods (http://links.lww.com/BS/A53).
2.4. Building machine learning models
Our method consisted of 2 steps: building the logistic regression model and ascertaining the optimal threshold (Fig. 1; SDC, Methods, http://links.lww.com/BS/A53; and SDC, Table S1, http://links.lww.com/BS/A53).27–31
Figure 1.
Flow diagram of building the machine learning model. EBV = Epstein-Barr virus.
Of the entire study population, 60% were randomly selected (ie, n = 282) as the training cohort; the remaining 40% were used as the validation cohort (n = 188). For the primary outcome (EBV reactivation), we performed the model-building steps in the training cohort and verified the model in the validation cohort. We also identified the sensitivity, specificity, area under the curve (AUC) score and accuracy score in both data cohorts.
2.4.1. Building models
We utilized logistic regression models with L2 regularization for the prediction. The model is illustrated in equation (1):
In equation (1), w is the coefficient to be trained, which requires the following objective function to be minimized:
During the optimization procedure, an inappropriate imbalance between the sizes of the positive and negative samples was found. We adjusted weights to each sample when conducting optimization as in equation (3):
We utilized sklearn v1.0.2 with Python 3.9 to build the models based on the anaconda3 development platform. The model parameters “class_weight” and “max_iter” are set to be “balanced” and 1000, respectively.32,33
2.4.2. Finding the optimal threshold
According to equation (1), the output of the logistic regression model should be between 0 and 1. To further specify the prediction results, determining the threshold for outputting negative or positive became significant. In this article, we drew receiver operating characteristic (ROC) curves30 and calculated the g-mean for each threshold.31 We chose the one with the largest g-mean to be the optimal threshold.
2.4.3. Evaluation for model
ROC-AUC was defined as the AUC of the tpr/fpr at thresholds ranging from 0 to 1. The confusion matrix was a 2 × 2 table for summarizing the prediction results. In addition, we normalized the count values by the number of True Label (Outcome) or the number of Predicted Label (Prediction).
The detailed information for the setting of equations is shown in the SDC, Methods (http://links.lww.com/BS/A53).
2.5. Definitions
The definitions for hematopoietic cell transplantation-specific comorbidity index (HCT-CI) and disease risk index (DRI) were as in previous studies.34,35 The definitions for engraftment, nonrelapse mortality (NRM), relapse, overall survival (OS), and leukemia-free survival (LFS) are shown in the SDC, Methods (http://links.lww.com/BS/A53).
2.6. Statistical methods
The primary outcome was EBV reactivation. The secondary outcomes included relapse, NRM, OS, and LFS.
We used the Mann–Whitney U test to compare continuous variables and the χ2 and Fisher exact tests for categorical variables. The Kaplan–Meier method was used to estimate the probability of OS and LFS. We used competing risk analyses to calculate the cumulative incidence of EBV reactivation, NRM, and relapse.36 Testing was 2-sided at the P < .05 level. Statistical analysis was performed on R software (version 4.2.0) (http://www.r-project.org) and SPSS 26.0 software (SPSS, Chicago, Illinois).
3. RESULTS
3.1. Characteristics of patients
Table 1 shows the characteristics of the training and validation cohorts. The detailed information of engraftment and aGVHD have been previously reported by Shen et al.15 A total of 438 patients (92.9%) survived until the last follow-up. The median duration of follow-up was 483 days (range, 39–770 d). The probabilities of NRM, relapse, OS and LFS at 1 year after HID HSCT were 3.9% (95% CI, 2.1%–5.7%), 8.7% (95% CI, 6.1%–11.2%), 93.9% (95% CI, 91.8%–96.1%), and 87.4% (95% CI, 84.4%–90.5%), respectively.
Table 1.
Patient characteristics.
| Characteristics | Training cohort (n = 282) | Validation cohort (n = 188) | P |
|---|---|---|---|
| Median age at allo-HSCT, y (range) | 27.5 (1–65) | 30.0 (1–66) | .514 |
| Gender, n (%) | .618 | ||
| Male | 166 (58.9) | 115 (61.2) | |
| Female | 116 (41.1) | 73 (38.8) | |
| Underlying disease, n (%) | .495 | ||
| Acute myeloid leukemia | 162 (57.4) | 102 (54.3) | |
| Acute lymphoblastic leukemia | 120 (42.6) | 86 (45.7) | |
| Disease status before allo-HSCT, n (%) | .922 | ||
| CR1 | 271 (96.1) | 181 (96.3) | |
| >CR1 | 11 (3.9) | 7 (3.7) | |
| Disease risk index before allo-HSCT, n (%) | .395 | ||
| Low risk | 14 (5.0) | 10 (5.3) | |
| Intermediate risk | 209 (74.1) | 145 (77.1) | |
| High risk | 59 (20.9) | 33 (17.6) | |
| HCT-CI scores before allo-HSCT, n (%) | .514 | ||
| 0 (low risk) | 204 (72.3) | 138 (73.4) | |
| 1–2 (intermediate risk) | 54 (19.1) | 43 (22.9) | |
| ≥3 (high risk) | 24 (8.5) | 7 (3.7) | |
| Number of HLA-A, HLA-B, HLA-DR mismatches, n (%) | .575 | ||
| 1 locus | 8 (2.8) | 3 (1.6) | |
| ≥2 loci | 274 (97.2) | 185 (98.4) | |
| EBV serostatus before HSCT, n (%) | .730 | ||
| Donor+/recipient– | 9 (3.2) | 4 (2.1) | |
| Donor+/recipient+ | 259 (91.8) | 173 (92.0) | |
| Donor–/recipient+ | 14 (5.0) | 11 (5.8) | |
| Conditioning regimen, n (%) | .049 | ||
| Chemotherapy-based regimen | 271 (96.1) | 187 (99.5) | |
| TBI-based regimen | 11 (3.9) | 1 (0.5) | |
| Donor/recipient gender matched, n (%) | .408 | ||
| Female donor/male recipient combination | 55 (19.5) | 31 (16.5) | |
| Others | 227 (80.5) | 157 (83.5) | |
| Donor/recipient relation, n (%) | .031 | ||
| Maternal donor | 30 (10.6) | 8 (4.3) | |
| Collateral donor | 8 (2.8) | 4 (2.1) | |
| Others | 244 (86.5) | 176 (93.6) | |
| MNC counts in graft, median (range, ×108/kg) | 9.35 (4.15–27.52) | 9.03 (5.76–15.94) | .335 |
| CD34+ cell counts in graft, median (range, ×106/kg) | 3.79 (0.67–29.35) | 3.84 (1.15–15.91) | .595 |
| Median follow-up of survivors, d (range) | 468.5 (39–770) | 497 (66–768) | .437 |
allo-HSCT = allogeneic hematopoietic stem cell transplantation, CR = complete remission, EBV = Epstein-Barr virus, HCT-CI = hematopoietic cell transplantation-specific comorbidity index, HLA = human leukocyte antigen, MNC = mononuclear cell, TBI = total body irradiation.
3.2. EBV characteristics
A total of 80 patients (17.0%) showed EBV reactivation. The median time from HSCT to EBV reactivation was 52 days (range, 20–579 d). The initial and highest plasma levels of EBV-DNAemia were 1.62 × 103 copies/mL(range, 1.00–25.10 × 103 copies/mL) and 2.96 × 103 copies/mL (range, 1.02–56.00 × 103 copies/mL), respectively. The cumulative incidence of EBV reactivation at 1 year after HID HSCT was 16.6% (95% CI, 13.3%–20.0%). The number of patients showing PTLD after EBV reactivation was 12.
3.3. Predictive model for EBV reactivation
Our equation was as follows:
Probability (EBV reactivation) =
where
Y = 0.0250 × (patient age) – 0.3614 × (patient gender) + 0.0668 × (underlying disease) – 0.6297 × (disease status before HSCT) – 0.0726 × (DRI) – 0.0118 × (HCT-CI score) + 1.2037 × (HLA disparity) + 0.5347 × (EBV serostatus) + 0.1605 × (conditioning regimen) – 0.2270 × (donor/recipient gender matched) + 0.2304 × (donor/recipient relation) – 0.0170 × (mononuclear cell count in graft) + 0.0395 × (CD34+ cell count in graft) – 2.4510 (Table 2). The threshold of probability was set as 0.4623, and the g-mean was 0.648; thus, the patients could be separated into low- and high-risk groups by the threshold values.
Table 2.
Variables for building machine learning models.
| Variables | Assignment for variables |
|---|---|
| Age (y) | Numerical value |
| Gender | Male = 0; female = 1 |
| Underlying disease | Acute myeloid leukemia = 0; acute lymphoblastic leukemia = 1 |
| Disease status before HSCT | CR1 = 0; >CR1 = 1 |
| DRI | Low risk = 0; intermediate risk = 1; high risk = 2 |
| HCT-CI score | Numerical value |
| HLA disparity | 1 locus = 0; ≥2 loci = 1 |
| EBV serostatus | D+/R– = 0; D+/R+ = 1; D–/R+ = 2 |
| Conditioning regimen | TBI-based = 0; chemotherapy-based = 1 |
| Donor/recipient gender matched | Others = 0; female donor/male recipient = 1 |
| Donor/recipient relation | Immediate related donors, others = 0; immediate related donors, maternal donors = 1; collateral related donors = 2 |
| Mononuclear cell counts in graft (×108/kg) | Numerical value |
| CD34+ cell counts in graft (×106/kg) | Numerical value |
CR = complete remission, D = donor, DRI = disease risk index, EBV = Epstein-Barr virus, HCT-CI = hematopoietic cell transplantation-specific comorbidity index, HLA = human leukocyte antigen, HSCT = hematopoietic stem cell transplant, R = recipient, TBI = total body irradiation.
In the training cohort, the sensitivity, specificity, AUC score and accuracy score were 0.7593, 0.5395, 0.6804, and 0.5816, respectively (Fig. 2A and SDC, Table S2, http://links.lww.com/BS/A53). In the validation cohort, the sensitivity, specificity, AUC score, and accuracy score were 0.8846, 0.4938, 0.6598, and 0.5479, respectively (Fig. 2B and SDC, Table S3, http://links.lww.com/BS/A53).
Figure 2.
ROC curve and confusion matrix for EBV reactivation model. (A) in the training and (B) in the validation cohort. EBV, Epstein-Barr virus..
3.4. Predicted value of our comprehensive model in the total cohort
The 1-year cumulative incidence of EBV reactivation after HID HSCT was 24.5% (95% CI, 18.4%–30.5%) and 11.0% (95% CI, 7.3%–14.7%), respectively, in the high- and low-risk groups (P < .001; Fig. 3A), in the total cohort.
Figure 3.
The 1-year cumulative incidence of EBV reactivation in the low- and high-risk groups. (A) in total cohorts, (B) in training cohorts, and (C) in validation cohorts. EBV, Epstein-Barr virus..
In the training cohort, the cumulative incidence of EBV reactivation at 1 year after HID HSCT was 19.3% (95% CI, 12.0%–26.6%) and 10.7% (95% CI, 6.0%–15.4%), respectively, in the high- and low-risk groups (P = .046; Fig. 3B). In the validation cohort, the cumulative incidence of EBV reactivation at 1 year after HID HSCT was 31.6% (95% CI, 21.4%–41.7%) and 11.4% (95% CI, 5.3%–17.5%), respectively, in the high- and low-risk groups (P = .001; Fig. 3C).
The cumulative incidence of EBV reactivation at 1 year after HID HSCT was significantly higher in the high-risk group than in the low-risk group in the patients with HCT-CI scores of 0 (SDC, Figure S1, http://links.lww.com/BS/A53). The low-risk group showed a trend to lower incidence of EBV reactivation compared with the high-risk group for patients with HCT-CI scores of ≥1 (SDC, Figure S2, http://links.lww.com/BS/A53).
The cumulative incidence of EBV reactivation with clinical meaning at 1 year after HID HSCT was 4.4% (95% CI, 2.0%–6.8%) and 9.1% (95% CI, 5.1%–13.2%) (P = .039), respectively, in the low- and high-risk groups.
The cumulative incidence of PTLD at 1 year after HID HSCT was 1.5% (95% CI, 0.0%–2.9%) and 4.1% (95% CI, 1.3%–6.8%) (P = .078), respectively, in the low- and high-risk groups.
3.5. Predicted value of our comprehensive model in patients without or with cytomegalovirus
The cumulative incidence of EBV reactivation at 1 year after HID HSCT was 9.1% (95% CI, 2.6%–15.6%) and 25.6% (95% CI, 11.7%–39.6%), respectively, in the low- and high-risk groups (P = .019) in those without cytomegalovirus (CMV)-DNAemia (n = 116) (Fig. 4A).
Figure 4.
The 1-year cumulative incidence of EBV reactivation. (A) in patients without CMV, (B) in patients with CMV, (C) in patients with grade II to IV aGVHD, and (D) in patients without aGVHD or with grade I aGVHD. EBV, Epstein-Barr virus; aGVHD, acute graft-versus-host disease.
In patients with CMV-DNAemia (n = 354), the cumulative incidence of EBV reactivation at 1 year after HID HSCT was 11.7% (95% CI, 7.2%–16.3%) and 24.2% (95% CI, 17.4%–30.9%), respectively, in the low- and high-risk groups (P = .003; Fig. 4B).
3.6. Predicted value of our comprehensive model in patients without or with severe aGVHD
In patients with grade II to IV aGVHD (n = 126), the cumulative incidence of EBV reactivation at 1 year after HID HSCT was 28.6% (95% CI, 13.2%–44.1%) and 11.1% (95% CI, 4.6%–17.6%), respectively, in the high- and low-risk groups (P = .029; Fig. 4C).
In patients without aGVHD or with grade I aGVHD (n = 344), the cumulative incidence of EBV reactivation at 1 year after HID HSCT was 23.6% (95% CI, 17.0%–30.2%) and 10.9% (95% CI, 6.4%–15.5%) (P = .002), respectively, in the high- and low-risk groups (Fig. 4D).
3.7. Secondary outcomes after HID HSCT
The cumulative incidence of NRM, OS and LFS at 1 year after HID HSCT for patients in the high-risk group was significantly poorer than in low-risk group. The incidence of relapse was comparable between the groups (Fig. 5).
Figure 5.
The 1-year cumulative incidence of secondary outcomes after HID HSCT in the low- and high-risk groups. (A) relapse, (B) NRM, (C) LFS, and (D) OS. NRM, non-relapse mortality; LFS, leukemia-free survival; OS, overall survival.
4. DISCUSSION
In the present study, we propose a predictive model for EBV reactivation after HID HSCT with the help of machine learning. It can categorize the patients into low- and high-risk groups for EBV reactivation. We first integrated different variables and established a comprehensive model that could effectively predict EBV reactivation in HID HSCT recipients with ATG for GVHD prophylaxis.
Several studies have already identified the risk factors for EBV reactivation after allo-HSCT.9,10,12,13,26 However, only male patients and intensified conditioning regimens were potential risk factors besides ATG and HLA mismatched donors, and using 1 or 2 variables to predict EBV reactivations was distinctly insufficient for HID HSCT recipients on ATG-based regimens. According to machine learning theory, adding more variables can increase the capacity and performance of the upper boundary of the predictive model.37,38 Thus, our comprehensive model included 13 demography, disease, and transplant characteristics. However, the large number of variables may induce overfitting in the training set.39 Our strategy is to add an L2 regularization term as shown in the objective function (equation 2). By introducing a regularization term to the objective function, the weights for coefficients become more balanced, thereby reducing the risk of overfitting.40 In addition, an imbalance problem was found between the sizes of the positive and negative samples. We adopted adjusted weights (equation 3) during the optimization procedure.41 In this way, we enhanced the weights for the positive samples to alleviate these adverse effects. Both methods contributed to a more generalizable and robust model. Our strategy was therefore in a step-wise manner and ensured the stability of the feature-selection process. In addition, we added a penalty function of the regularization term in the model optimization process, which can decrease the risk of overfitting the training data.
Considering that not all the patients would experience particular post-transplant complications (eg, aGVHD), we only enrolled the common transplant characteristics, and the model could be used in patients without post-transplant complications. For example, we observed that this model could predict EBV reactivation in patients without CMV-DNAemia or without severe aGVHD. This may help to increase the generality of our model.
In the present study, we observed that high-risk patients showed a higher incidence of NRM and a lower probability of survival compared with low-risk patients. Some studies also reported that EBV reactivation could increase the risk of mortality after allo-HSCT.6,9,10 This also supported the clinical significance of our predicted model.
Several studies reported that prophylactic rituximab treatment42 or EBV-specific T-cell infusions43 could decrease the risk of EBV reactivation. Considering that the median time from HSCT to EBV reactivation was nearly 2 months, we may have plenty of chances to conduct risk stratification-directed EBV prophylaxis after HID HSCT in high-risk patients on the basis of our predicted model, while the low-risk patients can avoid unnecessary treatment-related toxicities.
Regarding the limitations of our study, although we confirmed the model in the validation cohort successfully, this cohort was relatively small. Also, it did not enroll patients receiving unrelated-donor or identical-sibling-donor allo-HSCT with ATG for prophylaxis. Our cohort did not enroll HID HSCT with post-transplant cyclophosphamide either, although EBV reactivation is relatively rare in these patients.44 Thus, the model should be further evaluated by independent cohorts in multicenter studies with other donor types and transplant regimens. In the present study, only 1 patient died of PTLD. It is still premature for our model to predict EBV-related deaths, and this should be investigated further. Finally, only 3 patients showed new-onset EBV reactivation after chronic GVHD (cGVHD). We could not further identify the efficacy of our model in patients with cGVHD, and this requires additional study.
5. CONCLUSIONS
We have established a comprehensive model that could predict EBV reactivation in HID HSCT recipients using ATG for GVHD prophylaxis with machine learning. This is the first predictive model for these patients, who have a high risk of EBV reactivation, and it can be popularized easily. In future, prospective, multicenter studies can further confirm the efficacy of our predictive model. It can also help to conduct risk stratification-directed EBV prophylaxis after HID HSCT.
ACKNOWLEDGMENTS
This work was supported by the National key research and development plan of China (2022YFC2502606), the Program of the National Natural Science Foundation of China (grant number 82170208), the Foundation for Innovative Research Groups of the National Natural Science Foundation of China (grant number 81621001), the CAMS Innovation Fund for Medical Sciences (CIFMS) (grant number 2019-I2M-5-034), the Key Program of the National Natural Science Foundation of China (grant number 81930004), and the Fundamental Research Funds for the Central Universities, National Natural Science Foundation of China (No. 62102008).
Footnotes
S.F. and H.-Y.H. contributed equally to this article.
Conflict of interest: The authors declare that they have no conflict of interest.
This work was supported by the Program of the National Natural Science Foundation of China (grant number 82170208), the Foundation for Innovative Research Groups of the National Natural Science Foundation of China (grant number 81621001), the CAMS Innovation Fund for Medical Sciences (CIFMS) (grant number 2019-I2M-5-034), the Key Program of the National Natural Science Foundation of China (grant number 81930004) and the Fundamental Research Funds for the Central Universities, National Natural Science Foundation of China (No.62102008).
S.F. was involved in investigation; software; visualization; writing—original draft; and writing—review and editing. H.-Y.H. was involved in software; visualization; and writing—original draft. X.-Y.D. was involved in data curation; resources; and formal analysis. L.-P.X., X.-H.Z., Y.W., C.-H.Y., H.C., Y.-H.C., W.H., F.-R.W., J.-Z.W., K.-Y.L., M.-Z.S. were involved in data curation and resources. X.-J.H. was involved in funding acquisition; investigation; project administration; data curation; resources; and validation. S.-D.H. and X.-D.M. were involved in conceptualization; data curation; formal analysis; investigation; methodology; project administration; resources; software; supervision; validation; visualization; writing—original draft; and writing—review and editing.
Informed consent was obtained from all individual participants or their guardians included in the study.
The datasets generated during the analysis of the current study are available from the corresponding author.
REFERENCES
- [1].Zhang XH, Chen J, Han MZ, et al. The consensus from The Chinese Society of Hematology on indications, conditioning regimens and donor selection for allogeneic hematopoietic stem cell transplantation: 2021 update. J Hematol Oncol 2021;14:145. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [2].Xu LP, Lu PH, Wu DP, et al. ; Chinese Blood and Marrow Transplantation Registry Group. Hematopoietic stem cell transplantation activity in China 2019: a report from the Chinese Blood and Marrow Transplantation Registry Group. Bone Marrow Transplant 2021;56:2940–2947. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [3].Wang Y, Liu QF, Xu LP, et al. Haploidentical vs identical-sibling transplant for AML in remission: a multicenter, prospective study. Blood 2015;125:3956–3962. [DOI] [PubMed] [Google Scholar]
- [4].Xiao-Jun H, Lan-Ping X, Kai-Yan L, et al. Partially matched related donor transplantation can achieve outcomes comparable with unrelated donor transplantation for patients with hematologic malignancies. Clin Cancer Res 2009;15:4777–4783. [DOI] [PubMed] [Google Scholar]
- [5].Yan CH, Xu LP, Wang FR, et al. Causes of mortality after haploidentical hematopoietic stem cell transplantation and the comparison with HLA-identical sibling hematopoietic stem cell transplantation. Bone Marrow Transplant 2016;51:391–397. [DOI] [PubMed] [Google Scholar]
- [6].Styczynski J, van der Velden W, Fox CP, et al. ; Sixth European Conference on Infections in Leukemia, a joint venture of the Infectious Diseases Working Party of the European Society of Blood and Marrow Transplantation (EBMT-IDWP), the Infectious Diseases Group of the European Organization for Research and Treatment of Cancer (EORTC-IDG), the International Immunocompromised Host Society (ICHS) and the European Leukemia Net (ELN). Management of Epstein-Barr virus infections and post-transplant lymphoproliferative disorders in patients after allogeneic hematopoietic stem cell transplantation: Sixth European Conference on Infections in Leukemia (ECIL-6) guidelines. Haematologica 2016;101:803–811. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [7].Liu L, Liu Q, Feng S. Management of Epstein-Barr virus-related post-transplant lymphoproliferative disorder after allogeneic hematopoietic stem cell transplantation. Ther Adv Hematol 2020;11:2040620720910964. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [8].Liu L, Zhang X, Feng S. Epstein-Barr virus-related post-transplantation lymphoproliferative disorders after allogeneic hematopoietic stem cell transplantation. Biol Blood Marrow Transplant 2018;24:1341–1349. [DOI] [PubMed] [Google Scholar]
- [9].Zhou L, Gao Z-Y, Lu D-P. Incidence, risk factors, and clinical outcomes associated with Epstein-Barr virus-DNAemia and Epstein-Barr virus-associated disease in patients after haploidentical allogeneic stem cell transplantation: a single-center study. Clin Transplant 2020;34:e13856. [DOI] [PubMed] [Google Scholar]
- [10].Xuan L, Huang F, Fan Z, et al. Effects of intensified conditioning on Epstein-Barr virus and cytomegalovirus infections in allogeneic hematopoietic stem cell transplantation for hematological malignancies. J Hematol Oncol 2012;5:46. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [11].Xiong G, Zhang B, Huang M-y, et al. Epstein-Barr virus (EBV) infection in Chinese children: a retrospective study of age-specific prevalence. PLoS One 2014;9:e99857. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [12].Gao X-N, Lin J, Wang L-J, et al. Risk factors and clinical outcomes of Epstein-Barr virus DNAemia and post-transplant lymphoproliferative disorders after haploidentical and matched-sibling PBSCT in patients with hematologic malignancies. Ann Hematol 2019;98:2163–2177. [DOI] [PubMed] [Google Scholar]
- [13].Ru Y, Zhang X, Song T, et al. Epstein-Barr virus reactivation after allogeneic hematopoietic stem cell transplantation: multifactorial impact on transplant outcomes. Bone Marrow Transplant 2020;55:1754–1762. [DOI] [PubMed] [Google Scholar]
- [14].van Esser JW, van der Holt B, Meijer E, et al. Epstein-Barr virus (EBV) reactivation is a frequent event after allogeneic stem cell transplantation (SCT) and quantitatively predicts EBV-lymphoproliferative disease following T-cell--depleted SCT. Blood 2001;98:972–978. [DOI] [PubMed] [Google Scholar]
- [15].Shen M-Z, Hong S-D, Lou R, et al. A comprehensive model to predict severe acute graft-versus-host disease in acute leukemia patients after haploidentical hematopoietic stem cell transplantation. Exp Hematol Oncol 2022;11:25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [16].Wang Y, Liu QF, Lin R, et al. Optimizing antithymocyte globulin dosing in haploidentical hematopoietic cell transplantation: long-term follow-up of a multicenter, randomized controlled trial. Sci Bull 2021;66:2498–2505. [DOI] [PubMed] [Google Scholar]
- [17].Ma YR, Zhang X, Xu L, et al. G-CSF-primed peripheral blood stem cell haploidentical transplantation could achieve satisfactory clinical outcomes for acute leukemia patients in the first complete remission: a registered study. Front Oncol 2021;11:631625. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [18].Mo XD, Hong SD, Zhao YL, et al. Basiliximab for steroid-refractory acute graft-versus-host disease: a real-world analysis. Am J Hematol 2022;97:458–469. [DOI] [PubMed] [Google Scholar]
- [19].Shen MZ, Liu XX, Qiu ZY, et al. Efficacy and safety of mesenchymal stem cells treatment for multidrug-resistant graft-versus-host disease after haploidentical allogeneic hematopoietic stem cell transplantation. Ther Adv Hematol 2022;13:20406207211072838. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [20].Zhao JY, Liu SN, Xu LP, et al. Ruxolitinib is an effective salvage treatment for multidrug-resistant graft-versus-host disease after haploidentical allogeneic hematopoietic stem cell transplantation without posttransplant cyclophosphamide. Ann Hematol 2021;100:169–180. [DOI] [PubMed] [Google Scholar]
- [21].Liu SN, Zhang XH, Xu LP, et al. Prognostic factors and long-term follow-up of basiliximab for steroid-refractory acute graft-versus-host disease: updated experience from a large-scale study. Am J Hematol 2020;95:927–936. [DOI] [PubMed] [Google Scholar]
- [22].Fan S, Shen MZ, Zhang XH, et al. Preemptive immunotherapy for minimal residual disease in patients with t(8;21) acute myeloid leukemia after allogeneic hematopoietic stem cell transplantation. Front Oncol 2021;11:773394. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [23].Shen MZ, Li JX, Zhang XH, et al. Meta-analysis of interleukin-2 receptor antagonists as the treatment for steroid-refractory acute graft-versus-host disease. Front Immunol 2021;12:749266. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [24].Shen M-Z, Zhang X-H, Xu L-P, et al. Preemptive interferon-α therapy could protect against relapse and improve survival of acute myeloid leukemia patients after allogeneic hematopoietic stem cell transplantation: long-term results of two registry studies. Front Immunol 2022;13:757002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [25].Wang Y, Wu DP, Liu QF, et al. Low-dose post-transplant cyclophosphamide and anti-thymocyte globulin as an effective strategy for GVHD prevention in haploidentical patients. J Hematol Oncol 2019;12:88. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [26].Ru Y, Zhu J, Song T, et al. Features of Epstein–Barr virus and cytomegalovirus reactivation in acute leukemia patients after Haplo-HCT with myeloablative ATG-containing conditioning regimen. Front Cell Infect Microbiol 2022;12:865170. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [27].Iguyon I, Elisseeff A. An introduction to variable and feature selection. J Mach Learn Res 2003;3:1157–1182. [Google Scholar]
- [28].Seabold S, Perktold J. Statsmodels: econometric and statistical modeling with python. In Proceedings of the 9th Python in Science Conference. 2010:61:10–25080. Austin, Texas. [Google Scholar]
- [29].Hastie T. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2009:1–758. New York: Springer. [Google Scholar]
- [30].Zweig MH, Campbell G. Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. Clin Chem 1993;39:561–577. [PubMed] [Google Scholar]
- [31].Guo H, Liu H, Wu C, Zhi W, Xiao Y, She W. Logistic discrimination based on G-mean and F-measure for imbalanced problem. J Intell Fuzzy Syst 2016;31:1155–1166. [Google Scholar]
- [32].Fan R-E, Chang K-W, Hsieh C-J, Wang X-R, Lin C-J. LIBLINEAR: a library for large linear classification. J Mach Learn Res 2008;9:1871–1874. [Google Scholar]
- [33].Zhu C, Byrd R, Lu P, Nocedal J. Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization. ACM Trans Math Softw 1997;23:550–560. [Google Scholar]
- [34].Armand P, Kim HT, Logan BR, et al. Validation and refinement of the disease risk index for allogeneic stem cell transplantation. Blood 2014;123:3664–3671. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [35].Mo X-D, Zhang X-H, Xu L-P, et al. Disease risk comorbidity index for patients receiving haploidentical allogeneic hematopoietic transplantation. Engineering (Beijing, China) 2021;7:162–169. [Google Scholar]
- [36].Gooley TA, Leisenring W, Crowley J, Storer BE. Estimation of failure probabilities in the presence of competing risks: new representations of old estimators. Stat Med 1999;18:695–706. [DOI] [PubMed] [Google Scholar]
- [37].Blumer A, Ehrenfeucht A, Haussler D, Warmuth M. Learnability and the Vapnik-Chervonenkis dimension. J ACM 1989;36:929–965. [Google Scholar]
- [38].Abu-Mostafa YS. The Vapnik-Chervonenkis dimension: information versus complexity in learning. Neural Comput 1989;1:312–317. [Google Scholar]
- [39].Ying X. An overview of overfitting and its solutions. J Phys Conf Ser 2019;1168:22022. [Google Scholar]
- [40].Ng AY. Feature selection, L1 vs. L2 regularization, and rotational invariance. Proceedings of the twenty-first International Conference on Machine Learning. 2004, July:615–622. New york, USA. [Google Scholar]
- [41].King G, Zeng L. Logistic regression in rare events data. Political Anal 2001;9:137–163. [Google Scholar]
- [42].Dominietto A, Tedone E, Soracco M, et al. In vivo B-cell depletion with rituximab for alternative donor hemopoietic SCT. Bone Marrow Transplant 2012;47:101–106. [DOI] [PubMed] [Google Scholar]
- [43].Heslop HE, Slobod KS, Pule MA, et al. Long-term outcome of EBV-specific T-cell infusions to prevent or treat EBV-related lymphoproliferative disease in transplant recipients. Blood 2010;115:925–935. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [44].Kanakry JA, Kasamon YL, Bolaños-Meade J, et al. Absence of post-transplantation lymphoproliferative disorder after allogeneic blood or marrow transplantation using post-transplantation cyclophosphamide as graft-versus-host disease prophylaxis. Biol Blood Marrow Transplant 2013;19:1514–1517. [DOI] [PMC free article] [PubMed] [Google Scholar]





