Graphical Abstract

To the Editor
Myelodysplastic syndromes (MDS) refer to a heterogeneous class of diseases that affect the ability of hematopoietic stem cells to mature into healthy blood cells. The prognosis for MDS is variable but well-established risk scoring systems can effectively delineate the risks of death and progression to acute myelogenous leukemia (AML). Allogeneic hematopoietic stem cell transplantation (allo-HCT) is the only curative therapy for MDS and approximately 30% of MDS patients eventually undergo allo-HCT.1 The National Marrow Donor Program (NMDP)/Be The Match maintains a large stem cell donor registry that can be searched for potential matches for allo-HCT. For each patient that requires an allo-HCT, a related donor match is attempted. In patients for which a related donor is not available, the NMDP registry is searched for unrelated donors based on a variety of characteristics including HLA-type, age, sex, and blood group.2 For matched unrelated donors, previous studies have demonstrated the importance of finding the youngest possible donor.3 Even with these donor selection criteria in place, the 5 year survival post allo-HCT for patients with MDS is 40–50%. Thus, there remains a critical need to find additional donor characteristics that may optimize matching criteria and improve outcomes for patients with MDS that undergo allo-HCT.
A previous study implicated four genetic loci from donors that were associated with disease related mortality in a cohort of MDS, AML, and acute lymphoblastic leukemia patients.4 We hypothesized that there may exist additional genetic factors in donors that are predictive of outcomes post allo-HCT in patients with MDS. To investigate this hypothesis, we performed whole-genome sequencing (WGS) of whole blood DNA from a cohort of 494 donors that were matched to MDS patients who underwent allo-HCT. With these data, we conducted a genome-wide scan of the donor DNA to identify genetic regions that are associated with overall survival (OS) in the matched MDS patients. Our results implicate novel genetic regions in donors that may be important in determining outcomes post allo-HCT.
Data generation
The data source, sample preparation, and WGS procedures are described elsewhere.5 Briefly, the donor samples were ascertained from the Center for International Blood and Marrow Transplant Research (CIBMTR), a research collaboration between the NMDP/Be The Match and Medical College of Wisconsin (MCW), research repository (NCT04920474). This study was approved by the NMDP Institutional Review Board. Table S1 displays the characteristics of the donor cohort included in our study along with the matched MDS patient cohort. Procedures for DNA extraction, library preparation, sequencing, quality control and, principal component analysis are described in the Supplementary Methods.
Statistical association analysis
We tested each single nucleotide polymorphism (SNP) for association with OS of the matching MDS patient with a Cox proportional hazards model including donor age, donor sex, donor type, and the first 10 genetically inferred principal components. We used the genome-wide threshold of p-value < 5×10−8 to declare statistical significance. For variants with statistically significant associations, we further tested for association with relapse (REL) and treatment related mortality (TRM). Since REL and TRM are competing risks, we tested these outcomes with a cause-specific hazard. To investigate the impact of rare-genetic variants in donors, we conducted rare-variant burden tests of nonsynonymous protein-coding variants (see Supplementary Methods for details). To separate the potential genetic effects of related donors from unrelated donors, we repeated all of the above analyses with only the 397 unrelated donors. Cox models in the unrelated donors included donor age, donor sex, and the first 10 genetically inferred principal components as covariates (Figures S1, S2).
Common variant associations
There were 1,174,155 SNPs that passed quality control with a minor allele frequency (MAF) ≥ 1% and were available for statistical association testing. In the analysis of the full donor cohort, we identified one SNP (rs111224634) that was significantly associated with OS (MAF=0.012, HR=5.96, p-value=2.39×10−8, Figure 1). Across the genome, the p-values were well calibrated (Figures S3, S4) with a GC lambda value = 1.071. Notably, the association between rs111224634 and OS appears to be driven by associations with both REL and TRM (Table S2). To determine whether this association was independent of patient and disease factors, we ran a model adjusting for 11 additional patient and disease factors (see Supplementary Methods for details). The associations with rs111224634 remained statistically significant after adjusting for these factors (Table S3). In our analyses of the unrelated donors (Figures S5, S6), the association between rs111224634 and OS was even stronger (HR=6.67, p-value=4.96×10−9) suggesting that the inclusion of related donors serves to attenuate the signal at this locus. In the GWAS analysis of OS in the unrelated donor cohort, we identified two additional loci that reached genome-wide significance. At the sentinel SNPs for these two loci, the signals appear to be primarily attributable to an association with TRM (Table S2).
Figure 1:

Kaplan-Meier curve for overall survival in the MDS patients by donor genotype for rs111224634, adjusted for donor age, donor sex, donor group, and PC1-PC10. The survival curve for patients matched with donors that have the CC genotype is shown in red and the survival curve for patients matched with donors that have the CT genotype is shown in blue. The table above the plot shows the distribution of donor genotypes for rs111224634 by the matched patient transplant outcomes.
Rare variant associations
In addition to our association analyses of common variants (MAF ≥ 1%), we also ran rare-variant burden tests for all protein-coding genes with at least two variants with MAF < 1% that were also missense, nonsense, or loss-of-function and had a cumulative minor allele count greater than 9. We observed one statistically significant gene-level result in the full donor cohort as well as the unrelated donor cohort (Table S4). Rare variants in the GFAP gene were associated with overall survival (p=4.89×10−6) in all donors and in the unrelated donors as well (p=4.69×10−6). In both the full donor cohort as well as the unrelated donors, the associations with overall survival were attributable to associations with both relapse and TRM (Table S4).
In this study, we generated whole genome sequencing (30X) data from 494 donors for allo-HCT of patients with MDS. We tested for the association of both common (MAF ≥ 1%) and rare (MAF < 1%) germ-line genetic variants with OS in the matched MDS patient and identified three novel loci that harbor common variants significantly associated with OS. Of the three novel loci, one at TRBV6 harbors a common variant (rs111224634) that is associated with both REL and TRM, suggesting that the association with OS is attributable to an effect on both REL and TRM. Interestingly, the association between rs111224634 and OS was stronger in the smaller subset of unrelated donors (compared to the full donor cohort). The TRBV6 gene is a variable domain of the T cell receptor beta chain that participates in antigen recognition; beta T cell receptors are present on the cell surface of T lymphocytes.6 Given the well-known influence of HLA-types on donor-patient matching and subsequent transplant outcomes, this novel association at TRBV6 suggests that additional genomic regions related to adaptive immunity may play an additional critical role, though further experimental studies are needed to reveal the underlying biological function of this SNP. We also show that approximately 1% of donors carry the allele at rs111224634 that increases risk of relapse, TRM, and decreases overall survival (Figure 1). If this association is validated in future studies, clinicians may consider avoiding transplant donors that carry the risk allele (T) rs111224634. For patients where the only possible donor is one that carriers the risk allele at rs111224634, clinicians may consider non-transplant treatments or more heavy monitoring post-transplant and/or additional maintenance therapies. The two other novel loci associated with OS are located in intergenic regions distal from any nearby gene. The interpretation of these two association signals is therefore complicated by the relative lack of information on the two sentinel SNPs. We also observed an association between a burden of rare variants in the GFAP gene and overall survival that was attributable to associations with both relapse and TRM. The GFAP gene product is responsible for the cytoskeleton structure of glia cells and for supporting neighboring neurons and the blood brain barrier. The connection between this gene and post-transplant outcomes in patients with MDS remains unclear. In any case, replication of these association in an independent cohort of donor-MDS patient pairs post allo-HCT would help to clarify the role of these loci in determining allo-HCT outcomes and help provide robust estimates of their effect sizes. Because our study was relatively small for a standard GWAS analysis, the reported HRs (Table S2, Table S4) are almost certainly inflated due to the “winner’s curse,” whereby the estimated genetic effects of statistically significant loci are inflated due to finite sample sizes. It will be important for future work to extend our findings to indications beyond MDS. Our focus on MDS was undertaken with the rationale that the genetic factors influencing post-transplant outcomes would be more apparent in a disease with similar relapse rates and rates of TRM and that inclusion of other diseases (such as AML) would introduce heterogeneity and background noise. Future studies of the effect of germ-line genetic variation in donors on the success of allo-HCT in MDS patients may be able to leverage the results of our study to build and test polygenic risk scores. In this way, genome-wide genetic variation may prove to be a useful source of information in building prediction models, as with other complex human traits.
Supplementary Material
Acknowledgements
The authors would like to thank all participants in the CIBMTR. WGS was supported through N00014-17-1-2850 from the Office of Naval Research to the NMDP. The CIBMTR is supported primarily by Public Health Service U24CA076518 from the National Cancer Institute (NCI), the National Heart, Lung and Blood Institute (NHLBI) and the National Institute of Allergy and Infectious Diseases (NIAID); HHSH250201700006C from the Health Resources and Services Administration (HRSA); N00014-21-1-2954 and N00014-23-1-2057 from the Office of Naval Research; Support is also provided by Be the Match Foundation, the Medical College of Wisconsin, the National Marrow Donor Program, and from the following commercial entities: AbbVie; Actinium Pharmaceuticals, Inc.; Adaptimmune; Adaptive Biotechnologies Corporation; ADC Therapeutics; Adienne SA; Allogene; Allovir, Inc.; Amgen, Inc.; Angiocrine; Anthem; Astellas Pharma US; AstraZeneca; Atara Biotherapeutics; BeiGene; bluebird bio, inc.; Bristol Myers Squibb Co.; CareDx Inc.; CRISPR; CSL Behring; CytoSen Therapeutics, Inc.; Eurofins Viracor, DBA Eurofins Transplant Diagnostics; Gamida-Cell, Ltd.; Gilead; GlaxoSmithKline; HistoGenetics; Incyte Corporation; Iovance; Janssen Research & Development, LLC; Janssen/Johnson & Johnson; Jasper Therapeutics; Jazz Pharmaceuticals, Inc.; Kadmon; Karius; Kiadis Pharma; Kite, a Gilead Company; Kyowa Kirin; Legend Biotech; Magenta Therapeutics; Mallinckrodt Pharmaceuticals; Medexus Pharma; Merck & Co.; Mesoblast; Millennium, the Takeda Oncology Co.; Miltenyi Biotec, Inc.; MorphoSys; Novartis Pharmaceuticals Corporation; Omeros Corporation; OptumHealth; Orca Biosystems, Inc.; Ossium Health, Inc.; Pfizer, Inc.; Pharmacyclics, LLC, An AbbVie Company; Pluristem; PPD Development, LP; Sanofi; Sanofi-Aventis U.S. Inc.; Sobi, Inc.; Stemcyte; Takeda Pharmaceuticals; Talaris Therapeutics; Terumo Blood and Cell Technologies; TG Therapeutics; Vertex Pharmaceuticals; Vor Biopharma Inc.; Xenikos BV. The views expressed in this article do not reflect the official policy or position of the National Institute of Health, the Department of the Navy, the Department of Defense, Health Resources and Services Administration (HRSA) or any other agency of the U.S. Government. JD was supported by NIH K01 HL164972.
Footnotes
Competing Interests
None.
Data Availability Statement
CIBMTR supports accessibility of research in accord with the National Institutes of Health (NIH) Data Sharing Policy and the National Cancer Institute (NCI) Cancer Moonshot Public Access and Data Sharing Policy. The CIBMTR only releases de-identified datasets that comply with all relevant global regulations regarding privacy and confidentiality.
References
- 1.Getta BM, Kishtagari A, Hilden P, et al. Allogeneic Hematopoietic Stem Cell Transplantation Is Underutilized in Older Patients with Myelodysplastic Syndromes. Biol Blood Marrow Transplant. Jul 2017;23(7):1078–1086. doi: 10.1016/j.bbmt.2017.03.020 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Dehn J, Spellman S, Hurley CK, et al. Selection of unrelated donors and cord blood units for hematopoietic cell transplantation: guidelines from the NMDP/CIBMTR. Blood. Sep 19 2019;134(12):924–934. doi: 10.1182/blood.2019001212 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Logan BR, Maiers MJ, Sparapani RA, et al. Optimal Donor Selection for Hematopoietic Cell Transplantation Using Bayesian Machine Learning. JCO Clin Cancer Inform. May 2021;5:494–507. doi: 10.1200/CCI.20.00185 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Hahn T, Wang J, Preus LM, et al. Novel genetic variants associated with mortality after unrelated donor allogeneic hematopoietic cell transplantation. EClinicalMedicine. Oct 2021;40:101093. doi: 10.1016/j.eclinm.2021.101093 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Zhang T, Auer P, Dong J, et al. Whole-genome sequencing identifies novel predictors for hematopoietic cell transplant outcomes for patients with myelodysplastic syndrome: a CIBMTR study. J Hematol Oncol. Apr 11 2023;16(1):37. doi: 10.1186/s13045-023-01431-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Lefranc MP. Immunoglobulin and T Cell Receptor Genes: IMGT((R)) and the Birth and Rise of Immunoinformatics. Front Immunol. 2014;5:22. doi: 10.3389/fimmu.2014.00022 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
CIBMTR supports accessibility of research in accord with the National Institutes of Health (NIH) Data Sharing Policy and the National Cancer Institute (NCI) Cancer Moonshot Public Access and Data Sharing Policy. The CIBMTR only releases de-identified datasets that comply with all relevant global regulations regarding privacy and confidentiality.
