To the Editor:
Although the somatic landscapes of myeloid malignancies have been extensively characterized, germline contributions to the disease are not well understood. The limited number of germline variants that have been implicated in myeloid malignancy susceptibility are mostly rare and have largely been identified in families [1, 2]. Newfound predisposition alleles are also likely to be rare, given the dearth of published common susceptibility alleles and the nature of known susceptibility variants.
To investigate the roles of rare variants in myeloid malignancy susceptibility, we analyzed whole-exome sequence data from the germlines of 690 patients (Supplementary Tables 1–3; Supplementary Figs. 1A and 2). We assessed the presence and impact of rare variants in genes with previously implicated susceptibility alleles, as well as in genes with known or putative roles in myeloid neoplasms or cancer in general. This focused our analysis on 657 genes (Supplementary Table 4). After applying a stringent variant-calling and annotation pipeline (Supplementary Figs. 3 and 4), rare variants were defined as those with population allele frequencies below 1% that alter protein composition (missense/nonsense substitutions, in-frame/frameshift indels, and splice site variants) (Supplementary Fig. 1B, C; Supplementary Table 5).
In 2016, the World Health Organization revised their classification of myeloid neoplasms and acute leukemia to include a new category, termed “myeloid neoplasms with germline predisposition” [3]. We identified 11 patients in our cohort who would fit the criteria owing to mutations in DDX41 (Supplementary Fig. 5), four of whom harbored previously described [4–6] DDX41 variants with strong evidence for disease roles. In addition, five patients were found to have ETV6 mutations (Supplementary Fig. 5). One such patient carried a p.R353Q mutation affecting the highly conserved ETS domain, a well-known hotspot for both germline and somatic mutations in ETV6 among myeloid malignancy patients [7]. The p.R353Q variant is exceedingly rare, present at allele frequency 2.83 × 10−5 in the gnomAD database (https://gnomad.broadinstitute.org/). Three patients with GATA2 mutations and six with RUNX1 mutations were found (Supplementary Fig. 5). The RUNX1 variants in four of the six patients were novel (neither present in any of the population databases nor assigned rsIDs), and included two alleles (p.R166Q and p.Y287X) that were previously found [8, 9] through pedigree analyses of families with familial platelet disorder with predisposition to acute myeloid leukemia (FPD/AML). In our cohort, both individuals carrying these latter alleles were pediatric cases, with ages-of-onset 1 and 12 years. Overall, therefore, our cohort included up to (depending on stringency when considering whether a variant is likely predisposing) 25 patients (3.6%) with variants that would qualify them as having germline predisposition, including six (0.9%) that harbored previously identified, likely causal variants (Supplementary Table 6).
A total of 146 variants were classified as pathogenic or likely pathogenic according to American College of Medical Genetics criteria, spread across 65 genes and 137 patients (19.9%). The gene with the largest pathogenic/likely pathogenic variant burden is MPO, which harbors such a variant in 19 of the patients (Fig. 1a). MPO (myeloperoxidase) codes for a heme protein produced during myeloid differentiation. The pathogenic MPO variants found in our cohort are dominated by two substitutions: a splice site variant (c.2031–2A>C; 11 patients), and a missense variant (c.C1705T, p.R569W; seven patients) (Fig. 1b). Multispecies alignment shows that both of the corresponding wild-type alleles are highly conserved (Fig. 1b). Frequencies of the variants are significantly higher in our cohort (0.8% and 0.51% for c.2031–2A>C and p.R569W, respectively) than in the gnomAD control database (0.38 and 0.13%; P = 0.012 and 4.0 × 10−4, respectively). The allele frequencies in the recently published Beat AML study [10] of AML patients are similar (0.8 and 0.6%) to those in our cohort (Fig. 1c).
Fig. 1. Myeloperoxidase (MPO) as a candidate susceptibility gene in myeloid malignancy.
a Genes ranked by pathogenic/likely pathogenic variant burden. For clarity, only genes with more than one such variant are shown. b Rare protein-altering variants in MPO. Pathogenic/likely pathogenic variants are indicated with a dagger (†). Patient diagnosis is indicated by color and effect on amino acid sequence is indicated by shape. Also shown is multispecies alignment in regions surrounding the amino acid residue altered by the missense variant p. R569W (top) and the nucleotide residue altered by the splice variant c.2031–2A>C (bottom). c Allele frequencies of case and control cohorts for the two most recurrent pathogenic alleles in MPO. For clarity, significance is only shown for comparisons between our cohort and gnomAD. *P < 0.05; ***P < 0.0005. d, e Associations between the presence of cytogenetic lesions and the presence of a rare MPO variant (left) or pathogenic rare MPO variant (right). OR odds ratio. Filled circles indicate OR estimate, with whiskers indicating a 95% confidence interval for the OR.
MPO is an appealing candidate as a myeloid malignancy susceptibility gene for a number of reasons. It is expressed, on both the mRNA and protein levels, most prominently in blood cells [11, 12]. High MPO expression is characteristic of acute promyelocytic leukemia (APL), though none of the carriers of MPO rare variants here have the t(15;17) translocation causing APL. An earlier report associated the c.2031–2A>C variant with myeloperoxidase deficiency, and transfection of a construct harboring the variant resulted in mRNA with 109 nucleotides inserted between exons 11 and 12, yielding a protein that should lack enzymatic activity [13]. The variant was also identified as a being associated with protein abundance, both in cis (MPO) and in trans (RAB26), in human plasma [14]. Germline variants in MPO (including the splice site variant discussed here) have been associated with counts of leukocytes, monocytes, neutrophils, and other blood cell types [15]. In our cohort, patients with rare variants in MPO are more likely to have cytogenetic lesions, particularly complex karyotype and deletion of chromosome 5 (Fig. 1d, e).
Examination of rare variant burden, irrespective of pathogenicity, across all 657 genes under consideration (Supplementary Fig. 6) shows that genes associated with Fanconi anemia (FA) and, in general, those associated with autosomal recessive (AR) disorders, have higher rare variant burdens than do the other genes under consideration (Wilcoxon test P = 0.002 and 0.027, respectively; Fig. 2a). Although only 3.2% of genes under consideration are FA genes, 6.6% of all rare variants and 8.9% of truncating variants (nonsense, frameshift, and splice site) are found in FA genes (P = 4.3 × 10−4 and 7.7 × 10−4, respectively; Fig. 2b). Similarly, although only 25% of genes are AR, 29% of rare variants and 47.3% of truncating variants are in AR genes (P = 0.017 and 1.0 × 10−9, respectively; Fig. 2b). Moreover, the proportion of rare variants that are truncating is significantly higher in AR genes (5.7%) and FA genes (4.6%) than in other genes (2.5%; P = 1.5 × 10−9 and 0.013, respectively; Fig. 2c).
Fig. 2. Autosomal recessive genes and Fanconi anemia genes have higher overall and truncating rare germline variant burden.
a Rare germline variant burden for genes, stratified by FA and autosomal recessive status. b White bars indicate proportions of all genes under consideration that are AR/FA genes. Gray bars indicate proportions of all rare germline variants that are in AR/FA genes. Black bars indicate proportions of truncating rare germline variants that are in AR/FA genes. c Proportions of rare variants that are truncating, classified by FA and AR status. d Presence of rare germline BRCA2 variants are associated with poor overall survival. Shading indicates a 95% confidence band. e Counts for instances of RAH/CH in autosomal recessive genes. These combined categories are enriched for Fanconi anemia genes. f Observed number of instances of RAH/CH in FA genes is at the top 99.8th percentile of the permutation-based distribution (see Supplementary Materials) of number of instances expected under the null hypothesis of random chromosomal assortment. ***P < 10−7; **P < 0.003; *P < 0.03.
We also sought to examine genes where both parental alleles in a patient were impacted by rare variants (Supplementary Fig. 7). The only AR gene harboring a rareallele homozygote (RAH) in more than one patient was BRCA2 (Supplementary Table 7). Among the three patients carrying BRCA2 RAHs was one diagnosed at 15 years, and a 57 years old who died 8 months after diagnosis (the third patient did not have age or survival information available). In general, patients with rare germline BRCA2 variants have significantly worse outcomes with regard to overall survival (Fig. 2d; age-adjusted hazard ratio 2.62, 95% CI 1.26–5.44, P = 0.0072). We also observed a total of 95 putative (see Supplementary Material) compound heterozygotes (CHs) (Supplementary Table 8), across 85 different patients and 71 unique genes. Among the 95 CHs, 30 were in AR genes (Fig. 2e), with two in BRCA2 (Supplementary Fig. 8). A larger proportion of CH and RAH are in FA genes, compared with the proportion of all rare variants in FA genes (10.7% vs. 6.6%; P = 0.06) and compared with the proportion of FA genes among all genes under consideration (10.7% vs. 3.2%; P = 6.4 × 10−5).
A higher-than-expected number of RAH/CH events in a gene set is suggestive of a disease predisposition for individuals with biparental inheritance of rare variants in the genes. Using this principle (Supplementary Materials), we tested for enrichment of RAH/CH across all FA genes given the total number of rare alleles present in these genes across patient chromosomes. Assuming random assortment, we would expect 4.6 (95% confidence interval 1–9) instances of RAH/CH in FA genes, but we observe 12 (P = 0.002; Fig. 2f).
Somatic mutation data were available for 563 of the patients in our cohort. Genes’ cohort-wide burdens of rare germline variants do not show correlation with their somatic mutation frequencies (Supplementary Fig. 9A). Only TET2 had more than two patients harboring both rare variants and somatic mutations in the gene (Supplementary Fig. 9B), though not enough to attain statistical significance. However, it should be noted that we were likely underpowered to detect significant germline-somatic concordance.
Here we have presented results from the analysis of the largest collection, to our knowledge, of myeloid malignancy germline exomes reported to date. Interestingly, although the study cohort comprised a variety of diagnoses and a range of patient ages, we found no statistical effect of age or diagnosis in any of the results reported here (Supplementary Figs. 10–12). Our study sheds light on some mechanisms by which inherited alleles may confer risk of myeloid malignancy. The identification of such alleles has several practical benefits. Individuals carrying susceptibility alleles may benefit from increased surveillance, facilitating early preventative treatment. These individuals may also benefit from genetic counseling. Asymptomatic family members who are carriers of deleterious alleles may be excluded as candidate bone marrow donors. Finally, understanding the mechanisms-of-action of these alleles will help elucidate disease biology, potentially leading to improved patient treatment.
Supplementary Material
Acknowledgements
The authors with to acknowledge the individuals and families that were part of the study presented here, as well as researchers who made the exomes and metadata available to us. This study makes use of data generated by the Cancer Genomics Project at the University of Tokyo, the International Cancer Genome Consortium, the St. Jude Children’s Research Hospital Genomes for Kids Study, St. Jude Children’s Research Hospital - Washington University Pediatric Cancer Genome Project, the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) initiative managed by the NCI, The Cancer Genome Atlas. This work was supported in part by American Cancer Society grant 123436-RSG-12-159-01-DMC (TL), US National Institute of Health (NIH) grant R35 HL135795 (JPM), the Instituto de Salud Carlos III, Ministerio de Economia y Competitividad, Spain (PI/17/0575), 2017 SGR288 (GRC) Generalitat de Catalunya, and CERCA Program/Generalitat de Catalunya, Fundació Internacional Josep Carreras and from Celgene International (FS). The research leading to this work has received funding from “la Caixa” Foundation (FS). This work made use of the High Performance Computing Resource in the Core Facility for Advanced Research Computing at Case Western Reserve University.
Footnotes
Compliance with ethical standards
Conflict of interest The authors declare that they have no conflict of interest.
Supplementary information The online version of this article (https://doi.org/10.1038/s41375-019-0701-8) contains supplementary material, which is available to authorized users.
References
- 1.Churpek JE, Pyrtel K, Kanchi KL, Shao J, Koboldt D, Miller CA, et al. Genomic analysis of germ line and somatic variants in familial myelodysplasia/acute myeloid leukemia. Blood. 2015;126:2484–90. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Godley LA, Shimamura A. Genetic predisposition to hematologic malignancies: management and surveillance. Blood. 2017;130: 424–32. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Arber DA, Orazi A, Hasserjian R, Thiele J, Borowitz MJ, Le Beau MM, et al. The 2016 revision to the World Health Organization classification of myeloid neoplasms and acute leukemia. Blood. 2016;127:2391–405. [DOI] [PubMed] [Google Scholar]
- 4.Polprasert C, Schulze I, Sekeres MA, Makishima H, Przychodzen B, Hosono N, et al. Inherited and somatic defects in DDX41 in myeloid neoplasms. Cancer Cell. 2015;27:658–70. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Guidugli L, Johnson AK, Alkorta-Aranburu G, Nelakuditi V, Arndt K, Churpek JE, et al. Clinical utility of gene panel-based testing for hereditary myelodysplastic syndrome/acute leukemia predisposition syndromes. Leukemia. 2017;31:1226–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Cheah JJC, Hahn CN, Hiwase DK, Scott HS, Brown AL. Myeloid neoplasms with germline DDX41 mutation. Int J Hematol. 2017;106:163–74 [DOI] [PubMed] [Google Scholar]
- 7.Zhang MY, Churpek JE, Keel SB, Walsh T, Lee MK, Loeb KR, et al. Germline ETV6 mutations in familial thrombocytopenia and hematologic malignancy. Nat Genet. 2015;47:180–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Song WJ, Sullivan MG, Legare RD, Hutchings S, Tan X, Kufrin D, et al. Haploinsufficiency of CBFA2 causes familial thrombocytopenia with propensity to develop acute myelogenous leukaemia. Nat Genet. 1999;23:166–75. [DOI] [PubMed] [Google Scholar]
- 9.Michaud J, Wu F, Osato M, Cottles GM, Yanagida M, Asou N, et al. In vitro analyses of known and novel RUNX1/AML1 mutations in dominant familial platelet disorder with predisposition to acute myelogenous leukemia: implications for mechanisms of pathogenesis. Blood. 2002;99:1364–72. [DOI] [PubMed] [Google Scholar]
- 10.Tyner JW, Tognon CE, Bottomly D, Wilmot B, Kurtz SE, Savage SL, et al. Functional genomic landscape of acute myeloid leukaemia. Nature. 2018;562:526–31. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Schmidt T, Samaras P, Frejno M, Gessulat S, Barnert M, Kienegger H, et al. ProteomicsDB. Nucleic Acids Res. 2018;46:D1271–D81. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Consortium GT Human genomics. The genotype-tissue expression (GTEx) pilot analysis: multitissue gene regulation in humans. Science. 2015;348:648–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Marchetti C, Patriarca P, Solero GP, Baralle FE, Romano M. Genetic characterization of myeloperoxidase deficiency in Italy. Hum Mutat. 2004;23:496–505. [DOI] [PubMed] [Google Scholar]
- 14.Sun BB, Maranville JC, Peters JE, Stacey D, Staley JR, Blackshaw J, et al. Genomic atlas of the human plasma proteome. Nature 2018;558:73–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Astle WJ, Elding H, Jiang T, Allen D, Ruklisa D, Mann AL, et al. The allelic landscape of human blood cell trait variation and links to common complex disease. Cell. 2016;167:1415–29 e19. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.


