Despite an ever more granular elucidation of genetic mutations in subtypes of acute myeloid leukemia (AML), the precise events linking gene changes to distinct phenotypes remain elusive. Mulet-Lazaro et al report that epigenetic dysregulation in a particular AML subtype (good prognosis CEBPA double mutant) leads to allele-specific expression of the key stem cell-regulating transcription factor GATA2. This could have implications for our understanding of leukemogenesis.
Key Points
GATA2 ASE is a somatic event strongly associated with CEBPA DMs in AML.
GATA2 ASE results from silencing of 1 allele by promoter methylation and overactivation of a superenhancer in the other allele.
Visual Abstract
Abstract
Transcriptional deregulation is a central event in the development of acute myeloid leukemia (AML). To identify potential disturbances in gene regulation, we conducted an unbiased screen of allele-specific expression (ASE) in 209 AML cases. The gene encoding GATA binding protein 2 (GATA2) displayed ASE more often than any other myeloid- or cancer-related gene. GATA2 ASE was strongly associated with CEBPA double mutations (DMs), with 95% of cases presenting GATA2 ASE. In CEBPA DM AML with GATA2 mutations, the mutated allele was preferentially expressed. We found that GATA2 ASE was a somatic event lost in complete remission, supporting the notion that it plays a role in CEBPA DM AML. Acquisition of GATA2 ASE involved silencing of 1 allele via promoter methylation and concurrent overactivation of the other allele, thereby preserving expression levels. Notably, promoter methylation was also lost in remission along with GATA2 ASE. In summary, we propose that GATA2 ASE is acquired by epigenetic mechanisms and is a prerequisite for the development of AML with CEBPA DMs. This finding constitutes a novel example of an epigenetic hit cooperating with a genetic hit in the pathogenesis of AML.
Introduction
Transcriptional deregulation is a central event in cancer development.1 In acute myeloid leukemia (AML), most driver mutations occur in genes related to transcription, RNA splicing, chromatin regulation, and/or DNA methylation.2 In addition to mutations in protein-coding genes, alterations involving cis-regulatory elements play a critical role in aberrant gene expression in AML.3 Examples include aberrant expression of EVI1 through translocation of the distal GATA2 superenhancer in AML with 3q26 aberrations4 or focal amplification of distal MYC enhancers in AML with copy-number changes in 8q24.5 Other mechanisms identified in other malignancies include DNA alterations in cis-regulatory regions6 and changes in binding sites for CTCF and cohesin.7 Finally, in the absence of sequence variation, DNA methylation can modify gene expression, either directly by inducing promoter silencing8 or by preventing CTCF binding.9
Alterations in cis-regulatory regions usually affect a single DNA copy, leading to unbalanced expression of each allele controlled by these regulatory regions. For example, the gain of a superenhancer selectively increases gene expression only in the allele in which the new superenhancer is created.10 This phenomenon, termed allele-specific expression (ASE), can therefore serve as a telltale marker for cis-regulatory variation.11 In addition to acting as a surrogate marker, ASE can directly play a pathogenic role (eg, by haploinsufficiency or preferential expression of a mutated protein).12 Moreover, ASE of specific genes may be associated with increased risk of cancer development13 or progression,14 as has been shown for colon,15 breast, and ovarian cancers.16
Extensive data focusing on the occurrence and relevance of ASE in AML are lacking. Here, we carried out a systematic study of genes with aberrant ASE in AML to uncover aberrantly expressed genes caused by abnormalities in cis-regulatory elements. To this end, we generated whole-exome sequencing (WES) and RNA sequencing (RNA-seq) data in a large representative cohort of patients with AML and identified genes that recurrently exhibited ASE. Among those, GATA2 stood out prominently and exhibited a strong association with CEBPA double mutations (DMs). A multiomic analysis of the GATA2 regulatory region showed that ASE is a result of concomitant promoter methylation on 1 allele and compensatory enhancer activation on the other allele.
Methods
ASE
To discriminate expression from different alleles, WES and RNA-seq data were integrated using an in-house python script. First, single-nucleotide variants (SNVs) were detected on the WES data, and, second, allele-specific read counts at every SNV were computed in both WES and RNA-seq data. SNVs with <9 WES reads or <5 RNA-seq reads were excluded. Information was aggregated over all the SNVs in a gene, and ASE was determined with a false discovery rate <0.05 in a χ2 test and RNA variant allele frequency <0.35 (Figure 1). VAF <0.1 was defined as the threshold for monoallelic expression. After the initial exploratory screen, a targeted manually curated analysis was conducted on GATA2 to identify cases missed by the automated pipeline; ASE was defined only by RNA minor allele frequency <0.35 for SNVs with >20 reads.
Statistical association between mutations and genes with ASE
We calculated the statistical association between every possible pair of mutated genes and genes with ASE based on the cooccurrence of these 2 events in the patient cohort using Fisher’s exact test. For descriptive statistics and hypothesis tests involving clinical variables, the R package Atable17 was used with customized settings and functions.
Methylation analyses
Methylation analyses of the GATA2 locus were conducted using enhanced reduced representation bisulfite sequencing (ERRBS) data previously published by our group18 and bisulfite amplicon sequencing. Raw aligned reads and methylated base calls for CpGs were imported, filtered, and normalized with the package methylKit19 (version 1.13.1). Comparisons across groups of interest (CEBPA DMs and AML with and without GATA2 ASE) were performed with methylKit, and average methylation levels were plotted along the GATA2 gene with Gviz20 (version 1.28.3).
Allele-specific methylation of GATA2 promoters was studied with CRISPR/Cas9-targeted enrichment followed by amplification-free long-read sequencing by Oxford Nanopore.21 Methylation likelihood ratios were estimated with Nanopolish22 and plotted separately for each allele using Gviz.
ChIP-seq and ATAC-seq analyses
Chromatin immunoprecipitation with sequencing (ChIP-seq) and assay for transposase-accessible chromatin with sequencing (ATAC-seq) data were generated for a number of selected patients to investigate changes in enhancer and promoter regions. ChIP-seq and ATAC-seq were performed as described previously with slight modifications.23,24 ChIP-seq reads were aligned to the human reference genome build hg19 with bowtie, and bigwig files were generated for visualization with bedtools genomecov25 (version 2.27.1) and bedGraphToBigWig.26 ATAC-seq reads were aligned to the human reference genome build hg19 with bowtie227 (version 2.3.4.1), which is recommended for longer reads, and mitochondrial and duplicate reads were excluded. Bigwig files were generated as described.
Enhancer regions were defined for quantification of enhancer RNA from RNA-seq, as well as H3K27ac, H3K27me3, and ATAC-seq reads. Read counts in enhancer regions were computed with featureCounts28 (version 1.5.0-p3), and differential analysis was conducted with DESEq229 (version 1.24.0). The results of this analysis were plotted in the GATA2 region with Gviz20 (version 1.28.3).
An extended description of the methods is provided in the data supplement. Quality metrics for the sequencing data generated in this study are available in supplemental Table 1.
Results
GATA2 is the most recurrent gene with ASE in AML
To identify instances of epigenetic dysregulation in AML, we performed WES and RNA-seq on leukemic blasts from 209 AML patients, representing all major subtypes of the disease. Combining both data sets, we assessed ASE in every gene with informative (nonhomozygous) SNVs (Figure 1). Patients had a median of 36 genes with ASE, several of which were recurrently detected across multiple patients (525 in ≥5 patients). The number of genes with ASE was quite stable across patients and was comparable to findings in healthy donors (data not shown), making it unlikely that global mechanisms dictate ASE in AML. No association between genes with ASE in neighboring loci was detected across patients, indicating that causes of ASE were specific to each gene. The degree of ASE, measured by VAF in the RNA, varied widely across genes and patients; 22% of the ASE events were classified as monoallelic (VAF <0.1).
To increase the likelihood of disease-relevant observations, we subsequently selected genes previously reported to be involved in either cancer (COSMIC database30) or myeloid development (Gene Ontology database: 0030099). Of the genes with ASE complying with these criteria, the 40 most recurrent across the patients of our cohort are listed in Table 1 (supplemental Table 2 provides the complete filtered list). The gene most commonly found to show ASE (37% of cases with informative SNVs) was GATA2, which encodes a transcription factor crucial for the proliferation and maintenance of hematopoietic stem cells.31
Table 1.
Gene | No. of patients* | Evaluated cases† | Samples, %‡ | COSMIC§ | Myeloid differentiation§ |
---|---|---|---|---|---|
GATA2 | 66 | 178 | 37 | Yes | Yes |
THBS1 | 36 | 124 | 29 | No | Yes |
MYH11 | 20 | 199 | 10 | Yes | No |
CA2 | 13 | 126 | 10 | No | Yes |
MECOM | 13 | 186 | 7 | Yes | No |
SH3PXD2A | 13 | 195 | 7 | No | Yes |
CDKN2A | 11 | 102 | 11 | Yes | No |
JAG1 | 11 | 201 | 5 | No | Yes |
L3MBTL3 | 11 | 156 | 7 | No | Yes |
TRIM58 | 11 | 183 | 6 | No | Yes |
CIB1 | 10 | 160 | 6 | No | Yes |
FLT3 | 10 | 175 | 6 | Yes | No |
HIP1 | 9 | 198 | 5 | Yes | No |
PDE4DIP | 9 | 198 | 5 | Yes | No |
HSP90AB1 | 8 | 147 | 5 | Yes | No |
L3MBTL1 | 8 | 155 | 5 | No | Yes |
MGMT | 8 | 150 | 5 | Yes | No |
RUNX1 | 7 | 188 | 4 | Yes | Yes |
USP6 | 7 | 149 | 5 | Yes | No |
CD101 | 6 | 174 | 3 | No | Yes |
FAT1 | 6 | 202 | 3 | Yes | No |
IRF8 | 6 | 165 | 4 | No | Yes |
MEIS1 | 6 | 156 | 4 | No | Yes |
NPM1 | 6 | 134 | 4 | Yes | No |
ABL1 | 5 | 153 | 3 | Yes | No |
CIITA | 5 | 196 | 3 | Yes | No |
DNMT3A | 5 | 188 | 3 | Yes | No |
FAM20C | 5 | 184 | 3 | No | Yes |
LTF | 5 | 188 | 3 | No | Yes |
MYB | 5 | 161 | 3 | Yes | No |
PML | 5 | 181 | 3 | Yes | Yes |
PRDM2 | 5 | 151 | 3 | Yes | No |
RMI2 | 5 | 148 | 3 | Yes | No |
RPN1 | 5 | 159 | 3 | Yes | No |
ZFHX3 | 5 | 201 | 2 | Yes | No |
AKT1 | 4 | 178 | 2 | Yes | No |
BAX | 4 | 110 | 4 | Yes | No |
BRCA1 | 4 | 170 | 2 | Yes | No |
KMT2C | 4 | 162 | 2 | Yes | Yes |
KNSTRN | 4 | 158 | 3 | Yes | No |
No. of patients presenting with ASE for that gene.
No. of patients with SNVs that could be evaluated in that gene.
Determined by dividing no. of patients by no. of evaluated cases.
Indicates whether the gene is found in COSMIC database or is involved in myeloid differentiation (Gene Ontology database: 0030099). Note that reportedly imprinted genes (according to GeneImprint) were filtered out.
Molecular lesions in AML exhibit preferential association with gene-specific ASE
Our next question was whether there are preferential associations between genes with ASE and AML-specific mutations. To this end, we selected mutations likely to be somatic (based on their known involvement in AML, presence in COSMIC, and pathogenicity predictions) from the variants identified in the WES data (supplemental Table 3) and calculated the statistical association between every possible pair of mutated genes and genes with ASE (Figure 2).
Unsurprisingly, we found strong associations between driver chromosomal translocations and ASE of their constituent genes: t(11q23) and KMT2A, t(8;21) and RUNX1/RUNX1T1, t(15;17) and PML, t(3;3) and MECOM, and inv(16) and MYH11. Upon translocation to a different genomic region, genes previously under the control of another promoter (gene fusions) or enhancer (MECOM) acquired monoallelic expression. In addition, the analysis uncovered novel associations between ASE events and mutations, such as THBS1 with inv(16) (P = .0008), MYB with ETV6 (P = .0008), or LOX with SF3B1 (P = .0028). Among those, the association of GATA2 ASE with CEBPA DMs (P = 2.18 × 10−5) and with GATA2 mutations (P = .0004) was the strongest.
GATA2 ASE is strongly associated with CEBPA DM AML
Given the recurrence of GATA2 ASE and the prominent role of this gene in leukemogenesis, we further focused on GATA2. Therefore, using RNA-seq data, we manually inspected the GATA2 locus on the Integrative Genomics Viewer for all cases to ensure that no case had been excluded by the stringent filtering of our automated pipeline. This second analysis detected GATA2 ASE in 60% of patients with informative SNVs, a substantial increase that was due to the inclusion of untranslated regions (absent in the exome sequencing data) and the absence of P-value filtering (supplemental Figure 2). All subsequent calculations were based on this second analysis of the data.
Notably, GATA2 ASE was detected in all evaluable patients with CEBPA DMs (n = 21; Fisher’s exact test P = 1.57 × 10−5). A statistical analysis of clinically relevant variables revealed other positive associations, although weaker, of GATA2 ASE with normal karyotype, NPM1 mutations, and FLT3 internal tandem duplication mutations. There was no association with white blood cell count, age, sex, or ELN 2017 classification (Table 2). Although GATA2 ASE is widespread in AML, the t(8;21) and t(11q23) subgroups, both involving fusion proteins, were negatively associated with GATA2 ASE.
Table 2.
Group | % (n) | P * | Effect size (CI)*† | |
---|---|---|---|---|
GATA2 ASE (n = 103) | GATA2 non-ASE (n = 67) | |||
Sex | .34 | 0.72 (0.36-1.4) | ||
Female | 48 (49) | 39 (26) | ||
Male | 49 (50) | 55 (37) | ||
Missing | 3.9 (4) | 6 (4) | ||
Age, y | .79 | −0.19 (−0.51 to 0.13) | ||
Median | 48.00 | 47.00 | ||
MAD | 17.79 | 19.27 | ||
Mean, % | 48.70 | 45.57 | ||
SD, % | 16.82 | 16.30 | ||
Range | 15-86 | 17-77 | ||
Missing | 3.9 (4) | 6.0 (4) | ||
ELN classification | .22 | 0.14 (0-0.28) | ||
Adverse | 20 (21) | 30 (20) | ||
Favorable | 50 (52) | 37 (25) | ||
Intermediate | 28 (29) | 27 (18) | ||
Missing | 0.97 (1) | 6 (4) | ||
WBC count | .28 | 0.29 (−0.065 to 0.64) | ||
Median | 43.00 | 62.00 | ||
MAD | 35.88 | 52.19 | ||
Mean, % | 60.14 | 78.29 | ||
SD, % | 50.10 | 80.29 | ||
Range | 1-215 | 0-510 | ||
Missing | 15.5 (16) | 26.9 (18) | ||
NPM1 | .005 | 2.7 (1.3-6) | ||
Negative | 58 (60) | 79 (53) | ||
Positive | 42 (43) | 21 (14) | ||
FLT3-ITD | .0068 | 2.7 (1.3-6.2) | ||
Negative | 60 (62) | 81 (54) | ||
Positive | 40 (41) | 19 (13) | ||
CEBPA DMs | <.001 | NA (4 to NA) | ||
Negative | 80 (82) | 100 (67) | ||
Positive | 20 (21) | |||
CEBPA SMs | 1 | 0.86 (0.14-6.1) | ||
Negative | 96 (99) | 96 (64) | ||
Positive | 3.9 (4) | 4.5 (3) | ||
CEBPA silenced | .75 | 0.77 (0.19-3.3) | ||
Negative | 94 (97) | 93 (62) | ||
Positive | 5.8 (6) | 7.5 (5) | ||
t(15;17) | .079 | 0.16 (0.0031-1.6) | ||
Negative | 99 (102) | 94 (63) | ||
Positive | 0.97 (1) | 6 (4) | ||
t(8;21) | .036 | 0.12 (0.0026-1.1) | ||
Negative | 99 (102) | 93 (62) | ||
Positive | 0.97 (1) | 7.5 (5) | ||
inv(16) | .1 | 0.4 (0.11-1.3) | ||
Negative | 94 (97) | 87 (58) | ||
Positive | 5.8 (6) | 13 (9) | ||
Normal karyotype | <.001 | 4.4 (2.1-9.7) | ||
Negative | 36 (37) | 67 (45) | ||
Positive | 57 (59) | 24 (16) | ||
Missing | 6.8 (7) | 9 (6) | ||
Complex karyotype | .73 | 0.75 (0.15-4) | ||
Negative | 70 (72) | 64 (43) | ||
Positive | 4.9 (5) | 6 (4) | ||
Missing | 25 (26) | 30 (20) |
Descriptive statistics and hypotheses tests were computed for patients with AML with or without GATA2 ASE using Atable.
CI, confidence interval; ELN, European LeukemiaNet; ITD, internal tandem duplication; MAD, median absolute deviation; NA, not available; SD, standard deviation; SM, single mutation; WBC, white blood cell.
Reflects evaluation of the association between groups with or without GATA ASE and clinical variables.
Effect size measured as odds ratio for categorical variables and Cohen’s D for numerical variables.
GATA2 ASE was not significantly present in other AML subtypes known to be associated with CEBPA abnormalities, such as t(8;21)32 and CEBPA-silenced leukemias, both characterized by reduced CEBPA expression33,34 (Figure 3A). Moreover, single CEBPA mutations were not associated with GATA2 ASE (P = .708). Therefore, GATA2 ASE in CEBPA DMs does not seem to be a general result of abnormalities in CEBPA function or expression.
The expressed GATA2 allele is frequently mutated in AML with CEBPA DMs
The second mutated gene with the largest cooccurrence of GATA2 ASE was GATA2 itself (P = .0165). Interestingly, GATA2 was also mutated in 48% of the CEBPA DM cases in our cohort, and 19% carried a second subclonal GATA2 mutation (Table 3). This is in line with previous findings reporting that 40% of CEBPA DM cases cooccur with GATA2 mutations.35 In cases with a GATA2 mutation, the mutant allele was always preferentially expressed. This suggests a functional connection between GATA2 and CEBPA DMs, where ASE may play a cooperative role with GATA2 mutations.
Table 3.
Patient ID | RNA frequency* | GATA2 ASE† | GATA2 expression, TPM | GATA2 mutations‡ | GATA2 allele expressed§ | CEBPA mutations|| | CEBPA expression, TPM | CEBPA mutation VAF | ||
---|---|---|---|---|---|---|---|---|---|---|
n | Type (VAF) | Mut1 | Mut2 | |||||||
1316 | 0.233 | Skewed | 106.2 | 0 | — | — | N/C | 483.9 | 0.462 | 0.448 |
2192 | 0.023 | Monoallelic | 456.2 | 2 | ZF1 (0.39), ZF2 (0.59) | Mut (indel), mut (0.97) | N/C | 390.3 | 0.526 | 0.486 |
2218 | 0.263 | Skewed | 67.8 | 0 | — | — | C/C | 308.9 | 0.923 | HMZ |
2234 | 0.144 | Skewed | 28.5 | 2 | ZF1 (0.03) | Mut (0.07) | N/C | 380.5 | 0.498 | 0.475 |
2240 | 0.223 | Skewed | 41.0 | 1 | ZF1 (0.02) | Mut (0.03) | N/C | 328.0 | 0.486 | 0.461 |
2242 | Unknown | 55.5 | 0 | — | — | N/C | 162.0 | 0.472 | 0.447 | |
2253 | 0.269 | Skewed | 106.2 | 1 | ZF1 (0.47), ZF2 (0.07) | Mut (0.71), mut (0.49) | N/C | 168.1 | 0.490 | 0.418 |
2273 | 0.0993 | Monoallelic | 61.0 | 1 | ZF1 (0.47) | Mut (0.92) | N/C | 161.4 | 0.488 | 0.423 |
2545 | 0.037 | Monoallelic | 106.5 | 1 | ZF2 (0.39) | Mut (0.96) | N/C | 274.7 | 0.497 | 0.484 |
2753 | 0.106 | Skewed | 40.9 | 1 | ZF1 (0.45) | Mut (0.93) | N/C | 233.7 | 0.448 | 0.441 |
3101¶ | 0.126 | Skewed | 50.9 | 0 | — | — | N/N | 194.4 | NA | NA |
3327 | 0.071 | Monoallelic | 94.1 | 0 | — | — | C/C | 86.2 | 0.918 | HMZ |
4336 | 0.285 | Skewed | 36.7 | 0 | — | — | N/C | 143.7 | 0.442 | 0.470 |
5352 | 0.174 | Skewed | 24.3 | 0 | — | — | N/C | 417.6 | 0.472 | 0.412 |
5362 | 0.064 | Monoallelic | 60.2 | 2 | ZF1 (0.03), ZF2 (0.49) | Mut (0.12), mut (0.93) | N/C | 238.8 | 0.497 | 0.464 |
5364 | 0.097 | Monoallelic | 113.9 | 0 | — | — | N/N | 427.4 | 0.283 | 0.277 |
6376 | 0.024 | Monoallelic | 43.4 | 0 | — | — | C/C | 258.7 | 0.899 | HMZ |
7142 | 0.208 | Skewed | 29.7 | 0 | — | — | N/C | 141.2 | 0.482 | 0.473 |
AML0104 | 0.107 | Monoallelic | 66.6 | 0 | — | — | C/C | 264.1 | 0.422 | HMZ |
AML0129# | 0.018 | Monoallelic | 10.1 | 0 | — | — | N/N | 169.5 | 0.035 | 0.334 |
AML0135 | 0.097 | Monoallelic | 60.3 | 2 | ZF1 (0.19), ZF2 (0.37) | Mut (0.46), mut (0.87) | N/C | 125.0 | 0.399 | 0.173 |
UKR169 | 0.051 | Monoallelic | 13.9 | 1 | ZF1 (0.45) | Mut (0.96) | N/C | 318.8 | 0.847 | HMZ |
HMZ, homozygous; mut, mutated allele; NA, not available; TPM, transcripts per million; ZF, zinc finger.
Indicates the proportion of reads that come from the minor allele for all the single-nucleotide polymorphisms considered in the gene.
Categorized as monoallelic for RNA frequency ≤0.10 or skewed for RNA frequency ≤0.35. The expression of GATA2 and CEBPA is presented in TPM as reported by Salmon.
Contains the no., type (ZF1/2), and VAF of the mutations identified in GATA2.
Includes the VAF of these GATA2 mutations measured in the RNA.
VAF of the 2 CEBPA mutations, based on deep amplicon sequencing, is indicated in N- to C-terminal order.
Amplicon sequencing was not conducted for 3101, and CEBPA VAF was unavailable.
AML0129 had a CEBPA mutation in only 1 allele, but the other allele was not expressed; therefore, it acted like a CEBPA HMZ mutation at the transcriptional level.
We did not observe a difference in magnitude of GATA2 ASE (measured as VAF at RNA level) between patients with CEBPA DM with or without GATA2 mutations (supplemental Figure 3C). Therefore, GATA2 ASE in CEBPA DMs occurs independently of the number of GATA2 mutations.
Our findings were further validated in the TCGA-LAML36 and Beat AML37 data sets, where all 10 patients with CEBPA DMs and informative SNVs presented GATA2 ASE (supplemental Tables 4 and 5). Of these, 3 patients carried GATA2 mutations with preferential expression of the mutated allele (supplemental Figure 4A-B).
GATA2 ASE is a somatic event in CEBPA DM AML
Our observations suggest a role of GATA2 ASE in the pathogenesis of CEBPA DM AML, which would imply that GATA2 ASE should be leukemia specific and not present in healthy controls. An analysis of bone marrow– (n = 8) or cord blood–derived (n = 3) hematopoietic stem cells from healthy individuals did not show any GATA2 ASE, indicating that GATA2 ASE is not commonly found in the general population (Figure 4A).
To examine whether GATA2 ASE is indeed present at the time of leukemia development and lost upon achieving remission after treatment, we sequenced a second series of CEBPA DM cases (n = 12) for which both diagnostic and complete remission material was available (Table 4). In these cases, targeted GATA2 DNA and complementary DNA amplicon sequencing was applied, having previously confirmed that this technique recapitulates the RNA-seq results (supplemental Figure 5). In the diagnostic samples, we again observed frequent GATA2 ASE, although slightly less frequent than in the previous series (10 [83%] of 12 cases).
Table 4.
Patient ID | RNA frequency at diagnosis, %* | Skewing† | RNA frequency at remission, %* | GATA2 mutations‡ | GATA2 allele expressed§ | CEBPA mutations|| | CEBPA mutation VAF | ||||
---|---|---|---|---|---|---|---|---|---|---|---|
n | Type (VAF) | Diagnosis | Remission | ||||||||
Mut1 | Mut2 | Mut1 | Mut2 | ||||||||
13975 | 41.21 | Not skewed | 44.48 | 0 | — | — | N/C | 0.396 | 0.459 | 0.000 | 0.000 |
14006 | 17.81 | Skewed | 47.66 | 0 | — | — | N/N | 0.882 | HMZ | 0.000 | HMZ |
14347 | 16.96 | Skewed | 46.79 | 1 | ZF1 (0.49) | Mut (0.82) | N/C | 0.457 | 0.420 | 0.000 | 0.183 |
17658 | 25.65 | Skewed | 49.15 | 0 | — | — | N/N | 0.457 | 0.460 | 0.001 | 0.000 |
18522 | 23.43 | Skewed | 46.39 | 0 | — | — | C/C | 0.781 | HMZ | 0.001 | HMZ |
24783 | 37.22 | Skewed | 47.73 | 0 | — | — | N/C | 0.446 | 0.436 | 0.000 | 0.000 |
24819 | 28.66 | Skewed | 42.23 | 1 | ZF1 (0.06) | Mut (0.10) | N/C | 0.401 | 0.316 | 0.000 | 0.000 |
27899 | 24.73 | Skewed | 42.33 | 0 | — | — | N/C | 0.470 | 0.460 | 0.000 | 0.000 |
27977 | 0.01 | Monoallelic | 0.56 | 0 | — | — | N/C | 0.503 | 0.434 | 0.000 | 0.501 |
36832 | 41.03 | Not skewed | 40.58 | 0 | — | — | N/C | 0.438 | 0.389 | 0.000 | 0.000 |
40126 | 12.85 | Skewed | 47.57 | 2 | ZF1 (0.12), ZF1 (0.07) | Mut (0.16), mut (0.11) | N/C | 0.469 | 0.469 | 0.000 | 0.000 |
47293 | 14.71 | Skewed | 41.46 | 0 | — | — | N/C | 0.435 | 0.459 | 0.000 | 0.001 |
HMZ, homozygous; mut, mutated allele.
Indicates the proportion of reads from the minor allele for each single-nucleotide polymorphism considered, determined at diagnosis or remission.
Categorized as monoallelic for RNA frequency ≤0.10 or skewed for RNA frequency ≤0.35.
Reports the VAF of the mutation at the DNA level.
Includes the VAF measured in the RNA.
VAF of the 2 CEBPA mutations is indicated in N- to C-terminal order, at diagnosis and remission.
At remission, biallelic expression of GATA2 was restored in 9 of 10 CEBPA DM samples that showed GATA2 ASE at diagnosis (Figure 4B; supplemental Figure 6A). The exception, case 27977, displayed completely monoallelic expression of GATA2 at both time points, potentially indicating that GATA2 ASE preceded leukemia development in that particular patient. Interestingly, that same patient exhibited 1 N-terminal CEBPA mutation in 50% of the cells in remission, suggesting that it carried a germ line CEBPA mutation accompanied by germ line GATA2 ASE. In a control group of AML cases with NPM1 mutations with GATA2 ASE at diagnosis, we similarly observed GATA2 biallelic expression at remission (Figure 4C; supplemental Figure 6B).
Overall, these data indicate that GATA2 ASE is a leukemia-specific event, because it is absent in healthy cells and is lost in complete remission.
GATA2 promoters are differentially methylated in CEBPA DM AML
Methylation of CpG islands proximal to a transcriptional start site (TSS) may block transcription initiation and is correlated with loss of gene expression.38 To explore this in the context of GATA2 ASE, we analyzed ERRBS data generated in a subset (n = 35) of our AML cohort.18
The GATA2 gene encodes multiple isoforms with different TSSs, all of which overlap with a long CpG island. We defined promoters as the 1000-bp regions upstream of the TSS of isoforms expressed in AML: a short (Prom-S) and a long (Prom-L) isoform (supplemental Figure 7A). We compared methylation levels in these promoters for the following 3 groups: (1) CEBPA DM AML with GATA2 ASE (CEBPA_DM; n = 10), (2) AML without CEBPA DMs but with GATA2 ASE (Control_ASE; n = 20), and (3) AML without CEBPA DMs and without GATA2 ASE (Control_BE; n = 5; Figure 5A; supplemental Figure 7B). We identified significant hypermethylation in CEBPA DMs in the promoter of the long GATA2 form with respect to Control_ASE (P < .0001) but not Control_BE (P = .0016). No significant differences were observed in the promoter of Prom-S.
For further validation, we conducted bisulfite treatment followed by amplicon sequencing of GATA2 promoters in additional samples from the original cohort: CEBPA_DM (n = 9), Control_ASE (n = 7), and Control_BE (n = 2). Here, the regions were more narrowly defined but were sequenced with a higher resolution than that achieved by ERRBS. The results confirmed the previous observations (Figure 5B; supplemental Figure 8A); the CEBPA_DM group exhibited hypermethylation in the promoter of the long GATA2 form when compared with Control_ASE (P < .0001) and Control_BE (P = .0571). Moreover, we conducted bisulfite sequencing on 4 paired diagnosis-remission samples of CEBPA DMs in which we had previously detected GATA2 ASE (Figure 4). In all cases, we observed a strong decline of methylation levels in Prom-L at remission, consistent with the notion that hypermethylation associated with GATA2 ASE is a leukemia-specific event (Figure 5C; supplemental Figure 8B).
Methylation of GATA2 promoters is allele specific and correlates with expression
To confirm that the less transcriptionally active GATA2 allele is repressed via methylation, we carried out CRISPR/Cas9-targeted enrichment of the GATA2 locus followed by amplification-free long-read sequencing in 4 CEBPA DM patients by Oxford Nanopore, which allows direct detection of methylation.39 We estimated CpG methylation likelihood in each allele separately, based on a heterozygous single-nucleotide polymorphism that also enabled ASE detection.
In general, the individual methylation patterns recapitulated the ERRBS data (supplemental Figure 9A). The results were also consistent across different methylation callers (supplemental Figure 9B). Interestingly, there were no differences in Prom-L between the 2 alleles, both of which were strongly methylated (Figure 6; supplemental Figure 9C). Although ERRBS data revealed that patients with CEBPA DM are uniquely methylated in this region, certain positions exhibited 100% methylation in the selected patients (supplemental Figure 8A). This is incompatible with allele-specific methylation and thus in line with the Nanopore results. In contrast, three-fourths of patients presented allelic-specific methylation of the less abundant allele in Prom-S. This further supports the notion that the less transcriptionally active GATA2 allele is repressed via methylation in CEBPA DMs.
GATA2 levels seem to be preserved by a compensatory mechanism involving its −110-kb enhancer
Comparing expression levels across the abovementioned groups, there was no loss of GATA2 transcript levels in AML patients with CEBPA DMs (Figure 7A). We hypothesized that changes in the activity of a GATA2 enhancer in cis may compensate for the absence of transcription from the other allele. The promoters of GATA2 interact with a variety of cis-regulatory elements that dictate tissue-specific expression, including the 9.5-kb intronic enhancer and the −110-kb distant superenhancer.40 The −110-kb enhancer (−77 kb in mice) is essential for embryogenesis and controls differentiation of common myeloid progenitors and granulocyte-macrophage progenitors,41 and its loss is involved in the development of AML with inv(3)/t(3;3).4 Therefore, we examined changes in the activity of this enhancer.
Differential expression analysis revealed that CEBPA DM cases exhibited increased transcription in all the elements contained within the GATA2 superenhancer (P < .05; DESeq2) when compared with other AML cases, regardless of whether they exhibited GATA2 ASE (Figure 7B). Increased transcription in enhancer regions was shown to be allele specific for 4 of 6 CEBPA DM samples for which DNA sequencing information was available in that region (Figure 7C). Likewise, levels of both H3K27ac (Figure 7D) and ATAC-seq (supplemental Figure 10A) were higher for CEBPA DM cases than any other group in the GATA2 superenhancer region. Interestingly, the patterns of allele specificity sometimes differed between enhancer RNA and H3K27ac data (Figure 7C,E).
There were no significant differences in superenhancer methylation, although it should be noted that the resolution of ERRBS in this area was low (supplemental Figure 10B). There were no differences in H3K27me3 (supplemental Figure 10C), a mark for poised enhancers.42 H3K27me3, which is mediated by the polycomb complex PRC2, is also present in the promoters of silenced genes and might prevent transcription.43 However, we did not observe significant differences in any of the GATA2 locus regions examined, ruling out PRC2-mediated repression (supplemental Figure 10D).
Altogether, these results support the notion that inactivation of 1 GATA2 allele by methylation is compensated for by increased enhancer activity in the other allele, leading to maintenance of GATA2 levels.
Discussion
We detected GATA2 ASE in 60% of the AML cases, with a very strong association with CEBPA DMs. Analysis of additional cohorts revealed that GATA2 ASE was found in 41 (95%) of 43 CEBPA DM AML cases and was a somatic, leukemia-specific event that was lost upon remission. In cases with GATA2 mutations, the mutated allele was preferentially expressed, but ASE was also present in the absence of GATA2 mutations. We show that our findings can be explained by simultaneous silencing of 1 allele by methylation and overactivation of the other allele via the −110-kb superenhancer, resulting in unchanged, or even slightly increased, GATA2 levels. Collectively, these data suggest that GATA2 ASE is an important event in the development of AML with CEBPA DMs.
GATA2 encodes a transcription factor crucial for proliferation and maintenance of hematopoietic stem cells.31 Balanced expression of functional GATA2 is critical for normal hematopoiesis, with alterations in either its expression or activity having been linked to leukemogenesis.44 For instance, gain-of-function GATA2 mutations mediate acute myeloid transformation of chronic myeloid leukemia,45 whereas loss-of-function germ line mutations leading to GATA2 deficiency predispose carriers to familial myelodysplastic syndrome (MDS)/AML.46 These patients present a wide range of other phenotypic manifestations, including immunodeficiency, pulmonary disease, and lymphatic dysfunction.47 In addition to resulting from mutations in coding regions of the gene, these symptoms can be caused by mutations in an internal enhancer of GATA2, leading to reduced expression of the gene product.48 On the other hand, GATA2 overexpression has been suggested to be a poor prognostic marker in both pediatric49 and adult50 AML. Not only do our findings demonstrate that GATA2 defects may be caused by mutations in the gene or its regulatory elements, but they also underscore the importance of epigenetic changes or epimutations in this gene in a subset of leukemias.
These observations highlight the importance of fine-tuned regulation of GATA2 expression and point to a role of GATA2 ASE in the pathogenesis of AML. Accordingly, Celton et al51 also reported frequent GATA2 ASE in a smaller cohort of 49 normal karyotype patients with AML, although it should be noted that other genes were not considered in that study. In a much larger group of patients, we conclusively demonstrate that GATA2 displays ASE more often than any other known myeloid- or cancer-related gene. Moreover, although GATA2 ASE is widespread in AML, we show it is distinctly associated with CEBPA DMs; both events cooccured in 95% of the 43 cases analyzed.
CEBPA DMs define an AML subtype with a distinct gene expression profile and favorable clinical outcome.52,53 These patients typically exhibit a combination of N- and C-terminal mutations in the CEBPA protein, disrupting its dimerization and DNA-binding activities.54 We did not find an association between GATA2 ASE and the type of CEBPA mutations present in each patient (supplemental Figure 3D).
The specific association between GATA2 ASE and CEBPA DMs suggests cooperativity between these 2 genes in the context of leukemogenesis. This is in keeping with the previously reported observation that GATA2 mutations are present in ∼40% of CEBPA DM cases. Somatic GATA2 mutations mainly cluster in the 2 ZF domains of the protein, each with different functional implications.55 The ZF1 domain (N-terminal) of GATA2 contributes to the stabilization and specificity of DNA binding and mediates the interaction with FOG1, whereas ZF2 interacts with CEBPA.35 The role of these mutations in AML is a subject of ongoing research, with effects described on proliferation and differentiation (Leubolt et al55 provide a recent review). ZF1 mutations are strongly associated with CEBPA DMs, where they may play a cooperative role; the mutations lead to reduced transcription of CEBPA targets.35 All the cases of our cohort with GATA2 mutations exhibited at least an amino acid change in ZF1, but those with 2 mutations had a second hit in ZF2. Strikingly, both GATA2 mutations were always in the same allele, which was preferentially expressed. In a recent study of recurrently mutated genes in AML, Batcha et al56 also identified an allelic imbalance toward mutant GATA2, although their effort was limited to 11 genes harboring recurring mutations. Similarly, Al Seraihi et al57 reported GATA2 ASE favoring the mutated allele in a family with inherited GATA2-mutated MDS/AML. In contrast, Kozyra et al58 recently described synonymous GATA2 mutations in patients with MDS that lead to decreased transcript stability, leading to ASE favoring the wild-type allele. In patients with CEBPA DM AML that have GATA2 mutations, the presence of GATA2 ASE can be explained because it leads to dominance of the mutated allele. However, because GATA2 ASE was also observed in a vast majority of CEBPA DM cases without GATA2 mutations, we hypothesize that GATA2 ASE precedes the acquisition of mutations.
The average expression of GATA2 in CEBPA DM AML was comparable to that in other AMLs, even in cases with monoallelic GATA2 expression. We show that this is due to DNA methylation–mediated gene silencing of the repressed allele, compensated for by overactivation of the long-distance −110-kb GATA2 superenhancer on the other allele (supplemental Figure 11). Interestingly, this is the same regulatory element involved in AML with t(3;3)/inv(3),4 as well as many other atypical 3q26 translocations.59 However, in these leukemias, loss of the GATA2 superenhancer results in GATA2 haploinsufficiency, which accelerates EVI1-driven leukemogenesis.60 Given the very strong association between GATA2 ASE and CEBPA DMs, we hypothesize that GATA2 ASE also contributes to CEBPA-mediated leukemogenesis, although the exact mechanisms remain unclear. One possibility is that silencing of 1 allele and enhancer activation of the other allele do not originate at the same time. Instead, high levels of GATA2 driven by the −110-kb enhancer may contribute to leukemia initiation in preleukemic cells, whereas loss of expression may be favored in later stages. This hypothesis is consistent with the findings by Saida et al61 in inv(16) AML models, where Gata2 expression was upregulated in the preleukemic phase, but monoallelic Gata2 deletions led to a more aggressive phenotype in the leukemic stage. Studies using Cebpa DM mouse leukemias in vivo62 could further clarify the order of acquisition of Gata2 ASE in those leukemias.
The acquisition of methylation and acetylation marks in the absence of changes in the DNA constitutes an example of epimutation.63 Such epigenetic modifications have been extensively detected in cancer, often affecting the expression levels of tumor suppressor genes.64 Here, we show that epimutations leading to GATA2 ASE are mostly somatic and lost at remission, which further supports the notion that they play a role in leukemia development. Although hyperactivation of the −110-kb superenhancer was not reported, other studies had previously detected hypermethylation of the GATA2 promoter in non-CEBPA DM cases.51,57 Why GATA2 is prone to acquisition of these epimutations and how or when they are exactly incorporated remain to be elucidated. One intriguing possibility is that GATA2 ASE is acquired at a certain differentiation stage that becomes the leukemia cell of origin. Given that other subgroups with CEBPA abnormalities (other than mutations) do not show a similar pattern, we propose that ASE of GATA2 is not a consequence of CEBPA mutations. Intriguingly, GATA2 promoter methylation levels of other AML cases with GATA2 ASE are low, suggesting there might be another mechanism at play in those.
In a single patient with CEBPA DMs, we observed GATA2 ASE at diagnosis as well as in remission, which poses several questions for future research. First, GATA2 ASE in remission marrow should be analyzed in a much larger cohort to determine the frequency of such a condition. Second, it would be interesting to determine whether GATA2 ASE was already present in bone marrow progenitors before leukemic transformation and, if so, whether it was somatically acquired or present in the germ line. Importantly, this would suggest that an SNV in a regulatory domain of GATA2 is responsible for such an effect.
In summary, GATA2 ASE is a somatic event that is epigenetically acquired in almost all CEBPA DM AML cases, suggesting it plays a key role in the development and/or progression of this leukemia subtype, a notion further supported by the association between GATA2 mutations and CEBPA mutations. The specific mechanisms remain unclear, but the importance of fine-tuned GATA2 regulation points to GATA2 levels. Therefore, we propose that increased levels of GATA2 mediated by overactivation of the superenhancer, in collaboration with CEBPA mutations, might be an early event in leukemic transformation. Later, allele-specific silencing would result in stabilization of GATA2 levels in leukemic blasts.
Supplementary Material
The online version of this article contains a data supplement.
Acknowledgments
The authors thank their colleagues from the bone marrow transplantation group and the molecular diagnostic laboratory of the Department of Hematology at the Erasmus University Medical Center for storage of samples and molecular analysis of the leukemia cells. The authors also thank their colleagues from the Department of Hematology for their input, especially Remco Hoogenboezem for bioinformatic support and algorithm implementation. The authors acknowledge the research technicians involved in this work: Margit Nützel, Hanna Stanewsky, Johanna Raithel, and Ute Ackermann. Finally, the authors thank Roberto Avellino for critically reading the manuscript and Timothy Ley for discussing the findings.
This work was supported by grants and fellowships from the Dutch Cancer Society (R.D., B.J.W., R.M.-L., and S.v.H.) and a Leukemia & Lymphoma Society (LLS) Special Fellowship Award (B.J.W.). A.M.M. is supported by National Institutes of Health, National Cancer Institute grants UG1 CA233332 and R01 CA198089, and LLS Specialized Center of Research grant 7013-17.
Footnotes
Sequence data have been deposited in the European Genome-phenome Archive (EGA; http://www.ebi.ac.uk/ega/), which is hosted by the European Bioinformatics Institute, under accession number EGA S00001004684.
The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.
Authorship
Contribution: R.M.-L., B.J.W., and R.D. designed the study. S.v.H., C.E., C.G., E.B., and I.R. carried out experiments; R.M.-L., M.A.S., C.V., J.d.R., and P.V. analyzed data; and P.V., A.M.M., and M.R. provided samples and/or data. R.M.-L., B.J.W., and R.D. wrote the manuscript.
Conflict-of-interest disclosure: A.M.M. receives research funding from Janssen, Daiichi Sankyo, and Sanofi; has consulted for Epizyme, Constellation, BMI, and Exo-Therapeutics; and is a scientific advisor to KDAC. J.d.R. is cofounder of Cyclomics BV. The remaining authors declare no competing financial interests.
Correspondence: Bas J. Wouters, Department of Hematology, Erasmus University Medical Center, Wytemaweg 80, 3015CN Rotterdam, The Netherlands; e-mail: b.wouters@erasmusmc.nl.
REFERENCES
- 1.Bradner JE, Hnisz D, Young RA. Transcriptional addiction in cancer. Cell. 2017;168(4):629-643. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Papaemmanuil E, Gerstung M, Bullinger L, et al. Genomic classification and prognosis in acute myeloid leukemia. N Engl J Med. 2016;374(23):2209-2221. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Bhagwat AS, Lu B, Vakoc CR. Enhancer dysfunction in leukemia. Blood. 2018;131(16):1795-1804. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Gröschel S, Sanders MA, Hoogenboezem R, et al. A single oncogenic enhancer rearrangement causes concomitant EVI1 and GATA2 deregulation in leukemia. Cell. 2014;157(2):369-381. [DOI] [PubMed] [Google Scholar]
- 5.Shi J, Whyte WA, Zepeda-Mendoza CJ, et al. Role of SWI/SNF in acute leukemia maintenance and enhancer-mediated Myc regulation. Genes Dev. 2013;27(24):2648-2662. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Mansour MR, Abraham BJ, Anders L, et al. An oncogenic super-enhancer formed through somatic mutation of a noncoding intergenic element. Science. 2014;346(6215):1373-1377. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Guo YA, Chang MM, Huang W, et al. Mutation hotspots at CTCF binding sites coupled to chromosomal instability in gastrointestinal cancers. Nat Commun. 2018;9(1):1520. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Kulis M, Esteller M. DNA methylation and cancer. Adv Genet. 2010;70:27-56. [DOI] [PubMed] [Google Scholar]
- 9.Flavahan WA, Drier Y, Liau BB, et al. Insulator dysfunction and oncogene activation in IDH mutant gliomas. Nature. 2016;529(7584):110-114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Sur I, Taipale J. The role of enhancers in cancer. Nat Rev Cancer. 2016;16(8):483-493. [DOI] [PubMed] [Google Scholar]
- 11.Pastinen T. Genome-wide allele-specific analysis: insights into regulatory variation. Nat Rev Genet. 2010;11(8):533-538. [DOI] [PubMed] [Google Scholar]
- 12.Clayton EA, Khalid S, Ban D, Wang L, Jordan IK, McDonald JF. Tumor suppressor genes and allele-specific expression: mechanisms and significance. Oncotarget. 2020;11(4):462-479. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Valle L, Serena-Acedo T, Liyanarachchi S, et al. Germline allele-specific expression of TGFBR1 confers an increased risk of colorectal cancer. Science. 2008;321(5894):1361-1365. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.van Driel WJ, Tjiong MY, Hilders CGJM, Trimbos BJ, Fleuren GJ. Association of allele-specific HLA expression and histopathologic progression of cervical carcinoma. Gynecol Oncol. 1996;62(1):33-41. [DOI] [PubMed] [Google Scholar]
- 15.Liu Z, Dong X, Li Y. A genome-wide study of allele-specific expression in colorectal cancer. Front Genet. 2018;9:570. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Lee S, Kim J, Lee S. A comparative study on gene-set analysis methods for assessing differential expression associated with the survival phenotype. BMC Bioinformatics. 2011;12:377. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Ströbel A. atable: create tables for clinical trial reports. R Journal. 2019;11(1):137-148. [Google Scholar]
- 18.Glass JL, Hassane D, Wouters BJ, et al. Epigenetic identity in AML depends on disruption of nonpromoter regulatory elements and is affected by antagonistic effects of mutations in epigenetic modifiers. Cancer Discov. 2017;7(8):868-883. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Akalin A, Kormaksson M, Li S, et al. methylKit: a comprehensive R package for the analysis of genome-wide DNA methylation profiles. Genome Biol. 2012;13(10):R87. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Hahne F, Ivanek R, et al. Visualizing genomic data using Gviz and Bioconductor. Methods Mol. Biol. 2016;1418:335-351. [DOI] [PubMed] [Google Scholar]
- 21.Stangl C, de Blank S, Renkens I, et al. Partner independent fusion gene detection by multiplexed CRISPR-Cas9 enrichment and long read nanopore sequencing. Nat Commun. 2020;11(1):2861. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Simpson JT, Workman RE, Zuzarte PC, David M, Dursi LJ, Timp W. Detecting DNA cytosine methylation using nanopore sequencing. Nat Methods. 2017;14(4):407-410. [DOI] [PubMed] [Google Scholar]
- 23.Pham TH, Benner C, Lichtinger M, et al. Dynamic epigenetic enhancer signatures reveal key transcription factors associated with monocytic differentiation states. Blood. 2012;119(24):e161-e171. [DOI] [PubMed] [Google Scholar]
- 24.Corces MR, Trevino AE, Hamilton EG, et al. An improved ATAC-seq protocol reduces background and enables interrogation of frozen tissues. Nat Methods. 2017;14(10):959-962. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26(6):841-842. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Speir ML, Zweig AS, Rosenbloom KR, et al. The UCSC Genome Browser database: 2016 update. Nucleic Acids Res. 2016;44(D1):D717-D725. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9(4):357-359. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Liao Y, Smyth GK, Shi W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics. 2014;30(7):923-930. [DOI] [PubMed] [Google Scholar]
- 29.Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(12):550. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Tate JG, Bamford S, Jubb HC, et al. COSMIC: the Catalogue Of Somatic Mutations In Cancer. Nucleic Acids Res. 2019;47(D1):D941-D947. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Tsai F-Y, Orkin SH. Transcription factor GATA-2 is required for proliferation/survival of early hematopoietic cells and mast cell formation, but not for erythroid and myeloid terminal differentiation. Blood. 1997;89(10):3636-3643. [PubMed] [Google Scholar]
- 32.Pabst T, Mueller BU, Harakawa N, et al. AML1-ETO downregulates the granulocytic differentiation factor C/EBPalpha in t(8;21) myeloid leukemia. Nat Med. 2001;7(4):444-451. [DOI] [PubMed] [Google Scholar]
- 33.Wouters BJ, Jordà MA, Keeshan K, et al. Distinct gene expression profiles of acute myeloid/T-lymphoid leukemia with silenced CEBPA and mutations in NOTCH1. Blood. 2007;110(10):3706-3714. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Figueroa ME, Wouters BJ, Skrabanek L, et al. Genome-wide epigenetic analysis delineates a biologically distinct immature acute leukemia with myeloid/T-lymphoid features. Blood. 2009;113(12):2795-2804. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Greif PA, Dufour A, Konstandin NP, et al. GATA2 zinc finger 1 mutations associated with biallelic CEBPA mutations define a unique genetic entity of acute myeloid leukemia. Blood. 2012;120(2):395-403. [DOI] [PubMed] [Google Scholar]
- 36.Ley TJ, Miller C, Ding L, et al. ; Cancer Genome Atlas Research Network . Genomic and epigenomic landscapes of adult de novo acute myeloid leukemia [published correction appears in N Engl J Med. 2013;369(1):98]. N Engl J Med. 2013;368(22):2059-2074. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Tyner JW, Tognon CE, Bottomly D, et al. Functional genomic landscape of acute myeloid leukaemia. Nature. 2018;562(7728):526-531. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Jones PA. Functions of DNA methylation: islands, start sites, gene bodies and beyond. Nat Rev Genet. 2012;13(7):484-492. [DOI] [PubMed] [Google Scholar]
- 39.Wescoe ZL, Schreiber J, Akeson M. Nanopores discriminate among five C5-cytosine variants in DNA. J Am Chem Soc. 2014;136(47):16582-16587. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Wlodarski MW, Collin M, Horwitz MS. GATA2 deficiency and related myeloid neoplasms. Semin Hematol. 2017;54(2):81-86. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Johnson KD, Kong G, Gao X, et al. Cis-regulatory mechanisms governing stem and progenitor cell transitions. Sci Adv. 2015;1(8):e1500503. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Zhu Y, Sun L, Chen Z, Whitaker JW, Wang T, Wang W. Predicting enhancer transcription and activity from chromatin modifications. Nucleic Acids Res. 2013;41(22):10032-10043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Margueron R, Reinberg D. The Polycomb complex PRC2 and its mark in life. Nature. 2011;469(7330):343-349. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Menendez-Gonzalez JB, Vukovic M, Abdelfattah A, et al. Gata2 as a crucial regulator of stem cells in adult hematopoiesis and acute myeloid leukemia. Stem Cell Reports. 2019;13(2):291-306. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Zhang SJ, Ma LY, Huang QH, et al. Gain-of-function mutation of GATA-2 in acute myeloid transformation of chronic myeloid leukemia. Proc Natl Acad Sci USA. 2008;105(6):2076-2081. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Kazenwadel J, Secker GA, Liu YJ, et al. Loss-of-function germline GATA2 mutations in patients with MDS/AML or MonoMAC syndrome and primary lymphedema reveal a key role for GATA2 in the lymphatic vasculature. Blood. 2012;119(5):1283-1291. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Spinner MA, Sanchez LA, Hsu AP, et al. GATA2 deficiency: a protean disorder of hematopoiesis, lymphatics, and immunity. Blood. 2014;123(6):809-821. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Hsu AP, Johnson KD, Falcone EL, et al. GATA2 haploinsufficiency caused by mutations in a conserved intronic element leads to MonoMAC syndrome. Blood. 2013;121(19):3830-3837, S1-S7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Luesink M, Hollink IHIM, van der Velden VHJ, et al. High GATA2 expression is a poor prognostic marker in pediatric acute myeloid leukemia. Blood. 2012;120(10):2064-2075. [DOI] [PubMed] [Google Scholar]
- 50.Vicente C, Vazquez I, Conchillo A, et al. Overexpression of GATA2 predicts an adverse prognosis for patients with acute myeloid leukemia and it is associated with distinct molecular abnormalities. Leukemia. 2012;26(3):550-554. [DOI] [PubMed] [Google Scholar]
- 51.Celton M, Forest A, Gosse G, et al. Epigenetic regulation of GATA2 and its impact on normal karyotype acute myeloid leukemia. Leukemia. 2014;28(8):1617-1626. [DOI] [PubMed] [Google Scholar]
- 52.Wouters BJ, Löwenberg B, Erpelinck-Verschueren CAJ, van Putten WL, Valk PJ, Delwel R. Double CEBPA mutations, but not single CEBPA mutations, define a subgroup of acute myeloid leukemia with a distinctive gene expression profile that is uniquely associated with a favorable outcome. Blood. 2009;113(13):3088-3091. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Dufour A, Schneider F, Metzeler KH, et al. Acute myeloid leukemia with biallelic CEBPA gene mutations and normal karyotype represents a distinct genetic entity associated with a favorable clinical outcome. J Clin Oncol. 2010;28(4):570-577. [DOI] [PubMed] [Google Scholar]
- 54.Fasan A, Haferlach C, Alpermann T, et al. The role of different genetic subtypes of CEBPA mutated AML. Leukemia. 2014;28(4):794-803. [DOI] [PubMed] [Google Scholar]
- 55.Leubolt G, Redondo Monte E, Greif PA. GATA2 mutations in myeloid malignancies: two zinc fingers in many pies. IUBMB Life. 2020;72(1):151-158. [DOI] [PubMed] [Google Scholar]
- 56.Batcha AMN, Bamopoulos SA, Kerbs P, et al. Allelic imbalance of recurrently mutated genes in acute myeloid leukaemia. Sci Rep. 2019;9(1):11796. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Al Seraihi AF, Rio-Machin A, Tawana K, et al. GATA2 monoallelic expression underlies reduced penetrance in inherited GATA2-mutated MDS/AML. Leukemia. 2018;32(11):2502-2507. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Kozyra EJ, Pastor VB, Lefkopoulos S, et al. ; European Working Group of MDS in Childhood (EWOG-MDS) . Synonymous GATA2 mutations result in selective loss of mutated RNA and are common in patients with GATA2 deficiency. Leukemia. 2020;34(10):2673-2687. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Ottema S, Mulet-Lazaro R, Beverloo HB, et al. Atypical 3q26/MECOM rearrangements genocopy inv(3)/t(3;3) in acute myeloid leukemia. Blood. 2020;136(2):224-234. [DOI] [PubMed] [Google Scholar]
- 60.Suzuki M, Katayama S, Yamamoto M. Two effects of GATA2 enhancer repositioning by 3q chromosomal rearrangements. IUBMB Life. 2020;72(1):159-169. [DOI] [PubMed] [Google Scholar]
- 61.Saida S, Zhen T, Kim E, et al. Gata2 deficiency delays leukemogenesis while contributing to aggressive leukemia phenotype in Cbfb-MYH11 knockin mice. Leukemia. 2020;34(3):759-770. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Di Genua C, Valletta S, Buono M, et al. C/EBPα and GATA-2 mutations induce bilineage acute erythroid leukemia through transformation of a neomorphic neutrophil-erythroid progenitor. Cancer Cell. 2020;37(5):690-704.e8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Horsthemke B. Epimutations in human disease. Curr Top Microbiol Immunol. 2006;310:45-59. [DOI] [PubMed] [Google Scholar]
- 64.Plass C, Pfister SM, Lindroth AM, Bogatyrova O, Claus R, Lichter P. Mutations in regulators of the epigenome and their connections to global chromatin patterns in cancer. Nat Rev Genet. 2013;14(11):765-780. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.