SUMMARY
Epigenetic re-programming underlies specification of immune cell lineages, but patterns that uniquely define immune cell types and the mechanisms by which they are established remain unclear. Here we identified lineage-specific DNA methylation signatures of six immune cell types from human peripheral blood and determined their relationship to other epigenetic and transcriptomic patterns. Sites of lineage-specific hypomethylation were associated with distinct combinations of transcription factors in each cell type. By contrast, sites of lineage-specific hypermethylation were restricted mostly to adaptive immune cells. PU.1 binding sites were associated with lineage-specific hypo- and hypermethylation in different cell types, suggesting that it regulates DNA methylation in a context-dependent manner. These observations indicate that innate and adaptive immune lineages are specified by distinct epigenetic mechanisms via combinatorial and context-dependent use of key transcription factors. The cell-specific epigenomics and transcriptional patterns identified serve as a foundation for future studies on immune dysregulation in diseases and aging.
INTRODUCTION
Blood-borne immune cell types are generated from hematopoietic stem cells (HSC) resident in the bone marrow (Dzierzak and Speck, 2008; Mercier et al., 2011). B and T lymphocytes that mediate adaptive immunity develop in the bone marrow and the thymus, respectively. HSCs also give rise to innate immune cells such as monocytes, granulocytes, and natural killer (NK) cells. Functional heterogeneity of immune cells results from selective expression of their identical genetic contents via epigenetic reprogramming. DNA methylation at cytosine residues is one of the earliest forms of epigenetic modifications identified in mammals and is regulated by the action of DNA methyltransferases (DNMTs) and ten-eleven-translocases (TETs) (Bestor et al., 2015; Ginno et al., 2020; Lio and Rao, 2019). As an epigenetic mark, chemical stability of the methyl cytosine modification and its bimodal distribution makes it an attractive candidate to compare between cell states, including developmental stages or changes that accompany diseases or aging. However, methylation signatures that uniquely define each immune cell type and the mechanisms by which they are established remain to be uncovered.
Relationships between DNA methylation and establishment of hematopoietic cell identity have been explored most comprehensively in murine B lymphocytes. Genetic studies in mice implicate three key transcription factors, E2A, Ebf1 and Pax5, in establishing B cell identity (Hagman and Lukin, 2006; O'Riordan and Grosschedl, 1999). DNA methylation changes are evident as multi-potential progenitor cells advance to the pre-pro-B cell stage under influence of E2A (Benner et al., 2015), and further changes accrue as Ebf1 directs these cells to initiate antigen receptor gene rearrangements (Li et al., 2018). Binding motifs for these transcription factors are found to be enriched at differentially methylated regions (DMRs) induced by Tet2 and Tet3 proteins during B-cell differentiation(Orlanski et al., 2016). However, mechanisms by which specific transcription factors induce changes in methylation remains unclear.
DMRs have been identified in human cells and tissues using methylation microarrays and whole genome bisulfite sequencing (Ziller et al., 2013). Comparison with transcription factor binding sites compiled by the ENCODE consortium showed that DMRs for B and T lymphocytes share transcription factor profiles, including factors such as Batf, Bcl11a, Pax5 and Mef2a/c. The BLUEPRINT Epigenome Consortium has also generated comprehensive methylome profiles of immune cell subsets and their progenitors (Farlik et al., 2016; Kulis et al., 2015; Schuyler et al., 2016). Hypomethylated DMRs in myeloid progenitors were enriched for GATA1 and TAL1 binding sites, however no specific sites were enriched in lymphoid progenitors. Differentiation of lymphoid progenitors to pre-B cells generated DMRs that were enriched for many factors associated with pre-B cell differentiation in mice. Thus, while reinforcing the idea that the epigenomic changes accompany cellular differentiation, these studies did not establish the unique epigenetic identity of specific immune cell types. These studies were also limited by small sample size that made it difficult to identify immune cell-specific epigenetic differences due to inter-individual variability, including age. In addition, individuals selected for these studies were not strictly screened for clinical and sub-clinical considerations.
To circumvent inter-individual variability and focus on physiologically relevant cell-type specific patterns that were independent of the effects of pathology, we mapped the methylomes of major immune cell populations present in human peripheral blood from 55 healthy individuals recruited on the basis of stringent inclusion criteria. We identified and validated lineage-specific methylation signatures of 6 cell types and queried these patterns against other epigenetic features, gene expression and transcription factor binding. We found that cell lineage-specific hypomethylated sites and DMRs were associated with distinct combinations of transcription factor binding sites in each immune cell type. By contrast, cell-specific DNA hypermethylation was prominent only in adaptive immune cells. The differential use of DNA methylation in adaptive and innate immune cell types was further corroborated by RNA-Seq. Finally, PU.1 and RUNX1 were associated with both hypo- and hypermethylation in a context-dependent manner in different cell types. Our studies reveal cell-specific epigenetic and gene expression signatures that are established during hematopoiesis and implicate a restricted set of transcription factors in defining human immune cell identity.
RESULTS
Differential DNA methylation distinguishes immune cell types
To identify immune cell-specific patterns of DNA methylation we fractionated human blood into 11 subsets using a combination of magnetic beads and flow cytometry (Figure 1A, Figure S1A and SI Table 1). Sodium bisulfite-treated genomic DNA from each sample was hybridized to Illumina Infinium MethylationEPIC arrays followed by processing and quantitation of methylation (β ranging from 0-1) for each probe (Figure 1B). We then determined the average β value for each probe in each cell type. The resulting normalized distribution of average β values followed a characteristic bimodal distribution (Figure 1C). The first principal component (PC1) (41%) distinguished innate (monocytes, granulocytes, and natural killer cells) from adaptive immune subsets (B, CD4+, CD8+ lymphocytes). Additional resolution between bone-marrow derived myeloid and B lymphoid cells was evident in PC2 (15%) (Figure 1C). Naïve CD4+ and CD8+ T cells clustered closely and for further lineage-specific analyses were treated as one compartment.
To identify methylation patterns that specifically defined each of 6 immune cell types generated by hematopoiesis we first compared the β value at each CpG site in a cell type in pairwise fashion with all other cell types within each individual for 24 donors (age range 23-83 years, Figure S1B). Because of their similarity CD4+ and CD8+ T cells were compared to non-T lineage cells only. By this procedure we identified several thousand differentially methylated sites (Benjamini-Hochberg (BH) adjusted p≤ 0.05, SI Table 1). Further restricting the analyses to sites that were differentially methylated by at least 30% (average Δβ≤−0.3 or Δβ≥0.3) revealed 2000-4000 cell-specific hypomethylated sites in each cell type; cell-specific hypermethylation varied greatly between cell types (Figures 1D-E, S1F and SI Table 2). We will refer to these as sites of cell-specific hypo- or hypermethylation. Only 1-5 of these sites were shared with CpGs that constitute methylation clocks (Hannum et al., 2013; Horvath, 2013)(data not shown). Methylation patterns in PBMC or whole blood did not match any specific lineage. We applied the same criteria to a second cohort of 31 individuals (age range 21-81 years) and observed 80-94% concordance of cell-specific hypomethylated sites and 76-86% concordance of cell-specific hypermethylated sites between the two datasets (SI Table 2). Thus, these sites comprise a methylation signature of the 6 major blood-borne human immune cell lineages. All further analyses were performed using the probe set identified from the first cohort.
To understand the basis for the low numbers of innate lineage-specific sites of hypermethylation, we compared methylation in monocytes and granulocytes to each of the adaptive immune cell types. We identified 7000-12,000 selectively hypermethylated sites in either cell type compared to B or T or NK cells (Table 1, top). However, the numbers fell in 3- and 4-way comparisons, such that only 7-11% of hypermethylated sites remained when compared to all four cell types taken together (Table 1, top). By contrast, 50-74% of hypomethylated sites identified pairwise remained myeloid-specific in 4-way comparison. These trends were also evident when granulocytes and monocytes were compared as a group (Table 1, top).
Table 1.
Study | Group | Cell type | Compared to | Hypomethylated_30p | Hypermethylated_30p |
---|---|---|---|---|---|
This Study (GESTALT) | Pairwise | granulocytes | B | 26931 | 8819 |
granulocytes | T | 42375 | 8968 | ||
granulocytes | NK | 28329 | 12434 | ||
granulocytes | mono | 5218 | 4635 | ||
Combined | granulocytes | B+T | 24327 | 1642 | |
granulocytes | B+T+NK | 19963 | 1021 | ||
Pairwise | monocytes | B | 23308 | 7836 | |
monocytes | T | 38623 | 9337 | ||
monocytes | NK | 25848 | 12251 | ||
monocytes | gran | 4635 | 5218 | ||
Combined | monocytes | B+T | 21102 | 1512 | |
monocytes | B+T+NK | 17230 | 894 | ||
Combinatorial | granulocytes+monocytes | B | 17824 | 7232 | |
granulocytes+monocytes | T | 32224 | 8208 | ||
granulocytes+monocytes | NK | 20038 | 10958 | ||
granulocytes+monocytes | B+T | 15946 | 1312 | ||
granulocytes+monocytes | B+T+NK | 12705 | 770 | ||
Hachiya et al | all 23M CpG sites with >5 reads | CD4 | monocytes | 275028 | 469014 |
CD4 | neutrophils | 321872 | 548940 | ||
CD4 | monocytes+neutrophils | 250305 | 388181 | ||
monocytes | CD4 | 469014 | 275028 | ||
monocytes | neutrophils | 64351 | 112329 | ||
monocytes | CD4+ neutrophils | 53793 | 3096 | ||
neutrophils | CD4 | 548940 | 321872 | ||
neutrophils | monocytes | 112329 | 64351 | ||
neutrophils | CD4+monocytes | 105126 | 5899 | ||
814602 sites matching with Illumina 850K EPIC array | CD4 | monocytes | 15461 | 37190 | |
CD4 | neutrophils | 17063 | 40099 | ||
CD4 | monocytes+neutrophils | 14203 | 31262 | ||
monocytes | CD4 | 37190 | 15461 | ||
monocytes | neutrophils | 5053 | 5947 | ||
monocytes | CD4+ neutrophils | 4028 | 303 | ||
neutrophils | CD4 | 40099 | 17063 | ||
neutrophils | monocytes | 5947 | 5053 | ||
neutrophils | CD4+monocytes | 5402 | 675 |
Genes that contained cell-specific hypomethylated sites (based on MethylationEPIC array annotation) were enriched for biological functions attributable to individual cell types (Figure 1F). Cell-specific trends were less evident with genes associated with hypermethylated sites presumably because very few such sites were identified in innate immune cell types. The majority of cell-specific differentially methylated sites were located in ‘intergenic’ and intragenic’ genomic regions (Figure 2A). We also identified cell-specific DMRs containing ≥ 2 hypo- or hypermethylated sites within 300bp of each other. The number of hypomethylated DMRs was comparable between cell types, but T cells had distinctly higher number and longer stretches of hypermethylated DMRs (Figures 2B-C and SI Table 3). We conclude cell-specific hypermethylation is a feature of adaptive immune cell differentiation.
We also considered the extent to which selection of CpG sites in the EPIC array may have skewed our observations. For this we compared our results with those of Hachiya et al (Hachiya et al., 2017) who carried out whole genome bisulfite sequencing (WGBS) studies on CD4+ cells, monocytes and neutrophils. We applied our criterion of 30% change in methylation to their average methylation frequencies obtained from 102 individuals. We found that CD4+ cells had higher numbers of differentially hypermethylated residues compared to either monocytes or neutrophils in 3-way comparisons (Table 1, bottom). Additionally, CD4+-specific hypermethylation exceeded hypomethylation whereas the reverse was true for monocytes and neutrophils. These trends corroborate those identified in the current study and emphasize the importance of hypermethylation in establishing lymphoid cell identity. However, analysis of the Hachiya data using only sites present in the EPIC array accentuated differences between CD4+ hyper- and hypomethylation (Table 1, bottom) suggesting that lymphoid-specific hypermethylation occurs preferentially at enhancer regions. Further studies are needed to clarify functional and developmental implications of these observations.
Cell-specific DNA methylation associates with enhancer-related epigenetic marks
We determined the relationship of cell-specific DNA methylation to other epigenetic features using DNase I hypersensitive sites (DHS) and histone modification maps generated by ENCODE (Figures 2D-E, Figures S2A-C and SI Table 4). Lack of granulocyte information in the database restricted this analysis to only 5 cell types. B cell-specific hypomethylated sites coincided with DHS, H3K4me1 and H3K27ac modification in B cells, but not in other cell types (Figure 2D). Conversely, B cell-specific sites of hypermethylation had these same markers in the four other cell types but not in B cells. Taken together with the absence of transcription-associated H3K4me3 (Figure 2D), we inferred that B cell-specific hypomethylation occurred at cell-specific enhancers, whereas B cell-specific hypermethylation coincided with regulatory regions active in other cell types. Similar patterns were noted for monocyte-specific differentially methylated sites (Figure S2A).
By contrast, T cell-specific hypomethylated sites coincided with DHS in both T cells and NK cells (Figure 2E). However, these regions were marked with H3K4me1 and H3K27ac, reflecting active enhancers, only in T lineage cells. Thus, proximity of CpG sites to DHSs (as in the NK cells) is not sufficient to induce lineage-specific DNA hypomethylation. As described below, enrichment of the motif for LEF/TCF factors close to sites of T lineage-specific hypomethylation suggest that they may act upon cell non-specific (NK+T) DHS to specify hypomethylation only in T cells (Figure S2A). By contrast, T cell-specific hypermethylated sites coincided with DHS and H3K27ac prominently in monocytes and to a lesser extent in B cells and NK cells (Figure 2E). Thus, the T cell-specific methylation signature is imposed by selectively hypomethylating sequences with potential for NK cell activity and hypermethylating sequences with potential for activity in other immune cell types assayed.
Association of cell-specific hypomethylated sites with enhancers was also evident from analysis of chromatin state using chromHMM, an algorithm that annotates the genome on the basis of selected histone modifications (Ernst and Kellis, 2012) (Figure 3A, Figure S3). Cell-specificity of these enhancers was also evident from mapping the same sites to chromHMM profiles of other cell types (data not shown). Browser tracks of the two genes with B cell-specific DMRs are shown in Figure 3B (corresponding patterns for other cell-specific DMRs are shown in Figure S4).
Combinatorial patterns of transcription factor recognition motifs underlie regions around lineage-specific differentially methylated sites
To explore the basis for lineage-specific methylation patterns, we searched for transcription factor binding motifs in 200 bp surrounding sites of lineage-specific hypo- or hypermethylation. The top hits for motifs associated with hypomethylated sites corresponded with key immune cell development factors identified in mice (Mayran et al., 2019; Zhong and Zhu, 2017). For example, the top three motifs in B cells included those of EBF1, TCF3 and PAX8 (Figure 4A). The most enriched transcription factor motif associated with hypomethylated sites in both CD4+ and CD8+ T cells was that of the HMG domain factor LEF. The closely related factors Tcf1 and Lef1, are essential for committing murine thymic precursors to the T lineage (Hosokawa and Rothenberg, 2020; Weber et al., 2011), but their role in regulating DNA methylation is not known. TCF3 motif was also found amongst the top 5 in T cells, corroborating murine studies that show essential roles for bHLH factors E2A and Heb in T cell development(Bain et al., 1999; Barndt et al., 1999). Our studies also highlighted CEBP and ETS proteins in granulocytes, PU.1 and ATF factors in monocytes, and RUNX and TBET in NK cells in orchestrating lineage-specific hypomethylation.
A pattern of combinatorial use of transcription factors to specify lineage-specific hypo- and hypermethylation emerged from our analyses (Figure 4B). For example, PU.1 and EBF motifs marked B cell-specific hypomethylated sites whereas PU.1 and ATF motifs marked such sites in monocytes. Similarly, E2A and LEF/TCF motifs marked T cell-specific hypomethylated sites whereas E2A, EBF and PU.1 motif marked those in B lymphocytes. Distinct patterns also emerged near sites of lineage-specific hypermethylation (Figure 4C). The RUNX motif was identified in most cell types. However, when it was not associated with hypermethylation, such as in NK cells, it was instead associated with lineage-specific hypomethylation. Similar pattern was seen for the PU.1 motif, which was associated with hypomethylated sites in B cells and monocytes and hypermethylated sites in T cells and granulocytes. These observations are consistent with the idea that combinations of transcription factors establish immune cell identity by regulating cell-specific hypo- and hypermethylation.
EBF1 binds to hypomethylated sites in human B cells
We further explored the relationship between lineage-specific hypomethylation and transcription factors in one cell type. We chose naïve B cells in which the motif for EBF1 was associated most prominently with sites of cell-specific hypomethylation. We carried out ChIP-Seq with anti-EBF1 antibodies in naïve B cells isolated from 3 donors. (Figure S5A), identifying 3058 peaks with high confidence (SI Table 5). Genes associated with EBF1 binding were enriched for ‘B cell differentiation and activation’ and ‘immune responses’ pathways (Figure S5B). To investigate the relationship between EBF1 binding and DNA methylation, we determined the methylation status of MethylationEPIC array probes within 1kb of the peak summit, accounting for approximately 70% of EBF1-bound sites identified by ChIP-Seq (Figure 5A). We found that EBF1 peaks contained 327 sites of B cell-specific hypomethylation out of a predicted 858 sites based on HOMER analysis (Figure 5A, blue). An additional 348 EBF1-bound regions coincided with sites of significant B cell-specific hypomethylation (Figure 5A, orange). However, the majority of EBF1-bound regions in naïve B cells, had low β in all immune cell types (Figure 5A, grey). Very few EBF1-bound regions coincided with sites of hypermethylation in B cells; instead, sites of EBF1 binding in B cells were enriched for lineage-specific hypermethylation in other cell types (Figure 5B).
Most EBF1-bound sites coincided with DHS in B cells, though many were not B cell-specific, especially at gene promoters (Figure 5C left). EBF1 binding at B cell-specific DHS was restricted to intra- and intergenic regions that also coincided with sites of B cell-specific hypomethylation (Figure 5C right), concordant with the idea that lineage-specific hypomethylation occurred at active, lineage-specific enhancers. EBF1 binding in B cells (Figure 5C left, last column) was comparable at all sites regardless of DHS and methylation state in other hematopoietic lineages.
Lastly, we determined the relationship of EBF1 binding and DNA methylation to B cell-selective gene expression. For this, we annotated EBF1 peaks to genes using HOMER (Figure 5A, red font) and then evaluated their average expression in all six cell types using RNA-Seq data from the same individuals (see also Figure 6). We found that EBF1-bound genes that were selectively hypomethylated in B cells had higher expression in B cells compared to other cell types (Figure 5D). Additionally, genes with 4-fold or higher expression in B cells compared to all other cell types were also proportionately over-represented in this category (Figure 5E, blue and orange bars compared with grey bar). We conclude that EBF1-directed B cell specific hypomethylation and gene expression occurs via binding of this factor to lineage-specific enhancer sequences.
Relationship of cell-specific methylation to gene expression differs in innate and adaptive immune cell types
To investigate relationships between cell-specific DNA methylation and gene expression we carried out RNA-Seq from 26 donors (Figure S6A-B). We used DESeq2 to identify genes that were differentially expressed by more than 4-fold in each cell type compared to all others in pairwise fashion within individuals. We then compiled a list of genes that were up- or down regulated by 4-fold (BH adjusted p<0.05) in each cell type across individuals and will refer to these RNAs as being ‘cell-selective’ (SI Table 6). Data from CD4+ and CD8+ cells were merged for this analysis to identify T cell-selective RNAs. We identified several hundred such cell-selective genes in each of the 5 cell types (Figure 6A). Gene Ontology analyses identified functional pathways consistent with each cell type (Figure S6C).
We found that 11-35% of cell-selective upregulated genes were also hypomethylated specifically in that cell type (Figure 6A, brown pie charts). Though association between cell-specific hypomethylation and gene activity was evident in all cell types, DNA hypermethylation correlated with reduced gene expression only in adaptive immune cells (Figure 6A, blue pie charts). This view was further strengthened by evaluating methylation status of cell-selective genes without setting a threshold for differential methylation (no minimum Δβ between cell types). We found that hypomethylation (negative Δβ) dominated the profile of differentially expressed genes in the three innate immune lineages regardless of whether the genes were up- or downregulated (Figure 6B, rows 2 and 4). By contrast, DNA hypermethylation dominated the epigenetic landscape of differentially expressed genes in CD4+ and CD8+ T cells (Figure 6B, rows 1 and 3). The pattern for B cell-selective genes fell in-between innate and T lineages. These observations indicate that distinct mechanisms of epigenetic programming direct cell-selective gene expression in adaptive and innate immune cells.
To investigate expression profiles of genes that carried cell-specific differentially methylated CpGs, we constrained the analyses only to those genes where all differentially methylated probes changed in the same direction. This procedure identified 1627 genes (from 4091 probes) with B cell-specific hypomethylation and 268 genes (from 581 probes) with B cell-specific hypermethylation. Approximately 20% of the 1627 hypomethylated genes were selectively expressed in B cells compared to two or more cell types (Figure 6C). Similar patterns were noted for hypomethylated genes in other immune cell types examined (Figure S7A-D). The relationship of cell-specific hypermethylation to gene expression could only be evaluated in adaptive immune cells. Approximately 15% of B cell-specific hypermethylated genes were selectively under-expressed in B cells compared to two or more other cell types (Figure 6C, red box). T cell-specific hypermethylated genes accounted for about 10% of those that were selectively under-expressed in T cells (Figure S7A). We conclude that cell-specific hypomethylation confers selective gene expression across multiple immune cell types whereas DNA hypermethylation features predominantly in adaptive immune cells.
DISCUSSION
We have determined DNA methylation signatures that distinguish six major human immune cell types. Our analyses were carried out by pairwise comparison between cell types within an individual followed by consolidating differentially methylated sites that were shared between 24 individuals regardless of age and sex. We observed 80-90% concordance of results with a second independent cohort of 31 individuals. We propose that the identified sites constitute the characteristic epigenetic state of each cell lineage. These datasets provide a rigorous foundation for future analyses of dysregulated methylation associated with human diseases. Several features emerged from this analysis.
First, we observed that the extent of lineage-specific hypomethylation was comparable between immune cell types, ranging from 2-4,000 sites per cell type. By contrast, lineage-specific hypermethylation was largely restricted to adaptive immune cells, more specifically to T cells.
Second, we found binding motifs of distinct transcription factors enriched in sequences surrounding lineage-specific hypomethylated sites that highlighted combinatorial regulation of DNA methylation. Of these factors only Ebf1 and E47 have been previously shown to regulate DNA methylation in murine bone marrow precursor cells. The evidence presented here is the first indication of roles for the other factors in developmental DNA methylation and implicate them in programing human hematopoiesis.
Third, we established relationships between lineage-specific DNA methylation and gene expression. Approximately 11-35% of genes that were selectively expressed in a specific cell type contained one or more lineage-specific hypomethylated sites. 80-90% of genes that were selectively under expressed in B and T lymphocytes were hypermethylated in their promoters or intragenic regions compared to other cell types, whereas only 20-30% of genes that were selectively repressed in innate cell lineages were similarly marked. These observations indicate that DNA hypermethylation is less important in establishing innate cell-specific gene silencing. Alternatively, gene repression in innate cells may be sensitive to small changes in methylation.
The lack of greater correlation between DNA methylation and gene expression has been seen in other studies as well (Bonin et al., 2014; Calvanese et al., 2012; Kulis et al., 2015) and could be due to several reasons. One possibility is that lineage-specific sites of hypomethylation mark genes that were expressed at earlier stages of differentiation and are no longer expressed in fully mature lineages examined here. Alternatively, lineage-specific hypomethylation may poise the genome for future patterns of gene expression that are not evident in resting blood-borne cells, such as genes expressed in response to immune activity or due to cell residence in specific tissues Furthermore, DNA methylation may be essential but insufficient to induce RNA synthesis. The complex interplay between layers of epigenetic alterations is evident in our comparison of T and NK cells. We found that T lineage-specific hypomethylated sites mapped to DHS in both T and NK cells, but were H3K27ac+ only in T cells. Conversely, NK-specific hypomethylated sites had DHS peaks in NK and CD8 cells but were H3K27ac+ only in NK cells. Finally, we note that our gene expression studies did not measure gene transcription per se, reflecting instead the abundance of RNA.
Sites of low DNA methylation have been shown to co-localize with DNase I hypersensitive sites (Thurman et al., 2012). This general trend was also evident in our analysis as seen by enrichment of DHS in 3000 most lowly-methylated sites in all cell types examined but not in the top 3000 sites of high methylation. Ubiquitously hypomethylated sites were also H3K27ac+ and H3K4me3+ and enriched for H3K4me1 in all cell types. By contrast, lineage-specific hypomethylated sites coincided with lineage-specific DHS, marked with H3K4me1 and H3K27ac, but only sporadically with H3K4me3. These epigenetic features, their genomic locations, as well as their ChromHMM annotations identified lineage-specific hypomethylated sites to correspond to lineage-specific enhancers.
We observed one exception to the close coincidence between tissue-specific hypomethylated sites and DHS. T cell-specific hypomethylated sites were marked by DHS not only in CD4+ and CD8+ T cells, but also in NK cells. However, these regions were not H3K27ac+ in NK cells indicating that they were not active enhancers. We speculate that the shared DHS patterns between T and NK cells reflects similarities in their early development via Notch signaling (Geiger and Sun, 2016; Rolink et al., 2006). Thereafter, their developmental pathways diverge, with TCF1/LEF1 carrying forward the T cell program and associated lineage-specific demethylation and RUNX3/CBF inducing NK-specific demethylation (see next section). DHS associated with NK-specific hypomethylated sites were present only in CD8+ T cells but marked with H3K27ac selectively in NK cells. This pattern provides an epigenetic mechanism underlying the current view that NK cells represent the innate counterpart of CD8+ T cells (Kurioka et al., 2018; Zook and Kee, 2016). The epigenetic relationship between NK cells and T cells, especially CD8+ T cells, was previously noted in ATAC-Seq comparisons of murine innate and adaptive lymphoid cells (Shih et al., 2016). By imposing the additional constraint of lineage-specific hypomethylation, we uncovered an epigenetic hierarchy whereby only a subset of chromatin accessible sites in NK cells were hypomethylated as well. This may explain the highest enrichment of Runx motifs in NK cells in our studies.
Sites of lineage-specific methylation were associated with motifs of a limited number of transcription factors. Factors identified at hypomethylated sites were relatively unique to each cell type. For B and T lymphocytes they corresponded closely with factors known to be involved in lineage commitment in mice (Rothenberg, 2014). Our interpretation is that lineage-specific DNA hypomethylation coincides with irreversible loss of pluripotent differentiation potential. Extending this line of reasoning, we propose that demethylation-related transcription factors provide a window into commitment mechanisms in other cell types where these paradigms are less well established. For example, the top motif identified close to NK-specific hypomethylated sites is that of RUNX family proteins, followed closely by TBX21. Developmental studies in mice have identified several transcription factors to be important for NK cell differentiation, including Nfil3, PU.1, Stat5, Tcf1, Runx3 and Ets-1(Brillantes and Beaulieu, 2019). Our observations suggest that commitment to NK cell development is mediated by RUNX proteins, perhaps prior to generation of NK precursor (NKp) cells. Thereafter, TBX21-regulated demethylation may drive the conversion from NKp to mature NK cells that are competent to produce IFNβ. Another interesting distinction is the differential enrichment of PU.1(IRF) and C/EBPα motifs near hypomethylated sites in monocytes and granulocytes, respectively. Both factors are required for both lineages, yet each factor appears to impose its own epigenetic imprint to commit cells to one or the other lineage.
We surmise that identification of lineage-specific transcription factor motifs in this study was aided by features of the experimental design, including a focus on closely-related mature cell types from many healthy individuals. Additionally, because enhancer/silencer activation underlies many developmental transitions, the presence in the EPIC array of a large proportion of enhancer-associated CpG sites collected from the FANTOM database (Moran et al., 2016; Pidsley et al., 2016) may have further helped to target the analyses to developmentally pertinent transcription factors.
Identification of distinct transcription factor recognition motifs near sites of lineage-specific differential methylation strongly suggests a role for these factors in directing methylation status via DNA binding. To move beyond in silico predictions, we evaluated the relationship of EBF1 binding to B cell-specific hypomethylation. Approximately 25% of assayable EBF1-bound sites in naïve human B lymphocytes occurred near sites of B cell-specific hypomethylation. It is possible that EBF1 also regulates methylation at other sites where we no longer detect its binding in mature B lymphocytes. Such a hit-and-run mode of action of Ebf1 has been demonstrated at early stages of murine B cell differentiation(Li et al., 2018).
Of note, a lot of EBF1 binding in mature B cells, especially at promoters, coincided with epigenetically active chromatin in diverse cell types. Because EBF1 is only expressed in B cells, it cannot regulate methylation or DHS at these locations in the other cell types. Instead, EBF1 binding at these sites may reflect easy accessibility of the factor to pre-formed hypersensitive regions of the genome. On average EBF1-bound genes that were uniformly hypomethylated in all lineages were also comparably expressed in all cell types. Whether EBF1 contributes to expression of these genes in B cells remains to be determined. By contrast, EBF1-bound genes that were specifically hypomethylated in B cells were also expressed more in B cells than other cell types. We propose that EBF1 regulates B cell-selective gene expression via sites in enhancers where its binding coincides with B cell-specific DNA hypomethylation.
We observed that lineage-specific DNA hypermethylation was restricted to adaptive immune cell types and especially low in innate immune cells. To address the possibility that comparison between closely related myeloid cells eliminated hypermethylation patterns selective for these cells, we compared granulocytes or monocytes individually, or combined, to other cell types. Each myeloid cell type was distinguished from each of the other cell types by thousands of differentially methylated sites. However, comparisons with more than one cell type selectively reduced the numbers of myeloid cell-specific hypermethylated sites. These observations support the notion that hypo- and hyper-methylation are used in distinct ways to specify immune cell identity.
Lineage-specific hypermethylation can be achieved by lack of demethylation at sites that lose meCpG in other lineages or by de novo methylation during lymphopoiesis. Based on a comparison with methylation patterns in fetal liver-derived hematopoietic stem cells (Tejedor et al., 2018), we propose that T lineage-specific hypermethylation is largely established by de novo methylation. It is noteworthy that both our study and that of Tejedor et al. identified PU.1 binding motifs near hypermethylated sites in T cells. PU.1 was also the top motif identified near sites of NK-specific hypermethylation in our study. The latter observation corresponds closely with PU.1 sites noted within chromatin accessible regions that were lost in transition of common lymphoid precursors to NKp cells in mice (Shih et al., 2016). We propose that PU.1 is one factor that directs de novo methylation.
Lineage-specific hypermethylated sites shared transcription factors motifs across many cell types. For example, the RUNX motif was associated with hypermethylated sites in 5 out of 6 cell types examined in this study. Similarly, PU.1 and ETS motifs were associated with hypermethylation in 4 out of the 6 cell types. Because these factors were also associated with hypomethylation in limited cell types, one possibility is that they can recruit either TET or DNMT proteins depending on context. For example, there is evidence that PU.1 can interact with both TET2 and DNMT3B (Izzo et al., 2020). However, circumstances under which PU.1 recruits one or the other enzyme have not been identified. We suggest an alternate possibility. Perhaps the major function of PU.1 and RUNX proteins is to recruit DNMTs to specify lineage identity by DNA hypermethylation. Certain contexts may neutralize their methylation potential, for example by recruiting other DNA binding proteins that promote TET-dependent hypomethylation. One such context is suggested by our observation of PU/IRF composite elements near sites of lineage-specific hypomethylation, but PU.1-only motifs near sites of lineage-specific hypermethylation.
In conclusion, our study defines epigenetic identities of normal human immune cell types, thereby providing the baseline from which deviations that occur with further differentiation, disease or age can be scored with confidence. We also identified key lineage-specific transcription factors that likely regulate these cell-specific signatures via combinatorial and context-dependent mechanisms. Most of these factors have not been previously linked to regulation of methylation. Finally, these studies revealed differential use of methylation to define epigenetic states and gene expression profiles of innate and adaptive immune cell types that may underpin functional differences in developmentally distinct cell types.
Limitations of the study
Limitations of the present study have been noted in appropriate sections throughout the text. First, identity of tissue-specific differentially methylated sites determined here is limited by the selection of CpG residues in the EPIC array. It remains to be determined whether analyses of all CpG sites will show similar patterns. Second, relationships between tissue-specific differentially methylated sites and combinations of transcription factors were inferred based on enrichment of sequence motifs rather than experimental assessment of transcription factor binding in each cell type. As proof-of-principle we related EBF1 binding to hypomethylated sites in B cells by ChIP-Seq. Finally, we note that our gene expression studies were based on the abundance of mRNA rather than measuring gene transcription specifically. Future studies using techniques such as Gro-Seq or chromatin-associated RNA-Seq will permit better correlation between differential methylation and transcription.
STAR METHODS
RESOURCE AVAILABILITY
Lead Contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Ranjan Sen (senranja@grc.nia.nih.gov)
Materials Availability
This study did not generate new unique reagents.
Data and Code Availability
Raw and normalized DNA methylation and RNASeq data and EBF1 ChIP-Seq data have been deposited in GEO and are publicly available as of the date of publication. Accession numbers are listed in the key resources table
This paper does not report original code.
Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request
REAGENT or RESOURCE | SOURCE | IDENTIFIER |
---|---|---|
Antibodies | ||
PE/Cy7 anti-human CD62L | Biolegend | Clone: DREG-56; Cat# 304822 |
APC anti-human CD45RA | Biolegend | Clone: HI100; Cat# 304112 |
FITC anti-human CD8a | Biolegend | Clone: RPA-T8; Cat# 301050 |
CD19-PerCP/Cy5.5 | Biolegend | Clone: HIB19; Cat# 302230 |
PE anti-human CD38 | Biolegend | Clone: HB-7; Cat# 356604 |
PE/Cy7 anti-human CD27 | Biolegend | Clone: O323; Cat# 302838 |
FITC anti-human CD3 | Biolegend | Clone: OKT3; Cat# 317306 |
FITC anti-human CD4 | Biolegend | Clone: RPA-T4; Cat# 300538 |
PE anti-human CD4 | Biolegend | Clone: RPA-T4; Cat# 300508 |
CD14-PerCP/Cy5.5 | Biolegend | Clone: HCD14; Cat# 325622 |
CD16-PE | Biolegend | Clone: 3G8; Cat# 302008 |
CD8-APC | Biolegend | Clone: HIT8a; Cat# 300912 |
APC anti-human CD56 | Biolegend | Clone: NCAM; Cat# 318310 |
FITC CD91 | BD Bioscience | Cat# 550496 |
Biological samples | ||
PBMC, whole blood and immune subpopulations | Healthy human donors in Genetic and Epigenetic Signatures of Translational Aging Laboratory Testing study (GESTALT) | https://www.clinicaltrials.gov/ct2/show/NCT02339012 |
Chemicals | ||
HBSS buffer | Mediatech | Part # 21-022-CM |
Critical commercial assays and kits | ||
Ficoll-Paque PLUS | GE Healthcare | Cat# GE17-1440-03 |
1X DPBS (Dulbecco's Phosphate Buffered Saline, DPBS) w/o Ca & Mg | Quality Biological | SKU# 114-057-131 |
EasySep™ Human Monocyte Enrichment Kit without CD16 Depletion | STEMCELL Technologies | Cat# 19058 |
EasySep™ Human Whole Blood CD66b Positive Selection Kit | STEMCELL Technologies | Cat# 18682 |
EasySep Negative Human B Cell, Kit | STEMCELL Technologies | Cat# 19054 |
EasySep Negative Human CD4 Kit | STEMCELL Technologies | Cat# 19052 |
EasySep Negative Human CD8, Kit | STEMCELL Technologies | Cat# 19053 |
Zymo EZ-96 DNA Methylation Kit | Zymo | Cat# D5003 |
Infinium® MethylationEPIC BeadChip Kit | Illumina | Cat# WG-317-1003 |
Single Read Flowcell - Big (v4 SR Cluster Kit) | Illumina | Cat# GD-401-4001 |
Sequencing Reagent for Big Flowcell (v4 SBS HiSeq) (50 cycles) | Illumina | Cat# FC-401-4002 |
TrueSeq ChIPSeq 48 samples (SetA) | Illumina | Cat# IP-202-1012 |
TrueSeq ChIPSeq 48 samples (Set B) | Illumina | Cat# IP-202-1024 |
RNeasy Mini Kit (250) | Qiagen | Cat# 74106 |
miRNeasy mini kit | Qiagen | Cat# 217004 |
Low Input RiboMinus™ Eukaryote System v2 | Life Technologies Corporation | Cat# A15027 |
Bioanalyzer RNA 6000 Nano kit | Agilent | Part No.# 5067-1511 |
Ovation RNASeq System Kits V2 | Nugen | Part No.# 7102-A01 |
Deposited data | ||
Methylation data | This paper | GEO: GSE184269 |
RNASeq data | This paper | GEO: GSE184264 |
EBF1 Chip-Seq data | This paper | GEO: GSE183537 |
DNase-seq, H3K4me1,H3K4me3 and H3K27ac data | ENCODE project | https://www.encodeproject.org/ |
18 state model of immune cells | chromHMM | https://egg2.wustl.edu/roadmap/web_portal/chr_state_learning.html#exp_18state |
Tejedor 850K methylation data | (Tejedor et al., 2018) | ArrayExpress Accession # E-MTAB-6315 |
Softwares and algorithms | ||
GenomeStudio 2011.1 | Illumina | https://www.illumina.com/techniques/microarrays/array-data-analysis-experimental-design/genomestudio.html |
FlowJo v10 | FlowJo | https://www.flowjo.com |
GraphPad Prism v8 | GraphPad Software | https://www.graphpad.com |
R Studio 3.6.0 | The R Foundation | https://www.r-project.org |
DESeq2 v1.30.1 | (Love et al., 2014) | DOI: 10.18129/B9.bioc.DESeq2 |
minfi 1.38.0 | (Aryee et al., 2014) | DOI: 10.18129/B9.bioc.minfi |
Methylation plotter | (Mallona et al., 2014) | https://gattaca.imppc.org/methylation_plotter/ |
Cutadapt 3.4 | (Martin, 2011) | DOI:10.14806/ej.17.1.200 |
FastQC 0.11.9 | (Andrews, 2010) | https://www.bioinformatics.babraham.ac.uk/projects/fastqc/ |
STAR Aligner 2.7.9a | (Dobin et al., 2013) | DOI: 10.1093/bioinformatics/bts635 |
featureCounts v1.5.2 | (Liao et al., 2019) | DOI: 10.1093/bioinformatics/btt656 |
ReactomePA 1.36.0 | (Yu and He, 2016) | DOI: 10.18129/B9.bioc.ReactomePA |
ClusterProfiler 4.0.5 | (Yu et al., 2012) | DOI: 10.18129/B9.bioc.clusterProfiler |
Deeptools 3.5.1 | (Ramirez et al., 2016) | DOI:10.1093/nar/gkw257 |
Bedops 2.4.40 | (Neph et al., 2012) | DOI: 10.1093/bioinformatics/bts277 |
HOMER 4.11.1 | (Heinz et al., 2010) | DOI: 10.1016/j.molcel.2010.05.004 |
Other | ||
DNA Isolation, DNAQuik DNA Extraction protocol | Reprocell | N/A |
DNA Isolation, Qiagen DNeasy kit (for all cell types) | Genetic Resources Core Facility, Johns Hopkins University | N/A |
DNA Isolation, Chemagic Chemistry on MSM I workstation (for WB) | Genetic Resources Core Facility, Johns Hopkins University | N/A |
EXPERIMENTAL MODEL AND SUBJECT DETAILS
Human Cohort details
Whole blood, peripheral blood (PBMC) and other immune subpopulations for this study were collected from healthy donors enrolled in the Genetic and Epigenetic Signatures of Translational Aging Laboratory Testing study (GESTALT) (Protocol# 15-AG-0063). GESTALT participants (age range 22-83 years) were selected to be free of major diseases, except for controlled hypertension or a history of cancer that had been clinically silent for at least 10 years, were not chronically on medications (except one antihypertensive drug), had no physical or cognitive impairments, weighed more than 110 lbs and had BMI less than 30 kg/m2, and were not professional athletes. Inclusion and exclusion criteria were assessed in a pre-screening telephone call and subsequent screening visit during which information from a medical history, physical exams, and blood test were interpreted by a trained nurse practitioner, criteria were confirmed during the initial study visit. Participants unable to undergo cytapheresis or magnetic resonance imaging (MRI), or who were current smokers were excluded. GESTALT study visits were conducted at the Clinical Research Unit of the National Institute on Aging (Ubaida-Mohien et al., 2019). 55 healthy donors (22-83 years of age) were recruited in two phases of 24 and 31 individuals. The mean age was 54.1±18.6 and 53.4±19.9 respectively with 17 males to 7 females in phase 1 and 17 males to 14 females in phase 2 (Figure S1A). All GESTALT study protocols are approved by the institutional review board of the National Institutes of Health Intramural Research Program. All participants provided written, informed consent at every visit.
METHOD DETAILS
Isolation of peripheral blood mononuclear cells and immune cell populations
Peripheral blood mononuclear cells (PBMCs) were isolated from cytapheresis packs by density gradient centrifugation using Ficoll-Paque Plus (GE Healthcare, NJ, USA). For isolation of specific immune cells, total B, CD4 and CD8 cells were enriched by negative selection using EasySep Negative Human kits specific for each cell type; monocytes were negatively enriched using “EasySep Human Monocyte Enrichment Kit w/o CD16” (StemCell Technologies, Vancouver, Canada) following the manufacturer’s instructions. Natural killer cells were negatively enriched by depleting PBMCs with antibodies directed against CD3, CD4, CD14, CD19 and Glycophorin-A in HBSS buffer. Enriched cell populations were further purified by flow cytometry following Human Immunophenotyping Consortium (HIPC) phenotyping panels (Maecker et al., 2012). Gating strategies and post-sort purity were analyzed by FlowJo software (LLC, Ashland, OR) (Figure S1B). Granulocytes were positively selected from whole blood using EasySep™ Human Whole Blood CD66b Positive Selection Kit (StemCell Technologies, Vancouver, Canada) according to the manufacturer’s instructions. All purified cells and PBMC were washed with PBS, snap frozen in liquid nitrogen and stored at −80 °C for subsequent DNA isolation. All enriched and sorted cells were >95% pure determined by flow cytometry (Figure S1B).
DNA bisulfite conversion and quantification by Illumina Infinium MethylationEPIC microarray
DNA was isolated from 1 – 2 million cells using DNAQuik DNA Extraction protocol (Reprocell, Beltsville, MD) for GESTALT phase 1 and using Chemagic Chemistry on the automated MSM I workstation (Genetic Resources Core Facility, JHU, Baltimore, MD) for GESTALT phase 2. 300 ng of DNA was treated with sodium bisulfite using Zymo EZ-96 DNA Methylation Kit per manufacturer's protocol (Zymo Research, CA). The methylation status of approximately 850,000 CpG sites was determined using Illumina Human MethylationEPIC BeadChip on the Illumina iScan System as per manufacturer’s protocol. Initial data analysis was performed using GenomeStudio 2011.1 (Model M Version 1.9.0 Illumina, Inc. CA). (https://www.illumina.com/products/by-type/microarray-kits/infinium-methylation-epic.html).
Data processing and functional annotation of CpG sites
Data from the Illumina Infinium MethylationEPIC array was analyzed in R using minfi package called from the Bioconductor open source software (Aryee et al., 2014; Fortin et al., 2017). To remove technical and biological biases, probes with low detection p values (cutoff 0.01) and SNPs were filtered out(Moran et al., 2016), and the data was normalized using noob and BMIQ method(Liu and Siegmund, 2016). The noob-BMIQ normalized data in form of β values, was then used for differential methylation analysis. The MethylationEPIC probe annotation was obtained from R based annotation package provided by Kasper Hansen (IlluminaHumanMethylationEPICanno.ilm10b2.hg19). The package annotates probes to UCSC RefSeq genes (hg19). Based on location of a probe with respect to a nearby gene, 3 categories were created as defined below- 1) promoter group- TSS1500 (from 201 to 1500 bp upstream of TSS), TSS 200 (upto 200 bp upstream of TSS), 5’UTR, first exon 2) intragenic- exons (all exons except exon1), exon intron boundary, intron and 3’UTR and 3) intergenic probes were treated as a separate group. For probes that mapped to multiple genes or transcripts, the gene suggested by the annotation package and its associated gene location were considered. Location of probes with respect to CpG islands (CGI) were divided into 3 groups-within CGI, within CpG shore (0-2kb from CGI), CpG shelf (2-4kb from CGI) and open sea (>4kb from CGI).
Definition cell type-specific differentially methylated probes (DMP) and regions (DMR)
β values for each probe in a cell type were compared pairwise to every other cell type within an individual (paired t-test). The resulting p-values were adjusted for multiple testing (Benjamini-Hochberg (BH) adjusted p≤ 0.05) and average Δβ was calculated for every pair of cells (average βcell1 - average βcell2) to identify hypo- (average Δβ<0, BH p ≤ 0.05) or hypermethylated (average Δβ>0, BH p ≤ 0.05) probes in a cell type. Because of very similar methylation states, naïve CD4+ and CD8+ were compared to the non-T cells. Probes for which ∣Δβ∣≥ 0.3 for every pairwise comparison were considered cell-specific.
Cell-specific DMRs were defined as regions where 2 or more consecutive probes within 300bp were hypo- or hypermethylated in one cell type compared to all other cell types. Cell-specific DMRs were defined using bumphunter package from Bioconductor(Aryee et al., 2014).
Visualization of cell-specific DMR
For selected cell-specific DMRs, the average β value for each cell type was obtained. The plot was generated using Methylation plotter software(Mallona et al., 2014).
RNA extraction and RNA-seq library preparation
Total RNA was prepared from 2 x106 cells using miRNeasy mini kit (Qiagen Inc, CA) according to the manufacturer’s recommendations. RNA quality and quantity were checked using RNA-6000 nano kits on the Agilent 2100-Bioanalyzer and 500ng of the total RNA was used for ribosomal RNA (rRNA) depletion using GeneRead rRNA Depletion Nano Kit (Qiagen Inc, CA). 50ng of the rRNA depleted RNA was used for cDNA synthesis followed by single primer isothermal amplification (SPIA) using Ovation RNA–Seq System V2 kits according to manufacturer’s protocol (Nugen Technologies Inc, CA). 375ng of amplified cDNA sheared to an average size of 150–250 bases, was used to prepare the libraries using Illumina ChIP-Seq kits (Illumina Inc, CA) according to the manufacturer’s protocol, followed by sequencing using Illumina Hiseq2500 sequencer with V4 reagents. Single-read sequencing was performed for 138 cycles and Real-Time Analysis (RTA) v1.18.66.3 generated the base-call files (BCL files). BCL files were demultiplexed and converted to standard FASTQ files using bcl2fastq program (v2.17.1.14).
RNA-Seq data analysis
After adapter removal and end trimming of raw FASTQ files using cutadapt program(Martin, 2011),the quality of reads was checked using FastQC software (Andrews, 2010). A quality threshold of Q30 was used in cutadapt for trimming low quality reads. The reads were aligned to hg19 genome using STAR aligner (Dobin et al., 2013). STAR was used with default parameters (10 mismatches) in total. The output BAM files were then processed into read-count files using featureCounts tool from Rsubread package in Bioconductor (Liao et al., 2019). Annotation of the reads was done with ENSEMBL hg19 (v82). Genes with low read counts (<10 reads in 4 or more samples) were filtered out and pseudogenes were further excluded from the analysis. Differentially expressed transcripts in each cell type were identified using the DESeq2 software package(Love et al., 2014). Genes were considered cell-selective if they were over- or under expressed in a cell type with respect to every other cell type by ≥ 4-fold at a BH adjusted p<0.05. For association between gene expression and methylation, an average β was separately calculated for probes in promoters and intragenic regions for every cell-selective gene from all the phase1 donors. For each probe, the corresponding Δβ was obtained by subtracting the mean β of that cell type from the mean β of all the other 5 cell types combined.
Pathway analysis using ReactomePA
The pathway analyses were done using ReactomePA package available through Bioconductor which is based on REACTOME pathway database(Yu and He, 2016). Briefly, the gene names were converted to Entrez gene ID using bitr. Enrichment analysis was then performed to identify enriched pathways (FDR p <0.05). ClusterProfiler was used to visualize all the cell types in one graph.
Visualization of histone peaks and DHS peaks from ENCODE project using deepTools
Primary cell DHS and chromatin ChIP-Seq bigwig files were downloaded from ENCODE (https://www.encodeproject.org/) using identifiers provided in SI Table 3 (Sloan et al., 2016). DeepTools was used to visualize the pattern of DHS and histone peaks in ±3kb region surrounding methylated sites. For plotting the pattern in each cell type, the order of methylated probes was determined based on descending score of DHS peaks and this order was followed for all histone marks (H3K4me1, H3K4me3 and H3K27ac).
Annotation of cell-specific methylated probes using chromHMM based 18-state model
The 18-state chromHMM models (based on 6 chromatin marks H3K4me3, H3K4me1, H3K36me3, H3K27me3, H3K9me3 and H3K27ac) for various immune cells (E032- primary B cell, E038- primary CD4N, E047- primary CD8N, E029- monocyte, E046- NK cell) were downloaded from Roadmap epigenomics project (https://egg2.wustl.edu/roadmap/web_portal/chr_state_learning.html). Bedops tool was used to determine the overlap between the cell-specific methylated sites in respective chromHMM profiles. As a control, all probes from the Illumina Infinium MethylationEPIC array were also partitioned using each of the immune cell chromHMM profiles.
Prediction of de-novo transcription factor binding motifs by HOMER
200bp surrounding each differentially methylated site was provided as input for analysis in HOMER using de novo setting (Heinz et al., 2010). The top 5 motifs based on p-value were selected from each analysis. Bubble plots were prepared using ggplot2 package in R.
EBF1 ChIP-Seq in human primary B-cells
Human CD19+ primary B-cells were obtained from three healthy individuals. EBF1 ChIP was performed as previously described (Li et al., 2018). ChIP-Seq library preparation and deep sequencing were performed at the deep sequencing facility of Max Planck Institute of Immunology and Epigenetics (Freiburg, Germany). Paired-end reads were mapped to hg19 using bowtie2 (version 2.2.3). After filtering out duplicated read, reads with the mapping quality below 30 and reads mapping to blacklisted region were filtered out, using Picard, Samtools and bedtools and paired reads from all three replicates were combined for peak calling by MACS2 (2.1.1) with q-value cutoff of 0.01 and input DNA as a control. The peaks identified were further normalized and visualized in all three replicates using deepTools (Ramirez et al., 2016).
Comparison to published data by Tejedor et al.
List of hypo- and hypermethylated probes in human ESCs (hESC), common myeloid progenitors (CMP), B-cell progenitors (BCP) and six major mature white blood cell subsets: neutrophils, monocytes, CD4+ T lymphocytes, CD8+ T lymphocytes, natural killer (NK) cells and B lymphocytes identified with respect to hematopoietic stem cells (HSCs) were extracted from supplementary table 3(Tejedor et al., 2018). The raw IDAT files of the samples were obtained from ArrayExpress under the accession number E-MTAB-6315. The raw IDAT files were noob-BMIQ normalized like our dataset before it was visualized for cell-specific probes identified in our study.
Use of R software
R packages were used for most plots including ggplot2, pheatmap, prcomp, reshape2.
Supplementary Material
Acknowledgements
This work was supported by the Intramural Research Program of the National Institute on Aging and funds from the Max Planck Society (S.R., S.B., R.G.). We are grateful to the GESTALT participants and the GESTALT Study Team at Harbor Hospital and NIA. We thank Drs. Anjana Rao and Yehudit Bergman for their valuable comments and feedback on the paper.
Footnotes
Declaration of Interest
A.B. holds stock in Google, Inc. A.B. is a consultant for Third Rock Ventures, LLC.
REFERENCES
- Andrews S (2010). FastQC: A Quality Control Tool for High Throughput Sequence Data. [Google Scholar]
- Aryee MJ, Jaffe AE, Corrada-Bravo H, Ladd-Acosta C, Feinberg AP, Hansen KD, and Irizarry RA (2014). Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics 30, 1363–1369. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bain G, Quong MW, Soloff RS, Hedrick SM, and Murre C (1999). Thymocyte maturation is regulated by the activity of the helix-loop-helix protein, E47. J Exp Med 190, 1605–1616. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Barndt R, Dai MF, and Zhuang Y (1999). A novel role for HEB downstream or parallel to the pre-TCR signaling pathway during alpha beta thymopoiesis. J Immunol 163, 3331–3343. [PubMed] [Google Scholar]
- Benner C, Isoda T, and Murre C (2015). New roles for DNA cytosine modification, eRNA, anchors, and superanchors in developing B cell progenitors. Proc Natl Acad Sci U S A 112, 12776–12781. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bestor TH, Edwards JR, and Boulard M (2015). Notes on the role of dynamic DNA methylation in mammalian development. Proc Natl Acad Sci U S A 112, 6796–6799. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bonin M, Weidel L, Schendel P, Mans K, Flemming S, Grützkau A, Smiljanovic B, Sörensen T, Günther S, and Häupl T (2014). A8.20 Bioconpages - comparison of DNA methylation and gene expression in different immune cells. . Annals of the rheumatic diseases. [Google Scholar]
- Brillantes M, and Beaulieu AM (2019). Transcriptional control of natural killer cell differentiation. Immunology 156, 111–119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Calvanese V, Fernandez AF, Urdinguio RG, Suarez-Alvarez B, Mangas C, Perez-Garcia V, Bueno C, Montes R, Ramos-Mejia V, Martinez-Camblor P, et al. (2012). A promoter DNA demethylation landscape of human hematopoietic differentiation. Nucleic Acids Res 40, 116–131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, and Gingeras TR (2013). STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dzierzak E, and Speck NA (2008). Of lineage and legacy: the development of mammalian hematopoietic stem cells. Nat Immunol 9, 129–136. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ernst J, and Kellis M (2012). ChromHMM: automating chromatin-state discovery and characterization. Nat Methods 9, 215–216. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Farlik M, Halbritter F, Muller F, Choudry FA, Ebert P, Klughammer J, Farrow S, Santoro A, Ciaurro V, Mathur A, et al. (2016). DNA Methylation Dynamics of Human Hematopoietic Stem Cell Differentiation. Cell Stem Cell 19, 808–822. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fortin JP, Triche TJ Jr., and Hansen KD (2017). Preprocessing, normalization and integration of the Illumina HumanMethylationEPIC array with minfi. Bioinformatics 33, 558–560. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Geiger TL, and Sun JC (2016). Development and maturation of natural killer cells. Curr Opin Immunol 39, 82–89. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ginno PA, Gaidatzis D, Feldmann A, Hoerner L, Imanci D, Burger L, Zilbermann F, Peters A, Edenhofer F, Smallwood SA, et al. (2020). A genome-scale map of DNA methylation turnover identifies site-specific dependencies of DNMT and TET activity. Nat Commun 11, 2680. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hachiya T, Furukawa R, Shiwa Y, Ohmomo H, Ono K, Katsuoka F, Nagasaki M, Yasuda J, Fuse N, Kinoshita K, et al. (2017). Genome-wide identification of inter-individually variable DNA methylation sites improves the efficacy of epigenetic association studies. NPJ Genom Med 2, 11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hagman J, and Lukin K (2006). Transcription factors drive B cell development. Curr Opin Immunol 18, 127–134. [DOI] [PubMed] [Google Scholar]
- Hannum G, Guinney J, Zhao L, Zhang L, Hughes G, Sadda S, Klotzle B, Bibikova M, Fan JB, Gao Y, et al. (2013). Genome-wide methylation profiles reveal quantitative views of human aging rates. Mol Cell 49, 359–367. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Heinz S, Benner C, Spann N, Bertolino E, Lin YC, Laslo P, Cheng JX, Murre C, Singh H, and Glass CK (2010). Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol Cell 38, 576–589. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Horvath S (2013). DNA methylation age of human tissues and cell types. Genome Biol 14, R115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hosokawa H, and Rothenberg EV (2020). How transcription factors drive choice of the T cell fate. Nat Rev Immunol. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Izzo F, Lee SC, Poran A, Chaligne R, Gaiti F, Gross B, Murali RR, Deochand SD, Ang C, Jones PW, et al. (2020). DNA methylation disruption reshapes the hematopoietic differentiation landscape. Nat Genet 52, 378–387. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kulis M, Merkel A, Heath S, Queiros AC, Schuyler RP, Castellano G, Beekman R, Raineri E, Esteve A, Clot G, et al. (2015). Whole-genome fingerprint of the DNA methylome during human B cell differentiation. Nat Genet 47, 746–756. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kurioka A, Klenerman P, and Willberg CB (2018). Innate-like CD8+ T-cells and NK cells: converging functions and phenotypes. Immunology. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li R, Cauchy P, Ramamoorthy S, Boller S, Chavez L, and Grosschedl R (2018). Dynamic EBF1 occupancy directs sequential epigenetic and transcriptional events in B-cell programming. Genes Dev 32, 96–111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liao Y, Smyth GK, and Shi W (2019). The R package Rsubread is easier, faster, cheaper and better for alignment and quantification of RNA sequencing reads. Nucleic Acids Res 47, e47. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lio CJ, and Rao A (2019). TET Enzymes and 5hmC in Adaptive and Innate Immune Systems. Front Immunol 10, 210. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu J, and Siegmund KD (2016). An evaluation of processing methods for HumanMethylation450 BeadChip data. BMC Genomics 17, 469. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Love MI, Huber W, and Anders S (2014). Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15, 550. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Maecker HT, McCoy JP, and Nussenblatt R (2012). Standardizing immunophenotyping for the Human Immunology Project. Nat Rev Immunol 12, 191–200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mallona I, Diez-Villanueva A, and Peinado MA (2014). Methylation plotter: a web tool for dynamic visualization of DNA methylation data. Source Code Biol Med 9, 11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Martin M (2011). Cutadapt removes adapter sequences from high-throughput sequencing reads. 2011 17, 3. [Google Scholar]
- Mayran A, Sochodolsky K, Khetchoumian K, Harris J, Gauthier Y, Bemmo A, Balsalobre A, and Drouin J (2019). Pioneer and nonpioneer factor cooperation drives lineage specific chromatin opening. Nat Commun 10, 3807. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mercier FE, Ragu C, and Scadden DT (2011). The bone marrow at the crossroads of blood and immunity. Nat Rev Immunol 12, 49–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Moran S, Arribas C, and Esteller M (2016). Validation of a DNA methylation microarray for 850,000 CpG sites of the human genome enriched in enhancer sequences. Epigenomics 8, 389–399. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Neph S, Kuehn MS, Reynolds AP, Haugen E, Thurman RE, Johnson AK, Rynes E, Maurano MT, Vierstra J, Thomas S, et al. (2012). BEDOPS: high-performance genomic feature operations. Bioinformatics 28, 1919–1920. [DOI] [PMC free article] [PubMed] [Google Scholar]
- O'Riordan M, and Grosschedl R (1999). Coordinate regulation of B cell differentiation by the transcription factors EBF and E2A. Immunity 11, 21–31. [DOI] [PubMed] [Google Scholar]
- Orlanski S, Labi V, Reizel Y, Spiro A, Lichtenstein M, Levin-Klein R, Koralov SB, Skversky Y, Rajewsky K, Cedar H, et al. (2016). Tissue-specific DNA demethylation is required for proper B-cell differentiation and function. Proc Natl Acad Sci U S A 113, 5018–5023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pidsley R, Zotenko E, Peters TJ, Lawrence MG, Risbridger GP, Molloy P, Van Djik S, Muhlhausler B, Stirzaker C, and Clark SJ (2016). Critical evaluation of the Illumina MethylationEPIC BeadChip microarray for whole-genome DNA methylation profiling. Genome Biol 17, 208. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ramirez F, Ryan DP, Gruning B, Bhardwaj V, Kilpert F, Richter AS, Heyne S, Dundar F, and Manke T (2016). deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res 44, W160–165. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rolink AG, Balciunaite G, Demoliere C, and Ceredig R (2006). The potential involvement of Notch signaling in NK cell development. Immunol Lett 107, 50–57. [DOI] [PubMed] [Google Scholar]
- Rothenberg EV (2014). Transcriptional control of early T and B cell developmental choices. Annu Rev Immunol 32, 283–321. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schuyler RP, Merkel A, Raineri E, Altucci L, Vellenga E, Martens JHA, Pourfarzad F, Kuijpers TW, Burden F, Farrow S, et al. (2016). Distinct Trends of DNA Methylation Patterning in the Innate and Adaptive Immune Systems. Cell Rep 17, 2101–2111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shih HY, Sciume G, Mikami Y, Guo L, Sun HW, Brooks SR, Urban JF Jr., Davis FP, Kanno Y, and O'Shea JJ (2016). Developmental Acquisition of Regulomes Underlies Innate Lymphoid Cell Functionality. Cell 165, 1120–1133. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sloan CA, Chan ET, Davidson JM, Malladi VS, Strattan JS, Hitz BC, Gabdank I, Narayanan AK, Ho M, Lee BT, et al. (2016). ENCODE data at the ENCODE portal. Nucleic Acids Res 44, D726–732. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tejedor JR, Bueno C, Cobo I, Bayon GF, Prieto C, Mangas C, Perez RF, Santamarina P, Urdinguio RG, Menendez P, et al. (2018). Epigenome-wide analysis reveals specific DNA hypermethylation of T cells during human hematopoietic differentiation. Epigenomics 10, 903–923. [DOI] [PubMed] [Google Scholar]
- Thurman RE, Rynes E, Humbert R, Vierstra J, Maurano MT, Haugen E, Sheffield NC, Stergachis AB, Wang H, Vernot B, et al. (2012). The accessible chromatin landscape of the human genome. Nature 489, 75–82. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ubaida-Mohien C, Lyashkov A, Gonzalez-Freire M, Tharakan R, Shardell M, Moaddel R, Semba RD, Chia CW, Gorospe M, Sen R, et al. (2019). Discovery proteomics in aging human skeletal muscle finds change in spliceosome, immunity, proteostasis and mitochondria. Elife 8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Weber BN, Chi AW, Chavez A, Yashiro-Ohtani Y, Yang Q, Shestova O, and Bhandoola A (2011). A critical role for TCF-1 in T-lineage specification and differentiation. Nature 476, 63–68. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yu G, and He QY (2016). ReactomePA: an R/Bioconductor package for reactome pathway analysis and visualization. Mol Biosyst 12, 477–479. [DOI] [PubMed] [Google Scholar]
- Yu G, Wang LG, Han Y, and He QY (2012). clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS 16, 284–287. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhong C, and Zhu J (2017). Transcriptional regulators dictate innate lymphoid cell fates. Protein Cell 8, 242–254. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ziller MJ, Gu H, Muller F, Donaghey J, Tsai LT, Kohlbacher O, De Jager PL, Rosen ED, Bennett DA, Bernstein BE, et al. (2013). Charting a dynamic DNA methylation landscape of the human genome. Nature 500, 477–481. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zook EC, and Kee BL (2016). Development of innate lymphoid cells. Nat Immunol 17, 775–782. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Raw and normalized DNA methylation and RNASeq data and EBF1 ChIP-Seq data have been deposited in GEO and are publicly available as of the date of publication. Accession numbers are listed in the key resources table
This paper does not report original code.
Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request
REAGENT or RESOURCE | SOURCE | IDENTIFIER |
---|---|---|
Antibodies | ||
PE/Cy7 anti-human CD62L | Biolegend | Clone: DREG-56; Cat# 304822 |
APC anti-human CD45RA | Biolegend | Clone: HI100; Cat# 304112 |
FITC anti-human CD8a | Biolegend | Clone: RPA-T8; Cat# 301050 |
CD19-PerCP/Cy5.5 | Biolegend | Clone: HIB19; Cat# 302230 |
PE anti-human CD38 | Biolegend | Clone: HB-7; Cat# 356604 |
PE/Cy7 anti-human CD27 | Biolegend | Clone: O323; Cat# 302838 |
FITC anti-human CD3 | Biolegend | Clone: OKT3; Cat# 317306 |
FITC anti-human CD4 | Biolegend | Clone: RPA-T4; Cat# 300538 |
PE anti-human CD4 | Biolegend | Clone: RPA-T4; Cat# 300508 |
CD14-PerCP/Cy5.5 | Biolegend | Clone: HCD14; Cat# 325622 |
CD16-PE | Biolegend | Clone: 3G8; Cat# 302008 |
CD8-APC | Biolegend | Clone: HIT8a; Cat# 300912 |
APC anti-human CD56 | Biolegend | Clone: NCAM; Cat# 318310 |
FITC CD91 | BD Bioscience | Cat# 550496 |
Biological samples | ||
PBMC, whole blood and immune subpopulations | Healthy human donors in Genetic and Epigenetic Signatures of Translational Aging Laboratory Testing study (GESTALT) | https://www.clinicaltrials.gov/ct2/show/NCT02339012 |
Chemicals | ||
HBSS buffer | Mediatech | Part # 21-022-CM |
Critical commercial assays and kits | ||
Ficoll-Paque PLUS | GE Healthcare | Cat# GE17-1440-03 |
1X DPBS (Dulbecco's Phosphate Buffered Saline, DPBS) w/o Ca & Mg | Quality Biological | SKU# 114-057-131 |
EasySep™ Human Monocyte Enrichment Kit without CD16 Depletion | STEMCELL Technologies | Cat# 19058 |
EasySep™ Human Whole Blood CD66b Positive Selection Kit | STEMCELL Technologies | Cat# 18682 |
EasySep Negative Human B Cell, Kit | STEMCELL Technologies | Cat# 19054 |
EasySep Negative Human CD4 Kit | STEMCELL Technologies | Cat# 19052 |
EasySep Negative Human CD8, Kit | STEMCELL Technologies | Cat# 19053 |
Zymo EZ-96 DNA Methylation Kit | Zymo | Cat# D5003 |
Infinium® MethylationEPIC BeadChip Kit | Illumina | Cat# WG-317-1003 |
Single Read Flowcell - Big (v4 SR Cluster Kit) | Illumina | Cat# GD-401-4001 |
Sequencing Reagent for Big Flowcell (v4 SBS HiSeq) (50 cycles) | Illumina | Cat# FC-401-4002 |
TrueSeq ChIPSeq 48 samples (SetA) | Illumina | Cat# IP-202-1012 |
TrueSeq ChIPSeq 48 samples (Set B) | Illumina | Cat# IP-202-1024 |
RNeasy Mini Kit (250) | Qiagen | Cat# 74106 |
miRNeasy mini kit | Qiagen | Cat# 217004 |
Low Input RiboMinus™ Eukaryote System v2 | Life Technologies Corporation | Cat# A15027 |
Bioanalyzer RNA 6000 Nano kit | Agilent | Part No.# 5067-1511 |
Ovation RNASeq System Kits V2 | Nugen | Part No.# 7102-A01 |
Deposited data | ||
Methylation data | This paper | GEO: GSE184269 |
RNASeq data | This paper | GEO: GSE184264 |
EBF1 Chip-Seq data | This paper | GEO: GSE183537 |
DNase-seq, H3K4me1,H3K4me3 and H3K27ac data | ENCODE project | https://www.encodeproject.org/ |
18 state model of immune cells | chromHMM | https://egg2.wustl.edu/roadmap/web_portal/chr_state_learning.html#exp_18state |
Tejedor 850K methylation data | (Tejedor et al., 2018) | ArrayExpress Accession # E-MTAB-6315 |
Softwares and algorithms | ||
GenomeStudio 2011.1 | Illumina | https://www.illumina.com/techniques/microarrays/array-data-analysis-experimental-design/genomestudio.html |
FlowJo v10 | FlowJo | https://www.flowjo.com |
GraphPad Prism v8 | GraphPad Software | https://www.graphpad.com |
R Studio 3.6.0 | The R Foundation | https://www.r-project.org |
DESeq2 v1.30.1 | (Love et al., 2014) | DOI: 10.18129/B9.bioc.DESeq2 |
minfi 1.38.0 | (Aryee et al., 2014) | DOI: 10.18129/B9.bioc.minfi |
Methylation plotter | (Mallona et al., 2014) | https://gattaca.imppc.org/methylation_plotter/ |
Cutadapt 3.4 | (Martin, 2011) | DOI:10.14806/ej.17.1.200 |
FastQC 0.11.9 | (Andrews, 2010) | https://www.bioinformatics.babraham.ac.uk/projects/fastqc/ |
STAR Aligner 2.7.9a | (Dobin et al., 2013) | DOI: 10.1093/bioinformatics/bts635 |
featureCounts v1.5.2 | (Liao et al., 2019) | DOI: 10.1093/bioinformatics/btt656 |
ReactomePA 1.36.0 | (Yu and He, 2016) | DOI: 10.18129/B9.bioc.ReactomePA |
ClusterProfiler 4.0.5 | (Yu et al., 2012) | DOI: 10.18129/B9.bioc.clusterProfiler |
Deeptools 3.5.1 | (Ramirez et al., 2016) | DOI:10.1093/nar/gkw257 |
Bedops 2.4.40 | (Neph et al., 2012) | DOI: 10.1093/bioinformatics/bts277 |
HOMER 4.11.1 | (Heinz et al., 2010) | DOI: 10.1016/j.molcel.2010.05.004 |
Other | ||
DNA Isolation, DNAQuik DNA Extraction protocol | Reprocell | N/A |
DNA Isolation, Qiagen DNeasy kit (for all cell types) | Genetic Resources Core Facility, Johns Hopkins University | N/A |
DNA Isolation, Chemagic Chemistry on MSM I workstation (for WB) | Genetic Resources Core Facility, Johns Hopkins University | N/A |