Summary
Epigenome-wide association studies (EWAS) provide an alternative approach for studying human disease through consideration of non-genetic variants such as altered DNA methylation. To advance the complex interpretation of EWAS, we developed eFORGE (http://eforge.cs.ucl.ac.uk/), a new standalone and web-based tool for the analysis and interpretation of EWAS data. eFORGE determines the cell type-specific regulatory component of a set of EWAS-identified differentially methylated positions. This is achieved by detecting enrichment of overlap with DNase I hypersensitive sites across 454 samples (tissues, primary cell types, and cell lines) from the ENCODE, Roadmap Epigenomics, and BLUEPRINT projects. Application of eFORGE to 20 publicly available EWAS datasets identified disease-relevant cell types for several common diseases, a stem cell-like signature in cancer, and demonstrated the ability to detect cell-composition effects for EWAS performed on heterogeneous tissues. Our approach bridges the gap between large-scale epigenomics data and EWAS-derived target selection to yield insight into disease etiology.
Keywords: bioinformatics, epigenetics, epigenome-wide association study, histone marks, DNase I hypersensitive sites
Graphical Abstract
Highlights
-
•
Development of a tool for the analysis and interpretation of EWAS data
-
•
Identification of cell type-specific signals in heterogeneous EWAS data
-
•
Identification of cell-composition effects in EWAS
-
•
Compilation of eFORGE catalog of 20 published EWAS
Breeze et al. develop a tool for the analysis and interpretation of EWAS data. The eFORGE tool identifies cell type-specific, disease-relevant signals in heterogeneous EWAS data and can also identify cell-composition effects. Explore consortium data at the Cell Press IHEC webportal at http://www.cell.com/consortium/IHEC.
Introduction
Common complex diseases, such as autoimmune, metabolic, and neurodegenerative diseases as well as cancer, typically involve multiple genetic and non-genetic factors. The search for genetic factors through genome-wide association studies (GWAS) has identified thousands of replicated SNPs associated with many diseases and other complex phenotypes (Welter et al., 2014). Notably, most of the identified variants were found to map at non-protein coding regions where their molecular consequences are difficult to evaluate (Paul et al., 2014). The search for non-genetic factors is more complicated due to the many confounding factors, in particular, genetic heterogeneity between individuals and cell-type heterogeneity in accessible sample material (e.g., whole blood). Nonetheless, significant progress has been made following the introduction of systematic epigenome-wide association studies (EWAS) in 2011 (Rakyan et al., 2011), including correction of cell-composition effects (Houseman et al., 2012, Zou et al., 2014, McGregor et al., 2016, Houseman et al., 2016). Since then, over 250 EWAS have been conducted (Michels et al., 2013). A large proportion of these EWAS have correlated DNA methylation (DNAm) profiles with disease status and environmental exposure in cross-sectional case-control cohorts. A subset of these studies reported differentially methylated positions (DMPs) that replicated in independent sample cohorts, e.g., for smoking behavior (Philibert et al., 2015). Fine mapping of the disease-associated genetic variants and integrative analysis with cell type-specific epigenomic datasets suggest a possible causal involvement of affected regulatory sequences, especially enhancers (Farh et al., 2015, Kundaje et al., 2015).
Several large-scale initiatives such as the ENCODE (ENCODE Project Consortium, 2012), Epigenomics Roadmap (Kundaje et al., 2015), and BLUEPRINT (Adams et al., 2012) projects have mapped gene regulatory elements across a wide range of tissues and cell types using a variety of assays. Active regulatory elements have been shown to concentrate at DNase I hypersensitive sites (DHSs) (Dorschner et al., 2004). The intersection of these DHSs with common disease-associated variants has proved to be a powerful approach to identify potential regulatory variants implicated in the disease (Maurano et al., 2012). This approach has been implemented as a publicly available web tool, termed FORGE, and systematically applied to analyze SNP sets for 260 phenotypes from the GWAS catalog (Dunham et al., 2015). The functional interpretation of the resulting SNP-DHS overlaps is that highly associated GWAS SNPs are enriched for DHSs of specific cell types that are relevant to the etiology of the disease under investigation. This is because such SNPs are thought to exert functional changes in affected regulatory elements (marked by DHSs), which, in turn, can then affect gene expression and may ultimately result in phenotypic changes.
While genotypes are generally invariable, epigenotypes are ontogenetically, spatially, and temporally variable. This can impede the meaningful interpretation of DMPs identified through EWAS (Michels et al., 2013, Birney et al., 2016, Paul and Beck, 2014). For example, EWAS are often performed on heterogeneous tissues (e.g., whole blood) where cell-composition effects, or only a specific cell type within that tissue, may be driving the observed epigenetic signal. While significant progress has been made to reduce major sources of confounding (reviewed in Liang and Cookson, 2014, Wilhelm-Benartzi et al., 2013, Rakyan et al., 2011, Michels et al., 2013, Birney et al., 2016), additional tools are required to improve EWAS analysis and elucidate disease mechanisms. To this end, we have developed eFORGE (experimentally derived Functional element Overlap analysis of ReGions from EWAS), a bioinformatics tool that informs which EWAS DMPs are likely to be functional and in which tissue or cell type. This is achieved by overlap analysis between a DMP set of choice and reference DHS maps, improving on previous overlap analyses (Stadler et al., 2011, Slieker et al., 2013). eFORGE has been implemented both as standalone software and an interactive web tool, and rigorously tested for performance criteria, including reproducibility, false-positive rates, and code execution speed. To demonstrate its utility, we systematically analyzed publicly available EWAS datasets to explore the suitability of eFORGE for analyzing surrogate tissues and detecting cell-composition effects, with the ultimate aim of providing insights into disease mechanisms.
Results
eFORGE Method and Workflow
The main objective of eFORGE is the prediction of disease-relevant cell type(s) from EWAS data generated in heterogeneous tissues, as illustrated in Figure 1A. eFORGE analyses such EWAS data generated using Illumina Infinium BeadChips (i.e., 27k, 450k, and EPIC arrays). For a given set of high-scoring probes on the array platform (indicated by large black dots in upper left panel), eFORGE generates a background of 1,000 random probe sets with matching properties regarding their location within genes and CpG islands (Tables S1 and S2). It then overlaps these sets with DHSs for up to 454 selectable reference samples (tissues, primary cell types, and cell lines) from the ENCODE, Epigenomics Roadmap, and BLUEPRINT projects (Data S1). Finally, eFORGE compares the number of DMPs overlapping DHSs with those obtained by random probes to calculate enrichment scores for each of the selected cell types. For a given eFORGE plot, as illustrated in the bottom right panel of Figure 1A, cell types are shown on the horizontal axis and the significance (e.g., –log10 binomial p value) is shown on the vertical axis. Each dot represents an enrichment p value for a given cell type. Enrichments with multiple testing-corrected q values <0.05 and <0.01 are shown as pink and red dots, respectively. Samples that do not reach statistical significance (q value >0.05) are shown in blue.
Assessment of eFORGE Performance
First, we assessed the reproducibility of eFORGE. Each time eFORGE is run on a dataset, it selects matched background probes at random. Because of this, the background is different for each eFORGE analysis. To assess the level of reproducibility between individual analyses, we compared 1,000 independent eFORGE runs using the same dataset, which also represents the default test set made available as part of the stand-alone and web-based versions of eFORGE (see also Experimental Procedures). Using this test set, we obtained a highly consistent pattern of enrichment for the same target cell types between individual runs (Figure 1B), demonstrating that eFORGE results are highly reproducible.
Second, we assessed two methods for the correction of multiple testing: Benjamini-Hochberg (BH), a false discovery rate (FDR)-based correction method for independent tests (Benjamini and Hochberg, 1995), and Benjamini-Yekutieli (BY), an FDR-based correction for multiple tests that are not independent (Benjamini and Yekutieli, 2001). We reasoned that the latter would be more appropriate in our case as DHSs across different cell types have been found to be cell lineage dependent (Stergachis et al., 2013). Benjamini-Hochberg correction resulted in one false-positive (q value <0.01), after running >2.3 million tests (8,000 runs of 299 samples each), compared to no false-positives with BY correction (q value <0.01). Based on this result, we implemented BY correction for eFORGE analyses (see also Tables S3 and S4).
Third, we assessed eFORGE computational speed using different approaches to manage the large decimal p value numbers resulting from the many statistical tests performed. The Perl BigFloat and BigInt functions used by FORGE (Dunham et al., 2015) were slow and, in some cases, even impeded eFORGE code execution. Switching to using the logarithms of the values for the binomial test not only reduced the amount of digits needed in calculation, but also dramatically increased code execution speed as shown in Figure 1C. We anticipate that this advance will become particularly noticeable in the online version of eFORGE when encountering high user demand.
Inter-consortium Correlation Analysis
One possible caveat when interpreting eFORGE results could be consortium-specific differences in the generation of reference datasets used by eFORGE. These include data-analysis pipelines, experimental protocols (including read depth specifications), and materials used (i.e., ENCODE predominantly analyzed transformed cell lines, whereas Roadmap and BLUEPRINT used primary tissue samples). In order to quantify any consortium bias on data generation, we applied the Genomic Efficient Correlator (GeEC) tool (see Experimental Procedures). This tool can measure the correlation of data from different consortia and identify drivers of clustering for these data. This analysis did not detect any consortium bias on data generation (Figure 1D; Table S5).
Application to tDMPs
As a positive control, we assessed the ability of eFORGE to identify the correct tissues and cell type(s) when tested with probe sets of established tissue and cell-type specificity. In this regard, we analyzed three sets of previously reported tissue-specific DMPs (tDMPs) (Lowe et al., 2015) and three sets of cell type-specific DMPs (cDMPs) (Jaffe and Irizarry, 2014) using consolidated Roadmap DHS data. Figure 2 shows the resulting heatmap, which demonstrates the ability of eFORGE to unambiguously identify the relevant tissues and cell types for each tDMP and cDMP set (i.e., blood, kidney, lung, monocytes, natural killer cells, and T cells). To quantify the level at which eFORGE can detect mixed tissue- and cell type-specific enrichment, we next assessed its performance on mixed tDMP and cDMP probe lists. Figure S1A shows the result for a mixture of tDMPs from lung and kidney tissues (Lowe et al., 2015). Although both tissues were predicted correctly in eFORGE, the tissue-specific signal was lower, due to a lack of specific enrichment for the mixed sets in each of the cell types. Figure S1B shows the corresponding results for mixed cDMPs. Here, sets of 148 B cell-specific and 148 monocyte-specific cDMPs (Jaffe and Irizarry, 2014) were mixed, and again the corresponding cell types were correctly predicted. Taken together, we provided evidence that eFORGE can identify the correct target tissues and cell types from individual and mixed probe sets.
Application to EWAS Data
Next, we applied eFORGE to analyze published EWAS data. First, we considered all EWAS compiled in a review article (Michels et al., 2013) that analyzed at least 100 samples using Illumina Infinium BeadChips. This qualified 44 datasets for eFORGE analysis, of which 20 showed eFORGE signal (q value <0.05). 14 studies showed an enrichment pattern specific to particular tissues. For instance, we observed blood-specific patterns for six blood-based EWAS, and stem cell-specific patterns for five cancer and aging EWAS (Figure 3). The genome-wide distribution of hits from these studies is shown in Figure 4. In addition, we found a larger average sample size for studies that present eFORGE signal (average n = 527) compared to those studies that did not (average n = 191). Taken together, these results suggest that tissue-specific enrichment patterns are widespread among EWAS and that eFORGE demonstrates the capacity to detect these patterns.
Second, to provide specific examples of eFORGE analysis, we assessed the ability of eFORGE to predict disease-relevant cell types from EWAS conducted on immune blood cells for three autoimmune diseases: rheumatoid arthritis (RA) (Liu et al., 2013), systemic lupus erythematosus (SLE) (Coit et al., 2013), and Sjögren’s syndrome (Altorok et al., 2014). For these diseases, it is assumed that blood is the main affected tissue. We performed eFORGE analyses for each of these diseases using the reported top 100, 86, and 753 probes and consistently found tissue-specific enrichment for immune effector cells and thymus (Figure 5). Specifically, eFORGE results for the EWAS on RA pointed to CD14+ cells as the most highly enriched cell type (q value = 5.53e-04), indicating a tendency for cell-composition-corrected RA-associated DNAm changes to co-locate with CD14+ DHSs. Indeed, the accelerated maturation of CD14+ cells is a hallmark in RA (Hirohata et al., 1996). For the EWAS on SLE, we observed confirmatory enrichment in DHSs specific to T cells (q value = 2.56e-05). T cells, in particular CD4+ T cells, play an essential role in the development of SLE (Yin et al., 2015) and have previously been proposed as targets for the treatment of Sjögren’s syndrome (Singh and Cohen, 2012). Consistent with these findings, eFORGE also pointed to an independent T cell signal for DMPs identified in the Sjögren’s syndrome EWAS (q value = 1.31e-49).
Third, we assessed whether eFORGE can uncover patterns in published EWAS data that would inform the functional interpretation of the statistical findings. Using the top 1,000 hypomethylated regions for an EWAS on multiple sclerosis (MS) (Huynh et al., 2014), we generated eFORGE plots for several DHS reference sets (Figure 6A). We observed a tissue-specific enrichment in immune cells, which is unexpected for a study performed on pathology-free brain tissue. To support this observation, we carried out additional eFORGE analysis using histone marks (Kundaje et al., 2015) that showed an enhancer-specific signature (H3K4me1) underlying this DHS enrichment (Figure 6B). We then intersected the top 1,235 hypomethylated regions from the study that gave rise to the observed immune signal with the locations of active enhancers (n = 1,158) previously identified in microglial cells (Lavin et al., 2014). These immune cells constitute up to 15% of all cells in the mammalian CNS (Xavier et al., 2014). A Fisher’s exact test confirmed significant co-localization of the microglial-specific active enhancers (p value: 2.70e-07, odds ratio [OR]: 5.88, 95% confidence interval [CI]: 3.19–9.96), suggesting that microglial enhancers may be potential drivers of the MS EWAS signal. In conclusion, eFORGE analysis of published MS EWAS data uncovered tissue-specific patterns, suggesting potential molecular mechanisms relevant to the etiology of the disease.
Fourth, we examined whether eFORGE can be used for the interpretation of EWAS that use surrogate tissues. In these studies, DNAm changes are measured in easily accessible tissues such as whole blood or buccal cells rather than the target tissue that is most relevant to the disease. It has been suggested that DNAm changes in surrogate tissues reflect epigenomic perturbations found in the target tissue (Lowe and Rakyan, 2014). An alternative possibility is that the observed epigenetic signature does not mimic methylation changes in the target tissue but is specific to the surrogate tissue. We performed eFORGE analysis on the top 110 regions from an EWAS on ovarian cancer, which was performed on whole blood using a pre-treatment discovery cohort. We found enrichment for CD14+ cells (q value = 1.37e-12, Figure 6C), but not for ovary (q value = 1) or solid cancer tissues (q value = 1). In addition, the observed myeloid/lymphoid enrichment patterns suggested cell-composition effects, as only myeloid regions showed enrichment, raising the possibility that these DNAm differences were caused by an increase in myeloid cell numbers in one of the groups. While a similar immune signature had been identified before, in addition to cell-composition effects (Teschendorff et al., 2009, Houseman et al., 2012, Li et al., 2014), we used an alternative approach to support this finding. Specifically, we were able to exclude ovarian signal and provide evidence for which myeloid cell types may drive the EWAS signal. For example, DHS enrichment was not observed for megakaryocytes (Figure 6C). In summary, eFORGE analysis of the top hits from an EWAS on ovarian cancer pointed to cell-composition effects driven by a myeloid-specific immune response to this cancer type, rather than an epigenetic change in peripheral blood that mimics a change in the methylome of the ovarian cancer tissue.
Finally, we showed that eFORGE detects tissue-specific patterns in cancer EWAS data. We analyzed five cancer EWAS: breast cancer (Fang et al., 2011), colorectal cancer (Kibriya et al., 2011), sporadic colorectal cancer (Laczmanska et al., 2013), clear cell renal cell carcinoma (Arai et al., 2012), and adrenocortical carcinoma (Barreau et al., 2013). Using the top 330, 450, 240, 801, and 362 EWAS hits, respectively, we performed eFORGE analysis for each DMP set, and identified enrichment in stem cells but not in breast, intestine, or renal tissues across the five studies (Figure 7). This suggests that many regulatory elements affected by cancer epigenetic reprogramming may be stem cell like. This is consistent with previous findings that DNAm changes in cancer tissue aid the emergence of a possible stem cell phenotype (Widschwendter et al., 2007). In conclusion, application of eFORGE to cancer EWAS data provided evidence for a stem cell-like enrichment across all five studies, which warrants further investigation.
Discussion
We have developed eFORGE, a tool that highlights DMPs identified through EWAS that are likely to be functional in a cell type- and tissue-specific context. Its development represents an addition to the currently limited toolbox available for comprehensive analysis and interpretation of EWAS data. Both the standalone and web-based versions of eFORGE have been subjected to rigorous performance assessment with regards to false-positive rates, reproducibility, and speed to ensure ability to cope with high user demand.
SNPs identified through GWAS have been systematically probed for enrichment at regulatory elements marked by DHSs (Maurano et al., 2012). This analytical approach has recently been implemented in the FORGE tool, which enables an automated analysis workflow (Dunham et al., 2015). Complementing FORGE for epigenetic analysis, we have developed eFORGE to provide large-scale, tissue-specific DHS enrichment analysis for DMPs identified through EWAS. While using parts of the FORGE framework (Dunham et al., 2015), eFORGE uses a different, EWAS-specific background and contains several features not included in FORGE, such as a faster scaling of the binomial test and histone marks as a dataset additional to DHSs.
The main applications of eFORGE are the analysis of EWAS data to predict disease-relevant cell types and potential cell-composition effects, as well as quality-control analysis for studies on tissue-specific DNA methylation. We have provided evidence that eFORGE analysis of tDMPs correctly predicts the relevant tissue through tissue-specific DHS enrichment patterns (Figure 2). In this way, sets of tDMPs can be linked to the corresponding tissue in an independent manner. This link can also be used in the inverse sense to detect regions with a potential tissue-specific regulatory function, by using algorithms to detect probe sets with a high tissue-specific eFORGE score. eFORGE is also designed to aid downstream functional follow-up of DMPs by providing tissue-specific DHS overlap results in the form of tables, with data for specific genomic regions. DHSs are markers of cis-regulatory elements (Dorschner et al., 2004), and DNAm changes in these regions may be associated with functional consequences (Schübeler, 2015). Such functional links could be confirmed through experimental assays, including chromatin conformation capture techniques and epigenome editing using CRISPR/Cas9 (Köeferle et al., 2015).
eFORGE can also be used to assess cell-composition effects in EWAS, providing a complementary approach to methylation-based tools. When heterogeneous tissues are analyzed in EWAS, a proportion of the observed differential DNAm signal can be due to cell-composition effects (Houseman et al., 2012, Houseman et al., 2016, Paul and Beck, 2014). eFORGE can identify these cell-composition effects by detecting tissue-specific DHS enrichment based on genomic location, not DNAm values. Unlike alternative methods for detecting cell-composition effects (Zou et al., 2014, Houseman et al., 2012, Houseman et al., 2016), eFORGE provides the user-friendly option of web-based analysis.
eFORGE analysis is, however, not without limitations. Despite the fact that eFORGE can detect cell-composition effects, the user must interpret whether a given enrichment is driven by confounding cell-composition effects or by true cell type-specific effects. Based on the data presented here, strong bias toward one cell type for an EWAS performed on heterogeneous tissue is more likely an indication of cell-composition than cell type-specific effects. The need for a more accurate interpretation requires complementary methods for cell-composition detection and correction to be used and also highlights the requirement of experimental validation. In addition, eFORGE can only analyze cell types for which DHS data are available. Although the current eFORGE database contains 454 samples, many cell types are still not represented. To consider these, we have to recur to alternative analyses, as shown in our analysis of microglial enhancer enrichment outside of eFORGE.
In conclusion, we anticipate eFORGE to contribute to the challenging task ahead of translating the increasing number of DMPs identified through EWAS into relevant molecular mechanisms.
Experimental Procedures
DNase I Data
DNase I data from the ENCODE, Roadmap Epigenomics, and BLUEPRINT projects were downloaded and, if necessary, processed using the Hotspot method (see Supplemental Experimental Procedures). Fastq and BAM files for BLUEPRINT samples (listed in Table S6) are available at the European Genome-phenome Archive under accession number EGAD00001002713.
EWAS DMP Data
A list of 44 450k- and 27k-based EWAS was analyzed with eFORGE. The list of 20 studies with eFORGE signal is contained in Table S7. Studies were selected from Table S1 of a review on EWAS (Michels et al., 2013), taking only studies with a number of samples equal to or above 100. eFORGE analysis of DMP sets was performed for DNase I hotspots from ENCODE, BLUEPRINT, and Roadmap Epigenome projects, with background DMP sets from the Illumina 450k or 27k array where appropriate. For the 27k array background, only probes shared with the 450k array were used. Notably, 13 of the EWAS selected from the aforementioned review did not report top probes. This lack of reporting top probe IDs is an important and not previously reported finding, constituting a major limitation for EWAS reproducibility, in addition to hindering eFORGE analysis of published studies. We urge the community to embrace the open and clear reporting of EWAS results, including top study hits.
Preparation of eFORGE Overlaps
An SQLite database (http://www.sqlite.org) containing the overlaps for the Infinium Illumina 450k array (Bibikova et al., 2011) cg probes with the BLUEPRINT, ENCODE, and Roadmap Epigenome DNase I hotspots was incorporated into the eFORGE tool. The HumanMethylation450 v.1.2 Manifest File with data from all the cg probes on the 450k array was used to prepare this database (https://support.illumina.com/array/array_kits/infinium_humanmethylation450_beadchip_kit/downloads.html).
We compared the 450k array cg probe data to the indexed DNase I hotspots using the bedtools tool (Quinlan and Hall, 2010). The overlaps for each cg probe were stored in an SQLite DB, organized by datasets. For each cg probe, a binary string was generated, representing the presence or absence of an overlap with a hotspot in each dataset (either Roadmap, BLUEPRINT, or ENCODE).
Background Probe Parameters
eFORGE evaluated the input DMP set by comparing it to 1,000 background DMP sets. The probes in the background sets were matched to the probes in the input set using annotation (i.e., if one probe from the input set was in a promoter, then its matching probe in each of the 1,000 background sets was also in a promoter). The annotation categories used for this matching process were “Gene annotation” (i.e., 1stExon, 3′ UTR, 5′ UTR, Body, IGR, TSS1500, TSS200) and CpG Island annotation (i.e., Island, Shore_Shelf, and NA or “open sea”). All possible combinations of these two levels resulted in 21 different background bin categories. The CpG island annotation categories “Shelf” and “Shore” were merged to produce enough probes in each background bin category; i.e., more than 1,000. The 450k array Illumina annotation (Bibikova et al., 2011) was used to create these data categories (Tables S1 and S2).
eFORGE Analysis
Input of DMPs into eFORGE can be in any of two forms: as Illumina 450k/27k probe IDs or as BED format (BED format should be zero based and the chromosome should be given as chrN, as genomic location on human genome assembly GRCh37). Genome coordinates are sufficient to identify probe IDs if these are not provided in BED format. We suggest a minimum of 20 and a maximum of 1,000 probes. If a DMP is not present on the 450k array (or the 27k array probes shared with the 450k array), it is excluded from the analysis. We added a 1-kb proximity filter in order to avoid the biases of testing groups of proximal probes in eFORGE: methylation correlation among closely located CpGs could mean we would be testing the same change more than once. Probes from input are selected at random by the filter, and any probe within 1 kb of any already selected probe is excluded. The choice of selecting 1 kb as a limit for filtering was based on previous data showing strong correlation of DNA methylation levels between CpGs fewer than 1 kb apart (Eckhardt et al., 2006).
Overlaps are retrieved from the eFORGE database for each analyzable probe in the input set. The tool records a count of total hotspot overlaps for each DNase I sample (cell) for the test probe set. eFORGE selects 1,000 matching background probe sets that contain an equal number of probes to the test probe set, matching for gene annotation and CpG island annotation as described above. Retrieval of overlaps from the database for each of the probes in each of the background probe sets then occurs. The tool records an overlap count for each background set in each DNase I sample. For each test probe set, eFORGE obtains the binomial p value for the test set overlap count. This binomial p value is calculated for the test set overlap count relative to the total number of tested probe sets. The binomial test was chosen over the hypergeometric test due to the important computational speed advantages it offers, which are further highlighted considering the high number of tests performed by eFORGE.
eFORGE Outputs
Tabular and graphic descriptions of the enrichment of overlap for the test DMPs are generated by eFORGE for each DNase I hotspot sample. The tool outputs a tab-separated values (TSVs) file, which includes columns for Z score, binomial p value, cell, tissue, datatype, filename of the sample hotspots, DMPs that contribute to the enrichment, the GEO accession for each sample, and the BY adjusted q value. An interactive table containing these data is generated using the Datatables (https://datatables.net/) plug-in for the jQuery JavaScript library accessed through the rCharts package (http://ramnathv.github.io/rCharts/).
In terms of graphical visualization of eFORGE output, FORGE scripts (Dunham et al., 2015) were adapted with minor modifications. Briefly, the –log 10 (binomial p value) is presented by cell sample in each of the graphic outputs. Base R graphics (http://www.r-project.org) are used to generate a pdf graphic. eFORGE generates the interactive JavaScript graphic using the rCharts package (http://ramnathv.github.io/rCharts/) to interface with the dimple d3 libraries (http://dimplejs.org). In the pdf and the interactive graphic cells are grouped alphabetically within each tissue (tissues, in turn, also follow alphabetical order). The coloring of results by BY-corrected q value in each of the graphics is consistent, blue (q value >0.05), pink (q value <0.05), and red (q value <0.01).
eFORGE Reproducibility
A default probe set of 11 monocyte-specific DMPs was run 1,000 times on eFORGE, and we plotted a resulting longitudinal set of CD14+ category q values for these runs (Figure 1B).
False-Positive Testing
We tested eFORGE BY correction with sets of 5, 10, 15, 20, 30, 40, 50, and 100 randomly selected probes, performing 5,000 analyses for both Roadmap Epigenomics and ENCODE data. We did not obtain any false-positives (at a q value <0.01) for any of these analyses. We did, however, obtain false-positives at a q value <0.05 (0.36%), but this is only indicated to be an intermediate level of significance in eFORGE. The false-positive rates at a q value <0.05 do not show a clear tendency with the size of the probe sets (see Tables S3 and S4). Most of the q values below 0.05 come from a few probe sets, signaling that most FP come from random sets that are borderline for many tissues/cell lines. This supports the notion that it is highly improbable to obtain eFORGE tissue-specific enrichment with random probe lists.
Comparison of BH and BY
8,000 sets ranging between five and 100 DMPs were run on eFORGE using both BH and BY multiple testing correction methods. As 299 samples were analyzed for each of the 8,000 DMP sets, there is a total number of 2,392,000 sample tests. Out of these random tests, one false-positive was recorded for BH, and zero were recorded for BY, at a significance level of 0.01.
Addition of New Data to eFORGE Database
All the code for generating the eFORGE database from scratch is openly available on GitHub (see https://github.com/charlesbreeze/eFORGE/blob/master/database/README.txt and examples in the same directory). Data from a new project can be added as a table to the modular SQLite database. In addition, indications are provided for the production of bitstring tables from raw data.
GeEC Analysis
Datasets used for GeEC correlation analysis were taken from the IHEC Data Portal (http://epigenomesportal.ca/ihec/) as of March 2016. They correspond to 508 DNase sequencing (DNase-seq) datasets generated by ENCODE, Roadmap, and BLUEPRINT, as well as 1,277 chromatin immunoprecipitation sequencing (ChIP-seq) datasets for five core histone marks (H3K4me1, H3K4me3, H3K36me3, H3K27me3, and H3K9me3) generated by ENCODE and Roadmap. These 1,785 datasets were processed with the Genomic Efficient Correlator (GeEC) tool to first average the signal of each dataset in non-overlapping bins of 10 kb (excluding the ENCODE blacklisted regions) and then to calculate a matrix of the pairwise Pearson product-moment correlation coefficients (r). Hierarchical clustering (using the 1-r distance metric and the average linkage method) was then conducted, and the principal clusters were annotated.
Adjusted Rand Index (ARI) results were also used to validate the accuracy of the clustering by using the IHEC-provided metadata describing every dataset for three categories of labels (assayType, cellType, consortium). The tree generated through hierarchical clustering is cut at the appropriate height to obtain a number of clusters equal to the number of distinct labels in each category and then used to calculate the ARI scores. ARI values were calculated using the “average” linkage method.
tDMP Analysis
We extracted normalized samples from Marmal-aid (Lowe and Rakyan, 2013) for tissues that contained a minimum of 50 samples. We used the dmpFinder function in minfi (Aryee et al., 2014) to make an initial call of tissue-specific methylation differences. Two categories were defined to perform this for each tissue: group 1 contained the tissue of interest, and other tissues were included in group 2. We used the top two probes for each of the tissue-specific calls to visually inspect the data. Any samples that were found to be closer to the mean of group 2 than group 1 were removed. The final calls were then produced using the remaining samples. We randomly selected 50 samples for each tissue, and we then followed a similar procedure to the above, calling differences using dmpFinder. Data S2 contains the IDs of the samples used in the analysis. Data S3, S4, and S5 contain the calls for three of the tissues for all probes.
cDMP Analysis
We obtained data from Table S2 of a cell type-specific methylation paper by Jaffe and Irizarry (2014). cDMP selection was based on a given probe having the lowest methylation value for a given cell type when compared to the other cell types, and this low methylation value being lower than the next closest methylation value by 0.4.
Source Code
The source code for eFORGE is available on GitHub at https://github.com/charlesbreeze/eForge. It includes the code for the standalone tool, the webserver, and the scripts to build or extend the database, although a pre-compiled eForge.db sqlite database and background selection hash tables are readily available at http://eforge.cs.ucl.ac.uk/?download. Code variants (e.g., probe weighting) are discussed in the Supplemental Experimental Procedures. eFORGE has been successfully installed and run on Red Hat Linux and OS X 10.9.5.
Author Contributions
C.E.B., D.S.P., I.D., E.B., J.H., and S.B. designed the project. J.v.D., L.M.B., R.L., V.K.R., M.F., K.D., W.H.O., E.V., S.S., F.M., A.K.B., R.S., J.L., P.-E.J., GB, J.H.A.M., and H.G.S. provided data. C.E.B., I.D., and J.H. designed and developed the software package with support from A.E.T., J.C.A., J.E.B., and V.I. C.E.B., S.B., and D.S.P. wrote the paper with contributions from all authors.
Acknowledgments
C.E.B. was supported by a PhD fellowship from the EU-FP7 project EpiTrain (316758). J.H. was supported by the UCL Cancer Institute Research Trust. V.K.R. was supported by BLUEPRINT (282510). K.D. was funded as a HSST trainee by NHS Health Education England. M.F. was supported by the BHF Cambridge Centre of Excellence (RE/13/6/30180). Research in W.H.O.’s laboratory was supported by EU-FP7 project BLUEPRINT (282510) and by program grants from the National Institute for Health Research (NIHR, http://www.nihr.ac.uk) and the British Heart Foundation under numbers RP-PG-0310-1002 and RG/09/12/28096 (https://www.bhf.org.uk/). W.H.O.’s laboratory receives funding from NHS Blood and Transplant for facilities. We gratefully acknowledge the participation of all NIHR Cambridge BioResource volunteers. We thank the Cambridge BioResource staff for their help with volunteer recruitment. We thank members of the Cambridge BioResource SAB and Management Committee for their support of our study and the National Institute for Health Research Cambridge Biomedical Research Centre for funding. R.S. and his group were supported by the European Union in the framework of the BLUEPRINT Project (HEALTH-F5-2011-282510) and the German Ministry of Science and Education (BMBF) in the framework of the MMML-MYC-SYS project (036166B). We thank Deborah Winter (Weizmann Institute) for supplying a set of microglial enhancers from Lavin et al. (2014). Research in S.B.’s laboratory was supported by the Wellcome Trust (99148), Royal Society Wolfson Research Merit Award (WM100023), and EU-FP7 projects EpiTrain (316758), EPIGENESYS (257082), and BLUEPRINT (282510).
Published: November 15, 2016
Footnotes
Supplemental Information includes Supplemental Experimental Procedures, seven figures, seven tables, and five data files and can be found with this article online at http://dx.doi.org/10.1016/j.celrep.2016.10.059.
Contributor Information
Charles E. Breeze, Email: c.breeze@ucl.ac.uk.
Stephan Beck, Email: s.beck@ucl.ac.uk.
Accession Numbers
The accession number for the BLUEPRINT data reported in this paper is EGA: EGAD00001002713.
Supplemental Information
References
- Adams D., Altucci L., Antonarakis S.E., Ballesteros J., Beck S., Bird A., Bock C., Boehm B., Campo E., Caricasole A. BLUEPRINT to decode the epigenetic signature written in blood. Nat. Biotechnol. 2012;30:224–226. doi: 10.1038/nbt.2153. [DOI] [PubMed] [Google Scholar]
- Altorok N., Coit P., Hughes T., Koelsch K.A., Stone D.U., Rasmussen A., Radfar L., Scofield R.H., Sivils K.L., Farris A.D., Sawalha A.H. Genome-wide DNA methylation patterns in naive CD4+ T cells from patients with primary Sjögren’s syndrome. Arthritis Rheumatol. 2014;66:731–739. doi: 10.1002/art.38264. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Arai E., Chiku S., Mori T., Gotoh M., Nakagawa T., Fujimoto H., Kanai Y. Single-CpG-resolution methylome analysis identifies clinicopathologically aggressive CpG island methylator phenotype clear cell renal cell carcinomas. Carcinogenesis. 2012;33:1487–1493. doi: 10.1093/carcin/bgs177. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Aryee M.J., Jaffe A.E., Corrada-Bravo H., Ladd-Acosta C., Feinberg A.P., Hansen K.D., Irizarry R.A. Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics. 2014;30:1363–1369. doi: 10.1093/bioinformatics/btu049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Barreau O., Assié G., Wilmot-Roussel H., Ragazzon B., Baudry C., Perlemoine K., René-Corail F., Bertagna X., Dousset B., Hamzaoui N. Identification of a CpG island methylator phenotype in adrenocortical carcinomas. J. Clin. Endocrinol. Metab. 2013;98:E174–E184. doi: 10.1210/jc.2012-2993. [DOI] [PubMed] [Google Scholar]
- Benjamini Y., Hochberg Y. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. R. Stat. Soc. Series B Stat. Methodol. 1995;57:289–300. [Google Scholar]
- Benjamini Y., Yekutieli D. The control of the false discovery rate in multiple testing under dependency. Ann. Stat. 2001;29:1165–1188. [Google Scholar]
- Bibikova M., Barnes B., Tsan C., Ho V., Klotzle B., Le J.M., Delano D., Zhang L., Schroth G.P., Gunderson K.L. High density DNA methylation array with single CpG site resolution. Genomics. 2011;98:288–295. doi: 10.1016/j.ygeno.2011.07.007. [DOI] [PubMed] [Google Scholar]
- Birney E., Smith G.D., Greally J.M. Epigenome-wide Association Studies and the Interpretation of Disease -Omics. PLoS Genet. 2016;12:e1006105. doi: 10.1371/journal.pgen.1006105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Coit P., Jeffries M., Altorok N., Dozmorov M.G., Koelsch K.A., Wren J.D., Merrill J.T., McCune W.J., Sawalha A.H. Genome-wide DNA methylation study suggests epigenetic accessibility and transcriptional poising of interferon-regulated genes in naïve CD4+ T cells from lupus patients. J. Autoimmun. 2013;43:78–84. doi: 10.1016/j.jaut.2013.04.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dorschner M.O., Hawrylycz M., Humbert R., Wallace J.C., Shafer A., Kawamoto J., Mack J., Hall R., Goldy J., Sabo P.J. High-throughput localization of functional elements by quantitative chromatin profiling. Nat. Methods. 2004;1:219–225. doi: 10.1038/nmeth721. [DOI] [PubMed] [Google Scholar]
- Dunham I., Kulesha E., Iotchkova V., Morganella S., Birney E. FORGE: A tool to discover cell specific enrichments of GWAS associated SNPs in regulatory regions. F1000Res. 2015 [Google Scholar]
- Eckhardt F., Lewin J., Cortese R., Rakyan V.K., Attwood J., Burger M., Burton J., Cox T.V., Davies R., Down T.A. DNA methylation profiling of human chromosomes 6, 20 and 22. Nat. Genet. 2006;38:1378–1385. doi: 10.1038/ng1909. [DOI] [PMC free article] [PubMed] [Google Scholar]
- ENCODE Project Consortium An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489:57–74. doi: 10.1038/nature11247. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fang F., Turcan S., Rimner A., Kaufman A., Giri D., Morris L.G.T., Shen R., Seshan V., Mo Q., Heguy A. Breast cancer methylomes establish an epigenomic foundation for metastasis. Sci. Transl. Med. 2011;3:75ra25. doi: 10.1126/scitranslmed.3001875. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Farh K.K.-H., Marson A., Zhu J., Kleinewietfeld M., Housley W.J., Beik S., Shoresh N., Whitton H., Ryan R.J.H., Shishkin A.A. Genetic and epigenetic fine mapping of causal autoimmune disease variants. Nature. 2015;518:337–343. doi: 10.1038/nature13835. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hirohata S., Yanagida T., Itoh K., Nakamura H., Yoshino S., Tomita T., Ochi T. Accelerated generation of CD14+ monocyte-lineage cells from the bone marrow of rheumatoid arthritis patients. Arthritis Rheum. 1996;39:836–843. doi: 10.1002/art.1780390517. [DOI] [PubMed] [Google Scholar]
- Houseman E.A., Accomando W.P., Koestler D.C., Christensen B.C., Marsit C.J., Nelson H.H., Wiencke J.K., Kelsey K.T. DNA methylation arrays as surrogate measures of cell mixture distribution. BMC Bioinformatics. 2012;13:86. doi: 10.1186/1471-2105-13-86. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Houseman E.A., Kile M.L., Christiani D.C., Ince T.A., Kelsey K.T., Marsit C.J. Reference-free deconvolution of DNA methylation data and mediation by cell composition effects. BMC Bioinformatics. 2016;17:259. doi: 10.1186/s12859-016-1140-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huynh J.L., Garg P., Thin T.H., Yoo S., Dutta R., Trapp B.D., Haroutunian V., Zhu J., Donovan M.J., Sharp A.J., Casaccia P. Epigenome-wide differences in pathology-free regions of multiple sclerosis-affected brains. Nat. Neurosci. 2014;17:121–130. doi: 10.1038/nn.3588. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jaffe A.E., Irizarry R.A. Accounting for cellular heterogeneity is critical in epigenome-wide association studies. Genome Biol. 2014;15:R31. doi: 10.1186/gb-2014-15-2-r31. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kibriya M.G., Raza M., Jasmine F., Roy S., Paul-Brutus R., Rahaman R., Dodsworth C., Rakibuz-Zaman M., Kamal M., Ahsan H. A genome-wide DNA methylation study in colorectal carcinoma. BMC Med. Genomics. 2011;4:50. doi: 10.1186/1755-8794-4-50. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Köeferle A., Stricker S.H., Beck S. Brave new epigenomes: the dawn of epigenetic engineering. Genome Med. 2015;7:59. doi: 10.1186/s13073-015-0185-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kundaje A., Meuleman W., Ernst J., Bilenky M., Yen A., Heravi-Moussavi A., Kheradpour P., Zhang Z., Wang J., Ziller M.J., Roadmap Epigenomics Consortium Integrative analysis of 111 reference human epigenomes. Nature. 2015;518:317–330. doi: 10.1038/nature14248. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Laczmanska I., Karpinski P., Bebenek M., Sedziak T., Ramsey D., Szmida E., Sasiadek M.M. Protein tyrosine phosphatase receptor-like genes are frequently hypermethylated in sporadic colorectal cancer. J. Hum. Genet. 2013;58:11–15. doi: 10.1038/jhg.2012.119. [DOI] [PubMed] [Google Scholar]
- Lavin Y., Winter D., Blecher-Gonen R., David E., Keren-Shaul H., Merad M., Jung S., Amit I. Tissue-resident macrophage enhancer landscapes are shaped by the local microenvironment. Cell. 2014;159:1312–1326. doi: 10.1016/j.cell.2014.11.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li H., Zheng T., Chen B., Hong G., Zhang W., Shi T., Li S., Ao L., Wang C., Guo Z. Similar blood-borne DNA methylation alterations in cancer and inflammatory diseases determined by subpopulation shifts in peripheral leukocytes. Br. J. Cancer. 2014;111:525–531. doi: 10.1038/bjc.2014.347. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liang L., Cookson W.O.C. Grasping nettles: cellular heterogeneity and other confounders in epigenome-wide association studies. Hum. Mol. Genet. 2014;23(R1):R83–R88. doi: 10.1093/hmg/ddu284. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu Y., Aryee M.J., Padyukov L., Fallin M.D., Hesselberg E., Runarsson A., Reinius L., Acevedo N., Taub M., Ronninger M. Epigenome-wide association data implicate DNA methylation as an intermediary of genetic risk in rheumatoid arthritis. Nat. Biotechnol. 2013;31:142–147. doi: 10.1038/nbt.2487. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lowe R., Rakyan V.K. Marmal-aid--a database for Infinium HumanMethylation450. BMC Bioinformatics. 2013;14:359. doi: 10.1186/1471-2105-14-359. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lowe R., Rakyan V.K. Correcting for cell-type composition bias in epigenome-wide association studies. Genome Med. 2014;6:23. doi: 10.1186/gm540. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lowe R., Slodkowicz G., Goldman N., Rakyan V.K. The human blood DNA methylome displays a highly distinctive profile compared with other somatic tissues. Epigenetics. 2015;10:274–281. doi: 10.1080/15592294.2014.1003744. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Maurano M.T., Humbert R., Rynes E., Thurman R.E., Haugen E., Wang H., Reynolds A.P., Sandstrom R., Qu H., Brody J. Systematic localization of common disease-associated variation in regulatory DNA. Science. 2012;337:1190–1195. doi: 10.1126/science.1222794. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McGregor K., Bernatsky S., Colmegna I., Hudson M., Pastinen T., Labbe A., Greenwood C.M.T. An evaluation of methods correcting for cell-type heterogeneity in DNA methylation studies. Genome Biol. 2016;17:84. doi: 10.1186/s13059-016-0935-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Michels K.B., Binder A.M., Dedeurwaerder S., Epstein C.B., Greally J.M., Gut I., Houseman E.A., Izzi B., Kelsey K.T., Meissner A. Recommendations for the design and analysis of epigenome-wide association studies. Nat. Methods. 2013;10:949–955. doi: 10.1038/nmeth.2632. [DOI] [PubMed] [Google Scholar]
- Paul D.S., Beck S. Advances in epigenome-wide association studies for common diseases. Trends Mol. Med. 2014;20:541–543. doi: 10.1016/j.molmed.2014.07.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Paul D.S., Soranzo N., Beck S. Functional interpretation of non-coding sequence variation: concepts and challenges. BioEssays. 2014;36:191–199. doi: 10.1002/bies.201300126. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Philibert R., Hollenbeck N., Andersen E., Osborn T., Gerrard M., Gibbons F.X., Wang K. A quantitative epigenetic approach for the assessment of cigarette consumption. Front. Psychol. 2015;6:656. doi: 10.3389/fpsyg.2015.00656. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Quinlan A.R., Hall I.M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–842. doi: 10.1093/bioinformatics/btq033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rakyan V.K., Down T.A., Balding D.J., Beck S. Epigenome-wide association studies for common human diseases. Nat. Rev. Genet. 2011;12:529–541. doi: 10.1038/nrg3000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schübeler D. Function and information content of DNA methylation. Nature. 2015;517:321–326. doi: 10.1038/nature14192. [DOI] [PubMed] [Google Scholar]
- Singh N., Cohen P.L. The T cell in Sjogren’s syndrome: force majeure, not spectateur. J. Autoimmun. 2012;39:229–233. doi: 10.1016/j.jaut.2012.05.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Slieker R.C., Bos S.D., Goeman J.J., Bovée J.V., Talens R.P., van der Breggen R., Suchiman H.E.D., Lameijer E.-W., Putter H., van den Akker E.B. Identification and systematic annotation of tissue-specific differentially methylated regions using the Illumina 450k array. Epigenetics Chromatin. 2013;6:26. doi: 10.1186/1756-8935-6-26. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stadler M.B., Murr R., Burger L., Ivanek R., Lienert F., Schöler A., van Nimwegen E., Wirbelauer C., Oakeley E.J., Gaidatzis D. DNA-binding factors shape the mouse methylome at distal regulatory regions. Nature. 2011;480:490–495. doi: 10.1038/nature10716. [DOI] [PubMed] [Google Scholar]
- Stergachis A.B., Neph S., Reynolds A., Humbert R., Miller B., Paige S.L., Vernot B., Cheng J.B., Thurman R.E., Sandstrom R. Developmental fate and cellular maturity encoded in human regulatory DNA landscapes. Cell. 2013;154:888–903. doi: 10.1016/j.cell.2013.07.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Teschendorff A.E., Menon U., Gentry-Maharaj A., Ramus S.J., Gayther S.A., Apostolidou S., Jones A., Lechner M., Beck S., Jacobs I.J., Widschwendter M. An epigenetic signature in peripheral blood predicts active ovarian cancer. PLoS ONE. 2009;4:e8274. doi: 10.1371/journal.pone.0008274. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Welter D., MacArthur J., Morales J., Burdett T., Hall P., Junkins H., Klemm A., Flicek P., Manolio T., Hindorff L., Parkinson H. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 2014;42:D1001–D1006. doi: 10.1093/nar/gkt1229. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Widschwendter M., Fiegl H., Egle D., Mueller-Holzner E., Spizzo G., Marth C., Weisenberger D.J., Campan M., Young J., Jacobs I., Laird P.W. Epigenetic stem cell signature in cancer. Nat. Genet. 2007;39:157–158. doi: 10.1038/ng1941. [DOI] [PubMed] [Google Scholar]
- Wilhelm-Benartzi C.S., Koestler D.C., Karagas M.R., Flanagan J.M., Christensen B.C., Kelsey K.T., Marsit C.J., Houseman E.A., Brown R. Review of processing and analysis methods for DNA methylation array data. Br. J. Cancer. 2013;109:1394–1402. doi: 10.1038/bjc.2013.496. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xavier A.L., Menezes J.R.L., Goldman S.A., Nedergaard M. Fine-tuning the central nervous system: microglial modelling of cells and synapses. Philos. Trans. R. Soc. Lond. B Biol. Sci. 2014;369:20130593. doi: 10.1098/rstb.2013.0593. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yin Y., Choi S.-C., Xu Z., Perry D.J., Seay H., Croker B.P., Sobel E.S., Brusko T.M., Morel L. Normalization of CD4+ T cell metabolism reverses lupus. Sci. Transl. Med. 2015;7:274ra18. doi: 10.1126/scitranslmed.aaa0835. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zou J., Lippert C., Heckerman D., Aryee M., Listgarten J. Epigenome-wide association studies without the need for cell-type composition. Nat. Methods. 2014;11:309–311. doi: 10.1038/nmeth.2815. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.