Skip to main content
Briefings in Bioinformatics logoLink to Briefings in Bioinformatics
. 2023 Jun 27;24(4):bbad241. doi: 10.1093/bib/bbad241

E-value: a superior alternative to P-value and its adjustments in DNA methylation studies

Yifan Yang 1,2,#, Haoyuan Liu 3,#, Yi Liu 4,#, Liyuan Zhou 5, Xiaoqi Zheng 6, Rongxian Yue 7, David L Mattson 8, Srividya Kidambi 9, Mingyu Liang 10, Pengyuan Liu 11,12,, Xiaoqing Pan 13,
PMCID: PMC10359086  PMID: 37369639

Abstract

DNA methylation plays a crucial role in transcriptional regulation. Reduced representation bisulfite sequencing (RRBS) is a technique of increasing use for analyzing genome-wide methylation profiles. Many computational tools such as Metilene, MethylKit, BiSeq and DMRfinder have been developed to use RRBS data for the detection of the differentially methylated regions (DMRs) potentially involved in epigenetic regulations of gene expression. For DMR detection tools, as for countless other medical applications, P-values and their adjustments are among the most standard reporting statistics used to assess the statistical significance of biological findings. However, P-values are coming under increasing criticism relating to their questionable accuracy and relatively high levels of false positive or negative indications. Here, we propose a method to calculate E-values, as likelihood ratios falling into the null hypothesis over the entire parameter space, for DMR detection in RRBS data. We also provide the R package ‘metevalue’ as a user-friendly interface to implement E-value calculations into various DMR detection tools. To evaluate the performance of E-values, we generated various RRBS benchmarking datasets using our simulator ‘RRBSsim’ with eight samples in each experimental group. Our comprehensive benchmarking analyses showed that using E-values not only significantly improved accuracy, area under ROC curve and power, over that of P-values or adjusted P-values, but also reduced false discovery rates and type I errors. In applications using real RRBS data of CRL rats and a clinical trial on low-salt diet, the use of E-values detected biologically more relevant DMRs and also improved the negative association between DNA methylation and gene expression.

Keywords: E-value, P-value, Multiple testing correction, DNA methylation

INTRODUCTION

The P-value is a probability that measures the observed result of the statistical test falls into the acceptance or rejection category under a null hypothesis. In doing so, it provides a judgment level on statistical significance, the smaller the P-value the stronger the evidence against the null hypothesis. In biomedical research, a P-value of less than 0.05 is usually considered statistically significant. In high-dimension omics studies, P-values are usually corrected through P-value adjustment methods to control ‘false positive’ findings [1–5]. Up until now, P-values and adjusted P-values have become one of the most standard reporting statistics occurring in published medical literature [6–11].

However, debates over the use of P-values and their adjustments have continued over the past few decades. This has been largely due to examples and cases where the misuse and/or misinterpretation of the results of statistical tests and P-values has led to negative outcomes or potentially questionable over-confidence in results [6, 12–18]. In 2016, the American Statistical Association (ASA) issued a P-value guideline to attempt to supervise the proper use and interpretation of P-values [19]. Despite this, criticisms of P-values have continued to be widely reported, as highlighted in the 2019 Nature paper ‘Retire statistical significance’, and endorsed by more than 800 scholars worldwide [20]. This clearly indicates that the P-value controversy is far from resolved. In another 2019 example, the ASA published a further special issue of 43 papers on ‘Statistical Inference in the 21st Century: A World Beyond P < 0.05’, again suggesting that the scientific community needs to move beyond its addiction to the P-values [21, 22].

To overcome the limitation of P-values and adjusted P-values, a theoretical framework for E-values has been recently proposed to address the multiple hypothesis testing problem [23]. E-values can be directly interpreted as ‘betting scores’ or ‘likelihood ratios’, or can be interpreted as ‘how likely is the rejection to be true?’. Some evidence is being put forward that E-values have advantages for multiple testing corrections under certain theoretical settings [23]. However, the application of E-values to real-world data still requires further investigation. First of all, the definition of E-values is too broad to be applied in real-world data as E-values could be derived from any non-negative function the expectation of which is no more than 1. Second, due to the fundamental limitations of using P-values [23], transforming P-values to E-values via a calibrator function is not always valid. This is partly because the computing of P-values requires the full distribution of test statistics, often requiring asymptotic analysis. Without a large sample size, computing P-values by asymptotic analysis is far from robust. Therefore, direct E-value calculation, rather than transforming of P-values to E-values via a calibration function, is necessary when moving toward real data applications.

DNA methylation is a well-known aspect of epigenetic modification that plays a vital role in gene regulation. Reduced representation bisulfite sequencing (RRBS) is becoming increasingly popular for its use in analyzing genome-wide methylation profiles at a single-base resolution [24–27]. One of the main goals of RRBS studies is to identify differentially methylated regions (DMRs) between different biological conditions. Such detected DMRs have revealed important mechanisms of epigenetic regulation as mediated by alterations in DMR methylation. However, up to the present, most DMR detection tools have utilized P-values and adjusted P-values in their genome-wide DNA methylation studies. As common in other fields and studies, in these DMR detection studies a DMR is generally considered statistically significant when the adjusted P-value is less than 0.05. However, little clear analysis has been conducted to provide any comparisons between, or to analyze the relative merits of, the behaviors of P-values and adjusted P-values versus E-values in the identification of DMRs using RRBS.

In this study, we have proposed a method to calculate E-values for DMR detection in RRBS data. We also provided the R package ‘metevalue’ to facilitate user-based calculation of E-values. To assess the performance of E-values in identifying DMRs, we further generated various RRBS benchmarking datasets using our simulator ‘RRBSsim’. The performance of the detection of DMRs using P-values and their adjusted P-values or E-values is compared here in terms of accuracy (ACC), area under ROC curve (AUC), false discovery rate (FDR), type I error and power. Four commonly used DMR detection tools (Metilene, MethylKit, BiSeq and DMRfinder) were then tested, revealing the comparably excellent performance of E-values when applied to RRBS data to identify DMRs. Finally, to further demonstrate their utility in real RRBS data, we applied E-values to identify DMRs by Metilene in Dahl salt-sensitive (SS/JrHsdMcwiCrl, referred to as CRL) rats [30, 31] and a clinical trial on low-salt diet [32].

METHODS

Simulated RRBS data

RRBS sequence reads on the Illumina platform were generated using our simulator, RRBSsim, with default parameters (sequencing depth of 30× and read length of 100 base pairs), as described in our recent studies [28, 29]. Briefly, the reference genome (hg19) was randomly digested into variable sized regions as Msp1 fragments. The reference sequence of these Msp1 fragments in FASTA format was then inputted to RRBSsim. Once the baseline methylation level of a genomic region had been specified, the methylation value of CpG sites within the region was fitted with a Beta distribution by modeling real RRBS data from our recent lung cancer study [30]. The real RRBS data contained 18 tumors and adjacent normal tissues of non-small cell lung cancer, which can be accessed on the Sequence Read Archive (SRA) database (SRP125064). The RRBS data generated from the RRBSsim simulator also accounted for the strong dependence upon methylation levels of adjacent CpG sites within the genomic region [29].

Since RRBS experiments are often performed with a small number of biological replicates (ranging from a few to dozen subjects), in our simulation two experimental groups (treatment versus control groups) were set, each with eight RRBS samples. Most of the CpG islands (CGIs) are located in gene promoter regions which are often differentially methylated under certain disease conditions. We randomly chose approximately 50% of these CGIs and 25% of CGI shores (with 2 kb to their neighboring CGIs) as hyper- or hypo-methylated DMRs. The predefined DMRs contained CpG sites varying from a few to a thousand, with an average of 80 CpG sites per DMR. There were different degrees of methylation differences between the two groups on these predefined DMRs (<0.2, 0.2–0.4 and >0.4). A total of 100 repeats were carried out in our study. Details in generating the RRBS simulation data were described previously [28, 29]. Simulation codes and a set of simulated RRBS data as example data were deposited in the GitHub repository (https://github.com/yfyang86/metevalue-demos/tree/main/simulation). The simulated RRBS data in FASTQ format were analyzed using our in-house pipeline as described in the next section.

Real RRBS data and mRNA-seq data in applications

The raw RRBS and mRNA-seq data of CRL rats and a clinical trial on low-salt diet came from our previous studies [31–33]. Briefly, the CRL rat experiment involved rats on 0.4% NaCl (low-salt) diet since weaning or switched to a 4% NaCl (high-salt) diet for 3 weeks. Both RRBS and mRNA-seq data were obtained from renal T lymphocytes of CRL rats in low-salt and high-salt diet groups. In each group, there were four pooled samples with three rats in each pool [31, 32].

We recently conducted a clinical trial on 2 week low-salt diet (1200 mg/day Na diet) [32]. Both RRBS and mRNA-seq data of biopsied arterioles were obtained from the same subjects before and after low-salt diet (referred to as pre-diet and post-diet, respectively). There were 10 subjects in each group with each subject having both RRBS and mRNA-seq data. The detailed research protocol for this can be found in our previous study [33]. The RRBS and mRNA-seq data (Supplementary Table S1) have been deposited in the National Genomics Data Center under the accession numbers OMIX004080 and OMIX004085 (https://ngdc.cncb.ac.cn/omix).

Analysis of mRNA-seq data

Our in-house pipeline described previously [31, 33] was used to reanalyze the mRNA-seq data from CRL rats and a clinical trial on low-salt diet. The reference genome was rn6 for rats and hg19 for humans.

Analysis of RRBS data

Our in-house pipeline [32–36] was used to analyze both the simulated RRBS data and real RRBS data from CRL rats and a clinical trial on low-salt diet. Briefly, adapters were first removed from the output reads. The low-quality sequences at both ends of the reads (quality score < 20) were then further trimmed. These trimmed reads were then mapped to the reference genome (hg19 for humans and rn6 for rats) using Bismark (v0.16.1) [37]. Finally, the methylation call at each CpG site was extracted from the sequence alignment files (BAM). At each CpG site, the methylation rate was calculated as the percentage of unconverted cytosines in each library. RRBS data from different libraries were merged based on CpG coordinates. Transcription start site (TSS) regions were defined as 1000 bp upstream and downstream of the TSS. The genomic positions of CGIs on the reference genomes (hg19 for humans and rn6 for rats) were downloaded from the UCSC Genome Browser (http://genome.ucsc.edu/). CGI shores were defined as ±2 kb flanking the islands.

DMR detection tools

Four commonly used tools, Metilene, MethylKit, BiSeq and DMRfinder, have shown excellent performance in the identification of DMRs, and these were used in the evaluations of this study [28]. Metilene identifies genomic regions dynamically to maximize the absolute methylation difference between groups. P-values of each region are then calculated using the Wilcoxon rank-sum test [38]. MethylKit clusters CpG sites on predefined regions or tiling windows, and then calculates the P-value of each region using logistic regression or Fisher’s exact test [39]. BiSeq first clusters CpG sites, prunes the rejected CpG clusters and then merges the remaining CpG clusters into DMRs [40]. DMRfinder extracts CpG methylation counts from the sequence alignment files, clusters valid CpG sites into genomic regions and applies the Wald test to calculate the P-value for each region [28]. In all analyses, these DMR detection tools were run on default settings to identify differentially methylated regions between the two experimental groups.

P-values and adjusted P-values

Each computational tool was performed separately to identify DMRs between the two experimental groups. We first filtered out any identified regions with an absolute methylation difference of less than 0.05 between the two groups. This pre-filtering procedure is routinely done in detecting DMRs from RRBS data due to the potential for small methylation changes linked to potential bisulfite conversion errors during library preparation. P-values with the remaining regions were then adjusted using the Bonferroni or Benjamini–Hochberg (BH) methods [41, 42], implemented in the R statistical package ‘stats’ (https://www.r-project.org/).

A computational approach to estimate E-values

To facilitate E-value calculation toward real data applications, we provided a computational approach as follows. Specifically, two experimental groups (treatment and control groups) were considered. Suppose that there are Inline graphic samples in the treatment group, Inline graphic samples in the control group and Inline graphic CpG sites that are located in a given DMR. The methylation rate in the given DMR was first averaged across Inline graphic CpG sites for each sample and the averaged methylation rate denoted Inline graphic. We assumed that the averaged methylation rates in treatment and control groups were normally distributed. Considering different mean and variance parameters of the averaged methylation rate between treatment and control groups, we denoted Inline graphic. Under the null hypothesis, there is no difference in the average methylation rate between treatment and control groups. Therefore, we denoted Inline graphic. The E-value was defined as a ratio of the observed likelihood falling into the null hypothesis Inline graphic to the likelihood on the entire parameter space Inline graphic

graphic file with name DmEquation10.gif (1)
  • where unknown parameters Inline graphic were estimated from RRBS data using maximum likelihood estimation. It is worth noting that E-value tends to be small if the distributions of the averaged methylation rates in the treatment and control groups are identical. Instead of computing the limiting distribution of Inline graphic to generate P-values, we directly considered Inline graphic to be statistically significant. Based on the recommended calibration function Inline graphic for transforming E-values to P-values [23], the threshold of 20 for E-value is equivalent to our usual statistical significance level of a P-value or an adjusted P-value of 0.05.

We developed an R package named ‘metevalue’ to implement the estimation of E-value to identify DMRs in RRBS studies. The package ‘metevalue’ is a user-friendly interface that allows researchers to apply various tools to identify DMRs using E-values and their adjustments in epigenetic studies. It can be accessed directly through The Comprehensive R Archive Network (CRAN) (https://cran.r-project.org). Given a data matrix containing methylation levels in each CpG site for each sample and an input file containing methylation regions, ‘metevalue’ generates an E-value for any given methylation region. Examples of calculating E-values in RRBS and RNA-seq data are also provided in ‘metevalue’. These example data sets and the source codes for ‘metevalue’ are also available in the GitHub repository (https://github.com/yfyang86/metevalue).

Criteria for evaluating the performance of various tools

In our simulation, we already knew the DMRs. As defined in our previous study [28], if a region detected by the tool covers more than 50% of CpG sites in the predefined DMR, the tool detects a true DMR. True negative (TN) or false positive (FP) referred to the situation where a predefined non-DMR was identified as a non-DMR or DMR, respectively. Similarly, a false negative (FN) or true positive (TP) referred to the situation where a predefined DMR was identified as a non-DMR or DMR, respectively. This is summarized in the confusion matrix (Supplementary Table S2). The performance of these tools was then evaluated using multiple criteria following the confusion matrix. This included ACC, AUC, FDR, type I errors and power. The performance metrics were defined as follows:

graphic file with name DmEquation11.gif
graphic file with name DmEquation12.gif
graphic file with name DmEquation13.gif
graphic file with name DmEquation14.gif

Statistical analysis

Pearson correlation analysis was performed to quantify the linear correlation between log-transformed gene expression level and DNA methylation level. Wilcoxon rank-sum tests were used to compare the differences of each performance metric between using P-value, adjusted P-value (BH or Bonferroni) and E-value. The following comparisons between two groups were considered: E-value versus P-value, E-value versus BH, and E-value versus Bonferroni.

RESULTS

A workflow of evaluating E-values in identifying DMRs

Figure 1 illustrates a comprehensive comparative workflow for detecting DMRs using E-values, P-values or adjusted P-values. Briefly, the four commonly used tools (Metilene, MethylKit, BiSeq and DMRfinder) were applied to detect DMRs. Each tool detected a list of DMRs with P-values. Using methylation rates and DMRs from each tool as input data, the R package ‘metevalue’ provided E-values as an additional column to the output from each DMR detection tool. Any identified regions of an absolute methylation with a difference of less than 0.05 between the two groups was filtered out. Then, P-values with the remaining regions were adjusted using the Bonferroni or BH methods, as implemented in the R statistical package ‘stats’. Regions with either a P-value <0.05, adjusted P-value <0.05 or E-value >20 were considered to be identified as statistically significant DMRs.

Figure 1.

Figure 1

A workflow for comprehensive evaluations of E-values for detecting DMR.

RRBS data were generated using our simulator ‘RRBSsim’ to comprehensively evaluate the performance of P-values, adjusted P-values and E-values in identifying DMRs. A region identified as a DMR covering more than 50% of CpG sites in the predefined DMR was considered as correctly identified. True negatives (TN) or false positives (FP) were then assigned to refer to predefined non-DMRs that had been identified as non-DMRs or DMRs, respectively. False negatives (FN) or true positives (TP) were similarly assigned to refer to predefined DMRs that had been identified as non-DMRs or DMRs, respectively (Supplementary Table S2).

Finally, to evaluate the performance of E-values and adjusted P-values using the BH method in a real RRBS data application, the DMR detection tool ‘Metilene’ was further applied to our recent RRBS studies of CRL rats and our low-salt clinical trial.

E-values improved the concordance of DMRs detected by different tools

To evaluate the similarity between DMRs from different detection tools in simulated RRBS data, the overlap of DMRs predicted by different tools using P-values, BH, Bonferroni or E-values was analyzed (Figure 2). In general, BiSeq showed the largest proportion of DMRs that overlapped with other detection tools, while MethylKit showed the lowest proportion of DMRs that overlapped with other detection tools, with Metilene and DMRfinder lying in between. We concluded that using E-values could further improve the consistency of DMRs detected by different tools, as compared to using P-values, BH or Bonferroni. Ultimately, the percentage of DMRs detected by all four tools increased by 17.5% when using E-values.

Figure 2.

Figure 2

The concordance of DMRs detected by different tools using P-values, adjusted P-values or E-values in simulated RRBS data. In each panel, the DMRs identified by the four tools are visualized using a Venn diagram.

E-values improved accuracy and reduced false discovery rates

ACC was used to evaluate the accuracy of identifying true DMRs. All DMR detection tools yielded the highest ACC when using E-values rather than P-values, BH or Bonferroni (Figure 3). The average ACC increased by up to 7.9% when using E-values compared to using P-values, BH or Bonferroni.

Figure 3.

Figure 3

FDR and ACC of identifying DMRs using P-values, adjusted P-values and E-values in simulated RRBS data. FDR: false discovery rate; ACC: accuracy. In each panel, the y-axis represents FDR and the x-axis represents ACC. The average FDR and ACC in 100 repeats are presented.

In the identification of DMRs, the FDR was highest when using P-values and decreased when using adjusted P-values. However, the FDR could be further reduced (up to 27.2%) by using E-values. Surprisingly, BiSeq could control FDR regardless of using P-values, adjusted P-values or E-values, while in Metilene, MethyKit and DMRfinder, the FDRs were largely out of control, even with BH-adjusted P-values. Taken together, using E-values not only improved accuracy but also reduced the false discovery rate of these tools in detecting DMRs compared to using P-values or adjusted P-values.

E-values increased AUC

Among all DMR detection tools, using adjusted P-values with the Bonferroni method generally yielded the lowest AUC, while using E-values yielded the highest AUC. AUC was significantly increased by up to 14.4% for all DMR detection tools when using E-values compared to P-values or adjusted P-values (Figure 4).

Figure 4.

Figure 4

AUC of identifying DMRs using P-values, adjusted P-values and E-values in simulated RRBS data. AUC: area under receiver operating characteristic curve. The white dots in the box represent the average value of AUC. ****P ≤ 0.0001.

E-values improved power and reduced type I error

In general, DMR detection tools using P-values yielded higher power than those using adjusted P-values (BH or Bonferroni), while DMR detection tools using E-values, especially for BiSeq and Metilene, were more powerful than those using P-values (Figure 5).

Figure 5.

Figure 5

Type I error and power of identifying DMRs using P-values, adjusted P-values and E-values in simulated RRBS data. In each panel, the y-axis represents type I error and the x-axis represents power. The average type I error and power in 100 repeats are presented.

Among all DMR detection tools, using P-values yielded a higher number of type I errors than using adjusted P-values or E-values. Conversely, using E-values could reduce type I errors compared to using P-values or BH-adjusted P-values. With the exception of MethyKit, all other DMR detection tools could control type I errors below 0.05, regardless of whether P-values, adjusted P-values or E-values were used. In MethylKit, the type I error exceeded 0.12 and was not well controlled when using P-value or adjusted P-value, while the type I error was well controlled to less than 0.05 when using E-values. These results suggested that the use of E-values in these DMR detection tools provided good control over type I errors and improved the ability to identify DMRs.

E-values identified DMRs of more biological relevance

To illustrate the utility of E-values, we applied them to identify DMRs in real RRBS datasets. It is well known that high sodium intake increases the risk of cardiovascular and renal diseases. Therefore, a better identification of DMRs from CRL rats or our clinical trial on low-salt diet should be able to identify more genes related with cardiovascular and renal diseases. In addition, DNA methylation in gene promoters plays a crucial role in transcriptional regulation. A better identification of DMRs should be able to reflect the strong negative correlation between the methylation levels of DMR detected in the gene promoters and their corresponding gene expressions.

The first real RRBS dataset came from our previous studies [31, 32] that analyzed genome-wide DNA methylome and transcriptome of renal T lymphocytes from CRL rats treated with high- or low-salt diets. Metilene was previously used to identify DMRs between high-salt and low-salt diet CRL rats using BH-adjusted P-values. This resulted in the identification of 193 DMRs located at TSS regions with BH-adjusted P-values <0.05. When using E-values >20, Metilene identified 233 DMRs located at TSS regions (Figure 6A). The methylation levels of these identified DMRs were concentrated around 0.1, which was consistent to the expected low methylation level in TSS regions (Figure 6B). Among the identified DMRs in TSS regions, 141 overlapped DMRs were detected by both BH-adjusted P-values and E-values. Genes with their TSS located within these overlapped DMRs were related to diseases such as Type 2 Diabetes. 92 DMRs were identified only when using E-values rather than BH-adjusted P-values. Genes with their TSS located within these unique DMRs were enriched in several diseases including blood pressure, hypertension and kidney diseases (Figure 6C). We then calculated Pearson correlation between the methylation level of these DMRs and the corresponding gene expression. Interestingly, DMRs detected using E-values were more negatively correlated with their corresponding gene expressions than those DMRs detected using BH-adjusted P-values (Figure 6D–E).

Figure 6.

Figure 6

Comparison of DMRs detected in CRL rats by using E-values or BH-adjusted P-values. (A) Venn diagram of DMRs detected by E-values and BH-adjusted P-values. (B) Methylation levels in these detected DMRs. (C) Number of disease-associated genes with TSS located within the detected DMRs. (DE) The negative association between gene expression and methylation level in the DMRs as detected by E-values or BH-adjusted P-values. Metilene was used to detect DMRs in the real RRBS data analysis. DMRs carrying TSS regions were used for gene ontology analysis.

The second real RRBS dataset came from our clinical trial in which 10 subjects were placed on 2 week low-salt diet (1200 mg sodium/day) [33]. Both RRBS and mRNA-seq data of biopsied arterioles were obtained from the same subjects before and after low-salt diet (referred to as pre-diet and post-diet, respectively). The original Metilene had implemented the Wilcoxon rank-sum test for independent experimental groups. Here, we replaced the Wilcoxon rank-sum test in Metilene by Wilcoxon signed-rank test for pre- and post-diet paired samples. Then, the revised Metilene using BH-adjusted P-values or E-values was applied to identify DMRs between pre-diet and post-diet groups. DMRs detected using E-values showed stronger negative correlations with their corresponding gene expressions than DMRs detected using BH-adjusted P-values (Supplementary Figure S1). These real RRBS data analyses revealed that the use of E-values leads to detection of DMRs of increased biological relevancy.

DISCUSSION

In this study, we have proposed a method to calculate E-values for DMR detection in RRBS data and provided an R package ‘metevalue’ as a user-friendly interface to implement the calculation of E-values that is compatible with various DMR detection tools. We then generated various RRBS benchmarking datasets using our simulator ‘RRBSsim’ to comprehensively evaluate the performance of P-values, adjusted P-values and E-values in identifying DMRs. Such benchmarking analyses showed that using E-values, rather than P-values or adjusted P-values, not only significantly improved ACC, AUC and power, but also reduced FDR and type I errors in identifying DMRs in RRBS studies (Table 1). The real data applications from CRL rats and our clinical trial on low-salt diet demonstrated that the use of E-values detected more disease-related genes containing these DMRs and improved the negative correlation between DNA methylation and gene expression. This suggests that E-values detect biologically more relevant DMRs than P-values.

Table 1.

Summary of the performance of E-values, P-values and their adjustments

Software Criteria ACC AUC FDR Type I error Power
Metilene E-values 1 1 2 2 1
BH 2 3 3 3 3
Bonferroni 3 4 1 1 4
P-values 4 2 4 4 2
MethylKit E-values 1 1 1 1 4
BH 3 2 3 3 1
Bonferroni 2 2 2 2 2
P-values 3 2 3 4 1
BiSeq E-values 1 1 2 2 1
BH 3 2 3 3 3
Bonferroni 4 4 1 1 4
P-values 2 2 4 4 2
DMRfinder E-values 1 1 2 2 2
BH 2 2 3 3 3
Bonferroni 3 4 1 1 4
P-values 4 3 4 4 1

Note: Performance is ranked numerically from 1 (best) to 4 (worst).

Improving accuracy and reducing errors are major considerations in the development of any computational tools, especially when it comes to identifying DMRs. To our knowledge, our study is the first to provide an E-value computation tool and to evaluate its performance in various DMR detection tools for RRBS data. Our findings indicate that multiple testing correction may greatly influence the performance of DMR detection tools and highlight the applicability of E-values in identifying DMRs in RRBS studies. This provides valuable information to guide other researchers toward the advancement of sequence-based DMR analysis and may also provide an example for E-value incorporation that could lead to similar improvements for other areas of biological data analysis.

In addition to their superior performance, E-values are easy to interpret and compute. They can be directly interpreted as ‘betting scores’ or ‘likelihood ratios’, or can be interpreted as ‘how likely is the rejection to be true?’. Neither asymptotic analysis or full distribution of test statistics is required to derive E-values. On the contrary, it is the use of P-values and adjusted P-values that have fundamental limitations in these areas [23]. Neither P-values nor adjusted P-values can answer the question ‘how likely is the rejection to be true?’. Furthermore, the computing of P-values and adjusted P-values requires a known full distribution of test statistics and often requires asymptotic analysis. Without a large sample size, the computing P-values by asymptotic analysis is far from robust [23]. Therefore, it is not always easy to accurately compute P-values and adjusted P-values. In omics data analysis, it remains challenging to control the FDR using statistical methods that generate incorrect raw P-values [42, 43].

There have been different terms in literature that refer to the ‘E-value’, such as ‘betting score’, ‘Skeptic’s capital’ and ‘S-value’. They have been used interchangeably to express the same definition that represents a measure of the ratio of the observed likelihood falling into the null hypothesis to the likelihood on the entire parameter space. The E-value can also be interpreted as a Bayes factor. In the P-value guideline issued in 2016, the ASA also recommended the use of Bayes Factors as an alternative approach to prevent misuse and misconception of P-values [15]. Over recent years, researchers have adopted Bayes factor analysis to improve the accuracy of the identification of causal gene variants in genome-wide association studies [44]. In the present study, we have extended E-values to the detection of DMRs in RRBS data and provided an R package to calculate E-values for DMR detection tools including Metilene, MethylKit, BiSeq and DMRfinder. Our study suggests that the use of E-values may be a promising approach to address the multiple testing correction problems when using RRBS data to identify differentially methylated regions.

Several caveats should be acknowledged in our study. First, we considered just eight samples in each experimental group. Despite the small sample size in this study, RRBS experiments including animal studies and the clinical trial in our previous studies have often been performed with small number of biological replicates (ranging from a few to a dozen subjects), and this is considered sufficient to reveal the importance of epigenetic regulation involving DMR methylation changes [45–48]. Second, we used the default setting of each DMR detection tool, focusing mainly on the performance of P-values, adjusted P-values and E-values in each DMR detection tool. Third, we evaluated the performance of the E-value in the detection of DMR using RRBS data. This benchmark workflow would also be potentially applicable to identify DMRs in methylation array data and whole genome bisulfate sequencing data and is also be suitable for identifying differentially methylated CpG sites due to these DMR detection tools containing optional analysis of CpG sites. Overall, our benchmark workflow sheds light on potential general applications to other types of genomic data besides DNA methylation. However, the applicability toward the construction of E-values in other specific types of genomic data still requires further investigation.

CONCLUSIONS

In summary, we have proposed a method to calculate E-values for the detection of DMRs in RRBS data and provided the R package ‘metevalue’ as a user-friendly interface to implement E-value calculations into various DMR detection tools. Our comprehensive benchmarking analyses demonstrated that using E-values not only significantly improved accuracy, AUC and power, over that of P-values or adjusted P-values, but also reduced false discovery rates and type I errors. In applications upon real RRBS data, the use of E-values detected biologically more relevant DMRs and improved the negative association between DNA methylation levels and their associated gene expressions. Taken together, E-value is a superior alternative to P-value and its use in DNA methylation studies is likely to receive increasing attention related to genome-wide epigenetic analysis.

Key Points

  • An E-value computation tool is provided as an alternative framework to P-value and its adjustments in DNA methylation studies.

  • Using E-values not only significantly improved accuracy, area under ROC curve and power, but also reduced false discovery rates and type I errors compared to P-values or adjusted P-values.

  • Using E-values detected biologically more relevant differentially methylated regions (DMRs) in our real reduced representation bisulfite sequencing data.

  • Our findings highlight the applicability of E-values in the detection of DMRs and provide valuable information that may guide researchers in advancing the analysis of other types of genomic data.

Supplementary Material

Supplementary-Data-R1_bbad241

ACKNOWLEDGEMENTS

We thank Ms Chao Bi at the Bioinformatics Core Facility of Zhejiang University School of Medicine, and Mathematics and Science College of Shanghai Normal University for providing computing capacity, Drs Tiantian Mao and Taizhong Hu at University of Science and Technology for hosting lectures introducing E-value theory, Dr Ruodu Wang at University of Waterloo for comments and suggestions on applying E-value theory, Christopher Wood of Zhejiang University for English grammar polishing, and the anonymous reviewers and editors for reading and commenting on the manuscript.

Author Biographies

Yifan Yang is the R&D director for the Transwarp Technology (Shanghai) Co., Ltd, and a part-time graduate advisor of Shanghai Normal University. His research interests are survival analysis, bioinformatics, knowledge graphs and big data.

Haoyuan Liu is a master’s student at the Shanghai Center for Mathematical Science of Fudan University, which he joined after his undergraduate studies in the Department of Mathematics at Shanghai Normal University. His research interests are statistics and bioinformatics.

Yi Liu is a PhD student at Sir Run Run Shaw Hospital and Institute of Translational Medicine, Zhejiang University School of Medicine. Her research interests are statistics and bioinformatics.

Liyuan Zhou is a research scientist at Sir Run Run Shaw Hospital, Zhejiang University School of Medicine. She received her PhD in bioinformatics from Zhejiang University. Her research interests are complex trait analysis, bioinformatics and statistical genetics.

Xiaoqi Zheng is a professor in the Department of Mathematics at Shanghai Normal University. His research interests are bioinformatics and computational biology.

Rongxian Yue is a professor in the Department of Mathematics at Shanghai Normal University. His research interests are statistical design of experiments and statistical computing.

David L. Mattson is the department chair and professor in the Department of Physiology, Medical College of Georgia, Augusta University. His research interests include normal and pathophysiological regulation of renal function and arterial blood pressure.

Srividya Kidambi is a professor and chief in the Division of Endocrinology, Metabolism, and Clinical Nutrition at the Department of Medicine, Medical College of Wisconsin/Froedtert Hospital. Her research interests are treatment of conditions associated with severe uncontrolled blood pressure, blood pressure in young people and blood pressure secondary to hormonal disorders.

Mingyu Liang is a professor and vice chair for Interdisciplinary and Translational Research in the Department of Physiology, Kohler Co. Chair and Director of Center of Systems Molecular Medicine at the Medical College of Wisconsin. His research interests are (epi)genomics and precision medicine, regulatory RNA and cellular metabolism, as related to hypertension and cardiovascular and kidney diseases.

Pengyuan Liu is a professor at Sir Run Run Shaw Hospital and Institute of Translational Medicine, Zhejiang University School of Medicine, and an Adjunct Professor at the Department of Physiology, Medical College of Wisconsin, USA. His research interests are non-coding RNA, bioinformatics and computational biology.

Xiaoqing Pan is a professor in the Department of Mathematics at Shanghai Normal University. Her research interests are statistical genetics and epigenomics in complex traits.

Contributor Information

Yifan Yang, Department of Mathematics, Shanghai Normal University, Shanghai, China; Transwarp Technology Co., LTD, Shanghai, China.

Haoyuan Liu, Department of Mathematics, Shanghai Normal University, Shanghai, China.

Yi Liu, Key Laboratory of Precision Medicine in Diagnosis and Monitoring Research of Zhejiang Province, Department of Respiratory Medicine, Sir Run Run Shaw Hospital and Institute of Translational Medicine, Zhejiang University, Hangzhou, China.

Liyuan Zhou, Key Laboratory of Precision Medicine in Diagnosis and Monitoring Research of Zhejiang Province, Department of Respiratory Medicine, Sir Run Run Shaw Hospital and Institute of Translational Medicine, Zhejiang University, Hangzhou, China.

Xiaoqi Zheng, Department of Mathematics, Shanghai Normal University, Shanghai, China.

Rongxian Yue, Department of Mathematics, Shanghai Normal University, Shanghai, China.

David L Mattson, Department of Physiology, Medical College of Georgia, Augusta University, Augusta, GA, USA.

Srividya Kidambi, Department of Medicine, Medical College of Wisconsin, Milwaukee, WI, USA.

Mingyu Liang, Center of Systems Molecular Medicine, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI, USA.

Pengyuan Liu, Key Laboratory of Precision Medicine in Diagnosis and Monitoring Research of Zhejiang Province, Department of Respiratory Medicine, Sir Run Run Shaw Hospital and Institute of Translational Medicine, Zhejiang University, Hangzhou, China; Center of Systems Molecular Medicine, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI, USA.

Xiaoqing Pan, Department of Mathematics, Shanghai Normal University, Shanghai, China.

FUNDING

The Natural Science Foundation of Shanghai (20JC1413800), the Medical Health Science and Technology Key Project of Zhejiang Provincial Health Commission (WKJ-ZJ-2007), the Key Research & Development Program of Zhejiang Province of China (2021C03126), the National Natural Science Foundation of China (82072857), the National Institutes of Health (HL149620) and the American Heart Association (15SFRN23910002).

DATA AVAILABILITY

The real RRBS data and gene expression data from Dahl salt-sensitive (SS/JrHsdMcwiCrl, referred to as CRL) rats and the clinical trial on low-salt diet have been deposited in the National Genomics Data Center under the accession numbers OMIX004080 and OMIX004085, respectively (https://ngdc.cncb.ac.cn/omix). A set of simulated RRBS data has also been deposited as example data in the GitHub repository at https://github.com/yfyang86/metevalue-demos/tree/main/simulation.

CODE AVAILABILITY

The R package ‘metevalue’ can be accessed directly through The Comprehensive R Archive Network (CRAN) (https://cran.r-project.org). Its source code is also available in the GitHub repository at https://github.com/yfyang86/metevalue. Source codes for generating simulated RRBS data used in our study can be also found in the GitHub repository at https://github.com/yfyang86/metevalue-demos/tree/main/simulation.

AUTHORS’ CONTRIBUTION

P.L. and X.P. designed the study. Y.L. generated the RRBS simulation data from RRBSsim and prepared the algorithms for DMR detection tools. H.L. performed the data analysis. H.L., Y.Y. and X.P. developed the algorithm and its associated R package to calculate E-values. L.Z. and Y.Y. provided technical support for high performance computation. D.L.M., S.K. and M.L. generated real RRBS and RNA-seq data from CRL rats and from the clinical trial on low-salt diet. X.Z., R.Y., D.L.M., S.K. and M.L. discussed and commented upon the study. Y.Y., P.L. and X.P. wrote the manuscript. All authors commented and approved the manuscript.

REFERENCES

  • 1. Korthauer  K, Kimes  PK, Duvallet  C, et al.  A practical guide to methods controlling false discoveries in computational biology. Genome Biol  2019;20(1):118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Jafari  M, Ansari-Pour  N. Why, when and how to adjust your P values?  Cell J  2019;20(4):604–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Knight  R, Vrbanac  A, Taylor  BC, et al.  Best practices for analysing microbiomes. Nat Rev Microbiol  2018;16(7):410–22. [DOI] [PubMed] [Google Scholar]
  • 4. Meijer  RJ, Goeman  JJ. Multiple testing of gene sets from gene ontology: possibilities and pitfalls. Brief Bioinform  2016;17(5):808–18. [DOI] [PubMed] [Google Scholar]
  • 5. van den  Oord  EJ. Controlling false discoveries in genetic studies. Am J Med Genet B Neuropsychiatr Genet  2008;147b(5):637–44. [DOI] [PubMed] [Google Scholar]
  • 6. Gelman  A, Robert  CP. Revised evidence for statistical standards. Proc Natl Acad Sci U S A  2014;111(19):E1933. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Ziliak  ST. The Validus Medicus and a new gold standard. Lancet  2010;376(9738):324–5. [DOI] [PubMed] [Google Scholar]
  • 8. Tyanova  S, Temu  T, Cox  J. The MaxQuant computational platform for mass spectrometry-based shotgun proteomics. Nat Protoc  2016;11(12):2301–19. [DOI] [PubMed] [Google Scholar]
  • 9. Rothman  KJ. A show of confidence. N Engl J Med  1978;299(24):1362–3. [DOI] [PubMed] [Google Scholar]
  • 10. Greenland  S, Senn  SJ, Rothman  KJ, et al.  Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations. Eur J Epidemiol  2016;31(4):337–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Davis  ML, Huang  Y, Wang  K. Rank normalization empowers a t-test for microbiome differential abundance analysis while controlling for false discoveries. Brief Bioinform  2021;22(5):bbab059. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Huber  W. A clash of cultures in discussions of the P value. Nat Methods  2016;13(8):607. [DOI] [PubMed] [Google Scholar]
  • 13. Halsey  LG, Curran-Everett  D, Vowler  SL, Drummond  GB. The fickle P value generates irreproducible results. Nat Methods  2015;12(3):179–85. [DOI] [PubMed] [Google Scholar]
  • 14. Sterne  JA, Davey Smith  G. Sifting the evidence-what’s wrong with significance tests?  BMJ  2001;322(7280):226–31. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Nuzzo  R. Scientific method: statistical errors. Nature  2014;506(7487):150–2. [DOI] [PubMed] [Google Scholar]
  • 16. Lew  MJ. Bad statistical practice in pharmacology (and other basic biomedical disciplines): you probably don’t know P. Br J Pharmacol  2012;166(5):1559–67. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Ioannidis  JP. Why most discovered true associations are inflated. Epidemiology  2008;19(5):640–8. [DOI] [PubMed] [Google Scholar]
  • 18. Ioannidis  JP. Contradicted and initially stronger effects in highly cited clinical research. JAMA  2005;294(2):218–28. [DOI] [PubMed] [Google Scholar]
  • 19. Wasserstein  RL, Lazar  NA. The ASA statement on P-values: context, process, and purpose. The American Statistician  2016;70(2):129–33. [Google Scholar]
  • 20. Amrhein  V, Greenland  S, McShane  B. Scientists rise up against statistical significance. Nature  2019;567(7748):305–7. [DOI] [PubMed] [Google Scholar]
  • 21. Murtaugh  PA. In defense of P values. Ecology  2014;95(3):611–7. [DOI] [PubMed] [Google Scholar]
  • 22. Wasserstein  RL, Schirm  AL, Lazar  NA. Moving to a world beyond “p < 0.05”. The American Statistician  2019;73(sup1):1–19. [Google Scholar]
  • 23. Vovk  V, Wang  R. E-values: calibration, combination and applications. The Annals of Statistics  2021;49(3):1736, 1719–54. [Google Scholar]
  • 24. Yin  L, Dai  Y, Jiang  X, et al.  Role of DNA methylation in bisphenol a exposed mouse spermatocyte. Environ Toxicol Pharmacol  2016;48:265–71. [DOI] [PubMed] [Google Scholar]
  • 25. Plongthongkum  N, Diep  DH, Zhang  K. Advances in the profiling of DNA modifications: cytosine methylation and beyond. Nat Rev Genet  2014;15(10):647–61. [DOI] [PubMed] [Google Scholar]
  • 26. Meissner  A, Gnirke  A, Bell  GW, et al.  Reduced representation bisulfite sequencing for comparative high-resolution DNA methylation analysis. Nucleic Acids Res  2005;33(18):5868–77. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Gu  H, Bock  C, Mikkelsen  TS, et al.  Genome-scale DNA methylation mapping of clinical samples at single-nucleotide resolution. Nat Methods  2010;7(2):133–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Liu  Y, Han  Y, Zhou  L, et al.  A comprehensive evaluation of computational tools to identify differential methylation regions using RRBS data. Genomics  2020;112(6):4567–76. [DOI] [PubMed] [Google Scholar]
  • 29. Sun  X, Han  Y, Zhou  L, et al.  A comprehensive evaluation of alignment software for reduced representation bisulfite sequencing data. Bioinformatics  2018;34(16):2715–23. [DOI] [PubMed] [Google Scholar]
  • 30. Sun  X, Yi  J, Yang  J, et al.  An integrated epigenomic-transcriptomic landscape of lung cancer reveals novel methylation driver genes of diagnostic and therapeutic relevance. Theranostics  2021;11(11):5346–64. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Abais-Battad  JM, Alsheikh  AJ, Pan  X, et al.  Dietary effects on Dahl salt-sensitive hypertension, renal damage, and the T lymphocyte transcriptome. Hypertension  2019;74(4):854–63. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Dasinger  JH, Alsheikh  AJ, Abais-Battad  JM, et al.  Epigenetic modifications in T cells: the role of DNA methylation in salt-sensitive hypertension. Hypertension  2020;75(2):372–82. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Kidambi  S, Pan  X, Yang  C, et al.  Dietary sodium restriction results in tissue-specific changes in DNA methylation in humans. Hypertension  2021;78(2):434–46. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Roberts  ML, Kotchen  TA, Pan  X, et al.  Unique associations of DNA methylation regions with 24-hour blood pressure phenotypes in black participants. Hypertension  2022;79(4):761–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Liu  P, Liu  Y, Liu  H, et al.  Role of DNA de novo (de)methylation in the kidney in salt-induced hypertension. Hypertension  2018;72(5):1160–71. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Liu  Y, Liu  P, Yang  C, et al.  Base-resolution maps of 5-methylcytosine and 5-hydroxymethylcytosine in Dahl S rats: effect of salt and genomic sequence. Hypertension  2014;63(4):827–38. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Krueger  F, Andrews  SR. Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics  2011;27(11):1571–2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Jühling  F, Kretzmer  H, Bernhart  SH, et al.  Metilene: fast and sensitive calling of differentially methylated regions from bisulfite sequencing data. Genome Res  2016;26(2):256–62. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Akalin  A, Kormaksson  M, Li  S, et al.  methylKit: a comprehensive R package for the analysis of genome-wide DNA methylation profiles. Genome Biol  2012;13(10):R87–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Hebestreit  K, Dugas  M, Klein  H-U. Detection of significantly differentially methylated regions in targeted bisulfite sequencing data. Bioinformatics  2013;29(13):1647–53. [DOI] [PubMed] [Google Scholar]
  • 41. Hochberg  Y. A sharper Bonferroni procedure for multiple tests of significance. Biometrika  1988;75(4):800–2. [Google Scholar]
  • 42. Benjamini  Y, Hochberg  Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B Methodol  1995;57(1):289–300. [Google Scholar]
  • 43. Sham  PC, Purcell  SM. Statistical power and significance testing in large-scale genetic studies. Nat Rev Genet  2014;15(5):335–46. [DOI] [PubMed] [Google Scholar]
  • 44. Morisawa  J, Otani  T, Nishino  J, et al.  Semi-parametric empirical Bayes factor for genome-wide association studies. Eur J Hum Genet  2021;29(5):800–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45. Sharp  AJ, Migliavacca  E, Dupre  Y, et al.  Methylation profiling in individuals with uniparental disomy identifies novel differentially methylated regions on chromosome 15. Genome Res  2010;20(9):1271–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Lister  R, Pelizzola  M, Dowen  RH, et al.  Human DNA methylomes at base resolution show widespread epigenomic differences. Nature  2009;462(7271):315–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47. Laporte  M, le Luyer  J, Rougeux  C, et al.  DNA methylation reprogramming, TE derepression, and postzygotic isolation of nascent animal species. Sci Adv  2019;5(10):eaaw1644. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48. Grimm  SA, Shimbo  T, Takaku  M, et al.  DNA methylation in mice is influenced by genetics as well as sex and life experience. Nat Commun  2019;10(1):305. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary-Data-R1_bbad241

Data Availability Statement

The real RRBS data and gene expression data from Dahl salt-sensitive (SS/JrHsdMcwiCrl, referred to as CRL) rats and the clinical trial on low-salt diet have been deposited in the National Genomics Data Center under the accession numbers OMIX004080 and OMIX004085, respectively (https://ngdc.cncb.ac.cn/omix). A set of simulated RRBS data has also been deposited as example data in the GitHub repository at https://github.com/yfyang86/metevalue-demos/tree/main/simulation.


Articles from Briefings in Bioinformatics are provided here courtesy of Oxford University Press

RESOURCES