Skip to main content
Computational and Structural Biotechnology Journal logoLink to Computational and Structural Biotechnology Journal
. 2025 Aug 26;27:3742–3752. doi: 10.1016/j.csbj.2025.08.030

Statistical modeling of immunoprecipitation efficiency of MeRIP-seq data enabled accurate detection and quantification of epitranscriptome

Haozhe Wang a,b,e, Kunqi Chen g, Zhen Wei b, Bowen Song d, Manli Zhu i, Jionglong Su f, Anh Nguyen e, Jia Meng b,c,h, Yue Wang a,
PMCID: PMC12446549  PMID: 40977901

Abstract

Background

Recent advancements in epitranscriptomics highlight reversible RNA modifications as crucial regulators, with N6-methyladenosine (m6A) being abundant in eukaryotic mRNAs. Immunoprecipitation (IP) with specific antibodies is one of the most prevalent methods for m6A profiling, enabling the isolation of modified RNA for downstream analysis of their functional roles, but no computational methods have been developed to explicitly report a specific variation value in IP efficiencies conveniently, which may hinder the identification of novel modified RNA sites, particularly those with low abundance or less well-characterized.

Results

We develop a comprehensive analytical tool, AEEIP,1 for estimating the IP efficiency and correcting antibody bias in epitranscriptomics directly, AEEIP employs a mixture model to estimate the proportion of modification-containing RNA fragments from the source of IP data. Validation with both simulated and real data shows that AEEIP successfully estimates antibody bias across different replicates and experimental conditions, and reveals that this bias may obscure the accurate identification of m6A sites, leading to false negatives in the quantification of m6A-seq data. The proposed method provides reproducible IP efficiency analysis and more robust results for quantifying epitranscriptomics, which is available at: https://github.com/whz991026/AEEIP.

Keywords: m6A, Epitranscriptome, Immunoprecipitation efficiency, MeRIP-seq, Mixture model, Antibody bias

Graphical Abstract

graphic file with name ga1.jpg

Highlights

  • A novel computational tool (AEEIP) for estimating and correcting immunoprecipitation (IP) efficiency in MeRIP-seq data.

  • Accurate detection of m6A modifications by addressing antibody bias and IP inefficiencies.

  • Antibody bias varies significantly across tissues and cell lines.

  • Enhanced m6A site detection with more accurate clustering near functionally relevant mRNA regions.

1. Introduction

Over the past decade, extensive research in RNA epigenetics [1] and epitranscriptomics [2] has established reversible RNA modification as a fundamental mechanism of epigenetic regulation [3]. To date, more than 170 distinct types of RNA modification have been identified across all three domains of life [4], with many of these modifications linked to various biological processes and diseases [5], [6]. Among these, the N6-methyladenosine (m6A) modification has emerged as the most prevalent internal modification found in eukaryotic mRNAs [7], [8]. m6A modification is regulated by a group of associated proteins, often referred to as 'writers', 'erasers', and 'readers', which play critical roles in a range of biological functions, including miRNA processing [9], RNA stability [10], circadian clock [11], translation [12], [13], RNA-protein interaction [14], response to environmental exposures, cell differentiation and mechanistic toxicology [15], [16], [17], [18], [19], [20], [21]. Through its regulatory roles in gene expression and cellular processes, m6A may emerge as a promising therapeutic target for cancer, facilitating the development of novel treatments [22], [23], [24]. Accurate identification of RNA modification sites is essential for unraveling their regulatory mechanisms and exploring their roles in the context of human diseases [25], [26].

Breakthroughs in next-generation sequencing (NGS) technologies have greatly advanced the field of RNA epigenetics, enabling transcriptome-wide profiling of RNA modifications [27], [28], [29]. The most widely used technique for profiling the distribution of m6A sites is MeRIP-seq (or m6A-seq) [30], [31]. In addition to m6A-seq, several methods have been developed to profile m6A sites, including antibody-based approaches such as miCLIP [32], m6A-CLIP [33], m6ACE-seq [34], PA-m6A-seq [35], as well as enzyme-assisted methods MAZTER-Seq [36], m6A-REF-seq [37], DART-seq [38], scDART-seq [39], eTAM-seq [40] and m6A-SAC-seq [41], etc. Additionally, third-generation sequencing technique, the third generation Oxford Nanopore Technologies (ONT), enables the direct observation of RNA molecules and offer significant advantages by distinguishing different RNA modification types without prior amplification [42], [43], [44], [45], [46]. Despite recent advances in novel high-resolution techniques such as direct RNA sequencing, antibody-based methods like MeRIP-seq remain indispensable in epitranscriptomics research due to its high sensitivity, well-established analytical frameworks, cost-effectiveness, and compatibility with low RNA inputs [47], [48], [49].

Among these techniques, MeRIP-seq and miCLIP are most widely techniques, which depend on antibodies that recognize m6A to selectively pull down methylated RNAs [50]. Enrichment fold is usually calculated as log2(IP/Input) [51], [52] or analyzed by the statistical methods such as DESeq2 [53] and MACS2 [54]. Moreover, several specialized computational tools have been created for their analysis, including exomePeak [55] and TRES [56] for peak calling, RADAR [57] and TRESS [58] for the differential methylation analysis, trumpet [59] for MeRIP-seq data quality assessment, and m6ACali for reducing the impact of non-specific antibody enrichment in MeRIP-seq [50].

Despite the specificity of antibodies, inefficiencies in the IP process can result in the unintended capture of non-modified RNA fragments, introducing bias into the dataset. While existing methods typically consider noise variables in the development of statistical models for epitranscriptome analysis, no computational methods have been developed to directly quantify a specific variation value to express IP efficiencies caused by antibody bias in different experiments. The lack of clear interpretation of antibody behavior may lead to inaccurate analysis about downstream biological phenomena, such as erroneous conclusions about methylation changes [24] and differential expression studies comparing two conditions (e.g., disease vs. healthy) [60]. Biased data mislead biological interpretations by attributing observed effects to specific RNA fragments, when in fact the bias introduced by antibodies plays a significant role [61]. More importantly, antibody bias may obscure the identification of novel modified RNA sites, particularly those with low abundance or less well-characterized [62].

Experimental methods like SCARLET [63] use site-specific cleavage and radioactive labeling to precisely quantify antibody bias but are labor-intensive and unsuitable for large-scale MeRIP-seq studies. To address this, we introduce a Bayesian modeling framework for estimating IP efficiency, inspired by prior work [64] that applied Bayesian inference to deconvolve cell-type proportions in heterogeneous tissues. The variability in IP efficiency across repeated replicates or different experimental conditions complicates efforts to develop standardized methods for estimating bias [65]. The lack of comprehensive datasets characterizing antibody efficiency exacerbates this challenge [65]. No universally applicable computational tool has been proposed to explicitly estimate the IP efficiency of various antibodies so far [66]. Therefore, in this research, we develop a comprehensive analytical tool, Antibody Efficiency Estimation from Immunoprecipitation data (AEEIP), designed to estimate the IP efficiency and correct antibody bias in epitranscriptomics directly. AEEIP will be used to determine the mixing proportion of modification-containing RNA fragments and modification-free RNA fragments in IP data, and revise this bias from the very beginning of the analysis. Its validity will be assessed using both simulated data and real m6A data. We demonstrate that such bias may obscure true m6A sites in the quantification of m6A-seq data. The proposed method offers reproducible IP efficiency analysis and provides more robust results for quantifying epitranscriptomics, which is available at: https://github.com/whz991026/AEEIP.

2. Methods

2.1. Overview of AEEIP method

AEEIP defines the relationships between various RNA fragments and applies Bayesian framework to estimate the proportions of modified and unmodified RNA in experimental samples, building on previous studies [64] that utilized Bayesian inference to estimate cell-type composition in complex tissue samples. The Bayesian approach is particularly suitable for handling uncertainty in sparse data, which is common in MeRIP-seq datasets. Importantly, it enables us to treat IP efficiency as a latent variable within the system, providing a means to estimate antibody bias. Many precedents highlight the effectiveness of Bayesian models in resolving latent variables from noisy biological data, demonstrated by their successful application in RNA-seq quantification [67], ChIP-seq data analysis [68] and single-cell methylomes modeling [69]. Below is a detailed breakdown of how our proposed Bayesian method, AEEIP, will be used to estimate bias in epitranscriptomic IP efficiency.

We define the total number of RNA sites to be estimated as T, with the indices denoted as site t with t=1,2,,T. Let l~t represent the effective length of site t [61]. Denote two types of RNA fragments: type a represents the modification-free RNA fragment, and type b represents the modification-containing RNA fragment. Let τa and τb represent the proportions of RNA fragment a and b in IP sample, which satisfies the equation: τa+τb=1. Similarly, let σa and σb represent the proportion of RNA fragment a and b in Input sample, satisfying the equation: σa+σb=1. Denote the relative RNA fragment abundance that aligns to site t in sample a and b by ρta andρtb, respectively. They satisfy the equations: t=1Tρta=1 and t=1Tρtb=1.

Let ρtp and ρtm denote the relative RNA fragment abundance of site t in the Input sample and IP sample, respectively. They can be expressed as follows:

ρtp=σaρta+σbρtb.# (1)
ρtm=τaρta+τbρtb.# (2)
ρtp=ρta.# (3)

Since the Input sample only contains a very small proportion of type b RNA fragment (i.e. σbtends to 0), Eq. (1) can be simplified to

We will express ρtpand ρtm in terms of ρta and ρtbaccording to the (2), (3) in the following section.

2.2. Alignment representation

The experimental data is converted and recorded into a set Rprtp|t=1,2,,T, where rtp represents the number of RNA fragments in Input sample that align to site t. The set Rmrtm|t=1,2,,T is similarly defined for the IP sample.

Let Npt=1Trtp and Nmt=1Trtm be the total number of RNA fragments in the Input sample and IP sample, respectively. The alignment representation Γp is generated based on experimental data Rp, where Γp=γi,tp|i=1,2,,Np; t=1,2,,T.

To explain this intuitively, the set Γp is listed as an Np×T matrix. For the tth column of the matrix, we let its rtpnumber of elements to be 1 and other elements be 0, which means there are rtpnumber of RNA fragments aligning to site t. γi,tp=1 means the i-th RNA fragment from Input sample aligns to site t and 0 otherwise. Alignment representation Γm is similarly generated based on set Rm.

2.3. Generative model

The sequencing process is modeled as a sampling process. The probability of an RNA fragment originating from site t is given by:

αts=ρtsl~tk=1Tρksl~s# (4)

with s being p for the Input sample or m for the IP sample.

Next, the probability of observing RNA fragment i is expressed by

t=1Tγi,tsαtsl~t for s=p or m.# (5)

It is assumed that each fragment is produced independently in both samples. Therefore, the likelihood of observing the RNA fragment Γp from the Input sample and observing Γm from the IP sample:

PΓp,Γm|αtp,αtm for t=1,2,,T=i=1Npt=1Tγi,tpαtpl~ti=1Nmt=1Tγi,tmαtml~t.# (6)

In Eq. (6), we replace αtp and αtm by αta and αtb respectively. According to their biological meaning [64], their relationship is described as αtp=αta and αtmτaαta+τbαtb. Therefore, the likelihood function (6) can be expressed as:

PΓp,Γm|Θ=i=1Npt=1Tγi,tpαtal~ti=1Nmt=1Tγi,tmτaαta+τbαtbl~t# (7)

where Θis defined as αtat=1T,αtbt=1T,τa,τb.

We seek to make Θ satisfy the equationαtmτaαta+τbαtb. It is clear that the τa=0,αtb=αtm is always one of the optimal solutions. If we calculate the maximum likelihood estimation, we always get the same solution with the same Input. To better account for uncertainty and biological variability, we introduce Bayesian prior information [64], [67], [68], [69].

A Beta(βa,βb) distribution is used as the prior for RNA fragment proportion of type a and b. To obtain posterior probability from Eq. (8), Bayes’ theorem is introduced and applied [70]:

PΓp,Γm|Θi=1Npt=1Tγi,tpαtal~ti=1Nmt=1Tγi,tmτaαta+τbαtbl~tτaβa1τbβb1# (8)

Utilizing the concept of a normalizing constant [70], our posterior probability can be expressed as:

PΓp,Γm|Θ=1Ci=1Npt=1Tγi,tpαtal~ti=1Nmt=1Tγi,tmτaαta+τbαtbl~tτaβa1τbβb1# (9)

where C is the normalizing constant.

The estimated parameters are therefore described by

Θˆ=argmaxθlogPΘ|Γp,Γm# (10)

2.4. EM algorithm

The maximum a posteriori (MAP) problem is solved by Expectation-Maximization (EM) approach [71]. We need to define the latent variables. Let Zp=zi,tp|t=1,,T;i=1,,Np represent the fragment alignment representation in the Input sample, where zi,tp=1 if RNA fragment i aligns to the site t within the Input sample. The latent variables of IP sample are denoted as Zm=zi,tma,zi,tmb|t=1,,T;i=1,,Nm, where zi,tma=1if RNA fragment i aligns to the site t and comes from type a RNA fragment within the IP sample, and zi,tma=0otherwise. Here, zi,tmbcan be either 1 or 0, defining similarity. The details of the proof are provided in the Appendix.

2.4.1. E-step

Next, the auxiliary variables, i.e., conditional probability of latent variables are defined as below:

qtman+1=Pzi,tma=1|Θ,Γp,Γm=rtmτanαtanτanαtan+τbnαtbn# (11)
qtman+1=Pzi,tma=1|Θ,Γp,Γm=rtmτbnαtbnτanαtan+τbnαtbn# (12)

2.4.2. M-step

τan+1=t=1Tqtman+1+βan1Nm+βan+βbn2# (13)
τbn+1=t=1Tqtmbn+1+βbn1Nm+βan+βbn2# (14)
αtan+1=rta+qtman+1Np+τan+1Nm# (15)
αtbn+1=qtmbn+1τbn+1Nm# (16)

Iteratively perform the E-step and M-step to update both the model parameters and conditional probabilities until convergence, in order to maximize the posteriori objective in Eq. (10).

However, the final αtan+1 and αtbn+1 are not always satisfied by thet=1Tαtan+1=1 and t=1Tαtbn+1=1except for the following conditions:

t=1Tqtman+1=τan+1Nm, and t=1Tqtmbn+1=τbn+1Nm# (17)

Therefore, we replace the τan+1Nmand τbn+1Nm according to the Eq. (17). Then the corresponding variables are updated as the following equations show:

τan+1=t=1Tqtman+1Nm.# (18)
τbn+1=t=1Tqtmbn+1Nm.# (19)
αtan+1=rta+qtman+1Np+t=1Tqtman+1.# (20)
αtbn+1=qtmbn+1t=1Tqtmbn+1.# (21)

These equations can also be seen as the non-bayes version when the beta priors information are not considered. Therefore, we will obtain parameters τaf,τbf,αtaf,αtbf as the Bayesian version of the final step of the EM algorithm, and parameters τaf,τbf,αtaf,αtbf as the non-Bayesian version. The non-bayes version will be used in the optimization step to find the optimal bias rate.

2.5. Optimization

In the EM algorithm, we maximize the posterior likelihood function Eq. (9). However, the Eq. (9) depends on the choice of the hyper-parameter of the prior information. The larger the hyper-parameter chosen for the prior information, the more dominant the prior information will be in the posterior likelihood function [72]. Therefore, choosing the appropriate size of the hyper-parameter is important. According to Eq. (13) (14), which are only related to Nm, the total number of RNA fragments in the IP sample, both βa and βb can be set to 10Nm as recommended by [64].

Furthermore, changing the proportion of the hyper-parameter of the prior information will give us different results. The primary goal is to maximize Eq. (7), which is fundamental to satisfying the equation αtmτaαta+τbαtb. Therefore, we need to find the largest τaf from the different results of different prior information that also satisfies the αtmτafαtaf+τbfαtbf equation. To achieve this, we use linear regression to estimate the λ of the αtmαtbf=λ(αtafαtbf). However, the equations t=1Tαtaf=1 and t=1Tαtbf=1 are not always satisfied. Thus, we estimate theλ of the αtmαtbf=λ(αtafαtbf) and choose the largest τaf such that τafλ.

2.6. Bias correction in epitranscriptome quantification

We apply the learned bias correction to adjust the IP data by estimating reads originating from non-specific binding events. This corrected IP data is then used to identify potential modification sites and improve the quantification accuracy. The IP data bias correction process is outlined below.

Let Rprtp|t=1,2,,T represent the set of RNA fragments in the Input sample that align to site t, where rtp is the count of RNA fragments at site t. Similarly, let Rmrtm|t=1,2,,Trepresent the set of RNA fragments in the IP sample. The learned antibody bias is denoted as τafc. To correct for non-specific binding, we extract τafcNm reads from the IP data, where Nmt=1Trtm is the total read count in the IP sample.

Next, we calculate the frequency of the Input sample at each site as:

gtp=rtpt=1Trtp# (22)

We assume the non-specific binding reads in IP sample follow a multinomial distribution with probabilities given by the frequencies gtp from the Input sample. This assumption is based on treating each read as an independent draw from discrete sites with probabilities gtp, making the multinomial distribution a natural model for the count variability. A similar multinomial modeling approach is used in [73]. The corrected IP sample, denoted as Rmc, is obtained by subtracting the non-specific binding reads Rps, which consists of τafcNm reads sampled from the multinomial distribution PN(N;gtp). To enhance robustness, we perform multiple samplings of Rps and select the median for each site.

We adjust the size factor of the IP sample according to the corrected bias rate. Since the correction step removes τafc percent of the reads, the corrected size factor is set to 1τafc times the original size factor of the IP sample. After this adjustment, the corrected IP sample and size factor can be used as input for methylation differential analysis methods, such as DESeq2 [53], to identify potential modification sites.

2.7. AEEIP package

The AEEIP R package, accessible at https://github.com/whz991026/AEEIP, implements the AEEIP model designed to evaluate IP efficiency. This tool provides MAP estimates of antibody efficiency, allowing the quantification of the proportion of mixed Input sample in IP sample to improve accuracy. The package is able to search a optimized result by adjusting the hyperparameter of the beta priors. AEEIP offers a holistic approach to assessing antibody quality, ensuring the production of robust and dependable outcomes.

3. Results

To evaluate the performance of the proposed method, we tested it on both simulated and real datasets.

3.1. Test on simulated dataset

The simulated data mimics the reads count information of 10,000 sites in three IP and Input control samples. Specifically, to simulate the reads count in IP and Input, we generate the relative RNA fragment abundance for each site t, considering two types of RNA fragments: modification-free RNA fragments and modification-containing RNA fragments. We assume that the read counts of these two fragment types follow a negative binomial distribution, a common model for sequencing count data that accounts for overdispersion, as implemented in widely used tools such as DESeq2 [53] and edgeR [74]. Since both Input and IP contain both fragment types, we mix them in a certain proportion based on their relative abundance. Notably, we assume that the Input sample contains small enough proportion of modification-containing RNA fragments (≤ 0.05). This threshold is conservative because RNA modifications are inherently rare in biological samples (e.g., m6A typically occur in < 0.1 % of total RNA bases [30], [31]). We test different proportions of the two types of RNA fragments on both the Input and IP samples.

We determine the optimal τaf as the maximum value satisfying τafλ by systematically varying the hyper-parameter proportion of prior information from 0.01 to 0.99 in 0.01 increments. To visualize the difference between the τaf and λ, we plot the estimates of τafλ against their τaf from one of the replicates in the simulation data with different mix rates of the two types of RNA fragment in both Input and IP sample. To simulate biologically plausible conditions, we set modification-containing proportions in the Input sample (0.002, 0.01, 0.05) based on reported m6A densities in mammalian transcriptomes [47], [48], and used modification-free proportions in the IP sample (0.1, 0.2, 0.4) to reflect antibody non-specific binding observed in MeRIP-seq experiments [32]. From the Fig. 1, we can see that the change point is around the mix rate of IP sample we set. To determine the optimal value of τaf, we select the smallest τafn+1 that satisfies two criteria simultaneously. First, we require that (τafn+1τafn)/(λn+1λ(n))>0.01, indicating that the gap between τaf and λ is starting to increase. Second, we impose the constraint τafn+1λn+10.01, ensuring that τaf remains sufficiently close to λ. The 0.01 threshold was selected because it matches the step size (0.01) of our systematic parameter variation in the prior information, ensuring consistent sensitivity in our detection of change points while maintaining resolution commensurate with our experimental design.

Fig. 1.

Fig. 1

Estimates ofτafλbased on different mix rates of the Input sample and IP sample. The left three plots depict the IP sample containing 0.1 modification-free RNA fragments. The middle plots represent 0.2, and the rightmost plots represent 0.4. The top three plots show the Input sample containing 0.002 modification-containing RNA fragments. The middle plots represent 0.01, and the bottom plots represent 0.05.

The Fig. 2 illustrates the learned τaf(n+1) of all three replicates with the mix rate of the modification-free RNA fragment in the IP sample set to 0.1, 0.2, and 0.4. We also change the mix rate of the Input sample to see the influence of assuming the Input sample contains no modification-containing RNA fragment. We set the proportion of the modification-containing RNA fragment to 0.002, 0.01, and 0.05. From the Fig. 2, we can observe that all the learned mix rates of the IP sample are around the mix rate we set, even for the 5 % modification-containing RNA fragment in the Input sample. Therefore, the model can estimate the reasonable mix rate of the IP sample even though the Input sample contains 5 % modification-containing RNA fragment.

Fig. 2.

Fig. 2

Estimates of learned mix rate for different mix rates with different settings of simulated Input sample and IP sample. The left three plots depict the IP sample containing 0.1 modification-free RNA fragments. The middle plots represent 0.2, and the rightmost plots represent 0.4. The top three plots show the Input sample containing 0.002 modification-containing RNA fragments. The middle plots represent 0.01, and the bottom plots represent 0.05. We can observe that the estimated mix rate is around the real proportion of the modification-free RNA fragment in the IP sample.

After determining the mix rate of the IP sample, we use the DESeq2 model [53] to identify differentially methylated sites based on the counts of modification-free and modification-containing RNA fragments. We compare the results from DESeq2 with and without bias correction. The details of the bias correction process are described in the Section 2.6. This comparison helps to evaluate the impact of the bias correction on the identification of methylated sites.

In the simulated tests, Fig. 3 shows that more true positive modified sites are identified after bias correction compared to the non-corrected results. Meanwhile, the corrected bias model also reports fewer false negative sites. In Fig. 4, the results indicate that the corrected bias model achieves a lower False Discovery Rate (FDR) in different simulated settings. Results without bias correction get worse as the proportion of simulated modification-free RNA fragments increase in the IP sample, suggesting that higher proportion of biased fragments lead to poorer results. Furthermore, the observation suggests that bias correction has a more noticeable impact in reducing errors (as indicated by the lower FDR) when the sample contains a higher proportion of modification-free RNA fragments. This implies that bias correction improves the accuracy of the identification of simulated modified fragments, and address issues caused by the presence of unmodified fragments in the sample more effectively when the bias is more evident.

Fig. 3.

Fig. 3

Identification of simulated modification sites with different bias settings. The top three plots depict the IP sample containing 0.1 % modification-free RNA fragments. The middle plots represent 0.2 %, and the bottom plots represent 0.4 percent%. The left three plots depict the Input sample containing 0.002 percent modification-containing RNA fragments. The middle plots represent 0.01 percent, and the bottom plots represent 0.05 %. We demonstrate the identification of modified sites using DESeq2 (p-value < 0.05) with data corrected for bias using AEEIP, as well as without correction.

Fig. 4.

Fig. 4

FDR of simulated modification site identification with and without AEEIP correction. The left plot depicts the Input sample containing 0.002 % modification-containing RNA fragments. The middle plot represents 0.01 %, and the right plot represents 0.05 %. We observe that the data without correction yields worse results as the rate of modification-free RNA fragments increases in the IP sample. In contrast, the difference in FDR between the bias-corrected and uncorrected data becomes more pronounced when the proportion of modification-free RNA fragments is higher, suggesting that bias correction is more beneficial in scenarios where the IP sample contains a larger fraction of modification-free RNA fragments.

3.2. Test on real datasets

We design a pipeline to validate the AEEIP model's efficacy on real data. We utilize the AEEIP model to determine the antibody bias, which represents the percentage of modification-free RNA fragments in the IP sample based on both the Input sample and the IP sample. Subsequently, we combine the δ percentage Input sample with the 1δ percentage IP sample to create a new IP sample. The AEEIP model is then employed to ascertain the antibody bias based on the Input sample and the new IP sample. The relationship between the newly learned antibody bias and the real antibody bias is functional. By utilizing the new learned antibody bias τaδ and the mix rate δ, we can calculate the real antibody bias based on the τa=(τaδδ)/(1δ). The proof is detailed in the Appendix.

To investigate the IP efficiency of m6A data in different conditions, we utilize several groups of MeRIP-seq datasets. The first two datasets are from heart and liver tissues from the study [51] (GEO accession: GSE122744). The third dataset is the m6A profiling of U2OS cells treated with deazaadenosine (DAA) from the study [11] (GEO accession: GSE48037). The fourth dataset is from HEL cells in the study [52] (GEO accession: GSE106124). The fifth dataset is from K180 in the study [75] (GEO accession: GSE154555). To further benchmark a dataset of single-base sites for more specific analysis, a consistent set of m6A sites is collected to evaluate their m6A levels across different tissues and cell lines, thereby ensuring data integration for comparison.

A total of 108,740 single-base sites profiled with 12 independent high-resolution techniques [32], [33], [34], [35], [36], [37], [38], [39], [40], [41], [76] are integrated. Quality control metrics are provided in our latest m6A-Atlas 2.0 [76]. These data from diverse platforms are consistently quantified against a unified genome reference UCSC hg38/GRCh38. Common sites reported by at least two detection techniques are retained to guarantee reproducibility. We retrieve the read counts of these sites from paired IP and input control samples of different groups, respectively.

We first follow the same pipeline used for analyzing the simulated datasets, but this time applied it to real datasets. We mix the IP sample with 0, 0.1, and 0.2 percentages of the Input sample. We determine the largest τaf with τafλ from the different proportions of the hyper-parameter of the prior information. As shown in Fig. 5, we observe that the curve shifts to the right with a higher mix rate, indicating a higher proportion of modification-free RNA fragments in the IP sample. This observation aligns with the fact that the Input sample contains a high proportion of modification-free RNA fragments. To further accurately validate the AEEIP model, we compare with the corrected τa=(τaδδ)/(1δ).

Fig. 5.

Fig. 5

Estimates ofτafλunder varying mix rates of the Input and IP samples across the four real dataset groups. The red curve represents the non-mixed IP sample, the green curve depicts the IP sample mixed with 0.1 Input sample, and the blue curve depicts the IP sample mixed with 0.2 Input sample. We observe that the curves shift to the right with a higher mix rate, indicating a higher proportion of modification-free RNA fragments in the IP sample.

As shown in Fig. 6, while some variation exists within each group, the corrected τaf estimates are more consistent within replicates from the same condition compared to between distinct conditions or cell types. This indicates that the AEEIP model can robustly estimate antibody bias in IP samples, which reflects the proportion of modification-free RNA fragments present. Although some variation in the learned antibody bias exists among technical replicates (within the same donor/sample), the most notable differences are observed between biological conditions, i.e., cell lines and tissue types from different donors. This observation highlights the robustness of the AEEIP model, as samples from the same biological condition tend to exhibit similar bias patterns [32]. It is worth to note that the estimated antibody bias represents a composite measure influenced by multiple factors including antibody properties, sample quality, and experimental protocols, which may contribute to the observed variations between each case. Future studies incorporating direct experimental measurements of antibody efficiency will help further validate and refine AEEIP’s bias estimation framework. Additionally, analysis of the K180 METTL3 knockdown dataset revealed higher estimated antibody bias in KD samples compared to controls, likely due to reduced m6A levels lowering the IP signal and amplifying the effect of non-specific binding [75].

Fig. 6.

Fig. 6

Estimates of correctedτafunder different mix rates of the Input and IP samples across the four real dataset groups. The red curve represents the non-mixed IP sample, the green curve represents the IP sample mixed with 0.1 Input sample, and the blue curve represents the IP sample mixed with 0.2 Input sample. We observe that the corrected antibody bias values from the mixed IP samples are similar to the values learned from the original data across different replicates. The estimated antibody biases are similar within replicates of the same group but vary significantly across different groups.

We identify m6A sites from the IP and Input files of the four experimental conditions mentioned above. Our results indicate that applying AEEIP for antibody bias correction leads to the detection of more positive m6A sites, particularly in samples where AEEIP estimates a pronounced antibody bias—i.e., a substantial deviation between the directly detected and estimated modification signals in the IP versus Input samples, indicating clear non-specific binding or pulldown inefficiency. AEEIP consistently identified novel m6A methylation sites that might be overlooked in standard analyses. Both using a consistent statistical threshold, (DESeq2, p-value < 0.05), applying the AEEIP antibody bias correction identified 23,841 m6A sites in the heart, compared to 22,864 sites without correction. In the liver sample, 16,459 m6A sites were detected after bias correction, versus 14,009 without correction. In the HEL cell line, AEEIP identified 50,591 sites, compared to 46,076 without correction. In the U2OS sample, 16,990 sites were recognized as m6A-modified with correction, compared to 16,564 without. These sites are not merely additional m6A candidates but exhibit high evolutionary conservation (PhastCons score = 1) and reside in functionally critical genes [77], [78], [79], [80], as exemplified in Supplementary Table 1.

We utilize MetaTX [81] to plot the relative site distribution across different regions of mRNA (See Fig. 7). We compare the identified single-base m6A site with and without AEEIP correction. The results show that the site distribution with bias correction exhibits higher peak values compared to the distribution without correction. This indicates that the sites identified after bias correction are more likely to be m6A-active or functionally relevant, as supported by previous studies [47], [48]

Fig. 7.

Fig. 7

The distribution of identified m6A sites in four conditions with and without AEEIP correction. Applying AEEIP for IP-efficiency bias correction reports more positive m6A sites, but less sites fall into the promoter and tail regions. We compare these identified m6A sites by visualizing their distribution along the mRNA. The peak for the sites identified with bias correction is higher and more concentrated around the stop codon compared to the sites without bias correction.

While AEEIP was optimized for canonical mRNA m6A profiling, we included analysis of evaluating its performance on region-specific sites (5′ UTR, CDS, 3′ UTR) and short transcripts (≤ 250 bp) (Fig. S1). For short RNAs, MeRIP-seq is inherently limited due to fragmentation and enrichment challenges. Accordingly, our bias estimates from short transcripts diverged substantially from those of full-length mRNA, indicating that AEEIP is not suitable for small RNA m6A analysis.

For transcript regions, all median bias estimates from the 5′ UTR, CDS, and 3′ UTR were near zero. Among these, the estimates from the 3′ UTR most closely resemble those from the full-length regions, followed by the CDS, while the 5′ UTR shows the larger discrepancy. This aligns with known m6A distribution patterns, where methylation is enriched in 3′ UTRs and CDS but sparse in 5′ UTRs. These results highlight that while AEEIP performs robustly for canonical mRNA regions, its accuracy may vary in weakly methylated or short RNA contexts.

4. Discussion and conclusion

RNA modifications such as m6A play crucial roles in regulating various aspects of RNA biology, including splicing, stability, transport, and translation. Recent advancements in the field of RNA epigenetics facilitated the development of numerous techniques for a transcriptome-wide profiling of epitranscriptome. Among these, antibody-based methods such as MeRIP-seq and miCLIP are the most widely used for profiling the distribution of m6A sites. During their profiling process, the immunoprecipitation step is crucial for isolating RNA fragments that are specifically bound by m6A antibodies. Despite the specificity of antibodies, inefficiencies in the IP process can lead to the unintentional capture of non-modified RNA fragments, which may introduce bias into the dataset. While existing methods consider noise variables when developing statistical models for epitranscriptome analysis, there is a gap in the current methods for directly quantifying the impact of antibody bias on the efficiency of the IP process across different experiments.

This bias may prevent the identification of novel modified RNA sites, particularly those with low abundance or poor characterization, resulting in false negatives. Therefore, we propose AEEIP, a method designed to explicitly quantify IP efficiency as specific values and correct the bias in the IP data from the outset of the analysis. The performance of the proposed method is evaluated using both simulated and real datasets. In the simulated tests, the corrected bias model outperforms the uncorrected model in terms of true positives, false negative rates, and overall accuracy. In real data validation, the corrected bias model exhibits consistency across replicates and provides accurate estimates for samples with different cell types and tissues. These findings validate the efficacy of the AEEIP model in interpreting antibody bias in IP-seq data. This is further supported by the reduced FDR after AEEIP correction, especially in samples with more modification-free fragments. Finally, in four experimental conditions, using a consistent statistical threshold, AEEIP-assisted model detects more positive m6A sites. These sites are not just more numerous, but they also have stronger signals and are more clustered near the stop codons of mRNA.

Several constraints are outlined here. A key limitation is the lack of comparison with IP efficiency quantification methods, as we found few relevant approaches in the literature of directly quantifying IP efficiency in epitranscriptome, except for a statistical method developed specifically for ChIP-seq data [68]. Another limitation is that we do not explore the possible differences between replicates from the same case. We only analyze each replicate separately, which may not fully capture the nuances of variability within the same set of experiments. Additionally, while our focus in this study is on m6A, future applications of AEEIP to other RNA modifications should consider their distinct modification densities and biochemical properties, which may affect antibody efficiency and data characteristics. Addressing these challenges requires further exploration and refinement, including improvements in computational efficiency and deeper theoretical work to ensure optimal solutions and broader applicability of the proposed framework.

CRediT authorship contribution statement

Yue Wang: Supervision, Resources, Data curation, Conceptualization. Kunqi Chen: Supervision, Resources, Data curation. Haozhe Wang: Writing – original draft, Software, Methodology, Investigation, Formal analysis, Conceptualization. Manli Zhu: Supervision, Resources. Jionglong Su: Supervision, Resources. Zhen Wei: Supervision, Resources, Data curation. Bowen Song: Supervision, Resources. Anh Nguyen: Supervision, Resources. meng jia: Supervision, Resources, Conceptualization.

Declaration of Competing Interest

The authors declare no competing interests.

Acknowledgments

National Natural Science Foundation of China [32400527, 32500580]; Natural Science Foundation of Jiangsu Province [BK20240723]; Nanjing University of Chinese Medicine [013038019029 and 013038030001]; XJTLU Key Program Special Fund [KSF-E-51 and KSF-P-02].

Availability of Data and Materials

Footnotes

1

AEEIP - Antibody Efficiency Estimation from Immunoprecipitation data.

Appendix A

Supplementary data associated with this article can be found in the online version at doi:10.1016/j.csbj.2025.08.030.

Appendix A. Supplementary material

Supplementary material

mmc1.docx (224.5KB, docx)

References

  • 1.He C. Grand challenge commentary: RNA epigenetics? Nat Chem Biol. 2010;6:863–865. doi: 10.1038/nchembio.482. [DOI] [PubMed] [Google Scholar]
  • 2.Saletore Y., Meyer K., Korlach J., Vilfan I.D., Jaffrey S., Mason C.E. The birth of the epitranscriptome: deciphering the function of RNA modifications. Genome Biol. 2012;13:175. doi: 10.1186/gb-2012-13-10-175. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Jia G., Fu Y., Zhao X., Dai Q., Zheng G., Yang Y., et al. N 6-methyladenosine in nuclear RNA is a major substrate of the obesity-associated FTO. Nat Chem Biol. 2011;7:885–887. doi: 10.1038/nchembio.687. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Cappannini A., Ray A., Purta E., Mukherjee S., Boccaletto P., Moafinejad S.N., et al. MODOMICS: a database of RNA modifications and related information. 2023 update. Nucleic Acids Res. 2023:gkad1083. doi: 10.1093/nar/gkad1083. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Motorin Y., Helm M. RNA nucleotide methylation: 2021 update. Wiley Interdiscip Rev RNA. 2022;13 doi: 10.1002/wrna.1691. [DOI] [PubMed] [Google Scholar]
  • 6.Garcias Morales D., Reyes J.L. A birds'-eye view of the activity and specificity of the mRNA m(6) a methyltransferase complex. Wiley Interdiscip Rev RNA. 2021;12 doi: 10.1002/wrna.1618. [DOI] [PubMed] [Google Scholar]
  • 7.Boulias K., Greer E.L. Biological roles of adenine methylation in RNA. Nat Rev Genet. 2023;24:143–160. doi: 10.1038/s41576-022-00534-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Sendinc E., Shi Y. RNA m6A methylation across the transcriptome. Mol Cell. 2023;83:428–441. doi: 10.1016/j.molcel.2023.01.006. [DOI] [PubMed] [Google Scholar]
  • 9.Alarcon C.R., Lee H., Goodarzi H., Halberg N., Tavazoie S.F. N6-methyladenosine marks primary microRNAs for processing. Nature. 2015;519:482–485. doi: 10.1038/nature14281. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Wang X., Lu Z., Gomez A., Hon G.C., Yue Y., Han D., et al. N6-methyladenosine-dependent regulation of messenger RNA stability. Nature. 2014;505:117–120. doi: 10.1038/nature12730. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Fustin J.M., Doi M., Yamaguchi Y., Hida H., Nishimura S., Yoshida M., et al. RNA-methylation-dependent RNA processing controls the speed of the circadian clock. Cell. 2013;155:793–806. doi: 10.1016/j.cell.2013.10.026. [DOI] [PubMed] [Google Scholar]
  • 12.Meyer K.D., Patil D.P., Zhou J., Zinoviev A., Skabkin M.A., Elemento O., et al. 5' UTR m(6)A promotes Cap-Independent translation. Cell. 2015;163:999–1010. doi: 10.1016/j.cell.2015.10.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Wang X., Zhao B.S., Roundtree I.A., Lu Z., Han D., Ma H., et al. N(6)-methyladenosine modulates messenger RNA translation efficiency. Cell. 2015;161:1388–1399. doi: 10.1016/j.cell.2015.05.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Liu N., Dai Q., Zheng G., He C., Parisien M., Pan T. N(6)-methyladenosine-dependent RNA structural switches regulate RNA-protein interactions. Nature. 2015;518:560–564. doi: 10.1038/nature14234. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Batista Pedro J., Molinie B., Wang J., Qu K., Zhang J., Li L., et al. m6A RNA modification controls cell fate transition in mammalian embryonic stem cells. Cell Stem Cell. 2014;15:707–719. doi: 10.1016/j.stem.2014.09.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Delaunay S., Frye M. RNA modifications regulating cell fate in cancer. Nat Cell Biol. 2019;21:552–559. doi: 10.1038/s41556-019-0319-0. [DOI] [PubMed] [Google Scholar]
  • 17.Yang C. ToxPoint: dissecting functional RNA modifications in responses to environmental exposure—mechanistic toxicology research enters a new era. Toxicol Sci. 2020;174:1–2. doi: 10.1093/toxsci/kfz252. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Pendleton K.E., Chen B., Liu K., Hunter O.V., Xie Y., Tu B.P., et al. The U6 snRNA m6A methyltransferase METTL16 regulates SAM synthetase intron retention. Cell. 2017;169:824–835.e814. doi: 10.1016/j.cell.2017.05.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Liu N., Dai Q., Zheng G., He C., Parisien M., Pan T. N6-methyladenosine-dependent RNA structural switches regulate RNA–protein interactions. Nature. 2015;518:560–564. doi: 10.1038/nature14234. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Huang J., Wang X., Xia R., Yang D., Liu J., Lv Q., et al. Domain-knowledge enabled ensemble learning of 5-formylcytosine (f5C) modification sites. Comput Struct Biotechnol J. 2024;23:3175–3185. doi: 10.1016/j.csbj.2024.08.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Geula S., Moshitch-Moshkovitz S., Dominissini D., Mansour A.A., Kol N., Salmon-Divon M., et al. m6A mRNA methylation facilitates resolution of naïve pluripotency toward differentiation. Science. 2015;347:1002–1006. doi: 10.1126/science.1261417. [DOI] [PubMed] [Google Scholar]
  • 22.Yankova E., Blackaby W., Albertella M., Rak J., De Braekeleer E., Tsagkogeorga G., et al. Small-molecule inhibition of METTL3 as a strategy against myeloid leukaemia. Nature. 2021;593:597–601. doi: 10.1038/s41586-021-03536-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Cayir A. RNA modifications as emerging therapeutic targets. Wiley Interdiscip Rev RNA. 2022;13 doi: 10.1002/wrna.1702. [DOI] [PubMed] [Google Scholar]
  • 24.He P.C., He C. m6A RNA methylation: from mechanisms to therapeutic potential. EMBO J. 2021;40 doi: 10.15252/embj.2020105977. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Song B., Wang X., Liang Z., Ma J., Huang D., Wang Y., et al. RMDisease V2.0: an updated database of genetic variants that affect RNA modifications with disease and trait implication. Nucleic Acids Res. 2023;51:D1388–D1396. doi: 10.1093/nar/gkac750. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Song Y., Song B., Huang D., Nguyen A., Hu L., Meng J., et al. Multimodal zero-shot learning of previously unseen epitranscriptomes from RNA-seq data. Brief Bioinform. 2025;26 doi: 10.1093/bib/bbaf332. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Satam H., Joshi K., Mangrolia U., Waghoo S., Zaidi G., Rawool S., et al. Next-generation sequencing technology: current trends and advancements. Biology. 2023;12:997. doi: 10.3390/biology12070997. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Wang Y., Wei Z., Su J., Coenen F., Meng J. RgnTX: colocalization analysis of transcriptome elements in the presence of isoform heterogeneity and ambiguity. Comput Struct Biotechnol J. 2023;21:4110–4117. doi: 10.1016/j.csbj.2023.08.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Wang H., Wang Y., Zhou J., Song B., Tu G., Nguyen A., et al. Statistical modeling of single-cell epitranscriptomics enabled trajectory and regulatory inference of RNA methylation. Cell Genom. 2025;5 doi: 10.1016/j.xgen.2024.100702. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Meyer K.D., Saletore Y., Zumbo P., Elemento O., Mason C.E., Jaffrey S.R. Comprehensive analysis of mRNA methylation reveals enrichment in 3′ UTRs and near stop codons. Cell. 2012;149:1635–1646. doi: 10.1016/j.cell.2012.05.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Dominissini D., Moshitch-Moshkovitz S., Schwartz S., Salmon-Divon M., Ungar L., Osenberg S., et al. Topology of the human and mouse m6A RNA methylomes revealed by m6A-seq. Nature. 2012;485:201–206. doi: 10.1038/nature11112. [DOI] [PubMed] [Google Scholar]
  • 32.Linder B., Grozhik A.V., Olarerin-George A.O., Meydan C., Mason C.E., Jaffrey S.R. Single-nucleotide-resolution mapping of m6A and m6Am throughout the transcriptome. Nat Methods. 2015;12:767–772. doi: 10.1038/nmeth.3453. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Ke S., Alemu E.A., Mertens C., Gantman E.C., Fak J.J., Mele A., et al. A majority of m6A residues are in the last exons, allowing the potential for 3′ UTR regulation. Genes Dev. 2015;29:2037–2053. doi: 10.1101/gad.269415.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Goh Y.T., Koh Casslynn W.Q., Sim D.Y., Roca X., Goh W.S.S. METTL4 catalyzes m6Am methylation in U2 snRNA to regulate pre-mRNA splicing. Nucleic Acids Res. 2020;48:9250–9261. doi: 10.1093/nar/gkaa684. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Chen K., Lu Z., Wang X., Fu Y., Luo G.Z., Liu N., et al. High-resolution N(6) -methyladenosine (m(6) A) map using photo-crosslinking-assisted m(6) a sequencing. Angew Chem Int Ed Engl. 2015;54:1587–1590. doi: 10.1002/anie.201410647. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Garcia-Campos M.A., Edelheit S., Toth U., Safra M., Shachar R., Viukov S., et al. Deciphering the "m(6)A code" via antibody-independent quantitative profiling. Cell. 2019;178:731–747. doi: 10.1016/j.cell.2019.06.013. [DOI] [PubMed] [Google Scholar]
  • 37.Xu H.-Y., Zhang Y.-Q., Liu Z.-M., Chen T., Lv C.-Y., Tang S.-H., et al. ETCM: an encyclopaedia of traditional Chinese medicine. Nucleic Acids Res. 2019;47:D976–D982. doi: 10.1093/nar/gky987. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Meyer K.D. DART-seq: an antibody-free method for global m6A detection. Nat Methods. 2019;16:1275–1280. doi: 10.1038/s41592-019-0570-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Tegowski M., Flamand M.N., Meyer K.D. scDART-seq reveals distinct m6A signatures and mRNA methylation heterogeneity in single cells. Mol Cell. 2022;82:868–878. doi: 10.1016/j.molcel.2021.12.038. e810. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Xiao Y.-L., Liu S., Ge R., Wu Y., He C., Chen M., et al. Transcriptome-wide profiling and quantification of N6-methyladenosine by enzyme-assisted adenosine deamination. Nat Biotechnol. 2023;41:993–1003. doi: 10.1038/s41587-022-01587-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Ge R., Ye C., Peng Y., Dai Q., Zhao Y., Liu S., et al. m6A-SAC-seq for quantitative whole transcriptome m6A profiling. Nat Protoc. 2023;18:626–657. doi: 10.1038/s41596-022-00765-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Garalde D., Snell E., Jachimowicz D. Highly parallel direct RNA sequencing on an array of nanopores. Nat Methods. 2018;15:201–206. doi: 10.1038/nmeth.4577. [DOI] [PubMed] [Google Scholar]
  • 43.Thomas N.K., Poodari V.C., Jain M., Olsen H.E., Akeson M., Abu-Shumays R.L. Direct nanopore sequencing of individual full length tRNA strands. ACS Nano. 2021;15:16642–16653. doi: 10.1021/acsnano.1c06488. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Anreiter I., Mir Q., Simpson J.T., Janga S.C., Soller M. New twists in detecting mRNA modification dynamics. Trends Biotechnol. 2021;39:72–89. doi: 10.1016/j.tibtech.2020.06.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Liu H., Begik O., Lucas M.C. Accurate detection of m6A RNA modifications in native RNA sequences. Nat Commun. 2019;10:4079. doi: 10.1038/s41467-019-11713-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Zhang Y., Jiang J., Ma J., Wei Z., Wang Y., Song B., et al. DirectRMDB: a database of post-transcriptional RNA modifications unveiled from direct RNA sequencing technology. Nucleic Acids Res. 2023;51:D106–D116. doi: 10.1093/nar/gkac1061. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Meyer K.D., Saletore Y., Zumbo P., Elemento O., Mason C.E., Jaffrey S.R. Comprehensive analysis of mRNA methylation reveals enrichment in 3′ UTRs and near stop codons. Cell. 2012;149:1635–1646. doi: 10.1016/j.cell.2012.05.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Dominissini D., Moshitch-Moshkovitz S., Schwartz S., Salmon-Divon M., Ungar L., Osenberg S., et al. Topology of the human and mouse m6A RNA methylomes revealed by m6A-seq. Nature. 2012;485:201–206. doi: 10.1038/nature11112. [DOI] [PubMed] [Google Scholar]
  • 49.Garcia-Campos M.A., Edelheit S., Toth U., Safra M., Shachar R., Viukov S., et al. Deciphering the “m6A code” via antibody-independent quantitative profiling. Cell. 2019;178:731–747. doi: 10.1016/j.cell.2019.06.013. e716. [DOI] [PubMed] [Google Scholar]
  • 50.Ye H., Li T., Rigden D.J., Wei Z. m6ACali: machine learning-powered calibration for accurate m6A detection in MeRIP-Seq. Nucleic Acids Res. 2024;52:4830–4842. doi: 10.1093/nar/gkae280. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Zhang H., Shi X., Huang T., Zhao X., Chen W., Gu N., et al. Dynamic landscape and evolution of m6A methylation in human. Nucleic Acids Res. 2020;48:6251–6264. doi: 10.1093/nar/gkaa347. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Kuppers D.A., Arora S., Lim Y., Lim A.R., Carter L.M., Corrin P.D., et al. N6-methyladenosine mRNA marking promotes selective translation of regulons required for human erythropoiesis. Nat Commun. 2019;10:4596. doi: 10.1038/s41467-019-12518-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Love M.I., Huber W., Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:1–21. doi: 10.1186/s13059-014-0550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Zhang Y., Liu T., Meyer C.A., Eeckhoute J., Johnson D.S., Bernstein B.E., et al. Model-based analysis of ChIP-Seq (MACS) Genome Biol. 2008;9 doi: 10.1186/gb-2008-9-9-r137. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Meng J., Cui X., Rao M.K., Chen Y., Huang Y. Exome-based analysis for RNA epigenome sequencing data. Bioinformatics. 2013;29:1565–1567. doi: 10.1093/bioinformatics/btt171. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Guo Z., Shafik A.M., Jin P., Wu Z., Wu H. Detecting m6A methylation regions from methylated RNA immunoprecipitation sequencing. Bioinformatics. 2021;37:2818–2824. doi: 10.1093/bioinformatics/btab181. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Zhang Z., Zhan Q., Eckert M., Zhu A., Chryplewicz A., De Jesus D.F., et al. RADAR: differential analysis of MeRIP-seq data with a random effect model. Genome Biol. 2019;20:294. doi: 10.1186/s13059-019-1915-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Guo Z., Shafik A.M., Jin P., Wu H. Differential RNA methylation analysis for MeRIP-seq data under general experimental design. Bioinformatics. 2022;38:4705–4712. doi: 10.1093/bioinformatics/btac601. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Zhang T., Zhang S.W., Zhang L., Meng J. Trumpet: transcriptome-guided quality assessment of m(6)A-seq data. BMC Bioinforma. 2018;19:260. doi: 10.1186/s12859-018-2266-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Mohan D., Wansley D.L., Sie B.M., Noon M.S., Baer A.N., Laserson U., et al. PhIP-Seq characterization of serum antibodies using oligonucleotide-encoded peptidomes. Nat Protoc. 2018;13:1958–1978. doi: 10.1038/s41596-018-0025-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Roberts A., Trapnell C., Donaghey J., Rinn J.L., Pachter L. Improving RNA-Seq expression estimates by correcting for fragment bias. Genome Biol. 2011;12:1–14. doi: 10.1186/gb-2011-12-3-r22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Zhu Y., Zhu L., Wang X., Jin H. RNA-based therapeutics: an overview and prospectus. Cell Death Dis. 2022;13:644. doi: 10.1038/s41419-022-05075-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Mirza A.H., Bram Y., Schwartz R.E., Jaffrey S.R. SCARPET: site-specific quantification of methylated and nonmethylated adenosines reveals m6A stoichiometry. Rna. 2024;30:308–324. doi: 10.1261/rna.079776.123. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Li Y., Xie X. BMC bioinformatics. Springer; 2013. A mixture model for expression deconvolution from RNA-seq in heterogeneous tissues; pp. 1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Zhang H., Zhang L., Lin A., Xu C., Li Z., Liu K., et al. Algorithm for optimized mRNA design improves stability and immunogenicity. Nature. 2023;621:396–403. doi: 10.1038/s41586-023-06127-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Slama K., Galliot A., Weichmann F., Hertler J., Feederle R., Meister G., et al. Determination of enrichment factors for modified RNA in MeRIP experiments. Methods. 2019;156:102–109. doi: 10.1016/j.ymeth.2018.10.020. [DOI] [PubMed] [Google Scholar]
  • 67.Salzman J., Jiang H., Wong W.H. Statistical modeling of RNA-Seq data. Stati Sci: Rev J Inst Math Stat. 2011;26 doi: 10.1214/1210-STS1343. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Bao Y., Vinciotti V., Wit E., ’t Hoen P.A. Accounting for immunoprecipitation efficiencies in the statistical analysis of ChIP-seq data. BMC Bioinform. 2013;14:1–16. doi: 10.1186/1471-2105-14-169. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Kapourani C.-A., Sanguinetti G. Melissa: Bayesian clustering and imputation of single-cell methylomes. Genome Biol. 2019;20:61. doi: 10.1186/s13059-019-1665-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Feller W. Vol. 2. John Wiley & Sons; 1991. (An introduction to probability theory and its applications). [Google Scholar]
  • 71.Dempster A.P., Laird N.M., Rubin D.B. Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodol) 1977;39:1–22. [Google Scholar]
  • 72.Bernardo J.M., Smith A.F. John Wiley & Sons; 2009. Bayesian theory. [Google Scholar]
  • 73.Wu S.H., Schwartz R.S., Winter D.J., Conrad D.F., Cartwright R.A. Estimating error models for whole genome sequencing using mixtures of Dirichlet-multinomial distributions. Bioinformatics. 2017;33:2322–2329. doi: 10.1093/bioinformatics/btx133. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Robinson M.D., McCarthy D.J., Smyth G.K. Edger: a bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26:139–140. doi: 10.1093/bioinformatics/btp616. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Wang W., Shao F., Yang X., Wang J., Zhu R., Yang Y., et al. METTL3 promotes tumour development by decreasing APC expression mediated by APC mRNA n 6-methyladenosine-dependent YTHDF binding. Nat Commun. 2021;12:3803. doi: 10.1038/s41467-021-23501-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Liang Z., Ye H., Ma J., Wei Z., Wang Y., Zhang Y., et al. m6A-Atlas v2.0: updated resources for unraveling the N6-methyladenosine (m6A) epitranscriptome among multiple species. Nucleic Acids Res. 2024;52:D194–D202. doi: 10.1093/nar/gkad691. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Jaouadi H., Chabrak S., Lahbib S., Abdelhak S., Zaffran S. Identification of two variants in AGRN and RPL3L genes in a patient with catecholaminergic polymorphic ventricular tachycardia suggesting new candidate disease genes and digenic inheritance. Clin Case Rep. 2022;10 doi: 10.1002/ccr3.5339. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Nissim S., Weeks O., Talbot J.C., Hedgepeth J.W., Wucherpfennig J., Schatzman-Bone S., et al. Iterative use of nuclear receptor Nr5a2 regulates multiple stages of liver and pancreas development. Dev Biol. 2016;418:108–123. doi: 10.1016/j.ydbio.2016.07.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Khoubai F.Z., Grosset C.F. DUSP9, a dual-specificity phosphatase with a key role in cell biology and human diseases. Int J Mol Sci. 2021;22:11538. doi: 10.3390/ijms222111538. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Sterneck E., Poria D.K., Balamurugan K. Slug and E-cadherin: stealth accomplices? Front Mol Biosci. 2020;7 doi: 10.3389/fmolb.2020.00138. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Wang Y., Chen K., Wei Z., Coenen F., Su J., Meng J. MetaTX: deciphering the distribution of mRNA-related features in the presence of isoform ambiguity, with applications in epitranscriptome analysis. Bioinformatics. 2021;37:1285–1291. doi: 10.1093/bioinformatics/btaa938. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary material

mmc1.docx (224.5KB, docx)

Data Availability Statement


Articles from Computational and Structural Biotechnology Journal are provided here courtesy of Research Network of Computational and Structural Biotechnology

RESOURCES