Predict Epitranscriptome Targets and Regulatory Functions of N6-Methyladenosine (m6A) Writers and Erasers

Yiyou Song; Qingru Xu; Zhen Wei; Di Zhen; Jionglong Su; Kunqi Chen; Jia Meng

doi:10.1177/1176934319871290

. 2019 Sep 5;15:1176934319871290. doi: 10.1177/1176934319871290

Predict Epitranscriptome Targets and Regulatory Functions of N⁶-Methyladenosine (m⁶A) Writers and Erasers

Yiyou Song ^1,^*, Qingru Xu ^1,^*, Zhen Wei ^1,², Di Zhen ¹, Jionglong Su ^2,³, Kunqi Chen ^1,^4,^✉, Jia Meng ^3,⁵

PMCID: PMC6728658 PMID: 31523126

Abstract

Currently, although many successful bioinformatics efforts have been reported in the epitranscriptomics field for N⁶-methyladenosine (m⁶A) site identification, none is focused on the substrate specificity of different m⁶A-related enzymes, ie, the methyltransferases (writers) and demethylases (erasers). In this work, to untangle the target specificity and the regulatory functions of different RNA m⁶A writers (METTL3-METT14 and METTL16) and erasers (ALKBH5 and FTO), we extracted 49 genomic features along with the conventional sequence features and used the machine learning approach of random forest to predict their epitranscriptome substrates. Our method achieved reasonable performance on both the writer target prediction (as high as 0.918) and the eraser target prediction (as high as 0.888) in a 5-fold cross-validation, and results of the gene ontology analysis of their preferential targets further revealed the functional relevance of different RNA methylation writers and erasers.

Keywords: N⁶-methyladenosine (m⁶A), target prediction, epitranscriptome, random forest, RNA methylation

Introduction

Posttranscriptional RNA modifications are important mechanisms that act on all kinds of RNAs, leading to their increased structural and functional diversity.¹ There are at least 100 kinds of RNA modifications,² among which N⁶-methyladenosine (m⁶A) is currently the most prevalent and intensively studied due to its wide impacts.³ It regulates many essential biological processes including neuronal differentiation, obesity, and messenger RNA (mRNA) stability.^4-6 The m⁶A RNA methylation is a reversible mark, which is deposited by methyltransferases (or the writers), including METTL3 (methyltransferase-like 3), METTL14 (methyltransferase-like 14), METTL16 (methyltransferase-like 16), and so on, and is removed by demethylases (or the erasers), including FTO (fat mass and obesity–associated protein) and ALKBH5 (ALKB homolog 5).

The writers of RNA m⁶A modification are protein complexes containing catalytic components METTL3, METTL14, and METTL16, which all have the class I methyltransferase domain. METTL3 is the first identified m⁶A relevant methyltransferase that has S-adenosylmethionine (SAM)-binding activity.⁷ Afterward, METTL14 was discovered as the second methyltransferase that has a methyltransferase domain sharing 22% sequence identity with METTL3.⁸ While individual METTL3 or METTL14 exhibits comparably weak catalytic activity in vitro, the METTL3-METTL14 complex has higher catalytic capacity.^9,10 In addition, the crystal structure of METTL3-METTL4 complex suggested that only METTL3 binds with SAM and METTL14 plays a structural role for substrate recognition.^8,11 Thus, the heterodimeric METTL3-METTL14 complex was considered a catalytic domain of m⁶A methyltransferase. Recently, METTL16 has been identified as another catalytically active m⁶A mRNA methyltransferase.¹² The METTL16 is similar to METTL3 in structure, but has some unique elements, such as unique αB helix in the Rossmann fold.¹³ In addition, these 2 catalytically active m⁶A mRNA methyltransferases have different roles to play in biological processes. For example, the METTL14 and METTL3 modulate cell cycle progression of cortical neural progenitor cells¹⁴ and depletion of METTL3 or METTL14 promotes tumor progression by enhancing the growth of glioblastoma stem cells.¹⁵ METTL16 can recognize hairpin and methylated adenosine in the U6 snRNA, which regulates the expression of MAT2A.¹⁶

FTO and ALKBH5 are 2 currently identified m⁶A-specific RNA demethylases (erasers).^4,17 Although FTO is able to act as a demethylase on another substrate, N³-methylthymidine (m³T), its efficiency is much lower than working on m⁶A substrates.^18,19 Although both FTO and ALKBH5 can target specifically RNA m⁶A,^19,20 the 2 differ significantly on many levels. For example, at the molecular level, FTO has an amino-terminal AlkB-like domain, a carboxy-terminal domain with a novel fold, and an extra loop that covers on one side of conserved jelly-roll motif.²¹ ALKBH5 is a member of the 2-oxoglutarate (2OG) and ferrous iron-dependent nucleic acid oxygenase (NAOX) families, it has a double-stranded β-helix core fold and the active metal site is coordinated by an HXD. . .H motif along with 3 water molecules.²² In addition, their reaction pathways seem to be different: m⁶A is directly converted by ALKBH5 to adenosine; 2 intermediates, N⁶-hydroxymethyladenosine (hm⁶A) and N⁶-formyladenosine (fm⁶A), are observed during demethylation of m⁶A sites by FTO.^23,24 FTO and ALKBH5 also play different roles in terms of physiological functions, one is associated with obesity⁴ and the other is thought to participate in the formation of sperm.²⁵ Moreover, FTO is mainly expressed in the brain,²⁶ in contrast to ALKBH5, which is found in most tissues, particularly in the testes.¹⁷

Therefore, according to these studies, 2 kinds of catalytically active m⁶A mRNA methyltransferases and 2 demethylases exist that have distinct structures and participate in different biological functions. It would be very interesting to know what the preferential target sites of METTL3-METTL14 complex, METTL16, FTO, and ALKBH5 are and their downstream biological processes. Experimental approaches are effective for testing their functional relevance under a specific experimental condition, such as using different cell lines or testing different treatments. Due to limited detectability, it is not possible to detect target sites on very lowly expressed genes, which is the intrinsic limitation of wet lab–based approach, such as ParCLIP. To unveil comprehensively the epitranscriptome-wide targets of RNA m⁶A, we considered using computational approaches.

Currently, the field of bioinformatics has seen the rapid development of new methods and their wide applications in RNA epigenetics. The mammalian m⁶A short consensus motif RRACH (where R = A or G; H = A, C, or U) has not been characterized until 2012, when the next-generation sequencing techniques called m⁶A-seq or MeRIP-seq (methylated RNA immunoprecipitation sequencing)^27-29 emerged. Thereupon, RMBase and MeTDB have been developed into v2.0, which now can provide millions of m⁶A sites in many different species, such as, human, mouse, yeast, and fly.^30,31

Meanwhile, many successful computational studies have been devised on m⁶A site prediction, such as SRAMP, MethyRNA, and RNAMethPre.^32-34 Although there are many good precedents in m⁶A site prediction and deposition (database development), there has been no effort made in the substrate prediction of m⁶A enzymes. We, therefore, devised a computational tool to study the target specificity of m⁶A enzymes. In this study, the predictors were built using the random forest (RF) approach to distinguish the target specificity of the m⁶A writers (METTL3-METTL14 complex and METTL16) and the erasers (FTO and ALKBH5), respectively. Although the sequence-derived features were widely used in m⁶A site prediction^33,35 and generated reasonably good results, we included additional genome-derived features and achieved substantial improvement in performance.

Matrerials and Methods

The m⁶A sites

The transcriptome-wide m⁶A sites were extracted from the WHISTLE web server,³⁶ which used multiple genomic and sequence features to predict the entire epitranscriptome and achieved substantial improvement compared with existing approaches. Please note that all these m⁶A sites were originally collected from wet lab experiments³⁰ and simultaneously supported by the WHISTLE prediction with high confidence. We considered the m⁶A sites with probability greater than .6, .7, .8, and .9, which are corresponding to 4 data sets of 98 095, 75 720, 52 687, and 27 646 RNA methylation sites, respectively. In this study, 4 different sets of data were extracted for further analysis. This is because they correspond to different coverage and reliability. A larger set has better coverage of the m⁶A epitranscriptome, but may also contain more false m⁶A sites that can affect prediction performance. The training data is provided in Supplementary Table 1.

Target sites of the enzymes

The ground truth targets were identified using perturbation experiment, eg, the hypomethylated sites after the knock down of a methyltransferase identified from MeRIP-seq data. Specifically, the raw data were retrieved from GEO (Gene Expression Omnibus; see Table 1), and the FASTQ files were aligned to the reference genome hg19 using hisat2⁴² with default settings. The resulting SAM files were then converted to BAM files using samtools with the quality filter –q 30 and the FLAG filter –F 2820. Following that, the number of reads aligned to each individual RNA methylation sites were counted as fragments in R using GenomicAlignment package.⁴³ For each experiment with regulator perturbation, differential methylation analysis was conducted by DESeq2⁴⁴ using the interactive generalized linear model (GLM) design of ~ IP*Treatment, while IP is the indicator vector for the samples being IP, and Treatment is the indicator vector for the samples treated with regulator perturbation. The m⁶A sites with the Wald test fdr < 0.05 and the interactive coefficient <0 (>0 for the sample gsc11-ALKBH5-) are treated as the target sites of the regulator. The shared target sites of multiple enzymes, ie, (FTO and ALKBH5) or (M3/M14 vs M16) are considered with ambiguous association and thus excluded from our analysis.

Table 1.

GEO data sets used to identify ground truth target sites.

ID	Regulator	Cell type	GEO SRA study	Publication
1	METTL14	A549	SRP039397	Schwartz et al³⁷
2	METTL14	Hela	SRP022152	Liu et al¹⁰
3	METTL14	MonoMac6	SRP103072	Weng et al³⁸
4	METTL14	NB4	SRP103072	Weng et al³⁸
5	METTL3	A549	SRP039397	Schwartz et al³⁷
6	METTL3	AML	SRP099081	Barbieri et al³⁹
7	METTL3	Hek293T	SRP039397	Schwartz et al³⁷
8	METTL3	Hela	SRP022152	Liu et al¹⁰
9	METTL16	HEK293A	SRP094637	Pendleton et al¹²
10	ALKBH5	gsc11	SRP067910	Zhang et al⁴⁰
11	FTO	AML	SRP067910	Li et al⁴¹

Open in a new tab

Abbreviations: ALKBH5, ALKB homolog 5; FTO, fat mass and obesity–associated protein; GEO, Gene Expression Omnibus; METTL3, methyltransferase-like 3; METTL14, methyltransferase-like 14; METTL16, methyltransferase-like 16; SRA, Sequence Read Archive.

Feature encoding scheme and selection

Sequence-derived features

The nucleotide encoding method according to chemical properties was suggested by Bari et al.⁴⁵ In the MethyRNA³² and M6Apred,⁴⁶ this encoding method was applied in the generation of sequence-derived features and achieved good accuracy in the m⁶A site prediction. In this project, we followed this idea of chemical encoding method to generate sequence-derived features. Specifically, 3 chemical properties of the nucleotides were used to classify adenine (A), cytosine (C), guanine (G), and uracil (U). The first property is ring structures: A and G have 2 ring structures, whereas C and U have only 1 ring. The second property is functional groups. A and C contain amino group, whereas G and U contain the keto group. The third property is the number of hydrogen bonds formed. A and U can form 2 hydrogen bonds during hybridization, whereas G and C can form 3 hydrogen bonds. Based on the 3 structural chemical properties defined above, the ith nucleotide from sequence can be encoded by a vector:

\begin{array}{l} x_{i} = {\begin{matrix} 1 i f s_{i} \in {A, G} \\ 0 i f s_{i} \in {C, U} \end{matrix}, y_{i} = {\begin{matrix} 1 i f s_{i} \in {A, C} \\ 0 i f s_{i} \in {G, U} \end{matrix}, \\ z_{i} = {\begin{matrix} 1 i f s_{i} \in {A, U} \\ 0 i f s_{i} \in {C, G} \end{matrix} \end{array}

(1)

Thus, A can be marked as (1, 1, 1), C can be marked as (0, 1, 0), G can be marked as (1, 0, 0), and U can be marked as (0, 0, 1). In addition, a feature of the accumulative nucleotide frequency is calculated for each nucleotide position in the sequence. The density of ith nucleotide is defined as the sum of all the instances of the ith nucleotide before the $i + 1$ position. The nucleotide frequency (keep 2 decimal places) is defined by the following formula: $f_{i} = d_{i} / i$ . Using sequence “AUGGACACU” as an example, the accumulative frequencies for adenine are 1.00 (1/1), 0.40 (2/5), and 0.43 (3/7) at the first, fifth, and seventh sequence positions, respectively, whereas the frequencies for uracil are 0.50 (1/2) and 0.11 (1/9) at the second and ninth sequence positions, respectively. According to the sequence extended 20 bp (base pair) to each side around the m⁶A sites, they were encoded by the above method, and we obtained a sequence-derived feature with 164 dimensional features for each m⁶A site.

Genome-derived features

Although the sequence-based features were widely used in the prediction of RNA modification sites, there are potentially other features that can be used.⁴⁷ The genomic features have been shown in the WHISTLE project to be effective in the m⁶A site prediction. A total of 47 genome-derived features were considered for this project (see Table 2). Specifically, genomic features 1 to 17 specify the locations of adenosine sites within the transcript region and their topological properties as dummy variables. To generate features in this category, we used the transcript annotations of hg19 human genome assembly and the GenomicFeatures R/Bioconductor package.⁴³ Genomic features 18 to 21 define the relative positions of adenosine sites within transcript region, which is calculated based on the distance from the methylated adenine to the 5′ end divided by the total width of the region; the position features are set to 0 if the adenosine sites do not belong to the region. The values of features 22 to 26 are lengths of the transcript region containing the methylated site; if the sites do not belong to the region, the features are set as 0 also. The evolutionary conservation score of the methylated adenosine sites and its flanking regions was measured in features 27 to 30 with 2 metrics of nucleotide conservation: PhastCons score and the fitness consequence scores. Features 31 and 32 represent the RNA secondary structures of transcripts region containing methylated adenine predicted using RNAfold in Vienna RNA package. Features 33 to 35 represent the attributes of the genes or transcripts containing methylated sites. Features 36 to 40 indicate whether the adenosine sites interact with small noncoding RNA, long noncoding RNA, and microRNA, respectively. Finally, features 41 to 49 indicate whether the methylated sites are located within RNA protein-binding regions. For the above features, to avoid ambiguity caused by transcript isoforms, only the primary (longest) transcript of each gene was kept for the extraction of the transcript subregions. The details for genomic features are summarized in Table 2.

Table 2.

Genomic features used in the analysis.

ID	Name	Description	Note
1	UTR5	5′ UTR	Dummy variables indicating whether the site overlaps the topological region on the major RNA transcript
2	UTR3	3′ UTR
3	cds	CDS
4	Stop_codons	Stop codons flanked by 100 bp
5	Start_codons	Start codons flanked by 100 bp
6	TSS	Downstream 100 bp of TSS
7	TSS_A	Downstream 100 bp of TSS on A
8	Stop_codons	Stop codons
9	exon_stop	Exons containing stop codons
10	alternative_exon	Alternative exons
11	constitutive_exon	Constitutive exons
12	internal_exon	Internal exons
13	long_exon	Long exons (exon length ⩾ 400 bp)
14	last_exon	Last exons⁴⁸
15	last_exon_400bp	5′ 400 bp of the last exons⁴⁸
16	last_exon_sc400	5′ 400 bp of the last exons containing stop codons⁴⁸
17	intron	Introns
18	pos_UTR5	Relative position on 5′ UTR	Relative position on the region
19	pos_UTR3	Relative position on 3′ UTR
20	pos_CDS	Relative position on CDS
21	pos_exons	Relative position on exon
22	length_UTR5	5′ UTR length	The region length in base pairs
23	length_UTR3	3′ UTR length
24	length_cds	CDS length
25	length_gene_ex	Mature transcript length
26	length_gene_full	Full transcript length
27	PC_1bp	PhastCons scores of the nucleotide⁴⁹	Scores related to evolutionary conservation
28	PC_101bp	Average phastCons scores within the flanking 50 bp region⁴⁹
29	FC_1bp	fitCons scores of the nucleotide⁵⁰
30	FC_101bp	Average fitCons scores within the flanking 50 bp region⁵⁰
31	struc_hybridize	Predicted RNA hybridized region⁵¹	RNA secondary structure
32	struc_loop	Predicted RNA loop region⁵¹	RNA secondary structure
33	isoform_num	Isoform number	Attributes of the genes or transcripts
34	exon_num	Exon number
35	HK_genes	Housekeeping genes⁵²
36	sncRNA	sncRNA	RNA annotations related to m6A biology
37	lncRNA	lncRNA
38	miR_targeted_genes	miRNA-targeted genes⁵³
39	Verified_miRtargets	miRNA-targeted sites verified by experiment⁸
40	TargetScan	Predicted miRNA targeted sites by TargetScan⁹
41	HNRNPC_eCLIP	eCLIP data of HNRNPC RNA binding sites⁷	RNA-binding protein annotation from MeTDB database³¹
42	METTL3_TREW	METTL3-binding region³¹
43	METTL14_TREW	METTL14-binding region³¹
44	WTAP_TREW	WTAP-binding region³¹
45	YTHDC1_TREW	YTHDC1-binding region³¹
46	YTHDF1_TREW	YTHDF1-binding region³¹
47	YTHDF2_TREW	YTHDF2-binding region³¹
48	ALKBH5_TREW	ALKBH5-binding region³¹
49	FTO_TREW	FTO-binding region³¹

Open in a new tab

Features that are directly related to the prediction are not used to avoid overfitting. For example, the features 42 and 43 were not used for writer target prediction, whereas feature 48 and 49 were not used for eraser target prediction.

Machine learning approach

Machine learning algorithms have been widely used in the field of computational biology. In RNA epigenetics studies, support vector machine (SVM) and RF have been used previously in RNA m⁶A site prediction,^32,33,46 and both achieved good performances. In this project, the RF algorithm from the R randomForest was used to build predictor models.

Performance evaluation

A 5-fold cross-validation was used for assessing the reliability of the method. In the performance evaluation, the sensitivity $(S n)$ and specificity $(S p)$ are defined as follows:

S n = \frac{T P}{T P + F N}

(2)

S p = \frac{T N}{T N + F P}

(3)

where TP, TN, FP, and FN represent true positive, true negative, false positive, and false negative, respectively. In addition, prediction performance under different decision thresholds were measured as the receiver operating characteristic (ROC) curve whose y-axis is sensitivity and x-axis is 1-specificity, the area under ROC curve (AUC) was calculated as the main performance evaluation metrics. In addition, the ACC (overall accuracy) and MCC (Matthews correlation coefficient) were calculated as other indicators to evaluate the reliable of model.

Results and Discussion

Feature selection

Although extensive research in m⁶A site prediction has demonstrated the effectiveness and reliability of sequence-derived features^32,33,54 and genome-derived features,³⁶ we seek to, for the first time, use these features to predict the target specificity of m⁶A enzymes. Due to the abundance of the genomic features, we first performed feature selection to identify the genomic features most relevant to our purpose, which is to improve the reliability of the features and the prediction performance, as well as to save computation time and memory. The feature selection was performed on the set of RNA methylation sites with probability greater than .6.

At the beginning, the Perturb method⁵⁵ was used to estimate the relative importance of each genomic feature in the target specificity prediction of eraser targets using the R caret package. To illustrate the relative importance of different features clearly, the measurement results are rescaled and ranked (Figure 1A). According to this rank, relevant AUROC (area under the receiver operating characteristics) figures are generated based on the top N features. We can observe that the best performance was achieved with the top 20 features. Thus, only the top 20 features were used in our prediction model for erasers. Similarly, the same treatment was done on the writer target prediction (Figure 1B), where the best performance was achieved with the top 15 features.

Figure 1. — Feature Selection for Predictors. (A) The top 20 genomic features were used for prediction of the targets of erasers, including conservation score, METTL3 targets, etc. (B) The top 15 genomic features were used for prediction of the targets of writers, with the distance to known m⁶A site as the most important predictive feature, followed by gene length and conservation score.

Predictors based on different set of features

Existing computation models overwhelmingly relied on the sequence features. In our prediction model, while it also incorporates features derived from other genomic annotations (see Table 2), It is important to test whether these features contribute to the prediction performance. For this purpose, a 5-fold cross-validation was conducted on the data set of RNA methylation sites with probability greater than .6, and different types of features were used. As shown in Table 3, sequence features were more effective than sequence features in the target prediction for erasers; but not in the case for writers. However, it is consistent for the best performance to be achieved when both sequence and genomic features were incorporated.

Table 3.

Performance of predictors based on different features.

Feature type	Erasers (FTO vs ALKBH5)			Writers (M3/M14 vs M16)
Feature type	Sensitivity	Specificity	AUROC	Sensitivity	Specificity	AUROC
Sequence	0.789	0.781	0.849	0.656	0.746	0.772
Genome	0.762	0.736	0.827	0.802	0.795	0.886
Both	0.814	0.813	0.887	0.802	0.795	0.889

Open in a new tab

Abbreviations: ALKBH5, ALKB homolog 5; AUROC, area under the receiver operating characteristics; FTO, fat mass and obesity–associated protein.

This result was achieved on RNA methylation sites with probability greater than .6 with a 5-fold cross-validation.

Performance on different data sets

In next step, we consider expanding the model to test on all the 4 data sets. As shown in Table 4, the 4 different data sets have different coverage of the m⁶A epitranscriptome; for erasers, the best target prediction performance was achieved on data set 4, which are RNA methylation sites with probability greater than .9, whereas for writers, the best performance was achieved on data set 3, which are corresponding to the RNA methylation sites with probability greater than .8. To compare with other machine learning approaches, the SVM, Naïve Bayes, decision tree, and GLM were applied to build model. The performances for each method are summarized in Table S2 and were evaluated by the sensitivity, specificity, AUROC, ACC, and MCC.

Table 4.

Prediction performance on different data sets (AUROC).

Enzyme type	Data set
Enzyme type	Data set 1 (P* > .6) 98 095 sites	Data set 2 (P > .7) 75 720 sites	Data set 3 (P > .8) 52 687 sites	Data set 4 (P > .9) 27 646 sites
Erasers (FTO vs ALKBH5)	0.873	0.873	0.872	0.888
Writers (M3/M14 vs M16)	0.889	0.888	0.911	0.877

Open in a new tab

Abbreviations: ALKBH5, ALKB homolog 5; AUROC, area under the receiver operating characteristics; FTO, fat mass and obesity–associated protein.

Four data sets were considered, corresponding to the experiment-validated RNA methylation sites from RMBase and also supported by WHISTLE prediction with probability greater than .6, .7, .8, and .9, respectively. The detailed performance of 5 different classification predictors (RF, SVM, GLM, Naïve Bayes, and decision tree) is presented in Supplementary Table S2.

Biological functions regulated by different enzymes

There are 3585, 4623, 4742, and 4734 sites identified in data set 1 (see Table 4) under the regulation of METTL3-METTL14 complex, METTL16, FTO, and ALKBH5, respectively, which are located on 2149, 2178, 2375, and 2635 genes. The biological functions of these targets sites were then annotated with gene ontology enrichment analysis using DAVID website.⁵⁶ Figure 2 shows the top 10 mostly enriched biological processes. We can see that different biological processes were enriched in different enzymes. For example, FTO is associated with cell-cell adhesion (5.473E−13), mRNA splicing (2.212E−12), viral process (4.31E−09), whereas the target sites of ALKBH5 are more related to Golgi organization (4.13E−09) and DNA-templated transcription (4.14E−14). METTL16 targets are enriched with genes related to endoplasmic reticulum–associated misfolded protein catabolic process (9.07E−06), regulation of cell cycle (1.15E−04), apoptotic process (6.70E−04) and protein ubiquitination (9.99E−05), whereas METTL3-METTL14 complex preferentially target to genes associated to cell-cell adhesion (2.86E−07), cell division (4.08E−07), and G2/M transition of mitotic cell cycle (3.03E−06). Please see Table S3 for the complete gene ontology enrichment analysis result.

Figure 2. — Biological processes enriched in targets of m⁶A enzymes. Distinct biological processes are enriched in the predicted target sites of different enzymes. Figure shows the top 10 most statistically enriched biological processes associated with the targets of different m⁶A enzymes.

Discussion and Conclusions

Recent progress in RNA modification bioinformatics enabled the precise detection, accurate quantification, differential analysis, and function annotation of m⁶A RNA methylation sites in base resolution. RMBase and MetDB collected experimentally validated m⁶A sites in multiple species and revealed their potential regulatory functions.^30,31 The exomePeak^57,58 was developed based on Przyborowski and Wilenski’s^59,60 method for m⁶A site detection and differential methylation analysis from MeRIP-seq data. The computational prediction of m⁶A modification sites in different species performed in the works iRNA(m6A)-PseDNC,⁶¹ iRNA-Methyl,⁶² m6Apred,⁴⁶ RFAthM6A,⁶³ and BERMP⁶⁴ based on machine learning or deep learning approaches. The potential disease relevance and single-nucleotide polymorphism association of m⁶A modification were revealed by the m6Avar⁶⁵ and m6ASNP⁶⁶ by examining whether a disease mutation can alter the potential of RNA methylation status. Meanwhile, complex network method was used in m6Acomet,⁶⁷ m6A-Driver,⁶⁸ Deepm6A,⁶⁹ DRUM,⁷⁰ and FunDMDeep-m6A⁷¹ to study the regulatory functions and predict the disease association of m⁶A RNA modification.

Here, we have proposed a computational approach for the prediction of the target sites of m⁶A enzymes. The computational model proposed relies on 49 genomic features as well as the conventional sequence features. With a model selection step, we showed with a 5-fold cross-validation that the proposed approach achieved relatively good performance in the target prediction for the writers (AUC: 0.918) and erasers (AUC: 0.888). The following gene ontology analysis unveiled the epitranscriptome functional relevance of these enzymes.

The proposed approach suffers from the following limitations. (1) The ground truth target sites were identified from perturbation experiment, in which a target site of a methyltransferase is defined as those whose methylation level decreases when the methyltransferase was knocked down. Obviously, the decrease in methylation level may not be due to direct target but because of a secondary effect. For this reason, the ground truth data can be further improved. (2) The features incorporated in the prediction model can be further increased. Although a total of 49 genomic features have been incorporated in our prediction model, the set can be expanded by including, eg, features related to lncRNA, repeat region. Increased feature set can often lead to improved performance. (3) We considered here only a binary classification, which emphasizes the target specificity of different enzymes. However, in practice, it is possible that there are a large number of RNA methylation sites that are simultaneously targeted by both m⁶A writers (or both m⁶A erasers) considered in this work. In addition, there are likely to be unknown methyltransferases or demethylases to be discovered and thus are not considered in the prediction models. This would be a difficult question to solve. (4) A better computation method may be used. We used here RF, which is a classic method. Recent development in artificial intelligence, especially deep learning–related approach may achieve better performance.

Supplemental Material

Supplement_Table_1 – Supplemental material for Predict Epitranscriptome Targets and Regulatory Functions of N6-Methyladenosine (m6A) Writers and Erasers

Click here for additional data file.^{(6.6MB, xlsx)}

Supplemental material, Supplement_Table_1 for Predict Epitranscriptome Targets and Regulatory Functions of N6-Methyladenosine (m6A) Writers and Erasers by Yiyou Song, Qingru Xu, Zhen Wei, Di Zhen, Jionglong Su, Kunqi Chen and Jia Meng in Evolutionary Bioinformatics

Supplemental Material

Supplement_Table_2 – Supplemental material for Predict Epitranscriptome Targets and Regulatory Functions of N6-Methyladenosine (m6A) Writers and Erasers

Click here for additional data file.^{(210.4KB, pdf)}

Supplemental material, Supplement_Table_2 for Predict Epitranscriptome Targets and Regulatory Functions of N6-Methyladenosine (m6A) Writers and Erasers by Yiyou Song, Qingru Xu, Zhen Wei, Di Zhen, Jionglong Su, Kunqi Chen and Jia Meng in Evolutionary Bioinformatics

Supplemental Material

Supplement_Table_3 – Supplemental material for Predict Epitranscriptome Targets and Regulatory Functions of N6-Methyladenosine (m6A) Writers and Erasers

Click here for additional data file.^{(875.8KB, xlsx)}

Supplemental material, Supplement_Table_3 for Predict Epitranscriptome Targets and Regulatory Functions of N6-Methyladenosine (m6A) Writers and Erasers by Yiyou Song, Qingru Xu, Zhen Wei, Di Zhen, Jionglong Su, Kunqi Chen and Jia Meng in Evolutionary Bioinformatics

Footnotes

Funding:The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work has been supported by National Natural Science Foundation of China (31671373), Jiangsu University Natural Science Program (16KJB180027), XJTLU Key Program Special Fund (KSF-T-01), and Jiangsu Six Talent Peak Program (XYDXX-118).

Declaration of conflicting interests:The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Author Contributions: JM and KC conceived the idea and designed the research; ZW processed the raw data; YS, QX, and DZ performed the prediction analysis; YS and QX drafted the manuscript first. All authors read, critically revised, and approved the final manuscript.

ORCID iD: Kunqi Chen Inline graphic https://orcid.org/0000-0002-6025-8957

Supplemental Material: Supplemental material for this article is available online.

References

1. Grosjean H. Fine-Tuning of RNA Functions by Modification and Editing. Berlin, Germany: Springer; 2005. [Google Scholar]
2. Boccaletto P, Machnicka MA, Purta E, et al. MODOMICS: a database of RNA modification pathways. 2017 update. Nucleic Acids Res. 2017;46:D303-D307. [DOI] [PMC free article] [PubMed] [Google Scholar]
3. Meyer KD, Jaffrey SR. The dynamic epitranscriptome: N⁶-methyladenosine and gene expression control. Nat Rev Mol Cell Biol. 2014;15:313-326. [DOI] [PMC free article] [PubMed] [Google Scholar]
4. Jia G, Fu Y, Zhao X, et al. N⁶-methyladenosine in nuclear RNA is a major substrate of the obesity-associated FTO. Nat Chem Biol. 2011;7:885-887. [DOI] [PMC free article] [PubMed] [Google Scholar]
5. Mukobata S, Hibino T, Sugiyama A, et al. m⁶A acts as a nerve growth factor-gated Ca²⁺ channel in neuronal differentiation. Biochem Biophys Res Commun. 2002;297:722-728. [DOI] [PubMed] [Google Scholar]
6. Wang X, Lu Z, Gomez A, et al. N⁶-methyladenosine-dependent regulation of messenger RNA stability. Nature. 2014;505:117-120. [DOI] [PMC free article] [PubMed] [Google Scholar]
7. Bokar JA, Rath-Shambaugh ME, Ludwiczak R, Narayan P, Rottman F. Characterization and partial purification of mRNA N⁶-adenosine methyltransferase from HeLa cell nuclei. Internal mRNA methylation requires a multisubunit complex. J Biol Chem. 1994;269:17697-17704. [PubMed] [Google Scholar]
8. Wang X, Feng J, Xue Y, et al. Structural basis of N⁶-adenosine methylation by the METTL3-METTL14 complex. Nature. 2016;534:575-578. [DOI] [PubMed] [Google Scholar]
9. Wang Y, Li Y, Toth JI, Petroski MD, Zhang Z, Zhao JC. N⁶-methyladenosine modification destabilizes developmental regulators in embryonic stem cells. Nat Cell Biol. 2014;16:191-198. [DOI] [PMC free article] [PubMed] [Google Scholar]
10. Liu J, Yue Y, Han D, et al. A METTL3-METTL14 complex mediates mammalian nuclear RNA N⁶-adenosine methylation. Nat Chem Biol. 2014;10:93-95. [DOI] [PMC free article] [PubMed] [Google Scholar]
11. Sledz P, Jinek M. Structural insights into the molecular mechanism of the m⁶A writer complex. Elife. 2016;5:e18434. [DOI] [PMC free article] [PubMed] [Google Scholar]
12. Pendleton KE, Chen B, Liu K, et al. The U6 snRNA m⁶A methyltransferase METTL16 regulates SAM synthetase intron retention. Cell. 2017;169:824-835.e14. [DOI] [PMC free article] [PubMed] [Google Scholar]
13. Ruszkowska A, Ruszkowski M, Dauter Z, Brown JA. Structural insights into the RNA methyltransferase domain of METTL16. Sci Rep. 2018;8:5311. [DOI] [PMC free article] [PubMed] [Google Scholar]
14. Yoon KJ, Ringeling FR, Vissers C, et al. Temporal control of mammalian cortical neurogenesis by m⁶A methylation. Cell. 2017;171:877-889.e17. [DOI] [PMC free article] [PubMed] [Google Scholar]
15. Cui Q, Shi H, Ye P, et al. m⁶A RNA methylation regulates the self-renewal and tumorigenesis of glioblastoma stem cells. Cell Rep. 2017;18:2622-2634. [DOI] [PMC free article] [PubMed] [Google Scholar]
16. Shima H, Matsumoto M, Ishigami Y, et al. S-adenosylmethionine synthesis is regulated by selective N⁶-adenosine methylation and mRNA degradation involving METTL16 and YTHDC1. Cell Rep. 2017;21:3354-3363. [DOI] [PubMed] [Google Scholar]
17. Zheng G, Dahl JA, Niu Y, et al. ALKBH5 is a mammalian RNA demethylase that impacts RNA metabolism and mouse fertility. Mol Cell. 2013;49:18-29. [DOI] [PMC free article] [PubMed] [Google Scholar]
18. Jia G, Yang CG, Yang S, et al. Oxidative demethylation of 3-methylthymine and 3-methyluracil in single-stranded DNA and RNA by mouse and human FTO. FEBS Lett. 2008;582:3313-3319. [DOI] [PMC free article] [PubMed] [Google Scholar]
19. Toh JDW, Sun L, Lau LZM, et al. A strategy based on nucleotide specificity leads to a subfamily-selective and cell-active inhibitor of N⁶-methyladenosine demethylase FTO. Chem Sci. 2015;6:112-122. [DOI] [PMC free article] [PubMed] [Google Scholar]
20. Yang T, Cheong A, Mai X, Zou S, Woon EC. A methylation-switchable conformational probe for the sensitive and selective detection of RNA demethylase activity. Chem Commun (Camb). 2016;52:6181-6184. [DOI] [PubMed] [Google Scholar]
21. Han Z, Matsumoto M, Ishigami Y, et al. Crystal structure of the FTO protein reveals basis for its substrate specificity. Nature. 2012;464:1205-1209. [DOI] [PubMed] [Google Scholar]
22. Aik W, Scotti JS, Choi H, et al. Structure of human RNA N⁶-methyladenine demethylase ALKBH5 provides insights into its mechanisms of nucleic acid recognition and demethylation. Nucleic Acids Res. 2014;42:4741-4754. [DOI] [PMC free article] [PubMed] [Google Scholar]
23. Chen W, Zhang L, Zheng G, et al. Crystal structure of the RNA demethylase ALKBH5 from zebrafish. FEBS Lett. 2014;588:892-898. [DOI] [PMC free article] [PubMed] [Google Scholar]
24. Fu Y, Jia G, Pang X, et al. FTO-mediated formation of N⁶-hydroxymethyladenosine and N⁶-formyladenosine in mammalian RNA. Nat Commun. 2013;4:1798. [DOI] [PMC free article] [PubMed] [Google Scholar]
25. Tang C, Klukovich R, Peng H, et al. ALKBH5-dependent m⁶A demethylation controls splicing and stability of long 3′-UTR mRNAs in male germ cells. Proc Natl Acad Sci U S A. 2018;115:E325-E333. [DOI] [PMC free article] [PubMed] [Google Scholar]
26. Gerken T, Girard CA, Tung YC, et al. The obesity-associated FTO gene encodes a 2-oxoglutarate-dependent nucleic acid demethylase. Science. 2007;318:1469-1472. [DOI] [PMC free article] [PubMed] [Google Scholar]
27. Chen K, Lu Z, Wang X, et al. High-resolution N6-methyladenosine (m⁶A) map using photo-crosslinking-assisted m⁶A sequencing. Angew Chem. 2015;127:1607-1610. [DOI] [PMC free article] [PubMed] [Google Scholar]
28. Dominissini D, Moshitch-Moshkovitz S, Schwartz S, et al. Topology of the human and mouse m⁶A RNA methylomes revealed by m⁶A-seq. Nature. 2012;485:201-206. [DOI] [PubMed] [Google Scholar]
29. Meyer KD, Saletore Y, Zumbo P, Elemento O, Mason CE, Jaffrey SR. Comprehensive analysis of mRNA methylation reveals enrichment in 3′ UTRs and near stop codons. Cell. 2012;149:1635-1646. [DOI] [PMC free article] [PubMed] [Google Scholar]
30. Xuan J-J, Sun WJ, Lin PH, et al. RMBase v2.0: deciphering the map of RNA modifications from epitranscriptome sequencing data. Nucleic Acids Res. 2017;46:D327-D334. [DOI] [PMC free article] [PubMed] [Google Scholar]
31. Liu H, Wang H, Wei Z, et al. MeT-DB V2.0: elucidating context-specific functions of N⁶-methyladenosine methyltranscriptome. Nucleic Acids Res. 2018;46:D281-D287. [DOI] [PMC free article] [PubMed] [Google Scholar]
32. Chen W, Tang H, Lin H. MethyRNA: a web server for identification of N⁶-methyladenosine sites. J Biomol Struct Dyn. 2017;35:683-687. [DOI] [PubMed] [Google Scholar]
33. Zhou Y, Zeng P, Li Y-H, Zhang Z, Cui Q. SRAMP: prediction of mammalian N⁶-methyladenosine (m⁶A) sites based on sequence-derived features. Nucleic Acids Res. 2016;44:e91. [DOI] [PMC free article] [PubMed] [Google Scholar]
34. Xiang S, Liu K, Yan Z, Zhang Y, Sun Z. RNAMethPre: a web server for the prediction and query of mRNA m⁶A sites. PLoS One. 2016;11:e0162707. [DOI] [PMC free article] [PubMed] [Google Scholar]
35. Chen W, Xing P, Zou Q. Detecting N⁶-methyladenosine sites from RNA transcriptomes using ensemble support vector machines. Sci Rep. 2017;7:40242. [DOI] [PMC free article] [PubMed] [Google Scholar]
36. Chen K, Wei Z, Zhang Q, et al. WHISTLE: a high-accuracy map of the human N⁶-methyladenosine (m⁶A) epitranscriptome predicted using a machine learning approach. Nucleic Acids Res. 2019;47:e41. [DOI] [PMC free article] [PubMed] [Google Scholar]
37. Schwartz S, Mumbach MR, Jovanovic M, et al. Perturbation of m⁶A writers reveals two distinct classes of mRNA methylation at internal and 5′ sites. Cell Rep. 2014;8:284-296. [DOI] [PMC free article] [PubMed] [Google Scholar]
38. Weng H, Huang H, Wu H, et al. METTL14 inhibits hematopoietic stem/progenitor differentiation and promotes leukemogenesis via mRNA m⁶A modification. Cell Stem Cell. 2018;22:191-205.e9. [DOI] [PMC free article] [PubMed] [Google Scholar]
39. Barbieri I, Tzelepis K, Pandolfini L, et al. Promoter-bound METTL3 maintains myeloid leukaemia by m⁶A-dependent translation control. Nature. 2017;552:126-131. [DOI] [PMC free article] [PubMed] [Google Scholar]
40. Zhang S, Zhao BS, Zhou A, et al. m⁶A demethylase ALKBH5 maintains tumorigenicity of glioblastoma stem-like cells by sustaining FOXM1 expression and cell proliferation program. Cancer Cell. 2017;31:591-606.e6. [DOI] [PMC free article] [PubMed] [Google Scholar]
41. Li Z, Weng H, Su R, et al. FTO plays an oncogenic role in acute myeloid leukemia as a N⁶-methyladenosine RNA demethylase. Cancer Cell. 2017;31:127-141. [DOI] [PMC free article] [PubMed] [Google Scholar]
42. Kim D, Langmead B, Salzberg SL. HISAT: a fast spliced aligner with low memory requirements. Nat Methods. 2015;12:357-360. [DOI] [PMC free article] [PubMed] [Google Scholar]
43. Lawrence M, Huber W, Pages H, et al. Software for computing and annotating genomic ranges. Plos Comput Biol. 2013;9:e1003118. [DOI] [PMC free article] [PubMed] [Google Scholar]
44. Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:550. [DOI] [PMC free article] [PubMed] [Google Scholar]
45. Bari ATMG, Reaz MR, Choi HJ, Jeong BS. DNA Encoding for Splice Site Prediction in Large DNA Sequence Berlin, Gemany: Springer; 2013:46-58. [Google Scholar]
46. Chen W, Tran H, Liang Z, Lin H, Zhang L. Identification and analysis of the N⁶-methyladenosine in the Saccharomyces cerevisiae transcriptome. Sci Rep. 2015;5:13859. [DOI] [PMC free article] [PubMed] [Google Scholar]
47. Xu T, Zheng X, Li B, Jin P, Qin Z, Wu H. A comprehensive review of computational prediction of genome-wide features [published online ahead of print November 16, 2018]. Brief Bioinform. doi: 10.1093/bib/bby110. [DOI] [PMC free article] [PubMed] [Google Scholar]
48. Ke S, Pandya-Jones A, Saito Y, et al. m⁶A mRNA modifications are deposited in nascent pre-mRNA and are not required for splicing but do specify cytoplasmic turnover. Genes Dev. 2017;31:990-1006. [DOI] [PMC free article] [PubMed] [Google Scholar]
49. Siepel A, Haussler D. Phylogenetic hidden Markov models. In: Nielsen R. ed. Statistical Methods in Molecular Evolution. New York, NY: Springer; 2005:325-351. [Google Scholar]
50. Gulko B, Gronau I, Hubisz MJ, Siepel A. Probabilities of fitness consequences for point mutations across the human genome. bioRxiv 006825, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
51. Lorenz R, Bernhart SH, Honer Zu, Siederdissen C, et al. ViennaRNA package 2.0. Algorithms Mol Biol. 2011;6:26. [DOI] [PMC free article] [PubMed] [Google Scholar]
52. Eisenberg E, Levanon EY. Human housekeeping genes, revisited. Trends Genet. 2013;29:569-574. [DOI] [PubMed] [Google Scholar]
53. Chou C-H, Shrestha S, Yang CD, et al. miRTarBase update 2018: a resource for experimentally validated microRNA-target interactions. Nucleic Acids Res. 2017;46:D296-D302. [DOI] [PMC free article] [PubMed] [Google Scholar]
54. Li Y-H, Zhang G, Cui Q. PPUS: a web server to predict PUS-specific pseudouridine sites. Bioinformatics. 2015;31:3362-3364. [DOI] [PubMed] [Google Scholar]
55. Gevrey M, Dimopoulos I, Lek S. Review and comparison of methods to study the contribution of variables in artificial neural network models. Ecol Model. 2003;160:249-264. [Google Scholar]
56. Huang DW, Sherman BT, Lempicki RA. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc. 2008;4:44-57. [DOI] [PubMed] [Google Scholar]
57. Meng J, Cui X, Rao MK, Chen Y, Huang Y. Exome-based analysis for RNA epigenome sequencing data. Bioinformatics. 2013;29:1565-1567. [DOI] [PMC free article] [PubMed] [Google Scholar]
58. Meng J, Lu Z, Liu H, et al. A protocol for RNA methylation differential analysis with MeRIP-Seq data and exomePeak R/Bioconductor package. Methods. 2014;69:274-281. [DOI] [PMC free article] [PubMed] [Google Scholar]
59. Przyborowski J, Wilenski H. Homogeneity of results in testing samples from Poisson series: with an application to testing clover seed for dodder. Biometrika. 1940;31:313-323. [Google Scholar]
60. Krishnamoorthy K, Thomson J. A more powerful test for comparing two Poisson means. J Stat Plan Infer. 2004;119:23-35. [Google Scholar]
61. Chen W, Ding H, Zhou X, Lin H, Chou KC. iRNA(m⁶A)-PseDNC: identifying N(6)-methyladenosine sites using pseudo dinucleotide composition. Anal Biochem. 2018;561-562:59-65. [DOI] [PubMed] [Google Scholar]
62. Chen W, Feng P, Ding H, Lin H, Chou KC. iRNA-Methyl: identifying N(6)-methyladenosine sites using pseudo nucleotide composition. Anal Biochem. 2015;490:26-33. [DOI] [PubMed] [Google Scholar]
63. Wang X, Yan R. RFAthm⁶A: a new tool for predicting m⁶A sites in Arabidopsis thaliana. Plant Mol Biol. 2018;96:327-337. [DOI] [PubMed] [Google Scholar]
64. Huang Y, He N, Chen Y, Chen Z, Li L. BERMP: a cross-species classifier for predicting m⁶A sites by integrating a deep learning algorithm and a random forest approach. Int J Biol Sci. 2018;14:1669-1677. [DOI] [PMC free article] [PubMed] [Google Scholar]
65. Zheng Y, Nie P, Peng D, et al. m⁶AVar: a database of functional variants involved in m⁶A modification. Nucleic Acids Res. 2018;46:D139-D145. [DOI] [PMC free article] [PubMed] [Google Scholar]
66. Jiang S, Xie Y, He Z, et al. m⁶ASNP: a tool for annotating genetic variants by m⁶A function. Gigascience. 2018;7:1-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
67. Wu X, Wei Z, Chen K, et al. m⁶Acomet: large-scale functional prediction of individual m⁶A RNA methylation sites from an RNA co-methylation network. BMC Bioinformatics. 2019;20:223. [DOI] [PMC free article] [PubMed] [Google Scholar]
68. Zhang S, Zhang S, Liu L, Meng J, Huang Y. m⁶A-Driver: identifying context-specific mRNA m⁶A methylation-driven gene interaction networks. PLoS Comput Biol. 2016;12:e1005287. [DOI] [PMC free article] [PubMed] [Google Scholar]
69. Zhang S-Y, Zhang SW, Fan XN, et al. Global analysis of N⁶-methyladenosine functions and its disease association using deep learning and network-based methods. PLoS Comput Biol. 2019;15:e1006663. [DOI] [PMC free article] [PubMed] [Google Scholar]
70. Tang Y, Chen K, Wu X, et al. DRUM: inference of disease-associated m⁶A RNA methylation sites from a multi-layer heterogeneous network. Front Genet. 2019;10:266. [DOI] [PMC free article] [PubMed] [Google Scholar]
71. Zhang S-Y, Zhang S-W, Fan X-N, Zhang T, Meng J, Huang Y. FunDMDeep-m⁶A: identification and prioritization of functional differential m⁶A methylation genes. Bioinformatics. 2019;35:i90-i98. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplement_Table_1 – Supplemental material for Predict Epitranscriptome Targets and Regulatory Functions of N6-Methyladenosine (m6A) Writers and Erasers

Click here for additional data file.^{(6.6MB, xlsx)}

Supplement_Table_2 – Supplemental material for Predict Epitranscriptome Targets and Regulatory Functions of N6-Methyladenosine (m6A) Writers and Erasers

Click here for additional data file.^{(210.4KB, pdf)}

Supplement_Table_3 – Supplemental material for Predict Epitranscriptome Targets and Regulatory Functions of N6-Methyladenosine (m6A) Writers and Erasers

Click here for additional data file.^{(875.8KB, xlsx)}

[bibr1-1176934319871290] 1. Grosjean H. Fine-Tuning of RNA Functions by Modification and Editing. Berlin, Germany: Springer; 2005. [Google Scholar]

[bibr2-1176934319871290] 2. Boccaletto P, Machnicka MA, Purta E, et al. MODOMICS: a database of RNA modification pathways. 2017 update. Nucleic Acids Res. 2017;46:D303-D307. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr3-1176934319871290] 3. Meyer KD, Jaffrey SR. The dynamic epitranscriptome: N⁶-methyladenosine and gene expression control. Nat Rev Mol Cell Biol. 2014;15:313-326. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr4-1176934319871290] 4. Jia G, Fu Y, Zhao X, et al. N⁶-methyladenosine in nuclear RNA is a major substrate of the obesity-associated FTO. Nat Chem Biol. 2011;7:885-887. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr5-1176934319871290] 5. Mukobata S, Hibino T, Sugiyama A, et al. m⁶A acts as a nerve growth factor-gated Ca²⁺ channel in neuronal differentiation. Biochem Biophys Res Commun. 2002;297:722-728. [DOI] [PubMed] [Google Scholar]

[bibr6-1176934319871290] 6. Wang X, Lu Z, Gomez A, et al. N⁶-methyladenosine-dependent regulation of messenger RNA stability. Nature. 2014;505:117-120. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr7-1176934319871290] 7. Bokar JA, Rath-Shambaugh ME, Ludwiczak R, Narayan P, Rottman F. Characterization and partial purification of mRNA N⁶-adenosine methyltransferase from HeLa cell nuclei. Internal mRNA methylation requires a multisubunit complex. J Biol Chem. 1994;269:17697-17704. [PubMed] [Google Scholar]

[bibr8-1176934319871290] 8. Wang X, Feng J, Xue Y, et al. Structural basis of N⁶-adenosine methylation by the METTL3-METTL14 complex. Nature. 2016;534:575-578. [DOI] [PubMed] [Google Scholar]

[bibr9-1176934319871290] 9. Wang Y, Li Y, Toth JI, Petroski MD, Zhang Z, Zhao JC. N⁶-methyladenosine modification destabilizes developmental regulators in embryonic stem cells. Nat Cell Biol. 2014;16:191-198. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr10-1176934319871290] 10. Liu J, Yue Y, Han D, et al. A METTL3-METTL14 complex mediates mammalian nuclear RNA N⁶-adenosine methylation. Nat Chem Biol. 2014;10:93-95. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr11-1176934319871290] 11. Sledz P, Jinek M. Structural insights into the molecular mechanism of the m⁶A writer complex. Elife. 2016;5:e18434. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr12-1176934319871290] 12. Pendleton KE, Chen B, Liu K, et al. The U6 snRNA m⁶A methyltransferase METTL16 regulates SAM synthetase intron retention. Cell. 2017;169:824-835.e14. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr13-1176934319871290] 13. Ruszkowska A, Ruszkowski M, Dauter Z, Brown JA. Structural insights into the RNA methyltransferase domain of METTL16. Sci Rep. 2018;8:5311. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr14-1176934319871290] 14. Yoon KJ, Ringeling FR, Vissers C, et al. Temporal control of mammalian cortical neurogenesis by m⁶A methylation. Cell. 2017;171:877-889.e17. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr15-1176934319871290] 15. Cui Q, Shi H, Ye P, et al. m⁶A RNA methylation regulates the self-renewal and tumorigenesis of glioblastoma stem cells. Cell Rep. 2017;18:2622-2634. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr16-1176934319871290] 16. Shima H, Matsumoto M, Ishigami Y, et al. S-adenosylmethionine synthesis is regulated by selective N⁶-adenosine methylation and mRNA degradation involving METTL16 and YTHDC1. Cell Rep. 2017;21:3354-3363. [DOI] [PubMed] [Google Scholar]

[bibr17-1176934319871290] 17. Zheng G, Dahl JA, Niu Y, et al. ALKBH5 is a mammalian RNA demethylase that impacts RNA metabolism and mouse fertility. Mol Cell. 2013;49:18-29. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr18-1176934319871290] 18. Jia G, Yang CG, Yang S, et al. Oxidative demethylation of 3-methylthymine and 3-methyluracil in single-stranded DNA and RNA by mouse and human FTO. FEBS Lett. 2008;582:3313-3319. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr19-1176934319871290] 19. Toh JDW, Sun L, Lau LZM, et al. A strategy based on nucleotide specificity leads to a subfamily-selective and cell-active inhibitor of N⁶-methyladenosine demethylase FTO. Chem Sci. 2015;6:112-122. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr20-1176934319871290] 20. Yang T, Cheong A, Mai X, Zou S, Woon EC. A methylation-switchable conformational probe for the sensitive and selective detection of RNA demethylase activity. Chem Commun (Camb). 2016;52:6181-6184. [DOI] [PubMed] [Google Scholar]

[bibr21-1176934319871290] 21. Han Z, Matsumoto M, Ishigami Y, et al. Crystal structure of the FTO protein reveals basis for its substrate specificity. Nature. 2012;464:1205-1209. [DOI] [PubMed] [Google Scholar]

[bibr22-1176934319871290] 22. Aik W, Scotti JS, Choi H, et al. Structure of human RNA N⁶-methyladenine demethylase ALKBH5 provides insights into its mechanisms of nucleic acid recognition and demethylation. Nucleic Acids Res. 2014;42:4741-4754. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr23-1176934319871290] 23. Chen W, Zhang L, Zheng G, et al. Crystal structure of the RNA demethylase ALKBH5 from zebrafish. FEBS Lett. 2014;588:892-898. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr24-1176934319871290] 24. Fu Y, Jia G, Pang X, et al. FTO-mediated formation of N⁶-hydroxymethyladenosine and N⁶-formyladenosine in mammalian RNA. Nat Commun. 2013;4:1798. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr25-1176934319871290] 25. Tang C, Klukovich R, Peng H, et al. ALKBH5-dependent m⁶A demethylation controls splicing and stability of long 3′-UTR mRNAs in male germ cells. Proc Natl Acad Sci U S A. 2018;115:E325-E333. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr26-1176934319871290] 26. Gerken T, Girard CA, Tung YC, et al. The obesity-associated FTO gene encodes a 2-oxoglutarate-dependent nucleic acid demethylase. Science. 2007;318:1469-1472. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr27-1176934319871290] 27. Chen K, Lu Z, Wang X, et al. High-resolution N6-methyladenosine (m⁶A) map using photo-crosslinking-assisted m⁶A sequencing. Angew Chem. 2015;127:1607-1610. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr28-1176934319871290] 28. Dominissini D, Moshitch-Moshkovitz S, Schwartz S, et al. Topology of the human and mouse m⁶A RNA methylomes revealed by m⁶A-seq. Nature. 2012;485:201-206. [DOI] [PubMed] [Google Scholar]

[bibr29-1176934319871290] 29. Meyer KD, Saletore Y, Zumbo P, Elemento O, Mason CE, Jaffrey SR. Comprehensive analysis of mRNA methylation reveals enrichment in 3′ UTRs and near stop codons. Cell. 2012;149:1635-1646. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr30-1176934319871290] 30. Xuan J-J, Sun WJ, Lin PH, et al. RMBase v2.0: deciphering the map of RNA modifications from epitranscriptome sequencing data. Nucleic Acids Res. 2017;46:D327-D334. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr31-1176934319871290] 31. Liu H, Wang H, Wei Z, et al. MeT-DB V2.0: elucidating context-specific functions of N⁶-methyladenosine methyltranscriptome. Nucleic Acids Res. 2018;46:D281-D287. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr32-1176934319871290] 32. Chen W, Tang H, Lin H. MethyRNA: a web server for identification of N⁶-methyladenosine sites. J Biomol Struct Dyn. 2017;35:683-687. [DOI] [PubMed] [Google Scholar]

[bibr33-1176934319871290] 33. Zhou Y, Zeng P, Li Y-H, Zhang Z, Cui Q. SRAMP: prediction of mammalian N⁶-methyladenosine (m⁶A) sites based on sequence-derived features. Nucleic Acids Res. 2016;44:e91. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr34-1176934319871290] 34. Xiang S, Liu K, Yan Z, Zhang Y, Sun Z. RNAMethPre: a web server for the prediction and query of mRNA m⁶A sites. PLoS One. 2016;11:e0162707. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr35-1176934319871290] 35. Chen W, Xing P, Zou Q. Detecting N⁶-methyladenosine sites from RNA transcriptomes using ensemble support vector machines. Sci Rep. 2017;7:40242. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr36-1176934319871290] 36. Chen K, Wei Z, Zhang Q, et al. WHISTLE: a high-accuracy map of the human N⁶-methyladenosine (m⁶A) epitranscriptome predicted using a machine learning approach. Nucleic Acids Res. 2019;47:e41. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr37-1176934319871290] 37. Schwartz S, Mumbach MR, Jovanovic M, et al. Perturbation of m⁶A writers reveals two distinct classes of mRNA methylation at internal and 5′ sites. Cell Rep. 2014;8:284-296. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr38-1176934319871290] 38. Weng H, Huang H, Wu H, et al. METTL14 inhibits hematopoietic stem/progenitor differentiation and promotes leukemogenesis via mRNA m⁶A modification. Cell Stem Cell. 2018;22:191-205.e9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr39-1176934319871290] 39. Barbieri I, Tzelepis K, Pandolfini L, et al. Promoter-bound METTL3 maintains myeloid leukaemia by m⁶A-dependent translation control. Nature. 2017;552:126-131. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr40-1176934319871290] 40. Zhang S, Zhao BS, Zhou A, et al. m⁶A demethylase ALKBH5 maintains tumorigenicity of glioblastoma stem-like cells by sustaining FOXM1 expression and cell proliferation program. Cancer Cell. 2017;31:591-606.e6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr41-1176934319871290] 41. Li Z, Weng H, Su R, et al. FTO plays an oncogenic role in acute myeloid leukemia as a N⁶-methyladenosine RNA demethylase. Cancer Cell. 2017;31:127-141. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr42-1176934319871290] 42. Kim D, Langmead B, Salzberg SL. HISAT: a fast spliced aligner with low memory requirements. Nat Methods. 2015;12:357-360. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr43-1176934319871290] 43. Lawrence M, Huber W, Pages H, et al. Software for computing and annotating genomic ranges. Plos Comput Biol. 2013;9:e1003118. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr44-1176934319871290] 44. Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:550. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr45-1176934319871290] 45. Bari ATMG, Reaz MR, Choi HJ, Jeong BS. DNA Encoding for Splice Site Prediction in Large DNA Sequence Berlin, Gemany: Springer; 2013:46-58. [Google Scholar]

[bibr46-1176934319871290] 46. Chen W, Tran H, Liang Z, Lin H, Zhang L. Identification and analysis of the N⁶-methyladenosine in the Saccharomyces cerevisiae transcriptome. Sci Rep. 2015;5:13859. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr47-1176934319871290] 47. Xu T, Zheng X, Li B, Jin P, Qin Z, Wu H. A comprehensive review of computational prediction of genome-wide features [published online ahead of print November 16, 2018]. Brief Bioinform. doi: 10.1093/bib/bby110. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr48-1176934319871290] 48. Ke S, Pandya-Jones A, Saito Y, et al. m⁶A mRNA modifications are deposited in nascent pre-mRNA and are not required for splicing but do specify cytoplasmic turnover. Genes Dev. 2017;31:990-1006. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr49-1176934319871290] 49. Siepel A, Haussler D. Phylogenetic hidden Markov models. In: Nielsen R. ed. Statistical Methods in Molecular Evolution. New York, NY: Springer; 2005:325-351. [Google Scholar]

[bibr50-1176934319871290] 50. Gulko B, Gronau I, Hubisz MJ, Siepel A. Probabilities of fitness consequences for point mutations across the human genome. bioRxiv 006825, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr51-1176934319871290] 51. Lorenz R, Bernhart SH, Honer Zu, Siederdissen C, et al. ViennaRNA package 2.0. Algorithms Mol Biol. 2011;6:26. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr52-1176934319871290] 52. Eisenberg E, Levanon EY. Human housekeeping genes, revisited. Trends Genet. 2013;29:569-574. [DOI] [PubMed] [Google Scholar]

[bibr53-1176934319871290] 53. Chou C-H, Shrestha S, Yang CD, et al. miRTarBase update 2018: a resource for experimentally validated microRNA-target interactions. Nucleic Acids Res. 2017;46:D296-D302. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr54-1176934319871290] 54. Li Y-H, Zhang G, Cui Q. PPUS: a web server to predict PUS-specific pseudouridine sites. Bioinformatics. 2015;31:3362-3364. [DOI] [PubMed] [Google Scholar]

[bibr55-1176934319871290] 55. Gevrey M, Dimopoulos I, Lek S. Review and comparison of methods to study the contribution of variables in artificial neural network models. Ecol Model. 2003;160:249-264. [Google Scholar]

[bibr56-1176934319871290] 56. Huang DW, Sherman BT, Lempicki RA. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc. 2008;4:44-57. [DOI] [PubMed] [Google Scholar]

[bibr57-1176934319871290] 57. Meng J, Cui X, Rao MK, Chen Y, Huang Y. Exome-based analysis for RNA epigenome sequencing data. Bioinformatics. 2013;29:1565-1567. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr58-1176934319871290] 58. Meng J, Lu Z, Liu H, et al. A protocol for RNA methylation differential analysis with MeRIP-Seq data and exomePeak R/Bioconductor package. Methods. 2014;69:274-281. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr59-1176934319871290] 59. Przyborowski J, Wilenski H. Homogeneity of results in testing samples from Poisson series: with an application to testing clover seed for dodder. Biometrika. 1940;31:313-323. [Google Scholar]

[bibr60-1176934319871290] 60. Krishnamoorthy K, Thomson J. A more powerful test for comparing two Poisson means. J Stat Plan Infer. 2004;119:23-35. [Google Scholar]

[bibr61-1176934319871290] 61. Chen W, Ding H, Zhou X, Lin H, Chou KC. iRNA(m⁶A)-PseDNC: identifying N(6)-methyladenosine sites using pseudo dinucleotide composition. Anal Biochem. 2018;561-562:59-65. [DOI] [PubMed] [Google Scholar]

[bibr62-1176934319871290] 62. Chen W, Feng P, Ding H, Lin H, Chou KC. iRNA-Methyl: identifying N(6)-methyladenosine sites using pseudo nucleotide composition. Anal Biochem. 2015;490:26-33. [DOI] [PubMed] [Google Scholar]

[bibr63-1176934319871290] 63. Wang X, Yan R. RFAthm⁶A: a new tool for predicting m⁶A sites in Arabidopsis thaliana. Plant Mol Biol. 2018;96:327-337. [DOI] [PubMed] [Google Scholar]

[bibr64-1176934319871290] 64. Huang Y, He N, Chen Y, Chen Z, Li L. BERMP: a cross-species classifier for predicting m⁶A sites by integrating a deep learning algorithm and a random forest approach. Int J Biol Sci. 2018;14:1669-1677. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr65-1176934319871290] 65. Zheng Y, Nie P, Peng D, et al. m⁶AVar: a database of functional variants involved in m⁶A modification. Nucleic Acids Res. 2018;46:D139-D145. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr66-1176934319871290] 66. Jiang S, Xie Y, He Z, et al. m⁶ASNP: a tool for annotating genetic variants by m⁶A function. Gigascience. 2018;7:1-11. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr67-1176934319871290] 67. Wu X, Wei Z, Chen K, et al. m⁶Acomet: large-scale functional prediction of individual m⁶A RNA methylation sites from an RNA co-methylation network. BMC Bioinformatics. 2019;20:223. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr68-1176934319871290] 68. Zhang S, Zhang S, Liu L, Meng J, Huang Y. m⁶A-Driver: identifying context-specific mRNA m⁶A methylation-driven gene interaction networks. PLoS Comput Biol. 2016;12:e1005287. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr69-1176934319871290] 69. Zhang S-Y, Zhang SW, Fan XN, et al. Global analysis of N⁶-methyladenosine functions and its disease association using deep learning and network-based methods. PLoS Comput Biol. 2019;15:e1006663. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr70-1176934319871290] 70. Tang Y, Chen K, Wu X, et al. DRUM: inference of disease-associated m⁶A RNA methylation sites from a multi-layer heterogeneous network. Front Genet. 2019;10:266. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr71-1176934319871290] 71. Zhang S-Y, Zhang S-W, Fan X-N, Zhang T, Meng J, Huang Y. FunDMDeep-m⁶A: identification and prioritization of functional differential m⁶A methylation genes. Bioinformatics. 2019;35:i90-i98. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Predict Epitranscriptome Targets and Regulatory Functions of N6-Methyladenosine (m6A) Writers and Erasers

Yiyou Song

Qingru Xu

Zhen Wei

Di Zhen

Jionglong Su

Kunqi Chen

Jia Meng

Abstract

Introduction

Matrerials and Methods

The m6A sites

Target sites of the enzymes

Table 1.

Feature encoding scheme and selection

Sequence-derived features

Genome-derived features

Table 2.

Machine learning approach

Performance evaluation

Results and Discussion

Feature selection

Figure 1.

Predictors based on different set of features

Table 3.

Performance on different data sets

Table 4.

Biological functions regulated by different enzymes

Figure 2.

Discussion and Conclusions

Supplemental Material

Supplemental Material

Supplemental Material

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

Predict Epitranscriptome Targets and Regulatory Functions of N⁶-Methyladenosine (m⁶A) Writers and Erasers

The m⁶A sites