Skip to main content
Genomics, Proteomics & Bioinformatics logoLink to Genomics, Proteomics & Bioinformatics
. 2018 May 3;16(2):127–135. doi: 10.1016/j.gpb.2018.01.001

Comparative Analysis of Human Genes Frequently and Occasionally Regulated by m6A Modification

Yuan Zhou 1,⁎,a, Qinghua Cui 1,⁎,b
PMCID: PMC6112303  PMID: 29730206

Abstract

The m6A modification has been implicated as an important epitranscriptomic marker, which plays extensive roles in the regulation of transcript stability, splicing, translation, and localization. Nevertheless, only some genes are repeatedly modified across various conditions and the principle of m6A regulation remains elusive. In this study, we performed a systems-level analysis of human genes frequently regulated by m6A modification (m6Afreq genes) and those occasionally regulated by m6A modification (m6Aocca genes). Compared to the m6Aocca genes, the m6Afreq genes exhibit gene importance-related features, such as lower dN/dS ratio, higher protein–protein interaction network degree, and reduced tissue expression specificity. Signaling network analysis indicates that the m6Afreq genes are associated with downstream components of signaling cascades, high-linked signaling adaptors, and specific network motifs like incoherent feed forward loops. Moreover, functional enrichment analysis indicates significant overlaps between the m6Afreq genes and genes involved in various layers of gene expression, such as being the microRNA targets and the regulators of RNA processing. Therefore, our findings suggest the potential interplay between m6A epitranscriptomic regulation and other gene expression regulatory machineries.

Keywords: m6A, Epitranscriptome, Signaling network, Gene expression regulation, Gene importance

Introduction

Various types of RNA modifications can change the chemical or structural properties of the nucleotide residues and thus constitute the core mechanism of the epitranscriptomic regulation [1], [2]. N6-methyladenosine (m6A), which is one of the most important and widespread RNA modifications [3], can be recognized as the molecular tag by its reader proteins. Accumulating evidence has shown that m6A is associated with several key biological processes. For example, m6A modification can be specifically recognized by the YTH domain family reader proteins YTHDF2 and YTHDF1 to regulate the degradation [4] and translation of RNA transcripts [5] respectively. And such regulatory processes can be facilitated by YTHDF3 [6], [7]. Besides, YTH domain containing reader protein YTHDC1 is involved in the regulation of alternative splicing [8], while YTHDC2 enhances translational efficiency [9]. Other regulatory factors like the eukaryotic translational initiation factor 3 (eIF3) could also read m6A modification to trigger the translation initiation [10]. As the modification could change the chemical properties of nucleotide residues, m6A may also perturb the local structure of RNA, and the altered structures have been shown to facilitate the binding of other proteins like heterogeneous nuclear ribonucleoprotein C (HNRNPC) to their target RNAs [11], [12]. Notably, besides the coding transcriptome, m6A has also been suggested to regulate the biogenesis of non-coding RNAs (ncRNAs) like microRNAs (miRNAs) [13].

Establishment of immunoprecipitation-based high-throughput sequencing techniques like MERIP-seq or m6A-seq greatly facilitates the transcriptome-wide identification of m6A modification sites [14], [15]. Data generated from such studies have been curated in the MeT-DB database [16], [17]. Moreover, the transcriptome-wide m6A mapping studies also benefit from the recently developed high-resolution m6A mapping technique, like miCLIP [18], and computational m6A site prediction tools, like the yeast m6A predictor m6Apred [19] and the mammalian m6A predictor SRAMP [20]. Currently, most of the m6A modification profiles have been collected in the RMBase database [21] and the MeT-DB database [16], [17]. Therefore, m6A sites constitute the vast majority of the RNA methylation sites in both databases. Although m6A profiles from various conditions have been included in these databases, the distribution of m6A modified genes across these conditions remains unclear. Interestingly, in our initial efforts to compile a comprehensive m6A dataset (see details in Table S1) from the MeT-DB V2.0 [17], we noted that only few genes (18 genes) are always modified across all 38 conditions covered in this dataset. Why are some genes regulated by m6A modification more extensively than others are? To address this question, we analyzed differences in the conservation, network, regulation, and functional features between gene frequently regulated by m6A (m6Afreq genes) and those occasionally regulated by m6A (m6Aocca genes).

Results and discussion

m6Afreq genes show gene importance-related features

The overall distribution of the m6A modified conditions in our comprehensive m6A dataset is shown in Figure 1. Many genes (5854 genes) are m6A-modified under ≤19 condition(s) and only some genes (1551 genes) are m6A-modified under >35 conditions (Figure 1A). Considering not all genes are expressed under the 38 conditions covered in our dataset, we then corrected the number of m6A modified conditions by dividing the number of tissue/cell types in which the gene is expressed. As a result, a similar gene distribution was observed (Figure 1B). Among these genes, 4268 genes are found to be m6Afreq genes (modified under >3.5 corrected number of conditions), whereas 3711 genes are found to be m6Aocca genes (modified under ≤1.5 corrected number of conditions). To probe the biological characteristics related to such distribution, we performed comprehensive analyses to compare the features of m6Afreq genes and m6Aocca genes.

Figure 1.

Figure 1

The overall distribution of the number of m6A regulated conditions

A. The raw count of the number of m6A regulated conditions in our comprehensive m6A dataset. Intuitively, an m6A regulated condition is counted if there is any m6A peak identified in a particular gene under a specified condition. B. The corrected number of m6A regulated conditions in the comprehensive m6A dataset. The corrected number of m6A regulated conditions was obtained by dividing the number of m6A regulated conditions against that of cell types (covered by m6A profiles) where the gene shows baseline expression. A gene is considered to show baseline expression in a cell type, if TPM is greater than 0.5 in the corresponding cell type according to the Human Protein Atlas database. TPM, transcripts per kilobase million.

Genes expressed across many conditions and cell types tend to be essential genes. Therefore, it is interesting to check whether m6Afreq genes possess the essential gene-related features. Although essential genes are often defined in a context-dependent manner, several gene features, including higher conservation, higher protein–protein interaction (PPI) network degree, and broader gene expression spectrum, have been repeatedly shown to be correlated with gene importance [22], [23]. Compared to the m6Aocca genes, the m6Afreq genes are more conserved as indicated by the significantly lower sequence divergence rate (i.e., lower dN/dS ratio; 0.116 ± 0.00182 vs. 0.157 ± 0.00275, Wilcoxon’s test P = 7.63E−36), although there are fewer orthologous genes across various species for m6Afreq genes (102 ± 2.80 vs. 127 ± 4.36, Wilcoxon’s test P = 0.0389). Moreover, the m6Afreq genes have higher PPI network degree (44.1 ± 1.23 vs. 28.0 ± 0.921, Wilcoxon’s test P = 8.12E−64), indicating that they tend to interact with more genes and show higher importance in the PPI network. Genes that are constantly expressed across various tissues, i.e., housekeeping genes, likely play essential roles. Compared to the m6Aocca genes, the m6Afreq genes show significantly lower tissue expression specificity (0.250 ± 0.00156 vs. 0.297 ± 0.00236, Wilcoxon’s test P = 1.95E−68), indicating that m6Afreq genes tend to be more widely expressed across different tissues.

The classification of m6Afreq genes and m6Aocca genes depends on the threshold used. To avoid bias induced by the arbitrary threshold, we then calculated the Spearman’s correlation coefficients between the corrected number of m6A regulated conditions and the aforementioned gene importance-related features. As shown in Figure 2, our results are in line with the m6Afreq genes vs. m6Aocca genes comparisons shown above for most features, with the exception that no significant correlation is observed for the number of orthologous genes. The corrected number of m6A regulated conditions shows positive correlations with PPI network degree, but negative correlations with dN/dS ratio and the tissue expression specificity. Given the corrected number of m6A regulated conditions is in accordance with most of the aforementioned gene importance-related features (except the number of orthologous genes), genes frequently regulated by m6A modification are more likely to be important to the cell.

Figure 2.

Figure 2

The correlation between the corrected number of m6A regulated conditions and various gene features

The correlation curves between the corrected number of m6A regulated conditions and various gene features are plotted by using the LOESS smoothing technique. The line indicates the local average estimated by LOESS smoothing and the shade indicates the confidence interval. Outlier genes (0.5%) with extremely high corrected number of m6A regulated conditions are omitted due to their high variation in gene feature values, which could result in badly skewed regression lines. A. Correlation of corrected number of m6A regulated conditions with dN/dS ratio. B. Correlation of corrected number of m6A regulated conditions with number of orthologous genes. C. Correlation of corrected number of m6A regulated conditions with PPI network degree. D. Correlation of corrected number of m6A regulated conditions with tissue expression specificity. E. Correlation of corrected number of m6A regulated conditions with number of targeting microRNAs. F. The summary of Spearman’s correlation coefficient and P values for panels A–E. PPI, protein–protein interaction.

Signaling network properties of the m6Afreq genes

As shown in the previous section, the m6Afreq genes have higher PPI network degree. However, the in vivo relationships between genes are more complicated than what is described by the binary PPI network. We thus performed the comprehensive signaling network analysis for the detailed network topology properties of m6Afreq genes. Besides PPIs, directed activating (positive) interactions and repressing (negative) interactions between genes are also included in the signaling network.

As a result, 1530 m6Afreq genes and 1194 m6Aocca genes were mapped onto the signaling network, respectively. No significant difference was observed in the network degree with respect to the directed edges when comparing the m6Afreq genes and m6Aocca genes (Wilcoxon’s test P = 0.611). We tried to classify edges into activating and repressing edges, and compare the degree by considering activating edges or repressing edges alone. We found that compared to m6Aocca genes, m6Afreq genes have higher network degree when considering repressing edges alone (Wilcoxon’s test P = 3.28E−4). More specifically, m6Afreq genes have higher negative out-degree (i.e., the number of signal receivers repressed by this gene) than m6Aocca genes (1.63 ± 0.116 vs. 0.965 ± 0.0729, Wilcoxon’s test P = 1.05E−7), indicating that m6Afreq genes tend to repress other genes in the signaling network. We also tested other node centrality properties, including betweenness centrality, closeness centrality, eigenvector centrality, and transitivity centrality. Most of these properties do not significantly differ between m6Afreq genes and m6Aocca genes (Wilcoxon’s test P > 0.05), except that the m6Afreq genes show marginally higher betweenness centrality (2.52E−4 ± 3.18E − 5 vs. 1.78E−4 ± 2.22E−5, Wilcoxon’s test P = 0.0203) and closeness centrality (5.64E−3 ± 2.17E−5 vs. 5.62E−3 ± 2.41E−5, Wilcoxon’s test P = 0.0285) than m6Aocca genes. These results suggest that m6Afreq genes and m6Aocca genes are of largely comparable importance to the signaling network.

The difference in betweenness centrality and closeness centrality between m6Afreq genes and m6Aocca genes also implies that the localization of m6Afreq genes and m6Aocca genes in the signaling network would differ. To test this hypothesis, for each node, we calculated its shortest distance to the upstream receptors and that to the downstream effectors, and deduced its relative level in the signaling network by comparing these two distances. The relative level of a gene ranges from 0 to 1 with larger values indicative the downstream location (i.e., closer to the downstream effectors than to the upstream receptors) of the gene. While the upstream receptors could be clearly defined by the Gene Ontology (GO) term ‘receptor activity’, the identification of downstream effectors was not straightforward. We adopted two alternative definitions of downstream effectors. First, the downstream effectors could be identified as the nodes with zero out-degree after removing feedback loops. Since no signal would be sent from such kind of nodes, these nodes are intuitively downstream effectors at the bottom ends of signaling cascades. Second, the topology-based definition of downstream effectors could be misled by the incomplete signaling network topology, when the edges in the signaling network are limited. Therefore, we also assigned all transcription factors, which are often the outputting nodes in signaling pathways, as the downstream effectors. When applying topology-based definition of downstream effectors, no difference in signaling network could be observed between m6Afreq genes and m6Aocca genes (Wilcoxon’s test P = 0.289). A more prominent difference was noticed between m6Afreq genes and m6Aocca genes when we assigned the transcription factors as the downstream effectors (Figure 3A; 0.660 ± 0.00943 vs. 0.553 ± 0.0110; Wilcoxon’s test P = 2.19E−23). This result indicates that the m6Afreq genes, especially transcription factors, tend to act as the downstream effectors along the signaling cascades.

Figure 3.

Figure 3

The network feature of the m6Afreq genes

A. Boxplot comparing the distributions of relative level in the signaling network, between the m6Afreq genes and m6Aocca genes. The relative level in the signaling network shown here was calculated as the shortest distance to any upstream receptor divided by the sum of the shortest distance to any upstream receptor and the shortest distance to any downstream transcriptional factors. B. Cumulative distribution plot comparing the PPI-only degree distribution of the m6Afreq genes and that of m6Aocca genes. The PPI-only degree only considers PPI edges in the signaling network but omits the activating and repressing edges. C. Cumulative distribution plot comparing the PPI-only degree distribution of the interacting partners of m6Afreq genes and that of the interacting partners of m6Aocca genes. D. The overrepresented network motifs of m6Afreq genes. In a network motif, if there are more m6Afreq genes than 9500 out of 10,000 sets of randomly picked genes (corresponding to an empirical P value < 0.05), this motif is considered as an overrepresented motif. The respective motifs are explicitly depicted by the schemas on top. The activating and repressing edges are indicated using lines with arrowhead and circle, respectively. The name of the motif is composed of the motif type and the code describing the edge topology in the motif. For example, IFF3a1i2abbc indicates an incoherent feedforward loop with three nodes (a, b, and c) that form one activating edge and two repressing edges. Among the three edges, the major class of the edges (in this motif, the major class is repressing edge) comprises the edge between a and b, and the edge between b and c. IFF, incoherent feedforward loop; CFF, coherent feedforward loop; NFB, negative feedback loop; PFB, positive feedback loop.

Besides the activating/repressing edges, there are considerable numbers of PPI edges present in the signaling network. Nodes with many PPI partners in the signaling network often act as the adaptors, which can recruit other signaling components for efficient signaling [24]. We checked the PPI-only degree (the degree after ignoring the activating and repressing edges) of m6Afreq genes and m6Aocca genes. As a result, we found that m6Afreq genes have higher PPI-only degree than m6Aocca genes (Figure 3B; 2.64 ± 0.161 vs. 2.38 ± 0.177, Wilcoxon’s test P = 7.27E−5). In addition, the interacting partners of m6Afreq genes also exhibited higher PPI-only degree than the partners of m6Aocca genes (Figure 3C; 10.2 ± 0.441 vs. 8.16 ± 0.443, Wilcoxon’s test P = 3.99E−5). Therefore, the m6Afreq genes are inclined to be the recruited partners of high-linked signaling adaptors, or they themselves can act as high-linked signaling adaptors.

Signaling cascades are not always linear, and the signaling network motifs like feedback loops and feedforward loops are prevalent to achieve the fine-tuned cellular signaling [25], [26], [27]. We thus tested whether the m6Afreq genes were overrepresented in some specific network motifs in comparison with random expectation (see Materials and methods section for details). All overrepresented network motifs are shown in Figure 3D and we found that the m6Afreq genes are most overrepresented in various types of incoherent feedforward loops. Unlike the negative feedback loops and coherent feedforward loops, which often work for cellular homeostasis, adaptation, and de-sensitivity, the incoherent feedforward loops are often associated with ultra-sensitivity and non-monotonic response [25], [26], [27], [28]. The m6Afreq genes are also overrepresented in specific types of coherent feedforward loops that are unlikely to achieve adaptation [26]. Taken together, these results indicate that the m6Afreq genes are more likely to be involved in regulating the signal sensitivity than cellular homeostasis.

m6Afreq genes overlap with microRNA targets and development-related genes

Interestingly, a previous study shows that miRNA targets tend to be the downstream components in the signaling networks, interact with high-linked adaptors, and participate in the positively-linked network motifs [27]. Considering that m6Afreq genes show similar network properties, it is interesting to see whether genes extensively regulated by the m6A modification are also intensively targeted by miRNAs. We calculated the number of targeting miRNAs on each gene and found that the m6Afreq genes are more intensively regulated by miRNAs than the m6Aocca genes (number of targeting miRNAs 2.40 ± 0.113 vs. 1.18 ± 0.0858; Wilcoxon’s test P = 1.23E−20). Moreover, we also observed a positive correlation between the corrected number of m6A regulated conditions and the number of targeting miRNAs (Spearman’s correlation = 0.085, P = 3.47E−22; Figure 2E). The similar positive correlation persists when the positively co-expressed miRNA–target pairs (which were derived from mirCox database [29], see also Materials and Methods) (Figure S1A; Spearman’s correlation = 0.0562, P = 1.43E−10) or the negatively co-expressed miRNA–target pairs were considered alone (Figure S1B; Spearman’s correlation = 0.0514, P = 4.42E−9). Together, these results indicate potential crosstalk between m6A regulation and miRNA regulation. Recently Molinie et al. have reported that transcript isoforms heavily modified by m6A tend to have shorter 3′-UTR and therefore fewer miRNA binding sites [30]. Nevertheless, the conclusions of two studies are not necessarily conflicting: while Molinie et al. focused on the intensively modified RNAs and performed the comparison between transcripts isoforms (i.e., modified isoforms vs. non-modified isoforms), in this study we focused on the extensively modified RNAs and performed comparison between different genes (i.e., genes widely modified across various conditions vs. genes occasionally modified). It is possible that some genes are surveilled by multiple miRNAs and frequent m6A methylation. When heavily methylated, the isoforms lacking miRNA binding sites of such genes could be expressed to escape the regulation of miRNAs; conversely, the isoforms with multiple miRNA binding sites could be expressed when the m6A regulation is not present. How the miRNAs and m6A cooperate to regulate the gene expression in a sophisticated way deserves further experimental investigation.

miRNAs have been shown to be associated with cell proliferation and apoptosis [31]. We speculate that the m6Afreq genes could have similar enriched functions. We thus performed the GO functional enrichment analysis for m6Afreq genes. As a result, we found that the m6Afreq genes are significantly enriched for the terms like “embryo development”, “mitotic cell cycle”, “growth”, and “apoptotic signaling pathway” (Table S2). It is of note that these terms are not significantly enriched in m6Aocca genes (Table S3). This result again indicates potential functional crosstalk between m6A modification and miRNA targeting. In addition, m6A modification has also been implicated in the regulation of transcript translation, localization, stability, and splicing [4], [5], [8]. Interestingly, m6Afreq genes are also significantly associated with the functional terms like “negative regulation of transcription from RNA polymerase II promoter” and “RNA processing” (Table S2), which are not significantly enriched in m6Aocca genes (Table S3). Therefore, in addition to directly participating in the RNA metabolism process, it is plausible that m6A could also regulate RNA metabolism indirectly via extensively targeting the RNA metabolism-related genes, ultimately achieving more sophisticated regulation of the gene expression.

Preliminary validation on the quantitative m6A dataset and non-methylated genes

In the aforementioned analyses, we focused on the genes that are m6A regulated across various conditions. Given these analyses were based only on the binary methylation profiles (i.e., m6A modified or not), the m6A methylation level was not taken into consideration. Therefore, we also took advantage of the quantitative m6A methylation profiles in the MeT-DB V2.0 database [17] for preliminary validation of the main results shown above. These m6A methylation profiles were collected using the standardized pipeline, and a quantitative enrichment score was provided for each m6A site peak. For each gene, a normalized m6A regulation breadth score was calculated in a way similar to the calculation of tissue expression specificity [32], [33] (see also Materials and methods section). The normalized m6A regulation breadth ranges from 0 to 1, with higher score indicative genes frequently regulated by m6A.

We checked the correlations between the normalized m6A regulation breadth and several gene features that have been shown to be associated with m6Afreq genes in the analyses above. In line with the results from binary methylation profiles, the normalized m6A regulation breadth shows positive correlation with the PPI network degree, the relative level in signaling network, and the number of targeting miRNAs, while a negative correlation of the normalized m6A regulation breadth with dN/dS ratio and tissue expression specificity was observed (Figure S2). These results further support our findings from the binary methylation profile analyses.

Another issue of our analyses is that we did not take into consideration the genes that are not methylated. Due to the limited coverage of currently available m6A profiles, it is hard to identify bona fide non-regulated genes (i.e., m6Anone genes) without significant bias. To perform a preliminary test, we defined genes that have baseline expression in at least one cell type covered by the m6A profiles but have no known m6A sites as the m6Anone genes. Consequently, we identified 2779 m6Anone genes for comparison of the gene features that have been shown to be associated with m6Afreq genes. Generally, the gene features of m6Anone genes are much more similar to those of m6Aocca genes than to those of m6Afreq genes (Figure S3). For example, m6Afreq genes have the highest PPI network degree, followed by m6Aocca genes, and then m6Anone genes. These results are in line with intuitive expectation. We anticipate that with the accumulation of m6A profiles in public databases, a less biased comparison between m6Afreq genes, m6Aocca genes, and m6Anone genes will be performed in the future.

Although our analyses suggest largely consistent results about the difference between m6Afreq genes and m6Aocca genes, substantial limitation exists in this study. First, the current human m6A methylation profiles were largely derived from cell lines especially cancer cell lines like HeLa and A549 [16], [17], [21]. Therefore, these profiles could not fully recapitulate the in vivo m6A methylation patterns in normal human tissues. We hope that more tissue-derived m6A profiles can be generated in the future so that a dataset more representative of human biology would be compiled. Second, although we are able to compile a quantitative m6A dataset according to the enrichment score of m6A methylation peaks, the actual stoichiometry of m6A methylation is still hard to be measured using current MeRIP-seq technologies [30], [34]. A novel m6A methylation quantification method is crucial to generate less biased methylation profiles for more reliable comparative analyses. Third, it is known that the topology of m6A sites along the genes could convey biological functions [14], [15]. However, we did not perform analysis at m6A site level in the current study. The recent progress in single-nucleotide m6A site mapping technique and m6A site prediction methods [18], [20] could enable a comprehensive comparison of m6A methylation sites across different conditions. Finally, to study the (functional) conservation of m6A modifications, it would also be interesting to evaluate our findings in other species.

In summary, our results indicate that the m6A modification tends to regulate important genes. Besides, the miRNA targets and regulators of gene expression like transcriptional factors and RNA processing factors are also suggested to be preferred targets of m6A modification. Therefore, extensive functional crosstalk between m6A epitranscriptomic regulation and other regulatory machineries of gene expression is implied.

Materials and methods

Definition of gene groups based on the number of m6A modification conditions

The human m6A modification profiles, which cover 38 different m6A modification conditions (Table S1), were downloaded from the recently-updated 2.0 version of the MeT-DB database (http://compgenomics.utsa.edu/MeTDB/ and http://www.xjtlu.edu.cn/metdb2) [16]. We first discarded the m6A profiles, where the expression of any m6A methylation core components (including METTL3, METTL14, WTAP, ALKBH5, and FTO) was perturbed (knockout, knockdown or, over-expression), and combined the modification sites from the biological replicates. Then, the modification sites were mapped to Entrez genes and the number of conditions when the gene was modified on at least one m6A site was counted. To reduce bias, we corrected the number of m6A regulated conditions by dividing the number of cell types with baseline expression. For each gene, the number of cell types or tissues covered by m6A studies and showing baseline expression (i.e., transcripts per million, TPM >0.5) of this gene was derived from the Human Protein Atlas database (https://www.proteinatlas.org/) [35]. The genes with corrected number of m6A regulated conditions >3.5 (roughly corresponding to the top 25% in the distribution of corrected number of m6A regulated conditions) were defined as the m6Afreq genes, while those with corrected number of m6A regulated conditions ≤1.5 (roughly corresponding to the bottom 25% in the distribution) were defined as the m6Aocca genes.

We also complied a quantitative m6A dataset (m6A-quantitative dataset) based on the quantitative methylation profiles from MeT-DB V2.0. Then, the m6A peaks were mapped onto the Entrez genes, and the total enrichment score along each transcript was calculated. For the gene with multiple transcripts, only the maximum of the total enrichment scores was retained. The total enrichment score of the genes between different technical replicates were averaged and log10-transformed to reduce the bias from the extremely high enrichment scores. Consequently, for each gene in the m6A-quantitative dataset, 38 total enrichment scores, which are in correspondence to 38 different conditions, were obtained. Based on these 38 total enrichment scores, a specificity score τ is calculated in the same way as the calculation of tissue expression specificity [32], [33]. Finally, the normalized m6A regulation breadth was defined as 1 − τ. By definition, the normalized m6A regulation breadth ranges from 0 to 1, where higher score indicates more frequently regulated genes.

Statistical analysis of the gene importance-related gene features

The human-to-mouse dN/dS ratios were downloaded from the Ensembl database (http://www.ensembl.org/) [36]. The numbers of orthologous genes were retrieved from the orthologous matrix (OMA) database (http://omabrowser.org/oma/) [37]. The PPI data were obtained from the BioGRID database (http://thebiogrid.org/) [38]. After removing genetic interactions and protein–RNA interactions, the degree of each protein was calculated by counting the total number of its interacting partners [39]. As for the tissue expression specificity, we first obtained the gene expression atlas across 79 human tissues measured by Su et al. [40] (GEO accession number: GDS590). For each gene, the tissue expression specificity was measured according to the state-of-the-art τ method which was described in the previous studies [32], [33]. The conversion of gene symbols and RefSeq IDs to Entrez gene ID was performed according to the ID mapping file retrieved from the Ensembl database. All statistical analysis was performed in R (https://www.r-project.org/).

Signaling network analysis

The most recent human signaling network was downloaded from the Wang lab database (http://www.cancer-systemsbiology.org/) [27]. The node centrality analysis was performed using the igraph package in R. The relative level in the signaling network was calculated as the shortest distance to any upstream receptor divided by the sum of the shortest distance to any upstream receptor and the shortest distance to any downstream effector (e.g., transcriptional factors). Therefore, higher relative level indicates that the gene is located at the downstream of signaling network. The shortest distance between two genes was also calculated using igraph package with the edge direction constraint. The common network motifs in the signaling network were defined in previous work [26]. The total occurrence of one gene in a specific network motif was summarized using an in-house Perl script. We also randomly re-sampled equal number of genes in the signaling network to that of the m6Afreq genes or m6Aocca genes. This random re-sampling procedure was repeated for 10,000 times, which enables us to evaluate whether the enrichment of m6Afreq or m6Aocca genes for a specific motif can be also observed in randomly picked genes (thus randomly expected) or not. If the observed real occurrence is higher than the random occurrence for more than 9500 out of 10,000 re-sampling trials, the observed over-representation is considered as non-random (i.e., re-sampling test P < 0.05).

Comparison of microRNA targets and functional association

The experimentally-identified miRNA–target interactions were obtained from the miRTarBase (http://mirtarbase.mbc.nctu.edu.tw/) [41]. To reduce false positive results, only miRNA–target interactions supported by at least one piece of strong evidence record or by at least three pieces of weak evidence records were retained. We also examined the co-expressed miRNA–target database according to the mirCoX database [29]. For each miRNA–gene pair, the mirCoX database calculates the percentiles of correlation coefficients on either miRNA side or gene side. Therefore, the geometric mean of these two percentiles, also known as mutual rank [42], could serve as a reasonable measurement of miRNA–gene co-expression to filter the miRTarBase miRNA–target pairs. We assigned miRNA–target pair that has positive correlation coefficient and mutual rank <0.5 to be the positively co-expressed pairs, and those having negative correlation coefficient and mutual rank >0.5 to be the negatively co-expressed pairs.

The functional enrichment (GO biological process) analysis was performed using gProfileR online tool (http://biit.cs.ut.ee/gprofiler) with default parameters and threshold except the unspecific terms that are associated with more than 1000 genes were excluded before analysis [43]. To reduce the redundant terms, we applied “best per parent group” filtration provided by the gProfileR tool to the significantly-enriched terms.

Authors’ contributions

YZ and QC conceived and designed the analysis. QC supervised the study. YZ performed the analysis. YZ wrote the manuscript and QC edited the manuscript. Both authors read and approved the final manuscript.

Competing interests

The authors have declared no competing interests.

Acknowledgments

We thank Yan Huang and Xinhua Liu for assistance in data downloading. This study was supported by the National Natural Science Foundation of China (Grant Nos. 81670462 and 81422006 to QC) and China Postdoctoral Science Foundation (Grant No. 2016M591024 to YZ).

Handled by Yi Xing

Footnotes

Peer review under responsibility of Beijing Institute of Genomics, Chinese Academy of Sciences and Genetics Society of China.

Supplementary material associated with this article can be found, in the online version, at https://doi.org/10.1016/j.gpb.2018.01.001.

Contributor Information

Yuan Zhou, Email: zhouyuanbioinfo@bjmu.edu.cn.

Qinghua Cui, Email: cuiqinghua@hsc.pku.edu.cn.

Supplementary material

Supplementary Figure S1

The correlation between the corrected number of m6A regulated conditions and the number of co-expressed targeting microRNAsThe correlation curves between the corrected number of m6A regulated conditions and the number of co-expressed targeting microRNAs are plotted using the LOESS smoothing technique. The line indicates the local average estimated by LOESS smoothing and the shade indicates the confidence interval. Outlier genes (0.5%) with extremely high corrected number of m6A regulated conditions are omitted due to their high variation in gene feature values, which could result in badly skewed regression lines. A. Correlation with the number of positively co-expressed targeting microRNAs. B. Correlation with the number of negatively co-expressed targeting microRNAs.

mmc1.pptx (196.3KB, pptx)
Supplementary Figure S2

The correlation between the normalized m6A regulation breadth and various gene features in the quantitative m6A datasetThe correlation curve is plotted by using the LOESS smoothing techniques. The line indicates the local average estimated by LOESS and the shade indicates the confidence interval. A. Correlation of normalized m6A regulation breadth score with dN/dS ratio. The normalized m6A regulation breadth score of one gene summarizes the m6A peak scores of the gene across 38 conditions, calculated similarly as for the tissue expression specificity (see Materials & Methods for details). B. Correlation of normalized m6A regulation breadth score with tissue expression specificity. C. Correlation of normalized m6A regulation breadth score with PPI network degree. D. Correlation of normalized m6A regulation breadth score with relative level in signaling network. E. Correlation of normalized m6A regulation breadth score with number of targeting microRNAs. F. The summary of Spearman’s correlation coefficient and P values corresponding to the panels A−E. PPI, protein–protein interaction.

mmc2.pptx (417.1KB, pptx)
Supplementary Figure S3

Boxplots for the comparison of various gene features between m6Anone genes, m6Aocca genes, and m6Afreq genesA. Comparison of dN/dS ratio. B. Comparison of tissue expression specificity. C. Comparison of PPI network degree. Eight proteins with extremely high degree (>800) are considered as outliners and thus not shown in the plot. D. Comparison of number of targeting microRNAs. E. Comparison of the relative level in the signaling network. F. The summary of Wilcoxon’s test P values corresponding to the panels A−E.

mmc3.pptx (142.6KB, pptx)
Supplementary Table S1

Information about the 38 conditions covered by the comprehensive m6A dataset.

mmc4.docx (20.3KB, docx)
Supplementary Table S2

Top 20 enriched functional terms for m6Afreq genes.

mmc5.docx (15KB, docx)
Supplementary Table S3

Top 20 enriched functional terms for m6Aocca genes.

mmc6.docx (14.7KB, docx)

References

  • 1.Saletore Y., Meyer K., Korlach J., Vilfan I.D., Jaffrey S., Mason C.E. The birth of the epitranscriptome: deciphering the function of RNA modifications. Genome Biol. 2012;13:175. doi: 10.1186/gb-2012-13-10-175. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Meyer K.D., Jaffrey S.R. The dynamic epitranscriptome: N6-methyladenosine and gene expression control. Nat Rev Mol Cell Biol. 2014;15:313–326. doi: 10.1038/nrm3785. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Niu Y., Zhao X., Wu Y.S., Li M.M., Wang X.J., Yang Y.G. N6-methyl-adenosine (m6A) in RNA: an old modification with a novel epigenetic function. Genomics Proteomics Bioinformatics. 2013;11:8–17. doi: 10.1016/j.gpb.2012.12.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Wang X., Lu Z., Gomez A., Hon G.C., Yue Y., Han D. N6-methyladenosine-dependent regulation of messenger RNA stability. Nature. 2014;505:117–120. doi: 10.1038/nature12730. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Wang X., Zhao B.S., Roundtree I.A., Lu Z., Han D., Ma H. N6-methyladenosine modulates messenger RNA translation efficiency. Cell. 2015;161:1388–1399. doi: 10.1016/j.cell.2015.05.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Shi H., Wang X., Lu Z., Zhao B.S., Ma H., Hsu P.J. YTHDF3 facilitates translation and decay of N6-methyladenosine-modified RNA. Cell Res. 2017;27:315–328. doi: 10.1038/cr.2017.15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Li A., Chen Y.S., Ping X.L., Yang X., Xiao W., Yang Y. Cytoplasmic m6A reader YTHDF3 promotes mRNA translation. Cell Res. 2017;27:444–447. doi: 10.1038/cr.2017.10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Xiao W., Adhikari S., Dahal U., Chen Y.S., Hao Y.J., Sun B.F. Nuclear m6A reader YTHDC1 regulates mRNA splicing. Mol Cell. 2016;61:507–519. doi: 10.1016/j.molcel.2016.01.012. [DOI] [PubMed] [Google Scholar]
  • 9.Hsu P.J., Zhu Y., Ma H., Guo Y., Shi X., Liu Y. Ythdc2 is an N6-methyladenosine binding protein that regulates mammalian spermatogenesis. Cell Res. 2017;27:1115–1127. doi: 10.1038/cr.2017.99. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Meyer K.D., Patil D.P., Zhou J., Zinoviev A., Skabkin M.A., Elemento O. 5' UTR m6A promotes cap-independent translation. Cell. 2015;163:999–1010. doi: 10.1016/j.cell.2015.10.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Liu N., Dai Q., Zheng G., He C., Parisien M., Pan T. N6-methyladenosine-dependent RNA structural switches regulate RNA-protein interactions. Nature. 2015;518:560–564. doi: 10.1038/nature14234. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Roost C., Lynch S.R., Batista P.J., Qu K., Chang H.Y., Kool E.T. Structure and thermodynamics of N6-methyladenosine in RNA: a spring-loaded base modification. J Am Chem Soc. 2015;137:2107–2115. doi: 10.1021/ja513080v. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Alarcon C.R., Lee H., Goodarzi H., Halberg N., Tavazoie S.F. N6-methyladenosine marks primary microRNAs for processing. Nature. 2015;519:482–485. doi: 10.1038/nature14281. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Meyer K.D., Saletore Y., Zumbo P., Elemento O., Mason C.E., Jaffrey S.R. Comprehensive analysis of mRNA methylation reveals enrichment in 3' UTRs and near stop codons. Cell. 2012;149:1635–1646. doi: 10.1016/j.cell.2012.05.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Dominissini D., Moshitch-Moshkovitz S., Schwartz S., Salmon-Divon M., Ungar L., Osenberg S. Topology of the human and mouse m6A RNA methylomes revealed by m6A-seq. Nature. 2012;485:201–206. doi: 10.1038/nature11112. [DOI] [PubMed] [Google Scholar]
  • 16.Liu H., Flores M.A., Meng J., Zhang L., Zhao X., Rao M.K. MeT-DB: a database of transcriptome methylation in mammalian cells. Nucleic Acids Res. 2015;43:D197–D203. doi: 10.1093/nar/gku1024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Liu H., Wang H., Wei Z., Zhang S., Hua G., Zhang S.W. MeT-DB V2.0: elucidating context-specific functions of N6-methyl-adenosine methyltranscriptome. Nucleic Acids Res. 2018;46:D281–D287. doi: 10.1093/nar/gkx1080. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Linder B., Grozhik A.V., Olarerin-George A.O., Meydan C., Mason C.E., Jaffrey S.R. Single-nucleotide-resolution mapping of m6A and m6Am throughout the transcriptome. Nat Methods. 2015;12:767–772. doi: 10.1038/nmeth.3453. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Chen W., Tran H., Liang Z., Lin H., Zhang L. Identification and analysis of the N6-methyladenosine in the Saccharomyces cerevisiae transcriptome. Sci Rep. 2015;5:13859. doi: 10.1038/srep13859. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Zhou Y., Zeng P., Li Y.H., Zhang Z., Cui Q. SRAMP: prediction of mammalian N6-methyladenosine (m6A) sites based on sequence-derived features. Nucleic Acids Res. 2016;44 doi: 10.1093/nar/gkw104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Sun W.J., Li J.H., Liu S., Wu J., Zhou H., Qu L.H. RMBase: a resource for decoding the landscape of RNA modifications from high-throughput sequencing data. Nucleic Acids Res. 2016;44:D259–D265. doi: 10.1093/nar/gkv1036. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Liang H., Li W.H. Gene essentiality, gene duplicability and protein connectivity in human and mouse. Trends Genet. 2007;23:375–378. doi: 10.1016/j.tig.2007.04.005. [DOI] [PubMed] [Google Scholar]
  • 23.Blomen V.A., Májek P., Jae L.T., Bigenzahn J.W., Nieuwenhuis J., Staring J. Gene essentiality and synthetic lethality in haploid human cells. Science. 2015;350:1092–1096. doi: 10.1126/science.aac7557. [DOI] [PubMed] [Google Scholar]
  • 24.Logue J.S., Morrison D.K. Complexity in the signaling network: insights from the use of targeted inhibitors in cancer therapy. Genes Dev. 2012;26:641–650. doi: 10.1101/gad.186965.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Balazsi G., Barabasi A.L., Oltvai Z.N. Topological units of environmental signal processing in the transcriptional regulatory network of Escherichia coli. Proc Natl Acad Sci U S A. 2005;102:7841–7846. doi: 10.1073/pnas.0500365102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Ma W., Trusina A., El-Samad H., Lim W.A., Tang C. Defining network topologies that can achieve biochemical adaptation. Cell. 2009;138:760–773. doi: 10.1016/j.cell.2009.06.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Cui Q., Yu Z., Purisima E.O., Wang E. Principles of microRNA regulation of a human cellular signaling network. Mol Syst Biol. 2006;2:46. doi: 10.1038/msb4100089. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Goentoro L., Shoval O., Kirschner M.W., Alon U. The incoherent feedforward loop can provide fold-change detection in gene regulation. Mol Cell. 2009;36:894–899. doi: 10.1016/j.molcel.2009.11.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Giles C.B., Girija-Devi R., Dozmorov M.G., Wren J.D. mirCoX: a database of miRNA-mRNA expression correlations derived from RNA-seq meta-analysis. BMC Bioinformatics. 2013;14:S17. doi: 10.1186/1471-2105-14-S14-S17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Molinie B., Wang J., Lim K.S., Hillebrand R., Lu Z.X., Van Wittenberghe N. m6A-LAIC-seq reveals the census and complexity of the m6A epitranscriptome. Nat Methods. 2016;13:692–698. doi: 10.1038/nmeth.3898. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Wang Y., Lee C.G. MicroRNA and cancer–focus on apoptosis. J Cell Mol Med. 2009;13:12–23. doi: 10.1111/j.1582-4934.2008.00510.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Kryuchkova-Mostacci N., Robinson-Rechavi M. A benchmark of gene expression tissue-specificity metrics. Brief Bioinform. 2017;18:205–214. doi: 10.1093/bib/bbw008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Yanai I., Benjamin H., Shmoish M., Chalifa-Caspi V., Shklar M., Ophir R. Genome-wide midrange transcription profiles reveal expression level relationships in human tissue specification. Bioinformatics. 2005;21:650–659. doi: 10.1093/bioinformatics/bti042. [DOI] [PubMed] [Google Scholar]
  • 34.Fu Y., Dominissini D., Rechavi G., He C. Gene expression regulation mediated through reversible m6A RNA methylation. Nat Rev Genet. 2014;15:293–306. doi: 10.1038/nrg3724. [DOI] [PubMed] [Google Scholar]
  • 35.Uhlen M., Oksvold P., Fagerberg L., Lundberg E., Jonasson K., Forsberg M. Towards a knowledge-based Human Protein Atlas. Nat Biotechnol. 2010;28:1248–1250. doi: 10.1038/nbt1210-1248. [DOI] [PubMed] [Google Scholar]
  • 36.Yates A., Akanni W., Amode M.R., Barrell D., Billis K., Carvalho-Silva D. Ensembl 2016. Nucleic Acids Res. 2016;44:D710–D716. doi: 10.1093/nar/gkv1157. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Altenhoff A.M., Skunca N., Glover N., Train C.M., Sueki A., Pilizota I. The OMA orthology database in 2015: function predictions, better plant support, synteny view and other improvements. Nucleic Acids Res. 2015;43:D240–D249. doi: 10.1093/nar/gku1158. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Chatr-Aryamontri A., Oughtred R., Boucher L., Rust J., Chang C., Kolas N.K. The BioGRID interaction database: 2017 update. Nucleic Acids Res. 2017;45:D369–D379. doi: 10.1093/nar/gkw1102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Liu X., Zeng P., Cui Q., Zhou Y. Comparative analysis of genes frequently regulated by drugs based on connectivity map transcriptome data. PLoS One. 2017;12 doi: 10.1371/journal.pone.0179037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Su A.I., Wiltshire T., Batalov S., Lapp H., Ching K.A., Block D. A gene atlas of the mouse and human protein-encoding transcriptomes. Proc Natl Acad Sci U S A. 2004;101:6062–6067. doi: 10.1073/pnas.0400782101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Chou C.H., Chang N.W., Shrestha S., Hsu S.D., Lin Y.L., Lee W.H. miRTarBase 2016: updates to the experimentally validated miRNA-target interactions database. Nucleic Acids Res. 2016;44:D239–D247. doi: 10.1093/nar/gkv1258. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Obayashi T., Okamura Y., Ito S., Tadaka S., Motoike I.N., Kinoshita K. COXPRESdb: a database of comparative gene coexpression networks of eleven species for mammals. Nucleic Acids Res. 2013;41:D1014–D1020. doi: 10.1093/nar/gks1014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Reimand J., Arak T., Adler P., Kolberg L., Reisberg S., Peterson H. g:Profiler-a web server for functional interpretation of gene lists (2016 update) Nucleic Acids Res. 2016;44:W83–W89. doi: 10.1093/nar/gkw199. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Figure S1

The correlation between the corrected number of m6A regulated conditions and the number of co-expressed targeting microRNAsThe correlation curves between the corrected number of m6A regulated conditions and the number of co-expressed targeting microRNAs are plotted using the LOESS smoothing technique. The line indicates the local average estimated by LOESS smoothing and the shade indicates the confidence interval. Outlier genes (0.5%) with extremely high corrected number of m6A regulated conditions are omitted due to their high variation in gene feature values, which could result in badly skewed regression lines. A. Correlation with the number of positively co-expressed targeting microRNAs. B. Correlation with the number of negatively co-expressed targeting microRNAs.

mmc1.pptx (196.3KB, pptx)
Supplementary Figure S2

The correlation between the normalized m6A regulation breadth and various gene features in the quantitative m6A datasetThe correlation curve is plotted by using the LOESS smoothing techniques. The line indicates the local average estimated by LOESS and the shade indicates the confidence interval. A. Correlation of normalized m6A regulation breadth score with dN/dS ratio. The normalized m6A regulation breadth score of one gene summarizes the m6A peak scores of the gene across 38 conditions, calculated similarly as for the tissue expression specificity (see Materials & Methods for details). B. Correlation of normalized m6A regulation breadth score with tissue expression specificity. C. Correlation of normalized m6A regulation breadth score with PPI network degree. D. Correlation of normalized m6A regulation breadth score with relative level in signaling network. E. Correlation of normalized m6A regulation breadth score with number of targeting microRNAs. F. The summary of Spearman’s correlation coefficient and P values corresponding to the panels A−E. PPI, protein–protein interaction.

mmc2.pptx (417.1KB, pptx)
Supplementary Figure S3

Boxplots for the comparison of various gene features between m6Anone genes, m6Aocca genes, and m6Afreq genesA. Comparison of dN/dS ratio. B. Comparison of tissue expression specificity. C. Comparison of PPI network degree. Eight proteins with extremely high degree (>800) are considered as outliners and thus not shown in the plot. D. Comparison of number of targeting microRNAs. E. Comparison of the relative level in the signaling network. F. The summary of Wilcoxon’s test P values corresponding to the panels A−E.

mmc3.pptx (142.6KB, pptx)
Supplementary Table S1

Information about the 38 conditions covered by the comprehensive m6A dataset.

mmc4.docx (20.3KB, docx)
Supplementary Table S2

Top 20 enriched functional terms for m6Afreq genes.

mmc5.docx (15KB, docx)
Supplementary Table S3

Top 20 enriched functional terms for m6Aocca genes.

mmc6.docx (14.7KB, docx)

Articles from Genomics, Proteomics & Bioinformatics are provided here courtesy of Oxford University Press

RESOURCES