Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 2020 May 14;48(11):6251–6264. doi: 10.1093/nar/gkaa347

Dynamic landscape and evolution of m6A methylation in human

Hui Zhang 1,3, Xinrui Shi 1,3, Tao Huang 1,✉,3, Xueni Zhao 1, Wanying Chen 1, Nannan Gu 1, Rui Zhang 1,2,
PMCID: PMC7293016  PMID: 32406913

Abstract

m6A is a prevalent internal modification in mRNAs and has been linked to the diverse effects on mRNA fate. To explore the landscape and evolution of human m6A, we generated 27 m6A methylomes across major adult tissues. These data reveal dynamic m6A methylation across tissue types, uncover both broadly or tissue-specifically methylated sites, and identify an unexpected enrichment of m6A methylation at non-canonical cleavage sites. A comparison of fetal and adult m6A methylomes reveals that m6A preferentially occupies CDS regions in fetal tissues. Moreover, the m6A sub-motifs vary between fetal and adult tissues or across tissue types. From the evolutionary perspective, we uncover that the selection pressure on m6A sites varies and depends on their genic locations. Unexpectedly, we found that ∼40% of the 3′UTR m6A sites are under negative selection, which is higher than the evolutionary constraint on miRNA binding sites, and much higher than that on A-to-I RNA modification. Moreover, the recently gained m6A sites in human populations are clearly under positive selection and associated with traits or diseases. Our work provides a resource of human m6A profile for future studies of m6A functions, and suggests a role of m6A modification in human evolutionary adaptation and disease susceptibility.

INTRODUCTION

Chemical modifications on RNA have been recently appreciated as an important regulatory feature (1). Recent technological breakthroughs, driven mainly by the sequencing-based approaches, have enabled the genome-wide profiling of such RNA modifications, particularly the RNA deamination and methylation (2–6). However, except for A-to-I (adenosine to inosine) RNA editing, which is the predominant type of RNA deamination in animals (7–12), less is known about the dynamics and evolution of most RNA modifications.

m6A is one of the most prevalent internal modifications in mRNAs (2,13–16). It is present among eukaryotic species that range from yeast, plants, flies to mammals. m6A RNA methylation is catalyzed by a multicomponent methyltransferase complex, including METTL3, METTL14 and WTAP (17,18). It has a consensus motif RRACH (in which R represents A or G, and H represents A, C or U). m6A methylation regulates the splicing, expression, decay and translation of mRNAs (19–21), and plays crucial roles in various cellular pathways and processes such as cell differentiation, development and metabolism (15). To date, m6A has been identified in several thousand human protein-coding genes. Although m6A profile of many cultured human cell lines and fetal human tissues have been reported (22,23), we still have limited information about the global landscape and dynamics of m6A in adult human tissues.

It has been hypothesized that gene regulation, ranging from transcriptional processing to post-transcriptional regulation, has a central role in phenotypic evolution (24–26). Therefore, a fundamental question in biology is to understand how natural selection has shaped the evolution of gene regulation (27,28), including RNA modifications. Some studies have shown that m6A peaks, which typically span one to several hundred nt, are conserved between human and mouse (2) and the m6A peak regions have much higher sequence conservation scores than those of randomly selected regions (13). While others suggest that only 37% of the m6A peaks are conserved between human and rhesus macaque (29), and the sequence of m6A RAC central motif is only slightly conserved than the control RAC sites (30). In addition, a recent study suggested that most m6A sites in CDS regions are evolutionarily unconserved (31). However, those studies were limited in scope and scale, thus, a systematic investigation of the selection pressure on individual m6A sites is needed.

MATERIALS AND METHODS

Sample procurement

Samples of nine human adult tissues were obtained from Chinese Brain Bank Center (Wuhan, China). These tissues were collected post-mortem from individuals with no known medical history. The consent of human tissue samples using autopsy was obtained from the patients’ families. Samples were lysed and homogenized in TRIzol Reagent (Invitrogen) using Precellys evolution tissue homogenizer (Bertin). Total RNA was extracted using chloroform and isopropanol following the manufacturer's protocol. The quality of the total RNAs was determined by agarose gel electrophoresis and three biological replicates of RNA samples that with thick 28S and 18S ribosomal RNA (rRNA) gel bands at an approximate mass ratio of 2:1 were selected. These tissues are from five donors (N1–N5), including frontal cortex (N1–N3), cerebellum (N1–N3), heart (N–N3), liver (N1–N3), lung (N1, N3, N5), kidney (N1, N2, N5), spleen (N1, N2, N5), muscle (N2–N4) and testis (N1-N3). N1, male, age 39; N2, male, age 44; N3, male, age 47; N4, male, age 57; N5, male, age 44.

m6A-seq library preparation

m6A immunoprecipitation and library construction were performed as described previously with some modification (2). In brief, samples were lysed and homogenized in TRIzol Reagent (Invitrogen) using Precellys evolution tissue homogenizer (Bertin). Total RNA was extracted using chloroform and isopropanol following the manufacturer's protocol. Polyadenylated mRNA was enriched from total RNA using GenElute mRNA miniprep kit (Sigma-Aldrich). RNA samples were fragmented in 1X Next Magnesium RNA Fragmentation Buffer (NEB) at 94°C for 5min and fragmented RNA was then cleaned up using ethanol precipitation. 10ng fragmented RNA was used to construct input control library with VAHTS Stranded mRNA-seq library prep kit (Vazyme). 15–40 ug fragmented RNA was further incubated with 5ug rabbit anti-m6A polyclonal antibody (Synaptic Systems, catalog number 202 003) in IPP buffer (150 mM NaCl, 0.1% Igepal CA-630, 10 mM Tris–HCl, Ph7.4) overnight at 4°C. The m6A-Ab mixture was then immunoprecipitated by incubation with protein-G magnetic beads (Thermo Fisher, pre-blocked with 0.5 mg ml−1 BSA at 4°C for 2h) at 4°C for another 2h. After washing with IPP buffer, bound RNA was competitively eluted from the beads with 0.5 mg ml−1N6-methyladenosine (Sigma-Aldrich), followed by ethanol precipitation. RNA was resuspended in 8 μl water and used for library construction. Libraries were sequenced on HiSeq X (Illumina) to produce paired-end 150 bp reads.

MeRIP-seq of HEK293T cells

MeRIP-seq was performed as described previously (23). In brief, total RNA of HEK293T cells was extracted and fragmented in Next Magnesium RNA Fragmentation Buffer (NEB) at 94°C for 5 min. 10 ng fragmented RNA was used to construct the input library. 300 ug of fragmented RNA was incubated with 5 ug rabbit anti-m6A polyclonal antibody (Abcam, catalog number ab151230) overnight at 4°C. After stringent washing, bound RNA was eluted by competition with N6-methyladenosine (Sigma-Aldrich), followed by rRNA removal with QIAseq FastSelect RNA Removal Kits (Qiagen). Both the input and IP libraries were constructed using NEBNext Ultra II Directional RNA Library Prep Kit for Illumina (NEB). Libraries were sequenced on HiSeq X (Illumina) to produce paired-end 150 bp reads.

METTL3 knockout cell generation

METTL3 knockout cell was generated via CRISPR/Cas9-induced mutagenesis. In brief, a sgRNA sequence (GCAGAAGCGGCGTGCAGAAC) was designed using CRISPR-ERA (http://CRISPR-ERA.stanford.edu). The sgRNA template oligonucleotide was synthesized and cloned into lentiCRISPR v2 plasmid (Addgene#52961). The plasmid was transfected into HEK293T cells. Transfected cells were selected using puromycin (1 μg /ml). Mutant clones were selected by Sanger sequencing. The loss of METTL3 protein expression was verified with the METTL3 antibody (Proteintech, catalog number 15073-1-AP) by western blot. The m6A levels of HEK293T wild type and METTL3 knockout cells were measured using EpiQuik m6A RNA Methylation Quantification Kit (Epigentek). In brief, polyadenylated RNA was separated from total RNA using Oligo dT Magnetic Beads (Vazyme). 300ng polyadenylated mRNA was used for m6A quantification following the manufacturer's protocol.

m6A-seq data analysis

m6A peak identification was performed as previously described (32). In brief, we trimmed the adaptor and low quality reads using Cutadapt (33) (-e 0.3 –minimum-length 25; –trim-n -q 20,20) and fastx toolkit (fastx_trimmer -f 6; -t 5 -m 20). rRNA reads were then removed using SortMeRNA (34). Next, clean reads were mapped to the human genome (hg19) using TopHat2 (version 2.1.0) (35). Enriched peaks were identified by scanning each gene using 100-nt sliding windows, and calculating an enrichment score for each sliding window (winscore).

graphic file with name M1.gif

MeanWinIP and MeanWinControl are the mean coverage for each window for immunoprecipitation and input control, respectively. MedianGeneIP and MedianGeneControl are gene median coverages for immunoprecipitation and input control, respectively. Windows with RPKM ≥ 10 in the IP sample, enrichment score ≥2 in genes with RPKM in the input sample ≥1 were defined as enriched windows. Last, only the peaks with winscore ≥2 in at least two samples of a tissue type were considered as real m6A peaks.

To generate the metagene profile of m6A site distribution across transcripts, we first determined the number of bins that need to be divided for a given gene based on relative lengths between 5′UTR, CDS and 3′UTR of the human transcriptome (GENCODE v26). The relative lengths between 5′UTR, CDS and 3′UTR are 10:50:40, thus for each gene, 10, 50 and 40 bins of equal length were made for 5′UTR, CDS and 3′UTR, respectively. Next, for each m6A site, we assigned it to the longest isoform of the corresponding gene and determined which bin it is located in. Last, the number of sites for each bin was summarized and the curve was fitted with polyfit.

The mapping statistics of all datasets were summarized in Supplementary Table S1.

m6A peak call via exomePeak and MeTPeak

IP and input reads were mapped as described above. For each tissue type, the consistent peaks in all replicates were called using exomePeak (36) or MeTPeak (37) with default parameters.

Tissue specificity analysis

For each site, we first calculated its average winscore in each tissue type. Next, we calculated its tissue specificity index tau (38) using the average winscore of each site:

graphic file with name M2.gif

  Inline graphic is  the  average winscore  of  a site  in  tissue  i; n is the number of tissues.

Shared m6A sites were defined as sites with tau <0.15. Tissue-specific m6A sites were defined as sites with tau >0.6.

m6A-RIP qPCR

RIP was performed as described above. Both the IP and input RNAs were reverse transcribed and the m6A marked mRNAs and NC (GAPDH) mRNA were detected by qPCR. The enrichment fold of IP versus input of each gene was calculated and normalized to NC. Primers were listed as follows:

ATP5C1-F GGGAGCTTCGGCGCAT

ATP5C1-R CGCGCGAGAGAACATGGTAG

DCTN1-F GCACGGTTCCTGACAAGTCTA

DCTN1-R GACACAGAATCCTGCTTGCC

PSMB4-F ATGGAAGCGTTTTTGGGGTC

PSMB4-R GAGTGGACGGAATGCGGTAA

SDHAF2-F GCCTTGCTTCCGGCTTCTTA

SDHAF2-R TGTCCATCACTTGAGGCAGG

GAPDH-F TGCCAAATATGATGACATCAAGAA

GAPDH-R GGAGTGGGTGTCGCTGTTG

APA data analysis

Human APA cleavage sites were downloaded from PolyA_DB version 3.2 (http://exon.umdnj.edu/polya_db/v3/). PolyA_DB version 3.2 catalogs polyA sites using deep sequencing data.

3′ processing efficiency measurement assay

We used a bicistronic luciferase reporter construct, pPASPORT, to measure 3′ processing efficiency (39). 3′UTR of each selected gene was inserted into pPASPORT, between Renilla and Firefly luciferase genes. Plasmids were transfected into wild-type and METTL3 knockout HEK293T cells using Lipofectamine 3000 (Thermo Fisher Scientific), respectively. Renilla and Firefly luminescences were measured 24 h later using Dual-Glo Luciferase Assay System (Promega) on GloMax −96 Microplate Luminometer (Promega). All primers used to construct the reporter genes are listed in Supplementary Table S2.

Sub-motif analysis

To obtain the expected numbers of windows with both GGACH and AAACH sub-motifs, the m6A sub-motif sequences were shuffled within all m6A peaks of a given sample. Next, the number of windows with both sub-motifs was calculated. We repeated the shuffling analysis for 10 000 times and obtained 10 000 expected numbers. To plot and compare results from different samples, we performed normalization by mean-centering the values. In brief, for a given sample, we first calculated the mean value of expected window numbers (Mexpected). Next, both observed number and the 10 000 expected numbers were divided by Mexpected.

Rejected substitution score acquisition

The rejected substitution score data were from Sidow lab (http://mendel.stanford.edu/SidowLab/downloads/gerp/).

Cross-species analysis

To conduct the CDS m6A conservation analysis, we used the method from a previous study (31) with some modifications. First, we required that the control A sites and the m6A sites were in genes with similar dN/dS values (human versus mouse). Second, we required that the control A sites and the m6A sites were with similar distances to the stop codon. The human-mouse pairwise alignment file was downloaded from UCSC genome browser, and human-mouse dN and dS value table was obtained via Ensembl biomart.

The proportion of m6A sites in the third codon that are under evolutionary constraints between human and mouse was calculated as: (the evolutionary rate at m6A sites (1–0.644)—the mean evolutionary rate at control sites (1 – 0.621))/(1 – 0.621).

To determine the age of individual m6A sites, pairwise alignment files were downloaded from UCSC genome browser.

RNA editing site selection

Human RNA editing sites were downloaded from RADAR database (40). RADAR version 2, which includes 16 464 human RNA editing sites located at 3′UTR region, was used for analysis.

miRNA target site selection

miRNA binding sites were downloaded from TargetScan database (Release 7.1 http://www.targetscan.org/). The TargetScan algorithm predicts biological targets of miRNAs by searching for the presence of 8mer and 7mer sites that match the seed region of each miRNA (41). A total of 669 927 and 601 858 target sites of conserved and broadly conserved miRNA families were analyzed, separately. Conserved miRNA families are conserved across most mammals, but usually not beyond placental mammals. Broadly conserved miRNA families are conserved across most vertebrates, usually to zebrafish.

SNP data

We downloaded SNP and genotype data from the 1000 Genome Project (http://www.internationalgenome.org/). We discarded all insertion and deletion polymorphisms, SNPs with more than two alleles, SNPs monomorphic (that is, having only one allele) in all populations and SNPs that did not map uniquely to the human genome (hg19). Finally, a total of 77 664 537 SNPs were used for analysis.

Derived allele frequency (DAF) analysis

For each SNP, we extracted the ancestral allele information from the downloaded VCF files of the 1000 Genome Project. For an RRAC motif containing an SNP, we defined it as a gain of an m6A site if the derived allele created an RRAC motif.

Fst calculation

VCFtools was used to calculate Fst between populations (42).

Haplotype homozygosity-based tests

The VCF files were obtained from 1000 Genome Project. For each m6A SNP, we extracted SNPs within ±2 Mb to generate individual VCF files. The Perl script vcf2impute_legend_haps.pl from impute2 (https://mathgen.stats.ox.ac.uk/impute/impute_v2.html#scripts) was used to convert a VCF file into reference panel format: one legend file and one haplotype file. The VCF files of some SNPs were failed to convert so these SNPs were excluded from the further analysis. The hap format file and map format file required by R package ‘rehh’ and selscan were formatted using the legend and haplotype files by in-house Perl scripts. iHS was calculated using the ‘ihh2ihs’ function of rehh package in R (43). XP-EHH was calculated using the‘ies2xpehh’ function of rehh package.

Overlap with GTEx eQTL SNPs

GTEx v6p eQTL file was downloaded from GTEx website (https://gtexportal.org/home/). Only significantly associated SNPs were used.

Overlap with GWAS data

The NHGRI-EBI Catalog of Published GWAS (44), which contains 68 741 trait-associated SNPs, was used to query for overlaps with the 102 m6A SNPs.

To find proxy SNPs (nearby SNPs in linkage disequilibrium) of m6A SNPs that are GWAS hits, we identified all SNPs in strong linkage disequilibrium with high Fst (Fst > 0.15) SNPs. This step was implemented via the VCFtools by extracting all SNPs with pairwise r2 > 0.8 based on CEU, YRI, CHB, PUR and GIH populations from the 1000 Genome Project.

Gene ontology (GO) term analysis

GO term enrichment analysis was performed using R packages clusterProfiler (45).

RESULTS

m6A profiles across human adult tissues

To explore the landscape of m6A in adult human tissues, we constructed m6A-seq libraries from nine human adult tissues, including cerebellum, frontal cortex, heart, kidney, liver, lung, muscle, spleen and testis. For each tissue, three individuals were profiled. m6A peaks were called in each sample using a winscore approach as previously described (32). m6A peaks found in at least 2 samples of a tissue type were considered as real m6A peaks. On average, we found 19 100 m6A peaks per tissue (Supplementary Table S3). The samples of the same tissue type clustered well according to either their m6A levels or gene expression levels (Figure 1A, B). Sequence logo analysis of all datasets confirmed that the called peaks are enriched in the m6A consensus motif RRACH (Supplementary Figure S1), consistent with the previous observation (2). We next inferred the m6A sites by searching for the RRACH motifs within the peaks (Materials and methods). As expected, m6A sites preferentially appeared around stop codons (Figure 1C). To evaluate the performance of the winscore peak call method we used, we applied two additional peak calling algorithms (exomePeak (36) and MeTPeak (37)) for m6A site call and compared m6A sites called from different methods. Most sites identified in our method overlapped with those identified in exomePeak or MeTPeak, and both exomePeak and MeTPeak called a lot more uniquely identified peaks (Supplementary Figure S2A-B). Compared with sites called by exomePeak or MeTPeak, as expected, sites called from our method had higher winscores (Supplementary Figure S2C). Thus, sites called using the winscore approach seemed to represent a more stringent set of m6A sites and were used for all following analyses. Combining all data together, we obtained a union of 101 340 m6A sites that span various classes of genic regions (Figure 1D, Supplementary Supplementary Table S4).

Figure 1.

Figure 1.

m6A profile in human adult tissues. (A, B) Heatmap of Pearson correlation on m6A peak winscores (A) or gene expression levels (B) of protein-coding genes. Gene expression levels were quantified as the number of RNA-seq reads per kilobase of transcript per million mapped reads (RPKM). (C) The distribution of m6A sites across the length of mRNA transcripts for nine adult tissues. 5′UTRs, CDSs and 3′UTRs of protein-coding genes were individually grouped into 10, 50 and 40 bins of their total length, and the percentage of m6A sites that fall within each bin was determined. (D) Genic locations of m6A sites of adult tissues. Sites in nine adult tissues were merged for analysis. Sites were annotated using Ensembl gene annotations and ANNOVAR software. ncRNA, Noncoding RNA. (E) The distribution of the numbers of m6A sites that are methylated in one or more tissues. (F) The distribution of the numbers of m6A sites that are methylated in one or more tissues. Only m6A sites within the ubiquitously expressed genes (FPKM > 3 in all tissues) were analyzed. (G) The distribution of tissue-specific or shared m6A sites across the length of mRNA transcripts for 9 adult tissues. Tissue-specific m6A sites, sites that are within the ubiquitously expressed genes and have a tau >0.6; shared m6A sites, sites that are within the ubiquitously expressed genes and have a tau <0.15. (H) GO terms enriched in the ubiquitously expressed genes with shared m6A sites (tau < 0.15). GO term analysis was performed using clusterProfiler. All ubiquitously expressed genes were used as the background. P values were corrected by Bonferroni adjustments and the top 15 enriched go terms were shown.

To reveal the methylation landscape across adult tissues, we examined to what extent the m6A sites are shared between tissues. We found that more than 36.7% of the sites were found in one specific tissue type and only 5.5% of the sites were shared in all tissues we studied (Figure 1E), thus m6A methylation seems to have high tissue-specificity. The profiles of four selected m6A peaks across tissues, as well as the m6A-RIP qPCR validation results, were shown in Supplementary Figure S3. To further investigate the effect of differential gene expression on the tissue-specificity of m6A sites, we examined m6A sites that were located in the ubiquitously expressed genes across human tissues. Of the 4063 ubiquitously expressed genes, 23 167 sites were identified. Of these sites, 27.4% was methylated in only one tissue type and 12.4% was shared by all tissue types (Figure 1F). These data together suggest that both tissue-specific gene expression and tissue-specific methylation contribute to the observed high tissue-specificity of m6A sites.

To better investigate the tissue-specificity of individual m6A sites, we applied the widely used tissue specificity index tau to measure the tissue-specific methylation (Materials and Methods). Tau varies from 0 to 1, where 0 means broadly expressed, and 1 is specific. Interestingly, when examining the genic locations of tissue-specific (tau > 0.6) and shared m6A sites (tau < 0.15), we found that shared sites tended to be located around the stop codon, while tissue-specific sites tended to be away from the stop codon (Figure 1G). This result suggests the m6A sites away from stop codon may perform tissue-specific functions, while m6A sites around stop codon are more likely to be required for the maintenance of basic cellular function, thus methylated in all cells of an organism. Consistently, the genes with shared sites were enriched in essential functions such as chromatin organization, cellular catabolic process and histone modification (Figure 1H). This tissue-specific methylation pattern was confirmed using RNA-endoribonuclease–facilitated sequencing data that identify m6A sites in three human tissue types (Supplementary Figure S4). Moreover, to control the potential difference in RIP efficiencies between samples, we normalized the winscores based on the top 50 peaks in each sample (Supplementary Figure S5A) and repeated the analysis. We found that the tissue-specific methylation pattern still holds (Supplementary Figure S5B). Notably, an enrichment of tissue-specific sites was found in 5′UTR (Figure 1G). To ask whether the signal in 5′UTR is m6Am, we performed the sequence logo analysis for tissue-specific peaks in 5′UTR regions. We found that the called peaks are enriched in the m6A consensus motif RRACH (Supplementary Figure S5C). Moreover, we examined the distance of the tissue-specific m6A sites in the 5′UTR regions to the TSS. We found that <20% of the sites are very close to TSS (Supplementary Figure S5D). These results together suggest that most tissue-specific m6A peaks in 5′UTRs are m6A, although we cannot exclude the possibility that some of the peaks that are very close to TSS are m6Am.

We also applied a lower or higher stringency to define m6A sites (i.e. we required that the sites were found in at least one sample or all samples of a tissue type) and repeated the tissue-specificity analysis. In the low stringency condition, 45.6% of the sites were methylated in only one tissue type and 4.3% of the sites were shared in all tissues; in the high stringency condition, 50.9% of the sites were methylated in only one tissue type and 0.9% of the sites were shared in all tissues. In both conditions, we consistently observed that shared sites tended to be located around the stop codon, while tissue-specific sites tended to be away from the stop codon (Supplementary Figure S6A, B). Moreover, the genes with shared sites were enriched in essential functions such as chromatin organization, cellular catabolic process and histone modification (Supplementary Figure S6C, D).

Taken together, these data reveal dynamic m6A methylation across tissue types, uncover both broadly or tissue-specifically methylated sites, and highlight the potentially distinct regulatory effects for m6A sites around and away from the stop codon.

m6A methylation is enriched at non-canonical cleavage sites in 3′UTR

Polyadenylation processing of pre-mRNAs is an essential step in the generation of mature mRNAs. It includes an endonucleolytic cleavage followed by polyadenylation (46). A cleavage site is typically located in the downstream 15–30nt of the poly(A) signal (PAS) (Figure 2A). Most eukaryotic genes harbor multiple PASs, leading to expression of alternative polyadenylation (APA) isoforms. m6A is known to be associated with APA selection (30,47,48), however, whether m6A methylation directly regulates polyadenylation is unknown. The generation of the comprehensive list of m6A sites, along with the map of poly(A) cleavage sites in human (49), enables us to systematically examine the relationship between m6A methylation and cleavage. To do so, we first examined the distribution of the distance between a cleavage site and the nearest m6A site. Unexpectedly, we found that m6A sites are highly enriched in the cleavage site position (Figure 2B). This observation was confirmed using miCLIP and RNA-endoribonuclease–facilitated sequencing data that identify m6A sites with single-nucleotide-resolution (Supplementary Figure S7A). In addition, compared with 3′UTR m6A sites that were not located at the cleavage position, m6A sites at the cleavage position tended to be enriched in AAACH sub-motif (Figure 2C).

Figure 2.

Figure 2.

m6A methylation is enriched at non-canonical cleavage sites in 3′UTR. (A) Schematic representation of a poly(A) site and polyadenylation configuration. PAS, poly(A) signals. (B) The distribution of m6A sites around the cleavage sites. Position 0 means the cleavage position. (C) The proportion of m6A sub-motifs at the cleavage position or 3′UTR. (D) Top: Schematic diagram showing the bicistronic luciferase reporter system to measure 3′ processing efficiency. Fragment of interest was inserted into the MCS position (i.e. between Renilla (R luc) and Firefly luciferase (F luc) genes). Efficient mRNA 3′ processing at the tested PAS leads to high expression of Renilla luciferase gene and low expression of Firefly luciferase gene, while poor mRNA 3′ processing results in the opposite mode of gene expression. Therefore, the Renilla/Firefly ratio provides a quantitative measurement of the processing efficiency at the tested PAS. Bottom: Luciferase reporter assays to determine the relative PAS activity of 5 selected sites in wild-type and METTL3 knockout cells. Four biological replicates were performed and statistical significance was calculated using Student's t-test. The relative PAS activities are represented as mean ± sd. **P < 0.01; ***P < 0.001. (E) The proportion of canonical and noncanonical PAS for m6A and non-m6A cleavage sites. Canonical groups: AAUAAA, AUUAAA and other; noncanonical group, none. (F) Left: nucleotide sequence composition around all 3P-seq–identified canonical and noncanonical poly(A) sites. Position 0 means the cleavage position. Cleavage sites located in 3′UTR or 3′UTR downstream 1kb regions were analyzed. Right: The sequence context 2nt upstream and downstream of the cleavage position.

To examine the effect of m6A on PAS regulation, we utilized a PAS reporter assay to measure the impact of m6A on poly(A) site processing efficiency (Figure 2D). We first generated METTL3 knockout HEK293T cells (Supplementary Figure S7B, C) and confirmed the reduced m6A level (Supplementary Figure S7D). Next, we selected five m6A sites that were in the cleavage site position and methylated in HEK293T cells (Supplementary Figure S7E). We subcloned ∼ 300 bp region around each m6A sites into the reporter plasmid and transfected each of the reporters into both wild-type and METTL3 knockout cells. We found that these m6A located PAS regions had a higher processing efficiency in the METTL3 knockout cells (Figure 2D). This result suggests that m6A may repress polyadenylation processing.

APA events can lead to the production of noncanonical mRNA isoforms, affecting the fate of the transcript and the nature of the products of translation (50). To ask if m6A may be involved in such regulation, we divided APA sites into canonical and noncanonical groups and examined their association with m6A. Interestingly, we found that m6A sites tended to be located at the noncanonical APA sites (Figure 2E). An examination of the nucleotide compositions around the cleavage sites revealed that, compared with canonical APA sites, noncanonical APA sites had an enrichment of Cs at the immediate downstream of the cleavage position (Figure 2F), thus are more likely to form the RRACH motif required for m6A methylation.

Taken together, we unexpectedly observed an enrichment of m6A sites in the cleavage position, particularly for the noncanonical APA events. This observation raises the possibility that the methylation status of the cleavage position may affect the cleavage efficiency directly, therefore regulating APA selection. Consequently, the dynamic methylation of m6A at cleavage position across tissue types may contribute to the dynamic regulation of APA across tissue types.

Developmental dynamics of m6A methylation

To understand the developmental dynamics of m6A methylation, we compared the m6A profile between fetal and adult tissues. m6A profiles of seven human fetal tissues, including brain, heart, kidney, liver, lung, muscle and stomach, were used for analysis (23). Among these tissues, 5 tissue types were in common between fetal and adult samples. m6A peaks and sites of fetal tissues were called as we did in our adult tissue data. We confirmed that the samples of the same tissue type clustered together according to either their m6A levels or gene expression levels (Supplementary Figure S8A, B). In addition, the called peaks were enriched in the m6A consensus motif RRACH (Supplementary Figure S8C). In total, we obtained a union of 60 440 fetal m6A sites. These fetal m6A sites also preferentially appeared around stop codons (Figure 3A) and spanned various classes of genic regions (Figure 3B). Interestingly, fetal and adult tissues showed distinct distributions of m6A sites along the transcripts. Although m6A sites preferentially appeared around stop codons in both fetal and adult tissues, the CDS regions of fetal tissues showed clearly higher m6A proportions than that of adult tissues (Figure 3C). Because the m6A profile data of fetal tissues were generated using a different RIP procedure and antibody from our method, to exclude the possibility that the observed difference is due to technical issue, we performed two analyses. First, we constructed m6A-seq libraries using the same RNA sample with two different methods. We found that sites called from both methods had the same distribution across the transcripts (Supplementary Figure S9A), suggesting that the use of different library construction methods had no significant impact on m6A distribution analysis. Second, we analyzed an independent fetal tissue m6A-seq data, which include 3 post-conception week 11 (PCW11) fetal human brain samples, 3 mouse developing brain (E13.5) samples and 2 human 47 day forebrain organoid samples (51). These m6A-seq data were generated with the same RIP procedure as our data. The enrichment of CDS sites in fetal tissues was confirmed in this data set (Supplementary Figure S9B). VIRMA is known to mediate preferential m6A mRNA methylation in 3′UTR and near stop codon (48). An examination of VIRMA expression revealed that, compared with adult tissues, fetal tissues had higher expression levels (Supplementary Figure S9C, D), thus the observed difference between fetal and adult tissues may be due to other unknown regulators.

Figure 3.

Figure 3.

Developmental dynamics of m6A profile. (A) The distribution of m6A sites across the length of mRNA transcripts for 7 fetal tissues. (B) Genic locations of m6A sites of fetal tissues. Sites in seven fetal tissues were merged for analysis. (C) The ratio between the CDS m6A site number and 3′UTR m6A site number in fetal and adult tissues. (D) Heatmap showing the normalized proportion (the values were centered and scaled in the row direction) of 4 m6A sub-motifs in adult and fetal tissues. (E) Variance and mean value of the proportion of 4 sub-motifs across human adult tissues. (F) GGACH and AAACH sub-motifs tend to locate in different windows for all tissue types we studied. Color dots indicate the observed numbers of m6A peaks with both GGACH and AAACH sub-motifs. The distribution of the expected numbers was generated by the shuffled data. P is the fraction of the distribution on the left side of the dots. It is found that the observed numbers of m6A peaks with both GGACH and AAACH sub-motifs are significantly smaller than that of the shuffled data (P < 0.0001). The x-axis is the ratio of the number of windows with both sub-motifs over the mean number of windows with both sub-motifs calculated using the shuffled data.

The regulation of m6A motifs across tissue types or developmental stages

We noted that although the m6A consensus motif RRACH is enriched in all tissue types examined, the detailed motifs vary between tissues (Supplementary Figure S1 and Supplementary Figure S8C). To examine the motif dynamics and regulation between tissues, we divided the RRACH motif into four sub-motifs (GGACH, AGACH, GAACH and AAACH) for analysis. We found that different tissues had different sub-motif preferences (Figure 3D). For example, AAACH was overrepresented in both fetal brain and adult frontal cortex. Interestingly, we found that the proportions of AAACH and GGACH sub-motifs were most variable between tissues, while the proportion of AGACH and GAACH sub-motifs were consistent across tissues (Figure 3E and Supplementary Figure S10A). Moreover, during development, the sub-motifs of some tissues were changing but others were not (Supplementary Figure S10B).

m6A is installed by a multicomponent methyltransferase complex. Besides the core methyltransferase subunits, it also contains other proteins that interact with core subunits to methylate specific positions (52). The combination of core methyltransferase subunits with different interacting proteins may lead to different motif preference. The observation above suggests that GGACH and AAACH sub-motifs may be installed by core methyltransferase subunits with distinct interacting proteins. If so, we expect that these two sub-motifs may tend to occur in different peak regions. To ask if this is true, we shuffled the m6A sub-motif sequence within the peaks and calculated the numbers of peaks with both AAACH and GGACH sub-motifs (Materials and methods). We found that the observed number of peaks with both AAACH and GGACH sub-motifs were significantly less than the expected numbers in all tissues we examined (Figure 3F), consistent with our expectation.

The evolutionary landscape of human m6A methylation

Having revealed the dynamics of m6A methylation across human tissues, we next investigated its evolution. First, we examined the cross-species conservation of m6A sites to assess the strength of selective pressure on individual m6A sites.

For CDS sites, we examined the selection pressure of m6A sites in different codon positions, as they may be subject to different evolutionary constraints. To estimate the strength of selection pressure, we chose to compare the fraction of conserved m6A sites between human and mouse with that of the control A sites, as previously described (31). It is known that different genes are subject to different selection pressure. To control such effect, for control A sites in non-m6A RRACH motifs, we selected sites in genes with similar selection pressure, i.e. similar dN/dS ratio, as the m6A sites. In addition, m6A sites tended to be located in the 3′end of the CDS region. An examination of evolutionary constraint across the CDS region revealed that As in different CDS regions are subject to different levels of evolutionary constraint (Supplementary Figure S11). To control such effect, for each gene, we grouped the CDS regions into 20 bins and required that the control A sites were located in the same bin as the m6A sites. We found that m6A sites in different codon positions had different conservation patterns, consistent with a previous study (31). The m6A sites in the first codon position were less conserved than control A sites, while m6A sites in the third codon were much conserved than control A sites (Figure 4A). We estimated that 6% of the m6A sites in the third codon are under evolutionary constraints (Materials and methods), therefore likely functional. m6A methylation is known as a barrier to tRNA accommodation and translation elongation, and m6A in the first codon position has the strongest effect on delaying tRNA accommodation (53). Therefore, the effect of m6A modification in the first codon position may be generally detrimental, and more likely to be less conserved.

Figure 4.

Figure 4.

Natural selection on m6A inferred by cross-species analysis. (A) Comparison of evolutionary conservation between m6A and control A sites (non- m6A RRACH). Frequency distributions of the fraction of conserved control sites in 10,000 random sets with the sample size equal to the number of m6A sites at three codon positions were plotted separately. Red dots indicate the fraction of conserved m6A sites. P is the fraction of the distribution on the right side of the dots. First codon, P = 1; second codon position, P = 0.89; third codon position, P <0.0001. (B) Estimates of ρ for m6A RRACH motifs, A-to-I RNA editing triplet motif and miRNA binding sites. ρ represents the fraction of sites under selection within functional elements, which is calculated by INSIGHT. m6A sites, m6A RRACH motifs in 3′UTR region; RNA editing sites, nonrepetitive A-to-I RNA editing sites in 3′UTR region; miRNA target sites, 3′UTR miRNA binding sites for conserved (1) or broadly conserved (2) miRNA families. The regions that match the seed region of the miRNAs were used for analysis. (C) A tree representing the schematic phylogeny of the species studied. (D) The age distribution of m6A sites (left) and control non-m6A RRACH sites (right) across the 3′UTR region. m6A sites and control sites were grouped into 10 bins based on their distance to stop codon. The age of a site was based on the most distantly related species in which the site was conserved. The color codes for the age are the same as in (C). (E) Comparison of methylation levels between m6A sites under strong and weak constraints in CDS and 3′UTR. m6A sites with top 25% (High score, strong constraint) and bottom 25% (Low score, weak constraint) rejection scores were compared. m6A peak winscore of a site was used to represent m6A level. Each dot represents the median methylation level of the CDS or 3′UTR sites; biological replicates of each tissue type were plotted, separately.

For 3′UTR sites, we used INSIGHT (Inference of Natural Selection from Interspersed Genomically coHerent elemenTs) (54), a method for measuring the influence of natural selection for short, widely scattered noncoding elements, to estimate the proportion of m6A sites that are under negative selection (in other words, are functional). INSIGHT obtains information about natural selection by contrasting patterns of polymorphism and divergence in m6A motifs (RRACH) with those in flanking neutral regions. We obtained estimate of ρ that range from 0.33 to 0.56 for different tissues (Figure 4B). As a comparison, we examined two additional classes of regulatory elements in 3′UTR region: A-to-I RNA editing sites and miRNA binding sites. For RNA editing sites, we examined the editing site triplet motif (11,55) and estimated ρ = 0.03 (Figure 4B). For miRNA binding sites identified using Targetscan algorithm (41), we obtained an average estimate of ρ = 0.21 and 0.24 for conserved and broadly conserved miRNA families, respectively (Figure 4B). Thus, unexpectedly, we detected a much stronger signature of natural selection in m6A motifs compared with other post-transcriptional regulatory elements, with about half of the nucleotides estimated to be under negative selection. Next, we ask if the selection pressure on 3′UTR m6A sites is associated with their locations. We grouped m6A sites into 10 bins based on their locations and used phylostratigraphy data (Figure 4C) to examine the age of human m6A sites in each bin. We found that m6A sites had an older age than the control sites in all bins, suggesting that 3′UTR m6A sites are generally subject to negative selection (Figure 4D).

Second, we examined the relationship between m6A level and m6A site conservation. We obtained the rejected substitution scores, a score to measure the nucleotide-level constraint (56), of all m6A sites, and compared the methylation levels between m6A sites under stronger constraints and weaker constraints. Interestingly, we found that m6A sites under stronger constraints had higher methylation levels (Figure 4E). This observation suggests that the conserved m6A sites are optimized for m6A writer binding and methylation, thus likely functional.

Positive selection of m6A sites inferred from population genomic analysis

To ask if the m6A sites that were recently gained during human evolution were under positive selection, we analyzed SNP genotype data from the 1,000 Genome Project (Materials and methods). To prevent the ascertainment bias between functional classes, 5′UTR, 3′UTR of protein-coding genes and ncRNAs were analyzed. CDS region was excluded from this analysis, because it is difficult to distinguish if the selection signals are from m6A methylation or other factors unrelated to m6A methylation, such as amino acid changes. First, we asked if SNPs located within the RRACH motifs do affect methylation. To do so, we first identified all heterogeneous m6A SNPs in 27 adult tissue samples using the input RNA-seq data. Next, we calculated the m6A allele ratio (m6A allele read number/ total read number) using reads covered the selected SNPs for both IP and input samples. Last, we compared the m6A allele ratios between IP and input samples and each position of the RRACH motif was examined separately. We found that m6A allele had higher ratios in IP samples for SNPs located in the RRAC positions (Figure 5A). Thus SNPs located at the RRAC positions but not the H position affect methylation status. The read coverages of two representative heterogeneous m6A SNPs in IP and input samples were shown in Figure 5B. Next, we examined the DAF spectrums for a derived allele that are located in the RRAC positions and create an m6A motif. Different DAF distributions were directly compared using a Mann-Whitney test, as previously described (27). Interestingly, the DAF distribution for the SNPs of which the derived allele creates m6A motifs was significantly skewed toward high-frequency alleles relative to matched control or neutral sites, suggesting that some of these SNPs are subject to positive selection (Figure 5C).

Figure 5.

Figure 5.

Positive selection inferred from SNP genotype data. (A) Comparison of the m6A allele ratio between IP and input samples of heterozygote m6A SNPs. Only heterozygote SNPs with one genotype matching the RRACH motif and the other genotype not matching the RRACH motif were considered. SNPs located in each position of RRACH motifs were analyzed, separately. P values were calculated with the Kolmogorov-Smirnov test. (B) Examples of m6A peaks with SNPs. For chr11@85375601 m6A site, the C allele (GAACT, m6A site is underlined and the SNP is marked in italics) matches the m6A motif and the T allele (GAATT) disrupts the motif. For chr18@5243950 m6A site, the A allele (AAACA) matches the m6A motif and the C allele (ACACA) disrupts the motif. For both sites, compared with the non-m6A allele, the m6A allele is highly enriched in the IP sample. (C) DAF distributions of 1,000 Genome Project SNPs of m6A RRAC motifs for 3′UTR sites. The derived alleles that create m6A motifs were analyzed. Two groups of control sites were selected: (1) ‘RRAC’ from all non-m6A RRACH motifs located at 3′UTR region; (2) ‘RRAC’ from all RRACH motifs in the intergenic region. P values were calculated with a one-sided Mann-Whitney U test comparing the DAF distribution of SNPs in m6A RRAC motif with the distribution of SNPs in control sites. 3′UTR control group versus m6A group, P = 0.00037; intergenic group versus m6A group, P = 0.000013.

The excess of high-frequency derived alleles that create m6A motifs promotes us to further characterize the m6A site SNPs that are likely under positive selection. To have a comprehensive scan of positively selected SNPs related to m6A modification, we examined not only the SNPs of which the derived allele is m6A allele but also those of which the ancestral allele is m6A allele. Because it is plausible that selection may sometimes switch to favor an ancestral allele that has been segregating in the population. We first identified m6A SNPs that have been highly differentiated among populations (measured by the Fst parameter) and then determined where these differentiation events might be driven by positive selection during human evolution. In total, we identified 102 highly differentiated SNPs (Fst > 0.15) (Supplementary Table S5). Of these SNPs, 37 showed evidence of selection in iHS (the integrated Haplotype Scores) (57) and/or XP-EHH (Cross Population Extended Haplotype Homozogysity) (58) tests (Supplementary Tables S6–S7).

Among these differentiated SNPs, 81 SNPs are located in protein-coding genes. The remaining 21 SNPs are located in ncRNA genes, suggesting that m6A sites in ncRNA genes may be another class of targets of recent positive selection. The protein-coding gene list contains a number of genes involved in biological pathways thought to be recently targeted by positive selection, such as metabolism of carbohydrates, lipids and brain development (57). Particularly, it includes 25 genes that have previously been characterized as positively selected genes in the human lineage or across human populations (Supplementary Table S5).

As m6A is known to affect mRNA stability, to further understand the potential regulatory effects of these SNPs, we examined the overlaps between these variants and eQTL SNPs from the Genotype-Tissue Expression (GTEx) project. Of the 102 SNPs, 60 are associated with gene expression abundance (Supplementary Table S8), suggesting that these m6A SNPs may contribute to the expression changes. Last, we characterized the potential phenotypic effect of these variants. Using Genome-wide association study (GWAS) data, we found that a substantial proportion of the 102 m6A SNPs is disease- or trait- associated SNPs (Supplementary Table S8), such as those associated with HDL cholesterol levels, body mass index, and atherosclerosis.

DISCUSSION

The importance of m6A as a post-transcriptional modification has been appreciated, but the evolution, function and regulation of individual m6A sites remain largely unknown, in part because of insufficient data about its prevalence and dynamics. Here, we compiled the m6A methylomes of major adult human tissues, providing a valuable resource for future studies of the regulation and functions of this modification. We reveal that the distribution and motifs of m6A vary across tissue types or during development, suggesting that m6A is widely regulated by trans factors and involved in human development. Notably, it is known that the variation of postmortem conditions in different samples may affect RNA integrity and gene expression quantification. Since it is unknown that whether m6A in post-mortem tissues represents the in-situ state and whether m6A is more unstable than RNA itself, caution needs to be taken when using the m6A maps generated with postmortem tissues.

We used comparative genomics and population genetics approaches to show that a significant negative selection has acted on m6A sites, particularly the ones in the third codon position and 3′UTR. Current opinion on m6A modification believes that despite the functional importance of this modification, the single m6A site seems dispensable, as long as a transcript is methylated and can be recognized by a member of the major m6A reader YTH family (20,59,60). Our data, however, do not support this view and suggest that many of the single sites should be functionally important, most likely because their functions are position-dependent. Furthermore, with 1,000 Genome Project data, we identified a number of m6A SNPs whose patterns of allelic variation are not consistent with neutrality. There is a functional difference between alleles and finally the functional difference would result in a phenotypic effect that would be influenced by selection. These SNPs are enriched in the genes related to the immune system, dietary fatty acid processing and neuronal functions, consistent with the recently identified functions of m6A modification (16,61–64).

In summary, our work provides a resource of m6A profile in humans for future studies of m6A regulation and functions. Furthermore, our data provide independent evidence for the functional importance of m6A modification from the evolutionary perspective, and also suggests an unexpected role of m6A modification in recent human evolutionary adaptation and disease susceptibility.

DATA AVAILABILITY

The sequence data have been deposited in the NCBI GEO database under the accession code GSE122744.

Supplementary Material

gkaa347_Supplemental_Files

ACKNOWLEDGEMENTS

We thank Dr Jinkai Wang for sharing the m6A peak identification codes, Dr Chengguo Yao and Dr Yongsheng Shi for sharing the pPASPORT vector, and SYSU EEB Sequencing Core Facility for sequencing service.

Author contribution: T.H., X.N.Z., W.Y.C., H.Z. and N.N.G. carried out the experiments. H.Z. and X.R.S. performed computational analyses. R.Z., H.Z., X.R.S., T.H. and N.N.G. wrote the paper.

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online.

FUNDING

National Natural Science Foundation of China [91631108, 31571341 to R.Z.]; Guangdong Innovative and Entrepreneurial Research Team Program [2016ZT06S638 to R.Z.]; Thousand Talents Plan-The Recruitment Program for Young Professionals (to R.Z). Funding for open access charge: National Natural Science Foundation of China [91631108 and 31571341 to R.Z.].

Conflict of interest statement. None declared.

REFERENCES

  • 1. Li S., Mason C.E.. The pivotal regulatory landscape of RNA modifications. Annu. Rev. Genomics Hum. Genet. 2014; 15:127–150. [DOI] [PubMed] [Google Scholar]
  • 2. Dominissini D., Moshitch-Moshkovitz S., Schwartz S., Salmon-Divon M., Ungar L., Osenberg S., Cesarkas K., Jacob-Hirsch J., Amariglio N., Kupiec M. et al.. Topology of the human and mouse m6A RNA methylomes revealed by m6A-seq. Nature. 2012; 485:201–206. [DOI] [PubMed] [Google Scholar]
  • 3. Ramaswami G., Zhang R., Piskol R., Keegan L.P., Deng P., O’Connell M.A., Li J.B.. Identifying RNA editing sites using RNA sequencing data alone. Nat. Methods. 2013; 10:128–132. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Edelheit S., Schwartz S., Mumbach M.R., Wurtzel O., Sorek R.. Transcriptome-wide mapping of 5-methylcytidine RNA modifications in bacteria, archaea, and yeast reveals m5C within archaeal mRNAs. PLos Genet. 2013; 9:e1003602. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Li X., Xiong X., Yi C.. Epitranscriptome sequencing technologies: decoding RNA modifications. Nat. Methods. 2016; 14:23–31. [DOI] [PubMed] [Google Scholar]
  • 6. Huang T., Chen W.Y., Liu J.H., Gu N.N., Zhang R.. Genome-wide identification of mRNA 5-methylcytosine in mammals. Nat. Struct. Mol. Biol. 2019; 26:380–388. [DOI] [PubMed] [Google Scholar]
  • 7. Nishikura K. Functions and regulation of RNA editing by ADAR deaminases. Annu. Rev. Biochem. 2010; 79:321–349. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Xu G., Zhang J.. Human coding RNA editing is generally nonadaptive. Proc. Natl. Acad. Sci. U.S.A. 2014; 111:3769–3774. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Yu Y., Zhou H., Kong Y., Pan B., Chen L., Wang H., Hao P., Li X.. The landscape of A-to-I RNA editome is shaped by both positive and purifying selection. PLoS Genet. 2016; 12:e1006191. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Duan Y., Dou S., Luo S., Zhang H., Lu J.. Adaptation of A-to-I RNA editing in Drosophila. PLos Genet. 2017; 13:e1006648. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Zhang R., Deng P., Jacobson D., Li J.B.. Evolutionary analysis reveals regulatory and functional landscape of coding and non-coding RNA editing. PLos Genet. 2017; 13:e1006563. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Liscovitch-Brauer N., Alon S., Porath H.T., Elstein B., Unger R., Ziv T., Admon A., Levanon E.Y., Rosenthal J.J.C., Eisenberg E.. Trade-off between transcriptome plasticity and genome evolution in cephalopods. Cell. 2017; 169:191–202. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Meyer K.D., Saletore Y., Zumbo P., Elemento O., Mason C.E., Jaffrey S.R.. Comprehensive analysis of mRNA methylation reveals enrichment in 3′ UTRs and near stop codons. Cell. 2012; 149:1635–1646. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Meyer K.D., Jaffrey S.R.. The dynamic epitranscriptome: N6-methyladenosine and gene expression control. Nat. Rev. Mol. Cell Biol. 2014; 15:313–326. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Zhao B.S., Roundtree I.A., He C.. Post-transcriptional gene regulation by mRNA modifications. Nat. Rev. Mol. Cell Biol. 2017; 18:31–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Fu Y., Dominissini D., Rechavi G., He C.. Gene expression regulation mediated through reversible m6A RNA methylation. Nat. Rev. Genet. 2014; 15:293–306. [DOI] [PubMed] [Google Scholar]
  • 17. Bokar J.A., Shambaugh M.E., Polayes D., Matera A.G., Rottman F.M.. Purification and cDNA cloning of the AdoMet-binding subunit of the human mRNA (N6-adenosine)-methyltransferase. RNA. 1997; 3:1233–1247. [PMC free article] [PubMed] [Google Scholar]
  • 18. Liu J., Yue Y., Han D., Wang X., Fu Y., Zhang L., Jia G., Yu M., Lu Z., Deng X. et al.. A METTL3-METTL14 complex mediates mammalian nuclear RNA N6-adenosine methylation. Nat. Chem. Biol. 2014; 10:93–95. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Wang X., Lu Z., Gomez A., Hon G.C., Yue Y., Han D., Fu Y., Parisien M., Dai Q., Jia G. et al.. N6-methyladenosine-dependent regulation of messenger RNA stability. Nature. 2014; 505:117–120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Wang X., Zhao B.S., Roundtree I.A., Lu Z., Han D., Ma H., Weng X., Chen K., Shi H., He C.. N(6)-methyladenosine modulates messenger RNA translation efficiency. Cell. 2015; 161:1388–1399. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Xiao W., Adhikari S., Dahal U., Chen Y.-S., Hao Y.-J., Sun B.-F., Sun H.-Y., Li A., Ping X.-L., Lai W.-Y. et al.. Nuclear m6A Reader YTHDC1 Regulates mRNA Splicing. Mol. Cell. 2016; 61:507–519. [DOI] [PubMed] [Google Scholar]
  • 22. Sun W.-J., Li J.-H., Liu S., Wu J., Zhou H., Qu L.-H., Yang J.-H.. RMBase: a resource for decoding the landscape of RNA modifications from high-throughput sequencing data. Nucleic Acids Res. 2015; 44:D259–D265. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Xiao S., Cao S., Huang Q., Xia L., Deng M., Yang M., Jia G., Liu X., Shi J., Wang W. et al.. The RNA N(6)-methyladenosine modification landscape of human fetal tissues. Nat. Cell Biol. 2019; 21:651–661. [DOI] [PubMed] [Google Scholar]
  • 24. King M., Wilson A.. Evolution at two levels in humans and chimpanzees. Science. 1975; 188:107–116. [DOI] [PubMed] [Google Scholar]
  • 25. Barbosa-Morais N.L., Irimia M., Pan Q., Xiong H.Y., Gueroussov S., Lee L.J., Slobodeniuc V., Kutter C., Watt S., Colak R. et al.. The evolutionary landscape of alternative splicing in vertebrate species. Science. 2012; 338:1587–1593. [DOI] [PubMed] [Google Scholar]
  • 26. Xu J., Zhang R., Shen Y., Liu G., Lu X., Wu C.I.. The evolution of evolvability in microRNA target sites in vertebrates. Genome Res. 2013; 23:1810–1816. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Chen K., Rajewsky N.. Natural selection on human microRNA binding sites inferred from SNP data. Nat. Genet. 2006; 38:1452–1456. [DOI] [PubMed] [Google Scholar]
  • 28. Arbiza L., Gronau I., Aksoy B.A., Hubisz M.J., Gulko B., Keinan A., Siepel A.. Genome-wide inference of natural selection on human transcription factor binding sites. Nat. Genet. 2013; 45:723–729. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Ma L., Zhao B., Chen K., Thomas A., Tuteja J.H., He X., He C., White K.P.. Evolution of transcript modification by N6-methyladenosine in primates. Genome Res. 2017; 27:385–392. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Ke S., Alemu E.A., Mertens C., Gantman E.C., Fak J.J., Mele A., Haripal B., Zucker-Scharff I., Moore M.J., Park C.Y. et al.. A majority of m6A residues are in the last exons, allowing the potential for 3′ UTR regulation. Genes Dev. 2015; 29:2037–2053. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Liu Z., Zhang J.. Most m6A RNA modifications in protein-coding regions are evolutionarily unconserved and likely nonfunctional. Mol. Biol. Evol. 2018; 35:666–675. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Batista Pedro J., Molinie B., Wang J., Qu K., Zhang J., Li L., Bouley Donna M., Lujan E., Haddad B., Daneshvar K. et al.. m6A RNA modification controls cell fate transition in mammalian embryonic stem cells. Cell Stem Cell. 2014; 15:707–719. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Martin M. Cutadapt removes adapter sequences from High-Throughput sequencing reads. EMBnet.journal. 2011; 17:10–12. [Google Scholar]
  • 34. Kopylova E., Noe L., Touzet H.. SortMeRNA: fast and accurate filtering of ribosomal RNAs in metatranscriptomic data. Bioinformatics. 2012; 28:3211–3217. [DOI] [PubMed] [Google Scholar]
  • 35. Kim D., Pertea G., Trapnell C., Pimentel H., Kelley R., Salzberg S.L.. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 2013; 14:R36. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Meng J., Cui X., Rao M.K., Chen Y., Huang Y.. Exome-based analysis for RNA epigenome sequencing data. Bioinformatics. 2013; 29:1565–1567. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Cui X., Meng J., Zhang S., Chen Y., Huang Y.. A novel algorithm for calling mRNA m6A peaks by modeling biological variances in MeRIP-seq data. Bioinformatics. 2016; 32:i378–i385. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Kryuchkova-Mostacci N., Robinson-Rechavi M.. A benchmark of gene expression tissue-specificity metrics. Brief. Bioinform. 2016; 18:205–214. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Yao C., Biesinger J., Wan J., Weng L., Xing Y., Xie X., Shi Y.. Transcriptome-wide analyses of CstF64-RNA interactions in global regulation of mRNA alternative polyadenylation. PNAS. 2012; 109:18773–18778. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Ramaswami G., Li J.B.. RADAR: a rigorously annotated database of A-to-I RNA editing. Nucleic Acids Res. 2014; 42:D109–D113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Agarwal V., Bell G.W., Nam J.W., Bartel D.P.. Predicting effective microRNA target sites in mammalian mRNAs. eLife. 2015; 4:e05005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Danecek P., Auton A., Abecasis G., Albers C.A., Banks E., DePristo M.A., Handsaker R.E., Lunter G., Marth G.T., Sherry S.T. et al.. The variant call format and VCFtools. Bioinformatics. 2011; 27:2156–2158. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Gautier M., Vitalis R.. rehh: an R package to detect footprints of selection in genome-wide SNP data from haplotype structure. Bioinformatics. 2012; 28:1176–1177. [DOI] [PubMed] [Google Scholar]
  • 44. Welter D., MacArthur J., Morales J., Burdett T., Hall P., Junkins H., Klemm A., Flicek P., Manolio T., Hindorff L. et al.. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 2014; 42:D1001–D1006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45. Yu G., Wang L.-G., Han Y., He Q.-Y.. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS. 2012; 16:284–287. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Colgan D.F., Manley J.L.. Mechanism and regulation of mRNA polyadenylation. Genes Dev. 1997; 11:2755–2766. [DOI] [PubMed] [Google Scholar]
  • 47. Molinie B., Wang J., Lim K.S., Hillebrand R., Lu Z.-x., Van Wittenberghe N., Howard B.D., Daneshvar K., Mullen A.C., Dedon P. et al.. m6A-LAIC-seq reveals the census and complexity of the m6A epitranscriptome. Nat Meth. 2016; 13:692–698. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48. Yue Y., Liu J., Cui X., Cao J., Luo G., Zhang Z., Cheng T., Gao M., Shu X., Ma H. et al.. VIRMA mediates preferential m(6)A mRNA methylation in 3′UTR and near stop codon and associates with alternative polyadenylation. Cell discovery. 2018; 4:10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49. Wang R., Nambiar R., Zheng D., Tian B.. PolyA_DB 3 catalogs cleavage and polyadenylation sites identified by deep sequencing in multiple genomes. Nucleic Acids Res. 2018; 46:D315–D319. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50. Tian B., Manley J.L.. Alternative polyadenylation of mRNA precursors. Nat. Rev. Mol. Cell Biol. 2016; 18:18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51. Yoon K.-J., Ringeling F.R., Vissers C., Jacob F., Pokrass M., Jimenez-Cyrus D., Su Y., Kim N.-S., Zhu Y., Zheng L. et al.. Temporal control of mammalian cortical neurogenesis by m6A Methylation. Cell. 2017; 171:877–889. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52. Yang Y., Hsu P.J., Chen Y.S., Yang Y.G.. Dynamic transcriptomic m(6)A decoration: writers, erasers, readers and functions in RNA metabolism. Cell Res. 2018; 28:616–624. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53. Choi J., Ieong K.W., Demirci H., Chen J., Petrov A., Prabhakar A., O’Leary S.E., Dominissini D., Rechavi G., Soltis S.M. et al.. N(6)-methyladenosine in mRNA disrupts tRNA selection and translation-elongation dynamics. Nat. Struct. Mol. Biol. 2016; 23:110–115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54. Gronau I., Arbiza L., Mohammed J., Siepel A.. Inference of natural selection from interspersed genomic elements based on polymorphism and divergence. Mol. Biol. Evol. 2013; 30:1159–1171. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55. Eggington J.M., Greene T., Bass B.L.. Predicting sites of ADAR editing in double-stranded RNA. Nat. Commun. 2011; 2:319. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56. Davydov E.V., Goode D.L., Sirota M., Cooper G.M., Sidow A., Batzoglou S.. Identifying a high fraction of the human genome to be under selective constraint using GERP++. PLoS Comput. Biol. 2010; 6:e1001025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57. Voight B.F., Kudaravalli S., Wen X., Pritchard J.K.. A map of recent positive selection in the human genome. PLoS Biol. 2006; 4:e72. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58. Sabeti P.C., Varilly P., Fry B., Lohmueller J., Hostetter E., Cotsapas C., Xie X., Byrne E.H., McCarroll S.A., Gaudet R. et al.. Genome-wide detection and characterization of positive selection in human populations. Nature. 2007; 449:913–918. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59. Geula S., Moshitch-Moshkovitz S., Dominissini D., Mansour A.A., Kol N., Salmon-Divon M., Hershkovitz V., Peer E., Mor N., Manor Y.S. et al.. Stem cells. m6A mRNA methylation facilitates resolution of naive pluripotency toward differentiation. Science. 2015; 347:1002–1006. [DOI] [PubMed] [Google Scholar]
  • 60. Zhou J., Wan J., Gao X., Zhang X., Jaffrey S.R., Qian S.-B.. Dynamic m6A mRNA methylation directs translational control of heat shock response. Nature. 2015; 526:591–594. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61. Frayling T.M., Timpson N.J., Weedon M.N., Zeggini E., Freathy R.M., Lindgren C.M., Perry J.R.B., Elliott K.S., Lango H., Rayner N.W. et al.. A common variant in the FTO gene is associated with body mass index and predisposes to childhood and adult obesity. Science. 2007; 316:889–894. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62. Jia G., Fu Y., Zhao X., Dai Q., Zheng G., Yang Y., Yi C., Lindahl T., Pan T., Yang Y.-G. et al.. N6-Methyladenosine in nuclear RNA is a major substrate of the obesity-associated FTO. Nat. Chem. Biol. 2011; 7:885–887. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63. Zheng Q., Hou J., Zhou Y., Li Z., Cao X.. The RNA helicase DDX46 inhibits innate immunity by entrapping m6A-demethylated antiviral transcripts in the nucleus. Nat. Immunol. 2017; 18:1094–1103. [DOI] [PubMed] [Google Scholar]
  • 64. Li H.B., Tong J., Zhu S., Batista P.J., Duffy E.E., Zhao J., Bailis W., Cao G., Kroehling L., Chen Y. et al.. m6A mRNA methylation controls T cell homeostasis by targeting the IL-7/STAT5/SOCS pathways. Nature. 2017; 548:338–342. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

gkaa347_Supplemental_Files

Data Availability Statement

The sequence data have been deposited in the NCBI GEO database under the accession code GSE122744.


Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES