Abstract
With advanced technologies to map RNA modifications, our understanding of them has been revolutionized, and they are seen to be far more widespread and important than previously thought. Current next-generation sequencing (NGS)-based modification profiling methods are blind to RNA modifications and thus require selective chemical treatment or antibody immunoprecipitation methods for particular modification types. They also face the problem of short read length, isoform ambiguities, biases and artifacts. Direct RNA sequencing (DRS) technologies, commercialized by Oxford Nanopore Technologies (ONT), enable the direct interrogation of any given modification present in individual transcripts and promise to address the limitations of previous NGS-based methods. Here, we present the first ONT-based database of quantitative RNA modification profiles, DirectRMDB, which includes 16 types of modification and a total of 904,712 modification sites in 25 species identified from 39 independent studies. In addition to standard functions adopted by existing databases, such as gene annotations and post-transcriptional association analysis, we provide a fresh view of RNA modifications, which enables exploration of the epitranscriptome in an isoform-specific manner. The DirectRMDB database is freely available at: http://www.rnamd.org/directRMDB/.
INTRODUCTION
Conceptually similar to DNA modifications, RNA molecules undergo chemical modifications as well. The first RNA chemical modifications were documented in the 1950s in tRNAs and rRNAs (1). To date, >170 different modification types have been described, including N6-methyladenosine (m6A), pseudouridine (Ψ), N5-methylcytosine (m5C), N1-methyladenosine (m1A), methylation of 2’-O in the four nucleotides (i.e. Am, Tm, Cm and Gm) and N7-methylguanosine (m7G) (2–7). With recent advanced technologies to map these RNA modifications, our understanding of them has been revolutionized, and they are now understood to be far more widespread and important than previously thought. Systematic studies of this post-transcriptional regulatory layer have revealed its profound roles in shaping cellular processes, modulating disease risks, and governing cellular fate (8–11). For instance, m6A, one of the most prevalent RNA modifications, is proven to regulate cardiac gene expression, cell growth, stress response, stabilize junctional RNA, etc (12–15). Pseudouridine, the first discovered post-transcriptional modification (16), has recently been implicated in tumor development, maintenance, and progression (17).
RNA-seq has become a popular choice for analyzing complex epitranscriptomics. However, next-generation sequencing (NGS) platforms are typically blind to nucleotide modifications and thus need specific protocols to highlight RNA modifications on the molecules. These typically involve three strategies: (i) antibody immunoprecipitation, which specifically recognizes the modified bases with antibodies; (ii) enzyme-digestion, where RNAs are digested with modification-sensitive enzymes; (iii) chemical treatment, using chemical compounds that selectively react with the modified ribonucleotide of interest. Example of immunoprecipitation methods include m6A-seq (3), PA-m6A-seq (18), m6A-CLIP-seq (19), miCLIP (20), m6A-LAIC-Seq (21), m6ACE-Seq (22). Mazter-seq (23), m6A-REF-Seq (24), and DART-seq (25) are enzyme digestion-based methods that quantify m6A modification with single-nucleotide resolution. Pseudo-seq (6) and AlkAniline-seq (26) are typical chemical-based detection methods. These methods are similar in that they enrich fragments harboring modified ribonucleotides, followed by high-throughput sequencing and bioinformatics analysis to detect these changes.
Although these methods provide invaluable information, they are limited by the availability of high-quality antibodies and the lack of practical chemical reactivities towards a particular RNA modification (27). Thus, only a few of the over 170 known modification types can be accurately and effectively profiled. When selective antibodies or chemical treatments are available, the RNA modification to be studied should be chosen beforehand, and customized protocols must be designed for the chosen type, limiting our ability to characterize the epitranscriptome in a systematic and flexible manner (28). Also, these methods require multiple ligation and amplification steps during the library preparation, introducing undesired biases and artifacts (29,30). Finally, with respect to the sequencing itself, NGS platforms face the problem of short read length. Mapping modifications on highly repetitive splicing isoforms and characterizing the co-occurrence of distant modifications in the same transcripts remain challenges (27). Thus, most of existing NGS-based methods have the isoform-ambiguity issue and they report only genome-baed coordinate of RNA modifications.
The continuing discoveries of novel classes of RNA modifications in various organisms call for more sensitive, plastic, and convenient modification profiling methods. A promising alternative to NGS technologies is the direct RNA sequencing (DRS) platform developed by Oxford Nanopore Technologies (ONT) (31). Each nucleotide will cause distinct ionic current signals as it passes through a sensitive channel. This platform infers the RNA sequence by deconvoluting the serial electric signal event when the molecule is threading through the sensitive protein channel (32). Natural modifications along the molecule can result in characteristic signals that suggest both the position and identity of modifications (33). Theoretically, direct RNA sequencing allows the real-time and simultaneous detection of any given modification in the native RNA molecule. Additionally, nanopore sequencing offers ultra-long reads that can cover the entire length of the RNA molecule, which benefits the study of RNA modifications on splicing isoforms(34).
ONT sequencing platforms have yielded robust data of reasonable quality, and several pilot studies have detected RNA modifications from the data. For example, EpiNano (m6A) (35), ELIGOS (27), DRUMMER (36) and the work of Parker et al. (37) screen RNA modifications by examining the sequencing error profiles. Another body of work, such as xPore (38), m6Anet (m6A) (39), MINES (m6A) (34), nanom6A (m6A) (40), and nanoPsu (pseudouridine) (41) utilized the variation in current signal intensities. These tools were confirmed to have high accuracy for modification detection with single-nucleotide resolution.
To date, various comprehensive databases of RNA modification sites reported by NGS approaches are publicly available, including MODOMICS (42), RMBase (43), REPIC (44), m6A-atlas (45), m5C-atlas (46), MeT-DB (47), RMVar (48) and M6A2Target (49), which have together provided invaluable information to help decipher the complexities of epitranscriptomes. However, due to the low sensitivity and detection chemistry of NGS-based approaches, a huge proportion of modified sites have not been detected, and the landscape of RNA modifications on the transcriptome is yet to be well-studied (50). To address this gap, we have developed DirectRMDB, the first comprehensive database of RNA modification sites derived from direct RNA sequencing data. In this study, a collection of 16 quantitative modification profiles among 25 species and various cell types or tissues under different conditions were integrated from direct RNA sequencing samples. Data from other studies or techniques were also collected to validate the collected sites. A significant advantage of DirectRMDB is that it provides isoform-level information, including isoform-specific distributions of RNA modifications, isoform expression levels and secondary structure. We constructed a user-friendly web interface for the query, visualization, and sharing of the modification profiles and their association with transcriptional and post-transcriptional regulatory machinery (i.e. RNA binding proteins, miRNAs, splicing events), as well as their potential involvement in pathogenesis. As the first DRS-based database, DirectRMDB is expected to provide new insight into the complex epitranscriptome (Figure 1).
Figure 1.
The overall design of DirectRMDB. DirectRMDB is the first comprehensive database that integrates quantitative modification profiles determined by direct RNA sequencing. For quality assurance, eight different software tools for mining RNA modifications were rigorously integrated, and additional next-generation sequencing samples were collected for validation. DirectRMDB provides an isoform-specific view of modification sites, including their distribution on individual transcripts and the secondary structure predicted from the RNA primary sequences. The potential involvement of reported sites in pathogenesis and their potential interactions with post-transcriptional machinery can also be queried.
MATERIALS AND METHODS
Collection of candidate modification sites
125 direct RNA sequencing samples for 25 species, including 44 FAST5 and 81 FASTQ samples, were collected from 39 independent studies in the Gene Expression Omnibus (GEO) database (Supplementary Table S1). Eight modification detection tools, namely nanom6A (40), MINES (34), xPore (38), m6Anet (39), DRUMMER (36), ELIGOS (27), the work of Parker et al. (37), and Nanopsu (41) were used to infer possible modification sites from samples (Table 1). It is worth noting that although direct RNA sequencing allows the detection of RNA modifications with an isoform-level resolution, some tools (e.g. MINES) still rely on genome-level features and thus cannot distinguish between transcripts. Supplementary Figure S1 shows the general workflow for candidate site collection. As the colors indicate, the eight tools can be categorized into three classes in terms of their required input and thus the different pipelines of data pre-processing: (i) Tombo-based (i.e. nanom6A and MINES): the raw data (FAST5) was re-squiggled (i.e. a new assignment from current signal level data to the reference sequences was defined) with either transcriptome or genome reference using Tombo ‘resquiggle’ function. Specifically for MINES, the Tombo ‘de novo modification detection’ function was used to infer non-canonical bases from the re-squiggled current signals. This Tombo output was provided to nanom6A and MINES as input and candidate m6A sites returned. (ii) Nanopolish-based (i.e. xPore and m6Anet): the Nanopolish (51) ‘eventalign’ function was used to map the signal events extracted from the raw FAST5 sample to the reference transcriptome. m6Anet and xPore then analyzed the output TXT files to identify possible modifications. It is notable that xPore is a comparative method, which requires modification-free samples as control. (iii) BAM-based (i.e. DRUMMER, ELIGOS, the work of Parker et al., and NanoPsu): FAST5 samples were base called into FASTQ format with Guppy before alignment. Base-called reads, as well as downloaded FASTQ samples, were aligned to either reference genome or transcriptome using Minimap2 (52) with -ax map-ont settings. The resulting SAM files were transformed into BAM files, sorted and indexed with Samtools (53). ELIGOS and NanoPsu examine the error distribution profiles from the alignment file directly, while DRUMMER and Simpson's work requires control samples to perform the modification detection. Samples were analyzed by some or all of the eight tools depending on their data format (i.e. FASTQ or FAST5), the availability of control samples, and authentic reference sequences. References used for each species are summarized in Supplementary Table S2.
Table 1.
brief description and comparison of modification calling tools
| Modification | Input | Isoform-level? | Control sample? | Algorithm | |
|---|---|---|---|---|---|
| nanoPsu | Ψ | BAM | No | No | Sequencing error |
| ELIGOS | Mixed | No | No | ||
| the work of Parker et al. | / | No | Yes | ||
| Drummer | Yes | Yes | |||
| xpore | Nanopolish output | Yes | Yes | Current signal | |
| m6Anet | m6A | Yes | No | ||
| MINES | Tombo output | No | No | ||
| Nanom6A | No | No |
Note: ‘/’ means that detected modification types depend on the modification-free samples. For example, if an m6A-free sample is used as a control, reported sites are expected to be m6A methylation.
The landscape of RNA modifications on transcripts
Nanom6A, xPore, m6Anet, and DRUMMER detect bulk-level RNA modifications by examining either error distribution profiles or current signals distributions along transcripts. Nanom6A maps detected sites to the reference genome and present the results with genome coordinates. Therefore, only xPore, m6Anet and DRUMMER were used to predict modifications in individual transcripts. The workflow to run the three tools is shown in Supplementary Figure S1. To compare the isoform-level modification patterns and for the simplicity of results presentation, we converted the transcripts’ coordinates to genomic coordinates while keeping the isoform-level information.
Integration of results and validation by NGS methods
The collection of sites reported by each software could contain a significant proportion of false-positive sites. To ensure reliability, we collated the results from different samples and tools and then applied strict filtration criteria to generate reliable modification profiles for each species. To screen high-confident m6A sites, we searched for its known consensus DRACH motifs (where D denotes A, G or U, R denotes A or G, and H denotes A, C or U) from the candidate sites and only sites reported in multiple cases (i.e. reported by more than one tools or conditions) were kept. Candidate uridines suggested by both ELIGOS and NanoPsu (probability > 0.8) were considered pseudouridine sites.
ELIGOS identified a set of putative modification sites without characterizing their modification type. Therefore, NGS technologies were used to label these unknown types of candidate modification sites. Specifically, 10 modification profiles for human (i.e. m1A, methylation of 2’-O in the four nucleotides, m7G, AtoI, m6Am and m5U, m5C), two modification profiles for mouse (i.e. m1A and m5C), three modification profiles (i.e. f5C, dihydrouridine and ac4c) for yeast, and two modification profiles (i.e. m6A and Y) for Arabidopsis thaliana were collected from public resources, including m6A-atlas, m5C-atlas, RMDisease, MODOMICS, and supplementary data of published works. To ensure reliability and save space, unlabeled ELIGOS results were excluded from the final proposed profiles but can be downloaded from DirectRMDB.
To further validate our results, we collected high-confident modification sites and modification-enriched peaks derived from next generation sequencing samples (Supplementary table S3). Additionally, multiple modification profiles reported by LC-MS techniques were downloaded from MODOMICS and RMBase. Cross-validation was performed between candidate and NGS/LC-MS-derived sites. Sites confirmed by other techniques were clearly labeled. We also compared our results with sites published by other ONT-based modification detection studies (38,54). Overlapped sites were indicated as well.
Secondary structure prediction
RNA plays a vital role in the cell, not only as an intermediate product for the transmission of genetic information, but also as a functional element. Single-stranded-RNA molecules can fold into specific and stable structures. It is known that there is a strong association between their functions and structures (55). The three-dimensional structure of RNA molecules can only be determined by X-ray crystallography, nuclear magnetic resonance, and other laborious and high-cost methods (56). Therefore, we present the secondary structure of isoforms, which is easier to predict computationally. We use RNAfold (57), a widely used RNA secondary structure prediction software, with default parameter settings, to infer the structure from the RNA primary sequences. The landscapes of RNA modifications on each isoform under different conditions were annotated and highlighted on the predicted structure. For a better view, ultra-long reads (>2001nt) were cut into 2001nt fragments that contained modified bases.
Quantitative profiles of putative modification sites
44 FAST5 samples from nine species were collected to quantify the modification status of high-confident modification sites under different cell lines/tissues and conditions. The Tombo ‘de novo modification detection’ function was used to investigate non-canonical (i.e. modified) bases within individual reads and the fractions of modified reads aligned to each genomic position were output with Tombo ‘text output’ command. The modification fraction is used to quantify the modification status of reported sites. In addition to modification status, the transcripts’ expression profiles were also estimated from the BAM file with transcriptome reference using nanocount (58), isoform expression level calculation software designed for direct RNA sequencing data.
Basic annotation for modification sites
Gene annotation files were downloaded from Gencode (Human and Mouse) (59) and NCBI (60) (other species). Those high-confidence sites (genome-wide) were annotated by collected gene annotations and were classified into different gene types and genomic regions using ChIPseeker (61). In addition to basic gene annotation, the potential interactions between modifications and splicing events, miRNA as well as RNA binding proteins (RBPs) were included in human and mouse. miRNA target sites, RBP binding, and other events information were obtained from starBase (62), POSTAR (63) and the UCSC genome browser database (64), respectively. Since nanopore sensor protein takes a k-mer (4–6nt) as input each time, the presence of non-canonical bases could cause misleading signals thus influencing the deconvolution of adjacent bases. Therefore, for each site, we indicated the presence of other modifications within 5bp upstream and downstream as a warning of false positives.
Potential involvement of individual modification sites in pathogenesis
It is known that RNA modifications are closely related to the progression of diseases. To investigate the contribution of individual modification sites in disease development, we analyzed their positional relationship with potentially disease-associated genetic mutations. Sites that exactly overlapped with mutations were indicated. Collection of single nucleotides polymorphisms (SNPs), including both common variations and clinical mutations for human and mouse, was downloaded from dbSNP (65).
Database and web interface implementation
MySQL was used to store and manage the metadata. Hypertext Markup Language (HTML), Cascading style sheets (CSS) and Hypertext Preprocessor (PHP) were used to build the web interface. Genome browser JBrowse2 (66) was used to provide an integrated view of reference sequences, modification site information, related RBP binding, splicing, miRNA binding event, and associated SNPs.
RESULTS
The eight modification detection tools, applied to 125 direct RNA sequencing samples, suggested more than 16,000,000 candidate modification sites. By manually integrating, evaluating, and filtering the results, a total of 904 712 sites of 16 chemical modifications, namely m6A, Ψ, m1A, m6Am, 2’-O-Me, m5U, m7G, m5C, D, f5C, Y and ac4c, across 25 species, including Homo sapiens, Mus musculus, Arabidopsis thaliana, Sus scrofa and Escherichia coli, were confidently identified (Table 2 and Supplementary Table S4). Among these proposed sites, 149 353 human sites and 91 910 mouse sites were further confirmed by other techniques (i.e. NGS techniques and LC–MS) (Supplementary Figure S2). The landscapes of RNA modifications on human and mouse transcriptomes were evaluated. 225 041 sites in 26 039 distinct human transcripts and 228 558 sites in 21 413 mouse transcripts, corresponding respectively to 88 230 and 112 820 bases on the human and mouse genome, were found. We also predicted the secondary structure of isoforms and calculated their expression levels under specific cell lines and conditions.
Table 2.
The data statistics for DirectRMDB
| RNA modifications | |||||||
|---|---|---|---|---|---|---|---|
| Species | m6A | Ψ | 2′-O-Me | m5C | m1A | Other | Total |
| Human | 195 871 | 134 834 | 1506 | 26 033 | 2979 | 3803 | 365 026 |
| Mouse | 186 175 | 45 397 | / | 970 | 693 | / | 233 235 |
| Yeast | 148 | 59 | / | / | / | 19 | 226 |
| 22 other species | 203 973 | 102 241 | / | / | / | 11 | 306 225 |
Note: The numbers in the table indicate the total count of each modification type. In human, ‘Other’ refers to m7G, m5U, m6Am, and AtoI modifications, while in yeast, ‘Other’ refers to ac4c, D, Y and f5C. Please refer to Supplementary Table S4 for more details.
Quantitative modification profiles (i.e. the fraction of modified reads) for nine species under 44 different cell lines/tissues and conditions were calculated. Gene annotation of 20 species was successfully performed, while annotation of the Bipolaris sorokiniana, Candida nivariensis, Chikungunya virus and influenza A virus failed since no feasible annotation file is available for these species. Since non-canonical bases can influence the deconvolution of adjacent nucleotides due to the nanopore sequencing chemistry, we evaluated the interaction between reported modification sites. A total of 105 581 sites were screened as consecutive modifications (i.e. has other modifications within 5 bp up and downstream). For human and mouse, we also investigated the interaction between RNA modification and RNA binding proteins, splicing sites, and miRNA targets. For human, 171 RNA binding proteins, 826 miRNAs, and 101 587 splicing events are suggested to be associated with respectively 275 956, 54 390 and 108 738 modification sites. For mouse, we identified 39 RNA binding proteins, 905 miRNA and 79 010 splicing events that are related to RNA modifications. Also, 80 614 human sites and 5636 mouse modification sites are documented SNP sites, suggesting their potential involvement in disease development.
Comprehensive atlas of various types of RNA modifications
We constructed DirectRMDB, the first database that integrates direct RNA sequencing data to explore post-transcriptional modifications of RNAs. A user-friendly web interface was provided to search, browse, visualize and download the 16 types of high confidently collected modification sites and their potential relationships with miRNA targets, RBPs, splicing events, and pathogenesis. Jbrowser2 was integrated for interactive exploration of individual sites or regions of interest. We also provided isoform-level information, including the landscape of RNA modifications on individual transcripts, annotated secondary structures and transcripts expression levels under particular cell lines, tissues, and conditions. The DirectRMDB database is freely available at: http://www.rnamd.org/directRMDB/, and has a mirror at: www.xjtlu.edu.cn/biologicalsciences/directRMDB.
Case study on protein-coding gene RNF138
Ring finger protein 138 (RNF138) is a ubiquitin ligase belonging to the E3 ligase family, which harbors a ring finger protein domain, three zinc-finger-like domains, and a ubiquitin-interacting motif (67,68). It promotes cell survival via counteracting apoptotic signaling or directly influencing genome stability. Emerging evidence has linked the RNF138 protein with tumorigenesis, neurodegenerative disorders, and chronic inflammatory conditions (69,70). By searching through the Homo sapiens repository from directRMDB with the gene name RNF138 (Figure 2A), 101 entries, with one AtoI, one m6Am, 45 m6A and 54 Ψ sites, detected by MINES, m6Anet, Nanopsu, nanom6A, and ELIGOS from 7 different cell lines were returned (Figure 2C and D). Among the results, ELIGOS screened the majority of them (i.e. one AtoI, one m6Am, 25 m6A and 54 Ψ sites), suggesting its high sensitivity in screening modified bases. MINES only contributes to one m6A site, which can be explained by its strict evaluation criteria. Entries without available RBP binding, miRNA, splicing site, SNP, and transcriptomic information or not confirmed by NGS methods can be removed by clicking the corresponding buttons in the top filters box. Users can also retrieve sites of specific modification types from certain cell lines, tools or RNA types (e.g. mRNA, rRNA and tRNA) of interest. Detailed information on individual sites can be acquired by clicking the site ID. Taking ‘directRMDB_HomoSapiens_114258’ as an example, from the basic information returned, it is an m6A site reported by ELIGOS and nanom6A from four different samples under two cell lines (Figure 2E). Figure 2F shows the fraction of modified reads on different samples. Since no transcript information is available for these samples, the transcripts ID column is filled with NAs (for Not Available). In terms of interaction with regulatory events, this m6A site is expected to associate with two RNA binding proteins and two miRNAs. Also, this modification might play an important role in pathogenesis since it is a reported SNP site, where the adenosine base is mutated to guanosine in certain cases (Figure 2G).
Figure 2.
Case study on protein-coding gene RNF138. (A) Searching by gene name. (B) 101 sites of four modification types on the RNF gene. (C) Number of modifications detected by different software. (D) Pie chart of the number of modifications detected in each cell line. (E) Basic information of the example site with ID of ‘directRMDB_HomoSapiens_114258’. (F) Conditions involved in the example site and fraction of modified reads under different conditions. (G) Genome browser view of the example site and its relationships with RNA binding proteins, miRNAs and pathogenesis.
Case study on lncRNA MALAT1
Metastasis-associated lung adenocarcinoma transcript 1 (MALAT1), a long non-coding RNA (lncRNA) that has been confirmed to influence cancer development and metastasis (71). 73 modification sites from the DirectRMDB homo sapiens sets, including m6A, m5C, pseudouridine, and 2’-O-Me, were found on MALAT1 transcripts. Similarly, detailed information for individual modification sites, including interaction with RNA binding proteins, miRNAs, and other sites, can be acquired by clicking the site ID. Taking directRMDB_HomoSapiens_176403 as an example (Figure 3A). It is a pseudouridine sites screened by ELIGOS and nanoPsu from human ENDOC and SEAC cell lines. Also, a previous study (GSE60047) based on Ψ-seq also found pseudoridine modification at this position (Figure 3B). Figure 3C shows that an m1A site is located 4 bp downstream of the examples site. Since the nanopore sensor protein can interact with ∼5nt regions simultaneously, the presence of another non-canonical bases nearby may cause misleading signals and thus influence the analysis. Although the example site is reported in multiple cases and was supported by other techniques, it is possible that the site is false positive as the result of adjacent misleading signals.
Figure 3.
Case study on lncRNA MALAT1. (A) Basic information of the example site with ID of ‘directRMDB_HomoSapiens_176403’. (B) Information of other techniques or studies that confirmed the example site. (C) Information of site located within the 11bp region centered by the example site.
Case study: isoform level exploration of RNA modifications
TXNDC12 (chr1:52020131..52056171, GRCh38.p14 assembly) and KTI12 (chr1:52042103..52033810) are two genes that share common regions on chromosome 1. For an RNA modification site located within the shared regions, it can be difficult with NGS epitranscriptomics profiling methods to decide which genes or transcripts it belongs to. Fortunately, direct RNA sequencing technologies offer a solution to this isoform ambiguity problem thanks to the longer reads. By more precisely aligning reads to transcriptome references, modifications can be confidently located in an isoform-specific manner (Figure 4A). Here, the m6A site with ID of ‘directRMDB_HomoSapiens_142249’ is taken as an example (Figure 4B). From a genome-wide view, it is located on the shared region of TXNDC12 and KTI12 and was wrongly assigned to TXNDC12 by ChipSeeker in a random manner. On the directRMDB details page, in contrast, we can see that m6Anet unambiguously assigned this site to ENST00000371614, an isoform for the KTI12 gene, under four different conditions. Also, expression levels of KTI12 isoforms under the four conditions are displayed (Figure 4C). The picture of the predicted RNA secondary structure with highlighted modified bases can be queried by clicking the ‘show’ button on the ‘secondary structure’ column (Figure 4D).
Figure 4.
Case study: isoform level exploration of RNA modifications. (A) Genome and isoform level views of RNA modifications. (B) Basic information of the example site with ID of ‘directRMDB_HomoSapiens_142249’. (C) Screenshot of predicted secondary structure of ENST00000371614. The example modification site is highlighted (red). (D) Transcripts information, including transcriptomic coordinates, expression levels.
Conclusion
Maps of various RNA modifications have been constructed by coupling antibody immunoprecipitation or chemical probing with high-throughput sequencing. However, customized protocols are required for each RNA modification type, thus limiting our ability to characterize the plasticity of the whole epitranscriptomics systematically and in an unbiased fashion. Fortunately, the development of direct RNA sequencing platforms enables the mapping of diverse RNA modification types simultaneously and detection of any given modification present in native RNA molecules. With the rapid accumulation of direct RNA sequencing data and designed ONT tools, we constructed DirectRMDB, the first database of multiple RNA modifications unveiled by direct RNA sequencing technologies. By taking advantage of direct RNA sequencing technologies, DirectRMDB offers several novel features compared with existing epitranscriptomics databases: (i) since ONT direct sequencing generate ultra-long reads and is less vulnerable to isoform ambiguity, we confidently presented isoform-specific distribution of RNA modification sites. (ii) we provided transcript expression levels under different conditions. (iii) we integrated novel modifications sites that have not been detected by NGS-methods. Also, a user-friendly graphical interface integrated with a genome browser was constructed to facilitate the query, visualization, and analysis of this novel, fine-grained epitranscriptomics data. Due to the nature of ONT direct sequencing, the results might contain some false positive site. Therefore, we clearly indicated the tools, samples, and other techniques (i.e. NGS techniques and LC–MS) that support each site. Users could filter, select and use sites based on their understanding and knowledge. Overall, DirectRMDB provides a fresh view of the epitranscriptome. We will continue to update and improve the database by integrating the latest sequencing data and advanced tools to ensure that it remains a valuable resource for the research community.
DATA AVAILABILITY
No new data were generated or analysed in support of this research. The DirectRMDB database is freely available at: http://www.rnamd.org/directRMDB/, and has a mirror at: www.xjtlu.edu.cn/biologicalsciences/directRMDB.
Supplementary Material
Contributor Information
Yuxin Zhang, Key Laboratory of Ministry of Education for Gastrointestinal Cancer, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, Fujian 350004, China; Department of Biological Sciences, Xi’anJiaotong-Liverpool University, Suzhou, Jiangsu 215123, China; Institute of Systems, Molecular and Integrative Biology, Xi’anJiaotong-Liverpool University, Suzhou, Jiangsu 215123, China.
Jie Jiang, Department of Biological Sciences, Xi’anJiaotong-Liverpool University, Suzhou, Jiangsu 215123, China; Institute of Systems, Molecular and Integrative Biology, Xi’anJiaotong-Liverpool University, Suzhou, Jiangsu 215123, China.
Jiongming Ma, Key Laboratory of Ministry of Education for Gastrointestinal Cancer, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, Fujian 350004, China; Department of Biological Sciences, Xi’anJiaotong-Liverpool University, Suzhou, Jiangsu 215123, China; Institute of Systems, Molecular and Integrative Biology, Xi’anJiaotong-Liverpool University, Suzhou, Jiangsu 215123, China.
Zhen Wei, Department of Biological Sciences, Xi’anJiaotong-Liverpool University, Suzhou, Jiangsu 215123, China; Institute of Life Course and Medical Sciences, Xi’anJiaotong-Liverpool University, Suzhou, Jiangsu 215123, China.
Yue Wang, Department of Mathematical Sciences, Xi’anJiaotong-Liverpool University, Suzhou, Jiangsu 215123, China; Department of Computer Science, University of Liverpool, L69 7ZB, Liverpool, UK.
Bowen Song, Department of Mathematical Sciences, Xi’anJiaotong-Liverpool University, Suzhou, Jiangsu 215123, China; Institute of Systems, Molecular and Integrative Biology, Xi’anJiaotong-Liverpool University, Suzhou, Jiangsu 215123, China.
Jia Meng, Department of Biological Sciences, Xi’anJiaotong-Liverpool University, Suzhou, Jiangsu 215123, China; AI University Research Centre, Xi’an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China; Institute of Systems, Molecular and Integrative Biology, Xi’anJiaotong-Liverpool University, Suzhou, Jiangsu 215123, China.
Guifang Jia, Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, College of Chemistry and Molecular Engineering, Peking University, Beijing, China.
João Pedro de Magalhães, Institute of Life Course and Medical Sciences, Xi’anJiaotong-Liverpool University, Suzhou, Jiangsu 215123, China.
Daniel J Rigden, Institute of Systems, Molecular and Integrative Biology, Xi’anJiaotong-Liverpool University, Suzhou, Jiangsu 215123, China.
Daiyun Hang, Department of Biological Sciences, Xi’anJiaotong-Liverpool University, Suzhou, Jiangsu 215123, China; Department of Computer Science, University of Liverpool, L69 7ZB, Liverpool, UK.
Kunqi Chen, Key Laboratory of Ministry of Education for Gastrointestinal Cancer, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, Fujian 350004, China; Fujian Key Laboratory of Tumor Microbiology, Department of Medical Microbiology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, Fujian 350004, China.
SUPPLEMENTARY DATA
Supplementary Data are available at NAR Online.
FUNDING
National Natural Science Foundation of China [32100519 and 31671373]; XJTLU Key Program Special Fund [KSF-E-51 and KSF-P-02]; Scientific Research Foundation for Advanced Talents of Fujian Medical University [XRCZX202109]. Funding for open access charge: Scientific Research Foundation for Advanced Talents of Fujian Medical University [XRCZX202109].
Conflict of interest statement. None declared.
REFERENCES
- 1. Kemp J.W., Allen F.W.. Ribonucleic acids from pancreas which contain new components. Biochim. Biophys. Acta. 1958; 28:51–58. [DOI] [PubMed] [Google Scholar]
- 2. Boccaletto P., Machnicka M.A., Purta E., Piatkowski P., Baginski B., Wirecki T.K., de Crécy-Lagard V., Ross R., Limbach P.A., Kotter A.et al.. MODOMICS: a database of RNA modification pathways. 2017 update. Nucleic Acids Res. 2018; 46:D303–D307. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Dominissini D., Moshitch-Moshkovitz S., Schwartz S., Salmon-Divon M., Ungar L., Osenberg S., Cesarkas K., Jacob-Hirsch J., Amariglio N., Kupiec M.et al.. Topology of the human and mouse m6A RNA methylomes revealed by m6A-seq. Nature. 2012; 485:201–206. [DOI] [PubMed] [Google Scholar]
- 4. Li X., Xiong X., Zhang M., Wang K., Chen Y., Zhou J., Mao Y., Lv J., Yi D., Chen X.W.et al.. Base-Resolution mapping reveals distinct m(1)A methylome in Nuclear- and Mitochondrial-Encoded transcripts. Mol. Cell. 2017; 68:993–1005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Hussain S., Aleksic J., Blanco S., Dietmann S., Frye M.. Characterizing 5-methylcytosine in the mammalian epitranscriptome. Genome Biol. 2013; 14:215. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Carlile T.M., Rojas-Duran M.F., Zinshteyn B., Shin H., Bartoli K.M., Gilbert W.V.. Pseudouridine profiling reveals regulated mRNA pseudouridylation in yeast and human cells. Nature. 2014; 515:143–146. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Marchand V., Blanloeil-Oillo F., Helm M., Motorin Y.. Illumina-based ribomethseq approach for mapping of 2'-O-Me residues in RNA. Nucleic Acids Res. 2016; 44:e135. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Schwartz S., Agarwala S.D., Mumbach M.R., Jovanovic M., Mertins P., Shishkin A., Tabach Y., Mikkelsen T.S., Satija R., Ruvkun G.et al.. High-resolution mapping reveals a conserved, widespread, dynamic mRNA methylation program in yeast meiosis. Cell. 2013; 155:1409–1421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Torres A.G., Batlle E., Ribas de Pouplana L.. Role of tRNA modifications in human diseases. Trends Mol. Med. 2014; 20:306–314. [DOI] [PubMed] [Google Scholar]
- 10. Batista P.J., Molinie B., Wang J., Qu K., Zhang J., Li L., Bouley D.M., Lujan E., Haddad B., Daneshvar K.et al.. m(6)A RNA modification controls cell fate transition in mammalian embryonic stem cells. Cell Stem Cell. 2014; 15:707–719. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Haussmann I.U., Bodi Z., Sanchez-Moran E., Mongan N.P., Archer N., Fray R.G., Soller M.. m(6)A potentiates sxl alternative pre-mRNA splicing for robust drosophila sex determination. Nature. 2016; 540:301–304. [DOI] [PubMed] [Google Scholar]
- 12. Chen J., Zhang Y.C., Huang C., Shen H., Sun B., Cheng X., Zhang Y.J., Yang Y.G., Shu Q., Yang Y.et al.. m(6)A regulates neurogenesis and neuronal development by modulating histone methyltransferase ezh2. Genomics Proteomics Bioinformatics. 2019; 17:154–168. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Engel M., Eggert C., Kaplick P.M., Eder M., Roh S., Tietze L., Namendorf C., Arloth J., Weber P., Rex-Haffner M.et al.. The role of m(6)A/m-RNA methylation in stress response regulation. Neuron. 2018; 99:389–403. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Kmietczyk V., Riechert E., Kalinski L., Boileau E., Malovrh E., Malone B., Gorska A., Hofmann C., Varma E., Jürgensen L.et al.. m(6)A-mRNA methylation regulates cardiac gene expression and cellular growth. Life Sci. Alliance. 2019; 2:e201800233. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Liu B., Merriman D.K., Choi S.H., Schumacher M.A., Plangger R., Kreutz C., Horner S.M., Meyer K.D., Al-Hashimi H.M.. A potentially abundant junctional RNA motif stabilized by m(6)A and mg(2). Nat. Commun. 2018; 9:2761. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Cohn W.E. Pseudouridine, a carbon-carbon linked ribonucleoside in ribonucleic acids: isolation, structure, and chemical characteristics. J. Biol. Chem. 1960; 235:1488–1498. [PubMed] [Google Scholar]
- 17. Nombela P., Miguel-López B., Blanco S.. The role of m(6)A, m(5)C and Ψ RNA modifications in cancer: novel therapeutic opportunities. Mol. Cancer. 2021; 20:18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Chen K., Lu Z., Wang X., Fu Y., Luo G.Z., Liu N., Han D., Dominissini D., Dai Q., Pan T.et al.. High-resolution N(6) -methyladenosine (m(6) A) map using photo-crosslinking-assisted m(6) a sequencing. Angew. Chem. Int. Ed Engl. 2015; 54:1587–1590. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Ke S., Alemu E.A., Mertens C., Gantman E.C., Fak J.J., Mele A., Haripal B., Zucker-Scharff I., Moore M.J., Park C.Y.et al.. A majority of m6A residues are in the last exons, allowing the potential for 3' UTR regulation. Genes Dev. 2015; 29:2037–2053. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Linder B., Grozhik A.V., Olarerin-George A.O., Meydan C., Mason C.E., Jaffrey S.R.. Single-nucleotide-resolution mapping of m6A and m6Am throughout the transcriptome. Nat. Methods. 2015; 12:767–772. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Molinie B., Wang J., Lim K.S., Hillebrand R., Lu Z.X., Van Wittenberghe N., Howard B.D., Daneshvar K., Mullen A.C., Dedon P.et al.. m(6)A-LAIC-seq reveals the census and complexity of the m(6)A epitranscriptome. Nat. Methods. 2016; 13:692–698. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Koh C.W.Q., Goh Y.T., Goh W.S.S.. Atlas of quantitative single-base-resolution N(6)-methyl-adenine methylomes. Nat. Commun. 2019; 10:5636. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Garcia-Campos M.A., Edelheit S., Toth U., Safra M., Shachar R., Viukov S., Winkler R., Nir R., Lasman L., Brandis A.et al.. Deciphering the “m(6)A code” via antibody-independent quantitative profiling. Cell. 2019; 178:731–747. [DOI] [PubMed] [Google Scholar]
- 24. Zhang Z., Chen L.Q., Zhao Y.L., Yang C.G., Roundtree I.A., Zhang Z., Ren J., Xie W., He C., Luo G.Z.. Single-base mapping of m(6)A by an antibody-independent method. Sci. Adv. 2019; 5:eaax0250. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Meyer K.D. DART-seq: an antibody-free method for global m(6)A detection. Nat. Methods. 2019; 16:1275–1280. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Marchand V., Ayadi L., Ernst F.G.M., Hertler J., Bourguignon-Igel V., Galvanin A., Kotter A., Helm M., Lafontaine D.L.J., Motorin Y.. AlkAniline-Seq: profiling of m(7) g and m(3) c RNA modifications at single nucleotide resolution. Angew. Chem. Int. Ed Engl. 2018; 57:16785–16790. [DOI] [PubMed] [Google Scholar]
- 27. Jenjaroenpun P., Wongsurawat T., Wadley T.D., Wassenaar T.M., Liu J., Dai Q., Wanchai V., Akel N.S., Jamshidi-Parsian A., Franco A.T.et al.. Decoding the epitranscriptional landscape from native RNA sequences. Nucleic Acids Res. 2021; 49:e7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Begik O., Lucas M.C., Pryszcz L.P., Ramirez J.M., Medina R., Milenkovic I., Cruciani S., Liu H., Vieira H.G.S., Sas-Chen A.et al.. Quantitative profiling of pseudouridylation dynamics in native RNAs with nanopore sequencing. Nat. Biotechnol. 2021; 39:1278–1291. [DOI] [PubMed] [Google Scholar]
- 29. Carrara M., Beccuti M., Lazzarato F., Cavallo F., Cordero F., Donatelli S., Calogero R.A.. State-of-the-art fusion-finder algorithms sensitivity and specificity. Biomed. Res. Int. 2013; 2013:340620. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Hansen K.D., Brenner S.E., Dudoit S.. Biases in illumina transcriptome sequencing caused by random hexamer priming. Nucleic Acids Res. 2010; 38:e131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Garalde D.R., Snell E.A., Jachimowicz D., Sipos B., Lloyd J.H., Bruce M., Pantic N., Admassu T., James P., Warland A.et al.. Highly parallel direct RNA sequencing on an array of nanopores. Nat. Methods. 2018; 15:201–206. [DOI] [PubMed] [Google Scholar]
- 32. Lu H., Giordano F., Ning Z.. Oxford nanopore MinION sequencing and genome assembly. Genomics Proteomics Bioinformatics. 2016; 14:265–279. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. McIntyre A.B.R., Alexander N., Grigorev K., Bezdan D., Sichtig H., Chiu C.Y., Mason C.E.. Single-molecule sequencing detection of N6-methyladenine in microbial reference materials. Nat. Commun. 2019; 10:579. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Lorenz D.A., Sathe S., Einstein J.M., Yeo G.W.. Direct RNA sequencing enables m(6)A detection in endogenous transcript isoforms at base-specific resolution. RNA. 2020; 26:19–28. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Liu H., Begik O., Lucas M.C., Ramirez J.M., Mason C.E., Wiener D., Schwartz S., Mattick J.S., Smith M.A., Novoa E.M.. Accurate detection of m(6)A RNA modifications in native RNA sequences. Nat. Commun. 2019; 10:4079. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Price A.M., Hayer K.E., McIntyre A.B.R., Gokhale N.S., Abebe J.S., Della Fera A.N., Mason C.E., Horner S.M., Wilson A.C., Depledge D.P.et al.. Direct RNA sequencing reveals m(6)A modifications on adenovirus RNA are necessary for efficient splicing. Nat. Commun. 2020; 11:6016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Parker M.T., Knop K., Sherwood A.V., Schurch N.J., Mackinnon K., Gould P.D., Hall A.J., Barton G.J., Simpson G.G.. Nanopore direct RNA sequencing maps the complexity of arabidopsis mRNA processing and m(6)A modification. Elife. 2020; 9:e49658. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Pratanwanich P.N., Yao F., Chen Y., Koh C.W.Q., Wan Y.K., Hendra C., Poon P., Goh Y.T., Yap P.M.L., Chooi J.Y.et al.. Identification of differential RNA modifications from nanopore direct RNA sequencing with xPore. Nat. Biotechnol. 2021; 39:1394–1402. [DOI] [PubMed] [Google Scholar]
- 39. Hendra C., Pratanwanich P.N., Wan Y.K., Goh W.S.S., Thiery A., Göke J.. Detection of m6A from direct RNA sequencing using a multiple instance learning framework. 2021; bioRxiv doi:22 September 2021, preprint: not peer reviewed 10.1101/2021.09.20.461055. [DOI] [PMC free article] [PubMed]
- 40. Gao Y., Liu X., Wu B., Wang H., Xi F., Kohnen M.V., Reddy A.S.N., Gu L.. Quantitative profiling of N(6)-methyladenosine at single-base resolution in stem-differentiating xylem of populus trichocarpa using nanopore direct RNA sequencing. Genome Biol. 2021; 22:22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Huang S., Zhang W., Katanski C.D., Dersh D., Dai Q., Lolans K., Yewdell J., Eren A.M., Pan T.. Interferon inducible pseudouridine modification in human mRNA by quantitative nanopore profiling. Genome Biol. 2021; 22:330. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42. Boccaletto P., Stefaniak F., Ray A., Cappannini A., Mukherjee S., Purta E., Kurkowska M., Shirvanizadeh N., Destefanis E., Groza P.et al.. MODOMICS: a database of RNA modification pathways. 2021 update. Nucleic Acids Res. 2022; 50:D231–D235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Xuan J.J., Sun W.J., Lin P.H., Zhou K.R., Liu S., Zheng L.L., Qu L.H., Yang J.H.. RMBase v2.0: deciphering the map of RNA modifications from epitranscriptome sequencing data. Nucleic Acids Res. 2018; 46:D327–D334. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Liu S., Zhu A., He C., Chen M.. REPIC: a database for exploring the N(6)-methyladenosine methylome. Genome Biol. 2020; 21:100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Tang Y., Chen K., Song B., Ma J., Wu X., Xu Q., Wei Z., Su J., Liu G., Rong R.et al.. m6A-Atlas: a comprehensive knowledgebase for unraveling the N6-methyladenosine (m6A) epitranscriptome. Nucleic Acids Res. 2021; 49:D134–D143. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Ma J., Song B., Wei Z., Huang D., Zhang Y., Su J., de Magalhaes J.P., Rigden D.J., Meng J., Chen K.. m5C-Atlas: a comprehensive database for decoding and annotating the 5-methylcytosine (m5C) epitranscriptome. Nucleic Acids Res. 2022; 50:D196–D203. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47. Liu H., Wang H., Wei Z., Zhang S., Hua G., Zhang S.W., Zhang L., Gao S.J., Meng J., Chen X.et al.. MeT-DB V2.0: elucidating context-specific functions of N6-methyl-adenosine methyltranscriptome. Nucleic Acids Res. 2018; 46:D281–D287. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Luo X., Li H., Liang J., Zhao Q., Xie Y., Ren J., Zuo Z.. RMVar: an updated database of functional variants involved in RNA modifications. Nucleic Acids Res. 2021; 49:D1405–D1412. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49. Deng S., Zhang H., Zhu K., Li X., Ye Y., Li R., Liu X., Lin D., Zuo Z., Zheng J.. M6A2Target: a comprehensive database for targets of m6A writers, erasers and readers. Brief Bioinform. 2021; 22:bbaa055. [DOI] [PubMed] [Google Scholar]
- 50. Jonkhout N., Tran J., Smith M.A., Schonrock N., Mattick J.S., Novoa E.M.. The RNA modification landscape in human disease. RNA. 2017; 23:1754–1769. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51. Simpson J.T., Workman R.E., Zuzarte P.C., David M., Dursi L.J., Timp W.. Detecting DNA cytosine methylation using nanopore sequencing. Nat. Methods. 2017; 14:407–410. [DOI] [PubMed] [Google Scholar]
- 52. Li H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics. 2018; 34:3094–3100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53. Li H., Handsaker B., Wysoker A., Fennell T., Ruan J., Homer N., Marth G., Abecasis G., Durbin R.. The sequence alignment/map format and SAMtools. Bioinformatics. 2009; 25:2078–2079. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54. Piechotta M., Naarmann-de Vries I.S., Wang Q., Altmüller J., Dieterich C. RNA modification mapping with JACUSA2. Genome Biol. 2022; 23:115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55. McCaskill J.S. The equilibrium partition function and base pair binding probabilities for RNA secondary structure. Biopolymers. 1990; 29:1105–1119. [DOI] [PubMed] [Google Scholar]
- 56. Andronescu M., Bereg V., Hoos H.H., Condon A.. RNA STRAND: the RNA secondary structure and statistical analysis database. BMC Bioinf. 2008; 9:340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57. Denman R.B. Using RNAFOLD to predict the activity of small catalytic RNAs. BioTechniques. 1993; 15:1090–1095. [PubMed] [Google Scholar]
- 58. Gleeson J., Leger A., Prawer Y.D.J., Lane T.A., Harrison P.J., Haerty W., Clark M.B.. Accurate expression quantification from nanopore direct RNA sequencing with nanocount. Nucleic Acids Res. 2022; 50:e19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59. Frankish A., Diekhans M., Jungreis I., Lagarde J., Loveland J.E., Mudge J.M., Sisu C., Wright J.C., Armstrong J., Barnes I.et al.. gencode 2021. Nucleic Acids Res. 2021; 49:D916–D923. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60. Sayers E.W., Beck J., Bolton E.E., Bourexis D., Brister J.R., Canese K., Comeau D.C., Funk K., Kim S., Klimke W.et al.. Database resources of the national center for biotechnology information. Nucleic Acids Res. 2021; 49:D10–D17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61. Yu G., Wang L.G., He Q.Y.. ChIPseeker: an R/Bioconductor package for ChIP peak annotation, comparison and visualization. Bioinformatics. 2015; 31:2382–2383. [DOI] [PubMed] [Google Scholar]
- 62. Li J.H., Liu S., Zhou H., Qu L.H., Yang J.H.. starBase v2.0: decoding miRNA-ceRNA, miRNA-ncRNA and protein-RNA interaction networks from large-scale CLIP-Seq data. Nucleic Acids Res. 2014; 42:D92–D97. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63. Hu B., Yang Y.T., Huang Y., Zhu Y., Lu Z.J.. POSTAR: a platform for exploring post-transcriptional regulation coordinated by RNA-binding proteins. Nucleic Acids Res. 2017; 45:D104–D114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64. Lee B.T., Barber G.P., Benet-Pagès A., Casper J., Clawson H., Diekhans M., Fischer C., Gonzalez J.N., Hinrichs AngieS., Lee ChristopherM.et al.. The UCSC genome browser database: 2022 update. Nucleic Acids Res. 2021; 50:D1115–D1122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65. Sherry S.T., Ward M.H., Kholodov M., Baker J., Phan L., Smigielski E.M., Sirotkin K.. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 2001; 29:308–311. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66. Skinner M.E., Uzilov A.V., Stein L.D., Mungall C.J., Holmes I.H.. JBrowse: a next-generation genome browser. Genome Res. 2009; 19:1630–1638. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67. Ismail I.H., Gagné J.P., Genois M.M., Strickfaden H., McDonald D., Xu Z., Poirier G.G., Masson J.Y., Hendzel M.J.. The RNF138 E3 ligase displaces ku to promote DNA end resection and regulate DNA repair pathway choice. Nat. Cell Biol. 2015; 17:1446–1457. [DOI] [PubMed] [Google Scholar]
- 68. Schmidt C.K., Galanty Y., Sczaniecka-Clift M., Coates J., Jhujh S., Demir M., Cornwell M., Beli P., Jackson S.P.. Systematic E2 screening reveals a UBE2D-RNF138-CtIP axis promoting DNA repair. Nat. Cell Biol. 2015; 17:1458–1470. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69. Lee K., Byun K., Hong W., Chuang H.Y., Pack C.G., Bayarsaikhan E., Paek S.H., Kim H., Shin H.Y., Ideker T.et al.. Proteome-wide discovery of mislocated proteins in cancer. Genome Res. 2013; 23:1283–1294. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70. Long P., Samnakay P., Jenner P., Rose S.. A yeast two-hybrid screen reveals that osteopontin associates with MAP1A and MAP1B in addition to other proteins linked to microtubule stability, apoptosis and protein degradation in the human brain. Eur. J. Neurosci. 2012; 36:2733–2742. [DOI] [PubMed] [Google Scholar]
- 71. Li Z.X., Zhu Q.N., Zhang H.B., Hu Y., Wang G., Zhu Y.S.. MALAT1: a potential biomarker in cancer. Cancer Manag Res. 2018; 10:6757–6768. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
No new data were generated or analysed in support of this research. The DirectRMDB database is freely available at: http://www.rnamd.org/directRMDB/, and has a mirror at: www.xjtlu.edu.cn/biologicalsciences/directRMDB.




