Abstract
Resolving the spatial distribution of the transcriptome at a subcellular level can increase our understanding of biology and diseases. To facilitate studies of biological functions and molecular mechanisms in the transcriptome, we updated RNALocate, a resource for RNA subcellular localization analysis that is freely accessible at http://www.rnalocate.org/ or http://www.rna-society.org/rnalocate/. Compared to RNALocate v1.0, the new features in version 2.0 include (i) expansion of the data sources and the coverage of species; (ii) incorporation and integration of RNA-seq datasets containing information about subcellular localization; (iii) addition and reorganization of RNA information (RNA subcellular localization conditions and descriptive figures for method, RNA homology information, RNA interaction and ncRNA disease information) and (iv) three additional prediction tools: DM3Loc, iLoc-lncRNA and iLoc-mRNA. Overall, RNALocate v2.0 provides a comprehensive RNA subcellular localization resource for researchers to deconvolute the highly complex architecture of the cell.
INTRODUCTION
The subcellular localization of RNA plays an important role in cell growth and development, cell differentiation and inflammation, cell signal transduction and transcriptional regulation (1,2). At the cellular level, where an RNA is located likely determines whether it will be stored, processed, translated or degraded (3–5). Although the importance of RNA subcellular localization has been widely recognized, the related bioinformatics resources are limited compared to those available for protein localization. For example, a subcellular map of the human proteome, Human Protein Atlas (HPA), records detailed information about protein subcellular localization (6). Moreover, many resources also provide information about the subcellular localization of proteins, such as UniProt, PSORTdb, SubCellBarCode, MiCroKiTS 4.0 and SUBA4 (7–12). Corresponding protein subcellular localization technology and prediction tools include FLIRT, SUbCons, BUSCA and DeepMito (13–18).
There are already some databases for collecting the information of RNA subcellular localization at the transcriptome-wide level. For example, lncSLdb (19) stores subcellular localization of long noncoding RNAs (lncRNAs) from literature mining, LncATLAS (20) collects subcellular localization of lncRNAs from RNA-seq data, and EVmiRNA (21) involves the information of microRNAs (miRNAs) in extracellular vesicles. Several computational prediction tools, including DM3Loc (22), mRNALoc (23), lncLocator (24) and iLoc-lncRNA (25), were developed based on the first version of RNALocate(26). Of note, many experimental techniques for detecting RNA subcellular localizations have been developed in recent years, including APEX-Seq (27), proximity RNA-seq (28), MERFISH (29) and subRNA-seq (30), together with extensive new data. In view of these, this is the right time to update our database to RNALocate v2.0 (http://www.rnalocate.org/ or http://www.rna-society.org/rnalocate/), that is the collection of RNA subcellular localization data from literatures, other databases and RNA-seq datasets.
RNALocate v2.0 is a repository of integrated experimentally validated information on subcellular localization of RNA through manual curation of the literature and five other resources, along with analyses of 35 datasets from the Gene Expression Omnibus (GEO) (31) under a common framework (Figure 1). It also supports three RNA subcellular localization prediction tools: DM3Loc, iLoc-lncRNA and iLoc-mRNA (32). In total, RNALocate v2.0 integrates more than 213 000 RNA subcellular localization entries at 171 locations across 104 species. This resource will provide a valuable resource for better understanding the subcellular localization of the transcriptome.
MATERIALS AND METHODS
Data collection
RNALocate v2.0 integrates RNA subcellular localization data from the literature, five databases and 35 RNA-seq datasets. Publications in PubMed (mainly from 2016 to 2021) were screened with the following keyword combinations: (localization name) AND (RNA molecule). ‘Localization name’ represents the subcellular localization name, and ‘RNA molecule’ represents RNA symbols or RNA category names. Finally, we reviewed over 35 000 published studies that included 38 508 RNA subcellular localization entries. The other 174 752 entries were integrated from five other databases, including CSCD, EVmiRNA, exoRBase, PomBase and TAIR (21,33–36). RNA-seq datasets from GEO were screened with the following criteria: species (Homo sapiens or Mus musculus), sequencing type (bulk RNA-seq or small RNA-seq), condition (delete unknown) and replicate (≥2) and publication date (after 2016). RNALocate v2.0 adds over 200 new samples from 35 datasets of RNA-seq data with subcellular localization information.
To facilitate elucidating the role of RNA localization at the subcellular level, more annotation information was collected, including RNA subcellular localization conditions, methods and corresponding figures from the literature, RNA homology information from NCBI Gene (37), RNA interactions from RNAInter (38), and RNA-related diseases from MNDR v3.0 (39). Simultaneously, the transcript sequences from Refseq (40) and miRBase (41) were also included. For RNA-seq datasets, GEO accession, localization, sample condition, and PubMed ID were provided. In addition, the RNA expression and Gene Ontology (GO) enrichment results of the top 50 RNAs for each sample were also incorporated (42).
Data processing
Integrating multisource data requires unifying them into common reference databases to annotate various RNAs. Major types of RNA symbols were used: (i) miRNA symbols from the miRBase database, (ii) messenger RNA (mRNA), small nucleolar RNA (snoRNA), small nuclear RNA (snRNA) and lncRNA symbols from the NCBI Gene database, (iii) ribosomal RNA (rRNA) and piwi interacting RNA (piRNA) symbols from the RNAcentral (43) database and (iv) circular RNA (circRNA) symbols from the circBase (44) and exoRBase databases. Then, we reconstructed a hierarchical structure for all of the localizations according to the cellular component annotation curated in Gene Ontology. Additionally, miRBase accession, NCBI Gene ID, Ensembl Gene stable ID, RNAcentral identifier, circBase ID, exoRBase ID and their external links were also provided, which can help to efficiently retrieve a substantial amount of RNA-associated information from external resources. For the convenience of users, the RNA-associated information also contains RNA names from the literature, aliases, and sequences, among others.
In particular, we screened and processed 203 samples from 35 RNA-seq datasets that had labels of the subcellular locations in 26 conditions and 13 cell lines. All datasets contained 15 subcellular locations (Figure 2B, Supplementary Table S1). The raw data were downloaded and processed by the NCBI SRA Toolkit v2.10.5 for format conversion, and then adaptor contaminants and low-quality bases were removed using Trimmomatic v0.39 (45). The processed clean reads were aligned to the human and mouse reference genomes (GRCh38 and GRCm38 from GENCODE) with gene annotations (Release 34 and M25 from GENCODE), and the gene expression of each sample was estimated using HISAT2 v2.1.0, SAMtools v1.4 and featureCounts v2.0.1 (46,47). The RNA expression levels were normalized by transcripts per million (TPM). All the data consisted of two independent biological replicates per sample (except for samples from APEX-seq, which have at least two replicates). In order to further analysis, we standardized the genes in each dataset similar to the approach of LncATLAS. Genes with TPM >0 in all replicates of at least one sample were retained (gene expressed in some replicates but not expressed in others were excluded). And removed genes with a greater than twofold difference between replicates.
RESULTS
New data and annotations
To improve the accuracy of our database, we carefully calibrated all of the data in the first release of the database and deleted 6739 entries that represent protein subcellular localizations and unclear localizations. In addition, we merged 2897 entries in which all of the information was the same except for cell lines or tissue types. In summary, RNALocate v2.0 contains 213 260 experimentally validated RNA subcellular localization entries, including 38 508 manually curated entries from the literature and 174 752 entries from databases. These entries involved 112,304 nonredundant RNAs and 16 newly added RNA types (such as circRNA, lincRNA, mtRNA, scRNA, scaRNA and Y RNA). The 129 new subcellular locations (such as chromatin, insoluble cytoplasm, mitochondrial cloud, and plasma membrane) were also added.
The distribution of the subcellular localizations among different RNA types is shown in Figure 3A and Supplementary Table S2. The number of species in RNALocate v2.0 increased from 65 to 104 compared with the first version. The species cover seven categories: apicomplexa, euglenozoa, fungi, metozoa, rhodophyta, viridiplantae, and viruses. The top three species are Homo sapiens, Mus musculus, and Saccharomyces cerevisiae, as shown in Figure 3B. Other model species, such as Drosophila melanogaster, Rattus norvegicus and zebrafish (Danio rerio), have also been documented in RNALocate v2.0. Of note, some RNA subcellular localizations that only occur under certain conditions are also recorded in our database.
Features and utilities of RNALocate v2.0
RNALocate v2.0 provides a user-friendly platform for searching, browsing and profiling RNA subcellular localization data. To improve its search capability, RNALocate v2.0 provides search function for data from literature and RNA-seq dataset, respectively. For search from literature page, it enables an optimized query with a new function of fuzzy and batch search. ‘Fuzzy Search’ can help users search entries using nonstandardized RNA names and subcellular localization. Meanwhile, ‘batch search’ supports queries by a list of official symbols/IDs or a file upload to obtain associated entries.
Apart from basic annotations, such as RNA information, localization information, other subcellular localizations and ncRNA disease information, we modified the corresponding homology and interaction data in detail. The ‘RNA-RNA interaction’ presents only when both RNAs have subcellular localization information. Similarly, ‘homology information’ shows homologous RNA-associated entries instead of homologous genes. All of the information links to their corresponding databases. In addition, we added the method of RNA subcellular localization from literatures, and also included the corresponding conditions and figures.
To illustrate the different subcellular localizations of each RNA from RNA-seq datasets, the detail page of ‘Search From RNA-seq Dataset’ shows the basic information and subcellular localization in each RNA-seq dataset. Basic information included gene symbol, ensemble gene stable id, genome location and gene type. The latter included: (i) Subcellular localization (chromatin, cytoplasm and nucleoplasm) in different conditions (only in Mus musculus); (ii) Different subcellular localizations in individual cell type; (iii) Subcellular localizations revealed by APEX-Seq; (iv) Single subcellular localization in different conditions and (v) Subcellular localizations of cytoplasm and nucleus in different cell lines (Supplementary Figure S1). The search result page of literature and the detail page of RNA-seq dataset can be switched to each other. ‘Browse By RNA-seq dataset’ shows the sample information, gene expression and gene GO enrichment analysis for each dataset on the ‘Browse’ page. GEO accessions, locations, sampling conditions and other information were included in detail page. And also provided a histogram of the top 20 RNA expressions and the result of the top 50 RNA functional annotations for each sample (Figure 2A). In addition, all of the RNA expression in each dataset can be downloaded.
In response to the diverse needs of users, RNALocate v2.0 incorporates three prediction tools: DM3Loc, iLoc-lncRNA and iLoc-mRNA (all prediction tools were trained on RNALocate v1.0). They are used to predict the subcellular localizations of lncRNAs (iLoc-lncRNA) or mRNAs (DM3Loc and iLoc-mRNA).
CONCLUSION AND FUTURE PERSPECTIVES
Here, we present a resource of RNA subcellular localization information, RNALocate v2.0, generated by information obtained from the literature, databases and RNA-seq datasets. It contains more than 213 000 RNA subcellular localization entries, guiding and helping researchers perform further studies. RNALocate v2.0 integrates RNA-seq data with subcellular localization to quantify the expression of RNAs at the subcellular level. In addition, RNALocate v2.0 also incorporates three prediction tools for the various needs of users.
The biological functions of RNAs are usually influenced by their localizations. The fact that RNAs are located at multiple subcellular localizations also increases the complexity of the cell. The analysis of the protein-protein interaction network at the subcellular level has been confirmed to have a unique effect different from the cellular level, and corresponding methods have also emerged, such as CellWhere and ComPPI (48–50). Because of this, we expect that continuing to expand and improve RNALocate v2.0 can also help explore the RNA-RNA interaction network at the subcellular level in the future. Thus, RNALocate is the most comprehensive map of the subcellular localization of the transcriptome and it can satisfy different requirements.
Supplementary Material
Contributor Information
Tianyu Cui, Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China.
Yiying Dou, Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China.
Puwen Tan, Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China.
Zhen Ni, Department of Thoracic Surgery, Nanfang Hospital, Southern Medical University, Guangzhou 510515, China.
Tianyuan Liu, Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China.
DuoLin Wang, Department of Electrical Engineering and Computer Science, Bond Life Sciences Center, University of Missouri, Columbia, Missouri 65211, USA.
Yan Huang, Shunde Hospital, Southern Medical University (The First People's Hospital of Shunde Foshan), Foshan 528308, China.
Kaican Cai, Department of Thoracic Surgery, Nanfang Hospital, Southern Medical University, Guangzhou 510515, China.
Xiaoyang Zhao, State Key Laboratory of Organ Failure Research, Department of Developmental Biology, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China.
Dong Xu, Department of Electrical Engineering and Computer Science, Bond Life Sciences Center, University of Missouri, Columbia, Missouri 65211, USA.
Hao Lin, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 611731, China.
Dong Wang, Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China; Dermatology Hospital, Southern Medical University, Guangzhou 510091, China.
SUPPLEMENTARY DATA
Supplementary Data are available at NAR Online.
FUNDING
National Key Research and Development Project of China [2017YFA0105001, 2019YFA0801800]; National Natural Science Foundation of China [82070109, 81770104, 62002153]; Guangdong Basic and Applied Basic Research Foundation [2019A1515010784, 2019A1515110701]; China Postdoctoral Science Foundation [2020M682623, 2020M682785]; Paul K. and Diane Shumaker Endowment Fund at University of Missouri. Funding for open access charge: National Key Research and Development Project of China.
Conflict of interest statement. None declared.
REFERENCES
- 1. Buxbaum A.R., Haimovich G., Singer R.H.. In the right place at the right time: visualizing and understanding mRNA localization. Nat. Rev. Mol. Cell Biol. 2015; 16:95–109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Mofatteh M., Bullock S.L.. SnapShot: subcellular mRNA localization. Cell. 2017; 169:178–178. [DOI] [PubMed] [Google Scholar]
- 3. Berkovits B.D., Mayr C.. Alternative 3′ UTRs act as scaffolds to regulate membrane protein localization. Nature. 2015; 522:363–367. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Lecuyer E., Yoshida H., Parthasarathy N., Alm C., Babak T., Cerovina T., Hughes T.R., Tomancak P., Krause H.M.. Global analysis of mRNA localization reveals a prominent role in organizing cellular architecture and function. Cell. 2007; 131:174–187. [DOI] [PubMed] [Google Scholar]
- 5. Carlevaro-Fita J., Rahim A., Guigo R., Vardy L.A., Johnson R.. Cytoplasmic long noncoding RNAs are frequently bound to and degraded at ribosomes in human cells. RNA. 2016; 22:867–882. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Thul P.J., Akesson L., Wiking M., Mahdessian D., Geladaki A., Ait Blal H., Alm T., Asplund A., Bjork L., Breckels L.M.et al.. A subcellular map of the human proteome. Science. 2017; 356:eaal3321. [DOI] [PubMed] [Google Scholar]
- 7. UniProt C. UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res. 2019; 47:D506–D515. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Lau W.Y.V., Hoad G.R., Jin V., Winsor G.L., Madyan A., Gray K.L., Laird M.R., Lo R., Brinkman F.S.L. PSORTdb 4.0: expanded and redesigned bacterial and archaeal protein subcellular localization database incorporating new secondary localizations. Nucleic Acids Res. 2021; 49:D803–D808. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Orre L.M., Vesterlund M., Pan Y., Arslan T., Zhu Y., Fernandez Woodbridge A., Frings O., Fredlund E., Lehtio J.. SubCellBarCode: proteome-wide mapping of protein localization and relocalization. Mol. Cell. 2019; 73:166–182. [DOI] [PubMed] [Google Scholar]
- 10. Huang Z., Ma L., Wang Y., Pan Z., Ren J., Liu Z., Xue Y.. MiCroKiTS 4.0: a database of midbody, centrosome, kinetochore, telomere and spindle. Nucleic Acids Res. 2015; 43:D328–D334. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Hooper C.M., Castleden I.R., Tanz S.K., Aryamanesh N., Millar A.H.. SUBA4: the interactive data analysis centre for Arabidopsis subcellular protein locations. Nucleic Acids Res. 2017; 45:D1064–D1074. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Huang Y., Wang J., Zhao Y., Wang H., Liu T., Li Y., Cui T., Li W., Feng Y., Luo J.et al.. cncRNAdb: a manually curated resource of experimentally supported RNAs with both protein-coding and noncoding function. Nucleic Acids Res. 2021; 49:D65–D70. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Hirsch S.M., Sundaramoorthy S., Davies T., Zhuravlev Y., Waters J.C., Shirasu-Hiza M., Dumont J., Canman J.C.. FLIRT: fast local infrared thermogenetics for subcellular control of protein function. Nat. Methods. 2018; 15:921–923. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Salvatore M., Warholm P., Shu N., Basile W., Elofsson A.. SubCons: a new ensemble method for improved human subcellular localization predictions. Bioinformatics. 2017; 33:2464–2470. [DOI] [PubMed] [Google Scholar]
- 15. Savojardo C., Martelli P.L., Fariselli P., Profiti G., Casadio R.. BUSCA: an integrative web server to predict subcellular localization of proteins. Nucleic Acids Res. 2018; 46:W459–W466. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Savojardo C., Bruciaferri N., Tartari G., Martelli P.L., Casadio R.. DeepMito: accurate prediction of protein sub-mitochondrial localization using convolutional neural networks. Bioinformatics. 2020; 36:56–64. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Marx V. Mapping proteins with spatial proteomics. Nat. Methods. 2015; 12:815–819. [DOI] [PubMed] [Google Scholar]
- 18. Stadler C., Rexhepaj E., Singan V.R., Murphy R.F., Pepperkok R., Uhlen M., Simpson J.C., Lundberg E.. Immunofluorescence and fluorescent-protein tagging show high correlation for protein localization in mammalian cells. Nat. Methods. 2013; 10:315–323. [DOI] [PubMed] [Google Scholar]
- 19. Wen X., Gao L., Guo X., Li X., Huang X., Wang Y., Xu H., He R., Jia C., Liang F.. lncSLdb: a resource for long non-coding RNA subcellular localization. Database. 2018; 2018:bay085. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Mas-Ponte D., Carlevaro-Fita J., Palumbo E., Hermoso Pulido T., Guigo R., Johnson R.. LncATLAS database for subcellular localization of long noncoding RNAs. RNA. 2017; 23:1080–1087. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Liu T., Zhang Q., Zhang J., Li C., Miao Y.R., Lei Q., Li Q., Guo A.Y.. EVmiRNA: a database of miRNA profiling in extracellular vesicles. Nucleic Acids Res. 2019; 47:D89–D93. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Wang D., Zhang Z., Jiang Y., Mao Z., Wang D., Lin H., Xu D. DM3Loc: multi-label mRNA subcellular localization prediction and analysis based on multi-head self-attention mechanism. Nucleic Acids Res. 2021; 49:e46. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Garg A., Singhal N., Kumar R., Kumar M.. mRNALoc: a novel machine-learning based in-silico tool to predict mRNA subcellular localization. Nucleic Acids Res. 2020; 48:W239–W243. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Cao Z., Pan X., Yang Y., Huang Y., Shen H.B.. The lncLocator: a subcellular localization predictor for long non-coding RNAs based on a stacked ensemble classifier. Bioinformatics. 2018; 34:2185–2194. [DOI] [PubMed] [Google Scholar]
- 25. Su Z.D., Huang Y., Zhang Z.Y., Zhao Y.W., Wang D., Chen W., Chou K.C., Lin H.. iLoc-lncRNA: predict the subcellular location of lncRNAs by incorporating octamer composition into general PseKNC. Bioinformatics. 2018; 34:4196–4204. [DOI] [PubMed] [Google Scholar]
- 26. Zhang T., Tan P., Wang L., Jin N., Li Y., Zhang L., Yang H., Hu Z., Zhang L., Hu C.et al.. RNALocate: a resource for RNA subcellular localizations. Nucleic Acids Res. 2017; 45:D135–D138. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Fazal F.M., Han S., Parker K.R., Kaewsapsak P., Xu J., Boettiger A.N., Chang H.Y., Ting A.Y.. Atlas of subcellular RNA localization revealed by APEX-Seq. Cell. 2019; 178:473–490. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Morf J., Wingett S.W., Farabella I., Cairns J., Furlan-Magaril M., Jimenez-Garcia L.F., Liu X., Craig F.F., Walker S., Segonds-Pichon A.et al.. RNA proximity sequencing reveals the spatial organization of the transcriptome in the nucleus. Nat. Biotechnol. 2019; 37:793–802. [DOI] [PubMed] [Google Scholar]
- 29. Xia C., Fan J., Emanuel G., Hao J., Zhuang X.. Spatial transcriptome profiling by MERFISH reveals subcellular RNA compartmentalization and cell cycle-dependent gene expression. PNAS. 2019; 116:19490–19499. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Mayer A., Churchman L.S.. A detailed protocol for subcellular RNA sequencing (subRNA-seq). Curr. Protoc. Mol. Biol. 2017; 120:4.29.1–4.29.18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Barrett T., Wilhite S.E., Ledoux P., Evangelista C., Kim I.F., Tomashevsky M., Marshall K.A., Phillippy K.H., Sherman P.M., Holko M.et al.. NCBI GEO: archive for functional genomics data sets–update. Nucleic Acids Res. 2013; 41:D991–D995. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Zhang Z.Y., Yang Y.H., Ding H., Wang D., Chen W., Lin H.. Design powerful predictor for mRNA subcellular location prediction in Homo sapiens. Brief. Bioinform. 2021; 22:526–535. [DOI] [PubMed] [Google Scholar]
- 33. Li S., Li Y., Chen B., Zhao J., Yu S., Tang Y., Zheng Q., Li Y., Wang P., He X.et al.. exoRBase: a database of circRNA, lncRNA and mRNA in human blood exosomes. Nucleic Acids Res. 2018; 46:D106–D112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Xia S., Feng J., Chen K., Ma Y., Gong J., Cai F., Jin Y., Gao Y., Xia L., Chang H.et al.. CSCD: a database for cancer-specific circular RNAs. Nucleic Acids Res. 2018; 46:D925–D929. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Lock A., Rutherford K., Harris M.A., Hayles J., Oliver S.G., Bahler J., Wood V.. PomBase 2018: user-driven reimplementation of the fission yeast database provides rapid and intuitive access to diverse, interconnected information. Nucleic Acids Res. 2019; 47:D821–D827. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Berardini T.Z., Reiser L., Li D., Mezheritsky Y., Muller R., Strait E., Huala E.. The Arabidopsis information resource: making and mining the “gold standard” annotated reference plant genome. Genesis. 2015; 53:474–485. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Brown G.R., Hem V., Katz K.S., Ovetsky M., Wallin C., Ermolaeva O., Tolstoy I., Tatusova T., Pruitt K.D., Maglott D.R.et al.. Gene: a gene-centered information resource at NCBI. Nucleic Acids Res. 2015; 43:D36–D42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Lin Y., Liu T., Cui T., Wang Z., Zhang Y., Tan P., Huang Y., Yu J., Wang D. RNAInter in 2020: RNA interactome repository with increased coverage and annotation. Nucleic Acids Res. 2020; 48:D189–D197. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Ning L., Cui T., Zheng B., Wang N., Luo J., Yang B., Du M., Cheng J., Dou Y., Wang D. MNDR v3.0: mammal ncRNA-disease repository with increased coverage and annotation. Nucleic Acids Res. 2021; 49:D160–D164. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Haft D.H., DiCuccio M., Badretdin A., Brover V., Chetvernin V., O’Neill K., Li W., Chitsaz F., Derbyshire M.K., Gonzales N.R.et al.. RefSeq: an update on prokaryotic genome annotation and curation. Nucleic Acids Res. 2018; 46:D851–D860. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Kozomara A., Birgaoanu M., Griffiths-Jones S.. miRBase: from microRNA sequences to function. Nucleic Acids Res. 2019; 47:D155–D162. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42. The Gene Ontology C. The Gene Ontology Resource: 20 years and still GOing strong. Nucleic Acids Res. 2019; 47:D330–D338. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. The R.C. RNAcentral: a hub of information for non-coding RNA sequences. Nucleic Acids Res. 2019; 47:D221–D229. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Glazar P., Papavasileiou P., Rajewsky N.. circBase: a database for circular RNAs. RNA. 2014; 20:1666–1670. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Bolger A.M., Lohse M., Usadel B.. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014; 30:2114–2120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Li H., Handsaker B., Wysoker A., Fennell T., Ruan J., Homer N., Marth G., Abecasis G., Durbin R.Genome Project Data Processing, S. . The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009; 25:2078–2079. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47. Liao Y., Smyth G.K., Shi W.. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics. 2014; 30:923–930. [DOI] [PubMed] [Google Scholar]
- 48. Ma W., Mayr C.. A membraneless organelle associated with the endoplasmic reticulum enables 3′UTR-mediated protein-protein interactions. Cell. 2018; 175:1492–1506. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49. Zhu L., Malatras A., Thorley M., Aghoghogbe I., Mer A., Duguez S., Butler-Browne G., Voit T., Duddy W.. CellWhere: graphical display of interaction networks organized on subcellular localizations. Nucleic Acids Res. 2015; 43:W571–W575. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50. Veres D.V., Gyurko D.M., Thaler B., Szalay K.Z., Fazekas D., Korcsmaros T., Csermely P.. ComPPI: a cellular compartment-specific database for protein-protein interaction network analysis. Nucleic Acids Res. 2015; 43:D485–D493. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.