Table 1. Overview of data curated in POSTAR.
| Category | Human | Mouse | Resource/calculation methoda | |
|---|---|---|---|---|
| RBP binding sites | RBP binding sites from experiments | 1 752 329 | 1 003 984 | All CLIP-seq peaks called by Piranha (human: 65 RBPs; mouse: 30 RBPs)b |
| 39 201 | 78 922 | HITS-CLIP peaks called by CIMS (human: 17 RBPs; mouse: 23 RBPs)b | ||
| 7 731 846 | 96 346 | PAR-CLIP peaks called by PARalyzer (human: 44 RBPs; mouse: 4 RBPs)b | ||
| 4 598 307 | 1 013 008 | iCLIP peaks called by CITS (human: 9 RBPs; mouse: 8 RBPs)b | ||
| 6 703 559 | NA | eCLIP-seq peaks called by ENCODE (human: 56 RBPs)c | ||
| 439 817 | NA | PIP-seq peaks called by PMID24393486 (human: global RBPs) | ||
| RBP binding site from predictions | 25 623 567 | 18 540 386 | Peaks predicted by FIMO (human: 88 RBPs; mouse: 88 RBPs)d | |
| 19 447 967 | 24 621 203 | Peaks predicted by TESS (human: 88 RBPs; mouse: 88 RBPs)d | ||
| 16 586 127 | 11 905 150 | Peaks predicted by DeepBind (human: 82 RBPs; mouse: 82 RBPs)e | ||
| Data module I: Gene/RBP annotations | RBPs | 132 | 104 | Ensembl, PMID25365966 |
| Sequence motifs | 726 | 180 | MEME, HOMER | |
| Structural preferences | 720 | 179 | RNApromo, RNAcontext | |
| Gene Ontologies | 15 677 | 13 849 | GOBP, GOMF, GOCCf | |
| Biological pathways | 186 | 105 | KEGG | |
| Gene expression | 34 cells/tissue types | 18 cell/tissue types | TopHat, Cufflinksg | |
| Alternative splicing (skip exon) | 34 cells/tissue types | 18 cell/tissue types | TopHat, MISOg | |
| Data module II: Molecular annotations | miRNA binding sites from experiments | 3 906 955 | 1 588 861 | AGO CLIP-seq peaks called by Piranha, the targeting miRNAs identified by miRandah |
| miRNA binding sites from predictions | 70 516 087 | 38 336 372 | RNAhybrid, TargetScan, miRanda | |
| RNA modification sites | 177 049 | 91 930 | RMBase, PMID26863196 | |
| RNA editing sites | 2 583 302 | 8846 | RADAR, DARNED | |
| Splicing elements | 1 995 574 | 1 152 186 | Anno. in GENCODE human v19, mouse vM7 | |
| Conserved structural regions | 725 | 691 | EvoFam | |
| Data module III: Genomic variants | SNPs | 149 398 310 | 77 785 586 | dbSNP v146 |
| Tissue-specific eQTL | 19 530 607 | NA | GTEx | |
| GWAS SNPs | 278 473 | NA | GWASdb2, RNAfoldi | |
| Clinically important SNPs | 131 919 | NA | ClinVar, RNAfoldi | |
| Cancer TCGA whole-exome SNVs | 828 119 | NA | PMID24390350, RNAfoldi | |
| Cancer TCGA whole-genome SNVs | 4 745 891 | NA | PMID23945592, RNAfoldi | |
| Cancer COSMIC SNVs | 2 371 219 | NA | COSMIC v76, RNAfoldi | |
| Data module IV: Gene-Function associations | Tissue-specific genes | 21 549 | NA | TiGER, SpeCond |
| Gene-Disease associations | 419 906 | NA | OMIM, DisGeNET | |
| Gene-Cancer associations | 4485 | NA | Manually curated from 60 publicationsj | |
| Gene-Drug associations | 35 201 | NA | DGIdb 2.0 | |
| Data module V: RNA secondary structures | Predicted local structures | 82 242 543 | 57 095 233 | RNAfold with restraints from experimental structural probing data (human: DMS-seq, PARS; mouse: icSHAPE, Frag-seq, CIRS-seq)k |
aResults and data firstly generated by POSTAR are in bold font.
bWe provide all CLIP-seq peaks called by Piranha with P < 0.01. For CIMS, CITS and PARalyzer, we provide peaks with default significance cutoffs.
cSee Supplementary File 2 for the full list of eCLIP-seq data. The peaks were called by ENCODE.
dSee Supplementary File 5 for the RBPs and motifs used for prediction.
eSee Supplementary File 6 for the RBPs in DeepBind model.
fBP, Biological Process; MF, Molecular Function; CC, Cellular Component.
gSee Supplementary File 4 for the full list of 230 RNA-seq data sets in human and mouse.
hWe used all AGO CLIP-seq peaks called by Piranha (P < 0.01). The targeting miRNAs of the peaks were identified using miRanda with default parameters.
iWe used RNAfold to calculate the minimal free energy changes of local RNA secondary structures that are induced by the mutations.
jSee Supplementary File 3 for the full list of manually curated cancer genes.
kSee Supplementary File 7 for the experimental structural probing datasets. We predicted one local structure centered on each RBP binding site (window size: 150nt).