Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 2015 Dec 3;44(Database issue):D154–D163. doi: 10.1093/nar/gkv1308

RBP-Var: a database of functional variants involved in regulation mediated by RNA-binding proteins

Fengbiao Mao 1,2, Luoyuan Xiao 3, Xianfeng Li 4, Jialong Liang 1,2, Huajing Teng 1,5, Wanshi Cai 1,*, Zhong Sheng Sun 1,6,*
PMCID: PMC4702914  PMID: 26635394

Abstract

Transcription factors bind to the genome by forming specific contacts with the primary DNA sequence; however, RNA-binding proteins (RBPs) have greater scope to achieve binding specificity through the RNA secondary structure. It has been revealed that single nucleotide variants (SNVs) that alter RNA structure, also known as RiboSNitches, exhibit 3-fold greater local structure changes than replicates of the same DNA sequence, demonstrated by the fact that depletion of RiboSNitches could result in the alteration of specific RNA shapes at thousands of sites, including 3′ UTRs, binding sites of microRNAs and RBPs. However, the network between SNVs and post-transcriptional regulation remains unclear. Here, we developed RBP-Var, a database freely available at http://www.rbp-var.biols.ac.cn/, which provides annotation of functional variants involved in post-transcriptional interaction and regulation. RBP-Var provides an easy-to-use web interface that allows users to rapidly find whether SNVs of interest can transform the secondary structure of RNA and identify RBPs whose binding may be subsequently disrupted. RBP-Var integrates DNA and RNA biology to understand how various genetic variants and post-transcriptional mechanisms cooperate to orchestrate gene expression. In summary, RBP-Var is useful in selecting candidate SNVs for further functional studies and exploring causal SNVs underlying human diseases.

INTRODUCTION

A single nucleotide variant (SNV) can result in complex diseases through various molecular mechanisms, including amino-acid changes at the protein level, alteration of regulatory elements at the transcription level and disruption of RNA–protein interaction at the post-transcription level (14). Previous studies mainly concentrated on the effect of SNVs on protein structure (5,6) and the misregulation of transcriptional regulation (7,8), known as cSNVs and rSNVs, respectively. However some SNVs, which are located in non-coding regions and have no relationship with transcriptional regulation (8), can have an observable phenotype or result in a disease. Therefore, some SNVs may play crucial roles in post-transcriptional gene regulation by altering micro (mi)RNA targets, RNA secondary structure and RNA–protein interactions (1,4,912).

Intensive attempts have been undertaken to better understand RNA-binding proteins (RBPs) and RNA–protein interactions (13). The advent of next-generation sequencing and relevant molecular techniques have allowed the development of systematic experimental protocols for the identification of protein-binding sites on RNA using the immunoprecipitation of RBPs, with or without in vivo RNA–protein crosslinking, known as crosslinking immunoprecipitation sequencing (CLIP-seq) and RNA immunoprecipitation sequencing (RIP-seq), respectively (14,15). CLIP-seq that combines high-throughput sequencing (HITS)-CLIP (16), photoactivatable ribonucleoside-enhanced (PAR)-CLIP (17) and individual nucleotide resolution crosslinking immunoprecipitation (iCLIP) (18), is a powerful tool for investigating the in vivo binding sites of RBPs on a transcriptome-wide scale, providing higher resolution than previous technologies, such as RIP-seq, for identifying protein-binding sites on RNAs.

Identifying changes in RBP binding induced by SNVs for a messenger RNA (mRNA) in vivo remains challenging because RBPs have much more scope to achieve specificity through the secondary structure of RNA and SNVs could change intramolecular bonds which may twist RNA molecules into hairpins, stem-loops and various other bumps and bulges (13). The impact of SNVs on RNA structure and gene regulation has been widely studied (14). Wan et al. (19) studied the initial landscape and variation of RNA secondary structures in a human family trio (mother, father, offspring) and revealed that ≈15% of all transcribed SNVs are RiboSNitches that alter the local RNA structure. These RiboSNitches were also significantly depleted around predicted miRNA target sites and protein-binding sites on RNA. Moreover, Maticzka et al. (20) demonstrated the importance of RNA secondary structure for conferring binding specificity in a subset of RBPs using both RNA structure and sequence features to develop a machine learning-based approach for predicting protein-binding sites on RNA from CLIP data. Fukunaga et al. (21) developed the CapR software that determines the probability of secondary structure throughout an RNA molecule for RBP binding and found that the human RBP Pumilio-2 has a preference for hairpin loop structures. Overall, RNA secondary structures exert a significant effect on RBP binding, and the RNA interactome network can be disrupted once the RNA structures are transformed.

Recently, great progress has been made in identifying mutations in RBPs associated with disease risk (22) such as mutations in the RBPs FUS and TDP-43 that can cause amyotrophic lateral sclerosis (23). However, it remains unclear what changes occur to RNA substrates in diseases and their effect on RNA structure and RNA–protein interaction induced by SNVs or RNA editing sites (24). Therefore, a comprehensive database is required for the annotation of functional variants involved in post-transcriptional interactions. Here, we developed RBP-Var (Figure 1), an integrated database for the annotation of functional variants involved in RBP-mediated post-transcriptional regulation and interaction. We also determined trait/disease-associations and deciphered gene regulation for SNVs located in potential protein-binding sites on RNA. Furthermore, we developed a heuristic scoring system based on the functional confidence of a variant, including SNV or RNA edit, to assist comparison among annotations.

Figure 1.

Figure 1.

Workflow to identify functional rbSNVs by RBP-Var.

DATA COLLECTION AND PROCESSING

Data sources for RBP-Var

RBP-Var provides extensive annotation for the functional variants involved in post-transcriptional interactions. We collected all the available processed data on RBPs from starBase (25) and CLIPdb (26) and extracted sequencing data from Gene Expression Omnibus (GEO) (27). Genomic coordinates of RBPs derived from GEO were converted from human genome assembly hg18 into hg19 using the University of California, Santa Cruz (UCSC) LiftOver tool (http://genome.ucsc.edu). We also retrieved a collection of expression quantitative trait loci (eQTL) from previous studies (2837), human trait-associated SNVs from the genome-wide association study (GWAS) catalog (38), somatic mutations in cancer from COSMIC (39) and disease-associated mutations from ClinVar (40). In addition, RBP-Var includes collection of RBP binding SNVs (rbSNVs) which may disrupt the binding of RBPs, derived from 112 CLIP-seq data sets (60 RBPs) and 319 motif matching sites (153 RBPs). These functional RBP binding SNVs include deletions, insertions, single nucleotide polymorphism (SNPs), multiple nucleotide polymorphisms (MNPs), microsatellites, insertions/deletions from dbSNP database (version 142) as well as RNA editing (A-to-I editing sites) from RADAR and DARNED (Table 1).

Table 1. Database content of RBP-Var as of 10 October, 2015.

Data type Sources Description Records Genomic coverage (bp)
CLIP-seq starBase, CLIPdb, GEO 112 data sets for 60 RBPs 52,528,288 847,717,483
Predicted binding (PWMs) CISBP-RNA, RBPDB 319 PWMs for 153 RBPs 119,777,778 822,156,225
dbSNP dbSNP v142 including SNP,InDel and MNP 113,128,211 113,128,211
RNA editing RADAR,DARNED A->G editing 2,576,460 2,576,460
miRNA targets TargetScan, miRanda, miRNASNP miRNA-SNP pairs 3,361,915 26,958,420
eQTLs MuTher, SCAN, seeQTL, GTEx, Harvard SNP-gene pairs 4,546,890 4,546,890
dsQTLs dsQTL Browser SNP-gene pairs 214,522 214,522
LD proxies HapMap, 1000Genome r2>0.8 by Plink 4,600,670 4,600,670
6mA MeT-DB 6mA peaks called by MACS 466,037 127,588,574
Splicing sites H-DBAS including 7 splicing patterns 1,669,641 158,324,909
trait/disease associations GWAS, COSMIC, ClinVar, TCGA, DISEASES SNP-trait or gene-trait pairs 8,955,125 8,955,125

Position weight matrix (PWM) motif matching

Many RBPs interact with mRNAs via a limited set of modular RNA-binding domains, including the RNA recognition motif, heterogeneous nuclear ribonucleoprotein K-homology domain and zinc fingers (11,41). The RNA-binding domains of RBPs initially determine the specificity and preferences of RNA binding with specific sequence motifs (42). Therefore, all PWMs from the catalog of inferred sequence binding preferences of RNA binding proteins (CIS-BP-RNA) (42) and the RNA-binding protein database (RBPDB) (43) deposited in the AURA database (44) were used to call motif matches in the transcriptome. The possible k-mers were aligned to the human transcriptome using MAST in the MEME suite (45) to give a final motif mapping with a match score >0 and P-values < 0.0001.

Impacts of SNVs on RNA structures

Based on human reference transcripts retrieved from UCSC (hg19), we changed the corresponding allele in a given sequence of the reference transcript to an alternative allele as an altered transcript for each SNV. Then, the RNAfold program was employed (46) with default parameters to illustrate the secondary structure and calculate the minimal free energy (MFE, ΔG) for both transcript sequence of wild-type and mutant. Energy changes in RNA structures (ΔΔG) were calculated from MFE differences (ΔΔG = |ΔGalt -ΔGref|), where ΔGref and ΔGalt are the MFE of the reference transcript and altered transcript, respectively. Moreover, for SNP and RNA editing sites, we employed the RNAsnp (2) with default parameters to estimate the mutation effects on local RNA secondary structure in terms of empirical P-values based on global folding in correlation measure by using RNAfold. For insertions, deletions and SNVs of other types, based on changes of MFE,we evaluated the effects on RNA secondary structure in terms of empirical P-values calculated from cumulative probabilities of the Poisson distribution.

Impact of SNPs on miRNA–RNA interactions

Single nucleotide polymorphisms (SNPs) in the target sites of miRNAs may destroy or create miRNA-binding sites on RNAs, which lead to loss-of-function and/or gain-of-function miRNA–RNA interactions. To assess the impact of each SNP on miRNA–RNA interaction, TargetScan (47) and miRanda (48) were used to predict target sites around each SNP with ±25 bp sequences for Ref-transcripts and Alt-transcripts, separately. In addition, we compared miRNA target score versus target score induced by SNV by empirical P-values calculated from cumulative probabilities of the Poisson distribution. We analyzed target alteration of miRNA by SNPs in 3′-UTR of mRNA and miRNA target alteration by SNPs in miRNA seed regions. We found that 888 674 pairs miRNA–RNA interactions was altered significantly (P-value < 0.01) by 253 428 SNPs in 3′-UTR of mRNA in a total of 1 654 644 pairs miRNA–RNA interactions. Meanwhile, we found that 302 723 pairs miRNA–RNA interactions was altered significantly (P-value < 0.01) by 201 SNPs in miRNA seed regions in total 556 008 pairs miRNA–RNA interactions.

Analyzing linkage disequilibrium (LD) proxies

Due to genetic correlation among nearby SNVs, functional analysis on single SNV is unreliable. To extend the post-regulatory feature of a single SNV, LD correlations between SNVs were analyzed, and the set of SNVs that had a strong LD (r2>0.8) with rbSNVs was defined as LD-proxies of rbSNVs. LD data were derived from both merged HapMap phases I+II+III genotype data (49) and integrated 1000-genome phase III release data (50). Data from all populations deposited in these two projects were utilized to perform LD analyses in an LD-window of 500 kb using Plink (51).

Trait/disease-associated label

SNVs occurring at the binding sites of RBPs could contribute to cis-modulation of gene expression by changing the affinity of a regulatory protein to RNA sequence (52). We used trait/disease-associated SNVs (TASs) obtained from GWAS (38), COSMIC (39) and ClinVar (40) to annotate SNVs in potential protein-binding sites identified from 112 CLIP-seq data sets. Expression trait usually differs from other classical complex traits, because the measured mRNA or protein trait is the product of a single gene within a specific chromosomal location. It is known that eQTLs are genomic loci that regulate the expression level of mRNAs and play a crucial role in deciphering gene regulation and spatio-temporal specificity. RBP-Var provides eQTL labels and spatio-temporal labels for the included SNVs. We collected eQTLs from experimentally supported eQTL databases (3537) and the eQTL browser (http://eqtl.uchicagoedu/cgi-bin/gbrowse/eqtl/) (2834) to provide association labels for SNVs. Tissue and developmental stage information was labeled according to cell type from which protein-binding sites on RNA were identified.

Classifying functional variants

To better interpret and comprise the catalog of rbSNV information, we developed a heuristic scoring system based on the functional confidence of variants. The scoring system represents with increasing confidence that a variant lies in a functional location and probably results in a functional consequence (i.e. alteration of RBP binding and a gene regulatory effect) (Table 2). For example, we consider variants that are known eQTLs as significant and label them as Category 1. Within Category 1, subcategories indicate additional annotations ranging from the most informational variants (1a, variant may change the motif for RBP binding) to the least informational variants (1e, variant only has a motif for RBP binding).

Table 2. The variant classification scheme of RBP-Var.

Category Description # of SNV # of RNA editing
Likely to affect RBP binding, RNA secondary structure and expression
1a eQTL+RBP binding+matched RBP motif+riboSNitch+miRNA targets 1 0
1b eQTL+RBP binding+any RBP motif +riboSNitch+miRNA targets 1,059 0
1c eQTL+RBP binding+any RBP motif +riboSNitch 2,417 1
1d eQTL+RBP binding+any RBP motif 334 0
1e eQTL+RBP binding 916 0
Likely to affect RBP binding, RNA secondary structure
2a RBP binding+matched RBP motif +riboSNitch+miRNA targets 23 1
2b RBP binding+any RBP motif +riboSNitch+miRNA targets 29,415 483
2c RBP binding+any RBP motif +riboSNitch 177,806 5,758
2d RBP binding+riboSNitch 616,239 46,963
Less likely to affect RBP binding 4,984
3a RBP binding+any RBP motif+miRNA targets 15 49
3b RBP binding+matched RBP motif 28,599 0
Minimal possibility to affect RBP binding 11,477
4 RBP binding+any RBP motif 844 534
5 RBP binding+miRNA targets 85 209
6 RBP binding or RBP motif hit 29,415 14

WEB INTERFACE

Data retrieving in RBP-Var could be implemented based on SNP-centric or gene-centric. SNP-centric data retrieving was appropriate to analyze results from genetic studies, especially the results of high-throughput studies, and then provided evidence for further functional studies to identify causal SNPs and shed light on underlying molecular mechanisms. Gene-centric searches were useful in candidate SNP selections that are based on genes of interest. In addition, RBP-Var provided RNA-editing retrieving, which could decipher the potential function of RNA editing involved in post-regulation and gene expression (24). Furthermore, we applied the JBrowse Genome Browser (http://jbrowse.org) to establish the ‘Browser’ page to visualize genome-wide signals of RBP binding for each CLIP-seq data set. Users could browse all binding signals across specific genomic region of their interests. The searching result for SNP rs1802295 in RBP-Var database was used as an example of web-interface (Figure 2).

Figure 2.

Figure 2.

Web-interface of RBP-Var. (AH) The snapshot of searching result for SNP rs1802295 in RBP-Var database.

We found that 874 160 (0.77%) SNVs in the SNP database v.142 were overlapped with RNA binding sites as identified by CLIP-seq. These SNVs included 835 130 SNPs, 26 854 deletions, 11 323 insertions, 166 insertions/deletions, 679 multiple nucleotide polymorphisms (MNPs) and 8 microsatellites. Subsequently, we employed our heuristic scoring system to annotate these SNVs, and we found 828 210 SNVs classified into Category 1/2 (Table 2), which were composed of 802 765 SNPs, 23 570 deletions, 1052 insertions, 152 insertions/deletions, 663 MNPs and 8 microsatellites (Figure 3A).

Figure 3.

Figure 3.

The features of RBP-Var. (A) The component of rbSNVs derived from dbSNP. (B) Comparison of SNVs with category score 1/2 deposited in RBP-Var and RegulomeDB with damaging SNVs predicted by SIFT and PolyPhen2.(C) Comparison of RNA editing events with category score 1/2 deposited in RBP-Var with damaging RNA editing events predicted by SIFT and PolyPhen2.

APPLICATIONS

To verify the heuristic system, we compared the functional rbSNVs of RBP-Var with regulatory SNVs from RegulomeDB (8) and deleterious SNVs predicted by SIFT (5) and PolyPhen2 (6). We treated SNVs with category score 1/2 in RBP-Var and as regulatory SNVs in RegulomeDB. A total of 1 712 307 SNVs were identified as deleterious or functional SNVs by at least one of the four classification methods. Among 828 210 functional rbSNPs identified by RBP-Var, 40 904 were classified into functional SNVs by RegulomeDB, 166 647 were presented and predicted as deleterious by PolyPhen2 and 109 364 by SIFT (Figure 3B), whereas, the rest 651 784 rbSNVs were identified uniquely in RBP-Var. Then, we compared the rbSNVs uniquely identified in RBP-Var with GWAS-lead SNVs to determine causal disease variants. As a result, we identified 93 unique rbSNVs that were associated with human diseases (Supplemental Table S1). Similarly, we identified 7108 and 80 928 unique rbSNVs with RBP-Var score 1/2 in ClinVar and COSMIC databases, respectively (Supplemental Tables S2 and S3). These observations indicated that RBP-Var could identify more rbSNVs which could be implicated in human disease and biological functions even though they did not affect protein coding.

As one of the non-coding regulatory regions, untranslated regions (UTRs) are known to have many sequence and structural motifs that can regulate translational and transcriptional efficiency and the stability of mRNAs through interactions with RBPs and other non-coding RNAs like microRNAs (miRNAs) (4). Here, we found that 38 GWAS-lead SNPs located in the UTRs were classified with RBP-Var score 1/2 (Supplemental Table S1). For example, SNP rs1802295 (C/T) which is associated with type 2 diabetes in individuals of South Asian ancestry (53), but its underlying mechanism participated in type 2 diabetes is still unknown (54). Our result revealed SNP rs1802295 is located in the 3′-UTR of VPS26A, a region containing motif of binding by protein PABPC1L and PABPC3 based on PWM matching. This motif is likely disrupted by this SNP as the core component C in the motif would be changed to U (Figure 4A). Meanwhile, the experimental binding intensity around this SNP on mRNA of VPS26A includes binding by protein AGO2, PTBP1 from HeLa cell line, and IGF2BP3, WDR33, ELAVL1 from HEK293 cell line (Figure 4A). All of these RBP bindings may be affected as the secondary structure of mRNA of VPS26A is changed (Figure 4B) and the targeting site of miRNA hsa-miR-510 is potentially lost when the SNP rs1802295 occurs. Thus, we speculated that loss of targeting by miRNA hsa-miR-510 could up-regulate VPS26A and induce a cis-eQTL in blood according to Blood eQTL browser (55) by post-transcriptional regulation mediated by these RBPs. Hence, our results obtained by using the RBP-Var database provide additional insight into mechanisms underlying SNV-induced structural changes in the UTRs as such changes potentially affect the stability of the mRNA or disrupt the interactions with RBPs and targeting of microRNAs present in the UTRs.

Figure 4.

Figure 4.

Application of RBP-Var for an SNV related to type-2 diabetes. (A) The RBP binding signal for SNP rs1802295, which is a functional rbSNV associated with type 2 diabetes. The motif of binding by protein PABPC1L and PABPC3 are showed on the top of rs1802295. And the binding intensity around this SNP on mRNA of VPS26A by protein AGO2, PTBP1 from HeLa cell line, and IGF2BP3, WDR33, ELAVL1 from HEK293 cell line is illustrated in different tracks. The targeting site of miRNA hsa-miR-510 is marked by yellow vertical line. (B) The optimal secondary structure of RNA sequence from 200 bp of flanking regions on either side of SNP rs1802295 for wild-type of VPS26A and mutant of VPS26A induced by rs1802295. The green and red nucleotides represent the local sequence in interval of 50 bp for wild-type and mutant of VPS26A, respectively.

RNA editing is a molecular process through which some cells make discrete changes to specific nucleotide sequences within an RNA molecule after it has been generated by RNA polymerase. Adenosine to Inosine (A-to-I) editing is the main form of RNA editing in mammals (56) and occurs in regions of double-stranded RNA. Adenosine deaminases acting on RNA are RNA-editing enzymes involved in the hydrolytic deamination of adenosine to inosine (A-to-I editing). There are many effects of A-to-I editing, arising from the fact that I behaves as if it is guanine in translation and secondary structure formation. These effects include alteration of coding capacity (57), altered miRNA or siRNA target populations, heterochromatin formation, inhibition of miRNA and siRNA processing, and altered splicing. In this study, only a small fraction of RNA editing events deposited in the RADAR/DARNED (58,59) database had deleterious effects on protein structure as predicted by SIFT (5) and PolyPhen2 (6) (Figure 3C). In total, we found that 52,584 unique RNA editing events were classified into category 1/2, which might affect RBP binding as well as RNA secondary structure (Figure 3C).

Taking RNA-editing site chr8:103841636–103841636 for example, it was predicted to be tolerated by SIFT with score 0.55 and be benign by PolyPhen2 with score 0 though this A-to-I RNA editing results in a serine-to-glycine (Ser-to-Gly) amino acid substitution, occurring at residue 367 of AZIN1. However, recent study revealed that this edited form has a stronger affinity to antizyme, and the resultant higher AZIN1 protein stability promotes cell proliferation and predisposes to human hepatocellular carcinoma compared with wild-type AZIN1 protein (60) though underlied mechanism remained unknown. Intriguingly, we found that this A-to-I RNA editing may disrupt the RNA secondary structure of transcript ENST00000518697 of gene AZIN1 (p = 0.0074). And the resultant change of RNA secondary structure may influence the binding of RBPs including RBM47, eIF4AIII, AGO2, ELAVL1, FMR1 and PTBP1. Meanwhile, the searching result of our database showed that this A-to-I RNA editing site was N6-methylated in human hepatoma cell line HepG2. Our observation indicates that this A-to-I RNA editing may deplete the adenine N6-methylation (6mA) as the nucleotide adenine (A) would be transformed into nucleotide inosine (I). Hence, RBP-Var broadens the spectrum for the implication of RNA editing events by evaluating their effects on the alteration of post-regulatory capacity.

DISCUSSION

In this study, we included extensive information on annotated and computed post-regulatory elements in the human genome, and we combined a large array of data sources into a single and integrated database to quickly generate prioritized hypotheses for the function of variants affecting both coding and non-coding regions in a genome. This novel database has a convenient interface that allows the easy query submission and provides an instant classification of significant variants. The SNV summary page allows users to quickly form a hypothesis regarding the true RBP-mediated post-transcriptional consequence of a variant.

Guo et al. (61) constructed the rSNPbase database, which provides a similar annotation of SNVs integrated with RBP-mediated post-transcriptional regulation, using data sets of four RBPs derived from RIP-chip or RIP-seq analysis. Our database provides additional information by prioritizing SNVs within general regulatory regions based on specific RBPs, RNA secondary structure and PWM information.

Our simple heuristic scoring system may be improved over time, as more functional rbSNVs will be validated. Our analysis method and database are centered on a likely disruption of a protein-RNA interaction via alteration of RNA secondary structure and regulation of gene expression by rbSNVs if they are eQTLs. It is likely that there are additional sources of data that reinforce each other in a different manner that need to be explored; however, the scoring system provides significant enrichment along with better category scores, as shown when compared to GWAS-lead, ClinVar and COSMIC rbSNVs. Hence, our database and scoring system provide the best current method for annotating and prioritizing variants involved in RBP-mediated post-transcriptional regulation. Overall, we comprehensively integrated eQTLs, protein-binding sites on RNA, RNA secondary structure, miRNA, and SNV information along with GWAS, COSMIC, ClinVar, 6mA, alternative splicing, and miRNA expression profiles to seek potential post-functional SNVs.

In conclusion, this database provides a user-friendly web interface, allowing users to rapidly find whether SNVs of interest can transform the secondary structure of RNA and identify RBPs whose binding may be subsequently disrupted. Moreover, RBP-Var can assess the impact of each SNV on miRNA–RNA interaction as SNVs may destroy or create miRNA-binding sites, which result in loss-of-function and/or gain-of-function miRNA–RNA interactions. RBP-Var is a useful resource for benchmarking the mutations or RNA-editing events that cause disease by changing post-transcriptional interaction and regulation. In the future, RBP-Var will be updated frequently and extended to other species. We are dedicated to maintaining and improving RBP-Var, since it is a useful resource for the research community.

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online.

Acknowledgments

The authors would like to thank Jinyu Wu and Jinchen Li for helping with website debugging; Baozhen Du and Sen Guo for helping with data collection; Qi Liu and Huiqian Chen for all the valuable suggestions on our manuscript.

FUNDING

Funding for open access charge: National High Technology Research and Development Program of China [2012AA02A202]; National Natural Science Foundation of China [31171236/C060503]; Innovation Center of China, AstraZeneca.

Conflict of interest statement. None declared.

REFERENCES

  • 1.Halvorsen M., Martin J.S., Broadaway S., Laederach A. Disease-associated mutations that alter the RNA structural ensemble. PLoS Genet. 2010;6:e1001074. doi: 10.1371/journal.pgen.1001074. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Sabarinathan R., Tafer H., Seemann S.E., Hofacker I.L., Stadler P.F., Gorodkin J. The RNAsnp web server: predicting SNP effects on local RNA secondary structure. Nucleic Acids Res. 2013;41:W475–W479. doi: 10.1093/nar/gkt291. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Corley M., Solem A., Qu K., Chang H.Y., Laederach A. Detecting riboSNitches with RNA folding algorithms: a genome-wide benchmark. Nucleic Acids Res. 2015;43:1859–1868. doi: 10.1093/nar/gkv010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Sabarinathan R., Wenzel A., Novotny P., Tang X.J., Kalari K.R., Gorodkin J. Transcriptome-wide analysis of UTRs in Non-small cell lung cancer reveals cancer-related genes with SNV-induced changes on RNA secondary structure and miRNA target sites. PLoS One. 2014;9:e82699. doi: 10.1371/journal.pone.0082699. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Ng P.C., Henikoff S. SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Res. 2003;31:3812–3814. doi: 10.1093/nar/gkg509. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Adzhubei I.A., Schmidt S., Peshkin L., Ramensky V.E., Gerasimova A., Bork P., Kondrashov A.S., Sunyaev S.R. A method and server for predicting damaging missense mutations. Nat. Methods. 2010;7:248–249. doi: 10.1038/nmeth0410-248. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Lee T.I., Young R.A. Transcriptional regulation and its misregulation in disease. Cell. 2013;152:1237–1251. doi: 10.1016/j.cell.2013.02.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Boyle A.P., Hong E.L., Hariharan M., Cheng Y., Schaub M.A., Kasowski M., Karczewski K.J., Park J., Hitz B.C., Weng S., et al. Annotation of functional variation in personal genomes using RegulomeDB. Genome Res. 2012;22:1790–1797. doi: 10.1101/gr.137323.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Keene J.D. RNA regulons: coordination of post-transcriptional events. Nat. Rev. Genet. 2007;8:533–543. doi: 10.1038/nrg2111. [DOI] [PubMed] [Google Scholar]
  • 10.Mitchell S.F., Parker R. Principles and Properties of Eukaryotic mRNPs. Mol. Cell. 2014;54:547–558. doi: 10.1016/j.molcel.2014.04.033. [DOI] [PubMed] [Google Scholar]
  • 11.Lunde B.M., Moore C., Varani G. RNA-binding proteins: modular design for efficient function. Nat. Rev. Mol. Cell Biol. 2007;8:479–490. doi: 10.1038/nrm2178. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Ye J., Blelloch R. Regulation of Pluripotency by RNA Binding Proteins. Cell Stem Cell. 2014;15:271–280. doi: 10.1016/j.stem.2014.08.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Attar N. The RBPome: where the brains meet the brawn. Genome Biol. 2014;15:402. doi: 10.1186/gb4153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Konig J., Zarnack K., Luscombe N.M., Ule J. Protein-RNA interactions: new genomic technologies and perspectives. Nat. Rev. Genet. 2012;13:77–83. doi: 10.1038/nrg3141. [DOI] [PubMed] [Google Scholar]
  • 15.Ascano M., Hafner M., Cekan P., Gerstberger S., Tuschl T. Identification of RNA-protein interaction networks using PAR-CLIP. Wires RNA. 2012;3:159–177. doi: 10.1002/wrna.1103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Licatalosi D.D., Mele A., Fak J.J., Ule J., Kayikci M., Chi S.W., Clark T.A., Schweitzer A.C., Blume J.E., Wang X.N., et al. HITS-CLIP yields genome-wide insights into brain alternative RNA processing. Nature. 2008;456:U464–U422. doi: 10.1038/nature07488. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Hafner M., Landthaler M., Burger L., Khorshid M., Hausser J., Berninger P., Rothballer A., Ascano M., Jungkamp A.C., Munschauer M., et al. Transcriptome-wide identification of RNA-binding protein and MicroRNA target sites by PAR-CLIP. Cell. 2010;141:129–141. doi: 10.1016/j.cell.2010.03.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Konig J., Zarnack K., Rot G., Curk T., Kayikci M., Zupan B., Turner D.J., Luscombe N.M., Ule J. iCLIP reveals the function of hnRNP particles in splicing at individual nucleotide resolution. Nat. Struct. Mol. Biol. 2010;17:U909–U166. doi: 10.1038/nsmb.1838. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Wan Y., Qu K., Zhang Q.C., Flynn R.A., Manor O., Ouyang Z.Q., Zhang J.J., Spitale R.C., Snyder M.P., Segal E., et al. Landscape and variation of RNA secondary structure across the human transcriptome. Nature. 2014;505:706–715. doi: 10.1038/nature12946. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Maticzka D., Lange S.J., Costa F., Backofen R. GraphProt: modeling binding preferences of RNA-binding proteins. Genome Biol. 2014;15:17–34. doi: 10.1186/gb-2014-15-1-r17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Fukunaga T., Ozaki H., Terai G., Asai K., Iwasaki W., Kiryu H. CapR: revealing structural specificities of RNA-binding protein target recognition using CLIP-seq data. Genome Biol. 2014;15:16–30. doi: 10.1186/gb-2014-15-1-r16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Castello A., Fischer B., Hentze M.W., Preiss T. RNA-binding proteins in Mendelian disease. Trends Genet. 2013;29:318–327. doi: 10.1016/j.tig.2013.01.004. [DOI] [PubMed] [Google Scholar]
  • 23.Lagier-Tourenne C., Cleveland D.W. Rethinking ALS: The FUS about TDP-43. Cell. 2009;136:1001–1004. doi: 10.1016/j.cell.2009.03.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Li J.B., Church G.M. Deciphering the functions and regulation of brain-enriched A-to-I RNA editing. Nat. Neurosci. 2013;16:1518–1522. doi: 10.1038/nn.3539. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Li J.H., Liu S., Zhou H., Qu L.H., Yang J.H. starBase v2.0: decoding miRNA-ceRNA, miRNA-ncRNA and protein-RNA interaction networks from large-scale CLIP-Seq data. Nucleic Acids Res. 2014;42:D92–D97. doi: 10.1093/nar/gkt1248. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Yang Y.C., Di C., Hu B., Zhou M., Liu Y., Song N., Li Y., Umetsu J., Lu Z.J. CLIPdb: a CLIP-seq database for protein-RNA interactions. BMC Genomics. 2015;16:51–58. doi: 10.1186/s12864-015-1273-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Barrett T., Wilhite S.E., Ledoux P., Evangelista C., Kim I.F., Tomashevsky M., Marshall K.A., Phillippy K.H., Sherman P.M., Holko M., et al. NCBI GEO: archive for functional genomics data sets–update. Nucleic Acids Res. 2013;41:D991–D995. doi: 10.1093/nar/gks1193. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Schadt E.E., Molony C., Chudin E., Hao K., Yang X., Lum P.Y., Kasarskis A., Zhang B., Wang S., Suver C., et al. Mapping the genetic architecture of gene expression in human liver. PLoS Biol. 2008;6:1020–1032. doi: 10.1371/journal.pbio.0060107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Myers A.J., Gibbs J.R., AWebster J., Rohrer K., Zhao A., Marlowe L., Kaleem M., Leung D., Bryden L., Nath P., et al. A survey of genetic human cortical gene expression. Nat. Genet. 2007;39:1494–1499. doi: 10.1038/ng.2007.16. [DOI] [PubMed] [Google Scholar]
  • 30.Dermitzakis E.T. Population genomics of human gene expression. FEBS J. 2008;275:9–9. [Google Scholar]
  • 31.Veyrieras J.B., Kudaravalli S., Kim S.Y., Dermitzakis E.T., Gilad Y., Stephens M., Pritchard J.K. High-resolution mapping of expression-QTLs yields insight into human gene regulation. PLoS Genet. 2008;4:e1000214. doi: 10.1371/journal.pgen.1000214. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Pickrell J.K., Marioni J.C., Pai A.A., Degner J.F., Engelhardt B.E., Nkadori E., Veyrieras J.B., Stephens M., Gilad Y., Pritchard J.K. Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature. 2010;464:768–772. doi: 10.1038/nature08872. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Montgomery S.B., Sammeth M., Gutierrez-Arcelus M., Lach R.P., Ingle C., Nisbett J., Guigo R., Dermitzakis E.T. Transcriptome genetics using second generation sequencing in a Caucasian population. Nature. 2010;464:U773–U151. doi: 10.1038/nature08903. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Zeller T., Wild P., Szymczak S., Rotival M., Schillert A., Castagne R., Maouche S., Germain M., Lackner K., Rossmann H., et al. Genetics and beyond—the transcriptome of human monocytes and disease susceptibility. PLoS One. 2010;5:e10693. doi: 10.1371/journal.pone.0010693. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Xia K., Shabalin A.A., Huang S.P., Madar V., Zhou Y.H., Wang W., Zou F., Sun W., Sullivan P.F., Wright F.A. seeQTL: a searchable database for human eQTLs. Bioinformatics. 2012;28:451–452. doi: 10.1093/bioinformatics/btr678. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Gamazon E.R., Zhang W., Konkashbaev A., Duan S.W., Kistner E.O., Nicolae D.L., Dolan M.E., Cox N.J. SCAN: SNP and copy number annotation. Bioinformatics. 2010;26:259–262. doi: 10.1093/bioinformatics/btp644. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Lonsdale J., Thomas J., Salvatore M., Phillips R., Lo E., Shad S., Hasz R., Walters G., Garcia F., Young N., et al. The Genotype-Tissue Expression (GTEx) project. Nat. Genet. 2013;45:580–585. doi: 10.1038/ng.2653. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Welter D., MacArthur J., Morales J., Burdett T., Hall P., Junkins H., Klemm A., Flicek P., Manolio T., Hindorff L., et al. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 2014;42:D1001–D1006. doi: 10.1093/nar/gkt1229. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Forbes S.A., Beare D., Gunasekaran P., Leung K., Bindal N., Boutselakis H., Ding M., Bamford S., Cole C., Ward S., et al. COSMIC: exploring the world's knowledge of somatic mutations in human cancer. Nucleic Acids Res. 2015;43:D805–D811. doi: 10.1093/nar/gku1075. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Landrum M.J., Lee J.M., Riley G.R., Jang W., Rubinstein W.S., Church D.M., Maglott D.R. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res. 2014;42:D980–D985. doi: 10.1093/nar/gkt1113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Castello A., Fischer B., Eichelbaum K., Horos R., Beckmann B.M., Strein C., Davey N.E., Humphreys D.T., Preiss T., Steinmetz L.M., et al. Insights into RNA biology from an atlas of mammalian mRNA-binding proteins. Cell. 2012;149:1393–1406. doi: 10.1016/j.cell.2012.04.031. [DOI] [PubMed] [Google Scholar]
  • 42.Ray D., Kazan H., Cook K.B., Weirauch M.T., Najafabadi H.S., Li X., Gueroussov S., Albu M., Zheng H., Yang A., et al. A compendium of RNA-binding motifs for decoding gene regulation. Nature. 2013;499:172–177. doi: 10.1038/nature12311. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Cook K.B., Kazan H., Zuberi K., Morris Q., Hughes T.R. RBPDB: a database of RNA-binding specificities. Nucleic Acids Res. 2011;39:D301–D308. doi: 10.1093/nar/gkq1069. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Dassi E., Malossini A., Re A., Mazza T., Tebaldi T., Caputi L., Quattrone A. AURA: Atlas of UTR Regulatory Activity. Bioinformatics. 2012;28:142–144. doi: 10.1093/bioinformatics/btr608. [DOI] [PubMed] [Google Scholar]
  • 45.Bailey T.L., Boden M., Buske F.A., Frith M., Grant C.E., Clementi L., Ren J., Li W.W., Noble W.S. MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res. 2009;37:W202–W208. doi: 10.1093/nar/gkp335. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Gruber A.R., Lorenz R., Bernhart S.H., Neuboock R., Hofacker I.L. The Vienna RNA Websuite. Nucleic Acids Res. 2008;36:W70–W74. doi: 10.1093/nar/gkn188. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Friedman R.C., Farh K.K.H., Burge C.B., Bartel D.P. Most mammalian mRNAs are conserved targets of microRNAs. Genome Res. 2009;19:92–105. doi: 10.1101/gr.082701.108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Betel D., Wilson M., Gabow A., Marks D.S., Sander C. The microRNA.org resource: targets and expression. Nucleic Acids Res. 2008;36:D149–D153. doi: 10.1093/nar/gkm995. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Altshuler D.M., Gibbs R.A., Peltonen L., Dermitzakis E., Schaffner S.F., Yu F.L., Bonnen P.E., de Bakker P.I.W., Deloukas P., Gabriel S.B., et al. Integrating common and rare genetic variation in diverse human populations. Nature. 2010;467:52–58. doi: 10.1038/nature09298. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Altshuler D.M., Durbin R.M., Abecasis G.R., Bentley D.R., Chakravarti A., Clark A.G., Donnelly P., Eichler E.E., Flicek P., Gabriel S.B., et al. An integrated map of genetic variation from 1,092 human genomes. Nature. 2012;491:56–65. doi: 10.1038/nature11632. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Purcell S., Neale B., Todd-Brown K., Thomas L., Ferreira M.A.R., Bender D., Maller J., Sklar P., de Bakker P.I.W., Daly M.J., et al. PLINK: A tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 2007;81:559–575. doi: 10.1086/519795. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Baltz A.G., Munschauer M., Schwanhausser B., Vasile A., Murakawa Y., Schueler M., Youngs N., Penfold-Brown D., Drew K., Milek M., et al. The mRNA-bound proteome and its global occupancy profile on protein-coding transcripts. Mol. Cell. 2012;46:674–690. doi: 10.1016/j.molcel.2012.05.021. [DOI] [PubMed] [Google Scholar]
  • 53.Kooner J.S., Saleheen D., Sim X., Sehmi J., Zhang W.H., Frossard P., Been L.F., Chia K.S., Dimas A.S., Hassanali N., et al. Genome-wide association study in individuals of South Asian ancestry identifies six new type 2 diabetes susceptibility loci. Nat. Genet. 2011;43:U984–U994. doi: 10.1038/ng.921. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Sun X., Yu W.H., Hu C. Genetics of Type 2 Diabetes: Insights into the Pathogenesis and Its Clinical Application. Biomed. Res. Int. 2014;2014:926713–926727. doi: 10.1155/2014/926713. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Westra H.J., Peters M.J., Esko T., Yaghootkar H., Schurmann C., Kettunen J., Christiansen M.W., Fairfax B.P., Schramm K., Powell J.E., et al. Systematic identification of trans eQTLs as putative drivers of known disease associations. Nat. Genet. 2013;45:U1238–U1195. doi: 10.1038/ng.2756. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Danecek P., Nellaker C., McIntyre R.E., Buendia-Buendia J.E., Bumpstead S., Ponting C.P., Flint J., Durbin R., Keane T.M., Adams D.J. High levels of RNA-editing site conservation amongst 15 laboratory mouse strains. Genome Biol. 2012;13:26–37. doi: 10.1186/gb-2012-13-4-r26. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Garrett S., Rosenthal J.J.C. RNA editing underlies temperature adaptation in k+ channels from polar octopuses. Science. 2012;335:848–851. doi: 10.1126/science.1212795. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Ramaswami G., Li J.B. RADAR: a rigorously annotated database of A-to-I RNA editing. Nucleic Acids Res. 2014;42:D109–D113. doi: 10.1093/nar/gkt996. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Kiran A., Baranov P.V. DARNED: a DAtabase of RNa EDiting in humans. Bioinformatics. 2010;26:1772–1776. doi: 10.1093/bioinformatics/btq285. [DOI] [PubMed] [Google Scholar]
  • 60.Chen L.L., Li Y., Lin C.H., Chan T.H.M., Chow R.K.K., Song Y.Y., Liu M., Yuan Y.F., Fu L., Kong K.L., et al. Recoding RNA editing of AZIN1 predisposes to hepatocellular carcinoma. Nat. Med. 2013;19:209–216. doi: 10.1038/nm.3043. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Guo L.Y., Du Y., Chang S.H., Zhang K.L., Wang J. rSNPBase: a database for curated regulatory SNPs. Nucleic Acids Res. 2014;42:D1033–D1039. doi: 10.1093/nar/gkt1167. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES