Abstract
The secondary structure of a pre-mRNA influences a number of processing steps including alternative splicing. Since most splicing regulatory proteins bind to single-stranded RNA, the sequestration of RNA into double strands could prevent their binding. Here, we analyzed the secondary structure context of experimentally determined splicing enhancer and silencer motifs in their natural pre-mRNA context. We found that these splicing motifs are significantly more single-stranded than controls. These findings were validated by transfection experiments, where the effect of enhancer or silencer motifs on exon skipping was much more pronounced in single-stranded conformation. We also found that the structural context of predicted splicing motifs is under selection, suggesting a general importance of secondary structures on splicing and adding another level of evolutionary constraints on pre-mRNAs. Our results explain the action of mutations that affect splicing and indicate that the structural context of splicing motifs is part of the mRNA splicing code.
Author Summary
Almost all human protein-coding genes contain several exons and introns. Prior to translation, introns have to be removed and exons have to be joined, which happens in a processing step called splicing that generates the mature mRNA. For most genes, certain exons can be either included or excluded from the mature mRNA. It is currently not fully understood which signals are needed to accurately recognize the boundaries of exons in the intron-containing primary transcript. As in transcriptional regulation, enhancer and silencer sequence motifs are crucial for the correct recognition of exons. Splicing regulatory proteins identify these motifs in a sequence-specific manner. In general, these proteins bind to single-stranded RNA. Here, we analyzed local secondary structures of primary transcripts and found that known splicing motifs are preferentially located in a single-stranded context. Experimental tests demonstrated that motifs in single-stranded contexts have a stronger effect on splice site selection than those located in double-stranded regions. These results help to understand the action of human mutations that change the splicing pattern and indicate that local pre-mRNA secondary structures influence exon recognition.
Introduction
RNA molecules can adopt various conformations in solution by base pairings and hydrophobic interactions. For example, transfer RNA (tRNA) adopts a defined secondary and tertiary structure [1], whereas most messenger RNAs (mRNAs) exhibit only local structures [2]. Defined RNA structures can be recognized by other molecules, as exemplified by aminoacyl-tRNA synthetases that contact with the minor groove of the tRNA acceptor stem [3], which illustrates the functional importance of RNA structure. In contrast, most proteins regulating splice site selection recognize single-stranded, not base-paired RNA. For example, PUF (Pumilio/FBF), Zn-binding, KH (hnRNP K homology) domains, and RRMs (RNA recognition domains) bind to 2–10 nucleotides of single-stranded RNA [4]. Frequently, the RNA part that binds to a protein is in a hairpin loop [5,6]. The sequence specificity of binding is achieved by hydrophobic interactions between RNA bases and amino acids on the surface of the protein, which explains why RNA binding proteins can bind with sequence specificity to unstructured RNA [4].
The large majority of human genes is alternatively spliced [7,8], and the expression of more than one transcript from one gene represents an important mechanism to increase the diversity of a transcriptome and proteome from a limited number of genes [9,10]. In higher organisms, the splice sites do not contain all the information that is required for accurate intron recognition [11]. Additional enhancer and silencer signals located in exons and introns (ESE, ESS, ISE, and ISS for short) are essential for the alternative and constitutive splicing [12]. We refer to them collectively as splicing regulatory motifs. These splicing regulatory motifs bind to RNA binding proteins [4] or other RNAs [13]. There is emerging evidence that pre-mRNA secondary structure plays a role in alternative splicing (reviewed in [14]). For example, a deletion in the mouse fibronectin EDA exon leads to a shift of a critical ESE from single- into double-stranded conformation, which causes exon skipping [15]. The skipping of exon 7 of the SMN2 gene is correlated with the stability of a stem structure that sequesters the donor splice site [16]. The splicing of mutually exclusive exons in the rat FGFR2 and Drosophila DSCAM gene is regulated by conserved secondary structures [17,18]. Moreover, it has been proposed that sequences surrounding alternative exons might form structures that loop out the exon and prevent its recognition [19,20].
Here, we analyze whether the structural context of binding sites for splicing regulatory proteins has a general importance. We first compiled a large set of experimentally verified splicing enhancer and silencer motifs with their natural sequence context. Calculating the probability of a motif to be single-stranded, we found that splicing motifs are located in a structural context that favors their sequence to be more single-stranded than expected from various controls. These results were confirmed by transfection experiments comparing the effect of known splicing motifs in single- and double-stranded conformation. Furthermore, we show that the structural context of predicted splicing regulatory motifs is under selection. Our findings suggest that pre-mRNA secondary structures are an integral part of splice site recognition.
Results
Measurement of Single-Strandedness
To investigate the structural context of splicing motifs, we first developed an appropriate measurement of single-strandedness. Since in vivo mRNA secondary structures are mostly unknown [21], we use the standard approach of predicting structures by energy minimization [22]. For measuring single-strandedness, we compute the probability that all bases in the motif are unpaired (denoted as probability unpaired or PU value) using the equilibrium partition function [23]. Higher PU values indicate higher single-strandedness of the motif. As PU values account for all possible structures, they include possible structural fluctuations and circumvent inaccuracies caused by considering only one energetically optimal structure or a limited number of suboptimal structures. PU values allow the direct comparison of the single-strandedness of splicing motifs.
In vivo secondary structures of pre-mRNAs are likely to be local rather than global. Local RNA folding is influenced by the length of the flanking sequence context [24], and several lines of evidence indicate that pre-mRNA folding windows are small. First, pre-mRNA is bound by numerous proteins, which influences their ability to fold freely. Second, the formation of secondary structures occurs cotranscriptionally, which favors short-range over long-range base pairing [24]. This model is consistent with the results of kinetic folding algorithms [25]. Third, experiments suggested that pre-mRNA folding is limited to a region of about 50 nt downstream of the transcribing polymerase [26]. For these reasons, we focused on local base pairing and considered all symmetrical context lengths from 11 up to 30 nt up- and downstream of the splicing motif. Thus, for a motif of length 6 nt, we considered sequences with a total length from 28 nt (for context length 11) to length 66 nt (for context length 30). Smaller context lengths were not considered as the resulting sequences rarely form energetically stable structures. We computed the PU value of the splicing motif for all 20 context lengths. To obtain a single PU value for each splicing motif and to reduce a strong dependency on a single fixed context length, we averaged these 20 PU values. Figure 1A illustrates this approach for a binding site of the polypyrimidine tract binding protein.
Experimentally Determined Splicing Motifs Are Preferentially Located in Single-Stranded Regions
We used the AEdb motif database to obtain an experimentally verified set of high quality splicing regulatory motifs [27]. After filtering (see Materials and Methods), this set comprises 77 exonic and intronic enhancers and silencers with a length up to 9 nt from human, mouse, rat, chicken, Drosophila, and several viruses (Table S1). For each motif, we computed the average PU value using the natural pre-mRNA sequence context as described above. To get an overall measure, we averaged these 77 PU values (Table 1; Table S1).
Table 1.
To assess whether verified splicing motifs have a preference for single strands, we performed a statistical evaluation controlling for the motif length and the guanine-cytosine content (GC) content that influence the PU values (Figures S1 and S2). First, we randomly chose a new motif of the same length in the up- and downstream flanks of the natural sequence context for all 77 motifs (control dataset 1, Figure 1B). We repeated this 100 times to obtain 100 sets each with 77 randomly chosen motifs. The p-value was calculated as the fraction of random sets having a higher average PU value compared to the verified motifs (for example, if all 100 random sets have a lower PU value, the p-value is less than 1/100 = 0.01). This control accounts for possible biases in the selection of genes or exons since it uses the same sequences. We observed a substantial drop in the average PU value from 0.25 to 0.15, indicating that verified motifs are significantly more single-stranded (p < 0.01, Table 1).
To also account for the sequence bias of the verified motifs, we used another control. We repeated the procedure of control 1 but copied the verified motif to a new position (control dataset 2, Figure 1B). All random sets exhibit a lower single-strandedness (p < 0.01). To further verify this effect, we used dinucleotide shuffling [28] to modify the up- and downstream flanks of verified motifs while preserving the motif sequence (control dataset 3, Figure 1B). Again, all 100 randomly generated sets have lower PU values (p < 0.01). As a final test, we randomly selected a motif in 10,000 exons and 10,000 introns (control datasets 4 and 5). Since exons and introns have differences in their nucleotide composition [29], we split the 77 motifs into 50 exonic and 27 intronic ones according to their location in the exon-intron structure. As shown in Table 1, we found a significantly higher single-strandedness for exonic and intronic verified motifs compared to exonic and intronic random motifs.
To exclude the possibility that the maximal context length of 30 nt is inappropriate, we repeated the entire analysis with average values for context lengths 11–20 nt as well as 11–50 nt and found consistent results (Table S2). We also tested two other ways to measure the single-strandedness (Text S1; Table S2). It should be noted that even conservative controls with a lower GC content for control datasets 4 and 5 yield a higher single-strandedness for verified motifs (Tables S2 and S3).
The consistent results observed for all tests led us to conclude that naturally occurring, experimentally verified splicing motifs have a significant preference to be single-stranded. Furthermore, the results from controls 2 and 3, which keep the motif sequences, indicate that the higher single-strandedness is attributed to the flanks of the verified motifs rather than the motifs themselves.
The Activity of Splicing Regulatory Sequences Depends on Their Conformation in the Pre-mRNA
Next, we tested the hypothesis that the localization of a splicing regulatory element in a single- or double-stranded RNA structure influences splice site selection in an experimental system. We used the SXN-minigene [30], which has been widely used to analyze the impact of sequences on pre-mRNA processing [31]. This minigene contains an artificial alternative exon between two constitutively spliced globin exons. We inserted splicing enhancer and silencer motifs located either in a single-stranded loop or in a double-stranded stem into this alternative exon (Figure 2A). The single- and double-stranded motifs are located at similar positions in the exon to avoid positional effects [32]. The splicing pattern was analyzed by transfecting the constructs in HEK293 cells followed by detection of the splicing pattern by reverse transcription (RT)-PCR. As enhancer sequences, we used the experimentally well-characterized enhancer of the CD44 pre-mRNA (CAACCACAA) [33] and a pentamer (CAAGG), which is the core of many computational predicted enhancers [34]. As silencers, we used the experimentally characterized hnRNP A1 (TAGGGT) silencer and a computationally predicted silencer (GTAAGTGA) [35], which was previously experimentally verified in the HTR2C pre-mRNA [13].
As shown in Figure 2B and 2C, we observed that the function of these regulatory motifs strongly depends on their localization within an RNA conformation. Enhancers result in stronger exon inclusion when they are located in a loop compared with the location in the stem structure. Likewise, silencers located in a loop lead to stronger exon skipping compared to the loop structure. These data show that the pre-mRNA conformation influences the action of a splicing regulatory element.
Structural Context of Predicted Splicing Enhancer and Silencer Hexamers
Next, we asked whether we could detect evolutionary selection on the structural context of computationally predicted splicing regulatory motifs. Predicted enhancers, silencers, and “splicing-neutral” hexamers were taken from Stadler et al. [36]. Our strategy to detect differences in single-strandedness of one hexamer was to compare the single-strandedness in enhancer-dependent and silencer-dependent regions. Human exons are dependent on enhancers, while the intronic regions between an authentic and a strong intronic decoy splice site (decoy regions) are dependent on silencers [37]. The single-strandedness of each hexamer was determined as the average PU value for up to 1,000 hexamer occurrences in these datasets. The direct comparison of the PU values of equal hexamers is complicated by a strong negative correlation between the GC content of the hexamer flanking regions and the PU values (r = −0.64, p < 0.0001), as well as large differences in GC content between exons and decoy regions (Figures S2 and S3). To exclude this GC content bias, we focused only on those cases where selection is strong enough to overcome the overall correlation between PU and GC. Specifically, we considered a hexamer as selected to be single-stranded in exons if, despite more GC rich exonic flanks, the exonic PU value is higher compared to the decoy regions. Likewise, we considered a hexamer as selected to be single-stranded in decoy regions if, despite more GC rich flanks in decoy regions, the PU value is higher compared to exons.
Using the fraction of splicing-neutral motifs that are selected to be single-stranded as the background fraction, we found that in exons enhancers are significantly more often selected for single-strandedness, while silencers are close to neutral motifs (Figure 3; Table S4). Strikingly, in decoy regions, silencers are significantly more often while enhancers are less often selected to be single-stranded. These results indicate that the high enhancer frequency in exons is further intensified by a tendency for single-strandedness. Likewise, silencers are abundant in decoy regions [35], and in addition, their structural context has a tendency to be selected for single-strandedness. This is further supported by tests with other silencer-dependent regions: pseudo exons (silent intronic regions bounded by strong splice sites [34]) and intronic regions adjacent to authentic splice sites (Figure 3), as well as additional tests that compare enhancer/silencer/neutral motifs having the same number of GCs within and between datasets (Tables S5 and S6). Noteworthy, these differences are more pronounced for exons with weak splice sites and less pronounced for those with strong splice sites, consistent with the idea that the former have an even higher enhancer-dependency (Figure S4). These results indicate a widespread selection pressure on the structural context of splicing motifs.
Discussion
It is well established that pre-mRNA sequences have enhancing or silencing activities on splice site selection, but the involvement of pre-mRNA secondary structure has not yet been systematically investigated. By analyzing the secondary structure of experimentally verified splicing motifs within their natural pre-mRNA context, we found a higher single-strandedness for these motifs. These results were confirmed by transfection experiments demonstrating that single-stranded splicing motifs exert a stronger effect on the exon usage. Finally, we found that the structural context of predicted splicing motifs is under selection in enhancer- and silencer-dependent regions. Exons, in particular those with weak splice sites, have a higher single-strandedness for enhancers, while silencer-dependent regions have higher single-strandedness for silencers. As most exons are dependent on several splicing regulatory motifs, these findings suggest a general importance of pre-mRNA secondary structures on the splicing outcome. They indicate that secondary structures are a part of the “mRNA splicing code” that determines exon recognition. Selection on secondary structures could also explain the observed constraints on synonymous codon sites [38] and the conservation of large intronic regions adjacent to alternative exons [39]. We propose that a coding exon is subjected to at least three different selection pressures: (i) preserving the coding sequence, (ii) preserving the sequence of splicing motifs, and (iii) preserving an appropriate structural context for these splicing motifs. Selection on the coding sequence is likely to be the strongest pressure. This is consistent with results that splicing motifs are interchangeable [36] and our finding that intronic motifs are more single-stranded than exonic ones (Table 1).
Most splicing regulatory motifs are recognized by sequence-specific RNA binding proteins that make direct contacts with unpaired RNA bases [4]. This provides a mechanistic explanation for the inhibitory role of double strands on RNA–protein and mRNA–microRNA interactions [40,41] and for our observation that splicing regulatory motifs are located in single-stranded regions. In contrast to this general trend, a few individual motifs are located in a somewhat double-stranded conformation (Table S1). It remains to be determined which trans-acting factors bind to these motifs or whether a specific experimental situation influenced the results. It is also possible that proteins binding to the proximity of the double-stranded motifs change their single-strandedness or that RNA helicases unwind them [42].
Our findings can explain previously observed splicing effects of mutations that do not change splicing regulatory elements. For example, a silent mutation in human CFTR exon 12 that reduces exon inclusion from 80%–25% [43] does not create or destroy splicing motifs, but leads to a higher single-strandedness of existing ESSs and a lower single-strandedness of an existing ESE (Figure 4A). Other examples for mutations in human HPRT1 exon 8 are shown in Figures S5 and S6. Likewise, secondary structures can explain why mutations that change splicing motifs sometimes show no splicing effect. Most likely, the affected motifs are highly double-stranded in these cases; exemplified for motifs in rat beta-tropomyosin exon 8 (Figure 4B), human MAPT alternative exon 10 (Figure S7), and human HPRT1 exon 8 (Figures S8 and S9). Thus, the structural context should be taken into account in mutagenesis experiments [44], where observed splicing effects are usually interpreted as changes in regulatory motifs. To facilitate such analyses, we provide a web resource (http://biwww2.informatik.uni-freiburg.de/Software/NIPU) that allows a comparison between PU values and ESE/ESS scores [36]. As the formation of secondary structures depends on the speed of transcription [24], the effect of the promoter on splicing [45] might be partially explained by differential structure formation. Furthermore, secondary structures provide useful information in sequence motif finding [46] and will improve the computational detection of splicing motifs [31,34,47] and gene prediction algorithms [35].
Materials and Methods
Measurement of single-strandedness.
PU values were computed as described in [46]. Briefly, the PU value for the region a to b in an mRNA sequence is defined as
where Eall is the free energy of the ensemble of all structures, Eunpaired is the free energy of the ensemble of all structures that have the complete region a to b unpaired, R is the universal gas constant, and T is the temperature. We computed Eall and Eunpaired using the partition function version of RNAfold [48]. For Eunpaired, we assure that the region a to b is unpaired by applying additional constraints (RNAfold parameter-C). Other measurements of single-strandedness are described in the Text S1.
Experimentally determined splicing motifs.
We carefully examined the literature for all splicing motifs listed in the motif database of AEDB (http://www.ebi.ac.uk/asd/aedb/) [27]. We checked the consistency of the listed genes, species, and motif sequences using the respective publications. Only motifs that were demonstrated to influence splicing in their natural context were considered. We excluded motifs whose annotation was based solely on computational predictions, cases where the exact motif location is not given in the publication, or where we could not find the given motif in the respective gene. Motifs that presumably act by forming secondary structures instead of representing a protein binding site and motifs that were predicted on the basis of the splicing effect of a SNP were also excluded. Since three-dimensional structures of single-stranded RNA binding proteins indicate that they usually contact only a few residues [4], we discarded all motifs with a length of more than 9 nt as these longer motifs are likely to contain a core binding site at a location that was not experimentally determined. We removed redundancy by requiring that each motif has a unique location within a gene.
Control datasets.
For control datasets 1–3, we randomly generated 100 motif sets each with 77 random motifs by repeating the following procedures 100 times for each verified motif. Control dataset 1: We randomly selected a new position within the 150-nt up- and downstream flanking regions of the verified motif. The subsequence starting at this position with the same length as the verified motif was taken as the new motif. Control dataset 2: We used the same procedure as for control 1 but replaced the subsequence starting at the new position by the verified motif. Control dataset 3: Dinucleotide shuffling was done with Dishuffle [28] separately for the 300-nt up- and downstream flanks of the motif. The Dicodonshuffle program, which shuffles dicodons while preserving the coding sequence, could not be used because the output was not variable enough to yield 100 different sets. It should be noted that all control sequences were repeatedly derived from the same set of verified sequences. They are therefore not necessarily independent and the Wilcoxon rank-sum test cannot be used. To evaluate them statistically, we calculated the p-value as the fraction of random sets having a higher average PU value compared to the verified motifs. Control datasets 4 and 5: We downloaded from the University of California Santa Cruz Genome Browser the Human genome assembly (hg17, May 2004) as well as RefSeq transcript annotations (refGene.txt.gz, January 2005). Then, we extracted all internal exons and all 300-nt flanks from all introns. Redundancy was removed by using RSA-tools (http://rsat.ulb.ac.be/rsat/purge-sequence_form.cgi) with a minimal match length of 50 nt and at most three mismatches. We randomly selected 10,000 motifs within the exons (introns) while keeping the length distribution of the exonic (intronic) splicing motifs. All intronic motifs are at least 50 nt away from the splice site. The Wilcoxon rank-sum test was used to test whether verified and random motifs come from the same distribution. In all controls, we used the length distribution of the verified splicing motifs and controlled for the GC content (Table S3). Datasets with a lower and higher GC content for control 4 and 5 were generated by randomly discarding entries with a GC content of smaller and greater than 0.5. Statistical tests (Fisher's exact test, Wilcoxon rank-sum test) were performed using R (http://www.r-project.org/).
Minigene experiments.
Splicing assays and cotransfection were performed as described [49], employing the SXN derivatives. Briefly, 1 μg of indicated minigenes were transfected into HEK293 cells using the calcium phosphate method. RNA was isolated 24 h post-transfection by using the total RNA kit (Qiagen). Reverse transcription was performed in a total volume of 8.5 μl by using 2 μl RNA, 1 μl oligo (dT) (0.5 mg/ml), 1 μl first-strand buffer (Invitrogen), 10 mM DTT, 1 mM dNTP, 5 units SuperScript II RT (Invitrogen) at 42 °C for 60 min. To ensure that only plasmid-derived minigene transcripts were detected, subsequent amplification was performed using vector-specific primers (fwd-CCATTTGACCATTCACCACA, rev-CACTCCTGATGCTGTTATGG). RT-PCR products were resolved on ethidium bromide-stained agarose gels. The ratio of exon inclusion to exon skipping was determined using the Image G program. To detect possible unspliced products, RNA samples were treated with DNAse and after reverse transcription, unspliced RNA was amplified using 4′ extension times. Using these conditions, no unspliced RNA could be observed.
Predicted splicing motifs.
The exon set corresponds to the human exons used for control dataset 4. The definition of pseudo exons was adopted from [34] as intronic region with a length between 50 and 250 nt, flanked by two splice sites having a score of at least 8.5. We used a maximum entropy-based model for the quantification of the splice site strength as described in [50], which we computed using the Web server http://genes.mit.edu/burgelab/maxent/Xmaxentscan_scoreseq.html. All pseudo exons that overlap with spliced ESTs were excluded. To search for “decoy regions,” we scored all potential splice sites in the 100-nt intron flanks for introns longer than 400 nt using the maximum entropy model [37]. A decoy donor region is the region between a real donor site and a downstream decoy donor site (located at least 6 nt downstream), which has a higher score. A decoy acceptor region is the region between a real acceptor site and an upstream decoy acceptor site (located at least 15 nt upstream), which has a higher score. Decoy donor and acceptor regions were merged. The intron flank dataset consists of the regions +10...+70 from a donor site and −80... −20 from the acceptor site of the human exons. Redundancy in each dataset was removed using RSA-tools with a minimal match length of 30 nt and at most two mismatches.
For each dataset, we randomly selected up to 1,000 occurrences for each of the 4,096 hexamers. Then, we determined the single-strandedness for one hexamer by averaging the PU values (using context lengths 11–30 nt) for each of the hexamer occurrences. The average GC content of one motif was computed for the 66-nt region (30 nt up/downstream context + 6 nt of the motif). The 4,096 hexamers were divided into exonic enhancers (neighborhood inference (NI) score > 0.8), silencers (NI score < −0.8), and splicing-neutral motifs (−0.8 ≥ NI score ≤ 0.8), as suggested in Stadler et al. [36].
Supporting Information
Acknowledgments
We thank Sebastian Will and Anke Busch for helpful discussions.
Abbreviations
- GC
guanine-cytosine content
- mRNA
messenger RNA
- NI
neighborhood inference
- PU
probability unpaired
Footnotes
Author contributions. MH, RB, and SS conceived and designed the experiments and wrote the paper. MH and ZZ performed the experiments. MH and SS analyzed the data. RB and SS contributed equally to this study.
Funding. This work was supported by a grant from the European Union (EURASNET) and the Centers of Biomedical Research Excellence (COBRE) grant of the National Institutes of Health (P20RR020171).
Competing interests. The authors have declared that no competing interests exist.
References
- Holbrook SR. RNA structure: the long and the short of it. Curr Opin Struct Biol. 2005;15:302–308. doi: 10.1016/j.sbi.2005.04.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Graveley BR, Hertel KJ, Maniatis T. A systematic analysis of the factors that determine the strength of pre-mRNA splicing enhancers. EMBO J. 1998;17:6747–6756. doi: 10.1093/emboj/17.22.6747. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sankaranarayanan R, Dock-Bregeon AC, Romby P, Caillet J, Springer M, et al. The structure of threonyl-tRNA synthetase-tRNA(Thr) complex enlightens its repressor activity and reveals an essential zinc ion in the active site. Cell. 1999;97:371–381. doi: 10.1016/s0092-8674(00)80746-1. [DOI] [PubMed] [Google Scholar]
- Auweter SD, Oberstrass FC, Allain FH. Sequence-specific binding of single-stranded RNA: is there a code for recognition? Nucleic Acids Res. 2006;34:4943–4959. doi: 10.1093/nar/gkl620. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Buckanovich RJ, Darnell RB. The neuronal RNA binding protein Nova-1 recognizes specific RNA targets in vitro and in vivo. Mol Cell Biol. 1997;17:3194–3201. doi: 10.1128/mcb.17.6.3194. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Skrisovska L, Bourgeois CF, Stefl R, Grellscheid SN, Kister L, et al. The testis-specific human protein RBMY recognizes RNA through a novel mode of interaction. EMBO Rep. 2007;8:372–379. doi: 10.1038/sj.embor.7400910. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Johnson JM, Castle J, Garrett-Engele P, Kan Z, Loerch PM, et al. Genome-wide survey of human alternative pre-mRNA splicing with exon junction microarrays. Science. 2003;302:2141–2144. doi: 10.1126/science.1090100. [DOI] [PubMed] [Google Scholar]
- Tress ML, Martelli PL, Frankish A, Reeves GA, Wesselink JJ, et al. The implications of alternative splicing in the ENCODE protein complement. Proc Natl Acad Sci U S A. 2007;104:5495–5500. doi: 10.1073/pnas.0700800104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Graveley BR. Alternative splicing: increasing diversity in the proteomic world. Trends Genet. 2001;17:100–107. doi: 10.1016/s0168-9525(00)02176-4. [DOI] [PubMed] [Google Scholar]
- Hiller M, Huse K, Szafranski K, Jahn N, Hampe J, et al. Widespread occurrence of alternative splicing at NAGNAG acceptors contributes to proteome plasticity. Nat Genet. 2004;36:1255–1257. doi: 10.1038/ng1469. [DOI] [PubMed] [Google Scholar]
- Lim LP, Burge CB. A computational analysis of sequence features involved in recognition of short introns. Proc Natl Acad Sci U S A. 2001;98:11193–11198. doi: 10.1073/pnas.201407298. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Blencowe BJ. Exonic splicing enhancers: mechanism of action, diversity and role in human genetic diseases. Trends Biochem Sci. 2000;25:106–110. doi: 10.1016/s0968-0004(00)01549-8. [DOI] [PubMed] [Google Scholar]
- Kishore S, Stamm S. The snoRNA HBII-52 regulates alternative splicing of the serotonin receptor 2C. Science. 2006;311:230–232. doi: 10.1126/science.1118265. [DOI] [PubMed] [Google Scholar]
- Buratti E, Baralle FE. Influence of RNA secondary structure on the pre-mRNA splicing process. Mol Cell Biol. 2004;24:10505–10514. doi: 10.1128/MCB.24.24.10505-10514.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Buratti E, Muro AF, Giombi M, Gherbassi D, Iaconcig A, et al. RNA folding affects the recruitment of SR proteins by mouse and human polypurinic enhancer elements in the fibronectin EDA exon. Mol Cell Biol. 2004;24:1387–1400. doi: 10.1128/MCB.24.3.1387-1400.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Singh NN, Singh RN, Androphy EJ. Modulating role of RNA structure in alternative splicing of a critical exon in the spinal muscular atrophy genes. Nucleic Acids Res. 2007;35:371–389. doi: 10.1093/nar/gkl1050. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Graveley BR. Mutually exclusive splicing of the insect Dscam pre-mRNA directed by competing intronic RNA secondary structures. Cell. 2005;123:65–73. doi: 10.1016/j.cell.2005.07.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Muh SJ, Hovhannisyan RH, Carstens RP. A Non-sequence-specific double-stranded RNA structural element regulates splicing of two mutually exclusive exons of fibroblast growth factor receptor 2 (FGFR2) J Biol Chem. 2002;277:50143–50154. doi: 10.1074/jbc.M207409200. [DOI] [PubMed] [Google Scholar]
- Miriami E, Margalit H, Sperling R. Conserved sequence elements associated with exon skipping. Nucleic Acids Res. 2003;31:1974–1983. doi: 10.1093/nar/gkg279. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang XH, Leslie CS, Chasin LA. Dichotomous splicing signals in exon flanks. Genome Res. 2005;15:768–779. doi: 10.1101/gr.3217705. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schroeder R, Barta A, Semrad K. Strategies for RNA folding and assembly. Nat Rev Mol Cell Biol. 2004;5:908–919. doi: 10.1038/nrm1497. [DOI] [PubMed] [Google Scholar]
- Zuker M, Stiegler P. Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acids Res. 1981;9:133–148. doi: 10.1093/nar/9.1.133. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Muckstein U, Tafer H, Hackermuller J, Bernhart SH, Stadler PF, et al. Thermodynamics of RNA-RNA binding. Bioinformatics. 2006;22:1177–1182. doi: 10.1093/bioinformatics/btl024. [DOI] [PubMed] [Google Scholar]
- Schroeder R, Grossberger R, Pichler A, Waldsich C. RNA folding in vivo. Curr Opin Struct Biol. 2002;12:296–300. doi: 10.1016/s0959-440x(02)00325-1. [DOI] [PubMed] [Google Scholar]
- Flamm C, Fontana W, Hofacker IL, Schuster P. RNA folding at elementary step resolution. RNA. 2000;6:325–338. doi: 10.1017/s1355838200992161. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Eperon LP, Graham IR, Griffiths AD, Eperon IC. Effects of RNA secondary structure on alternative splicing of pre-mRNA: is folding limited to a region behind the transcribing RNA polymerase? Cell. 1988;54:393–401. doi: 10.1016/0092-8674(88)90202-4. [DOI] [PubMed] [Google Scholar]
- Stamm S, Riethoven JJ, Le Texier V, Gopalakrishnan C, Kumanduri V, et al. ASD: a bioinformatics resource on alternative splicing. Nucleic Acids Res. 2006;34:D46–D55. doi: 10.1093/nar/gkj031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Katz L, Burge CB. Widespread selection for local RNA secondary structure in coding regions of bacterial genes. Genome Res. 2003;13:2042–2051. doi: 10.1101/gr.1257503. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, et al. Initial sequencing and analysis of the human genome. Nature. 2001;409:860–921. doi: 10.1038/35057062. [DOI] [PubMed] [Google Scholar]
- Coulter LR, Landree MA, Cooper TA. Identification of a new class of exonic splicing enhancers by in vivo selection. Mol Cell Biol. 1997;17:2143–2150. doi: 10.1128/mcb.17.4.2143. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fairbrother WG, Yeh RF, Sharp PA, Burge CB. Predictive identification of exonic splicing enhancers in human genes. Science. 2002;297:1007–1013. doi: 10.1126/science.1073774. [DOI] [PubMed] [Google Scholar]
- Goren A, Ram O, Amit M, Keren H, Lev-Maor G, et al. Comparative analysis identifies exonic splicing regulatory sequences–The complex definition of enhancers and silencers. Mol Cell. 2006;22:769–781. doi: 10.1016/j.molcel.2006.05.008. [DOI] [PubMed] [Google Scholar]
- Stickeler E, Fraser SD, Honig A, Chen AL, Berget SM, et al. The RNA binding protein YB-1 binds A/C-rich exon enhancers and stimulates splicing of the CD44 alternative exon v4. EMBO J. 2001;20:3821–3830. doi: 10.1093/emboj/20.14.3821. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang XH, Chasin LA. Computational definition of sequence motifs governing constitutive exon splicing. Genes Dev. 2004;18:1241–1250. doi: 10.1101/gad.1195304. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang Z, Rolish ME, Yeo G, Tung V, Mawson M, et al. Systematic identification and analysis of exonic splicing silencers. Cell. 2004;119:831–845. doi: 10.1016/j.cell.2004.11.010. [DOI] [PubMed] [Google Scholar]
- Stadler MB, Shomron N, Yeo GW, Schneider A, Xiao X, et al. Inference of splicing regulatory activities by sequence neighborhood analysis. PLoS Genet. 2006;2:e191. doi: 10.1371/journal.pgen.0020191. doi: 10.1371/journal.pgen.0020191. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang Z, Xiao X, Van Nostrand E, Burge CB. General and specific functions of exonic splicing silencers in splicing control. Mol Cell. 2006;23:61–70. doi: 10.1016/j.molcel.2006.05.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xing Y, Lee C. Alternative splicing and RNA selection pressure–evolutionary consequences for eukaryotic genomes. Nat Rev Genet. 2006;7:499–509. doi: 10.1038/nrg1896. [DOI] [PubMed] [Google Scholar]
- Sugnet CW, Srinivasan K, Clark TA, O'Brien G, Cline MS, et al. Unusual intron conservation near tissue-regulated exons found by splicing microarrays. PLoS Comput Biol. 2006;2:e4. doi: 10.1371/journal.pcbi.0020004. doi: 10.1371/journal.pcbi.0020004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Long D, Lee R, Williams P, Chan CY, Ambros V, et al. Potent effect of target structure on microRNA function. Nat Struct Mol Biol. 2007;14:287–294. doi: 10.1038/nsmb1226. [DOI] [PubMed] [Google Scholar]
- Meisner NC, Hackermuller J, Uhl V, Aszodi A, Jaritz M, et al. mRNA openers and closers: modulating AU-rich element-controlled mRNA stability by a molecular switch in mRNA secondary structure. Chembiochem. 2004;5:1432–1447. doi: 10.1002/cbic.200400219. [DOI] [PubMed] [Google Scholar]
- Honig A, Auboeuf D, Parker MM, O'Malley BW, Berget SM. Regulation of alternative splicing by the ATP-dependent DEAD-box RNA helicase p72. Mol Cell Biol. 2002;22:5698–5707. doi: 10.1128/MCB.22.16.5698-5707.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pagani F, Raponi M, Baralle FE. Synonymous mutations in CFTR exon 12 affect splicing and are not neutral in evolution. Proc Natl Acad Sci U S A. 2005;102:6368–6372. doi: 10.1073/pnas.0502288102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pagani F, Baralle FE. Genomic variants in exons and introns: identifying the splicing spoilers. Nat Rev Genet. 2004;5:389–396. doi: 10.1038/nrg1327. [DOI] [PubMed] [Google Scholar]
- Kornblihtt AR, de la Mata M, Fededa JP, Munoz MJ, Nogues G. Multiple links between transcription and splicing. RNA. 2004;10:1489–1498. doi: 10.1261/rna.7100104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hiller M, Pudimat R, Busch A, Backofen R. Using RNA secondary structures to guide sequence motif finding towards single-stranded regions. Nucleic Acids Res. 2006;34:e117. doi: 10.1093/nar/gkl544. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smith PJ, Zhang C, Wang J, Chew SL, Zhang MQ, et al. An increased specificity score matrix for the prediction of SF2/ASF-specific exonic splicing enhancers. Hum Mol Genet. 2006;15:2490–2508. doi: 10.1093/hmg/ddl171. [DOI] [PubMed] [Google Scholar]
- Hofacker IL. Vienna RNA secondary structure server. Nucleic Acids Res. 2003;31:3429–3431. doi: 10.1093/nar/gkg599. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stoss O, Stoilov P, Hartmann AM, Nayler O, Stamm S. The in vivo minigene approach to analyze tissue-specific splicing. Brain Res Brain Res Protoc. 1999;4:383–394. doi: 10.1016/s1385-299x(99)00043-4. [DOI] [PubMed] [Google Scholar]
- Yeo G, Burge CB. Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals. J Comput Biol. 2004;11:377–394. doi: 10.1089/1066527041410418. [DOI] [PubMed] [Google Scholar]
- Tu M, Tong W, Perkins R, Valentine CR. Predicted changes in pre-mRNA secondary structure vary in their association with exon skipping for mutations in exons 2, 4, and 8 of the Hprt gene and exon 51 of the fibrillin gene. Mutat Res. 2000;432:15–32. doi: 10.1016/s1383-5726(99)00011-4. [DOI] [PubMed] [Google Scholar]
- Goedert M, Ghetti B, Spillantini MG. Tau gene mutations in frontotemporal dementia and parkinsonism linked to chromosome 17 (FTDP-17). Their relevance for understanding the neurogenerative process. Ann N Y Acad Sci. 2000;920:74–83. doi: 10.1111/j.1749-6632.2000.tb06907.x. [DOI] [PubMed] [Google Scholar]
- Chan RC, Black DL. The polypyrimidine tract binding protein binds upstream of neural cell-specific c-src exon N1 to repress the splicing of the intron downstream. Mol Cell Biol. 1997;17:4667–4676. doi: 10.1128/mcb.17.8.4667. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Selvakumar M, Helfman DM. Exonic splicing enhancers contribute to the use of both 3′ and 5′ splice site usage of rat beta-tropomyosin pre-mRNA. RNA. 1999;5:378–394. doi: 10.1017/s1355838299981050. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.