Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 2001 Aug 1;29(15):3204–3211. doi: 10.1093/nar/29.15.3204

In vitro selection of exonic splicing enhancer sequences: identification of novel CD44 enhancers

Gertrud Woerfel 1, Albrecht Bindereif 1,a
PMCID: PMC55827  PMID: 11470878

Abstract

We have developed an in vitro selection procedure that allows the identification and isolation of functional splicing enhancer sequences from any cDNA. It is based on the enhancement of general splicing activity of a pre-mRNA reporter derived from the Drosophila dsx gene. Short DNase I fragments are cloned into a cassette in the second exon of the reporter construct, replacing the natural dsx enhancer. After splicing and reverse transcription–PCR, fragments are recovered from the mRNA product. Applying this selection to the CD44 gene, which undergoes extensive alternative splicing processes, we have identified several novel exonic enhancers. Two of them, which reside in CD44 variable exon 6, were further characterized by mutational analysis and confirmed to function within their natural CD44 context.

INTRODUCTION

Mammalian protein coding genes are often expressed by alternative splicing pathways, generating from a single primary transcript multiple mRNAs coding for functionally distinct proteins (for recent reviews see 13). Alternative splicing is usually regulated in a tissue-specific or developmental manner. However, first insights into its molecular basis have been obtained in only a few instances. Splicing enhancers have been discovered as regulatory sequences, often located in exons but also sometimes in introns, that mediate, through specific binding proteins, interactions between general components of the spliceosome. Splicing regulatory sequences are often composed of both positively and negatively acting elements (enhancers versus silencers; see for example 4,5).

As to the mode of action of exonic enhancers, the following general principles have emerged from many studies: short purine-rich or CA-rich elements located in a regulated exon are responsible for activating a weak upstream 3′ splice site. Intronic enhancers are often required for the inclusion of alternatively spliced, short exons. As shown in several cases, splicing enhancers are recognized by SR (serine/arginine-rich) and hnRNP proteins (reviewed in 68). These large protein families include both general splicing factors and gene-specific regulators.

Based on our current knowledge it is clear that splicing enhancers are widespread and play important and diverse roles in the regulation of alternative splicing. Furthermore, mutations affecting enhancer sequences have been implicated in human genetic diseases (reviewed in 9). SELEX (systematic evolution of ligands by exponential enrichment) methods have previously been used to isolate functional RNA species from large pools containing randomized sequences (reviewed in 10). This classical SELEX approach has also been applied to search for novel splicing enhancers, starting from completely degenerate sequences, and has resulted in the identification of several purine- and CA-rich sequence motifs (1114). Here we report on a novel approach for the identification and isolation of functional enhancer sequences, which is based on naturally occurring sequences rather than completely degenerate ones, thus providing an alternative to the classical SELEX method. This in vitro selection allows the screening of any specific cDNA for exonic splicing enhancer elements. In principle, it can also be extended in vivo to isolate and investigate tissue-specific enhancer sequences (see Discussion). As a test model we have chosen a specific cDNA coding for CD44 protein.

CD44 is a family of membrane glycoproteins expressed on lymphocytes and functioning as a cell adhesion molecule with affinities for extracellular matrix components such as hyaluronic acid, collagen and fibronectin (reviewed in 15). The CD44 genomic structure is complex, comprised of 20 exons, which, through alternative splicing of 10 variable exons, give rise to a structurally and functionally diverse range of CD44 protein isoforms differing in ligand binding specificity of their extracellular domain (16,17). In addition, a specific splice variant of CD44 is involved in development of the vertebrate limb (18). Further interest in these splice variants comes from the finding that some are associated with tumor metastasis and may be used in cancer diagnosis (reviewed in 15,19). A mutational analysis of one particular variable exon, v5, revealed a composite structure of enhancer and silencer elements regulated by signal transduction pathways (5).

MATERIALS AND METHODS

DNA oligonucleotides

The following oligonucleotides were used: E1, 5′-GAAAAATTCCGCTATCCTTG-3′; E2, 5′-CACATACGATTTAGGTGACAC-3′; E1-E2 spl, 5′- GGCGAATCGAAGAGGGCC-3′; E2-5′F, 5′-CCAATACGTTGTGAATGAG-3′; E2-5′R, 5′-GAATTCCCCGGGTCTAGAG-3′; T7-dsx, 5′-CGAAATTAATACGACTCACTATAG-3′; CD44-F9, 5′-CCCCTCATTCACCATGAGCATCAT-3′; CD44-R10, 5′-GAATGGGAGTCTTCTCTGGGTGTT-3′; CD44-IVS9, 5′-CAGCTATGGTCTGCTTAGTCC-3′; CD44/3, 5′-CTGGTAAGGAGCCATCAAC-3′; T7, 5′-TAATACGACTCACTATAGGGCG-3′; cst3, 5′-CTAGACCCGGGGAATTCGCTAGCA-3′; cst4, 5′-AGCTTGCTAGCGAATTCCCCGGGT-3′; CD44-E3, 5′-CCAACACCTCCCAGTATGAC-3′; CD44-E16, 5′-CTTGACTCCCATGTGAGTGTC-3′; CD44-F9A, 5′-CCATGAGCATCATGAGGAAG-3′; CD44-R10A, 5′-CGATATCCCTCATGCCATCT-3′.

Constructs

dsx-XH. Plasmid dsx-XH was constructed as a splicing enhancer reporter, starting from dsx-ASLV (20). The exonic ASLV enhancer was replaced by a SmaI–EcoRI–NheI linker (annealed primers cst3 and cst4), which was inserted between the XbaI and HindIII restriction sites (see Fig. 1). In addition, the EcoRI restriction site downstream of the T7 promoter was destroyed to ensure that the EcoRI site in the linker region was unique, resulting in pdsx-XH.

Figure 1.

Figure 1

In vitro selection of exonic splicing enhancer sequences. The cDNA fragments, which were cloned into the polylinker region in the second exon of the dsx-XH reporter construct are derived either from restriction fragments or from size-selected DNase I fragments (represented by the crosshatched region). Primers used for (RT–)PCR amplifications are indicated by arrows. For details of the procedure, see Results and Materials and Methods.

dsx CD44-A and CD44-B constructs. Synthetic oligonucleotides corresponding to the sequences of the A and B regions were cloned into pdsx-XH between the XbaI and HindIII sites, resulting in pdsx CD44-A and CD44-B (Fig. 4). Similarly, substitution mutations were introduced, giving the pdsx CD44-A sub2 and sub4 mutant derivatives.

Figure 4.

Figure 4

Overview of in vitro selected putative enhancer sequences from a mouse CD44 cDNA containing constitutive exons 1–5 and variable exons v4–v10 (corresponding to exons 9–15). The selected sequences are schematically represented above by thick lines. The sequence of the variable exon 6 region, where most selected sequences concentrate, is shown in detail below. Regions A and B (indicated by boxes) are present in seven and three independent clones, respectively.

hCD44/E9-10. A CD44 minigene construct was made for transfection, which contains 58 bp of exon 9, the entire intron 9 and 112 bp of exon 10 (Fig. 6B). This fragment was generated by PCR from human genomic DNA, using primers CD44-F9 and CD44-R10. The amplified 2894 bp fragment was cloned into pcDNA3.1/V5-His TOPO TA (Invitrogen), downstream of the CMV promoter. The sub2 mutation was introduced by PCR, starting from two fragments: first, a fragment containing 58 nt of exon 9, intron 9, 40 nt of exon 10 and the mutation sub2 at the 3′-end; secondly, a fragment containing the sub2 mutation and the remaining 61 nt of exon 10. Both fragments were hybridized with each other and further amplified by overlap extension PCR with primers CD44-F9 and CD44-R10, followed by gel purification and cloning into pcDNA3.1/V5-His TOPO TA (Invitrogen), downstream of the CMV promoter. The mutation was confirmed by sequencing.

Figure 6.

Figure 6

Splicing requirement of 22mer sequence in the CD44 context. (A) In vitro analysis. The minigene construct T7 hCD44/E9-10 is schematically shown, containing exons 9 and 10 of the human CD44 gene and intron 9 with the 1.95 kb EcoRI fragment deleted. In exon 10 the normal sequence of the putative splicing enhancer (WT) was also substituted (sub2). Splicing was analyzed after 0, 30, 60 and 90 min by RT–PCR, using primers specific for the two flanking exons (CD44-F9 and CD44-R10). The mobilities of the PCR products representing unspliced and spliced mRNAs are indicated on the left. M, 100 bp ladder marker. (B) In vivo analysis. The minigene construct hCD44 E9-E10 is schematically shown, containing exons 9 and 10 of the human CD44 gene and full-length intron 9. The sequence of the putative splicing enhancer in exon 10 was also substituted as in the in vitro construct (WT and sub2). For the splicing analysis wild-type (lanes 3, 6 and 9) and sub2 mutant constructs (lanes 4, 7 and 10) were transiently transfected into HEK 293 cells and RNA was prepared 24 h post-transfection and assayed by RT–PCR. Mock-transfected cells served as controls (lanes 2, 5 and 8). Spliced mRNA was detected by primer combination CD44-F9A and CD44-R10A (lanes 2–4), unspliced pre-mRNA by primer combination CD44-IVS9 and CD44-R10A (lanes 5–7). To test for endogenous CD44 mRNA, additional RT–PCR assays with exon 3- and 16-specific primers were done (right, lanes 8–10). The mobilities of the PCR products representing unspliced and spliced mRNAs are indicated on the right. M, 100 bp ladder marker.

T7 hCD44/E9-10. A CD44 minigene construct was made for in vitro splicing assays, containing the human CD44 exon 9–intron 9–exon 10 fragment, with the 1950 bp EcoRI fragment in intron 9 deleted (nt 507–2457 of intron 9) and cloned into pCR2.1-TOPO (Invitrogen) downstream of the T7 promoter, resulting in T7 hCD44/E9-10. The sub2 mutation was introduced into this construct, using the corresponding procedure as described for hCD44/E9-10, but with T7 hCD44/E9-10 as starting construct. The mutation was also confirmed by sequencing.

CD44 pools

To search by in vitro selection for exonic CD44 sequences with splicing enhancer activity, two different CD44 pools were generated, using pdsx-XH as the reporter construct. First, the mouse cDNA fragment mCD44v in the pCMV-T7 vector (kind gift of P. Herrlich) containing exons E1–E5 and variable exons v4–v10 (corresponding to exons E9–E15) was PCR amplified with primers CD44/3 and T7 and cleaved to completion by AluI, HaeIII and RsaI. The fragments were cloned into the SmaI restriction site of pdsx-XH to generate T7-dsx/CD44 pool I. Secondly, the mCD44v cDNA was partially digested with DNase I in the presence of 10 mM MnCl2 to obtain randomly distributed DNA fragments. After gel purification of the fragments in the size range 50–100 bp and polishing of the ends with T4 DNA polymerase, phosphorylated EcoRI linker was ligated to the fragments, followed by cloning into the unique EcoRI site of pdsx-XH, generating T7-dsx/CD44 pool II. This procedure was repeated, resulting in two independent DNase I pools (II/1 and II/2).

In vitro selection of splicing enhancers

To generate the template for subsequent transcription reactions the dsx-CD44 pools were PCR amplified with primers T7-dsx and E2 and converted into a pool of capped pre-mRNAs by T7 transcription. Aliquots of 100 ng of the pre-mRNA pool were spliced in HeLa nuclear extract under standard conditions (60 min at 30°C, 125 µl reactions; see 21). Products were analyzed by reverse transcription (AMV-RT) with primer E2, followed by PCR with either primers E1 and E2 (25 cycles at 53.5°C) to visualize both pre-mRNA and spliced mRNA or the splice junction-specific primer E1-E2 spl and E2 for selective amplification of the spliced mRNA. The latter PCR products should be enriched in splicing enhancer sequences and were cloned into pCR2.1-TOPO (Invitrogen). Single clones were used to reconstruct full-length splicing substrates by overlap extension PCR. First, the first exon fragment including the T7 promoter, the intron and the first 37 nt of exon 2 was PCR amplified from template pdsx-XH with primers T7-dsx and E2-5′R. Secondly, the second exon fragments enriched for enhancer sequences were obtained by PCR amplification with E2-5′F and E2, using the cloned RT–PCR products generated with primers E1-E2 spl and E2. Thirdly, both overlapping fragments were combined and further amplified with primers T7-dsx and E2, thereby reconstructing a T7-dsx/CD44 pre-mRNA. Individual constructs were T7 transcribed and characterized for splicing enhancement by in vitro splicing assays (10 ng pre-mRNA per 25 µl reaction under standard conditions; RT–PCR assays as described above). Constructs dsx-XH and dsx-ASLV served as splicing-negative and splicing-positive controls, respectively. Splicing-active constructs were sequenced. For all quantifications Image Quant 3.2 software (Molecular Dynamics) was used. Splicing efficiencies were calculated as the ratio of the intensity of the spliced product band to the sum of the intensities of bands representing unspliced and spliced bands.

Splicing in vitro of T7 hCD44/E9-10 constructs

To characterize in vitro the activity of the region A sequence as a putative CD44 exonic enhancer in the CD44 context, minigene constructs T7 hCD44/E9-10 and the sub2 derivative were T7 transcribed and spliced in HeLa nuclear extract under standard conditions, followed by RT–PCR assays with primers CD44-F9 and CD44-R10. The PCR product corresponding to the spliced mRNA was cloned and sequenced to confirm use of the correct splice sites.

Transfection and RT–PCR analysis of splicing

To examine splicing of the minigene construct hCD44/E9-10 and mutant derivative sub2 in vivo, plasmids were transiently transfected into HEK293 cells. Total RNA was purified 24 h post-transfection, reverse transcription was done with primer CD44-R10, followed by PCR with primers CD44-F9A and CD44-R10A (spliced product, 30 cycles, 55°C) or CD44-IVS9 and CD44-R10A (pre-mRNA, 30 cycles, 55°C). To detect endogenous CD44 mRNA, reverse transcription was done with primer CD44-E16, followed by PCR with primers CD44-E3 and CD44-E16.

RESULTS

In vitro selection of exonic splicing enhancer elements from fragmented cDNAs

Here we describe an in vitro selection procedure for the isolation of functional splicing enhancer sequences, using fragmented cDNAs to generate a pool of exonic RNA fragments (Fig. 1). Specifically, we have chosen the mouse CD44 gene, which undergoes extensive alternative splicing, as a model system. Exonic CD44 sequences with splicing enhancer activity were selected from a mouse CD44 cDNA (mCD44v), which carries exons E1–E5 and variable exons v4–v10 (corresponding to E9–E15; see Fig. 4 for a schematic of the cDNA structure). We generated two different pools of short CD44 exon fragments: first (pool I), by digestion with three four-base cutting restriction enzymes (AluI, HaeIII and RsaI); secondly (pool II), by partial digestion with DNase I, followed by gel purification of the 50–100 bp fragments, which should yield a random collection of sequences representing the entire cDNA. Both pools were cloned into a cassette within the second exon of a splicing enhancer reporter construct, pdsx-XH, which is derived from Drosophila dsx exons 3 and 4. This cassette replaces the natural dsx enhancer in exon 4, with the result that the reporter pre-mRNA depends on insertion of a functional splicing enhancer (see Materials and Methods).

To search for novel enhancer elements, based on enhancement of in vitro splicing activity, these pools were T7 transcribed and spliced in vitro into HeLa nuclear extract. Splicing was monitored by RT–PCR analysis, using either exon-specific primers or an exon junction-specific primer that amplifies only spliced mRNAs (Fig. 2, left and right, respectively). This clearly showed that a significant portion of the pre-mRNA pool was spliced in vitro, in contrast to the pdsx-XH vector construct, which by itself was processed very inefficiently (see for example Fig. 3, lanes XH). To rule out any DNA contamination in the nuclear extract, PCR without prior reverse transcription was performed as a control (data not shown).

Figure 2.

Figure 2

Splicing in vitro of the T7-dsx/CD44 pre-mRNA pool. A pre-mRNA pool containing DNase I-generated fragments in the 50–100 bp size range within the second exon was processed in vitro as indicated for 0, 30, 60 and 90 min. Splicing was analyzed by RT–PCR, using exon-specific primers E1 and E2 (left) or a splice junction-specific primer (E1-E2 spl) in combination with exon 2-specific primer E2 (right) (for the location of these primers see Fig. 1). The mobilities of the PCR products representing unspliced and spliced mRNA are indicated. M, 100 bp ladder marker.

Figure 3.

Figure 3

Splicing in vitro of T7-dsx clones carrying putative enhancer sequences. Selected T7-dsx clones isolated from DNase I-generated pools (see Fig. 4 and Table 1 for sequences) were T7 transcribed and spliced in vitro for 0, 30 and 60 min. For comparison, pre-mRNAs T7-dsx-XH (negative control) and T7-dsx-ASLV (positive control) were spliced in vitro. The reactions were analyzed by RT–PCR, using exon-specific primers E1 and E2. The upper and lower bands represent unspliced and spliced mRNAs, respectively. M, 100 bp ladder marker.

The population of spliced mRNAs was expected to be enriched in functional enhancer sequences. To identify these putative enhancers, second exon fragments were cloned out of the mRNA pool after RT–PCR amplification and used individually to reconstruct the original dsx pre-mRNA context (see Fig. 1 and Materials and Methods). After T7 transcription T7-dsx/CD44 clones were then assayed for enhancement of in vitro splicing in comparison to the vector construct (pdsx-XH) and to a known strong enhancer (pdsx-ASLV). Once tested positively for splicing enhancement, clones were sequenced.

Initially we had carried out the reconstruction procedure with the entire pool of enriched mRNAs, resulting in a T7-dsx/CD44 DNA pool which, after T7 transcription, underwent a second and third round of in vitro selection. However, no higher enrichment was observed and we detected an increased number of clones carrying PCR artefacts (data not shown). Therefore, we routinely performed only single round selections.

We have found that after a single round of selection approximately half of the clones analyzed exhibited significant levels of splicing enhancement in comparison to the pdsx-XH control (Fig. 3). One-third of the selected clones contained sequences in the opposite orientation. These sequences were distributed over the entire CD44 sequence without any significant enrichment in specific regions (data not shown). The rest of the selected clones did not stimulate in vitro splicing. Figure 3 shows representative in vitro splicing assays of 10 independent and enhancer-positive T7-dsx clones, which carry CD44 cDNA fragments derived from pool II/2. In each case, in vitro splicing was assayed after 0, 30 and 60 min by RT–PCR with exon-specific primers. The range of splicing efficiencies of individual clones was between 13 (lanes sel2-4A) and 30% (lanes sel2-2D). The T7-dsx vector with non-specific linker sequence (T7-dsx-XH) served as a negative control to detect background splicing activity independent of an enhancer insertion (0% splicing, lanes XH); in contrast, the positive control pre-mRNA with the strong ASLV enhancer (T7-dsx-ASLV) resulted in strong enhancement (20% splicing, lanes ASLV).

Figure 4 gives a schematic overview of where on the CD44 cDNA the in vitro selected sequences map [see Table 1 for a list of all 27 sequences we have obtained from three independent in vitro selections (pools I, II/1 and II/2)]. Except for a single clone, 4B5 (139 bp), fragments were between 31 and 88 bp long, as expected, since DNase I or restriction fragments were size selected in the 50–100 bp range. Significantly, most sequences concentrate in the exon 11 (v6) region: seven overlapping clones out of the 27 map to this region and are derived from each of the three independent pools. Most fragments concentrate in the 5′-terminal half of exon 11 (v6), defining a 22 nt sequence (‘region A’). Furthermore, there is a region of 25 nt in the 3′-half of exon 11 (v6), which is represented in three different clones and which we have called region B (for a further mutational analysis of these regions, see below and Fig. 5). In addition, there is a short segment at the 5′-end of exon 9 (v4) represented by three selected clones, as well as an extended region of exons 10 (v5) and 12 (v7), each occurring in three clones (for a more detailed sequence analysis of the selected regions, see Discussion).

Table 1. Sequences of all selected clones derived from CD44 cDNA pool I (restriction fragment-generated) as well as pools II/1 and II/2 (DNase I-generated).

graphic file with name gke446t01.gif

The purine-rich motifs (GGAGA/GGGGA/AGAGG) are boxed, the CA-rich motifs [(C)CACC(C)] are underlined [motifs according to Schaal and Maniatis (23)].

Figure 5.

Figure 5

Mutational analysis of the splicing enhancer in region A (22mer). Clones CD44-A and CD44-B, which contain region A and B (see Fig. 3), respectively, were cloned into the exon cassette of T7-dsx and spliced in vitro. Control reactions were done in parallel with T7-dsx-XH (negative control) and T7-dsx-ASLV (positive control). In addition, two mutant derivatives of clone CD44-A (sub2 and sub4) were assayed for splicing enhancement (sequences shown below with substituted nucleotides underlined). In each case, time points of 0, 30, 60 and 90 min were analyzed by RT–PCR, using exon-specific primers E1 and E2. The mobilities of the PCR products representing unspliced and spliced mRNAs are indicated on the right.

Mutational analysis of a novel splicing enhancer in CD44

First we determined whether regions A and B are sufficient for splicing enhancer activity (Fig. 5). Both sequences were cloned as synthetic double-stranded oligonucleotides into the cassette of the T7-dsx second exon, giving T7-dsx CD44-A and T7-dsx CD44-B, respectively. Pre-mRNAs were obtained by T7 transcription, spliced in vitro and analyzed by RT–PCR with exon-specific primers. As a negative control pdsx-XH pre-mRNA was tested in parallel and as a strongly positive control pdsx-ASLV (lanes XH, 15% splicing after 90 min; lanes ASLV, 74%). As Figure 5 clearly shows, both regions A and B enhanced splicing activity (lanes CD44-A and CD44-B). Splicing enhancement by the 22mer region A was consistently higher than the effect of the 25mer region B (40 and 31%, respectively, after 90 min). We conclude that both region A and B of CD44 exon 11 (v6) are sufficient to confer enhancer activity in the exonic context of the dsx gene.

Since region A from the 5′-half of exon 11 (v6) had been selected most often, we focused our further analysis on this sequence. To identify which part of the 22mer is important, two substitution derivatives were constructed: CD44-A sub2 and sub4, where the first 11 or 6 nt, respectively, were replaced by their reverse complements (see Fig. 5). As a result, splicing activity was reduced to background levels (lanes CD44-Asub2, 9% after 60 min, and CD44-Asub4, 15%, compared to lane CD44-A, 31%, and lane XH, 12%), demonstrating that the 5′-terminal six positions of the 22mer are necessary for splicing enhancement.

In the experiments described so far, the in vitro selection of splicing enhancer-active sequences and their initial characterization was performed in the context of the dsx gene, which strongly depends on an exonic splicing enhancer. Next we determined whether region A, which had shown highest activity, was also important for determining splicing efficiency in the natural context of the CD44 gene.

Region A in mouse CD44 exon 11 (v6) corresponds to human exon 10 (v6). Exon v6 has been considered a tumor marker. We wanted to determine whether this region also works as an exonic splicing enhancer in the human CD44 context. This could give some insight into the alteration of alternative splicing during tumor progression. Therefore, we constructed a human CD44 minigene under T7 control containing exons 9 and 10, as well as intron 9 lacking a 1.95 kb EcoRI fragment (T7 hCD44/E9-10; see schematic in Fig. 6A). Within human exon 10, the 11 nt sequence 5′-CCAGAAGGAAC-3′ (wild-type, WT), which differs from the mouse sequence at only three positions (see underlined nucleotides), represents the 5′-half of region A and was substituted by 5′-TCTCCTGCTGG-3′ (sub2, corresponding to the mutation introduced in the mouse sequence in the dsx context). The minigene constructs were T7 transcribed into pre-mRNAs of ∼1060 nt and spliced in vitro for 0, 30, 60 and 90 min, followed by RT–PCR analysis with the two exon-specific primers (Fig. 6A, see lanes WT and sub2). Clearly, the wild-type CD44 pre-mRNA was processed very efficiently, whereas the sub2 mutant showed no detectable activity. Use of the correct splice junction was confirmed by sequencing of the RT–PCR product (data not shown). We conclude that the 11mer sequence in exon 10 of human CD44 (5′-CCAGAAGGAAC-3′) is an essential element for splicing in vitro.

Next, we determined whether this putative enhancer element is also required in vivo, using construct hCD44/E9-10, which contains exons 9 and 10 with the full-length intron 9 under CMV control. Exon 10 carries either the wild-type (WT) or the mutant (sub2) sequence of the putative element, in analogy to the T7 constructs (see above and Fig. 6B). After transient transfection into HEK 293 cells (see Fig. 6B) or HeLa cells (data not shown), splicing was analyzed 24 h post-transfection by RT–PCR. The absence of DNA contamination was confirmed by performing PCR assays without prior reverse transcription (data not shown). Spliced mRNA (lanes 2–4, F9A/R10A primer pair) and unspliced pre-mRNA (lanes 5–7, IVS9/R10A primer pair) were detected in separate reactions because of the large intron length of 2.7 kb. In cells transfected with the wild-type construct we found correctly spliced mRNA (as confirmed by sequence analysis; data not shown) as well as unspliced pre-mRNA (lanes 3 and 6, respectively). In contrast, the levels of mRNA derived from the mutated minigene were strongly reduced (to ∼20% of wild-type levels) and the corresponding unspliced product was as abundant as wild-type (lanes 4 and 7, respectively). With mock-transfected cells weak bands were produced with either primer pair, indicative of mRNA and pre-mRNA, respectively (lanes 2 and 5). This is most likely due to endogenous CD44 transcripts since in another control, using exon 3 and exon 16 primers, PCR product was observed in mock-transfected, wild-type-transfected and sub2-transfected cells (see lanes 8–10; the size of the mRNA product corresponds to the constitutive CD44 form without variable exons). We conclude that, consistent with the in vitro analysis (see above), in vivo splicing of human CD44 exons 9 and 10 critically depends on an enhancer element in exon 10, which we had initially identified through in vitro selection.

DISCUSSION

Searching for splicing enhancers by in vitro selection

We have described a novel approach to identify splicing enhancers in any natural DNA sequence. Since our functional selection is based on enhancer function in an exonic context, it is best suited to screen cDNAs or cDNA fragments for exonic enhancers. This approach basically differs from the classical SELEX method, which starts with large pools of randomized sequences and which has also been applied to isolate splicing enhancers (see Introduction). Alternatively, small specific fragments have been generated from a certain cDNA sequence by restriction enzymes or from synthetic oligonucleotides (see for example 22).

Our approach starts with natural sequences and generates a random collection of fragments by DNase I digestion. This fragment pool of preselected size range is then inserted into a splicing reporter at an exonic position where it replaces a natural enhancer. Therefore, splicing enhancement in vitro provides the functional basis for selection. As a result, natural sequences that work as general enhancers are enriched and can subsequently be isolated by cloning.

Most known splicing enhancers are composed of purine-rich or CA-rich elements or are composites of the two. Comparing the 22 nt sequence of exon v6 to known elements, we note that it is only moderately purine rich (64%), but also contains a motif, GGAGA, which has been found in many other purine-rich enhancer elements and has been recovered from a random screen (23); in total, five GA or AG dinucleotides occur within the 22mer sequence. Analyzing all 27 candidate sequences from our selections (Table 1), the purine-rich motif appears 20 times (GGAGA/GGGGA/AGAGG) and another, CA-rich consensus motif, (C)CACC(C), appears 17 times. Looking at the distribution of these two motifs, we find that 15 of our 27 clones carry at least one purine-rich motif, 12 carry at least one CA-rich motif, and four of them carry both purine- and CA-rich sequences; only four clones cannot be grouped into either of these two classes and may contain new enhancer types.

Two other studies based on in vitro selection had previously identified consensus sequences for splicing enhancers that depend on specific SR proteins (12,13); our region A sequence contains a potential SF2/ASF enhancer (positions 3–9 of region A), as well as an SRp40 candidate element (positions 11–15).

Regarding CD44 alternative splicing, a purine-rich enhancer (5′-GAAGAGGAGA-3′) has recently been identified and analyzed in the variable exon v5 (5,24,25). We recovered this entire sequence and part of it with two clones, sel2-3H and sel2-2G (see Table 1), confirming the validity of our selection.

The potential of this approach may be expanded in the future by introducing the following modifications. First, to isolate regulated and cell type-dependent enhancers, the in vitro selection could be carried out in extracts derived from different cell lines or specific tissues rather than in standard HeLa cell nuclear extract. To search for enhancers dependent on specific SR proteins, this procedure may also be performed in HeLa S100 extract combined with purified SR proteins (compare, for example, 12). Secondly, our in vitro selection could be modified to an in vivo procedure, by transfecting an enhancer-dependent splicing reporter carrying pools of cDNA fragments into the appropriate cells, which may contain additional necessary protein factors. This would provide another way to detect cell type-specific enhancers. Thirdly, instead of screening a specific cDNA, entire cDNA libraries or cDNA collections may be used to screen on a genome-wide level. Similarly, it might be interesting to use as starting material fragmented genomic DNA, as introduced by Gold and co-workers in their search for functional RNA aptamers (‘genomic SELEX’ strategy; for a review see 26).

In sum, the immediate use of the study presented here is that putative general enhancers can be easily isolated from a specific cDNA or exon sequence, based on an experimental approach. This is particularly relevant considering that not all enhancers can be reliably predicted based only on sequence analysis. In addition, it is likely that more classes of enhancer sequences exist than are currently known.

Implications for CD44 alternative splicing

We have identified a region within CD44 variable exon v6, which was represented most often in individual clones recovered from in vitro selections (see region A in Fig. 4). Subsequent mutational analysis confirmed this 22 nt region not only to be sufficient for general splicing enhancement, but also to function as a splicing regulatory element in the CD44 gene context (Figs 5 and 6).

What is the function of splicing enhancers in variable exon v6? Exon v6 is included only in some cell types or differentiation stages and only in certain pathological circumstances (see for example 27). Therefore, the inclusion of this exon has to be precisely regulated. The association of abnormal isoform patterns of CD44 with tumor progression may reflect the loss of this regulation. In particular, expression of CD44 v6 appears to be a marker for this process. Exonic enhancers as well as SR proteins are likely candidates for mediating regulation of exon v6 recognition. Region A as a strong exonic splicing enhancer element may be involved in recognition by SR proteins, thereby determining inclusion or skipping of v6 during tumor progression.

Alternatively, or in addition, enhancers may here play a role in general splicing enhancement. Consistent with such a requirement, the 3′ splice site preceding human exon 10 (v6) appears to be weak due to a relatively short polypyrimidine tract (accession no. AL133330). Interestingly, there are a number of reports demonstrating that intron 9, which precedes exon v6, can be retained in some cancer types (see for example 28).

Acknowledgments

ACKNOWLEDGEMENTS

We thank Kunio Inoue and Tillmann Achsel for dsx-derived enhancer plasmids and Patrice Wolff for constructing one of the initial dsx reporters. We are grateful to Peter Herrlich for providing the mouse CD44 cDNA and we thank members of our group for discussions. This work was supported by the Bundesministerium für Bildung und Forschung (BMBF German Human Genome Project).

References

  • 1.Black D.L. (2000) Protein diversity from alternative splicing: a challenge for bioinformatics and post-genome biology. Cell, 203, 367–370. [DOI] [PubMed] [Google Scholar]
  • 2.Chabot B. (1996) Directing alternative splicing: cast and scenarios. Trends Genet., 12, 472–478. [DOI] [PubMed] [Google Scholar]
  • 3.Lopez A.J. (1998) Alternative splicing of pre-mRNA: developmental consequences and mechanisms of regulation. Annu. Rev. Genet., 32, 279–305. [DOI] [PubMed] [Google Scholar]
  • 4.Kan J.L. and Green,M.R. (1999) Pre-mRNA splicing of IgM exons M1 and M2 is directed by a juxtaposed splicing enhancer and inhibitor. Genes Dev., 13, 462–471. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.König H., Ponta,H. and Herrlich,P. (1998) Coupling of signal transduction to alternative pre-mRNA splicing by a composite splice regulator. EMBO J., 17, 2904–2913. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Graveley B.R. (2000) Sorting out the complexity of SR protein functions. RNA, 6, 1197–1211. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Smith C.W. and Valcarcel,J. (2000) Alternative pre-mRNA splicing: the logic of combinatorial control. Trends Biochem. Sci., 25, 381–388. [DOI] [PubMed] [Google Scholar]
  • 8.Tacke R. and Manley,J.L. (1999) Determinants of SR protein specificity. Curr. Opin. Cell Biol., 11, 358–362. [DOI] [PubMed] [Google Scholar]
  • 9.Blencowe B.J. (2000) Exonic splicing enhancers: mechanism of action, diversity and role in human genetic diseases. Trends Biochem. Sci., 25, 106–110. [DOI] [PubMed] [Google Scholar]
  • 10.Szostak J.W. and Ellington,A.D. (1993) In vitro selection of functional RNA sequences. In Gesteland,R.F. and Atkins,J.F. (eds), The RNA World. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, pp. 497–533.
  • 11.Coulter L.R., Landree,M.A. and Cooper,T.A. (1997) Identification of a new class of exonic splicing enhancers by in vivo selection. Mol. Cell. Biol., 17, 2143–2150. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Liu H.X., Zhang,M. and Krainer,A.R. (1998) Identification of functional exonic splicing enhancer motifs recognized by individual SR proteins. Genes Dev., 12, 1998–2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Liu H.X., Chew,S.L., Cartegni,L., Zhang,M.Q. and Krainer,A.R. (2000) Exonic splicing enhancer motif recognized by human SC35 under splicing conditions. Mol. Cell. Biol., 20, 1063–1071. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Tian H. and Kole,R. (1995) Selection of novel exon recognition elements from a pool of random sequences. Mol. Cell. Biol., 25, 6291–6298. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Ponta H., Wainwright,D. and Herrlich,P. (1998) The CD44 protein family. Int. J. Biochem. Cell Biol., 30, 299–305. [DOI] [PubMed] [Google Scholar]
  • 16.Günthert U., Hofmann,M., Rudy,W., Reber,S., Zoller,M., Haussmann,I., Matzku,S., Wenzel,A., Ponta,H. and Herrlich,P. (1991) A new variant of glycoprotein CD44 confers metastatic potential to rat carcinoma cells. Cell, 65, 13–24. [DOI] [PubMed] [Google Scholar]
  • 17.Screaton G.R., Bell,M.V., Jackson,D.G., Cornelis,F.B., Gerth,U. and Bell,J.I. (1992) Genomic structure of DNA encoding the lymphocyte homing receptor CD44 reveals at least 12 alternatively spliced exons. Proc. Natl Acad. Sci. USA, 89, 12160–12164. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Sherman L., Wainwright,D., Ponta,H. and Herrlich,P. (1998) A splice variant of CD44 expressed in the apical ectodermal ridge presents fibroblast growth factors to limb mesenchyme and is required for limb outgrowth. Genes Dev., 12, 1058–1071. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Herrlich P., Zoller,M., Pals,S.T. and Ponta,H. (1993) CD44 splice variants: metastases meet lymphocytes. Immunol. Today, 14, 395–399. [DOI] [PubMed] [Google Scholar]
  • 20.Tanaka K., Watakabe,A. and Shimura,Y. (1994) Polypurine sequences within a downstream exon function as a splicing enhancer. Mol. Cell. Biol., 14, 1347–1354. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Bindereif A. and Green,M.R. (1987) An ordered pathway of snRNP binding during mammalian pre-mRNA splicing complex assembly. EMBO J., 6, 2415–2424. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Schaal T.D. and Maniatis,T. (1999) Multiple distinct splicing enhancers in the protein-coding sequences of a constitutively spliced pre-mRNA. Mol. Cell. Biol., 19, 261–273. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Schaal T.D. and Maniatis,T. (1999) Selection and characterization of pre-mRNA splicing enhancers: identification of novel SR protein-specific enhancer sequences. Mol. Cell. Biol., 19, 1705–1719. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Matter N., Marx,M., Weg-Remers,S., Ponta,H., Herrlich,P. and König,H. (2000) Heterogeneous ribonucleoprotein A1 is part of an exon-specific splice-silencing complex controlled by oncogenic signaling pathways. J. Biol. Chem., 275, 35353–35360. [DOI] [PubMed] [Google Scholar]
  • 25.Stoss O., Olbrich,M., Hartmann,A.M., König,H., Memmott,J., Andreadis,A. and Stamm,S. (2001) The STAR/GCG family protein rSLM-2 regulates the selection of alternative splice sites. J. Biol. Chem., 276, 8665–8673. [DOI] [PubMed] [Google Scholar]
  • 26.Gold L., Brown,D., He,Y.-Y., Shtatland,T., Singer,B.S. and Wu,Y. (1997) From oligonucleotide shapes to genomic SELEX: novel biological regulatory loops. Proc. Natl Acad. Sci. USA, 94, 59–64. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Legras S., Günthert,U., Stauder,R., Curt,F., Oliferenko,S., Kluin-Nelemans,H.C., Marie,J.P., Proctor,S., Jasmin,C. and Smadja-Joffe,F. (1998) A strong expression of CD44-6v correlates with shorter survival of patients with acute myeloid leukemia. Blood, 91, 3401–3413. [PubMed] [Google Scholar]
  • 28.Yoshida K., Bolodeoku,J., Sugino,T., Goodison,S., Matsumura,Y., Warren,B.F., Toge,T., Tahara,E. and Tarin,D. (1995) Abnormal retention of intron 9 in CD44 gene transcripts in human gastrointestinal tumors. Cancer Res., 55, 4273–4277. [PubMed] [Google Scholar]

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES