Abstract
In Caenorhabditis elegans, polycistronic pre-mRNAs are processed by cleavage and polyadenylation at the 3′ ends of the upstream genes and trans splicing, generally to the specialized spliced leader SL2, at the 5′ ends of the downstream genes. Previous studies have indicated a relationship between these two events in the processing of a heat shock-induced gpd-2–gpd-3 polycistronic pre-mRNA. Here, we report mutational analysis of the intercistronic region of this operon by linker scan analysis. Surprisingly, no sequences downstream of the 3′ end were important for 3′-end formation. In contrast, a U-rich (Ur) element located 29 bp downstream of the site of 3′-end formation was shown to be important for downstream mRNA biosynthesis. This ∼20-bp element is sufficient for SL2 trans splicing and mRNA accumulation when transplanted to a heterologous context. Furthermore, when the downstream gene was replaced by a gene from another organism, no loss of trans-splicing specificity was observed, suggesting that the Ur element may be the primary signal required for downstream mRNA processing.
Two characteristic features of Caenorhabditis elegans make it a unique model system among eukaryotes for studying RNA processing. First, approximately 70% of the genes in C. elegans undergo trans splicing during processing of the pre-mRNA. trans splicing involves the transfer of a 22-nucleotide (nt) spliced leader (SL) sequence from the SL snRNP to the 5′ ends of the mRNAs (2). The majority of trans splicing utilizes SL1 RNA and most SL1 trans splicing occurs near the 5′ ends of pre-mRNAs that begin with an outron, an AU-rich intron-like sequence containing a functional 3′ splice site (UUUUCAG/R) but lacking a 5′ splice site (5–7). Second, many C. elegans genes are arranged in operons (16, 21). These genes are found in closely linked gene clusters that are cotranscribed to produce polycistronic pre-mRNAs. Processing of these polycistronic precursors into mature monocistronic transcripts involves a combination of cleavage and polyadenylation at the 3′ end of the upstream mRNA and trans splicing at the 5′ end of the downstream mRNA. A second type of SL snRNP, called SL2, is used exclusively for trans splicing to the downstream genes in these polycistronic transcripts (16, 21), although mRNAs from some downstream genes in operons are trans spliced to both SL1 and SL2 (2). Since the discovery of operons and SL2 trans splicing in C. elegans, they have been found in other nematodes, including Caenorhabditis briggsae (13) and Dolichorhabditis (9). We do not know how widespread operons are in eukaryotes, although polycistronic transcripts have been identified in a variety of organisms, including Drosophila melanogaster and mammals (1).
Although the general splicing machinery is conserved in C. elegans (2), the existence of operons and trans splicing suggests there could be some machinery specific for them. Part of the trans splicing machinery, the SL snRNP, has been analyzed in Ascaris, another nematode (8). However, we know little about the unique machinery involved in operon processing and trans splicing in C. elegans. Since the two genes in an operon are separated by only 100 to 400 bp, it is possible that 3′-end formation at the upstream gene and trans splicing at the 5′ end of the downstream gene are functionally coupled. Indeed, our laboratory previously showed that mutation of the AAUAAA, the putative cleavage and polyadenylation specificity factor (CPSF) binding site of the upstream gene required for 3′-end formation, resulted in a reduction of the percentage of SL2 trans splicing to the trans splice site 110 nt downstream (12). In this case, however, even though 3′-end formation was completely prevented, some SL2 trans splicing still occurred. Thus, there must be additional signals that specify the use of SL2.
There are three possible sources that could contain such cis elements: the upstream gene, the intercistronic sequence, and the downstream gene. The only element in the intercistronic region found in all operons is the trans splice site, but this sequence is not different from the general 3′ splice site consensus (2), so it is not a candidate for an SL2-specific signal. Here we show that sequences in the downstream gene also play no obligatory role in SL2 trans splicing. In addition, we have performed a linker scan analysis of the intercistronic region. This analysis revealed a short U-rich element required for SL2 trans splicing more than 50 bp upstream of the trans splice site. In contrast, we found no sequences in the intercistronic region required for 3′-end end formation of the upstream gene.
MATERIALS AND METHODS
Worm strains and RNA preparation.
Maintenance and growth of worms was as described by Brenner (3) and Sulston and Hodgkin (17). Transgenic worm strains carrying extrachromosomal arrays were generated as described previously (15, 16). Worms were heat shocked at 30°C for 1 to 2 h on floating petri plates spread with bacteria. Total RNA was prepared from heat-shocked or non-heat-shocked mixed-stage populations of transgenic worms (5). The RNA was treated with DNase I prior to analysis.
Plasmid construction.
The plasmid WT, containing the wild-type operon, was constructed by deleting 75 bp of the plasmid HS1496 (16) upstream of the heat shock promoter to make the SalI site in gpd-3 unique. The linker scan mutations (LS1 to LS9) were made from WT by replacing 10-bp sections of the wild-type intercistronic region with linker GCTTAATTAA via recombinant PCR (11). The primers at the ends were gpd2CLus, 5′-CAACAGAGTTGTTGATCTCATCTCG-3′, and vit6pr2, 5′-AACTTGTCGCACTTCTGGTC-3′. The following pairs of primers were used to introduce the mutations: LS1-3 (5′-ATTCATTAATTAAGCTAAATATACAACCTTTATTG-3′) and LS1-5 (5′-ATTTAGCTTAATTAATGAATCTGCCATTTCCTCGT-3′), LS2-3 (5′-GAAATTTAATTAAGCATAAGATGAATAAATATACA-3′) and LS2-5 (5′-CTTATGCTTAATTAAATTTCCTCGTTTTTGCGAGT-3′), LS3-3 (5′-CAAAATTAATTAAGCGGCAGATTCAATAAGATGAA-3′) and LS3-5 (5′-CTGCCGCTTAATTAATTTTGCGAGTTTATATACCT-3′), LS4-3 (5′-TATAATTAATTAAGCACGAGGAAATGGCAGATTCA-3′) and LS4-5 (5′-CTCGTGCTTAATTAATTATATACCTTCCAATTTTC-3′), LS5-3 (5′-TTGGATTAATTAAGCACTCGCAAAAACGAGGAAAT-3′) and LS5-5 (5′-CGAGTGCTTAATTAATCCAATTTTCTTTCTATTGT-3′), LS6-3 (5′-AGAAATTAATTAAGCAGGTATATAAACTCGCAAAAAC-3′) and LS6-5 (5′-TACCTGCTTAATTAATTTCTATTGTATTTTCAACT-3′), LS7-3 (5′-AAAATTTAATTAAGCGAAAATTGGAAGGTATATAAAC-3′) and LS7-5 (5′-TTTTCGCTTAATTAAATTTTCAACTTCTAATTTTAATTC-3′), LS8-3 (5′-TTAGATTAATTAAGCACAATAGAAAGAAAATTGGA-3′) and LS8-5 (5′-ATTGTGCTTAATTAATCTAATTTTAATTCAGGGAA-3′), and LS9-3 (5′-TGAATTTAATTAAGCAGTTGAAAATACAATAGAAAG-3′) and LS9-5 (5′-CAACTGCTTAATTAAATTCAGGGAAACTGCTTCAA-3′). The last eight bases of the linker are the recognition site for PacI. This enzyme was used to facilitate cloning and ensure correct identification of the different LS mutations.
Mutations LS10, LS11, LS21, LS22, LS 23, LS24, LS25, and LS26 were made from WT by recombinant PCR using the following pairs of primers: LS10-3 (5′-CAAAAACAATTAAGCGGCAGATTCAATAAGATGAA-3′) and LS10-5 (5′-CTGCCGCTTAATTGTTTTTGCGAGTTTATATACCT-3′), LS11-3 (5′-TTGGATTAATTATAAACTCGCAAAAACGAGGAAAT-3′) and LS11-5 (5′-CGAGTTTATAATTAATCCAATTTTCTTTCTATTGT-3′), LS21-3 (5′-CGCAAA AAGGAGGAAATGGCAGATTC-3′) and LS21-5 (5′-CATTTCCTCCTTTTTGCGAGTTTATATACC-3′), LS22-3 (5′-TATAAACTCTGTGAAACGAGGAAATGGCAG-3′) and LS22-5 (5′-TCGTTTCACAGAGTTTATATACCTTCCAA-3′), LS23-3 (5′-GTATATAATGTGGCAAAAACGAGGAAATGG-3′) and LS23-5 (5′-GTTTTTGCCACATTATATACCTTCCAATTTTC-3′), LS24-3 (5′-AAGGTATTGTGACTCGCAAAAACGAGGAA-3′) and LS24-5 (5′-TGCGAGTCACAATACCTTCCAATTTTCTTTC-3′), LS25-3 (5′-TTGGAAGTGTGATAAACTCGCAAAAACGAG-3′) and LS25-5 (5′-GAGTTTTACACAACTTCCAATTTTCTTTCTATTG-3′), and LS26-3 (5′-AAATTGGATTGTATATAAACTCGCAAAAC-3′) and LS26-5 (5′-GAGTTTATATACAATCCAATTTTCTTTCTAATTGTA-3′).
Plasmids ΔUr and ΔUr/ΔpA, which were derived from WT and ΔpA (12), respectively, were constructed using an oligonucleotide-directed in vitro mutagenesis kit (Amersham). Plasmids ΔpA/LS4, ΔpA/LS21, and ΔpA/LS26 were derived from ΔpA by recombinant PCR using the corresponding oligonucleotides described above.
The plasmid SUF was constructed as follows: the intercistronic region of WT was deleted by replacing the PacI/NcoI fragment of LS1 with the corresponding fragment from LS9 to make pGPDPacI. To replace the intercistronic region with an unrelated sequence, two oligonucleotides, UrSUFUS (5′-GCTTAATTAATGTTTAAACTTCATCGATGTTTTTGCGAGTTTATATACCTATCG-3′) and UrSUFDS (5′-GAATTAATTAAGAGTTTAAACAAGAGAAAGATCTATAAATCGATAGGTATATAAACTCG-3′) were added to a PCR mixture containing 1× PCR buffer (Sigma), 0.2 mM deoxynucleoside triphosphates, and 2.5 U of Taq polymerase (Gibco BRL). The product was denatured at 92°C for 2 min, followed by PCR as follows: 92°C for 1 min, 45°C for 1 min, and 72°C for 1 min. After 30 cycles, the products were extended at 72°C for 7 min. The PCR product was cloned into pGEM-T easy vector (Promega) to make pTIC according to the instructions from the manufacturer. The structure of the resulting plasmid was confirmed by sequencing, and the intercistronic region was released with PacI and cloned into the PacI site of pGPDPacI to make SUF. SUFR was made by reversing the U-rich (Ur) sequence in SUF by digestion and religation at the two ClaI sites flanking the Ur element.
Plasmids pHSWTGFP and pHSΔUrGFP were constructed using recombinant PCR to replace the HpaI/SalI fragment containing the gpd-3 gene present in the WT plasmid with the gene for the green fluorescent protein (GFP) from pPD94.81 (kindly provided by A. Fire). In pHSWTGFP the intercistronic region contains the wild-type Ur region, whereas in the pHSΔUrGFP plasmid it contains the ΔUr mutation. Plasmids used as templates for amplifying the intercistronic region were either WT or ΔUr. Primers used to amplify the intercistronic region between the gpd-2 and gpd-3 genes included gpd2Clus and ICNLS-Apa I (5′-GTACCCTCCAAGGGCCCTCCTGAATTAAAATTAGAAG-3′). The ICNLS-Apa I primer introduces an ApaI site adjacent to the trans splice site. The gfp gene containing the simian virus 40 nuclear localization signal and five introns was amplified using the plasmid pPD94.81 as template. Primers used to amplify the GFP gene included NLS-Apa I (5′-GAGGGGCCTTGGAGGGTACCGAGC-3′) and IVS-Sal I (5′-CGATTATATGTCGACTGAAAATTTAAATATTAC-3′). Amplified fragments obtained after the recombinant PCR were digested with HpaI and SalI and cloned into WT plasmid digested with HpaI and SalI to create either pHSWTGFP or pHSΔUrGFP. All plasmids were sequenced and checked by restriction enzyme analysis.
RNase protection analysis (RPA) and primer extension analysis.
RNase protection assays were carried out as described previously (12). The RNA and probe were heated for 5 min at 65°C and hybridized at 45°C for 16 h. Primer extension in the presence of dideoxynucleotide was done as described previously (12). The mixtures of RNA and primer were heated at 65°C for 5 min before hybridization, and reverse transcription was carried out in the presence of 5% acetamide. Quantitation was performed on a Molecular Dynamics PhosphorImager using ImageQuant software. All constructs were assayed in at least two independent transgenic strains, most in three or four strains. Multiple assays of each strain were averaged, and error bars were calculated from all assays of each construct. In general, variations between strains carrying the same construct were equivalent to variations from multiple assays of the same strain.
RESULTS
Linker scan analysis of the intercistronic region between the gpd-2 and gpd-3 genes.
Spieth et al. (16) described an in vivo system to study operon processing using a construct, HS1496, derived from a three-gene operon (mai-1–gpd-2–gpd-3). This synthetic operon contains the gpd-2 and the gpd-3 genes under the control of the hsp-16-41 heat shock promoter. Using this construct in transgenic worms, it was shown by Kuersten et al. (12) that the gpd-2 AAUAAA required for 3′-end formation was required for correct SL2-specific trans splicing of gpd-3 but that the gpd-3 trans splice site was not important for gpd-2 3′ end formation. However, the intercistronic sequence, only about 100 bp long, has never been tested for signals required for either 3′-end formation or trans-splicing specificity. Comparison of intercistronic regions of different operons does not reveal the presence of conserved sequences (T. Blumenthal, unpublished observation). On the other hand, comparison of the intercistronic regions from 15 operons that have been sequenced in both C. elegans and C. briggsae does in each case reveal a sequence, present in similar locations, that is conserved (Table 1). In contrast, most of the remaining intercistronic sequence is highly diverged between the two closely related species (unpublished observations). The conserved sequences from these different operons do not resemble each other in any way other than being rich in uridylate.
TABLE 1.
Sequences conserved between C. elegans and C. briggsae intercistronic regionsa
| Gene pair | Position | Sequence |
|---|---|---|
| mai-1, gpd-2 | 46 (127) | AUUCGAUUuuCAuUUCGAUC |
| 46 (131) | AUUCGAUUcaCAcUUCGAUC | |
| gpd-2, gpd-3 | 35 (110) | UUUC-CUCGUUUuU-gCGAGuuUaUaUACCUU |
| 39 (127) | UUUCaCUCGUUUgUcuCGAGaaUuUuUACCUU | |
| cyp-9, pdi-1 | 27 (112) | UUUUUUGUUucguuGAggCUUCUGUG |
| 29 (100) | UUUUUUGUU-----GAcuCUUCUGUG | |
| ppp-1, tra-2 | 28 (125) | UUUucgUUUUCcUCuAUcU |
| 29 (97) | UUUgauUUUUC-UCaAUuU | |
| t26a5.3, t26a5.2 | 47 (111) | UUUUCgC-cAAAUUuUC |
| 44 (106) | UUUUCcCuuAAAUUcUC | |
| lir-2, lir-1 | 36 (119) | UUUUCuuAUUUAAAUAGU |
| 14 (110) | UUUUC-aAUUUAAAUAGU | |
| lir-1, lin-26 | 48 (130) | AAuGAGUUAaUacaACUuGAAUA-CUU |
| 42 (136) | AAcGAGUUAcUuuuACUCUgGAAUAuCUU | |
| rpl-29, rpp-1 | 27 (121) | UGUUGUUGAAUucuaUUcACAUGGUC |
| 26 (136) | UGUUGUUGAAUaucuUUuACAUGGUC | |
| r07h5.2, r07h5.1 | 33 (151) | UUuGAUU---AUUCAUUGaUG |
| 32 (122) | UUcGAUUacaAUUCAUUGgUG | |
| c18a3.3, c18a3.2 | 36 (107) | AUCGuUUUuUAAUCGAUC |
| 43 (110) | AUCG-UUUcUAAUCGAUC | |
| zk632.7, zk632.8 | 22 (130) | gUUuUUAUcuuG |
| 31 (113) | uUUcUUAUuagG | |
| f25g6.9, f25g6.8 | 17 (110) | AUUUUuAUUUaUUUU-GAU-UUUU |
| 17 (112) | AUUUUaAUUUcUUUUuGAUaUUUU | |
| w06d4.1, c31h5.5 | 44 (126) | AGUUuAacggUUUCUUUU |
| 27 (124) | AGUUaAuacuUUUCUUUU | |
| c18a3.2, c18a3.1 | 24 (110) | UUCAGuauGUCUACAG |
| 38 (98) | UUCAGc--GUCUACAG | |
| f18a1.7, f18a1.6 | 138 (290) | AAAACaGCUGAcaagUUCAaUUCAUCGAU |
| 141 (325) | AAAAC-GCUGA----UUCA-UUCAUCGAU |
C. elegans sequences are presented above C. briggsae sequences. Numbers in the “Position” column indicate the distance between the AAUAAA sequence and the putative Ur element versus the distance between the AAUAAA sequence and the trans splice site distance (in parentheses). - indicates a gap introduced to facilitate alignment. U's are in boldface; identities are in uppercase letters.
Thus, we decided to experimentally mutate all sequences within the gpd-2–gpd-3 intercistronic region to locate sequence elements that play a role in 3′-end formation or trans splicing. A series of linker scan mutations (LS1 to LS9) were constructed to cover the 90 bp of the intercistronic region from the 3′ end of gpd-2 to the trans splice site of gpd-3 (Fig. 1). Each mutation was made by replacing 10 bp of the intercistronic region with the sequence GCTTAATTAA, which maintains the A- and U-rich content of the intercistronic region. The nine constructs were introduced into worms to generate multiple transgenic strains for each. Total RNA was extracted from these strains and analyzed for correct 3′-end formation and trans splicing.
FIG. 1.
Linker scan analysis of the gpd-2–gpd-3 intercistronic region. The C. elegans hsp-16–41 promoter upstream of gpd-2 (triangle) drives expression of gpd-2 and gpd-3–vit-6. Filled bars, coding regions; open bars, noncoding regions of exons; narrow lines, introns; wider lines, intercistronic sequences. The sequence is annotated with arrows indicating RNA processing sites and boxes marking 3′-end formation signals. Numbers indicate base pairs from the gpd-2 3′ end. The mutations, shown below the sequence, were made by replacing the wild-type sequence with the sequence indicated. The Ur element (as defined in this paper) and all mutations that change the Ur element are underlined. The RPA probe spans from within the last exon of gpd-2 to the second exon of gpd-3 as indicated by the horizontal line above the operon diagram.
Effect of linker scan mutations on 3′-end formation of gpd-2.
To determine the effect of the various LS mutations, we examined polycistronic pre-mRNA processing by RPA. Expression and processing of the wild-type operon resulted in protection of a 322-nt fragment, indicative of correct 3′-end formation (Fig. 2, top, lane 1). The downstream gene, gpd-3, was processed by trans splicing to give a protected fragment of 259 nt. The amount of trans-spliced gpd-3 mRNA was approximately 40% of the amount of gpd-2 mRNA. As observed previously (12), polycistronic pre-mRNA processing was complete as judged by the absence of polycistronic pre-mRNA, which would produce a fragment of 679 nt. In all of the linker scan mutants, 3′-end formation of the upstream gene product was also complete (Fig. 2), since we could not detect accumulation of unprocessed RNA and since there was no obvious change in the amount of the upstream gene mRNA. These data indicate that in this operon there are no nonredundant sequences following the gpd-2 3′ end required for 3′-end formation. Although we would expect this region to contain a cleavage stimulatory factor (CstF) binding site, needed for 3′-end formation in higher eukaryotes, we cannot rule out the possibility that there are redundant elements in the intercistronic region that are involved in the processing of the upstream mRNA.
FIG. 2.
Analysis of the linker scan mutations by RNase protection analysis. Transgenic strains carrying the indicated constructs were grown and heat shocked, RNA was isolated, and RPA was performed. (Top) The three major RNA products are indicated: the gpd-2 3′-end product, formed by cleavage and polyadenylation, and the two 5′-end products, each formed by trans splicing and intron removal. Although the probe is identical to gpd-3 sequences, gpd-3 is sufficiently similar in this region to gpd-2 that the 5′ end of gpd-2 also results in a protected product (12) Polycistronic RNA (but with introns removed) would have given a 697-nt product with RNA from strains expressing the wild-type construct. The equivalent polycistronic RNA from the linker scanning mutants would have given two bands due to mismatching of the wild-type sequence probe: 355 and 324 nt for LS1, 365 and 314 nt for LS2, etc. (Bottom) The RPA was quantitated, and the data was plotted as the ratio of gpd-3 5′-end product to gpd-2 3′-end product.
Ur sequence required for downstream mRNA accumulation.
In most of the mutants (LS1, LS2, and LS6 through LS9), accumulation of gpd-3 mRNA is indistinguishable from that of the wild type (Fig. 2). However, in three mutants, LS3, LS4, and LS5, the amount of protected fragment representing the 5′ end of gpd-3 RNA was significantly reduced. In LS3 and LS5, gpd-3 RNA accumulation was about 50% of the wild type, whereas in LS4, only 25% of the wild-type level accumulated. These three mutations cover 30 bp of the intercistronic region that is especially rich in uridylate residues, so we call it the Ur element. It is essentially the same sequence found to be homologous between C. elegans and C. briggsae intercistronic regions (Table 1).
Effect of the linker scan mutations on gpd-3 trans-splicing specificity.
We tested all of the LS mutant strains for alteration of their relative levels of SL2 and SL1 trans splicing by primer extension with ddGTP, which distinguishes between SL1 and SL2 trans-spliced gpd-3 mRNA. Reverse transcription from the splice site stops at the first C in the template, giving products that extend the primer 9 nt with SL1-spliced mRNA, 3 nt with SL2-spliced mRNA, and 2 nt with pre-mRNA or other unspliced intermediates. With the wild-type construct, most trans-spliced product received SL2 (Fig. 3, top, lane 1) (12). Most of the linker scan mutations resulted in no significant change in the percentage of SL2 trans splicing. However, LS4, the mutation that had the most drastic effect on gpd-3 mRNA accumulation, also reduced the level of SL2 and increased the level of SL1 trans splicing (Fig. 3, top, lane 5). In all cases, we confirmed the results by primer extension in the presence of ddCTP, which also gives differences in sizes of the products for SL1 and SL2 trans-spliced and unspliced RNA (data not shown). We conclude that sequences mutated in LS4 play a key role in determining gpd-3 trans-splicing specificity and in the accumulation of trans-spliced gpd-3 mRNA.
FIG. 3.
Effect of LS mutations on trans-splicing specificity. The RNA samples analyzed in the results shown in Fig. 2 were tested by primer extension from a primer that is complementary to RNA −1 to +17 from the gpd-3 trans splice site. The presence of ddGTP results in termination of extension after 2 nt on unspliced RNA and after either 3 or 9 nt from RNA resulting from trans splicing to SL2 or SL1, respectively. Following primer extension, the products were electrophoresed on a 20% polyacrylamide denaturing gel. (Top) The positions of expected products are indicated. (Bottom) The RPA was quantitated, and the data was plotted as the percent of total trans-spliced gpd-3 that is trans spliced to SL2.
Establishing the boundaries of the Ur element.
The three linker scan mutations that affected gpd-3 mRNA accumulation span 30 bp. To establish the boundaries more precisely, we created additional mutations. In LS10, only the 5′-most 8 bp of sequence covered by LS3 was mutated (Fig. 1). Unlike LS3 (Fig. 2, top, lane 4), the phenotype was the wild type, indicating that only the 3′-most 2 bp of the sequence covered by LS3 are required for Ur function (Fig. 4, lane 2). In the second mutation, LS11, only the 3′-most 5 bp of the sequence covered by the LS5 mutation was mutated (Fig. 1). This mutation resulted in a mutant phenotype (Fig. 4, top, lane 3) even stronger than that shown by LS5 (Fig. 2, lane 6), indicating that the Ur element includes at least some of the 3′-most 5 bp of the sequence covered by the LS5 mutation. Neither LS10 nor LS11 altered trans-splicing specificity (data not shown), similar to the behavior of the LS3 and LS5 mutations (Fig. 3, top, lanes 4 and 6). These data indicate that the Ur element spans no more than 22 nt, from position 29 to 50 (Fig. 1), and possibly as few as 18 nt.
FIG. 4.
Fine-structure analysis of the Ur element. (Top) LS10 is identical to LS3 except for the 3′-most 2 bp which are returned to the wild-type sequence. In LS11, only the 3′-most 5 bp of LS5 are mutant. (see Fig. 1). In addition, several small mutations within the Ur element, described in the legend to Fig. 1, were tested. (Bottom) RPA was performed, and the data was quantitated as in described in the legend to Fig. 2.
Fine-structure analysis of the Ur element.
To further dissect the Ur element and test the idea that the U richness of the element contributes to its function, we constructed several smaller replacement mutations, LS21 to -26, and ΔUr (Fig. 1). None of these mutations affected 3′-end formation (Fig. 4). However, all of them except LS24 and LS26 resulted in significant reductions in gpd-3 accumulation. Even LS21, in which a single G at position 29 was changed to a C, reduced gpd-3 mRNA accumulation quite substantially (Fig. 4, top lane 4). These results indicate that most of the sequences between positions 29 and 50 play a role in Ur function. Only the bases from positions 41 to 44 could be replaced without significant reduction in gpd-3 mRNA. Thus the Ur element has critical nucleotides between positions 29 to 40 and 45 to 50.
Sufficiency of the Ur element.
The above results showed that the Ur element is essential for both maximal accumulation of gpd-3 mRNA and correct trans-splicing specificity. To determine whether it is sufficient, we created the SUF construct in which all of the intercistronic sequences except the Ur element were replaced with unrelated polylinker sequences (Fig. 1). As a control, SUFR was constructed by reversing Ur in the same context as SUF (Fig. 1). In both constructs, 3′-end formation of gpd-2 was normal, again suggesting that the region downstream of the cleavage site plays no role in 3′-end formation (Fig. 5). Remarkably, accumulation of gpd-3 mRNA was nearly normal in SUF, while virtually no gpd-3 mRNA accumulated in the SUFR control. This demonstrates that the Ur element can function in a heterologous context with no sequence from the intercistronic region except the trans splice site. Furthermore, trans-splicing specificity was also normal, demonstrating that the Ur element is sufficient to allow SL2 trans splicing in the absence of other sequences from the gpd-2–gpd-3 intercistronic region (Fig. 5).
FIG. 5.
The Ur element is sufficient for proper operon mRNA processing. The entire intercistronic region except Ur was replaced by unrelated sequence material to create the SUF construct (Fig. 1). In the SUFR construct, the Ur sequence is reversed in the same context. RPA and primer extension analysis were performed as described in the legends to Fig. 2 and 3.
Effect of Ur mutations in the absence of 3′-end formation.
It was shown previously that deletion of the gpd-2 AAUAAA signal for 3′-end formation completely prevents 3′-end formation but still allows gpd-3 trans splicing (12; also see Fig. 6A, top, lane 2). However, this trans splicing is no longer predominantly to SL2; instead about half of the trans splicing is to SL1 (12; also see Fig. 6B, lane 2). In order to determine whether the Ur mutations still had an effect in the absence of 3′-end formation, we created double mutations in which both the AAUAAA signal and the Ur element were mutated. We found that most of these double mutants were indistinguishable from the AAUAAA mutant alone (Fig. 6). As expected, gpd-2 3′-end formation was completely abrogated in all of the double mutants; however, there was a substantial amount of accumulation of gpd-3, much more than seen with the single Ur mutations (compare Fig. 2, top, lane 5, with Fig. 6A, lane 4, for instance). This suggests that the reduction in the levels of gpd-3 accumulation in the single Ur mutant strains is dependent on 3′-end formation. Furthermore, the ratio of SL2 to SL1 trans splicing in most double mutant strains is indistinguishable from that in the AAUAAA mutation alone, so it appears that the AAUAAA mutation is epistatic to the Ur mutations. However, the double mutations with LS4 or ΔUr do show a further increase in SL1 and decrease in SL2 trans splicing (Fig. 6B, lanes 3 and 4), so the Ur mutations may have a direct effect on trans-splicing specificity (also see below).
FIG. 6.
Contribution of the Ur element to the trans-splicing specificity of the downstream gene. The effects of mutations within the Ur element were measured in the absence of gpd-2 3′-end formation. The ΔpA mutation was constructed by mutating both of the gpd-2 AAUAAA elements (12). The unmutated construct is labeled WT. (+) indicates an unmutated Ur region. RPA (A) and primer extension analysis (B) were performed as described in the legends to Fig. 2 and 3.
Replacement of the downstream gene with the gfp gene.
In order to find out if there are any sequences in the downstream gene that play a role in defining its trans-splicing specificity, we replaced gpd-3 with the gfp gene in our transgenic construct and measured mRNA accumulation and trans-splicing specificity (Fig. 7). RPA showed that the gfp gene was expressed and processed properly (Fig. 7B, lane 1). Primer extension in the presence of ddGTP indicated that this gene was trans spliced predominantly to SL2 (Fig. 7C, lane 2). These results suggest that sequences in gpd-3 are not required for SL2-specific trans splicing. We also tested the effect of the ΔUr mutation in this operon. Both RPA and primer extension with and without ddGTP demonstrate clearly that in this novel context the Ur element is important for the accumulation of the downstream mRNA. Interestingly, it appears that, when gfp is the downstream gene, SL2 trans splicing is more dramatically inhibited by the lack of the Ur element than when gpd-3 is the downstream gene (compare Fig. 7C, lanes 3 and 4, with Fig. 3, top, lane 5).
FIG. 7.
The downstream gene can be replaced without loss of SL2 trans splicing. (A) The gpd-3 gene was replaced with the gfp gene and was tested for expression and trans-splicing specificity. These populations of worms were not heat shocked, since the artificial operon containing gpd-2 and the gfp gene was inexplicably expressed in the absence of heat shock. In this case poly(A)+ RNA was tested by RPA (B) and primer extension analysis (C) as described in the legends to Fig. 2 and 3. Constructs contained either the wild-type or ΔUr intercistronic sequence.
DISCUSSION
Accumulation of downstream mRNA.
Since the genes in C. elegans operons are typically separated by approximately 100 to 400 bp of intercistronic sequence, Kuersten et al. (12) and Spieth et al. (16) proposed that 3′-end formation of the upstream mRNA is mechanistically coupled to trans splicing of the downstream mRNA. They demonstrated that a functional AAUAAA does play a role in SL2 trans splicing, but SL2 trans splicing could still occur at reduced levels in the absence of AAUAAA. This suggested that the AAUAAA is not directly involved in SL2 trans splicing to gpd-3 and that other signals must be present that are responsible for SL2 specificity.
Our evidence suggests that the trans splicing of the downstream gene is determined by sequences within the intercistronic region. When the intercistronic sequence was shortened to 30 bp or lengthened to 300 bp with a heterologous sequence, trans splicing of the downstream mRNA was virtually eliminated (J. Spieth, K. Lea, S. Kuersten, M. MacMorris, and T. Blumenthal, unpublished). Based on this observation, we undertook the detailed mutational analysis reported here to screen the entire intercistronic sequence. This resulted in the discovery of what we term the Ur element. When the Ur element was mutated, accumulation of downstream mRNA was severely reduced, enabling us to define the element with some precision. Based on the data reported here, we can conclude that the 5′ end of the element is at position 29 or 30, counting from the site of 3′-end formation, and that its 3′ end is between positions 45 and 50. Furthermore, not all sequences between these positions are important, based on the fact that positions 40 to 43 can be mutated without loss of activity. We suggest that Ur is likely to be a protein binding site, since a single nucleotide change at position 27 eliminated activity.
The linker scan analysis along with the sufficiency experiment makes it clear that only sequences contained within the ∼22 nt of the Ur element are required for accumulation of downstream mRNA. Furthermore, the experiment in which the downstream gene itself was replaced by a gfp gene makes it clear that the sequence of gpd-3 does not play a critical role. What does the Ur element do? Because it appears to also affect the choice between SL2 and SL1 trans splicing, we presume it has its effect by allowing or directly promoting trans splicing. The following model can explain its function. If gpd-2 3′-end formation occurs before gpd-3 trans splicing, then the cleavage reaction will leave a free 5′ phosphate on the downstream portion of the pre-mRNA. Since this RNA is not capped, it will likely be subject to rapid degradation by an exonuclease. We hypothesize that a protein binds to the Ur element to impede the progress of the exonuclease and thereby allow SL2 trans splicing to occur. The SL2 trans-splicing event could be a default mode that occurs whenever sufficient time is allowed by the protein bound to Ur. Alternatively, the protein bound to the Ur element could interact directly or indirectly with the SL2 snRNP to facilitate correct trans splicing. The trans splicing provides a cap that will prevent degradation of the downstream mRNA and hence allow its accumulation.
Involvement of the Ur region in the trans-splicing specificity of the downstream gene.
If failure to accumulate downstream mRNA in the Ur mutants is primarily a defect in trans-splicing efficiency, then following 3′-end formation, the downstream product would be inefficiently processed, and the majority of the gpd-3 RNA would be degraded. An alternative possibility is that the Ur element is more directly involved in SL2 trans splicing. To test this possibility, we mutated the Ur element in the context of the gpd-2 AAUAAA mutation (ΔpA), which eliminates upstream 3′-end formation. Since cleavage or transcription termination couldn't occur due to the AAUAAA mutation, we predicted that the downstream product would now be able to accumulate. And indeed, both polycistronic RNA (resulting from the failure of 3′-end formation and trans splicing) and trans-spliced gpd-3 mRNA accumulated to the same levels whether or not the Ur element was mutated in the ΔpA background (Fig. 6). Since the ΔpA mutation by itself reduced SL2 trans splicing, it was difficult to determine for certain whether there was an additional reduction due to the Ur mutations. However, it appears that both ΔUr and the LS4 mutation did increase SL1 trans splicing at the expense of SL2.
A much more dramatic effect of the ΔUr mutation was seen in the context of the gfp operon construct (Fig. 7). Here, SL2 trans splicing was eliminated by the ΔUr mutation without any apparent effect on SL1 trans splicing. We conclude that there is a direct effect by the Ur element on SL2 specificity, as opposed to just on trans-splicing efficiency, but that it is more difficult to demonstrate when the AAUAAA is also mutated. We suggest that the Ur element is directly involved in SL2 trans splicing but that that involvement is facilitated by CPSF bound to the AAUAAA of the upstream gene or by 3′-end formation itself. It has previously been shown that inactivation of the gpd-2 CPSF binding site reduces the level of trans splicing to gpd-3 but doesn't eliminate SL2 trans splicing. Since vertebrate CPSF and CstF are known to interact cooperatively (10, 20), the AAUAAA mutation might exert its effect by lowering the affinity of CstF for its binding site.
Implications for the mechanism of 3′-end formation.
It is clear from earlier work that C. elegans uses the same recognition sequence for CPSF binding as vertebrates do, AAUAAA, although many variants of this sequence are known to function in C. elegans (2). In contrast, no CstF binding site has yet been identified in worms, although all of the CstF subunits are encoded in the C. elegans genome. This paper reports the first mutational analysis of a region downstream of a 3′ end in C. elegans, where the CstF binding site would be expected to be located. Nonetheless, our findings indicate that all of the sequence downstream of the 3′ end can be replaced without any apparent effect on 3′-end formation. The possibility that CstF does bind in this region to facilitate 3′-end formation but that the binding sites are redundant is made unlikely by the experiment showing that 3′-end formation was also complete in both the SUF and SUFR constructs. Since the latter contains no sequences normally present in the gpd-2 intercistronic region, we conclude that no sequences downstream of the 3′ end are required for correct 3′-end formation in this operon. Thus, all of the sequences required for 3′-end formation of gpd-2 are present within the gene, presumably in the 3′ untranslated region. Whether this is generally true of C. elegans 3′-end formation remains to be determined. It is possible that this conclusion is valid only for upstream genes in operons or even only for gpd-2.
Another possibility is that CstF could play an obligatory role but that it could interact with the 3′-end formation machinery by another means such as by forming a tight complex with CPSF (19). CstF may bind to the intercistronic region, but its binding there may not be required for 3′-end formation. Perhaps binding of CstF to a downstream region only facilitates 3′-end formation without being required for it to occur in a CstF-dependent fashion. This idea leaves open the possibility that CstF could be the protein that binds to the Ur region and that binding is required to allow production of downstream mRNA. We suggest this possibility only because the Ur element has a sequence consistent with a CstF binding site and is in a location where CstF binding sites generally reside (4, 14, 18), in this case, 29 nt downstream of the site of 3′-end formation. This idea is also consistent with the observation that each operon listed in Table 1 has a conserved sequence in this approximate location, but these sequences resemble each other only by virtue of the fact that they are all rich in U residues.
ACKNOWLEDGMENTS
We are grateful to Devin Leake for discussions and help with the figures.
This work was supported by research grant GM42432 from the National Institute of General Medical Sciences.
REFERENCES
- 1.Blumenthal T. Gene clusters and polycistronic transcription in eukaryotes. Bioessays. 1998;20:480–487. doi: 10.1002/(SICI)1521-1878(199806)20:6<480::AID-BIES6>3.0.CO;2-Q. [DOI] [PubMed] [Google Scholar]
- 2.Blumenthal T, Steward K. RNA processing and gene structure. In: Riddle D, Blumenthal T, Meyer B, Priess J, editors. C. elegans II. Cold Spring Harbor, N.Y: Cold Spring Harbor Laboratory Press; 1997. pp. 117–145. [PubMed] [Google Scholar]
- 3.Brenner S. The genetics of Caenorhabditis elegans. Genetics. 1974;77:71–94. doi: 10.1093/genetics/77.1.71. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Chou Z F, Chen F, Wilusz J. Sequence and position requirement for uridylate-rich downstream elements of polyadenylation signals. Nucleic Acids Res. 1994;22:2525–2531. doi: 10.1093/nar/22.13.2525. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Conrad R, Thomas J, Spieth J, Blumenthal T. Insertion of part of an intron into the 5′ untranslated region of a Caenorhabditis elegans gene converts it into a trans-spliced gene. Mol Cell Biol. 1991;11:1921–1926. doi: 10.1128/mcb.11.4.1921. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Conrad R, Liou R F, Blumenthal T. Functional analysis of a C. elegans trans-splice acceptor. Nucleic Acids Res. 1993;21:913–919. doi: 10.1093/nar/21.4.913. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Conrad R, Liou R F, Blumenthal T. Conversion of a trans-spliced C. elegans gene into a conventional gene by introduction of a splice donor site. EMBO J. 1993;12:1249–1255. doi: 10.1002/j.1460-2075.1993.tb05766.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Denker J A, Maroney P A, Yu Y T, Kanost R A, Nilsen T W. Multiple requirements for nematode spliced leader RNP function in trans-splicing. RNA. 1996;2:746–755. [PMC free article] [PubMed] [Google Scholar]
- 9.Evans D, Zorio D, MacMorris M, Winter C E, Lea K, Blumenthal T. Operons and SL2 trans-splicing exist in nematodes outside the genus Caenorhabditis. Proc Natl Acad Sci USA. 1997;94:9751–9756. doi: 10.1073/pnas.94.18.9751. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Gilmartin G M, Nevins J R. Molecular analyses of two poly(A) site processing factors that determine the recognition and efficiency of cleavage of the pre-mRNA. Mol Cell Biol. 1991;11:2432–2438. doi: 10.1128/mcb.11.5.2432. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Higuchi R. Recombinant PCR. In: Innis M A, Gelfand D H, Sninsky J J, White T, editors. PCR protocols: a guide for methods and applications. San Diego, Calif: Academic Press, Inc.; 1990. pp. 177–183. [Google Scholar]
- 12.Kuersten S, Lea K, MacMorris M, Spieth J, Blumenthal T. Relationship between 3′ end formation and SL2-specific trans-splicing in polycistronic Caenorhabditis elegans pre-mRNA processing. RNA. 1997;3:269–278. [PMC free article] [PubMed] [Google Scholar]
- 13.Lee Y H, Huang X Y, Hirsh D, Fox G E, Hecht R M. Conservation of gene organization and trans-splicing in the glyceraldehyde 3-phosphate dehydrogenease genes of Caenorhabditis briggsae. Gene. 1992;121:227–235. doi: 10.1016/0378-1119(92)90126-a. [DOI] [PubMed] [Google Scholar]
- 14.MacDonald C C, Wilusz J, Shenk T. The 64-kilodalton subunit of the CstF polyadenylation factor binds to pre-mRNAs downstream of the cleavage site and influences cleavage site location. Mol Cell Biol. 1994;14:6647–6654. doi: 10.1128/mcb.14.10.6647. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Mello C, Kramer J, Stinchcomb D, Ambros V. Efficient gene transfer in C. elegans: extrachromosomal maintenance and integration of transforming sequences. EMBO J. 1991;10:3959–3970. doi: 10.1002/j.1460-2075.1991.tb04966.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Spieth J, Brooke G, Kuersten S, Lea K, Blumenthal T. Operons in C. elegans: polycistronic mRNA precursors are processed by trans-splicing of SL2 to downstream coding regions. Cell. 1993;73:5121–5132. doi: 10.1016/0092-8674(93)90139-h. [DOI] [PubMed] [Google Scholar]
- 17.Sulston J, Hodgkin J. Methods. In: Wood W B, editor. The nematode Caenorhabditis elegans. Cold Spring Harbor, N.Y: Cold Spring Harbor Laboratory Press; 1988. pp. 587–606. [Google Scholar]
- 18.Takagaki Y, Manley J L. RNA recognition by the human polyadenylation factor CstF. Mol Cell Biol. 1997;17:3907–3914. doi: 10.1128/mcb.17.7.3907. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Takagaki Y, Manley J L. Complex protein interactions within the human polyadenylation machinery identify a novel component. Mol Cell Biol. 2000;20:1515–1525. doi: 10.1128/mcb.20.5.1515-1525.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Wilusz J, Shenk T, Takagaki Y, Manley J L. A multicomponent complex is required for the AAUAAA-dependent cross-linking of a 64-kilodalton protein to polyadenylation substrates. Mol Cell Biol. 1990;10:1244–1248. doi: 10.1128/mcb.10.3.1244. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Zorio D A R, Cheng N N, Blumenthal T, Spieth J. Operons represent a common form of chromosomal organization in C. elegans. Nature (London) 1994;372:270–272. doi: 10.1038/372270a0. [DOI] [PubMed] [Google Scholar]








