Abstract
A fraction of ribosomes engaged in translation will fail to terminate when reaching a stop codon, yielding nascent proteins inappropriately extended on their C-termini. Although such extended proteins can interfere with normal cellular processes, known mechanisms of translational surveillance are insufficient to protect cells from potential dominant consequences. Through a combination of transgenics and CRISPR/Cas9 gene editing in C. elegans, we demonstrate a consistent ability of cells to block accumulation of C-terminal extended proteins that result from failure to terminate at stop codons. 3’UTR-encoded sequences were sufficient to lower protein levels. Measurements of mRNA levels and translation suggested a co- or post-translational mechanism of action for these sequences in C. elegans. Similar mechanisms evidently operate in human cells, where we observed a comparable tendency for translated human 3’UTR sequences to reduce mature protein expression in tissue culture assays, including 3' sequences from the hypomorphic “Constant Spring” hemoglobin stop codon variant. We suggest 3’UTRs may encode peptide sequences that destabilize the attached protein, providing mitigation of unwelcome and varied translation errors.
Failure of translation termination to occur at a stop codon can lead to ribosomes translating into a 3’UTR. In some cases translation may proceed through the 3’UTR and into the poly(A) tail, triggering a process termed “nonstop” decay and destabilizing both the mRNA and nascent protein (reviewed in1). However, for a majority of 3’UTRs a stop codon is encountered prior to the poly(A) tail2,3. Readthrough events that encounter a subsequent termination codon are outside the scope of known translational surveillance pathways including nonstop1. Depending on the 3’UTR and the frame in which the ribosome enters, the late stop codon can be several, tens, or even hundreds of codons into a 3’UTR, producing variant proteins with potentially problematic C-terminal appendages. This issue is highlighted by several pathologies caused by late frameshifts or stop codon mutations where 3'UTR-encoded C-terminal extensions effect protein mislocalization4,5, aggregation6,7, and instability8–12, with calamitous consequences for organisms. Depending on sequence, genetic background, conditions, and organism, estimates of readthrough efficiency vary from <1% to 10% or more, posing a potential problem of non-trivial magnitude10,13.
We set out to determine whether, and to what extent, 3’UTR translation has an effect on gene expression with a fluorescent reporter system in C. elegans. Initially we selected 3’UTRs from three genes: unc-54 (encoding a muscle myosin), tbb-2 (a beta tubulin), and rpl-14 (a ribosomal protein). For each gene, fusion of the 3’UTR to a GFP driven by the myo-3 promoter resulted in robust fluorescence in body wall muscle (Fig 1a). Next, by mutating stop codons, we created GFP reporters for each gene where translation would read past the normal termination point, terminating instead at a stop codon part-way through the 3’UTR (Fig 1b). In each case the “late stop” reporter accumulated substantially less GFP, with differences in signal of at least 10-fold. As a control, a co-injected mCherry marker robustly expressed in the same cells. We conclude that translation into the 3’UTR can confer substantial loss of protein for at least these three 3’UTRs in C. elegans.
To test whether translation into 3’UTRs could confer a loss of protein expression more generally, a two-fluorescent-reporter system with each fluorophore transgene containing an identical 3’UTR was used. Nine genes were chosen arbitrarily to reflect a variety of functions and expression levels: rps-17 (small ribosomal subunit component), r74.6 (dom34/pelota release factor homolog), hlh-1 (muscle transcription factor), eef-1A.1 (also known as eft-3, translation elongation factor), myo-2 (a pharyngeal myosin), mut-16 (involved in gene/transposon silencing), bar-1 (a beta catenin), daf-6 (involved in amphid morphogenesis), and alr-1 (neuronal transcription factor). A criterion in choosing these genes was presence (common for C. elegans genes, Extended Data 1) of an in-frame stop codon in the 3’UTR at least 30 bases beyond the normal stop but upstream of known poly(A) sites. We fused the 3’UTRs of each gene separately to GFP and mCherry, removing the canonical termination codon in the GFP construct. For each of the nine genes tested, observed GFP signals were extremely faint, with raw GFP/mCherry fluorescence ratios of less than 0.1 (Fig 1c, Extended Data 2). As a control, versions of the GFP reporter with the normal termination codon intact provided robust GFP expression, 10-fold or more higher than the corresponding readthrough constructs (GFP/mCherry fluorescence ratios in the range 0.3–0.9).
Several observations provide hints at how translation into 3’UTRs might reduce protein levels:
Experiments with specific mutagenesis support a role for the eventual protein sequence. Shortening readthrough peptides (tested for unc-54 and tbb-2) increased GFP expression (Fig 2a). Extending this analysis, an equal-length non-synonymous substitution in the unc-54 3’UTR restored GFP expression, whereas synonymous substitution with multiple base differences did not.
Mutagenesis analysis of constructs using a constant 3’UTR reinforced the inference of peptide sequence as the primary determinant of GFP loss. We found that the nucleotide sequence between the normal termination codon and the first in-frame termination codon was sufficient to confer GFP loss if inserted at the end of the GFP coding region for unc-54, tbb-2, hlh-1, daf-6, rps-20, or rps-30 (Fig 2b). The rps-30 readthrough region had the weakest effect on GFP, and was the shortest (nine amino acids). We undertook further mechanistic dissection by synonymous variation of readthrough regions from unc-54, tbb-2, and rps-20, with GC contents from 35–60%, in some cases mutating >50% of bases. Each synonymously substituted variant conferred robust loss of GFP.
Decreased expression following translation into the 3'UTR required peptide linkage between the upstream protein and the 3'UTR-encoded segment. To assess the relationship between (i) covalent linkage with the translated C terminal peptide and (ii) the outcome for the larger protein, we took advantage of a picornavirus-derived oligopeptide sequence that causes cleavage and release of the nascent chain, after which ribosomes continue translation of the downstream sequence14,15. Insertion of the T2A peptide (EGRGSLLTCGDVEENPGP) between GFP and the unc-54 3’UTR-encoded sequence rescued GFP expression, whereas an uncleavable T2A* point mutant did not (Fig 2b). Restoration of GFP levels by T2A to the level of no-insert controls also argues against mRNA destabilization as a substantial factor in the protein loss observed upon readthrough.
The above results could be explained if GFP was generally incompatible with C-terminal fusions in our system. To address this, we inserted a variety of sequences downstream of GFP: 3xFLAG, 3xHA, three random sequences created in silico, and six arbitrary fragments of in-frame coding sequence from C. elegans genes, approximately length matched to 3’UTR-encoded sequences (Fig 2b). GFP expression varied between constructs but was generally higher than 3’UTR-encoded sequences: 3xHA, 3xFLAG, 2 of 3 of the random sequences, and 4 of 6 of the coding-derived fragments exhibited GFP:mCherry fluorescence ratios of >0.13, higher than all nine tested 3'UTR-derived C-terminal extensions and significant statistically (p=.004, KS test). Thus the effects of 3’UTR-encoded sequences are not explained by a general intolerance of GFP to C-terminal extensions (see also Methods, Extended Data 3, 4).
It was conceivable that peculiarities of GFP and/or transgene expression systems might underlie the above observations. To establish effects of 3'UTR translation at endogenous genes, we sought loci where: (1) a loss of protein would be detectable phenotypically, (2) C-terminal fusions are known to be functional, (3) the next in-frame stop codon of the endogenous locus is ≥10 amino acids past the annotated stop codon, yet upstream of annotated poly(A) sites16, and (4) there is little-or-no autoregulation/feedback. unc-54 and unc-22 satisfy all the criteria, and pha-4, unc-45, and tra-2 at least the first three points (Methods). For each locus we mutated the stop codon to allow translation into the 3'UTR (Fig 3a)17. In parallel, we analyzed small insertions/deletions generating late frameshifts for unc-22 and unc-54. Additional controls had length-matched sequences and/or GFP tags at the C-terminus (Extended Data 5). For each of unc-22, unc-45, unc-54, and tra-2, translation into the 3'UTR in at least one frame generated a strong hypomorphic (near null) phenotype specific to each locus. Other C-terminal tags for each gene were well behaved (no loss of expression), although one tra-2 C-terminal tag did produce a Tra phenotype. The ability to place alternative tags on the C-terminus without obvious phenotypic consequences argues against a general sensitivity of the C-terminus to tagging. For unc-22 and unc-54, that elongation into the 3'UTR in only some frames elicited a hypomorphic phenotype argues against ribosome elongation into the 3'UTR as being detrimental per se.
To determine the consequences on gene expression upon translation into 3'UTRs, we analyzed the unc-54(cc3389) TAA(Stop)>AAT(Asn) mutation for its effects on RNA, translation, and protein output. We analyzed mRNA expression in unc-54(cc3389/+) heterozygotes (phenotypically wild type to avoid complications from an Unc phenotype). RNA-seq revealed the unc-54(cc3389) and wild type alleles at approximately equal amounts in the mRNA pool, suggesting 3’UTR translation does not appreciably destabilize the unc-54 mRNA (Fig 3c). In parallel, we detected a ~20-fold reduction in UNC-54 protein in immunoblots in unc-54(cc3389) (Fig 3d). To look for possible alterations in translation for unc-54(cc3389), we examined the distribution of RNase-protected mRNA fragments with ribosome footprint profiling18. We observed no significant difference in the loading of ribosomes on unc-54 mRNA (Extended Data 6), nor on the number, distribution, frame, or fragment size of ribosomes in the extended region (Extended Data 7).
A model that arises from these observations is that 3’UTR-encoded peptides mark their resulting products for destruction, either co- or post-translationally. Conceivably this process might operate either in a specific cell/tissue type or in a broad spectrum of different contexts. A broadly-expressed reporter bearing a readthrough extension would be expected to highlight any tissue which failed to destabilize the C-terminal peptide. Using a broadly-expressed promoter (eef-1A.1) driving GFP with and without the unc-54 3'UTR-encoded peptide, we observed no cells where GFP was robustly retained (data not shown).
We likewise considered the possibility that 3’UTR-encoded peptides might act to limit protein levels in human cells, developing a specific assay using a lentiviral dual fluorescence reporter encoding puromycin N-acetyl-transferase tethered to mCherry-T2A, followed by eGFP and a multiple cloning site (Fig 4a). The resulting reporter expresses both fluorophores from the same mRNA, yet as two disjoint polypeptides, allowing consideration of a peptide tag’s effect on eGFP expression independent of effects on mCherry/mRNA expression. We validated the split dual fluorophore approach in K562 cells using tags known to be destabilizing (d1ODC, d4ODC19) or not (3xFLAG, 3xHA) (Fig 4b). We selected 13 genes of varying expression and function, and inserted the region between the annotated termination codon and first-in-frame termination codon downstream of eGFP. For 9 of 13 genes, the readthrough region reduced the eGFP:mCherry fluorescence ratio between 3 and 30-fold, a stronger reduction than the degron d4ODC (Fig 4c). While not universal, the substantial loss of eGFP fluorescence for a majority of readthrough regions opens up the possibility that translation into 3’UTRs may be generally inhibitory to expression across systems.
We hypothesize that a function of 3’UTRs is to minimize the accumulation of extended protein products that could be produced through translational readthrough.
This feature may prove generally significant in the causation of genetic disease. For example, readthrough alleles (e.g. Stop>Gln) of the HBA2 locus in humans produce a fraction (~1%9) of normal HBA2 protein (alpha globin), causing thalassemia. Translation into the HBA2 3’UTR is known to destabilize the HBA2 mRNA20, but it is unclear what effect the appended C-terminal 31 amino acids have on HBA2 protein. We considered the possibility that the HBA2 3'UTR-encoded peptide might prevent protein expression in humans, contributing to the loss of HBA2 protein. When appended to eGFP, the HBA2 3’UTR-encoded peptide decreased the eGFP:mCherry fluorescence ratio in K562 cells (Fig 4d). Furthermore, eGFP fluorescence was rescued by a self-cleaving (but not an uncleaveable mutant) T2A peptide.
Several observations from the literature corroborate the notion that 3’UTRencoded peptides may be detrimental to expression for more genes and organisms than those assayed here. In S. cerevisiae, translation past a point in the his3 3’UTR confers a substantial loss in protein expression, without detectable effects on mRNA levels11. Similarly, readthrough of the cyclic AMP phosphodiesterase pde2 stop codon produces a destabilized protein variant, and this has been suggested to explain elevated cyclic AMP levels in Psi+ yeast10. Differential stability by polymorphisms in the readthrough peptide of sky1 has been postulated to explain [Psi]-induced strain differences in diamide sensitivity21. Particularly intriguing are very recent findings that stop codon mutations at the cFLIP-L locus confer protein instability for this anti-apoptotic factor in mice, leading to embryonic lethality12. The same study also noted several hereditary human disease alleles where 3’UTR-encoded peptides are destabilizing, conferring marked decreases in protein activity and level (e.g.8).
Not every case of stop codon readthrough is destabilizing4–7 (Fig 4c), and some readthrough events are functional and regulated to defined levels (e.g.22–25). Understanding the mechanisms by which some readthrough events are recognized and cleared (while others are not) may prove informative for biological contexts and pathological states where inappropriate readthrough occurs. We do not yet know the determinants of a translated 3’UTR sequence that confer loss of protein, though the ability of numerous sequences (including shuffled and randomized 3’UTR variants, Fig 2b, 4c) suggest a highly degenerate sequence is sufficient. Consistent with the idea that readthrough peptides’ effects may be mediated via their biophysical characteristics, we observed a significant negative relationship between hydrophobicity and expression for GFP (K562 cells, C. elegans) and endogenous loci (unc-22 and unc-54, C. elegans) (Extended Data 8, 9,10, Supplementary Information).
Destabilization by 3'UTR-encoded peptides could effectively screen against at least three types of events where a stop codon is inappropriately bypassed: (1) Stop codon misreading (e.g. by suppressor tRNAs). Suppressor tRNAs permit readthrough of a few to upwards of 30% of ribosomes at a stop codon (one of UAA, UAG, or UGA)13. While some suppressor tRNAs can be toxic, other cells tolerate even high levels of readthrough13,26–28. Destabilization of readthrough products by C-terminal appendages may effectively buffer cells from suppressor tRNA-induced proteostatic chaos. (2) A ribosomal frameshift in a coding region which is late enough that no premature termination codon is encountered. In this case, ribosomes would enter the 3'UTR out-of-frame with the coding region. In our manipulations, translation of 3'UTRs in multiple frames was detrimental to expression (Extended Data 5, data not shown), and similar amino acid and hydropathy biases hold for all three 3'UTR frames (Extended Data 9,10). (3) Aberrant RNA processing or ribosome dysfunction could produce a variety of other improperly terminated peptides from which destabilization would provide valuable relief.
Methods
C. elegans Strain Construction and Husbandry
C. elegans were grown at 23C on agar plates with nematode growth medium seeded with E. coli strain OP50 as described29. Some strains were provided by the CGC, which is funded by NIH Office of Research Infrastructure Programs (P40 OD010440). A full list of strains used is available in Supplementary Table 1.
Transgenic array-containing strains were generated as follows: PD5102 (pha-1(e2123ts)I; rde-1(ne300)V) young adult hermaphrodites (grown at 16C) were injected with a mix of 90 ng/ul pC1 (containing a rescuing fragment of pha-1), 5 ng/ul of an mCherry-containing vector, and 5 ng/ul of a GFP-containing vector. Unless otherwise indicated, GFP was driven by the myo-3 promoter to drive expression in the body wall muscle30. Injectants were shifted to 23C to select for F1 progeny animals bearing a transgenic array (selecting for pha-1(+) expression31). The rde-1 allele included in this strain avoided a modest degree of secondary siRNA-based silencing observed with many extrachromosomal transgenes32. For transgenic lines generating low levels of GFP, we considered the possibility that the GFP protein was toxic and selected against.
Under this model, one might expect (1) a subset of sick and/or dead GFP-positive F1 animals, (2) muscle defects due to muscle-specific expression of potentially-toxic GFP derivatives, (3) concomitant low levels of mCherry, and/or (4) a decrease in the efficiency with which transgenic lines were obtained32. None of these effects were observed, arguing against any contribution of negative selection to the observed low GFP expression.
For a subset of strains, we deviated from the above protocol to generate pha-1 arrays as follows: (1) While most transgenic lines were generated from independently injected parents, a handful of strains were possibly generated from siblings of an injected parent (PD6480, 6481, 6482, 6483, 6484, 6485, 6486, 6493, 6494, 6495). In these cases, all injectants were pooled together on the same plate, and independent F1 were picked off to generate transgenic lines. Previous work has demonstrated independent F1 from the same injected parent carry distinct transgenic arrays32,33. (2) During the course of our analyses, we found a handful of strains with an mCherry-negative subpopulation. The mCherry-positive subpopulation was isolated and propagated to generate the strains PD6401, 6450, 6452, 6456, 6457, and 6464.
CRISPR/Cas9 genome editing was performed in the VC2010 (PD1074) N2 background as described17. We selected pha-434,35, unc-4536,37, tra-238–40, unc-2229, and unc-5441,42,38,26 based on the criteria in the text (citations indicated). The statement that unc-22 and unc-54 exhibit little-or-no autoregulation/feedback is based on a number of genetic experiments (with heterozygous29, amber-suppressed26, and/or smg-suppressed38 alleles) which express either UNC-54 or UNC-22 at stable intermediate levels (between wild type and null). Alleles of unc-45 were initially generated in the VC2010 background, though the embryonic lethality made unc-45(TerByP) difficult to maintain. We subsequently remade all alleles in a balanced heterozygote background (sC1(s2023) [dpy-1(s2170)] III/+) and considered non-Dpy segregants for phenotypic analyses.
Human Cell Line Construction
K562 cells (obtained from ATCC) were grown at a density of ~0.5–1×106 cells/ml in RPMI supplemented with penicillin/streptomycin, L-glutamine, and 10% FBS. All cell lines were maintained in a humidified incubator (37C, 5% CO2), and checked regularly for mycoplasma contamination. As means of validating K562 cells, we performed RNA-seq on a subset of lines and observed good correlation with published datasets43 (data not shown). Viral particles were produced in HEK293T cells in 6 well dishes, and 1ml of viral supernatant was used to infect ~100,000 K562 cells by spin infection, 103 rcf for 2 hours. Polybrene was omitted so as to keep the infection rate low (<10%), ensuring a single incorporation event for most cells. After three days of recovery, cells were selected with puromycin at 0.7 ug/ml for at least 3 days. Fluorescence was examined on a BD Accuri C6 flow cytometer, with appropriate gating for live cell events. For each construct examined via puromycin selection in K562 cells, similar eGFP and mCherry fluorescence levels were also observed in transient transfection in HEK293T cells in the absence of puromycin, arguing against a puromycin-selected skew in mCherry fluorescence.
Plasmids
Plasmids were constructed by restriction digest or Gibson cloning as detailed in Supplementary Table 2. pJA138/L3785 and pJA137/pCFJ104 were used as the basis of all C. elegans GFP or mCherry-containing vectors, respectively. Portions of pMCB306 and pMCB309 were used to construct pJA291, the parental puro::mCherry::T2A::eGFP::MCS::wPRE vector for experiments in human cells. Plasmids were confirmed by both sequencing and restriction digest, and plasmid concentrations determined with the QuBit dsDNA Broad Range kit (Invitrogen). A handful of plasmids that may be useful have been deposited with Addgene: pJA327 (C. elegans superfolder GFP in L3785), pJA291, pJA317 (pJA291 with d1ODC insert) and pJA318 (pJA291 with d4ODC insert).
GFP fusions were done with a GFP variant that corresponds to wild-type (Aequora) GFP with mutations at position 65 (Ser>Thr for human and Ser>Cys for C. elegans) known to improve folding and acquisition of fluorescence. Even with these mutations, GFP has a known propensity to misfold under some circumstances, so we examined the effect of a subset of the 3'UTR-encoded sequences (hlh-1, daf-6, and unc-54) downstream of a faster and more robust-folding GFP variant, superfolder GFP44. The observed reduction in superfolder GFP:mCherry ratios was quantitatively similar to that observed with normal GFP (Extended Data 3).
Sequences of FLAG45, HA46, d1ODC19, and d4ODC19 were obtained from the indicated publications. For exact sequences, see Supplementary Tables 1 and 2. T2A was previously shown to function in C. elegans14. Translation elongation through a member of the 2A peptide family (consensus D(V/I)EXNPGP) causes ribosomal pause, then release of the N-terminal peptide up to and including the Gly15. Translation elongation resumes, with the C-terminal peptide being produced with an N-terminal Pro.
Microscopy
Animals were immobilized by placing on a slide with a coverslip in 5mM EDTA, 50mM NaCl, 1mM levamisole and imaged on a Nikon Eclipse E6000 microscope using a Nikon super high pressure mercury lamp power supply. Filter cubes for fluorescence images were GFP (96342, Nikon Corp), mCherry (96321, Nikon Corp), and Broad (GFP+mCherry, 59022, Chroma Technology Corp). Images were collected with a 3CCD Digital Camera C7780 (Hamamatsu Corp) using HCImage (Version 1.0.2.060107, Hamamatsu Corp). Pictures of PD4251 and one of PD3363/3364 were taken for each imaging session and compared to ensure consistency between days.
For quantification of GFP to mCherry relative fluorescence, animals were imaged using a 4× objective with a broad filter and a 200 millisecond exposure. To avoid image over- or underexposure, a handful of exceptionally bright or dim strains were taken with a decreased or increased (respectively) exposure time (PD1798 40msec, PD3294 500msec, PD3299 50msec, PD3395 500msec, PD6327 50msec, PD6375 50msec, PD1786 50msec, PD1789 50msec, PD1790 50msec, PD6460 500msec, PD6469 50msec, PD6471 50msec, PD6472 50msec, PD6473 50msec, PD6485 100msec, PD1787 40msec, PD6450 100msec, PD6455 40msec, PD6477 40msec, PD6479 50msec, PD6498 40msec, PD6499 40msec, PD6500 40msec, PD6501 40msec, PD6502 40msec, PD6503 40msec, PD6504 40msec).
Raw pixel values for the red and green channels were obtained from image files using the tifffile package in python. Pixels below a threshold distance (200) from the median pixel intensity of the entire image were discarded as background. Pixels above a threshold intensity distance (4000 of a possible 4095) from the origin were discarded as saturated. The median pixel intensity for the entire image (essentially the black background, given the relatively low density of C. elegans tissue) was subtracted from the remaining pixels, and the slope of the linear regression line taken as the GFP/mCherry fluorescence ratio. This metric was robust to different exposure times and neutral density filters.
Statistics
Statistical tests and p-values are stated throughout the text and figures.
To test statistical significance of C-terminal appendage effects on the GFP/mCherry fluorescence ratio (Fig 2B), we divided the data into two groups: (1) 3'UTR-derived (unc-54(TerByP), tbb-2(TerByP), daf-6(TerByP), hlh-1(TerByP), rps-20(TerByP), rps-30(TerByP), shuffle1–3) and (2) non-3'UTR-derived (rand1–3(A,C,G,T), 3xHA, 3xFLAG, eef-1A.1(CDS63-83), bar-1(CDS452-492), daf-6(CDS756-782), mut-16(CDS89-101), alr-1(CDS93-126), hlh-1(CDS290-320,syn1)). For each construct, we took the average GFP/mCherry fluorescence ratio of all available lines. We compared the distribution of 3'UTR-derived and non-3'UTR-derived GFP/mCherry fluorescence ratio values by Kolmogorov-Smirnov test (p=0.004).
Ribosome Footprint
Profiling Ribo-seq was performed essentially as described18,47, with a few modifications. Briefly, animals were grown to the ~L4 stage, and harvested by centrifugation and flash freezing in liquid nitrogen. Animals were ground with a mortar and pestle in liquid nitrogen, after which the powder was thawed in excess volume ice-cold polysome lysis buffer (20mM Tris pH 8.0, 140mM KCl, 1.5mM MgCl2, 1% Triton) with cycloheximide 100ug/ml). RNase 1 and sucrose gradient centrifugation was as before47. ~2ug of purified, RNase 1-digested monosomal RNA was run on a urea 15% polyacrylamide gel, and the entire region from ~15–30nt was excised for library preparation. At this point the protocol continued with T4 PNK (NEB) treatment as with the RNA-seq1 protocol next section).
RNA Sequencing
Two RNA sequencing (RNA-seq) procotols were used in this study. The first RNA-seq protocol (RNA-seq1) was performed on homozygote populations of animals (Extended Data 6). 5ug of total RNA was treated with the RiboZero kit (Illumina). RNA was fragmented at 95C for 30' by addition of an equal volume of 100mM sodium carbonate, 0.5mM EDTA pH 9.3 buffer. RNA fragments were gel purified, then treated with T4 PNK NEB). 3'Ligation with AF-JA-34.2 (/5rApp/NNNNNNAGATCGGAAGAGCACACGTCT/3ddC/, Integrated DNA Technologies) and T4 RNA ligase 1 (NEB) was done at room temperature for 4 hours with 20% PEG8000 in 3.3mM DTT, 8.3mM glycerol, 50mM HEPES KOH (pH 8.3), 10mM MgCl2, 10ug/ml acetylated BSA. Unligated AF-JA-34.2 was removed by sequential treatment with 5'deadenylase (M0331S, NEB), then RecJf (M0264S, NEB). Reverse transcription was carried out with AF-JA-126 (/5Phos/AGATCGGAAGAGCGTCGTGT/iSp18/CACTCA/iSp18/GTGACTGGAGTTCAG ACGTGTGCTCTTCCGATCT, Integrated DNA Technologies) as a primer. Circular ligase treatment and PCR were as previously described47.
A second RNA-seq protocol (RNA-seq2) was used to examine RNA levels with small numbers of heterozygote animals (Fig 3C). ~60 L2–L4 mixed gender animals were picked and flash frozen in 50mM NaCl, and RNA extracted with trizol. RNaseH and 94 oligos complementary to ribosomal RNA were used to deplete rRNA from the sample48. Briefly, ~250ng of a cocktail of DNA oligos complementary to rRNA (Supplementary Table 3, ordered from Integrated DNA Technologies) was mixed with ~100ng total RNA in 125mM Tris pH7.4, 250mM NaCl in 8ul. The sample was heat denatured at 95C for 2', then cooled at −0.1C/sec to 45C. 1ul of digestion buffer was added (500mM Tris pH7.4, 1M NaCl, 200mM MgCl2) with 1ul (5 units) Thermostable RNase H (Epicentre), and the sample was incubated at 45C for one hour. DNA oligos were removed by treatment with TURBO DNase (ThermoFisher) at 37C, and RNA was extracted using an equal volume of phenol/chloroform. An RNA-seq library was prepared using the SMARTer Stranded RNA-Seq kit (Clontech Laboratories, Inc.).
Sequencing
Libraries were sequenced on a MiSeq Genome Analyzer (Illumina, Inc.). Reads were mapped to the C. elegans genome (Ensembl70, WBcel215) using STAR (v2.3.149), with the mutated bases of unc-54(cc3389) and unc-54(e1301) masked. For Ribo-seq and RNA-seq1, reads bearing the same last 6 nucleotides (from NNNNNN, added with AF-JA-34.2) were assumed to be PCR duplicates and collapsed to a single read. For RNA-seq2, multiple reads containing the same start and stop mapping positions were collapsed to a single read count to reduce effects of PCR bias. The removal of PCR duplicates with either protocol only affected ~5–10% of reads and did not adversely impact any of the analyses shown. RNA-seq1 and Ribo-seq were performed once for each strain shown in Extended Data 6,7.
Genomes and Annotations
While we sought to use the latest genome versions and annotations, we found it prudent to take advantage of the care and time with which other researchers annotated and analyzed earlier versions of genomes. For whole genome alignment of nematode species, C. elegans UCSC genome ce10/WS220 was used. To examine the length of predicted C-terminal extensions upon readthrough (Extended Data 1), genomes and annotations of each of the indicated species were as follows: E. coli Ensembl genome and annotations from assembly GCA_000967155.1.30, S. cerevisiae genome S288c (R57-1-1_20071212) and annotations50, C. elegans UCSC genome (WS190/ce6) and annotations16, H. sapiens Ensembl genome release 83 and annotations from TargetScan v7.051.
Immunobloting
Animals were boiled in 1× SDS loading buffer (65mM Tris pH 6.8, 10% glycerol, 2% SDS, 2mM PMSF, 1× Halt Protease Inhibitor (Thermo), 10% 2-mercaptoethanol) and run on a 7.5% Criterion TGX gel (Bio-Rad Laboratories, Inc.). Protein was transferred to a low background fluorescence PVDF membrane (Millipore). The membrane was blocked in 3% nonfat milk in 1× PBST with 250mM NaCl. The 5–6 antibody was used at a 1:5000 dilution to detect myo-3, and 5–8 antibody used at a 1:5000 dilution to detect unc-5452. The 5–6 and 5–8 monoclonal antibodies were produced previously by purification of endogenous myosin proteins. Secondary antibody staining was performed with 1:500 Cy3-conjugated affiniPure goat anti-mouse (Jackson Immunoresearch). Imaging was done on a Typhoon Trio (Amersham Biosciences), and quantification done in ImageJ. For the lower blot of Fig 3D, lysates were made from multiple animals, and serial dilutions done to titrate the number of animals per lane.
Extended Data
Supplementary Material
Acknowledgments
We thank the Fire Lab for critical reading of the manuscript, Christian Frøkjær-Jensen and Karen Artiles for technical expertise, and Tim Schedl, Toshifumi Inada, Claudio Joazeiro, Lorraine Ling, Andrew Nager, and Noah Spies for discussions. Anne Sapiro and Billy Li were instrumental in developing the RNA-seq2 protocol. This work was supported by grants from NIH R01GM37706, T32HG000044 (GTH), 1DP2HD084069-01 (MCB), 5F32GM112474-02 (JAA), Walter and Idun Berry Foundation (ESC), and NSF DGE-114747 (CHL).
Sequencing data is available through SRA (SRP064516).
Footnotes
Author Contributions
JAA, ESC, and AZF designed C. elegans experiments. JAA and ESC conducted C. elegans experiments. NJ developed the RNA-seq2 protocol. JAA performed computational analyses. JAA conducted experiments in human cell lines, as designed and aided by JAA, GTH, CHL, MCB, and AZF. JAA and AZF wrote the paper with help from all authors.
The authors declare no competing financial interests.
References
- 1.Klauer A, van Hoof A. Degradation of mRNAs that lack a stop codon: a decade of nonstop progress. Wiley Interdiscip. Rev. RNA. 2012;3:649–660. doi: 10.1002/wrna.1124. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Hamby S, Thomas N, Cooper D, Chuzhanova N. A meta-analysis of single base-pair substitutions in translational termination codons (’nonstop' mutations) that cause human inherited disease. Hum. Genomics. 2011;5:241–264. doi: 10.1186/1479-7364-5-4-241. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Williams I, Richardson J, Starkey A, Stansfield I. Genome-wide prediction of stop codon readthrough during translation in the yeast Saccharomyces cerevisiae. Nucleic Acids Res. 2004;32:6605–6616. doi: 10.1093/nar/gkh1004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Falini B, et al. Cytoplasmic nucleophosmin in acute myelogenous leukemia with a normal karyotype. N. Engl. J. Med. 2005;352:254–266. doi: 10.1056/NEJMoa041974. [DOI] [PubMed] [Google Scholar]
- 5.Hollingsworth T, Gross A. The severe autosomal dominant retinitis pigmentosa rhodopsin mutant Ter349Glu mislocalizes and induces rapid rod cell death. J. Biol. Chem. 2013;288:29047–29055. doi: 10.1074/jbc.M113.495184. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Vidal R, et al. A stop-codon mutation in the BRI gene associated with familial British dementia. Nature. 1999;399:776–781. doi: 10.1038/21637. [DOI] [PubMed] [Google Scholar]
- 7.Vidal R, et al. A decamer duplication in the 3’ region of the BRI gene originates an amyloid peptide that is associated with dementia in a Danish kindred. Proc. Natl. Acad. Sci. U. S. A. 2000;97:4920–4925. doi: 10.1073/pnas.080076097. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Pang S, et al. A novel nonstop mutation in the stop codon and a novel missense mutation in the type II 3beta-hydroxysteroid dehydrogenase (3beta-HSD) gene causing, respectively, nonclassic and classic 3beta-HSD deficiency congenital adrenal hyperplasia. J. Clin. Endocrinol. Metab. 2002;87:2556–2563. doi: 10.1210/jcem.87.6.8559. [DOI] [PubMed] [Google Scholar]
- 9.Clegg J, Weatherall D, Milner P. Haemoglobin Constant Spring--a chain termination mutant? Nature. 1971;232:655–657. doi: 10.1038/234337a0. [DOI] [PubMed] [Google Scholar]
- 10.Namy O, Duchateau-Nguyen G, Rousset J. Translational readthrough of the PDE2 stop codon modulates cAMP levels in Saccharomyces cerevisiae. Mol. Microbiol. 2002;43:641–652. doi: 10.1046/j.1365-2958.2002.02770.x. [DOI] [PubMed] [Google Scholar]
- 11.Inada T, Aiba H. Translation of aberrant mRNAs lacking a termination codon or with a shortened 3’-UTR is repressed after initiation in yeast. EMBO J. 2005;24:1584–1595. doi: 10.1038/sj.emboj.7600636. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Shibata N, et al. Degradation of stop codon read-through mutant proteins via the ubiquitin-proteasome system causes hereditary disorders. J. Biol. Chem. 2015;290:28428–28437. doi: 10.1074/jbc.M115.670901. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Capone JP, Sharp PA, RajBhandary UL. Amber, ochre, and opal suppressor tRNA genes derived from a human serine tRNA gene. EMBO J. 1985;4:213–221. doi: 10.1002/j.1460-2075.1985.tb02338.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Ahier A, Jarriault S. Simultaneous expression of multiple proteins under a single promoter in Caenorhabditis elegans via a versatile 2A-based toolkit. Genetics. 2014;196:605–613. doi: 10.1534/genetics.113.160846. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Doronina VA, et al. Site-Specific Release of Nascent Chains from Ribosomes at a Sense Codon. Mol. Cell. Biol. 2008;28:4227–4239. doi: 10.1128/MCB.00421-08. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Jan CH, Friedman RC, Ruby JG, Bartel DP. Formation, regulation and evolution of Caenorhabditis elegans 3’UTRs. Nature. 2011;469:97–101. doi: 10.1038/nature09616. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Arribere JA, et al. Efficient Marker-Free Recovery of Custom Genetic Modifications with CRISPR/Cas9 in Caenorhabditis elegans. Genetics. 2014;198:837–846. doi: 10.1534/genetics.114.169730. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Ingolia NT, Ghaemmaghami S, Newman JRS, Weissman JS. Genome-Wide Analysis in Vivo of Translation with Nucleotide Resolution Using Ribosome Profiling. Science (80-.) 2009;324:218–223. doi: 10.1126/science.1168978. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Yen H-CS, Xu Q, Chou DM, Zhao Z, Elledge SJ. Global protein stability profiling in mammalian cells. Science. 2008;322:918–923. doi: 10.1126/science.1160489. [DOI] [PubMed] [Google Scholar]
- 20.Liebhaber SA, Kan YW. Differentiation of the mRNA transcripts originating from the alpha 1- and alpha 2-globin loci in normals and alpha-thalassemics. J. Clin. Invest. 1981;68:439–446. doi: 10.1172/JCI110273. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Torabi N, Kruglyak L. Genetic basis of hidden phenotypic variation revealed by increased translational readthrough in yeast. PLoS Genet. 2012;8 doi: 10.1371/journal.pgen.1002546. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Steneberg P, Samakovlis C. A novel stop codon readthrough mechanism produces functional headcase protein in Drosophila trachea. EMBO Rep. 2001;2:593–597. doi: 10.1093/embo-reports/kve128. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Freitag J, Ast J, Bölker M. Cryptic peroxisomal targeting via alternative splicing and stop codon read-through in fungi. Nature. 2012;485:522–525. doi: 10.1038/nature11051. [DOI] [PubMed] [Google Scholar]
- 24.Eswarappa SM, et al. Programmed Translational Readthrough Generates Antiangiogenic VEGF-Ax. Cell. 2014;157:1605–1618. doi: 10.1016/j.cell.2014.04.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.True HL, Lindquist SL. A yeast prion provides a mechanism for genetic variation and phenotypic diversity. Nature. 2000;407:477–483. doi: 10.1038/35035005. [DOI] [PubMed] [Google Scholar]
- 26.Waterston RH. A second informational suppressor, SUP-7 X, in Caenorhabditis elegans. Genetics. 1981;97:307–325. doi: 10.1093/genetics/97.2.307. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Laski Fa, Ganguly S, Sharp PA, RajBhandary UL, Rubin GM. Construction, stable transformation, and function of an amber suppressor tRNA gene in Drosophila melanogaster. Proc. Natl. Acad. Sci. U. S. A. 1989;86:6696–6698. doi: 10.1073/pnas.86.17.6696. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Hudziak RM, Laski FA, RajBhandary UL, Sharp PA, Capecchi MR. Establishment of mammalian cell lines containing multiple nonsense mutations and functional suppressor tRNA genes. Cell. 1982;31:137–146. doi: 10.1016/0092-8674(82)90413-5. [DOI] [PubMed] [Google Scholar]
Extended References
- 29.Brenner S. The genetics of Caenorhabditis elegans. Genetics. 1974;77:71–94. doi: 10.1093/genetics/77.1.71. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Okkema PG, Harrison SW, Plunger V, Aryana A, Fire A. Sequence Requirements for Myosin Gene Expression and Regulation in Caenorhabditis elegans. Genetics. 1993:385–404. doi: 10.1093/genetics/135.2.385. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Granato M, Schnabel H, Schnabel R. pha-1, a selectable marker for gene transfer in C. elegans. Nucleic Acids Res. 1994;22:1762–1763. doi: 10.1093/nar/22.9.1762. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Mello CC, Kramer JM, Stinchcomb D, Ambros V. Efficient gene transfer in C.elegans: extrachromosomal maintenance and integration of transforming sequences. EMBO J. 1991;10:3959–3970. doi: 10.1002/j.1460-2075.1991.tb04966.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Stinchcomb DT, Shaw JE, Carr SH, Hirsh D. Extrachromosomal DNA transformation of Caenorhabditis elegans. Mol. Cell. Biol. 1985;5:3484–3496. doi: 10.1128/mcb.5.12.3484. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Mango SE, Lambie EJ, Kimble J. The pha-4 gene is required to generate the pharyngeal primordium of Caenorhabditis elegans. Development. 1994;120:3019–3031. doi: 10.1242/dev.120.10.3019. [DOI] [PubMed] [Google Scholar]
- 35.Zhong M, et al. Genome-wide identification of binding sites defines distinct functions for Caenorhabditis elegans PHA-4/FOXA in development and environmental response. PLoS Genet. 2010;6 doi: 10.1371/journal.pgen.1000848. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Venolia L, Waterston RH. The unc-45 gene of Caenorhabditis elegans is an essential muscle-affecting gene with maternal expression. Genetics. 1990;126:345–353. doi: 10.1093/genetics/126.2.345. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Ao W, Pilgrim D. Caenorhabditis elegans UNC-45 is a component of muscle thick filaments and colocalizes with myosin heavy chain B, but not myosin heavy chain A. J. Cell Biol. 2000;148:375–384. doi: 10.1083/jcb.148.2.375. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Hodgkin J, Papp A, Pulak R, Ambros V, Anderson P. A new kind of informational suppression in the nematode Caenorhabditis elegans. Genetics. 1989;123:301–313. doi: 10.1093/genetics/123.2.301. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Hodgkin JA, Brenner S. Mutations causing transformation of sexual phenotype in the nematode Caenorhabditis elegans. Genetics. 1977;86:275–287. [PMC free article] [PubMed] [Google Scholar]
- 40.Mapes J, Chen J-T, Yu J-S, Xue D. Somatic sex determination in Caenorhabditis elegans is modulated by SUP-26 repression of tra-2 translation. Proc. Natl. Acad. Sci. U. S. A. 2010;107:18022–18027. doi: 10.1073/pnas.1004513107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Anderson P, Brenner S. A selection for myosin heavy chain mutants in the nematode Caenorhabditis elegans. Proc. Natl. Acad. Sci. U. S. A. 1984;81:4470–4474. doi: 10.1073/pnas.81.14.4470. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Eide D, Anderson P. The gene structures of spontaneous mutations affecting a Caenorhabditis elegans myosin heavy chain gene. Genetics. 1985;109:67–79. doi: 10.1093/genetics/109.1.67. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Feingold E, et al. The ENCODE (ENCyclopedia Of DNA Elements) Project. Science. 2004;306:636–640. doi: 10.1126/science.1105136. [DOI] [PubMed] [Google Scholar]
- 44.Pédelacq J-D, Cabantous S, Tran T, Terwilliger TC, Waldo GS. Engineering and characterization of a superfolder green fluorescent protein. Nat. Biotechnol. 2006;24:79–88. doi: 10.1038/nbt1172. [DOI] [PubMed] [Google Scholar]
- 45.Hopp TP, et al. A Short Polypeptide Marker Sequence Useful for Recombinant Protein Identification and Purification. Nat. Biotechnol. 1988;6:1204–1210. [Google Scholar]
- 46.Field J, et al. Purification of a RAS-responsive adenylyl cyclase complex from Saccharomyces cerevisiae by use of an epitope addition method. Mol. Cell. Biol. 1988;8:2159–2165. doi: 10.1128/mcb.8.5.2159. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Stadler M, Fire A. Wobble base-pairing slows in vivo translation elongation in metazoans. RNA. 2011;17:2063–2073. doi: 10.1261/rna.02890211. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Morlan JD, Qu K, Sinicropi DV. Selective Depletion of rRNA Enables Whole Transcriptome Profiling of Archival Fixed Tissue. PLoS One. 2012;7:e42882. doi: 10.1371/journal.pone.0042882. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Dobin A, et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29:15–21. doi: 10.1093/bioinformatics/bts635. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Nagalakshmi U, et al. The Transcriptional Landscape of the Yeast Genome Defined by RNA Sequencing. Science (80-.) 2008;320:1344–1349. doi: 10.1126/science.1158441. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Agarwal V, Bell GW, Nam JW, Bartel DP. Predicting effective microRNA target sites in mammalian mRNAs. Elife. 2015;4:1–38. doi: 10.7554/eLife.05005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Miller DM, Ortiz I, Berliner GC, Epstein HF. Differential localization of two myosins within nematode thick filaments. Cell. 1983;34:477–490. doi: 10.1016/0092-8674(83)90381-1. [DOI] [PubMed] [Google Scholar]
- 53.Ross ED, Baxa U, Wickner RB. Scrambled prion domains form prions and amyloid. Mol. Cell. Biol. 2004;24:7206–7213. doi: 10.1128/MCB.24.16.7206-7213.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Gur E, Sauer RT. Recognition of misfolded proteins by Lon, a AAA + protease. Genes Dev. 2006:2267–2277. doi: 10.1101/gad.1670908. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Thompson O, et al. The million mutation project: a new approach to genetics in Caenorhabditis elegans. Genome Res. 2013;23:1749–1762. doi: 10.1101/gr.157651.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Kyte J, Doolittle RF. A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 1982;157:105–132. doi: 10.1016/0022-2836(82)90515-0. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.