Skip to main content
Plant Physiology logoLink to Plant Physiology
. 2019 Jun 27;181(1):367–380. doi: 10.1104/pp.19.00541

The Tomato Translational Landscape Revealed by Transcriptome Assembly and Ribosome Profiling1,[OPEN]

Hsin-Yen Larry Wu a, Gaoyuan Song b, Justin W Walley b, Polly Yingshan Hsu a,2,3
PMCID: PMC6716236  PMID: 31248964

Ribosome profiling revealed previously unannotated ORFs, elucidated evolutionarily conserved and unique translational features, and identified regulatory mechanisms hidden in the tomato genome.

Abstract

Recent applications of translational control in Arabidopsis (Arabidopsis thaliana) highlight the potential power of manipulating mRNA translation for crop improvement. However, to what extent translational regulation is conserved between Arabidopsis and other species is largely unknown, and the translatome of most crops remains poorly studied. Here, we combined de novo transcriptome assembly and ribosome profiling to study global mRNA translation in tomato (Solanum lycopersicum) roots. Exploiting features corresponding to active translation, we discovered widespread unannotated translation events, including 1,329 upstream open reading frames (uORFs) within the 5′ untranslated regions of annotated coding genes and 354 small ORFs (sORFs) among unannotated transcripts. uORFs may repress translation of their downstream main ORFs, whereas sORFs may encode signaling peptides. Besides evolutionarily conserved sORFs, we uncovered 96 Solanaceae-specific sORFs, revealing the importance of studying translatomes directly in crops. Proteomic analysis confirmed that some of the unannotated ORFs generate stable proteins in planta. In addition to defining the translatome, our results reveal the global regulation by uORFs and microRNAs. Despite diverging over 100 million years ago, many translational features are well conserved between Arabidopsis and tomato. Thus, our approach provides a high-throughput method to discover unannotated ORFs, elucidates evolutionarily conserved and unique translational features, and identifies regulatory mechanisms hidden in a crop genome.


Besides being an essential step in gene expression, mRNA translation directly shapes the proteome, which contributes to cellular structure, function, and activity in all organisms. The characterization of translational regulation in Arabidopsis (Arabidopsis thaliana) has enabled crop improvement, including increasing tomato (Solanum lycopersicum) sweetness, rice (Oryza sativa) immunity, and lettuce (Lactuca sativa) resistance to oxidative stress (Sagor et al., 2016; Xu et al., 2017b; Zhang et al., 2018). However, not everything in Arabidopsis is applicable to other plants, and how the Arabidopsis translatome compares with other species is largely unknown. Moreover, due to limited genomic resources and methods, translational landscapes and their underlying regulation in crops remain understudied.

Ribosome profiling, or Ribo-seq, has emerged as a high-throughput technique to study global translation (Ingolia et al., 2009; Brar and Weissman, 2015; Andreev et al., 2017). In a Ribo-seq experiment, ribosomes in the sample of interest are immobilized and the lysate is treated with nucleases to obtain ribosome-protected mRNA fragments (i.e. ribosome footprints). Finally, sequencing of the ribosome footprints reveals the quantity and positions of ribosomes on a given transcript. Because ribosomes decipher mRNA every three nucleotides, the periodic feature of ribosome footprints can be used to uncover previously unannotated translation events (Bazzini et al., 2014; Fields et al., 2015; Ji et al., 2015; Calviello et al., 2016; Hsu et al., 2016). For example, upstream open reading frames (uORFs) in the 5′ leader sequence or 5′ untranslated region (UTR) have been shown to be widespread in many protein-coding genes in humans (Homo sapiens), mouse (Mus musculus), zebrafish (Danio rerio), yeast (Saccharomyces cerevisiae), and plants (Brar et al., 2012; Liu et al., 2013; Ji et al., 2015; Lei et al., 2015; Chew et al., 2016; Hsu et al., 2016; Johnstone et al., 2016). Several well-characterized examples and global analyses indicate that uORFs can modulate the translation of their downstream main ORFs (Liu et al., 2013; von Arnim et al., 2014; Lei et al., 2015; Chew et al., 2016; Johnstone et al., 2016; Hsu and Benfey, 2018). Moreover, numerous presumed noncoding RNAs have been found to possess translated small ORFs (sORFs), usually below 100 codons (Bazzini et al., 2014; Hsu et al., 2016; Bazin et al., 2017; Ruiz-Orera and Albà, 2019). The small size of the protein products of sORFs suggests that they may serve as signaling peptides (Hsu and Benfey, 2018; Ruiz-Orera and Albà, 2019). Despite their importance, uORFs and sORFs are often missing in annotations because computational predictions often assume that (1) protein-coding sequences encode proteins greater than 100 amino acids and (2) only the longest ORF in a transcript is translated (Basrai et al., 1997; Claverie, 1997). Thus, ribosome profiling provides an unparalleled opportunity to experimentally identify translated ORFs genome wide in an unbiased manner.

In plant research, ribosome profiling has been used to study translational regulation in diverse aspects of plant development and response to stress, including photomorphogenesis, chloroplast differentiation, cotyledon development, hypoxia, hormone responses, nutrient deprivation, drought, pathogen responses, and biogenesis of small interfering RNAs (Liu et al., 2013; Zoschke et al., 2013; Juntawong et al., 2014; Lei et al., 2015; Merchante et al., 2015; Chotewutmontri and Barkan, 2016; Li et al., 2016; Bazin et al., 2017; Xu et al., 2017a; Shamimuzzaman and Vodkin, 2018). We previously optimized the resolution of this technique to resolve three-nucleotide periodicity, which enabled us to precisely define translated regions within individual transcripts, in Arabidopsis. As a result, we were able to identify previously unannotated translation events, including usage of non-AUG start sites, uORFs in 5′ UTRs, and sORFs in annotated noncoding RNAs (Hsu et al., 2016). To date, systematically identifying translated ORFs in plants has only been attempted in Arabidopsis (Hsu et al., 2016; Bazin et al., 2017).

Tomato is the most widely cultivated vegetable worldwide (Schwarz et al., 2014). It belongs to the Solanaceae, whose members produce important foods, spices, and medicines. Like other crops, tomato has limited genomic resources or optimized methods. For instance, the latest annotation, from the International Tomato Annotation Group (ITAG), ITAG 3.2 for cv Heinz 1706, only contains predicted protein-coding genes, whereas noncoding RNAs and uORFs are not included (Fernandez-Pozo et al., 2015). We chose seedling roots to establish the protocol for translatomic analysis for two reasons: (1) the root plays an essential role in water/nutrient uptake as well as interaction between plants and other organisms or the environment; and (2) the root is composed of diverse cell types, which is beneficial for surveying translation events, as we observed in our previous work in Arabidopsis seedlings (Hsu et al., 2016). Here, we performed ribosome profiling in combination with de novo transcriptome assembly to discover noncoding RNAs, uORFs, and sORFs and chart the translational landscape in tomato roots. The mapping and quantification of ribosome footprints in tomato not only uncovered numerous unannotated translation events but also revealed global features involved in translational regulation.

RESULTS

Establishment of an Experimental and Data Analysis Pipeline to Map the Tomato Translatome

To map actively translated ORFs, we isolated the roots of tomato seedlings (cv Heinz 1706) and performed strand-specific RNA sequencing (RNA-seq) and Ribo-seq in parallel (Fig. 1, A and B). RNA-seq reveals transcript identity and abundance, whereas Ribo-seq maps and quantifies ribosome occupancy on a given transcript (Brar and Weissman, 2015). We adapted our protocol and pipeline for Arabidopsis (Hsu et al., 2016) with two major modifications: (1) we increased the amount of RNase I used in tomato ribosome footprinting to achieve comparable resolution (see “Materials and Methods” for details); and (2) we performed paired-end 100-bp RNA-seq followed by reference-guided de novo transcriptome assembly to capture transcripts missing from the ITAG3.2 reference annotation (Fig. 1C; see “Materials and Methods” for details). This strategy allowed us to map the translated regions in both annotated and previously unannotated transcripts in an unbiased manner using the ORF-finding tool, RiboTaper (Calviello et al., 2016).

Figure 1.

Figure 1.

Experimental and data analysis procedures for ribosome profiling in tomato roots. A, Four-day-old tomato seedling roots (∼3 cm from the tip) were used in this study. B, Experimental workflow for RNA-seq and Ribo-seq and schematics of their expected read distributions in the three reading frames. This figure was adapted from Hsu et al. (2016). C, Data analysis workflow for reference-guided de novo transcriptome assembly and ORF discovery using RiboTaper.

As the quality of ribosome footprints is critical for finding ORFs (Hsu et al., 2016), we first systematically evaluated the Ribo-seq results by mapping the reads to the ITAG3.2 annotation. Consistent with observations in other nonplant organisms and Arabidopsis (Ingolia et al., 2009; Bazzini et al., 2014; Hsu et al., 2016), the dominant ribosome footprints in tomato were 28 nucleotides long (Fig. 2A). Moreover, in contrast to RNA-seq, the Ribo-seq reads predominantly mapped to the annotated coding sequences (CDSs) and were sparse in the 5′ UTRs and 3′ UTRs (Fig. 2, B and C). The three biological replicates were highly correlated, as indicated by the Pearson correlation, in both Ribo-seq (r = 0.998∼1) and RNA-seq (r = 0.998∼0.999; Supplemental Fig. S1, A and B). Overall, the RNA-seq and Ribo-seq data sets also showed a strong positive correlation (Pearson correlation after removing two extreme outliers, r = 0.878–0.88; Spearman correlation with all data points, ρ = 0.912–0.915; Supplemental Fig. S1, C–F). Most importantly, the distribution of ribosome footprints within the CDS displayed clear three-nucleotide periodicity, a signature of translating ribosomes that decipher three nucleotides at a time (Fig. 2C; Supplemental Fig. S2). Analyzing the distribution of footprints relative to the annotated translation start/stop sites allowed us to infer that the codon at the P-site within the ribosome is located between nucleotides 13 and 15 for 28-nucleotide footprints, and so on for specific footprint lengths (Supplemental Figs. S2 and S3). To visualize the position of the codon being translated, hereafter we use the first nucleotides of the P-sites (denoted as P-site signals) to indicate the positions of the footprints on the transcripts (Fig. 2C). The robustness of the three-nucleotide periodicity can be quantified based on the percentage of reads in the expected reading frame (shown in red in Fig. 2C and hereafter). At a global level, our 28-nucleotide footprints resulted in 85.5% in-frame reads. Together, these results demonstrate that our tomato Ribo-seq data set is of high quality compared with data sets from plants and other organisms (Bazzini et al., 2014; Guydosh and Green, 2014; Chung et al., 2015; Schafer et al., 2015; Hsu et al., 2016).

Figure 2.

Figure 2.

Ribosome footprints are enriched in coding sequences and display strong three-nucleotide periodicity. A, Distribution of read length of the ribosome footprints. nt., Nucleotides. B, Distribution of the Ribo-seq and RNA-seq reads in different genomic features annotated in ITAG3.2. C, Meta-gene analysis of the 28-nucleotide ribosome footprints near the annotated translation start and stop sites defined by ITAG3.2. The red, blue, and green bars represent reads mapped to the first (expected), second, and third reading frames, respectively. The majority of footprints were mapped to the CDS in the expected reading frame (85.5% in frame). For each read, only the first nucleotide in the P-site was plotted (for details, see Supplemental Figs. S2 and S3). The A-site (aminoacyl-tRNA entry site), P-site (peptidyl-tRNA formation site), and E-site (uncharged tRNA exit site) within the ribosomes at translation initiation and termination, and the inferred P-site (nucleotides 13–15) and A-site (nucleotides 16–18), are illustrated. The original meta-plots generated by RiboTaper for all footprint lengths are shown in Supplemental Figure S2.

Next, we performed reference-guided de novo transcriptome assembly for the RNA-seq data using stringtie, a transcript assembler (Pertea et al., 2015). Then, the newly assembled transcriptomes from the replicates were merged and compared with the ITAG3.2 annotations using gffcompare software (Pertea et al., 2016; Fig. 1C). In total, we uncovered 2,263 unannotated transcripts that could potentially encode for novel proteins. These transcripts could be classified into six groups based on their strands and genomic positions relative to existing gene features, such as intergenic (class u), cis-natural antisense transcripts (class x), intronic (class i), and others (class y and class o; Fig. 3, A and C); the nomenclature and descriptions of these discovered transcripts are adapted based on the gffcompare software (Pertea et al., 2016). Class s is expected to result from mapping errors (Pertea et al., 2016) and was included in our downstream analysis as a negative control. The most abundant classes of uncharacterized transcript in our data were intergenic transcripts (class u; 1,260) and cis-natural antisense transcripts (class x; 568). All six classes of uncharacterized transcripts, along with the annotated genes in ITAG3.2, were used to find translated ORFs.

Figure 3.

Figure 3.

The translational landscape of the tomato root. A, Classes of newly assembled transcripts identified by stringtie and gffcompare and used in downstream ORF identification. This figure was adapted from the gffcompare Web site (Pertea et al., 2016). B, Summary of translated ORFs identified by RiboTaper in our data set and peptide support from mass spectrometry (MS) data. The uORFs and annotated ORFs were identified from the 5′ UTRs and expected CDSs of annotated protein-coding genes in ITAG3.2, respectively. The previously unknown ORFs were identified from the newly assembled transcripts. The bottom row indicates the number of proteins in each category supported by MS data sets, either from our own proteomic analysis or searches against publicly available data. C, Summary of newly assembled transcripts and ORFs identified in each class of newly assembled transcripts. The total number of transcripts, number of transcripts identified as translated, and total number of translated ORFs are listed. D, Size distribution of each class of sORFs, uORFs, and annotated ORFs (aORFs). E, Predicted subcellular localization of proteins encoded by the sORFs. The prediction was performed using TargetP (Emanuelsson et al., 2000) with specificity 0.9 as a cutoff. F, Translation efficiency of sORFs compared with annotated ORFs. Only the coding regions were used to compute the TPM and translation efficiency of each transcript. For the x axis, only the range from 0 to 3 (arbitrary units) is shown. A two-sample Kolmogorov-Smirnov test was used to determine statistical significance. G to J, RNA-seq coverage and Ribo-seq periodicity in different genes: an intergenic sORF on chromosome 4 (G); an annotated coding gene that has good support from the Ribo-seq data for the predicted gene model (H); a misannotated ORF (I), note that the Ribo-seq reads do not match the CDS in the gene model and a different reading frame is used; a transcript with a potentially overlapping ORF within the annotated ORF (J). In G to J, the x axis indicates the genomic coordinate of the gene and the y axis shows the normalized read count (counts per hundred million reads). Ribo-seq reads are shown by plotting the first nucleotide of their P-sites (denoted as the P-site signals). The black and gray dashed vertical lines mark the predicted translation start and stop sites, respectively. The red, blue, and green lines in the Ribo-seq plot indicate the P-site signals mapped to the first (expected) reading frame and the second and third reading frames, respectively; the gray lines indicate the P-site signals mapped to outside of the annotated or identified coding regions. Hence, a higher ratio of red means better three-nucleotide periodicity. For the gene model beneath the Ribo-seq data, the gray, black, and white areas indicate the 5′ UTR, CDS, and 3′ UTR, respectively. In J, the yellow box above the gene model indicates the region with a potential ORF overlapping with the annotated ORF.

Translational Landscape of Tomato Roots as Defined by Ribosome Profiling

After collecting the transcript information, we used RiboTaper (Calviello et al., 2016) to interrogate both the annotated transcripts in ITAG3.2 and the newly assembled transcripts to search for all possible ORFs in the transcriptome. RiboTaper examines the P-site signals within each possible ORF and tests whether the signals display a statistically significant three-nucleotide periodicity (Calviello et al., 2016). As a quality control, we first examined translated ORFs detected at annotated coding regions. In total, 20,659 annotated ORFs were identified as translated in our data set (Fig. 3B; Supplemental Data Set S1A). Among 20,285 annotated protein-coding transcripts that have reasonable transcript levels (transcripts per million [TPM] > 0.5 in RNA-seq), 18,626 (92%) have translated ORFs identified. This indicates that our approach to identifying translated ORFs is efficient and robust. In addition to annotated ORFs, there were 1,329 unannotated uORFs translated from the 5′ UTR of annotated genes (Fig. 3B; Supplemental Data Sets S1B and S2). Notably, since only approximately half of the transcripts in ITAG3.2 (17,684 out of 35,768) have an annotated 5′ UTR and because RiboTaper can only identify ORFs in defined transcript ranges, the total number of uORFs in tomato root is clearly an underestimate.

Excitingly, we identified 354 unannotated translated ORFs from the newly assembled transcripts (Fig. 3B; Supplemental Data Sets S1C and S3). These unannotated ORFs were found in different classes of transcripts, but none were detected in the negative control, class s (Fig. 3C). As expected, most of the newly discovered ORFs were relatively small; ∼71% of them (250) encode proteins of less than 100 amino acids (Fig. 3D). Due to their relatively small size, hereafter we call them small ORFs (sORFs). The average lengths of the uORFs, sORFs, and annotated ORFs are 31, 95, and 422 amino acids, respectively. Among the 354 sORFs, 87 have a predicted signal peptide and are expected to be secreted proteins/peptides (Fig. 3E; Supplemental Data Set S1D). To test if the sORFs and annotated ORFs have similar translational properties, we compared their translation efficiency (see the definition in “Materials and Methods”) and found that they were statistically indistinguishable (Fig. 3F). This result supports the newly identified sORFs as genuine protein-coding genes in the tomato genome.

The majority of the identified ORFs have high fractions of P-site signals mapped to the expected reading frame (Supplemental Fig. S4). Visualizing the profiles of individual transcripts confirmed that both the sORFs and numerous annotated ORFs display strong three-nucleotide periodicity within the identified coding regions (Fig. 3, G and H). Therefore, by combining the high-quality Ribo-seq data with RiboTaper analysis, we not only validated many of the annotated gene models but also discovered new ORFs. These previously unannotated translated regions have been compiled and are ready to be incorporated into the official tomato annotation (Supplemental Data Sets S1, A–C, S2, and S3).

Evolutionarily Conserved and Solanaceae-Specific sORFs

Previously, we identified 27 sORFs in Arabidopsis by applying RiboTaper on Ribo-seq data (Hsu et al., 2016). Eight of the Arabidopsis sORFs have known tomato homologs. Our tomato root data showed that seven of the conserved sORFs were both transcribed and translated (Supplemental Fig. S5, A–D). Since Arabidopsis and tomato diverged approximately 100 million years ago (Ku et al., 2000), our data support that some sORFs are conserved across evolution.

If the newly identified tomato sORFs encode proteins for conserved biological processes, we would expect them to be preserved during evolution. We performed TBLASTN using 157 single-exon sORFs that were 16 to 100 amino acids long on 10 diverse plant genomes, including a wild tomato (Solanum pennellii), potato (Solanum tuberosum, which belongs to the same family as tomato, the Solanaceae), four dicots in other families, two monocots, a lycophyte, and a moss (Supplemental Fig. S6). In total, we found 96 Solanaceae-specific sORFs, including 18 sORFs unique to tomato and 78 sORFs shared by tomato and either wild tomato or potato. Out of 157 sORFs analyzed, 139 of them have homologs in at least one other plant genome. Some of the sORFs are highly conserved across these 10 genomes (Supplemental Fig. S6), suggesting the functional significance of these sORFs throughout evolution. Importantly, the conserved patterns among the homologs correlate well with their phylogenic relationships, indicating that these sORF homologs are unlikely to be false positives that randomly occurred in the BLAST search. While some sORFs are widely conserved, 96 sORFs are unique to Solanaceae, highlighting that our approach to study translatomes directly in tomato revealed translational events that were impossible to learn about by studying Arabidopsis alone. Taken together, our results reveal both evolutionarily conserved and Solanaceae-specific sORFs.

Some sORFs and uORFs Generate Stable Proteins in Planta

To evaluate whether the previously unknown ORFs, including sORFs and uORFs, accumulate stable proteins in planta and to validate our Ribo-seq results, we performed a proteogenomic analysis (Walley and Briggs, 2015) to identify novel peptides arising from these unannotated ORFs. Because the sORFs and uORFs are quite small, their protein products do not always generate peptides with ideal size and/or mass-to-charge ratios that are suitable for detection by MS. To increase the diversity of peptides for MS analysis, we extracted proteins from the roots and shoots of tomato seedlings and digested the proteins into peptides using trypsin or GluC, independently, prior to two-dimensional liquid chromatography-tandem mass spectrometry (MS/MS). As the sORFs and uORFs are currently missing from the tomato annotation, we created a custom protein database (Supplemental Data Set S4) derived from our Ribo-seq data to assist in identifying these unannotated proteins. In addition, we used our custom protein database to search publicly available proteomic data from the tomato fruit (ProteomeXchange PXD004887) and pericarp (ProteomeXchange PXD004947; Mata et al., 2017; Szymanski et al., 2017). In total, we identified 12,172 proteins, including 29 sORFs and 30 uORFs, with at least one unique peptide from these six proteomic data sets (Fig. 3B; Supplemental Data Sets S1, E and F, and S5, A–C). The MS detection rates (at least one unique peptide) for sORFs below 100 amino acids, 100 to 200 amino acids, and higher than 200 amino acids in our data are 4.8%, 16.3%, and 35.3%, respectively, suggesting that proteins with a larger size have better chances to be detected by MS. Despite the limitations of MS in small protein identification, our results support that some uORFs and sORFs accumulate stable proteins in planta.

Ribo-Seq Fine-Tunes and Improves Genome Annotation

Comparing the RiboTaper output and the annotated gene models, we found cases in which the translated ORFs were dramatically different from the predicted gene models. For example, translation may occur in a different reading frame or at a distinct region on the transcript (Fig. 3I; Supplemental Fig. S7, A–F). Thus, Ribo-seq provides a high-throughput experimental approach to validate and improve genome annotation. Furthermore, in several cases, using visual inspection, we found regions that appear to contain a short ORF that overlaps with the long annotated ORF but uses a different reading frame (Fig. 3J). These overlapping ORFs are similar to nonupstream coding ORFs identified in the human genome (Michel et al., 2012), and their functional importance is still unknown.

The translation start sites in the genome annotation are typically defined computationally, and often the most upstream AUG is predicted to be the start codon. Unexpectedly, in 64 genes, the RiboTaper-defined translation start sites were actually upstream of the annotated start sites (Fig. 4A; Supplemental Data Set S1G). In contrast, some ORFs appeared to use start sites downstream of the annotated start sites (Fig. 4B; Supplemental Fig. S5B). Currently, ITAG3.2 contains only one isoform per gene, and hence only one transcription start site is predicted per gene. It is possible that, in some cases, translation starts downstream of the annotated site, because transcription initiates downstream of the annotated transcription start site. Nonetheless, it appears that the most upstream AUG is not always used as the translation start site.

Figure 4.

Figure 4.

Upstream/downstream start sites and non-AUG start sites. A and B, Examples of the usage of an upstream start site (A) or a downstream start site (B). The gene model and data presentation are the same as those described in the legend of Figure 3. The blue triangles mark the locations of the annotated translation start sites. The orange triangles mark the locations of the RiboTaper-identified translation start sites. C, A tomato homolog of an Arabidopsis gene that was predicted to use an upstream CUG start site (orange triangle). Note the abundant in-frame P-site signals upstream of the annotated AUG start (blue triangle) in the 5′ UTR. D, Conservation of potential CUG/non-AUG start sites. The Arabidopsis gene identifier, tomato gene identifier, percentage amino acid identity, and number of in-frame P-site positions with Ribo-seq reads within the first 20 codons upstream of the AUG in our tomato root data are shown.

Non-AUG translation initiation has been discovered in animals and plants (Simpson et al., 2010; Laing et al., 2015; Kearse and Wilusz, 2017; Spealman et al., 2018). Twelve evolutionarily conserved noncanonical translation starts upstream of the most likely AUG have been predicted in Arabidopsis (Simpson et al., 2010), and we previously showed that at least one of them, in AT3G10985, has high Ribo-seq coverage using a CUG codon (Hsu et al., 2016). The profile of the tomato homolog of AT3G10985 confirmed the possible usage of the CUG start site (Fig. 4C). Next, we identified tomato homologs of all 12 predicted noncanonical-start genes and systematically checked their Ribo-seq coverage upstream of the annotated AUG start sites. We selected genes that met the following criteria: (1) the Ribo-seq reads cover at least seven in-frame P-site positions within the first 20 codons upstream of the AUG; and (2) there is no stop codon within the first 20 codons upstream of the AUG. We found that eight tomato genes that met the above criteria contain abundant reads upstream of the annotated AUG, suggesting that they use non-AUG start sites (Fig. 4D). Thus, despite the evolutionary distance between Arabidopsis and tomato, the usage of noncanonical translation initiation remains conserved in these homologs.

uORFs Regulate Translation Efficiency

Using RiboTaper, we identified 1,329 translated uORFs based on their significant three-nucleotide periodicity (Fig. 3B; Supplemental Data Sets S1B and S2). These uORFs included previously predicted conserved uORFs in the tomato SAC51 homolog (Fig. 5A; Imai et al., 2006) as well as previously unknown uORFs in numerous coding genes (Fig. 5B). Manual inspection of these transcripts suggested that the high stringency of RiboTaper might miss uORFs with lower periodicity, overlapping uORFs, and non-AUG-start uORFs. For example, the second of the three uORFs in the tomato SAC51 transcript (Fig. 5A) was not identified as coding by RiboTaper, presumably due to the imperfect periodicity in this area. Nevertheless, those identified are high-confidence translated uORFs.

Figure 5.

Figure 5.

uORFs repress translation efficiency of their downstream main ORFs and contain less-pronounced Kozak sequences. A and B, Profiles of genes containing conserved uORFs (A) or a previously uncharacterized uORF (B). The gene model and data presentation are the same as those described in the legend of Figure 3. The uORFs are labeled with yellow and orange boxes in the gene models. For the uORFs, the orange and green dashed vertical lines mark the translation start and stop sites, respectively. C, The translation efficiency of the main ORFs for transcripts containing a different number of translated uORFs. Only the coding regions were used to compute the TPM and translation efficiency of each transcript. The colored bars before the P values indicate the pairs of data used to determine statistical significance. The P values were determined with two-sample Kolmogorov-Smirnov tests. A.U., Arbitrary units. D, Selected nonredundant GO categories for genes containing one or more uORFs. FDR, False discovery rate. E and F, Kozak sequences of annotated ORFs, uORFs, and uORF-associated main ORFs. The statistical significance in F was determined using χ2 tests.

Global analyses have reported that translated uORFs repress the translation of their downstream main ORFs (Liu et al., 2013; Lei et al., 2015; Chew et al., 2016; Johnstone et al., 2016). Consistent with these reports, we found that globally, transcripts containing uORFs have lower translation efficiency than those without uORFs (Fig. 5C). In addition, more uORFs in a transcript correlate with stronger translational repression (Fig. 5C). To investigate which physiological pathways might be regulated by uORFs, we checked the Gene Ontology (GO) terms of the uORF-containing genes. Intriguingly, uORF-containing genes were enriched for protein kinases and phosphatases as well as signal transduction (Fig. 5D). This is similar to a previous prediction in Arabidopsis (Kim et al., 2007), except that transcription factors are not enriched in our data. Our results imply that substantial portions of the protein phosphorylation/dephosphorylation and signal transduction pathways in tomato are likely translationally regulated through uORFs.

Translation start sites have a well-defined Kozak consensus sequence in different organisms (Kozak, 1987; Lütcke et al., 1987). For example, the conserved nucleotides at positions −3 and +4 of the Kozak sequence in plants are purines (A/G) and G, respectively (Lütcke et al., 1987). As expected, we observed this conserved pattern among the annotated ORFs (Fig. 5E). Next, we examined the Kozak consensus sequences of the translated uORFs and their downstream main ORFs. Whereas the downstream main ORFs also favor the conserved nucleotides at −3 and +4 of the Kozak sequence, this pattern is missing in the uORFs (Fig. 5, E and F). Similar results were observed in the Kozak sequences of the uORFs and downstream main ORFs in Arabidopsis (Liu et al., 2013). The poorly conserved Kozak sequences might allow for more leaky scanning, a phenomenon in which a weak initiation context is sometimes skipped by the ribosome during translation initiation, so the downstream main ORFs could still have some chances to be translated.

Regulation of Gene Expression by MicroRNAs

MicroRNAs regulate gene expression through mRNA cleavage and translational repression (Yu et al., 2017; Li et al., 2018). The roles of microRNAs in tomato are less well understood than in Arabidopsis. We first predicted 6,312 microRNA target genes in tomato (Supplemental Data Set S1, H and I) using psRNATarget (Dai et al., 2018). Next, we compared their RNA-seq and Ribo-seq levels and the translation efficiency of the microRNA targets and other coding genes globally. The transcript levels of the microRNA targets were slightly but significantly reduced, consistent with the possibility that microRNAs regulate gene expression through mRNA cleavage (Fig. 6A). In addition, both the Ribo-seq levels and translation efficiency of the microRNA target genes were reduced (Fig. 6, B and C), consistent with prior observations of translational repression mediated by microRNAs (Faghihi and Wahlestedt, 2009). Thus, our results suggest that globally, microRNAs regulate gene expression at both the transcript and translational levels in tomato.

Figure 6.

Figure 6.

Regulation of gene expression by microRNAs (miRNAs). Cumulative distributions are shown for RNA-seq (A), Ribo-seq (B), and translation efficiency (TE; C) of microRNA targets and nonmicroRNA target genes. For the x axis in A and B, only the range from 0 and 50 (TPM) is shown. Only the coding regions were used to compute the TPM and translation efficiency of each transcript. The P values were determined with two-sample Kolmogorov-Smirnov tests. A.U., Arbitrary units.

DISCUSSION

Most of the plant research on mRNA translation was performed in Arabidopsis, and the knowledge has been transferred into several crops to improve crop performance. However, on a genome-wide level, it is unclear how well the Arabidopsis translatome compares with other species. In this study, we combined de novo transcriptome assembly and ribosome profiling to study the tomato translatome. We found that despite Arabidopsis and tomato diverging over 100 million years ago, many translational features are well conserved. Overall, we observed shared features between our Arabidopsis and tomato Ribo-seq data, including the most abundant ribosome footprint size and the inferred P-site within ribosome footprints. We found that previously unannotated translation events, such as uORFs and sORFs, are also widespread in tomato. In addition, we observed that usage of non-AUG translation start sites is shared between Arabidopsis and tomato. Finally, translational regulatory mechanisms, including uORFs on their downstream main ORFs and microRNAs on their target genes, are also well conserved in these two species.

Interestingly, we discovered 96 previously unknown sORFs only present in Solanaceae, including 78 shared by tomato and either wild tomato or potato and 18 sORFs uniquely found in tomato. These family-specific sORFs may provide functions unique to Solanaceae. The idea of family-specific regulatory molecules was proposed based on systemin, the first peptide hormone identified in plants. Systemin is only present in Solaneae, a subtribe of the Solanaceae (Pearce et al., 1991; Constabel et al., 1998). Such family- or subfamily-specific regulatory molecules may evolve during evolution for a specific lineage of plants. Even species-specific sORFs have been proposed to be important (Andrews and Rothnagel, 2014). The functions of the widely conserved and Solanaceae-specific sORFs require further studies.

Peptide signaling is crucial for cell-cell communication in numerous aspects of plant development and stress responses (Tavormina et al., 2015; Hsu and Benfey, 2018). We found 87 sORFs that encode potential secreted peptides. However, as about 50% of secreted proteins in plants lack a well-defined signal peptide (Agrawal et al., 2010), some sORFs without a predicted signal peptide may still be secreted. In addition, sORF products without a signal peptide have been found to play an important role in a wide range of physiological processes in plants, such as vegetative and reproductive development, small interfering RNA biogenesis, and stress tolerance (Casson et al., 2002; Blanvillain et al., 2011; Ikeuchi et al., 2011; Valdivia et al., 2012; De Coninck et al., 2013). Therefore, the identification of sORFs using ribosome profiling facilitates potential applications of these peptides in improving crop performance.

Several studies have illustrated the power of altering mRNA translation via uORFs to improve agriculture (Sagor et al., 2016; Xu et al., 2017b; Zhang et al., 2018). For example, engineering rice that specifically induces defense proteins when a uORF is repressed by pathogen attack enables immediate plant resistance without compromising plant growth in the absence of pathogens (Xu et al., 2017b). The identification of translated ORFs provides new possibilities to fine-tune the synthesis of proteins involved in diverse physiological pathways. Notably, the number of uORFs in tomato is still an underestimate. Approximately half of the tomato genes still lack annotated 5′ UTRs, and RiboTaper only searches for potential translated ORFs in defined transcript regions. Thus, uORFs could be an even more widespread mechanism to control translation in tomato. Future studies using a combination of cap analysis gene expression sequencing or paired-end analysis of transcription start sites sequencing with the long-read sequencing could facilitate defining the 5′ UTRs associated with specific isoforms (Ozsolak and Milos, 2011) and enable the identification of missing uORFs.

Ribo-seq has been integrated into proteomic research to achieve deeper proteome coverage (Menschaert et al., 2013; Van Damme et al., 2014; Crappé et al., 2015; Calviello et al., 2016). Unlike DNA or RNA molecules, which can be sequenced using genomic technologies, proteins are typically identified by matching MS spectra to theoretical spectra from candidate peptides in a reference protein database. Before ribosome profiling became available, to include potential protein sequences, the conventional proteogenomics approach exploited either three-frame translation using transcriptome data or six-frame translation using genomic sequences (Walley and Briggs, 2015; Ruggles et al., 2017). Integrating Ribo-seq data into the construction of protein databases for proteogenomic studies has two advantages: (1) Ribo-seq discovers unannotated translation events and thus enables the identification of unknown proteins that were previously missed in the annotation; and (2) compared with three-frame or six-frame translation, Ribo-seq reduces the search space and false positives. Therefore, our custom protein database, built based on the Ribo-seq data, may aid in proteomic research in tomato.

CONCLUSION

In summary, our approach combining transcriptome assembly and ribosome profiling enabled the identification of translated ORFs genome wide in tomato and revealed conserved and unique translational features across evolution. Our results not only provide valuable information to the plant community but also present a practical strategy to study translatomes in other less-well-annotated organisms.

MATERIALS AND METHODS

Plant Materials and Preparation of Lysates for RNA-Seq and Ribo-Seq

Tomato seeds (Solanum lycopersicum ‘Heinz 1706’) were obtained from the C.M. Rick Tomato Genetics Resource Center (accession LA4345) and bulked. For each replicate, ∼300 tomato seeds were surface sterilized in 70% (v/v) ethanol for 5 min followed by bleach solution (2.4% [v/v] NaHClO and 0.3% [v/v] Tween 20) for 30 min with shaking. The seeds were then washed with sterile water five times. Next, the seeds were stratified on 1× Murashige and Skoog medium (4.3 g L−1 Murashige and Skoog salt, 1% [w/v] Suc, 0.5 g L−1 MES, pH 5.7, and 1% [w/v] agar) and kept at 22°C in the dark for 3 d before being grown under 16-h-light/8-h-dark conditions at 22°C for 4 d. Seedlings that germinated at approximately the same time and of similar size were selected for the experiments. Roots (∼3 cm from the tip) from ∼180 plants were harvested at Zeitgeber time 3 (3 h after lights on) in batches and immediately frozen in liquid nitrogen. The frozen tissues were pooled and pulverized in liquid nitrogen using a mortar and pestle. Approximately 0.4 g of tissue powder was resuspended in 1.2 mL of lysis buffer (100 mm Tris-HCl [pH 8], 40 mm KCl, 20 mm MgCl2, 2% [v/v] polyoxyethylene [10] tridecyl ether [Sigma, P2393], 1% [w/v] sodium deoxycholate [Sigma, D6750], 1 mm dithiothreitol, 100 μg mL−1 cycloheximide, and 10 units mL−1 DNase I [Epicenter, D9905K]) as described by Hsu et al. (2016). After incubation on ice with gentle shaking for 10 min, the lysate was spun at 4°C at 20,000g for 10 min. The supernatant was transferred to a new tube and divided into 100-µL aliquots. The aliquoted lysates were flash frozen in liquid nitrogen and stored at −80°C until processing.

RNA Purification and RNA-Seq Library Construction

For RNA-seq samples, 10 µL of 10% (w/v) SDS was added to the 100-µL lysate aliquots described above. RNA greater than 200 nucleotides was extracted using a Zymo RNA Clean & Concentrator kit (Zymo Research, R1017). The obtained RNA was checked with a Bioanalyzer (Agilent) RNA pico chip to access the RNA integrity, and a RNA integrity number value ranging from 9.2 to 9.4 was obtained for each replicate. Ribosomal RNAs (rRNAs) were depleted using a RiboZero Plant Leaf kit (Illumina, MRZPL1224). Next, 100 ng of the rRNA-depleted RNA was used as the starting material, fragmented to ∼200 nucleotides based on the RNA integrity number reported by the Bioanalyzer, and processed using the NEBNext Ultra Directional RNA Library Prep Kit (New England Biolabs, E7420S) to create strand-specific libraries. The libraries were barcoded and enriched using 11 cycles of PCR amplification. The libraries were brought to equal molarity, pooled, and sequenced on one lane of a Hi-Seq 4000 using PE-100 sequencing.

Ribosome Footprinting and Ribo-Seq Library Construction

The Ribo-seq samples were prepared based on Hsu et al. (2016) with modifications described as follows, which optimize the method for tomato. Briefly, the RNA concentration of each lysate was first determined using a Qubit RNA HS assay (Invitrogen, Q32852) using a 10-fold dilution. Next, 100 µL of the lysate described above was treated with 100 units of nuclease (provided in the TruSeq Mammalian Ribo Profile Kit, Illumina, RPHMR12126) per 40 µg of RNA with gentle shaking at room temperature for 1 h. The nuclease reaction was stopped by immediately transferring to ice and adding 15 µL of SUPERase-IN (Invitrogen, AM2696). The ribosomes were isolated using Illustra MicroSpin S-400 HR columns (GE Healthcare, 27514001). RNA greater than 17 nucleotides was purified first (Zymo Research, R1017), and then RNA smaller than 200 nucleotides was enriched (Zymo Research, R1015). Next, the rRNAs were depleted using a RiboZero Plant Leaf kit (Illumina, MRZPL1224). The rRNA-depleted RNA was then separated via 15% (w/v) Tris-borate-EDTA-urea PAGE (Invitrogen, EC68852BOX), and gel slices ranging from 28 to 30 nucleotides were excised. Ribosome footprints were recovered from the excised gel slices using the overnight elution method, and the sequencing libraries were constructed according to the TruSeq Mammalian Ribo Profile Kit manual. The final libraries were amplified via nine cycles of PCR. The libraries were brought to equal molarity, pooled, and sequenced on two lanes of a Hi-Seq 4000 using SE-50 sequencing.

RNA-Seq and Ribo-Seq Data Analysis

The raw RNA-seq and Ribo-seq data and detailed mapping parameters have been deposited in the Gene Expression Omnibus (GEO) database (www.ncbi.nlm.nih.gov/geo) under accession number GSE124962. The tomato reference genome sequence and annotation files used in this study were downloaded from the Sol Genomics Network (Fernandez-Pozo et al., 2015). The adaptor sequence AGA​TCG​GAA​GAG​CAC​ACG​TCT was first removed from the Ribo-seq data using FASTX_clipper v0.0.14 (http://hannonlab.cshl.edu/fastx_toolkit/). For both RNA-seq and Ribo-seq, the rRNA, tRNA, small nuclear RNA, small nucleolar RNA, and repeat sequences were removed using Bowtie2 v2.3.4.1 (Langmead and Salzberg, 2012). The rRNA, tRNA, small nuclear RNA, and small nucleolar RNA sequences were extracted from the SL2.5 genome assembly with the ITAG2.4 annotation (Fernandez-Pozo et al., 2015), and the repeat sequences were extracted from the SL3.0 genome assembly with the ITAG3.2 annotation. After these contaminating sequences were removed using Bowtie2, the preprocessed RNA-seq and Ribo-seq files were used to calculate the read distribution in different gene features (Fig. 2B) using the featureCounts function of the Subread package v1.5.3 (Liao et al., 2014).

Next, the preprocessed RNA-seq and Ribo-seq reads were mapped to the tomato reference genome sequence SL3.0 with the ITAG3.2 annotation using the STAR v2.6.0.c (Dobin et al., 2013). The reference-guided de novo assembly of the mapped RNA-seq reads was performed with stringtie v1.3.3b (Pertea et al., 2015), and the newly assembled gtf files were compared with ITAG3.2 using gffcompare v0.10.1 (Pertea et al., 2016). The i, x, y, o, u, and s classes of new transcripts (see Fig. 3A for details) and their descriptions were extracted from the gffcompare output gtf and concatenated with ITAG3.2. This combined gtf (referred to as Tomato_Root_ixyous+ITAG3.2.gtf; submitted to GEO as a processed file within GSE124962) was used to map the RNA-seq and Ribo-seq reads again with STAR. Notably, all six classes of uncharacterized transcripts in Tomato_Root_ixyous+ITAG3.2.gtf were assigned as noncoding RNAs, and this gtf was used for downstream RiboTaper analysis. The three biological replicates of the mapped bam files for RNA-seq were merged into one large bam file with SAMtools v1.8 (Li et al., 2009). The three mapped Ribo-seq bam files were also merged. The two merged bam files above were then used for ORF discovery with RiboTaper v1.3 (Calviello et al., 2016).

For RiboTaper analysis, the RiboTaper annotation files and the offset parameters (i.e. the inferred P-site position for each footprint length) were first obtained. The RiboTaper annotation files were generated using the create_annotations_files.bash function in the RiboTaper package using SL3.0 assembly and the Tomato_Root_ixyous+ITAG3.2.gtf. To obtain the offset parameters, the create_metaplots.bash and metag.R functions in the RiboTaper package were used to generate meta-gene plots. The offset parameters were identified through the meta-gene plots. For 24-, 25-, 26-, 27-, and 28-nucleotide footprints, the offset values were 8, 9, 10, 11, and 12, respectively (Supplemental Fig. S3). Next, we performed RiboTaper analysis using the RiboTaper annotation, offset parameters, and RNA-seq and Ribo-seq bam files. The coding sequences identified by RiboTaper from the newly assembled transcripts were extracted from the translated_ORFs_filtered_sorted.bed file and integrated with Tomato_Root_ixyous+ITAG3.2.gtf to generate Supplemental Data Sets S2 and S3.

We then mapped the Ribo-seq and RNA-seq data again to the CDS ranges with STAR, and the transcripts per million (TPM) for the CDS of each transcript was quantified via RSEM v1.3.0 (Li and Dewey, 2011). The formula to calculate translation efficiency is TE = (the TPMCDS of Ribo-seq)/(the TPMCDS of RNA-seq). To avoid inflation due to a small denominator, only genes with an RNA-seq TPM greater than 0.5 were used in the statistical analysis of translation efficiency. The plotting of three-nucleotide periodicity of the Ribo-seq and coverage of RNA-seq was generated by incorporating the plot function in R v3.4.3 (R Core Team, 2013) with functions from GenomicRanges v1.30.3, GenomicFeatures v1.30.3, and GenomicAlignments v1.14.2 libraries (Lawrence et al., 2013) to read in the gtf file and RNA-seq bam file. The merged RNA-seq bam file from STAR and the processed P_sites_all file from RiboTaper were used to plot the RNA-seq coverage and P-sites of Ribo-seq, respectively. The Linux command line code to preprocess the P_sites_all file before use for plotting was cut -f 1,3,6 P_sites_all | sort | uniq -c | sed -r 's/^(*[^ ]+) +/\1\t/' > name_output_file. For plotting the CUG/non-AUG start gene, the CDS range of the gene in the gtf file was manually modified before plotting.

Statistical Analysis

The statistical analysis in this study was performed in R (R Core Team, 2013). The chisq.test and ks.test functions of the stats package in R were used for the χ2 analysis and the Kolmogorov-Smirnov test, respectively. The Pearson and Spearman correlation coefficients were calculated using the cor function. Pairwise comparisons were performed using the corrplot function in the corrplot v0.84 package (Wei, 2013). The empirical cumulative probabilities of translation efficiency were calculated using the ecdf function (in the stats package) and plotted with the base R plot function.

Protein Extraction and Digestion

Roots (∼3 cm near the tip) and shoots (shoot tip including ∼1 cm of hypocotyl) of 4-d-old tomato seedlings were harvested at Zeitgeber time 3 (3 h after lights on). The proteomics experiments were carried out based on established methods as follows (Castellana et al., 2014; Song et al., 2018a, 2018b). Five volumes (v/w) of Tris-buffered phenol, pH 8, was added to 150 mg of ground tissue, vortexed for 1 min, then mixed with 5 volumes (buffer/tissue, v/w) of extraction buffer (50 mm Tris, pH 7.5, 1 mm EDTA, pH 8, and 0.9 m Suc), and centrifuged at 13,000g for 10 min at 4°C. The phenol phase was transferred to a new tube, and a second phenol extraction was performed on the aqueous phase. The two phenol phase extractions were combined, and 5 volumes of prechilled methanol with 0.1 m ammonium acetate was added. This was mixed well and keep at −80°C for 1 h prior to centrifugation at 4,500g for 10 min at 4°C. Precipitation with 0.1 m ammonium acetate in methanol was performed twice with incubation at −20°C for 30 min. The sample was resuspended in 70% (v/v) methanol at kept at −20°C for 30 min prior to centrifuging at 4,500g for 10 min at 4°C. The supernatant was discarded, and the pellet was placed in a vacuum concentrator until it was nearly dry. Two volumes (buffer/pellet, v/v) of protein digestion buffer [8 m urea, 50 mm Tris, pH 7, and 5 mm Tris(2-carboxyethyl)phosphine hydrochloride] was added to the pellet. The samples were then probe sonicated to aid in resuspension of the pellet. The protein concentration was then determined using the Bradford assay (Thermo Scientific).

The solubilized protein (∼1 mg) was added to an Amicon Ultracel –30K centrifugal filter (catalog no. UFC803008) and centrifuged at 4,000g for 20 to 40 min. This step was repeated once. Then, 4 mL of urea solution with 2 mm Tris (2-carboxyethyl)phosphine hydrochloride was added to the filter unit and centrifuged at 4,000g for 20 to 40 min. Next, 2 mL of iodoacetamide solution (50 mm iodoacetamide in 8 m urea) was added and incubated without mixing at room temperature for 30 min in the dark prior to centrifuging at 4,000g for 20 to 40 min. Two milliliters of urea solution was added to the filter unit, which was then centrifuged at 4,000g for 20 to 40 min. This step was repeated once. Two milliliters of 0.05 m NH4HCO3 was added to the filter unit and centrifuged at 4,000g for 20 to 40 min. This step was repeated once. Then, 2 mL 0.05 m NH4HCO3 with trypsin (enzyme:protein ratio 1:100) or GluC (enzyme:protein ratio 1:20) was added. Samples were incubated at 37°C overnight. Undigested protein was estimated using Bradford assays, and then trypsin (1 μg μL−1) was added to a ratio of 1:100 and an equal volume of Lys-C (0.1 μg μL−1) was added to the trypsin/Lys-C-digested sample and GluC was added at a ratio of 1:20 to the sample digested with GluC. The digests were incubated for an additional 4 h at 37°C. The filter unit was added to a new collection tube and centrifuged at 4,000g for 20 to 40 min. One milliliter of 0.05 m NH4HCO3 was added and centrifuged at 4,000g for 20 to 40 min. The samples were acidified to pH 2 to 3 with 99% (v/v) formic acid and centrifuged at 21,000g for 20 min. Finally, samples were desalted using 50-mg Sep-Pak C18 cartridges (Waters). Eluted peptides were dried using a vacuum centrifuge (Thermo Scientific) and resuspended in 0.1% (v/v) formic acid. Peptide amount was quantified using the Pierce BCA Protein Assay Kit.

Liquid Chromatography-MS/MS

An Agilent 1260 quaternary HPLC device was used to deliver a flow rate of ∼600 nL min−1 via a splitter. All columns were packed in house using a Next Advance pressure cell, and the nanospray tips were fabricated using a fused silica capillary that was pulled to a sharp tip using a laser puller (Sutter, P-2000). Twenty-five micrograms of peptides was loaded unto 20-cm capillary columns packed with 5 μm Zorbax SB-C18 (Agilent), which was connected using a zero dead volume 1-μm filter (Upchurch, M548) to a 5-cm-long strong cation exchange (SCX) column packed with 5 μm polysulfoethyl (PolyLC). The SCX column was then connected to a 20-cm nanospray tip packed with 2.5 μm C18 (Waters). The three sections were joined and mounted on a custom electrospray source for online nested peptide elution. A new set of columns was used for every sample. Peptides were eluted from the loading column unto the SCX column using a 0% to 80% acetonitrile gradient over 60 min. Peptides were then fractionated from the SCX column using a series of ammonium acetate salt steps as follows: 10, 30, 32.5, 35, 37.5, 40, 42.5, 45, 50, 55, 65, 75, 85, 90, 95, 100, 150, and 1,000 mm. For these analyses, buffers A (99.9% water and 0.1% formic acid), B (99.9% acetonitrile and 0.1% formic acid), C (100 mm ammonium acetate and 2% formic acid), and D (1 m ammonium acetate and 2% formic acid) were utilized. For each salt step, a 150-min gradient program comprised a 0 to 5 min increase to the specified ammonium acetate concentration (using buffer C or D), 5 to 10 min hold, 10 to 14 min at 100% buffer A, 15 to 120 min at 5% to 35% buffer B, 120 to 140 min at 35% to 80% buffer B, 140 to 145 min at 80% buffer B, and 145 to 150 min at buffer A.

Eluted peptides were analyzed using a Thermo Scientific Q-Exactive Plus high-resolution quadrupole Orbitrap mass spectrometer, which was directly coupled to the HPLC device. Data-dependent acquisition was obtained using Xcalibur 4.0 software in positive ion mode with a spray voltage of 2 kV, a capillary temperature of 275°C, and a radio frequency of 60. MS1 spectra were measured at a resolution of 70,000, an automatic gain control of 3e6, with a maximum ion time of 100 ms and a mass range of 400 to 2,000 mass-to-charge ratio. Up to 15 MS2 scans were triggered at a resolution of 17,500, an automatic gain control of 1e5 with a maximum ion time of 50 ms, an isolation window of 1.5 mass-to-charge ratio, and a normalized collision energy of 28. Charge exclusion was set to unassigned, 1, 5 to 8, and greater than 8. MS1 that triggered MS2 scans were dynamically excluded for 25 s.

Database Search and False Discovery Rate Filtering

The raw data were analyzed using MaxQuant version 1.6.3.3 (Tyanova et al., 2016). A customized protein database containing 22,513 proteins (Supplemental Data Set S4) was generated from the RiboTaper output file ORFs_max_filt. Spectra were searched against the customized protein database, which was complemented with reverse decoy sequences and common contaminants by MaxQuant. Carbamidomethyl Cys was set as a fixed modification, while Met oxidation and protein N-terminal acetylation were set as variable modifications. Digestion parameters were set to specific and Trypsin/P;LysC or GluC. Up to two missed cleavages were allowed. A false discovery rate less than 0.01 and protein identification level was required. The second peptide option was used to identify cofragmented peptides. The match between runs feature of MaxQuant was not utilized. Raw data files and MaxQuant Search results have been deposited in the Mass Spectrometry Interactive Virtual Environment repository (https://massive.ucsd.edu/ProteoSAFe/static/massive.jsp) with data set identifier MSV000083363.

Prediction of the Subcellular Localization of sORFs

A fasta file containing the sORF amino acid sequences was uploaded to the TargetP Web site (Emanuelsson et al., 2000). We selected Plant as the organism group and >0.90 as the specificity cutoff and then submitted for analysis.

Evolutionary Analysis

The TBLASTN function for BLAST v2.7.1 (OS Linux_x86_64; Camacho et al., 2009) was used for the homology search. Because several plant genomes still lack exon-intron junction information in their annotations, we only selected single-exon tomato sORFs that encoded 16 to 100 amino acid residues for this analysis, and the reference genomes (Athaliana_167_TAIR9.fa, Atrichopoda_291_v1.0.fa, Csinensis_154_v1.fa, Mtruncatula_285_Mt4.0.fa, Osativa_323_v7.0.fa, Ppatens_318_v3.fa, S_lycopersicum_chromosomes.3.00.fa, Sitalica_312_v2.fa, Smoellendorffii_91_v1.fa, and Stuberosum_448_v4.03.fa) were downloaded from Phytozome v12 (Goodstein et al., 2012). The fa (fasta) files for each genome were used to generate BLAST databases with the following code: makeblastdb -in genome.fa -parse_seqids -dbtype nucl, where genome.fa was replaced with the fasta file for each genome. Next, the code tblastn -query input.fa -db species_database -out species_blast_result.txt -evalue 0.001 -outfmt '6 qseqid sseqid length qlen qstart qend sstart send pident gapopen mismatch evalue bitscore' -num_threads 10 was used to search for sequence homologs in the target genomes. The names of species_database and species_blast_result.txt were changed correspondingly. The final heatmap for amino acid identity was plotted in R using the pheatmap v1.0.10 (Kolde, 2015) and RColorBrewer v1.1.2 (Neuwirth, 2014) libraries.

MicroRNA Target Identification

The tomato microRNA sequences were extracted from Kaur et al. (2017) and Liu et al. (2017). Next, we used psRNATarget (Dai et al., 2018) against ITAG3.2 mRNA sequences to identify potential microRNA targets. We used Schema V2 (2017 release; Dai et al., 2018) and selected calculate target accessibility as the analysis parameter.

GO Term Analysis

agriGO v2.0 (Tian et al., 2017) was used for the GO analysis of uORF-containing genes.

Accession Numbers

The raw RNA-seq and Ribo-seq data have been deposited in the GEO database under accession number GSE124962. Proteomics raw data files and MaxQuant Search results have been deposited at the Mass Spectrometry Interactive Virtual Environment repository with data set identifier MSV000083363.

Supplemental Data

The following supplemental materials are available.

Acknowledgments

We thank Philip N. Benfey at Duke University for generous support to help initiate this project. This work used the Vincent J. Coates Genomics Sequencing Laboratory at the University of California, Berkeley, supported by a National Institutes of Health instrumentation grant (S10 OD018174).

Footnotes

1

This work was supported by the U.S. Department of Agriculture-National Institute of Food and Agriculture (postdoctoral fellowship 2016-67012-24720) and Michigan State University (startup fund) to P.Y.H.; the National Science Foundation (1759023), the U.S. Department of Agriculture-National Institute of Food and Agriculture (Hatch project 3808), and Iowa State University (Plant Sciences Institute Award) to J.W.W..

[OPEN]

Articles can be viewed without a subscription.

References

  1. Agrawal GK, Jwa NS, Lebrun MH, Job D, Rakwal R (2010) Plant secretome: Unlocking secrets of the secreted proteins. Proteomics 10: 799–827 [DOI] [PubMed] [Google Scholar]
  2. Andreev DE, O’Connor PBF, Loughran G, Dmitriev SE, Baranov PV, Shatsky IN (2017) Insights into the mechanisms of eukaryotic translation gained with ribosome profiling. Nucleic Acids Res 45: 513–526 [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Andrews SJ, Rothnagel JA (2014) Emerging evidence for functional peptides encoded by short open reading frames. Nat Rev Genet 15: 193–204 [DOI] [PubMed] [Google Scholar]
  4. Basrai MA, Hieter P, Boeke JD (1997) Small open reading frames: Beautiful needles in the haystack. Genome Res 7: 768–771 [DOI] [PubMed] [Google Scholar]
  5. Bazin J, Baerenfaller K, Gosai SJ, Gregory BD, Crespi M, Bailey-Serres J (2017) Global analysis of ribosome-associated noncoding RNAs unveils new modes of translational regulation. Proc Natl Acad Sci USA 114: E10018–E10027 [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Bazzini AA, Johnstone TG, Christiano R, Mackowiak SD, Obermayer B, Fleming ES, Vejnar CE, Lee MT, Rajewsky N, Walther TC, et al. (2014) Identification of small ORFs in vertebrates using ribosome footprinting and evolutionary conservation. EMBO J 33: 981–993 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Blanvillain R, Young B, Cai YM, Hecht V, Varoquaux F, Delorme V, Lancelin JM, Delseny M, Gallois P (2011) The Arabidopsis peptide kiss of death is an inducer of programmed cell death. EMBO J 30: 1173–1183 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Brar GA, Weissman JS (2015) Ribosome profiling reveals the what, when, where and how of protein synthesis. Nat Rev Mol Cell Biol 16: 651–664 [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Brar GA, Yassour M, Friedman N, Regev A, Ingolia NT, Weissman JS (2012) High-resolution view of the yeast meiotic program revealed by ribosome profiling. Science 335: 552–557 [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Calviello L, Mukherjee N, Wyler E, Zauber H, Hirsekorn A, Selbach M, Landthaler M, Obermayer B, Ohler U (2016) Detecting actively translated open reading frames in ribosome profiling data. Nat Methods 13: 165–170 [DOI] [PubMed] [Google Scholar]
  11. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL (2009) BLAST+: Architecture and applications. BMC Bioinformatics 10: 421. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Casson SA, Chilley PM, Topping JF, Evans IM, Souter MA, Lindsey K (2002) The POLARIS gene of Arabidopsis encodes a predicted peptide required for correct root growth and leaf vascular patterning. Plant Cell 14: 1705–1721 [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Castellana NE, Shen Z, He Y, Walley JW, Cassidy CJ, Briggs SP, Bafna V (2014) An automated proteogenomic method uses mass spectrometry to reveal novel genes in Zea mays. Mol Cell Proteomics 13: 157–167 [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Chew GL, Pauli A, Schier AF (2016) Conservation of uORF repressiveness and sequence features in mouse, human and zebrafish. Nat Commun 7: 11663. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Chotewutmontri P, Barkan A (2016) Dynamics of chloroplast translation during chloroplast differentiation in maize. PLoS Genet 12: e1006106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Chung BY, Hardcastle TJ, Jones JD, Irigoyen N, Firth AE, Baulcombe DC, Brierley I (2015) The use of duplex-specific nuclease in ribosome profiling and a user-friendly software package for Ribo-seq data analysis. RNA 21: 1731–1745 [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Claverie JM. (1997) Computational methods for the identification of genes in vertebrate genomic sequences. Hum Mol Genet 6: 1735–1744 [DOI] [PubMed] [Google Scholar]
  18. Constabel CP, Yip L, Ryan CA (1998) Prosystemin from potato, black nightshade, and bell pepper: Primary structure and biological activity of predicted systemin polypeptides. Plant Mol Biol 36: 55–62 [DOI] [PubMed] [Google Scholar]
  19. Crappé J, Ndah E, Koch A, Steyaert S, Gawron D, De Keulenaer S, De Meester E, De Meyer T, Van Criekinge W, Van Damme P, et al. (2015) PROTEOFORMER: Deep proteome coverage through ribosome profiling and MS integration. Nucleic Acids Res 43: e29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Dai X, Zhuang Z, Zhao PX (2018) psRNATarget: A plant small RNA target analysis server (2017 release). Nucleic Acids Res 46: W49–W54 [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. De Coninck B, Carron D, Tavormina P, Willem L, Craik DJ, Vos C, Thevissen K, Mathys J, Cammue BPA (2013) Mining the genome of Arabidopsis thaliana as a basis for the identification of novel bioactive peptides involved in oxidative stress tolerance. J Exp Bot 64: 5297–5307 [DOI] [PubMed] [Google Scholar]
  22. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR (2013) STAR: Ultrafast universal RNA-seq aligner. Bioinformatics 29: 15–21 [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Emanuelsson O, Nielsen H, Brunak S, von Heijne G (2000) Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. J Mol Biol 300: 1005–1016 [DOI] [PubMed] [Google Scholar]
  24. Faghihi MA, Wahlestedt C (2009) Regulatory roles of natural antisense transcripts. Nat Rev Mol Cell Biol 10: 637–643 [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Fernandez-Pozo N, Menda N, Edwards JD, Saha S, Tecle IY, Strickler SR, Bombarely A, Fisher-York T, Pujar A, Foerster H, et al. (2015) The Sol Genomics Network (SGN): From genotype to phenotype to breeding. Nucleic Acids Res 43: D1036–D1041 [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Fields AP, Rodriguez EH, Jovanovic M, Stern-Ginossar N, Haas BJ, Mertins P, Raychowdhury R, Hacohen N, Carr SA, Ingolia NT, et al. (2015) A regression-based analysis of ribosome-profiling data reveals a conserved complexity to mammalian translation. Mol Cell 60: 816–827 [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Goodstein DM, Shu S, Howson R, Neupane R, Hayes RD, Fazo J, Mitros T, Dirks W, Hellsten U, Putnam N, et al. (2012) Phytozome: A comparative platform for green plant genomics. Nucleic Acids Res 40: D1178–D1186 [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Guydosh NR, Green R (2014) Dom34 rescues ribosomes in 3′ untranslated regions. Cell 156: 950–962 [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Hsu PY, Benfey PN (2018) Small but mighty: Functional peptides encoded by small ORFs in plants. Proteomics 18: e1700038. [DOI] [PubMed] [Google Scholar]
  30. Hsu PY, Calviello L, Wu HL, Li FW, Rothfels CJ, Ohler U, Benfey PN (2016) Super-resolution ribosome profiling reveals unannotated translation events in Arabidopsis. Proc Natl Acad Sci USA 113: E7126–E7135 [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Ikeuchi M, Yamaguchi T, Kazama T, Ito T, Horiguchi G, Tsukaya H (2011) ROTUNDIFOLIA4 regulates cell proliferation along the body axis in Arabidopsis shoot. Plant Cell Physiol 52: 59–69 [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Imai A, Hanzawa Y, Komura M, Yamamoto KT, Komeda Y, Takahashi T (2006) The dwarf phenotype of the Arabidopsis acl5 mutant is suppressed by a mutation in an upstream ORF of a bHLH gene. Development 133: 3575–3585 [DOI] [PubMed] [Google Scholar]
  33. Ingolia NT, Ghaemmaghami S, Newman JRS, Weissman JS (2009) Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science 324: 218–223 [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Ji Z, Song R, Regev A, Struhl K (2015) Many lncRNAs, 5'UTRs, and pseudogenes are translated and some are likely to express functional proteins. eLife 4: e08890. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Johnstone TG, Bazzini AA, Giraldez AJ (2016) Upstream ORFs are prevalent translational repressors in vertebrates. EMBO J 35: 706–723 [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Juntawong P, Girke T, Bazin J, Bailey-Serres J (2014) Translational dynamics revealed by genome-wide profiling of ribosome footprints in Arabidopsis. Proc Natl Acad Sci USA 111: E203–E212 [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Kaur P, Shukla N, Joshi G, VijayaKumar C, Jagannath A, Agarwal M, Goel S, Kumar A (2017) Genome-wide identification and characterization of miRNAome from tomato (Solanum lycopersicum) roots and root-knot nematode (Meloidogyne incognita) during susceptible interaction. PLoS ONE 12: e0175178. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Kearse MG, Wilusz JE (2017) Non-AUG translation: A new start for protein synthesis in eukaryotes. Genes Dev 31: 1717–1731 [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Kim BH, Cai X, Vaughn JN, von Arnim AG (2007) On the functions of the h subunit of eukaryotic initiation factor 3 in late stages of translation initiation. Genome Biol 8: R60. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Kolde R. (2015) pheatmap: Pretty heatmaps. https://cran.r-project.org/web/packages/pheatmap/index.html
  41. Kozak M. (1987) An analysis of 5′-noncoding sequences from 699 vertebrate messenger RNAs. Nucleic Acids Res 15: 8125–8148 [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Ku HM, Vision T, Liu J, Tanksley SD (2000) Comparing sequenced segments of the tomato and Arabidopsis genomes: Large-scale duplication followed by selective gene loss creates a network of synteny. Proc Natl Acad Sci USA 97: 9121–9126 [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Laing WA, Martínez-Sánchez M, Wright MA, Bulley SM, Brewster D, Dare AP, Rassam M, Wang D, Storey R, Macknight RC, et al. (2015) An upstream open reading frame is essential for feedback regulation of ascorbate biosynthesis in Arabidopsis. Plant Cell 27: 772–786 [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nat Methods 9: 357–359 [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Lawrence M, Huber W, Pagès H, Aboyoun P, Carlson M, Gentleman R, Morgan MT, Carey VJ (2013) Software for computing and annotating genomic ranges. PLOS Comput Biol 9: e1003118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Lei L, Shi J, Chen J, Zhang M, Sun S, Xie S, Li X, Zeng B, Peng L, Hauck A, et al. (2015) Ribosome profiling reveals dynamic translational landscape in maize seedlings under drought stress. Plant J 84: 1206–1218 [DOI] [PubMed] [Google Scholar]
  47. Li B, Dewey CN (2011) RSEM: Accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics 12: 323. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R (2009) The Sequence Alignment/Map format and SAMtools. Bioinformatics 25: 2078–2079 [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Li S, Le B, Ma X, Li S, You C, Yu Y, Zhang B, Liu L, Gao L, Shi T, et al. (2016) Biogenesis of phased siRNAs on membrane-bound polysomes in Arabidopsis. eLife 5: e22750. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Li Z, Xu R, Li N (2018) MicroRNAs from plants to animals, do they define a new messenger for communication? Nutr Metab (Lond) 15: 68. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Liao Y, Smyth GK, Shi W (2014) featureCounts: An efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30: 923–930 [DOI] [PubMed] [Google Scholar]
  52. Liu MJ, Wu SH, Wu JF, Lin WD, Wu YC, Tsai TY, Tsai HL, Wu SH (2013) Translational landscape of photomorphogenic Arabidopsis. Plant Cell 25: 3699–3710 [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Liu M, Yu H, Zhao G, Huang Q, Lu Y, Ouyang B (2017) Profiling of drought-responsive microRNA and mRNA in tomato using high-throughput sequencing. BMC Genomics 18: 481. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Lütcke HA, Chow KC, Mickel FS, Moss KA, Kern HF, Scheele GA (1987) Selection of AUG initiation codons differs in plants and animals. EMBO J 6: 43–48 [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Mata CI, Fabre B, Hertog ML, Parsons HT, Deery MJ, Lilley KS, Nicolaï BM (2017) In-depth characterization of the tomato fruit pericarp proteome. Proteomics 17: 1600406. [DOI] [PubMed] [Google Scholar]
  56. Menschaert G, Van Criekinge W, Notelaers T, Koch A, Crappé J, Gevaert K, Van Damme P (2013) Deep proteome coverage based on ribosome profiling aids mass spectrometry-based protein and peptide discovery and provides evidence of alternative translation products and near-cognate translation initiation events. Mol Cell Proteomics 12: 1780–1790 [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Merchante C, Brumos J, Yun J, Hu Q, Spencer KR, Enríquez P, Binder BM, Heber S, Stepanova AN, Alonso JM (2015) Gene-specific translation regulation mediated by the hormone-signaling molecule EIN2. Cell 163: 684–697 [DOI] [PubMed] [Google Scholar]
  58. Michel AM, Choudhury KR, Firth AE, Ingolia NT, Atkins JF, Baranov PV (2012) Observation of dually decoded regions of the human genome using ribosome profiling data. Genome Res 22: 2219–2229 [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Neuwirth E. (2014) RColorBrewer: ColorBrewer Palettes. https://cran.r-project.org/web/packages/RColorBrewer/index.html
  60. Ozsolak F, Milos PM (2011) RNA sequencing: advances, Challenges and opportunities. Nat Rev Genet 12: 87–98 [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Pearce G, Strydom D, Johnson S, Ryan CA (1991) A polypeptide from tomato leaves induces wound-inducible proteinase inhibitor proteins. Science 253: 895–897 [DOI] [PubMed] [Google Scholar]
  62. Pertea M, Pertea GM, Antonescu CM, Chang TC, Mendell JT, Salzberg SL (2015) StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat Biotechnol 33: 290–295 [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Pertea M, Kim D, Pertea GM, Leek JT, Salzberg SL (2016) Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. Nat Protoc 11: 1650–1667 [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. R Core Team (2013) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna [Google Scholar]
  65. Ruggles KV, Krug K, Wang X, Clauser KR, Wang J, Payne SH, Fenyö D, Zhang B, Mani DR (2017) Methods, tools and current perspectives in proteogenomics. Mol Cell Proteomics 16: 959–981 [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Ruiz-Orera J, Albà MM (2019) Translation of small open reading frames: Roles in regulation and evolutionary innovation. Trends Genet 35: 186–198 [DOI] [PubMed] [Google Scholar]
  67. Sagor GHM, Berberich T, Tanaka S, Nishiyama M, Kanayama Y, Kojima S, Muramoto K, Kusano T (2016) A novel strategy to produce sweeter tomato fruits with high sugar contents by fruit-specific expression of a single bZIP transcription factor gene. Plant Biotechnol J 14: 1116–1126 [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Schafer S, Adami E, Heinig M, Rodrigues KEC, Kreuchwig F, Silhavy J, van Heesch S, Simaite D, Rajewsky N, Cuppen E, et al. (2015) Translational regulation shapes the molecular landscape of complex disease phenotypes. Nat Commun 6: 7200. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Schwarz D, Thompson AJ, Kläring HP (2014) Guidelines to use tomato in experiments with a controlled environment. Front Plant Sci 5: 625. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Shamimuzzaman M, Vodkin L (2018) Ribosome profiling reveals changes in translational status of soybean transcripts during immature cotyledon development. PLoS ONE 13: e0194596. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Simpson GG, Laurie RE, Dijkwel PP, Quesada V, Stockwell PA, Dean C, Macknight RC (2010) Noncanonical translation initiation of the Arabidopsis flowering time and alternative polyadenylation regulator FCA. Plant Cell 22: 3764–3777 [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Song G, Brachova L, Nikolau BJ, Jones AM, Walley JW (2018a) Heterotrimeric G-protein-dependent proteome and phosphoproteome in unstimulated Arabidopsis roots. Proteomics 18: e1800323. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Song G, Hsu PY, Walley JW (2018b) Assessment and refinement of sample preparation methods for deep and quantitative plant proteome profiling. Proteomics 18: e1800220. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Spealman P, Naik AW, May GE, Kuersten S, Freeberg L, Murphy RF, McManus J (2018) Conserved non-AUG uORFs revealed by a novel regression analysis of ribosome profiling data. Genome Res 28: 214–222 [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Szymanski J, Levin Y, Savidor A, Breitel D, Chappell-Maor L, Heinig U, Töpfer N, Aharoni A (2017) Label-free deep shotgun proteomics reveals protein dynamics during tomato fruit tissues development. Plant J 90: 396–417 [DOI] [PubMed] [Google Scholar]
  76. Tavormina P, De Coninck B, Nikonorova N, De Smet I, Cammue BPA (2015) The plant peptidome: An expanding repertoire of structural features and biological functions. Plant Cell 27: 2095–2118 [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Tian T, Liu Y, Yan H, You Q, Yi X, Du Z, Xu W, Su Z (2017) agriGO v2.0: A GO analysis toolkit for the agricultural community, 2017 update. Nucleic Acids Res 45: W122–W129 [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Tyanova S, Temu T, Cox J (2016) The MaxQuant computational platform for mass spectrometry-based shotgun proteomics. Nat Protoc 11: 2301–2319 [DOI] [PubMed] [Google Scholar]
  79. Valdivia ER, Chevalier D, Sampedro J, Taylor I, Niederhuth CE, Walker JC (2012) DVL genes play a role in the coordination of socket cell recruitment and differentiation. J Exp Bot 63: 1405–1412 [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Van Damme P, Gawron D, Van Criekinge W, Menschaert G (2014) N-terminal proteomics and ribosome profiling provide a comprehensive view of the alternative translation initiation landscape in mice and men. Mol Cell Proteomics 13: 1245–1261 [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. von Arnim AG, Jia Q, Vaughn JN (2014) Regulation of plant translation by upstream open reading frames. Plant Sci 214: 1–12 [DOI] [PubMed] [Google Scholar]
  82. Walley JW, Briggs SP (2015) Dual use of peptide mass spectra: Protein atlas and genome annotation. Curr Plant Biol 2: 21–24 [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Wei T. (2013) corrplot: Visualization of a correlation matrix. https://cran.r-project.org/web/packages/corrplot/index.html
  84. Xu G, Greene GH, Yoo H, Liu L, Marqués J, Motley J, Dong X (2017a) Global translational reprogramming is a fundamental layer of immune regulation in plants. Nature 545: 487–490 [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Xu G, Yuan M, Ai C, Liu L, Zhuang E, Karapetyan S, Wang S, Dong X (2017b) uORF-mediated translation allows engineered plant disease resistance without fitness costs. Nature 545: 491–494 [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Yu Y, Jia T, Chen X (2017) The ‘how’ and ‘where’ of plant microRNAs. New Phytol 216: 1002–1017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Zhang H, Si X, Ji X, Fan R, Liu J, Chen K, Wang D, Gao C (2018) Genome editing of upstream open reading frames enables translational control in plants. Nat Biotechnol 36: 894–898 [DOI] [PubMed] [Google Scholar]
  88. Zoschke R, Watkins KP, Barkan A (2013) A rapid ribosome profiling method elucidates chloroplast ribosome behavior in vivo. Plant Cell 25: 2265–2275 [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Plant Physiology are provided here courtesy of Oxford University Press

RESOURCES