Skip to main content
Plant Signaling & Behavior logoLink to Plant Signaling & Behavior
. 2010 Nov 1;5(11):1330–1335. doi: 10.4161/psb.5.11.12837

Does an expressed sequence tag (EST) library of Salsola iberica (tumbleweed) help to understand plant responses to environmental stresses?

Sam R Zwenger 1, Rotana Alsaggaf 1, Chhandak Basu 1,
PMCID: PMC3115229  PMID: 20935479

Abstract

Weeds play an important role in agriculture and molecular techniques are useful to help understand traits that contribute to weediness and weeds' interactions with the environment. A total of 377 expressed sequence tags (ESTs) from a modest library were arranged into 227 unique fragments and 61 contigs, which consisted of two or more ESTs. From blastx results, we mapped and annotated unigenes using the gene ontology vocabulary according to biological process, cellular component and molecular function. These were then compared to a reference set of Arabidopsis thaliana sequences for statistically significant over- or underrepresented genes. The sequences were also compared against multiple protein databases for similarity of functional domains. Overall, the S. iberica sequences showed high similarity to response to stress, which included salt-induced proteins, betaine aldehydehyde dehydrogenase and calcium binding proteins. Only a modest number of transcripts were sequenced; however, the results presented here demonstrate the metabolic versatility of S. iberica in sub-optimal conditions that are likely to contribute to its cosmopolitan distribution. Here we propose that an EST library of an economically important weed species could be used to understand the weed's interactions with the environment.

Key words: expressed sequence tag, gene ontology, Salsola iberica, weed, weediness

Introduction

Weeds have always been a fundamental aspect of agriculture due to their significance in disturbing crop growth and limiting crop production. There are several methods of weed control and elimination that include mechanical, chemical and biological methods. It is now realized, in addition to improving crop traits, a greater understanding of weed transcriptional states and weed genomics is needed.1 There is a growing necessity of implementing genomic and transciptomic techniques to better understand the functional genomics aspect of weeds.2 High-throughput methods, such as genomic analyses, generate large data sets, often followed by computational methods for biological interpretation. Subsequently, Larrinua and Belmar3 have provided an in depth review on the importance that bioinformatics has on weed science research.

Salsola spp. are annual plants that grow in a variety of sub-optimal environments including arid and saline coastal soils. They are commonly found in disturbed areas, uncultivated grain fields and roadsides throughout the continental United States.4 Thought to have Eurasian origins, this weed has become naturalized in North America, Australia and elsewhere.5 For example, S. australis is a major weed in Australia.5 In Afghanistan and other parts of the Middle East S. imbricata is considered a serious weed6 and S. komarovi has been found growing on Korean sand dunes.7

Salsola spp. are commonly referred to as saltbush or tumbleweed. This reflects their ability to tolerate saline soils as well as their tumbling nature upon maturity, which facilitates seed dispersal. Mature, tumbling plants cause major disturbances by piling up against fences and houses, disrupting automobile traffic and clogging irrigation canals. In addition, Salsola spp. are major weeds in dryland farming and therefore play an important role in agriculture.8 More importantly, understanding how plants adapt and survive on saline soils is of significance since about one third of arable land is ranked as uncultivatable because of salinity.9

For these reasons we hypothesized that an over representation of GO terms associated with stress-related genes would be found with S. iberica sequences when compared to an A. thaliana reference set. In addition, due to its cosmopolitan distribution, ability to tolerate harsh environments, important role in agriculture and the paucity of current sequence data we created and annotated a small EST library of S. iberica to better understand some of its physiology and weediness traits and its interactions with the environment.

Results

Average cDNA insert size in the library was estimated to be 800 bp. Overall complexity of the library was estimated to be 5 × 105. For this library, a total of 768 clones were sequenced, of which 377 were of high enough quality to be used in this library. Using an E-value threshold 1.0 E-6, the distribution of sequences in the library showed that 99 (34%) S. iberica unigenes did not match sequences in the blastx database and therefore could not be mapped or annotated (Sup. File 1 and Fig. 1). From these, five unigenes could not be mapped and nine could not be annotated. The blastx hits from our Salsola library suggest that the 377 EST sequences were enough to show insight into mechanisms that play roles in plant physiological processes. The Blast2Go suite used in our analysis streamlined the process of batch blast and gene mapping and ontology assignments. The process entails three primary steps; similarity search using the NCBI-BLAST database (blastx), mapping the BLAST hits to multiple databases (e.g., Gene Ontology database, Protein Information Resource, Annex) and functional annotation according to Gene Ontology terms. The final step applies the ‘annotation rule’, which can only be applied after the first two steps. The rule attempts to determine the best annotation within a given reliability.10 Further analysis using the GOSSIP algorithm17 showed that there is a significant overrepresentation of interesting (i.e., weediness) GO-related terms affiliated with the S. iberica sequence data set (Table 1).

Figure 1.

Figure 1

Pie diagrams showing the distribution of unigenes within their respective GO annotation categories (Biological Process, Molecular Function and Cellular Component). More than half of the annotated sequences of the biological process category were classified as a cellular or metabolic process while five sequences were annotated as regulating biological processes. The molecular function category classified the majority of sequences as having some type of binding activity. The cellular component category included many annotations related to organelles.

Table 1.

Results for Fisher's exact test showing gene ontology (GO) terms overrepresented in S. iberica when compared to the A. thaliana reference set.

GO term Name FDR FWER Single test p-value # in test group # in reference group # non annot test # non annot reference group
GO:0005622 intracellular 8.27E-09 2.69E-08 0 136 8971 38 13587
GO:0043231 intracellular membrane-bounded organelle 8.27E-09 2.69E-08 0 112 7396 62 15162
GO:0043229 intracellular organelle 8.27E-09 2.69E-08 0 117 7808 57 14750
GO:0044424 intracellular part 8.27E-09 2.69E-08 0 129 8595 45 13963
GO:0015979 photosynthesis 8.27E-09 2.69E-08 0 29 155 145 22403
GO:0009536 plastid 8.27E-09 2.69E-08 0 87 3195 87 19363
GO:0043226 organelle 8.27E-09 2.69E-08 0 117 7809 57 14749
GO:0009579 thylakoid 8.27E-09 2.69E-08 0 45 445 129 22113
GO:0044444 cytoplasmic part 8.27E-09 2.69E-08 0 113 5919 61 16639
GO:0043227 membrane-bounded organelle 8.27E-09 2.69E-08 0 112 7402 62 15156
GO:0006091 generation of precursor metabolites and energy 8.27E-09 2.69E-08 0 29 301 145 22257
GO:0016020 membrane 8.27E-09 2.69E-08 0 80 4606 94 17952
GO:0005737 cytoplasm 8.27E-09 2.69E-08 0 125 6349 49 16209
GO:0008152 metabolic process 1.21E-08 4.22E-08 8.97E-11 102 7822 72 14736
GO:0005198 structural molecule activity 4.64E-08 1.74E-07 7.61E-10 21 509 153 22049
GO:0005840 ribosome 7.86E-08 3.14E-07 1.53E-09 20 476 154 22082
GO:0030529 ribonucleoprotein complex 7.61E-06 3.23E-05 1.43E-07 20 631 154 21927
GO:0005623 cell 1.33E-05 5.97E-05 2.79E-07 148 15366 26 7192
GO:0009987 cellular process 3.55E-05 1.68E-04 6.27E-07 105 9408 69 13150
GO:0009058 biosynthetic process 6.51E-05 3.26E-04 1.46E-06 46 2911 128 19647
GO:0044464 cell part 1.04E-04 5.46E-04 1.83E-06 146 15366 28 7192
GO:0043232 intracellular non-membrane-bounded organelle 2.55E-04 0.00146728 5.71E-06 23 1028 151 21530
GO:0043228 non-membrane-bounded organelle 2.55E-04 0.00146728 5.71E-06 23 1028 151 21530
GO:0009628 response to abiotic stimulus 4.00E-04 0.00239535 9.22E-06 25 1217 149 21341
GO:0005739 mitochondrion 0.00366101 0.0226216 5.85E-05 22 1113 152 21445
GO:0006412 translation 0.00683275 0.0434415 1.63E-04 23 1282 151 21276
GO:0044237 cellular metabolic process 0.0113657 0.0738504 2.28E-04 73 6577 101 15981
GO:0006950 response to stress 0.0136289 0.0909947 3.23E-04 31 2086 143 20472
GO:0034961 cellular biopolymer biosynthetic process 0.0330373 0.215949 7.95E-04 25 1623 149 20935
GO:0043284 biopolymer biosynthetic process 0.0330373 0.219478 8.37E-04 25 1629 149 20929
GO:0034645 cellular macromolecule biosynthetic process 0.0591195 0.367593 0.00154437 25 1703 149 20855
GO:0009059 macromolecule biosynthetic process 0.0662273 0.411324 0.00168459 25 1714 149 20844
GO:0006810 transport 0.0832994 0.497082 0.00248744 25 1765 149 20793
GO:0010467 gene expression 0.0883478 0.528145 0.00252466 25 1767 149 20791
GO:0051234 establishment of localization 0.0884048 0.538687 0.00267833 25 1775 149 20783
GO:0016043 cellular component organization 0.0901766 0.555916 0.00292763 15 856 159 21702
GO:0044238 primary metabolic process 0.0987281 0.598854 0.00324473 67 6456 107 16102
GO:0051179 localization 0.116792 0.67038 0.00402214 25 1832 149 20726

Each GO term and name is provided with the corrected p value (FDR & FWER). # in test group, number of times term was found in S. iberica; # in reference group, number of times term was found in A. thaliana set; # non annot test, number of non-annotated in S. iberica set; # non annot reference group, number of non annotated in A. thaliana. FDR, corrected p-value by False Discovery Rate control; FWER, corrected p-value by Family Wise Error Rate; Single Test p Value: p Value without multiple testing corrections.

The level of GO annotations, in which detail of the description of a gene product increases with the GO level, varied across GO categories (Sup. File 1 and Fig. 2). Those categorized as having molecular function showed the highest GO annotation level, however most annotations were of cellular component. The majority of unigenes with cellular process were categorized at GO levels 4, 5 and 6. A total of 895 annotations were ascribed to all unigenes. In general, a longer sequence length resulted in an increased likelihood of being annotated (Sup. File 1 and Fig. 3). For example, less than half of the unigenes with a length of 200–300 bp were annotated while nearly all unigenes of 700–1,000 bp were annotated.

A total of 174 (60%) unigenes were mapped and annotated to GO terms. Annotated sequences were assigned to GO categories of biological process, cellular component and molecular process (Fig. 1). The criteria to place an annotation under a specific GO parent term (e.g., biological process) is based on a controlled vocabulary, which helps compare sequences across all species. Further, child terms (i.e., metabolic process) are associated with each parent term that give further description to an annotated sequence. Thus, in our data, a total of 341 annotations were described as playing a role in biological processes. Of these, 102 (30%) were categorized as having a role in metabolic processes and only four (<2%) were assigned to reproductive process. The majority of assignments were given to cellular process (105). Equal numbers of unigenes within the biological process category played a role in localization (25) or establishment of localization (25).

Within the molecular function category, a total of 109 annotations were assigned. These consisted of 63 (58%) annotations with some type of binding function (e.g., nucleic acid or protein binding). An additional 18 (17%) were categorized as having hydrolase activity or transferase activity (21). Only 6 (<6%) were annotated as having either translation factor or transcription factor activity.

A total of 630 annotations were categorized as cellular component. Of these, 136 (22%) were intracellular and 129 (20%) were categorized as an intracellular part. A total of 112 (18%) annotations were associated with membrane-bound organelles and only five (<1%) annotations played a role in external capsulating structure.

Implementation of the GOSSIP algorithm, which Blast2Go incorporates into their suite, showed an overrepresentation of GO terms according to annotated seqeunces (Table 1). This was based on the corrected p-value by false discovery rate (FDR) control (an FDR of 0.05 was used here). Importantly, these terms included “response to abiotic stimulus” (4.00E-4), “photosynthesis” (8.27E-9), “generation of precursor metabolites and energy” (8.27E-9) and “biosynthetic process” (6.51EE-5). Terms under “response to abiotic stress” were also overrepresented (FDR = 0.0136) in the S. iberica sequence set.

Discussion

Plant metabolism.

Many of the contigs generated were closely related to important plant metabolic processes. The largest contig consisted of nine ESTs and matched the photosystem I reaction center subunit (E-value = 6.03 E-45). The second largest contig consisted of six ESTs and coded for a ferrodoxin precursor. These findings are expected since the plant material was obtained from a young S. iberica, which would require photosynthesis for energy and anabolic processes. Another contig that consisted of sixESTs coded for phosphate dikinase, which is involved in carrying out phosphorylation reactions. The third largest EST-containing contigs consisted of four ESTs and coded for a lipid transfer protein and chlorophyll a/b binding protein. Additional matches to inorganic pyrohosphatase, 23 kDa oxygen evolving complex and S-adenosyl methionine synthetase were found for contigs each consisting of three ESTs. In addition to the many contigs matching photosynthetic and regulatory processes, one transcript encoded cytochrome c oxidase, which is important in cellular respiration.

Stress associations.

Many enzymes that play a role in abiotic stress response were found in S. iberica and included carbonic anyhydrase, catalase and glutathione peroxidase. Carbonic anyhydrase has previously exhibited increased levels in drought-stressed wheat leaves.11 Catalase is the enzyme that rapidly converts H2O2 molecules to less reactive species and it is well known that hydrogen peroxide molecules are produced in stressful conditions in plants.

Other sequences of interest were stress-enhanced proteins, temperature induced lipocalin, 2-cysteine peroxiredoxin and calcium ion binding proteins. Lipocalins are important for moving relatively small hydrophobic molecules across lipid membranes. The enzyme 2-cysteine peroxiredoxin, like catalase and glutathione peroxidase, is also important in reducing peroxide molecules. The importance of calcium ion binding proteins is important in many plant responses since calcium can act as a secondary messenger. It also is gaining increasing attention for its role in plant stress responses.12

Salsola spp. are also commonly called saltwort, due to their ability to tolerate saline soils. In addition, they are known for their drought and heat tolerance.6 It has been suggested that proline can help increase tolerance to osmotic stress. Sharifabad and Nodoushan19 studied the effects of salinity across three different Salsola species (S. dendroides, S. richteri and S. orientalis). They detected increasing proline levels correlated with increasing salinity. Here, we found at least one gene involved in production of glutamine (glutamine synthetase), a precursor to proline. Based on the annotations, we found at least six unigenes associated with salt stress response, which included cytochrome c oxidase subuint, glutathione peroxidase and an unknown plant salt-induced protein.

In previous studies, betaine aldehyde dehydrogenase has been shown to function in high salinity and drought stress.13 This enzyme catalyzes the last step of producing glycine betaine from choline in a two-step reaction.14 Glycine betaine is an important metabolite in salt stressed plants because of its ability to protect enzymes affiliated with photosynthesis.15 In this library we found one transcript coding for betaine aldehyde dehydrogenase. Three unigenes were found to be jasmonate-induced proteins. Jasmonate plays an important part of the abiotic and biotic plant stress response as a signaling hormone.

Proteins classified as heat shock proteins (HSPs), specifically HSP100 and HSP90 and HSP80 have been suggested to play a role in both development and stress response.16 Two transcripts from our S. iberica analysis closely matched HSP90 (E-value = 1.16 E-116), which is involved in stabilizing cellular proteins upon stress conditions. Koning et al.20 performed a northern blot and a GUS assay of a transgenic Arabidopsis line and showed that HSP80 is upregulated in development. These studies help present evidence that germinating seeds and young plants might often encounter abiotic stress factors and some of these genes might have evolved to help ensure proper development.

Materials and Methods

Library construction.

Plant tissue from a single individual growing in a parking lot on the University of Northern Colorado (Greeley, CO) campus was obtained and snap frozen in liquid nitrogen. Total RNA was obtained using Trizol reagent (Invitrogen, USA) and mRNA was isolated using the polyA purist kit (Ambion, USA). Blunt ended cDNA was made from approximately 3 µg of mRNA using the OrientExpress cDNA synthesis kit with slight modifications (Novagen, USA). For nucleic acid precipitations, in place of ammonium acetate we used sodium acetate (0.3 M, pH 5.2).

To increase phosphorylation of 5′ and 3′ ends, the DNA Terminator End Repair Kit (Lucigen Corp., USA) was used. cDNA was size selected (700 bp-5 kb) using gel electrophoresis and electroelution. Approximately 250 ng of the blunt ended cDNA was ligated into the pSMART vector (Lucigen Corp., USA). Chemically competent E. coli were transformed with 1 µL of the ligation reaction.

To estimate insert size, eight colonies picked from an overnight culture were grown in LB broth overnight and plasmid extraction was performed using a plasmid miniprep kit (Qiagen, USA). A fast digest reaction using EcoRV (Fermentas, USA) and gel electrophoresis was performed to determine average insert size. The library was shipped to Lucigen Corp., USA for sequencing and read in one direction using an ABI 3730xl DNA sequencer.

Sequence analysis.

Sequencher software was used for vector and low quality sequence removal from a total 768 sequences resulted in 377 high-quality ESTs ranging from 200 bp-1,111 bp. Contig assembly yielded 227 fragments and 61 contigs consisting of 2 or more ESTs for a total of 288 unigenes. The resulting sequences were opened with the Blast2Go java application (www.blast2go.org/), freely available online.10 The Blast2Go suite allows for batch blast, functional analysis, assignment of gene ontology terms, statistical analysis and searching for functional domains via InterProScan (Sup. File 2). To increase specificity of gene ontology terms we used the GOSlim mapping function. To determine if our set of S. iberica sequences contained a statistically significant overrepresentation of GO terms we compared them to a reference set of GO terms for Arabidopsis thaliana.17

Future Perspectives

The science of weed control could be greatly improved with a molecular understanding of how weeds are able to outcompete crop species. Understanding the protein and genetic components of weeds can offer an important role in developing novel ways to help solve weed problems.18

Future research should include sequencing additional ESTs for Salsola to determine the additional diversity and abundance of transcripts. Construction and analysis of EST libraries of other cosmopolitan weed species might also be important to further the understanding of how weeds are able to live in areas most plants would find unbearable. In this paper we have developed an EST library of an important weed species and used the ESTs to understand part of the weed's transcriptome and its interactions with the environment. However, more wet lab data is needed to expand our analysis.

Data Deposition

Sequence data from this article have been deposited into GeneBank under accession numbers GW316091-GW316220, GW343231-GW343252,GW343255-GW343298, GW343301-GW343500.

Acknowledgements

The Blast2Go Google group was helpful in using the Blast2Go suite (http://groups.google.com/group/Blast2GO). This work was funded in part by the Colorado BioScience Grant and the GAANN Fellowship.

Footnotes

Supplementary Material

Supplementary Material
psb0511_1330SD1.pdf (231.7KB, pdf)
psb0511_1330SD2.pdf (150.8KB, pdf)

References

  • 1.Basu C, Halfhill MD, Mueller TC, Stewart CN., Jr Weed genomics: new tools to understand weed biology. Trends Plant Sci. 2004;9:391–408. doi: 10.1016/j.tplants.2004.06.003. [DOI] [PubMed] [Google Scholar]
  • 2.Stewart CN, Tranel PJ, Horvath DP, Anderson JV, Rieseberg LH, Westwood JH, et al. Evolution of Weediness and Invasiveness: Charting the Course for Weed Genomics. Weed Sci. 2009;57:451–462. [Google Scholar]
  • 3.Larrinua IM. BSB. Bioinformatics and its relevance to weed science. Weed Science. 2008;56:297–305. [Google Scholar]
  • 4.Ayres D, Ryan FJ, Grotkopp E, Bailey J, Gaskin JF. Tumbleweed (Salsola, secion Kali) Species and Speciation in California. Biol Invasions. 2008;11:1175–1187. [Google Scholar]
  • 5.Borger CPD, Walsh M, Powles SB. Demography of Salsola australis populations in the agricultural region of south-west Australia. Weed Res. 2009;49:391–399. [Google Scholar]
  • 6.Khan MA, Gul B, Weber DJ. Seed germination in the Great Basin halophyte Salsola iberica. Can J Bot. 2002;80:650–655. [Google Scholar]
  • 7.Kim KD. Invasive plants on disturbed Korean sand dunes. Estuar Coast Shelf Sci. 2005;62:353–364. [Google Scholar]
  • 8.Schillinger WF. Ecology and Control of Russian Thistle (Salsola Iberica) after Spring Wheat Harvest. Weed Sci. 2007;55:381–385. [Google Scholar]
  • 9.Frommer WB, Ludewig U, Rentsch D. Enhanced: Taking Transgenic Plants with a Pinch of Salt. Science. 1999;285:1222–1223. doi: 10.1126/science.285.5431.1222. [DOI] [PubMed] [Google Scholar]
  • 10.Conesa A, Gotz S, Garcia-Gomez JM, Terol J, Talon M, Robles M. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics. 2005;21:3674–3676. doi: 10.1093/bioinformatics/bti610. [DOI] [PubMed] [Google Scholar]
  • 11.Kicheva MI, Lazova GN. Response of carbonic anhydrase to polyethylene glycol-mediated water stress in wheat. Photosynthetica. 1997;34:133–135. [Google Scholar]
  • 12.Song WY, Zhang ZB, Shao HB, Guo XL, Cao HX, Zhao HB, et al. Relationship between calcium decoding elements and plant abiotic-stress resistance. Int J Biol Sci. 2008;4:116–125. doi: 10.7150/ijbs.4.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Waditee R, Bhuiyan NH, Hirata E, Hibino T, Tanaka Y, Shikata M, et al. Metabolic engineering for betaine accumulation in microbes and plants. J Biol Chem. 2007;282:34185–34193. doi: 10.1074/jbc.M704939200. [DOI] [PubMed] [Google Scholar]
  • 14.Hattori T, Mitsuya S, Fujiwara T, Jagendorf AT, Takabe T. Tissue specificity of glycinebetaine synthesis in barley. Plant Sci. 2008;176:112–118. [Google Scholar]
  • 15.Nomura M, Hibino T, Takabe T, Sugiyama T, Yokota A, Miyake H, et al. Transgenically produced glycinebetaine protects ribulose 1,5-bisphosphate carboxylase/oxygenase from inactivation in Synechococcus sp. PCC7942 under salt stress. Plant Cell Physiol. 1998;39:425–432. [Google Scholar]
  • 16.Krishna P, Gloor G. The Hsp90 family of proteins in Arabidopsis thaliana. Cell Stress Chaperones. 2001;6:238–246. doi: 10.1379/1466-1268(2001)006<0238:thfopi>2.0.co;2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Blüthgen N, Kielbasa SM, Herzel H. Inferring combinatorial regulation of transcription in silico. Nucleic Acids Res. 2005;33:272–279. doi: 10.1093/nar/gki167. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Rector BG. Molecular biology approaches to control of intractable weeds: New strategies and complements to existing biological practices. Plant Sci. 2008;175:437–448. [Google Scholar]
  • 19.Heidari-Sharifabad H, Mirzaie-Nodoushan H. Salinity-induced growth and some metabolic changes in three Salsola species. J Arid environments. 2006;67:715–720. [Google Scholar]
  • 20.Koning AJ, Rose R, Comai L. Developmental expression of tomato heat-shock cognate protein 80. Plant Physiol. 1992;100:801–811. doi: 10.1104/pp.100.2.801. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material
psb0511_1330SD1.pdf (231.7KB, pdf)
psb0511_1330SD2.pdf (150.8KB, pdf)

Articles from Plant Signaling & Behavior are provided here courtesy of Taylor & Francis

RESOURCES