Abstract
The fungal plant pathogens Sclerotinia sclerotiorum and S. trifoliorum are morphologically similar, but differ considerably in host range. In an effort to elucidate mechanisms of the host range difference, transcriptomes of the 2 species at vegetative growth stage were compared to gain further insight into commonality and uniqueness in gene expression and pathogenic mechanisms of the 2 closely related pathogens. A total of 23133 and 21043 unique transcripts were obtained from S. sclerotiorum and S. trifoliorum, respectively. Approximately 43% of the transcripts were genes with known functions for both species. Among 1411 orthologous contigs, about 10% (147) were more highly (>3-fold) expressed in S. trifoliorum than in S. sclerotiorum, and about 12% (173) of the orthologs were more highly (>3-fold) expressed in S. sclerotiorum than in S. trifoliorum. The expression levels of genes on the supercontig 30 have the highest correlation coefficient value between the 2 species. Twenty-seven contigs were found to be new and unique for S. trifoliorum. Additionally, differences in expressed genes involved in pathogenesis like oxalate biosynthesis and endopolygalacturonases were detected between the 2 species. The analyses of the transcriptomes not only discovered similarities and uniqueness in gene expression between the 2 closely related species, providing additional information for annotation the S. sclerotiorum genome, but also provided foundation for comparing the transcriptomes with host-infecting transcriptomes.
Keywords: RNA sequencing, Sclerotinia sclerotiorum, S. trifoliorum, transcriptome
The ascomycetous fungi Sclerotinia sclerotiorum (Lib.) de Bary and S. trifoliorum Erikss. cause diseases of white mold and stem rot on many economically important crops. S. sclerotiorum has a wide host range encompassing more than 400 plant species (Bolton et al. 2006). Its genome contains about 38 megabases (Mb) distributed in 37 supercontigs (Amselem et al. 2011). A recent comprehensive study of the genome of S. sclerotiorum in comparison with that of Botrytis cinerea presents our current understanding of the genome of S. sclerotiorum and its complex nature of necrotrophic pathogenesis, its ability to produce oxalic acid and a suite of carbohydrate-active enzymes including a series of polygalaturonases (Amselem et al. 2011).
On the other hand, very little is known about the genetics of its closely related species S. trifoliorum. Sclerotinia trifoliorum is very similar to S. sclerotiorum morphologically. The definitive morphological difference is that ascospores are dimorphic in S. trifoliorum, whereas monomorphic in S. sclerotiorum (Kohn 1979). Sclerotinia trifoliorum has a lower optimal temperature for growth, and has a much narrower host range about 40 species mainly on cool season forage and grain legumes (Willis 1971; Farr and Rossman 2012). Comparative studies of S. trifoliorum with S. sclerotiorum will allow us to take advantage of the available genomic information on S. sclertorum to advance our understanding of S. trifoliorum and to gain insight into the biology and pathogenic mechanisms of both species.
High-throughput RNA-sequencing (RNA-Seq) technology is becoming a very powerful and cost-efficient tool for transcriptome analysis, which has greatly facilitated studies to advance our understanding of the complexity of gene expression, regulation and networks (Ozsolak et al. 2009; Wang et al. 2009). This technology has been recently applied for the transcriptome analysis in a range of organisms like humans (Pan et al. 2008; Wang et al. 2008), yeast (Nagalakshmi et al. 2008; Wilhelm et al. 2008), rice (Zhang et al. 2010), tomato (Schilmiller et al. 2010), and sorghum (Paszkiewicz and Studholme 2010). RNA-Sequencing is very sensitive and can detect a large dynamic range of gene expression levels in contrast to hybridization-based techniques such as microarrays, which lack sensitivity for gene expression either at very low or very high levels and, therefore, have a much smaller range. Furthermore, RNA Sequencing can accurately quantify gene expression levels especially alternative splicing or those lowly expressed regions, potentially functional noncoding RNAs, newly identified exons, and untranslated regions (UTRs) (Zhang et al. 2010). Considering these advantages RNA Sequencing is a revolutionary tool for transcriptome analysis.
Our long-term research goal is to understand the mechanisms of host range difference between S. sclerotiorum and S. trifoliorum. One approach to that goal is to comprehensively understand the complex transcriptomes of S. trifoliorum in relation to S. sclerotiorum at a genomic level. Within this approach, we initiated this project to apply RNA-sequencing to these 2 species. Our analysis revealed numerous commonalties in expressed genes in biological process, cellular component and molecular function. Also the novel transcripts, newly identified exons and UTRs were revealed. Qualitative and quantitative analyses based on the transcriptome data may help reveal the mechanisms responsible for host range differences, and provide information for future investigation. Characterization of the transcriptomes of S. sclerotiorum and S. trifoliorum will also contribute information for annotating the S. sclerotiorum genome.
Materials and Methods
Strains and Growth Conditions
Sclerotinia sclerotiorum isolate WM-A1 and S. trifoliorum isolate 06CWM-G47 were used in transcriptome sequencing. Isolate 1980 of S. sclerotiorum, the strain based on which the genome was sequenced, was used in detection of specific genes. Colonized agar plugs from the edge of actively growing colonies on potato dextrose agar (PDA, Difco) were transferred to a fresh PDA overlaid with cellophane membrane (Bio-Rad Catalog No. 165–0963) and were allowed to grow for 48h at room temperature (22–24 °C). Mycelia were harvested by peeling off the mycelial matt from the membrane and used immediately for isolation of total RNAs.
Messenger RNA Isolation and Quality Assessment
Total RNA was isolated by the TRIzol method (Invitrogen) from freshly harvested mycelia of S. sclerotiorum and S. trifoliorum. Then messenger RNA (mRNA) was isolated using a GenElute™ mRNA Miniprep Kit (Sigma) from total RNA. The mRNA quality was assessed using the Bioanalyzer electrophoresis system and test chips (Caliper Life Sciences). A smear from approximately 7kb down to 200bp indicated high quality.
RNA Sequencing and Sequence Assembling
To prepare the mRNA samples for sequencing, 200ng of mRNA was fragmented with the RNA Fragmentation Solution (Roche Diagnostics Company). Double-stranded cDNA was synthesized from the fragmented RNA using the cDNA Synthesis System Kit (no PCR amplification method, Roche). The double-stranded cDNA was purified and then a sequencing adaptor was added for cDNA Library Quality Assessment by Bioanalyzer. Sequencing was performed using the “GS FLX Sequencing Kit Titanium Reagents XLR70” on the Genome Sequencer FLX at the Bioinformatics Core of Washington State University. The raw reads were assembled to contigs using gsAssembler v2.5.3 (Roche). The assembled contigs were manually checked for overlapping sequences and were combined (this happened only in contigs with higher number of reads), similar to the situation reported by Franssen et al. (2011).
Sequence Analysis
Unigenes (unique contigs and singletons) were compared with the NCBI nr (nonredundant) protein database (June 2010) using the BLASTX algorithm and NCBI nr nucleotide database using the BLASTN algorithm. Orthologs between S. sclerotiorum and S. trifoliorum were identified using NCBI-Blast2 algorithm (e-value threshold of 1×10−20). The Blast2GO annotation tool was used to assign most probable gene ontology (GO) terms to the contigs and singletons (Conesa et al. 2005). Default BLASTX parameters with an e-value threshold of 1×10−3 were loaded into the program. Subsequent mapping (assigning GO terms to positive hits after blast search) of the ESTs to determine the best possible settings prior to the annotation step was performed. Based on results obtained from the blasting and mapping, an annotation configuration with e-value-hit-filter 1×10−3, annotation cut off “55” and GO weight “8” was selected and used for all analyses. From these annotations, the unigenes were classified into 3 functional categories: biological process, molecular function, and cellular component. Pathway mapping was performed with KEGG (Kyoto Encyclopedia of Genes and Genomes) (Ogata et al. 1999).
Normalized Gene Locus Expression Level Analysis
In order to facilitate comparing relative expression levels of transcripts between RNA samples of the 2 species, we adopted the normalized gene locus expression level analysis developed for shorter (25bp) reads by Mortazavi et al. (2008). Transcript expression levels were normalized by the number of reads and transcript length over the total number of reads, that is, reads per kb per million reads (RPKM), using the formula:
where C is the number of reads of the transcript, N is the total number of reads generated in the mRNA sample, and L is the length of the transcript in bases (Mortazavi et al. 2008). As an alternative measure of expression levels, we used the housekeeping gene actin as a standard and the number of reads over total sequence reads. The expression levels of selected genes over actin expression were compared with the RPKM values to validate the usage of RPKM as a measure of relative expression levels.
The RPKM values were used to predict the S. sclerotiorum genomic regions that contain more highly expressed genes during vegetative growth. After the unigenes were assigned to the 37 supercontigs in the S. sclerotiorum genome accessed in June 2010, the average RPKM value of transcripts on each supercontig is used to compare among the supercontigs and between species. Further, the RPKM values of transcripts were mapped onto sugercontigs in a linear gene order, and correlation analyses were used to compare relative expressions for each supercontig between the 2 species.
Discovery of Uniquely Expressed Genes and Novel Transcripts
Contigs from one species were searched for in the transcriptome (unigenes) of the other species and those that are absent in the transcriptome of the other species are considered as uniquely expressed genes. The uniquely expressed genes for both species were coded according to their perspective locations in the 37 supercontigs of the S. sclerotiorum genome and were graphically displayed using the software MeV (MultiExperiment Viewer) (Saeed et al. 2003).
Contiguous expression regions of more than 50bp with each base supported by at least 5 reads was considered as transcriptionally active regions or transcript units (Stanke et al. 2008). The transcripts that do not overlap with any predicted transcripts in the Sclerotinia transcript database (http://www.broadinstitute.org/annotation/genome/sclerotinia_sclerotiorum/ accessed in June 2010) and were located on the genome regions not predicted by gene models were considered as novel transcripts.
PCR Analysis for Specific Genes
Sequences of 7 polygalacturonase (PG) genes SSPG1, SSPG3, SSPG5, SSPG6, SSXPG1, SSXPG2, and SSPG1d were obtained from the NCBI database (Hegedus and Rimmer 2005). PCR primers were designed for each of these 7 genes and were used to determine if the genes were expressed in the EST pools of isolates WM-A1 of S. sclerotiorum and 06CWM-G47 of S. trifoliorum, and also to detect the genes in genomic DNA of these 2 isolates plus the genomic DNA of isolate 1980. PCR was performed as following conditions: 94 °C for 5min; 32 cycles of 94 °C for 15s, 52 °C for 30s and 72 °C for 1min; final at 72 °C 7min.
Data Archiving
In fulfillment of data archiving guidelines (Baker 2013), we have deposited the primary data underlying these analyses in the NCBI Transcriptome Shotgun Assembly Sequence Database. The S. sclerotiorum transcriptome data is under the accessions from JP572948 to JP593051 and the S. trifoliorum transcriptome data is under the accessions from JP555362 to JP572947.
Results
The 454 Sequencing and Data Analysis
The 454 Titanium Sequencing generated 102563 and 88341 sequence reads in S. sclerotiorum and S. trifoliorum, respectively. The assembly using gsAssembler v2.5.3 (Roche) produced 3659 contigs (average length 822bp) and 19474 singletons in S. sclerotiorum, and 3018 contigs (average length 815bp) and 18025 singletons in S. trifoliorum (Table 1). Most of the contig lengths are between 500 and 2000bp with the longest being 5000bp for both species (Supplementary Figure S1). The length distributions of contigs were almost the same for both S. sclerotiorum and S. trifoliorum with 2 peaks in 600bp and 2000bp sizes. Likewise, the read distributions of contigs were also the same for both species with the largest number of contigs with 20 reads (Supplementary Figure S1). There were 23133 and 21043 unique sequences (unigenes) in S. sclerotiorum and S. trifoliorum, respectively. The average lengths of the unigenes are 439 and 418bp covering 10.1 and 8.8Mb for S. sclerotiorum and S. trifoliorum, respectively (Table 1).
Table 1.
S. sclerotiorum | S. trifoliorum | |
---|---|---|
Total sequence reads | 102563 | 88 341 |
Total assembled contigs | 3659 | 3018 |
Contigs average length (bp) | 822 (183–6773) | 815 (100–7622) |
Total single reads (singletons) | 19474 | 18025 |
Singleton average length (bp) | 475 (97–583) | 448 (100–650) |
Total unique sequences (unigenes) | 23133 | 21043 |
Unigene average length | 439 | 418 |
Total length of all unigene sequences | 10.1 Mb | 8.8 Mb |
Number of genes with BLAST hits (% of unigenes) | 18420 (79.6%) | 16346 (77.7%) |
Contigs | 3516 (96.1%) | 2887 (95.7%) |
Singletons | 14904 (76.5%) | 13459 (74.7%) |
Number of annotated genes (% of unigenes) | 9919 (42.9%) | 8704 (41.4%) |
Contigs | 3054 (83.4%) | 2531(83.9%) |
Singletons | 6865 (35.2%) | 6173 (34.2%) |
Number of orthologs (% of unigenes) | 13256 (57.3%) | 10814 (51.4%) |
EST Annotation and Functional Classification
All the unigenes of S. sclerotiorum had corresponding DNA sequence in the S. sclerotiorum genome, indicating the reliability of the transcriptome sequences. However, only about 84% of the unigenes of S. trifoliorum had corresponding sequences in the S. sclerotiorum genome. About 80% (18420) of the unigenes in S. sclerotiorum matched annotated genes in the S. sclerotiorum genome, and about 78% (16346) of the unigenes of S. trifoliorum matched annotated genes in the S. sclerotiorum genome (Table 1). The rest (about 20%) unigene sequences had no BLAST hits in the NCBI transcript database.
In assigning most probable GO terms, about 43% of the unigenes were genes with known functions for both species (Table 1). And about 55% of the unigenes were orthologs (e.g., they were found in both species). For both species, about 17%, 32%, and 39% of the unigenes were proposed to have functions in cellular component, biological process, and molecular function, respectively (Supplementary Table S1). There were 1323 (5.7%) and 1155 (5.5%) unigenes in S. sclerotiorum and S. trifoliorum, respectively, with annotations in all the 3 GO categories, and 7486 (32.3%) unigenes in S. sclerotiorum and 6387 (30.3%) unigenes in S. trifoliorum had annotations for 2 of the 3 categories (Supplementary Table S1). Similarly, the percentages of gene annotation in the subcategories were about the same for both species (Supplementary Table S1). The distribution of GO categories is very similar in S. sclerotiorum and S. trifoliorum with no categories showing significant differences between the 2 species (Supplementary Table S1).
All the unigenes were also analyzed with the KEGG mapping program. Similar to that found in GO term annotations, the metabolic pathway mapping also give nearly the same results for both species. About 6% of the unigenes were involved in various known metabolism-relevant metabolic pathways (Figure 1). About 30% of the unigenes involved in the metabolic pathways are related to carbohydrate metabolism and amino acid metabolism, and about 20% of the unigenes are related to energy metabolism, lipid metabolism, nucleotide metabolism, and metabolism of cofactors and vitamins. And about 62% of the metabolism-related genes had annotations for at least 2 subcategories.
Differentially and Uniquely Expressed Transcripts in S. sclerotiorum and S. trifoliorum
The RPKM values and the actual transcript levels of the housekeeping actin gene were compared to assure the accuracy of the RPKM values. Actin gene expression levels (number of reads) relative to the total transcriptome (total reads) were the same for both species as expected, and so were their RPKM values. Similar findings were found for some other housekeeping genes like histone protein H2B, histone protein 2A, 14-3-3 protein, and casein kinase 1 protein (data not shown). Thus, the RPKM values were used in subsequent analyses.
The RPKM values were used to compare orthologs of contigs between S. sclerotiorum and S. trifoliorum, and orthologs with more than 3-fold difference in RPKM values were considered as differentially expressed genes. Among 1411 orthologous contigs, 173 were highly expressed in S. sclerotiorum and 147 were highly expressed in S. trifoliorum (data no shown).
The unigenes with the top 20 highest RPKM values in each species were chosen for comparison. Surprisingly only 2 genes (SS1G_05520, elongation factor 1-alpha and SS1G_3527, a hypothetical protein) were in common between the 2 species and their expression levels in the 2 species were about the same with a ratio of 1 for their RPKM values (Tables 2 and 3). More than half of the top 20 highly expressed genes are hypothetical proteins in both species. One (JP573108) of top 20 S. sclerotiorum contigs had no ortholog in the unigenes of S. trifoliorum. Three (JP555429, JP555402, and JP556642) of the top 20 S. trifoliorum contigs had no orthologs in the unigenes of S. sclerotiorum, 1 (JP556642) of the 3 had no corresponding sequence in the S. sclerotiorum genome (Table 3). The heat shock 70 kda protein gene (JP555401) was 368 times more highly expressed in S. trifoliorum than in S. sclerotiorum (Table 3).
Table 2.
Rank | Transcript | Length | RPKM | Description | Ratioa |
---|---|---|---|---|---|
1 | JP572951 | 1394 | 16101 | SS1G_04148, hypothetical protein | 435 |
2 | JP573085 | 491 | 15648 | SS1G_13505, hypothetical protein | 18 |
3 | JP573087 | 701 | 12588 | SS1G_01463, hypothetical protein | 4 |
4 | JP572964 | 1028 | 12586 | SS1G_14212, hypothetical protein | 139 |
5 | JP572950 | 3051 | 10837 | SS1G_02378, ATP citrate lyase | 4 |
6 | JP572948 | 551 | 10184 | SS1G_12299, hypothetical protein | 56 |
7 | JP572966 | 1625 | 10164 | SS1G_07059,phosphatidylserine decarboxylase | 406 |
8 | JP573022 | 824 | 8993 | SS1G_00095, hypothetical protein | 14 |
9 | JP574227 | 745 | 8677 | SS1G_10096, hypothetical protein | 8 |
10 | JP573023 | 828 | 8125 | SS1G_05464, hypothetical protein | 13 |
11 | JP573001 | 1519 | 7414 | SS1G_10716, hypothetical protein | 17 |
12 | JP574261 | 1683 | 5087 | SS1G_05520, elongation factor 1-alpha | 1 |
13 | JP574318 | 586 | 3694 | SS1G_03611, predicted protein | 3 |
14 | JP575689 | 1198 | 3516 | SS1G_07798,glyceraldehyde-3-phosphate dehydrogenase | 7 |
15 | JP573081 | 773 | 3254 | SS1G_09040, hypothetical protein | 2 |
16 | JP573084 | 2237 | 3199 | SS1G_03527, hypothetical protein | 1 |
17 | JP573099 | 599 | 3076 | SS1G_00699, superoxide dismutase | 11 |
18 | JP573108 | 330 | 3014 | SS1G_14006, hypothetical protein | N/A |
19 | JP573042 | 556 | 2841 | SS1G_01396, hypothetical protein | 22 |
20 | JP573041 | 899 | 2787 | SS1G_08110, hypothetical protein | 2 |
Average | 1081 | 7589 | 61 |
aThe ratio value is the ratio of RPKM value (S. sclerotiorum/orthologs from S. trifoliorum). N/A means no orthologs.
Table 3.
Rank | Transcript | Length | RPKM | Description | Ratioa |
---|---|---|---|---|---|
1 | JP555362 | 912 | 26338 | SS1G_09707, hypothetical protein | 889 |
2 | JP555364 | 1119 | 22093 | SS1G_11468, hypothetical protein | 225 |
3 | JP555401 | 2585 | 10339 | SS1G_00134, heat shock 70 kda protein | 368 |
4 | JP556651 | 1008 | 5188 | SS1G_00849, hypothetical protein | 18 |
5 | JP555400 | 1856 | 4995 | SS1G_04403, fatty acid desaturase | 9 |
6 | JP555429 | 1130 | 4889 | SS1G_04857, hypothetical protein | N/A |
7 | JP555402 | 756 | 4881 | SS1G_03138, hypothetical protein | N/A |
8 | JP555422 | 1232 | 4318 | SS1G_09963, hypothetical protein | 216 |
9 | JP555363 | 3159 | 3935 | SS1G_03527, hypothetical protein | 1 |
10 | JP555421 | 1386 | 3855 | SS1G_09038, formate dehydrogenase | 2 |
11 | JP557401 | 1675 | 3832 | SS1G_05520, elongation factor 1-alpha | 1 |
12 | JP555387 | 812 | 3792 | SS1G_02599, hypothetical protein | 4 |
13 | JP555430 | 1680 | 3180 | SS1G_10197, acyl- desaturase | 4 |
14 | JP555441 | 569 | 3084 | SS1G_13462, hypothetical protein | 110 |
15 | JP556642 | 498 | 3069 | Not found in the S. s. genome | N/A |
16 | JP555371 | 2600 | 3052 | SS1G_04923, hypothetical protein | 4 |
17 | JP557814 | 1552 | 2939 | SS1G_13599, hypothetical protein | 6 |
18 | JP555368 | 2429 | 2824 | SS1G_13600, hypothetical protein | 3 |
19 | JP555427 | 635 | 2442 | SS1G_06992, hypothetical protein | 77 |
20 | JP555388 | 2739 | 2414 | SS1G_07425, zinc finger protein | 25 |
Average | 1517 | 6073 | 115 |
aThe ratio value is the ratio of RPKM value (S. trifoliorum/orthologs from S. sclerotiorum). N/A means no orthologs.
To identify uniquely expressed genes, we searched the contigs with more than 5 reads in 1 species in all the unigenes of the other species. The contigs in 1 species that were absent in the unigenes of the other species were considered as uniquely expressed genes, and their positions in the supercontigs of the S. sclerotiorum genome were located. There were 612 (17%) contigs that were uniquely expressed in S. sclerotiorum, and 341 (11%) contigs were uniquely expressed in S. trifoliorum (Figure 2). Twenty-seven of the 341 uniquely expressed genes in S. trifoliorum had no blast-hits in the S. sclerotiorum genome database or in NCBI database (Table 4), indicating they were unique to S. trifoliorum.
Table 4.
S. sclerotiorum | S. trifoliorum | |
---|---|---|
Total assembled contigs | 3659 | 3018 |
Uniquely expressed contigs | 612 | 341 |
Absent in the S. sclerotiorum genome | 0 | 27 |
Novel transcripts (contigs with >5 reads) | 143 | 131 |
Orthologs | 15 | 15 |
Absent in the S. sclerotiorum genome | 0 | 22 |
Novel Transcripts
There were 143 and 131contigs, respectively, in S. sclerotiorum and S. trifoliorum that had no any BLAST hits in the S. sclerotiorum transcripts database or in the NCBI transcripts database. They are considered as novel transcripts. Fifteen of the novel transcripts were found in both species and are orthologs (Table 4, Supplementary Table S2). Presence of the novel transcripts in both species not only showed the reliability of the transcript sequences, but also provided evidence that these novel transcripts are likely functional. Twenty-two of the 131 S. trifoliorum novel transcripts had no BLAST hits in the S. sclerotiorum genome or NCBI DNA sequences (Table 4), and therefore are unique to S. trifoliorum. These 22 novel transcripts are included in the 27 uniquely expressed genes with no BLAST hits in the S. sclerotiorum genome database.
Gene Expression Levels Among Supercontigs of the S. sclerotiorum Genome
The average RPKM values of contigs and singletons were 215 and 26, respectively, for S. sclerotiorum, and 231 and 32, respectively, for S. trifoliorum. The average RPKM values of all unigenes were 69 and 75 for S. sclerotiorum and S. trifoliorum, respectively.
When the average RPKM values of transcripts on the 37 supercontigs of the S. sclerotiorum genome were compared, 3 distinct features were found between S. sclerotiorum and S. trifoliorum (Figure 3). First, supercontig 35 had 7-fold higher levels of average RPKM value than any of the other supercontigs for both species. Second, S. sclerotiorum had distinctly higher average RPKM values on supercontigs 20, 21, 28, 30, and 31 than did S. trifoliorum. And third, S. trifoliorum had distinctly higher RPKM values on supercontigs 14 and 18 than did S. sclerotiorum (Figure 3). The numbers of model genes in each supercontig are also shown in Figure 3.
The correlation coefficients of RPKM values of unigenes applied on the linear genes in the 37 supercontigs of the S. sclerotiorum genome were calculated (Figure 4). The linear genes in the supercontig 30 have the highest correlation coefficient value, which means the gene expression between the 2 speices have the highest consistency in supercontig 30. Nine of the 37 supercontigs have the negative correlation between S. sclerotiorum and S. trifoliorum, with supercontig 29 had the highest negative coefficient. Supercontig 31 had the lowest absolute coefficient value between the 2 species. The relative RPKM values of genes on supercontigs 2, 29, and 31 were depicted in Supplementary Figures S2–S4, respectively.
Expression of Oxalate Metabolism Genes and PG Gene Family
Since one of our goals was to compare between the 2 species in expression of genes that are responsible for pathogenicity, we specifically searched for pathogenicity genes previously known in S. sclerotiorum, namely oxalate metabolism genes (Schmid et al. 2010) and PG genes (Hegedus and Rimmer 2005). All the 7 PG genes were detected in the genomic DNAs of S. sclerotiorum and S. trifoliorum using PCR (Table 5). SSPG3 and SSPG5 had no transcripts in transcriptomes of both species. SSPG6 had 2 singletons of the same gene in S. sclerotiorum, but had no transcript in S. trifoliorum. SSXPG2 had 1 singleton in S. trifoliorum, but had no transcripts in S. sclerotiorum. Expression level of SSPG1d in S. trifoliorum was 23-fold higher than in S. sclerotiorum (Table 5). With respect to oxalate metabolism, Schmid et al. (2010) proposed 12 enzymes potentially involved in fungal oxalate metabolism. Three (glyoxylate oxidase, oxalate oxidase, and glycolate oxidase) of the 12 enzymes had no transcripts in the transcriptome of both species and the genes for the 3 enzymes were not found in the genome of S. sclerotiorum. The remaining 9 enzymes were found in the unigenes in both species (Table 6). Two contigs (1 in each species) encoding for an oxaloacetate hydrolase were found in the transcriptomes. Potentially 2 different oxalate decarboxylase genes operate in the 2 species: the gene SS1G_08814 with 2 singletons in S. sclerotiorum and the gene SS1G_10796 with 2 contigs in S. trifoliorum (Table 6). In terms of transcripts of isocitrate lyase, 2 singletons were found in S. sclerotiorum (one each for genes SS1G_04975 and SS1G_04900), whereas 2 singletons were found in S. trifoliorum for the gene SS1G_04975 (Table 6).
Table 5.
Gene | Genomic location | PCR forward (F-) and reverse (R-) primer sequences (5′ to 3′)b | S. sclerotiorum | S. trifoliorum | ||
---|---|---|---|---|---|---|
Transcript (RPKM) | Genomic DNA | Transcript (RPKM) | Genomic DNA | |||
SSPG1 | SS1G_01407 | F- TTTGGATCTTGGAAATGATG | JP575285 (396) | Y | JP557844 (217) | Y |
R- GGTGGACCATTGTACTCAGA | ||||||
SSPG1d | SS1G_10167 | F-CTAGTGGAATCCAGTGCTTG | JP592135(18) | Y | JP557846 (460) | Y |
R-AGTACGGTGTTGTCATCGAG | JP580764(21) | |||||
SSPG3 | SS1G_10698 | F- TACAGAGTCGGAGTTGGAAA | Not expressed | Y | Not expressed | Y |
R- CGTCTTTGAGCTTTTGAAGA | ||||||
SSPG5 | SS1G_04177 | F- TTCTCATTCCTTGGCTCATA | Not expressed | Y | Not expressed | Y |
R- AAACGATGGTAGCACAAGAA | ||||||
SSPG6 | SS1G_11057 | F- ATCAAGTCCAACGAAGGAAC | JP591710 (20) | Y | Not expressed | Y |
R- CCGGTTGATGGATAGTTACA | JP584752(20) | Not expressed | ||||
SSXPG1 | SS1G_04207 | F- TGCAAATAAATATGCGAGGT | JP586867 (23) | Y | JP567830 (24) | Y |
R- GCCAGTAAAGGACCTGAAGT | ||||||
SSXPG2 | SS1G_02553 | F- CTCCTGTGGTCAACAAACAT | Not expressed | Y | JP560305 (23) | Y |
R- ATGTTTGTTGACCACAGGAG |
Y, positive detection by PCR in genomic DNA.
Table 6.
No. | Enzyme | Gene | S. sclerotiorum | S. trifoliorum |
---|---|---|---|---|
1 | Oxaloacetate hydrolase | SS1G_08218 | JP572992 (736) | JP555371 (3052) |
2 | Glyoxylate oxidase | N/A | Not detected | Not detected |
3 | Isocitrate lyase | SS1G_04975 | JP590289 (20) | JP568434 (23), JP561743 (27) |
SS1G_04900 | JP587238 (21) | |||
4 | Malate synthase | SS1G_05583 | JP575586 (174) | JP570628 (24) |
5 | Pyruvate carboxylase | SS1G_12839 | JP573066 (922) | JP555467 (957) |
6 | Aspartate aminotransferase | SS1G_14097 | JP573531 (190) | JP555461 (590) |
SS1G_03827 | JP575170 (291) | JP558186 (141) | ||
7 | Oxalate decarboxylase | SS1G_08814 | JP583239 (20), JP582537 (23) | Not detected |
SS1G_10796 | Not detected | JP556986 (461), JP557181 (315) | ||
8 | Formate dehydrogenase | SS1G_09038 | JP573100 (2184), | JP555421 (3855) |
9 | Malate dehydrogenase | SS1G_08975 | JP573000 (751) | JP555378 (1698) |
10 | Succinate dehydrogenase | SS1G_07864 | JP575092 (191) | JP557178 (531) |
11 | Oxalate oxidase | N/A | Not detected | Not detected |
12 | Glycolate oxidase | N/A | Not detected | Not detected |
Discussion
Next-generation sequencing technologies are now being exploited to analyze not only static genomes, but also dynamic transcriptomes in an approach termed RNA-seq, which can replace the earlier transcriptomics techniques largely relied on hybridization-based microarray technologies (Ozsolak and Milos 2011). There are 3 commercially available platforms (Roche 454, Illumina, and ABI SOLiD) applicable for RNA-Sequencing. Illumina and ABI SOLiD sequencing can generate large amounts of sequences (5–15 Gbp total per run), but have a short read length (30–100bp), which may be more difficult to accurately map on genomes especially in DNA regions of repeated sequences (Ansorge 2009). The Roche 454 sequencing platform is based on pyrosequencing in microreactors on a picotiter plate (Margulies et al. 2005). It has the advantage of generating long sequence reads (400bp) and disadvantage of the smaller amount of data generated, approximately 0.25–1 Gbp sequence information per plate with the 454 GS FLX and Titanium systems. In this study, the average length of sequence reads were more than 400bp, which is almost 7–15 times longer than the average lengths from Illumina and ABI SOLiD sequencing (Mortazavi et al. 2008; Wang et al. 2008; Maher et al. 2009; Graveley et al. 2010) and facilitated contig assembly and mapping on to the genome sequences.
The genome of S. sclerotiorum contains 38.33Mb with 14522 genes identified with gene models, covering 15.78Mb exon boundary. In our study, 42.25 and 37.54Mb raw data were generated in S. sclerotiorum and S. trifoliorum, respectively, which gave almost 3-fold coverage of genome exon size. Finally 23133 and 21043 unigenes were produced in S. sclerotiorum and S. trifoliorum covering 10.1 and 8.8Mb of their genomes. Thus, these 2 transcriptome libraries covered 70–80% genome exonic region at a single time point during the vegetative growth. Using Illumia sequencing, Wang et al. (2010) identified 11263 genes out of total 12074 model genes with 44bp average length reads and 145-fold coverage of Aspergillus oryzae genome size. It is clear that in our study even the low sequencing depth of Roche 454 still get the good coverage of both genome exonic region (70–80%) and transcripts compared to the Illumina sequencing. Despite the large number of singletons in the transcriptomes due to low sequencing depth, the distinct longer reads facilitated reads assembly and reliable mapping on to genome sequences, providing an advantage of using the Roche 454 sequencing for S. trifoliorum, the genome of which is not available.
Being closely related species of Sclerotinia, S. sclerotiorum and S. trifoliorum have similar metabolic potpourri: About 60% of their transcripts are orthologs. These 2 species also have almost the same percentages of transcripts involved in the Biological Process, Cellular Component and Molecular Function of GO terms, and in the 11 KEGG metabolic pathways, the 2 species have a very close relationship at the molecular level. For the transcripts that were mapped onto the supercontigs, genes on supercontig 35 were 7 times more highly expressed on average than genes on any other supercontigs for both species. The relative expression levels of genes on 8 super contigs were correlated for both species. Among the transcripts that were not annotated (novel transcripts) fifteen of them were identical between the 2 species. The 2 species also exhibited similar expression levels of enzymes involved in pectin degradation and in oxalate production.
Despite these commonalities between the 2 species, there are many differences in differentially and uniquely expressed genes. About 15% of the orthologs were highly expressed in S. trifoliorum than in S. sclerotiorum, and about 9% of the orthologs were highly expressed in S. sclerotiorum than in S. trifoliorum. Also about 17% of the contigs were uniquely expressed in S. sclerotiorum, and about 11% of the contigs were uniquely expressed in S. trifoliorum. Only 2 of the top 20 highest expressed genes are in common between the 2 species. Besides, 29 of 37 supercontigs have correlation coefficient values of average RPKM between the linear genes small than 0.4, which means the gene expression in the most of supercontigs have considerable difference between the 2 species. The high expression level of the heat shock protein in S. trifoliorum indicated that the room temperature used in this experiment was too high for optimal growth of the species since it prefers lower temperature (Willis 1971).
Pectin is a major constituent of the plant cell wall and pectinases produced by S. sclerotiorum play a major role in degradation of plant cell wall and in pathogenesis (Li et al. 2004; Bolton et al. 2006). The PGs are important pectinases that can degrade unesterified pectate polymers. To date, genes encoding 5 types of S. sclerotiorum endo-PGs (SSPG1, SSPG1d, SSPG3, SSPG5, and SSPG6) and 2 exo-PGs (SSXPG1 and SSXPG2) have been isolated (Li et al. 2004). Expression of SSPG1 precedes that of SSPG3, SSPG5, SSPG6 or exoPGs SSXPG1, and SSXPG2 during infection (Favaron et al. 2004; Kasza et al. 2004). In our research, SSPG1 was observed with high expression in normal culture condition (without pectin as substrate), which is in accordance with the report that a basal level of SSPG1 expression is always observed (Li et al. 2004). This is advantageous that the constitutive expression SSPG1 activity can serve to induce additional endo- and exoPGs for the rapid colonization for infection of healthy tissue (Hegedus and Rimmer 2005). The expression of acidic endoPGs SSPG3 and SSPG5 had not been detected in both species. The low expression levels of neutral endoPGs SSPG1d, SSPG6, and exoPGs SSXPG1, SSXPG2 were detected in both species, due to the powerful and sensitive reading ability of deep RNA sequencing.
Oxalic acid, working in concert with cell wall degrading enzymes such as PGs, is regarded as an important pathogenicity factor in Sclerotinia speices. Although oxalic acid itself is not required for pathogenesis because oxalate-minus mutants were still capable of infecting plants as long as the ambient pH is low (Xu et al. 2015), oxalic acid, as a strong organic acid, lowers ambient pH and creates an optimum condition for expression of endopolygalacturonases (Rollins and Dickman 2001). As well, Kim et al. (2008) reported the oxalic acid induces a programed cell death response in plant tissue that is required for disease development. There were 12 genes potentially involved in oxalate metabolism. In our study, glyoxylate oxidase, oxalate oxidase, and glycolate oxidase have no transcripts in both species. Schmid et al. (2010) also did not find these 3 genes in Sclerotium rolfsii, which suggest that these 3 genes were not the essential enzymes in the oxalate metabolism in fungi.
This study extensively compared transcriptomes of S. sclerotiorum and S. trifoliorum, and gave insight into the commonalities and uniqueness in gene expression between the 2 species. The discovery of uniquely expressed genes provides clues for further investigation into intrinsic differences in pathogenic mechanisms between the 2 species. Furthermore, the comparative analysis provides a foundation for comparing with host-infecting transcriptomes in order to elucidate pathogenic mechanisms.
Supplementary Material
Supplementary material can be found at http://www.jhered.oxfordjournals.org/
Funding
USDA ARS National Sclerotinia Initiative (partial support).
Supplementary Material
References
- Amselem J, Cuomo CA, van Kan JA, Viaud M, Benito EP, Couloux A, Coutinho PM, de Vries RP, Dyer PS, Fillinger S, et al. 2011. Genomic analysis of the necrotrophic fungal pathogens Sclerotinia sclerotiorum and Botrytis cinerea. PLoS Genet. 7:e1002230. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ansorge WJ. 2009. Next-generation DNA sequencing techniques. N Biotechnol. 25:195–203. [DOI] [PubMed] [Google Scholar]
- Baker CS. 2013. Journal of heredity adopts joint data archiving policy. J Hered. 104:1. [DOI] [PubMed] [Google Scholar]
- Bolton MD, Thomma BPHJ, Nelson BD. 2006. Sclerotinia sclerotiorum (Lib.) de Bary: biology and molecular traits of a cosmopolitan pathogen. Mol Plant Pathol. 7:1–16. [DOI] [PubMed] [Google Scholar]
- Conesa A, Gotz S, Garcia-Gomez JM, Terol J, Talon M, Robles M. 2005. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics. 21:3674–3676. [DOI] [PubMed] [Google Scholar]
- Farr DF, Rossman AY. 2012. Fungal databases Available from http://nt.ars-grin.gov/fungaldatabases/index.cfm
- Favaron F, Sella L, D’Ovidio R. 2004. Relationships among endo-polygalacturonase, oxalate, pH, and plant polygalacturonase-inhibiting protein (PGIP) in the interaction between Sclerotinia sclerotiorum and soybean. Mol Plant Microbe Interact. 17:1402–1409. [DOI] [PubMed] [Google Scholar]
- Franssen SU, Shrestha RP, Brautigam A, Bornberg-Bauer E, Weber APM. 2011. Comprehensive transcriptome analysis of the highly complex Pisum sativum genome using next generation sequencing. BMC Genomics. 12:227. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Graveley BR, Brooks AN, Carlson JW, Duff MO, Landolin JM, Yang L, Artieri CG, van Baren MJ, Boley N, Booth BW, et al. 2010. The developmental transcriptome of Drosophila melanogaster. Nature. 471:473–479. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hegedus DD, Rimmer SR. 2005. Sclerotinia sclerotiorum: when “to be or not to be” a pathogen? FEMS Microbiology Lett. 251:177–184. [DOI] [PubMed] [Google Scholar]
- Kasza Z, Vagvolgyi C, Fevre M, Cotton P. 2004. Molecular characterization and in planta detection of Sclerotinia sclerotiorum endopolygalacturonase genes. Curr Microbiol. 48:208–213. [DOI] [PubMed] [Google Scholar]
- Kim KS, Min JY, Dickman MB. 2008. Oxalic acid is an elicitor of plant programmed cell death during Sclerotinia sclerotiorum disease development. Mol Plant Microbe Interact. 21:605–612. [DOI] [PubMed] [Google Scholar]
- Kohn LM. 1979. Delimitation of the economically important plant pathogenic Sclerotinia species. Phytopathology. 69:881–886. [Google Scholar]
- Li R, Rimmer R, Buchwaldt L, Sharpe AG, Seguin-Swartz G, Coutu C, Hegedus DD. 2004. Interaction of Sclerotinia sclerotiorum with a resistant Brassica napus cultivar: expressed sequence tag analysis identifies genes associated with fungal pathogenesis. Fungal Genet Biol. 41:735–753. [DOI] [PubMed] [Google Scholar]
- Maher CA, Kumar-Sinha C, Cao X, Kalyana-Sundaram S, Han B, Jing X, Sam L, Barrette T, Palanisamy N, Chinnaiyan AM. 2009Transcriptome sequencing to detect gene fusions in cancer. Nature. 458:97–101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, Chen YJ, Chen Z, et al. 2005. Genome sequencing in microfabricated high-density picolitre reactors. Nature. 437:376–380. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. 2008. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods. 5:621–628. [DOI] [PubMed] [Google Scholar]
- Nagalakshmi U, Wang Z, Waern K, Shou C, Raha D, Gerstein M, Snyder M. 2008. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science. 320:1344–1349. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ogata H, Goto S, Sato K, Fujibuchi W, Bono H, Kanehisa M. 1999. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 27:29–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ozsolak F, Milos PM. 2011. RNA sequencing: advances, challenges and opportunities. Nat Rev Genet. 12:87–98. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ozsolak F, Platt AR, Jones DR, Reifenberger JG, Sass LE, McInerney P, Thompson JF, Bowers J, Jarosz M, Milos PM. 2009. Direct RNA sequencing. Nature. 461:814–818. [DOI] [PubMed] [Google Scholar]
- Pan Q, Shai O, Lee LJ, Frey BJ, Blencowe BJ. 2008. Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat Genet. 40:1413–1415. [DOI] [PubMed] [Google Scholar]
- Paszkiewicz K, Studholme DJ. 2010De novo assembly of short sequence reads. Brief Bioinform. 11:457–472. [DOI] [PubMed] [Google Scholar]
- Rollins JA, Dickman MB. 2001PH signaling in Sclerotinia sclerotiorum: identification of a pacC/RIM1 Homolog. Appl Environ Microbiol. 67:75–81. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Saeed AI, Sharov V, White J, Li J, Liang W, Bhagabati N, Braisted J, Klapa M, Currier T, Thiagarajan M, et al. 2003. TM4: a free, open-source system for microarray data management and analysis. Biotechniques. 34:374–378. [DOI] [PubMed] [Google Scholar]
- Schilmiller AL, Miner DP, Larson M, McDowell E, Gang DR, Wilkerson C, Last RL. 2010. Studies of a biochemical factory: tomato trichome deep expressed sequence tag sequencing and proteomics. Plant Physiol. 153:1212–1223. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schmid J, Muller-Hagen D, Bekel T, Funk L, Stahl U, Sieber V, Meyer V. 2010Transcriptome sequencing and comparative transcriptome analysis of the scleroglucan producer Sclerotium rolfsii. BMC Genomics. 11:329. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stanke M, Diekhans M, Baertsch R, Haussler D. 2008Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics. 24:637–644. [DOI] [PubMed] [Google Scholar]
- Wang ET, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, Kingsmore SF, Schroth GP, Burge CB. 2008. Alternative isoform regulation in human tissue transcriptomes. Nature. 456:470–476. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang Z, Gerstein M, Snyder M. 2009. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet. 10:57–63. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang B, Guo G, Wang C, Lin Y, Wang X, Zhao M, Guo Y, He M, Zhang Y, Pan L. 2010. Survey of the transcriptome of Aspergillus oryzae via massively parallel mRNA sequencing. Nucleic Acids Res. 38:5075–5087. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wilhelm BT, Marguerat S, Watt S, Schubert F, Wood V, Goodhead I, Penkett CJ, Rogers J, Bahler J. 2008. Dynamic repertoire of a eukaryotic transcriptome surveyed at single-nucleotide resolution. Nature. 453:1239–1243. [DOI] [PubMed] [Google Scholar]
- Willis CB. 1971. Incubation temperature differentiates S. sclerotiorum, S. trifoliorum and S. minor. Proc Can Phytopathol Soc. 37:21–30. [Google Scholar]
- Xu L, Xiang M, White D, Chen W. 2015. pH dependency of sclerotial develiment and pathogenicity revealed by using genetically defined oxlate-minus mutants of Sclerotinia sclerotiorum. Environ Microbiol. 17:2896–2909. [DOI] [PubMed] [Google Scholar]
- Zhang GJ, Guo GW, Hu XD, Zhang Y, Li QY, Li RQ, Zhuang RH, Lu ZK, He ZQ, Fang XD, et al. 2010. Deep RNA sequencing at single base-pair resolution reveals high complexity of the rice transcriptome. Genome Res. 20:646–654. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.