Skip to main content
The Plant Cell logoLink to The Plant Cell
. 2009 Aug;21(8):2203–2219. doi: 10.1105/tpc.109.068411

Transcript Profiling Provides Evidence of Functional Divergence and Expression Networks among Ribosomal Protein Gene Paralogs in Brassica napus[W],[OA]

Carrie A Whittle 1, Joan E Krochko 1,1
PMCID: PMC2751962  PMID: 19706795

Abstract

The plant ribosome is composed of 80 distinct ribosomal (r)-proteins. In Arabidopsis thaliana, each r-protein is encoded by two or more highly similar paralogous genes, although only one copy of each r-protein is incorporated into the ribosome. Brassica napus is especially suited to the comparative study of r-protein gene paralogs due to its documented history of genome duplication as well as the recent availability of large EST data sets. We have identified 996 putative r-protein genes spanning 79 distinct r-proteins in B. napus using EST data from 16 tissue collections. A total of 23,408 tissue-specific r-protein ESTs are associated with this gene set. Comparative analysis of the transcript levels for these unigenes reveals that a large fraction of r-protein genes are differentially expressed and that the number of paralogs expressed for each r-protein varies extensively with tissue type in B. napus. In addition, in many cases the paralogous genes for a specific r-protein are not transcribed in concert and have highly contrasting expression patterns among tissues. Thus, each tissue examined has a novel r-protein transcript population. Furthermore, hierarchical clustering reveals that particular paralogs for nonhomologous r-protein genes cluster together, suggesting that r-protein paralog combinations are associated with specific tissues in B. napus and, thus, may contribute to tissue differentiation and/or specialization. Altogether, the data suggest that duplicated r-protein genes undergo functional divergence into highly specialized paralogs and coexpression networks and that, similar to recent reports for yeast, these are likely actively involved in differentiation, development, and/or tissue-specific processes.

INTRODUCTION

The ribosome is a two subunit enzyme consisting of both rRNA and ribosomal proteins (r-proteins) and is essential for catalyzing the peptidyl transferase reaction during polypeptide synthesis (Chang et al., 2005). In Arabidopsis thaliana, the cytoplasmic ribosome, which synthesizes the vast majority of cellular peptides (Bailey-Serres, 1998), is composed of four rRNAs (the large subunit contains 26S, 5.8S, and 5S rRNA, and the small subunit contains 18S rRNA) and ∼80 distinct ribosomal proteins (32 proteins for the small subunit and 48 proteins in the large subunit; Barakat et al., 2001; Chang et al., 2005). Ribosomal proteins are therefore essential to protein synthesis and consequently play an important and integral role in plant cell metabolism, cell division, plant growth, and fitness. For example, r-protein gene mutants in Arabidopsis have been shown to impair translation of photosynthesis proteins (plastid r-proteins S12 and L11; Pesaresi et al., 2001; Morita-Yamamuro et al., 2004) and delay embryo development and reduce embryo viability (S5, S13, and S16; Tsugeki et al., 1996; Ito et al., 2000; Weijers et al., 2001).

Although each ribosome contains only a single polypeptide of each r-protein, evidence indicates that most or all r-proteins are encoded by two or more highly similar gene family members in Arabidopsis. A total of 249 genes encode the 80 r-proteins; thus, a number of paralogous genes encode the same r-protein (Barakat et al., 2001; Chang et al., 2005). These paralogous genes (i.e., gene family members) could be redundant and/or, as indicated for certain r-protein genes, may be involved in specific plant processes, tissues, or developmental stages (Van Lijsebettens et al., 1994; Williams and Sussex, 1995; Barakat et al., 2001; McIntosh and Bonham-Smith, 2005; Degenhardt and Bonham-Smith, 2008). For example, evidence has shown that among the three genes encoding the r-protein S18 in Arabidopsis, only one copy is markedly upregulated in actively dividing meristemic regions (Van Lijsebettens et al., 1994). In addition, it has been found that one of the genes encoding L11 (previously called L16) in Arabidopsis is upregulated within actively dividing regions, while another gene copy is specific to other cell/tissue types (Williams and Sussex, 1995). Knockdowns of RPL23aA (encoding one of the paralogs of r-protein L23a) in Arabidopsis lead to impaired growth and abnormalities, suggesting that this specific gene plays an essential role in fitness traits. By contrast, knockdowns of the closely related gene family member RPL23aB (also encoding r-protein L23a) have little effect on phenotypes (Degenhardt and Bonham-Smith, 2008). Gene paralogs encoding the r-protein S15a (87 to 100% identity at the amino acid level) are differentially expressed in Arabidopsis, with one gene being completely transcriptionally quiescent, while the other three are highly expressed in mitotically active regions (e.g., flowers and buds; Hulm et al., 2005).

The presence of multiple gene paralogs for r-proteins in plants such as Arabidopsis could be due to a high frequency of ancestral polyploidy events. Although Arabidopsis contains one of the smallest plant genomes (C value = 0.16 pg, 157 Mbp; Arabidopsis Genome Initiative, 2000; Bennett et al., 2003), it has been estimated that >65% of its genes are members of gene families (Arabidopsis Genome Initiative, 2000), which is likely largely attributable to the fact that this species underwent at least three ancient whole-genome duplications over the last 300 million years (Vision et al., 2000; Simillion et al., 2002; Blanc et al., 2003; Bowers et al., 2003). By comparison, Brassica napus has a much larger genome that is even more extensively duplicated (C value =1.5 pg; Bennett and Leitch, 2005; that can be estimated as 1.5 × 0.978 × 109 bp = 1.51 Gbp; Dolezel et al., 2003). B. napus is an allopolyploid (n = 19) derived from the hybridization of B. rapa (contains the Brassica A genome; n = 10) and Brassica oleracea (contains the Brassica C genome; n = 9). Recent data suggest that Arabidopsis and the genus Brassica/Brassiceae lineage diverged ∼24 million years ago (Lysak et al., 2005), B. oleracea and B. rapa diverged ∼4 million years ago, and B. napus developed within the last 10,000 years (Rana et al., 2004). Also, evidence indicates that on average B. napus has six homologous chromosomal regions for each corresponding region in Arabidopsis. These are believed to result from a triplication event generating a putative hexaploid ancestor (shared by multiple species in the tribe Brassiceae) ∼7.9 to 14.6 million years ago (Lagercrantz, 1998; Lukens et al., 2003; Lysak et al., 2005). Such gene duplications facilitate unequal crossing over at meiosis, which can further enhance the number of gene copies (for example, by leading to tandem duplicated genes; Mondragon-Palomino and Gaut, 2005). Accordingly, the evolution of the r-protein gene complement in B. napus may be highly complex, making this species especially well suited to the study of r-protein gene paralogs compared with other plant species.

There are several possible fates for the redundant gene copies resulting from gene and genome duplication events. First, gene copies can be lost from the genome due to the accumulation of deleterious mutations (i.e., become either pseudogenes or become physically lost from the genome); this is believed to be especially prevalent shortly following polyploidy events (Lynch and Conery, 2000; Vision et al., 2000; Blanc et al., 2003). Second, redundant genes can develop new adaptive functions through beneficial mutations and positive Darwinian selection (i.e., neofunctionalization; Ohno, 1970). Data indicate that positive Darwinian selection underlies the functional diversification of Arabidopsis gene families originating from duplication events, including the MADS box genes (Martinez-Castilla and Alvarez-Buylla, 2003), the nucleotide binding site leucine-rich repeat proteins, receptor-like kinases, the receptor-like proteins (Mondragon-Palomino and Gaut, 2005), and the methylthioalkylmalate synthases (Benderoth et al., 2006). Third, gene copies can undergo subfunctionalization, a process in which the ancestral gene functions become subdivided among the duplicated genes (i.e., the gene copies degenerate to perform complementary functions that, in combination, match the functionality of the ancestral gene, and this may involve partitioning at regulatory regions; Force et al., 1999; Lockton and Gaut, 2005). The subfunctionalized genes may be retained in the genome by negative/purifying selection (selection acting to retain the adaptive phenotype and remove extreme and/or deleterious phenotypes/genotypes), and in this manner, duplicated genes can be retained without positive selection (Force et al., 1999). Fourth, duplicated genes may be subjected to gene dosage balance (i.e., proper balance in gene dosage is required to maintain functionality of certain molecular complexes as well as regulatory and signaling pathways) (Gene Balance Hypothesis; Birchler et al., 2001; Birchler and Veitia, 2007; Conant and Wolfe, 2008; Veitia et al., 2008). Because whole-genome duplications (WGDs) maintain molecular balance, genes associated with dosage-sensitive molecular structures/pathways tend more often to be retained after these events than those gene copies derived from localized duplication events (tandem, segmental duplications, and aneuploidy) where genes are often lost to retain dosage balance. Evidence consistent with the gene balance theory, namely, the high retention of genes derived from WGDs and/or low retention of localized duplications, has been reported for a range of gene types (transcription factors, kinases, and signaling genes; Blomme et al., 2006; Birchler and Veitia, 2007; Freeling, 2008), including the ribosomal protein genes in Arabidopsis, yeast, and Paramecium (Papp et al., 2003; Maere et al., 2005; Aury et al., 2006; Freeling, 2008). Once balanced dosage pressures have been alleviated, retained genes may develop new or subdivided functions through subfunctionalization and/or neofunctionalization (Freeling and Thomas, 2006; Birchler and Veitia, 2007; Freeling, 2008). Thus, the gene complement and gene functionality within a genome such as that of B. napus, with an extensive history of duplications, is the result of a complex interaction of pressures, including those that lead to new and partitioned gene functions.

The divergence of redundant genes into new or subdivided functions is believed to underlie widespread differential expression among gene paralogs in Arabidopsis, where it has been found that 57% of newly, and 73% of older, duplicated genes have diverged in expression profiles (Blanc and Wolfe, 2004; also supported by additional Arabidopsis data, e.g., Casneuf et al., 2006). In addition, data from ovules in Gossypium hirsutum indicate that for 10 of the 40 pairs of duplicated genes examined, one gene member is either silenced or has unequal expression and/or the gene members are expressed in different tissues (Adams et al., 2003). These observations are consistent with the evolution of complementary and/or divergent functions.

At present, minimal information is available on whether r-protein gene paralogs have diverged in expression among tissues in plants, including B. napus. This is important for understanding whether the differential expression of these genes is associated with plant differentiation, development, and tissue-specific processes. One of the contributing factors for the paucity of information regarding r-protein genes and gene expression in B. napus (and other species) has been the lack of complete and annotated genome sequence data. However, large tissue-specific EST data sets recently have become available for B. napus, and these data can be effectively used for the identification of genes as well as for studies of gene expression, in which the number of ESTs per gene is an effective measure of gene expression level (Duret and Mouchiroud, 1999; Akashi, 2001; Tiffin and Hahn, 2002; Wright et al., 2004; Mitreva et al., 2006; Whittle et al., 2007). In this study, genes and putative gene families encoding cytoplasmic r-proteins in B. napus were identified based on the currently available EST data set. In addition, the expression levels of these genes were compared across a range of tissues to assess functional divergence and specialization of r-protein gene paralogs.

RESULTS

To identify r-protein genes in B. napus, 623,778 B. napus ESTs (available as of Dec 7, 2008) were compiled; this data set included all publicly available ESTs from the National Center for Biotechnology Information (NCBI) as well as National Research Council of Canada-Plant Biotechnology Institute (NRC-PBI) in-house ESTs. In addition, a complete r-protein sequence data set for the 80 known Arabidopsis r-proteins was compiled (see Methods; Table 1; Barakat et al., 2001). The protein sequences for all Arabidopsis paralogs associated with any specific r-protein (r-protein gene families) were included in this Arabidopsis r-protein data set (n = 229 after adjustments to the complete data set; Table 1). Subsequently, we compared the Arabidopsis r-proteins against the translated B. napus ESTs using TBLASTN (http://blast.ncbi.nlm.nih.gov) and identified 36,938 ESTs that matched r-proteins (with a cutoff of e = 10−6). These ESTs were clustered and assembled using CAP3 (Huang and Madan, 1999), and each translated unigene sequence (referred to as genes from now forward) was identified by comparisons against the Arabidopsis r-proteins using BLASTX. The match having the lowest e-value was used for identification (with a cutoff of e = 10−6). All B. napus genes matching an Arabidopsis r-protein gene family, which includes matches to any of the paralogs associated with each Arabidopsis r-protein (Table 1), were considered paralogous genes and therefore comprise a putative B. napus r-protein gene family. An example of a putative B. napus r-protein gene family is provided in Supplemental Figure 1 online.

Table 1.

The 80 Ribosomal Proteins in Arabidopsis and Numbers of B. napus Translated Unigenes (Contigs and Singletons, n = 1665) Matching Each Arabidopsis r-Protein

Arabidopsis Ribosomal Proteins
B. napus Genes Matching Arabidopsis Ribosomal Proteins
Ribosomal Protein Gene Paralogs Encoding the r-Proteina AGI Identificationb Number of B. napus Genes Matching the Arabidopsis r-Protein Ratio of Gene Number for B. napus versus Arabidopsisc
C S Genesd
Sa RPSaA RPSaB At1g72370 At3g04770 16 11 27 13.5
S2 RPS2A RPS2B RPS2C RPS2D At1g58380 At1g59359 At2g41840 At3g57490 15 14 29 7.3
S3 RPS3A RPS3B RPS3C At2g31610 At3g53870 At5g35530 8 5 13 4.3
S3a RPS3aA RPS3aB At3g04840 At4g34670 18 9 27 13.5
S4 RPS4A RPS4B RPS4C(Psg) RPS4D At2g17360 At5g07090 N.A. At5g58420 13 10 23 7.6
S5 RPS5A RPS5B At2g37270 At3g11940 13 13 26 13.0
S6 RPS6A RPS6B At4g31700 At5g10360 11 4 15 7.5
S7 RPS7A RPS7B RPS7C At1g48830 At3g02560 At5g16130 12 8 20 6.7
S8 RPS8A RPS8B At5g20290 At5g59240 8 3 11 5.5
S9 RPS9A(Psg) RPS9B RPS9C At4g12160 At5g15200 At5g39850 10 8 18 9.0
S10 RPS10A RPS10B RPS10C At4g25740 At5g41520 At5g52650 11 4 15 5.0
S11 RPS11A RPS11B RPS11C At3g48930 At4g30800 At5g23740 18 4 22 7.3
S12 RPS12A RPS12B(Psg) RPS12C At1g15930 At1g80800 At2g32060 14 4 18 9.0
S13 RPS13A RPS13B At3g60770 At4g00100 19 10 29 14.5
S14 RPS14A RPS14B RPS14C At2g36160 At3g11510 At3g52580 9 2 11 3.7
S15 RPS15A RPS15B RPS15C RPS15D RPS15E RPS15F At1g04270 At5g09490 At5g09500 At5g09510 At5g43640 At5g63070 10 9 19 3.2
S15a RPS15aA RPS15aB RPS15aC RPS15aD RPS15aE RPS15aF At1g07770 At2g19720 At2g39590 At3g46040 At4g29430 At5g59850 19 7 26 4.3
S16 RPS16A RPS16B RPS16C At2g09990 At3g04230 At5g18380 11 9 20 6.7
S17 RPS17A RPS17B RPS17C RPS17D At2g04390 At2g05220 At3g10610 At5g04800 16 5 21 5.3
S18 RPS18A (A) RPS18B (B) RPS18C (C) At1g22780 At1g34030 At4g09800 10 4 14 4.7
S19 RPS19A RPS19B RPS19C At3g02080 At5g15520 At5g61170 17 11 28 9.3
S20 RPS20A RPS20B RPS20C At3g45030 At3g47370 At5g62300 12 1 13 4.3
S21 RPS21A(Psg) RPS21B RPS21C At3g27450 At3g53890 At5g27700 18 9 27 13.5
S23 RPS23A RPS23B At3g09680 At5g02960 12 2 14 7.0
S24 RPS24A RPS24B At3g04920 At5g28060 23 9 32 16.0
S25 RPS25A RPS25B RPS25C(Psg) RPS25D RPS25E At2g16360 At2g21580 At3g30740 At4g34555 At4g39200 19 12 31 7.8
S26 RPS26B RPS26A RPS26C At2g40510 At2g40590 At3g56340 9 2 11 3.7
S27 RPS27A RPS27B RPS27C(Psg) RPS27D At2g45710 At3g61110 N.A. At5g47930 23 15 38 12.7
S27a RPS27aA RPS27aB RPS27aC At1g23410 At2g47110 At3g62250 13 1 14 4.7
S28 RPS28A RPS28B RPS28C At3g10090 At5g03850 At5g64140 14 5 19 6.3
S29 RPS29A RPS29B RPS29C RPS29D(Psg) At3g43980 At3g44010 At4g33865 N.A. 18 6 24 8.0
S30 RPS30A RPS30B RPS30C At2g19750 At4g29390 At5g56670 19 11 30 10.0
P0 RPP0A RPP0B RPP0C At2g40010 At3g09200 At3g11250 12 13 25 8.3
P1 RPP1A RPP1B RPP1C At1g01100 At4g00810 At5g47700 20 12 32 10.7
P2 RPP2A RPP2B RPP2C RPP2D RPP2E At2g27720 At2g27710 At3g28500 At3g44590 At5g40040 20 5 25 5.0
P3 RPP3A RPP3B At4g25890 At5g57290 7 7 14 7.0
L3 RPL3A(1) RPL3B(2) RPL3C(Psg) At1g43170 At1g61580 At5g42445 11 11 22 11.0
L4 RPL4A RPL4B(Psg) RPL4C(Psg) RPL4D At3g09630 At1g35200 At2g24730 At5g02870 15 21 36 18.0
L5 RPL5A RPL5B RPL5C(Psg) At3g25520 At5g39740 At5g40130 10 17 27 13.5
L6 RPL6A RPL6B RPL6C At1g18540 At1g74060 At1g74050 12 10 22 7.3
L7 RPL7A RPL7B RPL7C RPL7D At1g80750 At2g01250 At2g44120 At3g13580 19 6 25 6.3
L7a RPL7aA RPL7aB At2g47610 At3g62870 17 12 29 14.5
L8 RPL8A RPL8B RPL8C At2g18020 At3g51190 At4g36130 15 12 27 9.0
L9 RPL9A RPL9B RPL9C RPL9D N.A. At1g33120 At1g33140 At4g10450 10 2 12 3.0
L10 RPL10A RPL10B RPL10C At1g14320 At1g26910 At1g66580 8 5 13 4.3
L10a RPL10aA RPL10aB RPL10aC At1g08360 At2g27530 At5g22440 12 12 24 8.0
L11 RPL11A(A) RPL11B RPL11C(B) RPL11D At2g42740 At3g58700 At4g18730 At5g45775 18 14 32 8.0
L12 RPL12A RPL12B RPL12C At2g37190 At3g53430 At5g60670 17 14 31 10.3
L13 RPL13A(Psg) RPL13B RPL13C RPL13D At3g48130 At3g49010 At3g48960 At5g23900 15 20 35 11.7
L13a RPL13aA RPL13aB RPL13aC RPL13aD At3g07110 At3g24830 At4g13170 At5g48760 11 4 15 3.8
L14 RPL14A RPL14B At2g20450 At4g27090 10 7 17 8.5
L15 RPL15A RPL15B At4g16720 At4g17390 5 0 5 2.5
L17 RPL17A RPL17B At1g27400 At1g67430 11 9 20 10.0
L18 RPL18A RPL18B RPL18C At2g47570 At3g05590 At5g27850 10 5 15 5.0
L18a RPL18aA RPL18aB RPL18aC At1g29970 At2g34480 At3g14600 22 10 32 10.7
L19 RPL19A RPL19B RPL19C At1g02780 At3g16780 At4g02230 20 14 34 11.3
L21 RPL21A RPL21B(Psg) RPL21C RPL21D(Psg) RPL21E RPL21F At1g09590 At1g09486 At1g09690 At1g31355 At1g57660 At1g57860 12 2 14 3.5
L22 RPL22A RPL22B RPL22C At1g02830 At3g05560 At5g27770 14 9 23 7.7
L23 RPL23A RPL23B RPL23C At1g04480 At2g33370 At3g04400 15 9 24 8.0
L23a RPL23aA(2) RPL23aB(3) At2g39460 At3g55280 14 20 34 17.0
L24 RPL24A RPL24B At2g36620 At3g53020 6 2 8 4.0
L26 RPL26A RPL26B At3g49910 At5g67510 7 6 13 6.5
L27 RPL27A RPL27B RPL27C At2g32220 At3g22230 At4g15000 12 1 13 4.3
L27a RPL27aA RPL27aB RPL27aC At1g12960 At1g23290 At1g70600 18 4 22 7.3
L28 RPL28A RPL28B(Psg) RPL28C At2g19730 N.A. At4g29410 11 4 15 7.5
L29 RPL29A RPL29B At3g06700 At3g06680 9 3 12 6
L30 RPL30A RPL30B RPL30C At1g36240 At1g77940 At3g18740 11 4 15 5
L31 RPL31A RPL31B RPL31C At2g19740 At4g26230 At5g56710 17 2 19 6.3
L32 RPL32A RPL32B At4g18100 At5g46430 14 1 15 7.5
L34 RPL34A RPL34B RPL34C At1g26880 At1g69620 At3g28900 11 6 17 5.7
L35 RPL35A RPL35B RPL35C RPL35D At3g09500 At2g39390 At3g55170 At5g02610 12 1 13 3.3
L35a RPL35aA RPL35aB RPL35aC RPL35aD At1g07070 At1g41880 At1g74270 At3g55750 15 8 23 5.8
L36 RPL36A RPL36B RPL36C At2g37600 At3g53740 At5g02450 13 5 18 5
L36a RPL36aA RPL36aB At3g23390 At4g14320 10 1 11 5.5
L37 RPL37A RPL37B RPL37C At1g15250 At1g52300 At3g16080 12 1 13 4.3
L37a RPL37aA(Psg) RPL37aB RPL37aC N.A. At3g10950 At3g60245 14 9 23 11.5
L38 RPL38A RPL38B At2g43460 At3g59540 24 12 36 18
L39 RPL39A RPL39B RPL39C At2g25210 At3g02190 At4g31985 12 5 17 5.6
L40 RPL40A RPL40B At2g36170 At3g52590 11 2 13 6.5
L41 RPL41A RPL41B(Psg) RPL41C RPL41D RPL41E RPL41F RPL41G N.A. N.A. At2g40205 At3g08520 At3g11120 N.A. At3g56020 0 0 0
Total number matching r-proteins in B. napus 1079 586 1665 Mean fold increase in gene number 7.9

The Arabidopsis r-protein gene list is based on Barakat et al. (2001). The table includes all B. napus genes matching any one of the Arabidopsis paralogs for a particular r-protein. Arabidopsis gene names and AGI identifiers are given in the same order across the two columns above. C, contigs; S, singletons, Psg, pseudogene, N.A., no AGI number available.

a

The paralog identifications are those provided by Barakat et al. (2001).

b

Some of the AGI numbers have been updated/adjusted based on data from The Arabidopsis Information Resource (TAIR). Of particular note, the AGI number for RPL35aA used here is At1g07070 (differing from the identification At1g06980 listed in Barakat et al., 2001) and for RPS25D is At4g34555 (differing from At4g34670 listed in Barakat et al., 2001). The AGI number for L18Aa listed by Barakat et al. (2001) is At1g29970 and thus is the locus ID used here (an additional ID is also available in TAIR). The AGI numbers provided by Barakat et al. (2001) were used as the template for identification of Arabidopsis r-proteins used in our analysis and were only changed when necessary (e.g., when the AGI number listed by Barakat et al. [2001] is no longer available in TAIR or when a different protein is listed for a particular AGI number; r-proteins previously lacking an AGI ID that are currently identified at TAIR were updated).

c

The number of paralogs in Arabidopsis used in this calculation does not include those genes identified as pseudogenes. A total of 79 of 80 Arabidopsis r-proteins have a match in the B. napus gene data set. Arabidopsis r-protein pseudogenes and genes lacking an AGI number are provided in the table but were not included in our Arabidopsis r-protein data set.

d

The mean number of genes per r-protein in B. napus is 20.8.

Based on the above EST analysis, we report a total of 1665 r-protein genes for B. napus, where 1079 are contigs and 586 are represented by singleton ESTs; each of these was assigned a unique alphanumeric gene name. A total of 79 of the 80 r-proteins previously identified in Arabidopsis have a match in the translated B. napus gene data set, indicating that there is a high number of homologous r-protein genes between these two species and that the Brassica EST data set is reasonably comprehensive (Table 1). Notably, a markedly higher number of r-protein paralogs were detected for each of the 79 B. napus r-proteins than are found in Arabidopsis (average 7.9-fold higher number of paralogs per r-protein in B. napus; Table 1; Barakat et al., 2001; Chang et al., 2005). For example, the protein Sa is encoded by two genes in Arabidopsis (RPSaA and RPSaB), while 27 translated genes (derived from the r-protein B. napus EST data set) matched this Arabidopsis protein in B. napus (Table 1). The larger number of paralogs for r-proteins in B. napus is consistent with the extensive history of gene/genome duplication events in this species.

R-Protein Gene Expression Analysis

A highly conservative approach was taken to identify unigenes for the comparative analysis of r-protein gene expression in this study. Only r-protein genes containing overlapping sequences from at least two independent ESTs in the full unigene data set (described above) were used in the comparative analyses of gene expression levels (1079 contigs included and singletons were excluded, thereby excluding contaminant, artifactual, and inaccurate EST sequences; Ewing and Green, 2000). B. napus r-protein ESTs were identified from 16 distinct tissue collections (45 cDNA libraries in total; see Supplemental Table 1 online), including anthers, apical meristems, flower buds, endosperm, early-stage embryos, embryos/seeds, leaves, roots, seed coats, seedlings, and stems, as well as from in-house libraries representing microspores, ovules, mature pollen, in vitro pollen, and microspore-derived embryos (MDEs). R-protein ESTs from these tissues were detected as described above. The number of ESTs associated with each r-protein gene for each of the 16 tissue types under study was determined by examination of the EST member profiles for each contig (1079 contigs) generated from CAP3 (described previously) (see Supplemental Table 1 online). The EST members composing a unigene contig normally have >95% similarity, which is rigorous enough to distinguish among genes in conserved gene families (Subramanian and Kumar, 2004); thus, the EST frequency per contig per tissue type is an effective measure of gene expression level (Tiffin and Hahn, 2002; Mitreva et al., 2006; Whittle et al., 2007). Based on this analysis, 996 of the B. napus r-protein genes are represented by at least one EST in at least one of the 16 tissues in the tissue-specific EST data set (the 996 r-protein genes contain 23 408 ESTs derived from the 16 tissues; see Supplemental Table 1 online). The expression data for these 996 genes, which encode a total of 79 r-proteins, were used for our comparative analyses.

Each of the 996 contigs used in the gene expression analysis was identified from the full B. napus r-protein unigene data set representing all publicly available and in-house r-protein ESTs (a total of 36,938 r-protein ESTs identified from the 623,778 B. napus EST data set). The ESTs are derived from libraries sequenced in the 5′, 3′, or both directions; thus, contigs are often complete or near complete representations of the entire open reading frame for a given r-protein gene (and often include the untranslated regions; see Methods). This approach greatly limits the potential to obtain subset contigs that match a single r-protein (i.e., two contigs matching different areas of a single r-protein gene) and bias that may result from obtaining contigs based solely on unidirectional EST sequencing (e.g., EST clustering using only 3′ EST libraries might mask the differentiation of paralogs that are distinct only at their 5′ ends; see Supplemental Figure 1 online and Methods). The ESTs associated with each of these 996 genes from the 16 tissue-specific data sets were used for gene expression analysis and make up a subset (23,408 ESTs) of the total r-protein EST data set used in the r-protein contig assembly.

The gene paralogs identified for the B. napus r-proteins in the tissue-specific data set (996 r-protein genes encoding 79 r-proteins) are highly conserved, and the majority of these groupings share between 80 and 100% protein sequence identity. Our estimates from the protein coding region for each gene indicate that 73 of the 79 r-proteins are encoded by paralogs that share >80% sequence identity, while 55 of 79 r-proteins are encoded by paralogs that have >90% sequence identity (see Methods; the open reading frame for each B. napus gene per gene family was identified using the BLASTX alignment to the matching Arabidopsis r-protein). Relatively few r-proteins are encoded by paralogs that are more divergent; three highly divergent gene families include those encoding the r-proteins S15a, P2, and L7, wherein the most divergent paralogs per r-protein have <54% protein sequence identity. Overall, the level of similarity seen in B. napus r-proteins is consistent with data from Arabidopsis, in which the majority of r-protein gene families retain an average of 94% protein sequence identity and in which only relatively few gene families (including those encoding S15a, P2, and L7) have greater divergence levels (Barakat et al., 2007).

R-Protein EST Frequencies across Tissues

The 10 r-protein gene families with the highest number of ESTs in the B. napus cDNA libraries across all 16 tissues (this includes all paralogs per r-protein) are two small (S20 and S26) and eight large ribosomal subunit r-proteins (P2, L15, L18, L21, L23, L32, L36, and L37) (Table 2). The highest expression level was observed for P2, which has a total of 228.9 ESTs per 10,000 (i.e., [total number of ESTs per r-protein across 16 tissues (includes all paralogs)/total number of ESTs in data set of 16 tissues] × 10,000); transcripts for this protein were detected in 12 of the 16 tissues under study, including both vegetative and reproductive tissues (Table 2). All 10 of the most highly expressed r-proteins have ESTs within the majority of tissues under study, including flower buds (buds), early-stage embryos, embryos/seeds, endosperm, MDEs, ovules, seed coats, and seedlings, all of which are differentiating and complex tissues/cells (Brink and Cooper, 1947; Robinson-Beers et al., 1992; Goldberg et al., 1993; Wan et al., 2002) (Table 2). This suggests that many of the genes/paralogs encoding these r-proteins could play a fundamental and nonspecialized role in plant growth and development. Notably, none of the 10 most highly transcribed r-protein gene families had transcripts within pollen, and only two were represented within the stem transcriptome (L32 and L18); both of these tissues are relatively more established and less actively dividing regions of the plant (Dickinson, 1965; Southworth, 1975; Lyndon, 1990; Zhang et al., 2004).

Table 2.

The 10 r-Protein Gene Families with the Highest Number of ESTs across All 16 Tissues Examined (i.e., Summed across All Paralogs and All Tissues for Each r-Protein)

r-Protein Number of B. napus Paralogs Expressed Number of Tissues (out of 16) with Transcripts Detected ESTs per 10,000 across All Tissues Tissue Expressiona
A AM B EE ES E L MDE M O P PIV R SC S ST
P2 18 12 228.9 X X X X X X X X X X X X
L21 11 11 192.2 X X X X X X X X X X X
S20 12 13 181.6 X X X X X X X X X X X X X
L32 14 11 166.2 X X X X X X X X X X X
L36 11 12 162.7 X X X X X X X X X X X X X
L23 15 14 161.9 X X X X X X X X X X X X X X
L37 12 9 161.9 X X X X X X X X X
L18 9 14 160.6 X X X X X X X X X X X X X X
L15 5 10 153.4 X X X X X X X X X X
S26 9 12 151.6 X X X X X X X X X X X X

The EST number and tissue expression pattern are shown for each r-protein. X, ESTs present.

a

Anther (A), apical meristem (AM), bud (B), early embryo (EE), embryos and seeds (ES), endosperm (E), leaves (L), MDEs, microspores (M), ovules (O), pollen (P), pollen-in vitro (PIV), root (R), seed coats (SC), seedlings (S), and stem (ST)

The percentage of the transcriptome composed of r-protein transcripts varies extensively among tissues (Figure 1A). Ribosomal-protein gene transcripts make up a very high fraction of the transcriptome of several reproductive tissues, including microspores (with >21% of all ESTs corresponding to r-proteins), embryos and seeds (14.1%), as well as ovules (9.6%), all of which are highly differentiating and specialized tissues. By contrast, lower levels of r-protein ESTs are observed in the transcriptomes of anthers, apical meristems, buds, early-stage embryos, roots, seed coats, seedlings, and stems (<6%). The leaves and pollen, which are largely established and have relatively less actively differentiating tissues, have the lowest levels of r-protein ESTs in their transcriptomes (<1%). These data demonstrate that r-protein gene transcript levels vary extensively among tissue types and expression is greatest among highly differentiating reproductive tissues.

Figure 1.

Figure 1.

Expression of r-Protein ESTs and Genes across Tissues.

(A) The percentages of the transcriptome composed of r-protein ESTs for each of the 16 tissues examined in B. napus.

(B) The number of r-protein genes expressed per 10,000 ESTs for each tissue type (includes all paralogs per gene family).

Number of r-Protein Genes Transcribed per Tissue Type

The number of r-protein genes (unigenes) that are detected within an EST data set (i.e., among the 996 r-protein genes used in the gene expression analyses) may be expected to be greater for larger EST data sets due to the fact that some poorly expressed genes are apt to be detected. Thus, to compare the number of r-protein genes expressed among the tissues examined here (wherein library/data set sizes differ), the number of r-protein genes detected per tissue relative to EST data set size was standardized as follows: number of r-protein genes detected per 10,000 ESTs (per tissue type) = (number of r-protein genes with at least one EST/total number of ESTs) × 10,000. Note that 7 of the 45 libraries were sequenced in two directions rather than one; thus, the library size in these cases was divided by 2 prior to this calculation, (see Supplemental Table 1 online; Figures 1B and 2B). For all other analyses in this study, the total EST number per gene per tissue was standardized by the total EST data set size.

Figure 2.

Figure 2.

Expression of r-Protein Paralogs Relative to Tissue Type and Library Size.

(A) The number of paralogs expressed per r-protein (i.e., the number of r-protein genes [all paralogs] with transcripts/number of r-proteins with transcripts) for each of 16 tissue collections in B. napus.

(B) The number of paralogs expressed per r-protein versus EST data set size for each tissue (Pearson correlation coefficient, R = 0.82, P = 5 × 10−5). EST data set size was divided by 2 for libraries sequenced in both directions. Anther (A), apical meristem (AM), bud (B), early embryo (EE), embryos and seeds (ES), endosperm (E), leaves (L), MDEs, microspores (M), ovules (O), pollen (P), pollen-in vitro (PIV), root (R), seed coats (SC), seedlings (S), and stem (ST).

The data indicate that microspores have the highest number of r-protein genes expressed per 10,000 ESTs (includes all paralogs), with >500 r-protein genes per 10,000 ESTs (Figure 1B). A high number of genes was also detected within ovules (274 genes per 10,000 ESTs), while intermediate values were found in anthers and apical meristems and markedly lower numbers were observed for other tissues (including embryos/seeds, buds, early embryos, endosperm, seedlings, and stems). The lowest levels were detected in pollen, seed coats, and leaves. The complexity of r-protein gene expression per 10,000 ESTs was positively correlated with the percentage of r-protein ESTs in each transcriptome across tissues (Pearson correlation coefficient R = 0.62, P = 0.01) (cf. Figures 1A and 1B). For example, microspores and ovules have the highest number of r-protein genes expressed per 10,000 ESTs, and r-protein transcripts are very highly represented in these particular transcriptomes (Figures 1A and 1B). However, there was some variation in this relationship. For instance, relatively few r-protein genes per 10,000 ESTs were detected for embryos/seeds (compared with most other tissues), while a very high percentage of the embryo/seed transcriptome was comprised of r-protein ESTs; this suggests that relatively few r-protein genes are expressed, but these are transcribed at very high levels in these particular tissues (Figures 1A and 1B).

Number of Paralogs Transcribed per r-Protein

The number of paralogs expressed per r-protein was determined for each tissue type (Figure 2A). Multiple paralogs of r-proteins are expressed across most of the tissues examined (given as the number of r-protein genes expressed in a tissue/the number of distinct r-proteins represented by transcripts in the same tissue; Figure 2A). Specifically, the data indicate that multiple paralogs for an r-protein are expressed in almost every tissue and that more than three paralogs are expressed per r-protein in embryos/seeds, endosperm, early embryos, microspores, MDEs, ovules, and seed coats (Figure 2A). The actual number of r-protein paralogs expressed per r-protein is likely even higher for tissues such as microspores and ovules (which have values of 4.2 and 3.6, respectively) because the EST data sets for these particular tissues are relatively small (i.e., the larger the EST data set, the more r-protein genes/paralogs are detected per r-protein) (Pearson correlation coefficient, R = 0.82, P = 5 × 10−5; see Figure 2B; note that data set size was adjusted to account for two-directional sequencing of certain libraries; see Supplemental Table 1 online). Thus, more extensive sequencing of these libraries will likely reveal an even more complex transcript population, with more paralogs expressed per r-protein. One exception in the analysis is the mature pollen, in which only one EST was detected for each of the four r-protein genes found in the EST data set. Overall though, the data indicate that many r-protein paralogs are actively expressed across the various tissues and thus likely play a role in cell/tissue physiology.

Differential Expression of r-Protein Genes

Comparison of the EST frequency per tissue type for each of the r-protein genes (unigenes) using IDEG6 (and the generalized χ2 test and Bonferroni correction; http://telethon.bio.unipd.it/bioinfo/IDEG6_form/; Romualdi et al., 2003) reveals that 532 of the 996 transcribed r-protein genes in the tissues under study are differentially expressed across these tissues (P ≤ 0.05). As shown in Table 3, the EST frequencies for these 532 differentially expressed r-protein genes are poorly correlated among many tissues in B. napus, suggesting that the r-protein profiles are largely tissue specific. For example, the apical meristem shows no correlation with any of the other tissues, including buds, reproductive tissues (anthers, microspores, embryos/seeds, and ovules), leaves, and stems, indicating that this tissue, which is essential to development and growth, has a unique r-protein transcript population. Similarly, stems, roots, leaves, and pollen each have unique transcript profiles that show no correlation with any other tissues; such a trend would be expected if differential expression of r-protein gene transcripts were involved in tissue differentiation and/or specialization. The EST levels of anthers are correlated to buds, but with no other tissues. In contrast with these results, a number of statistically significant correlations were detected among certain reproductive tissues. Specifically, early-stage embryos and late-stage embryos/seeds are correlated with each other, as well as with the endosperm, MDEs, microspores, and seed coats (Table 3). This is likely due to the fact that each of these tissues is associated with seed/embryo development. In particular, this result demonstrates that subcomponents of seeds/embryos (e.g., endosperm, seed coats, and microspores, which can develop into embryos; Malik et al., 2007) share a substantial component of their r-protein transcript profile with the complete seeds/embryos. The observations that these reproductive tissues do not have any correlations in expression with the other plant materials examined (Table 3, Figure 3) and that most other tissues show no correlations between tissues either suggest that r-protein transcript populations are largely tissue specific.

Table 3.

Pearson Correlation Coefficients among 532 Differentially Expressed r-Protein Genes across the 16 Tissue Types

Tissue Type Anther (A) Apical Meristem (AM) Bud (B) Early Embryos (EE) Embryos and Seeds (ES) Endosperm (E) Leaves (L) MDE Microspores (M) Ovules (O) Pollen (P) Pollen in vitro (PIV) Root (R) Seed Coat (SC) Seedlings (S)
AM 0.010
B 0.200* 0.007
EE −0.012 −0.140 −0.014
ES −0.047 −0.131 −0.013 0.528**
E −0.074 −0.133 −0.002 0.495** 0.711**
L 0.073 0.067 0.077 −0.001 0.042 0.024
MDE −0.015 −0.131 0.040 0.327** 0.595** 0.486** 0.028
M −0.034 −0.126 0.011 0.188* 0.404** 0.302** 0.043 0.480**
O −0.099 −0.138 0.006 0.117 0.217* 0.228* −0.034 0.194* 0.183*
P −0.033 −0.025 −0.019 −0.045 −0.052 −0.046 −0.015 −0.049 −0.050 −0.007
PIV −0.020 −0.044 0.032 −0.025 0.063 0.090 0.026 0.041 −0.012 0.297 0.123
R 0.103 −0.055 0.135 −0.041 −0.023 0.000 0.045 0.030 −0.020 −0.051 −0.028 −0.020
SC −0.049 −0.076 0.040 0.330** 0.776** 0.692** 0.060 0.503** 0.367** 0.261* −0.045 0.137 0.021
S 0.052 −0.006 0.166* −0.083 −0.081 −0.065 0.054 −0.052 −0.059 −0.017 −0.029 −0.046 0.142 −0.023
Stem 0.031 0.018 −0.023 −0.110 −0.114 −0.075 0.037 −0.072 −0.107 −0.062 −0.015 −0.054 0.081 −0.094 0.114

A Bonferroni correction has been applied across all comparisons. Comparisons that are statistically significant are in bold (** P < 10−10, * 0.05 > P < 10−10).

Figure 3.

Figure 3.

Clustering of r-Protein Genes across Tissues Based on EST Frequencies.

Hierarchical clustering of 532 r-protein genes (representing 79 r-proteins) that are differentially expressed among the 16 tissue types examined based on the transcript level per gene per tissue (transcript levels are standardized by EST data set size). Nine distinct gene clusters (A to I) were identified. The percentage of r-proteins represented by a paralog(s) in each cluster that also have paralogs in other clusters is shown. Yellow, high expression; black, low expression. Anther (A), apical meristem (AM), bud (B), early embryo (EE), embryos and seeds (ES), endosperm (E), leaves (L), MDEs, microspores (M), ovules (O), pollen (P), pollen-in vitro (PIV), root (R), seed coats (SC), seedlings (S), and stem (ST).

To further reveal the relationships in r-protein gene expression among tissues, hierarchical clustering of EST frequencies (standardized by tissue-specific EST data set size) was conducted for the 532 differentially expressed r-protein genes (which includes genes encoding 79 r-proteins) using Cluster and TreeView (Eisen et al., 1998). This approach groups genes (in this case, r-protein gene paralogs) with similar gene expression profiles and allows the identification of genes with a high probability of being functionally related (Eisen et al., 1998; Bansal et al., 2007). The results indicate that differentially expressed ribosomal protein genes cluster into nine distinct groups. The largest cluster (Cluster I, Figure 3) comprises a large number of r-protein genes/paralogs that are very highly expressed in microspores. Ribosomal-protein gene transcripts represented in this cluster also include those expressed in embryos/seeds, early-stage embryos, MDEs, ovules, and endosperm, which is consistent with a correlation in the gene expression profiles among these tissues. By contrast, few transcripts from the leaves, pollen, and stems, which are relatively quiescent and nondividing tissues, are detected in this cluster. Another large cluster (Cluster H) consists primarily of reproductive tissues but excludes microspores, suggesting that embryos/seeds, endosperm, and seed coats have their own specific subset of r-protein gene transcripts that does not overlap with microspores (or the other tissues examined). An additional large gene cluster (Cluster G) has high levels of r-protein gene transcripts derived from ovules, but relatively few from other tissues, suggesting that ovules have a novel r-protein transcript population. The other, smaller clusters, including A, B, C, D, E, and F, are largely made up of transcripts derived from specific tissues, namely, apical meristems, seedlings, stems, anthers, roots, and buds, respectively. This distribution is consistent with a relatively small (Figure 1A), yet tissue-specific, component in the r-protein transcript population in these tissues.

It is particularly notable that r-protein gene families represented by gene transcripts in each cluster (A to I) also have transcripts of paralogous genes within other clusters. Specifically, 100% of the r-proteins encoded by genes represented in clusters A to H and 98.4% of r-proteins encoded by genes in cluster I also are encoded by paralogs with transcripts within other clusters (Figure 3). Among the 79 r-proteins represented in the differentially expressed gene set, there is only one r-protein (L15) wherein all of the paralogs are limited to a single cluster (cluster I; Figure 3). These expression patterns for r-protein genes suggest that paralogs for r-protein gene families are expressed across many clusters/tissues and that r-protein paralogs (not r-protein gene families) show tissue-specific expression (Figure 3). This is consistent with the notion that paralogs for particular r-proteins have unique functions in the different tissues. In fact, the results suggest that r-protein paralogs for nonhomologous genes cluster into network groups (A, B, C, D, E, F, G, H, or I) and could be diverging as pairs/groups of genes, leading to novel regulatory mechanisms for specifying growth, differentiation, and developmental fates.

Tissue-Specific Expression Comparisons for r-Protein Paralogous Genes

To further ascertain whether r-protein gene paralogs play a role in tissue differentiation and/or tissue-specific physiology, we compared r-protein gene expression between pairs of tissues using the IDEG6 generalized χ2 test (P ≤ 0.05) and the 996 r-protein genes previously identified for gene expression analysis. Only genes that have at least one EST expressed among the two compared tissues were used in each analysis. Specifically, we assessed whether paralogous genes encoding the same r-protein exhibit opposite expression patterns between two compared tissues (i.e., are not coexpressed); such a trend would suggest that paralogs for specific r-protein genes have different roles in development. Comparisons were conducted between mature male (mature pollen and anthers) and female reproductive tissues (ovules), microspores and ovules, early embryos and embryos/seeds, endosperm and seed coats, MDEs and in vitro pollen, zygotic embryos and MDEs, meristems and leaves, roots and stems, and embryos/seeds and seedlings. The results show there is a high level of differential expression of r-protein genes among most of the compared tissues (Table 4; see Supplemental Table 2 online).

Table 4.

The Number of Statistically Significant Differentially Expressed r-Protein Genes among Pairs of Compared Tissues in B. napus (P < 0.05)

Ribosomal Protein Gene U D Ribosomal Protein Gene U D Ribosomal Protein Gene U D
Male (Pollen and Anthers) versus Female Early Embryos versus Embryos and Zygotic Embryos (Early Embryos, Embryos, and
(Ovules) NG = 443, ND = 206 Seeds NG = 619, ND = 200 Seeds) versus Microspore NG = 638, ND = 162
    S2 Contig295 X     S10 Contig150 X     S3 Contig407 X
    S2 Contig588 X     S10 Contig229 X     S3 Contig440 X
    S2 Contig611 X     S10 Contig721 X     S4 Contig273 X
    S18 Contig451 X     S12 Contig235 X     S4 Contig627 X
    S18 Contig769 X     S12 Contig395 X     S13 Contig247 X
    S18 Contig914 X     S21 Contig272 X     S13 Contig279 X
    L8 Contig289 X     S21 Contig35 X     S13 Contig633 X
    L8 Contig434 X     S21 Contig412 X     S20 Contig206 X
    L8 Contig73 X     S21 Contig706 X     S20 Contig207 X
    S25 Contig570 X     S20 Contig263 X
    S25 Contig720 X     S21 Contig266 X
Microspores versus Ovules     S27 Contig288 X     S21 Contig272 X
NG = 480, ND = 165     S27 Contig714 X     S21 Contig35 X
    S3a Contig33 X     S27 Contig88 X     S21 Contig630 X
    S3a Contig836 X     P1 Contig321 X     S21 Contig646 X
    S3a Contig117 X     P1 Contig427 X     S21 Contig74 X
    S9 Contig348 X     P1 Contig712 X     S26 Contig428 X
    S9 Contig482 X     P1 Contig82 X     S26 Contig490 X
    S9 Contig749 X     L6 Contig1 X     S26 Contig572 X
    S9 Contig78 X     L6 Contig462 X     S29 Contig632 X
    S11 Contig36 X     L6 Contig555 X     S29 Contig702 X
    S11 Contig504 X     L6 Contig717 X     L4 Contig503 X
    S11 Contig550 X     L27 Contig162 X     L4 Contig89 X
    S11 Contig640 X     L27 Contig231 X     L12 Contig2 X
    S12 Contig235 X     L27 Contig460 X     L12 Contig309 X
    S12 Contig389 X     L27 Contig52 X     L12 Contig639 X
    S12 Contig473 X     L22 Contig102 X
    S12 Contig475 X     L22 Contig419 X
    S12 Contig517 X Endosperm versus Seed Coat     L22 Contig47 X
    S27 Contig284 X NG = 503, ND = 94     L22 Contig635 X
    S27 Contig88 X     S5 Contig101 X     L23 Contig441 X
    P1 Contig321 X     S5 Contig463 X     L23 Contig453 X
    P1 Contig581 X     L18a Contig361 X     L23 Contig483 X
    L6 Contig1 X     L18a Contig442 X     L23 Contig648 X
    L6 Contig336 X     L18a Contig549 X     L29 Contig234 X
    L6 Contig535 X     L40 Contig299 X     L29 Contig656 X
    L17 Contig107 X     L40 Contig342 X     L29 Contig84 X
    L17 Contig220 X     L30 Contig136 X
    L17 Contig560 X     L30 Contig654 X
    L17 Contig623 X Meristem (Apical Meristem and     L35a Contig368 X
    L17 Contig771 X Bud) versus Leaves NG = 237, ND = 26     L35a Contig628 X
    L30 Contig791 X     L18a Contig1068 X     L35a Contig629 X
    L30 Contig91 X     L18a Contig549 X     L35a Contig631 X
    L37a Contig282 X
MDEs versus Pollen-In     L37a Contig650 X
Vitro NG = 391, ND = 2 Roots versus Stems NG = 162, ND = 23     L38 Contig124 X
    None     None     L38 Contig316 X
    L38 Contig344 X
    L38 Contig590 X
    L38 Contig613 X
    L38 Contig641 X
    L39 Contig596 X
    L39 Contig626 X
    L40 Contig299 X
    L40 Contig342 X
    L40 Contig547 X

Paralogs encoding the same ribosomal protein that have statistically significant opposite expression patterns among the compared tissues are shown (i.e., cases where all paralogs for a single r-protein are similarly up- or downregulated in one of the compared tissues are not shown). NG, number of compared genes (that have at least one EST among the two compared tissues); ND, total number of differentially expressed genes; U, upregulated; D, downregulated. Upregulation and downregulation of each gene is for the first tissue listed relative to the second tissue in each paired comparison.

The results also demonstrate that, in many cases, different paralogs for the same r-protein are not up- or downregulated in parallel in these tissue comparisons. For example, for the comparison of mature male (pollen and anthers) and female (ovule) tissues, where 46.5% of r-protein genes are differentially expressed, several paralogs encoding the S2 protein are statistically significantly differentially expressed between these tissues; however, the paralogs have opposite expression patterns (Table 4). Specifically, one gene (Contig611) encoding the S2 protein is significantly upregulated in male reproductive tissues relative to female tissues, while two other paralogs (Contig295 and Contig588) are significantly downregulated in male tissues. Given that all three of these genes encode the same r-protein (S2), this suggests that the expression pattern of r-protein paralogs may play a role in the differentiation of male and female reproductive tissues in B. napus (similar results were found for genes encoding the proteins S18 and L8 in this tissue comparison) (Table 4). Similar trends were detected in comparisons of microspores and ovules (in which 34.3% of transcribed r-protein genes are differentially expressed and paralogs for nine different r-proteins are expressed in opposite directions), early-stage embryos versus embryos/seeds (32.2% differentially expressed, eight proteins have paralogs expressed in opposite directions), and zygotic embryos versus MDEs (25.3% differentially expressed, and paralogs for 18 different r-proteins are expressed in opposite directions), suggesting that specific r-protein gene paralogs are associated with certain tissues in B. napus.

A high level of differential expression was detected between embryos/seeds and seedlings, in which 48.1% of r-protein genes are differentially expressed and 56 r-proteins have paralogous genes expressed in opposite directions (see Supplemental Table 2 online). Thus, more than two-thirds of the 79 B. napus r-proteins are encoded by paralogs that have opposite expression patterns between these tissues, indicating further that paralog transcript combinations may play a major role in the rapid differentiation occurring between the seed and seedling stages of development. Notably, comparisons between roots and stems do not show paralogs with opposite expression profiles, which could be due to the fact that the tissues are mature and not actively highly differentiating. There are differentially expressed r-protein genes in this latter tissue comparison, but these are all in the same direction for all paralogs for each r-protein. Similarly, no r-protein paralogs have opposite expression patterns in the comparison of MDEs and in vitro pollen (at this high level of stringency). Altogether, it is apparent that for the majority of tissues examined here, paralogs encoding the same r-protein are often not expressed in parallel and thus r-protein paralog combinations could play a major role in plant development and differentiation.

DISCUSSION

The translated B. napus unigene r-protein data set (1665 genes) matches 79 of the 80 r-proteins previously identified in Arabidopsis, demonstrating both a high level of homology in the r-protein gene sets between these two Brassicaceous species and that the B. napus EST data set is reasonably comprehensive (Table 1). The higher number of r-protein paralogs per r-protein in B. napus (based on translation of unigenes derived from the B. napus r-protein EST data set) compared with Arabidopsis suggests that many of the duplicated r-protein genes originating from polyploidy events following divergence from the Arabidopsis lineage and the interspecific hybridization are retained in the B. napus genome. Previous data from Arabidopsis has suggested that r-protein genes are retained in the genome at higher levels than other types of genes (Blanc and Wolfe 2004); our data confirm that r-protein paralogous genes are retained often and for a wide range of r-proteins in B. napus. Nonetheless, it is notable that there are marked variations in the numbers of paralogs per r-protein in B. napus, suggesting that there is a complex evolutionary history for these genes. In some cases, high numbers of paralogs encode a single r-protein in B. napus; for example, there is an 18-fold higher gene copy number for r-protein L38 in B. napus than in Arabidopsis, and even after excluding singletons, this value is still 12-fold higher for L38 in B. napus (singletons can be unreliable for gene identification purposes; Ewing and Green, 2000). Based solely on whole-genome duplication events, one might expect approximately a sixfold difference in gene copy number between Arabidopsis and B. napus (Lagercrantz, 1998; Lukens et al., 2003; Rana et al., 2004; Lysak et al., 2005); however, the high copy numbers for some r-proteins in B. napus suggests that r-protein paralogs may originate not only from polyploidy events, but also from localized duplications (e.g., tandem or segmental duplications).

Gene retention is an important component to the observed high numbers of transcribed r-protein genes in B. napus (Table 1). The high levels of gene retention could be explained by pressures acting to maintain gene balance among dosage-sensitive r-protein genes arising from WGDs (Birchler et al., 2001; Birchler and Veitia, 2007; Conant and Wolfe, 2008; Veitia et al., 2008), by neofunctionalization and/or subfunctionalization of duplicated genes arising from WGDs and/or localized duplications (Ohno, 1970; Force et al., 1999; Lynch and Conery, 2000; Zhang and Gaut, 2003; Lockton and Gaut, 2005; Freeling, 2008) and/or by other unknown factors. By contrast, cases showing an unusually low number of paralogs for an r-protein (e.g., L9 shows only a threefold higher copy number in B. napus than in Arabidopsis; Table 1) could be attributable to gene losses resulting from accumulation of mutations that impair gene function (e.g., point mutations, deletions [from unequal crossing over events], and/or gene conversion events) and/or from gene balance pressures acting to remove genes derived from localized duplications (Smith, 1973; Lagercrantz, 1998; Lynch and Conery, 2003; Mondragon-Palomino and Gaut, 2005; Freeling 2008). Also, it has been noted that high levels of gene loss can occur for sparsely linked or peripheral genes (i.e., those not involved in main functions of protein-protein networks) (Li et al., 2006; Dopman and Hartl, 2007). Only genes with transcripts are reported in this analysis; thus, lower gene copy numbers (fewer paralogs) indicate that some genes likely have been silenced (pseudogenes) or have been lost physically from the genome. In sum, these data indicate that the r-protein complement in B. napus is highly complex and that a large number of paralogous r-protein genes, derived from genome and/or tandem gene duplication events, are retained within the genome. This is in marked contrast to the fate of the vast majority of the duplicated genes in polyploids, which often are lost/silenced and/or for which the divergence of gene copies into new/subdivided gene functions occurs but is a relatively rare event (Lynch and Conery, 2000; Vision et al., 2000; Blanc et al., 2003; Lynch and Conery, 2003; Blanc and Wolfe, 2004).

Tissue-Specific r-Protein Transcript Profiles

Comparative analysis of gene expression profiles for the 996 r-protein contigs defined by two or more ESTs (contig identifications based on the complete B. napus r-protein EST data set; Ewing and Green, 2000) and that are expressed in the tissues under consideration revealed that r-protein transcript populations vary substantially with tissue type. Highly diverse and distinctive r-protein transcript populations (paralog combinations) are observed for the majority of reproductive tissues (i.e., early-stage embryos, late-stage embryos/seeds, endosperm, MDEs, microspores, ovules, and seed coats; Figures 1B, 2A, and 3), suggesting that many duplicated r-protein genes have developed into functionally distinct paralogs associated with, and differentially expressed during, plant reproduction. Although gene balance pressures could explain the high rates of retention of duplicated r-protein genes in the genome (Birchler et al., 2001; Birchler and Veitia, 2007; Conant and Wolfe, 2008; Veitia et al., 2008), the observed variations in gene expression among paralogs (as detected in the reproductive tissues) are best explained by functional divergence (Ohno, 1970; Adams et al., 2003; Blanc and Wolfe, 2004; Casneuf et al., 2006). Previous data have shown that genes/proteins that mediate reproductive processes diverge more rapidly at the sequence level than those expressed in other, nonreproductive, tissues (Swanson and Vacquier, 2002; Torgerson and Singh, 2004; Clark et al., 2006), and this could give rise to highly specialized r-protein paralogs as reported here. Among the reproductive tissues, it is notable that the haploid tissues/cells, namely, microspores and ovules (note that microspores are haploid and ovules contain haploid female cells), have the most diverse r-protein transcript populations (Figures 1A, 2B, and 3). Although microspores and ovules have significant overlap in their r-protein gene transcript profiles (Table 3), they also have distinct unshared tissue-specific components in their r-protein transcript populations (Figure 3, Table 4). The former result could indicate that these tissues share a fundamental r-protein component required for translation and/or development, while the latter finding suggests that these tissues also have distinct r-protein transcript/protein populations (i.e., paralog combinations) that support specialization. Earlier findings in Arabidopsis have suggested that the functional divergence of duplicated genes (accompanied by mutations) is accelerated among reproductive haploid tissues/cells because the phenotypes are immediately exposed to selection (i.e., rapid selection among haploid tissues/cells can give rise to highly specialized paralogs in these tissues and may lead to the loss of a given gene's expression in sporophytic tissues) (Honys and Twell, 2003). Overall, the data from reproductive tissues/cells provided here indicate that r-protein genes could be involved in regulation of gene/protein expression at various stages of reproductive differentiation and development in B. napus, including the early stages of gamete development, through to fertilization, and embryo development and establishment. The involvement of r-protein paralogs in gene regulation has been reported recently among yeast strains (lacking/containing different r-protein paralogs; Komili et al., 2007) and is consistent with the ribosome filter hypothesis, which posits that ribosomal subunits, and the properties of the associated r-proteins, regulate which mRNAs are translated in different cell types and, thus, are intricately involved in cell/tissue differentiation (Mauro and Edelman, 2002).

The majority of the tissues examined here have a highly novel component to their r-protein transcript population; therefore, it is evident that r-protein genes may play a regulatory role throughout most stages of plant development. In addition to the tissues described above, apical meristems, seedlings, stems, anthers, roots, and buds each have a highly novel component to the r-protein transcript populations (Figures 1B, 2A, and 3). Among the vegetative tissues, the apical meristems display a relatively large r-protein gene cluster (Figure 3), which could be attributable to the fact that stem cells contained within the apical meristems/vegetative buds are highly differentiating and give rise to the precursor cells that later differentiate into new tissues (Woodrick et al., 2000; Bäurle and Laux, 2003). By contrast, the markedly low numbers of r-protein genes observed for other tissues, including leaves, stems, and pollen (Figures 1B and 2A; and low expression of r-proteins in pollen), suggests that a reduced level of transcripts is still sufficient to support growth/development of these relatively established (i.e., mature) and quiescent tissues. Overall, the totality of evidence indicates that transcript complexity (combinations of gene paralogs from nonhomologous r-proteins) likely decreases in the following order in B. napus: microspores and ovules, embryos/seeds and their subtissues (i.e., endosperm and seed coats), apical meristems, buds, seedlings, and other relatively quiescent tissues, including roots, stems, and pollen, suggesting that some component of the r-protein transcript populations are associated with tissue differentiation and tissue complexity.

These data for B. napus, across a wide range of r-protein genes (i.e., paralogs for 79 r-proteins) are consistent with the available data in Arabidopsis showing that paralogs encoding certain r-proteins, such as S15a, S18, L7, L11, and L23 (Van Lijsebettens et al., 1994; Williams and Sussex, 1995; Hulm et al., 2005; Barakat et al., 2007, Degenhardt and Bonham-Smith, 2008), are associated with specific tissues/processes. Our data demonstrate that sets of r-protein paralogs, representing a wide range of r-proteins, are associated with specific tissues in B. napus and that paralogs encoding the same r-protein (belonging to the same r-protein gene family) often have highly contrasting expression profiles. Furthermore, this differential expression is extended to reveal the coexpression of groups of paralogs for different r-proteins in a highly tissue-specific manner (Figures 3A to 3I).

Given that each tissue has a unique (and sometimes complex) r-protein transcript population, it was considered worthwhile to examine in more detail the individual expression patterns for gene paralogs within r-protein gene families to ascertain whether these were coexpressed, as has been suggested to occur for certain duplicated genes in Arabidopsis (Blanc and Wolfe, 2004). The finding that nearly every r-protein represented by a paralog(s) in one of the gene clusters has another paralog(s) in another gene cluster (i.e., >98% of r-proteins in each cluster have a paralog in another cluster; Figure 3) clearly shows that paralogs from an r-protein gene family can be differentially expressed among tissues. This is also clearly demonstrated in tissue comparisons where the opposite expression patterns for certain paralogous genes support the notion of developmental specificity in expression patterns (Table 4). Thus, r-protein paralog combinations from nonhomologous genes may regulate new gene pathways and networks that support tissue differentiation and development. This is supported by recent data from yeast suggesting that combinations of nonhomologous r-protein paralogs are involved in determining the functionality of r-proteins in specific tissues (Komili et al., 2007). Moreover, additional data from yeast also indicates that r-protein paralog combinations are directly involved in the regulation of translation of localized (tissue-specific) mRNAs and, thus, likely play a role in determining the tissue-specific protein composition (Komili et al., 2007). It is worthwhile to note that posttranslational modifications of r-proteins could also contribute to gene regulation (Mazumder et al., 2003; Bachand et al., 2006; Komili et al., 2007); this could also occur for plants, as recent data have indicated that there are a high number of sites for posttranslational modifications on r-proteins in Arabidopsis, including for initiator Met removal, N-terminal acetylation, N-terminal methylation, Lys N-methylation, and phosphorylation (Giavalisco et al., 2005; Carroll et al., 2008). In totality, the findings suggest that the tissue-specific r-protein paralog combinations observed here may be directly involved in the regulation of gene expression within tissues at the translational level in B. napus (possibly mediated by posttranslational modifications) and, thus, could be a critical factor in the coordination of tissue-specific differentiation and/or specialization.

Plausible Mechanisms Underlying Expression Divergence of r-Proteins

Divergence in gene expression profiles, as reported for r-protein gene paralogs examined here, is believed to be a critical feature demonstrating the functional divergence of duplicated genes into new genes (Ohno, 1970; Adams et al., 2003; Blanc and Wolfe, 2004; Casneuf et al., 2006). Thus, our data are highly suggestive of functional divergence of r-protein genes in B. napus. Functional divergence of gene paralogs has been attributed to positive or relaxed purifying selection acting on gene copies following duplication events, which has been shown to give rise to new or fractionalized functions among gene family members for various organisms (Ohno, 1970; Force et al., 1999; Martinez-Castilla and Alvarez-Buylla, 2003; Lynch and Katju, 2004; Mondragon-Palomino and Gaut, 2005). However, analysis of paralogous r-protein sequences in Arabidopsis has provided little evidence for functional divergence of r-proteins at the molecular level (e.g., among paralogs, there is a lack of asymmetrical protein divergence, which is believed to be an indicator of functional divergence) (Blanc and Wolfe, 2004; Barakat et al., 2007). Moreover, the high level of conservation of r-protein gene sequences at the DNA and protein level in Arabidopsis (Barakat et al., 2001, 2007) and B. napus also suggests that the functional divergence of these r-protein genes may not be primarily/solely driven by genetic factors. Thus, it can be speculated that the gene expression divergence could partially result from neofunctionalization and/or subfunctionalization of r-proteins at the epigenetic level (selection acting on epigenetically mediated r-protein gene expression and/or on genes involved in the posttranslational modifications of r-proteins; Lockton and Gaut, 2005) and/or from relatively minor changes in the r-protein DNA/protein sequences. Data from Triticum and Aegilops amphiploids have shown that methylation and other epigenetic factors contribute to regulation of gene expression within the highly duplicated genomes and can be associated with shifts in functionality among paralogs (Liu et al., 1998, Shaked et al., 2001; Liu and Wendel, 2002). It is worthwhile to note that even if ribosomal protein genes were up- or downregulated to influence subunit abundance in specific tissues, this is not an effective explanation for the consistent finding that particular paralogs for distinct r-proteins are expressed in parallel across tissues and that paralogs encoding the same r-protein often show statistically significant and opposite expression patterns across tissues (Figure 3, Table 4). Overall, it is evident that both genetic and epigenetic factors could each be substantial factors underlying the functional divergence of r-protein gene paralogs.

Given that B. napus has a highly complex genome, derived from a series of ancient genome duplication events as well as an alloploidy event, resolving the precise role of genetic factors will be largely dependent on the availability of the complete genome sequence for this species and for its immediate progenitors (B. rapa and B. oleracea). Such data may reveal the precise number and arrangement of r-protein genes in the genome and their origin (B. rapa or B. oleracea), the complete DNA and protein sequences for all genes, the exact gene homologs in Arabidopsis for every B. napus paralog, as well as specific gene pairs linked to each genome duplication event following the phylogenetic divergence of the Arabidopsis and Brassica lineages (Blanc and Wolfe, 2004). Accordingly, comparisons of complete DNA/protein sequences can be used then for comparative analysis (e.g., to detect asymmetric protein evolution and positive selection) among specific paralogs and among specific WGD events, wherein the precise ancestry is well established (Blanc and Wolfe, 2004; Scannell and Wolfe, 2008). These data may also reveal whether coevolved r-protein genes in the immediate ancestors of B. napus (B. oleracea and B. rapa) may partially contribute to the observed divergence in expression of paralogs. Further data regarding the relationship between epigenetic traits and posttranslational modifications associated with r-protein paralog functionalities will also be important for understanding the factors underlying the functional divergence of r-protein paralogs.

In summary, our findings suggest that r-protein genes are not generic housekeeping genes (Hulm et al., 2005), as they have often been characterized in plants (e.g., Sterky et al., 2004, Nicot et al., 2005), but rather are likely involved in tissue-specific processes. The differential expression of r-protein gene paralogs across tissues has important implications for the widespread use of r-protein genes for the standardization of transcript analysis data (e.g., Sterky et al., 2004; Nicot et al., 2005). Although a component of r-protein gene expression may be necessary to support translation in a manner shared by most tissues, the results presented here demonstrate that r-protein paralogs and r-protein paralog combinations are also associated with specific tissues in B. napus. The functional divergence of r-protein paralogs could be attributable to genetically or epigenetically driven neofunctionalization and/or subfunctionalization, leading to highly specialized paralogs associated with tissue-specific differentiation and/or processes. These findings have important implications for the study of the molecular mechanisms underlying tissue differentiation and tissue-specific gene regulation in plants.

METHODS

The Brassica napus EST data set contained 623,778 ESTs and consisted of 596,312 ESTs downloaded from NCBI (all ESTs available from NCBI as of December 7, 2008) as well as an additional 27,466 NRC-PBI in-house ESTs (note that a substantial number of in-house ESTs had been submitted to GenBank from NRC-PBI prior to this analysis and thus are included in the NCBI data set). The entire in-house EST data set included sequences derived from microspores, ovules, pollen, MDEs, and in vitro pollen. All ESTs are available in GenBank (see Supplemental Table 1 online).The ribosomal-protein sequences for Arabidopsis thaliana were compiled based on the Arabidopsis Genome Initiative (AGI) identification numbers provided by Barakat et al. (2001) (available from TAIR). The gene/protein sequence identifications provided by Barakat et al. (2001) were updated/adjusted as listed in Table 1. The r-protein genes in Arabidopsis identified as pseudogenes and those lacking an AGI number were excluded from the protein sequence data set. After our updates to Barakat et al. (2001), Nr-protein genes = 229, NPseudogenes = 17, and NNo AGI ID = 3 in Arabidopsis. Systat 12 (Systat Software) was used for the statistical analysis. CAP3 was set to the default parameters in clustering and assembly analyses (Huang and Madan, 1999).

The Arabidopsis r-proteins were compared against the translated B. napus ESTs using TBLASTN (http://blast.ncbi.nlm.nih.gov), and 36,938 ESTs that matched r-proteins were identified (with a cutoff of e = 10−6). For the identification of r-protein genes in B. napus, the 1665 genes identified from clustering and assembly of all available r-protein ESTs using CAP3 (Huang and Madan, 1999) were compared against the Arabidopsis r-proteins using BLASTX. The match having the lowest e-value was used for identification (with a cutoff of e = 10−6). All B. napus contigs matching the same Arabidopsis r-protein were considered paralogs, making up a putative gene family (i.e., includes all genes matching any of the Arabidopsis paralogs of a particular r-protein; Table 1). This was done because specific r-protein paralogs in Arabidopsis are often too similar to provide an exact match across species. The mean paralog number per r-protein for B. napus is 7.9-fold higher than for Arabidopsis (Table 1). The estimates of the number of B. napus gene paralogs per r-protein may be reduced slightly as more EST and genomic data becomes available, as some singletons may merge into a unigene. Genomic data could also lead to modifications in the number of paralogs observed per r-protein because genes currently not represented by ESTs may be identified and the inherent variation in unigene numbers resulting from EST clustering protocols/approaches (e.g., input parameters) can be eliminated.

The clustering program used in our analyses, CAP3, clustered together highly similar ESTs (normally <5% sequence differences) at a level sufficient to group most/all alleles of genes, particularly for r-protein genes that are highly conserved in the coding regions and have minimal or no allelic variation (Berger and Weber, 1974). Although many cDNA libraries were sequenced in only one direction for the 16 tissues used in our gene expression analysis (predominantly 5′ ESTs, with seven libraries sequenced in both directions; see Supplemental Table 1 online), this does not result in a bias against the detection of gene paralogs that differ only at the extreme ends of the genes. This is because the 996 contigs used in our gene expression analyses were initially identified by clustering of all publicly available r-protein ESTs (36,938 r-protein ESTs) that were identified from the 623,778 B. napus ESTs and included libraries sequenced in the 5′, 3′, or both directions. Most unigene contigs for each r-protein gene family have similar lengths to each other and to the matching Arabidopsis r-protein(s). Shorter contigs in a gene family normally overlapped with each other and were contained within a region of the other (full-length) contigs from the same gene family (i.e., they were not subset contigs). Overall, this approach not only prevented bias that could result from obtaining contigs solely from unidirectional or limited EST sequencing but also greatly limited the potential to obtain two distinct contigs matching different parts of a single Arabidopsis r-protein.

To be conservative, we used only those unigenes represented by at least two ESTs for our comparative gene expression analysis, based on the fact that singletons can be poor indicators of a unique gene (Ewing and Green, 2000). Contigs differing in the coding region, the untranslated region, or in both regions were considered distinct in our analysis. The percentage of sequence identity was determined for the aligned sequence regions of all paralogs per r-protein (using the 996 genes having at least one EST in the 16 tissues under study). Only the regions of overlap across most/all contigs per r-protein were used for this analysis (these were aligned using ClustalW; Thompson et al., 1994); thus, the identity values for sets of paralogs are estimates. The open reading frame for each B. napus gene was identified using alignments against the Arabidopsis r-protein data set using local BLASTX. The BLASTX amino acid–based algorithm provides alignments of the six-frame translated EST relative to the protein database and thereby is more sensitive to elements of functionality and homology than DNA alignments and accurately reveals reading frames (Gish and States, 1993). For the hierarchical clustering of the 532 differentially expressed r-protein genes across the selected tissues, the EST level per gene was standardized relative to EST data set size for each tissue type (i.e., data set size per tissue; see Supplemental Table 1 online) using IDEG6 (Romualdi et al., 2003; Figure 3) prior to the clustering. Sequencing protocols and screening for EST quality for in-house ESTs (NRC-PBI) was conducted as described previously (Malik et al., 2007).

There are several notable points with regard to the EST data sets used in this study. Specifically, although the vast majority of contigs per gene family in our analysis (>90%) have substantial DNA sequence divergence (between 3 and 45% divergence within a gene family; see Results), manual examination of the contigs (and comparison to mRNA from Arabidopsis homologs) revealed that sometimes a pair of contigs (with high identity values) differed primarily/solely at the outermost edges. Manual inspections also revealed that in some cases, residual vector sequence could be observed in this proximity (edges of the untranslated region or protein coding region). All such ambiguous regions (putative vector and/or gene regions) were removed from the consensus sequences. After removal, some of the contigs could share >98% identity, and rarely 100% (up to 3 to 4% of contigs), and/or could differ solely by indel regions; such high levels of identity can also be observed for certain Arabidopsis r-protein genes (i.e., >98% and occasionally >99% identity; http://www.Arabidopsis.org). Thus, all such contigs for B. napus were used in our analysis, including the relatively few cases wherein sequence differences occurred only in the ambiguous region, pending further genomic data to completely resolve these genes (we also noted a breadth of expression across tissues for such gene pairs [i.e., they were not from specific EST libraries]). Given that B. napus is an unannotated species, the regions of residual vector sequence in some ESTs were estimated using VecScreen (http://www.ncbi.nlm.nih.gov/VecScreen/VecScreen.html). Manual inspection of our data sets further revealed that a small number of the publicly available B. napus ESTs from GenBank were composed of sequence from more than one gene (resulting from linkage of two nonhomologous ESTs, presumably arising during the cloning process). Safeguards were taken in our study to exclude the potential effects of any EST chimeras, namely, that only ESTs that matched an r-protein were used in our analysis (and thus no EST sets from unrelated genes that could occur in a chimera were included in our study and thus could not be scored in gene expression analysis) and that all such regions (arising from a single chimeric EST containing an r-protein and another unrelated gene) were removed from the contig consensus sequence automatically during the identification of open reading frames for each r-protein gene. Nonetheless, it is notable that such artifacts are present in the public data sets (and likely in most localized data sets), and this should be a consideration for prospective large-scale studies using EST data sets in B. napus (and other organisms). The consensus sequences for the 996 contigs used in this study (as described above; prior to the identification of the open reading frames) are provided at GenBank under the name “Brassica napus r-protein contigs.”

Accession Numbers

The accession numbers for ESTs from each library used in this analysis are available through a search of GenBank using the library name at http://www.ncbi.nlm.nih.gov/Entrez/. The name of each cDNA library is listed in Supplemental Table 1 online.

Supplemental Data

The following materials are available in the online version of this article.

  • Supplemental Figure 1. An Example of the Relationship among Paralogs Composing a Putative Brassica napus r-Protein Gene Family (L15).

  • Supplemental Table 1. Description of the EST Data Sets for the 16 B. napus Tissues Examined in This Study.

  • Supplemental Table 2. The Number of Statistically Significantly Differentially Expressed r-Protein Genes among Embryos and Seeds versus Seedlings in Brassica napus.

Supplementary Material

[Supplemental Data]

Acknowledgments

We are grateful for insightful and valuable suggestions on the manuscript from Sarah P. Otto and for thoughtful suggestions from the two anonymous reviewers. We thank Meghna R. Malik for her role in the establishment of the in-house EST data sets from B. napus reproductive tissues that were used in a component of our analysis. We also thank the PBI DNA Sequencing and Genomics labs for cDNA library construction and EST sequencing. This is National Research Council of Canada publication number 50146.

The author responsible for distribution of materials integral to the findings presented in this article in accordance with the policy described in the Instructions for Authors (www.plantcell.org) is Joan E. Krochko (Joan.Krochko@nrc-cnrc.gc.ca).

[W]

Online version contains Web-only data.

[OA]

Open access articles can be viewed online without a subscription.

References

  1. Adams, K.L., Cronn, R., Percifield, R., and Wendel, J.F. (2003). Genes duplicated by polyploidy show unequal contributions to the transcriptome and organ-specific reciprocal silencing. Proc. Natl. Acad. Sci. USA 100 4649–4654. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Akashi, H. (2001). Gene expression and molecular evolution. Curr. Opin. Genet. Dev. 11 660–666. [DOI] [PubMed] [Google Scholar]
  3. Arabidopsis Genome Initiative (2000). Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408 796–815. [DOI] [PubMed] [Google Scholar]
  4. Aury, J.M., et al. (2006). Global trends of whole-genome duplications revealed by the ciliate Paramecium tetraurelia. Nature 444 171–178. [DOI] [PubMed] [Google Scholar]
  5. Bachand, F., Lackner, D.H., Bähler, J., and Silver, P.A. (2006). Autoregulation of ribosome biosynthesis by a translational response in fission yeast. Mol. Cell. Biol. 6 1731–1742. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Bailey-Serres, J. (1998). Cytoplasmic ribosomes of higher plants. In A Look beyond Transcription: Mechanisms Determining mRNA Stability and Translation in Plants. J. Bailey-Serres and D.R. Gallie, eds (Rockville, MD: American Society of Plant Physiology), pp. 125–144.
  7. Bansal, M., Belcastro, V., Ambesi-Impiombato, A., and di Bernardo, D. (2007). How to infer gene networks from expression profiles. Mol. Syst. Biol. 3 78. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Barakat, A., Müller, K.F., and Sáenz-de-Miera, L.E. (2007). Molecular evolutionary analyses of the Arabidopsis L7 ribosomal protein gene family. Gene 403 143–150. [DOI] [PubMed] [Google Scholar]
  9. Barakat, A., Szick-Miranda, K., Chang, I.F., Guyot, R., Blanc, G., Cooke, R., Delseny, M., and Bailey-Serres, J. (2001). The organization of cytoplasmic ribosomal protein genes in the Arabidopsis genome. Plant Physiol. 127 398–415. [PMC free article] [PubMed] [Google Scholar]
  10. Bäurle, I., and Laux, T. (2003). Apical meristems: The plant's fountain of youth. Bioessays 25 961–970. [DOI] [PubMed] [Google Scholar]
  11. Benderoth, M., Textor, S., Windsor, A.J., Mitchell-Olds, T., Gershenzon, J., and Kroymann, J. (2006). Positive selection driving diversification in plant secondary metabolism. Proc. Natl. Acad. Sci. USA 103 9118–9123. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Bennett, M.D., and Leitch, I.J. (2005). Nuclear DNA amounts in angiosperms - Progress, problems and prospects. Ann. Bot. (Lond.) 95 45–90. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Bennett, M.D., Leitch, I.J., Price, H.J., and Johnston, J.S. (2003). Comparisons with Caenorhabditis (∼100 Mb) and Drosophila (∼175 Mb) using flow cytometry show genome size in Arabidopsis to be ∼157 Mb and thus ∼25% larger than the Arabidopsis Genome Initiative estimate of ∼125 MB. Ann. Bot. (Lond.) 91 547–557. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Berger, E.M., and Weber, L. (1974). The ribosomes of Drosophila. II. Studies on intraspecific variation. Genetics 78 1173–1183. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Birchler, J.A., Bhadra, U., Bhadra, M.P., and Auger, D.L. (2001). Dosage-dependent gene regulation in multicellular eukaryotes: Implications for dosage compensation, aneuploid syndromes, and quantitative traits. Dev. Biol. 234 275–288. [DOI] [PubMed] [Google Scholar]
  16. Birchler, J.A., and Veitia, R.A. (2007). The gene balance hypothesis: From classical genetics to modern genomics. Plant Cell 19 395–402. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Blanc, G., Hokamp, K., and Wolfe, K.H. (2003). A recent polyploidy superimposed on older large-scale duplications in the Arabidopsis genome. Genome Res. 13 137–144. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Blanc, G., and Wolfe, K.H. (2004). Functional divergence of duplicated genes formed by polyploidy during Arabidopsis evolution. Plant Cell 16 1679–1691. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Blomme, T., Vandepoele, K., De Bodt, S., Simillion, C., Maere, S., and Van de Peer, Y. (2006). The gain and loss of genes during 600 million years of vertebrate evolution. Genome Biol. 7 R43. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Bowers, J.E., Chapman, B.A., Rong, J., and Paterson, A.H. (2003). Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 422 433–438. [DOI] [PubMed] [Google Scholar]
  21. Brink, R.A., and Cooper, D.C. (1947). The endosperm in seed development. Bot. Rev. 13 479–541. [Google Scholar]
  22. Carroll, A.J., Heazlewood, J.L., Ito, J., and Millar, A.H. (2008). Analysis of the Arabidopsis cytosolic ribosome proteome provides detailed insights into its components and their post-translational modification. Mol. Cell. Proteomics 7 347–369. [DOI] [PubMed] [Google Scholar]
  23. Casneuf, T., De Bodt, S., Raes, J., Maere, S., and Van de Peer, Y. (2006). Nonrandom divergence of gene expression following gene and genome duplications in the flowering plant Arabidopsis thaliana. Genome Biol. 7 R13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Chang, I.F., Szick-Miranda, K., Pan, S., and Bailey-Serres, J. (2005). Proteomic characterization of evolutionarily conserved and variable proteins of Arabidopsis cytosolic ribosomes. Plant Physiol. 137 848–862. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Clark, N.L., Aagaard, J.E., and Swanson, W.J. (2006). Evolution of reproductive proteins from animals and plants. Reproduction 131 11–22. [DOI] [PubMed] [Google Scholar]
  26. Conant, G.C., and Wolfe, K.H. (2008). Turning a hobby into a job: How duplicated genes find new functions. Nat. Rev. Genet. 9 938–950. [DOI] [PubMed] [Google Scholar]
  27. Degenhardt, R.F., and Bonham-Smith, P.C. (2008). Arabidopsis ribosomal proteins RPL23aA and RPL23aB are differentially targeted to the nucleolus and are disparately required for normal development. Plant Physiol. 147 128–142. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Dickinson, D.B. (1965). Germination of lily pollen: Respiration and tube growth. Science 31 1818–1819. [DOI] [PubMed] [Google Scholar]
  29. Dolezel, J., Bartoš, J., Voglmayr, H., and Greilhuber, J. (2003). Nuclear DNA content and genome size of trout and human. Cytometry 51 127–128. [DOI] [PubMed] [Google Scholar]
  30. Dopman, E.B., and Hartl, D.L. (2007). A portrait of copy-number polymorphism in Drosophila melanogaster. Proc. Natl. Acad. Sci. USA 104 19920–19925. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Duret, L., and Mouchiroud, D. (1999). Expression pattern and, surprisingly, gene length shade codon usage in Caenorhabditis, Drosophila, and Arabidopsis. Proc. Natl. Acad. Sci. USA 96 4482–4487. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Eisen, M.B., Spellman, P.T., Brown, P.O., and Botstein, D. (1998). Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95 14863–14868. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Ewing, B., and Green, P. (2000). Analysis of expressed sequence tags indicates 35,000 human genes. Nat. Genet. 25 232–234. [DOI] [PubMed] [Google Scholar]
  34. Force, A., Lynch, M., Pickett, F.B., Amores, A., Yan, Y.L., and Postlethwait, J. (1999). Preservation of duplicate genes by complementary, degenerate mutations. Genetics 151 1531–1545. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Freeling, M. (2008). The evolutionary position of subfunctionalization, downgraded. Genome Dyn. 4 25–40. [DOI] [PubMed] [Google Scholar]
  36. Freeling, M., and Thomas, B.C. (2006). Gene-balanced duplications, like tetraploidy, provide predictable drive to increase morphological complexity. Genome Res. 16 805–814. [DOI] [PubMed] [Google Scholar]
  37. Giavalisco, P., Wilson, D., Kreitler, T., Lehrach, H., Klose, J., Gobom, J., and Fucini, P. (2005). High heterogeneity within the ribosomal proteins of the Arabidopsis thaliana 80S ribosome. Plant Mol. Biol. 57 577–591. [DOI] [PubMed] [Google Scholar]
  38. Gish, W., and States, D.J. (1993). Identification of protein coding regions by database similarity search. Nat. Genet. 3 266–272. [DOI] [PubMed] [Google Scholar]
  39. Goldberg, R.B., Beals, T.P., and Sanders, P.M. (1993). Anther development: Basic principles and practical applications. Plant Cell 5 1217–1229. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Honys, D., and Twell, D. (2003). Comparative analysis of the Arabidopsis pollen transcriptome. Plant Physiol. 132 640–652. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Huang, X., and Madan, A. (1999). CAP3, A DNA sequence assembly program. Genome Res. 9 868–877. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Hulm, J.L., McIntosh, K.B., and Bonham-Smith, P.C. (2005). Variation in transcript abundance among four members of the Arabidopsis thaliana RIBOSOMAL PROTEIN S14a gene family. Plant Sci. 169 267–278. [Google Scholar]
  43. Ito, T., Kim, G.T., and Shinozaki, K. (2000). Disruption of an Arabidopsis cytoplasmic ribosomal protein S13-homologous gene by transposon-mediated mutagenesis causes aberrant growth and development. Plant J. 22 257–264. [DOI] [PubMed] [Google Scholar]
  44. Komili, S., Farny, N.G., Roth, F.P., and Silver, P.A. (2007). Functional specificity among ribosomal proteins regulates gene expression. Cell 131 557–571. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Lagercrantz, U. (1998). Comparative mapping between Arabidopsis thaliana and Brassica nigra indicates that Brassica genomes have evolved through extensive genome replication accompanied by chromosome fusions and frequent rearrangements. Genetics 150 1217–1228. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Li, L., Huang, Y., Xia, X., and Sun, Z. (2006). Preferential duplication in the sparse part of yeast protein interaction network. Mol. Biol. Evol. 23 2467–2473. [DOI] [PubMed] [Google Scholar]
  47. Liu, B., Vega, J.M., and Feldman, M. (1998). Rapid genomic changes in newly synthesized amphiploids of Triticum and Aegilops. II. Changes in low-copy coding DNA sequences. Genome 41 535–542. [DOI] [PubMed] [Google Scholar]
  48. Liu, B., and Wendel, J.F. (2002). Non-Mendelian phenomena in allopolyploid genome evolution. Curr. Genomics 3 489–505. [Google Scholar]
  49. Lockton, S., and Gaut, B.S. (2005). Plant conserved non-coding sequences and paralogue evolution. Trends Genet. 21 60–85. [DOI] [PubMed] [Google Scholar]
  50. Lukens, L., Zou, F., Lydiate, D., Parkin, I., and Osborn, T. (2003). Comparison of a Brassica oleracea genetic map with the genome of Arabidopsis thaliana. Genetics 164 359–372. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Lynch, M., and Conery, J.S. (2000). The evolutionary fate and consequences of duplicate genes. Science 290 1151–1155. [DOI] [PubMed] [Google Scholar]
  52. Lynch, M., and Conery, J.S. (2003). The evolutionary demography of duplicate genes. J. Struct. Funct. Genomics 3 35–44. [PubMed] [Google Scholar]
  53. Lynch, M., and Katju, V. (2004). The altered evolutionary trajectories of gene duplicates. Trends Genet. 20 544–549. [DOI] [PubMed] [Google Scholar]
  54. Lyndon, R.F. (1990). Plant Development: The Cellular Basis. (London: Unwin Hyman).
  55. Lysak, M.A., Koch, M.A., Pecinka, A., and Schubert, I. (2005). Chromosome triplication found across the tribe Brassiceae. Genome Res. 15 516–525. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Maere, S., De Bodt, S., Raes, J., Casneuf, T., Van Montagu, M., Kuiper, M., and Van de Peer, Y. (2005). Modeling gene and genome duplications in eukaryotes. Proc. Natl. Acad. Sci. USA 102 5454–5459. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Malik, M.R., Wang, F., Dirpaul, J.M., Zhou, N., Polowick, P.L., Ferrie, A.M.R., and Krochko, J.E. (2007). Transcript profiling and identification of molecular markers for early microspore embryogenesis in Brassica napus. Plant Physiol. 144 134–154. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Martinez-Castilla, L.P., and Alvarez-Buylla, E.R. (2003). Adaptive evolution in the Arabidopsis MADS-box gene family inferred from its complete resolved phylogeny. Proc. Natl. Acad. Sci. USA 100 13407–13412. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Mauro, V.P., and Edelman, G.M. (2002). The ribosome filter hypothesis. Proc. Natl. Acad. Sci. USA 99 12031–12036. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Mazumder, B., Sampath, P., Seshadri, V., Maitra, R.K., DiCorleto, P.E., and Fox, P.L. (2003). Regulated release of L13a from the 60S ribosomal subunit as a mechanism of transcript-specific translational control. Cell 115 187–198. [DOI] [PubMed] [Google Scholar]
  61. McIntosh, K.B., and Bonham-Smith, P.C. (2005). The two ribosomal protein L23A genes are differentially transcribed in Arabidopsis thaliana. Genome 48 443–454. [DOI] [PubMed] [Google Scholar]
  62. Mitreva, M., Wendl, M.C., Martin, J., Wylie, T., Yin, Y., Larson, A., Parkinson, J., Waterston, R.H., and McCarter, J.P. (2006). Codon usage patterns in Nematoda: Analysis based on over 25 million codons in thirty-two species. Genome Biol. 7 R75. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Mondragon-Palomino, M., and Gaut, B.S. (2005). Gene conversion and the evolution of three leucine-rich repeat gene families in Arabidopsis thaliana. Mol. Biol. Evol. 22 2444–2456. [DOI] [PubMed] [Google Scholar]
  64. Morita-Yamamuro, C., Tsutsui, T., Tanaka, A., and Yamaguchi, J. (2004). Knock-out of the plastid ribosomal protein S21 causes impaired photosynthesis and sugar-response during germination and seedling development in Arabidopsis thaliana. Plant Cell Physiol. 45 781–788. [DOI] [PubMed] [Google Scholar]
  65. Nicot, N., Hausman, J.F., Hoffmann, L., and Evers, D. (2005). Housekeeping gene selection for real-time RT-PCR normalization in potato during biotic and abiotic stress. J. Exp. Bot. 56 2907–2914. [DOI] [PubMed] [Google Scholar]
  66. Ohno, S. (1970). Evolution by Gene Duplication. (New York: Springer-Verlag).
  67. Papp, B., Pál, C., and Hurst, L.D. (2003). Dosage sensitivity and the evolution of gene families in yeast. Nature 424 194–197. [DOI] [PubMed] [Google Scholar]
  68. Pesaresi, P., Varotto, C., Meurer, J., Jahns, P., Salamini, F., and Leister, D. (2001). Knock-out of the plastid ribosomal protein L11 in Arabidopsis: Effects on mRNA translation and photosynthesis. Plant J. 27 179–189. [DOI] [PubMed] [Google Scholar]
  69. Rana, D., van den Boogaart, T., O'Neill, C.M., Hynes, L., Bent, E., Macpherson, L., Park, J.Y., Lim, Y.P., and Bancroft, I. (2004). Conservation of the microstructure of genome segments in Brassica napus and its diploid relatives. Plant J. 40 725–733. [DOI] [PubMed] [Google Scholar]
  70. Robinson-Beers, K., Pruitt, R.E., and Gasser, C.S. (1992). Ovule development in wild-type Arabidopsis and two female-sterile mutants. Plant Cell 4 1237–1249. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Romualdi, C., Bortoluzzi, S., D'Alessi, F., and Danieli, G.A. (2003). IDEG6: A web tool for detection of differentially expressed genes in multiple tag sampling experiments. Physiol. Genomics 12 159–162. [DOI] [PubMed] [Google Scholar]
  72. Scannell, D.R., and Wolfe, K.H. (2008). A burst of protein sequence evolution and a prolonged period of asymmetric evolution follow gene duplication in yeast. Genome Res. 18 137–147. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Shaked, H., Kashkush, K., Ozkan, H., Feldman, M., and Levy, A.A. (2001). Sequence elimination and cytosine methylation are rapid and reproducible responses of the genome to wide hybridization and allopolyploidy in wheat. Plant Cell 13 1749–1759. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Simillion, C., Vandepoele, K., Van Montagu, M.C., Zabeau, M., and Van de Peer, Y. (2002). The hidden duplication past of Arabidopsis thaliana. Proc. Natl. Acad. Sci. USA 99 13627–13632. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Smith, G.P. (1973). Unequal crossover and the evolution of multigene families. Cold Spring Harb. Symp. Quant. Biol. 38 507–513. [DOI] [PubMed] [Google Scholar]
  76. Southworth, D. (1975). Lectins stimulate pollen germination. Nature 258 600–602. [Google Scholar]
  77. Sterky, F., et al. (2004). A Populus EST resource for plant functional genomics. Proc. Natl. Acad. Sci. USA 101 13951–13956. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Subramanian, S., and Kumar, S. (2004). Gene expression intensity shapes evolutionary rates of the proteins encoded by the vertebrate genome. Genetics 168 373–381. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Swanson, W.J., and Vacquier, V.D. (2002). The rapid evolution of reproductive proteins. Nat. Rev. Genet. 3 137–143. [DOI] [PubMed] [Google Scholar]
  80. Thompson, J.D., Higgins, D.G., and Gibson, T.J. (1994). CLUSTALW: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position specific gap penalties and weight matrix choice. Nucleic Acids Res. 22 4673–4680. [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Tiffin, P., and Hahn, M.W. (2002). Coding sequence divergence between two closely related plant species: Arabidopsis thaliana and Brassica rapa ssp. pekinensis. J. Mol. Evol. 54 746–753. [DOI] [PubMed] [Google Scholar]
  82. Torgerson, D.G., and Singh, R.S. (2004). Rapid evolution through gene duplication and subfunctionalization of the testes-specific α4 proteasome subunits in Drosophila. Genetics 168 1421–1432. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Tsugeki, R., Kochieva, E.Z., and Fedoroff, N.V. (1996). A transposon insertion in the Arabidopsis SSR16 gene causes an embryo-defective lethal mutation. Plant J. 10 479–489. [DOI] [PubMed] [Google Scholar]
  84. Van Lijsebettens, M., Vanderhaeghen, R., De Block, M., Bauw, G., Villarroel, R., and Van Montagu, M. (1994). An S18 ribosomal protein gene copy at the Arabidopsis PFL locus affects plant development by its specific expression in meristems. EMBO J. 13 3378–3388. [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Veitia, R.A., Bottani, S., and Birchler, J.A. (2008). Cellular reactions to gene dosage imbalance: Genomic, transcriptomic and proteomic effects. Trends Genet. 24 390–397. [DOI] [PubMed] [Google Scholar]
  86. Vision, T.J., Brown, D.G., and Tanksley, S.D. (2000). The origins of genomic duplications in Arabidopsis. Science 290 2114–2117. [DOI] [PubMed] [Google Scholar]
  87. Wan, L., Xia, Q., Qui, Q., and Selvaraj, G. (2002). Early stages of seed development in Brassica napus: A seed with coat-specific cysteine proteinase associated with programmed cell death of the inner integument. Plant J. 30 1–10. [DOI] [PubMed] [Google Scholar]
  88. Weijers, D., Franke-van Dijk, M., Venchken, R.-J., Qunit, A., Hooykass, P., and Offringa, R. (2001). An Arabidopsis Minute-like phenotype caused by a semi-dominant mutation in a ribosomal protein S5 gene. Development 128 4289–4299. [DOI] [PubMed] [Google Scholar]
  89. Whittle, C.A., Malik, M.R., and Krochko, J.E. (2007). Gender-specific selection on codon usage in plant genomes. BMC Genomics 8 169–179. [DOI] [PMC free article] [PubMed] [Google Scholar]
  90. Williams, M.E., and Sussex, L.M. (1995). Developmental regulation of ribosomal protein L16 genes in Arabidopsis thaliana. Plant J. 8 65–76. [DOI] [PubMed] [Google Scholar]
  91. Wright, S.I., Yau, C.B., Loosely, M., and Myers, C. (2004). Effects of gene expression on molecular evolution in Arabidopsis thaliana and Arabidopsis lyrata. Mol. Biol. Evol. 21 1719–1726. [DOI] [PubMed] [Google Scholar]
  92. Woodrick, R., Martin, P.R., Birman, I., and Pickett, F.B. (2000). The Arabidopsis embryonic shoot. Development 127 813–820. [DOI] [PubMed] [Google Scholar]
  93. Zhang, G., Gifford, D.J., and Cass, D.D. (2004). RNA and protein synthesis in sperm cells isolated from Zea mays L. pollen. Sex. Plant Reprod. 6 239–243. [Google Scholar]
  94. Zhang, L., and Gaut, B.S. (2003). Does recombination shape the distribution and evolution of tandemly arrayed genes (TAGS) in the Arabidopsis thaliana genome? Genome Res. 13 2533–2540. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

[Supplemental Data]
tpc.109.068411_1.pdf (60.6KB, pdf)

Articles from The Plant Cell are provided here courtesy of Oxford University Press

RESOURCES