Skip to main content
Plant Physiology logoLink to Plant Physiology
. 2003 Mar;131(3):1313–1326. doi: 10.1104/pp.102.014928

Whole-Genome Comparison of Leucine-Rich Repeat Extensins in Arabidopsis and Rice. A Conserved Family of Cell Wall Proteins Form a Vegetative and a Reproductive Clade1,[w]

Nicolas Baumberger 1,2, Brigitte Doesseger 1, Romain Guyot 1, Anouck Diet 1, Ronald L Parsons 1, Mark A Clark 1, MP Simmons 1, Patricia Bedinger 1, Stephen A Goff 1, Christoph Ringli 1, Beat Keller 1,*
PMCID: PMC166891  PMID: 12644681

Abstract

We have searched the Arabidopsis and rice (Oryza sativa) genomes for homologs of LRX1, an Arabidopsis gene encoding a novel type of cell wall protein containing a leucine-rich repeat (LRR) and an extensin domain. Eleven and eight LRX (LRR/EXTENSIN) genes have been identified in these two plant species, respectively. The LRX gene family encodes proteins characterized by a short N-terminal domain, a domain with 10 LRRs, a cysteine-rich motif, and a variable C-terminal extensin-like domain. Phylogenetic analysis performed on the conserved domains indicates the existence of two major clades of LRX proteins that arose before the eudicot/monocot divergence and then diversified independently in each lineage. In Arabidopsis, gene expression studies by northern hybridization and promoter::uidA fusions showed that the two phylogenetic clades represent a specialization into “reproductive” and “vegetative” LRXs. The four Arabidopsis genes of the “reproductive” clade are specifically expressed in pollen, whereas the seven “vegetative” genes are predominantly expressed in various sporophytic tissues. This separation into two expression classes is also supported by previous studies on maize (Zea mays) and tomato (Lycopersicon esculentum) LRX homologs and by information on available rice ESTs. The strong conservation of the amino acids responsible for the putative recognition specificity of the LRR domain throughout the family suggests that the LRX proteins interact with similar ligands.


With the completion of the Arabidopsis genome sequence, it became clear that many Arabidopsis genes are members of multigene families. Although this had already been suggested by the analysis of expressed sequence tag (EST) databases and by classical gene searches, the availability of the full gene set of a plant provides the unique opportunity to get a complete inventory of all the members of a gene family. Among the 25,500 genes predicted in the Arabidopsis genome, 65% are members of a multigene family and 37% belong to families of more than five members (Arabidopsis Genome Initiative, 2000). Although the predicted total gene number of Arabidopsis is significantly larger than that of other sequenced multicellular eukaryotes such as Caenorhabditis elegans (19,000; C. elegans Sequencing Consortium, 1998) or Drosophila melanogaster (13,600; Adams et al., 2000), the absolute number of gene families and singletons (11,601 in Arabidopsis) is comparable in all these organisms (Arabidopsis Genome Initiative, 2000). This indicates that frequent gene duplications and consequently large gene families are a distinctive feature of the Arabidopsis genome, and possibly of all plant genomes. With the recent publication of a high-quality draft from two different subspecies, rice (Oryza sativa) is the second plant whose genome can be comprehensively investigated. Depending on the stringency applied in gene prediction, the rice genome contains between 32,277 and 61,668 genes (Goff et al., 2002; Yu et al., 2002). With 77% of the genes distributed in about 15,000 multigene families, it appears that this larger number of genes is at least partly due to an increase in the number of copies per family. Both Arabidopsis and rice contain large duplicated segments, suggesting that they have undergone one or several polyploidization events during their evolutionary history. Monocots and eudicots diverged 180 to 240 million years ago (Wolfe et al., 1989; Goremykin et al., 1997; Soltis et al., 2002), and the main polyploidization or large duplication events still detectable in the genome of Arabidopsis and rice are estimated to have occurred around 112 million years and 40 to 50 million years ago, respectively (Vision et al., 2000; Goff et al., 2002). Combined with local tandem duplications, polyploidization is a mechanism that generates multigene families and is a major source of evolutionary novelty. With the availability of the Arabidopsis and rice genomes, it is now possible to study and understand more precisely how and when gene families arose, how they were amplified, and which new biological functions were derived in the monocots and eudicots. In addition to basic knowledge on evolutionary processes, these comparative studies should also help to attribute a function to every identified gene.

Recently, a new type of modular cell wall protein containing a Leu-rich repeat (LRR) and an extensin domain was described in eudicots and monocots (Rubinstein et al., 1995a; Baumberger et al., 2001; Stratford et al., 2001). LRRs are frequently implicated in protein-protein interactions and, in plants, a large subclass of receptor-like kinases has extracellular LRRs in the receptor domain (Shiu and Bleecker, 2001). Of those, several have been shown to work in signal transduction during development or to participate in pathogen recognition and defense (Song et al., 1995; Torii et al., 1996; Clark et al., 1997; Li and Chory, 1997; Jinn et al., 2000). Extensins form an abundant group of cell wall structural proteins belonging to the family of Hyp-rich glycoproteins. They are defined by the presence of the repeated pentapeptide Ser(Hyp)4 (Kieliszewski et al., 1990; Kieliszewski and Lamport, 1994), where most Hyp and Ser residues are glycosylated (Wilson and Fry, 1986). One proposed function of extensins is to reinforce the polysaccharidic structure of the wall by cross-linking to each other and/or other cell wall components. It has also been postulated that extensins might fix the shape of the cell at the end of the expansion phase (Carpita and Gibeaut, 1993). Such a function in cell morphogenesis is supported by the recent finding that an extensin gene is required for correct cell morphogenesis during Arabidopsis embryogenesis (Hall and Cannon, 2002).

The Arabidopsis LRX1 (LRR/EXTENSIN1) gene was shown to be involved in root hair morphogenesis because LRX1 mutants develop root hairs with aberrant shape (Baumberger et al., 2001). The LRX1 protein is specifically targeted to the root hair cell wall, where it is insolubilized. Similarly, PEX1, a pollen-specific maize (Zea mays) LRX, was immunolocalized in the intine layer of the pollen grain and the callosic sheath of the pollen tube (Rubinstein et al., 1995a, 1995b). Together, the properties of LRR and extensins, the cell wall localization of LRX proteins, and the phenotype of lrx1 mutants suggest that LRXs are potentially involved in the regulation of cell wall expansion in response to signaling.

To better understand the function and the evolutionary history of LRR extensins, we have characterized the LRX family of Arabidopsis and rice. We have compared the protein organization of the identified gene family members and further studied the expression of the Arabidopsis LRX genes. Our results reveal that LRX proteins fall into two different subclasses defined by their phylogenetic relationship and their preferential expression either in pollen or in vegetative tissues. The LRX phylogeny suggests that this family has independently evolved in monocots and eudicots from two common ancestral genes specifically expressed in either vegetative or reproductive tissues.

RESULTS

Genome-Wide Searches Identified 11 LRX Genes in Arabidopsis and Eight in Rice

To identify the complete LRX gene family, we searched the genomes of Arabidopsis (Arabidopsis Genome Initiative, 2000) and rice subsp. japonica (Goff et al., 2002), which we recently got access to, using the tBLASTn algorithm (Altschul et al., 1997) and the LRR protein sequence of LRX1 as a query. Among all the sequences showing some homology to the LRR domain of LRX1 (over 100), we selected for further examination those with an E value under 10e-22. Sequences encoding an extensin-like domain in the C terminus were retained and reiteratively blasted against the databases. The reiterative search did not provide additional sequences. All the LRX sequences had an E value under 10e-75 and were compacted at the top of the list of hits (depending on the query, 25–60 sequences were retained following the criterion of an E value under 10e-22), indicating that the initial cutoff value of 10e-22 was appropriate. In total, this survey identified 11 Arabidopsis LRX genes, including the already characterized Arabidopsis LRX1 gene, and eight novel rice genes (Table I). The LRX genes are dispersed on the five Arabidopsis chromosomes and on seven rice chromosomes. The nomenclature used to identify the individual genes has been developed based on previous work (Rubinstein et al., 1995a, 1995b; Baumberger et al., 2001; this study). The gene family is referred to as the LRX gene family. The specific members of the “vegetative” clade (see below) are referred to as LRX genes with the addition of genus-species modifiers (i.e. LRX1 is now AtLRX1, and we propose that TOML-4 [Zhou et al., 1992] be renamed LeLRX1). The specific members of the “reproductive clade (see below) are referred to as PEX genes with a similar modifier (i.e. Pex1 [Rubinstein et al., 1995a] is now ZmPEX1). We believe that this system minimizes confusion, while still maintaining the links to the previous published work.

Table I.

LRX genes in Arabidopsis and rice

Gene Name Accession No./Contigsa Chromosome No. No. of Amino Acids EST Accession No. EST Tissues
AtLRX1 AY026364 I 744 AV549666 Roots
AtLRX2 AAF70841 I 786 None
AtLRX3 CAB40769 IV 760 AA712746 Mixed
Z37611 5-d-old seedlings
AV549313 Roots
AtLRX4 BAB01951 III 494 AI995373 Inflorescence
AV536116l Liquid-cultured seedlings
AtLRX5 CAB37452 IV 858 AV832281 Mixed
AV529017 Mixed
AV546936 Siliques and flowers
AV540471 Roots
AV548766 Roots
AV820372 Roots
AV523356 Aboveground organs
AV547191 Roots
AV523812 Aboveground organs
AtLRX6 BAB01255 III 470 AV546157 Roots
AV540214 Roots
AtLRX7 NP_197937 V 433 None
AtPEX1 BAB01698 III 956 AV530441 Flower buds
AI996299 Inflorescence
AV808484 Mixed
AtPEX2 AAD43152 I 852 AV535680 Flower buds
AV810460 Mixed
AtPEX3 AAD41978 II 791 AV566944 Green siliques
AV440802 Aboveground organs
AV796546 Mixed
AI100573 Mixed
AV804253 Mixed
AV817648 Mixed
AV520402 Aboveground organs
Z17773 Green siliques
AtPEX4 CAA19879 IV 699 AI994021 Mixed
AA394664 Mixed
BE521427 Developing seeds (5–13 d post-flowering).
BE521428 Developing seeds (5–13 d post-flowering).
OsLRX1 cl003089.267 (46,794) II (8.9 cM) 693 BE04668 2-Week-old plants
OsLRX2 cl006359.13 (32,553) VI (117–120 cM) >505 AU091360 Panicles < 3cm
AU162801 Green shoots
OsLRX3 cl016081.16 (8,601) V (32–32.5 cM) 468 None
OsLRX4 cl026400.214 (13,025) I (19.9–26.8 cM) 520 E138_4Z Panicles (flowering)
OsLRX5 cl024247.95 (9,724) VII (61.9–62.4 cM) 571 D2478 Root seedlings
D41575 Etiolated shoots
AU174289 Etiolated shoots
OsPEX1 cl015625 (79,045) (75,401) (19,994) XI (106.8 cM) 1,381 C717877 Panicle (flowering)
C72813 Panicle (flowering)
C71802 Panicle (flowering)
C98502 Panicle (flowering)
C98526 Panicle (flowering)
OsPEX2 cl023884 (97,130) XII (71.7–78.9 cM) 780 C98912 Panicle (flowering)
C98913 Panicle (flowering)
OsPEX3 clb9508 (41) I (71.7 cM) 503 C72871 Panicle (flowering)
a

Accession nos. derived from the National Center for Biotechnology Information are given for the Arabidopsis genes, and contig nos. from the rice database at TMRI are provided for the rice homologs (Goff et al., 2002). Nos. in parentheses indicate contigs harboring the same gene in rice subsp. indica (Yu et al., 2002). 

Because all the AtLRX (Arabidopsis LRX) genes appear to be located on pairs of duplicated chromosomal segments (Blanc et al., 2000; Vision et al., 2000), we used the Dotter program (Sonnhammer and Durbin, 1995) to align the sequences of the bacterial artificial chromosomes (BACs) harboring the LRX genes and to verify the conservation of the genes flanking each candidate. This analysis confirmed that AtLRX1/AtLRX2, AtLRX3/AtLRX4, and AtPEX1/AtPEX2, as well as AtPEX3/PEX4, form gene pairs, which arose by segmental duplication. Therefore, these gene pairs can be considered as paralogs (Fig. 1). The three remaining genes, AtLRX5, AtLRX6, and AtLRX7, are also located on duplicated chromosomal segments. However, the second copy of each gene has been lost by sequence rearrangement as revealed by the analysis of the corresponding BAC sequences (data not shown).

Figure 1.

Figure 1

Position of the 11 Arabidopsis LRX genes on the five Arabidopsis chromosomes. The large chromosomal duplications harboring the LRX genes are represented by shaded boxes with identical pattern, and paralogs are linked by arrows. AtLRX5, 6, and 7 are single genes, although they are located on duplicated segments. Centromeres are indicated by constrictions in the chromosome schematic representation. The chromosomal duplications are deduced from Blanc et al. (2000) and Vision et al. (2000). Gene positions on the chromosomes are approximated according to the mapping of the BAC clones available at The Arabidopsis Information Research Web site (http://www.arabidopsis.org).

The prediction of the AtLRX open reading frames (ORFs) was verified by visually checking the three-frame translations of the genomic sequences and by comparing the predicted proteins with the AtLRX1, ZmPEX1, and ZmPEX2 sequences (Rubinstein et al., 1995a; Baumberger et al., 2001; Stratford et al., 2001). The OsLRX ORFs were determined similarly, after correction of the ambiguities and sequencing errors by comparison with ESTs and rice subsp. indica genomic sequences (Yu et al., 2002). At the time point of the analysis, ORF sequences of two OsLRXs candidate genes were incomplete. One of them, OsLRX3, was included in the analysis because its ORF sequence only lacks the 5′ end, which is poorly conserved in the LRX family. The second candidate was not considered further, because its identification as a putative LRX gene only relies on its homology with the LRR domain of AtLRX1, without any possibility to verify the rest of the sequence and the presence of an extensin domain.

Because previously reported LRX genes lack introns (Rubinstein et al., 1995a; Baumberger et al., 2001; Stratford et al., 2001), we wished to determine whether any AtLRX genes contained introns. If available, the presence or absence of introns was verified by comparing the genomic sequences with corresponding EST sequences. Two genes (AtLRX6 and AtLRX7) encode proteins with very short and atypical extensin domains, and, therefore, could contain unannotated introns in the extensin domain leading to a perceived premature truncation. For these genes, the 3′ end of the cDNA was experimentally confirmed by RACE-PCR, verifying that the annotated gene sequences are correct. Two genes, AtPEX3 and AtLRX5, are annotated as containing introns. For AtPEX3, starting the ORF at position 54,949 of the BAC F19G14 (accession no. AC006438) results in a complete ORF that predicts a protein with a better signal sequence compared with the annotated protein. For AtLRX5, the predicted intron results in an unusual sequence motif nested within the extensin domain. To reexamine the nucleotide sequence in this region, we sequenced PCR products covering the predicted intron and flanking sequences. As suspected, the published genomic sequence contains an extra nucleotide at position 35,198 of the BAC F28A21 (accession AL035526). Correction of this sequencing error eliminates the requirement for the predicted intron and eliminates the unusual sequence motif from the extensin domain. The complete corrected DNA and protein sequences of the rice LRX are available in the supplemental data (see www.plantphysiol.org). From our results, we conclude that all of the AtLRX genes lack introns.

The LRR Domain of LRX Proteins Is Strongly Conserved between Monocots and Eudicots

To provide a general description of the LRX family, we compared the 11 Arabidopsis LRX protein sequences with the eight rice homologs. We also included in the analysis homologs from maize (ZmPEX1 and ZmPEX2; Rubinstein et al., 1995a; Stratford et al., 2001), tomato (Lycopersicon esculentum; LeLRX1 and LePEX1; Zhou et al., 1992; Baumberger et al., 2001; Stratford et al., 2001), and from Nicotiana tabacum (NtPEX1; Wong, 2001).

The general organization of the LRX proteins after cleavage of the predicted signal peptide (Nielsen et al., 1997) consists of a 54- to 102-amino acid-long N-terminal domain, which is followed by a 236- to 240-amino acid-long domain of LRRs, separated from the C-terminal extensin-like domain by a short Cys-rich region of 39 to 50 amino acids (Fig. 2, A and B). Three regions can be delimited in the N-terminal domain: a hypervariable region of 11 to 20 amino acids and two flanking regions of predicted α-helices. The first α-helix is truncated or entirely missing in eight of the LRX proteins (AtLRX1, 2, 6, 7, LeLRX, and OsLRX1, 3, and 4) but is always present in the PEX proteins. In contrast, the second α-helix is present throughout the family and forms the beginning of a very well-conserved region of the N-terminal domain (Fig. 2, B and C). The hypervariable region is sometimes conserved between proteins of the same plant family (LePEX/NtPEX and ZmPEX1/ZmPEX2/OsPEX1/OsPEX2) and between subsets of the Arabidopsis proteins (AtPEX1/AtPEX3/AtPEX4 and AtLRX4/AtLRX5; Fig. 2C).

Figure 2.

Figure 2

Domain organization of the LRX proteins. A, Schematic representation of the conserved domain organization of the LRX proteins. The different domains are drawn to scale and their identity is indicated (SP, predicted signal peptide; and α, position of an α-helix). The N-terminal domain contains two predicted α-helices (gray oblique stripes) that flank a hypervariable region (light-gray box). The region of the N-terminal domain beginning with the second predicted α-helix is strongly conserved (stippled box). The C terminus (black oblique stripes) is conserved between several of the LRX proteins. B, Consensus amino acid sequence of each conserved domain. Capital letters, Residues that are conserved in at least 80% of the Arabidopsis, rice, maize, and tomato LRX genes; x, Non-conserved residues; and a, Any aliphatic amino acids. The 10 LRRs are aligned to the plant extracellular LRR consensus sequence, and positions matching the consensus are indicated in bold letters. The predicted β-strand/β-turn structural motif is framed. Lower script numbers indicate the number of non-conserved residues between two consensus amino acids. C, N-terminal domain sequences of the LRX proteins. The α-helix regions and the hypervariable regions are indicated at the top. Conserved residues are in bold, and regions of particular homology are framed. D, C terminus of the LRX proteins. Conserved residues are indicated in bold, and sequences sharing a same consensus are grouped together. The consensus motifs are shown on the right of the sequences. Capital letters, Residues conserved in 50% or more of the sequences; minuscule letters, residues common in less than 50% of the sequences.

The LRR domain is composed of nine complete repeats of 23 to 25 amino acids flanked by a 10th degenerated terminal repeat (Fig. 2B). The LRX LRRs match the plant extracytoplasmic LRR consensus sequence LxxLxxLxLxxNxLxGxIPxxLGx, where L represents a conserved Leu or other aliphatic residues, and x can be any amino acid (Kajava, 1998). The identity of the LRR domain between the different members of the LRX family ranges from 96.6% to 50%. The first three LRRs, as well as the sixth one, show the strongest conservation throughout the family. Protein tertiary structure prediction and crystallographic studies indicate that the protein-protein interaction surface of LRR domains is constituted by the juxtaposition of the xxLxLxx motif of each repeat in which the variable residues are exposed to the solvent and determine the specificity of the interaction (Fig. 2B; Kobe and Deisenhofer, 1995; Papageorgiou et al., 1997). Therefore, we compared the conservation of the solvent exposed residues with the conservation of the amino acids at non-consensus positions in the LRRs of the complete Arabidopsis LRX family. For this purpose, the solvent-exposed residues of each LRR were extracted from the sequence and joint together into a separate file for each AtLRX protein. All pair-wise similarities were calculated between these artificial sequences, and the values were averaged. The same calculation was performed on the rest of the LRR domain sequence, excluding the LRR consensus residues, which are likely to be under strong negative selection pressure (Parniske et al., 1997). The average similarity is 72% ± 11% for the solvent-exposed residues (xxLxLxx) against 48% ± 13% for the variable amino acids not directly involved in the interaction. This suggests that the purifying selection tended to preserve the recognition specificity of the LRR domain.

The Cys-rich region that follows the LRR domain also forms a characteristic signature of LRX proteins. This hinge-region consists of five Cys residues regularly spaced by 10 to 18 variable amino acids and two well-conserved Asp and Gln residues (Cx10–18Cx10–14NCx6–7Qx4Cx9–11C). The strong conservation of these Cys suggests that disulfide bonds are important in the conformation and/or function of LRX proteins.

The Extensin Domain of the LRX Family Is Highly Variable

In contrast to the high similarity observed between the N-terminal and LRR domain of all the predicted LRX proteins, the extensin domain is extremely variable, both in length and motif organization (Table II). The S(P)4-n motif characteristic of the extensins is present in a high copy number in most of the LRX proteins. However, some of the identified LRX proteins contain a very short extensin domain with only one or two S(P)4-n motifs (AtLRX7 and OsLRX3) or contain only shorter stretches of Pro residues (OsLRX1 and OsLRX2). Most of the LRX extensin domains contain additional subdomains that lack the specific SP(4-n) motifs but are made of other repeats or are enriched in specific amino acids, which are frequently Ser, Val, Lys, Gln, and Thr. The alignment of the extensin domain was difficult, even between duplicated genes, mostly because of the frequent insertion and deletion of particular repeats. As a consequence, we manually screened single-extensin domains for patterns and higher order repeats and compared these patterns between the different LRX proteins. An analysis performed with the motif-discovering algorithm MEME (Bailey and Elkan, 1994) essentially confirmed the manual search and was used to refine the motif and consensus definition where necessary. Table II compiles the different repeats identified in each extensin domain and reports motifs that are present in more than one LRX protein. Repeats are frequently clustered together and define subdomains, which have a specific signature on hydropathy plots (data not shown). For example, the repeat SPPPPVH constitutes a conserved subdomain of variable copy number in four of the Arabidopsis LRX proteins (AtPEX1–4). This SPPPPVH subdomain makes up essentially the entire extensin moiety of AtPEX3, whereas in its paralog (AtPEX4), it is preceded by a different domain absent from AtPEX3. Similarly, the motif VKSPAPVSPPPP is present in numerous copies, with small variations, in several of the monocot LRXs identified so far (ZmPEX1, ZmPEX2, OsPEX1, and OsPEX2). Interestingly, the most conserved part of the extensin domain is the C terminus. Monocot and dicot PEX proteins have the consensus terminal motif ilPP(i/f)(i/L/m)ghqYaSPPPP(m/q)FqGY, whereas two other related motifs, yegxplPPvigVSYxASPPPPpxx(f/y)Y and KLPFPPVYGVx(y/a)(a/y)SxPPPP(v/s)KPYN, terminate the extensin domain of AtLRX1 to 5, LeLRX, and OsLRX1 and 2, respectively (Fig. 2D; Table II). The remaining LRX proteins have less conserved C termini.

Table II.

Extensin motifs and repeats

Protein Total Lengtha Specific Motifsb No. of Repeats
AtLRX1 367 SPPPPssKMSPsVRay  4
SPPPPYVYS  6
SPPPPspvYYppvt(p/q) 10
PTSYFPPMPSVSYDASPPPPPSYY  1 (C terminus)
AtLRX2 407 SPPPPssKMSPsfrat  6
SPPPPspYIYS  2
SPPPpppvYYppvtq 10
YEDTPLPPIRGVSYASPPPPSIPYY  1 (C terminus)
AtLRX3 372 SPPPPpPPPPppVy  7
SPPPpvYhy  9
YEGPLPPVIGVSYASPPPPPFY  1 (C terminus)
AtLRX4 98 SPPpppV(h/y)  4
FEGPLPPVIGVSYASPPPPPFY  1 (C terminus)
AtLRX5 458 SPpTTPsPGGSPPS  6
YEGPLPPIPGISYASPPPPPFY  1 (C terminus)
AtLRX6 96 SP29YVY  1
No other repeat
AtLRX7 60 No repeat [two S(P)n motifs)]
LeLRX 310 KP  4
SyEHPktp  7
FYENIPLPPVIGVSYASPPPPVIPYY  1 (C terminus)
OsLRX1 314 TpSYP  4
KLPFPPVYGVSYASPPPPVKPYN  1 (C terminus)
OsLRX2 176 SPSS  3
KLPFPPVyGVAYSSPPPP  2
KLPFPPVYGVAYSSPPPPSKPYN  1 (C terminus)
OsLRX3 93 No repeats
OsLRX4 148 WPP(v/i)(h/g)VPYGSPPPPPLH  2
OsLRX5 178 PyYEVSPEDRYL  2
SPPPPPAY  2
AtPEX1 559 PK(q/p)E 17
DPYDASPxxxRR  1 (Similar to AtPEX2)
PKqEtPKPEESPKPQP  3
SPPPPVh 24
ILPPNIGHQYASPPPPMFPGY  1 (C terminus)
AtPEX2 463 PK (11) KP(4) Q(8)  1 (Similar to AtPEX1)
DPYDASPxxxRR 12
SPPPpv(hyf) 13
Q(av)pt  1 (C terminus)
VLPPHIGFQYASPPPPMFQGY
AtPEX3 342 PVxKPQPPKESPQPxDPYxQSPVxxRR  2 (Similar to AtPEX4)
SPPPPV(h/y) 23
IIPPFIGHQYASPPPPMFQGY  1 (C terminus)
AtPEX4 287 PVhkPsPVptT  5
PVxKPQPPKESPQPxDPYxQSPVxxRR  1 (Similar to AtPEX3
SPPPpxxxV(h/y) 17
IIPPFIGHQYASPPPPMFAGY  1 (C terminus)
LePEX 322 PK  6
V(h/aSPPPP) 32
ALPPTLGSLYASPPPPIFQGY  1 (C terminus)
OsPEX1 983 SPPTPESKA  3
KSPPSHTPESSSPPSkESEPPPTPTPKSSPPSHEEYVPPSPAKSTPP  2
vKSpPPpAPVilppp 11
VLLPPVMAHQYASPPPPQFQGY  1 (C terminus)
OsPEX2 378 vksPPPPAPvSSPPPP 14
ILPPILSHSYASPPPPQFEGY  1 (C terminus)
OsPEX3 106 SPPPPpppAPV  2
ILPPILSAKYQSPPPPFFEGY  1 (C terminus)
ZmPEX1 782 vKsppPpapvaSPPPPvKssPPapVSSPPptpksspppapvsspppp 10
ILPPIMANKYASPPPPQFQGY  1 (C terminus)
ZmPEX2 907 vKsPPpPapvaspppp  8
SPPPTPKSSPPLAPVSSPPQVEKTSPPPAPVS  3
vKssPPpapvSsPPp(t/a/p)pkssppapvsspp 13
ILPPIMANKYASPPPPLFQGY  1 (C terminus)
a

The total length of the extensin domain is given in amino acids. 

b

Upper and lower case letters indicate 100% or more than 50% conservation between all the repeats, respectively. Lower case letters in parentheses indicate amino acids that are individually present in less than 50% of the repeats but collectively make more than 50% of the residues at this position. Lower case x represents any amino acid or a gap. 

The Expression of the Arabidopsis LRX Genes Is Tissue Specific

The identification of several corresponding EST sequences for most of the Arabidopsis LRX genes indicated that they are expressed during normal plant development, with the notable exception of AtLRX2 and AtLRX7, for which no ESTs were found (Table I). We characterized the expression pattern of each AtLRX gene by northern hybridization of total RNA isolated from various organs (Fig. 3). Transcripts of AtPEX1 to 4, which share the highest similarity with the ZmPEX genes and form two pairs of paralogs, were almost exclusively detected in flowers. Transcripts of all AtPEX genes are readily detected in mature anthers, pollen, and pollinated carpels, and AtPEX1 may be expressed at a very low level in unpollinated carpels (Fig. 3). AtLRX2 to 6, which are phylogenetically more related to AtLRX1 (Baumberger et al., 2001), display a more diverse expression pattern. AtLRX2 (N. Baumberger, M. Steiner, U. Ryser, B. Keller, and C. Ringli, unpublished data) and AtLRX6 are specifically expressed in roots, whereas AtLRX5 mRNA is mainly found in flowers, young leaves, and roots and is present in low amounts in the other organs. The two paralogous genes, AtLRX3 and AtLRX4, are both expressed in all organs, with the difference that AtLRX4 has a higher expression in roots and young leaves than AtLRX3.

Figure 3.

Figure 3

Northern-blot analysis of the Arabidopsis LRX transcript levels. Total RNA was extracted from roots (Rt), young developing leaves and cotyledons (yL), rosette mature leaves (rL), cauline leaves (cL), floral stems (St), flower buds and opened flowers (Fl), stamens (S), carpels (Ca), pollen (Po), and pollinated carpels (PCa). Roots and young developing leaves were harvested from 14-d-old Columbia seedlings grown vertically on solidified Murashige and Skoog medium. All the other material was harvested from 35- to 40-d-old Columbia plants. Carpels were harvested from unopened flowers with young stamens to reduce pollen contamination. For northern analysis, 5 μg (2 μg for pollen) of total RNA was hybridized with 32P-labeled gene-specific probes amplified by PCR from genomic DNA. Ribosomal RNAs were used as loading control (lower). No signal was obtained with the LRX7 probe and the result is not shown. AtLRX2 data will be presented elsewhere (N. Baumberger, M. Steiner, U. Ryser, B. Keller, and C. Ringli, unpublished data).

To assess more precisely the spatial and temporal regulation of AtLRX gene expression, the promoter region of each LRX gene was fused to the bacterial β-glucuronidase (GUS) reporter gene (uidA) and transformed into plants. The activity of the uidA reporter gene was tested in at least five independent transgenic T2 lines. pAtPEX2, pAtPEX3, and pAtPEX4::uidA plants showed a clear staining of the mature pollen grains and, upon pollen germination, of the elongating pollen tubes (Fig. 4, A and B). No other floral tissue showed any GUS activity in these transgenic plants. These results confirm the flower-specific signal obtained by northern hybridization and suggest that these genes constitute orthologs of the ZmPEX1, ZmPEX2, and LePEX genes that were also shown to be expressed in pollen (Rubinstein et al., 1995a; Stratford et al., 2001). pAtLRX6::uidA expression was observed in secondary roots emerging from the primary root (Fig. 4C). GUS staining at the apex was still visible during elongation of the lateral roots once they had broken through the cortex of the primary roots but gradually faded with age (data not shown). In older roots (>7 d post-germination), a moderate level of expression was detected in the central cylinder of primary and secondary roots (Fig. 4C). pAtLRX5::uidA showed a very similar pattern of expression in emerging lateral roots at a very early stage of dedifferentiation of the pericycle cells (Fig. 4D). However, in contrast to pAtLRX6::uidA, a high GUS activity was detected in very young emerging leaves and stipules at the center of the pAtLRX5::uidA plant rosette (Fig. 4E). A more diffuse staining persisted later in the expanding leaves, mostly in the petiole (Fig. 4E). During flower development, pAtLRX5::uidA expression was restricted to the carpels, the stamen filament, and the abscission zone of the floral whorls (Fig. 4F). Consistent with the northern data, pAtLRX3::uidA and pAtLRX4::uidA plants showed the same pattern of expression. In these plants, GUS activity was restricted to the root vascular tissues (Fig. 4G) and in the veins of the cotyledons, the developing leaves, and the sepals (Fig. 4, H and I). A moderate level of expression was also observed in the rest of the leaf and sepal tissues. The pAtLRX7::uidA construct induced a strong GUS activity in pollen grains, whereas no signal could be detected in flowers by northern hybridization (data not shown). However, we could amplify AtLRX7 transcripts by RACE-PCR from flower total RNA as template, demonstrating that this gene is at least expressed in flowers, and possibly in other tissues. Therefore, it is possible that a low stability of the AtLRX7 transcript prevents its detection by northern hybridization.

Figure 4.

Figure 4

Histochemical localization of LRX expression in LRX promoter::uidA transgenic plants. A and B, pPEX3::uidA expression. C, pLRX6::uidA expression. D through F, pLRX5::uidA expression. G through I, pLRX4::uidA expression. A, Open flower. Mature pollen grains show a strong expression of pPEX3::uidA. B, Pollen germinating on a stigma. C, Emerging lateral roots, showing strong expression of pLRX6::uidA in the lateral root meristem, as well as in the vascular tissue of the mature primary root. D, Root of a 7-d-old seedling; GUS activity is restricted to the emerging lateral roots. E, Fourteen-day-old seedlings; pLRX5::uidA is expressed in the leaf primordia and stipules (inset image) at the basis of the rosette leaves. A moderate activity persisted during further expansion of the leaves, preferentially in petioles. F, Opened flowers and flower buds. GUS activity is detected in the carpels and faintly in the pedicels. Carpel expression seems higher in the style directly under the stigma and at the base of the organ and the junction with the receptacle (abscission zone). G, Primary mature roots; the vascular bundle is strongly stained. H, Fourteen-day-old seedlings; the inset picture shows a leaf stained for a shorter time to reveal the stronger expression in vascular tissue. I, Open flower; GUS activity is detected in sepals, mostly in the veins. Bars = 1 mm (A, F, and I), 50 μm (B–D), 2 mm (E and H), 350 μm (E, inset image), and 200 μm (G).

Monocot and Eudicot LRX Genes Cluster in Two Mixed Phylogenetic Clades

To study the phylogeny of the LRX proteins, the protein sequences were aligned, and trees were generated by the maximum parsimony method. Because of uncertain alignment, the signal peptide, the hypervariable region between the predicted α-helices in the amino terminus, as well as the extensin domain except the last 24 residues of the carboxy terminus, were removed before the phylogenetic analysis. A phylogenetic analysis excluding the conserved C termini of the extensin domain gave a very similar tree topology (data not shown). The aligned regions were composed of 389 amino acid positions. Of these 389 positions, 254 were parsimony informative for the amino acid-based analysis. Seven parsimony-informative gap characters were included in all analyses. Two most parsimonious gene trees of 1,721 steps were found in all 1,000 replicates. The ensemble consistency index (Kluge and Farris, 1969) of these trees was 0.65 (excluding uninformative characters) and the ensemble retention index (Farris, 1989) was 0.64. The strict consensus tree (Schuh and Polhemus, 1980; Sokal and Rohlf, 1981) with jackknife support values mapped is presented in Figure 5. In the jackknife tree, OsPEX3 was resolved as the sister group of the clade that includes ZmPEX1 and OsPEX1 with weak (54%) support.

Figure 5.

Figure 5

Phylogenetic analysis of the LRX family. The strict consensus of the two most parsimonious amino acid based phylogenetic trees is reproduced, and the jackknife support values mapped above the branches. The number of steps was 1,721, the ensemble consistency index of the most parsimonious trees was 0.65 (excluding uninformative characters), and the ensemble retention index was 0.64. Monocot and dicot LRX proteins cluster in a vegetatively expressed LRX and a reproductive-expressed LRX clade.

A phylogenetic analysis was also performed on 778 nucleotide characters corresponding to the first and second codon positions. Among those characters, 442 were parsimony informative. One most parsimonious gene tree of 2,157 steps was found in all 1,000 replicates with an ensemble consistency index of 0.45 (excluding uninformative characters) and an ensemble retention index of 0.58. This tree topology was nearly identical to that inferred by using amino acid characters and only differed with respect to branches that were weakly supported in one or both trees (see supplemental data at www.plantphysiol.org).

The phylogenetic analysis shows that the LRX sequences fall in two distinct clades, which comprise both eudicot (Arabidopsis and tomato) and monocot LRXs (rice and maize; Fig. 5). Remarkably, this classification of the LRX genes into two clades almost perfectly overlaps with the LRX expression patterns: the first clade, which clusters the pollen-expressed AtPEX1 to 4, also contains ZmPEX1, ZmPEX2, and LePEX, which were all shown to be specifically expressed in pollen (this manuscript; Rubinstein et al., 1995a; Stratford et al., 2001). The second phylogenetic clade contains all the Arabidopsis LRX genes expressed in vegetative tissues and AtLRX7, for which the expression profile remains unclear. The expression of the OsLRX genes, based on ESTs isolated from different tissues, indicate that this classification also holds true for rice. For OsPEX1 to 3, ESTs were isolated from flowering panicles. ESTs of OsLRX1, 2, and 5 were isolated from roots, shoots, and whole seedlings (Table I) and, thus, fit in the clade of genes expressed in vegetative tissue. For OsLRX3 and 4, a final conclusion cannot be drawn because no or only one EST from flowering panicles, respectively, was isolated. A more detailed analysis will be necessary to conclusively classify these two genes.

DISCUSSION

The LRX phylogeny suggests that the origin and initial expansion of the LRX family predate the divergence of monocots and eudicots 160 to 240 million years ago (Wolfe et al., 1989; Goremykin et al., 1997; Soltis et al., 2002). The first step in the diversification of the LRX gene family possibly occurred by the duplication of a unique ancestral gene resulting in the two clades observed in modern angiosperms. The separation between eudicot and monocot LRX sequences within each of the clades derived from this gene duplication, indicating that the LRX family expanded and diversified after the divergence between the two lineages. Although the absence of LRX candidate gene sequences from more ancient taxa in current databases prevents us from reliably rooting the phylogenetic tree, the evolutionary scenario presented above is the most probable because it is the most parsimonious. Placing the origin of the family anywhere else than between the two postulated reproductive and vegetative clades would imply multiple gene extinction events in each angiosperm lineage. The expression of the LRX genes of Arabidopsis, maize, and tomato reveals that they can be classified as vegetatively expressed or pollen-expressed genes, two categories that almost completely overlap with the phylogenetic clades. The AtLRX7 gene seems to be an exception: It is expressed in pollen although it belongs to the vegetative LRX phylogenetic clade. However, its pollen-specific expression was only demonstrated by the promoter::uidA fusion and no mRNA was detectable by northern analysis, suggesting that AtLRX7 mRNA may be unstable. The expression profiles of the OsLRX genes, deduced from EST database mining, so far corroborate the suggested classification. ESTs corresponding to OsLRX1, 2, and 5 of the vegetative clade were isolated from different vegetative tissue and those of OsPEX1 to 3 from flowering panicles. In two cases, EST data are either absent (OsLRX3) or possibly incomplete (OsLRX4). Thus, a final conclusion on the classification of the whole rice LRX gene family will require a more detailed analysis of the gene expression profiles.

The existence of “vegetative” and “reproductive” LRX clades suggests that the two ancestral LRX genes had acquired their tissue specificity before the monocot-eudicot divergence. Such a clear specialization between reproductive and vegetative isovariants was also observed in the actin and profilin gene families (McDowell et al., 1996; Meagher et al., 1999; Kandasamy et al., 2002). It was proposed that the initial separation between vegetative and reproductive actins was contemporary with and perhaps contingent upon the invention of new developmental pathways involved in the formation of vegetative organs from reproductive ones (Meagher et al., 1999). Similar to actins (Ringli et al., 2002), LRXs are possibly involved in cell morphogenesis, as demonstrated for AtLRX1 (Baumberger et al., 2001). Thus, the same evolutionary mechanisms might have driven the specialization of LRX genes into reproductive and vegetative forms. However, in contrast to the LRX family, the vegetative and reproductive actin clades expanded before the radiation of the angiosperms, as indicated by the existence of subclasses comprising monocot and dicot sequences within each clade (Meagher et al., 1999).

The localization of the most closely related AtLRX genes on duplicated chromosomal regions indicates that the last expansion of the LRX family in Arabidopsis occurred during whole or partial genome duplication events, followed, in some cases, by selective gene loss (Blanc et al., 2000; Grant et al., 2000; Ku et al., 2000; Vision et al., 2000). Three, and possibly four, rounds of duplications generated five vegetative subclasses with distinct expression patterns, and two rounds of duplications created two reproductive subclasses with identical expression patterns. The conservation of a very similar expression pattern in each copy of the paired genes suggests that the functional specialization of these genes happened before their duplication. According to the estimation of Vision et al. (2000) and the synteny observed between tomato and Arabidopsis (Ku et al., 2000), most of the duplications in the Arabidopsis genome occurred shortly after the divergence between asterids (tomato) and rosids (Arabidopsis) 112 to 156 million years ago. It is surprising that genes duplicated so long ago have maintained the same expression pattern, whereas the estimated half-life of a duplicated gene copy is only 3.2 million years before it is silenced (Lynch and Conery, 2000). This might happen if the duplicated LRX genes have acquired useful new functions (neofunctionalization) or if two copies are necessary to achieve a sufficient level of expression (sub-functionalization; Force et al., 1999). Complementation of the Arabidopsis lrx1 mutant with different LRX genes and isolation of other lrx mutants should help to resolve this issue.

Interestingly, the seven vegetative Arabidopsis LRX genes form five subclasses with distinct expression patterns. This mirrors the number of rice genes in the vegetative clade. Similarly, the two pairs of Arabidopsis PEX paralogs correspond to two rice PEX genes (the phylogenetic position of the third one, OsPEX3, is not well resolved). Because the LRX genes have evolved independently in moncots and eudicots, this might indicate that a minimum of six to seven differently regulated LRXs is required for the development of modern angiosperms. It will be interesting to learn whether the evolution has led to the formation of the same vegetative expression patterns in different LRX subclasses in monocots and in eudicots.

The high variability of the extensin domain throughout the LRX family is in strong contrast to the high conservation of the LRR domain and raises the question of whether or not different extensin domains have been recruited multiple times by a single LRR protein. However, the presence of similar C termini in different LRXs, both within the same species and between species, rather suggests that the formation of a chimeric LRR extensin by domain shuffling probably took place before the first gene duplication. Except for the C terminus, which is particularly conserved between monocot and dicot PEX genes, similarities to some extent are only observed between the extensin domains of genes belonging to the same expression subclass (AtLRX1/AtLRX2, AtLRX3/AtLRX4, and AtPEX1–4) or between putative orthologs (ZmPEXs and OsPEX1/OsPEX2). It is possible that the variability of the extensin domains reflects an adaptation to particular cell walls. For instance, the composition of the pollen tube cell walls is clearly different from the composition of other cell types, being rich in callose and poor in cellulose (Taylor and Hepler, 1997, and refs. therein). Primary cell walls also differ by a number of structural and biochemical specificities between Poaceae (type II cell walls) and other angiosperms (type I cell walls; Carpita and Gibeaut, 1993). One possible consequence of the difference in the primary cell wall might be the presence of the amino acid motif YxY, important for intramolecular cross-linking, exclusively in extensin domains of dicot LRXs. In general, it seems that the extensin domains are under a relatively mild selection pressure, and only some critical structural motifs are conserved. The insertion/deletion of repeats and the insertion of single amino acids have been shown to conserve conformation and function of some Pro-rich proteins (Rabanal et al., 1993; Schmidt et al., 1994). It is likely that replication slippage and unequal crossing over occur more frequently in the highly repetitive G/C-rich sequences encoding the extensin-like domain than in the LRR-encoding sequence. These two mechanisms could generate the observed variability in the extensin domain by sequence extension and repeat shuffling. The higher similarities between genes belonging to the same expression subclass then would rather be a consequence of a more recent divergence than the sign of a functional specialization to a specific type of cell wall. Experiments involving complementation of the lrx1 mutant (Baumberger et al., 2001) by chimeric AtLRX constructs containing different extensin domains will address this question. It is interesting to note that none of the LRXs contains the motif VYK, thought to be important for intermolecular cross-linking (Schnabelrauch et al., 1996). Because protein immobilization in the cell wall was demonstrated for AtLRX1 and ZmPEX1 (Baumberger et al., 2001; Rubinstein et al., 1995b), it is likely that this process is mediated by other motifs than those so far identified in extensins.

LRRs frequently participate in protein-protein interactions, and crystallographic studies indicate that they collectively form an open horseshoe structure (Kobe and Deisenhofer, 1994, 1995; Papageorgiou et al., 1997). In each LRR, a region of the consensus sequence (xxLxLxx) is predicted to form a β-strand/β-turn structure in which the variable residues (x) are exposed to the solvent and determine the specificity of the interaction. This region of the LRR is subjected to diversifying selection in plant resistance genes (McDowell et al., 1998; Meyers et al., 1998; Ellis et al., 1999; Van der Hoorn et al., 2001). The observation that among different LRXs, the solvent-exposed amino acids are well conserved or even identical (AtLRX3/4 and AtLRX6/7) suggests that the LRX proteins interact with very similar or identical ligands. However, we observed that overexpression of the LRR domain of AtLRX1, despite its ability to sequester the ligand of the native AtLRX1 protein and induce a dominant negative effect in root hairs, had no influence in the rest of the plant where other LRXs are active (Baumberger et al., 2001). This result indicates that the flanking regions of the LRRs might contribute to the specificity of the interaction with the ligand as observed for the LRRs of the tomato and flax (Linum usitatissimum) resistance genes Cf-4, L6, and L7 (Ellis et al., 1999; Van der Hoorn et al., 2001). The existence of family- or paralog/ortholog-specific regions adjacent to the conserved region of the N-termini of the LRX proteins supports this notion.

Mutations in the Arabidopsis LRX1 gene result in root hairs that swell, branch, or abort, suggesting that AtLRX1 is involved in tip growth, an extreme case of polarized growth restricted to root hairs and pollen tubes (Baumberger et al., 2001). Therefore, it is particularly intriguing to note that at least four Arabidopsis LRX genes are expressed in mature and possibly in germinating pollen as demonstrated for the ZmPEX1 gene (Rubinstein et al., 1995a, 1995b). Thus, the primary function of LRX proteins might be to control cell polarization or to locally regulate cell wall expansion during tip growth. For example, the PEX proteins may regulate the polarized growth of pollen tubes, possibly in response to a signal from the pistil. However, because LRX genes are also expressed in cells that expand by diffuse growth, the LRX proteins might have multiple functions depending on the cellular context. Instead of acting during cell expansion, LRXs might function in the modification of localized cell wall domains during the differentiation process. The preferential expression of AtLRX3 and AtLRX4 in vascular tissue, for instance, might be related to a role of LRX proteins in the maturation of xylem vessels, a process that requires the elimination of the terminal cell walls of the xylem elements. The functional analysis of several Arabidopsis LRX genes currently undertaken in our laboratories should provide additional insight into the function of these intriguing proteins.

MATERIALS AND METHODS

Plant Material and Growth Conditions

Arabidopsis ecotype Columbia was used in all expression studies. Plants were grown either in soil or on one-half-strength Murashige and Skoog medium, supplemented with 2% (w/v) Suc and 0.6% (w/v) Phytagel (Sigma, Buchs, Switzerland), under continuous light, and at 24°C. Transgenic seedlings were selected on one-half-strength Murashige and Skoog plates with 0.8% (w/v) Phytoagar (Invitrogen, Basel) supplemented with 50 μg mL−1 kanamycin. All analyses using transgenic lines were performed in the T2 or T3 generation.

Database Search and Gene Annotation

Arabidopsis and rice (Oryza sativa) LRX sequences were retrieved by tBLASTn searches of the Arabidopsis genome sequence database at the National Center for Biotechnology Information (http://www.ncbi.nlm.nih. gov/) and the SYD database at the Syngenta Torrey Mesa Research Institute (http://www.tmri.org/index.html), respectively. The preliminary search was performed with the amino acid sequence of AtLRX1 with the exclusion of the extensin domain (AY026364). Retrieved sequences with a cutoff E value above 1 × 10−10 were translated in the three frames to visually check for the presence of a putative extensin-like domain downstream of the region of homology with LRX1. LRX candidates were all recovered with a cutoff value well above 1 × 10−20, indicating that the stringency of the search criterion was appropriate. Selected candidates were then used as query in a second round tBLASTn search that did not identify new candidates.

Gene predictions were visually verified using the AtLRX1 and ZmPEX1 (Z34465) protein sequences as predictive models. Ambiguities were solved by comparison of the genomic sequences with EST or cDNA sequences, if available. For rice sequences, which have a lower accuracy than the Arabidopsis sequences, the ORF sequences were also validated by comparison with the genomic sequences of the rice subsp. indica (http://btn.genomics. org.cn/rice). In case of doubt, corrections were made in favor of the highest homology with AtLRX1 and ZmPEX1 protein sequences.

A 466-bp fragment containing the predicted AtLRX7 intron was amplified by PCR (using the oligonucleotide pair 5′-ccattgtaggcccgactccatcgtc-3′ and 5′-gagtgtattggcgttggtggaggtg-3′), subcloned into pGEMT-Easy according to the manufacturer's protocol (Promega, Madison, WI), and sequenced (Macromolecular Resources, Colorado State University, Fort Collins).

Sequence Alignment and Phylogenetic Analysis

Amino acid sequences were aligned by use of the default alignment parameters in ClustalX (Thompson et al., 1997) and then manually adjusted by use of the alignment criterion presented by Zurawski and Clegg (1987), in which gaps are considered as characters and the number of evolutionary events is minimized. Regions of individual sequences that remained ambiguously aligned with this criterion were coded as uncertain (“?”) for the phylogenetic analysis. The corresponding alignment of DNA sequences was created manually. The analysis with the motif-detecting algorithm MEME was performed at http://bioweb.pasteur.fr/seqanal/motif/meme/meme.html.

Aligned sequences were input into MacClade (Maddison and Maddison, 1992) for phylogenetic analysis. Sequence gaps that were unambiguously aligned were scored as additional characters by use of “complex indel coding” (Simmons and Ochoterena, 2000). Parsimony-based tree searches were performed using PAUP* (Swofford, 1998). Three tree searches were conducted: using amino acid characters and using nucleotide characters from first and second codon positions. Each tree search was performed using 1,000 replicates with equal character weighting, random taxon addition, and tree bisection reconnection, which swapped to completion for every search. Relative levels of branch support were determined using jackknife support (Farris, 1989). One thousand replicates were performed with 10 tree bisection reconnection searches held per replicate. Jackknife support values were mapped onto the most parsimonious tree or the strict consensus of the most parsimonious trees, respectively.

Outgroup rooting was attempted using the Leu-rich region from LePRK1 of tomato (Lycopersicon esculentum; Muschietti et al., 1998) and CLV1 of Arabidopsis (Clark et al., 1997). LRRs are comprised of a series of 23 to 25 residue repeats, of which there are 9.5 in the LRX gene family, six in LePRK1, and 21 in CLV1. Unfortunately, which repeats from the outgroup sequences aligned with which repeats from the ingroup sequences could not be determined. Therefore, duplicate gene rooting (Gogarten et al., 1989; Iwabe et al., 1989; Donoghue and Mathews, 1998) was used to root the gene tree between the vegetative- and reproductive-expressed paralogs.

The maize (Zea mays) and tomato LRX sequences used in the analysis are recorded in the databases under the following accession numbers: ZmPEX1, Z34465; ZmPEX2, AF159297; and LePEX1, AF159296.

Northern Analysis and RACE-PCR

RNA was extracted from various organs with the Trizol reagent (Invitrogen) following the manufacturer's instructions. Pollen was harvested and ruptured essentially as described (Huang et al., 1997). Five micrograms of total RNA (2 μg for pollen) was separated by agarose gel electrophoresis, transferred onto a GenescreenPlus nylon membrane (Invitrogen), and hybridized with 32P-labeled probes specific to each LRX gene. The fragment used as probes covered the sequence coding for the LRR domain and were amplified from Arabidopsis genomic DNA by PCR, cloned into pGEM-T easy vectors (Promega), and sequenced. The actual transcript end of the AtLRX6 and AtLRX7 genes was determined by 3′-RACE-PCR using the GeneRacer Kit (Invitrogen) and gene-specific primers. The resulting PCR products were cloned and sequenced. The specificity of the probes used for the northern experiment was verified by Southern hybridization. Each probe gave a different pattern, indicating that they are gene specific.

Constructs and Plant Transformation

For the each LRX promoter::uidA fusion construct (pLRX::uidA), 1.5 kb of the promoter region were amplified by PCR from genomic DNA by PCR, sequenced, and fused to the bacterial uidA gene in the vector pGPTV-KAN (Becker et al., 1992). The T-DNA constructs were transformed into Agrobacterium tumefaciens GV3101, and plant transformation was performed following the floral dip method described by Clough and Bent (1998). Transgenic plants were selected on Murashige and Skoog agar plates containing kanamycin transferred on soil and allowed to set seed. T2 and T3 generations were analyzed for GUS activity.

GUS Histochemical Analysis

Histochemical staining for GUS activity was performed by incubation in 100 μg mL−1 5-bromo-4-chloro-3-indolyl glucuronide in 50 mm Na-phosphate buffer (pH 6.8), 10 mm EDTA, 0.5 mm K3Fe(CN)6, 0.5 mm K4Fe(CN)6, and 0.1% (v/v) Triton X-100 at 37°C for 3 to 16 h. The material was then fixed and the pigments removed by incubation in 70% (v/v) ethanol for several hours. Observations were made on a Leica stereomicroscope LZM125 and a Leica microscope Laborlux equipped with a Wild camera MPS52 (Leica, Glattbrugg, Switzerland).

Supplementary Material

Supplemental Data

ACKNOWLEDGMENT

We thank Martin Parniske for insight into the LRR sequence analysis.

Footnotes

1

This work was supported by the Swiss National Science Foundation (grant nos. 31–51055.97 and 41–6 419.00) and by the National Science Foundation (grant no. 0091976 to P.A.B.).

[w]

The online version of this article contains Web-only data. The supplemental material is available at www.plantphysiol.org.

Article, publication date, and citation information can be found at www.plantphysiol.org/cgi/doi/10.1104/pp.102.014928.

LITERATURE CITED

  1. Adams MD, Celniker SE, Holt RA, Evans CA, Gocayne JD, Amanatides PG, Scherer SE, Li PW, Hoskins RA, Galle RF et al. The genome sequence of Drosophila melanogaster. Science. 2000;287:2185–2195. doi: 10.1126/science.287.5461.2185. [DOI] [PubMed] [Google Scholar]
  2. Altschul SF, Madden TL, Schaffer AA, Zhang JH, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. doi: 10.1093/nar/25.17.3389. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Arabidopsis Genome Initiative. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000;408:796–815. doi: 10.1038/35048692. [DOI] [PubMed] [Google Scholar]
  4. Bailey TL, Elkan C. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. In: Altman R, Brutlog D, Karp P, Lathrop R, Searls D, editors. Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology. Menlo Park, CA: American Association for Artificial Intelligence Press; 1994. pp. 28–36. [PubMed] [Google Scholar]
  5. Baumberger N, Ringli C, Keller B. The chimeric leucine-rich repeat/extensin cell wall protein LRX1 is required for root hair morphogenesis in Arabidopsis thaliana. Genes Dev. 2001;15:1128–1139. doi: 10.1101/gad.200201. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Becker D, Kemper E, Schell J, Masterson R. New plant binary vectors with selectable markers located proximal to the left T-DNA border. Plant Mol Biol. 1992;20:1195–1197. doi: 10.1007/BF00028908. [DOI] [PubMed] [Google Scholar]
  7. Blanc G, Barakat A, Guyot R, Cooke R, Delseny I. Extensive duplication and reshuffling in the Arabidopsisgenome. Plant Cell. 2000;12:1093–1101. doi: 10.1105/tpc.12.7.1093. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Carpita NC, Gibeaut DM. Structural models of primary cell walls in flowering plants: consistency of molecular structure with the physical properties of the walls during growth. Plant J. 1993;3:1–30. doi: 10.1111/j.1365-313x.1993.tb00007.x. [DOI] [PubMed] [Google Scholar]
  9. C. elegans Sequencing Consortium. Genome sequence of the nematode C. elegans: a platform for investigating biology. Science. 1998;282:2012–2018. doi: 10.1126/science.282.5396.2012. [DOI] [PubMed] [Google Scholar]
  10. Clark SE, Williams RW, Meyerowitz EM. The CLAVATA1 gene encodes a putative receptor kinase that controls shoot and floral meristem size in Arabidopsis. Cell. 1997;89:575–585. doi: 10.1016/s0092-8674(00)80239-1. [DOI] [PubMed] [Google Scholar]
  11. Clough SJ, Bent AF. Floral dip: a simplified method for Agrobacterium-mediated transformation of Arabidopsis thaliana. Plant J. 1998;16:735–743. doi: 10.1046/j.1365-313x.1998.00343.x. [DOI] [PubMed] [Google Scholar]
  12. Donoghue MJ, Mathews S. Duplicate genes and the root of angiosperms, with an example using phytochrome sequences. Mol Phylogenet Evol. 1998;9:489–500. doi: 10.1006/mpev.1998.0511. [DOI] [PubMed] [Google Scholar]
  13. Ellis JG, Lawrence GJ, Luck JE, Dodds PN. Identification of regions in alleles of the flax rust resistance gene Lthat determine differences in gene-for-gene specificity. Plant Cell. 1999;11:495–506. doi: 10.1105/tpc.11.3.495. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Farris JS. The retention index and the rescaled consistency index. Cladistics. 1989;5:417–419. doi: 10.1111/j.1096-0031.1989.tb00573.x. [DOI] [PubMed] [Google Scholar]
  15. Force A, Lynch M, Pickett FB, Amores A, Yan YL, Postlethwait J. Preservation of duplicate genes by complementary, degenerative mutations. Genetics. 1999;151:1531–1545. doi: 10.1093/genetics/151.4.1531. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Goff SA, Ricke D, Lan TH, Presting G, Wang RL, Dunn M, Glazebrook J, Sessions A, Oeller P, Varma H et al. A draft sequence of the rice genome (Oryza sativaL. ssp japonica) Science. 2002;296:92–100. doi: 10.1126/science.1068275. [DOI] [PubMed] [Google Scholar]
  17. Gogarten JP, Kibak H, Dittrich P, Taiz L, Bowman EJ, Bowman BJ, Manolson MF, Poole RJ, Date T, Oshima T et al. Evolution of the vacuolar H+-ATPase: implications for the origin of eukaryotes. Proc Natl Acad Sci USA. 1989;86:6661–6665. doi: 10.1073/pnas.86.17.6661. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Goremykin VV, Hansmann S, Martin WF. Evolutionary analysis of 58 proteins encoded in six completely sequenced chloroplast genomes: revised molecular estimates of two seed plant divergence times. Plant Syst Evol. 1997;206:337–351. [Google Scholar]
  19. Grant D, Cregan P, Shoemaker RC. Genome organization in dicots: genome duplication in Arabidopsis and synteny between soybean and Arabidopsis. Proc Natl Acad Sci USA. 2000;97:4168–4173. doi: 10.1073/pnas.070430597. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Hall Q, Cannon MC. The cell wall hydroxyproline-rich glycoprotein RSH is essential for normal embryo development in Arabidopsis. Plant Cell. 2002;14:1161–1172. doi: 10.1105/tpc.010477. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Huang S, An YQ, McDowell JM, McKinney EC, Meagher RB. The Arabidopsis ACT11actin gene is strongly expressed in tissues of the emerging inflorescence, pollen, and developing ovules. Plant Mol Biol. 1997;33:125–139. doi: 10.1023/a:1005741514764. [DOI] [PubMed] [Google Scholar]
  22. Iwabe N, Kuma K, Hasegawa M, Osawa S, Miyata T. Evolutionary relationship of archaebacteria, eubacteria, and eukaryotes inferred from phylogenetic trees of duplicated genes. Proc Natl Acad Sci USA. 1989;86:9355–9359. doi: 10.1073/pnas.86.23.9355. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Jinn TL, Stone JM, Walker JC. HAESA, an Arabidopsisleucine-rich repeat receptor kinase, controls floral organ abscission. Genes Dev. 2000;14:108–117. [PMC free article] [PubMed] [Google Scholar]
  24. Kajava AV. Structural diversity of leucine-rich repeat proteins. J Mol Biol. 1998;277:519–527. doi: 10.1006/jmbi.1998.1643. [DOI] [PubMed] [Google Scholar]
  25. Kandasamy MK, McKinney EC, Meagher RB. Plant profilin isovariants are distinctly regulated in vegetative and reproductive tissues. Cell Motil Cytoskelet. 2002;52:22–32. doi: 10.1002/cm.10029. [DOI] [PubMed] [Google Scholar]
  26. Kieliszewski MJ, Lamport DT. Extensin: repetitive motifs, functional sites, post-translational codes, and phylogeny. Plant J. 1994;5:157–172. doi: 10.1046/j.1365-313x.1994.05020157.x. [DOI] [PubMed] [Google Scholar]
  27. Kieliszewski MJ, Leykam JF, Lamport DTA. Structure of the threonine-rich extensin from Zea mays. Plant Physiol. 1990;92:316–326. doi: 10.1104/pp.92.2.316. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Kluge AG, Farris JS. Quantitative phyletics and evolution of Anurans. Syst Zool. 1969;18:1–32. [Google Scholar]
  29. Kobe B, Deisenhofer J. The leucine-rich repeat: a versatile binding motif. Trends Biol Sci. 1994;19:415–421. doi: 10.1016/0968-0004(94)90090-6. [DOI] [PubMed] [Google Scholar]
  30. Kobe B, Deisenhofer J. A structural basis of the interactions between leucine-rich repeats and protein ligands. Nature. 1995;374:183–186. doi: 10.1038/374183a0. [DOI] [PubMed] [Google Scholar]
  31. Ku HM, Vision T, Liu JP, Tanksley SD. Comparing sequenced segments of the tomato and Arabidopsisgenomes: large-scale duplication followed by selective gene loss creates a network of synteny. Proc Natl Acad Sci USA. 2000;97:9121–9126. doi: 10.1073/pnas.160271297. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Li JM, Chory J. A putative leucine-rich repeat receptor kinase involved in brassinosteroid signal transduction. Cell. 1997;90:929–938. doi: 10.1016/s0092-8674(00)80357-8. [DOI] [PubMed] [Google Scholar]
  33. Lynch M, Conery JS. The evolutionary fate and consequences of duplicate genes. Science. 2000;290:1151–1155. doi: 10.1126/science.290.5494.1151. [DOI] [PubMed] [Google Scholar]
  34. Maddison WP, Maddison DR. MacClade: Analysis of Phylogeny and Character Evolution. Sunderland, MA: Sinauer; 1992. [DOI] [PubMed] [Google Scholar]
  35. McDowell JM, Dhandaydham M, Long TA, Aarts MGM, Goff SA, Holub EB, Dangl JL. Intragenic recombination and diversifying selection contribute to the evolution of downy mildew resistance at the RPP8 locus of Arabidopsis. Plant Cell. 1998;10:1861–1874. doi: 10.1105/tpc.10.11.1861. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. McDowell JM, Huang SR, McKinney EC, An HJ, Meagher RB. Structure and evolution of the actin gene family in Arabidopsis thaliana. Genetics. 1996;142:587–602. doi: 10.1093/genetics/142.2.587. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Meagher RB, McKinney EC, Vitale AV. The evolution of new structures: clues from plant cytoskeletal genes. Trends Genet. 1999;15:278–284. doi: 10.1016/s0168-9525(99)01759-x. [DOI] [PubMed] [Google Scholar]
  38. Meyers BC, Shen KA, Rohani P, Gaut BS, Michelmore RW. Receptor-like genes in the major resistance locus of lettuce are subject to divergent selection. Plant Cell. 1998;10:1833–1846. doi: 10.1105/tpc.10.11.1833. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Muschietti J, Eyal Y, McCormick S. Pollen tube localization implies a role in pollen-pistil interactions for the tomato receptor-like protein kinases LePRK1 and LePRK2. Plant Cell. 1998;10:319–330. doi: 10.1105/tpc.10.3.319. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Nielsen H, Engelbrecht J, Brunak S, von Heijne G. A neural network method for identification of prokaryotic and eukaryotic peptides and prediction of their cleavage sites. Int J Neural Sys. 1997;8:581–599. doi: 10.1142/s0129065797000537. [DOI] [PubMed] [Google Scholar]
  41. Papageorgiou AC, Shapiro R, Acharya KR. Molecular recognition of human angiogenin by placental ribonuclease inhibitor: an X-ray crystallographic study at 2.0 angstrom resolution. EMBO J. 1997;16:5162–5177. doi: 10.1093/emboj/16.17.5162. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Parniske M, Hammond-Kosack KE, Golstein C, Thomas CM, Jones DA, Harrison K, Wulff BBH, Jones JDG. Novel disease resistance specificities result from sequence exchange between tandemly repeated genes at the Cf-4/9 locus of tomato. Cell. 1997;91:821–832. doi: 10.1016/s0092-8674(00)80470-5. [DOI] [PubMed] [Google Scholar]
  43. Rabanal F, Ludevid MD, Pons M, Giralt E. CD of proline-rich polypeptides: application to the study of the repetitive domain of maize glutelin-2. Biopolymers. 1993;33:1019–1028. doi: 10.1002/bip.360330704. [DOI] [PubMed] [Google Scholar]
  44. Ringli C, Baumberger N, Diet A, Frey B, Keller B. ACTIN2 is essential for bulge site selection and tip growth during root hair development of Arabidopsis. Plant Physiol. 2002;129:1464–1472. doi: 10.1104/pp.005777. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Rubinstein AL, Broadwater AH, Lowrey KB, Bedinger PA. PEX1, a pollen-specific gene with an extensin-like domain. Proc Natl Acad Sci USA. 1995a;92:3086–3090. doi: 10.1073/pnas.92.8.3086. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Rubinstein AL, Marquez J, Suarez Cervera M, Bedinger PA. Extensin-like glycoproteins in the maize pollen tube wall. Plant Cell. 1995b;7:2211–2225. doi: 10.1105/tpc.7.12.2211. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Schmidt JS, Lindstrom JT, Vodkin LO. Genetic length polymorphisms create size variation in proline-rich proteins of the cell wall. Plant J. 1994;6:177–186. doi: 10.1046/j.1365-313x.1994.6020177.x. [DOI] [PubMed] [Google Scholar]
  48. Schnabelrauch LS, Kieliszewski M, Upham BL, Alizedeh H, Lamport DT. Isolation of pl 4.6 extensin peroxidase from tomato cell suspension cultures and identification of Val-Tyr-Lys as putative intermolecular cross-link site. Plant J. 1996;9:477–489. doi: 10.1046/j.1365-313x.1996.09040477.x. [DOI] [PubMed] [Google Scholar]
  49. Schuh RT, Polhemus JT. Analysis of taxonomic congruence among morphological, ecological, and biogeographic data sets for the Leptopodomorpha (Hemiptera) Syst Zool. 1980;29:1–26. [Google Scholar]
  50. Shiu SH, Bleecker AB. Receptor-like kinases from Arabidopsisform a monophyletic gene family related to animal receptor kinases. Proc Natl Acad Sci USA. 2001;98:10763–10768. doi: 10.1073/pnas.181141598. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Simmons MP, Ochoterena H. Gaps as characters in sequence-based phylogenetic analyses. Syst Biol. 2000;49:369–381. [PubMed] [Google Scholar]
  52. Sokal RR, Rohlf FJ. Taxonomic congruence in the Leptopodomorphareexamined. Syst Zool. 1981;30:309–325. [Google Scholar]
  53. Soltis PS, Soltis DE, Savolainen V, Crane PR, Barraclough TG. Rate heterogeneity among lineages of tracheophytes: integration of molecular and fossil data and evidence for molecular living fossils. Proc Natl Acad Sci USA. 2002;99:4430–4435. doi: 10.1073/pnas.032087199. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Song WY, Wang GL, Chen LL, Kim HS, Pi LY, Holsten T, Gardner J, Wang B, Zhai WX, Zhu LH et al. A receptor kinase-like protein encoded by the rice disease resistance gene, Xa21. Science. 1995;270:1804–1806. doi: 10.1126/science.270.5243.1804. [DOI] [PubMed] [Google Scholar]
  55. Sonnhammer ELL, Durbin R. A dot-matrix program with dynamic threshold control suited for genomic DNA and protein-sequence analysis. Gene. 1995;167:1–10. doi: 10.1016/0378-1119(95)00714-8. [DOI] [PubMed] [Google Scholar]
  56. Stratford S, Barnes W, Hohorst DL, Sagert JG, Cotter R, Golubiewski A, Showalter AM, McCormick S, Bedinger P. A leucine-rich repeat region is conserved in pollen extensin-like (Pex) proteins in monocots and dicots. Plant Mol Biol. 2001;46:43–56. doi: 10.1023/a:1010659425399. [DOI] [PubMed] [Google Scholar]
  57. Swofford DL. PAUP*: Phylogenetic Analysis Using Parsimony (* and Other Methods). Sunderland, MA: Sinauer; 1998. [Google Scholar]
  58. Taylor LP, Hepler PK. Pollen germination and tube growth. Annu Rev Plant Physiol Plant Mol Biol. 1997;48:461–491. doi: 10.1146/annurev.arplant.48.1.461. [DOI] [PubMed] [Google Scholar]
  59. Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG. The CLUSTAL X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 1997;25:4876–4882. doi: 10.1093/nar/25.24.4876. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Torii KU, Mitsukawa N, Oosumi T, Matsuura Y, Yokoyama R, Whittier RF, Komeda Y. The Arabidopsis ERECTAgene encodes a putative receptor protein kinase with extracellular leucine-rich repeats. Plant Cell. 1996;8:735–746. doi: 10.1105/tpc.8.4.735. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Van der Hoorn RAL, Roth R, De Wit PJG. Identification of distinct specificity determinants in resistance protein Cf-4 allows construction of a Cf-9 mutant that confers recognition of avirulence protein AVR4. Plant Cell. 2001;13:273–285. doi: 10.1105/tpc.13.2.273. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Vision TJ, Brown DG, Tanksley SD. The origins of genomic duplications in Arabidopsis. Science. 2000;290:2114–2117. doi: 10.1126/science.290.5499.2114. [DOI] [PubMed] [Google Scholar]
  63. Wilson LG, Fry JC. Extensin: a major cell wall glycoprotein. Plant Cell Environ. 1986;9:239–260. [Google Scholar]
  64. Wolfe KH, Gouy ML, Yang YW, Sharp PM, Li WH. Date of the monocot dicot divergence estimated from chloroplast DNA-sequence data. Proc Natl Acad Sci USA. 1989;86:6201–6205. doi: 10.1073/pnas.86.16.6201. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Wong EI. In vitro studies of signalling molecules involved in pollen tube growth. MS thesis. Amherst: University of Massachusetts; 2001. [Google Scholar]
  66. Yu J, Hu SN, Wang J, Wong GKS, Li SG, Liu B, Deng YJ, Dai L, Zhou Y, Zhang XQ et al. A draft sequence of the rice genome (Oryza sativaL. ssp indica) Science. 2002;296:79–92. doi: 10.1126/science.1068037. [DOI] [PubMed] [Google Scholar]
  67. Zhou J, Rumeau D, Showalter AM. Isolation and characterization of two wound-regulated tomato extensin genes. Plant Mol Biol. 1992;20:5–17. doi: 10.1007/BF00029144. [DOI] [PubMed] [Google Scholar]
  68. Zurawski G, Clegg MT. Evolution of higher-plant chloroplast DNA-encoded genes: implications for structure-function and phylogenetic studies. Annu Rev Plant Physiol Plant Mol Biol. 1987;38:391–418. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Data

Articles from Plant Physiology are provided here courtesy of Oxford University Press

RESOURCES