Abstract
Knowledge of the origin and evolution of gene families is critical to our understanding of the evolution of protein function. To gain a detailed understanding of the evolution of the small heat shock proteins (sHSPs) in plants, we have examined the evolutionary history of the chloroplast (CP)-localized sHSPs. Previously, these nuclear-encoded CP proteins had been identified only from angiosperms. This study reveals the presence of the CP sHSPs in a moss, Funaria hygrometrica. Two clones for CP sHSPs were isolated from a F. hygrometrica heat shock cDNA library that represent two distinct CP sHSP genes. Our analysis of the CP sHSPs reveals unexpected evolutionary relationships and patterns of sequence conservation. Phylogenetic analysis of the CP sHSPs with other plant CP sHSPs and eukaryotic, archaeal, and bacterial sHSPs shows that the CP sHSPs are not closely related to the cyanobacterial sHSPs. Thus, they most likely evolved via gene duplication from a nuclear-encoded cytosolic sHSP and not via gene transfer from the CP endosymbiont. Previous sequence analysis had shown that all angiosperm CP sHSPs possess a methionine-rich region in the N-terminal domain. The primary sequence of this region is not highly conserved in the F. hygrometrica CP sHSPs. This lack of sequence conservation indicates that sometime in land plant evolution, after the divergence of mosses from the common ancestor of angiosperms but before the monocot–dicot divergence, there was a change in the selective constraints acting on the CP sHSPs.
To understand the evolution of protein function, it is important to know both the history of a protein family and the patterns of sequence evolution of that family. This information can provide an understanding of when the family originated, how its members are related to other proteins, the extent to which functional constraints are acting on the protein family, and whether these constraints have been consistent across taxa and over long periods of time. Although progress is being made in the study of the evolution of protein function in animals, our understanding of the evolution of protein function in plants has been limited by a number of factors. First, although the vast majority of plant proteins are nuclear-encoded, most plant molecular evolutionary studies have focused on chloroplast (CP)-encoded proteins (1). Furthermore, much of what is known about the evolution of plant nuclear-encoded proteins is limited to events of the last 150 million years because of the paucity of sequence data from non-angiosperm lineages. Clearly, additional studies of nuclear-encoded proteins are needed for a fuller understanding of plant protein evolution.
One very important aspect of the history of a protein family is the timing of origin of members within the family (2). The origin of nuclear-encoded organelle-localized proteins has been the subject of recent study (3–5), and two models of nuclear-encoded organelle-localized protein evolution have been proposed. In the first model, the well established “gene transfer” (or functional specificity) model (6), organelle genes are transferred from the endosymbiont genome to the nuclear genome. Because the products of the transferred gene are required for the organelle function, the transferred gene is maintained in the nuclear genome and acquires a transit sequence for proper trafficking of the protein back into the organelle. This model is well supported by numerous examples of nuclear-encoded CP protein genes that are closely related to genes of cyanobacterial proteins (for examples, see refs. 7–9). In the second model, “functional redundancy,” the organelle and nuclear proteins were functionally equivalent or redundant in the early eukaryote (4). Therefore, it is not necessary to retain the endosymbiont protein. Furthermore, it may not matter which form (nuclear or organelle) is used in the cytosol or sent into the organelles. Recent evidence suggests that the functional redundancy model accurately describes the evolution of at least some organelle-localized proteins (3, 4). However, it is difficult to make definitive statements about the origin and evolution of many plant organelle-localized proteins because of both a lack of data and a lack of detailed analysis of available data. Most studies of the origin of plant organelle proteins have examined sequences from only one major plant lineage, angiosperms. More importantly, many of these studies did not include any cyanobacterial sequences, which are critical to evaluation of hypotheses concerning the origin of nuclear-encoded CP proteins. A full understanding of the evolution of protein function in plants requires more detailed studies of nuclear-encoded proteins in a diversity of plant lineages.
Our studies of the small heat shock proteins (sHSPs) in plants focus on protein evolution with the goals of elucidating the history and patterns of sHSP evolution and uncovering the general patterns of protein family evolution (10, 11). The sHSPs are a diverse group of proteins found in archaea, bacteria, and eukaryotes and include the vertebrate lens α-crystallin proteins. All of these proteins share an ≈100-aa C-terminal “heat shock” domain (12, 13). The sHSPs form homooligomers both in vivo and in vitro (14–18) and facilitate the folding and reactivation of misfolded or denatured proteins (14, 16, 19). It is interesting that although most organisms have just one to a few cytosolically localized sHSPs, plants have many diverse sHSPs. There are at least five families of nuclear-encoded plant sHSPs, and they make up a major portion of the protein produced during high-temperature stress in plants (20). The plant sHSP families are clearly distinguishable from one another based on sequence analysis, and different sHSPs localize to different parts of the cell (10, 21). Two of the plant sHSP families are cytosolically localized (I and II), one is localized to the CPs, another to the mitochondria (MT), and the fifth localizes to the endoplasmic reticulum (ER).
There are many unanswered questions concerning the origin of the plant sHSP families and the nature of the selective forces acting on them. For instance, it has previously been suggested that the CP sHSPs may not have originated via gene transfer to the nucleus from the cyanobacterial endosymbiont, but rather from the duplication of the nuclear-encoded cytosolic sHSPs (10, 22). However, this hypothesis has not been explicitly tested. If the CP sHSPs did evolve from the cytosolic sHSPs, then at some point in plant evolution, there must have been differential selection for structure and/or function among the sHSP families, because the angiosperm sHSP families are clearly distinct and display very different patterns and rates of sequence evolution (10). The angiosperm CP sHSPs possess a highly conserved methionine-rich region in the N-terminal domain that is not found in any other plant or nonplant sHSP. This region has a methionine content of at least 20% and is predicted to form an amphipathic α-helix (23). The origin, evolutionary history, and functional significance of this region are unknown, but it clearly illustrates the differing selective constraints acting on the plant sHSP families.
To address questions concerning the evolution of the sHSPs, we have chosen to study the sHSPs of Funaria hygrometrica, a moss. Mosses are one of the most basal land plant lineages, and fossil evidence suggests that moss divergence occurred at least 400–450 million years ago (24, 25). Land plants are monophyletic (24, 26); thus, mosses share a common ancestor with angiosperms. Inclusion of mosses in evolutionary studies can provide needed depth for understanding protein evolution in plants. To define the origin of the CP sHSPs and the selective forces acting on these proteins, we have examined the relationship of the F. hygrometrica sHSPs to angiosperm homologs, the relationship of the plant CP sHSPs to cyanobacterial sHSPs, and the patterns of sequence evolution among CP sHSPs.
The analyses suggest that the CP sHSPs most likely evolved through gene duplication from a nuclear-encoded cytosolic sHSP and not by gene transfer from the endosymbiont. In addition, whereas the methionine-rich region seen in angiosperm CP sHSPs is not conserved at the primary sequence level in F. hygrometrica, the α-helical structure conserved among the angiosperm CP sHSPs is present in the F. hygrometrica CP sHSPs. These changing patterns of sequence evolution may reflect shifts during land plant evolution in the selective constraints acting on the CP sHSPs and the functions of the CP sHSPs. The data suggest that the CP sHSPs fit neither the functional redundancy nor the functional specificity models of organelle protein evolution. Thus, there may be considerable diversity in the origin and evolution of organelle-localized proteins.
Materials and Methods
F. hygrometrica chloronema cells were grown in culture and used to construct a heat shock cDNA library. The details of the library construction and screening process have been published (11). After putative sHSP cDNA phage clones were identified, they were in vivo excised to yield pBluescript plasmids (Stratagene). The expression patterns and length of the transcripts of heat-induced clones were analyzed with Northern blots (see ref. 11). Plasmid DNA was purified and sequenced by using an ABI 377 DNA sequencer at the University of Arizona DNA Sequencing Facility. All clones were sequenced completely in both directions. One clone, HSP B, was a partial clone, and a 5′ rapid amplification of cDNA ends (RACE) (27) system (GIBCO/BRL) was used to obtain the complete cDNA sequence of this gene. Poly(A) RNA used in the RACE procedure was identical to that used to construct the cDNA library.
The DNA sequences of the HSP clones were compared with known sequences in the gene databases by using blast (28). Further analysis of the sequences was performed by using gcg (Genetics Computer Group) Version 9.1. Pairwise estimates of sequence similarity and identity were performed by using the program gap with the Blossum62 matrix. The alignment was constructed by using clustalx (29) with a gap penalty of 20 and a gap extension penalty of 0.25 and was further refined by hand. For the entire CP sHSP-coding region, secondary-structure predictions and probabilities were obtained by using the predict protein program (30–32). Helical wheel (GCG Version 9.1) was used to generate secondary-structure predictions of the α-helical region identified with the predict protein program.
Phylogenetic analysis was restricted to an examination of the sHSP C-terminal domain comprising 100 aa. The N-terminal domain of the sHSPs is highly variable and cannot be meaningfully aligned across kingdoms. Sequences of the C-terminal domains of 78 sHSPs were aligned by using the methods described above. The length of this region varies among the sHSPs from 100 to over 125 aa. The insertion of gaps in the alignment results in a matrix of 130 characters. The alignment is available as supplementary material on the PNAS web site (www.pnas.org). Phylogenetic analysis was performed by using both distance and parsimony methods. The distance analyses were conduced by using phylip Version 3.572 (33). Amino acid distance matrices were calculated with protdist using the PAM distance matrix. Trees were generated by neighbor-joining in the program nj. Parsimony analysis of the amino acid alignment was conducted by using paupstar Version 4.0d64 (D. Swofford, personal communication) In the parsimony analysis, searches were conducted with the heuristic algorithm with 500 random addition replicates with TBR (tree-bisection-rearrangements). Alternative topologies were evaluated in macclade Version 3.04 (34). Support for the branches in the distance and parsimony trees was assessed by using bootstrap analysis. For each of the two methods of analysis used, 500 bootstrap replicates, with the same search parameters used in each type of sequence analysis, were performed. The results of the distance and parsimony analyses were congruent. Because sequences from bacteria, archaea, and eukaryotes were used in the analysis, the choice of an outgroup was problematic. The analysis was performed with (both archaea and bacteria) and without outgroups. The placement of the plant sHSP sequences within the trees did not change with different methods of analysis. Bacterial sequences were used to root the tree presented here because most current analyses indicate a closer relationship between archaea and eukaryotes (35).
Results
Isolation of Two Heat-Induced cDNAs Encoding CP sHSPs.
Heat-induced cDNA clones having inserts of <1,500 bp were identified and considered good candidates for sHSP cDNAs. Two clones of over 400 sequenced (11) were found to share significant sequence similarity to angiosperm CP sHSPs (Fig. 1). Northern blot analysis (data not shown) confirmed that the two clones presented here encode heat-induced mRNAs that are not detected in control F. hygrometrica mRNA. Similar expression patterns were reported for F. hygrometrica cytosolic sHSPs clones isolated from the same heat-shock cDNA library (11). One clone, HSP 22, encodes a complete 244-aa ORF for a 27.6-kDa protein (AF197942). The second cDNA clone, HSP 21, represented a partial coding region, and RACE was used to obtain the sequence of the complete transcript (AF197941). This second gene contains a reading frame of 230 aa for a 25.9-kDa protein. Pairwise comparisons indicate that the two genes represent distinct CP HSPs; in the coding regions, the DNA sequences are ≈56% identical, whereas the deduced amino acid sequences are 61% similar and 52% identical.
Sequence Conservation Patterns of the F. hygrometrica CP sHSPs Differ from Those Seen Among Angiosperm CP sHSPs.
The F. hygrometrica HSP 21 and HSP 22 proteins have an obvious N-terminal CP transit sequence that is typical of nuclear-encoded CP proteins and that is required for transport of the protein into the CP. Transit sequences are cleaved on entering the CP and are not part of the mature functional protein. The names of the F. hygrometrica proteins, HSP 22 and HSP 21, reflect the calculated size of the proteins without the transit sequences. The conservation of the amino acids A and Q (at residues 51 and 52) in the alignment are noteworthy. The cleavage of the transit peptide for Pisum sativum HSP 21 occurs just before amino acid Q (18). This glutamine residue is conserved in all known CP sHSPs, and it was suggested (18) that this is the site of the transit peptide cleavage for all CP sHSPs. All subsequent analysis of the CP sHSPs was conducted with the transit sequences removed.
The two mature F. hygrometrica CP sHSPs share 72% similarity and 61% identity. In pairwise comparisons of the F. hygrometrica CP sHSPs with both monocot (Triticum aestivum and Zea mays) and dicot (Arabidopsis thaliana and P. sativum), CP sHSPs amino acid identity was >42% and similarity was >55%. The highest identity and similarity was between HSP 21 and A. thaliana HSP 21, at 49.4% and 66.0%, respectively. Most of the residues conserved among the angiosperm CP sHSPs are also conserved in the F. hygrometrica CP sHSPs (see consensus alignment in Fig. 1). Consensus regions I and II are conserved among all plant sHSPs (10, 20), and region I is conserved among bacterial and eukaryotic sHSPs and among vertebrate lens α-crystallin proteins (12, 13). In the alignment of the angiosperm and F. hygrometrica CP sHSPs, the same 12 of 26 residues conserved among angiosperm CP sHSPs in region I are conserved among the angiosperm and F. hygrometrica CP sHSPs (Fig. 1). A similar pattern is seen in region II, where 14 of 25 residues are conserved among the angiosperm CP sHSPs and 11 residues are conserved among the angiosperm CP sHSPs and HSP 22 and HSP 21. A full alignment of >70 sHSPs is provided in the supplementary material at www.pnas.org.
The pattern of sequence conservation in consensus region III, the methionine-rich region, in the N-terminal domain is quite different from that seen in the C-terminal domain. When only angiosperm CP sHSPs are compared, 22 of 26 residues in this region are conserved. However, when the F. hygrometrica CP sHSPs are compared with angiosperm CP sHSPs, only 10 of 26 residues are conserved (Fig. 1). Only one of the four methionine residues conserved among the angiosperm CP sHSPs is present in the F. hygrometrica CP sHSPs (Fig. 1). Despite the limited sequence conservation in this region, there is obvious conservation of secondary structure. Secondary-structure predictions (30–32) for this region in the F. hygrometrica CP sHSPs reveals, with high probability (9 on a scale of 1–9 with 9 as the highest possible probability), the presence of the amphipathic helix seen in the angiosperm CP sHSPs (Fig. 2).
HSP 22 and HSP 21 Are Members of the CP sHSP Family and Are Closely Related to Other Plant sHSP Families.
Phylogenetic analysis was carried out to determine the relationship of the F. hygrometrica CP sHSPs to the angiosperm CP sHSPs, and the relationship of the plant sHSP families to the sHSPs from other organisms (nonplant). The sequences used in this study included all of the known plant organelle sHSPs (CP, MT, and ER), representatives of the plant cytosolic I and II sHSP families, archaeal, bacterial, fungal, and animal sHSPs, as well as some vertebrate lens α-crystallin proteins. This analysis of the sHSPs includes sHSPs from the cyanobacteria Synechocystis sp. and Synechococcus vulcanus.
It is clear from the phylogenetic analysis that the two F. hygrometrica CP sHSPs are members of the previously identified higher plant CP sHSP family (Fig. 3). HSP 22 and HSP 21 are found in a basal position relative to the angiosperm CP sHSPs, most likely reflecting the orthologous (organismal) relationships of the CP sHSPs. The F. hygrometrica CP sHSPs are obviously paralogs of the cytosolic F. hygrometrica sHSPs and are not closely related to the cyanobacterial sHSPs. In addition, our analysis shows that the sHSP from Rikettsia prowazekii, the bacterium thought to be most closely related to the ancestor of mitochondria (36), is not closely related to the seed-plant MT sHSPs. Although the bootstrap values for some of the branches are not high, the topology of the tree is robust to different methods of analysis, with the neighbor-joining and parsimony trees having nearly identical topologies. In both methods of analysis, the cytosolic plant sHSPs are more closely related to the plant organelle sHSPs than they are to cytosolic sHSPs from other eukaryotes. The large evolutionary distances and the rapid rate of evolution of the sHSPs are most likely responsible for the low bootstrap values for some of the branches. Previous analysis demonstrated that selective constraint is higher on the secondary structure than on the primary sequence of the sHSPs (12). Many positions in the alignment are not conserved, but the amino acid replacements are conservative in relation to the charge and size of the amino acids. It is also important to note that the topology of the tree is in general agreement with known organismal relationships.
Placement of the CP sHSPs and the two cyanobacterial sHSPs in the phylogenetic tree shown in Fig. 3 indicates that the CP sHSPs are more closely related to cytosolic sHSPs than to cyanobacterial sHSPs. To examine the support for these relationships in more detail, we determined the cost (in numbers of additional steps) of moving the CP sHSP branch to the bacterial clade by using the program macclade. Moving either just the CP sHSPs or both the CP sHSPs and the MT sHSPs resulted in a gain of over 30 steps on the shortest tree. In conclusion, the CP sHSPs are not closely related to the cyanobacterial sHSP. The distant relationship of F. hygrometrica CP sHSPs to bacterial sHSPs is also seen in multiple-alignment and pairwise sequence comparisons of plant sHSPs with the sHSP from Synechocystis. There is only a low level of sequence conservation between Synechocystis HSP 16.6 and the land plant CP-sHSPs (Fig. 1). Pairwise sequence identity of the two F. hygrometrica CP sHSPs with Synechocystis HSP 16.6 is <23%, which is lower than the sequence identity these sequences share with other bacterial sHSPs. For instance, the F. hygrometrica sHSPs (HSP 21 and 22) share 27% and 26% sequence identity with the sHSP from Aquifex aeolicus and 29% and 30% sequence identity with HSP 16.6 from Methanococcus jannaschii. Limited conservation in regions I and II and the complete absence of the α-helical region (region III) in the cyanobacterial proteins also indicates the lack of a close relationship between the F. hygrometrica CP sHSPs and the cyanobacterial sHSPs.
Discussion
The CP sHSPs Are Not Closely Related to the Cyanobacterial sHSPs.
Neither the gene-transfer (functional specificity) nor the functional-redundancy models adequately describe the evolution of the CP sHSPs. From our analysis, it is clear that the CP sHSP genes in land plants did not originate from the CP endosymbiont. The phylogenetic analysis also shows that the endosymbiont sHSPs have neither replaced nor were redundant with nuclear sHSPs in any eukaryotic lineage. That the endosymbiont (bacterial) sHSPs were not retained in eukaryotes indicates that these sHSPs were not easily exchangeable with the nuclear sHSPs. This result also suggests that the endosymbiont sHSPs were no longer required within the early eukaryotic cell.
The diversification of the sHSPs into organelle-localized forms appears to have occurred only within the plant lineage. The timing of this diversification must have preceded the divergence of the mosses, but the exact time is unknown. The finding that the plant MT sHSPs are not closely related to the sHSP from R. prowazekii suggests the MT sHSPs, like the CP sHSPs, evolved within the plant lineage. Further evidence supporting this conclusion comes from a thorough analysis of the complete genomes of Saccharomyces cerevisiae and Caenorhabditis elegans that failed to uncover MT and ER sHSPs in these eukaryotes (E.R.W., unpublished data). The position of HSP 22, the only known algal sHSP from the chlorophyte Chlamydomonas reinhardtii, in the phylogenetic tree suggests that the diversification of the sHSPs occurred after the divergence of the most common ancestor of C. reinhardtii and F. hygrometrica but before the divergence of the common ancestor of mosses and seed plants. More thorough sampling from algal and bryophyte lineages will be necessary before we can determine the timing of the origin of the plant sHSP families.
Relationship of Protein Function to the Timing of Organelle Protein Evolution.
The results of this study indicate that one cannot necessarily predict the timing of organelle-protein evolution based on the functions of the protein families under study. The contrasting evolutionary histories of the sHSP and HSP70 families, both of which are molecular chaperones, illustrate this point. The phylogenetic relationships within the HSP70 family reflect the gene-transfer model of organelle evolution. The genes for the MT and CP HSP70s are more closely related to bacterial HSP70s than to eukaryotic cytosolic HSP70s, indicating that they were transferred from the endosymbionts to the nucleus (7). In comparing these two families of proteins, it is important to note that although the sHSPs, like the large HSPs, have been shown to have molecular chaperone properties, there are crucial differences between these families. These differences are important in interpreting their evolutionary histories. Although many HSP70s are heat-induced and are an important part of the heat shock response, HSP70s are also present in nonstressed cells (37) and are essential for normal cellular function. The expression patterns of the sHSPs are very different. In plants, sHSPs are heat-induced and in some cases developmentally regulated, but they are not known to be significant components of most nonstressed cells (20, 37). As a result, the selection pressures acting on the HSP70s and the sHSPs are quite different. The retention in the nuclear genomes of the organelle HSP70s most likely reflects the important role in protein folding that these proteins have under normal cellular conditions (37–40). The lack of a need for sHSPs under normal cellular conditions may have rendered the endosymbiont sHSPs unnecessary. There may have been no selective advantage in early eukaryotes for the transfer of the organelle sHSPs to the nucleus, a gain of a transit sequence, and the energy-costly import of sHSPs into the organelles. Later selection pressures, unique to the plant lineage, must have driven the diversification of the sHSPs leading not only to the CP sHSPs, but also to the MT, ER, and the two cytosolic plant sHSP families.
Evidence for a Shift in the Selective Constraints Acting on the CP sHSPs.
Analysis of sequence evolution across time and taxa can provide insight into the stability or variability of evolutionary constraints acting on proteins. Whereas the sHSP C-terminal domain exhibits consistent patterns of sequence evolution across great evolutionary distances, the N-terminal domain has a much more variable pattern of evolution. The C-terminal domain is important in the ability of sHSPs to form large oligomers (the native in vivo unit of the sHSPs) and in the ability of sHSPs to interact with misfolded substrates (14, 19). Secondary-structure predictions of the C-terminal domain from widely divergent sHSPs are very similar, even though the primary sequences are not highly conserved (12). The recently reported crystal structure of the C-terminal domain of HSP16.6 from M. jannaschii (41) is in agreement with the earlier predicted secondary structures for many sHSPs and confirms that this region is essential to oligomer structure. Thus, there is a consistent pattern of constraint on the structure of this domain from bacteria and archaea to eukaryotes. In contrast, the sHSP N-terminal domain displays a more complex pattern of sequence evolution. It evolves quickly and thus cannot be reliably aligned across kingdoms. In addition, much less is known about the possible structure–function relationships of this domain. The N-terminal domain was disordered in the M. jannaschii HSP 16.6 crystals (41), the only sHSP for which a crystal structure is known. There is also no pattern of conserved secondary structure (based on structure-prediction methods) for this domain across diverse sHSPs. This observation suggests that there is considerable variation in the selective constraints acting on the C- and N-terminal domains.
Although, even among plant sHSPs, the N-terminal domain is quite variable across protein families (CP, MT, ER, and cytosolic I and II), individual plant sHSP families have specific conserved regions within this domain (10). The methionine-rich region unique to the CP sHSPs is the most striking example of conserved N-terminal regions, and among the angiosperms, this domain is even more conserved than regions I and II in the C terminus (10, 23). There is a surprising lack of conservation of the methionine-rich region in the F. hygrometrica CP sHSPs, the first available sequences in a non-flowering plant lineage. If the lack of primary sequence conservation of the methionine-rich region were primarily due to the large evolutionary distances encompassed within these comparisons (angiosperms to F. hygrometrica, ≈400–450 million years), we might expect to see a similar decrease in sequence conservation in the C-terminal consensus regions I and II. However, approximately the same number of residues are conserved in the C-terminal domain among angiosperm CP sHSPs as are conserved between angiosperm and F. hygrometrica and CP sHSPs (Fig. 1). In contrast, whereas 22 of 26 residues are conserved in the N-terminal region among angiosperms, when the F. hygrometrica CP sHSPs are added to the comparisons, only 10 of 26 residues are conserved. The differences in conservation between the C-terminal and N-terminal domains is then not due to changes in substitution rate across the entire length of the protein, but to specific changes in substitution rate in the CP N-terminal region. Therefore, we can conclude that sometime in land-plant evolution, after the divergence of mosses but before the monocot–dicot split, there was a shift in the constraints acting on the CP sHSPs, resulting in conservation of methionine residues in this region among angiosperms.
Conservation of the α-helix in the F. hygrometrica CP sHSP N-terminal domains is intriguing because of its implications for the evolution of CP sHSP function. Secondary structure can be critical for protein function. The role of the methionine residues and the α-helix in the CP sHSPs is not known, but it has previously been hypothesized that this region is important in substrate recognition (23). It has recently been suggested that the CP sHSPs may protect plants from oxidative stress, as well as heat stress, and that the methionine residues, which are easily oxidized, play an important role in this protection (42). Differences in primary- and secondary-structure conservation in the N-terminal domain could reflect functional differences between F. hygrometrica and angiosperm CP sHSPs. It is possible that angiosperm CP sHSPs have a more restricted function or a different substrate specificity than F. hygrometrica CP sHSPs. Thus, in angiosperms, an additional selection pressure is maintaining the methionine residues. Further study is needed to establish the functional significance of the methionine-rich region for the angiosperm CP sHSPs and functional differences among the CP sHSPs from angiosperms and other plant lineages. In addition, representatives of other land-plant and algal lineages should be examined to determine the time of origin of the CP sHSPs.
The biochemical evolution of CPs is not limited to the transfer of genes of bacterial origin to the nucleus but also includes gene duplication and the acquisition of CP function for nuclear-encoded cytosolic proteins. More recently acquired organelle proteins are also not necessarily just organelle forms of cytosolic proteins. The patterns of sequence evolution of the sHSPs indicates that there are unique selective forces acting on the CP sHSPs and that these forces have evolved within the land plant lineage. This result suggests that plant gene families that are highly conserved in angiosperms are shared with other plant lineages and may have evolved through intermediate forms. Additional studies of the evolution of plant gene families will be necessary to determine whether the patterns of evolution seen in the sHSPs reflect broader evolutionary processes.
Supplementary Material
Acknowledgments
L. Graham, S. Schneider, and two anonymous reviewers provided many useful comments on earlier versions of this manuscript. This research was partially supported by funds to E.W. from an National Science Foundation-funded Research Training Group in the Analysis of Biological Diversity at the University of Arizona. Additional funds were provided to E.W. from the Biology Department of Marquette University. Support to E.V. was from National Institutes of Health Grant R01 GM2762 and the U.S. Department of Agriculture National Research Initiative Competitive Grants Program.
Abbreviations
- CP
chloroplast
- MT
mitochondria
- sHSP
small heat shock protein
- ER
endoplasmic reticulum
Footnotes
References
- 1.Clegg M T, Cummings M P, Durbin M L. Proc Natl Acad Sci USA. 1997;94:7791–7798. doi: 10.1073/pnas.94.15.7791. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Iwabe N, Kuma K, Miyata T. Mol Biol Evol. 1996;13:483–493. doi: 10.1093/oxfordjournals.molbev.a025609. [DOI] [PubMed] [Google Scholar]
- 3.Keeling P J, Doolittle W F. Proc Natl Acad Sci USA. 1997;94:1270–1275. doi: 10.1073/pnas.94.4.1270. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Martin M, Schnarrenberger C. Curr Genet. 1997;32:1–18. doi: 10.1007/s002940050241. [DOI] [PubMed] [Google Scholar]
- 5.Roger A J, Svard S G, Tovar J, Clark C G, Smith M W, Gillin F D, Sogin M L. Proc Natl Acad Sci USA. 1998;95:229–234. doi: 10.1073/pnas.95.1.229. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Weeden N F. J Mol Evol. 1981;17:133–139. doi: 10.1007/BF01733906. [DOI] [PubMed] [Google Scholar]
- 7.Boorstein W R, Ziegelhoffer T, Craig E A. J Mol Evol. 1994;38:1–17. doi: 10.1007/BF00175490. [DOI] [PubMed] [Google Scholar]
- 8.Eisen J A. J Mol Evol. 1995;41:1105–1123. doi: 10.1007/BF00173192. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Martin M, Stoebe B, Goremykin V, Hansmann S, Hasegawa M, Kowallik K V. Nature (London) 1998;393:162–165. doi: 10.1038/30234. [DOI] [PubMed] [Google Scholar]
- 10.Waters E R. Genetics. 1995;141:785–795. doi: 10.1093/genetics/141.2.785. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Waters E R, Vierling E. Mol Biol Evol. 1999;16:127–139. doi: 10.1093/oxfordjournals.molbev.a026033. [DOI] [PubMed] [Google Scholar]
- 12.Caspers G-J, Leunissen J, de Jong W W. J Mol Evol. 1995;40:238–248. doi: 10.1007/BF00163229. [DOI] [PubMed] [Google Scholar]
- 13.Plesofsky-Vig N, Vig J, Brambl R. J Mol Evol. 1992;35:537–545. doi: 10.1007/BF00160214. [DOI] [PubMed] [Google Scholar]
- 14.Ehrnsperger M, Buchner J, Gaestel M. In: Molecular Chaperones in the Life Cycle of Proteins. Fink A L, Goto Y, editors. New York: Dekker; 1998. pp. 533–574. [Google Scholar]
- 15.Helm K W, Lee G J, Vierling E. Plant Physiol. 1997;1997:1477–1485. doi: 10.1104/pp.114.4.1477. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Kim R, Kim K K, Yokota H, Kim S-H. Proc Natl Acad Sci USA. 1998;95:9129–9133. doi: 10.1073/pnas.95.16.9129. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Lee G J, Pokala N, Vierling E. J Biol Chem. 1995;270:10432–10438. doi: 10.1074/jbc.270.18.10432. [DOI] [PubMed] [Google Scholar]
- 18.Suzuki T, Krawitz D, Vierling E. Plant Physiol. 1998;116:1151–1161. doi: 10.1104/pp.116.3.1151. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Lee G J, Roseman A M, Saibil H R, Vierling E. EMBO J. 1997;16:659–671. doi: 10.1093/emboj/16.3.659. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Vierling E. Annu Rev Plant Physiol Plant Mol Biol. 1991;42:579–620. [Google Scholar]
- 21.Waters E R, Lee G, Vierling E. J Exp Bot. 1996;47:325–338. [Google Scholar]
- 22.Vierling E, Nagao R T, DeRocher A E, Harris L M. EMBO J. 1988;7:575–581. doi: 10.1002/j.1460-2075.1988.tb02849.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Chen Q, Vierling E. Mol Gen Genet. 1991;226:425–431. doi: 10.1007/BF00260655. [DOI] [PubMed] [Google Scholar]
- 24.Kenrick P, Crane P. Nature (London) 1997;389:33–39. [Google Scholar]
- 25.Kroken S B, Graham L E, Cook M E. Am J Bot. 1996;83:1241–1254. [Google Scholar]
- 26.Qiu Y-L, Palmer J D. Trends Plant Sci. 1999;4:26–30. doi: 10.1016/s1360-1385(98)01361-2. [DOI] [PubMed] [Google Scholar]
- 27.Frohman M A, Dush M K, Martin G R. Proc Natl Acad Sci USA. 1988;85:8998–9002. doi: 10.1073/pnas.85.23.8998. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Altschul S I, Gish W, Myers E W, Lipman D J. J Mol Biol. 1990;215:403–410. doi: 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
- 29.Higgins D G, Thompson J D, Gibson T J. Methods Enzymol. 1996;226:383–402. doi: 10.1016/s0076-6879(96)66024-8. [DOI] [PubMed] [Google Scholar]
- 30.Rost B, Sander C. J Mol Biol. 1993;232:584–599. doi: 10.1006/jmbi.1993.1413. [DOI] [PubMed] [Google Scholar]
- 31.Rost B, Sander C. Proteins. 1994;19:55–72. doi: 10.1002/prot.340190108. [DOI] [PubMed] [Google Scholar]
- 32.Rost B, Sander C, Schneider R. Comput Appl Biosci. 1994;10:53–60. doi: 10.1093/bioinformatics/10.1.53. [DOI] [PubMed] [Google Scholar]
- 33.Felsenstein J. phylogeny inference package. Seattle, WA: Univ. of Washington; 1993. , Version 3.572. [Google Scholar]
- 34.Maddison W P, Maddison D R. macclade. Sunderland, MA: Sinauer; 1992. [Google Scholar]
- 35.Li W-H. Molecular Evolution. Sunderland, MA: Sinauer; 1997. pp. pp.167–173. [Google Scholar]
- 36.Andersson S G E, Zomorodipour A, Andersson J O, Sicheritz-Ponten T, Alsmark U C M, Podowski R M, Naslund A K, Eriksson A-S, Winkler H H, Kurland C G. Nature (London) 1998;396:133–140. doi: 10.1038/24094. [DOI] [PubMed] [Google Scholar]
- 37.Boston R S, Viitanen P V, Vierling E. Plant Mol Biol. 1996;32:191–222. doi: 10.1007/BF00039383. [DOI] [PubMed] [Google Scholar]
- 38.Craig E A, Weissman J S, Horwich A L. Cell. 1994;78:365–372. doi: 10.1016/0092-8674(94)90416-2. [DOI] [PubMed] [Google Scholar]
- 39.Hartl F U. Nature (London) 1996;381:571–580. doi: 10.1038/381571a0. [DOI] [PubMed] [Google Scholar]
- 40.Rassow J, Ahsen O V, Bomer U, Pfanner N. Trends Cell Biol. 1997;7:129–133. doi: 10.1016/S0962-8924(96)10056-8. [DOI] [PubMed] [Google Scholar]
- 41.Kim K K, Kim R, Kim S-H. Nature (London) 1998;394:595–599. doi: 10.1038/29106. [DOI] [PubMed] [Google Scholar]
- 42.Harndahl, U., Buffoni Hall, R., Osteryoung, K., Vierling, E. & Sundby, C. (1999) Cell Stress Chap., in press. [DOI] [PMC free article] [PubMed]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.