Abstract
The elucidation of ribosomal structure has shown that the function of ribosomes is fundamentally confined to dynamic interactions established between the RNA components of the ribosomal ensemble. These findings now enable a detailed analysis of the evolution of ribosomal RNA (rRNA) structure. The origin and diversification of rRNA was studied here using phylogenetic tools directly at the structural level. A rooted universal tree was reconstructed from the combined secondary structures of large (LSU) and small (SSU) subunit rRNA using cladistic methods and considerations in statistical mechanics. The evolution of the complete repertoire of structural ribosomal characters was formally traced lineage-by-lineage in the tree, showing a tendency towards molecular simplification and a homogeneous reduction of ribosomal structural change with time. Character tracing revealed patterns of evolution in inter-subunit bridge contacts and tRNA-binding sites that were consistent with the proposed coupling of tRNA translocation and subunit movement. These patterns support the concerted evolution of tRNA-binding sites in the two subunits and the ancestral nature and common origin of certain structural ribosomal features, such as the peptidyl (P) site, the functional relay of the penultimate stem helix of SSU rRNA, and other structures participating in ribosomal dynamics. Overall results provide a rare insight into the evolution of ribosomal structure.
INTRODUCTION
Ribosomes are macromolecular factories that synthesize proteins from amino acids with remarkable speed and accuracy, their function being fundamentally confined to ribosomal RNA (rRNA) (1). High-resolution crystal structures of the 30S and 50S ribosomal subunits have been recently determined at atomic resolution (2–5), and the complete structure of the 70S ribosome (at 5.5 Å) acquired in the presence of transcript molecules and cognate transfer RNA (tRNA) bound to aminoacyl (A), peptidyl (P) and exit (E) sites (6). These, and other studies (7), clarify the many functional interactions established within the rRNA core. Dynamic bridges between rRNA, tRNA and peripheral proteins ratchet the tRNA molecules through the center of the ribosome, orienting subunits and facilitating their movement in the process of peptide elongation (6). Crystallography has also confirmed secondary structural models predicted during the past two decades by the study of compensating substitutions in sequence alignments (8,9) and has helped place the evolutionarily conserved and functional ribosomal regions in the center of the molecule (10). The careful identification of functional components in rRNA now provides a new perspective on translation (11) and enables an evolutionary tracing of these structures at the molecular level.
Phylogenies are branching histories of inheritance and useful tools for testing evolutionary hypotheses. They are usually described by trees, graphical and mathematical representations (containing branches and reticulations) that depict how contemporary is common ancestry (12). Phylogenetic trees trace the genealogical relationships of molecules and organisms at different taxonomical levels. They have been particularly useful for the comparative analysis of nucleic acid and protein sequences in subjects as diverse as molecular epidemiology, the study of genome organization and evolution, and the origin of life (13,14). In recent years there has been continued recognition that higher order structure is fundamental to establish important structure–function relationships in biological macromolecules (15). The increasing acquisition of structural information and the inception of structural genomics (16) has reinforced this view. Many have studied the evolutionary diversification of structure at rather large structural scales (17). Evolutionary relationships have also been inferred directly from the structure of nucleic acid (18–21) and protein molecules (22–25), using formal approaches of phylogenetic analysis. These methods compare molecules using secondary structure representations or spatial backbone distributions. This contrasts with approaches of positional covariance in comparative sequence analysis that focus on inferring structure from phylogenetic diversification (8,9). I recently introduced a formal approach that recovers phylogenetic signal from the structure of nucleic acid molecules and can be applied to the study of organisms of highly divergent lineage (20,21). Phylogenetic relationships are inferred on the basis of shared and derived characteristics in RNA structure, using a cladistic approach and considerations in statistical mechanics (see Fig. 1). Structural features that describe molecular components or entire molecules are treated as linearly ordered multi-state characters that are polarized by fixing the direction of evolutionary transformation towards molecular order. The assumption that RNA molecules are optimized by a process that increases favorable and decreases non-favorable inter- and intra-molecular interactions is supported by analytical models that reconstruct the structural repertoire of evolving RNA sequences from energetic and kinetic perspectives (26–28), the study of extant and randomized sequences (29–32), experimental verification of a tendency towards stability using thermodynamic principles generalized to account for non-equilibrium conditions (33,34), correlation between thermal stability and the occurrence of structural motifs in natural nucleic acids (35), and concordance between phylogenies inferred from structure, sequence and accepted classification (21). The approach ultimately reconstructs inherently rooted phylogenetic trees directly from evolved structure, allowing the evolutionary study of important functional components of RNA molecules at levels other than sequence. I reconstruct here a universal tree from the combined secondary structures of large (LSU) and small (SSU) subunit rRNA, trace structural characters in the trees, and study the origin and diversification of ribosomal structure.
Figure 1.
Phylogenetic analysis of molecular structure. The structures from extant RNA molecules can be organized hierarchically in nested sets using cladistic principles. This involves three intimately related processes that are illustrated here by the analysis of archaeal 5S rRNA. (A) Discovery and coding of characters. RNA sequences are folded using kinetic, genetic or energy minimization algorithms, or according to a structural model (64). The resulting structures can be characterized by molecular attributes such as nucleotide length of distinguishing features, thermodynamic properties such as minimum Gibbs free energy increments (ΔG°), and statistical parameters that describe the shape, stability and uniqueness of folded conformations (28,65). In cladistics, attributes are known as ‘characters’ and the numerical values (and frequency distributions of values) they display constitute ‘character states’ (8). 5S rRNA structures from archaeal species (exemplified by Desulfurococcus mobilis) were characterized by two different sets of characters, one describing quantitatively the statistical properties of molecules and the other describing their molecular shape. Shannon entropy of the base-pairing probability matrix (Q) and base-pairing propensity (P) are normalized to sequence length. Q measures the number of conflicting interactions (frustration) that occur during folding (perfectly defined structure: Q = 0; no preferred structure: Q = 1). Characters need to be appropriately coded (i.e. converted into a discrete alphanumerical format) to be cladistically useful. For example, statistical character states are continuous variables that need to be gap-recoded into discrete data to provide maximum phylogenetic signal [e.g. Q values range from 0.0235 (D.mobilis; coded as 1) to 0.1930 (Haloferax volcanii; coded as C)]. Finally, useful characters are only those that can be compared in different structures and are homologous (i.e. share common ancestry). Homology is tested by topographic correspondence (e.g. identifying the same arrangement of stem tracts S1, S2 and S3 in all archaeal molecules; coaxial stems are highlighted) and empirically by phylogenetic reconstruction (21). (B) Character transformation and evolution. The structural characters described here transform from one state to another through time and do so in linearly ordered and reversible pathways called ‘transformation series’. However, an evolutionary direction of transformation (described by a weathercock’s arrow) can be superimposed on these pathways when the ancestral state is identified. This ‘polarization’ of character transformation is driven by an evolutionary search for structural order [which is well supported by concepts in statistical mechanics (26–32)]. Polarized pathways can be effectively described using ‘step matrices’ that assign a transformation cost (measured in ‘steps’ or probabilities) to every change. These step matrices help optimize the interplay between phylogenetic analysis and the evolutionary model (see Fig. 3). (C) Reconstruction of phylogenetic trees. In this process, hypotheses about character state (models of structural evolution) result into hypotheses about groups of molecules (hierarchical phylogenies). Using discrete methods, trees are chosen to be those that require either the fewest evolutionary changes (maximum parsimony) or are most likely to happen (maximum likelihood) (8). In this example, phylogenetic trees are reconstructed using the criterion of parsimony by minimizing the total cost of character change. Data matrices show how archaeal characters and molecules display character states. They were used to reconstruct most-parsimonious rooted trees in exhaustive searches. Characters were equally weighted except for stems that were weighted double to account for nucleotide number. Each analysis produced single optimal trees with similar topologies. BS support values indicate how robust are individual groups. I have here chosen the maximum parsimony method because it is one of the few that explicitly seeks the reconstruction of ancestral molecules. CI, consistency index. RI, retention index.
MATERIALS AND METHODS
rRNA sequences and secondary structures derived from comparative sequence analysis were drawn from the Antwerp database (http://rrna.uia.ac.be) and the 5S rRNA Data Bank (http://cammsg3.caos.kun.nl). The SSU structures were corrected to account for pseudoknots in variable area V4 (36). RNA structures were decomposed into sub-structural components and their features characterized and coded using an alphanumerical format suitable for cladistic analysis, as previously described (20,21). Coded characters were based on the length in nucleotides of double-helical stem tracts (S), hairpin (H), bulge and interior loops (B), and unpaired segments such as free ends, connecting joints, and multiloop sequences separating stems (U). Other structural features, such as mean number of stems attached to a loop (D) and number of B loops in S (N), were used to describe 5S rRNA. Character states were represented by numbers 0–9 and letters A–P. Structural features with longer nucleotide lengths were given the maximum state (P), and if missing, the minimum state (0). The maximum number of character states was limited by the phylogenetic analysis programs used here (26 states). Structural alignments listed characters characterizing the structure in the 5′Æ3′ direction as it is read in the sequence, and for each sequence segment, in the order S, B, H and U. Stem tracts were defined as those separated by multibranched and pseudoknot loops, and were considered as coaxial units even when harboring interior loops. They were defined by two complementary sequence segments and characters [named by a number and its prime according to established nomenclature (10)] to account for the difference in nucleotide number between stem and unpaired tracts. As previously described, topographic correspondence was the main criterion for determining character homology in unambiguously aligned rRNA molecules [see Caetano-Anollés (21) for details]. Molecules were also characterized by statistical metrics Q, P and S (31) as described in Figure 1. The equilibrium partition functions and base pair probabilities were calculated using the Vienna package (37). Phylogenetic relationships were inferred using the PAUP* package (38) and character reconstruction implemented in MacClade (39). Characters were polarized by distinguishing ancestral states as those with larger S and lower H, B, U, N and D state values. The assumptions, justification and limitations of the model of character change have been previously described (20,21). Phylogenetic trees were reconstructed using maximum parsimony as the optimality criterion, and were automatically rooted at the point where the hypothetical ancestor connected to the tree. Phylogenetic reliability was evaluated by the non-parametric bootstrap (BS) method (40) (implemented using at least 103 pseudoreplicates in PAUP*) and by double decay (DD) analysis (41) [using RadCon (42)]. The structure of phylogenetic signal in the data was tested by the skewness (g1) of the length distribution of 104 random trees, and permutation tail probability (PTP) tests of cladistic covariation using 103 replicates. The homogeneity of partitions was analyzed using a modified Michevich–Farris index of incongruence among data sets and 103 heuristic search replicates (43). Topological congruence was measured using several tree comparison metrics and randomization tools implemented in COMPONENT (44). The distribution of character change in a clade was measured using the node index (Ni) metric. Character changes per branch in branches equidistant to the root node (by value n) were normalized to the total number of changes per branch in the clade, weighted (multiplied by n), averaged, and expressed on a 0–1 scale. These values were sometimes gap-recoded and used as discrete characters for maximum parsimony analysis.
RESULTS
Reconstruction of a universal tree based on rRNA structure
The origin and diversification of rRNA molecules was studied directly at the secondary structure level. The procedure of analysis followed the three steps outlined in Figure 1. First, rRNA sequences that were folded according to accepted structural models [inferred from comparative sequence analysis (10)] were described by a set of molecular attributes that characterize numerically their secondary structure. Two different sets of characters were used for this purpose, one describing directly the shape of the molecules, another describing general statistical properties (molecular statistics described in Fig. 1A). In this paper, I have focused on characters that describe each and every spatial component of secondary rRNA structure, since these characters can trace the actual details of the evolutionary transformation of molecular structure. Presently, statistical characters describe the evolution of complete molecular ensembles and cannot trace spatial features individually in complicated molecules such as rRNA. Their analysis will be described elsewhere. Each rRNA structure was dissected into double-helical stem tracts, hairpin loops, bulge and interior loops, and other single-stranded segments. The length of these components (in nucleotides) was tabulated using a simple alphanumeric format and used to generate a data matrix for cladistic analysis in which homologous characters were aligned in ordered columns following the 5′Æ3′ direction of the sequence. Table 1 illustrates the data showing an alignment matrix of rRNA structural characters encoding inter-subunit bridge contacts and tRNA-binding sites. This is only a small subset representing only ∼5% of the rRNA data matrix. These matrices are used to characterize nucleic acid molecules and can be viewed as cladistic ensembles composed mostly of helical stem and loop regions that expand and/or contract in the course of evolution. Secondly, the structural characters identified were polarized by establishing a direction in their evolutionary transformation. Character polarization and its assumptions are carefully described in Figure 1B (and elsewhere) (21) and were implemented by identifying ancestral character states with the ANCSTATES command in PAUP. Thirdly, the structural alignment matrices were used to reconstruct universal phylogenetic trees using maximum parsimony as the optimality criterion and established methods. This involved comparing molecules from species encompassing all primary organismal domains of life.
Table 1. Alignment matrix of interacting rRNA structural characters.
Characters in LSU rRNA (L) and SSU rRNA (S) represent stems (S), bulges and interior loops (B), hairpin loops (H) and unpaired sequences (U) and encode rRNA bridge contacts (described by numbers) and peptidyl (P), aminoacyl (A), or exit (E) tRNA-binding sites. Descriptions of universal helix tracts and associated structures harboring these features are given in Table 2.
Only LSU and SSU rRNA molecules were analyzed in this study, as the small structured rRNA (e.g. 5S rRNA; Fig. 1) carried little phylogenetic information and was unsuitable for the reconstruction of a universal phylogeny. Data from the rRNA subunits in 29 selected taxa were combined and resulted in 1540 structural characters (878 and 662 for LSU and SSU rRNA, respectively) that included 384 S, 778 B, 117 H and 261 U features. The complete data set can be retrieved from the TreeBASE repository (http://herbaria.harvard.edu/treebase/) under study and matrix accession nos S730 and M1162, respectively. In order to diminish phylogenetic noise, 510 uninformative characters representing invariant features and autapomorphies were excluded before tree reconstruction. Combined analysis of the rRNA subunits produced a single universal tree (Fig. 2A). The data exhibited strong phylogenetic signal, as indicated by the distribution of cladogram length (p < 0.01) and PTP tests of cladistic co-variation (p = 0.001). However, the cladistic properties of rRNA structure appeared heterogeneous. The null hypothesis of congruence was rejected when combined LSU and SSU molecules, combined helix and unpaired regions, or combined structural domains in both molecules were tested for homogeneity of data partitions (p = 0.001). This suggests that the two subunits exhibit distinct phylogenetic histories, the molecules themselves are heterogeneous, and helix and unpaired regions express variant evolutionary patterns. However, congruence could not be rejected when unpaired regions were subdivided into structural components H, B and U (p = 0.425), and functional characters defining inter-subunit bridges and tRNA-binding sites were compared with the rest of the molecules (p = 0.079). This was somehow unanticipated, especially for functional regions that should have had differential histories imprinted on them (see below).
Figure 2.
Phylogenetic reconstruction of a universal tree based on the combined secondary structures of LSU and SSU rRNA. (A) Single most-parsimonious tree (13 233 steps; CI = 0.477, RI = 0.708; g1 = –0.467; PTP test, p = 0.001) retained after a heuristic search with TBR branch swapping and 50 replicates of random addition sequence. A total of 647 LSU and 383 SSU rRNA informative characters representing the structure of both subunits were combined and analyzed. The reduced tree shows branches with <50% BS collapsed into polytomies, nodes described by BS proportions and DD indexes (in italics), and maximum leaf stabilities from BS and decay analyses (for 1244 rooted trees). Decay and cladistic information content (CIC) values describe the global tree and the most supported and comprehensive topologies. (B) Strict consensus tree resulting from combining universal trees obtained from individual rRNA subunit structures. The trees had 8892 and 3690 steps and shared 2337 out of 3654 triplets.
The trees reconstructed from the secondary structures of individual or combined rRNA subunits were mostly congruent (partition distance, PD = 18–32; symmetric difference, SD = 0.12–0.25 for quartet analysis), rejecting a topological match by chance (p < 0.01). However, an approach of taxonomic congruence in partitioned analysis in which rRNA subunits were individually analyzed and trees combined by strict consensus resulted in considerable loss of resolution (Fig. 2B). Therefore, I have combined subunit data to enable character reconstruction in the ribosomal ensemble, reduce ambiguities, and maximize explanatory power (45). Note however that this could compromise the conservativeness of certain phylogenetic statements.
The topology of the universal tree was reasonably supported by BS and decay analysis (Fig. 2A), and was similar to topologies individually obtained from LSU and SSU structures (21). The tree branched in three monophyletic groups corresponding to the major domains of life. Clades defining Archaea and Bacteria were significantly supported (85 and 88% BS). In contrast, clade definition was marginal for Eucarya (72% BS), mostly due to the early branching of amitochondriate Archezoa. However, support for Eucarya was significant (86% BS) when a refined model of character evolution (see below) was used. The universal tree was rooted in the eukaryotic branch, albeit poorly (50% BS). A similar rooting was observed in reconstructions from LSU and SSU rRNA structure (with 97 and 50% BS, respectively) (21). Phylogenetic relationships generally matched traditional classification and revealed the unprecedented diversity of the eukaryotes. Decay and leaf stability values showed support being strongest for most of the relationships within Eucarya and weakest for those in Bacteria. Incongruence between phylogenies reconstructed from sequence and structure was also evident, such as the grouping of fungi and plants and the early branching of Archezoa in Eucarya (21). Note that phylogenetic analysis of structure may suffer from the same problems that affect reconstructions from sequence, such as mutational saturation, variation of evolutionary rates across sites, and covarion structure.
Inference of a model of rRNA character evolution
Given a phylogeny and an initial evolutionary hypothesis, it is possible to reconstruct histories of character change and use this information to infer a refined model of character evolution (39). The final objective is to generate a matrix of transformation costs between character states that assigns a probability to every possible change. Step matrices help understand mechanisms of evolution and can improve phylogenetic reconstruction. For example, substitution rate models utilize ‘rate’ matrices that explain RNA sequence evolution by incorporating assumptions about base-pairing constraints (46). As with sequence, the definition of an initial model of RNA structural evolution requires assumptions about the evolutionary process itself. In the present study, the initial model considers that structural features increase or decrease in length in single steps, each step corresponding to the addition or removal of a nucleotide in a sequence. The process follows a linear and reversible path, but expresses an asymmetry in transformation costs (e.g. the transformation cost from lengths 1 to 2 differs from lengths 2 to 1). This accommodates a universal search for order that is supported by considerations in thermodynamics and statistical mechanics of molecules. Since assumptions could not be falsified by phylogenetic reconstruction (21), the validity and approach of using character history reconstructions to draw inferences about evolutionary processes related to RNA structure is therefore enforced.
Based on these considerations, a model of character evolution was inferred from patterns of character change using the ‘State Changes and Stasis’ feature in MacClade, and the resulting step matrix of character transformation was used to reconstruct a refined universal tree. The relative frequencies of change were plotted in a bubble diagram (Fig. 3A) and were converted to a transformation type using functions described by Wheeler (47). The refined model depicted the ‘frustrated’ energetics of base pairing invoked by the original model of character change. While changes occurred most frequently in single steps, there was a differential behavior of helix stacks and unpaired loop regions. In helices, losses were favored over gains for lengths of stacks £9–11 bp with a reverse trend for longer segments. In unpaired regions, gains were favored over losses, with frequencies decreasing with nucleotide length. These trends were especially evident when one-step gains and losses are individually charted for each or both subunits (data not shown). The inferred step matrix was used to reconstruct a refined universal tree (Fig. 3B). This tree maintained the overall topology of the original reconstruction (PD = 10, SD = 0.08), but showed a better support of the clade encompassing Eucarya.
Figure 3.
Modeling character evolution. (A) Bubble chart describing the average frequency of changes between states in helix stacks and unpaired segments. (B) Reconstruction of a universal tree using a characters state matrix that was inferred from probabilities of character change. A single most-parsimonious tree (10 964 steps; g1 = –0.536; PTP test, p = 0.001) retained after a heuristic search with TBR branch swapping and 10 replicates of random addition sequence was obtained. A total of 302 helix and 721 unpaired characters were included in the analysis. The reduced tree shows branches with <50% BS collapsed into polytomies and BS proportions.
Tracing the evolution of structural change in rRNA
The ability to trace structural features evolving along the lineages of the universal tree using parsimony-based methods offers a unique opportunity to study how evolutionary change is distributed and constrained in rRNA. It also enables the inference of hypothetical ancestral molecules by assigning character states (termed ‘most parsimonious reconstruction sets’) to each node in the tree. The historical reconstructions of the pathways taken by characters from one state to another were analyzed here on a phylogeny that, with debated exceptions (Archezoa), derives congruently from sequence, structural traits and traditional classification. To my knowledge, this is the first time that a complete repertoire of structural characters has been traced lineage-by-lineage on a tree using such a formal treatment.
Figure 4A describes ancestral LSU and SSU rRNA molecules inferred from reconstructions of character history. The ancestral molecules showed structural features absent or reduced in molecules from extant taxa in Eucarya (e.g. 8_1, 23_n, C1_n, D4_1 and E20_n), Archaea and Bacteria (e.g. 43_n, G5_n and H1_n components). These features generally coincided with hypervariable regions, known to have insertion sites toward the less conserved periphery of the ribosomal ensemble, and appearing with no clear evolutionary pattern (10).
Figure 4.
Phylogenetic tracing of rRNA structure. (A) Schematic representations of the secondary structure of inferred ancestral rRNA subunit molecules. The LSU and SSU structures contain 100 and 50 universal helical segments (S; defined as segments separated by multibranched or pseudoknot loops), respectively, and several other S specific to certain taxa (58). Sequences are drawn clockwise from the 5′ to the 3′ terminus and S tracts (depicted by bars sized in base pairs) are numbered in the same order for each structural LSU domain (A–I), and globally, for the 5′ (helices 1–21), central (22–31) and 3′ domains (32–50) in SSU. The location of inter-subunit bridges (gray), and P (red), A (blue) and E sites (cyan) are highlighted in the ancestral structures. (B) The evolution of characters defining total and functional rRNA structures was traced on the dichotomous universal tree described in Figure 1A. Pie charts are sized by area and describe changes in functional characters occurring in basal branches and major clades.
The distribution of character change along the branches of the universal (dichotomous) tree showed that most structural changes in LSU and SSU rRNA molecules were of ancestral origin (Fig. 4B). The frequency of change was highest in Eucarya. Character change in branches (nodes) equidistant to the hypothetical ancestor (the root node) increased significantly towards the base of the universal tree, and this pattern was not significantly different between subunits following a two-way ANOVA for values normalized to the total number of informative characters in each molecule (df: 1, 7, 7, 96; branches: F = 2.68, p = 0.014; subunits: F = 0.76, p = 0.386; interaction: F = 1.53, p = 0.168). Overall, there were 0.095 ± 0.009 (SE) and 0.077 ± 0.010 changes per character occurring in LSU and SSU rRNA, respectively. The distribution of character change was also evaluated using a measure of central tendency, the node index (Ni). Ni is a conservative indicator of how derived are changes in branches within a clade, which is unbiased by clade topology, number of characters and levels of change, and ranges from 0 (changes occurring exclusively at the root) to 1 (changes exclusively at the leaves). Ni values for the LSU (0.426 ± 0.003) and SSU (0.388 ± 0.004) molecules were significantly lower (p < 0.001) than an average of shuffled data where changes were randomly distributed. This again indicates that changes are unequally distributed along the branches of the tree and cluster in its base.
These observations reveal the following three patterns of evolution in the lineages of the universal tree: (i) molecular simplification, especially in highly variable regions; (ii) reduction of structural change with evolutionary time; and (iii) a concordant distribution of character change in both ribosomal subunits.
Tracing the evolution of structures important for ribosomal function
A number of structures important for ribosome function have been recently identified in the interface of the two rRNA subunits, located in an RNA core devoid of ribosomal proteins (6). The construction of three-dimensional nucleotide variability maps has shown that this core is highly conserved (10). Despite low relative substitution rates, the evolution of tRNA-binding sites in LSU rRNA [defined mostly by chemical footprinting and in vivo and in vitro functional studies (1)] was traced in the lineages of a tree generated from its structure (21). Here I have extended that preliminary study to the entire rRNA ensemble by including features important for ribosomal function that have been carefully resolved by low-resolution cryo-EM (48) and high-resolution structural analysis (2–7). Variable features that were individually traced in the universal evolutionary tree included sequences involved in peptidyl transferase activity, the translational cycle, and interaction between rRNA subunits (Fig. 4A). A total of 92 characters were studied (Table 1), encompassing inter-subunit bridge contacts and tRNA-binding sites (Table 2).
Table 2. Interacting structural characters analyzed in rRNA.
1Interfaces labeled with an asterisk involve the interaction with proteins (e.g. S12, S13, S15, L2, L5, L14, L16, L19, L33). AC, anticodon; DHU, dihydrouracil.
2Contacts are described by features associated with universal helix tracts following standard nomenclature (see Fig. 4A). Characters are identified with superscript numbers.
3Ni is a topological measure of how derived are character changes in the universal tree. ANOVA and Student–Newman–Keuls post hoc analysis showed significant differences between treatments (df: 3, 200; F = 29.6; p = 0.0001), even when categorizing subunits (df: 7, 196; F = 29.7; p = 0.0001). Different letters show significance at p = 0.05 in groups abcd and xyz.
The pie charts in Figure 4B show a total of 204 character changes distributing along the branches of the universal tree. Over half of these occurred in inter-subunit contacts and were mostly derived. In turn, only 12% of changes in tRNA-binding sites occurred in P sites. These changes appeared ancestral to those in A and E sites, being mostly confined to the tree base and to Eucarya. The visual trends observed in the figure were confirmed by statistical analysis. Patterns of change were significant (p = 0.0001) when measured with the Ni metric (Table 2). Ni increased in the order P sites < E and A sites < bridges, in clear progression from ancestral to derived. This pattern of change may portray an actual origin of tRNA-binding sites, provided structural evolution was gradual at the onset of organismal diversification.
Ancestral–descendant relationships between functional components
In order to dissect evolutionary patterns further, Ni values were calculated for the major clades that were present in the universal tree and these were used as phylogenetic characters to reconstruct a rooted tree of ribosomal structures (Fig. 5). This approach established ancestral–descendant relationships between the functional ribosomal components themselves. Figure 5A shows the ribosomal outline of an interface view of rRNA subunits revealing the specific location of bridge contacts and tRNA-binding sites considered in this analysis. RNA interfaces included a total of 19 LSU and SSU rRNA mobile bridge contacts and P, A and E tRNA-binding sites in each rRNA subunit. For each subunit, tRNA-binding sites encompassed all interactions established with the cloverleaf tRNA structure described in Table 2.
Figure 5.
Phylogenetic relationships between tRNA-binding sites and inter-subunit bridges. (A) Ribosomal outline of an interface view of rRNA subunits, showing location of proteins (light yellow), tRNA (in LSU) or tRNA-binding sites (in SSU), and bridge contacts. Highlighted in gray are dominant structural components of the subunit interface, SSU helix 49, co-axial LSU helices E25–E28, and the flexible bridge component E26 (harboring contact B2a). (B) Phylogenetic reconstruction of relationships between bridge contacts and tRNA-binding sites in each rRNA subunit. Multi-state characters were defined by Ni values for the universal tree and its major clades (Archaea, Bacteria, Eucarya and the tree base), normalized to leaf number. An exhaustive search resulted in three most-parsimonious trees (57 steps; CI = 0.702, RI = 0.653; g1 = –0.537; PTP test, p = 0.01). The reduced tree shows branches with <50% BS collapsed into polytomies, nodes described by BS proportions and DD indexes (in italics), and maximum leaf stabilities. (C) Detailed phylogenetic relationships between bridge contacts and tRNA-binding sites. Two trees (204 steps; CI = 0.392, RI = 0.660; g1 = –0.443; PTP test, p = 0.01) were retained after a branch-and-bound search. Note that trees shown in (B) and (C) are congruent with the 50% majority rule consensus. The existence of character change was traced as indexed in the legend, and is shown for selected branches.
The phylogenetic relationships between functional ribosomal structures were first reconstructed by pooling bridge components in each subunit together (Fig. 5B). The tree that was generated showed that (i) P sites in both rRNA subunits were ancestral to other functional structures; (ii) P, A and E sites clustered separately in three distinct groups, with E sites being more derived than the rest; and (iii) LSU rRNA bridge contacts had divergent histories and were more derived than their SSU rRNA counterparts. Statistical analysis of Ni values were for the most part consistent with these results (Table 2). The phylogenetic grouping of binding sites for tRNA in each subunit, with P sites appearing earlier that A sites, and E sites being the most derived, suggests that tRNA-binding sites in both rRNA subunits have been the subject of concerted evolution.
The phylogenetic relationships between mobile bridge contacts in individual subunits were also analyzed in relation to binding sites for tRNA (Fig. 5C). The phylogenetic tree showed the evolutionary relationships of individual mobile inter-subunit contacts. SSU rRNA bridge contacts B2a, B3, B5 and B6, all supported by the penultimate stem helix of the molecule (helix 49; see Figs 4A and 5A), were closely related phylogenetically to ancestral P site components. Stem helix 49, also known as helix 44 of 16S rRNA in the prokaryotic model, is the dominant structural component of the SSU rRNA interface and has been proposed as a central ribosomal functional relay (2). LSU rRNA contact B7a and the conserved B2b, both perturbed during tRNA translocation (6) and supported by coaxial helices E26–E28 (helices 69 and 71 of the prokaryotic model), and ‘flexible’ contact B2a, were similarly related to the P site. These contacts form a crucial central LSU rRNA interface element implicated in translational fidelity (6) that interacts directly with SSU rRNA functional relay through B2a and with the ‘switch helix’ 31 through B2b. In contrast, several other bridge contacts were phylogenetic derivatives of the A and E sites, such as B2c, B7b, and LSU contacts B3 and B5. Interestingly, the ‘A site finger’ B1a and protein-mediated B1b contacts between the LSU head and the SSU top appeared phylogenetic derivatives of the P site, although being structurally distant (D14) and quite derived. These flexible contacts are involved in the ratchet-like inter-subunit rotation induced by elongation factor G (49). B1a was particularly variable, accounting for 16% of changes in LSU bridge contacts. Interestingly, these bridges involve protein–protein and protein–helix contacts (6), suggesting the recent role of ribosomal proteins in subunit interactions.
Character evolution was also reconstructed in the structural trees, confirming for example the ancestral nature of bridges B2b, B4, B8 and contacts supported by the penultimate stem helix of SSU rRNA. This approach was especially useful when studying structural components important for ribosomal translocation. The Thermus thermophilus LSU rRNA central interface element harboring the flexible bridge contact B2a exhibits an intriguing disorder in the crystal structure of helix 69 (6). This helix, here labeled E26, also establishes contacts with P-tRNA and A-tRNA. The disorder has been explained by the feeble support of the multi-branched loop structure (embodied in helices E24–E28) that harbors the central interface element and the absence of direct stacking–packing interactions with the rest of the LSU rRNA molecule. The disorder identifies features capable of independent motion that participate in ribosomal dynamics and involve both subunit contact and tRNA movement (6). Because of its central role in translocation, the E24–E28 multi-branched loop structure was taken out of molecular context and was characterized using statistical properties that describe the stability and uniqueness of folded conformations (see Fig. 1). These properties were traced in the universal tree using square-change parsimony (Fig. 6). Interestingly, Q entropy values that describe number of conflicting interactions and measure plasticity in the E24–E28 loop generally increased in the tree. They were considerably higher in Archaea and Eucarya than in Bacteria. This trend is the opposite of what is generally encountered in the rest of the rRNA molecules (G. Caetano-Anollés, manuscript in preparation), but was shared by the stem helix D14 that harbors B1a, the other flexible rRNA contact (data not shown). Therefore, ribosomal structures bearing flexible contacts appear subjected to strong functional constraints that force them to go against the universal search for molecular order evident in rRNA.
Figure 6.
Tracing the molecular plasticity of the E24–E28 structure that supports ‘flexible’ contact B2a in the universal tree. Representative molecular substructures (124–150 nt in length) in Archaea, Bacteria and Eucarya are given from bottom to top. Minimum Gibbs free energy increments (ΔG°) for the folds ranged from –18.8 (Tetrahymena thermophila) to –65.8 kcal/mol (Thermotoga maritima). Shannon entropy (Q) values that measure the uniqueness of folded conformations, together with other statistical properties, were calculated for the individual substructures. Square root parsimony was then used to reconstruct ancestral Q states as continuous characters in the universal tree, using MacClade with the rooted tree option. A mean of corresponding permuted cohorts obtained by sequence randomization is shown at the base of the tree. Since at Q = 1 there is no preferred structure, the relatively low value of randomized sequences demonstrates once more the importance of self-organization in evolution (29–32).
DISCUSSION
I have used cladistic principles to reconstruct phylogenetic history directly from the structure of nucleic acids molecules, focusing on the evolution of structural attributes that characterize how branched, stable and uniquely folded (plastic) are nucleic acid molecules. Structural attributes (characters) transform from one numerical value (character state) to another in linearly ordered and polarized pathways. These ‘transformation’ [sensu Hennig (50)] pathways are defined here by a model of structural evolution driven by a fundamental and universal evolutionary tendency towards order. This model assumes that molecular evolution is constrained by the mapping of sequence into structure and that ancestral characteristics in extant molecules become fixed when trapped in the local optima of an adaptive landscape. The model does not consider other possible attributes that bare on biological function and are optimized during molecular evolution, given the limited knowledge of the complex selective forces operating at a functional level. Instead, function is regarded here as a first consequence of order and assumptions are only based on principles well grounded on the thermodynamics and statistical mechanics of molecules (26–32). Note that these assumptions could not be falsified by phylogenetic reconstruction, as congruence between phylogenetic trees derived from structure, sequence and traditional classification failed to reject the validity of character transformation hypotheses.
One particularly fundamental phylogeny represents the hierarchical classification of the living world and is based on comparative analysis of sequences encoding rRNA and several proteins (14). This phylogeny is known as the universal tree of life. Because of concerted evolution, there are no paralogous genes that can root the rRNA universal tree and this problem has been intractable. Here I was able to reconstruct a rooted universal tree from the combined structures of LSU and SSU rRNA molecules. The tree branched in three monophyletic groups corresponding to the major domains of life, and was rooted in the eukaryotic branch (Fig. 2). This rooting supports conclusions from a previous analysis of individual rRNA structures (21), but conflicts with the accepted view of a prokaryotic universal ancestor (51). Instead, it intimates an equally parsimonious eukaryotic origin of diversified life. Note that a eukaryotic ancestry of life (52) has been recently supported by phylogenetic analysis of several gene sequences [e.g. SRP (53)]. The existence of three major monophyletic groups parallels evolutionary reconstructions from comparative sequence analysis (51,54,55), and suggests diversity originated in three dramatic evolutionary events. Within major clades, phylogenetic relationships generally matched traditional classification. However, there were instances of incongruence such as the basal placement of amitochondriate Archezoa in the Eucarya, which represents a highly debated topology (56).
Character tracing in the reconstructed tree allowed the inference of a refined model of character evolution, produced a step matrix of transformation costs for parsimony analysis, and resulted in a refined universal tree (Fig. 2). This tree retained the overall topology of the original reconstruction. As expected, the inferred model depicted the ‘frustrated’ energetics of base pairing in RNA structure (26,28). The minimum free energy of a secondary structure can be considered the sum of its loop energies, which have been measured, tabulated, and used in folding algorithms as function of loop size and delimiting base pairs (57–59). Energetically unfavorable loops constrain the formation of energetically favorable stacking regions, resulting in a vast arrangement of possible helix and unpaired segments [i.e. the structural repertoire of an RNA sequence (28)]. The contrasting probabilities of change in helix and loop regions reflect these opposing energetic trends. However, step matrices showed an unanticipated tendency to form rRNA molecules with paired segments of an optimal length of ∼10 bp. These observations suggest there is an evolutionary tendency to optimize molecular size. Interestingly, reconstructions of character history allowed the inference of ancestral rRNA molecules showing there was a clear evolutionary trend towards molecular simplification (Fig. 3). This general trend is consistent with a proposal of simplification of molecular features in prokaryotes by gene loss and non-orthologous displacement (60), and streamlining of the rapidly replicating prokaryotic genomes (17). Interestingly, features that are simplified generally have a polyphyletic distribution in the universal tree and occur in the periphery of the ribosome (10). Probably, they are not constrained by function and can therefore afford to be transient during evolution (17). However, there was one striking exception in this study, the ancestral but very compact structure of Encephalitozoon cuniculi (61). This anomaly could be attributed to unusual evolutionary forces, fast evolutionary rates, and the intracellular lifestyle of this amitochondriate microsporidian endoparasite.
An evolutionary model of structural change was recently inferred from the dynamics of RNA populations evolving towards predefined structural targets in a constant environment (28,62). The model predicts a concurrent reduction of variability (genetic canalization) and plasticity during structural evolution that ultimately results in evolutionary lock-in and extreme modularity (i.e. the ability to maintain the structural integrity of autonomous components across a wide range of environments and genetic contexts) (28). This model can be tested in rRNA by inferring the history of character change, if we assume that constraints imposed on structure by the functional landscape (e.g. involving translation) were relatively constant during ribosomal evolution. The distribution of character change along the branches of the universal tree showed structural change clustering at its base (Fig. 4B and Table 2). This pattern is consistent with the proposed evolutionary model, providing direct experimental support to the predicted reduction of structural variability in time. Similarly, tracing of cladistic characters reflecting the uniqueness of folded conformations supported an evolutionary decrease of the plastic repertoire (G. Caetano-Anollés, manuscript in preparation). Reductions in RNA plasticity have also been supported by studies in statistical mechanics of extant RNA molecules (including rRNA) (29–32). Overall results substantiate the concurrent decrease of genetic variability and structural plasticity during evolution. Since cladogenesis (i.e. the branching of lineages in a phylogenetic tree) is a consequence of natural selection (50), I propose here that these evolutionary lock-in mechanisms enhance cladogenesis at the expense of curtailing phenotypic innovation. As a consequence, ancestral characteristics in extant molecules result from lineages that retain exploratory properties while those that are derived result from lock-in and express structural modularity as an evolutionary novelty. We are currently tracing modularity characteristics in phylogenetic trees to test some of these assumptions.
One of the most powerful features of structure-based phylogenetic tree reconstruction proposed here is the ability to trace the evolution of functionally important components in nucleic acid molecules. This is particularly meaningful when structural information has been acquired in vitro using methods such as cryo-electron microscopy reconstruction, NMR spectroscopy, crystallography, and genetic and biochemical approaches. Recent progress in the structural determination of the ribosomal ensemble (2–7) has defined mobile inter-subunit bridge contacts and tRNA-binding sites at atomic resolution in rRNA. This has provided an unprecedented view of the mechanism of translation. Here I show that the history of structural change in these ribosomal components can be traced on the universal tree. The distributions of character change along the universal tree were used to establish phylogenetic relationships between the functional structures themselves (Fig. 5). This constitutes a novel approach that enables the phylogenetic reconstruction of ‘structural trees’ and the establishment of ancestral–descendant relationships between the individual structural components of a nucleic acid molecule. The evolutionary patterns observed support several interesting and functionally important features such as the concerted evolution of tRNA-binding sites in the two subunits, and the ancestral nature and common origin of structures that participate in ribosomal dynamics. These structures include the peptidyl (P) site, the functional relay of the penultimate stem helix of SSU rRNA, and the central interface of LSU rRNA. The spatial arrangement of contacts and binding sites and the molecular disorder that characterizes the coupling of tRNA translocation and inter-subunit movement in crystallographic analysis (6) matched the evolutionary patterns observed here. Interestingly, mobile contacts centrally implicated in these mechanisms (B2a and B1a) appear supported by structures with atypical evolutionary trends that favor molecular disorder. This observation illustrates how functional constraints can curb the evolution of molecules.
Structures predicted by the study of compensating substitutions in sequence alignments (8–10) have recently been confirmed by crystallographic studies of bacterial and archaeal rRNA molecules (1–7). This has validated the comparative sequence analysis prediction method. The majority of secondary structural components are common to prokaryotic and eukaryotic molecules, and for the most part, structures are well defined (10), despite ongoing refinement of variable and taxon-specific rRNA regions (especially in Eucarya) (36). Therefore, structural inaccuracies are assumed not severe and are tolerated as systematic error in this study, pending a detailed crystallographic analysis of the eukaryotic rRNA ensemble. Mature rRNA molecules represent a mosaic of highly conserved domains and rapidly evolving areas always found in identical locations along the molecules (63). Two- and three-dimensional nucleotide variability maps have been constructed recently (10), showing that highly conserved regions are more abundant in single-stranded rRNA and are preferentially located in the center of the ribosome. In contrast, structures specific for Archaea, Bacteria and Eucarya are situated in its periphery. A preliminary analysis of structural character change has shown that change is maximal on the periphery of the rRNA ensemble (G.Caetano-Anollés, unpublished results), confirming inferences from three-dimensional variability maps (10). The inter-subunit bridge contacts and tRNA-binding sites traced in the present study fall within the conserved and centrally located areas. They belong to functional structures involved in tRNA binding, mRNA decoding and peptidyl transfer, which exhibit average nucleotide substitution rates 10–50 times lower that the average rRNA site. These structures are conserved and are present in the three domains of life. Structural inaccuracies in these sites should be considered negligible and inconsequential to the validity of the phylogenetic inferences presented in this study.
SUPPLEMENTARY MATERIAL
Supplementary Material is available at NAR Online.
Acknowledgments
ACKNOWLEDGEMENTS
I gratefully acknowledge G. E. Caetano-Anollés for help in character coding, V. Knudsen for computer programs, and Vital NRG for financial support.
REFERENCES
- 1.Green R. and Noller,H.F. (1997) Ribosomes and translation. Annu. Rev. Biochem., 66, 679–716. [DOI] [PubMed] [Google Scholar]
- 2.Cate J.H., Yusupov,M.M., Yusupova,G.Z., Earnest,T.N. and Noller,H.F. (1999) X-ray crystal structure of 70S ribosome functional complexes. Science, 285, 2095–2104. [DOI] [PubMed] [Google Scholar]
- 3.Ban N., Nissen,P., Hansen,J., Moore,P.B. and Steitz,T.A. (2000) The complete atomic structure of the large ribosomal subunit at 2.4 Å resolution. Science, 289, 905–920. [DOI] [PubMed] [Google Scholar]
- 4.Wimberly B.T., Brodersen,D.E., Clemons,W.M.,Jr, Morgan-Warren,R.J., Carter,A.P., Vonrhein,C., Hartsch,T. and Ramakrishnan,V. (2000) Structure of the 30S ribosomal subunit. Nature, 407, 306–307. [DOI] [PubMed] [Google Scholar]
- 5.Schluenzen F., Tocilj,A., Zarivach,R., Harms,J., Gluehmann,M., Janell,D., Bashan,A., Bartels,H., Agmon,I., Franceschi,F. and Yonath,A. (2000) Structure of functionally activated small ribosomal subunit at 3.3 Å resolution. Cell, 102, 615–623. [DOI] [PubMed] [Google Scholar]
- 6.Yusupov M.M., Yusupova,G.Z., Baucom,A., Lieberman,K., Earnest,T.N., Cate,J.H.D. and Noller,H.F. (2001) Crystal structure of the ribosome at 5.5 Å resolution. Science, 292, 883–896. [DOI] [PubMed] [Google Scholar]
- 7.Ogle J.M., Brodersen,D.E., Clemons,W.M.,Jr, Tarry,M.J., Carter,A.P. and Ramakrishnan,V. (2001) Recognition of cognate transfer RNA by the 30S ribosomal subunit. Science, 292, 897–902. [DOI] [PubMed] [Google Scholar]
- 8.James B.D., Olsen,G.J. and Pace,N.R. (1989) Phylogenetic comparative analysis of RNA secondary structure. Methods Enzymol., 180, 227–239. [DOI] [PubMed] [Google Scholar]
- 9.Gutell R.R., Larsen,N. and Woese,C.R. (1994) Lessons from an evolving rRNA: 16S and 23S rRNA structures from a comparative perspective. Microbiol. Rev., 58, 10–26. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Wuyts J., Van de Peer,Y. and De Wachter,R. (2001) Distribution of substitution rates and location of insertion sites in the tertiary structure of ribosomal RNA. Nucleic Acids Res., 29, 5017–5028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Woese C.R. (2001) Translation: in retrospect and prospect. RNA, 7, 1055–1067. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Page R.D.M. and Holmes,E.C. (1998) Molecular Evolution: A Phylogenetic Approach. Blackwell Science, Oxford.
- 13.Charlesworth D., Charlesworth,B. and McVean,A.T. (2001) Genome sequences and evolutionary biology, a two-way interaction. Trends Ecol. Evol., 16, 235–242. [DOI] [PubMed] [Google Scholar]
- 14.Doolittle W.D. (1999) Phylogenetic classification and the universal tree. Science, 284, 2124–2128. [DOI] [PubMed] [Google Scholar]
- 15.Vukmirovic O.G. and Tilghman,S.M. (2000) Exploring genome space. Nature, 405, 820–822. [DOI] [PubMed] [Google Scholar]
- 16.Mittl P.R.E. and Grütter,M.G. (2001) Structural genomics: opportunities and challenges. Curr. Opin. Chem. Biol., 5, 402–408. [DOI] [PubMed] [Google Scholar]
- 17.Clark C.G. (1987) On the evolution of ribosomal RNA. J. Mol. Evol., 25, 343–350. [DOI] [PubMed] [Google Scholar]
- 18.Billoud B., Guerrucci,M.-A., Masselot,M. and Deutsch,J.S. (2000) Cirripede phylogeny using a novel approach: molecular morphometrics. Mol. Biol. Evol., 17, 1435–1445. [DOI] [PubMed] [Google Scholar]
- 19.Collins L.J., Moulton,V. and Penny,D. (2000) Use of RNA secondary structure for studying the evolution of RNase P and RNase MRP. J. Mol. Evol., 51, 194–204. [DOI] [PubMed] [Google Scholar]
- 20.Caetano-Anollés G. (2001) Novel strategies to study the role of mutation and nucleic acid structure in evolution. Plant Cell Tissue Org. Culture, 67, 115–132. [Google Scholar]
- 21.Caetano-Anollés G. (2002) Evolved RNA secondary structure and the rooting of the universal tree of life. J. Mol. Evol., 54, 333–345. [DOI] [PubMed] [Google Scholar]
- 22.Eventoff W. and Rossmann,M.G. (1975) The evolution of dehydrogenases and kinases. CRC Crit. Rev. Biochem., 3, 111–140. [DOI] [PubMed] [Google Scholar]
- 23.Johnson M.S., Sutcliff,M.J. and Blundell,T.L. (1990) Molecular anatomy: phyletic relationships derived from three-dimensional structures of proteins. J. Mol. Evol., 30, 43–59. [DOI] [PubMed] [Google Scholar]
- 24.Bujnicki J.M. (2000) Phylogeny of restriction endonuclease-like superfamily inferred from comparison of protein sequences. J. Mol. Evol., 50, 39–44. [DOI] [PubMed] [Google Scholar]
- 25.Breitling R., Laubner,D. and Adamski,J. (2001) Structure-based phylogenetic analysis of short-chain alcohol dehydrogenases and reclassification of the 17beta-hydroxysteroid dehydrogenase family. Mol. Biol. Evol., 18, 2154–2161. [DOI] [PubMed] [Google Scholar]
- 26.Fontana W. and Schuster,P. (1998) Shaping space: the possible and the attainable in RNA genotype–phenotype mapping. J. Theor. Biol., 194, 491–515. [DOI] [PubMed] [Google Scholar]
- 27.Wuchty S., Fontana,W., Hofacker,I.L. and Schuster,P. (1999) Complete suboptimal folding of RNA and the stability of secondary structures. Biopolymers, 49, 145–165. [DOI] [PubMed] [Google Scholar]
- 28.Ancel L.W. and Fontana,W. (2000) Plasticity, evolvability and modularity in RNA. J. Exp. Zool., 288, 242–283. [DOI] [PubMed] [Google Scholar]
- 29.Higgs P.G. (1993) RNA secondary structure: a comparison or real and random sequences. J. Phys. I France, 3, 43–59. [Google Scholar]
- 30.Higgs P.G. (1995) Thermodynamic properties of transfer RNA: a computational study. J. Chem. Soc. Faraday Trans., 91, 2531–2540. [Google Scholar]
- 31.Schultes E.A., Hraber,P.T. and LaBean,T.H. (1999) Estimating the contributions of selection and self-organization in RNA secondary structure. J. Mol. Evol., 49, 76–83. [DOI] [PubMed] [Google Scholar]
- 32.Gultyaev P.A., van Batenburg,F.H.D. and Pleij,C.W.A. (2002) Selective pressures on RNA hairpins in vivo and in vitro.J. Mol. Evol., 54, 1–8. [DOI] [PubMed] [Google Scholar]
- 33.Gladyshev G.P. (1978) On the thermodynamics of biological evolution. J. Theor. Biol., 75, 425–441. [DOI] [PubMed] [Google Scholar]
- 34.Gladyshev G.P. and Eshov,Y.A. (1982) Principles of the thermodynamics of biological systems. J. Theor. Biol., 94, 301–343. [DOI] [PubMed] [Google Scholar]
- 35.Kierzek E., Biala,E. and Kierzek,R. (2001) Elements of thermodynamics in RNA evolution. Acta Biochim. Polonica, 48, 485–493. [PubMed] [Google Scholar]
- 36.Wuyts J., De Rijk,P., Van de Peer,Y., Pison,G., Rousseuw,P. and De Wachter,R. (2000) Comparative analysis of more than 3000 sequences reveals the existence of two pseudoknots in area V4 of eukaryotic small subunit ribosomal RNA. Nucleic Acids Res., 28, 4698–4708. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Hofacker I.L., Fontana,W., Stadler,P.F., Bonhoeffer,L.S., Tacker,M. and Schuster,P. (1994) Fast folding and comparison of RNA secondary structures. Monatshefte Chem., 125, 167–188. [Google Scholar]
- 38.Swofford D.L. (1999) Phylogenetic Analysis Using Parsimony and Other Programs (PAUP*), Version 4. Sinauer Assoc., Sunderland, MA.
- 39.Maddison W.P. and Maddison,D.R. (1999) MacClade: Analysis of Phylogeny and Character Evolution, Version 3.08. Sinauer Assoc., Sunderland, MA.
- 40.Felsenstein J. (1985) Confidence limits on phylogenies: an approach using the bootstrap. Evolution, 39, 783–791. [DOI] [PubMed] [Google Scholar]
- 41.Wilkinson M., Thorley,J.L. and Upchurch,P. (2000) A chain is no longer than its weakest link: double decay analysis of phylogenetic hypotheses. Syst. Biol., 49, 754–776. [DOI] [PubMed] [Google Scholar]
- 42.Thorley J.L. and Page,R.D.M. (2000) RadCon: phylogenetic tree comparison and consensus. Bioinformatics, 16, 486–487. [DOI] [PubMed] [Google Scholar]
- 43.Farris J.S., Kållersjö,M., Kluge,A.G. and Bult,C. (1995) Testing significance of incongruence. Cladistic s, 10, 315–319. [Google Scholar]
- 44.Page R.D.M. (1993) COMPONENT, Tree Comparison Software for Microsoft Windows, v. 2.0. The Natural History Museum, London.
- 45.Nixon K.C. and Carpenter,J.M. (1996) On simultaneous analysis. Cladistics, 12, 221–241. [DOI] [PubMed] [Google Scholar]
- 46.Savill N.J., Hoyle,D.C. and Higgs,P.G. (2001) RNA sequence evolution with secondary structure constraints: comparison of substitution rate models using maximum-likelihood methods. Genetics, 157, 399–411. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Wheeler W.C. (1990) Combinatorial weights in phylogenetic analysis. A statistical parsimony procedure. Cladistics, 6, 269–275. [Google Scholar]
- 48.Gabashvili I.S., Agrawal,R.K., Spahn,C.M., Grassucci,R.A., Svergun,D.I., Frank,J. and Penczek,P. (2000) Solution structure of the E. coli 70S ribosome at 11.5 Å resolution. Cell, 100, 537–549. [DOI] [PubMed] [Google Scholar]
- 49.Frank J. and Agrawal,R.K. (2000) A ratchet-like inter-subunit reorganization of the ribosome during translocation. Nature, 406, 318–322. [DOI] [PubMed] [Google Scholar]
- 50.Hennig W. (1966) Phylogenetic Systematics. University of Illinois Press, Urbana, IL.
- 51.Woese C.R., Kandler,O. and Wheelis,M.L. (1990) Towards a natural system of organisms: proposal for the domains Archaea, Bacteria and Eucarya. Proc. Natl Acad. Sci. USA, 87, 4576–4579. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Reanney D.C. (1974) On the origin of prokaryotes. Theor. Biol., 48, 243–251. [DOI] [PubMed] [Google Scholar]
- 53.Brinkmann H. and Philippe,H. (1999) Archaea sister group of bacteria? Indications from tree reconstruction artifacts in ancient phylogenies. Mol. Biol. Evol., 16, 817–825. [DOI] [PubMed] [Google Scholar]
- 54.Gouy M. and Li,W.-H. (1989) Phylogenetic analysis based on rRNA sequences supports the archaebacterial rather than the eocyte tree. Nature, 339, 145–147. [DOI] [PubMed] [Google Scholar]
- 55.De Rijk P., Van de Peer,Y., Van den Broeck,I. and De Wachter,R. (1995) Evolution according to the large ribosomal subunit RNA. J. Mol. Evol., 41, 366–375. [DOI] [PubMed] [Google Scholar]
- 56.Philippe H., Germot,A. and Moreira,D. (2000) The new phylogeny of eukaryotes. Curr. Opin. Genet. Dev., 10, 596–601. [DOI] [PubMed] [Google Scholar]
- 57.Freier S.M., Kierzek,R., Jaeger,J.A., Sugimoto,N., Caruthers,M.H., Neilson,T. and Turner,D.H. (1986) Improved free-energy parameters for prediction of RNA duplex stability. Proc. Natl Acad. Sci. USA, 83, 9373–9377. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Jaeguer J.A., Turner,D.H. and Zuker,M. (1989) Improved predictions of secondary structures for RNA. Proc. Natl Acad. Sci. USA, 86, 7706–7710. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Mathews D.W., Sabina,J., Zuker,M. and Turner,D.H. (1999) Expanded sequence dependence of thermodynamic parameters provides robust prediction of RNA secondary structure. J. Mol. Biol., 288, 911–940. [DOI] [PubMed] [Google Scholar]
- 60.Forterre P. and Philippe,H. (1999) Where is the root of the universal tree of life? Bioessays, 21, 871–879. [DOI] [PubMed] [Google Scholar]
- 61.Peyretaillade E., Biderre,C., Peyret,P., Duffieux,F., Méténier,G., Gouy,M., Michot,B. and Vivarés,C.P. (1998) Encephalitozoon cuniculi, a unicellular eukaryote with an unusual chromosomal dispersion of ribosomal genes and a LSU rRNA reduced to the universal core. Nucleic Acids Res., 26, 3513–3520. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Fontana W. and Schuster,P. (1998) Continuity in evolution: on the nature of transitions. Science, 280, 1451–1456. [DOI] [PubMed] [Google Scholar]
- 63.Hassouna N., Michot,B. and Bachellerie,J.P. (1984) The complete nucleotide sequence of mouse 28S rRNA gene. Implications for the process of size increase of the large subunit rRNA in higher eukaryotes. Nucleic Acids Res., 12, 3563–3583. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Higgs P.G. (2000) RNA secondary structure: physical and computational aspects. Q. Rev. Biophys., 33, 199–253. [DOI] [PubMed] [Google Scholar]
- 65.Fontana W., Konings,D.A., Stadler,P.F. and Schuster,P. (1993) Statistics of RNA secondary structures. Biopolymers, 33, 1389–1404. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.








