Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 1999 Sep 28;96(20):11067–11068. doi: 10.1073/pnas.96.20.11067

Testing ancient RNA–protein interactions

Laura F Landweber 1,*
PMCID: PMC34244  PMID: 10500126

Abstract

The past decade in molecular biology has seen remarkable advances in the study of the origin and early evolution of life. The mathematical tools for analyzing DNA and protein sequences, coupled with the availability of complete microbial genome sequences, provide insight almost as far back as the age of the nucleic acids themselves. Experimental evolution in the laboratory and especially in vitro evolution of RNA provide insight into a hypothetical world where RNA, or a close relative, may have debuted as a primary functional and informational molecule. The ability to isolate new functional RNAs from random sequences now ultimately makes the world of possible primitive chemical interactions accessible even when the molecules or reactions are no longer present in modern species. Thus we can at last form direct experimental tests of specific models for the origin of RNA–protein associations, such as those that influenced the genetic code. This marks a turning point for probing the origin and early history of life at the molecular level.


When Hansel and Gretel set out into the forest, they left a trail of bread crumbs to trace their path home, but alas the trail disappeared. Evolutionary biologists face a similarly daunting task, as few crumbs remain that allow us to trace the history of ancient molecules, especially those that arose long before the last common ancestors of all terrestrial life. More recent molecular evolution, on the other hand, generally is inferred from the trail of nucleotide and amino acid substitutions encoded in DNA sequences. So what can we know about the evolutionary relationships between molecules as old as RNA and protein, and most remarkably the events that shaped the genetic code that links them? Each of the presenters in the session on “Chemistry of the Origin of Life” focused on a different approach to this problem, allowing us to contrast the perspectives of modern DNA sequence comparison, synthetic organic chemistry, and finally recapitulating evolution in the laboratory.

Deep Phylogeny

A surprising discovery that emerged from analysis of complete archaeal and bacterial genome sequences was the unconventional appearance of a class I lysyl-tRNA synthetase (LysRS) in several Archaea and even a few representative Bacteria, such as the Lyme disease spirochete Borrelia burgdorferi. Most known organisms charge their tRNALys with a class II synthetase. Indeed 10 amino acids are invariably charged with a class I synthetase and nine with class II, so this marks lysine an exception. Lluis Ribas de Pouplana of the Scripps Research Institute (La Jolla, CA) presented phylogenetic evidence that the class I LysRSs form a monophyletic group, meaning that they are more closely related to each other than to any other class I synthetase, despite their noncanonical distribution within Bacteria and Archaea. This analysis also suggests that class I LysRSs are as old as the other class I synthetases, implying redundancy of class I and class II LysRSs in the universal ancestor (1) followed by either multiple losses, or more favorably, a web of ancient lateral transfers (2).

Such extensive gene transfers may have been the dominant mode of inheritance during the subcellular period of genetic annealing proposed by Woese (3). Compared to a consortium of progenotes, like a kibbutz, this stage would have encouraged sharing of components before the full consolidation of systems such as the translational apparatus. Woese offers the modularity of synthetases as an explanation for their increased mobility. Their role is universal and they are unconstrained by species-specific interactions except with tRNA. Despite accumulation of differences, which would have been more subtle billions of years ago, the ability of a few synthetases to operate in foreign systems does support this notion, while it also underscores the strong conservation of distinct elements on tRNA. For example, an archaeal class I LysRS from Methanococcus can still charge Escherichia coli tRNALys that normally is charged by a class II LysRS (4), although this enzyme even approaches tRNA from a different side.

Ribas de Pouplana’s conclusions suggest that the identity of tRNALys, which itself displays a canonical phylogeny for these species, may have been established long before the appearance or at least fixation of protein LysRS (2). This observation should cheer RNA world enthusiasts, as it supports the presence of functional RNA molecules in the absence of proteins, but it is also partly a logical argument, as one cannot use existing molecules to decipher the order of events that preceded them. Given the strong evidence for lateral transfer among synthetases, one could consider a secondary origin of class I LysRS, say by activation of a pseudogene or by formation of a chimera from other class I synthetases. Although not observed for any other synthetase, either scenario would obscure the enzyme’s pedigree yet root it similarly at the base of a tree.

The protein sequence identities between class I LysRSs are low (26–44%), yet the alignments are robust because of the very few gaps and superimposable structures of most class I synthetases (5). We may put our faith in the present phylogeny, but the small sample size and huge genetic distances point to a pressing need for more class I and class II LysRS sequences from Archaea and Bacteria, as well as functional studies of both classes of LysRS and the crystal structure of any class I LysRS. Though the timing may not be knowable for such deep ancestry, structural analysis still should reveal which of the class I synthetases is the closest relative of LysRS.

Chemical Synthesis

Yu-Fen Zhao of Tsinghua University (Beijing, China) presented her proposal for a role of phosphates in prebiotic organic chemistry. Phosphates are a common currency in modern biochemistry, where their exchange leads to streams of reactions in metabolism and they are the main activating group that drives reactions as fundamental as polymerization of DNA and RNA; hence it is natural to assume they played major roles in prebiotic biochemistry as well. Others caution that phosphate is an unlikely reagent for the prebiotic world (6).

Zhao has either synthesized or discovered a prebiotic sort of phosphotransfer cascade, analogous to the phosphate transfers and relays that control several components in a circuit. Highly evolved phosphorelays are found in all domains of life, implying their existence in the universal ancestor (7).

Zhao’s much simpler phosphotransfer cascade begins with N-phosphorylation of a single naturally occurring α-amino acid, such as threonine, by reaction with dialkylphosphite in a mixture of water and chloroform to form N-(O,O-dialkyl)phosphothreonine. Reaction of phosphothreonine with nucleosides at room temperature for a week in anhydrous organic solvent (pyridine) led to modest, but detectable, formation of nucleotides, short oligonucleotides, and phosphoryl peptides. The presence of the unstable N-phosphoryl group (which is even more unstable in water) promotes intramolecular cyclization and activation of the carboxyl group. Attack of the amino group on the carboxylic group of the amino acids then drives formation of peptides. Moreover through an interactive loop, pyrimidine (C or U) nucleosides may facilitate peptide condensation. This is a creative link between RNA and peptide synthesis, which at the same time can allow the 5′ hydroxyl of uridine to attack phosphothreonine, leading in a three-step pathway to formation of uridine 5′-monophosphate, one of the four RNA nucleotides (8).

Fundamentally the N-phosphoamino acid provides a strong activating group, such as diisopropylpyrophosphate, permitting transfer of the phosphate from the amino acid to other intermediates, although these reactions would be visible only in the absence of water. Through a different kind of chemistry, work in my lab (9) and others (reviewed in ref. 10) has demonstrated the high frequency in random sequence pools of novel ribozymes that use pyrophosphate-leaving groups to drive primitive biochemical reactions in aqueous solution. One of these selected RNAs offers another route to the synthesis of uridine 5′-monophosphate. It catalyzes formation of the glycosidic bond that links uracil to the sugar phosphate, using similar components to make a nucleotide as modern metabolism (11).

Test-Tube Evolution Experiments

As several theories about the origin of the genetic code postulate some form of specific recognition between amino acids and short RNA sequences, in vitro selection of RNA molecules (aptamers) that bind amino acid ligands permits direct and unbiased assessment of the validity of each theory, transforming them into testable hypotheses. Specifically in favor of Woese et al.’s (12) stereochemical theory, my laboratory found that arginine aptamers bear a significant overrepresentation of arginine codons at nucleotides either known or predicted to lie at the arginine binding sites. Seventy two percent of these nucleotides are present in arginine codons, whereas 29% (approximately the value expected by chance) of the nucleotides outside of the binding sites are found in arginine codons. The probability of such a specific association occurring by chance is approximately three in a million. No set of codons for any other amino acid shows such a binding association with arginine (Fig. 1), nor does any other possible set of codons obeying the “4 + 2” pattern of all six-codon class amino acids (13). This finding strongly suggests a role of chemical determinism in shaping the codon assignments for this amino acid.

Figure 1.

Figure 1

Only arginine codons show a significant affinity for arginine. A one-tailed G test of independence, not corrected for multiple comparisons, was used to detect strength of association between each amino acid’s codon class and arginine. Significantly more arginine codons were present at arginine binding sites than expected by chance (G = 20.2; P = 3.4 × 10−6). Dotted lines and shaded regions indicate P > 0.05 and P > 0.01, respectively. Even the codons for proline are not significant because an association with P > 0.01 is likely to occur by chance in a set of 21 comparisons (13).

Yet all test-tube evolution experiments are haunted by the assumption that the protocol provides a suitable proxy for a prebiotic environment. The emergence of precisely the same sequence motifs in the laboratory as nature chose to encode as arginine could be called a striking, if not astounding, coincidence (14), but our conclusions were robust to all negative controls, including permutations of arginine codons, randomization of the aptamer sequence data, and comparison against nonarginine aptamers, suggesting that it is not a statistical fluke. We also could reject a role for specific recognition of other signature motifs, like anticodons, in this case, although this does not preclude such a mode of binding between RNA and other amino acids (13).

Arginine is indeed a special amino acid, possibly even a late addition to the genetic code (15). A few specific RNA aptamers have been selected to hydrophobic amino acids (1618), and several laboratories are actively engaged in similar experiments. Hence any generalizations to the rest of the genetic code must anxiously await isolation and structural analysis of RNA aptamers to amino acids other than arginine.

Acknowledgments

I thank R. Knight, L. Ribas de Pouplana, A.-L. Reysenbach, and Y.-F. Zhao for lively discussion. This research was supported by National Science Foundation Grant MCB-9604377.

References


Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES