The discovery of catalytic activity in RNA (1) has prompted serious attempts to decipher the mechanisms of their assembly into functional native states starting from linear strands (2). The outlines of the folding mechanisms of large RNA molecules are beginning to emerge thanks to several novel experiments (3–7) and theoretical arguments (8) originally advanced in the context of protein folding. It is becoming increasingly clear that the folding kinetics of RNA is complex (3–7), involving parallel pathways (5) and kinetic traps. These complexities arise because the underlying energy landscape governing RNA folding is rugged. The most dramatic finding, common to most of the recent experimental studies (5, 6), is that the slow processes in the transition to the functionally competent state of RNA involve transitions from misfolded structures. These structures, which have many elements in common with the native state, serve as kinetic traps for the majority of the initial population of molecules (5, 8).
Implicit in all of the proposed folding mechanisms is the assumption that there are two major structural changes in RNA en route to the native state, starting from an ensemble of unfolded molecules. It is supposed that stable secondary structures form rapidly, perhaps on microsecond time scales. The subsequent assembly leading to tertiary folding takes place by bringing the secondary structural elements together. The resulting kinetics of the assembly of the secondary structural elements is complex because of inherent topological frustration (8). The evidence for the two-step process (reminiscent of framework-like models in protein folding), namely, fast secondary structure formation followed by a slower acquisition of tertiary interactions, came from the folding studies of tRNA (9). In an illuminating paper in this issue of the Proceedings, Wu and Tinoco (10) show that this commonly held picture of RNA assembly may not always be correct.
The NMR experiments of Wu and Tinoco were performed on a fragment of the Tetrahymena group I self-splicing intron, which has served as the prototypical model for examining the folding kinetics of large RNA (3–5, 7). This intron consists of two large domains (base-paired P regions), namely, P4-P6 and P1-P2/P3-P9 (11). The resolution of the crystal structure of the P4-P6 domain (12, 13) has provided considerable insight into the folding process of the intron. In the stable P4-P6 domain, which can fold independently (14), resides the three helices (P5a, P5b, and P5c) that form the three-helix junction P5abc. By rapid time-resolved mapping of the accessible surface area of RNA to hydroxyl radical generated by using x-ray, Scalvi et al. (7) concluded that the 56-nt P5abc subdomain forms very rapidly. The rest of the P4-P6 domain nucleates around this scaffold. The rapid folding of P5abc together with the earlier work showing that this subdomain can fold independently (14) make the study of the folding of P5abc important in understanding the assembly of P4-P6 and ultimately the intron itself. This “bottom-up” strategy is used by Wu and Tinoco (10).
The secondary structure of P5abc subdomain in the absence of Mg2+ ions, which are needed to initiate tertiary folding of group I introns (11), was determined by NMR (10). This structure is substantially different from the x-ray secondary structure. The calculated free energy of the solution structure is lower than the crystal structure. Upon adding sufficient amount of Mg2+ the incorrect secondary structure undergoes substantial rearrangement en route to the folded conformation in which the secondary structure coincides with the x-ray structure. Thus, under folding conditions the nucleation of the magnesium ion core that leads to the tertiary folding of the P5abc subdomain also triggers the formation of the native secondary structure. This is the central result in the paper by Wu and Tinoco (10).
The substantial rearrangements in the secondary structure in P5abc, when sufficient amount of Mg2+ is added to drive folding, is explained by supposing that favorable tertiary interactions compensate for the loss of free energy in the breakup of non-native secondary structure base pairs (10). It follows from this that such rearrangements are thermodynamically feasible only if Ntɛt > Nsɛs where Ns is the number of lost secondary structure base pairs, ɛs is an average free energy for base pair formation, Nt is the net gain in the number of favorable tertiary interactions, and ɛt is the associated free energy gain. For simplicity we have assumed that ɛt and ɛs do not depend on the nature tertiary or secondary interactions. The thermodynamic requirement for the structural rearrangement suggests that if Ns is large, as is the case in P5abc (10), then a very compact tertiary fold containing many favorable tertiary interactions is needed to compensate for lost secondary base-pair interactions. This is unlikely to happen when the number of nucleotides is too small. On the other hand, topological frustration would prevent substantial structural reorganization in large RNA (8). Thus, it is logical to postulate that optimal length of the sequence and the compactness of the tertiary fold are the factors that determine the order of events in RNA folding.
There are mechanistic implications of the findings by Wu and Tinoco (10) as well. This becomes particularly clear if analogies to protein folding, which seem to be emerging from theoretical models (8) and recent experiments (3–6, 15, 16) on large RNA, are made. Wu and Tinoco estimated, based on chemical shifts and line broadening, that the conversion between the unfolded (at zero Mg2+ concentration) and folded (at 30–40 mM Mg2+) triple-helix junction takes place in about 33 msec, which is not inconsistent with the inference made by Scalvi et al. (7). A theoretical estimate for this folding time, τF, can be made if we assume that nucleation-collapse mechanism in the presence of Mg2+ drives the tertiary folding of P5abc. It has already been proposed (4) that the consolidation of the magnesium ion core (five discrete Mg2+ bound to various regions in the P5abc subdomain) nucleates the tertiary folding of not only P5abc but of the entire P4-P6 domain. If the nucleation-collapse mechanism is operative then the tertiary folding time is estimated to be (17) τF ≈ τ0N4.2 where N is the number of nucleotides, and τ0 (a function of viscosity, persistence length of the polyion, and effective surface tension) ≈10−9 s. For the 56-nt P5abc we find τF ≈ 20 msec, which is in reasonable agreement with experiments. The associated barrier to nucleation is about 10 kcal/mol. The preceding arguments and previous experiments suggest that, in all likelihood, tertiary folding of P5abc occurs by a nucleation-collapse mechanism. A major implication of this is that both thermodynamically and kinetically the folding of the isolated P5abc subdomain must exhibit two-state behavior. As far as we know this has not been established experimentally.
The anticipated nucleation-collapse mechanism in tertiary folding of P5abc invites comparisons to the folding of small proteins. Fersht and coworkers (18) have shown, by using protein engineering methods, that the 64-residue chymotrypsin inhibitor 2 (CI2) folds by a nucleation-collapse mechanism in about 15 msec under optimal folding conditions. The wealth of data on this protein suggests that in these small proteins the formation of secondary structure, collapse, and tertiary structure formation occurs almost synchronously (18). The plausible similarity in the folding mechanism between CI2 and P5abc suggests that native secondary and tertiary structures occur on very similar time scales in this subdomain.
The two-state nucleation-collapse mechanism for tertiary folding of P5abc implies that the bottleneck for folding is associated with the search for the critical nucleus or a set of critical nuclei. Once the nuclei are located then with overwhelming probability the folded state would be reached extremely rapidly. In Fig. 1 the interconversion between the unfolded and folded states of the P5abc subdomain is schematically sketched. The structure(s) of the transition state(s) and their free energies then would determine the folding routes. If the structures near the bottleneck are similar then one might have a relatively unique transition state as is often the case in small molecule reactions. However, if there is diversity in the transition state structures then it is more fruitful to think in terms of transition state ensembles (19, 20). Thus, it is of vital interest to probe the degree of heterogeneity of the transition states. In protein folding, where the major driving force arises from the need to form a hydrophobic core, the characterization of transition states for two-state proteins has not provided a clear picture about the structures near the bottleneck. Although the formation of the magnesium ion core may be the logical analogue of the hydrophobic core in RNA folding (4), the transition states in RNA may be easier to decipher than in proteins. The primary reason is because of the profound differences in the nature of the driving force. Because RNA is highly charged “polyelectrolyte effects” (condensation of counterions that can drastically alter packing interactions) dominate (21). It has been suggested (22) that if the interactions stabilizing the core of the nucleus are sufficiently strong, as is be the case in electrostatically dominated folding folding of RNA, then one expects a structurally more homogeneous transition state ensemble. Because discrete binding of magnesium ions is the major driving force for tertiary folding of the P5abc subdomain we conjecture that the nucleus in this case is less dispersed than in proteins (21, 22). This does not imply that if the nucleotides to which the magnesium ions are coordinated are mutated the assembly of the subdomain would not take place at all. In fact, the recent experiments of Treiber et al. (16) show that although certain mutations destabilize the P5abc subdomain they lead to an overall enhancement of the folding of the entire intron. This suggests that there is diversity in the folding nuclei. However, because of the polyelectrolyte effect in RNA folding there may not be much structural diversity of the transition state structures. In addition, the need to form a tight fully coordinated magnesium ion core (4) and complimentarity of base-pair interactions further constrain the degree of heterogeneity of the transition state ensemble. These factors may render greater specificity to the folding nuclei in RNA than in proteins.
The dissection of the structures of the P5abc subdomain by using NMR by Wu and Tinoco (10) offers a unique opportunity to answer some of the questions posed in the form of illustration in Fig. 1. In particular, it would be of great interest to probe the kinetics of the wild-type and mutant P5abc subdomains cloned by Treiber et al. (16). Such RNA engineering experiments would offer us a glimpse of the transition states in tertiary folding of not only the P5abc subdomain but also of the intron itself. Cumulatively, these experiments and new class of fast folding experiments, similar to those on proteins (23), will go a long way in providing a fuller understanding of the way large RNA molecules fold. In the process, one may better appreciate the common themes used by nature to fold biomolecules.
Footnotes
A commentary on this article begins on page 11555.
References
- 1. Cech T R. In: The RNA World. Gesteland R F, Atkins J F, editors. Plainview, NY: Cold Spring Harbor Lab. Press; 1993. pp. 239–269. [Google Scholar]
- 2.Brion P, Westhof E. Annu Rev Biophys Biomol Struct. 1997;26:113–137. doi: 10.1146/annurev.biophys.26.1.113. [DOI] [PubMed] [Google Scholar]
- 3.Zarrinkar P P, Williamson J R. Nat Struct Biol. 1996;3:432–438. doi: 10.1038/nsb0596-432. [DOI] [PubMed] [Google Scholar]
- 4.Cate J H, Hanna R L, Doudna J A. Nat Struct Biol. 1997;4:553–558. doi: 10.1038/nsb0797-553. [DOI] [PubMed] [Google Scholar]
- 5.Pan J, Thirumalai D, Woodson S A. J Mol Biol. 1997;273:7–13. doi: 10.1006/jmbi.1997.1311. [DOI] [PubMed] [Google Scholar]
- 6.Pan T, Sosnick T R. Nat Struct Biol. 1997;4:931–938. doi: 10.1038/nsb1197-931. [DOI] [PubMed] [Google Scholar]
- 7.Scalvi B, Sullivan M, Chance M R, Brenowitz M, Woodson S A. Science. 1998;279:1940–1943. doi: 10.1126/science.279.5358.1940. [DOI] [PubMed] [Google Scholar]
- 8.Thirumalai D, Woodson S A. Acc Chem Res. 1996;29:433–439. [Google Scholar]
- 9.Saenger W. Principles of Nucleic Acid Structure. New York: Springer; 1984. [Google Scholar]
- 10.Wu M, Tinoco I., Jr Proc Natl Acad Sci USA. 1998;95:11555–11560. doi: 10.1073/pnas.95.20.11555. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Celander D W, Cech T R. Science. 1991;251:401–407. doi: 10.1126/science.1989074. [DOI] [PubMed] [Google Scholar]
- 12.Cate J H, Gooding A R, Podell E, Zhou K, Golden B L, Kundrot C E, Cech T R, Doudna J A. Science. 1996;273:1678–1685. doi: 10.1126/science.273.5282.1678. [DOI] [PubMed] [Google Scholar]
- 13.Cate J H, Gooding A R, Podell E, Zhou K, Golden B L, Szewczak A A, Cech T R, Doudna J A. Science. 1996;273:1696–1699. doi: 10.1126/science.273.5282.1696. [DOI] [PubMed] [Google Scholar]
- 14.Murphy F L, Cech T R. Biochemistry. 1993;32:5291–5300. doi: 10.1021/bi00071a003. [DOI] [PubMed] [Google Scholar]
- 15.Draper D E. Trends Biochem Sci. 1996;21:145–149. [PubMed] [Google Scholar]
- 16.Treiber D K, Rook M S, Zarrinkar P P, Williamson J R. Science. 1998;279:1943–1946. doi: 10.1126/science.279.5358.1943. [DOI] [PubMed] [Google Scholar]
- 17.Thirumalai D. J Phys (Orsay, France) I. 1995;5:1457–1467. [Google Scholar]
- 18.Itzhaki L S, Otzen D E, Fersht A R. J Mol Biol. 1995;254:261–288. doi: 10.1006/jmbi.1995.0616. [DOI] [PubMed] [Google Scholar]
- 19.Wolynes P G. Proc Natl Acad Sci USA. 1997;94:6170–6175. doi: 10.1073/pnas.94.12.6170. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Onuchic J N, Socci N D, Luthey-Schulten Z, Wolynes P G. Folding Design. 1996;1:441–450. doi: 10.1016/S1359-0278(96)00060-0. [DOI] [PubMed] [Google Scholar]
- 21.Conn G L, Draper D E. Curr Opin Struct Biol. 1998;8:278–285. doi: 10.1016/s0959-440x(98)80059-6. [DOI] [PubMed] [Google Scholar]
- 22.Klimov, D. K. & Thirumalai, D. (1998) J. Mol. Biol., in press.
- 23.Eaton W A, Munoz V, Thompson P A, Chan C-K, Hofrichter J. Curr Opin Struct Biol. 1997;7:10–14. doi: 10.1016/s0959-440x(97)80003-6. [DOI] [PubMed] [Google Scholar]