Skip to main content
PLOS Computational Biology logoLink to PLOS Computational Biology
. 2012 May 31;8(5):e1002534. doi: 10.1371/journal.pcbi.1002534

Thermodynamic Basis for the Emergence of Genomes during Prebiotic Evolution

Hyung-June Woo, Ravi Vijaya Satya, Jaques Reifman 1,*
Editor: Carl T Bergstrom2
PMCID: PMC3364946  PMID: 22693440

Abstract

The RNA world hypothesis views modern organisms as descendants of RNA molecules. The earliest RNA molecules must have been random sequences, from which the first genomes that coded for polymerase ribozymes emerged. The quasispecies theory by Eigen predicts the existence of an error threshold limiting genomic stability during such transitions, but does not address the spontaneity of changes. Following a recent theoretical approach, we applied the quasispecies theory combined with kinetic/thermodynamic descriptions of RNA replication to analyze the collective behavior of RNA replicators based on known experimental kinetics data. We find that, with increasing fidelity (relative rate of base-extension for Watson-Crick versus mismatched base pairs), replications without enzymes, with ribozymes, and with protein-based polymerases are above, near, and below a critical point, respectively. The prebiotic evolution therefore must have crossed this critical region. Over large regions of the phase diagram, fitness increases with increasing fidelity, biasing random drifts in sequence space toward ‘crystallization.’ This region encloses the experimental nonenzymatic fidelity value, favoring evolutions toward polymerase sequences with ever higher fidelity, despite error rates above the error catastrophe threshold. Our work shows that experimentally characterized kinetics and thermodynamics of RNA replication allow us to determine the physicochemical conditions required for the spontaneous crystallization of biological information. Our findings also suggest that among many potential oligomers capable of templated replication, RNAs may have evolved to form prebiotic genomes due to the value of their nonenzymatic fidelity.

Author Summary

A leading hypothesis for the origin of life describes a prebiotic world where RNA molecules started carrying genetic information for catalyzing their own replication. This origin of biological information is akin to the crystallization of ice from water, where ‘order’ emerges from ‘disorder.’ What does the science of such phase transformations tell us about the emergence of genomes? In this paper, we show that such thermodynamic considerations of RNA synthesis, when combined with kinetics and population dynamics, lead to the conclusion that the ‘crystallization’ of genomes from its basic elements would have been spontaneous for RNAs, but not necessarily for other potential building blocks of genomes in the prebiotic soup.

Introduction

All biological organisms are evolutionarily related. The salient characteristics of life (reproduction and selection) must have therefore emerged either gradually or abruptly from inanimate chemical processes some time in the early history of the Earth. Our ever-increasing knowledge on the biochemical and genetic basis of modern life forms should guide the quest to understand this transition, in addition to the chemistry of potential building blocks [1], [2] and geochemical considerations [3], [4]. The lack of fossil evidence forces us to rely on model building, which can often be tested experimentally in the laboratory [5]. One of the simplest and most promising is the RNA world hypothesis [1], [6], [7], which proposes RNA molecules as precursors to modern life forms consisting of DNAs as carriers of genomes and proteins as molecular machines. Continued progress in experimental studies has yielded a diverse range of evidences supporting this hypothesis. In particular, plausible synthetic routes to nucleotides [2] and oligomers [8] have been demonstrated. RNA ribozymes capable of catalyzing RNA replications have been designed and synthesized via in vitro selection [9],[10]. Extensive studies of RNA folding landscapes further demonstrate the capability of RNAs to function both as carriers of genotypes and phenotypes [11], [12].

Conceptual difficulties to this scenario include the need for the existence of sufficiently concentrated and pure building blocks (chirally selected nucleotides for RNAs) and the necessity to explain subsequent evolutions of multi-chemical autocatalytic systems [13]: the incorporations of proteins and nonreplicative metabolic networks. In this context, Nowak and Ohtsuki recently considered a model describing a pre-evolutionary stage with nonreplicative chemical selection [14]. The undeniable strength of the RNA world hypothesis, nevertheless, is that it has the potential to provide an empirically well-tested pathway for the transition from chemistry to biology, irrespective of its factual historical relevance. The relative simplicity of the model should also allow quantitative descriptions that can complement empirical approaches.

Our focus in this paper, in particular, is the transition from the first RNA molecules formed, which must have been pools of near-random RNA sequences, to the first genomes coding for RNA ribozymes. Crucial in understanding such an emergence of the first RNA genomes is the error threshold predicted by the quasispecies theory [15][17]. At this threshold, the structure of a population of RNA sequences shifts from being dominated by a stable genome (‘master sequence’) to becoming random pools, or vice versa. This transition can also be described and understood in the context of more general population dynamics models [18], [19], for which many exact results have now been obtained based on statistical physics approaches [17], [20][24]. The error catastrophe transition is in the forward direction, and has thus been likened to ‘melting’ by Eigen [15]. The transition has recently been observed in behaviors of modern RNA viruses exposed to mutagens [25], [26]: a moderate artificial increase in mutation rates of viruses can lead to a complete extinction of virus populations. The error threshold is roughly proportional to the inverse of genome length, which also raised the question of how genomes long enough to encode error correction could have evolved under high error rates (Eigen's paradox) [15], [27], [28]. Notably, Saakian et al. [29] have recently applied analytical treatments of quasispecies theory to consider this question. Higher organisms keep error rates down to levels that are orders of magnitude lower than achievable by polymerases only, using sophisticated error correction mechanisms including mismatch repair complexes. Tannenbaum et al. [30], [31] have studied the quasispecies models of organisms posessing mismatch repair genes, finding transitions analogous to the classic error catastrophe transition in repair-deficient mutator frequencies.

The prebiotic evolution in the RNA world is in the opposite direction of the error catastrophe transition, and may thus be referred to as ‘crystallization.’ In an equilibrium fluid, whether one observes melting or crystallization is determined by the changes in temperature and pressure. Can we find analogous conditions for the emergence of the first genomes? Addressing this question requires connections to thermodynamics of RNA synthesis. Recent developments in the theory of nucleotide strand replication [32][34] provide a promising new direction to bridge the gap between the basic chemical thermodynamics of RNA synthesis and molecular evolution. The mean error rate of replication increases as the reaction condition approaches equilibrium, contributing to entropy production [32]. With a combination of this single-molecule thermodynamics and quasispecies theory, a surprisingly complete analogy to equilibrium fluids was proposed [34], where volume, pressure, and temperature are replaced by replication velocity, thermodynamic force, and inverse fidelity, respectively, with counterparts of condensation, sublimation, critical point, and triple point. Based on the analysis of a model replication kinetics equivalent to the Jukes-Cantor model of DNA evolution [35], it was suggested that the prebiotic evolution of RNA strands may have been biased by a thermodynamic driving force toward increasingly higher fidelity of polymerase ribozymes below a certain threshold [34].

To what extent these theoretical predictions are applicable to the actual prebiotic evolution that occurred in the past must ultimately be judged based on quantitative empirical data from existing and new experiments. Here, we extend our previous work [34] and assess the applicability of this thermodynamic theory of molecular evolution to prebiotic evolution, using experimental data for polymerization kinetics currently available in the literature. Our results based on these empirical data provide a strong support for the main conclusion of the theory, that there is a thermodynamic driving force biasing random sequence evolutions in the absence of genomes toward higher fidelity in a certain regime of parameter spaces. With considerations of the time-dependent evolutionary behavior of RNA populations, we furthermore show that it is possible to estimate the time scales that would have been required for a random sequence pool to crystallize a newly discovered master sequence under a given thermodynamic condition. These results also shed new light on Eigen's paradox. Most importantly, our approach enlarges the scope of both the quasispecies theory-based discussions of the stability of genomes and biochemical approaches to RNA replication by introducing the concept of thermodynamic driving forces and constraints in molecular evolution.

Results/Discussion

RNA replication kinetics and thermodynamics

The thermodynamic theory of molecular evolution [34] combines the kinetics and thermodynamics of RNA replication on a single-molecule level with population-level features. We first consider the molecular level description of RNA synthesis (or elongation): an elementary step of insertion by addition of a nucleotide (Figure 1A) consumes a nucleoside triphosphate (NTP) and produces a pyrophosphate (PPi). Its driving force Inline graphic is given by

graphic file with name pcbi.1002534.e002.jpg (1)

where Inline graphic is defined such that Inline graphic at equilibrium (see Methods). We may estimate the equilibrium constant from Inline graphic of DNA phosphodiester bond formation and Inline graphic of Inline graphic (NMP: nucleoside monophosphate), yielding Inline graphic [36], [37]. This value likely overestimates the magnitude of Inline graphic because it ignores the unfavorable entropy change of binding a free NTP monomer, leading to Inline graphic.

Figure 1. Kinetics and thermodynamics of RNA replication.

Figure 1

A: Single-molecule kinetics. B: Population dynamics.

One may seek the origin of the observed high-fidelity of polymerization reactions [38] in the relative thermal instability of incorrectly formed Watson-Crick base pairs. However, the stability differences between correctly and incorrectly inserted nucleotide pairs are small: an experimental estimate based on melting temperature measurements for the difference in free energy between incorrect and correct pairs yielded Inline graphic [39], which we adopted in this work. This value is the average of the relative stabilities of nucleotides G, C, and T (Inline graphic, Inline graphic, and Inline graphic, respectively [39]) with respect to A in a DNA 9-mer duplex terminus against the template base T. The precise value depends on the identity of the base pairs at the terminus and at the neighboring position immediately upstream: for DNAs, duplex stabilities including effects of mismatches can be reliably estimated based on nearest neighbor interactions [40]. Longer-ranged interactions presumably play more important roles for RNAs, which form secondary structures and higher-order folds [41], affecting Inline graphic values. Frier et al. [42] provided values of free energy contributions to the duplex stability from all 16 possible terminal RNA base pairs and mismatches next to 4 distinct base pairs upstream (Table 4 in Ref. [42]). From these data, we calculated Inline graphic, comparable to Inline graphic.

In quantitative descriptions encompassing both the high kinetic selectivity and this marginal stability difference, it is important to fully take into account the reversibility of the reactions [32]. We adopt the simplest description of the kinetics of polymerization, specified by 16 forward and reverse rates, Inline graphic and Inline graphic, respectively, each corresponding to the insertion and its reverse of a nucleotide (Inline graphic) against a template base (Inline graphic; Figure 1A). In reality, these rates do depend on the identity of base pairs immediately upstream [40], [41], which may lead to stalling after incorrect incorporations [28]. More importantly, however, these rates also depend on Inline graphic and Inline graphic. We estimated the forward rates from the available experimental data of primer extension under the far-from-equilibrium limiting condition [9], [28], [43][49]. The backward rates can then be related to the forward rates via equilibrium stability.

In general, the overall elongation reaction of a single nucleotide goes through a transition state, whose activation energy is differentially affected by the action of polymerases. If one ignores the reverse reaction under the condition of Inline graphic, the Michaelis-Menten kinetics applies for the primer extension. In the limit of small Inline graphic, we then have Inline graphic, the latter representing the apparent second-order rate constant with the substrate dissociation constant Inline graphic and the turnover rate of product formation Inline graphic [50]. Measurements of polymerase-catalyzed reactions show the selectivity reflected in differences in Inline graphic for correct and incorrect base pairs to be orders of magnitude larger than equilibrium stability differences [39]. Examples currently found in the literature are shown in Tables 1 and 2, including those for activated nonenzymatic polymerization (DNA replication without enzymes) determined recently by Chen et al. [28]. Table 1, in particular, shows the dramatic increase in the degree of relative stabilization of the transition states for correct base pairs in modern polymerases. The evolution of polymerases has entailed two aspects: the facilitation of the overall elongation rate and the amplification of the preferential attachment of correct versus incorrect nucleotides. As we show below, this latter aspect of selectivity evolution leads to a phase transition-like behavior, profoundly affecting population dynamics of evolving macromolecules.

Table 1. Reference base incorporation rates Inline graphic of NTPs (rows) against template bases (columns).

A. Nonenzymatic [28]
A T G C
ATP Inline graphic Inline graphic Inline graphic Inline graphic
TTP Inline graphic Inline graphic Inline graphic Inline graphic
GTP Inline graphic Inline graphic Inline graphic Inline graphic
CTP Inline graphic Inline graphic Inline graphic Inline graphic

Rates are defined as the apparent second order rate constant Inline graphic (or the limit of Inline graphic for small [NTP]) in units of Inline graphic.

1

For poliovirus Inline graphic, the mismatch rate has been reported for only one combination Inline graphic. We assumed that the same ratio Inline graphic applies to all NTPs for each template base. The value for Inline graphic is a harmonic mean of two data (Inline graphic and Inline graphic).

Table 2. Reference base incorporation rates for DNA polymerases.

A. Sulfolobus solfataricus P2 DNAP IV (Dpo4) [44]
A T G C
ATP Inline graphic Inline graphic Inline graphic Inline graphic
TTP Inline graphic Inline graphic Inline graphic Inline graphic
GTP Inline graphic Inline graphic Inline graphic Inline graphic
CTP Inline graphic Inline graphic Inline graphic Inline graphic

Rates are defined similarly in the same units as in Table 1.

1

For pol Inline graphic, it was assumed that Inline graphic.

To characterize this dual aspect of enzyme-catalyzed polymerization reactions, we adopt a ‘reduced’ description involving two key characteristics of forward rates: the mean base incorporation rate Inline graphic and the relative inverse fidelity Inline graphic (the ratio of incorrect to correct insertion rates). Precise definitions of these quantities in terms of kinetic rates emerge from the mean field theory (see Methods):

graphic file with name pcbi.1002534.e194.jpg (2a)
graphic file with name pcbi.1002534.e195.jpg (2b)

where Inline graphic is the Watson-Crick complementary base of Inline graphic and the angled brackets denote a harmonic mean over distribution Inline graphic of template bases:

graphic file with name pcbi.1002534.e199.jpg (3)

Figure 2 shows the distribution of these quantities among nine polymerase systems whose polymerization kinetics have been determined experimentally (Tables 1 and 2), in which we observe qualitative trends of the evolutionary changes reflected on the values of Inline graphic and Inline graphic: the Inline graphic values of modern polymerases are Inline graphic times larger than the activated nonenzymatic rate, while the nonenzymatic fidelity (Inline graphic) implies that the Watson-Crick structure in the absence of enzymes already supports a fairly high level of fidelity. The arrows illustrate the direction of evolutionary changes that must have occurred from the nonenzymatic to protein-based polymerases via the polymerase ribozymes in the RNA world.

Figure 2. Variations of inverse fidelity Inline graphic and mean base incorporation rate Inline graphic among polymerases.

Figure 2

See Tables 1 and 2 for the references. Arrows show the likely direction of evolutionary changes.

The nonenzymatic data are for the templated oligomerization of activated nucleotide analogs, the nucleoside Inline graphic-phosphorimidazolide, where PPi is replaced by the imidazole group [28]. Zielinski et al. have compared the kinetics of RNA versus DNA elongation of the activated system [51]. They concluded that RNA elongation is more efficient because its A-form helical structure positions the Inline graphic-OH group towards the incoming monomer, whereas contributions of wobble-pairing appeared to facilitate mismatches. This study suggests that the nonenzymatic kinetic rates for RNAs may have higher Inline graphic and Inline graphic values than for DNAs. We nevertheless expect their order of magnitudes to be similar.

Mean field theory

The kinetic rates and thermodynamic conditions (the value of Inline graphic) allow us to extract, using simulations in general (see Methods), the main stationary properties of RNA elongation: the mean elongation velocity Inline graphic (the average number of nucleotide pairs added per unit time) and error rate Inline graphic (the average fraction of mismatched nucleotide pairs). They differ from their respective microscopic counterparts, Inline graphic and Inline graphic, because of varying contributions of the reverse rates as functions of Inline graphic. Importantly, exact analytic expressions for the stationary properties can be obtained if the kinetic rates have sufficient symmetry: the set of Inline graphic for all Inline graphic is independent of the identity of Inline graphic (‘symmetric template models;’ see Methods and Figure 3).

Figure 3. Numerical tests of mean field theory.

Figure 3

A: Three components of Inline graphic as functions of Inline graphic for a symmetric template model for which the mean field theory is exact. The rates were given by Inline graphic, Inline graphic, Inline graphic, and Inline graphic for Inline graphicA, U, G, C. Lines are from Eq. (40). Symbols are from numerical simulations. B: Test of site-independence for the sequence distribution, Eq. (35), with pol Inline graphic rates (Table 2F). All symbols were calculated from numerical simulations. C–D: Mean velocity (C) and error rate (D) for the pol Inline graphic kinetics, both with full experimental kinetics (Table 2) and Jukes-Cantor version (JC) derived from the full kinetic set. Symbols are from simulations, which verify that for JC kinetics the mean field prediction is exact.

In Ref. [34], an important special case of symmetric template models

graphic file with name pcbi.1002534.e229.jpg (4)

equivalent to the Jukes-Cantor model of DNA evolution [35], was considered. The Jukes-Cantor model is a two-parameter model, while general symmetric template models have four parameters.

However, to quantitatively assess the applicability of the theory based on empirical data of RNA replication kinetics, it is necessary to allow all 16 values of Inline graphic to be independent empirical parameters. Here, we used a version of the mean field theory that generalizes the analytic results with the following expressions for the elongation velocity Inline graphic and error rate Inline graphic (see Methods):

graphic file with name pcbi.1002534.e233.jpg (5a)
graphic file with name pcbi.1002534.e234.jpg (5b)
graphic file with name pcbi.1002534.e235.jpg (5c)
graphic file with name pcbi.1002534.e236.jpg (5d)

where Inline graphic denotes the probability to find Inline graphic base-paired to Inline graphic, and Eq. (5b), the normalization condition for Inline graphic, determines Inline graphic, the mean velocity of nucleotide addition against template base Inline graphic. Equation (5a) is a generalization of the equilibrium Boltzmann distribution, to which it reduces to when Inline graphic, and is exact for symmetric template models (Figure 3). Because the complete reproduction of an RNA strand requires a pair of replications, we also considered the net error rate Inline graphic of two consecutive replications:

graphic file with name pcbi.1002534.e245.jpg (6)

For connections to thermodynamics, one must calculate the entropy production (in units of Inline graphic) per monomer addition [32]:

graphic file with name pcbi.1002534.e247.jpg (7)

where the first term in the square brackets represents the contribution of monomer consumptions to dissipation and the second term corresponds to the disorder creation by copying errors. The average in Eq. (7) is an arithmetic mean since Inline graphic is not a rate and should match the external thermodynamic force given by Eq. (1) in stationary states. The quantity Inline graphic is the free energy change in units of Inline graphic (or the negative of entropy production) for the addition of a nucleotide Inline graphic against Inline graphic, with which the backward rates Inline graphic are expressed in terms of forward rates Inline graphic via

graphic file with name pcbi.1002534.e255.jpg (8)

The dependence of Inline graphic on concentrations of monomers is nontrivial because four NTPs compete for a single site. Relative stabilities of correct versus mismatched base pairs (Inline graphic), in contrast, are expected to be largely insensitive to concentrations. The following form of Inline graphic reflects this expectation [34]:

graphic file with name pcbi.1002534.e259.jpg (9)

where Inline graphic, and the parameter Inline graphic accounts for the dependence of Inline graphic on concentrations (with mole fractions of NTPs assumed to be maintained equal during variations of [NTP]/[PPi]). With Inline graphic, we have Inline graphic at Inline graphic. Further physical insights into the free energy parameter Inline graphic can be gained by considering the condition of equilibrium (see Methods).

It can be shown that Inline graphic ranges from a minimum Inline graphic far from equilibrium (Inline graphic, Inline graphic), leading to Eq. (2b), to a maximum Inline graphic at equilibrium (Inline graphic) (see Methods). This variation of Inline graphic with varying thermodynamic force Inline graphic can be interpreted as follows: near equilibrium, both the correct (faster) and incorrect (slower) incorporation steps are balanced by their reverse steps, leading to comparable net incorporation statistics. Far from equilibrium, the reverse rates become negligible and the faster correct incorporation dominates.

In Figure 3A, we show that the mean field theory is exact for arbitrary symmetric template models. Figure 3B supports the site-independence of Inline graphic [Eq. (35) in Methods] for more general 16-parameter cases. Comparisons of the mean field theory predictions for elongation properties of pol Inline graphic kinetics (Table 2) with simulations (Figure 3C–D) show that the theory generally gives reliable results. The Jukes-Cantor reduction of empirical rates [Eq. (4)] based on Eqs. (2) is also seen to give a good approximation over all parameter ranges, showing that the analytical theory developed in Ref. [34] provides accurate descriptions of realistic kinetics. Nevertheless, for the best numerical accuracy of predictions based on experimental kinetics, we based our main results in the following sections on stochastic simulations. Importantly, however, the mean field theory in the current application yields the definitions given by Eq. (2), in addition to the analytical limits of velocity and error rate (see Methods), which we verified exactly from simulations.

Single-molecule properties

We applied this single-molecule description of RNA replication to three experimental systems: nonenzymatic reactions [28], ‘Round-18’ (R18) polymerase ribozyme [9], and poliovirus polymerase (Inline graphic) [43] (Figure 3), each representing the beginning, intermediate, and late stages of evolution. As has been previously observed in Ref. [34] for the Jukes-Cantor model, the qualitative trend shown in Figure 4 parallels that of fluids undergoing vapor-liquid transitions with decreasing temperature when pressure, volume, and temperature are replaced by thermodynamic force Inline graphic, velocity Inline graphic, and inverse fidelity Inline graphic, respectively. The correspondence of Inline graphic to temperature in fluids, in particular, is natural because it is a microscopic measure of randomness destroying genomic information. Figure 4A,B shows that for high inverse fidelity Inline graphic (nonenzymatic), the elongation velocity Inline graphic and error rate Inline graphic monotonically increase and decrease, respectively, with increasing Inline graphic. A critical point is crossed (ribozyme) as Inline graphic decreases, and Inline graphic and Inline graphic become nonmonotonic (Inline graphic) with discontinuous jumps for decreasing Inline graphic (‘evaporation’). The error rate Inline graphic exhibits the same qualitative behavior (Figure 4B). These results verify the biological applicability of the theoretical predictions made previously in Ref. [34], based on known experimental kinetic data of systems representing key milestones of evolutionary processes (Figure 2).

Figure 4. Single-molecule elongation properties as functions of Inline graphic.

Figure 4

A–B: Mean RNA sequence elongation velocity Inline graphic in units of Inline graphic (A) and mean error rate (B) with nonenzymatic, R18 ribozyme, and poliovirus Inline graphic kinetics, which show supercritical, near-critical, and subcritical behaviors, respectively. Green arrows indicate discontinuous jumps for poliovirus. The diamonds denote Inline graphic values using the poliovirus sequence (instead of random sequences for others), and the triangles indicate Inline graphic for poliovirus. C–D: Mean elongation velocity (C) and mean error rate (D) with increasing fidelity based on rescaled nonenzymatic kinetics.

The key question then is: how would these changes in the elongation behavior of RNA replication actually have occurred during the prebiotic evolution? To address this question, we modeled the increases in fidelity from the nonenzymatic starting point by uniformly rescaling the incorrect incorporation rates [Inline graphic, Inline graphic] of the set of nonenzymatic kinetics (Table 1) to produce different values of Inline graphic. Simulations identified the critical point suggested in Figure 4A,B at Inline graphic and verified the limits of error rates at and far from equilibrium predicted by the mean field theory exactly (Figure 4C,D).

Phase behavior

We then scanned the variation of these phase behavior for different values of Inline graphic and Inline graphic to generate the phase diagrams shown in Figure 5, which confirms that the qualitative features of the Jukes-Cantor model phase diagram [34] are preserved for empirical RNA replication. However, as opposed to the results in Ref. [34] that represent generic predictions, Figure 5 is based on empirical nonenzymatic kinetics and its uniform rescaling, with no other adjustable parameters.

Figure 5. Thermodynamic phase diagrams of RNA replication.

Figure 5

A–B: The Inline graphic-Inline graphic diagrams with color levels and contours (black dashed lines) representing Inline graphic (A) and Inline graphic (B). The black solid lines show the spinodal terminated by the critical point (filled circles). The red solid lines show the L-C transition for Inline graphic and Inline graphic, which meets the spinodal at the triple point (open circles). The white dashed lines show the boundary of Inline graphic region (smaller Inline graphic side). The green dashed lines show the analogous region of Inline graphic values for starvation processes (Inline graphic). The vertical lines show the location of the nonenzymatic Inline graphic value. C–D: The Inline graphic-Inline graphic and Inline graphic-Inline graphic diagrams. The green dashed lines represent the Inline graphic boundary. The blue dotted lines give the maximum and minimum Inline graphic and Inline graphic, respectively, and the red dotted line in D denotes the maximum error rate at equilibrium. The fitness Inline graphic is in units of Inline graphic.

The discontinuous jumps shown in Figure 4C,D correspond to the limit of stability (‘spinodal’; thick black lines) of the ‘liquid’ or L phase (high Inline graphic-low Inline graphic state) against the ‘gas’ or G phase (low Inline graphic-high Inline graphic state). In equilibrium, the location of a phase transition in the phase diagram is determined by the equality of free energies of the two phases [52], [53]. Here, we adopted the assumption that if multiple stationary states exist for a given Inline graphic, the state with higher Inline graphic (and lower Inline graphic) is chosen. This assumption is based on the relationship

graphic file with name pcbi.1002534.e331.jpg (10)

connecting the entropy production rate Inline graphic to Inline graphic and the velocity Inline graphic of RNA replicator Inline graphic present in the system, where Inline graphic is the total number of replicators and Inline graphic is the mean velocity (or ‘fitness’). The analogy to equilibrium phase behavior also excludes the first-order character of liquid-solid transitions, which for the current case is continuous. Equation (10) is a special case of a general relationship between nonequilibrium fluxes and conjugate forces [52]. In this formulation, a state with high Inline graphic contributes more to entropy production. This assumption is consistent with the standard interpretation of the replication rate as a measure of fitness [15], [54]. The multiplicity of stationary states at a single-molecule level is supported by the recent demonstration of a real-time sequencing-by-polymerization technique [55], where it was reported that polymerases interconverted between two distinct velocities during DNA elongation for a given reaction condition (Figures 3C and S3 of Ref. [55]). A complete kinetic characterization of the Inline graphic-29 polymerase used in this experiment would allow us to make a more quantitative assessment of this interpretation.

Population dynamics

In considering the thermodynamic interpretation of the population dynamics of RNA sequences, we adopt the following physical model (Figure 1): during evolutionary drifts of a random population in sequence space, a particular sequence that folds and catalyzes the replication of RNAs with the same sequence (and no others) is ‘discovered.’ (In reality, a ribozyme would more likely have had catalytic activities for arbitrary sequences. The selectivity toward its own sequence, instead, would have arisen from the need for spatial diffusion in order to act on other sequences.) This sequence therefore has a higher Inline graphic value [Eq. (2a)] compared to others, leading to the single-peak Eigen landscape [Eq. (18) below]. Our goal in this and the following subsections is to describe the growth and stability of this master sequence. In Ref. [34], the basic quasispecies theory under the single-peak landscape was combined with the theory of a single-molecule elongation. We expanded this treatment by considering different scenarios of how Inline graphic and Inline graphic may have been distributed in RNA populations (Figure 1B).

For the inverse fidelity Inline graphic, one may first assume that it is nearly uniform (or regard it as an average over replicators) in a population, as has been assumed implicitly in Ref. [34]. We also assumed that only the RNA strands with a certain polarity (analogous to the positive or negative-sense polarities of viral genomes [56]) have catalytic activities, such that a pair of replication events is necessary to reproduce a polymerase ribozyme. This feature makes the current treatment more realistic for RNA prebiotic evolution compared to those in Ref. [34]. The following derivation of the thermodynamic quasispecies theory in this subsection otherwise adopts the approach therein [34].

In a population of self-replicating RNAs with genotypes labeled by index Inline graphic, the genotype Inline graphic catalyzes replications with rates

graphic file with name pcbi.1002534.e346.jpg (11)

where Inline graphic is the equivalent of Eq. (2a) for the genotype Inline graphic specified by a fitness landscape. The relative rate Inline graphic specifies the rate of addition of nucleotide Inline graphic against base Inline graphic, all normalized such that Inline graphic. In this model, therefore, all genotypes have the same set of relative enzymatic rates for nucleotide pairs (and the same value of Inline graphic), while differing in their absolute magnitude of catalysis, Inline graphic. This assumption of uniform inverse fidelity is reasonable for populations with genotypes distributed within a small neighborhood of a master sequence (or a small random subspace in the absence of a master) in the sequence space. The elongation velocity Inline graphic is given by

graphic file with name pcbi.1002534.e356.jpg (12)

where (the relative velocity) Inline graphic is now determined from Eqs. (5c) with Inline graphic replaced by Inline graphic, and the replication rate of genotype Inline graphic is

graphic file with name pcbi.1002534.e361.jpg (13)

because a pair of replication events requires the addition of Inline graphic nucleotides, where Inline graphic is the length of genome.

The mutation rate Inline graphic from genotype Inline graphic to Inline graphic is given by

graphic file with name pcbi.1002534.e367.jpg (14)

where Inline graphic is the Hamming distance (the number of nucleotides that are different) between Inline graphic and Inline graphic. Denoting the number of individuals (of the polarity that has catalytic activity) with genotype Inline graphic as Inline graphic, the evolving population in the Eigen model [15], [16] without constraints on the population size obeys the dynamical equation,

graphic file with name pcbi.1002534.e373.jpg (15)

At any time Inline graphic, the total number of all individuals (population size) Inline graphic is given by Inline graphic, which from Eq. (15) changes via Inline graphic, where Inline graphic is the population growth rate (mean fitness) with the frequency of genotype Inline graphic, Inline graphic. Therefore, for a given population characterized by the set Inline graphic, the corresponding entropy production rate is given by Eq. (10) with

graphic file with name pcbi.1002534.e382.jpg (16)

Similarly, under an idealized condition where replication occurs together with degradation [15], [17], [57], a population can evolve under a constant Inline graphic with a fixed mean population size. In this case, a replication event occurs with the same rate as the random degradation of a replicator. The evolution equation becomes

graphic file with name pcbi.1002534.e384.jpg (17)

such that Inline graphic is constant. For the fitness landscape, we adopted the single-peak Eigen landscape:

graphic file with name pcbi.1002534.e386.jpg (18)

where Inline graphic is a constant with the unit of a rate and Inline graphic is the relative fitness of the master sequence.

Under these simplifying approximations, the standard quasispecies theory becomes applicable directly, with connections to thermodynamics made by Inline graphic and Inline graphic. These fundamental relationships linking elongation properties to thermodynamic and kinetic parameters can be written in implicit but closed analytical forms [34] for the Jukes-Cantor model. In the numerical approach adopted here for arbitrary rates, simulations are first performed for a given set of rate constants and Inline graphic values to obtain averages of Inline graphic, Inline graphic, and Inline graphic values as functions of Inline graphic, as illustrated in Figure 1 of Ref. [34]. The implicit parameter Inline graphic is then eliminated to obtain Inline graphic and Inline graphic as functions of Inline graphic (Figure 3C–D). For the region in which multiple branches of Inline graphic exist for a given Inline graphic, the branch with the largest Inline graphic (L phase) is chosen.

In the infinite population limit, the quasispecies is either dominated by the ‘master species’ with the mean fitness Inline graphic, where Inline graphic is the probability of replicating Inline graphic sites over two consecutive cycles without error [see Eq. (6)], or by the ‘mutant species’ with fitness Inline graphic. From Eqs. (13) and (16), the mean fitness is therefore given by

graphic file with name pcbi.1002534.e407.jpg (19)

where

graphic file with name pcbi.1002534.e408.jpg (20)

denotes the threshold error rate for which Inline graphic becomes the same for the master mutant species. Equation (19) implies that a constant-Inline graphic contour (red lines in Figure 5) is the ‘melting line’ separating a ‘crystalline’ (C) phase from the L phase [34]. As shown in Figure 5, the L-C transition line meets the L-G line at the triple point, below which the C and G phases meet directly (‘sublimation’). Our results show that this fairly complete analogy to the equilibrium phase behavior of fluids discovered first in Ref. [34] is indeed equally applicable in more realistic considerations of RNA prebiotic evolution.

The L-C transition line lies at the heart of the crystallization of genomes that may occur during evolutionary walks [13] in sequence space. The presence of the L phase distinct from the G phase below the critical point has an important consequence to such sequence explorations: despite the absence of a stable genome, analogous to liquid phases with short-range orders, RNAs in the L phase with Inline graphic (Figure 5D) would still exhibit sequence correlations for a significantly large number of generations. We may use the Jukes-Cantor relationship between the error rate and the cumulative mean Hamming distance Inline graphic from an ancestral sequence after Inline graphic generations [35],

graphic file with name pcbi.1002534.e414.jpg (21)

A typical sequence in the G phase with Inline graphic (Figure 5D), for instance, would evolve to reach Inline graphic in just Inline graphic generations, on average, whereas in the L phase with Inline graphic, it would do so in Inline graphic generations. Therefore, when a system ‘evaporates’ into the G phase, an ancestral sequence gets lost in a couple of generations. In contrast, conditions in the L phase, with error rates comparable to those in the C phase nearby in the phase diagram, would greatly facilitate crystallizations of viable genomes.

In interpreting the physical distinction between L and G phases, it is useful again to compare them with their analogs in equilibrium fluids, the liquid and gas phases in a container. The pressure of a fluid in equilibrium is controlled by the external force per unit area of the container, which matches the average of microscopic forces per unit area exerted by molecules on the wall interior. At high temperatures (the average kinetic energy of molecules), a given external pressure can be balanced by the mean force of a state (gas), where density is low and molecules rarely interact. The equilibrium density is then roughly proportional to pressure and inversely proportional to temperature. At low enough temperatures, a given external pressure can also be matched by a different phase (liquid) with a much higher density held together by intermolecular attractions. Both gas and liquid phases are characterized by the lack of long-range order. The sharp boundary between them appears when temperature goes below the critical value because the effect of molecular interaction renders a certain range of pressure values unstable.

Analogously, an RNA molecule replicating in a chemical reservoir is driven by the external thermodynamic force given by Eq. (1), which matches the average entropy production per monomer addition. For large Inline graphic values, the replication is nearly random and the second term of Eq. (7), the sequence disorder contribution to the entropy production, is constant (Inline graphic), making the dependence of internal Inline graphic on Inline graphic monotonic [34]. With a sufficiently small Inline graphic, in contrast, the sequence disorder nearly vanishes, reducing the entropy production. This change is compensated by the dominance of faster correct incorporation steps, with the corresponding increase in velocity and decrease in error rates. A given value of external Inline graphic can be matched either by a state with low velocity and high errors (G phase), or by one with high velocity and low errors (L phase), each distinguished by the relative importance of the two terms in the square brackets in Eq. (7). The sharp boundary between them appears because, for intermediate values of Inline graphic, stationary states become unstable against fluctuations. The neighborhood of regimes where the C phase is stable is dominated by the L phase (Figure 5) in which the error rate is comparable to those in the C phase, if Inline graphic is subcritical.

For the population as a whole, random drifts in Inline graphic due to sequence explorations are not isotropic but, rather, are biased toward the direction of increasing Inline graphic. In our previous work [34], a threshold was identified within the phase diagram separating regimes where the direction of this bias shifts. We sought the analog of this threshold in Figure 5 corresponding to the evolution of RNAs, where the region in which Inline graphic in the L phase (white dotted lines in Figure 5A, B) includes the nonenzymatic fidelity value and links it to the C phase. Inside the C phase, Inline graphic is always negative (Figure 6A). Once a population has Inline graphic values to the left of the white dashed line in Figure 5A, random drifts in sequence space would be biased toward increasingly higher fidelity, leading to crystallization and stable genomes.

Figure 6. Dependence of mean fitness on fidelity.

Figure 6

A: Mean fitness as a function of Inline graphic at constant Inline graphic. The slope Inline graphic is negative below a threshold Inline graphic for each Inline graphic (white dashed lines in Figure 5A,B and green dashed lines in Figure 5C,D, respectively). The discontinuous jump for Inline graphic and the cusps at smaller Inline graphic values correspond to G-L and L-C transitions, respectively. B: Mean fitness averaged over starvation processes (Inline graphic) for different initial thermodynamic force Inline graphic (see Figure 10). The slope Inline graphic is negative below a threshold Inline graphic for each Inline graphic (green dotted lines in Figure 5A,B). Vertical lines represent the nonenzymatic fidelity. The fitness Inline graphic is in units of Inline graphic.

Stochastic evolutionary dynamics

We next relaxed the assumption that Inline graphic is uniform within a population (Inline graphic is the value for genotype Inline graphic). Equation (13) is then replaced by

graphic file with name pcbi.1002534.e450.jpg (22)

for which numerical simulations have to be used. An efficient method to extract collective population dynamics of competing molecules is again provided by the Gillespie algorithm [58], which was first applied to the quasispecies dynamics by Nowak and Schuster [59]. The set of possible reactions corresponding to Eq. (17) a population can undergo are written as

graphic file with name pcbi.1002534.e451.jpg (23a)
graphic file with name pcbi.1002534.e452.jpg (23b)

where Inline graphic is a replicator of genotype Inline graphic. The mutation matrix is given by

graphic file with name pcbi.1002534.e455.jpg (24)

where Inline graphic is the error rate of reactions catalyzed by the genotype Inline graphic.

We tested this simulation algorithm using the special case of the exponential growth of a population with no degradation [Eq. (23a) only], uniform error rate (Inline graphic), and the initial condition of single master sequence under Eq. (18) (see Methods and Figure 7). Systems with replication and degradation [Eqs. (23)] using uniform Inline graphic and initial population size of Inline graphic were also simulated, in which the total population size showed moderate diffusional drifts but roughly remained the same over typical trajectories, and Inline graphic decayed to reach the steady state values (Figure 8) predicted by the infinite population result. These results show that the steady state reached in simulations depends neither on the initial conditions (single replicator or a large population) nor the boundary conditions (no degradation or constant Inline graphic).

Figure 7. Time dependence of master sequence frequency .

Figure 7

Inline graphic . Stochastic simulation results of the Eigen model (solid lines, averaged over 1000 trajectories) are compared with Eq. (54) (dotted line), where Inline graphic is the fitness of mutants, for Inline graphic and Inline graphic. The initial condition was Inline graphic.

Figure 8. Stationary frequency of master sequence.

Figure 8

Stochastic simulations results for the Eigen model are compared with Inline graphic. The simulations were under the condition of (approximately) constant population size (Inline graphic) using Eqs. (23). With Inline graphic and Inline graphic, the error threshold where Inline graphic is at Inline graphic. Error bars represent one standard deviations.

Crystallization kinetics

We used the constant-Inline graphic stochastic evolutionary dynamics simulations to examine the temporal evolution of quasispecies. The inverse fidelity Inline graphic was assumed to depend on genotype Inline graphic via the same form of single peak landscape as for fitness:

graphic file with name pcbi.1002534.e477.jpg (25)

where we took Inline graphic and Inline graphic (the nonenzymatic value) in Figure 9. The time scale of simulations is set by using Inline graphic from the nonenzymatic replication (Table 1) in Eqs. (18) and (22). We assumed Inline graphic as a representative chemical environment, such that Inline graphic.

Figure 9. Crystallization of a genome.

Figure 9

Stochastic simulations were used with genome length Inline graphic and mean base incorporation rate Inline graphic. The initial population (Inline graphic) contained random sequences and a single master sequence with relative fitness Inline graphic. The inverse fidelity was given by Eq. (25) with Inline graphic and Inline graphic. A: Relative frequency of the master sequence (initially Inline graphic). B: Average of the fractional Hamming distance (HD; Inline graphic initially).

Figure 9 shows two typical trajectories starting from an initial pool of random sequences of length Inline graphic, containing a single replicator designated as the master sequence. This ‘seeding’ of the population by a master sequence mimics the situation where a genotype with a significantly higher fitness is discovered during random drifts. The resulting evolution in Figure 9 is analogous to ‘crystal growths,’ in which the frequency of the master sequence steadily grows to reach a value consistent with the stability of the C phase (Figure 5): a master sequence with Inline graphic and Inline graphic spreads and dominates the population under thermodynamic force Inline graphic. The corresponding growth under Inline graphic, which corresponds to the vicinity of the L-C boundary in Figure 5, is much weaker and slower, suggesting that the phase diagram remains valid for inhomogeneous Inline graphic. The estimated time scales in Figure 9 (based on the activated nonenzymatic rates and Inline graphic) further suggest that the crystallization of a genome can occur within Inline graphic under suitable conditions. However, as in equilibrium fluids, it will never occur if thermodynamics precludes a stable C phase.

Starvation process

We also considered an alternative setup where a population growth occurs in a closed system, which leads to an evolutionary change we refer to as the ‘starvation process.’ Similar situations were also considered in Ref. [60]. During an idealized starvation process, a single genotype is placed inside a medium containing a given amount of NTPs and PPi's with the corresponding initial thermodynamic force Inline graphic. The population growth leads to the gradual depletion of NTPs and accumulation of PPi's, lowering Inline graphic. The error rate therefore increases over time. The growth of the population would come to an end when the condition finally reaches equilibrium (Inline graphic). The resulting collection of RNAs in reality may then disperse into fresh media, restarting new rounds of starvation processes.

We introduce the fractional population size Inline graphic with respect to the asymptotic population reached in the limit of equilibrium (see Methods). A single process starts with Inline graphic, where Inline graphic is maximum and Inline graphic, and may undergo up to two transitions (C-L and L-G) if Inline graphic to reach equilibrium, where Inline graphic, Inline graphic, and Inline graphic (Figure 10). In Figure 6, the mean fitness as a function of Inline graphic averaged over a starvation process (B) is compared with that without the averaging (A). For Inline graphic close to the nonenzymatic value (vertical dotted line), Inline graphic is negative in the L phase for Inline graphic above a threshold. The dependence of the Inline graphic region on Inline graphic (green dashed line in Figure 5A,B) closely resembles that of the Inline graphic region on Inline graphic (white dashed lines in Figure 5A,B). We therefore conclude that an environment that supports repeated starvation processes with an initial Inline graphic above this boundary for a given Inline graphic promotes evolution that lowers Inline graphic. It is worthwhile to note that this conclusion was reached without invoking any significant simplifying assumptions other than the experimentally characterized kinetics of nonenzymatic replication (Table 1), thermodynamic considerations, and the quasispecies theory, except the uncertainty in values of Inline graphic. We verified that the conclusion remains valid for all possible Inline graphic (Figure 11).

Figure 10. Variation of mean fitness during starvation processes.

Figure 10

The mean fitness is shown as a function of fractional population size Inline graphic. The two Inline graphic values (with Inline graphic and Inline graphic) illustrate typical behavior below and above the critical point. The C-L and L-G transitions are indicated for the subcritical case.

Figure 11. Sensitivity of fidelity threshold on equilibrium constant.

Figure 11

The dependence on Inline graphic of the minimum Inline graphic of starvation processes, for which Inline graphic, are shown.

Evolution of longer genomes

The conclusion that there was an underlying driving force biasing fidelity increases in the absence of genomes is particularly powerful because it is independent of the physical mechanisms implementing it. A likely mechanism for such changes is the evolution of error correction with the necessary increases in genome length Inline graphic. The Eigen's paradox arises because such an increase would lower Inline graphic (the melting line recedes toward smaller Inline graphic in Figure 5A,B). The melted population, however, would be driven to recrystallize a new, longer genome because Inline graphic in the L phase. Growths in genome lengths most likely occurred with insertions, which is beyond the scope of our treatment that only considered base substitution errors. Saakian has studied the evolutionary model of parallel mutation-selection scheme with insertion and deletion [61]. Similar approaches combined with our findings may offer more detailed insights on how genome growths may have been facilitated by thermodynamic driving forces. In addition, we have restricted our study here to a single chemical system (RNAs). It would be of interest to apply similar approaches to more complex systems containing multiple ingredients, including peptides.

Together, our findings in Figure 5 suggest that the initial nonenzymatic fidelity of RNA lies within the threshold favoring fidelity increases. Rather than being coincidental, this feature may explain nature's choice of NTPs as the media for encoding biological information. Many possible alternative oligomers capable of templated replications have been proposed as precursors to RNAs [7]. Their corresponding monomers, however, would have had widely different fidelity values, and one system (NTPs) that happened to lie within the Inline graphic boundary presumably evolved the RNA quasispecies cloud towards smaller Inline graphic, eventually crystallizing the first genomes.

Methods

Thermodynamic force

An elementary elongation reaction can be written as

graphic file with name pcbi.1002534.e536.jpg (26)

where Inline graphic and Inline graphic is the RNA primer of length Inline graphic. The entropy of the system plus reservoir is Inline graphic, where Inline graphic and Inline graphic are the total numbers of monomers Inline graphic and PPi, respectively. The entropy production rate is [34], [52]

graphic file with name pcbi.1002534.e544.jpg (27)

where Inline graphic is the force acting on the growing primer (in length units of e.g., base pair rise), Inline graphic, Inline graphic, Inline graphic, Inline graphic is the consumption rate of nucleotide Inline graphic. The force Inline graphic is analogous to the external tension balancing the entropic force of rubber elasticity [52]. In conditions where the external force is not controlled, it may be replaced by frictional drag on polymerases, which would depend on elongation velocity. The constant Inline graphic is given by

graphic file with name pcbi.1002534.e553.jpg (28)

In Eq. (28), Inline graphic, and in the third equality, we have assumed that Inline graphic. Equation (1) follows with Inline graphic, and Eq. (32) becomes Inline graphic.

Equilibrium condition

A useful physical insight to Inline graphic defined in Eq. (9) can be gained by considering the condition of equilibrium, which can be derived from Eq. (7) as

graphic file with name pcbi.1002534.e559.jpg (29)

or with Eq. (8), Inline graphic, the detailed balance. In equilibrium, on the other hand, we can calculate Inline graphic by considering a two-level system with a ground state and Inline graphic-fold degenerate excited states with energy gap Inline graphic (Inline graphic is the total number of NTP types; Inline graphic in the main text):

graphic file with name pcbi.1002534.e566.jpg (30)

Comparing this with Eqs. (9) and (29), we obtain

graphic file with name pcbi.1002534.e567.jpg (31)

which we also verify in the next subsection directly from the mean field theory. We may interpret Inline graphic and Inline graphic in Eq. (31) as the entropic and energetic factors for mismatches: they are costly individually (by a factor of Inline graphic) but there are many of them (Inline graphic). Equation (29) shows that the free energy parameter Inline graphic is defined with a constant term such that it absorbs the partition function that normalizes Inline graphic in equilibrium. The mean field theory generalizes Eq. (29) into Eqs. (5a) and (8) for nonequilibrium conditions.

Mean field theory of RNA replication

Previously, we derived a mean field theory for the templated replication and showed with numerical tests that it was exact for the two-parameter Jukes-Cantor model rates [34]. Here, we reproduce the analytical derivation and expand it to show that the mean field theory becomes exact for symmetric template models (i.e., four-parameter models with the set of Inline graphic for all Inline graphic independent of the identity of Inline graphic), of which the Jukes-Cantor model is a special case. We also test this conclusion by comparing the analytical results with numerical simulations (Figure 3). The particular version of the mean field theory we adopted for general rates in this paper [Eq. (5)] can then be considered as a generalization of these expressions.

Considering the elongation of RNA depicted in Figure 1A, the master equation for the probability Inline graphic to have a chain with length Inline graphic and sequence Inline graphic under a template Inline graphic (assumed infinitely long; may refer to the whole template sequence or a single base) can be written as

graphic file with name pcbi.1002534.e581.jpg (32)

where Inline graphic (Inline graphic and Inline graphic in the main text). We introduce the reduced distribution [32]

graphic file with name pcbi.1002534.e585.jpg (33)

where Inline graphic is the probability to find the chain with length Inline graphic at time Inline graphic under the given template Inline graphic, and Inline graphic is the conditional probability of having the indicated sequence for the given chain of length Inline graphic. In writing Inline graphic as time independent, we have implicitly assumed the stationary limit where Inline graphic represents the asymptotic monomer distributions near the terminus of the growing chain. Our numerical simulations confirm the existence of such distributions for Inline graphic. Conversely, the chain length distribution Inline graphic supports a peak, Inline graphic, moving with a constant velocity, Inline graphic, which depends on the entire template sequence Inline graphic in general.

A special case of particular interest is the symmetric template models, for which the set of rates would be specified completely by at most Inline graphic parameters instead of Inline graphic. The simplest example is the two-parameter model [34] given in Eq. (4). For such models, we may write

graphic file with name pcbi.1002534.e601.jpg (34)

The physical interpretation behind Eq. (34) is that the monomer addition and deletion at the Inline graphic'th site on the template are solely determined by the set of rates Inline graphic, which are independent of the identity of Inline graphic. The probability for chain growth, therefore, should be independent of template sequences. We also assume that

graphic file with name pcbi.1002534.e605.jpg (35)

which is expected because the rates Inline graphic are all local in their dependence on nucleotides. Numerical tests suggest that Eq. (35) is generally valid for arbitrary rate constants (Figure 3B).

Using Eqs. (34), (35) and summing both sides of Eq. (32) over Inline graphic, we have

graphic file with name pcbi.1002534.e608.jpg (36)

where

graphic file with name pcbi.1002534.e609.jpg (37a)
graphic file with name pcbi.1002534.e610.jpg (37b)

are the total fluxes for chain growth and shrinkage. Any expression involving summations over Inline graphic of Inline graphic is independent of Inline graphic for symmetric template models.

Equation (36) is valid for any Inline graphic, which we replace by Inline graphic. We note that Inline graphic, Inline graphic, and Inline graphic. If we sum both sides of Eq. (36) over Inline graphic using Inline graphic, multiply by Inline graphic, and sum over Inline graphic, we get

graphic file with name pcbi.1002534.e623.jpg (38)

If we sum both sides of Eq. (36) with respect to Inline graphic,

graphic file with name pcbi.1002534.e625.jpg (39)

or with Eq. (38),

graphic file with name pcbi.1002534.e626.jpg (40)

Equations (37b), (38), (40), and Inline graphic which was used in the derivation, form a set of self-consistent equations for Inline graphic [Inline graphic equations, Inline graphic copies of Eq. (40), the normalization, and Eq. (37b); for Inline graphic unknown, Inline graphic, Inline graphic, and Inline graphic]. Because Inline graphic is independent of Inline graphic, this set of equations is most conveniently solved by imposing the normalization condition to Eq. (40) to determine Inline graphic:

graphic file with name pcbi.1002534.e638.jpg (41)

We note that Eq. (41) leads to a unique Inline graphic independent of Inline graphic because of the symmetry of rates with respect to Inline graphic. The mean error rate is given by

graphic file with name pcbi.1002534.e642.jpg (42)

again independent of Inline graphic. In Figure 3A, we test Eqs. (40) and (41) with an example set of symmetric template model rates. The comparison with numerical simulations shows that the expressions are exact for symmetric template models. We also tested the factorization assumption, Eq. (35), for the general case (pol Inline graphic kinetics, Table 2) in Figure 3B, which suggests that it is generally valid.

Without the symmetry of kinetic rates with respect to template base identity, the main difficulty in the analytical treatment is that Eq. (34) is no longer valid, and the velocity Inline graphic depends on the entire template sequence. There are a number of ways to generalize the exact expressions, Eqs. (40) and (41), into cases where the kinetic rates do depend on the identity of template bases. One way, demonstrated in Ref. [34], is to introduce averages over template bases to Eq. (36) to symmetrize Inline graphic and Inline graphic over the distribution of Inline graphic. An average over Inline graphic of the right hand side of Eq. (41) determines the mean velocity Inline graphic independent of Inline graphic (Eq. (12) of Ref. [34]). Here, we used a different approach, generalizing Eq. (40) into Eq. (5a), which introduces a template-dependent velocity Inline graphic. We found this mean field theory to give better agreements with simulations for asymmetric rates especially when combined with averages over Inline graphic defined as the harmonic mean, i.e., Eq. (3). Because harmonic means are not additive, the summation must precede the average in Eq. (5d) within the current approach. Equation (5) becomes exact for symmetric template models.

In applying the mean field theory expressions, all forward rates are first scaled with Inline graphic [Inline graphic], and Eq. (41) is solved for each Inline graphic to find Inline graphic. Although it is a quartic equation with respect to Inline graphic, there was always only one solution for Inline graphic. The mean velocity Inline graphic, error rate Inline graphic, and thermodynamic force Inline graphic then follows from Eqs. (5c), (5d), and (7). The dependence of Inline graphic and Inline graphic on Inline graphic are obtained by treating Inline graphic as a parameter to be eliminated (Figure 3) [62].

Limiting behavior of elongation properties

Equilibrium is reached when Inline graphic and Inline graphic, which occurs when Inline graphic for all Inline graphic: from Eqs. (5a), (8), and (9), we verify Eqs. (29) and (31), and the maximum error rate is

graphic file with name pcbi.1002534.e671.jpg (43)

which would be equal to 3/4 if Inline graphic and Inline graphic. Far from equilibrium, where Inline graphic, Inline graphic and Inline graphic, Eqs. (5a) and (41) give Inline graphic and Inline graphic, where Inline graphic is given by Eq. (2a). The error rate in this limit approaches its lower bound,

graphic file with name pcbi.1002534.e680.jpg (44)

which defines the inverse fidelity parameter Inline graphic via Eq. (2b). The analogous limits of Inline graphic can also be calculated from Eq. (6):

graphic file with name pcbi.1002534.e683.jpg (45)

and

graphic file with name pcbi.1002534.e684.jpg (46)

where in the second equality in Eq. (46), the average over Inline graphic was omitted because Inline graphic is symmetric [Eq. (9)] and the three terms in the square brackets correspond to Inline graphic, Inline graphic, and Inline graphic cases, respectively. For Inline graphic, Inline graphic if Inline graphic, and Inline graphic with Inline graphic (dotted red line in Figure 5D).

Stochastic simulation of single-molecule elongation

Numerical simulations of single-molecule RNA replication kinetics were performed with the Gillespie algorithm [58] applied to Eq. (32) [32]. A sufficiently long template sequence Inline graphic was pre-generated for the simulations using random sequences with equal distributions Inline graphic. The initial condition was chosen as Inline graphic (with rate constants assigned arbitrarily for Inline graphic), and only conditions that lead to positive velocities were considered. For a given set of Inline graphic [or Inline graphic], a value of Inline graphic is chosen, Eq. (8) is used to generate Inline graphic, and simulations are run to obtain Inline graphic, where Inline graphic and Inline graphic are the length of chain grown and time elapsed, respectively. The error rates Inline graphic and Inline graphic, and the thermodynamic force Inline graphic are obtained by first calculating Inline graphic over the chain and using Eqs. (5d) and (7). This procedure is repeated for different values of Inline graphic to yield Inline graphic and Inline graphic as functions of Inline graphic. Typically, simulations were run up to Inline graphic and properties were averaged over the entire chain grown. For poliovirus Inline graphic, simuations were also performed using templates generated by repeating the poliovirus sequence [63] (diamonds in Figure 4B).

Stochastic simulation of evolutionary dynamics

In the stochastic form of the Eigen model given by Eqs. (23), the total rate of transformation at a given time Inline graphic is

graphic file with name pcbi.1002534.e717.jpg (47)

where Inline graphic was used. In a simulation, a random number Inline graphic with a uniform distribution is drawn, and

graphic file with name pcbi.1002534.e720.jpg (48)

determined the time Inline graphic of the next replication/degradation event. A second random number Inline graphic was drawn next, which chooses one (Inline graphic, Inline graphic) of Inline graphic reactions (Inline graphic replications and Inline graphic degradations, where Inline graphic is the total number of distinct genotypes present within the population) from Eqs. (23) following Ref. [58]:

graphic file with name pcbi.1002534.e729.jpg (49)

where

graphic file with name pcbi.1002534.e730.jpg (50)

For a degradation event, the number of replicators is updated as Inline graphic. For a replication, a progeny genotype Inline graphic is produced from Inline graphic by attempting to mutate each nucleotide into 3 different bases with probability Inline graphic, followed by the update Inline graphic. Since the total number of possible genotypes Inline graphic is exponentially large for even moderate values of Inline graphic, exact enumerations of Inline graphic for all possible genotypes was avoided. Instead, the simulation proceeded by first creating from the initial distribution a list of genotypes for which Inline graphic, and adding newly encountered genotypes to the list as mutations occurred.

Test case for quasispecies dynamics

For testing the Gillespie simulation of RNA population dynamics, we used the single-peak Eigen landscape (18) without back-mutation, for which the quasispecies dynamics (15) can be easily integrated. Although more advanced methods pioneered by Saakian and coworkers [17], [20][22], [24] allow exact analyses of the Eigen model, the following simple treatment suffices for our purpose of testing numerical simulations because for moderately large Inline graphic, the effect of back-mutations become negligible. Writing a vector Inline graphic, where Inline graphic and Inline graphic are the total numbers of individuals with the master sequence and mutants, respectively, and ignoring back-mutations, Eq. (15) can be written as

graphic file with name pcbi.1002534.e744.jpg (51)

where

graphic file with name pcbi.1002534.e745.jpg (52)

and Inline graphic. Diagonalizing Inline graphic and integrating, we get

graphic file with name pcbi.1002534.e748.jpg (53)

where Inline graphic and Inline graphic are the two eigenvalues of Inline graphic and Inline graphic is the eigenvector matrix. We then obtain the time-dependent master sequence frequency Inline graphic,

graphic file with name pcbi.1002534.e754.jpg (54)

Figures 7 and 8 test numerical simulations with Eq. (54) and its stationary limit, Inline graphic, respectively.

Starvation process

During an idealized starvation process, a single genotype is placed inside a medium containing Inline graphic NTPs and Inline graphic PPi's with the corresponding initial thermodynamic force Inline graphic. As replication progresses, Inline graphic decreases via

graphic file with name pcbi.1002534.e760.jpg (55)

assuming rapid mixing, and Inline graphic grows via Inline graphic. Equation (55) can be solved for Inline graphic to give

graphic file with name pcbi.1002534.e764.jpg (56)

Introducing the fractional population size Inline graphic with respect to the asymptotic population Inline graphic reached in the limit of equilibrium (Inline graphic),

graphic file with name pcbi.1002534.e768.jpg (57)

or

graphic file with name pcbi.1002534.e769.jpg (58)

Equation (58) with Inline graphic and Inline graphic give the dependence of mean fitness on Inline graphic during a starvation process with the initial condition Inline graphic. This procedure assumes that the early stages of growth with finite Inline graphic for which Inline graphic deviates from Eq. (19) make negligible contributions. The mean fitness averaged over the process is

graphic file with name pcbi.1002534.e776.jpg (59)

This integral was performed using trapezoidal rules to obtain Figure 10.

Acknowledgments

The authors thank Joy Hoffman for her help with manuscript preparation. The opinions and assertions contained are the private views of the authors and are not to be construed as official or as reflecting the views of the U.S. Army or the U.S. Department of Defense. This paper has been approved for public release with unlimited distribution.

Footnotes

The authors have declared that no competing interests exist.

This work was funded by the Military Operational Medicine Research Program of the U.S. Army Medical Research and Material Command, Fort Detrick, Maryland under the U.S. Army's Network Science Initiative. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Orgel LE. Prebiotic chemistry and the origin of the RNA world. Crit Rev Biochem Mol Biol. 2004;39:99–123. doi: 10.1080/10409230490460765. [DOI] [PubMed] [Google Scholar]
  • 2.Powner MW, Gerland B, Sutherland JD. Synthesis of activated pyrimidine ribonucleotides in prebiotically plausible conditions. Nature. 2009;459:239–242. doi: 10.1038/nature08013. [DOI] [PubMed] [Google Scholar]
  • 3.Hazen RM, Sverjensky DA. Mineral surfaces, geochemical complexities, and the origins of life. Cold Spring Harb Perspect Biol. 2010;2:a002162. doi: 10.1101/cshperspect.a002162. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Mulkidjanian AY, Bychkov AY, Dibrova DV, Galperin MY, Koonin EV. Origin of first cells at terrestrial, anoxic geothermal fields. Proc Natl Acad Sci U S A. 2012;109:E821–E830. doi: 10.1073/pnas.1117774109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Lincoln TA, Joyce GF. Self-sustained replication of an RNA enzyme. Science. 2009;323:1229–1232. doi: 10.1126/science.1167856. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Cheng LK, Unrau PJ. Closing the circle: replicating RNA with RNA. Cold Spring Harb Perspect Biol. 2010;2:a002204. doi: 10.1101/cshperspect.a002204. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Joyce GF. The antiquity of RNA-based evolution. Nature. 2002;418:214–221. doi: 10.1038/418214a. [DOI] [PubMed] [Google Scholar]
  • 8.Ferris JP, Hill AR, Jr, Liu R, Orgel LE. Synthesis of long prebiotic oligomers on mineral surfaces. Nature. 1996;381:59–61. doi: 10.1038/381059a0. [DOI] [PubMed] [Google Scholar]
  • 9.Johnston WK, Unrau PJ, Lawrence MS, Glasner ME, Bartel DP. RNA-catalyzed RNA polymerization: Accurate and general RNA-templated primer extension. Science. 2001;292:1319–1325. doi: 10.1126/science.1060786. [DOI] [PubMed] [Google Scholar]
  • 10.Wochner A, Attwater J, Coulson A, Holliger P. Ribozyme-catalyzed transcription of an active ribozyme. Science. 2011;332:209–212. doi: 10.1126/science.1200752. [DOI] [PubMed] [Google Scholar]
  • 11.Fontana W, Schuster P. Continuity in evolution: on the nature of transitions. Science. 1998;280:1451–1455. doi: 10.1126/science.280.5368.1451. [DOI] [PubMed] [Google Scholar]
  • 12.Obermayer B, Krammer H, Braun D, Gerland U. Emergence of information transmission in a prebiotic RNA reactor. Phys Rev Lett. 2011;107:018101. doi: 10.1103/PhysRevLett.107.018101. [DOI] [PubMed] [Google Scholar]
  • 13.Kauffman SA. The origins of order. New York: Oxford; 1993. [Google Scholar]
  • 14.Nowak MA, Ohtsuki H. Prevolutionary dynamics and the origin of evolution. Proc Natl Acad Sci U S A. 2008;105:14924–14927. doi: 10.1073/pnas.0806714105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Eigen M. Selforganization of matter and the evolution of biological macromolecules. Naturwissenschaften. 1971;58:465–523. doi: 10.1007/BF00623322. [DOI] [PubMed] [Google Scholar]
  • 16.Swetina J, Schuster P. A model for polynucleotide replication. Biophys Chem. 1982;16:329–345. doi: 10.1016/0301-4622(82)87037-3. [DOI] [PubMed] [Google Scholar]
  • 17.Saakian DB, Hu CK. Exact solution of the Eigen model with general fitness functions and degradation rates. Proc Natl Acad Sci U S A. 2006;103:4935–4939. doi: 10.1073/pnas.0504924103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Wagner H, Baake E, Gerisch T. Ising quantum chain and sequence evolution. J Stat Phys. 1998;92:1017–1052. [Google Scholar]
  • 19.Baake E, Baake M, Wagner H. Ising quantum chain is equivalent to a model of biological evolution. Phys Rev Lett. 1997;78:559–562. [Google Scholar]
  • 20.Saakian D, Hu CK. Eigen model as a quantum spin chain: Exact dynamics. Phys Rev E. 2004;69:021913. doi: 10.1103/PhysRevE.69.021913. [DOI] [PubMed] [Google Scholar]
  • 21.Saakian DB. A new method for the solution of models of biological evolution: Derivation of exact steady-state distributions. J Stat Phys. 2007;128:781–798. [Google Scholar]
  • 22.Saakian DB, Hu CK, Khachatryan H. Solvable biological evolution models with general fitness functions and multiple mutations in parallel mutation-selection scheme. Phys Rev E. 2004;70:041908. doi: 10.1103/PhysRevE.70.041908. [DOI] [PubMed] [Google Scholar]
  • 23.Park JM, Muñoz E, Deem MW. Quasispecies theory for finite populations. Phys Rev E. 2010;81:011902. doi: 10.1103/PhysRevE.81.011902. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Saakian DB, Rozanova O, Akmetzhanov A. Dynamics of the Eigen and the Crow-Kimura models for molecular evolution. Phys Rev E. 2008;78:041908. doi: 10.1103/PhysRevE.78.041908. [DOI] [PubMed] [Google Scholar]
  • 25.Domingo E, Holland JJ. RNA virus mutations and fitness for survival. Annu Rev Microbiol. 1997;51:151–178. doi: 10.1146/annurev.micro.51.1.151. [DOI] [PubMed] [Google Scholar]
  • 26.Anderson JP, Daifuku R, Loeb LA. Viral error catastrophe by mutagenic nucleosides. Annu Rev Microbiol. 2004;58:183–205. doi: 10.1146/annurev.micro.58.030603.123649. [DOI] [PubMed] [Google Scholar]
  • 27.Kun A, Santos M, Szathmáry E. Real ribozymes suggest a relaxed error threshold. Nat Genet. 2005;37:1008–1011. doi: 10.1038/ng1621. [DOI] [PubMed] [Google Scholar]
  • 28.Rajamani S, Ichida JK, Antal T, Treco DA, Leu K, et al. Effect of stalling after mismatches on the error catastrophe in nonenzymatic nucleic acid replication. J Am Chem Soc. 2010;132:5880–5885. doi: 10.1021/ja100780p. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Saakian DB, Biebricher CK, Hu CK. Lethal mutants and truncated selection together solve a paradox of the origin of life. PLoS ONE. 2011;6:e21904. doi: 10.1371/journal.pone.0021904. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Tannenbaum E, Shakhnovich EI. Error and repair catastrophes: A two-dimensional phase diagram in the quasispecies model. Phys Rev E. 2004;69:011902. doi: 10.1103/PhysRevE.69.011902. [DOI] [PubMed] [Google Scholar]
  • 31.Tannenbaum E, Deeds EJ, Shakhnovich EI. Equilibrium distribution of mutators in the single fitness peak model. Phys Rev Lett. 2003;91:138105. doi: 10.1103/PhysRevLett.91.138105. [DOI] [PubMed] [Google Scholar]
  • 32.Andrieux D, Gaspard P. Nonequilibrium generation of information in copolymerization processes. Proc Natl Acad Sci U S A. 2008;105:9516–9521. doi: 10.1073/pnas.0802049105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Andrieux D, Gaspard P. Molecular information processing in nonequilibrium copolymerizations. J Chem Phys. 2009;130:014901. doi: 10.1063/1.3050099. [DOI] [PubMed] [Google Scholar]
  • 34.Woo HJ, Wallqvist A. Nonequilibrium phase transitions associated with dna replication. Phys Rev Lett. 2011;106:060601. doi: 10.1103/PhysRevLett.106.060601. [DOI] [PubMed] [Google Scholar]
  • 35.Jukes TH, Cantor CR. Evolution of protein molecules. In: Munro MN, editor. Mammalian protein metabolism. Volume 3. New York: Academic Press; 1969. pp. 21–132. [Google Scholar]
  • 36.Dickson KS, Burns CM, Richardson JP. Determination of the free-energy change for repair of a DNA phosphodiester bond. J Biol Chem. 2000;275:15828–15831. doi: 10.1074/jbc.M910044199. [DOI] [PubMed] [Google Scholar]
  • 37.Minetti CA, Remeta DP, Miller H, Gelfand CA, Plum GE, et al. The thermodynamics of template-directed DNA synthesis: base insertion and extension enthalpies. Proc Natl Acad Sci U S A. 2003;100:14719–14724. doi: 10.1073/pnas.2336142100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Johnson KA. The kinetic and chemical mechanism of high-fidelity DNA polymerases. Biochim Biophys Acta. 2010;1804:1041–1048. doi: 10.1016/j.bbapap.2010.01.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Petruska J, Goodman MF, Boosalis MS, Sowers LC, Cheong C, et al. Comparison between DNA melting thermodynamics and DNA polymerase fidelity. Proc Natl Acad Sci U S A. 1988;85:6252–6256. doi: 10.1073/pnas.85.17.6252. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Allawi HT, SantaLucia J., Jr Thermodynamics and NMR of internal G·T mismatches in DNA. Biochemistry. 1997;36:10581–10594. doi: 10.1021/bi962590c. [DOI] [PubMed] [Google Scholar]
  • 41.Turner DH. Thermodynamics of base pairing. Curr Opin Struct Biol. 1996;6:299–304. doi: 10.1016/s0959-440x(96)80047-9. [DOI] [PubMed] [Google Scholar]
  • 42.Freier SM, Kierzek R, Jaeger JA, Sugimoto N, Caruthers MH, et al. Improved free-energy parameters for predictions of RNA duplex stability. Proc Natl Acad Sci U S A. 1986;83:9373–9377. doi: 10.1073/pnas.83.24.9373. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Arnold JJ, Cameron CE. Poliovirus RNA-dependent RNA polymerase (3Dpol): pre-steadystate kinetic analysis of ribonucleotide incorporation in the presence of Mg2+. Biochemistry. 2004;43:5126–5137. doi: 10.1021/bi035212y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Fiala KA, Suo Z. Pre-steady-state kinetic studies of the fidelity of Sulfolobus solfataricus P2 DNA polymerase IV. Biochemistry. 2004;43:2106–2115. doi: 10.1021/bi0357457. [DOI] [PubMed] [Google Scholar]
  • 45.Roettger MP, Fiala KA, Sompalli S, Dong Y, Suo Z. Pre-steady-state kinetic studies of the fidelity of human DNA polymerase μ. Biochemistry. 2004;43:13827–13838. doi: 10.1021/bi048782m. [DOI] [PubMed] [Google Scholar]
  • 46.Dieckman LM, Johnson RE, Prakash S, Washington MT. Pre-steady state kinetic studies of the fidelity of nucleotide incorporation by yeast DNA polymerase δ. Biochemistry. 2010;49:7344–7350. doi: 10.1021/bi100556m. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Zhang L, Brown JA, Newmister SA, Suo Z. Polymerization fidelity of a replicative DNA polymerase from the hyperthermophilic archaeon Sulfolobus solfataricus P2. Biochemistry. 2009;48:7492–7501. doi: 10.1021/bi900532w. [DOI] [PubMed] [Google Scholar]
  • 48.Lee HR, Johnson KA. Fidelity of the human mitochondrial DNA polymerase. J Biol Chem. 2006;281:36236–36240. doi: 10.1074/jbc.M607964200. [DOI] [PubMed] [Google Scholar]
  • 49.Ahn J, Werneburg BG, Tsai MD. DNA polymerase β: Structure-fidelity relationship from pre-steady-state kinetic analyses of all possible correct and incorrect base pairs for wild type and R283A mutant. Biochemistry. 1997;36:1100–1107. doi: 10.1021/bi961653o. [DOI] [PubMed] [Google Scholar]
  • 50.Bertram JG, Oertell K, Petruska J, Goodman MF. DNA polymerase fidelity: Comparing direct competition of right and wrong dNTP substrates with steady state and pre-steady state kinetics. Biochemistry. 2010;49:20–28. doi: 10.1021/bi901653g. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Zielinski M, Kozlov IA, Orgel LE. A comparison of RNA with DNA in template-directed synthesis. Helv Chim Acta. 2000;83:1678–1684. doi: 10.1002/1522-2675(20000809)83:8<1678::AID-HLCA1678>3.0.CO;2-P. [DOI] [PubMed] [Google Scholar]
  • 52.Callen HB. Thermodynamics and an introduction to thermostatistics. 2nd edition. New York: Wiley; 1985. [Google Scholar]
  • 53.Stanley HE. Introduction to phase transitions and critical phenomena. Oxford: Oxford; 1987. [Google Scholar]
  • 54.Nowak MA. Evolutionary dynamics. Cambridge, MA: Harvard; 2006. [Google Scholar]
  • 55.Eid J, Fehr A, Gray J, Luong K, Lyle J, et al. Real-time DNA sequencing from single polymerase molecules. Science. 2009;323:133–138. doi: 10.1126/science.1162986. [DOI] [PubMed] [Google Scholar]
  • 56.Holmes EC. The evolution and emergence of RNA viruses. New York: Oxford; 2009. [Google Scholar]
  • 57.Schuster P, Fontana W. Chance and necessity in evolution: Lessons from RNA. Physica D. 1999;133:427–452. [Google Scholar]
  • 58.Gillespie DT. Exact stochastic simulation of coupled chemical reactions. J Phys Chem. 1977;81:2340–2361. [Google Scholar]
  • 59.Nowak M, Schuster P. Error thresholds of replication in finite populations mutation frequencies and the onset of Müllers ratchet. J Theor Biol. 1989;137:375–395. doi: 10.1016/s0022-5193(89)80036-0. [DOI] [PubMed] [Google Scholar]
  • 60.Hermsen R, Hwa T. Sources and sinks: A stochastic model of evolution in heterogeneous environments. Phys Rev Lett. 2010;105:248104. doi: 10.1103/PhysRevLett.105.248104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Saakian DB. Evolution models with base substitutions, insertions, deletions, and selection. Phys Rev E. 2008;78:061920. doi: 10.1103/PhysRevE.78.061920. [DOI] [PubMed] [Google Scholar]
  • 62.Esposito M, Lindenberg K, Van den Broeck C. Extracting chemical energy by growing disorder: efficiency at maximum power. J Stat Mech. 2010:P01008. doi: 10.1088/1742-5468/2010/01/P01008. [Google Scholar]
  • 63.Kitamura N, Sembler BL, Rothberg PG, Larsen GR, Adler CJ, et al. Primary structure, gene organization and polypeptide expression of poliovirus RNA. Nature. 1981;291:547–553. doi: 10.1038/291547a0. [DOI] [PubMed] [Google Scholar]

Articles from PLoS Computational Biology are provided here courtesy of PLOS

RESOURCES