Skip to main content
Proceedings of the Royal Society B: Biological Sciences logoLink to Proceedings of the Royal Society B: Biological Sciences
. 2019 Oct 2;286(1912):20191359. doi: 10.1098/rspb.2019.1359

The origin of the central dogma through conflicting multilevel selection

Nobuto Takeuchi 1,3,, Kunihiko Kaneko 1,2
PMCID: PMC6790754  PMID: 31575361

Abstract

The central dogma of molecular biology rests on two kinds of asymmetry between genomes and enzymes: informatic asymmetry, where information flows from genomes to enzymes but not from enzymes to genomes; and catalytic asymmetry, where enzymes provide chemical catalysis but genomes do not. How did these asymmetries originate? Here, we show that these asymmetries can spontaneously arise from conflict between selection at the molecular level and selection at the cellular level. We developed a model consisting of a population of protocells, each containing a population of replicating catalytic molecules. The molecules are assumed to face a trade-off between serving as catalysts and serving as templates. This trade-off causes conflicting multilevel selection: serving as catalysts is favoured by selection between protocells, whereas serving as templates is favoured by selection between molecules within protocells. This conflict induces informatic and catalytic symmetry breaking, whereby the molecules differentiate into genomes and enzymes, establishing the central dogma. We show mathematically that the symmetry breaking is caused by a positive feedback between Fisher’s reproductive values and the relative impact of selection at different levels. This feedback induces a division of labour between genomes and enzymes, provided variation at the molecular level is sufficiently large relative to variation at the cellular level, a condition that is expected to hinder the evolution of altruism. Taken together, our results suggest that the central dogma is a logical consequence of conflicting multilevel selection.

Keywords: reproductive division of labour, origin of genetic information, RNA world hypothesis, prebiotic evolution, Price equation

1. Introduction

At the heart of living systems lies a distinction between genomes and enzymes—a division of labour between the transmission of genetic information and the provision of chemical catalysis. This distinction rests on two types of asymmetry between genomes and enzymes: informatic asymmetry, where information flows from genomes to enzymes but not from enzymes to genomes; and catalytic asymmetry, where enzymes provide chemical catalysis but genomes do not. These two asymmetries constitute the essence of the central dogma in functional terms [1].

However, current hypotheses about the origin of life posit that genomes and enzymes were initially undistinguished, both embodied in a single type of molecule, RNA or its analogues [2]. While these hypotheses resolve the chicken-and-egg paradox of whether genomes or enzymes came first, they raise an obvious question: how did the genome-enzyme distinction originate?

Michod hypothesized that a genome-enzyme distinction evolved because the distinction maximized the multiplication rates of replicators by allowing the unconstrained optimization of the replication rate and hydrolytic resistance of replicators that are in a trade-off relation [3].

We consider an alternative possibility that does not depend on the assumption that a genome-enzyme distinction maximized the multiplication rates of replicators. Specifically, we explore the possibility that a genome-enzyme distinction arose from conflict between selection at the level of protocells and selection at the level of molecules within protocells. During the evolutionary transition from replicating molecules to protocells, competition occurred both between protocells and between molecules within protocells [47]. Consequently, selection operated at both cellular and molecular levels, and selection at one level was potentially in conflict with selection at the other [8,9]. Previous studies have demonstrated that such conflicting multilevel selection can induce a partial and primitive distinction between genomes and enzymes in replicating molecules [10,11]. Specifically, the molecules undergo catalytic symmetry breaking between their complementary strands, whereby one strand becomes catalytic and the other becomes non-catalytic. However, the molecules do not undergo informatic symmetry breaking—i.e. one-way flow of information from non-catalytic to catalytic molecules—because complementary replication necessitates both strands to be replicated. Therefore, the previous studies have left the most essential aspect of the central dogma unexplained.

Here, we investigate whether conflicting multilevel selection can induce both informatic and catalytic symmetry breaking in replicating molecules. To this end, we extend the previous model by considering two types of replicating molecules, denoted by P and Q. Although P and Q could be interpreted as RNA and DNA, their chemical identity is unspecified for simplicity and generality. To examine the possibility of spontaneous symmetry breaking, we assume that P and Q initially do not distinguish each other. We then ask whether evolution creates a distinction between P and Q such that information flows irreversibly from one type (either P or Q) that is non-catalytic to the other that is catalytic.

2. Model

Our model is an agent-based model with two types of replicators, P and Q. We assume that both P and Q are initially capable of catalysing four reactions at an equal rate: the replication of P, replication of Q, transcription of P to Q, and transcription of Q to P, where complementarity is ignored (figure 1a; note that this figure does not depict a two-member hypercycle because in our model replicators undergo transcription [12]; see Discussion for more on comparison with hypercycles).

Figure 1.

Figure 1.

The agent-based model (see Methods for details). (a) Two types of replicators, P and Q, can serve as templates and catalysts for producing either type. Circular harpoons indicate replication; straight harpoons, transcription (heads indicate products; tails, templates). Dotted arrows indicate catalysis (heads indicate reaction catalysed; tails, replicators providing catalysis). (b) Replicators undergo complex formation, replication, transcription, and decay. Rate constants of complex formation are given by the kcpt values of a replicator serving as a catalyst (whose type, P or Q, is denoted by c). The catalyst can form two distinct complexes with another replicator serving as a template (whose type is denoted by t) depending on whether it replicates (p = t) or transcribes (pt) the template. (c) Protocells exchange substrate (represented by stars) through rapid diffusion. Protocells divide when the number of internal particles exceeds V. Protocells are removed when they lose all particles. (Online version in colour.)

Replicators compete for a finite supply of substrate denoted by S (hereafter, P, Q, and S are collectively called particles). S is consumed through the replication and transcription of P and Q, and recycled through the decay of P and Q (figure 1b). Thus, the total number of particles, i.e. the sum of the total numbers of P, Q, and S is kept constant (the relative frequencies of P, Q, and S are variable).

All particles are compartmentalized into protocells, across which P and Q do not diffuse at all, but S diffuses rapidly (figure 1c; Methods). This difference in diffusion induces the passive transport of S from protocells in which S is converted into P and Q slowly, to protocells in which this conversion is rapid. Consequently, the latter grow at the expense of the former [13]. If the number of particles in a protocell exceeds threshold V, the protocell is divided with its particles randomly distributed between the two daughter cells; conversely, if this number decreases to zero, the protocell is discarded.

Crucial in our modelling is the incorporation of a trade-off between a replicator’s catalytic activities and templating opportunities. This trade-off arises from the constraint that providing catalysis and serving as a template impose structurally incompatible requirements on replicators [14,15]. Because replication or transcription takes a finite amount of time, serving as a catalyst comes at the cost of spending less time serving as a template, thereby inhibiting replication of itself. To incorporate this trade-off, the model assumes that replication and transcription entail complex formation between a catalyst and template (figure 1b) [16]. The rate constants of complex formation are given by the catalytic activities (denoted by kcpt) of replicators, as described below.

Each replicator is individually assigned eight catalytic values denoted by kptc[0,1], where the indices (c, p, and t) are P or Q (figure 1a). Four of these kcpt values denote the catalytic activities of the replicator itself; the other four, those of its transcripts. For example, if a replicator is of type P, its catalytic activities are given by its kptP values, whereas those of its transcripts, which are of type Q, are given by its kptQ values. The indices p and t denote the specific type of reaction catalysed, as depicted in figure 1a. When a new replicator is produced, its kcpt values are inherited from its template with potential mutation of probability m (Methods).

The kcpt values of a replicator determine the rates at which this replicator forms a complex with another replicator and catalyses replication or transcription of the latter (figure 1b; Methods). The greater the catalytic activities (kcpt) of a replicator, the greater the chance that the replicator is sequestered in a complex as a catalyst and thus unable to serve as a template—hence a trade-off. Note that the trade-off is relative: if all replicators in a protocell have identical kcpt values, their multiplication rate increases monotonically with their kcpt values, assuming all else is held constant.

The above trade-off creates a dilemma: providing catalysis brings benefit at the cellular level because it accelerates a protocell’s uptake of substrate; however, providing catalysis brings cost at the molecular level because it decreases the relative opportunity of a replicator to be replicated within a protocell [10]. Therefore, selection between protocells tends to maximize the kcpt values of replicators (i.e. cellular-level selection), whereas selection within protocells tends to minimize the kcpt values of replicators (i.e. molecular-level selection).

3. Results

(a). Computational analysis

Using the agent-based model described above, we examined how kcpt values evolve as a result of conflicting multilevel selection. To this end, we set the initial kcpt values of all replicators to 1, so that P and Q are initially identical in their catalytic activities (the initial frequencies of P or Q are also set to be equal). We then simulated the model for various values of V (the threshold at which protocells divide) and m (mutation rate).

Our main result is that for sufficiently large values of V and m, replicators undergo spontaneous symmetry breaking in three aspects (figure 2ad; electronic supplementary material, figure S1). First, one type of replicator (either P or Q) evolves high catalytic activity, whereas the other completely loses it (i.e. kptckptc0 for cc′): catalytic symmetry breaking (figure 2b,c). Second, templates are transcribed into catalysts, but catalysts are not reverse-transcribed into templates (i.e. kctcktcc0): informatic symmetry breaking (figure 2b,c). Finally, the copy number of templates becomes smaller than that of catalysts: numerical symmetry breaking (figure 2d). This threefold symmetry breaking is robust to various changes in model details (see electronic supplementary material, Text 1.1 and 1.2; figures S2–S4).

Figure 2.

Figure 2.

The evolution of the central dogma. (a) Phase diagram: circles indicate no symmetry breaking (electronic supplementary material, figure S1a,b); squares, uncategorized (electronic supplementary material, figure S1c,d); open triangles, incomplete symmetry breaking (electronic supplementary material, figure S1eh); filled triangles, threefold symmetry breaking as depicted in b, c, and d; diamonds, catalytic and informatic symmetry breaking without numerical symmetry breaking (electronic supplementary material, figure S5a). The initial condition was kptc=1 for all replicators. (b) Dynamics of kcpt averaged over all replicators. V = 10 000 and m = 0.01. (c) Catalytic activities evolved in b. (d) Per-cell frequency of minority replicator types (P or Q) at equilibrium as a function of V: boxes, quartiles; whiskers, 5th and 95th percentiles. Only protocells containing at least V/2 particles were considered. (e) Frequencies of templates (orange) and catalysts (blue) in the entire population or in the common ancestors. V = 3162 and m = 0.01. (f) Illustration of e. Circles represent replicators; arrows, genealogy. Extinct lineages are grey. Common ancestors are always templates, whereas the majority of replicators are catalysts. (Online version in colour.)

A significant consequence of the catalytic and informatic symmetry breaking is the resolution of the dilemma between providing catalysis and getting replicated. Once symmetry is broken, tracking lineages reveals that the common ancestors of all replicators are almost always templates (figure 2e,f; Methods). That is, information is transmitted almost exclusively through templates, whereas information in catalysts is eventually lost (i.e. catalysts have zero reproductive value). Consequently, evolution operates almost exclusively through competition between templates, rather than between catalysts. How the catalytic activity of catalysts evolves, therefore, depends solely on the cost and benefit to templates. On one hand, this catalytic activity brings benefit to templates for competition across protocells. On the other hand, this activity brings no cost to templates for competition within a protocell (neither does it bring benefit because catalysis is equally shared among templates). Therefore, the catalytic activity of catalysts is maximized by cellular-level selection operating on templates, but not minimized by molecular-level selection operating on templates, hence the resolution of the dilemma between catalysing and templating. Because of this resolution, symmetry breaking leads to the maintenance of high catalytic activities (electronic supplementary material, figures S6 and S7).

(b). Mathematical analysis

To understand the mechanism of the catalytic and informatic symmetry breaking, we simplified the agent-based model into mathematical equations. These equations allow us to consider all the costs and benefits involved in the provision of catalysis by c ∈ {P, Q}: molecular-level cost to c (denoted by γcc) and cellular-level benefit to t ∈ {P, Q} (denoted by βtc). The equations calculate the joint effects of all these costs and benefits on the evolution of the average catalytic activities of c (denoted by kc). The equations are derived with the help of Price’s theorem [8,9,17] and displayed below (see Methods and electronic supplementary material, Text 1.3 for the derivation):

Δk¯Pω¯P(βPPσcel2γPPσmol2)+ω¯QβPQσcel2andΔk¯Qω¯PβQPσcel2+ω¯Q(βQQσcel2γQQσmol2),} 3.1

where Δ denotes evolutionary change per generation, ω¯c is the average normalized reproductive value of c, σcel2 is the variance of catalytic activities among protocells (cellular-level variance), and σ2mol is the variance of catalytic activities within a protocell (molecular-level variance).

The first and second terms on the right-hand side of equations (3.1) represent evolution arising through the replication of P and Q, respectively, weighted by the reproductive values, ω¯P and ω¯Q. The terms multiplied by βctσcel2 represent evolution driven by cellular-level selection; those by γccσmol2, evolution driven by molecular-level selection.

The derivation of equations (3.1) involves various simplifications that are not made in the agent-based model, among which the three most important are noted below (see Methods and electronic supplementary material, Text 1.3 for details). First, equations (3.1) simplify evolutionary dynamics by restricting the number of evolvable parameters to a minimum required for catalytic and informatic symmetry breaking. More specifically, equations (3.1) assume that kcpt is independent of p and t (denoted by kc), i.e. catalysts do not distinguish the replicator types of templates and products. Despite this simplification, catalytic symmetry breaking can still occur (e.g. kP > kQ), as can informatic symmetry breaking: the trade-off between catalysing and templating causes information to flow preferentially from less catalytic to more catalytic replicator types. However, numerical symmetry breaking is excluded as it requires kcpt to depend on p; consequently, the frequencies of P or Q are fixed and even in equations (3.1) (this is not the case in the agent-based model described in the previous section). Therefore, while equations (3.1) are useful for identifying the mechanism of catalytic and informatic symmetry breaking, they are not useful for identifying the mechanism of numerical symmetry breaking. In the electronic supplementary material, we use different equations to identify the mechanism of numerical symmetry breaking (see electronic supplementary material, Text 1.4 and figure S5).

The second simplification involved in equations (3.1) is that variances σ2mol and σcel2 are treated as parameters although they are actually dynamic variables dependent on m and V in the agent-based model (in electronic supplementary material, we examine this assumption; see electronic supplementary material, Text 1.5 and figure S8). In addition, these variances are assumed to be identical between k¯P and k¯Q because no difference is a priori assumed between P and Q.

The third simplification involved in equations (3.1) is that the terms of order greater than σ2cel and σ2mol are ignored under the assumption of weak selection [17].

Using equations (3.1), we can now elucidate the mechanism of the symmetry breaking. Consider a symmetric situation where P and Q are equally catalytic: k¯P=k¯Q. Since P and Q are identical, the catalytic activities of P and Q evolve identically: Δk¯P=Δk¯Q. Next, suppose that P becomes slightly more catalytic than Q for whatever reason, e.g. by genetic drift: k¯P>k¯Q (catalytic asymmetry). The trade-off between catalysing and templating then causes P to be replicated less frequently than Q, so that ω¯P<ω¯Q (informatic asymmetry). Consequently, the second terms of equations (3.1) increase relative to the first terms. That is, for catalysis provided by P (i.e. k¯P), the impact of cellular-level selection through Q (i.e. ω¯QβPQσcel2) increases relative to those of molecular-level and cellular-level selection through P (i.e. ω¯PγPPσmol2 and ω¯PβPPσcel2, respectively), resulting in the relative strengthening of cellular-level selection. By contrast, for catalysis provided by Q (i.e. k¯Q), the impacts of molecular-level and cellular-level selection through Q (i.e. ω¯QγQQσmol2 and ω¯QβQQσcel2, respectively) increase relative to that of cellular-level selection through P (i.e. ω¯PβQPσcel2), resulting in the relative strengthening of molecular-level selection. Consequently, a small difference between k¯P and k¯Q leads to Δk¯P>Δk¯Q, the amplification of the initial difference—hence, symmetry breaking. The above mechanism can be summarized as a positive feedback between reproductive values and the relative impact of selection at different levels.

We next asked whether, and under what conditions, the above feedback leads to symmetry breaking such that either P or Q completely loses catalytic activity. To address this question, we performed a phase-plane analysis of equations (3.1) as described in figure 3 (see Methods and electronic supplementary material, Text 1.6 for details). Figure 3 shows that k¯P and k¯Q diverge from symmetric states (i.e. Δk¯PΔk¯Q), confirming the positive feedback described above. However, symmetry breaking occurs only if molecular-level variance σ2mol is sufficiently large relative to cellular-level variance σ2cel (i.e. if genetic relatedness between replicators, σcel2/(σmol2+σcel2), is sufficiently low; see Methods). Large σmol2/σcel2 is required because if σmol2/σcel2 is too small, cellular-level selection completely dominates over molecular-level selection, maximizing both k¯P and k¯Q (figure 3a). The requirement of large σmol2/σcel2 is consistent with the fact that the agent-based model displays symmetry breaking for sufficiently large V: the law of large numbers implies that σmol2/σcel2 increases with V [10,18]. This consistency with the agent-model suggests that equations (3.1) correctly describe the mechanism of symmetry breaking in the agent-based model (see electronic supplementary material, Text 1.5 and figure S8 for an additional consistency check in terms of both m and V).

Figure 3.

Figure 3.

Phase-plane analysis. For this analysis, equations (3.1) were adapted as follows: βtc and γcc were set to 1; ω¯c was calculated as ek¯c/(ek¯P+ek¯Q); Δ was replaced with time derivative (d/dτ); and (d/dτ)k¯c was set to 0 if k¯c=0 or k¯c=1 to ensure that kc is bounded within [0, 1] as in the agent-based model. Solid lines indicate nullclines: (d/dτ)k¯P=0 (red) and (d/dτ)k¯Q=0 (blue). The nullclines at k¯c=0 and k¯c=1 are not depicted for visibility. Filled circles indicate symmetric (grey) and asymmetric (black) stable equilibria; open circles, unstable equilibria; arrows, short-duration flows (Δτ = 0.15) leading to symmetric (grey) or asymmetric (black) equilibria. Dashed lines (orange) demarcate basins of attraction. σcel2=1. (a) Molecular-level variance is so small that cellular-level selection completely dominates; consequently, kc is always maximized. (b) Molecular-level variance is large enough to create asymmetric equilibria; however, cellular-level variance is still large enough to make k¯P=k¯Q=1 stable. (c) A tipping point; the nullclines overlap. (d) Molecular-level variance is so large that k¯P=k¯Q=1 is unstable; the asymmetric equilibria can be reached if k¯Pk¯Q1. (Online version in colour.)

4. Discussion

Our results show that conflicting multilevel selection can induce informatic and catalytic symmetry breaking in replicating molecules. The symmetry breaking is induced because molecular-level selection minimizes the catalytic activity of one type of molecule (either P or Q), whereas cellular-level selection maximizes that of the other. The significance of the symmetry breaking is that it results in the one-way flow of information from non-catalytic to catalytic molecules—the central dogma. The symmetry breaking thereby establishes a division of labour between the transmission of genetic information and the provision of chemical catalysis. This division of labour resolves a dilemma between templating and catalysing, the very source of conflict between levels of selection. Below, we discuss our results in relation to four subjects, namely, chemistry, hypercycle theory, kin selection theory, and Michod’s 1983 paper [3].

Our theory does not specify the chemical details of replicating molecules, and this abstraction carries two implications. First, our theory suggests that the central dogma, if formulated in functional terms, is a general feature of living systems that is independent of protein chemistry. When the central dogma was originally proposed, it was formulated in chemical terms as the irreversible flow of information from nucleic acids to proteins [1]. Accordingly, the chemical properties of proteins have been considered integral to the central dogma [19]. By contrast, the present study formulates the central dogma in functional terms, as the irreversible flow of information from non-catalytic to catalytic molecules. Our theory shows that the central dogma, formulated as such, is a logical consequence of conflicting multilevel selection. Therefore, the central dogma might be a general feature of life that is independent of the chemical specifics of material in which life is embodied.

The second implication of the chemical abstraction is that our theory could be tested by experiments with existing materials. Our theory assumes that a replicator faces a trade-off between providing ‘catalysis’ and getting replicated. However, it does not restrict catalysis to being replicase activity: although our agent-based model assumes that catalysts are replicases, our mathematical analysis does not. Therefore, existing RNA and DNA molecules could be used to test our theory [20]. For example, one could compare two systems, one where RNA serves as both templates and catalysts, and one where RNA serves as catalysts and DNA serves as templates. According to our theory, the latter is expected to maintain higher catalytic activity through evolution, provided the mutation rate and the number of molecules per cell are sufficiently large (see also [21]). In addition, using RNA and DNA is potentially relevant to the historical origin of the central dogma, given the possibility that DNA might have emerged before the advent of protein translation [2225].

While our theory is similar to hypercycle theory in that both are concerned with the evolution of complexity in replicator systems, our theory proposes a distinct mechanism for evolving such complexity. Whereas hypercycle theory proposes symbiosis between multiple lineages of replicators [12], our theory proposes symmetry breaking (i.e. differentiation) in a single lineage of replicators—a fundamental distinction that is drawn between ‘egalitarian’ and ‘fraternal’ major evolutionary transitions as defined by Queller [26] (egalitarianism implies equality, which is involved in the evolution of complexity through symbiosis, whereas fraternalism implies kinship, which is involved in the evolution of complexity through differentiation; these terms are taken from a French Revolutionary slogan, Liberté, Egalité, Fraternité).

Moreover, our theory differs from hypercycle theory in terms of the roles played by non-catalytic templates. In hypercycle theory, the evolution of non-catalytic templates jeopardizes hypercycles because such templates (called parasites) can replicate faster than catalytic templates constituting the hypercycles [16,27]. In our theory, the evolution of non-catalytic templates is one of the essential factors leading to the division of labour between genomes and enzymes.

While our theory differs from hypercycle theory in the above aspects, it does not contradict the latter. In fact, there is a potential synergy between the evolution of complexity through symmetry breaking and that through symbiosis. Our theory posits that a distinction between genomes and enzymes resolves the dilemma between templating and catalysing, thereby increasing the evolutionary stability of catalytic activities in replicators. Likewise, this distinction might also contribute to the evolutionary stability of symbiosis between replicators, hence the potential synergy (however, we should add that the specific mechanism of symbiosis proposed by hypercycle theory is not unique [2834]).

While our theory is consistent with kin selection theory, it makes a novel prediction for evolution under a condition of low genetic relatedness. Kin selection theory posits that altruism can evolve if genetic relatedness is sufficiently high [35]. Consistent with this, our theory posits that for sufficiently high genetic relatedness (i.e. for sufficiently high σcell2/(σcel2+σmol2), or sufficiently small m and V), cellular-level selection maximizes the provision of catalysis by all molecules, establishing full altruism (providing catalysis can be viewed as altruism [36]: providing catalysis brings no direct benefit to a catalyst because a catalyst cannot catalyse the replication of itself in our model). However, the two theories diverge for sufficiently low genetic relatedness. In this case, kin selection theory predicts that evolution cannot lead to altruism; by contrast, our theory predicts that evolution can lead to a division of labour between the transmission of genetic information and the provision of chemical catalysis. Whether this reproductive division of labour should be called altruism is up for debate.

Michod hypothesized that a genome-enzyme distinction evolved because the distinction maximized the multiplication rates of replicators by allowing the unconstrained optimization of the replication rate and hydrolytic resistance of replicators that are in a trade-off relation [3]. While our present work is similar to Michod’s in underlining trade-off faced by replicators, it differs from the latter in two aspects. First, our work provides a model that explicitly demonstrates the evolution of a genome-enzyme distinction, whereas Michod’s work does not (the latter instead describes mathematical modelling that examines a condition required for the invasion of a hypercycle; however, the invasion of a hypercycle does not necessarily imply the evolution of a genome-enzyme distinction).

Second, Michod’s hypothesis assumes that a genome-enzyme distinction maximizes the multiplication rates of replicators, whereas our model does not involve this assumption. In our model, the multiplication rates of replicators increase monotonically with their catalytic activities if replicators have identical catalytic activities, assuming all else is held constant. Therefore, the multiplication rates of replicators are maximized if all replicators are maximally catalytic, a state that involves no genome-enzyme distinction. This state, in fact, evolves for sufficiently small V and m values, i.e. for sufficiently high relatedness (see also the discussion of kin selection theory above).

One might wonder how our model could display the evolution of a genome-enzyme distinction without the assumption that a genome-enzyme distinction maximizes the multiplication rates of replicators. The answer is conflicting multilevel selection. In our model, a genome-enzyme distinction evolves because it is a stable equilibrium between evolution driven by molecular-level selection and evolution driven by cellular-level selection. The symmetry breaking that creates this distinction cannot be induced by selection at a single level, molecular or cellular, because selection at a single level either maximizes or minimizes all catalytic activities of all replicators. Rather, the symmetry breaking is induced by conflict between molecular-level selection and cellular-level selection, the interaction that creates a positive feedback between reproductive values and the relative impact of selection at different levels. Similar results have been obtained from previous studies, where interactions between conflicting levels of selection are shown to evolve various states that are not directly selected for at any single level [10,21,37]. Taken together, these results suggest the possibility that biological complexity evolves as emergent outcomes of conflicting multilevel selection.

Finally, we note that the division of labour between the transmission of genetic information and other functions is a recurrent pattern throughout biological hierarchy. For example, multicellular organisms display differentiation between germline and soma, as do eusocial animal colonies between queens and workers (table 1) [47]. Given that all these systems potentially involve conflicting multilevel selection and tend to display reproductive division of labour as their sizes increase [7], our theory might provide a basis on which to pursue a universal principle of life that transcends the levels of biological hierarchy.

Table 1.

Division of labour between information transmission and other functions transcends the levels of biological hierarchy.

hierarchy
differentiation
whole parts information other
cell molecules genome enzyme
symbiont population* prokaryotic cells transmitted non-transmitted
ciliate organelles micronucleus macronucleus
multicellular organism eukaryotic cells germline soma
eusocial colony animals queen worker

*Bacterial endosymbionts of ungulate lice (Haematopinus) and planthoppers (Fulgoroidea) [38].

5. Methods

(a). Details of the model

The model treats each molecule as a distinct individual with uniquely assigned kcpt values. One time step of the model consists of three substeps: reaction, diffusion, and cell division.

In the reaction step, the reactions depicted in figure 1b are simulated with the algorithm described previously [10]. The rate constants of complex formation are given by the kcpt values of a replicator serving as a catalyst. For example, if two replicators, denoted by X and Y, serve as a catalyst and template, respectively, the rate constant of complex formation is the kpyx value of X, where x, y, and p are the replicator types (i.e. P or Q) of X, Y, and product, respectively. If X and Y switch the roles (i.e. X serves as a template, and Y serves as a catalyst), the rate constant of complex formation is the kpxy value of Y. Complexes are distinguished not only by the roles of X and Y, but also by the replicator type of product p. Therefore, X and Y can form four distinct complexes depending on which replicator serves as a catalyst and which type of replicator is being produced.

The above rule about complex formation implies that whether a template is replicated (p = t) or transcribed (pt) depends entirely on the kcpt values of a catalyst. In other words, a template cannot control how its information is used by a catalyst. This rule excludes the possibility that a template maximizes its fitness by biasing catalysts towards replication rather than transcription. Excluding this possibility is legitimate if the backbone of a template does not directly determine the backbone of a product as in nucleic acid polymerization.

In addition, the above rule about complex formation implies that replicators multiply fastest if their kcpt values are maximized for all combinations of c, p, and t (this is because X and Y form a complex at a rate proportional to pkpyx+kpxy if all possible complexes are considered). Consequently, cellular-level selection tends to maximize kcpt values for all combinations of c, p, and t (because cellular-level selection tends to maximize the multiplication rate of replicators within protocells). If kcpt values are maximized for all combinations of c, p, and t, P and Q coexist. Therefore, coexistence between P and Q is favoured by cellular-level selection, a situation that might not always be the case in reality. We ascertained that the above specific rule about complex formation does not critically affect results by examining an alternative model in which cellular-level selection does not necessarily favour coexistence between P and Q (see electronic supplementary material, Text 1.1).

In the diffusion step, all substrate molecules are randomly re-distributed among protocells with probabilities proportional to the number of replicators in protocells. In other words, the model assumes that substrate diffuses extremely rapidly.

In the cell-division step, every protocell containing more than V particles (i.e. P, Q, and S together) is divided as described in Model.

The mutation of kcpt is modelled as unbiased random walks. With a probability m per replication or transcription, each kcpt value of a replicator is mutated by adding a number randomly drawn from a uniform distribution on the interval ( − δmut, δmut) (δmut = 0.05 unless otherwise stated). The values of kcpt are bounded above by kmax with a reflecting boundary (kmax = 1 unless otherwise stated), but are not bounded below to remove the boundary effect at kptc=0. However, if kptc<0, the respective rate constant of complex formation is regarded as zero.

We ascertained that the above specific model of mutation does not critically affect results by testing two alternative models of mutation. One model is nearly the same as the above, except that the boundary condition at kptc=0 was set to reflecting. The other model implements mutation as unbiased random walks on a logarithmic scale. The details are described in electronic supplementary material, Text 1.2.

Each simulation was run for at least 5 × 107 time steps (denoted by tmin) unless otherwise stated, where the unit of time is defined as that in which one replicator decays with probability d (thus, the average lifetime of replicators is 1/d time steps). The value of d was set to 0.02. The total number of particles in the model Ntot was set to 50V so that the number of protocells was approximately 100 irrespective of the value of V. At the beginning of each simulation, 50 protocells of equal size were generated. The initial values of kcpt were set to kmax for every replicator unless otherwise stated. The initial frequencies of P and Q were equal, and that of S was zero.

(b). Ancestor tracking

Common ancestors of replicators were obtained in two steps. First, ancestor tracking was done at the cellular level to obtain the common ancestors of all surviving protocells. Second, ancestor tracking was done at the molecular level for the replicators contained by the common ancestors of protocells obtained in the first step. The results shown in figure 2e were obtained from the data between 2.1 × 107 and 2.17 × 107 time steps, so that the ancestor distribution was from after the completion of symmetry breaking.

(c). Outline of the derivation of equations (3.1)

To derive equations (3.1), we simplified the agent-based model in two ways. First, we assumed that kcpt is independent of p and t. Under this assumption, a catalyst does not distinguish the replicator types of templates (i.e. kptc=kptc for tt′) and products (i.e. kptc=kptc for pp′). This assumption excludes the possibility of numerical symmetry breaking, but still allows catalytic and informatic symmetry breaking as described in Results.

Second, we abstracted away chemical reactions by defining ωijt as the probability that replicator j of type t in protocell i is replicated or transcribed per unit time. Let nijt(τ) be the population size of this replicator at time τ. Then, nijt(τ) is expected to satisfy

[nijP(τ+1)nijQ(τ+1)]=[ωijPωijQωijPωijQ][nijP(τ)nijQ(τ)]. 5.1

The fitness of the replicator can be defined as the dominant eigenvalue λij of the 2 × 2 matrix on the right-hand side of equation (5.1): λij=ωijP+ωijQ. Fisher’s reproductive values of P and Q are given by the corresponding left eigenvector uij=[ωijP,ωijQ].

The evolutionary dynamics of the average catalytic activity of replicators can be described with Price’s equation [8,9]. Let κcij be the catalytic activity of replicator j of type c in protocell i (we use κ instead of k to distinguish κcij from kcpt). Price’s equation states that

λi~j~Δκi~i~c=σi~2[λij~,κij~c]+Ei~[σij~2[λij,κijc]], 5.2

where xij~, xi~j~, and Ei~[x] are x averaged over the indices marked with tildes, σi~2[x,y] is the covariance between x and y over protocells, and σij~2[x,y] is the covariance between x and y over the replicators in protocell i. One replicator is always counted as one sample in calculating all moments.

To approximate equation (5.2), we assumed that covariances between κijP and κijQ and between κij~P and κij~Q are negligible because the mutation of κijP and that of κijQ are uncorrelated in the agent-based model (see electronic supplementary material, Text 1.6 for an alternative justification of this assumption). Under this assumption, equation (5.2) is approximated by equations (3.1) up to the second central moments of κijc and κij~c, with the following notation (see electronic supplementary material, Text 1.3 for the derivation):

ω¯t=ωi~j~tλi~j~,σcel2=σi~2[κij~c,κij~c],σmol2=Ei~[σij~2[κijc,κijc]],k¯c=κi~j~c,γcc=Ei~[lnωijcκijc],βct=lnωij~tκij~c,

where ω¯t is the normalized average reproductive value of type-t replicators, σ2cel, σ2mol, and kc are the simplification of the notation, γcc is an average decrease in the replication rate of a type-c replicator due to an increase in its own catalytic activity, and βtc is an increase in the average replication rate of type-t replicators in a protocell due to an increase in the average catalytic activity of type-c replicators in that protocell. We assumed that σ2cel and σ2mol do not depend on c because no difference is a priori assumed between P and Q.

The values of γcc and βtc can be interpreted as the cost and benefit of providing catalysis. Let us assume that V is so large that κij~c and κcij can be regarded as mathematically independent of each other if i and j are fixed (if i and j are varied, κij~c and κcij may be statistically correlated). Under this assumption, increasing κcij does not increase κij~c, so that γcc reflects only the cost of providing catalysis at the molecular level. Likewise, increasing κij~c does not increase κcij, so that βtc reflects only the benefit of receiving catalysis at the cellular level. Moreover, the independence of κij~c from κcij implies that ωijc/κijc=0 for cc′, which permits the following interpretation: if a replicator of type c provides more catalysis, its transcripts, which is of type c′, pay no extra cost (i.e. γcc=0).

(d). Outline of the phase-plane analysis

To perform the phase-plane analysis depicted in figure 3, we defined ωijt as a specific function of κijt (see above for the meaning of ωijt and κijt):

ωijt=eκij~P+κij~Qesκijt[esκij~P+esκij~Q]1, 5.3

where the first factor eκij~P+κij~Q represents the cellular-level benefit of catalysis provided by the replicators in protocell i, the second factor esκijt represents the molecular-level cost of catalysis provided by the focal replicator, the last factor normalizes the cost, and s is the cost-benefit ratio. The above definition of ωijt was chosen to satisfy the requirement that a replicator faces the trade-off between providing catalysis and serving as a template, i.e. γtt and βtc are positive. Apart from this requirement, the definition was arbitrarily chosen for simplicity.

Under the definition in equation (5.3), we again approximated equation (5.2) up to the second central moments of κijc and κij~c, obtaining the following (see electronic supplementary material, Text 1.6 for the derivation):

ω¯t=esk¯tesk¯P+esk¯Q,γcc=sandβct=1. 5.4

Equations (3.1) and (5.4) can be expressed in a compact form as

[Δk¯PΔk¯Q]σtot2[RB(1R)C],

where =[/k¯P,/k¯Q]T (T denotes transpose), σtot2=σmol2+ σ2cel, R=σcel2/σtot2, B=k¯P+k¯Q and C=ln(esk¯P+esk¯Q). R can be interpreted as the regression coefficient of κij~c on κcij [39] and, therefore, the coefficient of genetic relatedness [40]. The potential RB − (1 − R)C can be interpreted as inclusive fitness.

Supplementary Material

Supplementary Texts and Figures
rspb20191359supp1.pdf (2.4MB, pdf)
Reviewer comments

Acknowledgements

The authors thank Stuart A. West and his group, and Ulrich F Müller for discussion, Daniel J. van der Post, Austen R. D. Ganley and Anthony M. Poole for help with the manuscript and Paulien Hogeweg for inspiration. The authors wish to acknowledge the contribution of NeSI to the results of this research. New Zealand’s national compute and analytics services and team are supported by the New Zealand eScience Infrastructure (NeSI) and funded jointly by NeSI’s collaborator institutions and through the Ministry of Business, Innovation and Employment. URL http://www.nesi.org.nz

Data accessibility

C++ source code implementing the agent-based model is available from the Dryad Digital Repository: https://doi.org/10.5061/dryad.mn257gm [41].

Authors' contributions

N.T. conceived the study, designed, implemented, and analysed the models, and wrote the paper. K.K. discussed the design, results, and implications of the study, and commented on the manuscript at all stages.

Competing interests

We declare we have no competing interests.

Funding

The authors have been supported by JSPS KAKENHI (grant nos JP17K17657 and JP17H06386). N.T. has been supported by grants from the University of Tokyo and the School of Biological Sciences, the University of Auckland.

References

  • 1.Crick F. 1970. Central dogma of molecular biology. Nature 227, 561–563. ( 10.1038/227561a0) [DOI] [PubMed] [Google Scholar]
  • 2.Gesteland RF, Cech T, Atkins JF (eds). 2006. The RNA World, 3rd edn Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press. [Google Scholar]
  • 3.Michod RE. 1983. Population biology of the first replicators: on the origin of the genotype, phenotype and organism. Am. Zool. 23, 5–14. ( 10.1093/icb/23.1.5) [DOI] [Google Scholar]
  • 4.Buss LW. 1987. The evolution of individuality. Princeton, NJ: Princeton University Press. [Google Scholar]
  • 5.Maynard Smith J, Szathmáry E. 1995. The major transitions in evolution. Oxford, UK: W. H. Freeman/Spektrum. [Google Scholar]
  • 6.Michod RE. 1999. Darwinian dynamics: evolutionary transitions in fitness and individuality. Princeton, NJ: Princeton University Press. [Google Scholar]
  • 7.Bourke AFG. 2011. Principles of social evolution. Oxford, UK: Oxford University Press. [Google Scholar]
  • 8.Price GR. 1972. Extension of covariance selection mathematics. Ann. Hum. Genet. 35, 485–490. ( 10.1111/ahg.1972.35.issue-4) [DOI] [PubMed] [Google Scholar]
  • 9.Hamilton WD. 1975. Innate social aptitudes of man an approach from evolutionary genetics. In: Biosocial anthoroplogy (ed. R Fox), pp. 133–153. London, UK: Malaby Press.
  • 10.Takeuchi N, Hogeweg P, Kaneko K. 2017. The origin of a primordial genome through spontaneous symmetry breaking. Nat. Commun. 8, 250 ( 10.1038/s41467-017-00243-x) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.von der Dunk S, Colizzi E, Hogeweg P. 2017. Evolutionary conflict leads to innovation: symmetry breaking in a spatial model of RNA-like replicators. Life 7, 43 ( 10.3390/life7040043) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Eigen M, Schuster P. 1979. The hypercycle: a principle of natural self organization. Berlin, Germany: Springer. [DOI] [PubMed] [Google Scholar]
  • 13.Chen IA, Roberts RW, Szostak JW. 2004. The emergence of competition between model protocells. Science 305, 1474–1476. ( 10.1126/science.1100757) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Durand PM, Michod RE. 2010. Genomics in the light of evolutionary transitions. Evolution 64, 1533–1540. ( 10.1111/j.1558-5646.2009.00907.x) [DOI] [PubMed] [Google Scholar]
  • 15.Ivica NA, Obermayer B, Campbell GW, Rajamani S, Gerland U, Chen IA. 2013. The paradox of dual roles in the RNA world: resolving the conflict between stable folding and templating ability. J. Mol. Evol. 77, 55–63. ( 10.1007/s00239-013-9584-x) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Takeuchi N, Hogeweg P. 2007. The role of complex formation and deleterious mutations for the stability of RNA-like replicator systems. J. Mol. Evol. 65, 668–686. ( 10.1007/s00239-007-9044-6) [DOI] [PubMed] [Google Scholar]
  • 17.Iwasa Y, Pomiankowski A, Nee S. 1991. The evolution of costly mate preferences II. The ‘handicap’ principle. Evolution 45, 1431–1442. ( 10.1111/j.1558-5646.1991.tb02646.x) [DOI] [PubMed] [Google Scholar]
  • 18.Takeuchi N, Kaneko K, Hogeweg P. 2016. Evolutionarily stable disequilibrium: endless dynamics of evolution in a stationary population. Proc. R. Soc. B 283, 20153109 ( 10.1098/rspb.2015.3109) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Koonin EV. 2015. Why the central dogma: on the nature of the great biological exclusion principle. Biol. Direct 10, 52 ( 10.1186/s13062-015-0084-3) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Müller S, Appel B, Balke D, Hieronymus R, Nübel C. 2016. Thirty-five years of research into ribozymes and nucleic acid catalysis: where do we stand today? F1000Research 5, 1511 ( 10.12688/f1000research) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Takeuchi N, Hogeweg P, Koonin EV. 2011. On the origin of DNA genomes: evolution of the division of labor between template and catalyst in model replicator systems. PLoS Comput. Biol. 7, e1002024 ( 10.1371/journal.pcbi.1002024) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Powner MW, Zheng SL, Szostak JW. 2012. Multicomponent assembly of proposed DNA precursors in water. J. Am. Chem. Soc. 134, 13 889–13 895. ( 10.1021/ja306176n) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Poole AM, Horinouchi N, Catchpole RJ, Si D, Hibi M, Tanaka K, Ogawa J. 2014. The case for an early biological origin of DNA. J. Mol. Evol. 79, 204–212. ( 10.1007/s00239-014-9656-6) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Ritson DJ, Sutherland JD. 2014. Conversion of biosynthetic precursors of RNA to those of DNA by photoredox chemistry. J. Mol. Evol. 78, 245–250. ( 10.1007/s00239-014-9617-0) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Gavette JV, Stoop M, Hud NV, Krishnamurthy R. 2016. RNA-DNA chimeras in the context of an RNA world transition to an RNA/DNA world. Angew. Chem. Int. Ed. 55, 13 204–13 209. ( 10.1002/anie.201607919) [DOI] [PubMed] [Google Scholar]
  • 26.Queller DC. 1997. Cooperators since life began. Q. Rev. Biol. 72, 184–188. ( 10.1086/419766) [DOI] [Google Scholar]
  • 27.Maynard Smith J. 1979. Hypercycles and the origin of life. Nature 280, 445–446. ( 10.1038/280445a0) [DOI] [PubMed] [Google Scholar]
  • 28.Szathmáry E, Demeter L. 1987. Group selection of early replicators and the origin of life. J. Theor. Biol. 128, 463–486. ( 10.1016/S0022-5193(87)80191-1) [DOI] [PubMed] [Google Scholar]
  • 29.Maynard Smith J, Szathmáry E. 1993. The origin of chromosomes I. Selection for linkage. J. Theor. Biol. 164, 437–446. ( 10.1006/jtbi.1993.1165) [DOI] [PubMed] [Google Scholar]
  • 30.Czárán T, Szathmáry E. 2000. Coexistence of replicators in prebiotic evolution. In: The geometry of ecological interactions: simplifying spatial complexity (eds U Dieckman, R Law, JAJ Metz), pp. 116–134. Cambridge, UK: Cambridge University Press.
  • 31.Hogeweg P, Takeuchi N. 2003. Multilevel selection in models of prebiotic evolution: compartments and spatial self-organization. Orig. Life Evol. Biosph. 33, 375–403. ( 10.1023/A:1025754907141) [DOI] [PubMed] [Google Scholar]
  • 32.Takeuchi N, Hogeweg P. 2008. Evolution of complexity in RNA-like replicator systems. Biol. Direct 3, 11 ( 10.1186/1745-6150-3-11) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Takeuchi N, Hogeweg P. 2012. Evolutionary dynamics of RNA-like replicator systems: a bioinformatic approach to the origin of life. Phys. Life Rev. 9, 219–263. ( 10.1016/j.plrev.2012.06.001) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Kim YE, Higgs PG. 2016. Co-operation between polymerases and nucleotide synthetases in the RNA world. PLoS Comput. Biol. 12, e1005161 ( 10.1371/journal.pcbi.1005161) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Hamilton WD. 1964. The genetical evolution of social behaviour. I. J. Theor. Biol. 7, 1–16. ( 10.1016/0022-5193(64)90038-4) [DOI] [PubMed] [Google Scholar]
  • 36.West SA, Griffin AS, Gardner A. 2007. Social semantics: altruism, cooperation, mutualism, strong reciprocity and group selection. J. Evol. Biol. 20, 415–432. ( 10.1111/jeb.2007.20.issue-2) [DOI] [PubMed] [Google Scholar]
  • 37.Takeuchi N, Hogeweg P. 2009. Multilevel selection in models of prebiotic evolution II: a direct comparison of compartmentalization and spatial self-organization. PLoS Computat. Biol. 5, e1000542 ( 10.1371/journal.pcbi.1000542) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Frank SA. 1996. Host control of symbiont transmission: the separation of symbionts into germ and soma. Am. Nat. 148, 1113–1124. ( 10.1086/285974) [DOI] [Google Scholar]
  • 39.Rice SH. 2004. Evolutionary theory: mathematical and conceptual foundations. Sunderland, MA: Sinauer Associates. [Google Scholar]
  • 40.Hamilton WD. 1970. Selfish and spiteful behaviour in an evolutionary model. Nature 228, 1218–1220. ( 10.1038/2281218a0) [DOI] [PubMed] [Google Scholar]
  • 41.Takeuchi N, Kaneko K. 2019. The origin of the central dogma through conflicting multilevel selection Dryad Digital Repository. ( 10.5061/dryad.mn257gm) [DOI] [PMC free article] [PubMed]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Citations

  1. Takeuchi N, Kaneko K. 2019. The origin of the central dogma through conflicting multilevel selection Dryad Digital Repository. ( 10.5061/dryad.mn257gm) [DOI] [PMC free article] [PubMed]

Supplementary Materials

Supplementary Texts and Figures
rspb20191359supp1.pdf (2.4MB, pdf)
Reviewer comments

Data Availability Statement

C++ source code implementing the agent-based model is available from the Dryad Digital Repository: https://doi.org/10.5061/dryad.mn257gm [41].


Articles from Proceedings of the Royal Society B: Biological Sciences are provided here courtesy of The Royal Society

RESOURCES