Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2025 Apr 14;15:12806. doi: 10.1038/s41598-025-97319-2

Chiral symmetry breaking and information accumulation in pre-biological protocell evolution

Konstantin K Konstantinov 1,, Alisa F Konstantinova 1
PMCID: PMC11997073  PMID: 40229319

Abstract

We study a linear evolutionary model based on the two-dimensional distribution of protocells by total enantiomeric excess and the amount of stored information, which they can pass from generation to generation, and without any mutual inhibition. We show that the evolution of such systems occurs in four distinct stages. The first stage is an exponential growth of the concentration of protocells near the point Inline graphic and it should take negligible time on a geological scale. The second stage is a diffusion-like process in both dimensions. This process can also be accompanied by a drift in the direction of increased information passed from generation to generation, provided that the appropriate linear coefficient in the information storage subspace is large enough. The third stage is a rapid symmetry breaking and formation of the species near Inline graphic value of enantiomeric excess (assuming a small positive global enantiomeric asymmetry factor). The fourth stage is a relaxation toward a global stationary point, which is a narrow peak located near Inline graphic value of enantiomeric excess and some optimal value of the amount of stored information.

Supplementary Information

The online version contains supplementary material available at 10.1038/s41598-025-97319-2.

Keywords: Molecular evolution, Chiral symmetry breaking, Biological information storage

Subject terms: Origin of life, Evolutionary theory

Introduction

The emergence of life on Earth is a question that has captivated scientists, philosophers, and thinkers for centuries. Among the critical stages in the origins of life is the formation and evolution of protocells, the rudimentary biological compartments that preceded modern cells. While theories surrounding abiogenesis and the RNA world have garnered significant attention, the focus on protocells provides a lens through which to understand the specific transitions that must have occurred for inanimate molecules to give rise to living, self-replicating systems. This article aims to consider two highly significant facets of protocell evolution: chiral symmetry breaking and the increase in informational complexity passed down through generations. Both aspects are crucially influenced by thermodynamics, specifically by the availability of low entropy energy inflow (e.g., energy from the Sun) and the ability to dissipate extra entropy.

Chiral molecules are those that exist in non-superimposable mirror image forms, known as enantiomers. In modern biology, only one type of organic enantiomer—either the “left-handed” (L) or the “right-handed” (D) version—is overwhelmingly favored for each class of biomolecules like amino acids and sugars. This phenomenon, known as “homochirality” presents an intriguing puzzle: how did the early Earth, presumably awash in a racemic mixture of both L- and D-enantiomers, give rise to life that prefers one type over the other? The breaking of chiral symmetry, wherein one enantiomer is favored over its mirror image, is thought to be a crucial milestone in the transition from chemistry to biology113.

Related to the question of chiral symmetry breaking is the requirement for an increase in informational complexity. For life to evolve, each successive generation of protocells must not only replicate but also adapt to its environment and optimize its performance to persist. Thus, the process would involve an increased amount of information being passed from one generation to the next, allowing for the emergence of more complex structures and functions1416. This increase in inherited complexity serves as a cornerstone for Darwinian evolution and eventually leads to the complex web of life as we know it today17.

However, neither chiral symmetry breaking nor the increase in inherited complexity can occur in a thermodynamic vacuum. Life is an inherently non-equilibrium process, requiring a constant influx of low entropy energy to maintain its ordered state14,18,19. This energy source, typically in the form of sunlight for modern biology, is crucial for driving the endothermic reactions required for life. Moreover, the ability of a system to dissipate the extra entropy it generates – by exporting it to its surroundings – is a defining characteristic of living systems and likely played a vital role in the emergence and evolution of protocells14,2022.

This article aims to expand upon the foundational work in protocell evolution, particularly focusing on chiral symmetry breaking and increasing informational complexity as pivotal milestones. Drawing inspiration from the one-dimensional model considered by23, which considered the distribution of protocells based on enantiomeric excess, we extend this model into a two-dimensional framework. The extended model not only accounts for distribution by enantiomeric excess but also incorporates another crucial dimension—the amount of information inherited from one generation to the next.

The move to a two-dimensional model aims to encapsulate a more comprehensive picture of the evolutionary dynamics at play in early life forms. By considering both the chiral composition and inherited information, we hope to explore the interplay between these two essential facets and their combined influence on protocell survival and evolution. This dual consideration is consistent with the broader understanding that these processes are governed by thermodynamic factors, specifically the need for low entropy energy sources and mechanisms for effective dissipation of extra entropy21,22.

While previous models have often focused on either the chiral aspects7,13 or information complexity15,16, rarely have both dimensions been considered simultaneously. This article aims to bridge that gap and offer a more synthesized view of how these critical factors co-evolved in shaping the first rudimentary biological systems on Earth.

It is also known that modern life utilizes various error correction mechanisms24, which allow correcting replication errors. In this work we do not want to take that into account, as that ability requires some advanced cellular machinery, which could not exist before life.

Evolution of protocells in two-dimensional space

The model considered here is essentially the model considered by23 extended further into two-dimensional space of chiral composition (total enantiomeric excess of the protocell) and inherited information (the amount of information passed from generation to generation). Apart from finding the stationary points, we also wanted to solve the dynamic evolution problem because it is expected to be much more complex and richer than in one-dimensional space.

As the number of organic molecules on prebiotic Earth was very large, we first start from a continuous representation of chemical systems, where the evolution of the system can be described using differential (or in this case integrodifferential) equations. This allows us to use differential and integral machinery, which, in turn allows us to make some interesting conclusions without solving the equations. However, once we get to a time evolution, the continuous representation breaks down and results in some non-existent artefacts. We will consider that in detail below when we talk about solution method.

The exact chemical reactions, which occurred in protocells on a prebiotic Earth are not known yet, and therefore we would like to concentrate on mathematical aspects of protocell evolution, rather than to speculate on what chemical reactions could have been a driving force at that time. One of the converging ideas is that the transition from non-living matter to living matter is largely due to so called pre-biological compartmentalization25, of which the transition from zero to just one compartment is the first crucial step. It is those, as they are called in25, “self-propagating, chemically simple compartmentalized mainly organic systems” that we are after in this research but without trying to get into a discussion about the unknown chemical reactions of that time. From that point of view, a protocell, which we would like to consider here, is a simple, single-compartment unit with at least some reactions inside a compartment driven by an external energy source. This single-compartment protocell requires neither a symmetry breaking to occur first, nor an advanced replication machinery as we have in modern life. Rather, a set of coupled reactions, which under the energy inflow, could increase the amounts of its reagents is what’s sufficient to start the process. A relatively simple experiment showcasing such compartmentalization was performed back in 196326 and then repeated recently27.

Subsequently, we would like to consider a protocell as a black box, which consumes some “food” and energy out of the environment and creates another protocell. As protocells are not alive yet, we also would like to ignore the consumption of food to support a “life” of a protocell, because there is no life yet. A protocell is a relatively complex structure consisting of many molecules and so it needs more than one molecule of food to replicate. Nevertheless, it is convenient to express the models in protocells, rather than in the number of food molecules. The rate of collision between a protocell and a “molecule” of food is proportional to the concentration of protocells and a concentration of food. And food molecules are consumed one by one, not all at a time. Therefore, the reaction rate should be proportional to the first power of concentration of food and the first power of concentration of protocell. And the fact that many molecules of food are needed to replicate a protocell simply results in a replication time proportional to the number of needed molecules of food. That new protocell may have some small random mutations in comparison to the original protocell. This can be expressed as:

graphic file with name d33e279.gif 1

where Inline graphic is an amount of food “molecules”, Inline graphic is original protocell, Inline graphic is a new protocell. We shall stress that Inline graphic here is a concentration of food molecules expressed in protocells, that is the actual concentration of food molecules divided by a number of food molecules needed to replicate a protocell, say Inline graphic. We will not use this parameter Inline graphic in any of the calculations. It also seems interesting to consider a nonlinear scenario where some Inline graphic molecules of food are simultaneously needed to create a protocell:

graphic file with name d33e332.gif 2

We have run a few simulations using this equation (and with relevant changes to all other equations) and the only noticeable changes were to the first stage of evolution of the system. Larger than one values of Inline graphic resulted in faster evolution time of that stage. We will talk about that phenomenon in more detail below. Subsequently, we have used Inline graphic for all further calculations due to increased numerical stability and smaller calculation time in comparison to larger values of Inline graphic.

We further consider that the protocells can “die off” without any mutual inhibition:

graphic file with name d33e360.gif 3

where Inline graphic is some waste “molecule” and by waste “molecule” we mean that we also express waste in protocells, the same as food.

Finally, we need to close the model, and this is achieved by recycling the waste.

graphic file with name d33e375.gif 4

The alternative is to introduce a pass-through model, where the food flows in at some constant rate and the waste is discarded. It can be shown that these models are dynamically nearly equivalent up to some time-dependent transformations. However, pass-through models don’t have a well-defined integral of motion, which makes them harder to model and account for errors. In addition, as the total amount of matter in a pass-through model linearly increases over time that makes such models less stable than the models with recycling. In other words, a pass-through model is substantially more likely to blow up numerically if some coefficients of the model are not chosen correctly.

Given the equations above, the changes in protocell concentration Inline graphic can be written as:

graphic file with name d33e391.gif 5

where Inline graphic is a concentration of protocells with total enantioselectivity Inline graphic and the amount of stored information Inline graphic at time Inline graphic, Inline graphic is the concentration of food molecules at time Inline graphic, Inline graphicis the rate at which protocells Inline graphic can produce protocells Inline graphic, Inline graphic is a normalization constant, so that Inline graphic, and Inline graphic is the decay rate of protocells Inline graphic. The range of enantioselectivity is naturally the interval Inline graphic with the initial value located near Inline graphic (nearly racemic mixture). The range in the information space depends on how we define the amount of information passed from generation to generation and what we mean as that information. However, we can always change the variables so that the domain in Inline graphic starts from 0 and we keep the upper boundary as some value Inline graphic. We can further rescale Inline graphic so that to make that Inline graphic any value we want, e.g., 1, but we find it more convenient to keep it without rescaling in case we’d want to compare some models with different values of Inline graphic in the future. Subsequently, that makes the initial state of protocells as some small narrow peak near Inline graphic in Inline graphic space.

The changes in Inline graphic and Inline graphic can be written as:

graphic file with name d33e548.gif 6

where Inline graphic is the concentration of waste “molecules” at time Inline graphic and

graphic file with name d33e568.gif 7

where Inline graphic is a recycling rate.

We can first look at the stationary point of these equations. Equation (5) leads to the following condition:

graphic file with name d33e587.gif 8

where Inline graphic, Inline graphic and Inline graphic is Inline graphic normalized so that:

graphic file with name d33e619.gif 9

This is a two-dimensional Fredholm integral operator of the first kind for the kernel Inline graphic. If Inline graphic and Inline graphic subspaces are separable: Inline graphic then we can express Inline graphic and then Eq. (8) also splits into one-dimensional equations.

Kernel normalization

Before we proceed further, it is convenient to perform some transformations of the original kernel Inline graphic. First, we can extract a normalization constant Inline graphic:

graphic file with name d33e677.gif 10

which is a total production rate at a point Inline graphic, and a total normalized replication rate Inline graphic:

graphic file with name d33e697.gif 11

Then we can rewrite the kernel as:

graphic file with name d33e705.gif 12

where Inline graphic is defined as:

graphic file with name d33e718.gif 13

from which it follows that:

graphic file with name d33e725.gif 14

which means that Inline graphic is the probability that species with enantiomeric excess Inline graphic and amount of stored information Inline graphic would produce species with total enantiomeric excess Inline graphic and amount of stored information Inline graphic. As mutations are small, that probability should be a narrow peak centered near the point Inline graphic.

Assuming that mutations in Inline graphic and Inline graphic spaces are independent and taking into account that the number of molecules and protocells was very large, allows us to utilize Central Limit Theorem. That means that we can use normal distributions to model Inline graphic as some narrow peak near Inline graphic:

graphic file with name d33e795.gif 15

where Inline graphic, Inline graphic are some small parameters (in general functions of Inline graphic and Inline graphic) defining mutation rates, Inline graphic is the error function, and the normalization coefficient follows from normalization condition Eq. (14). This probability is nearly identical to the two-dimensional normal probability distribution inside the domain: Inline graphic and has some bias at the edges of the full domain Inline graphic. This bias is irrelevant because the starting point Inline graphic is in the middle of the domain in Inline graphic space and as we are interested in the system moving away from Inline graphic some irregularities near that point at the beginning of the time evolution are insignificant.

After all these transformations, the Eq. (8) becomes:

graphic file with name d33e872.gif 16

where Inline graphic. We can estimate the largest eigenvalue and approximate location of its eigenvector under reasonable assumptions that the mutations are small. Consider that the first eigenvector is a narrow “bump” function of some height Inline graphic with the widths (standard deviations) Inline graphic and Inline graphic and it is centered near point Inline graphic Then, we can integrate Eq. (16) by Inline graphic and approximate it. Then:

graphic file with name d33e919.gif 17

where Inline graphic is some numerical factor, which depends on the actual shape of Inline graphic and:

graphic file with name d33e939.gif 18

where the first approximate transition can be made because Inline graphic is non-zero when Inline graphic is close to Inline graphic and Inline graphic is close to Inline graphic (the mutations are considered small) and Inline graphic is a slow function in the range where Inline graphic is non-zero (the decay rate for new protocells is not very different from the one of the original protocells), and the second approximation follows from the fact that we considered Inline graphic as a narrow peak near Inline graphic. This leads to an estimate:

graphic file with name d33e1001.gif 19

And since we are interested in the largest eigenvalue, that means that the point Inline graphic is where the function:

graphic file with name d33e1015.gif 20

has a global maximum on the domain: Inline graphic.

Diffusion

As the mutations are considered small, then Inline graphic is a narrow peak centered around the point Inline graphic. In this case we can perform a Taylor expansion of Inline graphic near Inline graphic when Inline graphic is substantially wider than Inline graphic:

graphic file with name d33e1070.gif 21

and substituting into Eq. (5) we can obtain:

graphic file with name d33e1080.gif 22

where:

graphic file with name d33e1087.gif 23

This means that Eq. (5) can be interpreted as time dependent convection-diffusion equation with a sink in the scenarios when the mutations are small and shape of Inline graphic is substantially wider than the width of mutation probability, though the form is slightly different from what is usually considered as convection-diffusion equation. It is important to note that both conditions: small mutation rate and abundance of species (which is a manifestation of the fact that Inline graphic is substantially wider than mutation probability) seem to hold in life and in the considered model for all values of time except some small amount of time at the beginning of the evolution of the system. The term Inline graphic is responsible for creation Inline graphic or destruction Inline graphic of protocells, vector Inline graphic is the “speed” at which the species drift away, and matrix Inline graphic determines the “diffusion”. The drift is the most interesting factor here. We can perform a Taylor expansion of Inline graphic near the point Inline graphic:

graphic file with name d33e1153.gif 24

and then consider that Inline graphic is a nearly symmetric function near that point Inline graphic inside Inline graphic. Then:

graphic file with name d33e1179.gif 25

and where:

graphic file with name d33e1186.gif 26

and we ignored higher order terms by Inline graphic and Inline graphic. That means that the drift is determined by a vector Inline graphic, which is a local gradient of Inline graphic up to some coefficient of proportionality. We also note that Inline graphic and Inline graphic with a very high precision within Inline graphic.

Stages of evolution with time

We note that Inline graphic must be an even function of Inline graphic due to Inline graphic symmetry and therefore all odd derivatives of Inline graphic by Inline graphic at the point Inline graphic must be zeros. Therefore, second order Taylor expansion of Inline graphic near point Inline graphic should not have terms linear in Inline graphic:

graphic file with name d33e1295.gif 27

or, if we want to keep Inline graphic in a separable form then:

graphic file with name d33e1309.gif 28

The latter form has some extra higher than level 2 terms in comparison to the Taylor expansion. This only affects the evolution in diagonal directions and only when cross terms become substantial. If both quadratic terms are positive (which is the case in life, except probably near some catastrophic events, which we are not considering here), then value of Inline graphic in separable form is larger in diagonal directions than when using the Taylor expansion. However, we have not seen the system going in diagonal directions even though we used a separable version of Inline graphic in most of our calculations.

Assuming that the mutations are small, we can consider various stages of evolution of the system. The initial state of the system is a very narrow peak near point Inline graphic in Inline graphic space. If we replace Inline graphic where Inline graphic is the Dirac delta function, then we can perform integrations in Eqs. (57) to obtain:

graphic file with name d33e1362.gif 29
graphic file with name d33e1368.gif 30
graphic file with name d33e1374.gif 31

Provided that initially the food is abundant, and the initial number of protocells is very small, the first stage of evolution is an exponential growth of the total number of protocells with the initial growth rate: Inline graphic. The time interval of that period is when the initial number of protocells: Inline graphic would exponentially grow to the order of Inline graphic:

graphic file with name d33e1400.gif 32

from which it follows that:

graphic file with name d33e1407.gif 33

When the food is abundant (as it is in the initial stage), then Inline graphic and so we can ignore Inline graphic in the denominator. Taking just one initial protocell and the total number of organic molecules or even all atoms on Earth as the hard limit gives an estimate that the logarithm in numerator is naturally limited to somewhere between Inline graphic. The actual value is irrelevant as the process is very quick. The value Inline graphic is essentially a doubling rate (to be more precise the time to increase the population in Inline graphic times) and so using even the upper (unrealistic) boundary of the nominator as Inline graphic and unrealistically slow doubling rate of protocells of, say, one year gives Inline graphic years, which is negligible on a geological time scale. This time period, Inline graphic is the amount of time, which would take the most inefficient protocells (located near point Inline graphic in Inline graphic space) to consume nearly all food. If we look at Eq. (22), then Inline graphic can be interpreted as the period of time during which the value Inline graphic reaches 0 at point Inline graphic:

graphic file with name d33e1498.gif 34

Once this point in time is reached, then further evolution of the system is only possible by increasing the efficiency of protocells. The less efficient protocells then will die off whereas the more efficient protocells will continue to evolve thus making it possible to sustain their replication at smaller and smaller concentrations of food. That means that the boundary:

graphic file with name d33e1506.gif 35

should move away from point Inline graphic as the system evolves.

The second stage of evolution substantially depends on the values of the drift vector, Inline graphic. The value Inline graphic due to symmetry, however Inline graphic does not have to be zero. If it is zero or close to zero, then the second stage of evolution is a diffusion in Inline graphic space until the time when quadratic coefficients in Inline graphic start to play a role. However, if Inline graphic is large enough, then the initial bump near Inline graphic will move in Inline graphic space faster than it diffuses. Biologically this means that in the first case the system produces a variety of species that are diverse both in total enantiomeric excess and the amount of stored information, whereas in the second case the species form a compact “bump” of not yet separated into left or right species, which quickly advance in the ability to pass information from generation to generation.

The third stage of evolution starts when quadratic coefficients in Eq. (22) become substantial. There is an extensive discussion in23 regarding how replication with errors helps to transition replicated structures toward homochirality. And while it is intuitively clear that passing more information from generation to generation is beneficial for the evolution of life, we will not dwell in theoretical considerations of this matter in the current work. Rather, we note, that if any of the quadratic coefficients were negative, then the diffusion in that direction would have been suppressed and subsequently life, as we know it, would not exist. Therefore, both quadratic coefficients must be positive. The time evolution in that stage is complicated enough so that it is not possible to deduce how the system should evolve without actually solving the dynamical problem. Before we get to that, we shall note the following. The drift vector, Inline graphic is proportional to the gradient of Inline graphic at a given point. Therefore, the further is the point from Inline graphic in any direction, the larger is that gradient and therefore the faster the protocells that are at that point move in the direction of that gradient because Inline graphic has positive quadratic coefficients in the Taylor expansion in both directions. If we note that the evolution equation resembles the convection-diffusion equation, then we can talk about the speed at which the width of the initial spike increases. And it is a well-known fact that a standard diffusion equation with constant diffusion coefficient has a Gaussian solution with the width Inline graphicwhere Inline graphic is the diffusion coefficient and Inline graphic is evolution time. Therefore the “speed” at which the edge of the bump moves is Inline graphic (as a derivative of the width by time). This can be estimated in the region where quadratic coefficients give substantial contribution to Inline graphic (for example for large enough values of Inline graphic) as Inline graphic whereas Inline graphic. Another critical fact comes from stochasticity related to chemical evolution. A stochasticity in relation to the current model means that there is a probability that several favorable mutations happen in a row, and which would place some protocells in the region with higher amplification and subsequently drift speed. When that happens and because the speed of the edge of the bump decreases with time, then it could be possible for the protocells with favorable mutations to get separated from the rest of the pack. That’s the scenario that we are the most interested in when solving dynamical evolution problem.

Solution of dynamical problems

The continuous representation is well-suitable for finding the stationary points of the considered system and performing some simplified analysis. However, it breaks down when we want to consider a dynamical evolution, that is, how does the system get to that stationary point from the initial state? This is mostly due to machine noise. As the number of calculations on each step is very large, the machine noise produces some very small yet non-zero values of Inline graphic far from the initial state near Inline graphic. That machine noise is also likely to appear in the regions of Inline graphic with large values and subsequently with larger “amplification”. As a result, the machine noise will experience exponential growth and that results in some artefacts appearing where they are not supposed to be.

As chemical systems evolve in integer numbers rather than in real numbers, we could try using some version of chemical master equation to calculate the evolution of the system in the number of molecules. Then very small real numbers cannot appear in the regions where they are not supposed to be and so the issue of unrealistic artefacts is removed. Since we want to use a large number of molecules and protocells in our calculations, Gillespie tau-leaping algorithm28 is well suitable for such a problem. We have used a custom version of this algorithm to ensure stability near zero values. It is interesting to note that the Gillespie tau-leaping algorithm is essentially a stochastic version of a simple Euler algorithm for solving differential equations. In fact, it is straightforward to show that as the number of molecules goes to infinity, the Gillespie tau-leaping algorithm converges to the Euler algorithm. This means that the Gillespie tau-leaping algorithm should be plagued by the same issues as the Euler algorithm. In particular, it means that the Gillespie tau-leaping algorithm should undershoot for exponential growth and overshoot for exponential decay, of which the exponential decay is the most dangerous. The algorithm can easily overshoot below zero, after which all the calculations break down. This is why chemical equations are considered stiff. When solving differential equations this is resolved either by substantially reducing the time step, thus drastically increasing the solution time, or using higher-order direct methods, or using indirect methods, which require inverting matrices. Apart from the need to adapt indirect methods for the Gillespie tau-leaping algorithm, this is simply not suitable for the current problem because, for example, a 500 × 500 grid has 250,000 variables and using indirect method would require inverting 250,000 × 250,000 matrices, where machine noise, calculation time, and memory requirements would make it prohibitive. Using higher order direct methods with the tau-leaping algorithm would also require adaptation.

The solution to this problem in case of chemical equations is simple. We let the system overshoot into negative values but treat all negative values as exact zeros in subsequent calculations. This is the approach that we used in some previous works, and it showed results consistent with the ones obtained using more advanced but slower algorithms. Out of all the variables in the system considered in the current work it is food that could easily become negative at some steps, especially if some of the parameters are large enough. Even though the approach described above works even in such extreme cases, we fine-tune the algorithm further by adjusting the step so that to consume all food but no more than is available on a given step. The solution is stable and since the system has an invariant of motion, we can monitor that it is conserved exactly at each step.

Results

We used the following separable approximations of Inline graphic and Inline graphic in the current work:

graphic file with name d33e1699.gif 36
graphic file with name d33e1705.gif 37
graphic file with name d33e1711.gif 38

Parameter Inline graphic is a small linear global asymmetry factor and our view is that it should be introduced in the decay rate Inline graphic rather than in replication rate Inline graphic because the global asymmetry factor influences the stability of the left vs. right amino acids and sugars, which subsequently affects the lifespan of protocells. Parameter Inline graphic is an artificially introduced limiting factor to keep the stationary point off the boundary Inline graphic. If this factor is not introduced, then the stationary point in Inline graphic space becomes near point Inline graphic. Another view is that any model should have limitations, and we wanted to limit how far our model could evolve. Parameter Inline graphic is a scaling coefficient, which we introduced for convenience.

One of the most important parameters in the Gillespie tau-leaping algorithm is the step size when all other parameters are fixed. However, we can always rescale the parameters so that to set step size to some desired value. We used fixed step size Inline graphic and then set other parameters as necessary. This allows us to treat Inline graphic as evolution epochs and it makes it easier to interpret the results, e.g., if Inline graphic day, then Inline graphic means that average protocell lifespan is: Inline graphic days.

We used the following parameter values in calculations:

Parameter Value Description
Inline graphic 500 Size of the grid used in calculations. We used a Inline graphic grid.
Inline graphic 25 Upper boundary of the domain in Inline graphic space. Currently the choice is arbitrary as we can always rescale the Inline graphic variable. However, if we consider it as the natural logarithm of bits of information passed from generation to generation, then that value translates into Inline graphic bytes, which by far exceeds reasonable estimates of how much information the protocells could have passed from generation to generation.
Inline graphic Inline graphic Scaling coefficient to make it more convenient to set values of parameters in Inline graphic dimension. This choice of Inline graphic was determined by our desire to scale Inline graphic into 1 in scaled coordinate Inline graphic.
Inline graphic 0.1 The parameter Inline graphic, as determined by Eq. (10), can be used as is when using ODE evolution. The value of Inline graphic is used instead when using Gillespie tau-leaping evolution in integer numbers. This adjustment is necessary because the term with Inline graphic in the original differential Eq. (5) is Inline graphic and therefore Inline graphic has a dimension Inline graphic. Therefore, we need to account for that when transitioning to the equations expressed in the number of molecules.
Inline graphic 0.01 Decay rate of protocells at point Inline graphic.
Inline graphic 1 Recycling rate. The value of that parameter determines how much residual waste is near the stationary point.
Inline graphic 1 The value of Inline graphic at point Inline graphic.
Inline graphic Inline graphic or Inline graphic The value of Inline graphic at point Inline graphic.
Inline graphic 1 The value of Inline graphic at point Inline graphic.
Inline graphic 0.005 or 0.01 Mutation rate in Inline graphic space.
Inline graphic 0.005 or 0.01 Mutation rate in Inline graphic space.
Inline graphic Inline graphic Threshold parameter to control the values of probability Eq. (15) to treat as exact zero. When Inline graphic we treat it as exact zero and remove it from the kernel function (but adjust total probability to sum up to 1). This allows us to consider a four-dimensional kernel as a sparse array resulting in a very substantial speed increase of the algorithm. As typical number of such non-zero points in Inline graphic is somewhere between 10 to 100 instead of Inline graphic, which results in a 1,000- to 10,000-fold speed increase of the algorithm.
Inline graphic Inline graphicor Inline graphic Global asymmetry factor. The values smaller than Inline graphic are beyond the resolution of the grid used in our calculations.
Inline graphic 1000 This is an artificial parameter to keep the stationary point off the upper boundary in Inline graphic space and model the scenario when further increase of stored information results in faster decay rate. If this parameter is not introduced, the system will run toward the Inline graphic edge of the domain, which produces harder to visualize results.
Inline graphic Inline graphic Total number of molecules in the system. This is an invariant, and this number does not change over the course of evolution of the system. The bigger is Inline graphic the more the dynamical evolution starts to resemble evolution of the system in real numbers.
Inline graphic Inline graphic Initial number of protocells located at point Inline graphic at a time Inline graphic.

The following scenarios were chosen for this article: Inline graphic, Inline graphic, Inline graphic, Inline graphic, Inline graphic or 1, Inline graphic or Inline graphic and we summarized them in the Table 1.

Table 1.

Parameters of the models considered in the current work.

No Model parameters Model code
1 (a) Inline graphic, Inline graphic, Inline graphic, Inline graphic, Inline graphic, Inline graphic d500k1e005g01a002f1E
2 (b) Inline graphic, Inline graphic, Inline graphic, Inline graphic, Inline graphic, Inline graphic d500k1e01g01a002f1E
3 (c) Inline graphic, Inline graphic, Inline graphic, Inline graphic, Inline graphic, Inline graphic d500k1e005g01a002i10f1E
4 (d) Inline graphic, Inline graphic, Inline graphic, Inline graphic, Inline graphic, Inline graphic d500k1e01g01a002i10f1E

The colors matching a number in the table above were used on multi-line 2D figures and 3D figures use (a) – (d) marking. 2D charts show means and widths (standard deviation) of Inline graphic over time of which the mean in Inline graphic space, Inline graphic (Fig. 1) and standard deviation Inline graphic, Fig. 2 are the most important. The 3D figures (distributions of Inline graphic) are shown for some chosen values of time to illustrate the most interesting effects. All values of Inline graphic on 3D figures are in percent.

Fig. 1.

Fig. 1

Dependence of Inline graphic on Inline graphic. Produced using Wolfram Mathematica 13.

Fig. 2.

Fig. 2

Dependence of Inline graphic on Inline graphic. Produced using Wolfram Mathematica 13.

The first two stages of evolution (very quick exponential growth of the initial narrow bump followed by a diffusion-like spreading) were consistently observed among all runs that we performed. Figure 3 shows Inline graphic for some chosen values of time when the second stage of evolution has not completed yet. The time is Inline graphic for the case when Inline graphic and Inline graphic when Inline graphic. The width of Gaussian solution of the diffusion equation with constant diffusion coefficient grows Inline graphic (Inline graphic is either Inline graphic or Inline graphic). The widths of the bumps for Inline graphic vs. Inline graphic are nearly the same on the Figure 3 even though the evolution time is 4x faster, which is consistent with the estimate. There is a drift toward higher information content when Inline graphic. The widths of the bumps are approximately the same as in the case of no drift, though the drift is more significant in case when Inline graphic than when Inline graphic. This is expected as the run time is 4x longer in the first case.

Fig. 3.

Fig. 3

Dependence of Inline graphic on Inline graphic and Inline graphic when the second stage of evolution (diffusion) has not completed yet. Produced using Wolfram Mathematica 13.

Figure 4 shows some points in time for the four considered scenarios where each of the corresponding diffusion-like evolution runs break down and experience rapid transformation. We will talk about that in more detail when we discuss Fig. 2. Here we would like to note that the process is essentially random and using different random seed values will produce different intermediate shapes until the process settles down near stationary points.

Fig. 4.

Fig. 4

Dependence of Inline graphic on Inline graphic and Inline graphic for some values of time when the system experiences rapid transformations. Produced using Wolfram Mathematica 13.

Figure 5 shows the point in time (Inline graphic) when the stationary point has been reached. The stationary point looks nearly the same for all considered variants: it is a narrow peak near Inline graphic and Inline graphic, which is a point in Inline graphic space where Inline graphic experiences a rapid increase due to a limiting factor Inline graphic. Without that limiting factor the system would just run toward Inline graphic. The stationary point for Inline graphic is a narrower peak than for Inline graphic, as expected, and the value of Inline graphic does not affect the position and the width of the peak.

Fig. 5.

Fig. 5

Dependence of Inline graphic on Inline graphic and Inline graphic near stationary point (Inline graphic). Produced using Wolfram Mathematica 13.

We present animation of the evolution of the system for these 4 scenarios in supplementary materials.

We also want to look at some cumulative characteristics of the evolution because it is hard to present changes to a 3D chart over time in a compact form. Means and standard deviations of Inline graphic in Inline graphic and Inline graphic spaces are a convenient choice. In integral representation they can be expressed as follows (note that the normalization here is different in comparison to Eq. (9)):

graphic file with name d33e2926.gif 39

Figures 1, 2 and 6, and 7 show Inline graphic, Inline graphic, Inline graphic, Inline graphic for the four scenarios considered above. All figures have lines 1 through 4 corresponding to the parameters shown in the Table 1 above.

Fig. 6.

Fig. 6

Dependence of Inline graphic on Inline graphic. Produced using Wolfram Mathematica 13.

Fig. 7.

Fig. 7

Dependence of Inline graphic on Inline graphic. Produced using Wolfram Mathematica 13.

The results presented above show that a two-dimensional model is much more complex and richer than a one-dimensional one23. In particular, as food becomes scarce, the boundary between efficient and inefficient protocells is just two points in one-dimensional model. And so, these points are not connected. That boundary becomes a connected curve in the two-dimensional model. Those protocells that are inside the boundary “die off” from starvation because they are not efficient enough to sustain their population at a given concentration of food. And the boundary extends from Inline graphic over time due to diffusion and appearance of more efficient species. The most interesting part here is that many species are possible. Second, random fluctuations may create several favorable mutations in a row. A favorable mutation is a mutation that makes some species of protocells more efficient. That slight increase in efficiency results in placing some small number of protocells into the place in Inline graphic space where they have a positive exponential growth factor in comparison to the boundary where the bulk of the protocells exist at some moment in time. Subsequently these more efficient protocells can temporarily experience exponential growth. This may result in splitting of the original “bump” distribution of protocells into several non-connected groups, which we can call species. The greater the total number of molecules in the system the more likely this is to occur.

The easiest way to illustrate four stages of evolution of the system is to look at the concentration of food molecules. As the time scale of the first stage is much faster than that of the other three stages, we need two charts to showcase all four stages.

Figure 8 illustrates the first stage of evolution. It shows an exponential, or more correctly, a hyperbolic tangent-like shape. Models 1 and 2 are indistinguishable from each other during this stage (that’s why model 1’s chart is not visible in the figure). Model 3 is slightly faster, and model 4 is even faster at consuming most of the food. Models 3 and 4 have a drift coefficient, which moves protocells into the region of higher efficiency (resulting in faster food consumption). Model 4 also has a larger diffusion coefficient (larger values of Inline graphic and Inline graphic), which further accelerates some protocells into the region of higher efficiency.

Fig. 8.

Fig. 8

Dependence of the amount of food in percent on Inline graphic during stage 1. Produced using Wolfram Mathematica 13.

Figure 9 illustrates stages 2–4 of the evolution. Of all the models considered here, only model 1 shows four clear stages of evolution, where stages 2 and 3 appear as temporary plateaus, followed by a sharp transition to the next stage. Model 2, which has a zero-drift coefficient like model 1 but a larger diffusion coefficient, also exhibits a clear stage 2, while stages 3 and 4 are much faster and not clearly visible. Model 3 has a drift coefficient but a smaller diffusion coefficient. The drift coefficient moves the protocells into the region of higher efficiency, making the plateaus almost disappear. Finally, model 4 has both a drift coefficient and a larger diffusion coefficient, combining stages 2–4 into a single continuous slide.

Fig. 9.

Fig. 9

Dependence of the amount of food in percent on Inline graphic during stages 2 – 4. Produced using Wolfram Mathematica 13.

Conclusions

We consider an evolution of protocells without competition in a two-dimensional space: total enantioselectivity, Inline graphic, and the amount of information passed from generation to generation, Inline graphic, and we present a simple two-dimensional model, which describes such an evolution. We show that such a model is described by a system of integrodifferential equations, which under the hood is powered by a protocell replication kernel, Inline graphic, and a mortality rate, Inline graphic. We show that it is convenient to split a replication kernel into a multiplication of the pieces: a total replication rate at a point Inline graphic in Inline graphic space, Inline graphic, a total normalized replication rate, Inline graphic, which is the normalized rate at which the species with total enantioselectivity Inline graphic and the amount of stored information Inline graphic can produce any species, and the probability that species with total enantiomeric excess Inline graphic and amount of stored information Inline graphic would produce species with total enantiomeric excess Inline graphic and amount of stored information Inline graphic, Inline graphic.

We show that observation of life, as it exists, leads to a conclusion that a Taylor expansion of total normalized replication rate Inline graphic near point Inline graphic should be a function with zero linear coefficient in Inline graphic space, non-negative linear coefficient in Inline graphic space, and positive quadratic coefficients both in Inline graphic and Inline graphic spaces, provided that we limit the considerations to the terms not higher than quadratic ones.

We show that under reasonable assumptions the mutations from generation to generation of protocells are small (the probability Inline graphic is a narrow peak centered near Inline graphic and Inline graphic), the evolution of the system can be described in four stages.

The first stage is a very quick exponential growth of the population of the nearly racemic species near the point Inline graphic in Inline graphic space. Since this is an exponential process, it should happen in a negligible amount time on a geological scale, and we show that the time estimate for that process is on the order of Inline graphic years for the closed model considered here.

The second stage is a relatively slow diffusion-like process, when the system spreads out in both Inline graphic and Inline graphic directions. The system at that stage can be very well approximated by a convection-diffusion equation with a drift. If a linear coefficient in the Taylor expansion of total normalized replication rate Inline graphic near point Inline graphic in Inline graphic space is zero or small enough, then the system experiences just a diffusion (spearing out of the initial delta function like peak), and if that linear coefficient is large enough then that peak travels in Inline graphic space while spreading out, thus, gaining information storage capacity able to be passed from generation to generation. The time of that stage depends on the combination of factors: larger mutation rates in Inline graphic and larger quadratic coefficients in Inline graphic make it faster and the other way around.

The third stage starts when diffusion powered widening of the initial narrow peak slows down enough in comparison to local drift at the edges of the peak. That’s the point in time when the process becomes stochastic in nature. One or several “species” could be formed, and they run away in the direction of increased efficiency of the species. This process is quick as it is also exponential in nature, though not as quick as the first stage. It is exponential because a small number of more efficient protocells find the food abundant whereas the bulk of the protocells balance between replication and extinction. When the grid step is sufficiently small to resolve a global asymmetry factor, the system consistently produces more efficient species. As we used a grid with Inline graphic points in both Inline graphic directions, that corresponds to global asymmetry factor Inline graphic. We performed multiple runs, and the results consistently show directed symmetry breaking in the system. Smaller values of Inline graphic may result in two species form near Inline graphic, and less efficient species (Inline graphic in our case) may dominate in the system for some prolonged period, after which more efficient species may still overcome less efficient ones. If the total amount of food in the system Inline graphic is larger, then such event is more likely to happen. This is because for this event to occur there must be at least some protocells near Inline graphic and some large enough value of Inline graphic. However, if Inline graphic is small, thenе this may not statistically happen and symmetry breaking becomes random when a grid cannot resolve the global asymmetry factor.

The fourth stage is a relaxation toward a stationary point, which is a narrow peak near Inline graphic, Inline graphic in case of the models considered here. When Inline graphic, then symmetry breaking occurs first and the system transitions toward a peak Inline graphic and some small value of Inline graphic first, then slides toward Inline graphic either as a whole bump or forming some runaway species. When Inline graphic is large enough (we considered Inline graphic and it is large enough for the model considered here), then the system first accumulates substantial ability to pass information from generation to generation without experiencing any symmetry breaking, and symmetry breaking occurs only when the system already possesses substantial ability to pass information from generation to generation.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 5 (24.1KB, docx)
Supplementary Material 6 (26.5KB, docx)

Author contributions

A.K. and K.K. developed the model. K.K. provided coding of the model. A.K. provided structure of the manuscript and charts. A.K. and K. K. wrote the main manuscript and K.K. generated charts and animations. All authors reviewed the manuscript.

Data availability

The datasets generated during the current study are available in a “frozen” GitHub repository branch: https://github.com/kkkmail/CoreClm/tree/clm700-Fredholm-frozen-V2.

Declarations

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Frank, F. C. On spontaneous asymmetric synthesis. Biochim. Biophys. Acta. 11, 459–463 (1953). [DOI] [PubMed] [Google Scholar]
  • 2.Kondepudi, D. K. & Nelson, G. W. Chiral symmetry breaking in nonequilibrium systems. Phys. Rev. Lett.50 (14), 1023–1026 (1983). [Google Scholar]
  • 3.Steel, M. The emergence of a Self-Catalysing structure in abstract Origin-Of-Life models. Appl. Math. Lett.13, 91–95 (2000). [Google Scholar]
  • 4.Sandars, P. G. H. A toy model for the generation of homochirality during polymerization. Orig Life Evol. Biosph. 33, 575–587 (2003). [DOI] [PubMed] [Google Scholar]
  • 5.Plasson, R., Bersini, H. & Commeyras, A. Recycling Frank: spontaneous emergence of homochirality in noncatalytic systems. PNAS101 (48), 16733–16738 (2004). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Pizzarello, S. & Weber, A. L. Prebiotic amino acids as asymmetric catalysts. Science303 (5661), 1151 (2004). [DOI] [PubMed] [Google Scholar]
  • 7.Joyce, G. F. Directed evolution of nucleic acid enzymes. Annu. Rev. Biochem.73, 791–836 (2004). [DOI] [PubMed] [Google Scholar]
  • 8.Wattis, J. A. D. & Coveney, P. V. Symmetry-breaking in chiral polymerisation. Orig Life Evol. Biosph. 35, 243 (2005). [DOI] [PubMed] [Google Scholar]
  • 9.Weissbuch, I., Leiserowitz, L. & Lahav, M. Stochastic mirror symmetry breaking via Self-Assembly, reactivity and amplification of chirality: relevance to abiotic conditions. Top. Curr. Chem.259, 123–165 (2005). [Google Scholar]
  • 10.Brandenburg, A., Lehto, H. J. & Lehto, K. M. Homochirality in an early peptide world. Astrobiology7 (5), 725–732 (2007). [DOI] [PubMed] [Google Scholar]
  • 11.Gleiser, M., Thorarinson, J. & Walker, S. I. Punctuated chirality. Orig Life Evol. Biosph. 38, 499–508 (2008). [DOI] [PubMed] [Google Scholar]
  • 12.Kafri, R., Markovitch, O. & Lancet, D. Spontaneous chiral symmetry breaking in early molecular networks. Biol. Direct. 5, 38 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Blackmond, D. G. The origin of biological homochirality. Cold Spring Harb. Perspect. Biol.2 (5), a002147 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Eigen, M. Selforganization of matter and the evolution of biological macromolecules. Naturwissenschaften58, 465–523 (1971). [DOI] [PubMed] [Google Scholar]
  • 15.Kauffman, S. A. Autocatalytic sets of proteins. J. Theor. Biol.119 (1), 1–24 (1986). [DOI] [PubMed] [Google Scholar]
  • 16.Szostak, J. W. An optimal degree of physical and chemical heterogeneity for the origin of life? Proc. of the National Academy of Science. 108 (44), 17676–17682 (2011). [DOI] [PMC free article] [PubMed]
  • 17.Smith, E. & Morowitz, H. J. Universality in intermediary metabolism. PNAS101 (36), 13168–13173 (2004). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Schrödinger, E. What Is Life? The Physical Aspect of the Living Cell (Cambridge University Press, 1944).
  • 19.England, J. L. Dissipative adaptation in driven self-assembly. Nat. Nanotechnol.10 (11), 919–923 (2015). [DOI] [PubMed] [Google Scholar]
  • 20.Lane, N. & Martin, W. The energetics of genome complexity. Nature467 (7318), 929–934 (2010). [DOI] [PubMed] [Google Scholar]
  • 21.Pross, A. What Is Life? How Chemistry Becomes Biology (Oxford University Press, 2016).
  • 22.Lanier, K. A. & Williams, L. D. The origin of life: models and data. J. Mol. Evol.86 (4–5), 249–260 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Konstantinov, K. K. & Konstantinova, A. F. Evolutionary approach to biological homochirality. Origins Life Evol. Biospheres. 52, 205–232 (2022). [DOI] [PubMed] [Google Scholar]
  • 24.Robertson, M. P. & Joyce, G. F. The origins of the RNA world. Cold Spring Harb Perspect. Biol.4 (5), a003608 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Monnard, P-A., Walde, P. & Current Ideas about Prebiological Compartmentalization Life5, 1239–1263 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Bahadur, K. & Ranganayaki, S. Synthesis of Jeewanu, the units capable of growth, multiplication and metabolic activity. I. Preparation of units capable of growth and division and having metabolic activity. Zentr Bakteriol Parasitenk. 117 (11), 567–574 (1964). [Google Scholar]
  • 27.Gupta, V. K. & Rai, R. K. Histochemical localisation of RNA-like material in photochemically formed self-sustaining, abiogenic supramolecular assemblies ‘jeewanu’. Int. Res. J. Sci. Eng.1 (1), 1–4 (2013). [Google Scholar]
  • 28.Gillespie, D. T. Approximate accelerated stochastic simulation of chemically reacting systems. J. Chem. Phys.115, 1716–1733 (2001). [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material 5 (24.1KB, docx)
Supplementary Material 6 (26.5KB, docx)

Data Availability Statement

The datasets generated during the current study are available in a “frozen” GitHub repository branch: https://github.com/kkkmail/CoreClm/tree/clm700-Fredholm-frozen-V2.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES