Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2024 Aug 12;121(34):e2315000121. doi: 10.1073/pnas.2315000121

Origins of Life: The Protein Folding Problem all over again?

Charles D Kocher a,b, Ken A Dill a,b,c,1
PMCID: PMC11348307  PMID: 39133848

Abstract

How did specific useful protein sequences arise from simpler molecules at the origin of life? This seemingly needle-in-a-haystack problem has remarkably close resemblance to the old Protein Folding Problem, for which the solution is now known from statistical physics. Based on the logic that Origins must have come only after there was an operative evolution mechanism—which selects on phenotype, not genotype—we give a perspective that proteins and their folding processes are likely to have been the primary driver of the early stages of the origin of life.

Keywords: Protein Folding Problem, origin of life, foldcats, disorder-to-order transition


No one knows how life originated—presumably on earth—about 3.5 to 3.8 billion years ago (13). No experiment has rediscovered it. In that breach, modeling can give some guidance. For instance, there are insightful speculations on “What type of biomolecule came first?” (411). This question can be narrowed further to a focus on nucleic acids vs. proteins, because those are uniquely the two types of sequence structure function molecules at the beating heart of biology. A popular view is that RNA came first because it can serve the dual purposes of both information storage and catalysis, leading to self-replication (5, 12, 13). Here, we summarize a recent alternative perspective that protein folding and function was among the first steps.

Why proteins, rather than RNA? In short, it follows from a different logic. The primacy of RNA has followed from supposing that self-replication is the central issue. In contrast, the primacy of proteins follows from reasoning that the central question instead is What is the driving force toward biology? Why was there any force at all? What molecular process starts from disordered states—and through some sort of needle-in-a-haystack search through a huge sequence space (haystack)—finds a few functional biomolecular sequences (needles)? We describe how the needle-in-a-haystack nature of the origin of life (OOL, or Origins) closely resembles—and can be resolved by—what we now know to be essentially the same physics that underpinned the solution to another apparent needle-in-a-haystack conundrum, the Protein Folding Problem (PFP) (1421).

Basic Questions of Origins

How Did Evolution Begin?

According to NASA’s definition, “life is a self-sustaining chemical system capable of Darwinian Evolution” (22). The emphasis is ours, to emphasize the implication that because life cannot be defined in the absence of its adaptation dynamics, then some form of that dynamics must have been operating at or before the OOL. In short, life cannot originate until it can propagate. Evolution must have had a beginning. Darwinian processes among molecules were apparently not acting before Origins approximately 3.5 billion years ago, when earth’s processes were purely governed by physics and chemistry. Before living systems could pick and choose molecules, there must have been a sustainable process for doing so. What was it? Here are some of the key questions:

How Did Polymer Sequences Come to Encode Molecular Functions?

The heart of biology is sequence-based heteropolymers (RNA, DNA, proteins) that are the cell’s catalysts, machines, and memory. Origins is sometimes expressed as finding a needle in a haystack, or as a Blind Watchmaker (23), or as monkeys on typewriters that create Shakespearean plays, because the specific sequences that give biopolymers their functions must be found from the huge space of mostly useless alternative sequences. How did order arise from disorder in polymer sequences?

What Was the Tipping Point from Degradation and Hydrolysis to Long-Term Persistence?

Prebiotic chemical reactions tend toward hydrolysis, degradation, and dilution, as expressed by the second law of thermodynamics of equilibria. But, living systems are not tendencies toward equilibrium. They are driven by resource intake. How did prebiotic molecules rise up to persistent dynamics that overcame decay forces?

What Was Fitness Before There Were Cells?

Biology drives toward self-servingness, by winning and losing in competitions for finite resources. There is no evident equivalent of simple prebiotic molecules being self-serving. What selection principle among molecules preceded the climbing of fitness landscapes observed in cells?

Two Stories of Needles-in-Haystacks

There is a remarkable similarity between the two puzzles of origins of life and the old PFP—both have apparent needle-in-a-haystack unlikelihood combinatorics and both pertain to protein sequences. The Origins problem can be regarded as a search of a large sequence space to pick out particular sequences. What is the probability that a present-day amino acid sequence, say of lysozyme, arose from random selection out of the vast sea of all possible sequences? That probability is often taken to be infinitesimal. Needle-in-haystack problems can be expressed as a landscape shaped like a golf course (Right side of Fig. 1) that is randomly searched by a ball rolling on it. Finding the hole by rolling a ball randomly would be impossibly slow.

Fig. 1.

Fig. 1.

Searches using a golf course landscape are random and slow. (Left, Top) Protein folding searches conformations to find the native structure. (Left, Bottom) Origins searches sequence space to find given functional proteins. (Right) A golf course landscape representing a completely random search process. It is shown with a single minimum; however, sometimes protein folding funnels can have multiple minima, and we expect that sequence funnels will too.

Likewise, previously seen as a needle-in-a-haystack search was the PFP (16, 17, 1921): how does a protein molecule search its vast conformational space to find its unique native structure (Top Left of Fig. 1)? This, too, was expressed in terms of golf-course-shaped landscapes. But, the solution to the physical folding problem, of the driving forces and kinetic routes, is now well-known* (1418), and it gives useful insights for the OOL. In short, needles-in-haystacks and golf courses are now seen as just incorrect conceptualizations. The problem is not one of random independent steps; the problem is to find what types of physical cooperativity cause the snowballing, or bootstrapping, of one state of small probability to another state of higher probability. Fig. 2 shows three lessons from the PFP: 1) That a physical code, based on hydrophobic (H) and polar (P) patterning reduces the haystack by more than 100 orders of magnitude, as confirmed by experiments (2630). 2) That the landscape is relatively funnel-shaped, not golf-course-like, because of physical cooperativities in secondary and tertiary drivers (16, 17, 31). 3) That the kinetics is local first, global later; helices and turns early, tertiary structure later (32); reducing an NP-completeness challenge to an often very fast process. In the Foldon Funnel Model, the early fast steps (helices and turns) are not stable (i.e., not downhill); they are just less unstable, continuing up a landscape of diminishing steepness until reaching a tipping point at which full stability is achieved by the native structure (33).

Fig. 2.

Fig. 2.

Three lessons from the PFP (16, 17, 31). (Left) Hydrophobic/polar patterning determines the conformation space, not the exact amino acid sequence, making the search space smaller (2629). (Center) Folding landscapes are funneled, representing a driven search, not a random one. (Right) When proteins fold, local structures collapse first, which then allow for the global structure to form. Populations of chains with varying degrees of collapse are shown, with darker curves indicating more folded structure. Figure from ref. 33.

Finding the sequence of a particular protein, say lysozyme, by randomly searching the space of all 20N sequences of length N, entails the infinitesimal probability of 20N. But, searching the space of particular folds and functions requires only a search of 2N sequences because of the binary HP folding code (26, 27); see Fig. 2, Left. This space is vastly smaller, by a factor of 10N.

Learning from the Logic of Evolution

Insights into the beginnings of evolutionary dynamics can come from knowing how cellular Darwinian evolution (DE) currently operates (34, 35). DE is a process of mutational search, competition, and fitness ratcheting. What does today’s process tell us about the singularity 3.5 billion years ago at the beginnings of that process?

Evolution’s Grist Is Molecules Making Molecules.

To be self-sustaining and persistent, evolution is rooted in the autocatalysis (positive feedback) of replication. We call it “moms making moms.” Evolution selects on phenotypes, not genotypes. Selection cannot see a gene if it is not expressed. Selection in simple organisms is based on differential growth rates, and the main mass production that constitutes growth in cells is protein. Moreover, the makers and catalysts that produce that growth are also mainly proteins. For prebiotic origins, the equivalent autocatalytic process was “molecules making molecules.” Of course, some RNA molecules can be functional, in self-catalysis like splicing of mRNAs, in ribosomes, and others, but today’s functionality is mostly proteins.

Evolution Is Implemented by Sequence Function Molecules.

On the whole, proteins are good at function and catalysis, while RNA is a good vehicle for information. This difference can be rationalized by their chain physics. 1) Proteins are versatile catalysts over a broad range of reactions, in part because of their 20 side-chain moieties covering the chemical spectrum, vs. the smaller repertoire of interaction types in nucleic acids (36, 37). 2) Proteins make good catalysts because they fold into single solid-like miniature stable surfaces. RNAs are stringier: they are less compact and more structurally heterogeneous because RNAs are dominated by secondary structure forces while proteins are dominated by tertiary forces (38, 39). And even those RNAs that are structured are assisted, in ribosomes, by protein–RNA interactions (40); or in tRNA molecules, by modified nucleotides (41). 3) Proteins give a more direct and unique mapping of sequence to structure because of funnel-shaped energy landscapes. While some RNA molecules do have unique folds, when taken over RNA sequence space as a whole, landscapes are generally rugged with multiple minima (42), because of multiple hydrogen bonds per base pair and because of the greater degrees of freedom around the backbone [eight in RNA vs. two in proteins (43)].

Evolution Is a “Feynman Ratchet,” Not a Copy Machine.

Evolutionary replication is not perfect autocatalysis: It is heritable variation, i.e., descent with modification. If moms made identical copies of themselves, evolution would have died out, having been too brittle in the face of environmental variations and unruliness, particularly in life’s fragile early stages (34). The winners that take all would die out when the environmental “winds” shift. Rather, evolution’s replication, through a process of search/compete/select, is of autocatalytic sets, wherein one element of the set can produce others (4447). Evolution is a Feynman Ratchet, like a Brownian ratchet, where a random noisy input drives a directed output. Mutational searching generates much random junk that becomes selected through competition.

Evolution Is Driven by an Out-of-Equilibrium Environment.

Cells are driven by intake of resources—food, energy, and water. They are open systems, not driven by a second law tendency to equilibrium. Open systems are of two types: 1) “equilibrium,” of the system with a bath, having no net flow either way, having only fluctuations—These tend to equilibria—and 2) “nonequilibrium,” where there is a net unbalanced flow between system and environment. In biology, only death is explained by (1); life must be explained by (2). Think about a TV set. Its action is not a tendency toward equilibrium until it is unplugged; as long as a TV set is plugged in, its action is persistent complex flows of electrical currents that enact its functionality. Origins too must have entailed some form of persistent environmental driver that coupled to molecule-making. What type of molecule-making could have innovated through positive feedback cooperativity?

Hypothesis: Evolution Started with Proteins

The HP Foldcat Mechanism.

Now, we discuss the link between prebiotic evolution and the protein folding process, which is encapsulated by the HP Foldcat (HPF) mechanism. The HP Foldcat mechanism is presented in detail elsewhere (35, 48); here, is a summary. The “HP” stands for hydrophobic and polar, the two types of (amino acid) monomers that can get polymerized into a chain. Peptides are assumed to be continuously synthesized from these monomers through catalysis on some prebiotic catalyst, which we call the Founding Rock; see Fig. 3. At first, the peptides are short and random. A small fraction of those chains are longer, collapsing (folding) into compact conformations with a hydrophobic core. Some folders have stable surfaces. Indicated here as having hydrophobic sticky “landing pads,” these foldcat sequences are catalysts that accelerate elongation of a client chain by bringing the next monomer to be added into juxtaposition. What follows below is first the big-picture conclusions from the model, followed by a more quantitative description of it.

Fig. 3.

Fig. 3.

Bootstrapping foldcats from stationary catalysis. The Founding Rock (green) assembles polymers from hydrophobic (red) and polar (blue) monomers. Some of these polymers fold, and some of the folders catalyze elongation of other chains on hydrophobic patches of their surface. The two foldcat structures at the bottom are from a computation showing that they are the unique native conformations of those two sequences in the two-dimensional HP lattice model. The elongated client chain shown could be a piece of any foldcat containing the sequence PHHH, contributing to the foldcat autocatalytic set.

This Mechanism Has Emergent Properties.

The foldcat process has emergent properties—i.e., properties that are not anticipated from simple noncooperative random short-chain synthesis alone. a) Chains grow longer. b) There are autocatalytic sets, where some sequences become preferentially populated, explaining the beginnings of sequence–structure relationships (48, 49). c) A fitness property emerges—namely the folding stability (and ultimately the catalytic effectiveness)—through which some sequences survive and win while other sequences degrade and recycle. And whereas fitness starts as simple folding stability, once this form of privileging takes hold persistently, any other factors that can stabilize proteins or their communities can also support further evolutionary change. d) Accordingly, an evolution-like process emerges, namely of sequence searching and fitness selection via competition for monomers; see also ref. 6. The emergence of autocats and function here is not dependent on a nucleic acid template.

e) Moreover, this mechanism has long-term persistence for the following reasons. 1) It is an open system, not driven by a second law tendency to equilibrium. It is driven by a nonequilibrium input of monomers and a Founding Rock that initially facilitates the otherwise nonspontaneous polymerization of these monomers. 2) At some point in the process, when chains become sufficiently good foldcats, there would be an untethering transformation, a point in time at which catalysis is mostly carried out by foldcat proteins and is no longer reliant on the Founding Rock. This would be a major evolutionary event, because no longer is Origins localized, i.e., stuck in some “small pond in Nebraska.” Now catalysts are mobile and can go anywhere; now catalysts are programmable (different proteins can catalyze different reactions or work in different condition); and now catalysis becomes miniaturized and capturable within cells (50).

These emergent properties come from two physical cooperativities: i) Chains that are long enough have folded cores with enhanced protection against degradation, and ii) the collection of chains that help elongate other chains forms an autocatalytic set. Like a snowflake that accretes more snow, that turns into a snowball and ultimately into an avalanche, cooperativities are key to explaining how the improbable first steps rise up to dominate the macroscale. The HP Foldcat mechanism snowballs toward longer, more folded, and more catalytic molecules.

This Mechanism Is Prebiotically Plausible.

While it has not been observed directly, the HP Foldcat mechanism could plausibly have arisen on the early earth. First, amino acids and peptides have been produced in prebiotically plausible ways through terrestrial processes (9, 5154) or from space (5558). Founding-Rock-like catalysis of peptide elongation through dehydration reactions has been demonstrated on mineral surfaces (6062) and at air–water interfaces (sea spray) (6365), or even by unknown extraterrestrial processes (66). Plausible nonequilibrium drivers of peptide bond formation could have been wet-dry cycles (67, 68) or hot-cold cycles (69).

Second, polymers of hydrophobic and polar monomers can fold and catalyze, even if the chains are random and/or short. Proteins are known to be driven by a binary HP code (16, 29, 30, 70, 71). Because today’s 20 amino acids are found in roughly equal hydrophobic and polar proportions in the PDB, it means that most sequences, of any sufficient chain length, will collapse to compactness in water (26, 27, 7276) and thus have cores that are protected from access to the external solvent (77, 78). While speculations suggest that early alphabets may have had fewer than 20 amino acids (79, 80), all that matters here is just a binary code. Also, short proteins are ubiquitous in biology: Humans have thousands of microproteins (81), which are proteins that are less than 100 amino acids long. Although modern microproteins may have a distinct, later evolutionary origin, they demonstrate that short proteins can be interesting and functional as required for the HP Foldcat mechanism. Many microproteins perform biological functions including catalyzing reactions (8290).

Third, folding and catalysis are simple physical properties of HP chains. They are found in prebiotic mixtures of amino acids (79, 9196), possibly assisted by available small molecules (97). Moreover, even just cysteine alone, a single hydrophobic amino acid, has been suggested to be both prebiotically available and capable of performing peptide ligation reactions under plausible prebiotic conditions (98). In addition, catalysis has also been observed in amyloids (99, 100). And, while today’s enzyme catalysts often utilize high levels of atomically detailed chemical and spatial specificity, simpler spatial proximity effects, as envisioned here, are capable of giving orders of magnitude speed-ups to reactions (101103). The HP foldcat mechanism predicts that persistent generation of HP chains could lead to some longer folded chains, a fraction of which could catalyze other actions.

The Origins Problem Resembles the Folding Problem

Now, compare the origins of life problem to the PFP. In the Foldcat conception, both problems are centered on proteins—one in conformational space and the other in sequence space; one driven by equilibrium forces and the other by nonequilibrium forces. Nevertheless, both have funnel-shaped landscapes; both have dynamical epochs for how order arises from disorder; both entail cooperative interactions through which random steps lead from local to global order; and both of them solve apparent needle-in-a-haystack combinatorics problems through the protein folding code.

Both Have Funnel Landscapes.

See Fig. 4. In the end, protein folding turned out not to be a needle-in-a-haystack problem. Although it entails a search through a huge space to find the single native structure, it is not random. Energetic preferences favor compact hydrogen-bonded states with hydrophobic contacts. Steps downhill in free energy lead to more stability and facilitate additional steps. Even though individual steps are stochastic, the net result is directed. It matters not if a state is highly unlikely based on the count of other options; it only matters if one advance can lead to another, the way one snowflake can start a snowball, and an avalanche, downhill. The reason for the large width at the top (high entropy) and smooth reduction to the low-entropy native state is because of excluded volume (104). The denser the chain gets as it grows more compact and native-like, the fewer configurations it has that remain available. Protein folding entropies are huge: TΔS100 kcal/mol for 100-mer sized proteins, about half of which is due to the backbone and half to the sidechains (105, 106). In short, the PFP was not about aimless searching, but about the accumulation of local advantages and the cooperativities of one step leading to the next (33).

Fig. 4.

Fig. 4.

Foldcats cause a funneling-like exploration toward a particular region of sequence space by leveraging local advantages. (Left) The Founding Rock makes random chains, from which stable and catalytically active ones are selected. The few discovered foldcats untether the protein synthesis process from the Founding Rock and preferentially make more of themselves. (Right) A simplified model of the foldcat mechanism, introduced in ref. 107, shows how monomers (light green) initially form random, useless chains (dark green), slowly develop folding (orange), then catalysis (light blue), which enables longer, more functional chains to be created (darker blue curves).

Understanding the basis for cooperativity is important for the origins problem, as it was for the folding problem. It is funnel-like. The Left side of Fig. 4 has a large space of random short sequences, leading to a later stage of a much smaller space of longer folders and foldcats.§ The Right side of Fig. 4 gives an example of the time-course of this funneling in terms of populations of different types of HP chains (nonfolding, folding, foldcat, of various lengths) (107). Proteins fold because conformational space is shaped like a funnel. The HPF model shows how origins, too, may result from funneling, in this case in sequence space.

Briefly, here is the model more quantitatively (107). As noted above, any origins model must explain cooperativities, i.e., a physical basis for nonrandom “snowballing.” For the two types of cooperativity embodied here, the present model is minimal insofar as it lumps together microstates into mesostates in a way that requires only 3 rate parameters after simplification. Let M be the total amount of monomer, r the total number of nonfoldcat chains, and A the total number of foldcats. Monomer is supplied to the system at a rate αM and decays at a rate dMM, nonfoldcat chains are created (from monomers, by both the foldcats and the Founding Rock) with rate αr(A,M) and decay at a rate drr, and foldcats decay at a rate DA. The specific way in which foldcats cooperatively speed up their communal formation by making their precursors from monomers in the function αr(A,M)—the feature that we want to study—is importantly still present in this simplified model even though other details of the foldcat mechanism have not been explicitly tracked. Finally, the elongation reactions, which are catalyzed both by the Founding Rock and by the foldcats A, are as follows: 1) r+MA (nonfoldcat is elongated into a foldcat), 2) r+Mr (nonfoldcat is elongated and still is not a foldcat), 3) A+MA (foldcat is elongated and is still a foldcat), and 4) A+Mr (foldcat is elongated and is no longer a foldcat). Elongation reaction i has mass-action rate constant Ki(A), to which the Founding Rock and foldcats both contribute. These equations give a set of ODEs for foldcat cooperativity:

drdt=αrdrr+K4(A)AMK1(A)rM,dMdt=αM[dM+r(K1(A)+K2(A))++A(K3(A)+K4(A))]M,dAdt=K1(A)rMK4(A)AMDA. [1]

We now reduce these to a single rate equation. By eliminating M and r through steady-state arguments, and by switching to dimensionless variables (for details, see ref. 107), we get

dAdt=k1A1+k1A+k1k2A2(1+k1A)(1+A/As)A. [2]

where k1 characterizes the rate at which foldcats make new foldcats (related to K1(A) above), k2 characterizes the rate at which foldcats make their precursors (related to αr(A,M) above), and As is the number of foldcats at which the latter reaction starts to saturate. The rate k2 provides the cooperativity: The ability of foldcats to make their precursors allows them to accelerate their collective production nonlinearly. Specifically, when the number of foldcats is small, i.e., in the prebiotic stage, A0 and we find

dAdtk1(1+k2Ak1A)AA. [3]

The two terms in Eq. 3 give growth and decay rates, respectively. For a noncooperative autocatalyst, the growth rate is gA, where g is a constant. In this case, g1 is either positive or negative for all values of A, meaning that the foldcats either grow or decay just depending on the constant value of g. Cooperativity occurs when g is itself a function of A, such as g=k1(1+k2Ak1A). When k2<k1, the cooperativity is negative and inhibits further growth. However, when k2>k1, the cooperativity is positive and can encourage further growth. Now, the sign of dA/dt also depends on the value of A, because g itself depends on the value of A. Instead of having only-growth or only-decay behavior as in the noncooperative case, the population of foldcats can have a bistable behavior: When A is small, dA/dt < 0, but when A increases, eventually, dA/dt < 0. Positive cooperativity allows for an initially unfavorable environment (small AdA/dt < 0) to be overcome by fluctuating to a higher population where the growth rate is positive (107).

Disorder to Order: From Many to One.

Because foldcats can exhibit positive cooperativity, they can transition from a world where degradation dominates, with only random short chains, to a world of persistent growth, with longer-chain folders and foldcats that propagate stably with evolution-like dynamics. Other studies explore similar disorder-to-order transitions of other Origins models (108111).

Fig. 5 shows the kinetic phase diagram of this model Eq. 2. In the red region, foldcats just die off and cannot survive. The environment is too unfavorable; it produces only short random chains, the disordered state. In the green region, foldcats grow deterministically into a persistent population because both growth rate parameters are favorable. In the yellow region, the environment is unfavorable, according to the mass-action dynamics of Eq. 2, but stochastic fluctuations can drive the system over the barrier from disorder to order. This stochastic behavior is because of the nonlinearity (cooperativity) in Eq. 3: there are two stable steady-states, one with A=0 (disordered) and one with A>0 (ordered), which are separated by a “kinetic barrier” that must be hopped over.

Fig. 5.

Fig. 5.

The dynamical phase diagram of foldcat origins, from Eq. 2. Increasing the cooperative reproduction rate k2 above the noncooperative reproduction rate k1 allows for a bistable region where a stochastic jump can take foldcats from a disordered state (mostly small useless chains) to an ordered state (folders, foldcats, long chains enriched).

The goal of this modeling is to ask whether foldcat cooperativities admit of any possible window of viable parameters that could lead to a persistent evolutionary process toward further complexity and biology. It is only a minimal model, and of necessity neglects many things, including repurposing, other forms of noncatalytic functionalities, and/or protein assemblies. The main conclusion here is that the yellow region is a viable tipping point route from prebiotic degradation to kinetic persistence of foldcats in an unfavorable initial environment. As described below, it leads to an evolutionary funnel in sequence space.

From Disorder to Order.

Models of protein folding show epochs of steps that follow a hierarchy of local-first, global later; see the Right side of Fig. 2. Microscopically, the steps are stochastic. But “mindless” local advantages add up to globally optimal and ordered structures. First to appear are local interactions in helices and turns; later are nonlocal helix–helix interactions. In folding, most early steps are undirected and unproductive, but ultimately the native state (ordered) arises from the denatured states (disordered). The foldcat mechanism of origins reflects similar hierarchical epochs, as shown on the Right side of Fig. 4. In the foldcat model, first come short unfunctional molecules, followed by systematically longer and functional molecules. In both evolution and folding processes, small incremental advantages are found among a sea of options, and then further advantages accumulate, leading ultimately to a greater global advantage.

The Rest of the Story: From Evolution to Origins

Our foldcat mechanism is not a full story of Origins. It is missing major components of life, including cell encapsulation, nucleic acids, lineages and inheritance through a genetic code, and the complex biochemical pathways needed to implement them. What is the fitness ratchet that is preserving value among prebiotic molecules? Here, it is simply persistence, i.e., the folding stability of a chain—longer chains are more stably folded, so they persist longer in a fluctuating environment. Once this simple evolutionary dynamics is stable, this machinery can then further discover other forms of persistence and fitness. Various such discoveries have been proposed: selection of amino acid type (79, 91, 95, 112); use of an energy currency such as ATP (109); a better protein chain elongator [ribosome, (113)]; or other features (114).

On the one hand, the present model assigns primacy to proteins insofar as proteins alone are the minimal system that can explain arguably the first step—which is an evolution-like process toward origins—without requiring any other biomolecules. On the other hand, that does not have temporal implications about other components. It does not mean that other molecules, including RNA, were not present concurrently or even undergoing interesting dynamics themselves.

Fig. 6 shows a potential-energy-like diagram with two steps to Origins. First are evolutionary dynamics such as the foldcat mechanism; second are the further ingredients we just listed. Before the advent of lineages—i.e., nucleic acids and cells—persistence is only on the time scales of molecule processes. The advent of lineages gives extraordinary extension of the time scale of persistence, all the better for handling larger environmental unruliness. This model brings the perspective that prebiotic chemistry was not “aiming to become biology,” but that it ratcheted up chemical persistence and biomolecules were the best way to achieve it.

Fig. 6.

Fig. 6.

The potential-energy-like valleys (stable states) on the path to cellular life, starting with foldcats. Foldcats represent a stable, persistent state, from which evolutionary dynamics can jump to the next minimum closer to biological life.

Conclusions

The origins of life must have been preceded by a stable evolution-like propagation mechanism. We review how evolution could arise from a random generator of peptide sequences that could ultimately function and catalyze reactions. Two types of proteins’ cooperativities, in their assembly and reduction of degradation rates, lead to the emergence of longer chains, autocatalytic sets, increasing persistence and function through a narrowing sequence space funnel, and a tipping point from disorder to order. This origins process in sequence space resembles—and originates partly from—the folding process in conformational space.

Acknowledgments

We are grateful to the Laufer Center for Physical and Quantitative Biology at Stony Brook and to the John Templeton Foundation for financial support (Grant ID 62564).

Author contributions

C.D.K. and K.A.D. designed research; performed research; and wrote the paper.

Competing interests

The authors declare no competing interest.

Footnotes

This article is a PNAS Direct Submission. U.S. is a guest editor invited by the Editorial Board.

*The physical folding problem is different than the computational protein-structure-prediction problem that machine learning methods are now used for (24, 25).

Simply as a shorthand notation for some possibly macroscopic mineral surface, volcanic vent, or air–water interface, constrained to be located at particular spatial location(s).

In contrast, the production of RNA under prebiotic conditions has been more challenging (9, 59).

§In simplest approximation, the walls of this funnel are linear on a log scale because funneling follows 20Nm, where N is total target chain length and m is the particular sequence length.

Data, Materials, and Software Availability

There are no data underlying this work.

References

  • 1.Schopf J. W., Kudryavtsev A. B., Czaja A. D., Tripathi A. B., Evidence of Archean life: Stromatolites and microfossils. Precamb. Res. 158, 141–155 (2007). [Google Scholar]
  • 2.Ohtomo Y., Kakegawa T., Ishida A., Nagase T., Rosing M. T., Evidence for biogenic graphite in early Archaean Isua metasedimentary rocks. Nat. Geosci. 7, 25–28 (2014). [Google Scholar]
  • 3.Dodd M. S., et al. , Evidence for early life in earth’s oldest hydrothermal vent precipitates. Nature 543, 60–64 (2017). [DOI] [PubMed] [Google Scholar]
  • 4.Wachtershauser G., Before enzymes and templates: Theory of surface metabolism. Microbiol. Rev. 52, 452–484 (1988). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Gilbert W., Origin of life: The RNA world. Nature 319, 618 (1986). [Google Scholar]
  • 6.Joyce G. F., Szostak J. W., Protocells and RNA self-replication. Cold Spring Harb. Perspect. Biol. 10, a034801 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.de Duve C., The beginnings of life on earth. Am. Sci. 83, 428–437 (1995). [Google Scholar]
  • 8.Segré D., Ben-Eli D., Deamer D. W., Lancet D., The lipid world. Origins Life Evol. Biosp. 31, 119–145 (2001). [DOI] [PubMed] [Google Scholar]
  • 9.Fried S. D., Fujishima K., Makarov M., Cherepashuk I., Hlouchova K., Peptides before and during the nucleotide world: An origins story emphasizing cooperation between proteins and nucleic acids. J. R. Soc. Interface 19, 20210641 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Maury C. P. J., Self-propagating β-sheet polypeptide structures as prebiotic informational molecular entities: The amyloid world. Origins Life Evol. Biosp. 39, 141–150 (2009). [DOI] [PubMed] [Google Scholar]
  • 11.Maury C. P. J., Origin of life., Primordial genetics: Information transfer in a pre-RNA world based on self-replicating beta-sheet amyloid conformers. J. Theor. Biol. 382, 292–297 (2015). [DOI] [PubMed] [Google Scholar]
  • 12.Robertson M. P., Joyce G. F., The origins of the RNA world. Cold Spring Harb. Perspect. Biol. 4, a003608 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Atkins J. F., Gesteland R. F., Cech T., RNA Worlds: From Life’s Origins to Diversity in Gene Regulation (Cold Spring Harbor Laboratory Press, 2011). [Google Scholar]
  • 14.Dill K. A., Ozkan S. B., Weikl T. R., Chodera J. D., Voelz V. A., The protein folding problem: When will it be solved? Curr. Opin. Struct. Biol. 17, 342–346 (2007). [DOI] [PubMed] [Google Scholar]
  • 15.Dill K. A., Ozkan S. B., Shell M. S., Weikl T. R., The protein folding problem. Annu. Rev. Biophys. 37, 289–316 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Dill K. A., MacCallum J. L., The protein-folding problem, 50 years on. Science 338, 1042–1046 (2012). [DOI] [PubMed] [Google Scholar]
  • 17.Nassar R., Dignon G. L., Razban R. M., Dill K. A., The protein folding problem: The role of theory. J. Mol. Biol. 433, 167126 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Onuchic J. N., Luthey-Schulten Z., Wolynes P. G., Theory of protein folding: The energy landscape perspective. Annu. Rev. Phys. Chem. 48, 545–600 (1997). [DOI] [PubMed] [Google Scholar]
  • 19.Wolynes P. G., Onuchic J. N., Thirumalai D., Navigating the folding routes. Science 267, 1619–1620 (1995). [DOI] [PubMed] [Google Scholar]
  • 20.Finkelstein A. V., Bogatyreva N. S., Ivankov D. N., Garbuzynskiy S. O., Protein folding problem: Enigma, paradox, solution. Biophys. Rev. 14, 1255–1272 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Thirumalai D., O’Brien E. P., Morrison G., Hyeon C., Theoretical perspectives on protein folding. Annu. Rev. Biophys. 39, 159–183 (2010). [DOI] [PubMed] [Google Scholar]
  • 22.Joyce G. F., Deamer D. W., Fleischaker G., Forward to Origins of life: The Central Concepts in Origins of Life: The Central Concepts (Jones and Bartlett Publishers, 1994). [Google Scholar]
  • 23.Dawkins R., The Blind Watchmaker: Why the Evidence of Evolution Reveals a Universe Without Design (W. W. Norton & Company, 1996). [Google Scholar]
  • 24.Jumper J., et al. , Highly accurate protein structure prediction with alphafold. Nature 596, 583–589 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Madani A., et al. , Large language models generate functional protein sequences across diverse families. Nat. Biotechnol. 41, 1–8 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Lau K. F., Dill K. A., A lattice statistical mechanics model of the conformational and sequence spaces of proteins. Macromolecules 22, 3986–3997 (1989). [Google Scholar]
  • 27.Lau K. F., Dill K. A., Theory for protein mutability and biogenesis. Proc. Natl. Acad. Sci. U.S.A. 87, 638–642 (1990). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Bowie J. U., Reidhaar-Olson J. F., Lim W. A., Sauer R. T., Deciphering the message in protein sequences: Tolerance to amino acid substitutions. Science 247, 1306–1310 (1990). [DOI] [PubMed] [Google Scholar]
  • 29.Kamtekar S., Schiffer J. M., Xiong H., Babik J. M., Hecht M. H., Protein design by binary patterning of polar and nonpolar amino acids. Science 262, 1680–1685 (1993). [DOI] [PubMed] [Google Scholar]
  • 30.Lim W. A., Sauer R. T., Alternative packing arrangements in the hydrophobic core of Λrepresser. Nature 339, 31–36 (1989). [DOI] [PubMed] [Google Scholar]
  • 31.Dill K., Jernigan R., Bahar I., Protein Actions: Principles and Modeling (CRC Press, 2017). [Google Scholar]
  • 32.Englander S. W., Mayne L., Kan Z. Y., Hu W., Protein folding-how and why: By hydrogen exchange, fragment separation, and mass spectrometry. Annu. Rev. Biophys. 45, 135–152 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Rollins G. C., Dill K. A., General mechanism of two-state protein folding kinetics. J. Am. Chem. Soc. 136, 11420–11427 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Kocher C. D., Dill K. A., Darwinian evolution as a dynamical principle. Proc. Natl. Acad. Sci. U.S.A. 120, e2218390120 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Kocher C., Dill K. A., Origins of life: First came evolutionary dynamics. QRB Discov. 4, e4 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Plaxco K. W., Gross M., Astrobiology: An Introduction (JHU Press, 2021). [Google Scholar]
  • 37.Narlikar G. J., Herschlag D., Mechanistic aspects of enzymatic catalysis: Lessons from comparison of RNA and protein enzymes. Annu. Rev. Biochem. 66, 19–59 (1997). [DOI] [PubMed] [Google Scholar]
  • 38.Cruz J. A., Westhof E., The dynamic landscapes of RNA architecture. Cell 136, 604–609 (2009). [DOI] [PubMed] [Google Scholar]
  • 39.Butcher S. E., Pyle A. M., The molecular interactions that stabilize RNA tertiary structure: RNA motifs, patterns, and networks. Acc. Chem. Res. 44, 1302–1311 (2011). [DOI] [PubMed] [Google Scholar]
  • 40.Klein D., Moore P., Steitz T., The roles of ribosomal proteins in the structure assembly, and evolution of the large ribosomal subunit. J. Mol. Biol. 340, 141–177 (2004). [DOI] [PubMed] [Google Scholar]
  • 41.Biedenbänder T., et al. , RNA modifications stabilize the tertiary structure of tRNAfMet by locally increasing conformational dynamics. Nucl. Acids Res. 50, 2334–2349 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Chen S. J., Dill K. A., RNA folding energy landscapes. Proc. Natl. Acad. Sci. U.S.A. 97, 646–651 (2000). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Keating K. S., Humphris E. L., Pyle A. M., A new way to see RNA. Q. Rev. Biophys. 44, 433–466 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Hordijk W., A history of autocatalytic sets. Biol. Theory 14, 224–246 (2019). [Google Scholar]
  • 45.Kauffman S. A., Cellular homeostasis, epigenesis and replication in randomly aggregated macromolecular systems. J. Cybernet. 1, 71–96 (1971). [Google Scholar]
  • 46.Kauffman S. A., The Origins of Order: Self-organization and Selection in Evolution (Oxford University Press, 1993). [Google Scholar]
  • 47.Hordijk W., Steel M., Conditions for evolvability of autocatalytic sets: A formal example and analysis. Origins Life Evol. Biosph. 44, 111–124 (2014). [DOI] [PubMed] [Google Scholar]
  • 48.Guseva E., Zuckermann R. N., Dill K. A., Foldamer hypothesis for the growth and sequence differentiation of prebiotic polymers. Proc. Natl. Acad. Sci. U.S.A. 114, E7460–E7468 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Farquharson T., Agozzino L., Dill K., The bootstrap model of prebiotic networks of proteins and nucleic acids. Life 12, 724 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Kocher C., Agozzino L., Dill K., Nanoscale catalyst chemotaxis can drive the assembly of functional pathways. J. Phys. Chem. B 125, 8781–8786 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Miller S. L., A production of amino acids under possible primitive earth conditions. Science 117, 528–529 (1953). [DOI] [PubMed] [Google Scholar]
  • 52.Miller S. L., Urey H. C., Organic compound synthesis on the primitive earth. Science 130, 245–251 (1959). [DOI] [PubMed] [Google Scholar]
  • 53.Parker E. T., et al. , Primordial synthesis of amines and amino acids in a 1958 Miller H2S-rich spark discharge experiment. Proc. Natl. Acad. Sci. U.S.A. 108, 5526–5531 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Johnson A. P., et al. , The Miller volcanic spark discharge experiment. Science 322, 404 (2008). [DOI] [PubMed] [Google Scholar]
  • 55.Cronin J. R., Pizzarello S., Amino acids in meteorites. Adv. Space Res. 3, 5–18 (1983). [DOI] [PubMed] [Google Scholar]
  • 56.Glavin D. P., et al. , Extraterrestrial amino acids and L-enantiomeric excesses in the CM2 carbonaceous chondrites Aguas Zarcas and Murchison. Meteorit. Planet. Sci. 56, 148–173 (2021). [Google Scholar]
  • 57.Kebukawa Y., Asano S., Tani A., Yoda I., Kobayashi K., Gamma-ray-induced amino acid formation in aqueous small bodies in the early solar system. ACS Cent. Sci. 8, 1664–1671 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Botta O., Bada J. L., Extraterrestrial organic compounds in meteorites. Surv. Geophys. 23, 411–467 (2002). [Google Scholar]
  • 59.Kitadai N., Maruyama S., Origins of building blocks of life: A review. Geosci. Front. 9, 1117–1153 (2018). [Google Scholar]
  • 60.Furukawa Y., Otake T., Ishiguro T., Nakazawa H., Kakegawa T., Abiotic formation of valine peptides under conditions of high temperature and high pressure. Origins Life Evol. Biosph. 42, 519–531 (2012). [DOI] [PubMed] [Google Scholar]
  • 61.Lambert J. F., Adsorption and polymerization of amino acids on mineral surfaces: A review. Origins Life Evol. Biosph. 38, 211–242 (2008). [DOI] [PubMed] [Google Scholar]
  • 62.Takahagi W., et al. , Peptide synthesis under the alkaline hydrothermal conditions on Enceladus. ACS Earth Space Chem. 3, 2559–2568 (2019). [Google Scholar]
  • 63.Holden D. T., Morato N. M., Cooks R. G., Aqueous microdroplets enable abiotic synthesis and chain extension of unique peptide isomers from free amino acids. Proc. Natl. Acad. Sci. U.S.A. 119, e2212642119 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Griffith E. C., Vaida V., In situ observation of peptide bond formation at the water–air interface. Proc. Natl. Acad. Sci. U.S.A. 109, 15697–15701 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Deal A. M., Rapf R. J., Vaida V., Water–air interfaces as environments to address the water paradox in prebiotic chemistry: A physical chemistry perspective. J. Phys. Chem. A 125, 4929–4942 (2021). [DOI] [PubMed] [Google Scholar]
  • 66.Krasnokutski S. A., Chuang K. J., Jäger C., Ueberschaar N., Henning T., A pathway to peptides in space through the condensation of atomic carbon. Nat. Astron. 6, 381–386 (2022). [Google Scholar]
  • 67.Forsythe J. G., et al. , Ester-mediated amide bond formation driven by wet–dry cycles: A possible path to polypeptides on the prebiotic earth. Angew. Chem. Int. Ed. 54, 9871–9875 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Rodriguez-Garcia M., et al. , Formation of oligopeptides in high yield under simple programmable conditions. Nat. Commun. 6, 8385 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Imai Ei., Honda H., Hatori K., Brack A., Matsuno K., Elongation of oligopeptides in a simulated submarine hydrothermal system. Science 283, 831–833 (1999). [DOI] [PubMed] [Google Scholar]
  • 70.Dill K. A., et al. , Principles of protein folding—A perspective from simple exact models. Prot. Sci. 4, 561–602 (1995). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Koga R., et al. , Robust folding of a de novo designed ideal protein even with most of the core mutated to valine. Proc. Natl. Acad. Sci. U.S.A. 117, 31149–31156 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Yoo B., Kirshenbaum K., Peptoid architectures: Elaboration, actuation, and application. Curr. Opin. Chem. Biol. 12, 714–721 (2008). [DOI] [PubMed] [Google Scholar]
  • 73.Davidson A. R., Sauer R. T., Folded proteins occur frequently in libraries of random amino acid sequences. Proc. Natl. Acad. Sci. U.S.A. 91, 2146–2150 (1994). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Chiarabelli C., et al. , Investigation of de novo totally random biosequences. Part II: On the folding frequency in a totally random library of de novo proteins obtained by phage display. Chem. Biodiver. 3, 840–859 (2006). [DOI] [PubMed] [Google Scholar]
  • 75.Keefe A. D., Szostak J. W., Functional proteins from a random-sequence library. Nature 410, 715–718 (2001). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Gorske B. C., Stringer J. R., Bastian B. L., Fowler S. A., Blackwell H. E., New strategies for the design of folded peptoids revealed by a survey of noncovalent interactions in model systems. J. Am. Chem. Soc. 131, 16555–16567 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Krishna M. M., Hoang L., Lin Y., Englander S. W., Hydrogen exchange methods to study protein folding. Methods 34, 51–64 (2004). [DOI] [PubMed] [Google Scholar]
  • 78.Li R., Woodward C., The hydrogen exchange core and protein folding. Prot. Sci. 8, 1571–1590 (1999). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Makarov M., et al. , Early selection of the amino acid alphabet was adaptively shaped by biophysical constraints of foldability. J. Am. Chem. Soc. 145, 5320–5329 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Tretyachenko V., et al. , Modern and prebiotic amino acids support distinct structural profiles in proteins. Open Biol. 12, 220040 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Rathore A., Martinez T. F., Chu Q., Saghatelian A., Small, but mighty? Searching for human microproteins and their potential for understanding health and disease. Expert Rev. Proteom. 15, 963–965 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Anderson D. M., et al. , Widespread control of calcium signaling by a family of SERCA-inhibiting micropeptides. Sci. Sig. 9, ra119 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Martinez T. F., et al. , Profiling mouse brown and white adipocytes to identify metabolically relevant small ORFs and functional microproteins. Cell Metab. 35, 166–183 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Cao X., et al. , Nascent alt-protein chemoproteomics reveals a pre-60s assembly checkpoint inhibitor. Nat. Chem. Biol. 18, 643–651 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Durrant M. G., Bhatt A. S., Automated prediction and annotation of small open reading frames in microbial genomes. Cell Host Microbe 29, 121–131 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.D’Lima N. G., et al. , A human microprotein that interacts with the mRNA decapping complex. Nat. Chem. Biol. 13, 174–180 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Huang J. Z., et al. , A peptide encoded by a putative lncRNA HOXB-AS3 suppresses colon cancer growth. Mol. Cell 68, 171–184 (2017). [DOI] [PubMed] [Google Scholar]
  • 88.Nieto-Torres J. L., Verdiá-Báguena C., Castaño-Rodriguez C., Aguilella V. M., Enjuanes L., Relevance of viroporin ion channel activity on viral replication and pathogenesis. Viruses 7, 3552–3573 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Ray S., et al. , The mlpt/Ubr3/Svb module comprises an ancient developmental switch for embryonic patterning. eLife 8, e39748 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Slavoff S. A., Heo J., Budnik B. A., Hanakahi L. A., Saghatelian A., A human short open reading frame (SORF)-encoded polypeptide that stimulates DNA end joining. J. Biol. Chem. 289, 10950–10957 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91.Shibue R., et al. , Comprehensive reduction of amino acid set in a protein suggests the importance of prebiotic amino acids for stable proteins. Sci. Rep. 8, 1227 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92.Makarov M., et al. , Enzyme catalysis prior to aromatic residues: Reverse engineering of a dephospho-CoA kinase. Prot. Sci. 30, 1022–1034 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Riddle D. S., et al. , Functional rapidly folding proteins from simplified amino acid sequences. Nat. Struct. Biol. 4, 805–809 (1997). [DOI] [PubMed] [Google Scholar]
  • 94.Kimura M., Akanuma S., Reconstruction and characterization of thermally stable and catalytically active proteins comprising an alphabet of 13 amino acids. J. Mol. Evol. 88, 372–381 (2020). [DOI] [PubMed] [Google Scholar]
  • 95.Longo L. M., Tenorio C. A., Kumru O. S., Middaugh C. R., Blaber M., A single aromatic core mutation converts a designed “primitive’’ protein from halophile to mesophile folding. Prot. Sci. 24, 27–37 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 96.Yagi S., et al. , Seven amino acid types suffice to create the core fold of RNA polymerase. J. Am. Chem. Soc. 143, 15998–16006 (2021). [DOI] [PubMed] [Google Scholar]
  • 97.Despotović D., et al. , Polyamines mediate folding of primordial hyperacidic helical proteins. Biochemistry 59, 4456–4462 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 98.Foden C. S., et al. , Prebiotic synthesis of cysteine peptides that catalyze peptide ligation in neutral water. Science 370, 865–869 (2020). [DOI] [PubMed] [Google Scholar]
  • 99.Takahashi Y., Mihara H., Construction of a chemically and conformationally self-replicating system of amyloid-like fibrils. Bioorg. Med. Chem. 12, 693–699 (2004). [DOI] [PubMed] [Google Scholar]
  • 100.Rout S. K., Friedmann M. P., Riek R., Greenwald J., A prebiotic template-directed peptide synthesis based on amyloids. Nat. Commun. 9, 234 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 101.Menger F. M., Nome F., Interaction vs preorganization in enzyme catalysis. A dispute that calls for resolution. ACS Chem. Biol. 14, 1386–1392 (2019). [DOI] [PubMed] [Google Scholar]
  • 102.Page M. I., Jencks W. P., Entropic contributions to rate accelerations in enzymic and intramolecular reactions and the chelate effect. Proc. Natl. Acad. Sci. U.S.A. 68, 1678–1683 (1971). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 103.Lau E. Y., Kahn K., Bash P. A., Bruice T. C., The importance of reactant positioning in enzyme catalysis: A hybrid quantum mechanics/molecular mechanics study of a haloalkane dehalogenase. Proc. Natl. Acad. Sci. U.S.A. 97, 9937–9942 (2000). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 104.Dill K. A., Theory for the folding and stability of globular proteins. Biochemistry 24, 1501–1509 (1985). [DOI] [PubMed] [Google Scholar]
  • 105.Ghosh K., Dill K. A., Computing protein stabilities from their chain lengths. Proc. Natl. Acad. Sci. U.S.A. 106, 10649–10654 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 106.Bromberg S., Dill K. A., Side-chain entropy and packing in proteins. Prot. Sci. 3, 997–1009 (1994). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 107.C. D. Kocher, K. A. Dill, The prebiotic emergence of biological evolution. arXiv [Preprint] (2023). 10.48550/arXiv.2311.13650 (Accessed 22 November 2023). [DOI]
  • 108.Dyson F. J., A model for the origin of life. J. Mol. Evol. 18, 344–350 (1982). [DOI] [PubMed] [Google Scholar]
  • 109.Dyson F., Origins of Life (Cambridge University Press, 1999). [Google Scholar]
  • 110.Wu M., Higgs P. G., The origin of life is a spatially localized stochastic transition. Biol. Dir. 7, 42 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 111.Shay J. A., Huynh C., Higgs P. G., The origin and spread of a cooperative replicase in a prebiotic chemical system. J. Theor. Biol. 364, 249–259 (2015). [DOI] [PubMed] [Google Scholar]
  • 112.Higgs P. G., Pudritz R. E., A thermodynamic basis for prebiotic amino acid synthesis and the nature of the first genetic code. Astrobiology 9, 483–490 (2009). [DOI] [PubMed] [Google Scholar]
  • 113.Bose T., et al. , Origin of life: Protoribosome forms peptide bonds and links RNA and protein dominated worlds. Nucl. Acids Res. 50, 1815–1828 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 114.Pross A., The evolutionary origin of biological function and complexity. J. Mol. Evol. 76, 185–191 (2013). [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

There are no data underlying this work.


Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES