Skip to main content
Wiley Open Access Collection logoLink to Wiley Open Access Collection
. 2022 Jul 28;28(55):e202201419. doi: 10.1002/chem.202201419

On the Evolutionary History of the Twenty Encoded Amino Acids

Andreas Kirschning 1,
PMCID: PMC9796705  PMID: 35726786

Abstract

α‐Amino acids are essential molecular constituents of life, twenty of which are privileged because they are encoded by the ribosomal machinery. The question remains open as to why this number and why this 20 in particular, an almost philosophical question that cannot be conclusively resolved. They are closely related to the evolution of the genetic code and whether nucleic acids, amino acids, and peptides appeared simultaneously and were available under prebiotic conditions when the first self‐sufficient complex molecular system emerged on Earth. This report focuses on prebiotic and metabolic aspects of amino acids and proteins starting with meteorites, followed by their formation, including peptides, under plausible prebiotic conditions, and the major biosynthetic pathways in the various kingdoms of life. Coenzymes play a key role in the present analysis in that amino acid metabolism is linked to glycolysis and different variants of the tricarboxylic acid cycle (TCA, rTCA, and the incomplete horseshoe version) as well as the biosynthesis of the most important coenzymes. Thus, the report opens additional perspectives and facets on the molecular evolution of primary metabolism.

Keywords: amino acids, coenzymes, genetic code, metabolism, origin of life


Amino acids and proteins! “Unde venistis”? The answer to the question of where these (macro)molecules come from is told as a journey that starts with meteorites and ends with the origin of the genetic code. On this journey, which incorporates the citric acid cycle and glycolysis, molecular companions and fellow travelers appear on the scene, in particular coenzymes and cofactors.

graphic file with name CHEM-28-0-g017.jpg

1. Introduction

Reflections on the origin of life are speculative per se. And fundamental questions such as a) how the 20 proteinogenic amino acids were selected and b) in what temporal sequence they are likely to have appeared on the world stage, open up the possibility of deductively delving far into the darkness of the past. [1] With respect to proteins, there is a growing body of evidence that the modern genetically encoded amino acid alphabet is evolutionarily highly optimized. These include the structural chemical diversity of the side chains [2] and the number twenty, which is found in all kingdoms of life and also in viruses, so that their roots must lie far back in the past. Indeed, there has been substantial progress in understanding amino acid evolution [3] and one scenario argues for an intensive selection process and the reduction to twenty amino acids, while the alternative suggests that new amino acids appeared successively over time and were fitted into the repertoire of protein synthesis?

The Hadean eon represents the time when the earth first formed and abiotic synthesis prevailed (physical‐chemical evolution). The following eon of the Archean (about 3,500 billion years ago) is described as the age of bacteria and archaea. By this time, template‐based peptide and protein synthesis as well as biosynthetic pathways to individual amino acids must have been established (biological evolution) (Figure 1).

Figure 1.

Figure 1

The principal phases of evolution and the two theories including the stages from abiotically formed amino acids to their metabolism as discussed in this report. [4]

How life may have arisen is now essentially shaped by two hypotheses: The first (“RNA world”) [5] holds that the first living molecules were RNA molecules capable of carrying both catalytic and heritable information. The second hypothesis, which has gained strong support, especially recently, is called “metabolism first” (Figure 1). [6] It states that metabolic reactions preceded genetic information carriers. The reactions proceeded in self‐sustaining cycles, each growing into an ever larger cycle. Occasionally, these reactions split into two independent routes, so that evolution to new compounds or more efficient pathways could have developed from these mixtures of reactions.

The present report does not aim at advocating one of these two propagated theories Rather, it follows up on an earlier article that highlighted the relationship of coenzymes and cofactors to the RNA world theory. [7] This report now addresses the evolutionary interplay of metabolism and the emergence of coenzymes and cofactors, by specifically focusing on the metabolism of amino acids, which feeds primarily through sugar metabolism and the citric acid cycle and ultimately leads to the question of the emergence of the genetic code. Thus, this report is not in opposition to the RNA world theory but complements the earlier article and expands the evolutionary view of coenzymes and cofactors within the prevailing theory of coevolution of peptides and nucleic acids.

2. Abiogenesis of Amino Acids and Peptides

2.1. The astrochemical scenario

Meteorites, especially samples from the Murchison meteorite and the martian meteorite ALH84001, as well as in the coma of 67P/Churyumov‐Gerasimenko measured by the ROSINA[ 8 , 9 ] are a fruitful sources for amino acids. Based on their mineral constituents and rock material, they are considered to be similar in geological characteristics to the planetesimals that have survived since the formation of the solar nebula about 4.5 billion years ago.[ 10 , 11 ] In fact, more than 80 amino acids, both proteinogenic and non‐proteinogenic, have been found in meteorites to date in concentration ranges up to about 60 ppm.[ 12 , 13 , 14 ] Twelve of them belong to the proteinogenic group (see Table 1), [13c] that give rise to hypotheses about the minimum number of amino acids for subsequent steps in molecular evolution. [15] In particular, the presence of uracil and xanthine helped to identify the collected molecular samples as non‐terrestrial in origin based on carbon isotope ratio measurements. [16] There is only one reference about the existence of dipeptides and diketopiperazines of abiotic origin in a sample of the Murchison[ 17 , 18 ] and in the meteorite Y‐791198. [18] Recently, it was confirmed that g lycine and other amino acids form in dense interstellar clouds well before they transform into new stars and planets. [19] Chemically, the combination of the formose reaction and the Strecker reaction have been spotted to have likely played a role but UV photolysis of interstellar ice analogs and electrons mimicking cosmic rays have also been made responsible for the interstellar formation of amino acids. [20]

Table 1.

Summary on four major experimental set‐ups or protometabolic approaches, their chemical conditions, and listing of amino acids formed in these scenarios as well as amino acids found on carbonaceous meteorites.[ 13 , 14 ]

2.1.

[a] Status 2018 according to Ref. [13c]; [b] these amino acids are decomposed by ultraviolet radiation, which prevailed in space and on the early Earth; [c] Powner and co‐workers reported a cysteine synthesis based on a thio‐Michael addition of H2S to an acrylonitrile derivative (ref. [22b]); [d] only imidazole has been detected in carbonaceous meteorites.

2.2. Prebiotic amino acid and peptide synthesis

It seems rather unlikely that the evolution of life was fed solely, or at all, by the arrival of extraterrestrial organic components, including amino acids, through meteoritic carriers. Several different environments have been proposed as plausible locations for the starting point of the origin of life on planet Earth, and among these, hydrothermal vents and hydrothermal fields [40] have been favoured. It is accepted that the origin of life did not occur in a single location, as various conditions, some very specific, including heat, light, catalytic surfaces, and reductive environments must have been necessary to establish premetabolic networks. [21]

Conceivably, products and reactants were transported between individual compartmentally distinct regions of different millieus. Classically, the Strecker synthesis using aldehydes, HCN and ammonia yields α‐aminonitriles and eventually α‐amino acids. In one variant, aldehydes and HCN are combined in water at neutral pH in the presence of diamidophosphate (DAP) to give N‐phosphoraminonitriles, which upon treatment with H2S provide α‐aminothioamides. [22a]

Miller's spark discharge experiments, three of which are listed in Table 1, represented the first attempts to conduct abiotic chemistry under simulated conditions of the primordial Earth. [23] In a variation of the classic experiment (H2, H2O, CH4, and NH3) H2S and CO2 were added, the outcome was recently reanalyzed (Table 1). [24] It revealed the presence of other amino acids, particularly methionine and cysteine, although the latter could only be confirmed by analysis of the degradation products.

Oró pointed out that the nucleotide bases are formally oligomers of hydrogen cyanide, and indeed adenine is formed from concentrated aqueous ammonium cyanide under refluxing conditions (Table 1). [25] In addition, some amino acids were reported to have formed from HCN and ammonia. Closely related to HCN is its formal hydrate formamide. In a series of studies under high energy conditions [26] with formamide generated from ammonia and CO or formic acid under high temperature conditions led to various carboxylic acids and amino acids as well as urea and carbodiimide. In one case, various meteorite‐derived additives served as heterogeneous catalysts. [27]

The iron−sulfur hypothesis states that deposits of iron sulfide minerals near deep‐sea hydrothermal vents [28] are capable of providing the reductive medium and energy to catalyze complex reaction sequences from simple precursors such as CO and CO2, HCN and H2S. The formation of small molecules were identified under simulated conditions in an autoclave; for example ammonia is produced from nitrate under these reducing conditions (FeS, H2S). [29]

Methanethiol was also a product from which S‐methyl ethanethioate (H3C−CO−SCH3) is generated in the presence of nickel and iron sulfide.[ 30 , 31 ] In the presence of HCN, α‐hydroxy and α‐amino acids such as glycolate/glycine, lactate/alanine, glycerate/serine, and pyruvic acid are generated. Also 2‐oxocarboxylic acids including pyruvic acid react with ammonia in the presence of iron hydroxide or iron sulfide and H2S to form α‐amino acids including alanine. [32] The reaction of α‐amino acids with carbonyl sulfide (COS), another important product, resulted in dipeptides and tripeptides (see section.[ 33 , 34 ] Furthermore, it was reported that short phenylalanine oligopeptides are generated from the monomer when heated to 100 °C in the presence of (Ni,Fe)S and CO. [35] An attractive aspect of this theory is related to the reductive properties of H2S, which also plays a key role in Sutherland's cyanosufidic protometabolism. It is based on the so‐called cyanosulfidic chemical homologation process, [36] in which the key building blocks of RNA, proteins, and lipids are formed only from hydrogen cyanide as the sole source of carbon and nitrogen. Hydrogen sulfide serves as a reducing agent under UV irradiation conditions in the presence of Cu(I)/Cu(II) catalysts and this system effectively acts as a photoredox system. [37] An important aspect in the light of this article represents the progression of cyanosulfidic protometabolism to amino acids. Without going into details here[ 38 , 39 ] Sutherland and coworkers found multistep chemical routes leading to the amino acids glycine, alanine, serine, threonine, and proline but also to the more polar representatives asparagine, aspartate, glutamine, and glutamate and finally to the nonpolar amino acids valine and leucine (Table 1). [36]

None of these four approaches can claim to be universally valid, each has some shortcomings, be it high concentrations required (HCN, DMF), be it long synthetic pathways (cyanosulfidic protometabolism), be it dilution problems and questions about the influence of saline media on the downstream formation of complex molecular systems leading to protometabolic networks (deep‐sea hydrothermal vents). The inherent problem of increasing dilution with each reaction step, simply because the individual yields can be far from quantitative, is overcome by so called wet‐dry cycles. They are particularly well realized in hydrothermal fields. These cycles, consisting of repetitive dilution by rain and short‐term generated small pond‐filling streams, and concentration by evaporation, are also considered essential sites where chemical steps for prebiotic formation of peptides may have occurred.[ 40 , 41 , 42 ]

It is obvious that in different prebiotic millieus several amino acids such as glycine and alanine seem to form inevitably. For other amino acids, such as the two basic amino acids histidine and lysine but also the aromatic amino acids, especially tryptophan, the opposite is true. But model experiments showed that histidine is accessible from imidazole‐4‐acetaldehyde which in turn is formed from formamidine and erythrose. [43]

Prebiotic pathways to phenylalanine and tyrosine under primitive early Earth conditions have also been explored, with phenylacetylene playing a central role. [44] This is thought to form from various hydrocarbons including acetylene at high temperatures, electrical discharges, or by ultraviolet light. H2S addition and hydrolysis would produce phenylacetaldehyde, which yields phenylalanine by Strecker chemistry. Experimentally, the authors also reported on trace formation of tyrosine. Mentionably, some alkynes such as diacetylene are said to be present on Saturn's moon Titan. [45] The suggestion that tryptophan can form from indole and pyruvate under Friedel‐Crafts conditions with iron‐rich saponite followed by transamination under hydrothermal conditions must be considered speculative, also because little is known about whether indole was present in the prebiotic world. [46] For some amino acids, no convincing synthetic routes under prebiotic conditions have been found so far; e. g. the one proposed for arginine has been questioned by some authors.[ 47 , 48 ]

Among the five scenarios compiled in Table 1, the studies on meteorites stand out in that they aimed to reveal the full spectrum of products, including non‐proteinogenic amino acids. Remarkably, some are found in greater amounts than proteinogenic amino acids, and yet were not later linked to the genetic code (Figure 2).

Figure 2.

Figure 2

Selected non‐coded amino acids found on meteorites and in experiments with electrical discharges mimicking astrochemical conditions (the most abundant amino acids related to the major product glycine are highlighted in grey); additional extraterrestrial organic molecules are listed in the cited literature. [7]

At this point, it should be noted that several plausible scenarios for the chemical activation of the carboxyl group in amino acids under plausible prebiotic conditions have been described, which enabled the prebiotic formation of peptides. [49] The reader is referred to several review articles.[ 50 , 51 , 52 ] In principle, it is accepted that a prebiotic chemistry for the formation of peptides must have existed on planet Earth and it is reasonable to consider wet‐dry cycle conditions[ 53 , 54 , 55 ] as could have existed in hydrothermal fields. [40]

Since amino acid metabolism is closely connected with the TCA cycle (tricarboxylic acid; see next section 3.1), it is important to note that experiments on the formation of intermediates of the TCA cycle have also been performed under putative prebiotic conditions, notably by Waddell[ 56 , 57 , 58 ] and Krishnamurthy [59] and coworkers. For example, citric acid can be generated from oxaloacetate under photolytic conditions, while 2‐oxoglutarate and glutamate yield succinic acid as the major photolysis product.

3. The Biosynthetic Scenario

3.1. Overview on amino acid biosyntheses

The twenty proteinogenic amino acids are biosynthesized from carbohydrate‐derived building blocks and these are recruited from glycolysis and the TCA cycle. Histidine differs from the other amino acids in that portions of its backbone originates from ATP; thus, it is the only amino acid directly linked with nuleotide and purine metabolism.[ 60 , 61 ] Amino acids are typically grouped either based on these building blocks (pyruvate), or amino acids that serve as precursors for other amino acids (aspartate, glutamate, serine). The aromatic amino acids are grouped according to their common chemical structural feature (Scheme 1). The biosynthesis of all proteinogenic amino acids is closely intertwined with the need for coenzymes, which are required both to provide building blocks as well as for the subsequent individual biosyntheses.

Scheme 1.

Scheme 1

Overview of amino acid biosynthesis linked with glycolysis and the TCA cycle. The coenzymes involved (outlined in dashed lines) are assigned to these two metabolic elements, while the coenzymes required for amino acid biosynthesis are listed below Table 2.

At this point the reductive tricarboxylic acid cycle (rTCA cycle; sometimes also referred to as reverse TCA cycle) is worth mentioning.[ 62 , 63 ] Typically found in bacteria and archaea, the TCA cycle basically runs in reverse (Scheme 2). It provides organic building blocks from CO2[ 64 , 65 ] under highly reductive conditons with the electron donors H2S or thiosulfate. The rTCA is not only considered ancient, but is also discussed as a plausible candidate for the first autotrophic metabolism with respect to the origin of life.[ 66 , 67 ]

Scheme 2.

Scheme 2

Overview of reductive TCA cycle with coenzymes and cofactors involved (numbering of intermediates are taken from the TCA cycle in Scheme 1).

The two TCA cycles differ in several aspects, for example 2‐oxoglutarate:ferredoxin oxidoreductase is a thiamine pyrophosphate (TPP)‐dependent enzyme with an additional [4Fe−4S] cluster 18 that transfers CO2 to succinyl CoA to yield 2‐oxoglutarate and CoA (6→5, Scheme 2). [68] The second step of CO2 fixation (5→4) is promoted by 2‐oxoglutarate carboxylase that uses the mixed anhydride after ATP activation. Interestingly, some facultative chemo‐lithoautotrophic thermophiles can switch between both TCA cycles depending on available carbon sources. [69]

Based on the assumption that the rTCA cycle has prebiotic significance for carbon fixation and that the intermediates oxaloacetate and 2‐oxoglutarate have served as precursors for amino acids, a chemical approach to mimic this cycle using metal salts was recently carried out which was coupled with a reductive amination step (see Scheme 10).[ 70 , 71 ]

3.2. Coenzymes

Coenzymes are small organic non‐protein compounds that specifically bind to proteins and actively participate in biotransformations.[ 72 , 73 , 74 , 75 ] They have been called vestiges from a prebiotic (RNA) world for several reasons:[ 7 , 76 ] (1) some of them are structurally closely related to RNA building blocks as exemplified in the adenosine monophosphate handle (AMP, 11), (2) they have, unlike other biomolecules, not undergone structurally changes, at least since the emergence of the last unified common ancestor (LUCA), and (3) they have more or less stuck to their primary role, which is to support catalysis. Coenzymes and cofactors can clearly serve as “checkpoints” to validate the plausibility of theories at any stage of molecular evolution, and as discussed here, this includes the evolution of amino acids and proteins.

3.2.1. No amino acid biosynthesis without coenzymes

In previously published hypotheses on molecular and biotic evolution, coenzymes have been more or less disregarded. For glycolysis, ATP/ADP and nicotinamide (NAD(P)+ 12) are required,[ 77 , 78 ] while the TCA cycle relies on NAD(P)+ 12, flavins (FAD 13) and guanosine diphosphate (GDP). Another important coenzyme is thiamine pyrophosphate (TPP, 16) which is necessary for the generation of erythrose‐4‐phosphate (E‐4‐P) from glyceraldehyde‐3‐phosphate (GAP 24).

This coenzyme is also found in α‐oxoglutarate dehydrogenase and pyruvate dehydrogenase, the latter operates as an entry into the TCA cycle. In anaerobic microorganisms lipoic acid is replaced by ferredoxin 18 as found in pyruvate:ferredoxin oxidoreductase (PFOR), an enzyme also exists in the reverse rTCA cycle (Scheme 2). [79] From an evolutionary point of view, it is an interesting fact that the pyruvate dehydrogenase complex also shows activity toward 2‐oxoglutarate, the substrate of α‐oxoglutarate dehydrogenase. [80] The number of steps as well as the need of coenzymes/cofactors for each amino acid biosynthesis are listed in Table 2. [81] Nature makes use of only a limited number of coenzymes, while with the exception of methionine, pterine‐based coenzymes such as flavins 13 and tetrahydrofolic acid (THF 20), as well as biotin and S‐adenosylmethionine, are not part of the list. It comes to a surprise that no metal‐based cofactors are involved for the formation of amino acids as metals and metal‐based cofactors are thought to have played a key role very early in evolution (see Scheme 5).[ 82 , 83 , 84 ]

Table 2.

Top: Biosyntheses of 20 coded amino acids with respect to number of steps and coenzymes required (the glutamate family does not require PLP* but 2‐oxoglutarate undergoes reductive amination with NAD(P)H 12 instead). Bottom: Structures of the AMP “handle” 11, coenzymes 1220 involved in glycolysis, the TCA cycle and amino acid biosynthesis. The list includes the ferredoxin 18 that can exchange lipoic acid 17. [7]

3.2.1.

[a] Ile and Lys can also be biosynthesized within the aspartate and glutamate families.

The redox coenzyme NAD(P)+/NAD(P)H 12 occurs in each of the known amino acid biosyntheses, with the exception of alanine, aspartate, and asparagine, suggesting that these nicotinamide‐based coenzymes probably appeared around the time when the evolution of amino acid biosyntheses took place. A similar assumption can be made for the coenzyme pyridoxal phosphate (PLP, 19) and its amino derivative pyridoxamine phosphate (PLP*). The best documented [7] and accomplished prebiotic syntheses exist for PLP 19 and nicotinamide 12 with structurally simplified nucleotide element,[ 85 , 86 , 87 ] Pentoses and glyceraldehyde‐3‐phosphate serve as building blocks for the synthesis of PLP derivatives, while two routes are established to generate nicotinamides, with the biomimetic variant using dihydroxyacetone phosphate and aspartate as precursors. [88] The latter can react with ribose‐1,2‐cyclic phosphate to give ribosylated pyridinium salts derived from NAD+ 12. [89]

Analysis of the biosynthesis of nicotinamides or PLP supports their ancient nature. Two different biosynthetic pathways are known for each of the two coenzymes. One of each is very simple, with respect to the number of coenzymes or cofactors required for their formation. In prokaryotes, NAD is usually biosynthesized from dihydroxyacetone phosphate and oxaloacetate or aspartate, with their imine derivative serving as the reactive intermediate, so that only ATP and the iron−sulfur cofactor are involved (Scheme 3, case I).[ 90 , 91 , 92 ] Remarkably, the [4Fe4S] cluster 18 does not act as a redox cofactor here but rather as a Lewis acid. The simpler of the two biosynthetic route towards PLP 19 starts from GAP 24 and ribose‐5‐phosphate and ATP 14 is the only coenzyme needed (Scheme 3, case II). [93]

Scheme 3.

Scheme 3

Summary of NAD+ 12, PLP 19 and TPP 16 biosyntheses (for TPP three routes to the two key intermediates 21 and 22: A1B1, A2B2 and A1B2). [103]

The biosynthesis of methionine requires THF 20 as a methyl‐transfer acting coenzyme.[ 94 , 95 ] In recent years, evidence has been accumulated that in anaerobic archaea and bacteria the iron−sulfur corrinoid protein functions as a methyl group transfer system (see Scheme 5) in methionine biosynthesis, [96] indicating that THF and its variants likely appeared later.

3.2.2. Why is TPP a likely “latecomer”?

In the present context, TPP 16 is a particularly interesting coenzyme. It is directly involved in the biosynthesis of the three hydrophobic amino acids valine, leucine and isoleucine, the maintenance of the TCA cycle (dehydrogenase complexes; see above) and for the formation of erythrose‐4‐phosphate (E‐4‐P), a precursor of aromatic amino acids.[ 97 , 98 , 99 ]

Nature has evolved three different biosynthetic pathways to TPP 16, all of which are terminated by a substitution reaction between hydroxymethyl−pyrimidine phosphate (HMP−P, 21) and hydroxyethylthiazole phosphate (HET−P, 22) and phosphorylation (Scheme 3, case III). [100] Two independent biosyntheses were found for each of the two fragments, with either ribose 5‐phosphate (R5P, 25) and glycine (A1) or alternatively protein‐bound histidine and PLP 19 serving as precursors to access HMP−P 21 (A2). The components for the biosynthetic generation of HET−P 22 are either glycine, cysteine, pyruvate 23 and GAP 24 (B1) or alternatively glycine and the five C atoms of ribose found in NAD+ 12 (B2).

This analysis is eye‐opening. Overall, three amino acids function as building blocks in the three TPP biosynthetic routes, namely glycine, cysteine and histidine. More significant, however, is the selection and function of the coenzymes required for the biosynthesis of fragments 21 and 22. First, bacterial biosynthesis B1 to 22 reveals that this route, requires TPP for its own generation (pyruvate 23+GAP 24 yields 1‐deoxy‐xylulose‐5‐phosphate), making it an evolutionary irrelevant one. [101]

Particularly surprising is the fact that the A2 and B2 biosynthetic subroutes use the coenzymes PLP and NAD+ as building blocks, a little‐known role for these coenzymes. This would imply that these two sub‐routes to TPP are evolutionarily younger than those of PLP 19 or NAD+ 12, so that biosynthetically TPP can be regarded to be a “latecomer”. [102] Consequently, this would mean that the known biosynthetic pathways to branched aliphatic α‐amino acids and aromatic amino acids are also more recent.

3.3. Could amino acids biosynthesized without the coenzyme TPP 16?

In view of the above discussion, it seems reasonable to perform a “Gedanken experiment” aimed at a TPP‐free metabolism for the TCA cycle and the biosynthesis of aliphatic and aromatic amino acids. This “experiment” must ensure that the altered metabolism closely follows existing biosynthetic pathways, simply in the spirit of François Jacob, who pointed out that evolution behaves like a tinkerer and does not constantly invent new pathways. [104] The execution of this “experiment” finds justification in today‘s world as it is suspected that Borrelia burgdorferi does not rely on TPP. [105] This organism does not have the TCA cycle, does not perform oxidative phosphorylation, and there are no known pathways for de novo biosynthesis of carbohydrates and amino acids. But the exact metabolism that can fully explain the lack of thiamine‐dependent enzymes is still unknown.

3.3.1. Glycolysis and TCA cycle

First we focus on the biotransformation of 2‐oxoglutarate to succinyl−CoA (Scheme 1). Are there any older variants of the TCA cycle that do not use TPP 16? A non‐cyclic putative evolutionary precursor of the TCA cycle, also known as the incomplete „horseshoe“ TCA, has been found in the strictly anaerobic bacterium Elusimicrobium minutum [106] and the hyperthermophilic archaeum Ignicoccus hospitalis (Scheme 4). [107]

Scheme 4.

Scheme 4

Amino acid biosynthesis without the coenzyme TPP 16: The incomplete („horseshoe“) TCA cycle contains a reductive (via oxaloacetate) and an oxidative branch (via citrate). The Wood‐Ljungdahl C1 fixation pathway provides acetyl−CoA without the need of TPP (DKFP=6‐deoxy‐5‐ketofructose‐1‐phosphate, amino acids are listed in Scheme 1). [a] The list of coenzymes and cofactors were established in the acetogen moorella thermoacetica; [b] for a prebiotic version of acetogenesis see Ref. [70, 71].

Here, the link between 2‐oxoglutarate and fumarate is interrupted, by absence of α‐oxoglutarate dehydrogenase and succinate dehydrogenase, so that an oxidative half‐cycle to α‐oxoglutarate and a reductive half‐cycle to fumarate remain intact. The reductive branch of the incomplete TCA cycle is initiated by the interconversion of oxaloacetate to malate, and fumarate.

As a consequence, the horseshoe TCA does not require TPP 16. In many anaerobes such as E. minutum it serves for the biosynthesis of various, but not all, proteinogenic amino acids. [106] The organism is able to form glutamate, glutamine and proline (glutamate family), aspartate, cysteine and threonine (aspartate family), glycine, alanine and serine (pyruvate family), as well as lysine and histidine. However, it relies on the external influx of arginine, asparagine and methionine, as well as the aromatic and alkyl‐branched amino acids that, noticeably also depend on TPP 16 for their biosynthesis.

Pyruvate dehydrogenase is the second enzyme to be considered in more detail. Acetogens use the Wood‐Ljungdahl or reductive acetyl−CoA pathway, a blueprint for a primal form of C1 fixation. [107] It proceeds under strictly anaerobic conditions in which two equivalents of CO2 are fixed and eventually converted to acetyl−CoA. It consists of two branches in which CO2 is first reductively converted to formate (methyl branch) or to CO (carbonyl branch). [108]

In the former, CO2 reduction is brought about by the pterin‐based molybdenum cofactor (Moco 28), supposedly an ancient redox system. [109] In bacteria, the resulting formic acid is next activated by ATP 14 and transferred to the second pterin‐derived coenzyme THF 20 (in archaea, it is the variant tetrahydromethylpterin (THMPT)), where the methyl group is formed via a series of reduction steps. From there it is finally transferred to the corrinoid cobamide. The other CO2 reduction in the CO branch is catalyzed by carbon monoxide dehydrogenase (CODH), which contains Ni,Fe centers 26. Subsequently, the metal‐bound carbon monoxide is condensed with coenzyme A and the methyl group donated by the cobamide eventually leads to acetyl−CoA. catalyzed by acetyl−CoA synthase (ACS; 27) (Scheme 5). [110]

Scheme 5.

Scheme 5

Structures of cofactors and their role in acetogenesis with CODH and ACS.

Considering that the “experiment” aims to circumvent the use of TPP, then the Wood‐Ljungdahl pathway reveals some problems, since the need for pterin‐derived coenzymes Moco, THF, and THMPT, respectively, creates a new roadblock because the pterin scaffolds likely appeared at a later point in time on the stage. Is it, then, that a prebiotically generated activated acetic acid was recruited instead?

In one case experiments to test the iron−sulfur hypothesis yielded S−methyl ethanethioate in the presence of nickel and iron sulfide.[ 30 , 31 ] Similarily acetate and pyruvate 23 are formed from H2 and CO2 by the minerals greigite (Fe3S4), magnetite (Fe3O4) and awaruite (Ni3Fe). [111] This would imply that the prebiotic provision of building blocks fed the first self‐sustaining metabolic systems, and thus pre‐metabolism and later metabolism were based on chemical continuity.[ 112 , 113 ]

3.3.2. Aromatic and branched aliphatic amino acids

Next, we turn to the branched aliphatic as well as the aromatic amino acids. Erythrose‐4‐phosphate (E‐4‐P) and phosphoenolpyruvate (PEP) act as entries into the shikimate pathway, relevant for aromatic amino acids and E‐4‐P is formed by a TPP‐dependent transketolase. [114]

An alternative, TPP‐free entrance into the shikimate pathway was found in some archaea such as Methanocaldococcus jannaschii, where 6‐deoxy‐5‐ketofructose‐1‐phosphate 30 (from pyruvate 23 via methylglyoxal 29) and L‐aspartate semialdehyde 31 serve as gateway building blocks (Scheme 6).[ 115 , 116 , 117 ] Their aldolase‐mediated coupling followed by oxidative deamination leads to diketocarboxylic acid 32, which is converted to 3‐dehydroquinic acid 33 by an intramolecular aldol reaction. Our knowledge of this biosynthetic pathway, however, is still too incomplete to make definite statements about its role in evolution. [118]

Scheme 6.

Scheme 6

Alternative biosynthetic pathway to DHQ 33 in the archaea Methanocaldococcus jannaschii.

The biosyntheses of the branched aliphatic proteinogenic amino acids valine, leucine, and isoleucine also depend on the acyl anion transfer reagent TPP, starting from 2‐oxocarboxylic acids, in case of valine and leucine this is pyruvate 23 and for isoleucine its 2‐oxo‐butyrate 40 (Scheme 7). The first important products of this acyl transfer process are 34 and 41, respectively and the Breslow intermediate 38 is the key intermediate of this process. [119] It can be theoretically deduced that the coenzyme pyridoxamine phosphate (19, PLP*) should also be able to mediate such an acyl “Umpolung“ if one follows the mechanistically known PLP‐mediated decarboxylation of amino acids to amines 37 (Scheme 7). Condensation of an 2‐oxocarboxylic acid with PLP* produces an imine that can be in tautomeric equilibrium with a second imine. If, instead of a proton, pyruvate 23 acts as an electrophile, followed by tautomerization of the resulting imine and subsequent hydrolysis of PLP*, coupling products 34 and 41, respectively, would be formed by an alternative pathway while making PLP* available for transaminations occurring later in the biosynthesis. The glycine cleavage system (GCS) provides an argument that a type 39 intermediate can be captured by electrophiles other than the proton, in the GCS case it is electrophilic sulfur in the oxidized form of lipoic acid. [120]

Scheme 7.

Scheme 7

A. TPP‐mediated biosynthesis of valine and leucine and the hypothetical PLP‐mediated analogous conversion (in orange frame) to intermediate 34; B. Analogous hypothetical reaction sequence to isoleucine from 2‐oxobutyric acid 40 via intermediate 41 (orange frame) and 2‐oxocarboxylate 42 (PLP*=pyrdoxamine phosphate).

Why has nature not realized this alternative, although PLP/PLP* 19 seems to be a simpler carrier of acyl anions compared to TPP 16 according to this mechanism. Is it an ancient, evolutionarily buried role of PLP and PLP*? Or has this role just not been found yet? Are there chemically or biosynthetically simpler approaches to TPP 16 or to simplified analogs unknown to us to date? Or are the three branched aliphatic amino acids perhaps indeed evolutionary “latecomers”? Chemically, the TPP‐mediated conversion to 34 or 41 is expected to be more efficient than the conversion with PLP* proposed here, since the latter proceeds via several tautomeric imine intermediates, each prone to hydrolysis under aqueous conditions. So their may have been a quest for a more “robust” acylanion methodology. This “experiment” shows that it is worthwhile to think about hidden chemical roles of coenzymes. [121]

3.4. Do amino acids exist that could have been preferred over the coded ones?

The ancient TCA cycle is closely linked to amino acid biosynthesis; [122] could it have been the starting point for other α‐amino acids based on recurring patterns of chemical sequences? Such a sequence is found for 2‐oxoglutarate, formed from oxaloacetate, which consists of an initial aldol‐like addition of acetyl−CoA 36 followed by water elimination and re‐addition of water with opposite regiocontrol. Next NAD+‐mediated oxidation leads to the (often spontaneous) loss of the original carboxyl group. In essence, 2‐oxocarboxylate is transformed into a homologeous 2‐oxocarboxylate by which a methylene group is formally inserted between the keto function and the substituent R (Scheme 8). [123] This sequence is also found in the biosyntheses of glutamate, arginine and the non‐proteinogenic amino acid ornithine (all from oxaloacetate), lysine (from 2‐oxoglutarate) and leucine (from 3‐methyl‐2‐oxobutanoic acid 35). In a remarkable iterative process, 2‐oxoglutarate is extended three times (via 2‐oxoadipate and 2‐oxopimelate) to 2‐oxosuberate, en route to coenzyme B as well as in the biosynthesis of the coenzyme biotin (for coenzyme B see Scheme 8). [124] Further theoretical examples, not realized by Nature, could be formulated for the proteinogenic amino acids threonine, isoleucine, and methionine, which are principally accessible from homoserine (42), which in turn could form from serine or 3‐hydroxypyruvate. However, homoserine is biosynthesized from aspartate via the semialdehyde 31. The naturally occuring, non‐proteinogenic amino acids ornithine and homoalanine are also formed from the aspartate/oxaloacetate and alanine/pyruvate couples, [125] via this homologization sequence. The last precursor en route to homoalanine is α‐oxobutyrate 40, for which, however, two other biosynthetic alternatives from asparate and threonine, respectively, are known. [126]

Scheme 8.

Scheme 8

Nature's homologation concept of 2‐oxocarboxylates (C atoms marked in grey highlight the homologation process).

This four‐step homologization sequence is likely very ancient because a) only the coenzyme nicotinamide is required, b) it is found in the TCA cycle and its incomplete „horseshoe“ variant and c) in principle should be reproducible under prebiotic reaction conditions. This suggests the existence of ancient protoenzymes with lower substrate specificity (see Figure 3C).

Figure 3.

Figure 3

Hypotheses A.‐C of metabolic evolution.

Evidence for this hypothesis is still pending, as this would imply that norvaline and norleucine could theoretically have been included in the inventory of encoded hydrophobic amino acids. Both amino acids occur intracellularly, and the enzymes involved, such as isopropylmalate synthetase, are promiscuous and accept different 2‐oxoacids (Scheme 7).[ 127 , 128 ] In addition, some degree of promiscuity of leucyl−tRNA synthase results in a mischarged norvalyl−tRNALeu that evades translational proofreading activity and leads to norvaline‐containing proteins. [129] These observations suggest that these chemical ambiguities may have played a role in determining the genetic code (see section 5.4), but norvaline and norleucine were ultimately omitted for whatever reason.

Interestingly, the noncoding amino acids N‐methylalanine, homoalanine, homoserine, and norvaline are on the list of non proteinogenic amino acids found on meteroites. The Miller‐Urey experiments also yielded nonproteinogenic amino acids, including norvaline and norleucine suggesting that they must have been present during the transition to a coded world. [130] The absence of these n‐alkyl amino acids was justified by their structural similarity to the versatile proteinogenic amino acid methionine. From a biosynthetic point of view, however, this comparison makes little sense, since the biosynthesis for methionine is among the most complex (see Table 2) and probably appeared late on the stage. Surprisingly, homoalanine and ornithine were also not included into Nature's portfolio of encoded amino acids, although in terms of polarity, homoalanine can be placed between alanine and the branched aliphatic amino acids.

The evolutionary role of ornithine is quite peculiar. Since ornithine is the biosynthetic precursor of arginine, it was suspected that ornithine was initially encoded in the early phase of the genetic code, which was later replaced by arginine. However, the chemical reactivity of ornithine leaves it out of the privileged list, since in its activated form (e. g., when bound to tRNA) it readily cyclizes to the corresponding δ‐lactam. And the arginine content of the proteins is much lower than expected, considering that 6 of the 61 codons for amino acids in the genetic code are for arginine. [131] Presumably, when the code was expanded, lysine was added with an amino side chain that filled the gap left by the omission of ornithine (see Section 5.3). This is also indicated by the fact that lysine is more abundant than one would expect based on its two codons.

3.5. Three “chicken and egg” problems need to be resolved

Considerations on the biosynthesis of amino acids, their possible timing in evolution, and the special role of coenzymes have obscured some fundamental dilemmas.The generation of the 20 encoded amino acids are catalyzed by enzymes, supported by the presence of coenzymes. Enzymes themselves consist of α‐amino acids whose formation they catalyze, so we are facing with a case of causal circularity or, in more general parlance, a “chicken and egg” problem. This also applies to the pairs amino acids/coenzymes and proteins/coenzymes, since the latter are also biosynthesized by enzymes but are themselves essential for various steps in the biosynthesis of amino acids (see Scheme 3). [132] Clearly, the emergence of a biotic world is inconceivable without the presence of abiotic molecules such as amino acids, small peptides and coenzymes/cofactors, even if they should only be considered in the broader context of a replicable RNA world‐[ 5 , 133 ] or viroid‐like system [134] or as part of a systems chemistry approach. [135]

4. Hypotheses on Linking Metabolic Pathways with Prebiotic Chemism

The emergence and refinement of basic biosynthetic pathways enabled primitive organisms to become increasingly independent of exogenous sources. With the in‐depth coverage of both abiotic and biotic amino acid syntheses, the assumption can be made that incidental reactivities, both enzymatic and inherently chemical, provide the background against which the recruitment of individual enzymes greatly enhances the function of a slow but pre‐existing multistep sequence. [136] One such entire block of analogous enzyme sequences, was described in Scheme 8.

Why and in which way could finally a metabolism evolve? To this end, several elementary, forward in time hypotheses have been formulated in an attempt to rationalize the evolutionary dynamics that transitioned the prebiotic world into the biotic one. [131] Drivers of metabolic evolution are thought to be duplication and divergence of genes and enzymes, kinetic optimization of established pathways through fusion of enzymes, and minimization of ATP unit cost, thereby improving thermodynamic efficiency.

4.1. The “retrograde” hypothesis

The “retrograde” theory of evolution, put forward by Horowitz [137] states that the first living species was a completely heterotrophic organism (Figure 3A). It proliferated at the expense of prebiotically formed organic molecules, for example amino acids or molecules produced under prebiotic conditions with the properties of modern coenzymes and cofactors. The organism will then use up the environmental reserves of A and deplete it to a point where growth is limited. In such an environment, any organism that evolves an enzyme or catalytic system capable of synthesizing a molecule A from precursors B would have a clear selection advantage and would rapidly expand in the environment. This selection process could be repeated for subsequent “generations” until the completion of the biosynthetic pathway known today.

The theory additionally states that further evolution is likely to be based on a random combination of genes. Thus, the simultaneous unavailability of two intermediates (e. g. B and C) would favor a symbiotic association between two mutants, one capable of synthesizing B and the other capable of synthesizing C from other precursors in the environment, leading to the evolution of short reaction chains. Thus, the theory also incorporates the idea of parasitism and symbiosis as driving forces of evolution.[ 138 , 139 ]

The theory contains aspects that have been critically commented upon. The evolution of metabolic pathways in the reverse direction requires particular environmental conditions in which prebiotic formed organic compounds and potential precursors accumulated, but their presence became depleted over time. In addition, the origin of many other anabolic metabolic pathways cannot be inferred from their backward evolution because they involve many unstable intermediates. How could these have accumulated in prebiotic and present‐day environments?

4.2. The “forward” hypothesis

A lesser‐known proposal negates the importance of prebiotic compounds in biological evolution. [140] Here, biosynthesis of end products occurs by forward evolution from simpler precursor molecules (Figure 3B) and enzymes catalyzing earlier steps in a metabolic pathway are older than those acting later. Each intermediate in a biosynthetic pathway must therefore be useful to the organism, since simultaneous evolution of multiple genes in a sequence is too unlikely. The hypothesis seems to be questionable for complex, linear biosynthetic pathways such as those for the purines and the branched‐chain amino acids, where the intermediates have no obvious benefit to the organism. [141]

4.3. The “patchwork” hypothesis

The “pathwork” hypothesis emphasizes the role of primitive enzymes in the evolution of metabolic pathways.[ 136 , 142 ] These early enzymes were able to react with a broad range of chemically related substrates (Figure 3C). [143] The catalytic capabilities would still have been low in terms of “turnover”, but allowed metabolism for primitive cells with still small genomes. With the arrival of next‐generation enzymes (E2), in which the amino acid sequence is slightly different, substrate specificity and catalytic activity increased. As defined, the “ patchwork” concept could not become effective until after the appearance of protein biosynthesis and the rise of enzymes. There is some evidence that the “patchwork” hypothesis had some validity in evolutionary amino acid biosynthesis particularly for threonine, tryptophan, isoleucine, and methionine (three of them belong to the aspartate family). [60] An example of the “patchwork” hypothesis is the homlogation strategy discussed in Scheme 8. Comparative structural and functional analyses revealed that a small number of amino acid substitutions in the active site lead to paralogous proteins. They can recognize substrates with different aliphatic chain lengths as found in the biosynthesis of coenzyme B (CoB) and in the methanogen Methanocaldococcus jannaschii, the homocitrate synthase (HCS) is able to accept 2‐oxoglutarate, 2‐oxoadipate, and 2‐oxopimelate. [124]

The “patchwork” hypothesis can also be applied to the biosynthesis of TPP 16 and imidazole‐containing 5,6‐dimethylbenzimidazole (DMB, 46) the lower ligand in cobamide and vitamin B12 (Scheme 9).[ 144 , 145 ] Comparative genomics on bacterial thiamine‐pyrimidine synthase (ThiC) revealed the existence of a paralog of thiC, the HBI synthase (BzaF), which is clustered with anaerobic genes for vitamin B12 biosynthesis.

Scheme 9.

Scheme 9

TPP and DMB‐biosynthesis, that can be linked to the “patchwork” hypothesis (Figure 3 C.) (PRPP=5‐phosphoribosyl‐1‐pyrophosphate).

Both enzymes use the same substrate phosphoribosyl‐aminoimidazole (AIR, 43) and promote quite different radical chain reactions induced by the coenzyme SAM and cofactor ferredoxin 18. [146] In the case of TPP biosynthesis, the product is hydroxymethyl‐pyrimidine phosphate (HMP−P, 21) [101] while in the DMB branch, 5‐hydroxybenzimidazole (HBI, 45) is formed first. [147] The sequence and structural similarities of the two enzymes is reflected in the proposed mechanisms. The key chemical driver for the formation of the subsequent intermediates is the radical cation 44 from where the two routes separate. Subtle changes in the amino acid sequence of the protein determine which of the two completely different products with distinct biological functions is formed. The progenitor protein E1 was still likely to have generated several products BD from A (Figure 3 C.) as it was unable to control the high reactivity of the intermediate radicals. In the further course, this protein evolved into the two related enzymes E2 and E3, which selectively enabled the formation of the two most important products B and D.

4.4. Mixed origin of metabolic pathways

Later the “patchwork” hypothesis was extended by including prebiotic chemistry and combining it with the appearance of the first enzymes.[ 113 , 148 ] It was assumed that prebiotically generated molecules should be chemically quite stable. These were complemented by molecules derived from existing metabolic pathways in cells for which stability was not a mandatory requirement. The expansion of the metabolic repertoire should have occurred by gene duplication and should have produced non‐specific catalysts. Supposedly, these early proteins were formed by non‐enzymatic reactions.[ 21c , 149 ]

The so called mixed‐origin approach can be exemplified for the biomimetic trans‐ or reductive amination of 2‐oxo‐carboxylic acids (Scheme 10). The non‐enzymatic transamination of glyoxylate with glutamine as amino donor is the earliest example of such a prebiotic study and the reversal of transamination namely from glycine to glyoxylate in the presence of formaldehyde has also been reported.[ 150 , 151 ] However, this reaction exhibits an unfavorable equilibrium for the formation of imines in the aqueous medium. It can be circumvented by using an stoichiometric amount of hydrazine (or hydroxylamine) and added metallic iron to create a reductive environment.[ 70 , 71 ]

Scheme 10.

Scheme 10

Merging prebiotic chemistry with concepts of metabolic evolution exemplified for transamination of 2‐oxocarboxylic acids (E1, E2=enzymes).

En route to the biotic world, a possible prebiotic (biomimetic) [152] generation of PLP 19 pushed the system, because it is known that coenzymes alone can drive their chemistry (Scheme 10, top). [7] The transamination chemistry became more efficient with the emergence of the first enzyme E1, a transaminase capable of binding PLP, but with low substrate specificity. According to the “patchwork” hypothesis, the promiscuous character gradually disappeared after the formation of next‐generation enzymes, promoted by gene duplication (e. g. E2). In fact, PLP‐dependent transaminases are known for their broad specificity. [153]

4.5. “Shell“ hypothesis

As a final example, the so‐called “shell” hypothesis is discussed, which specifically addresses the reductive citric acid (rTCA) cycle. According to this hypothesis, it should have led to an ‘‘energy amphiphile” core, being the starting point for the formation of new molecules. These propagate in a network‐like manner and constitute molecular shells over previously formed metabolic nuclei (Figure 4, top). [154a] This hypothesis assumes that the prebiotic chemical processes are “imprinted” on modern metabolism as relics. [155] Accordingly, metabolic biogenesis manifested itself in a hierarchy of nested reaction networks of increasing complexity. [154] It starts with shell A, which includes glycolysis and fatty acid biosynthesis. [156] This was superseded by the introduction of nitrogen originated from amino acids in shell B and eventually sulfur in shell C. Consequently, purines, pyrimidines, and many other cofactors or coenzymes formed, as evolutionary “latecomers”. In parallel, the rTCA underwent a change via the transition of the bidirectional to finally the TCA cycle. In this scenario, current enzymes are replaced by naturally occurring minerals or metal ions – a concept that has also been proposed for other metabolic cycles and networks.[ 29 , 157 , 158 ]

Figure 4.

Figure 4

Morowitz “shell” hypothesis (top) and adaption when the incomplete horseshoe variant would be included (bottom). Coenzymes labeled with numbers I and II appeared early in evolution, for example NAD+ and PLP.

However, one can consider a modified account by taking into account the close evolutionary relationship between amino acid metabolism and the availability of selected coenzymes (Table 2) [131] and combining this with the horseshoe TCA cycle, the ancient form of the TCA cycle (see Scheme 4), which would substantially modify Morowitz's hypothesis (Figure 4, bottom). NADH‐dependent reductions and PLP‐mediated transaminations are central to the metabolism of most amino acids. The simple amino acids glycine, [159] alanine and aspartate are synthesized from the corresponding 2‐oxocarboxylic acids in step by transamination, first chemically (see Scheme 10, top) and later after its arrival by the coenzyme PLP/PLP* 19 (Scheme 10, bottom). Elements of the original “shell” hypothesis found support when the rTCA cycle was recently reactivated under putative prebiotic conditions using metal salt promoters.[ 70 , 71 ]

The idea of primordial metabolic cycles has generated controversal debates. [160] Orgel pointed out that abiotic reactions proceed at low yields, i. e., the more reaction steps that occur in linear succession, the more catastrophically the overall yield decreases. This is a particularly fatal problem in metabolic cycles because the substrate concentration for the first step depends entirely on the yields of the following steps.

5. Retro‐Bioanalytical Approaches to the Origin of Amino Acid Metabolism

5.1. The last unified common ancestor (LUCA)

The transition from the abiotic to the biotic world can also be analyzed in the inverse direction, starting from today‘s amino acid metabolism and its coding at the genetic level. Among other early “life forms”, [134] LUCA, a central theoretical model from were the three kingdoms of Life (bacteria, archaea, and eucarya) evolved, needs to be covered. Using the mapped genomic diversity of the biotic world, [161] a comprehensive phylogenetic reconstruction of the metabolic abilities of LUCA was reported by Martin et al.. [161b] The analysis suggests that LUCA was an autotrophic, thermophilic, anaerobic prokaryote, living in hydrothermal vents, relying on the Wood‐Ljungdahl pathway (see Scheme 5).

Consequently, LUCA must have already used the entire alphabet of the 20 encoded amino acids as well as most of the coenzymes and cofactors. Thus, LUCA cannot resolve the problems of causal circularity mentioned in Section 3.4 but must have arisen long after the appearance of the first life forms.[ 162 , 163 ]

5.2. The chemical space and physicochemical properties of proteinogenic amino acids

An evaluative approach employs the analysis of the chemical space filled by the various functionalized side chains of proteinogenic amino acids as well as their ability to form polar interactions with the surrounding medium. [164] Recently published in silico studies on hypothetical peptide sets were performed with amino acid compositions comprising 3–19 amino acids, and a total of 1913 structurally distinct α‐amino acids were included in the repertoire. The adaptive value of their combined physicochemical properties compared to those of the modern set consisting of the known twenty amino acids were determined. As a major result it was found that such hypothetical sets, which included the encoded amino acids, are particularly adaptive.

It was concluded that each time a coded amino acid appeared on the scene, it provided adaptive value. Each selection step could have helped to expand the evolving set of amino acids, leading to an increase in the number of encoded amino acids in the emerging alphabet. Why then did life stop exploring chemical space further with the 20 encoded amino acids? It was suggested that property space was already well‐explored at this stage. As soon as the evolving organisms acquired one or more amino acids from the modern alphabet, their abilities improved, so that natural selection becomes visible already at this molecular level.

Supposedly “latecomers” such as the aromatic amino acids phenylalanine, tyrosine, and tryptophan, but also cysteine, exert a strong influence on the structural rigidity of peptides and proteins. The starting point of another study was therefore the replacement of the aromatic amino acids from an enzyme, specifically dephospho coenzyme kinase, by lysine. [165] One finding was that the two‐step process catalyzed by the enzyme was not completely deactivated, particularly the second step. This presumed decoupling of activity and structure was taken as an indication that early life could get along for a time without highly structured proteins. Only in the further course of evolution did the repertoire of amino acids complete itself with more strongly structuring amino acids. [166]

The hypothesis that active proteins with a much smaller repertoire of amino acids are conceivable has also been tackled by Akanuma et al. [167] A folded, soluble and stable supposedly ancient nucleoside kinase with reduced catalytic activity, can arise from an alphabet of only 13 amino acids, that lacks, for example, Cys, Phe, Ile, Met, Gln, Thr, and Trp. [168] The authors selected the remaining thirteen amino acids based on the assumption that LUCA was likely a thermophilic organism that required thermostable primordial proteins. [169] These conceptual de novo strategies form the basis for the emergence of an early protein world and provide thoughts on the origin of the genetic code.

5.3. Histidine and cationic amino acids

Among the 20 encoded amino acids histidine stands out in many respects. It has been most often discussed in terms of a retro‐bioanalytical rational to explain the selection of the twenty encoded amino acids and the emergence of the genetic code, despite the fact that so far, no convincing conditions for the abiotic synthesis of this amino acid have been found.[ 43 , 170 ] Moreover, the presence of imidazole‐containing purines in meteorites is well documented, but histidine is conspicuously absent in carbonaceous chondrites (see Table 1).

Amino acids with positively charged side chains, particularly lysine, arginine, and histidine, are considered key players in the co‐evolution between early proteins and early nucleic acids because of their ability to stabilize RNA by acting as an early chaperone and their role in expanding the catalytic repertoire of the molecular world at that time (RNA world theory). [5] It was argued that in the biotic world, the imidazole side chain often acts as a catalytic site in enzyme reactions by mediating general acid‐base catalysis in evolutionarily highly conserved catalytic triads. [171]

The biosynthesis of histidine is unusual in that it is the only one that uses a nucleotide as the starting building block. If the currently favored RNA‐world theory is correct, then this would consequently induce a direct connection of histidine to the purines, as is also the case for so many coenzymes.[ 7 , 76 ] Despite the fact that the biosynthesis is quite long and consists of ten linear steps, many of which are hydrolytic or condensation reactions, only the simple coenzymes pyridoxamine (PLP*, 19) and NAD+ 12 are actually needed. In all histidine‐synthesizing organisms, the pathway is unbranched consisting of nine intermediates and of eight distinct proteins, encoded by eight genes. Analysis of the structure and organization as well as the phylogenetic analyses of the his genes, which included the role of gene fusions, [172] suggest that histidine biosynthesis must be very ancient and that it may have existed already well before the emergence of LUCA. [173]

These analyses focused primarily on the hisA and hisF genes because they provide partial evidence for the “retrograde” hypothesis on the origin and evolution of metabolic pathways (Figure 5). The genes are paralogous and are arranged in tandem in the same operon and the two associated proteins catalyze two sequential steps in the same biosynthetic pathway. Histidine may thus play a key role in elucidating the transition from abiotic to biotic evolution and how the RNA world was fused with the emerging world of amino acids, proteins and coenzymes.

Figure 5.

Figure 5

A. The genetic code table (adopted and slightly modified from Ref. [15, 9a]). Thick lines divide the table into quadrants between which a transversion mutation is required to change the encoded amino acid; (orange N=3rd nucleotide exchanged from G via A, C and U). B. Evolution of the amino acid repertoire in protein biosynthesis according to references [188] and [193] (“GC first”).

Finally, if cationic amino acids were indeed absent in a primitive RNA world, complexes of negatively charged amino acid residues, such as those found in aspartate and glutamate, with divalent metal cations (e. g. Mg2+, Fe2+) may have taken over the role of basic amino acids in primitive prebiotic peptides, it has been speculated. [174] It was also found that removal of divalent Mg and its replacement by the prebiotic metal iron(II) leads to an expansion of the catalytic repertoire of RNA.

An interesting approach to elucidate the currently strongly favored theory of coevolution of peptides and nucleic acids was achieved by cationic protopeptides. [175] In particular depsipeptides and polyesters can interact directly with RNA, leading to mutual stabilization. Interestingly, RNA is able to increase the lifetime of such protopeptides.

5.4. The genetic code as guideline

5.4.1. Crick's “frozen accident” theory

Francis Crick published the first thoughts on the selection of 20 encoded amino acids, called the “frozen accident theory”. It states that it would have been just as possible for another group of (originally abiotically formed) 20 amino acids to cross the “finish line” first, but after the first representatives had been admitted, the door to the exclusive club was closed to the other amino acids present. [176] According to this theory, the process of freezing took place when life had developed a certain degree of complexity. Due to its high evolutionary potential, this system was successful in competition with all other systems and has therefore survived as the only universal system until today. Before that time, changes in protein sequences resulting from changes in the code could thus be tolerated.

However, Crick's model does not explain why some codons can be grouped according to the physicochemical properties of their respective amino acids. [177] It seems likely that amino acids were not chosen primarily for their ability to support catalysis in the form of their oligomers, since (metal) complexes with highly effective cofactors for carrying out diverse reactions have likely already existed. [5]

Would more rational ties between amino acids and the formation of the genetic code be conceivable instead? [178] If the discussions about the origin of life yielded a considerable variety of different proposals and hypotheses (besides the RNA world hypothesis, [5] also the thioester world proposed by De Duve, [179] the peptide model of Commeyras, [180] the sugar model of Weber [181] and others [133] ), then the picture of the formation of the genetic code discussed until today appears equally confusing as is shown below. [182]

5.4.2. The “UA first” model

Based on the assumption that RNA must have preceded DNA, a primordial genetic code beginning with U and A was proposed. [183] Accordingly, codons for Lys, Met, Ile, Asn, Tyr, Leu, Phe, and a stop codon arose in the first evolutionary phase. However, most of these amino acids are based on many biosynthetic steps requiring many coenzymes, and therefore can be considered rather evolutionary latecomers from a metabolic point of view. For several valid reasons, the alternative starting point with G and C as the initial letters of the premordial code has found greater popularity. [184] In this case, it is not so much the biosyntheses of the nucleic acid building blocks that are used for primary analysis, but those of the amino acids.

5.4.3. The “glycine first” model

Among the 20 canonical amino acids of the biological coding system, the amino acid glycine is one of the most abundant in prebiotic experiments (see Table 1). Bernhardt, Patrick as well Tamura assumed that coding triplets are point mutated, and glycine (codon: GG N) served as the starting point. [159] This was followed by serine, aspartic acid and/or glutamic acid – small hydrophilic amino acids. At this stage, this would have given rise to short, water‐soluble peptides. Evolution of the code is thought to have occurred by duplication and mutation of tRNA sequences, resulting in a radiation of codon assignment from the top left corner outward (Figure 5 A). In this way, small hydrophobic peptides or mixed peptides would have been added. Gradually, longer polypeptides would have formed that contained a hydrophobic core for folding and stability.

For Francis and others [185] the genetic code table also served as a starting point and they came to a modified conclusion. Based on quantitative analysis of abiotically produced amino acids, it was concluded that the original genetic code consisted of only the four GNC triplets encoding for Gly, Asp/Glu, Ala, and Val (top row, Figure 5 A).

5.4.4. The “GC first” model

Hartman and Smith proposed that the first set of amino acids should be linked to the GC code, so that the coding stretch consisted only of the nucleobases guanine and cytosine (Figure 5 B). [186] The “GC first” model, in which no amino acid but these two nucleotides act as conceptual starting point, overlaps strongly with the “glycine first” model, since glycine is encoded by the four triplets GG N. As a consequence, glycine, alanine, proline, and an amino acid with a positively charged side chain were the first for template‐controlled peptide biosynthesis. [186] Budisa and co‐workers proposed ornithine (see section 3.3) which in time was later exchanged for arginine (Figure 5 B).

This hypothesis follows biochemical assumptions and this initial selection therefore focuses on the interaction and eventual stabilization of polyanionic RNA by the resulting early peptides. It enabled the later co‐evolution of peptide sequences and the translation apparatus. A crucial aspect that the authors included in their considerations is the necessity of protein folding in the context of the “RNA world theory”. It could have started with elongated and rigid peptides, with proline serving as the key amino acid. The first repertoire had to include positively charged amino acids that could dock to negatively charged RNA bodies. In addition, alanine as the simplest chiral amino acid supplemented the initial list of coded amino acids. The next evolutionary step in the code was the purine base adenine, which established a GCA code by allowing access of additional, now polar, amino acids, namely aspartate and glutamate, asparagine and glutamine, threonine, serine, and histidine.

Finally, uracil and consequently the hydrophobic amino acids methionine, leucine, isoleucine, valine, phenylalanine, tyrosine, tryptophan, cysteine and lysine came into play many of which are formed via complex biosynthetic pathways. Based on these considerations, it was possible to develop a stepwise model correlating the gradual recruitment of the number of encoded amino acids with the hierarchy of protein folding, that eventually yielded α‐helical structures, and the establishment of protein tertiary structures. [187]

Recently, Westhof et al. presented an alternative circular representation of the genetic code table, choosing an asymmetric distribution of codons (Figure 5A, bottom). [188] In it, there is a clear separation between GC‐rich 4‐codon boxes and AU‐rich 2 : 2‐codon and 3 : 1‐codon boxes. Within this presentation, the multiplicity and complexity of nucleotide modifications, particularly in the anticodon loop, are usefully separated. They correlate well with the need to stabilize AU‐rich codon‐anticodon pairs. Westhof also regards the GC pair as the starting point, which is gradually extended by A/U together with tRNA modifications and the modification of enzymes (see below: chapter 5.4.7 and Scheme 11).

Scheme 11.

Scheme 11

tRNA‐linked amino acid biosynthesis of glutamine and 5‐aminolevulinic acid in the thermophilic archaeon Methanopyrus kandleri (Mka) (top); tRNA‐linked amino acid biosynthesis of cysteine found in Methanobacteriales species (bottom).

5.4.5. The RNY hypothesis

Another concept, primarily proposed by Shepherd, states that primitive tRNA translations were based on the repetition of coding RNY triplets. These are remnants of an ancient code that is still detectable today, although it has been largely replaced by the present universal code. [189] R stands for purines, Y for pyrimidines, while N can be either a purine or pyrimidine. In fact, these triplets outnumber RNR triplets fourfold in genes for extant proteins. The excess of RNY codons is likely due to a preference for the corresponding tRNAs rather than the remnants of an ancestral genetic code. RNY triplets encode Asn, Ser, Thr, Ile, Asp, Gly, Ala, and Val, with no basic amino acids among them. It has been postulated that once the primordial genetic code reached the RNY code, the elimination of any amino acid was strongly inhibited at this stage and therefore the genetic code was already to some extent frozen. [190] But the necessity of the listed amino acids to stabilize RNA in a prebiotic world pose a problem for the validity of the hypothesis. [191]

5.4.6. The theory of co‐evolution of the genetic code

The theory of co‐evolution of the genetic code postulates that prebiotic synthesis did not provide all twenty amino acids and therefore some of them had to come from the co‐evolving amino acid biosynthetic pathways. [192] The addition of new amino acids increased protein versatility, improving the trajectory for perfecting catalytic capabilities in some enzymes and reducing combined transcription and translation errors to the <0.0003 range. [193] Finally, against a background of decreasing errors, the noise introduced by the insertion of an additional amino acid with complete codon assignment would result in an overly large selective disadvantage that far outweighed the advantage of a new amino acid side chain. Accordingly, the expansion of the code to include additional amino acids with full codon assignment had to come to a halt. In contrast, the argument has been advanced that the number of encoded amino acids may have been greater due to the plethora of amino acids available from the chemical environment. [194] The number of amino acids may also have varied at different times until the selection process was complete according to Crick's “frozen accident” theorem. This argument is supported by the analysis of some contemporary proteins composed of less than 20 different amino acids and by the experiments described in section 5.2. [195]

5.4.7. Biosynthetic steps on tRNA‐linked amino acids

Another feature that may have come into play at a later stage in the evolution of the genetic code is tRNA‐linked amino acid biosynthesis. These include hydrolytically sensitive asparagine and glutamine. Most bacteria and all archaea arrive at Gln−tRNAGln by the transamidation pathway (Scheme 11) [196] which could then have led to the expansion of the genetic code (for glutamine: CAA and CAG after replacing the first base G with C). [197]

Biosynthesis with tRNA‐bound glutamate is still found today, for example in the thermophilic archaeon Methanopyrus kandleri (Mka). Glu−tRNA is thereby converted to semialdehyde by Glu−tRNA reductase. [198] From there a PLP‐dependent biotransformation leads to 5‐amino‐levulinic acid and hence to tetrapyrroles and urogen III a, biosynthetic precursor of heme. [199]

Additional examples of tRNA‐linked amino acid biosynthesis are found exclusively in methanogenic archaea of the Methanobacteriales species, there especially for the formation of cysteine. Starting from D‐3‐phosphoglycerate, O‐phospho‐L‐serine is first biosynthesized, which is loaded onto tRNACys by ATP activation and then converted to tRNA‐bound cysteine at this site. [200]

With this perspective in mind, it was postulated that some early biosynthetic steps on tRNA‐linked amino acids existed in the pre‐LUCA era and that these tRNA‐associated biotransfomations were then replaced by today‘s tRNA‐independent biosyntheses. Consequently, these disappeared from organisms living today. This hypothesis does not include the fact that coenzymes had to be present, such as NADPH 12 and PLP 19 into consideration that are involved in heme biosynthesis, a cofactor. Therefore, it is to suggest that the timing of coenzyme appearance should also be included into genetic code evolution. [201]

5.4.8. Plausibility check exemplified by the “GC first” model

Even though the number of hypotheses and theories about the origin of the genetic code is even larger than the ideas presented here, there is still no reliable solution to the riddle. Are the criteria used to determine which genetic code is suitable for the primitive genetic code inadequate? The RNY code, the AU code, and the GC code model are mainly based on the codon pattern of the genes existing today, the RNA world theory of the origin of life and nucleotide metabolism, the stability of the RNA secondary structure, and the simplicity of the code compared to the universal genetic code, respectively. Prebiotic amino acid chemistry, the evolution of early amino acid metabolism, the transition or “hybridazion” between the two, and the occurrence and role of coenzymes such as PLP 19 or other chemical mediators and catalysts have possessed a marginal prominence. Therefore, the different scenarios show some weaknesses. This can be illustrated by the choice of the first four amino acids in the “GC‐first” model (section 5.4.4 and Figure 5 B). Serine appears in the second phase, while glycine in this scenario is at the very beginning of the evolution of the genetic code (see also “glycine first” model). This contradicts the biosynthesis known today, which starts from glycerate‐3‐phosphate and biotransformation to serine, which in turn serves as a precursor for glycine, a reversal of the relationship with the said model. It therefore can be argued that glycine had to be recruited from the prebiotic environment. The second proposed early amino acid is alanine, which is formed in one step by transamination from pyruvate 23, is a metabolically reasonable choice. The third amino acid is proline, which is biosynthetically formed from glutamate in four steps (Table 2). However, glutamate comes into play later in phase two of the “GC first” model, again a chronological logic that does not quite follow the timing from a biosynthetic perspective. One may argue that in Methanococcus jannaschii ornithine serves as precursor in proline biosynthesis with NAD+ being required as coenzyme. [202] Although ornithine is named as one of the first amino acids encoded in the GC first theory (see Figure 5B), the fundamental problem persists because ornithine is also biosynthesized from glutamate.

6. Summary and Outlook

The present article reveals that the molecular evolution of amino acids up to the genetic code could not have succeeded in isolation without also considering those of nucleic acids (although not discussed in detail here) and of coenzymes and cofactors. These three lineages must have arisen at an early stage of evolution, not without influence from each other, but in direct interplay and mutual influence, as outlined in Scheme 12. [203] In particular, the two transitional phases from the prebiotic world to the first premetabolic systems and protocells, as well as the transition to LUCA, both marked with large arrows, appear shrouded in fog. LUCA, regarded to be autotrophic, thermophilic, anaerobic prokaryote that lived in hydrothermal vents and used the Wood‐Ljungdahl pathway (section 5.1) [161] likely evolved from a bewildering variety of different premetabolic forms of protocells that are very difficult to grasp or describe.

Scheme 12.

Scheme 12

Three parallel lineages of molecular and metabolic evolution: nucleic acids (I), coenzymes/cofactors (II) and proteins and metabolic networks (III) – a hypothetical model as well as structural and functional relationships of coenzymes/cofactors to the “genetics first” and “metabolism first” theories.

In‐depth studies of highly parasitic life forms, such as the marine hyperthermophilic and chemolithoautotrophic archaea Nanoarchaeum equitans, [204] that still survive with reduced genomes and an incomplete metabolism but recruit all metabolic building blocks and nutrients from outside are interesting models for pre‐LUCA forms of Life. The difference to such parasitic organisms would be that these ancient forms would have to take up the building blocks from their abiotic environment.

Thus, the journey recounted in this report ends with a take‐home message that the evolutionary development metabolism specifically of amino acids and proteins can and should be considered only in the context of the origin of nucleotides and coenzymes and cofactors and not in isolation. In a nutshell, metabolism (“metabolism first”) and the evolutionary relation to replication (“genetic first”) should not be considered alone (Figure 1). Rather, coenzymes and cofactors are evident to be an important link between the two, in simplified terms structurally with nucleic acids [7] and functionally with metabolism (scheme 12).

Conflict of interest

The authors declare no conflict of interest.

7.

Biographical Information

Andreas Kirschning studied chemistry at the University of Hamburg and at Southampton University (UK). In Hamburg he joined the group of Prof. Ernst Schaumann and received his PhD in 1989 in the field of organosilicon chemistry. After a postdoctoral stay at the University of Washington (Seattle, USA) with Prof. Heinz G. Floss, he moved to the Technical University of Clausthal in 1991. In 2000 he became full professor at Leibniz University of Hannover. His research interests include all aspects of natural products including mutasynthesis and the use of terpene synthases. Another important aspect of his research addresses the development and combination of enabling technologies in organic synthesis, specifically flow chemistry in combination with inductive heating techniques. Recently, he has devoted himself to questions concerning the origin of life with special reference to the role of coenzymes and cofactors in evolution.

graphic file with name CHEM-28-0-g014.jpg

Acknowledgements

I thank my students and Dr. Carsten Zeilinger (Leibniz University Hannover) who provided important suggestions and ideas. Open Access funding enabled and organized by Projekt DEAL.

A. Kirschning, Chem. Eur. J. 2022, 28, e202201419.

Data Availability Statement

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

References

  • 1.Considerations on the origin of life is typically approached using the hypothetical-deductive method:
  • 1a. Świeżyński A., Int. J. Astrobiol. 2016, 15, 291–299; [Google Scholar]
  • 1b. Wächterhäuser G., J. Theor. Biol. 1997, 187, 483–449; [DOI] [PubMed] [Google Scholar]
  • 1c. Kirschning A., Reydon T. A. C., Beilstein J. Org. Chem. 2015, 11, 893–896. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.In rare cases, the codon UGA stands for selenocysteine, and the codon UAG for pyrrolysine; both of which are nor covered here.
  • 3. 
  • 3a. Ruiz-Mirazo K., Briones C., de la Escosura A., Chem. Rev. 2014, 114, 285–366; [DOI] [PubMed] [Google Scholar]
  • 3b. Fried S. D., Fujishima K., Makarov M., Cherepashuk I., Hlouchova K., J. R. Soc. Interface 2022, 19: 20210641. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.The RNA world theory as well as the emergence of homochirality are not in the focus of this review.
  • 5. 
  • 5a. Gilbert W., Nature 1986, 319, 618; [Google Scholar]
  • 5b. Joyce G. F., Nature 2002, 418, 214–221; [DOI] [PubMed] [Google Scholar]
  • 5c. Robertson M. P., Joyce G. F., Cold Spring Harbor Perspect. Biol. 2012, 4, a003608. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Anet F. A., Curr. Opin. Chem. Biol. 2004, 8, 654–659. [DOI] [PubMed] [Google Scholar]
  • 7. Kirschning A., Angew. Chem. Int. Ed. 2020, 60, 6242–6269. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. 
  • 8a. Sephton M. A., Nat. Prod. Rep. 2002, 19, 292–311; [DOI] [PubMed] [Google Scholar]
  • 8b. Öberg K. I., Chem. Rev. 2016, 116, 9631–9663; [DOI] [PubMed] [Google Scholar]
  • 8c. Altwegg K., Balsiger H., Bar-Nun A., Berthelier J.-J., Bieler A., Bochsler P., Briois C., Calmonte U., Combi M. R., Cottin H., De Keyser J., Dhooghe F., Fiethe B., Fuselier S. A., Gasc S., Gombosi T. I., Hansen K. C., Haessig M., Jäckel A., Kopp E., Korth A., Le Roy L., Mall U., Marty B., Mousis O., Owen T., Rème H., Rubin M., Sémon T., Tzou C.-Y., Waite J. H., Wurz P., Sci. Adv. 2016, 2, e1600285. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Bada J. L., Glavin D. P., McDonald G. D., Becker L., Science 1998, 279, 362–365. [DOI] [PubMed] [Google Scholar]
  • 10. Shaw A. M., Astrochemistry: From Astronomy to Astrobiology; John Wiley & Sons: New York, NY, USA, 2007. [Google Scholar]
  • 11. Lauretta D. S., McSween H. Y., Meteorites and the Early Solar System II; University of Arizona Press: Tucson, AZ, USA, 2006. [Google Scholar]
  • 12. 
  • 12a. Sephton M. A., Astron. Geophys. 2004, 45, 2–8; [Google Scholar]
  • 12b. Koga T., Naraoka H., Sci. Rep. 2017, 7: 636. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. 
  • 13a. Cronin J., Cooper G., Pizzarello S., Adv. Space Res. 1995, 15, 91–97; [DOI] [PubMed] [Google Scholar]
  • 13b. Elsila J. E., Aponte J. C., Blackmond D. G., Burton A. S., Dworkin J. P., Glavin D. P., ACS Cent. Sci. 2016, 2, 370–379; [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13c. Burton A. S., Berger E. L., Life 2018, 8, 14.29757224 [Google Scholar]
  • 14. Elsila J. E., Johnson N. M., Glavin D. P., Aponte J. C., Dworkin J. P., Meteorit. Planet. Sci. 2021, 56, 586–600. [Google Scholar]
  • 15. Ruf A., d'Hendecourt L. L. S., Schmitt-Kopplin P., Life 2018, 8, 18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. 
  • 16a. Oba Y., Takano Y., Naraoka H., Watanabe N., Kouchi A., Nat. Commun. 2019, 10, 4413; [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16b. Naraoka H., Yamashita Y., Yamaguchi M., Orthous-Daunay F.-R., ACS Earth Space Chem. 2017, 1, 540–550. [Google Scholar]
  • 17. Oba Y., Naraoka H., Meteorit. Planet. Sci. 2006, 41, 1175–1181. [Google Scholar]
  • 18.A. Shimoyama, R. Ogasawara, Origins Life Evol. Biospheres 2002, 32, 165–179. [DOI] [PubMed]
  • 19. Hadraoui K., Cottin H., Ivanovski S. L., Zapf P., Altwegg K., Benilan Y., Biver N., Della Corte V., Fray N., Lasue J., Merouane S., Rotundi A., Zakharov V., A&A 2019, 630, A32. [Google Scholar]
  • 20. 
  • 20a. Bernstein M. P., Dworkin J. P., Sandford S. A., Cooper G. W., Allamandola L. J., Nature 2002, 416, 401–403; [DOI] [PubMed] [Google Scholar]
  • 20b. Portugal W., Pilling S., Boduch P., Rothard H., Andrade D. P. P., MNRAS 2014, 441, 3209–3225. [Google Scholar]
  • 21. 
  • 21a. Semenov S. N., Kraft L. J., Ainla A., Zhao M., Baghbanzadeh M., Campbell V. E., Kang K., Fox J. M., Whitesides G. M., Nature 2016, 537, 656–660; [DOI] [PubMed] [Google Scholar]
  • 21b. Islam S., Powner M. W., Chem. 2017, 2, 470–501; [Google Scholar]
  • 21c. Kitadai N., Maruyama S., Geoscience Front. 2018, 9, 1117–1153. [Google Scholar]
  • 22. 
  • 22a. Ashe K., Fernandez-García C., Corpinot M. K., Coggins A. J., Bučar D.-K., Powner M. W., Commun. Chem. 2019, 2, 23; [Google Scholar]
  • 22b. Foden C. S., Islam S., Fernández-García C., Maugeri L., Sheppard T. D., Powner M. W., Science 2020, 370, 865–869. [DOI] [PubMed] [Google Scholar]
  • 23. 
  • 23a. Miller S. L., Science 1953, 117, 528–529; [DOI] [PubMed] [Google Scholar]
  • 23b. Miller S. L., Urey H. C., Science 1959, 130, 245–251. [DOI] [PubMed] [Google Scholar]
  • 24. 
  • 24a. Parker E. T., Cleaves H. J., Dworkin J. P., Glavin D. P., Callahan M., Aubrey A., Lazcano A., Bada J. L., Proc. Acad. Sci. USA 2011, 108, 5526–5531; [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24b. Parker E. T., Cleaves H. J., Callahan M. P., Dworkin J. P., Glavin D. P., Lazcano A., Bada J. L., Origins Life Evol. Biospheres 2011, 41, 201–212; [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24c. Bada J. L., Chem. Soc. Rev. 2013, 7, 2186–2196. [DOI] [PubMed] [Google Scholar]
  • 25. 
  • 25a. Oró J., Biochem. Biophys. Res. Commun. 1960, 2, 407–412; [Google Scholar]
  • 25b. Oró J., Kimball A. P., Arch. Biochem. Biophys. 1961, 94, 217–227; [DOI] [PubMed] [Google Scholar]
  • 25c. Oró J., Kamat S. S., Nature 1961, 190, 442–443. [DOI] [PubMed] [Google Scholar]
  • 26. Yamada H., Hirobe M., Okamoto T., Yakugaku Zasshi. 1980, 100, 489–492. [Google Scholar]
  • 27. Saladino R., Botta G., Delfino M., Di Mauro E., Chem. Eur. J. 2013, 19, 16916–16922. [DOI] [PubMed] [Google Scholar]
  • 28. Martin W., Russell M. J., Phil. Trans. R. Soc. B 2007, 362, 1887–1925. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Blöchl E., Keller M., Wächtershäuser G., Stetter K. O., Proc. Natl. Acad. Sci. USA 1992, 89, 8117–8120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Huber C., Wächtershäuser G., Science 1997, 276, 245–247. [DOI] [PubMed] [Google Scholar]
  • 31.G. Wächtershäuser, M. W. W. Adams; In J. Wiegel (ed.). Thermophiles: The Keys to Molecular Evolution and the Origin of Life. 1998, 47–57.
  • 32. Huber C., Wächtershäuser G., Science 2006, 314, 630–632. [DOI] [PubMed] [Google Scholar]
  • 33. Huber C., Wächtershäuser G., Tetrahedron Lett. 2003, 44, 1695–1697. [Google Scholar]
  • 34. 
  • 34a. Huber C., Wächtershäuser G., Science 1998, 281, 670–672; [DOI] [PubMed] [Google Scholar]
  • 34b. Leman L., Orgel L., Ghadiri M. R., Science 2004, 306, 283–286. [DOI] [PubMed] [Google Scholar]
  • 35. Huber C., Eisenreich W., Hecht S., Wächtershäuser G., Science 2003, 301, 938–940. [DOI] [PubMed] [Google Scholar]
  • 36. Sutherland J. D., Angew. Chem. Int. Ed. 2016, 55, 104–121; [DOI] [PubMed] [Google Scholar]; Angew. Chem. 2016, 128, 108–126. [Google Scholar]
  • 37. 
  • 37a. Sagan C., Khare B. N., Science 1971, 173, 417–420; [DOI] [PubMed] [Google Scholar]
  • 37b. Khare B. N., Sagan C., Nature 1971, 232, 577–579. [DOI] [PubMed] [Google Scholar]
  • 38. Patel B. H., Percivalle C., Ritson D. J., Duffy C. D., Sutherland J. D., Nat. Chem. 2015, 7, 301–307. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Sasselov D. D., Grotzinger J. P., Sutherland J. D., Sci. Adv. 2020, 6: eaax3419. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Deamer D., Assembling Life “How can Life Begin on Earth and Other Habitable Planets? Oxford University Press; 2019. [Google Scholar]
  • 41. 
  • 41a. Fares H. M., Marras A. E., Ting J. M., Tirrell M. V., Keating C. D., Nature Commun. 2020, 11, 5423; [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41b. Ross D. S., Deamer D., Life 2016, 6, 28.27472365 [Google Scholar]
  • 42. Mamajanov I., MacDonald P. J., Ying J., Duncanson D. M., Dowdy G. R., Walker C. A., Engelhart A. E., Fernández F. M., Grover M. A., Hud N. V., Schork F. J., Macromolecules 2014, 47, 1334–1343. [Google Scholar]
  • 43. 
  • 43a. Shen C., Yang L., Miller S., Oró J., Origins Life Evol. Biospheres 1987, 17, 295–305; [DOI] [PubMed] [Google Scholar]
  • 43b. Shen C., Yang L., Miller S. L., Oró J., J. Mol. Evol. 1990, 31, 167–174. [DOI] [PubMed] [Google Scholar]
  • 44. Friedmann N., Miller S. L., Science 1969, 166, 766–767. [DOI] [PubMed] [Google Scholar]
  • 45. Guennoun Z., Coupeaud A., Couturier-Tamburelli I., Pietri N., Coussan S., Aycard J. P., Chem. Phys. 2004, 300, 143–151. [Google Scholar]
  • 46. Ménez B., Pisapia C., Andreani M., Jamme F., Vanbellingen Q. P., Brunelle A., Richard L., Dumas P., Réfrégiers M., Nature 2018, 564, 59–63. [DOI] [PubMed] [Google Scholar]
  • 47. Yoshino D., Hayatsu K., Anders E., Cosmochim. Acta 1971, 35, 927–938. [Google Scholar]
  • 48. Cleaves H. J., Aubrey A. D., Bada J. L., Origins Life Evol. Biospheres 2009, 39, 109–126. [DOI] [PubMed] [Google Scholar]
  • 49. Imai E.-i., Honda H., Hatori K., Brack A., Matsuno K., Science 1999, 283, 831–834. [DOI] [PubMed] [Google Scholar]
  • 50. Rode B. M., Peptides 1999, 20, 773–786. [DOI] [PubMed] [Google Scholar]
  • 51. Danger G., Plasson R., Pascal R., Chem. Soc. Rev. 2012, 41, 5416–5429. [DOI] [PubMed] [Google Scholar]
  • 52. Frenkel-Pinter M., Samanta M., Ashkenasy G., Leman L. J., Chem. Rev. 2020, 120, 4707–4765. [DOI] [PubMed] [Google Scholar]
  • 53. Rohlfing D. L., Science 1976, 193, 68–70. [DOI] [PubMed] [Google Scholar]
  • 54. Brack A., Chem. Biodiversity 2007, 4, 665–679. [DOI] [PubMed] [Google Scholar]
  • 55. Lambert J.-F., Origins Life Evol. Biospheres 2008, 38, 211–242. [DOI] [PubMed] [Google Scholar]
  • 56. 
  • 56a. Waddell T. G., Henderson B. S., Morris R. T., Lewis C. M., Zimmermann A. G., Origins Life 1987, 17, 149–153. [DOI] [PubMed] [Google Scholar]
  • 57. Waddell T. G., Geevarghese S. K., Henderson B. S., Pagni R. M., Newton J. S., Origins Life 1989, 9, 603–607. [Google Scholar]
  • 58. Waddell T. G., Miller T. J., Origins Life 1992, 21, 219–223. [DOI] [PubMed] [Google Scholar]
  • 59. Yadav M., Pulletikurti S., Yerabolu J. R., Krishnamurthy R., Nat. Chem. 2022, 14, 170–178. [DOI] [PubMed] [Google Scholar]
  • 60. Bromke M. A., Metabolites 2013, 3, 294–311. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61. Winkler M. E., Ramos-Montaňez S., EcoSal 2009, 3, 10.1128/ecosalplus.3.6.1.9. [Google Scholar]
  • 62. Evans M. C., Buchanan B. B., Arnon D. I., Proc. Natl. Acad. Sci. USA 1966, 55, 928–934. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63. Buchanan B. B., Arnon D. I., Photosynth. Res. 1990, 24, 47–53. [PubMed] [Google Scholar]
  • 64. Hügler M., Wirsen C. O., Fuchs G., Taylor C. D., Sievert S. M., J. Bacteriol. 2005, 187, 3020–3027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65. Fuchs G., Annu. Rev. Microbiol. 2011, 65, 631–658. [DOI] [PubMed] [Google Scholar]
  • 66. Becerra A., Rivas M., García-Ferris C., Lazcano A., Peretó J., Int. Microbiol. 2014, 17, 91–97. [DOI] [PubMed] [Google Scholar]
  • 67. Kitadai N., Kameya M., Fujishima K., Life 2017, 7, 39.28208827 [Google Scholar]
  • 68. Chen P. Y.-T., Li B., Drennan C. L., Elliott S. J., Joule 2019, 3, 595–611. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69. Nunoura T., Chikaraishi Y., Izaki R., Suwa T., Sato T., Harada T., Mori K., Kato Y., Miyazaki M., Shimamura S., Yanagawa K., Shuto A., Ohkouchi N., Fujita N., Takaki Y., Atomi H., Takai K., Science 2018, 359, 559–563. [DOI] [PubMed] [Google Scholar]
  • 70. Muchowska K. B., Varma S. J., Chevallot-Beroux E., Lethuillier-Karl L., Li G., Moran J., Nat. Ecol. Evol. 2017, 1, 1716–1721. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71. Muchowska K. B., Varma S. J., Moran J., Nature 2019, 569, 104–107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72. Richter M., Nat. Prod. Rep. 2013, 30, 1324–1345. [DOI] [PubMed] [Google Scholar]
  • 73. Fischer J. D., Holliday G. L., Rahman S. A., Thornton J. M., J. Mol. Biol. 2010, 403, 803–824. [DOI] [PubMed] [Google Scholar]
  • 74. Webb M. E., Marquet A., Mendel R. R., Rébeillé F., Smith A. G., Nat. Prod. Rep. 2007, 24, 988–1008. [DOI] [PubMed] [Google Scholar]
  • 75. Richter M., Nat. Prod. Rep. 2013, 30, 1324–1345. [DOI] [PubMed] [Google Scholar]
  • 76. H. B. White  III , J. Mol. Evol. 1976, 7, 101–104. [DOI] [PubMed] [Google Scholar]
  • 77. Convay T., FEMS Microbiol. Rev. 1992, 9, 1–27. [DOI] [PubMed] [Google Scholar]
  • 78. Daniel R. M., Danson M. J., J. Mol. Evol. 1995, 40, 559–563. [Google Scholar]
  • 79. Katsyv A., Schoelmerich M. C., Basen M., Müller V., FEBS Open Bio 2021, 11, 1332–1342. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80. Beatty J. T., Gest H., J. Bacteriol. 1981, 148, 584–593. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81. Umbarger H. E., Annu. Rev. Biochem. 1978, 47, 533–606. [DOI] [PubMed] [Google Scholar]
  • 82. Li Y., Kitadai N., Nakamura R., Life 2018, 8, 46. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83. Belmonte L., Mansy S. S., Elements 2016, 12, 413–418. [Google Scholar]
  • 84. Kitadai N., Nakamura R., Yamamoto M., Takai K., Yoshida N., Oono Y., Sci. Adv. 2019, 5, eaav7848. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85. Dowler M. J., Fuller W. D., Orgel L. E., Sanchez R. A., Science 1970, 169, 1320–1321. [DOI] [PubMed] [Google Scholar]
  • 86. Friedmann N., Miller S. L., Sanchez R. A., Science 1971, 171, 1026–1027. [DOI] [PubMed] [Google Scholar]
  • 87. Cleaves H. J., Miller S. L., J. Mol. Evol. 2001, 52, 73–77. [DOI] [PubMed] [Google Scholar]
  • 88. Civis S., Juha L., Babankova D., Cvacka J., Frank O., Jehlicka J., Kralikova B., Krasa J., Kubat P., Muck A., Pfeifer M., Skala J., Ullschmied J., Chem. Phys. Lett. 2004, 386, 169–173. [Google Scholar]
  • 89. Kim H., Benner S. A., Chem. Eur. J. 2018, 24, 581–584. [DOI] [PubMed] [Google Scholar]
  • 90. Magni G., Amici A., Emanuelli M., Raffaelli N., Ruggieri S., Adv. Enzymol. Relat. Mol. Biol. 1999, 73, 135–182. [DOI] [PubMed] [Google Scholar]
  • 91. de Choudens S. O., Loiseau L., Sanakis Y., Barras F., Fontecave M., FEBS Lett. 2005, 79, 3737–3743. [DOI] [PubMed] [Google Scholar]
  • 92. Begley T., Kinsland C., Mehl R., Osterman A., Dorrestein P., Vitam. Horm. 2001, 61, 103–119. [DOI] [PubMed] [Google Scholar]
  • 93. Richts B., Rosenberg J., Commichau F. M., Front. Mol. Biosci. 2019, 6, 32. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94. Welch A. D., Perspect. Biol. Med. 1983, 27, 64–75. [DOI] [PubMed] [Google Scholar]
  • 95. Gorelova V., Bastien O., De Clerk O., Lespinats S., Rébeillé F., van Der Straeten D., Sci. Rep. 2019, 9, 5731. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 96. Price M. N., Deutschbauer A. M., Arkin A. P., PLoS Genet. 2021, 17, e1009342. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97. Tittmann K., FEBS J. 2009, 276, 2454–2468. [DOI] [PubMed] [Google Scholar]
  • 98. Bunik V. I., Tylicki A., Lukashev N. V., FEBS J. 2013, 280, 6412–6442. [DOI] [PubMed] [Google Scholar]
  • 99. Palmer L. D., Downs D. M., J. Biol. Chem. 2013, 288, 30693–30699. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 100.In E. coli glycine is substituted by L-tyrosine: Begley T. P., Downs D. M., Ealick S. E., McLafferty F. W., Van Loon A. P., Taylor S., Campobasso N., Chiu H. J., Kinsland C., Reddick J. J., Xi J., Arch. Microbiol. 1999, 171, 293–300. [DOI] [PubMed] [Google Scholar]
  • 101. 
  • 101a. Palmer L. D., Downs D. M., J. Biol. Chem. 2013, 288, 30693–30699; [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 101b. Chatterjee A., Hazra A. B., Abdelwahed S., Hilmey D. G., Begley T. P., Angew. Chem. Int. Ed. 2010, 49, 8653–8656; [DOI] [PMC free article] [PubMed] [Google Scholar]; Angew. Chem. 2010, 122, 8835–8838. [Google Scholar]
  • 102.A similar discussion of the possible timing of their arrival can be made for other coenzymes, namely biotin and the flavins.
  • 103. Dorrestein P. C., Zhai H., McLafferty F. W., Begley T. P., Chem. Biol. 2004, 11, 1373–1381. [DOI] [PubMed] [Google Scholar]
  • 104. Jacob F., Science 1977, 196, 1161–1166. [DOI] [PubMed] [Google Scholar]
  • 105. 
  • 105a. Zhang K., Bian J., Deng Y., Smith A., Nunez R. E., Li M. B., Pal U., Yu A.-M., Qiu W., Ealick S. E., Li C., Nature Microbiol. 2016, 2, 16213; [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 105b. Downs D., Nat. Microbiol. 2016, 2, 16252; [DOI] [PubMed] [Google Scholar]
  • 105c. Kerstholt M., Netea M. G., Joosten L. A. B., Ticks Tick Borne Dis 2020, 11, 101386. [DOI] [PubMed] [Google Scholar]
  • 106. Herlemann D. P. R., Geissinger O., Ikeda-Ohtsubo W., Kunin V., Sun H., Lapidus A., Hugenholtz P., Brune A., Appl. Environ. Microbiol. 2009, 75, 2841–2849. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 107. 
  • 107a. Varma S. J., Muchowska K. B., Chatelain P., Moran J., Nat. Ecol. Evol. 2018, 2, 1019–1024; [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 107b. Sobotta J., Geisberger T., Moosmann C., Scheidler C. M., Eisenreich W., Wächtershäuser G., Huber C., A Possible Life 2020, 10, 35; [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 107c. Sousa F. L., Thiergart T., Landan G., Nelson-Sathi S., Pereira I. A. C., Allen J. F., Lane N., Martin W. F., Phil Trans R Soc B 2013, 368, 0088; [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 107d. Nitschke W., Russell M. J., Phil. Trans. R. Soc. B 2013, 368, 0258. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 108. Maden B., Biochem. J. 2000, 350, 609–629. [PMC free article] [PubMed] [Google Scholar]
  • 109. 
  • 109a. Mendel R. R., J. Biol. Chem. 2013, 288, 13165–13172; [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 109b. Iobbi-Nivol C., Leimkühler S., Biochim. Biophys. Acta Bioenerg. 2013, 1827, 1086–1101. [DOI] [PubMed] [Google Scholar]
  • 110. White R. H., Vitam. Horm. 2001, 61, 299–337. [DOI] [PubMed] [Google Scholar]
  • 111. Preiner M., Igarashi K., Muchowska K. B., Yu M., Varma S. J., Kleinermanns K., Nobu M. K., Kamagata Y., Tüysüz H., Moran J., Martin W. F., Nat. Ecol. Evol. 2020, 4, 534–542. [DOI] [PubMed] [Google Scholar]
  • 112. Martin W. F., Front. Microbiol. 2020, 11, 817. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 113. Lazcano A., Miller S. L., J. Mol. Evol. 1999, 49, 424–431. [PubMed] [Google Scholar]
  • 114. Knaggs A. R., Nat. Prod. Rep. 2003, 20, 119–136. [DOI] [PubMed] [Google Scholar]
  • 115. 
  • 115a. White R. H., Biochemistry 2004, 43, 7618–7627; [DOI] [PubMed] [Google Scholar]
  • 115b. White R. H., Xu H., Biochemistry 2006, 45, 12366–12379; [DOI] [PubMed] [Google Scholar]
  • 115c. White R. H., Biochemistry 2008, 47, 5037–5046. [DOI] [PubMed] [Google Scholar]
  • 116. Miller D. V., Ruhlin M., Ray W. K., Xu H., White R. H., FEBS Lett. 2017, 591, 2269–2278. [DOI] [PubMed] [Google Scholar]
  • 117. Soderberg T., Archaea 2005, 1, 347–352. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 118.Theoretically, a TPP-free access to 1-deoxy-xylulose-5-phosphate, typically formed by a ketolase from pyruvate 23+GAP 24, is conceivable if, alternatively, 6-deoxy-5-ketofructose-1-phosphate 30 reacts with glycolaldehyde phosphate in an aldolase reaction. Lichtenthaler H. K., Annu. Rev. Plant Physiol. Plant Mol. Biol. 1999, 50, 47–65. This reaction has not yet been found. [DOI] [PubMed] [Google Scholar]
  • 119. 
  • 119a. Breslow R., J. Am. Chem. Soc. 1958, 80, 3719–3726; [Google Scholar]
  • 119b. Pareek M., Reddi Y., Sunoj R. B., Chem. Sci. 2021, 12, 7973–7992. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 120. 
  • 120a. Braakman R., PLoS Comp. Biol. 2012, 8, e1002455; [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 120b. Kikuchi G., Motokawa Y., Yoshida T., Hiraga K., Proc. Jpn. Acad. Ser. B 2008, 84, 246–263; [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 120c. Du Y.-L., Ryan K. S., Nat. Prod. Rep. 2019, 36, 430–457. [DOI] [PubMed] [Google Scholar]
  • 121.PLP is also capable of promoting O2-promoted oxidations but property may have evolved only when O2 had appeared on earth: Hoffarth E. R., Haatveit K. C., Kuatsjah E., MacNeil G. A., Saroya S., Walsby C. J., Eltis L. D., Houk K. N., Garcia-Borràs M., Ryan K. S., Proc. Nat. Acad. Sci. 2021, 118, 40, e2012591118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 122. Meléndez-Hevia E, Waddell T. G., Cascante M., J. Mol. Evol. 1996, 43, 293–303. [DOI] [PubMed] [Google Scholar]
  • 123. Jensen R. A., Ann. Rev. Microbiol. 1976, 30, 409–425. [DOI] [PubMed] [Google Scholar]
  • 124. 
  • 124a. Drevland R. M., Jia Y., Palmer D. R. J., Graham D. E., J. Biol. Chem. 2008, 283, 28888–28896; [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 124b. Howell D. M., Harich K., Xu H., White R. H., Biochemistry 1998, 37,10108–10117. [DOI] [PubMed] [Google Scholar]
  • 125. Xu H., Zhang Y., Guo X., Ren S., Staempi A. A., Chiao J., Jiang W., Zhao G., J. Bacteriol. 2004, 186, 5400–5409. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 126.Homoalanine is a known amino acid formed by transamination from α-ketobutyrate 40. Umbarger H. E., Brown B., J. Bacteriol. 1957, 73, 105.13405870 [Google Scholar]
  • 127.α-Isopropylmalate synthetase is able to catalyze the condensations of acetyl-CoA with pyruvate, α-ketobutyrate, α-ketovalerate, or α-keto-β-methylvalerate as well as α-ketoisovalerate: Kisumi M., Sugiura M., Chibata I., J. Biochem. 1976, 80, 333–339.794063 [Google Scholar]
  • 128. 
  • 128a. Kohlhaw G., Leary T. R., Umbarger H. E., Biol. Chem. 1969, 244, 2218–2225; [PubMed] [Google Scholar]
  • 128b. Strassman M., Ceci L. N., Arch. Biochem. Biophys. 1967, 119, 420–428; [DOI] [PubMed] [Google Scholar]
  • 128c. Sai T., Aida K., Uemura T., Gen. Appl. Microbiol. 1969, 15, 345–363. [Google Scholar]
  • 129. Apostol I., Levine J., Lippincott J., Leach J., Hess E., Glascock C. B., Weickert M. J., Blackmore R., J. Biol. Chem. 1997, 272, 28980–28988. [DOI] [PubMed] [Google Scholar]
  • 130. Alvarez-Carreño C., Becerra A., Lazcano A., Origins Life Evol. Biospheres 2013, 43, 363–375. [DOI] [PubMed] [Google Scholar]
  • 131. Jukes T. H., Biochem. Biophys. Res. Commun. 1973, 53, 709–714. [DOI] [PubMed] [Google Scholar]
  • 132. Kirschning A., Nat. Prod. Rep. 2021, 38, 993–1010. [DOI] [PubMed] [Google Scholar]
  • 133.Other than the RNA world hypothesis have been proposed: “Protein first”: P. Andras, C. Andras, Med. Hypotheses 2005, 64, 678–688; „metabolism first”: see Ref. [6]; „lipid first“: Segré D., Ben-Eli D., Deamer D. W., Lancet D., Origins Life Evol. Biospheres 2001, 31, 119–145. [Google Scholar]
  • 134. 
  • 134a. Chela-Flores J., J. Theor. Biol. 1994, 166, 163–166; [Google Scholar]
  • 134b. Diener T. O., Biol. Direct 2016, 11, 15; [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 134c. Forterre P., Prangishvili D., Ann. N. Y. Acad. Sci. 2009, 1178, 65–77; [DOI] [PubMed] [Google Scholar]
  • 134d. Broecker F., Moelling K., Ann. N. Y. Acad. Sci. 2019, 1447, 53–68. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 135. Krishnamurthy R., Acc. Chem. Res. 2017, 50, 455–459. [DOI] [PubMed] [Google Scholar]
  • 136. Jensen R. A., Annu. Rev. Microbiol. 1976, 30, 409–425 and references cited therein. [DOI] [PubMed] [Google Scholar]
  • 137. Horowitz N. H., Proc. Natl. Acad. Sci. USA 1945, 31, 153–157. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 138. King A. G. M., Origins Life 1977, 8, 39–53. [DOI] [PubMed] [Google Scholar]
  • 139. Kurup R., Kurup P. A., The Origin of Life – Abiogenesis and Symbiosis. Lap Lambert Academic Publishing; 2019, ISBN 620010007. [Google Scholar]
  • 140. Granick S., Ann. N. Y. Acad. Sci. 1957, 96, 292–308. [DOI] [PubMed] [Google Scholar]
  • 141. Fani R., Fondi M., Phys. Life Rev. 2009, 6, 23–52. [DOI] [PubMed] [Google Scholar]
  • 142. Ycas M., J. Theor. Biol. 1974, 44, 145–160. [DOI] [PubMed] [Google Scholar]
  • 143. Fondi M., Fani R., Res. Microbiol. 2009, 160, 502–512. [DOI] [PubMed] [Google Scholar]
  • 144. Mehta P., Abdelwahed S. H., Fenwick M. K., Hazra A. B., Taga M. E., Zhang Y., Ealick S. E., Begley T. P., J. Am. Chem. Soc. 2015, 137, 10444–10447. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 145. Hazra B., Han A. W., Mehta A. P., Mok K. C., Osadchiy V., Begley T. P., Taga M. E., Proc. Nat. Acad. Sci. 2015, 112, 10792–10797. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 146. Yokoyama K., Lilla E. A., Nat. Prod. Rep. 2018, 35, 660–694. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 147. Gagnon D. M., Stich T. A., Mehta A. P., Abdelwahed S. H., Begley T. P., Britt R. D., J. Am. Chem. Soc. 2018, 140, 12798–12807. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 148. Delaye L., Lazcano A., Phys. Life Rev. 2005, 2, 47–64. [DOI] [PubMed] [Google Scholar]
  • 149. Muchowska K. B., Varma S. J., Moran J., Chem. Rev. 2020, 120, 7708–7744. [DOI] [PubMed] [Google Scholar]
  • 150. 
  • 150a. Nakada H. I., Weinhouse S., J. Biol. Chem. 1953, 204, 831–836; later the reductive amination of α-ketoglutarate was reported [PubMed] [Google Scholar]
  • 150b. Morowitz H. J., Paterson E., Chang S., Origins Life Evol. Biospheres 1995, 25, 395–399. [DOI] [PubMed] [Google Scholar]
  • 151. 
  • 151a. Barge L. M., Flores E., Baum M. M., VanderVelde D. G., Russell M. J., Proc. Natl. Acad. Sci. USA 2019, 116, 4828–4833; [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 151b. Mohammed F. S., Chen K., Mojica M., Conley M., Napoline J. W., Butch C., Pollet P., Krishnamurthy Ramanarayanan, Liotta C. L., Synlett 2015, 28, 93–97. [Google Scholar]
  • 152. Burns B. E., Xiang Y., Kinsland C. L., McLafferty F. W., Begley T. P., J. Am. Chem. Soc. 2005, 127, 3682–3683. [DOI] [PubMed] [Google Scholar]
  • 153. Steffen-Munsberg F., Vickers C., Thontowi A., Schätzle S., Meinhardt T., Humble M. S., Land H., Berglund P., Bornscheuer U. T., Höhne M., ChemCatChem 2013, 5, 154–157. [Google Scholar]
  • 154. 
  • 154a. Morowitz H. J., Complexity 1999, 4, 39–53; [Google Scholar]
  • 154b. Caetano-Anolles G., Yafremava L. S., Gee H., Caetano-Anolles D., Kim H. S., Mittenthal J. E., Int. J. Biochem. Cell Biol. 2009, 41, 285–297. [DOI] [PubMed] [Google Scholar]
  • 155. 
  • 155a. Eschenmoser A., Chem. Biodiversity 2007, 4, 554–573; [DOI] [PubMed] [Google Scholar]
  • 155b. Benner S. A., Kim H.-J., Biondi E., Life 2019, 9, 84; [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 155c. Harrison S., Lane N., Nat. Commun. 2018, 9, 5176; [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 155d. Wołos A., Roszak R., Żądło-Dobrowolska A., Beker W., Mikulak-Klucznik B., Spólnik G., Dygas Maw, Szymkuć S., Grzybowsk B. A., Science 2020, 369, eaaw1955. [DOI] [PubMed] [Google Scholar]
  • 156.The evolution of fatty acids and the fatty acid synthase type II is not discussed here.
  • 157. 
  • 157a. Smith E., Morowitz H. J., The Origin and Nature of Life on Earth: the Emergence of the Fourth Geosphere; 1 st edn.; Cambridge University Press: New York, NY, 2016; [Google Scholar]
  • 157b. Morowitz H. J., Kostelnik J. D., Yang J., Cody G. D., Proc. Nat. Acad. Sci. 2000, 97, 7704–7708; [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 157c. Smith E., Morowitz S. J., Proc. Nat. Acad. Sci. 2004, 101, 13168–13173; [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 157d. Stubbs R. T., Yadav M., Krishnamurthy R., Springsteen G., Nat. Chem. 2020, 12, 1016–1022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 158. 
  • 158a. Ralser M., Biochem. J. 2018, 475, 2577–2592; [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 158b. Muchowska K. B., Chevallot-Beroux E., Moran J., Bioorg. Med. Chem. 2019, 27, 2292–2297. [DOI] [PubMed] [Google Scholar]
  • 159.Arguments positioning glycine as one of the early coded amino acids during the emergence of the genetic code:
  • 159a. Bernhardt H. S., Patrick W. M., J. Mol. Evol. 2014, 78, 307–309; [DOI] [PubMed] [Google Scholar]
  • 159b. Bernhardt H. S., Tate W. P., Biol. Direct 2008, 3, 53; [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 159c. Tamura K., J. Mol. Evol. 2015, 81, 69–71. [DOI] [PubMed] [Google Scholar]
  • 160. Orgel L. E., PLoS Biol. 2008, 6, 5–13. [Google Scholar]
  • 161. 
  • 161a. Weiss M. C., Sousa F. L., Mrnjavac N., Neukirchen S., Roettger M., Nelson-Sathi S., Martin W. F., Nat. Microbiol. 2016, 1, 16116; [DOI] [PubMed] [Google Scholar]
  • 161b. Martin W. F., Sousa F. L., Cold Spring Harbor Perspect. Biol. 2016, 8, a01812. [Google Scholar]
  • 162. Nasir A., Romero-Severson E., Claverie J. M., Trends Microbiol. 2020, 28, 959–967. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 163. Cornish-Bowden A., Cárdenas M. L., J. Theor. Biol. 2017, 434, 68–74. [DOI] [PubMed] [Google Scholar]
  • 164. 
  • 164a. Ilardo M., Meringer M., Freeland S., Rasulev B., H. J. Cleaves  II , Sci. Rep. 2015, 5, 9414; [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 164b. Kubyshkin V., Budisa N., Biotechnol. J. 2017, 12, 1600097; [DOI] [PubMed] [Google Scholar]
  • 164c. Ilardo M., Bose R., Meringer M., Rasulev B., Grefenstette N., Stephenson J., Freeland S., Gillams R. J., Butch C. J., H. J. Cleaves  II , Sci. Rep. 2019, 9, 12468. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 165. Makarov M., Meng J., Tretyachenko V., Srb P., Březinová A., Giacobelli V. G., Bednárová L., Vondrášek J., Dunker A. K., Hlouchová K., Protein Sci. 2021, 30, 1022–1034. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 166. Yan J., Cheng J., Kurgan L., Uversky V. N., Cell. Mol. Life Sci. 2020, 77, 2423–2440. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 167. 
  • 167a. Walter K. U., Vamvaca K., Hilvert D., J. Biol. Chem. 2005, 280, 37742–37746; [DOI] [PubMed] [Google Scholar]
  • 167b. Müller M. M., Allison J. R., Hongdilokkul N., Gaillon L., Kast P., van Gunsteren W. F., Marlière P., Hilvert D., PLoS Genet. 2013, 9, e1003187. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 168. Shibue R., Sasamoto T., Shimada M., Zhang B., Yamagishi A., Akanuma S., Sci. Rep. 2018, 8, 1227. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 169. Akanuma S., Yokobori S., Nakajima Y., Bessho M., Yamagishi A., Evolution 2015, 69, 2954–2962. [DOI] [PubMed] [Google Scholar]
  • 170. 
  • 170a. Maurel M. C., Ninio J., Biochimie 1987, 69, 551–553; [DOI] [PubMed] [Google Scholar]
  • 170b. Shen C., Mills T., Oró J., J. Mol. Evol. 1990, 31, 175–179; [DOI] [PubMed] [Google Scholar]
  • 170c. White D. H., Erickson J. C., J. Mol. Evol. 1980, 16, 279–290; [DOI] [PubMed] [Google Scholar]
  • 170d. Shen C., Lazcano A., Oró J., J. Mol. Evol. 1990, 31, 445–452. [DOI] [PubMed] [Google Scholar]
  • 171. Liao S.-M., Du Q.-S., Meng J.-Z., Pang Z.-W., Huang R.-B., Chem. Cent. J. 2013, 7, 44. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 172. Yanai I., Wolf Y. I., Koonin E. V., Genome Biol. 2002, 3, 0024.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 173. 
  • 173a. Fani R., Brilli M., Fondi M., Lió P., BMC Evol. Biol. 2007, 7 Suppl 2, S4; [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 173b. Fani R., Liò P., Lazcano A., J. Mol. Evol. 1995, 41, 760–774. [DOI] [PubMed] [Google Scholar]
  • 174. Hsiao C., Chou I.-C., Okafor C. D., Bowman J. C., O'Neill E. B., Athavale S. S., Petrov A. S., Hud N. V., Wartell R. M., Harvey S. C., Williams L. D., Nat. Chem. 2013, 5, 525–528. [DOI] [PubMed] [Google Scholar]
  • 175. Frenkel-Pinter M., Haynes J. W., Mohyeldin A. M., Martin C., Sargon A. B., Petrov A. S., Krishnamurthy R., Hud N. V., Williams L. D., Leman L. J., Nat. Commun. 2020, 11, 3137. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 176. 
  • 176a. Crick F. H., J. Mol. Biol. 1968, 38, 367–379; [DOI] [PubMed] [Google Scholar]
  • 176b. Orgel L. E., J. Mol. Biol. 1968, 38, 381–393; [DOI] [PubMed] [Google Scholar]
  • 176c. de Pouplana L. R., Torres A. G., Rafels-Ybern À., Life 2017, 7, 14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 177. Gospodinov A., Kunnev D., Life 2020, 10, 81. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 178. 
  • 178a. Scossa F., Fernie A. R., Comput. Struct. Biotechnol. J. 2020, 18, 482–500; [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 178b. Becerra A., J. Mol. Evol. 2021, 89, 183–188. [DOI] [PubMed] [Google Scholar]
  • 179. De Duve C., ‘Blueprint for a Cell: The Nature and Origin of Life’, Patterson, Burlington, N. C., 1991. [Google Scholar]
  • 180. Pascal R., Boiteau L., Commeyras A., Top. Curr. Chem. 2005, 259, 69. [Google Scholar]
  • 181. Weber A. L., Origins Life Evol. Biospheres 2001, 31, 71. [DOI] [PubMed] [Google Scholar]
  • 182. 
  • 182a. Barbieri M., BioSystems 2019, 185, 104024; [DOI] [PubMed] [Google Scholar]
  • 182b. Kubyshkin V., Acevedo-Rocha C. G., Budisa N., BioSystems 2018, 164, 16–25; [DOI] [PubMed] [Google Scholar]
  • 182c. Seligmann H., Warthi G., Comput. Struct. Biotechnol. J. 2017, 15, 412–424. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 183. Jimenez-Sanches A., J. Mol. Evol. 1995, 41, 712–716. [DOI] [PubMed] [Google Scholar]
  • 184.Hyperthermophilic archaea such as P. fumarii exhibit higher GC content, particularly at the third nucleic acid within the codon, likely due to a selection pressure for more hydrogen bonding to increase DNA thermal stability.
  • 185. 
  • 185a. Francis B. R., J. Mol. Evol. 2013, 77, 134–158; [DOI] [PubMed] [Google Scholar]
  • 185b. Eigen M., Winkler-Oswatitsch R., Naturwissenschaften 1981, 68, 217–228; [DOI] [PubMed] [Google Scholar]
  • 185c. Ikehara K., Omori Y., Arai R., Hirose A., J. Mol. Evol. 2002, 54, 530–538. [DOI] [PubMed] [Google Scholar]
  • 186. 
  • 186a. Hartman H., Smith T., Life 2014, 4, 227–249; [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 186b. Smith T., Hartman H., FEBS Lett. 2015, 589, 3499–3507. [DOI] [PubMed] [Google Scholar]
  • 187. 
  • 187a. Kubyshkin V., Budisa N., Int. J. Mol. Sci. 2019, 20, 5507; [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 187b. Kubyshkin V., Budisa N., Curr. Opin. Biotechnol. 2019, 60, 242–249. [DOI] [PubMed] [Google Scholar]
  • 188. Grosjean H., Westhof E., Nucl. Acids Res. 2016, 44, 18020–8040. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 189. Shepherd J. C. W., Trends Biochem. Sci. 1984, 9, 8–10. [Google Scholar]
  • 190. José M. V., Zamudio G. S., Palacios-Pérez M., Bobadilla J. R., Farías S. T., Origins Life Evol. Biospheres 2015, 45, 77–83. [DOI] [PubMed] [Google Scholar]
  • 191. Jukes T. H., J. Mol. Evol. 1996, 42, 377–381. [DOI] [PubMed] [Google Scholar]
  • 192. 
  • 192a. Wong J. T.-F., BioEssays 2005, 27, 416–425; [DOI] [PubMed] [Google Scholar]
  • 192b. Philip G. K., Freeland S. J., Astrobiology 2011, 11, 235–240. [DOI] [PubMed] [Google Scholar]
  • 193. Wong M. J. T., Bronskill P. M., J. Mol. Evol. 1979, 13, 115–125. [DOI] [PubMed] [Google Scholar]
  • 194. Weber A. L., Miller S. L., J. Mol. Evol. 1981, 17, 273–284. [DOI] [PubMed] [Google Scholar]
  • 195. Lu Y., Freeland S. J., J. Theor. Biol. 2008, 250, 349–361. [DOI] [PubMed] [Google Scholar]
  • 196. 
  • 196a. Ibba M., Söll D., Annu. Rev. Biochem. 2000, 69, 617–650; [DOI] [PubMed] [Google Scholar]
  • 196b. Sheppard K., Yuan J., Hohn M. J., Jester B., Devine K. M., Söll D., Nucl. Acids Res. 2008, 36, 1813–1825. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 197. Di Giulio M., J. Mol. Evol. 2002, 55, 616–622. [DOI] [PubMed] [Google Scholar]
  • 198. Moser J., Lorenz S., Hubschwerlen C., Rompf A., Jahn D., J. Biol. Chem. 1999, 274, 30679–30685. [DOI] [PubMed] [Google Scholar]
  • 199. Moser J., Schubert W.-D., Beier V., Bringemeier I., Jahn D., Heinz D. W., EMBO J. 2001, 23, 6583–6590. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 200. Sauerwald A., Zhu W., Major T. A., Roy H., Palioura S., Jahn D., Whitman W. B., J. R. Yates  3rd , Ibba M., Söll D., Science 2005, 307, 1969–1972. [DOI] [PubMed] [Google Scholar]
  • 201. Long X., Xue H., Wong J. T.-F., Evol. Bioinformatics 2020, 16, 1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 202. 
  • 202a. Muth W. L., Costilow R. N., J. Biol. Chem. 1974, 249, 7463–7467; [PubMed] [Google Scholar]
  • 202b. Raupner M., White R. H., J. Bacteriol. 2001, 183, 5203–5205. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 203.Latest publications on the coevolution of RNA and peptides as well as on RNA-self replication see
  • 203a. Müller F., Escobar L., Xu F., Węgrzyn E., Nainytė M., Amatov T., Chan C.-Y., Pichler A., Carell T., Nature 2022, 605, 279–284; [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 203b. Mizuuchi R., Furubayashi T., Ichihashi N., Nat. Commun. 2022, 13: 1460. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 204. Huber H., Burggraf S., Mayer T., Wyschkony I., Rachel R., Stetter K. O., Int. J. Syst. Evol. Microbiol. 2000, 50, 2093–2100. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Data sharing is not applicable to this article as no new data were created or analyzed in this study.


Articles from Chemistry (Weinheim an Der Bergstrasse, Germany) are provided here courtesy of Wiley

RESOURCES