Skip to main content
Philosophical Transactions of the Royal Society B: Biological Sciences logoLink to Philosophical Transactions of the Royal Society B: Biological Sciences
. 2023 Jan 11;378(1871):20220027. doi: 10.1098/rstb.2022.0027

Rethinking nucleic acids from their origins to their applications

Steven A Benner 1,2,
PMCID: PMC9835595  PMID: 36633284

Abstract

Reviewed are three decades of synthetic biology research in our laboratory that has generated alternatives to standard DNA and RNA as possible informational systems to support Darwinian evolution, and therefore life, and to understand their natural history, on Earth and throughout the cosmos. From this, we have learned that:

  • the core structure of nucleic acids appears to be a natural outcome of non-biological chemical processes probably in constrained, intermittently irrigated, sub-aerial aquifers on the surfaces of rocky planets like Earth and/or Mars approximately 4.36 ± 0.05 billion years ago;

  • however, this core is not unique. Synthetic biology has generated many different molecular systems able to support the evolution of molecular information;

  • these alternatives to standard DNA and RNA support biotechnology, including DNA synthesis, human diagnostics, biomedical research and medicine;

  • in particular, they support laboratory in vitro evolution (LIVE) with performance to generate catalysts at least 104–105 fold better than standard DNA libraries, enhancing access to receptors and catalysts on demand. Coupling nanostructures to the products of LIVE with expanded DNA offers new approaches for disease therapy; and

  • nevertheless, a polyelectrolyte structure and size regular building blocks are required for any informational polymer to support Darwinian evolution. These features serve as universal and agnostic biosignatures, useful for seeking life throughout the Solar System.

This article is part of the theme issue ‘Reactivity and mechanism in chemical and synthetic biology’.

Keywords: nucleic acids, origin of life, laboratory in vitro evolution, search for life in the cosmos

1. Introduction

The double helix is rapidly approaching its 70th birthday [1]. Familiar even to middle school students, the image is used in jewellery, book covers and courtyard sculptures. It is part of … well … ‘our DNA’.

That familiarity may be sufficient excuse for many having overlooked its many chemical perplexities for these seven decades. Today, however, those perplexities are our focus of my lecture, and this manuscript.

For example, in canonical duplexes of DNA or RNA (collectively NA), a polyanion binds to another polyanion. This is associated with coulombic repulsion between the backbones of the two bound strands. That repulsion might be expected to destabilize the double helix.

Indeed, in the 1980s, it was expected. That led to the design of uncharged forms of DNA. In the reasoning, if one simply were to synthesize an NA analogue where the linking negatively charged phosphate groups were replaced by linkers without a charge, that analogue would continue to bind complementary strands of DNA via Watson-Crick rules (big purines pair with small pyrimidines, hydrogen bond donors pair with hydrogen bond acceptors, A pairs with T and G pairs with C). Further, since the two strands would no longer suffer columbic repulsion, binding should be tighter. Uncharged analogues might even passively pass through cell membranes into the cell, where it would bind to intracellular RNA and prevent that RNA from directing translation.

As another perplexity, information critical to our survival, reproduction and evolution is, according to the Watson-Crick model, transmitted by hydrogen bonding in water. However, hydrogen bonding in water seems foolish as a molecular recognition principle, as water itself offers multiple opportunities for competing hydrogen bonds that might disrupt recognition. Further, heterocycles (including purines and pyrimidines) are notorious for adopting tautomeric forms that move hydrogen bonding groups around [2].

Following this logic, Eric Kool [3], Floyd Romesberg [46], Ichiro Hirao [7,8] and others sought to replace inter-nucleobase hydrogen bonding by hydrophobic interactions driven by steric complementarity. They are presenting some of their work in this session.

As yet another perplexity, the single strands are quite flexible. Standard theories of molecular recognition hold that rigid molecules are best able to give specificity and tight binding. Indeed, Leumann [9], Wengel [10], Imanishi [11] and others attempted to rigidify those strands by replacing the ribose by various bicyclic and ‘locked’ analogues. Here, they expected Watson-Crick pairing rules to be retained, but with enhanced stability of the duplex to reflect less loss of conformational entropy once it forms.

My laboratory is now in its third decade [12] of work seeking to understand these and other features of this fascinating and important molecular system. Here, we started by using organic synthesis to answer ‘What if?’ and ‘Why not?’ questions. As a result of this work, we have come to understandings of these peculiarities and much more. As it turns out, these peculiarities are all important for the functioning of this molecular system as the informational part of an evolvable system.

For example, reducing flexibility of the single-strands by making DNA entirely out of locked sugar analogues does indeed improve affinity. However, such DNA analogues hybridize only slowly to their complementary strands [13]. Evidently, flexibility is important in the dynamic behaviour required to go from two single strands to one duplex.

Further, hydrogen bonds are critical to achieving directionality in a nucleobase pair. In particular, they prevent hydrophobic nucleobases from slipping on top of each other to form an intercalated structure [14]. Polymerases also prevent this, but such intercalation disrupts the aperiodic crystal structure that gives strand-strand pairing its fidelity. This means that the sequence space which can be explored with nucleobase pairing without hydrogen bonding is quite limited.

These are all topics for another session. Today, we will follow a different direction that synthetic biology has provided. In particular, synthesis has given us good ideas for how the structural features of NAs are connected to their natural histories, how they originated on Earth (and possibly Mars), and how they might be productively changed to support biotechnology, commerce and medicine.

2. Synthesis and the features of nucleic acids that are universal

Theory holds that Darwinian evolution is the only mechanism by which matter can organize itself to give properties that we value in life. These properties can be seen as ‘information’, but information that is manifested in the substance of an evolvable molecular system. This is a special kind of molecular information.

To support Darwinian evolution, the information in a molecular system must be replicable with a high degree of fidelity, but not absolute fidelity. Further, when the information content of a genetic molecule changes through mistakes made during its replication, those mistakes must themselves be replicable. This is required to ensure the propagation of mistakes that improve ‘fitness’. Here, ‘fitness’ and ‘function’ are related features that make a host organism better able to survive, select a mate and reproduce.

The familiarity of DNA and RNA makes most laypeople, and many molecular biologists, think that such behaviours are easy to come by. In fact, they are not. In general, changing the structure within most molecular systems changes dramatically their molecular behaviour, their chemical reactivity, and their physical properties; or, using a biological word, their molecular ‘phenotypes’.

This ‘rugged’ phenotype-sequence landscape can be quite useful when evolving proteins. Here, evolution is facilitated if a very small number of amino acid replacements survey very large amounts of phenotypic space. This means that one can convert an enzyme that, for example, catalyses the addition of water to fumarate to give malate, to an enzyme that catalyses the addition of ammonia to fumarate to give aspartate, with relatively few amino acid replacements. This allows the evolution of protein phenotype to be speedy.

However, a rugged phenotype-sequence landscape is not desirable for an informational molecule, especially if the phenotype is delivered by a molecular system that is translated from the information it contains. Here, changing information content must not disrupt the molecular processes by which replication occurs.

Some of the peculiarities of DNA and RNA are in fact necessities if this behaviour in an informational molecule is to be achieved.

3. The aperiodic crystal structure

A decade before Watson and Crick proposed the double helical structure for DNA, Erwin Schrödinger explained why an informational genetic molecule must have certain general structural features [15]. The logic is complex, but worth the effort to understand. After you do, you can count yourself among perhaps a dozen of souls living on this planet today who understand what Schrödinger was saying, and why it must influence studies on the nature, origin, and distribution of life in the cosmos.

First, Schrödinger knew that simple binding could not possibly guarantee adequate fidelity in the transfer of thousands and thousands of bits of information needed for life. That is, in cartoon form, because the sigmoidal binding curve as a function of concentrations is broad; this means that over broad concentration ranges, the molecular system is partly bound.

Schrödinger recognized that biology needed concentration-sharp binding. On or off, not half on or half off. For that, Schrödinger recognized that the information in living systems must be in molecules that have access to the accuracy that comes from physics of phase transitions.

Work with me here, as this is important. Phase transitions are events that occur when matter changes … well … its phase. Most important for Schrödinger's proposal, phase transitions are exemplified by the sharp melting of crystals. Melted or not. A pure crystal does not have a range of temperatures where it is half melted.

Thus, when water freezes, disordered molecules of the liquid arrange themselves into a specific lattice; this is a new phase. That lattice is strongly ‘self purifying’; it excludes non-water molecules because they do not fit into the three-dimensional lattice. For example, dissolved salts are concentrated in a residual brine as ocean water crystallizes.

Crystallization is well known in organic molecules as well. Thus, your grade in your laboratory organic chemistry course (if you took such a course) was probably influenced by how high and sharp the melting point was of the crystal of caffeine that you isolated from tea leaves. Here again, the crystal should exclude impurities that do not fit into the lattice, if you did the crystallization correctly.

Schrödinger asked how a system of informational molecules might exploit the fidelity of phase transition to support Darwinian evolution.

Again, some background. Crystals contain defects, even after crystallization has excluded most of them. Those defects can be said to hold ‘information’. Indeed, the defects in a milligram cubic crystal of sodium chloride require far more information to enumerate than the information in a human genome. Nevertheless, because of the physics of phase transitions, those defects are few.

Further, crystals can be powdered and used to seed the growth of other crystals. Thus, the crystal can be said to be ‘replicable’. Further, the ‘children’ crystals also hold information in the placement of their defects. Again, because of the physics of phase transitions, those defects are few, but more than enough to support biology.

However, the information in the child crystals has no relation to the information in the parent crystal. Further, the information in the child crystals cannot be transferred to their descendants.

Thus, the system cannot evolve. It cannot improve its information content to capture more resources from the environment, compete with competitors, and evolve the properties that we value in life.

Schrödinger, therefore, turned to a different concept, a crystal built from more than one component. The arrangement of different components in the crystal, he proposed, would hold the biological information. That arrangement could therefore not be regular; it must be, in Schrödinger's word, ‘aperiodic’. The information was held by the order, or sequence, of the building blocks in space.

Schrödinger understood that this molecular system was not easily a three-dimensional crystal. It could be a two-dimensional crystal, later advocated by Cairns-Smith [16]. Also, it could be a ‘one-dimensional crystal’, later the double helical structure proposed by Watson and Crick.

However, Schrödinger recognized that the building blocks of an aperiodic crystal could not have just any structure. To allow replication to enjoy the fidelity of phase transitions, the exchangeable informational building blocks must all have the same size and shape. They must all fit into the lattice of an aperiodic crystal.

Schrödinger did not actually know anything about the structure of DNA. However, his theory did not need to know the structure of DNA. It was general. In linear form, biological information storage requires a polymer built from a defined number of building blocks all having the same size and shape. The information is stored in their sequence.

Watson and Crick also did not know much about the structure of DNA [17]. They were unfamiliar with Erwin Chargaff's studies of DNA that indicated A:T and G:C pairing. Chargaff himself, visiting Watson and Crick in Cambridge, recorded his annoyance at their ignorance of his work, referring to them as ‘two pitch men in search of a helix’.

Watson and Crick also did not know the correct tautomeric structure of guanine. Jerry Donohue, who was visiting Cambridge University from the laboratory of Linus Pauling at Cal Tech, had to set them straight.

However, Crick, being a physicist, knew about Schrödinger, and so when he and Watson drew out the structures of the four base pairs (A:T, T:A, G:C and C:G) once they had the correct tautomeric structure for guanine, their first reaction was to note their similar sizes and shapes. They remarked immediately that their double helix fitted Schrödinger's aperiodic crystal structure theory. This gave them confidence in their proposed structure, as much as its correspondence with crystallographic data taken from Rosalind Franklin.

It is important to note that G and A are big purines, while T and C are small pyrimidines. The Watson:Crick pair exploits size complementarity, where big purines pair with small pyrimidines. This gives the informational units, the A:T, T:A, G:C and C:G pairs, all the same size. A mis-pair disrupts Schrödinger's size/shape uniformity, disrupts the linear aperiodic crystal and is excluded with a fidelity that exceeds that possible from simple binding energies.

4. Synthesis generalizes the aperiodic crystal structure

Using synthesis, we showed that Schrödinger's aperiodic crystal structure concept can be generalized. For example, Shuichi Hoshika led a team in my laboratory which showed that a different aperiodic crystal structure can have six pairs as its informational units (the T:K, S:V and C:Z pairs, and the K:T, V:S and Z:C pairs; figure 1, top) [18]. Each of these is formed by pairing a small pyrimidine analogue forming three hydrogen bonds with a small complement. These pairs are ‘skinny’.

Figure 1.

Figure 1.

As long as the backbone has a repeating charge and hydrogen bonding is complementary, many pairing geometries support information-specific duplex formation. Shown here are skinny (top) and fat (bottom) pairs. Crystals structures of these were determined by Millie Georgiadis.

These lack the size complementarity of the Watson-Crick pair, but they retain its hydrogen bonding complementarity. Millie Georgiadis, who has been our crystallographic collaborator throughout this work, and who will be speaking in this session, showed that these formed duplexes with a different aperiodic crystal structure. However, the duplexes are just as stable as Watson-Crick duplexes, as measured by melting temperatures. Indeed, to make Watson-Crick big-small size-complementary pairs competitive, we had to complete the A:T pair by adding the amino group missing from adenine. Only with the diaminopurine:T pair (joined by three hydrogen bonds) replacing the A:T pair (joined by two), did natural DNA compete with skinny DNA.

A different aperiodic crystal structure is formed when purines pair with purines via complementary hydrogen bonds to give ‘fat’ pairs (figure 1, bottom). Millie's crystallography showed that these fit a different aperiodic crystal structure. The duplexes held together by fat pairs are more stable than those with Watson-Crick pairs. Indeed, if one wants a very stable aperiodic crystal in water, fat pairs are the way to go.

Thus, Schrödinger's aperiodic crystal structure concept can be generalized. However, we could take our analysis of this concept one step further. Having three different synthetic versions of a general double helix structure with different shapes and sizes allowed us to examine Schrödinger's central premise: that disruption of size/shape regularity destroys the phase transition. Experiments show that when the size regularity is broken by mixing these, one does not get an average stability. Rather, breaking the size/shape regularity destabilizes all structures. Mixing skinny, Watson-Crick and fat pairs into one structure destroys molecular recognition fidelity.

5. The polyelectrolyte backbone is also universal

Schrödinger's universal is based on structure, the consistency in the sizes and shapes of building blocks of informational polymers needed to ensure largely faithful replication. A quarter of a century ago [19,20], we pointed out that for Darwinian evolution to succeed, the physico-chemical properties of the informational biopolymer must also remain constant as their information content changed. Seeking to ensure that the National Aeronautics and Space Administration (NASA) community would incorporate this into their activities to search for alien life, we termed this property ‘Capable of Suffering Mutation Independent of Concern over Loss of Properties Essential for Replication’. This cumbersome phrase has a convenient acronym: COSMIC-LOPER.

What are those ‘properties essential for replication’? Let us take just one: solubility. As many have noted, dissolution is a key for metabolism, information transfer and (again) evolution. If molecules are not dissolved in something, they cannot move to encounter other things, including the ‘things’ that are needed for their replication; and while not often explicitly stated, NASA's decades old mantra to ‘follow the water’ reflects a need for life to have a liquid matrix that appears to be best suited for life, at least as we conceive it.

In many polymer systems, solubility changes dramatically with small changes in molecular structure. The archetypal system, explored by the aforementioned Linus Pauling, are proteins. For example, sickle cell anemia disease is caused by a single replacement of one amino acid by another amino acid in the protein haemoglobin. This, in turn, causes the protein haemoglobin to start to precipitate. Indeed, if one moves amino acids around in the sequence of a soluble protein without changing the overall number of amino acids of each time, one generally converts that soluble protein in to an isomer that does not dissolve in water.

Thus, as a general statement, all proteins are metastable with respect to their precipitated states. This was exemplified by the proteins in your egg in your skillet this morning. Precipitation disrupts reproduction. One simply cannot have one's informational biopolymer precipitate should it acquire a mutation that confers improved fitness. This means that proteins would not be useful as informational polymeric system suitable for supporting evolution.

What molecular features in a biopolymer might allow the free interchange of informational units without risking precipitation, or the loss of other properties essential for replication in water? We return to the electronic structure of molecules, where both theory and experiment note that the most important feature that determines the properties of a molecule is its charge.

If a molecule lacks a charge (a monopole), then the most important feature that determines its properties is its dipole. Absent a dipole, the most important feature is its quadrapole, and so on.

As an illustrative example that organic chemists will appreciate, the benzene molecule does not have a charge, nor does it have a dipole. However, it does have a quadrapole, and this determines much of its behaviour. This includes the temperature at which it crystallizes, what form its crystals take (no, the benzene rings do not stack), and how it interacts (the ‘pi anion’) with other molecules.

The backbone of proteins also do not have a charge. However, they do have a (large) dipole moment. This determines how they interact, including with themselves and with other proteins. The alpha helix and beta strand, also proposed by Linus Pauling and his team, are formed by dipole-dipole interactions. Further, the most prominent physical property of proteins, their propensity to precipitate, comes via inter-protein interaction between the repeating dipoles in the backbone of one protein with those in the other.

However, a charge dominates them all, and DNA, RNA, their fat and skinny analogues described above, indeed every rule-based molecular recognition system that has emerged from synthesis over the past 30 years, has a repeating backbone charge.

How does this allow the polyelectrolyte to support evolution? Because nothing much happens to physical properties when you swap size-shape similar building blocks in a polymer with a repeating backbone charge. The derived molecule still dissolves in water. If you replace another building block, or another dozen building blocks, or shuffle the building blocks, the derived molecule still dissolves in water. It still forms Watson-Crick pairs. It still carries information following rules. It still interacts with enzymes needed to transmit that information.

Interestingly, RNA provides an ‘exception that proves the rule’. If the sequence of an RNA system happens to evolve into a guanine-rich region of sequence space, the rule breaks down. Over in this corner of the landscape, even RNA with its repeating backbone charge starts to behave like a real molecule, with physical properties sensitive to its actual chemical structure. poly-G RNA aggregates; even the repeating polyanion backbone cannot stop it.

One of the outcomes of our synthetic biology with DNA analogues was an understanding of the extent to which the repeating backbone charges determine rule-based molecular recognition. Yes, the columbic repulsion between strands does destabilize the association of two strands in DNA, RNA, their fat and skinny analogues, and every other rule-based molecular recognition system that has emerged from synthesis over the past 30 years. However, the backbone–backbone coulombic repulsion forces strand–strand interactions to the edges of the interacting heterocycles. These are where hydrogen bonds form in the strand–strand interactions between the linear systems. Thus, the repeating backbone charge is what allows these systems to display rule-based molecular recognition, Watson-Crick, fat and skinny.

This was poorly appreciated in the 1980s. Indeed, in that decade, an entire industry was based on the assumption that removing the repeating backbone charges from DNA or RNA would leave a molecule still able to do Watson-Crickery. This led to a profusion of non-ionic backbones that were hoped to generate ‘antisense drugs’.

Paul Miller and Paul Tso, for example, replaced the negative charge on the phosphate diester with a methyl group to give a methylphosphonate (figure 2) [21]. Henk Buck and his team attempted to replace the linking phosphates by non-ionic linking tri-esters [22]. Peter Nielsen and Michael Egholm reported an uncharged peptide-like backbone with a repeating dipole, which they called PNA [23]. James Summerton created uncharged morpholino backbone nucleic acid analogues [24].

Figure 2.

Figure 2.

In the 1980s, the repeating backbone charge was thought to be dispensible without loss of the ability of a strand to bind to a complementary strand following Watson-Crick rules. Accordingly, these and many other linking groups that lacked a charge were proposed by many groups to replace the phosphate linking groups (top left). An uncounted millions of dollars was spent in pursuit of this (now known to be incorrect) model. In our laboratory, we chose to synthesize the dimethylene sulfone non-stereogenic linkers. However, we now know from synthetic biology that the repeating backbone charge is essential to guide Watson-Crick pairing, and necessary for an informational biopolymer to support the evolution of a living system.

My laboratory also became excited by the possibility of charge neutral analogues of DNA able to cross cell membranes to bind with Watson-Crick specificity to disease causing nucleic acids inside cells. We generated a series of DNA analogues where the anionic phosphate linking groups and the backbone were replaced by dimethylenesulfone units (figure 2) [2529]. These were not charged but still had a strong dipole.

What happened? We discovered that replacing the repeating monopole by repeating dipole in these dimethylsulfone DNA and RNA analogues created molecules that behave like proteins: they folded. Further, detailed studies showed that they had lost the ability to do rule-based molecular recognition, above a certain length. The same was seen with PNA, morpholine-linked analogues and, indeed, all other uncharged analogues once they became longer than a certain length characteristic of the backbone.

This led to the Polyelectrolyte Theory of the Gene. This theory posits that all genetic informational molecules in water, regardless of their origin and detailed structure, must have a backbone with a repeating charge. That charge can be negative, as in DNA or RNA (figure 3). That charge can be positive. However, it must be repeating, so to allow the repeating charge to so dominate the physical properties of the molecular system that a molecule can change its information content without changing its properties (much).

Figure 3.

Figure 3.

The Polyelectrolyte Theory of the Gene holds that all life in water must have an informational polymer with a repeating charge, either negative or positive. In both, the polyelectrolyte is a ‘handle’ that lets an agnostic life finder (ALF) concentrate scarce genetic molecules from bulk water, agnostic to the alphabet used to write biological information. However, ‘letters’ in the alphabet must have similar size/shape, to fit an aperiodic crystal [30]. Both structural universals are essential for Darwinian evolution. However, beyond these, little else is universal [31].

6. Combining a polyelectrolyte backbone with size/shape building blocks creates a universal biosignature

Twenty years ago, we pointed out that Schrödinger's aperiodic crystal structure could be combined with the Polyelectrolyte Theory of the Gene to create an agnostic biosignature, a molecule that that must be present in any life form living in water for Darwinian evolution to be informationally supported. The idea was simple: all life will have such polyelectrolytes [15].

Further, since such systems are impossible to be sustained in water in the absence of Darwinian evolution, all non-life will lack them. They are, we propositioned, a chemical proxy for the Darwinian evolution that, we premised all life will have, and all non-life will lack.

These propositions concerning agnostic biosignatures were not obtained by simple generalization from the one example of life that natural history has delivered for us to inspect on Earth. Rather, this proposition for universal agnostic biosignatures, molecular proxies for Darwinian evolution universally, came from hundreds of examples that human synthesis delivered for us to inspect. Given this, the molecular structures of Terran DNA and RNA are examples of these general rules, rather than sources of them.

One of these two universal molecular features provides an unexpected gift to those searching for alien life. The polyelectrolyte structure is a handle by which sparse alien genetic biopolymers might be concentrated from ultra-dilute solution in water where they might be living.

Why? Because polyelectrolytes move easily in electric fields. The electrophoresis used in modern molecular biology laboratories is an example of it. Electrodialysis is widely used in desalination, food processing and other technologies. With the appropriate filters and membranes, electrodialysis can concentrate polyelectrolytes from ultrahigh dilution large samples of Martian water, segregating these from salts, and delivering these for analysis.

This sets the stage for a universal agnostic life finder (ALF) [32]. ALFs do not work by detecting polyelectrolytes. Rather, ALFs exploit the universal polyelectrolyte feature of informational polymers to concentrate Darwinian polymers. After concentration, detection can be done by pretty much any method.

Best, the detection step should examine the concentrated polyelectrolyte polymers to see if they are composed of Schrödinger's size/shape regular building blocks. Mass spectrometry with fragmentation is one way to do this. If we find a polyelectrolyte with size/shape regular building blocks in a sample of alien water, we have found a molecule necessary for Darwinian evolution, and a molecular system that is sustained only by Darwinian evolution. To the extent that life is a ‘self-sustaining chemical system capable of Darwinian evolution’, we have found ‘life’.

7. The urgency of the search for alien life

So how well did the acronym COSMIC-LOPER do getting the NASA community to go looking for alien life, now that they knew how to recognize it, at least on a rocky planet like Mars. Unfortunately, for reasons deeply rooted in the culture, the detection of alien life has not been a part of any space mission since 1976, and that did not change as a result of these insights from synthetic biology, now available for a quarter of a century.

This culture is, in part, the result of an incorrectly interpreted set of gas chromatography-mass spectrometry (GC-MS) experiments done on Mars in the 1976 Viking missions. A quarter-century passed before we [33] and others recognized that the Viking GC-MS data had been misinterpreted to indicate that the Martian surface lacked organic molecules. Still worse, many misinterpreted the 1976 Viking data as evidence that Martian soil is self sterilizing. This, in turn, caused the collapse of the life-detection missions to Mars.

Another quarter century has gone by to bring us to the present day. Even though we know how to look for life on Mars, where to look for life on Mars [3436], and the high probability that life originated in Mars [37], only in 2019 did NASA hold, in Carlsbad, for the first time in decades, a workshop on detecting extant life on Mars. The science supporting the urgent feasibility for the detection of life on Mars was laid out in a recent report just as COVID broke [38]. However, still, the community in large part thinks that life is not possible there [39].

8. Origins

In addition to constraining our search for life in the Solar System, synthesis provides us with constraints for work seeking to better understand the natural history of nucleic acids and, in particular, the origin of life. The strategy is profoundly different, however. Synthetic biology presumes a synthetic biologist, an ‘intelligent designer’ who can guide the experiments. No credible model for the origin of the informational biopolymer that originated Darwinian evolution in natural history can enjoy such an advantage.

Indeed, the principal line of criticism against much of modern research into the origins of life focuses exactly on this problem. Discussed extensively by Robert Shapiro [40,41], the criticism is that most synthetic chemistry approaches the origins of life problem by (in cartoon form):

  • purchasing some chemicals, generally in high-purity;

  • mixing those chemicals in high concentrations in a specific order under carefully devised conditions;

  • obtaining a product mixture of compounds that includes some found in terran life; and

  • publishing assertions about the origin of life from these mixtures.

Shapiro is persuasive precisely because much of prebiotic chemistry actually does follow this format [42,43].

Accordingly, some time ago, we suggested that problem selection and research approaches could facilitate progress if it moved from re-visiting ‘difficult problems’ left over from research a half century earlier. Here, we retained the hypothesis that the first informational biopolymer to support Darwinian evolution was the polyelectrolyte RNA. With this constraint, we suggested these as guiding rules to play the ‘origins game’:

  • (i)

    problems selection in origins research should be driven by a search to resolve paradoxes, rather than simply seeking to solve problems or test hypotheses. Even ‘difficult problems’ [44]. As well known among physicists, success in a research programme that addresses paradoxes moves towards community acceptance of a model, even among those who are inclined to disbelieve. As Bohr once remarked: ‘How wonderful that we have met with a paradox. Now we have some hope of making progress.’ [45, p. 196]. Paradoxes are propositions and conclusions laid out in the form of an Aristotelian syllogism, where the propositions are generally accepted, but the deductive logic ends with the conclusion that life could not possibly have originated, at least not in a way proposed. Such paradoxes are easily defined for the ‘RNA first model’;

  • (ii)

    geological information should be introduced not as post hoc constraints, or even by ‘back informing’ chemistry in what has been called a ‘new modus operandii’ for origins research [46], but in the form of specific environmental scenarios under which all chemistry must take place without human intervention. That is, if one proposes (as we do) that the environment whereby RNA 100–200 nucleotides in length (sufficient for a replicator) was the surface of the Earth 4.36 ± 0.05 billion years ago, all chemistry to get that RNA must occur autonomously with the oxidized minerals on the surface lying beneath an atmosphere that has been transiently reduced by modest sized impacts; and

  • (iii)

    while geological information has long been suggested to be a source of catalysts for desirable prebiotic reactions [47], a possibly better use for mineral–organic interactions may be to prevent reactions. This is intended to address the so-called ‘tar paradox’, an empirical and theoretical statement that organic matter, when given energy without having access to Darwinian processes, devolves to complex tar. Here we search for organic minerals that are scarce in the modern Terran biosphere (as they are eaten), but likely to be abundant on early Earth.

As part of the ‘game’, we proposed that the first informational polymer to support Darwinian evolution was exactly the RNA known in nature, not in one of the forms that synthetic biology has delivered. RNA conforms to the two criteria required universally for informational polymers, discussed previously. Of course, modern Terran biology itself offers us more than enough opportunities to show that RNA can support Darwinian evolution. Further, both natural RNA and laboratory-evolved RNA have been shown to have catalytic activity, supporting an RNA World hypothesis which holds that the only genetically encoded component of biological catalysis during one episode of life on Earth was RNA itself, perhaps with modified nucleotides [48,49].

The RNA World hypothesis is well supported. Indeed, recent work from the Carell laboratory shows how it might have invented translation [50]. However, it does not compel an RNA first hypothesis for the origins of life. A number of investigators [51] have argued that an alternative polymer preceded the formation of the RNA World, feeling that RNA itself is too complex to have arisen without the guiding hand of a synthetic chemist or a divinity.

This leads to the most aggressive arguments against the RNA First Hypothesis for the abiological formation of Schrödingerian polyelectrolytes to spark Darwinian evolution. Specifically, RNA is seen by many to be too complex to have emerged spontaneously from prebiotic environments.

At this point, we have come to disagree. RNA is indeed complex, but experiments set within a single coherent geological context now suggest that it is likely to be the natural consequence of organic chemistry operating on a Hadean Earth.

9. The explicit planetary environment where RNA forms autonomously

The environmental model consists of these components [52,53]:

  • Earth formed 4.53 billion years ago, with the formation of the Moon following at 4.51 Ga;

  • metallic iron, the predominant component of the enstatite meteorites that formed most of Earth's accretion mass, sank rapidly to the core (+ 0.01 Ga), removing most reducing Fe0;

  • Fe2+ in the mantle disproportionates to Fe0 (which sinks) and Fe3+, leading to further oxidation of the mantle;

  • by 4.36 ± 0.05 Ga, the minerals delivered from the mantle to the surface were oxidized, with the oxygen fugacity being near the fayalite-magnetite-quartz (FeO-Fe3O4; FMQ) redox buffer. This time corresponds approximately to the last bombardment event likely to sterilize the entire planet;

  • accordingly, in the rocks delivered to the crust from the mantle, phosphorus was present as phosphate (not phosphide), boron was present as borate, not boride, silicon was present as silicate, and molybdenum was present as Mo4+ and Mo6+;

  • atmospheric carbon was emitted from mantle as CO2 (not CO or CH4), nitrogen was emitted as N2 (not NH3), sulphur was admitted as SO2 (not H2S), and hydrogen was emitted as H2O (not H2);

  • over this period, Earth suffered impacts from smaller bodies, such as analogues of Ceres (1024 g, which is planet-sterilizing) and Vesta (3 × 1023 g, which is too small to be planet sterilizing, but still has a differentiated iron core). Such impactors have their own iron core. Fragmenting these cores put large amounts of the impactor Fe0 into the atmosphere, reducing the atmosphere so that it contained CO, CH4, NH3, H2S and H2. Such atmospheres have long been known to be productive for the formation of hydrogen cyanide (HCN), cyanoacetylene (HCCCN), cyanamide (H2NCN) and cyanogen (NCCN). From these, both standard and nonstandard nucleobases are known to be produced [54];

  • a reducing atmosphere productive for RNA bases is also hazy [55]. Accordingly, high-energy ultraviolet light does not reach the surface and is not available to do prebiotic photochemistry there. Further, it does not destroy organic molecules that might have been synthesized on the surface post-impact, at least until the haze clears as the atmosphere returns to its pre-impact redox state;

  • according to the model, RNA is formed in intermittently irrigated sub-aerial constrained aquifers, to not allow it to be diluted unproductively into a global ocean. To the extent that it appeared above the ocean level, land was dominated by basalt that was reworked by impact. Upon rapid cooling under air or water, basaltic glass is formed. This is illustrated today by basalt on the sub-aerial surface of Iceland. As principal differences, Icelandic basalt has access to an oxidizing atmosphere, and is formed primarily by volcanism, not impact; and

  • depending on specific conditions, the basaltic glass is reworked by water to give zeolites, starting with mordenite. Alternatively, under dryer conditions, the silica is reworked to give opal.

Abundant evidence now documents the likelihood of this specific environment just after the last sterilizing impact, estimated from studies of solar system dynamics to have been 4.36 ± 0.05 billion years ago. For example, Dustin Trail and his colleagues examined zircons that survived from near this time. They noted that the Ce4+/Ce3+ ratio of these ancient zircons indicates that the melts from which they crystallized had a redox state of (fO2 = FMQ–0.5 ± 2.3). This is consistent with the volcanic gas composition that our model requires.

We now turn to describe the chemistry that could not not have happened in this geological scenario. The scheme that we propose to operate in this environment is captured in figure 4.

Figure 4.

Figure 4.

A model supported by experiments suggests that RNA is a molecular outcome of intrinsic chemical reactivity operating in a single geological environment, here in an intermittently hydrated constrained aquifer beneath a Hadean atmosphere 4.36 ± 0.05 billion years ago. Here all the rock species shown (phosphate, borate, lüneburgite, impact basalt glass) were present beneath an atmosphere transiently reduced by impactors large enough to have their own iron cores. Organic minerals were formed from HCHO generated high above the hazy atmosphere by photochemistry, after it was trapped by volcanic SO2 emerging from a fayalite-magnetite-quartz (FMQ) redox mantle. For more details, see the cited literature.

It is useful to analyse figure 4 in reverse, as the most delicate of its molecules is the final one: RNA between 100 and 200 nucleotides in length. This length is adequate to support RNA catalysis, including RNA molecules that catalyse template directed synthesis of RNA [61,62]. If life is to begin under an RNA first scenario, these are the molecules that must have arisen under our geological scenario, and been metastable there.

Some time ago, Elisa Biondi showed that RNA can be stabilized by binding to silica species such as opal or zeolites [63]. She is now just publishing the step just before, having shown that RNA can be made in the Hadean simply by exposing nucleoside triphosphates to basaltic glass. Elisa will discuss these results in another session.

To take the next step backwards, we must then ask where the triphosphates came from. In basalt and basaltic glass, phosphate is in the form of phosphate anhydrides, including cyclic trimetaphosphate and polyphosphate [64]. Hyo-Joong Kim recently showed that in the presence of nickel, itself derived from the cores of impactors, nucleosides react in the presence of borate to give triphosphates [65]. Here, borate coordinates to the 2'- and 3'-hydroxyl groups, driving the triphosphate to the 5'-position, the position required for basalt glass-catalysed synthesis of RNA 100–200 nucleotides in length. Borate was almost certainly available as well; it is observed in the eroding basalts of the Martian surface.

Taking the next step backwards, we must ask where the nucleosides came from. Here, reaction of ribose 1,2-cyclic phosphate with the standard RNA nucleobases in the presence of evaporating urea gives nucleosides, some in quite high yields [66]. As noted in our geological model, the bases are formed in the transiently reduced atmosphere, as is urea, from the hydrolysis of cyanamide. These also rain to the surface as organic minerals.

But where did the ribose 1,2-cyclic phosphate come from? Here, Krishnamurthy et al. [58,67] showed that phosphate anhydrides, again from impact basaltic glass, react with ammonia, again from the transiently reduced atmosphere, to give phosphoramides. These react with ribose to give ribose 1,2-cyclic phosphate.

Where does the ribose come from? A decade ago, a team lead by the aforementioned Hyo-Joong Kim showed how borate minerals could moderate the conversion of carbohydrates to give a five-carbon carbohydrate as the first metastable intermediate in a series of aldol reactions [55]. This can be transformed into ribose among other linear pentoses, again under control by borate. In each case, the reacting electrophile is formaldehyde (HCHO) with seeding glycolaldehyde.

So where did the formaldehyde (HCHO) and seeding glycolaldehyde come from? These are formed high in the atmosphere by photochemical reaction of CO2 and water [68,69]. With SO2 emerging by volcanism from a FMQ mantle, the HCHO cannot help but be trapped as the hydroxymethylsulfonate addition product. This must have rained from the atmosphere into the constrained aquifer.

Thus, all of the steps in figure 4 can occur autonomously in a single environment, a constrained sub-aerial aquifer on the Hadean Earth suffering large impacts 4.36 ± −0.05 billion years ago. Similar environments were also present at the same time on Mars. We are now in the process of developing kits we can distribute to high school and college students where they may themselves reproduce steps in the prebiotic synthesis of RNA.

10. Missing pieces

This model is far from complete. In particular, it does not account for the need for Schrödinger homochirality, even though it does manage to capture the essentials of the Polyelectrolyte Theory of the Gene. I myself consider this to be a serious gap in the model, even though I recognize that many people are working to solve the homochirality problem in general [70].

Further, detailed analysis of the RNA formed on impact basaltic glass shows that it contains a mixture of 2′,5- and 3′,5'-links. The seriousness of this problem is still not clear. Some think that this mixture of linkages can be cured [7173]. Others not.

Nor do we understand adequately the maturation of carbohydrates in the presence of borate in this specific environment. Phosphoramides can also react with maturing carbohydrates before they reach the five-carbon pentose stage. Whether these unproductively divert carbohydrate material, or whether these serve as productive organic mineral reservoirs, is unknown at the moment.

11. Synthetic Darwinism

With RNA being now largely paradox free as a prebiotic species, we now ask whether RNA could support the origin of life, and, if so, with what probability.

Here, we turn to laboratory in vitro evolution (LIVE), a tool proposed three decades ago [7476] as a way to obtain ligands and catalysts without needing to command chemical theory at a level adequate for direct design, and without the trial and error that characterizes most ‘design’ in organic chemistry [77]. In one form, it premises that a library of >1012 DNA or RNA (NA) molecules contains one or more ligands with a selectable affinity for every receptor.

If this were so, is should be possible to get a ligand for any receptor, or a catalyst for any reaction. All that was necessary was to select it away from the uninteresting NA molecules by following a simple experimental procedure:

  • (i)

    synthesize a library of NA molecules;

  • (ii)

    extract from that library the desired ligand/catalyst by a procedure that exploits the desired binding/catalysis as a separation principle in an extraction step; and

  • (iii)

    use the power of polymerase chain reaction to amplify the functional surviving molecules.

If necessary, one might repeat the process, perhaps with some sequence evolution.

In this vision, the tedium of medicinal chemistry might no longer be required. In its grandest vision, the library would contain a drug for any disease where a target was accessible to an NA therapeutic. For those interest in catalysis, it would contain a catalyst for every reaction.

Supporting this vision was, of course, the RNA World model for an episode of life on early Earth where RNA was the only encoded component of biological catalysis. Indeed, according to this ‘RNA World’ model, evolved RNA had a wide range of catalytic activity [78]. This model was also supported by details of modern biology, such as the catalysis of peptide synthesis by ribosomal RNA [79], RNAse P-catalysed processes that matured transfer RNA [80], and RNA mediated splicing of introns [81]. These are all working examples of natural RNA catalysis.

Pursuit of this vision has continued for three decades [82]. In this time, LIVE has delivered artificial NA binding molecules (aptamers) for many targets. Further, LIVE has delivered catalytic RNAzymes and DNAzymes [83], including RNA ligases and replicases [8486].

Unfortunately, aptamers produced from standard DNA and RNA sequences have never had particularly useful affinities. Aptazymes produced by LIVE have generally had only modest catalytic power. Analysis of ‘landscapes’ of RNA molecules have questioned whether this platform is capable of supporting the evolution of metabolic reactions required, both for biotechnology but even in an RNA world [87]. In biotechnology, these disappointing outcomes have limited aptamers and aptazymes to niche markets, not delivering the transformative technology that was envisioned, although many continue to try (https://aptamergroup.com/).

Further, for catalysis, NA does not appear to be a particularly effective matrix for generating catalysts. This may come because NA as we know it in the modern world has too few side-chains, too little functionality, too low sequence density and consequently too little catalytic potential. Its functionality repertoire is certainly inferior relative to better proteins.

Thus, it was perhaps naive to ever expect aptamers to ‘rival antibodies’, or aptazymes to be useful catalysts. Supporting this view, when natural RNA performs non-genetic roles, it often has appended functional groups [88]. This suggests that the RNA World may have enhanced the limited power of its catalytic biopolymer by added functionality to the four standard bases [89].

These observations prompted synthetic biologists to attach functional groups to one or more of the four nucleobases in the NA libraries prepared for LIVE. This has generally seen success [9093]. Thus, Ichiro Hirao, speaking later in this session, found that adding just one hydrophobic side chain to an aptamer can give it picomolar affinities, which generally elude standard LIVE [94]. SomaLogic has obtained slow off-rate modified aptamers (SOMAmers) by appending hydrophobic side chains (benzyl, naphthyl, tryptamino and isobutyl) to nucleobases, and made a business of it [95,96].

Improvements are also seen with catalysis. For example, Perrin and his group modified all four standard nucleotides to obtain improved NA catalysts [97,98]. Substituting polymerase-based amplification with ligation, Liu recently functionalized nucleic acid polymers with up to 32 different species [99].

Detailed studies of some examples of catalytic NAs found that their catalytic power was limited by misfolded forms [100]. Misfolding is difficult to avoid by the low information density presented by just four building blocks and just two nucleobase pairs. Further, ‘over-decorating’ an NA sequence by adding functional groups to each of its components, or (for hydrophobic appendages) to just 25% of the components causes the NA to no longer ‘behave like nucleic acids'.

Thus, we premised that gaining better control over folding and limiting the functionalization of library sequences were also important goals when seeking to improve the performance of LIVE. To improve LIVE under this hypothesis, we set out to re-design evolvable NA platforms by creating artificially expanded genetic information systems (AEGIS) that might support LIVE, but with additional information-containing building blocks that form independently folding pairs, some with added functionality [101,102].

This approach has been rewarding. Considering just binding, we and others [103] have generated a range of AEGISbodies, analogues of antibodies, that bind to specific protein targets [104] as well as complete cells [105107]. Some of these have been used in architectures to deliver drugs to specific cancer cells [108]. Further, the expanded genetic alphabets have delivered new non-canonical interactions that improve the range of folded structures [109].

One interesting example of a potential application of an AEGISbody as a therapeutic agent was published just as the pandemic was beginning to distract us, using AEGIS and other alien genetic systems for diagnostics purposes. Here, Liqin Zhang took an AEGISbody that had been selected to bind to a line of liver cancer cells, and counterselected so as to not bind to untransformed normal liver cells, and appended to it a short single-stranded unit that serves as the trigger for the assembly of a nanotrain structure. After the nanotrain had been assembled, he bound doxorubicin drug molecules to the nanotrain, approximately 100 for each (figure 5).

Figure 5.

Figure 5.

An AEGISbody (Apt) that binds selectively to liver cancer cells, but not to normal liver cells, can be appended to an AEGIS nanotrain (NTr) built from a six-letter genetic alphabet. This intercalates the drug doxorubicin (Dox). After it binds to the cancer cell, the AEGISbody with the attached nanotrain is internalized, driving the drug into the cell. Therefore, this AEGISbody-drug conjugate provides a general strategy to do targeted drug delivery to cancer cells.

The result was a molecule that resembles an antibody-drug conjugate, but with the antibody replaced by an AEGISbody. In vitro studies with cells in culture showed that the AEGISbody dragging behind it a train of conjugated doxorubicin molecules bound selectively to the cancer cells, but not to the normal cells. The AEGISbody with the attached drug molecules was then internalized into the cancer cell, and killed the cancer cell.

The advantages of AEGISbody-nanotrains over standard aptamers conjugated to drugs include, of course, the improved performance of AEGISbodies over standard aptamers. AEGISbodies have additional advantages over standard aptamers. In particular, because AEGIS nucleotides do not pair with standard base pairs, the nanostructures cannot be invaded by any standard DNA or RNA molecules. As these standard DNA and RNA molecules are abundant in an actual living system, this gives AEGIS nano-constructs increased robustness in complicated biological environments, including the human body.

12. Catalysis

The advantages of AEGIS-LIVE over standard LIVE is exceptionally evident when catalysts are sought. In another set of experiments, Elisa Biondi did parallel in vitro selection experiments to evolve RNA-cleaving DNAzymes with libraries containing either standard DNA, or AEGIS DNA built from six nucleotides (ACTGZP) [110]. The starting libraries contained 25 nucleotides in the random regions composed of either ACTG, or ACTGZP nucleotides, surrounded by standard DNA primer binding sites.

The length (25 nt) of the random region was chosen for its being, for standard DNA, the longest where an essentially complete sequence space can be practically investigated. A standard library with 25 random nucleotides has 425, or 1015, possible sequences. This sequence space can be covered by approximately 2 nmol of library, about 20 μg, which contains on average one exemplar of every sequence. Thus, if the presumption of classical in vitro selection holds (that every library contains a selectable catalyst for every reaction), simple selection without subsequent evolution should find it. Indeed, subsequent evolution is not possible; it simply revisits sequences already present in the original library.

This is different for six-letter AEGIS libraries. Here, 25 nt-long random regions have 625 (2.8 × 1019) different sequences. This would require 47 μmol of library (approx. half a gram) to cover the full sequence space. This is not practical in most laboratory environments. However, evolution is possible here, and may even be desirable.

The results are striking of this head-to-head comparison between a standard library and an expanded library. Even after 16 rounds of selection, the standard library gave no species with ribonuclease activity (figure 6). This contrasts with the modest activity achieved by Santoro, Joyce, and others with longer libraries. It indicates that selectable catalytic activity is very sparsely distributed within the shorter standard libraries.

Figure 6.

Figure 6.

(a) Progress in a selection to obtain DNA molecules that cleave RNA ‘in cis’ shows that active RNA cleavers appear after eight rounds of selection with the expanded GACTZP library (red), but never with the standard GACT library (black). (b) Kinetics of RNA cleavage catalysed by one of the survivors, present from the first round. (c) Predicted secondary structure of one of the RNA-cleaving AEGISzymes selected from the GACTZP library.

By contrast, ribonuclease activity was observed after just eight rounds of selection with the six-letter GACTZP libraries. Sequence analysis show that several of these are present after the first round of selection; several more appeared in rounds four through to eight, perhaps by true evolution.

Assuming that the sampling is random with respect to cleavage activity, this implies that the total GACTZP library contains approximately 100 000 ribonucleases with comparable activity. Since the standard library, which sampled a quarter of the sequence space, did not provide even a single ribonuclease with this activity, we can infer that the density of RNA cleavage activity at this level is at least 25 000 times higher in the AEGIS GACTZP functionalized library than in the standard library.

Remarkably, we can propose a mechanism explaining why the GACTZP functionalized library is such a better reservoir for functional molecules than the standard library. Analysis of the kinetics of cleavage as a function of pH showed a bell-shaped curve, with a maximum of about 7.2. Deep sequencing of survivors revealed that all contain two or four AEGIS Z nucleotides.

The pKa of the Z heterocycle alone is 7.8. Thus, Z provides to DNA libraries an acid-base functionality absent in standard nucleotides, but present in proteins, by histidine.

In the well-studied protein ribonuclease A from ox pancreas, two histidine residues act in concert in a ‘pull-push’ general acid-general base mechanism (figure 7, top). Here, one of the histidines must be protonated; the other must be deprotonated.

Figure 7.

Figure 7.

(Top) The ribonuclease A protein uses a deprotonated histidine as a general base to remove a proton from the 2'-hydroxyl group of the RNA substrate, facilitating its attack on the phosphate. A protonated histidine in the active site facilitates this attack by protonating the oxygen of the phosphate linking group. (Bottom) The AEGISbodies catalysing RNA cleavage all have Z nucleotides in pairs. We propose a mechanism where a deprotonated Z acts as a general base to remove a proton from the 2'-hydroxyl group of the RNA substrate, facilitating its attack on the phosphate. A protonated Z in the active site then facilitates this attack by protonating the oxygen of the phosphate linking group. The bell shaped pH-rate profile with a maximum at a perturbed pKa value for Z is consistent with this. This illustrates how expanded genetic alphabets can make laboratory in vitro evolution be a general route to productive catalysts.

We propose that the AEGISzyme effects RNA cleavage via an analogous mechanism, but with the histidines replaced by Z's. Here, one of the Z's must be protonated; the other must be deprotonated. The pH-rate profile showing a maximum rate with a pH near the pKa of the Z heterocycle is consistent with this. With one Z’ protonated and one Z deprotonated around the phosphodiester linkage, acid-base catalysis of RNA cleavage would follow. This hypothesis is also consistent with the observation that the selected ribonucleases contain Z pairs.

Adding a nucleobase pair having distinctive reactivity increases the value of a library as a reservoir for RNA cleaving AEGISzymes by 4 or 5 orders of magnitude. However, could this expansion have played a role in natural history at or near the origin of life?

Abiological processes can attach functionalized size chains to the standard pyrimidines in RNA, especially uracil [111,112]. This strikes us as a more reasonable way to get general acids and general bases into a prebiotic RNA library than a prebiotic synthesis of Z, notwithstanding the fact that this might lead to ‘overdecoration’ of the evolving platform.

Of course, just as translation probably emerged well after Darwinian evolution was secure in Terran natural history, the biosynthesis of Z or other AEGIS bases in alien organisms have access to Darwinian process but have ‘chosen’ to increase the catalytic power of their informational biopolymer rather than inventing a new catalytic biopolymer.

What is clear, however, is the biotechnological import of the ability of AEGIS components to make LIVE work. Even with advanced selection strategies, in vitro selection, SELEX and LIVE still are not meeting the promise of laboratory evolution. AEGIS-LIVE appears to be able to.

13. Conclusion

We have covered a lot, possibly too much, in this lecture. So let us recap:

  • the core structure of nucleic acids appears to be a natural outcome of non-biological chemical processes occurring in a constrained sub-aerial aquifer on the surface of a rocky planet like Earth or Mars approximately 4.36 ± 0.05 billion years ago. The important elements of that surface are basalt converted to basaltic glass through impacts, which became non-sterilizing starting near this time;

  • however, this core structure is not functionally unique. Synthesis generates informational molecular systems able to support evolution, despite having different structures and different modes of information transfer. In particular, the DNA and RNA alphabets can be expanded by adding nucleotides; these can be replicated with rule-based fidelity if they fit the Schrödinger aperiodic crystal structures;

  • these alternative nucleic acid analogues support biotechnology, including large scale DNA synthesis, diagnostics, including highly multiplexed genetic assays, and medicine, including new approaches for the therapy of cancer. In particular, adding replicable nucleotides allows functionality to be introduced into a library without functional group ‘overdecoration’. For some reactions, like RNA cleavage, the library built from expanded nucleotides is 104–105 richer as a source of catalytic activity;

  • nevertheless, certain features, including a polyelectrolyte structure and size regular building blocks, are required for any informational polymer that must support Darwinian evolution. Size regularity is essential to allow the physics of phase transitions to ensure replication fidelity. A polyelectrolyte backbone in water ensures stability of physical properties in an informational polymer, ensuring that information that confers fitness can be propagated; and

  • these features serve as universal and agnostic biosignatures that support the search for life throughout the Solar System, based on universal chemical and physical laws rather than extrapolations from the molecular biology found in the one example of life that we know of here on Earth.

Data accessibility

No data are associated with this article.

Authors' contributions

S.A.B.: conceptualization.

Conflict of interest declaration

Many of the compounds and processes described here are covered by US patents, are licensed for commercial use, and are available through Firebird to any interested party for their own research. The Guest Editor is on the Board of Directors of the Foundation for Applied Molecular Evolution.

Funding

This material is based on work supported in part by the National Science Foundation under grant no. EAR-2213438. Research reported in this publication was supported by the National Institute of General Medical Sciences of the National Institutes of Health under award no. R01GM141391. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

References

  • 1.Watson JD, Crick FH. 1953. Molecular structure of nucleic acids: a structure for deoxyribose nucleic acid. Nature 171, 737-738. ( 10.1038/171737a0) [DOI] [PubMed] [Google Scholar]
  • 2.Eberlein L, Beierlein FR, van Eikema Hommes NJR, Radadiya A, Heil J, Benner SA, Clark T, Kast SM, Richards NGJ. 2020. Tautomeric equilibria of nucleobases in the hachimoji expanded genetic alphabet. J. Chem. Theory Comput. 16, 2766-2777. ( 10.1021/acs.jctc.9b01079) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Kool ET, Morales JC, Guckian KM. 2000. Mimicking the structure and function of DNA. Insights into DNA stability and replication. Angew. Chem. Int. Ed. 39, 990-1009. () [DOI] [PubMed] [Google Scholar]
  • 4.Malyshev DA, Dhami K, Lavergne T, Chen T, Dai N, Foster JM, Correa IR Jr., Romesberg FE. 2014. A semi-synthetic organism with an expanded genetic alphabet. Nature 509, 385-388. ( 10.1038/nature13314) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Malyshev DA, Seo YJ, Ordoukhanian P, Romesberg FE. 2009. PCR with an expanded genetic alphabet. J. Am. Chem. Soc. 131, 14 620-14 621. ( 10.1021/ja906186f) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Zhang Y, Ptacin JL, Fischer EC, Aerni HR, Caffaro CE, San Jose K, Feldman AW, Turner CR, Romesberg FE. 2017. A semi-synthetic organism that stores and retrieves increased genetic information. Nature 551, 644. ( 10.1038/nature24659) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Kimoto M, Cox RS, Hirao I. 2011. Unnatural base pair systems for sensing and diagnostic applications. Expert Rev. Mol. Diagn. 11, 321-331. ( 10.1586/erm.11.5) [DOI] [PubMed] [Google Scholar]
  • 8.Kimoto M, Kawai R, Mitsui T, Yokoyama S, Hirao I. 2009. An unnatural base pair system for efficient PCR amplification and functionalization of DNA molecules. Nucleic Acids Res. 37, e14. ( 10.1093/nar/gkn956) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Tarköy M, Bolli M, Schweizer B, Leumann C. 1993. Nucleic-acid analogues with constraint conformational flexibility in the sugar-phosphate backbone (Bicyclo-DNA). Part 1. Preparation of (3S, 5′R)-2′-Deoxy-3′, 5′-ethano-αβ-D-ribonucleosides (bicyclonucleosides). Helv. Chim. Acta 76, 481-510. ( 10.1002/hlca.19930760132) [DOI] [Google Scholar]
  • 10.Veedu RN, Wengel J. 2009. Locked nucleic acid as a novel class of therapeutic agents. RNA Biol. 6, 321-323. ( 10.4161/rna.6.3.8807) [DOI] [PubMed] [Google Scholar]
  • 11.Obika S, Andoh JI, Sugimoto T, Miyashita K, Imanishi T. 1999. Synthesis of a conformationally locked AZT analogue, 3′-azido-3′-deoxy-2′-O, 4′-C-methylene-5-methyluridine. Tetrahedron Lett. 40, 6465-6468. ( 10.1016/S0040-4039(99)01324-6) [DOI] [Google Scholar]
  • 12.Benner SA, Karalkar NB, Hoshika S, Laos R, Shaw RW, Matsuura M, Fajardo D, Moussatche P. 2016. Alternative Watson-Crick synthetic genetic systems. Cold Spring Harb. Perspect. Biol. 8, a023770. Cold Spring Harbor Press. PMID: 27663774. ( 10.1101/cshperspect.a023770) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Wang L, Yang CJ, Medley CD, Benner SA, Tan W. 2005. Locked nucleic acid molecular beacons. J. Am. Chem. Soc. 127, 15 664-15 665. ( 10.1021/ja052498g) [DOI] [PubMed] [Google Scholar]
  • 14.Betz K, Malyshev DA, Lavergne T, Welte W, Diederichs K, Dwyer TJ, Marx A. 2012. KlenTaq polymerase replicates unnatural base pairs by inducing a Watson-Crick geometry. Nat. Chem. Biol. 8, 612-614. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Benner SA. 2017. Detecting Darwinism from molecules in the Enceladus plumes, Jupiter's moons, and other planetary water lagoons . Astrobiology 17, 840-851. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Cairns-Smith AG. 1982. Genetic takeover and the mineral origins of life. Cambridge, UK: Cambridge University Press. [Google Scholar]
  • 17.Olby R. 1974. The path to the double helix. The discovery of DNA. Seattle, WA: University of Washington Press. [Google Scholar]
  • 18.Hoshika S, et al. 2018. ‘Skinny’ and ‘Fat’ DNA: two new double helices. J. Am. Chem. Soc. 140, 11 655-11 660. ( 10.1021/jacs.8b05042) [DOI] [PubMed] [Google Scholar]
  • 19.Benner SA, Switzer CY. 1999. Chance and necessity in biomolecular chemistry: is life as we know it universal? In Simplicity and complexity in proteins and nucleic acids (eds Frauenfelder H, Deisenhofer J, Wolynes PG), pp. 335-359. Berlin, Germany: Dahlem Workshop Report, Dahlem University Press. [Google Scholar]
  • 20.Benner SA. 1999. The molecular origins of life: assembling pieces of the puzzle. Science 283, 2026. ( 10.1126/science.283.5410.2026a) [DOI] [Google Scholar]
  • 21.Miller PS, McParland KB, Jayaraman K, Tso POP. 1981. Biochemical and biological effects of nonionic nucleic acid methylphosphonates. Biochemistry 20, 1874-1880. ( 10.1021/bi00510a024) [DOI] [PubMed] [Google Scholar]
  • 22.Buck HM, Koole LH, van Genderen MH, Smit L, Geelen JL, Jurriaans S, Goudsmit J. 1990. Phosphate-methylated DNA aimed at HIV-1 RNA loops and integrated DNA inhibits viral infectivity. Science 248, 208-212. ( 10.1126/science.2326635) [DOI] [PubMed] [Google Scholar]
  • 23.Egholm M, et al. 1993. PNA hybridizes to complementary oligonucleotides obeying the Watson–Crick hydrogen-bonding rules. Nature 365, 566-568. ( 10.1038/365566a0) [DOI] [PubMed] [Google Scholar]
  • 24.Weller DD, Daly DT, Olson WK, Summerton JE. 1991. Molecular modeling of acyclic polyamide oligonucleotide analogs. J. Org. Chem. 56, 6000-6006. ( 10.1021/jo00021a009) [DOI] [Google Scholar]
  • 25.Schneider KC, Benner SA. 1990. Building blocks for oligonucleotide analogs with dimethylene-sulfide, -sulfoxide and -sulfone groups replacing phosphodiester linkages. Tetrahedron Lett. 31, 335-338. ( 10.1016/S0040-4039(00)94548-9) [DOI] [PubMed] [Google Scholar]
  • 26.Huang Z, Schneider KC, Benner SA. 1991. Building blocks for analogs of ribo- and deoxyribonucleotides with dimethylene sulfide, sulfoxide and sulfone groups replacing phosphodiester linkages. J. Org. Chem. 56, 3869-3882. ( 10.1021/jo00012a018) [DOI] [Google Scholar]
  • 27.Huang Z, Schneider KC, Benner SA. 1993. Oligonucleotide analogs with dimethylene-sulfide, -sulfoxide and -sulfone groups replacing phosphodiester linkages. In Methods in molecular biology 20 (ed. Agrawal S), pp. 315-353. Totowa, NJ: Humana Press Inc. [DOI] [PubMed] [Google Scholar]
  • 28.Hyrup B, Richert C, Schulte-Herbrueggen T, Benner SA, Egli M. 1995. X-ray crystal structure of a dimethylene sulfone bridged ribonucleotide dimer crystallized at elevated temperature. Nucl. Acids Res. 23, 2427-2433. ( 10.1093/nar/23.13.2427) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Roughton AL, Portmann S, Benner SA, Egli M. 1995. Crystal structure of a dimethylene sulfone linked ribodinucleotide analog. J. Am. Chem. Soc. 117, 7249-7250. ( 10.1021/ja00132a027) [DOI] [Google Scholar]
  • 30.Schrödinger E. 1943. What is life. Cambridge, UK: Cambridge University Press. [Google Scholar]
  • 31.Cafferty BJ, Fialho DM, Khanam J, Krishnamurthy R, Hud NV. 2016. Spontaneous formation and base pairing of plausible prebiotic nucleotides in water. Nat. Commun. 7, 1-8. ( 10.1038/ncomms11328) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Špaček J. 2021. How the Agnostic Life Finder (ALF) searches for life on Mars. Primordial Scoop 2021, e0211. ( 10.52400/VNTE9601) [DOI] [Google Scholar]
  • 33.Benner SA, Devine KG, Matveeva LN, Powell DH. 2000. The missing organic molecules on Mars. Proc. Natl Acad. Sci. USA 97, 2425-2430. ( 10.1073/pnas.040539497) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Scharf C, Cronin L. 2016. Quantifying the origins of life on a planetary scale. Proc. Natl Acad. Sci. USA 113, 8127-8132. ( 10.1073/pnas.1523233113) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Dundas CM, Byrne S, McEwen AS, Mellon MT, Kennedy MR, Daubar IJ, Saper L. 2014. HiRISE observations of new impact craters exposing Martian ground ice. J. Geophys. Res.: Planets 119, 109-127. ( 10.1002/2013JE004482) [DOI] [Google Scholar]
  • 36.Dundas CM, Bramson AM, Ojha L, Wray JJ, Mellon MT, Byrne S, Clark E. 2018. Exposed subsurface ice sheets in the Martian mid-latitudes. Science 359, 199-201. ( 10.1126/science.aao1619) [DOI] [PubMed] [Google Scholar]
  • 37.Benner SA, Kim H-J. 2015. The case for a Martian origin for Earth life. In Instruments, methods, and missions for astrobiology XVII. SPIE optical engineering+applications (eds Hoover RB, Levin GV, Rozanov AYu, Wickramasinghe NC), 9606, 96060C. ( 10.1117/12.2192890) [DOI] [Google Scholar]
  • 38.Carrier B, et al. 2020. Mars extant life. What's next? Conference report. Astrobiology 20, 785-814. ( 10.1089/ast.2020.2237) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Benner SA. 2021. Gilbert Levin and life on Mars. Primordial Scoop 2021, e1011. ( 10.52400/AWXL3848) [DOI] [Google Scholar]
  • 40.Shapiro R. 1999. Prebiotic cytosine synthesis: a critical analysis and implications for the origin of life. Proc. Natl Acad. Sci. USA 96, 4396-4401. ( 10.1073/pnas.96.8.4396) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Wade N. 2009. Chemist shows how RNA can be the starting point for life. New York Times. See http://www.nytimes.com/2009/05/14/science/14rna.html?pagewanted=all.
  • 42.Patel BH, Percivalle C, Ritson DJ, Duffy CD, Sutherland JD. 2015. Common origins of RNA, protein and lipid precursors in a cyanosulfidic protometabolism. Nat. Chem. 7, 301-307. ( 10.1038/nchem.2202) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Benner SA. 2019. Addressing questions not easily addressed by ‘the’ scientific method. The origins of life. Acta Pontif. Acad. Sci. 25, 123-161. [Google Scholar]
  • 44.Benner SA. 2014. Paradoxes in the origin of life. Origins Life Evol. Biosphere 44, 339-343. ( 10.1007/s11084-014-9379-0) [DOI] [PubMed] [Google Scholar]
  • 45.Moore R. 1966. Niels Bohr: the man, his science, and the world they changed, p. 196. Cambridge, MA: MIT Press. [Google Scholar]
  • 46.Powner MW, Sutherland JD. 2011. Prebiotic chemistry: a new modus operandi. Phil. Trans. R. Soc. B 366, 2870-2877. ( 10.1098/rstb.2011.0134) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Ferris JP. 2005. Mineral catalysis and prebiotic synthesis: montmorillonite-catalyzed formation of RNA. Elements 1, 145-149. ( 10.2113/gselements.1.3.145) [DOI] [Google Scholar]
  • 48.Rich A. 1962. On the problems of evolution and biochemical information transfer. In Horizons in biochemistry (eds M Kasha, B Pullman), pp. 103-126. New York, NY: Academic Press. [Google Scholar]
  • 49.Gilbert W. 1986. Origin of life: the RNA world. Nature 319, 618-618. ( 10.1038/319618a0) [DOI] [Google Scholar]
  • 50.Müller F, Escobar L, Xu F, Węgrzyn E, Nainytė M, Amatov T, Chan CY, Pichler A, Carell T. 2022. A prebiotically plausible scenario of an RNA–peptide world. Nature 605, 279-284. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Hud NV, Cafferty BJ, Krishnamurthy R, Williams LD. 2013. The origin of RNA and ‘my grandfather's axe’. Chem. Biol. 20, 466-474. ( 10.1016/j.chembiol.2013.03.012) [DOI] [PubMed] [Google Scholar]
  • 52.Benner SA, et al. 2019. When did life likely emerge on Earth in an RNA-first process? ChemSystChem 1, e190003. ( 10.1002/syst.201900035) [DOI] [Google Scholar]
  • 53.Benner SA, Kim HJ, Biondi E. 2019. Prebiotic chemistry that could not not have happened. Life 9, 84. ( 10.3390/life9040084) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Kitadai N, Maruyama S. 2018. Origins of building blocks of life: a review. Geosci. Front. 9, 1117-1153. ( 10.1016/j.gsf.2017.07.007) [DOI] [Google Scholar]
  • 55.Zahnle KJ, Lupu R, Catling DC, Wogan N. 2020. Creation and evolution of impact-generated reduced atmospheres of early Earth. Planetary Sci. J. 1, 11. ( 10.3847/PSJ/ab7e2c) [DOI] [Google Scholar]
  • 56.Kawai J, McLendon DC, Kim HJ, Benner SA. 2019. Hydroxymethanesulfonate from volcanic sulfur dioxide. A mineral reservoir for formaldehyde in prebiotic chemistry. Astrobiology 19, 506-516. ( 10.1089/ast.2017.1800) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Kim HJ, Ricardo A, Illangkoon HI, Kim MJ, Carrigan MA, Frye F, Benner SA. 2011. Synthesis of carbohydrates in mineral-guided prebiotic cycles. J. Am. Chem. Soc. 133, 9457-9468. ( 10.1021/ja201769f) [DOI] [PubMed] [Google Scholar]
  • 58.Krishnamurthy R, Guntha S, Eschenmoser A. 2000. Regioselective α-phosphorylation of aldoses in aqueous solution. Angew. Chem. Int. Ed. 39, 2281-2285. () [DOI] [PubMed] [Google Scholar]
  • 59.Kim HJ, Benner SA. 2020. Abiotic synthesis of nucleoside 5′-triphosphates with nickel borate and cyclic trimetaphosphate (CTMP). Astrobiology 21, 298-306. ( 10.1089/ast.2020.2264) [DOI] [PubMed] [Google Scholar]
  • 60.Kim HJ, Furukawa Y, Kakegawa T, Bita A, Scorei R, Benner SA. 2016. Evaporite borate-containing mineral ensembles make phosphate available and regiospecifically phosphorylate ribonucleosides: borate as a multifaceted problem solver in prebiotic chemistry. Angew. Chem. 55, 15 816-15 820. ( 10.1002/ange.201608001) [DOI] [PubMed] [Google Scholar]
  • 61.Wochner A, Attwater J, Coulson A, Holliger P. 2011. Ribozyme-catalyzed transcription of an active ribozyme. Science 332, 209-212. ( 10.1126/science.1200752) [DOI] [PubMed] [Google Scholar]
  • 62.Attwater J, Raguram A, Morgunov AS, Gianni E, Holliger P. 2018. Ribozyme-catalysed RNA synthesis using triplet building blocks. Elife 2018, 7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Biondi E, Howell L, Benner SA. 2017. Opal absorbs and stabilizes RNA. A hierarchy of prebiotic silica minerals. Synlett 28, 84-88. [Google Scholar]
  • 64.Holm NG. 2014. Glasses as sources of condensed phosphates on the early earth. Geochem. Trans. 15, 1-4. ( 10.1186/1467-4866-15-1) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Kim HJ, Benner SA. 2021. Abiotic synthesis of nucleoside 5'-triphosphates with nickel borate and cyclic trimetaphosphate (CTMP). Astrobiology 21, 298-306. ( 10.1089/ast.2020.2264) [DOI] [PubMed] [Google Scholar]
  • 66.Kim HJ, Kim J. 2019. A prebiotic synthesis of canonical pyrimidine and purine ribonucleotides. Astrobiology 19, 669-674. ( 10.1089/ast.2018.1935) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Osumah A, Krishnamurthy R. 2021. Diamidophosphate (DAP): a plausible prebiotic phosphorylating reagent with a Chem to BioChem potential? Chembiochem 22, 3001-3009. ( 10.1002/cbic.202100274) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Pinto J, Gladstone G, Yung Y. 1980. Photochemical production of formaldehyde in Earth's primitive atmosphere. Science 210, 183-185. ( 10.1126/science.210.4466.183) [DOI] [PubMed] [Google Scholar]
  • 69.Harman CE, Kasting JF, Wolf ET. 2013. Atmospheric production of glycolaldehyde under hazy prebiotic conditions. Orig. Life Evol. Biosph. 43, 77-98. ( 10.1007/s11084-013-9332-7) [DOI] [PubMed] [Google Scholar]
  • 70.Yu J, Jones AX, Legnani L, Blackmond DG. 2021. Prebiotic access to enantioenriched glyceraldehyde mediated by peptides. Chem. Sci 12, 6350-6354. ( 10.1039/D1SC01250A) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Engelhart AE, Powner MW, Szostak JW. 2013. Functional RNAs exhibit tolerance for non-heritable 2 ‘-5 ‘ versus 3 ‘-5 ‘ backbone heterogeneity. Nat. Chem. 5, 390-394. ( 10.1038/nchem.1623) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Mariani A, Sutherland JD. 2017. Non-enzymatic RNA backbone proofreading through energy-dissipative recycling. Angew. Chem. Int. Ed. 56, 6563-6566. ( 10.1002/anie.201703169) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Usher DA, McHale AH. 1976. Hydrolytic stability of helical RNA. Selective advantage for natural 3’,5'-bond. Proc. Natl Acad. Sci. USA 73, 1149-1153. ( 10.1073/pnas.73.4.1149) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Tuerk C, Gold L. 1990. Systematic evolution of ligands by exponential enrichment. RNA ligands to bacteriophage-T4 DNA-polymerase. Science 249, 505-510. ( 10.1126/science.2200121) [DOI] [PubMed] [Google Scholar]
  • 75.Robertson DL, Joyce GF. 1990. Selection in vitro of an RNA enzyme that specifically cleaves single-stranded-DNA. Nature 344, 467-468. ( 10.1038/344467a0) [DOI] [PubMed] [Google Scholar]
  • 76.Ellington AD, Szostak JW. 1990. In vitro selection of RNA molecules that bind specific ligands. Nature 346, 818-822. ( 10.1038/346818a0) [DOI] [PubMed] [Google Scholar]
  • 77.Zepik H, Benner SA. 1999. Catalysts, anticatalysts, and receptors for unactivated phosphate diesters in water. J. Org. Chem. 64, 8080-8083. ( 10.1021/jo982418+) [DOI] [PubMed] [Google Scholar]
  • 78.Benner SA, Ellington AD, Tauer A. 1989. Modern metabolism as a palimpsest of the RNA world. Proc. Natl Acad. Sci. USA 86, 7054-7058. ( 10.1073/pnas.86.18.7054) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Yonath A. 2010. Hibernating bears, antibiotics, and the evolving ribosome (Nobel Lecture). Angew. Chem. Int. Ed. 49, 4340-4354. ( 10.1002/anie.201001297) [DOI] [PubMed] [Google Scholar]
  • 80.Altman S. 1990. Enzymatic cleavage of RNA by RNA (Nobel lecture). Angew. Chem. Int. Ed. Engl. 29, 749-758. ( 10.1002/anie.199007491) [DOI] [Google Scholar]
  • 81.Cech TR. 1990. Self-splicing and enzymatic activity of an intervening sequence RNA from tetrahymena (Nobel Lecture). Angew. Chem. Int. Ed. 29, 759-768. ( 10.1002/anie.199007591) [DOI] [PubMed] [Google Scholar]
  • 82.Lyu C, Khan IM, Wang Z. 2021. Capture-SELEX for aptamer selection: a short review. Talanta 229, 122274. ( 10.1016/j.talanta.2021.122274) [DOI] [PubMed] [Google Scholar]
  • 83.Bartel DP, Szostak JW. 1993. Isolation of new ribozymes from a large pool of random sequences. Science 261, 1411-1418. ( 10.1126/science.7690155) [DOI] [PubMed] [Google Scholar]
  • 84.Ekland EH, Bartel DP. 1996. RNA-catalysed RNA polymerization using nucleoside triphosphates. Nature 382, 373-376. ( 10.1038/382373a0) [DOI] [PubMed] [Google Scholar]
  • 85.Johnston WK, Unrau PJ, Lawrence MS, Glasner ME, Bartel DP. 2001. RNA-catalyzed RNA polymerization: accurate and general RNA-templated primer extension. Science 292, 1319-1325. ( 10.1126/science.1060786) [DOI] [PubMed] [Google Scholar]
  • 86.Ekland EH, Szostak JW, Bartel DP. 1995. Structurally complex and highly-active RNA ligases derived from random RNA sequences. Science 269, 364-370. ( 10.1126/science.7618102) [DOI] [PubMed] [Google Scholar]
  • 87.Pressman AD, Liu Z, Janzen E, Blanco C, Müller UF, Joyce GF, Chen IA. 2019. Mapping a systematic ribozyme fitness landscape reveals a frustrated evolutionary network for self-aminoacylating RNA. J. Am. Chem. Soc. 141, 6213-6223. ( 10.1021/jacs.8b13298) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.El Yacoubi B, Bailly M, de Crécy-Lagard V. 2012. Biosynthesis and function of posttranscriptional modifications of transfer RNAs. Ann. Rev. Genet. 46, 69-95. ( 10.1146/annurev-genet-110711-155641) [DOI] [PubMed] [Google Scholar]
  • 89.Benner SA, Carrigan MA, Ricardo A, Frye F. 2006. Setting the stage. The history, chemistry and geobiology behind RNA. In RNA world (eds Gesteland RF, Cech TR, Atkins JF), pp. 1-21. Cold Spring Harbor, NY: Cold Spring Harbor Press. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Battersby TR, Ang DN, Burgstaller P, Jurczyk SC, Bowser MT, Buchanan DD, Kennedy RT, Benner SA. 1999. Quantitative analysis of receptors for adenosine nucleotides obtained via in vitro selection from a library incorporating a cationic nucleotide analog. J. Am. Chem. Soc. 121, 9781-9789. ( 10.1021/ja9816436) [DOI] [PubMed] [Google Scholar]
  • 91.Tarasow TM, Eaton BE. 1998. Dressed for success: Realizing the catalytic potential of RNA. Biopolymers 48, 29-37. () [DOI] [Google Scholar]
  • 92.Tolle F, Mayer G. 2013. Dressed for success: applying chemistry to modulate aptamer functionality. Chem. Sci. 4, 60-67. ( 10.1039/C2SC21510A) [DOI] [Google Scholar]
  • 93.Pfeiffert F, Rosenthal M, Siegl J, Ewers J, Mayer G. 2017. Customised nucleic acid libraries selection for enhanced aptamer and performance. Curr. Opin. Biotech. 48, 111-118. ( 10.1016/j.copbio.2017.03.026) [DOI] [PubMed] [Google Scholar]
  • 94.Kimoto M, Yamashige R, Matsunaga KI, Yokoyama S, Hirao I. 2013. Generation of high-affinity DNA aptamers using an expanded genetic alphabet. Nat. Biotech. 31, 453-457. ( 10.1038/nbt.2556) [DOI] [PubMed] [Google Scholar]
  • 95.Gold L, et al. 2010. Aptamer-based multiplexed proteomic technology for biomarker discovery. PLoS ONE 2010, 5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 96.Kraemer S, et al. 2011. From SOMAmer-based biomarker discovery to diagnostic and clinical applications: a SOMAmer-based, streamlined multiplex proteomic assay. PLoS ONE 2011, 6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97.Hollenstein M, Hipolito CJ, Lam CH, Perrin DM. 2013. Toward the combinatorial selection of chemically modified DNAzyme RNase A mimics active against all-RNA substrates. ACS Comb. Sci. 15, 174-182. ( 10.1021/co3001378) [DOI] [PubMed] [Google Scholar]
  • 98.Wang Y, Liu E, Lam CH, Perrin DM. 2018. A densely modified M(2+)-independent DNAzyme that cleaves RNA efficiently with multiple catalytic turnover. Chem. Sci. 9, 1813-1821. ( 10.1039/C7SC04491G) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 99.Chen Z, Lichtor PA, Berliner AP, Chen JC, Liu DR. 2018. Evolution of sequence-defined highly functionalized nucleic acid polymers. Nat. Chem. 10, 420-427. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 100.Carrigan M, Ricardo A, Ang DN, Benner SA. 2004. Quantitative analysis of a deoxyribonucleotide catalyst obtained via in vitro selection. A DNA ribonuclease. Biochemistry 43, 11 446-11 459. ( 10.1021/bi049898l) [DOI] [PubMed] [Google Scholar]
  • 101.Piccirilli JA, Krauch T, Moroney SE, Benner SA. 1990. Enzymatic incorporation of a new base pair into DNA and RNA extends the genetic alphabet. Nature 343, 33-37. ( 10.1038/343033a0) [DOI] [PubMed] [Google Scholar]
  • 102.Switzer C, Moroney SE, Benner SA. 1989. Enzymatic incorporation of a new base pair into DNA and RNA. J. Am. Chem. Soc. 111, 8322-8323. ( 10.1021/ja00203a067) [DOI] [Google Scholar]
  • 103.Zumrut H, Yang Z, Williams N, Arizala J, Batool S, Benner SA, Mallikaratchy P. 2020. Ligand guided selection (LIGS) with artificially expanded genetic information systems against TCR-CD3ε. Biochemistry 59, 552-562. ( 10.1021/acs.biochem.9b00919) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 104.Biondi E, Lane JD, Das D, Dasgupta S, Piccirilli JA, Hoshika S, Bradley KM, Krantz BA, Benner SA. 2016. Laboratory evolution of artificially expanded DNA gives redesignable aptamers that target the toxic form of anthrax protective antigen. Nucl. Acids Res. 44, 9565-9577. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 105.Sefah K, et al. 2014. In vitro selection with artificial expanded genetic information systems (AEGIS). Proc. Natl Acad. Sci. USA 111, 1449-1456. ( 10.1073/pnas.1311778111) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 106.Zhang L, et al. 2015. Evolution of functional six-nucleotide DNA. J. Am. Chem. Soc. 137, 6734-6737. ( 10.1021/jacs.5b02251) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 107.Zhang L, et al. 2016. Combining genetic systems from synthetic biology with cell engineering and laboratory evolution. Aptamers against Glypican 3. Angew. Chem. 55, 12 372-12 375. ( 10.1002/anie.201605058) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 108.Zhang L, et al. 2020. An aptamer-nanotrain assembled from six-letter DNA delivers doxorubicin selectively to liver cancer cells. Angew. Chem. Int. Ed. 59, 663-888. ( 10.1002/anie.201909691) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 109.Chaput JC, Switzer C. 1999. A DNA pentaplex incorporating nucleobase quintets. Proc. Natl Acad. Sci. USA 96, 10 614-10 619. ( 10.1073/pnas.96.19.10614) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 110.Jerome CA, Hoshika S, Bradley KM, Benner SA, Biondi E. Submitted. Libraries with expanded genetic alphabets are better reservoirs of catalytic molecules.
  • 111.Wolk SK, Mayfield WS, Gelinas AD, Astling D, Guillot J, Brody EN, Gold L. 2020. Modified nucleotides may have enhanced early RNA catalysis. Proc. Natl Acad. Sci. USA 117, 8236-8242. ( 10.1073/pnas.1809041117) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 112.Perrin DM. 2000. Nucleic acids for recognition and catalysis landmarks, limitations, and looking to the future. Comb. Chem. High Throughput Screen 3, 243-269. ( 10.2174/1386207003331599) [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

No data are associated with this article.


Articles from Philosophical Transactions of the Royal Society B: Biological Sciences are provided here courtesy of The Royal Society

RESOURCES