Skip to main content
Philosophical transactions. Series A, Mathematical, physical, and engineering sciences logoLink to Philosophical transactions. Series A, Mathematical, physical, and engineering sciences
. 2017 Nov 13;375(2109):20160347. doi: 10.1098/rsta.2016.0347

Bulk measurements of messy chemistries are needed for a theory of the origins of life

Nicholas Guttenberg 1,2, Nathaniel Virgo 1, Kuhan Chandru 1, Caleb Scharf 3, Irena Mamajanov 1,
PMCID: PMC5686404  PMID: 29133446

Abstract

A feature of many of the chemical systems plausibly involved in the origins of terrestrial life is that they are complex and messy—producing a wide range of compounds via a wide range of mechanisms. However, the fundamental behaviour of such systems is currently not well understood; we do not have the tools to make statistical predictions about such complex chemical networks. This is, in part, due to a lack of quantitative data from which such a theory could be built; specifically, functional measurements of messy chemical systems. Here, we propose that the pantheon of experimental approaches to the origins of life should be expanded to include the study of ‘functional measurements’—the direct study of bulk properties of chemical systems and their interactions with other compounds, the formation of structures and other behaviours, even in cases where the precise composition and mechanisms are unknown.

This article is part of the themed issue ‘Reconceptualizing the origins of life’.

Keywords: bulk measurements, prebiotic chemistry, messy chemistry, synthetic chemistry

1. Introduction: chemistry and the origins of life

To explain the origins of life (OoL), we must understand what types of chemical systems could support the emergence of life’s key processes. As it stands, many chemical systems with promising hints of life-like processes have been discovered. One set of such discoveries involves the ability to produce biologically relevant or precursor molecules, in a wide variety of environments. This can happen from prebiotically available small molecules and energy sources, for instance, under Miller–Urey conditions [1] or at high pressure and temperature in geothermal vent conditions [2]. Some biologically relevant molecules may also be generated by hydrogen cyanide (HCN) chemistry [3] and in Fischer–Tropsch-type reactions [4]. Beyond the question of just producing particular precursor molecules, we are beginning to see the first ‘life-like properties’ in examples of functioning autocatalysis in the formose reaction [5] and in more exotic chemical systems such as self-assembling molybdenum complexes [6] and stacked macrocycle replicators [7,8].

There is a commonality to many of these systems—they are qualitatively messy. The systems produce large assortments of compounds through a wide variety of mechanisms. For example, in prebiotically plausible settings self-polymerization of small molecules, such as formaldehyde [9] or HCN [10,11], yields a diverse mixture of products that is difficult to resolve, and has been referred to as ‘tar’ or ‘asphalt’ [12]. This is strikingly different from the controlled, high-yield world of synthetic chemistry, and even different from the enzyme-controlled world of biochemistry.

There is an open question about whether the OoL began with a sparse chemical network and remained sparse as biology developed [13], or whether it began messy and became sparse later, either gradually or through a series of discrete transitions, as hypothesized, for example, by Oparin [14] and others [1517].

In this short paper, we propose that in order to resolve this question there is a pressing need to deal with complex, messy chemical systems as a whole, and in their own terms—as a network of interacting chemical mechanisms and processes—rather than only focusing on finding specific molecules or mechanisms from modern biology. Furthermore, a purely theoretical approach is not sufficient. At least initially, the specific measurements that offer the greatest scientific leverage will have to be discovered via exploration and well-honed experimental intuition. The kind of information that will be needed to construct a better theory of messy chemistry is not specific molecular syntheses, but rather bulk measurements that directly address interactions and function at the systems level. In the following sections, we provide a number of suggestions for directions to take.

What do we mean by messy? By a ‘messy’ chemistry, we mean one that has a high diversity of products, intermediates and reaction pathways, such that they cannot all be unequivocally identified. The same chemistry might be considered messy in some contexts and clean in others; this depends on the questions being asked and the techniques available for analysis. A ‘clean’ approach is one in which one aims to understand in detail every component of the system. We suggest that, for many chemical systems that are relevant for the OoL, such an approach is not practical or even desirable, and we propose instead to focus on the development of new, rigorous methodologies for the study of fundamentally messy systems.

Messiness seems to be a frequent ‘default state’ encountered in prebiotic chemistry experiments, and it seems worthwhile to take seriously the possibility that the OoL might have begun in such a system. Investigating this possibility involves finding ways to grapple directly with the explosions of complexity and interactions that occur in such chemical systems, rather than by mapping out subsets of the system in a locally ‘clean’ way.

Why do we focus on ‘clean’ versus ‘messy’ approaches? In chemistry as a whole, different standards of evidence are applied for studying natural systems such as biochemistry or geochemistry, versus the task of elucidating synthetic processes. In synthetic chemistry, the goal is usually to produce a high yield of specific products and to elucidate in detail the reaction mechanisms involved. In contrast, in observational branches of chemistry, the typical goal is to characterize what is there in the system. These different approaches can colour the way that the OoL problem is approached.

In work on the OoL from within synthetic chemistry, a clean synthesis of biological or prebiotic molecules is often an explicit or implicit goal. Other possibilities—such as the production of a diverse range of polymer products (a.k.a. ‘asphalt’ or ‘tar’) [1820] or non-stereospecific mechanisms—are often taken to be problems to be avoided. By contrast, by suggesting a ‘messy’ approach, we are proposing that these things are important objects of study in their own right. By taking them seriously for what they are and what they do, we might shed important light on the problem of origins. It might be that the initial ‘tar’ is not a problem to be avoided, but an important stage in the emergence of life.

Examples. This kind of approach is already taken in branches of bio- and geochemistry, where challenges and constraints in characterizing the complexity of the natural world have led to the development of analytical techniques suitable for complex mixtures. Metabolomics, for example, treats individual molecules as fingerprints associated with cellular processes, and requires techniques capable of identifying and quantifying broad subsets of small molecules in biological samples and then understanding their connection to system-level effects in order to diagnose and treat diseases.

Kuhnert et al. [21] have introduced an alternative, more ‘messy’ approach. When studying coffee and tea extracts, the authors analysed the samples by mass spectrometry without any attempt at pre-analysis sample separation or treatment. The resulting mass spectra, of course, contain unresolved peaks or ‘humps’. The authors applied a statistical analysis routine aimed at identifying unique features and tendencies of the hump rather than attempting to resolve every single mass feature. This relatively fast analysis allowed the recognition of spectral features unique to teas from different origins. Furthermore, while lacking an unequivocal identification and quantification step, the method was able to discover classes of compounds beneficially and adversely affecting health in processed foods.

Another example is the routine application of bulk measurements to geochemical organic samples. The petroleum industry has established many quickly measured parameters to classify crude oils. These specifications include American Petroleum Institue (API) gravity, an arbitrary parameter based on the specific gravity of the fluid. An oil can be referred to as light/heavy based on the relative composition of light aliphatic and asphaltic fractions, or sweet/sour as a measure of bulk sulfur content.

In the context of the OoL, solid-state nuclear magnetic resonance (NMR) has been used to crudely compare HCN polymers and polymers derived from Miller–Urey experiments [22], just to confirm their different nature. Furthermore, bulk solid-state NMR measurements were used to ascertain the polymerization mechanism of HCN. In this experiment, the condensed phase HCN polymerization was initiated by a base and many radical catalysts. The resulting broad unresolved spectra of the resulting polymers looked remarkably similar, suggesting a possibility of in situ radical formation during neat HCN polymerization.

2. Towards a messy theory of the origins of life

The ultimate goal of OoL research is to produce, and test, a theory about how life arose from chemistry. Such a theory must ultimately consist of more than a list of compounds and the prebiotic reactions that synthesized them; it must also explain how a diverse set of reactions become organized into a coherent systemic whole with the capacity for evolution, metabolism etc.

To resolve such fundamental questions as ‘clean versus messy’, we seek a theory that is as general as possible. This generality is necessary not only because we might want to ask the question of how likely life is to form in other chemical scenarios (for example, on other planets), but also because the ultimate test of an explanatory theory is for it to make predictions beyond a single case.

In practice, the development of theories is almost always preceded by observational results. Much current OoL research tries to do this in the opposite order, first proposing a theory of how life began, and then performing experiments to test or demonstrate it. Our proposal is that the field should include a work that focuses on generating observational results about the behaviour of messy, complex chemistries, regardless of any immediate interpretation of how this could lead to biology.

This would require a certain amount of change in the conventional requirements for a paper to be publishable in this field. The goal of an observational paper should not be to present or justify a certain theory about the OoL, and nor should it necessarily be to identify the mechanism of a particular reaction. Rather, what we need for the development of theory is the publication of interesting unexplained results about the behaviour of messy chemical systems. Journals and conferences on the OoL should explicitly lighten the requirement that a result needs to be explainable in order to be published. High-quality unexplained results are crucial for the future of the field, and their publication should be encouraged in every way possible.

3. Functional measurements

What kinds of observational experiments would best support the construction of a general theory of the OoL from chemistry?

A theory about life must be slightly different from a mere collection of complex chemical reactions. It must also have some explanatory power regarding the system-level attributes of life such as evolution and metabolism.

The types of measurements which would be relevant for a theory of the OoL should focus on things which do stuff: measurements of functional properties. Those functions can then be assembled as the building blocks of larger-scale functions connected to the things that life must do to sustain and propagate itself.

In addition, the functions measured should be the ones which enable follow-up predictions of downstream functional consequences. If the functional measurements are first-order objects of a theory of life, then it must be possible to understand how they relate to each other if those building blocks are to be assembled into higher-level patterns and inferences. That is to say, the same self-reciprocating structure that exists at the microscopic scale in chemistry must be reflected in the macro-scale constructed by the choice of bulk measurements.

As a guide, we refer here to examples of where this kind of approach has been successful elsewhere in chemistry.

We start with an example from recent OoL chemistry. It was observed that a set of reacted amino acid Gly, Ala, Asp and Val complexes can break down animal protein [23]. The specific mechanisms for that functionality are analytically difficult to fully characterize, but that is not relevant for the consequences: these results indicate that the system is able to take large, heterogeneous compounds and render them into smaller sub-components, which, in turn, implies that the complex mixture could feed on other chemical reservoirs. One could then ask about the dependency of this functionality: if the tar is allowed to ‘eat’ a particular distribution of proteins, does its activity towards those proteins change over time? How does the activity change when the tar is heated or exposed to ultraviolet radiation, or when the pH is changed? Does the complex/tar change its general structure over time?

Fundamentally, the questions we want to ask require the measurements of robust properties of messy chemistries that do not necessarily require a detailed chemical analysis.

In a further example, Hanczyc [18] has demonstrated that HCN polymer, a heterogeneous messy mixture, can drive the motion of oil droplets in an aqueous environment. This is another example of function being provided by a messy system, and a similar set of questions can be asked about it.

Finally, kerogen maturation provides perhaps the clearest example of where functional measurements can be self-predictive. As kerogen ages, the ratios of hydrogen to carbon and oxygen to carbon systematically decrease. Based on just these two numbers, it is possible to classify kerogen into four branches that correspond to the type of biomass which originally produced the kerogen, provide estimates of its age, and determine whether it will produce oil or natural gas when heated at depth. The O/C ratio and the H/C ratio (i.e van Krevelen plots) are highly predictive measurements of a very complex soup of compounds, and furthermore their dynamics with time and heat are themselves robustly predictable [24].

4. Conclusion

We have argued that, in order to produce a generalizable theory of the OoL, a necessary step may be to understand the properties and dynamical behaviour of ‘messy’ chemical systems, in which the products and processes are diverse and not easily enumerable. To achieve this, we have argued for a combination of empirical and theoretical approaches, each feeding off and informing the other. We have presented examples where the ‘messy’ approach has worked in other fields.

Achieving this in OoL research requires expanding the field’s emphasis to include a new kind of empirical study, where the focus is not on testing hypotheses about a specific reaction mechanism, but on measuring and understanding the overall behaviour of a complex, messy system. Publications of this kind are needed for a new theory conceptualizing these systems as applicable to the OoL to be developed.

Acknowledgements

The authors thank members of ELSI and attendees of the workshop ‘Re-conceptualizing the origins of life’ for many discussions that stimulated this work.

Data accessibility

This article has no supporting data.

Authors' contributions

All authors contributed to the development of the ideas and drafting of the manuscript. All authors read and approved the manuscript.

Competing interests

The authors declare that they have no competing interests.

Funding

This work was supported by the World Premier International (WPI) Research Center Initiative, MEXT, Japan, and the ELSI Origins Network (EON), which is supported by a grant from the John Templeton Foundation. The opinions expressed in this publication are those of the authors and do not necessarily reflect the views of the John Templeton Foundation.

References

  • 1.Miller SL. 1953. A production of amino acids under possible primitive Earth conditions. Science 117, 528–529. ( 10.1126/science.117.3046.528) [DOI] [PubMed] [Google Scholar]
  • 2.Wächtershäuser G. 2000. Life as we don’t know it. Science 289, 1307–1308. ( 10.1126/science.289.5483.1307) [DOI] [PubMed] [Google Scholar]
  • 3.Moser R, Claggett A, Matthews C. 1968. Peptide formation from aminomalononitrile (HCN trimer). Tetrahedron Lett. 9, 1605–1608. ( 10.1016/S0040-4039(01)99012-4) [DOI] [PubMed] [Google Scholar]
  • 4.McCollom TM, Ritter G, Simoneit BR. 1999. Lipid synthesis under hydrothermal conditions by Fischer-Tropsch-type reactions. Orig. Life. Evol. Biosph. 29, 153–166. ( 10.1023/A:1006592502746) [DOI] [PubMed] [Google Scholar]
  • 5.Breslow R. 1959. On the mechanism of the formose reaction. Tetrahedron Lett. 1, 22–26. ( 10.1016/S0040-4039(01)99487-0) [DOI] [Google Scholar]
  • 6.Long DL, Burkholder E, Cronin L. 2007. Polyoxometalate clusters, nanostructures and materials: from self assembly to designer materials and devices. Chem. Soc. Rev. 36, 105–121. ( 10.1039/B502666K) [DOI] [PubMed] [Google Scholar]
  • 7.Otto S, Furlan RL, Sanders JK. 2000. Dynamic combinatorial libraries of macrocyclic disulfides in water. J. Am. Chem. Soc. 122, 12 063–12 064. ( 10.1021/ja005507o) [DOI] [Google Scholar]
  • 8.Otto S, Furlan RL, Sanders JK. 2002. Selection and amplification of hosts from dynamic combinatorial libraries of macrocyclic disulfides. Science 297, 590–593. ( 10.1126/science.1072361) [DOI] [PubMed] [Google Scholar]
  • 9.Cody GD, Heying E, Alexander CM, Nittler LR, Kilcoyne AD, Sandford SA, Stroud RM. 2011. Establishing a molecular relationship between chondritic and cometary organic solids. Proc. Natl Acad. Sci. USA 108, 19 171–19 176. ( 10.1073/pnas.1015913108) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Mamajanov I, Herzfeld J. 2009. HCN polymers characterized by solid state NMR: chains and sheets formed in the neat liquid. J. Chem. Phys. 130, 134503 ( 10.1063/1.3092908. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Minard RD, Hatcher PG, Gourley RC, Matthews CN. 1998. Structural investigations of hydrogen cyanide polymers: new insights using TMAH thermochemolysis/GC-MS. Orig. Life Evol. Biosph. 28, 461–473. ( 10.1023/A:1006566125815) [DOI] [PubMed] [Google Scholar]
  • 12.Benner SA. 2014. Paradoxes in the origin of life. Orig. Life Evol. Biosph. 44, 339–343. ( 10.1007/s11084-014-9379-0) [DOI] [PubMed] [Google Scholar]
  • 13.Copley SD, Smith E, Morowitz HJ. 2010. How began life: the emergence of sparse metabolic networks. J. Cosmol. 10, 3345–3361. [Google Scholar]
  • 14.Oparin A. 1924. Proiskhodenie Zhisni. Moscow, Russia: Moscovky Rabotchii; [In Russian.] [Google Scholar]
  • 15.Kauffman SA. 1986. Autocatalytic sets of proteins. J. Theor. Biol. 119, 1–24. ( 10.1016/S0022-5193(86)80047-9) [DOI] [PubMed] [Google Scholar]
  • 16.Pereto J. 2012. Out of fuzzy chemistry: from prebiotic chemistry to metabolic networks. Chem. Soc. Rev. 41, 5394–5403. ( 10.1039/c2cs35054h) [DOI] [PubMed] [Google Scholar]
  • 17.Virgo ND. 2016. Thresholds in messy chemistries. In Proc. of the Artificial Life Conf. 2016, Cancun, Mexico, 4–8 July 2016 (eds C Gershenson, T Froese, JM Siqueiros, W Aguilar, EJ Izquierdo, H Sayama), pp. 598–299. Cambridge, MA: MIT Press.
  • 18.Hanczyc MM. 2011. Metabolism and motility in prebiotic structures. Phil. Trans. R. Soc. B 366, 2885–2893. ( 10.1098/rstb.2011.0141) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Chandru K, Obayashi Y, Kaneko T, Kobayashi K. 2014. Formation of amino acid condensates partly having peptide bonds in a simulated hydrothermal environment. Viva Origino 41, 24–28. [Google Scholar]
  • 20.Benner SA, Kim HJ, Carrigan MA. 2012. Asphalt, water, and the prebiotic synthesis of ribose, ribonucleosides, and RNA. Acc. Chem. Res. 45, 2025–2034. ( 10.1021/ar200332w) [DOI] [PubMed] [Google Scholar]
  • 21.Kuhnert N, Dairpoosh F, Yassin G, Golon A, Jaiswal R. 2013. What is under the hump? Mass spectrometry based analysis of complex mixtures in processed food—lessons from the characterisation of black tea thearubigins, coffee melanoidines and caramel. Food. Funct. 4, 1130–1147. ( 10.1039/c3fo30385c) [DOI] [PubMed] [Google Scholar]
  • 22.Garbow JR, Schaefer J, Ludicky R, Matthews C. 1987. Detection of secondary amides in hydrogen cyanide polymers by dipolar rotational spin-echo nitrogen-15 NMR. Macromolecules 20, 305–309. ( 10.1021/ma00168a012) [DOI] [Google Scholar]
  • 23.Oba T, Fukushima J, Maruyama M, Iwamoto R, Ikehara K. 2005. Catalytic activities of [GADV]-peptides. Orig. Life Evol. Biosph. 35, 447–460. ( 10.1007/s11084-005-3519-5) [DOI] [PubMed] [Google Scholar]
  • 24.Seewald JS. 2003. Organic-inorganic interactions in petroleum-producing sedimentary basins. Nature 426, 327–333. ( 10.1038/nature02132) [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

This article has no supporting data.


Articles from Philosophical transactions. Series A, Mathematical, physical, and engineering sciences are provided here courtesy of The Royal Society

RESOURCES