Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2022 Jul 22;119(30):e2119734119. doi: 10.1073/pnas.2119734119

The emergence of interstellar molecular complexity explained by interacting networks

Miguel García-Sánchez a,b,c, Izaskun Jiménez-Serra a, Fernando Puente-Sánchez d, Jacobo Aguirre a,c,1
PMCID: PMC9335321  PMID: 35867830

Significance

The road to life is punctuated by transitions toward complexity, from astrochemistry to biomolecules and eventually, to living organisms. Disentangling the origin of such transitions is a challenge where the application of complexity and network theory has not been fully exploited. We introduce a computational framework in which simple networks simulate the most basic building bricks of life and interact to form complex structures, leading to an explosion of diversity when the parameter representing the environment reaches a critical value. While this model is abstract and unrelated to chemical theory, its predictions reliably mimic the molecular evolution in the interstellar medium during the transition toward chemical complexity, suggesting that the rules leading to the emergence of complexity may be universal.

Keywords: complex networks, complexity, astrochemistry, astrobiology, origin of life

Abstract

Recent years have witnessed the detection of an increasing number of complex organic molecules in interstellar space, some of them being of prebiotic interest. Disentangling the origin of interstellar prebiotic chemistry and its connection to biochemistry and ultimately, to biology is an enormously challenging scientific goal where the application of complexity theory and network science has not been fully exploited. Encouraged by this idea, we present a theoretical and computational framework to model the evolution of simple networked structures toward complexity. In our environment, complex networks represent simplified chemical compounds and interact optimizing the dynamical importance of their nodes. We describe the emergence of a transition from simple networks toward complexity when the parameter representing the environment reaches a critical value. Notably, although our system does not attempt to model the rules of real chemistry nor is dependent on external input data, the results describe the emergence of complexity in the evolution of chemical diversity in the interstellar medium. Furthermore, they reveal an as yet unknown relationship between the abundances of molecules in dark clouds and the potential number of chemical reactions that yield them as products, supporting the ability of the conceptual framework presented here to shed light on real scenarios. Our work reinforces the notion that some of the properties that condition the extremely complex journey from the chemistry in space to prebiotic chemistry and finally, to life could show relatively simple and universal patterns.


The origin of life on Earth is far away from being unveiled. Life could have appeared spontaneously in our early planet about 4 billion y ago, or it could have existed previously in the outer space and been brought by dust and meteoroids, as panspermia states. An intermediate hypothesis is the molecular panspermia, which proposes that the original building blocks of life could have been produced in the interstellar medium (ISM) and introduced in the early Earth by asteroids and meteorites during the Late Heavy Bombardment that took place between 3.8 and 4.1 billion y ago, importantly enriching prebiotic chemistry. Over 200 molecules have been detected in the ISM, with some of them being prebiotically relevant as, for example, glycolaldehyde, urea, or ethanolamine (14). Prebiotic molecules, such as glycine or ribose, have indeed been found in meteorites and comets (5, 6), which supports the idea that prebiotic species could initially form in interstellar space and be transferred later to planetesimals and to Earth during the formation of the solar system.

In parallel to the efforts to understand the origin of life from the biochemical optics, the development and advance of computation in the last 50 y propitiated the study of life modeled as a cellular automaton (7, 8). Despite the simplicity of the rules, the different systems under study, in particular Conway’s Game of Life (9), presented an unexpected variety of spatiotemporal patterns and cast light on how complexity, emergence, and self-organization arise from a simple system. More complex systems were introduced in the eighties to model Darwinian evolution with the use of a new type of artificial life, where organisms described as computer programs could self-replicate, adapt, and mutate by natural selection, mostly competing for the control of the memory of the computer [e.g., CoreWar (10) and Avida (11, 12)]. The introduction of these digital organisms to address fundamental biological questions was supported on two main statements. First, they provide a way to generalize life beyond the organisms detected so far in our biosphere. Second, they allow us to perform, enlarge, and repeat experiments on a scale that is unachievable with real entities (11).

In order to assess whether artificial life and its connection with complex networks theory can bring light to the study of the origin of life, in this work we present a computational framework, NetWorld, where networks interact following very simple local rules inspired by network science and game theory, leading to a chemistry of networks. Our objective is to test using real astrochemical data whether a basic digital framework can reproduce certain general properties of the difficult transition from chemistry to biology and therefore, describe in an abstract level the creation of the basic building blocks of life.

1. Results

In NetWorld, every chemical compound is represented by a network, and this allows us to apply the strength and tools of complex network theory. From simply isolated nodes that simulate an initial state of total lack of complexity, this artificial chemistry shows the emergence of a transition beyond which the environment permits the appearance of a rich variety of networks with different spectral, topological, and dynamical properties, mimicking in a very simplified manner the first steps of prebiotic chemistry and its natural evolution toward complexity. We will pay special attention to the descriptive and predictive ability of this framework, showing that the results throw light on the chemical evolution toward complexity of molecular clouds in the ISM and reveal a so far unknown relationship between the abundances of the molecules present in dark clouds and the number of chemical reactions that have them as products.

1.A. Description and Rules of the Model.

An artificial chemistry model is defined by three components: a set of all possible structures, a set of rules that govern the interaction among structures, and an algorithm that describes the reaction domain (13). Furthermore, depending on their level of abstraction, models can be classified as analogous, when they try to be faithful to natural chemistry, and abstract otherwise. According to these definitions, NetWorld could be understood as an extremely abstract artificial chemistry model, where nodes stand for indistinguishable basic entities—they are unweighted and have no different properties to represent the chemical valency or size of atoms, for example—and the bonds between nodes are represented by undirected and unweighted links. In a simple case, these entities could be molecules where nodes stand for atoms and links for their interactions. However, as its internal rules are so different from those that govern real chemistry, our approach differs drastically from the large myriad of artificial chemistry models that quantitatively represent real processes through detailed descriptions of the physicochemical interaction between atoms (1315).

Fig. 1 presents a visual description of the evolutionary dynamics described by NetWorld, and Fig. 1A shows a toy example with four initial nodes. A rigorous explanation of the algorithm is presented in SI Appendix, Supporting Information Texts S1–S3; an analysis of the dependence of the computation time with NetWorld’s parameters is introduced in SI Appendix, Supporting Information Text S4; and information on the public availability of the code is found in SI Appendix, Supporting Information Text S5. Each process is started with an initial number n(0)=N of isolated nodes. Nodes are neither created nor destroyed during the process. In each time step t, the population consists of n(t) networks that will be made of the N available nodes for the total ensemble. At the beginning of each time step t, two networks A and B of the total population are chosen randomly. They interact following a simple set of rules explained below and in SI Appendix, Supporting Information Text S1.A. They either form a new network C, simulating the reaction A+BC, or fail to join and remain as A and B. In the latter case, a different pair of networks A and B is chosen until one pair succeeds in forming a new network. At the end of the time step, every network i in the population has a partition probability

Pi=21+exp(μiβ) [1]

of being divided into smaller pieces, simulating the reaction CDj (SI Appendix, Supporting Information Text S1.B has a full description of the partition algorithm). The stability parameter μi is the second smallest eigenvalue associated with the Laplacian matrix of network i. It is also known as the algebraic connectivity or the Fiedler eigenvalue and represents the resistance of a network to being split into different communities (16). The environment parameter β concentrates the whole physicochemical properties of the environment, such as the temperature or the radiation. β is constant during the whole process, and it is the unique relevant global parameter of our model. When a new time step starts, the same process is repeated with the new collection of n(t+1) networks. Note that n(t+1)n(t)1 and n(t)[1,N] for all t. The process finishes when 1) the totality of nodes collapses in a unique structure [i.e., n(t)  =  1], 2) n(t)  >  1 but no networks will accept any new connections, or 3) t reaches a limit value of 104 steps. A typical realization of the total process for an environment parameter β=2.5 and N = 20 initial nodes is plotted in Fig. 1C.

Fig. 1.

Fig. 1.

Description of the evolutionary dynamics created by NetWorld. (A) Sketch of the potential evolution of a simple system formed by four nodes. (B) In every time step of a simulation, two networks get in touch and interact until the system reaches a final network (defined by a Nash equilibrium) or a cycle of several alternating structures. The connector links (blue) join the original networks (red and green). (C) Example of a realization of the whole process with environment parameter β=2.5 and N = 20 initial isolated nodes. The process reaches an end when a single network of size 20 is created. Insets show the state of the system at specific times.

The interaction between the two networks A and B chosen randomly at time t to give rise to network C requires special attention (Fig. 1B). It is inspired by a connecting method already applied to socioeconomic networks (17, 18), and it is grounded in the extensive work devoted to the description of the competition/cooperation between networks developed during the last decade for ordinary interactions (1922) or more recently, for higher-order interactions (2325). We randomly choose one node a in A and one node b in B (connector nodes from now on) and connect them through an undirected link (connector link from now on). This new link is accepted only if both connector nodes a and b increase their dynamical importance in the network, measured as I=λ1ul. λ1 is the largest eigenvalue of the adjacency matrix of the new network formed by A, B, and the connector link just added, and u represents its associated eigenvector, L1 normalized such that uk=1. ul is the eigenvector centrality of connector node l, and it measures the importance of a node based on how well connected it is and how important its neighbors are. Both λ1 and u are important measurements of the dynamical properties associated with a complex network. If there was a dynamical process of the type m(t+1)=Gm(t) evolving on a network of adjacency matrix G, it is known that m(t)u when t, and the population growth rate would be λ1 (17). The process of choosing randomly a new pair of nodes a in A and b in B—already connected through a connector link from a to b—and checking whether they accept a link or not is repeated, taking into account that, for simplicity and without loss of generality, only one connector link per node is allowed (18), and therefore, any preexisting connector links associated with nodes a and b are erased. In general, this algorithm leads to the connection of A and B through a cascade and rewiring of connector links until the total network C=A+B reaches a Nash equilibrium or a cycle between several final configurations, and in the latter case, one of the final cyclic configurations is chosen randomly as C. If no links are accepted between A and B when all possible connections between both networks have been tried, we suppose that they do not react, and A and B remain unchanged. In this case, the time is not increased and the interaction between two other networks starts.

In summary, our framework is grounded in the simulation of abstract networked entities that evolve following rules inherited from network science and game theory, and there is a total absence of machine learning techniques or any kind of fitting with real data in the different steps of the model.

1.B. Description of the Artificial Chemosphere Created at NetWorld: Transition toward Complexity.

Every simulation of the whole process throughout this work started from N = 40 initial isolated nodes, lasted a maximum of 104 time steps, and was repeated 25 times for each value of the environment parameter β in the range β=[0,2.7]. The limit case β does not permit any network partition (P = 0 in Eq. 1) and is also studied. All structures detected at the end of any time step for all realizations of each β, even if they were destroyed later, represent its diversity. The relative abundance of each configuration is given by the probability of finding it in the set of networks accumulated during all times and realizations (SI Appendix, Supporting Information Text S2 has a detailed explanation of how to compute the number of different configurations and the relative abundance in a simulation).

1.B.1. Diversity vs. environment.

We start by focusing on how the environment shapes the networks emerging in the system. Fig. 2 shows the topological and structural description of the diversity created for a wide range of values of the environment parameter. For β  =  0, every network of two nodes (i.e., the first to be created from isolated nodes) immediately breaks up, and only isolated nodes are found for all times. When β grows, the partition probability P of a network decreases (Eq. 1), and diverse structures emerge; however, it is not until large values of β are reached (β=2.5 for N = 40) that networks of the maximum possible size are created, as shown in Fig. 2B.* The size, the mean degree k, the largest eigenvalue of the adjacency matrix λ1, and the degree entropy of the created structures H (plotted in Fig. 2BE) positively correlate with β, as shown in the first component of the principal components analysis (PCA) calculated for the whole ensemble of existing structures and plotted in Fig. 2G (26). On the contrary, the average stability of the networks μ¯ correlates negatively with β (Fig. 2F and the second component of the PCA in Fig. 2G), as more extensive and heterogeneous structures (i.e., with large entropy H) are in general easier to divide, but the low partition probability of networks for large β permits their appearance and survival.

Fig. 2.

Fig. 2.

Description of the diversity of structures created by NetWorld for different values of the environment parameter β and N = 40 initial nodes. (A) Histogram of network sizes (number of nodes in a network) for β=1.8. A log-normal fitting is plotted with a bold line (goodness of fit of r = 0.993). Dependence with β of (B) the size of the networks, (C) the average degree k, (D) the maximum eigenvalue λ1, (E) the degree entropy H [calculated as H=i=1Kpi×log2(pi), where pi is the fraction of nodes of degree i and K is the maximum degree], and (F) the stability parameter μ. (G) Principal components analysis (PCA) of the system: First (blue; 54% of variability) and second (yellow; 21% of variability). (H) Number of different configurations Nconf as a function of β. Exponential approximations are plotted as dashed lines [Nconf=2.56exp(1.43β),r=0.971 for β<βcrit and Nconf=0.03exp(4.25β),r=0.998 for β>βcrit]. The critical value of β (βcrit1.55) is remarked. In BF, every single network is represented by a blue dot, and the average value and the 10 to 90 percentile range for each β are plotted in red.

Importantly, in Fig. 2H, the growth of the number of different configurations with β suffers a transition around βcrit=1.55 such that beyond that critical point, the diversity explodes exponentially faster than before it. While the 10 to 90 percentile range plotted in red in Fig. 2BF shows that the population is formed by a large number of networks with similar topological properties, for β>βcrit there is also a rare chemosphere of structures that show very different topologies from the one of the majority, introducing a large amount of variability. Note that the existence of long tails of low-abundance entities is a typical property of real complex systems, notably ecological communities (27, 28). In summary, for low values of β, the main bricks of future complexity are formed, but only when a critical state of the environment is reached may these motifs be sufficiently abundant and interact successfully to enrich the system with a large number of new structures. This new population is made of very diverse entities, where 1) regular/robust structures but also, 2) heterogeneous/less stable networks can be detected. The former are made out of very similar blocks that, from the optics of information theory, would not code complex information but show redundancy, a fundamental property to prevent attacks and failures when basic tasks must be developed. The latter consist of complex networks where more information could be coded, but they are more sensitive to divisions, external perturbations, or losses of nodes.

1.B.2. Abundance vs. environment.

Fig. 3 describes the relative abundance of the networked structures obtained for the different environments: that is, the probability of finding them in the set of networks accumulated during all times and realizations. In Fig. 3A, we plot the relative abundance of each configuration as a function of β, and it is clear that the simplest structures are created at low values of β and are especially abundant. A gradual appearance of larger and/or more complex configurations takes place for moderate β, and beyond β=βcrit, a cascade of new configurations leads to the exponential emergence of diversity already measured in Fig. 2H. The sharpness of the transition toward complexity is specially visible in Fig. 3B, where the normalized entropy of the total ensemble of networks Hnorm (29) shows a transition from a linear growth with β to a constant value. For β<βcrit, the abundances of the few structures that exist tend to be more uniformly distributed when β grows (Fig. 3A), increasing the entropy of the system (which reaches the maximum value Hnorm  =  1 when all abundances are equal; the mathematical expression for Hnorm is in Fig. 3). However, once the critical environment is surpassed, the growth in complexity due to the tendency toward the relative abundance uniform distribution is balanced by the exponential emergence of diversity. Finally, in Fig. 3C, we plot the abundance ranks of the population for different values of β. For low β, the curves are very skewed and show exponential decays, typical behavior of ecological environments with little diversity (30). When β grows and crosses the critical value (brown line), the curves gradually lose skewness and become power laws, showing long tails as a consequence of a large diversity of rare configurations, and tend toward the limit case of the process in which β= and the structures cannot divide.

Fig. 3.

Fig. 3.

Analysis of the relative abundance of structures created by NetWorld for different values of the environment parameter β and N = 40 initial nodes. (A) Abundance of each structure as a function of β. Each color represents a different structure (some of them shown in the insets), but all structures emerged for β>βcrit are plotted in dark blue for clarity. (B) Normalized entropy Hnorm as a function of β. Hnorm=i=1Nconfpi×log2pi/log2(Nconf), where pi is the relative abundance of configuration i and Nconf is the number of different configurations. Linear approximations are plotted as dashed lines: Hnorm=(0.006±0.004)+(0.121±0.005)β for β<βcrit and Hnorm=(0.20±0.01)(0.006±0.005)β for β>βcrit. The critical value of βcrit1.55 is remarked. (C) Abundance rank for different values of β (excluding isolated nodes). β[0.1,2.7]; β grows from left to right in intervals of Δβ=0.2. The abundance rank for βcrit=1.55 is plotted as the brown line, and that for β (i.e., when the environment does not permit any network partition) is plotted as black circles.

1.C. Application to a Real Scenario: Chemical Complexity in the Interstellar Medium.

Molecules are an important component of the ISM since they regulate its ionization state and energy dissipation. Molecules are typically found in interstellar clouds where the amount of interstellar dust and thus, of visual extinction Av is large enough to prevent the photodissociation of molecular species by the external interstellar ultraviolet (UV) radiation field, which enables their formation and survival. The level of chemical complexity in interstellar clouds is, however, very different depending on their level of extinction and on the available amount of molecular hydrogen (H2) and carbon monoxide (CO) within them (31). In this way, interstellar clouds can be classified as diffuse atomic [with a fraction f of H2 with respect to the total amount of atomic H of f(H2) <10%], diffuse molecular [f(H2) >10%], translucent [f(H2) >10% and with a fraction of CO with respect to the total amount of atomic C of f(CO) <90%], or dense clouds [f(H2) >10% and f(CO) >90%]. The chemistry in diffuse atomic clouds is very limited (31), while the chemistry in dense clouds presents a very high level of chemical complexity (e.g., refs. 32 and 33). Therefore, we use here the molecular abundances measured toward diffuse molecular, translucent, and dense clouds as test cases for the applicability of the digital environment NetWorld to real scenarios.

In Fig. 4A, we show the abundances of the chemical compounds detected toward four interstellar clouds ranked in the order of their decreasing magnitude: 1) the interstellar cloud ζ Ophiuci, a diffuse molecular cloud where only a few molecules have been found (31) (dust extinction is so weak [Av=1.06 magnitude] that UV radiation destroys most of the molecular material); 2) the translucent cloud located in the direction of the ultracompact HII region K4 in the Sagittarius B2 massive star-forming region (34, 35) (this cloud has an extinction of Av = 2.0 mag, just enough to enable the formation of new molecular species and to protect the molecular content recently formed within the cloud, playing the role of the critical transition in the model); and 3) L134N (Serpens) and 4) TMC-1 (Taurus), two dense clouds with Av>10 mag where extensive numbers of both simple and complex molecules have been synthesized thanks not only to the protection of the high Av but also, to the high fraction of CO present (36). We refer to SI Appendix, Supporting Information Text S6 for an explanation on how we have obtained the molecular abundances toward the different clouds used in our analysis. Fig. 4B shows the abundance rank obtained with NetWorld for four different environments that are qualitatively compatible with these four sets of real data: a low value of β representing a harsh environment where most created compounds are rapidly destroyed, a value close to the critical βcrit beyond which complexity expands, and two values of β slightly over this transition point. We include the critical environment βcrit=1.55 for comparison. In Fig. 4C and D, we focus on the potential quantitative agreement between real data and the results of the artificial framework. We compare the abundance ranks for TMC-1 and L134N with those of NetWorld’s β=1.8 and 1.7, respectively. The abundances are provided excluding the most frequent elements of each ensemble, H2 in the molecular abundances and the isolated nodes in NetWorld. The real and numerical curves show a quantitative agreement between the number of molecules present in the cloud and the number of configurations in NetWorld and also, in the relative abundance for a large set of molecules and configurations. The framework does not reproduce, however, the truncation shown in the real curves for the two to three lowest abundances, but this behavior would disappear if new real data were introduced. Note that the astrochemical datasets of molecular species and their measured abundances remain largely incomplete even for the most observed clouds, such as TMC-1, and especially for low-abundance species or very large molecules that are more difficult to detect (e.g., refs. 32 and 37). It is also remarkable that the values of β that best fit the astrochemical data of L134N and TMC-1 (β=1.7 and 1.8) are slightly beyond βcrit=1.55 and thus, belong to a regime in which expansion of their chemical diversity is expected. Indeed, recent observational works toward the TMC-1 dense cloud have revealed the presence of small polycyclic aromatic hydrocarbons, demonstrating the ability of these environments to generate complex molecular structures (32, 33, 37).

Fig. 4.

Fig. 4.

Comparison between the results of the digital environment NetWorld (×) and the evolution of chemical complexity of interstellar clouds, the astrochemical environment where the most basic bricks of life are created (°). All abundance sets shown were L1 normalized (the sum is one). (A) Abundance rank of the molecular compounds detected in four interstellar clouds: the diffuse molecular cloud ζ Ophiuci (31), the translucent cloud located in the direction of the ultracompact HII region K4 in the SgrB2 molecular complex (a cloud in the transition from the diffuse regime to the dense regime) (34, 35), and the dense clouds L134N (Serpens) and TMC-1 (Taurus) (36). (B) Abundance rank obtained in NetWorld for N = 40 initial nodes and environments that are qualitatively compatible with the four sets of real data plotted in A. The abundance rank for βcrit=1.55 is plotted as the brown line. (C and D) Comparison between the abundances in TMC-1 and L134N dense clouds with NetWorld simulations for β=1.8 and 1.7, respectively. (E) Dependence of abundances of the molecules detected in L134N and TMC-1 on the number of astrophysical reactions that have them as products. (F) Dependence of abundances of networks of size 10 created by NetWorld (in the limit β= for simplicity) on the number of paths used to create them. Curve fits are shown in E and F as dashed lines. Note that CO abundances are out of range in E for clarity but were considered in the fits.

Finally, a relevant pattern obtained in NetWorld is that the relative abundances of the different structures correlate with the number of paths identified to create them, following a very simple functional dependence of the type yxα (Fig. 4F) (α=1.2±0.2 for networks of size 10, N = 40, and β=) (SI Appendix, Supporting Information Text S3 has details on how to compute the number of paths to reach a configuration). Note that this is a highly nontrivial result, as different paths have in general very different occurrence probabilities, and this expected diversity could in principle spoil the correlation. In order to check whether a similar relationship might also emerge in astrochemical environments, we plotted in Fig. 4E the dependence of the molecular abundances measured toward the dense clouds L134N and TMC-1 on the number of chemical reactions that have them as products as a simple proxy for the chemical paths. The number of chemical reactions is extracted from the astrochemical reaction dataset KIDA (KInetic Database for Astrochemistry) (38). Surprisingly, the abundance data correlate with the number of reactions following the same functional dependence as NetWorld’s simulations (yxα, where αTMC1=1.0±0.2, r = 0.57, and p=2·106 and αL134N=1.2±0.3, r = 0.54, and p=6·104) (SI Appendix, Supporting Information Text S7). In addition, the α-coefficient of the power law that relates these two magnitudes is close to one, indicating that the abundance of a certain molecule would be expected to be approximately proportional to the number of reactions that create it.

We caution, however, that the latter analysis could suffer from important biases. First, the molecular inventory of the ISM is far from complete. Small molecules that do not have dipole moment, such as N2 or CH4, cannot be observed at radio wavelengths. Furthermore, large molecules not only show low abundances but also their partition function is so large that it spreads the emission across many energy levels, making the molecular line intensities very weak. Recent spectroscopic surveys carried out toward chemically rich sources, such as SgrB2 N2 (39), IRAS16293-2422 (40), TMC-1 (32, 37), or G+0.693 (e.g., refs. 3 and 4) with ALMA (Atacama large millimeter/submillimeter array), the GBT (Green Bank telescope), or the Yebes 40m telescope, have boosted the detection rate of new molecular species in the ISM in the past decade (reviewed in ref. 41), which is starting to alleviate this potential issue. Second, the astrochemical reaction databases are strongly biased to small molecules. All these chemical networks are based on the pioneering work in ref. 42, which had the goal of investigating the interstellar chemistry of small molecular species. In addition, experimental and theoretical works of gas-phase reactions are extremely challenging especially at the low temperatures typical of interstellar conditions, in particular of radical–radical reactions (43, 44), and not all reactions yielding the same product are equally efficient due to the existence of, for example, high energy barriers. Because of all these reasons, the number of detected molecules and the datasets of reactions that can have them as products are inaccurate.

To address these limitations, we have statistically verified that the correlations between the molecular abundances and the number of reactions shown in Fig. 4E for the interstellar clouds TMC-1 and L134N still hold when removing the effect of a controlling variable, such as the molecular size, understood as the number of atoms contained within a molecule (partial correlation coefficient r=0.49 and p=7·105 for TMC-1 and r=0.42 and p=0.007 for L134N) (SI Appendix, Supporting Information Text S7 has more details). This result points to the correlation detected being independent of potential biases due to the molecular size. A thorough statistical analysis, considering other astrochemical magnitudes related to the molecules found in different astrophysical environments, will be carried out in the future.

2. Discussion

In this work, we have introduced a conceptual and computational framework called NetWorld describing the evolution of networked structures, where nodes interact following exclusively the optimization of their own dynamical importance. Our results show that, although there is not a causal relationship between NetWorld’s framework and real astrochemical phenomena, a simple model grounded in network science and game theory captures key properties of the process toward chemical complexity and the creation of the building blocks of life. In particular, while our approach does not try to mimic real astrochemistry, it succeeds to explain the emergence of interstellar molecular complexity and points out as yet unknown astrochemical relationships that could be of importance in our understanding of the formation of prebiotic species in interstellar space.

Multiple methods have been proposed to map real chemistry to artificial chemistry models. Graphs, binary strings, and character strings or numbers, among others, have been used to represent molecules (4548). Our networks describe molecules where atoms are nodes and their interactions are links. Beyond this mapping, one could add properties and impose more complex rules of interaction in order to get closer to real chemistry. Node labels and properties might be introduced to represent the atom type, hybridization type, charge, valency, or radicals. Moreover, an energy function could be used to choose the structure resulting from a reaction on the basis of the change of energy (14). Based on the current state of our framework, adding these features in NetWorld is possible and a priori, computationally feasible. However, we stress that fundamental features of NetWorld are its simplicity and abstraction, as we aim to present a framework that transcends chemistry to describe the interaction between complex structures of different nature and in diverse environments from nodes representing atoms to biomolecules or even species.

In the same line of thought, the natural transition toward complexity that emerged from our computational environment has also been observed in complex ecosystems. In particular, Fisher and Mehta (30) reported a strikingly similar pattern in which the skewness of biodiversity rank–abundance curves decreased and the overall diversity increased with carrying capacity, highlighting the sharp transition between stochastic neutral regimes and selection-dominated niche regimes. The environment parameter β introduced here and the dust extinction Av in interstellar chemistry—two quantities directly related since the rate constants of interstellar UV photodestruction reactions depend exponentially of Av (49)—could be understood as loose proxies for the carrying capacity in the ecological context, in the sense that low β/low Av results in a “harsher” environment that limits network/molecules/species richness. Continuing the analogy, in the low-β/low-Av regime, our network communities and the interstellar molecular abundances show a highly skewed abundance distribution (Fig. 4A and B) and are indeed dominated by stochasticity (the partition probability P introduced in Eq. 1 and the interaction with UV radiation, respectively). On the other hand, in the high-β/high-Av regime, they present low skewness and high diversity and are dominated by the selection of structures with a higher number of paths/reactions leading to them (as shown in Fig. 4E and F). All in all, we believe that the similarities between 1) the results in ref. 30 based on models that are firmly rooted in classical ecological theory and checked with real data, 2) those obtained from molecular abundances in interstellar clouds, and 3) the ones introduced by our computational environment, derived from a simple framework with no a priori ecological or chemical assumptions, are not coincidental. They instead hint that the long path from the creation of the basic prebiotic compounds in the ISM to the origin of life and its evolution on the early Earth could show universal patterns and common phenomenologies at all scales and across all stages.

Finally, while we have exclusively used sets of isolated and indistinguishable nodes as initial conditions and have focused on the description of the emerging diversity created, the framework here introduced could be of use in many other contexts. Advancing the subject of the origin and evolution of early life, potentially fruitful lines of future work could be the analysis of the interaction of small motifs to simulate the polymerization of simple chemical compounds, resembling the phenomenology present in the RNA World, or the search for autocatalytic reactions that could help us advance toward a protoreplication of networks, where the concepts of mutation and fitness could be analyzed as emerging properties of the system instead of introducing them ad hoc as has been done so far in the literature.

Supplementary Material

Supplementary File
pnas.2119734119.sapp.pdf (923.1KB, pdf)

Acknowledgments

We acknowledge conversations with A. Aguirre-Tamaral, S. Avero, C. Briones, M. Castro, M. Fernández-Ruz, J. García de la Concepción, J. A. García-Martín, R. Guantes, D. Hochberg, J. Iranzo, A. Lucía-Sanz, S. Manrubia, and M. Ruiz-Bermejo and technical support from N. Aguirre and J. Aguirre. I.J.-S. and J.A. received support from Projects PID2019-105552RB-C41, PID2021-122936NB-I00, and MDM-2017-0737 Unidad de Excelencia “María de Maeztu”-Centro de Astrobiología (CSIC-INTA) by the Spanish Ministry of Science and Innovation/State Agency of Research MCIN/AEI/10.13039/501100011033 and by “Fondo Europeo de Desarrollo Regional (FEDER) Una manera de hacer Europa,” through Project ESP2017-86582-C4-1-R. F.P.-S. was supported by the European Union’s Horizon 2020 Research and Innovation Programme under Marie Skłodowska-Curie Grant 892961.

Footnotes

The authors declare no competing interest.

This article is a PNAS Direct Submission. S.W.W. is a guest editor invited by the Editorial Board.

*Note that, in opposition to real chemistry, in our model all nodes are equal, and therefore, there are very few potential configurations of small size (e.g., there is only one for size 2, while there is a large number of molecules of two atoms in nature).

We calculated the number of paths of each configuration for β= (with no loss of generality), as it is the case where there is no network partition and the computation is simpler.

This article contains supporting information online at https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.2119734119/-/DCSupplemental.

Data Availability

MATLAB implementation of the codes used has been deposited in GitHub (https://github.com/MiguelGarciaSanchez/NetWorld) (50). Previously published data were used for this work; they were taken from tables of abundances of molecules in different molecular clouds in refs. 31 and 3436. (SI Appendix, Supporting Information Text S6 has more details).

References

  • 1.Hollis J. M., Jewell P. R., Lovas F. J., Remijan A., Green bank telescope observations of interstellar glycolaldehyde: Low-temperature sugar. APJ 613, L45–L48 (2004). [Google Scholar]
  • 2.Belloche A., et al., Re-exploring molecular complexity with ALMA (ReMoCA): Interstellar detection of urea. A&A 628, A10 (2019). [Google Scholar]
  • 3.Jiménez-Serra I., et al., Toward the RNA-world in the interstellar medium—Detection of urea and search of 2-amino-oxazole and simple sugars. Astrobiology 20, 1048–1066 (2020). [DOI] [PubMed] [Google Scholar]
  • 4.Rivilla V. M., et al., Discovery in space of ethanolamine, the simplest phospholipid head group. Proc. Natl. Acad. Sci. U.S.A. 118, e2101314118 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Altwegg K., et al., Organics in comet 67P: A first comparative analysis of mass spectra from ROSINA–DFMS, COSAC and Ptolemy. Mon. Not. R. Astron. Soc. 469 (suppl. 2), S130–S141 (2017). [Google Scholar]
  • 6.Furukawa Y., et al., Extraterrestrial ribose and other sugars in primitive meteorites. Proc. Natl. Acad. Sci. U.S.A. 116, 24440–24445 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Neumann J. V., Burks A. W., Theory of Self-Reproducing Automata (University of Illinois Press, 1966). [Google Scholar]
  • 8.Wolfram S., Statistical mechanics of cellular automata. Rev. Mod. Phys. 55, 601–644 (1983). [Google Scholar]
  • 9.Gardner M., The fantastic combinations of John Conway’s new solitaire game “life”. Sci. Am. 223, 120–123 (1970). [Google Scholar]
  • 10.Rasmussen S., Knudsen C., Feldberg R., Hindsholm M., The coreworld: Emergence and evolution of cooperative structures in a computational chemistry. Physica D 42, 111–134 (1990). [Google Scholar]
  • 11.Lenski R. E., Ofria C., Collier T. C., Adami C., Genome complexity, robustness and genetic interactions in digital organisms. Nature 400, 661–664 (1999). [DOI] [PubMed] [Google Scholar]
  • 12.Adami C., Ofria C., Collier T. C., Evolution of biological complexity. Proc. Natl. Acad. Sci. U.S.A. 97, 4463–4468 (2000). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Dittrich P., Ziegler J., Banzhaf W., Artificial chemistries: A review. Artif. Life 7, 225–275 (2001). [DOI] [PubMed] [Google Scholar]
  • 14.Benkö G., Flamm C., Stadler P. F., A graph-based toy model of chemistry. J. Chem. Inf. Comput. Sci. 43, 1085–1093 (2003). [DOI] [PubMed] [Google Scholar]
  • 15.Banzhaf W., Yamamoto L., Artificial Chemistries (MIT Press, Cambridge, MA, 2010). [Google Scholar]
  • 16.Newman M. E. J., Networks: An Introduction (Oxford University Press, Inc., New York, NY, 2010). [Google Scholar]
  • 17.Iranzo J., Buldú J. M., Aguirre J., Competition among networks highlights the power of the weak. Nat. Commun. 7, 13273 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Iranzo J., Pablo-Martí F., Aguirre J., Emergence of complex socioeconomic networks driven by individual and collective interests. Phys. Rev. Res. 2, 043352 (2020). [Google Scholar]
  • 19.Gómez-Gardeñes J., Reinares I., Arenas A., Floría L. M., Evolution of cooperation in multiplex networks. Sci. Rep. 2, 620 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Aguirre J., Papo D., Buldú J. M., Successful strategies for competing networks. Nat. Phys. 9, 230–234 (2013). [Google Scholar]
  • 21.Wang Z., Szolnoki A., Perc M., Self-organization towards optimally interdependent networks by means of coevolution. New J. Phys. 16, 033041 (2014). [Google Scholar]
  • 22.Wang Z., Wang L., Szolnoki A., Perc M., Evolutionary games on multilayer networks: A colloquium. Eur. Phys. J. B 88, 124 (2015). [Google Scholar]
  • 23.Alvarez-Rodriguez U., et al., Evolutionary dynamics of higher-order interactions in social networks. Nat. Hum. Behav. 5, 586–595 (2021). [DOI] [PubMed] [Google Scholar]
  • 24.Kumar A., Chowdhary S., Capraro V., Perc M., Evolution of honesty in higher-order social networks. Phys. Rev. E 104, 054308 (2021). [DOI] [PubMed] [Google Scholar]
  • 25.Bianconi G., Higher-Order Networks (Cambridge University Press, 2021). [Google Scholar]
  • 26.Jolliffe I. T., Cadima J., Principal component analysis: A review and recent developments. Philos. Trans.- Royal Soc., Math. Phys. Eng. Sci. 374, 20150202 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.McGill B. J., et al., Species abundance distributions: Moving beyond single prediction theories to integration within an ecological framework. Ecol. Lett. 10, 995–1015 (2007). [DOI] [PubMed] [Google Scholar]
  • 28.Pedrós-Alió C., The rare bacterial biosphere. Annu. Rev. Mar. Sci. 4, 449–466 (2012). [DOI] [PubMed] [Google Scholar]
  • 29.Gregori J., et al., Viral quasispecies complexity measures. Virology 493, 227–237 (2016). [DOI] [PubMed] [Google Scholar]
  • 30.Fisher C. K., Mehta P., The transition between the niche and neutral regimes in ecology. Proc. Natl. Acad. Sci. U.S.A. 111, 13111–13116 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Snow T. P., McCall B. J., Diffuse atomic and molecular clouds. Annu. Rev. Astron. Astrophys. 44, 367–414 (2006). [Google Scholar]
  • 32.Cernicharo J., et al., Pure hydrocarbon cycles in TMC-1: Discovery of ethynyl cyclopropenylidene, cyclopentadiene and indene. Astron. Astrophys. 649, L15 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.McGuire B. A., et al., Detection of two interstellar polycyclic aromatic hydrocarbons via spectral matched filtering. Science 371, 1265–1269 (2021). [DOI] [PubMed] [Google Scholar]
  • 34.Thiel V., Belloche A., Menten K. M., Garrod R. T., Müller H. S. P., Complex organic molecules in diffuse clouds along the line of sight to Sagittarius B2. A&A 605, L6 (2017). [Google Scholar]
  • 35.Corby J. F., McGuire B. A., Herbst E., Remijan A. J., The molecular chemistry of diffuse and translucent clouds in the line-of-sight to SGR B2: Absorption by simple organic and inorganic molecules in the GBT primos survey. Astron. Astrophys. 610, A10 (2018). [Google Scholar]
  • 36.Agúndez M., Wakelam V., Chemistry of dark clouds: Databases, networks, and models. Chem. Rev. 113, 8710–8737 (2013). [DOI] [PubMed] [Google Scholar]
  • 37.McGuire B. A., 2018 Census of interstellar, circumstellar, extragalactic, protoplanetary disk, and exoplanetary molecules. APJ 239, 17 (2018). [Google Scholar]
  • 38.Wakelam V., et al., The 2014 KIDA network for interstellar chemistry. Astrophys. J. Suppl. Ser. 217, 20 (2015). [Google Scholar]
  • 39.Belloche A., Müller H. S. P., Garrod R. T., Menten K. M., Exploring molecular complexity with alma (emoca): Deuterated complex organic molecules in Sagittarius b2(n2). Astron. Astrophys. 587, A91 (2016). [Google Scholar]
  • 40.Jørgensen J. K., et al., The ALMA protostellar interferometric line survey (PILS). First results from an unbiased submillimeter wavelength line survey of the Class 0 protostellar binary IRAS 16293-2422 with ALMA. Astron. Astrophys. 595, A117 (2016). [Google Scholar]
  • 41.McGuire B. A., 2021 Census of interstellar, circumstellar, extragalactic, protoplanetary disk, and exoplanetary molecules. Astrophys. J. Suppl. Ser. 259, 30 (2022). [Google Scholar]
  • 42.Prasad S. S., Huntress W. T. J., A model for gas phase chemistry in interstellar clouds. I. The basic model, library of chemical reactions, and chemistry among C, N, and O compounds. Astrophys. J. 43, 1–35 (1980). [Google Scholar]
  • 43.Shannon R., Blitz M., Goddard A., Heard D., Accelerated chemistry in the reaction between the hydroxyl radical and methanol at interstellar temperatures facilitated by tunnelling. Nat. Chem. 5, 745–749 (2013). [DOI] [PubMed] [Google Scholar]
  • 44.de la Concepción J. G., Puzzarini C., Barone V., Jiménez-Serra I., Roncero O., Formation of phosphorus monoxide (PO) in the interstellar medium: Insights from quantum-chemical and kinetic calculations. Astrophys. J. 922, 169 (2021). [Google Scholar]
  • 45.Farmer J., Kauffman S. A., Packard N. H., Autocatalytic replication of polymers. Physica D 22, 50–67 (1986). [Google Scholar]
  • 46.Banzhaf W., Self-replicating sequences of binary numbers. Foundations II: Strings of length N=4. Biol. Cybern. 69, 275–281 (1993). [Google Scholar]
  • 47.Dittrich P., Banzhaf W., Self-evolution in a constructive binary string system. Artif. Life 4, 203–220 (1998). [DOI] [PubMed] [Google Scholar]
  • 48.Dittrich P., Liljeros F., Soulier A., Banzhaf W., Spontaneous group formation in the seceder model. Phys. Rev. Lett. 84, 3205–3208 (2000). [DOI] [PubMed] [Google Scholar]
  • 49.Holdship J., Viti S., Jiménez-Serra I., Makrymallis A., Priestley F., UCLCHEM: A gas-grain chemical code for clouds, cores, and c-shocks. Astron. J. 154, 38 (2017). [Google Scholar]
  • 50.García-Sánchez M., NetWorld [Computer software and implementation details]. GitHub. https://github.com/MiguelGarciaSanchez/NetWorld. Deposited 2 March 2022.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary File
pnas.2119734119.sapp.pdf (923.1KB, pdf)

Data Availability Statement

MATLAB implementation of the codes used has been deposited in GitHub (https://github.com/MiguelGarciaSanchez/NetWorld) (50). Previously published data were used for this work; they were taken from tables of abundances of molecules in different molecular clouds in refs. 31 and 3436. (SI Appendix, Supporting Information Text S6 has more details).


Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES