Skip to main content
Journal of Cheminformatics logoLink to Journal of Cheminformatics
. 2022 Sep 19;14:63. doi: 10.1186/s13321-022-00621-8

What makes a reaction network “chemical”?

Stefan Müller 1, Christoph Flamm 2, Peter F Stadler 2,3,4,5,6,7,
PMCID: PMC9484159  PMID: 36123755

Abstract

Background

Reaction networks (RNs) comprise a set X of species and a set R of reactions YY, each converting a multiset of educts YX into a multiset YX of products. RNs are equivalent to directed hypergraphs. However, not all RNs necessarily admit a chemical interpretation. Instead, they might contradict fundamental principles of physics such as the conservation of energy and mass or the reversibility of chemical reactions. The consequences of these necessary conditions for the stoichiometric matrix SRX×R have been discussed extensively in the chemical literature. Here, we provide sufficient conditions for S that guarantee the interpretation of RNs in terms of balanced sum formulas and structural formulas, respectively.

Results

Chemically plausible RNs allow neither a perpetuum mobile, i.e., a “futile cycle” of reactions with non-vanishing energy production, nor the creation or annihilation of mass. Such RNs are said to be thermodynamically sound and conservative. For finite RNs, both conditions can be expressed equivalently as properties of the stoichiometric matrix S. The first condition is vacuous for reversible networks, but it excludes irreversible futile cycles and—in a stricter sense—futile cycles that even contain an irreversible reaction. The second condition is equivalent to the existence of a strictly positive reaction invariant. It is also sufficient for the existence of a realization in terms of sum formulas, obeying conservation of “atoms”. In particular, these realizations can be chosen such that any two species have distinct sum formulas, unless S implies that they are “obligatory isomers”. In terms of structural formulas, every compound is a labeled multigraph, in essence a Lewis formula, and reactions comprise only a rearrangement of bonds such that the total bond order is preserved. In particular, for every conservative RN, there exists a Lewis realization, in which any two compounds are realized by pairwisely distinct multigraphs. Finally, we show that, in general, there are infinitely many realizations for a given conservative RN.

Conclusions

“Chemical” RNs are directed hypergraphs with a stoichiometric matrix S whose left kernel contains a strictly positive vector and whose right kernel does not contain a futile cycle involving an irreversible reaction. This simple characterization also provides a concise specification of random models for chemical RNs that additionally constrain S by rank, sparsity, or distribution of the non-zero entries. Furthermore, it suggests several interesting avenues for future research, in particular, concerning alternative representations of reaction networks and infinite chemical universes.

Supplementary Information

The online version contains supplementary material available at 10.1186/s13321-022-00621-8.

Keywords: Chemical reaction network, Directed hypergraph, Stoichiometric matrix, Futile cycle, Perpetuum mobile, Energy conservation, Mass conservation, Reaction invariants, Null spaces, Sum formula, Multigraph, Lewis formula

Background

Most authors will agree that a chemical reaction network consists of a set X of chemical species or compounds and a set R of chemical reactions, each describing the transformation of some (multi)set of educts into a (multi)set of products. Depending on the application, this basic construction may be augmented by assigning properties such as mass, energy, sum formulas, or structural formulas to the compounds. Similarly, reactions may be associated with rate constants, equilibrium constants, and so on. A formal theory of reaction networks (RN) describes a reaction on a given set of compounds X as a stoichiometric relation, i.e., as a pair of formal sums of chemical species xX:

xXsxr-xxXsxr+x. 1

The left-hand side in Eq. (1) lists the educts and the right-hand side gives the products of the reaction. The stoichiometric coefficients sxr-N0 and sxr+N0 denote the number of species xX that are consumed (on the left-hand side) or produced (on the right-hand side) by the reaction r, respectively. A species xX is an educt in reaction r if sxr->0 and a product if sxr+>0. If sxr+=sxr-=0, then species x does not take part in reaction r and is suppressed in the conventional chemical notation. The formal sums xXsxr-x and xXsxr+x form the complexes of educts r- and products r+ of the reaction r. We denote the set of reactions under considerations by R and call the pair (X,R) a reaction network (RN). Throughout this contribution we will assume that both X and R are non-empty and finite. Excluding explicit catalysis, that is, forbidding sxr-sxr+>0, it suffices to consider the stoichiometric matrix SN0X×R. Its entries Sxr=sxr+-sxr- describe the net production or consumption of species x in reaction r. In many practical applications, e.g. in the context of metabolic networks, RNs are embedded in an open system. In that manner, the consumption of nutrients and the production of waste can be modeled. We will return to this point only after discussing chemical RNs in isolation, i.e., as closed systems.

Several graph representations have been considered as (simplified) models of a RN, see [1] for a recent summary. In contrast to the pair (X,R), they do not always completely represent the RN.

The S-graph (species graph, compound graph, or substrate network in the context of metabolic networks) has the species as its vertices. A (directed) edge connects x to y if the RN contains a reaction that has x as an educt and y as a product [2, 3]. The corresponding construction in the kinetic setting is the interaction graph with undirected edges whenever [x]/[y]0, which are usually annotated by the sign of the derivative [4]. S-graphs have also proved to be useful in approximation algorithms for the minimal seed set problem [5], which asks for the smallest set of substrates that can generate all metabolites. Complementarily, reaction graphs model reactions as nodes, while edges denote shared molecules [6].

The complex-reaction graph simply has the complexes C (the left- and right-hand sides of the reactions) as its vertex set and the reactions R as its edge set. That is, two complexes r- and r+ are connected by a directed edge if there is a reaction r=(r-,r+)R. Its incidence matrix ZRC×R (with entries Zcr=-1 if c=r-, Zcr=1 if c=r+, and Zcr=0 otherwise) is linked to the stochiometric matrix via S=YZ, where the entries of the (stoichiometric) complex matrix YRX×C are the corresponding stochiometric coefficients. The complex-reaction graph plays a key role in the analysis of chemical reaction networks with mass-action kinetics and arbitrary rate constants, as studied in classical “chemical reaction network theory” (CRNT) [79]. It gives rise to notions such as “complex balancing” and “deficiency”, which allow the formulation of strong (global) stability results, see e.g. [10, 11].

SR-graphs (Species-reaction networks) are bipartite graphs with different types of nodes for chemical species and reactions, respectively [12, 13]. As such, they can be endowed with additional annotations or extended with multiple edges to represent stoichiometric coefficients. In this extended form, they are faithful representations of chemical RNs. Alternatively, the edges are often annotated with the multiplicities of molecules, i.e., the stoichiometric coefficients; in this case, they completely specify the RN (X,R). Undirected SR-graphs have a close relationship to classical deficiency theory [7, 9] and form the starting point for a qualitative theory of chemical RN kinetics [14]). More detailed information on qualitative kinetic behavior can be extracted from directed SR-graphs [15]. Both the S- and the R-graph can be extracted unambigously from an SR-graph.

The bipartite SR-graphs can be interpreted as the König’s representation [16] of directed hypergraphs. The connection between hypergraph and graph representations is discussed in some more detail in [17]. While SR-graphs and directed hypergraphs can be transformed into each other, they carry a very different semantic. For instance, the notions of path and connectivity are very different for bipartite graphs and directed hypergraphs [18]. It has been argued, therefore that any graph representation of chemical networks necessarily treats edges as independent entities and thus fails to correctly capture the nature of chemical reactions [19, 20]. In a similar vein, [21] adopts the hypergraph representation and models (bio)chemical pathways as integer hyperflows to ensure mass balance at each vertex. Not every pair of an S- and R-graph implies an SR-graph, and if they do, the result need not be unique [6].

Over the last decade, many authors, including one of us, have investigated metabolic networks from a statistical perspective and reached the conclusion that they are distictly “non-random”, presumably as the consequence of four billion years of evolution. This conclusion is typically reached by first converting a RN into one of the graph representations mentioned above. The choice of graphs is largely motivated by a desire to place metabolic or other chemical RNs within the scheme of small world and scale free networks and to analyze the RNs with the well-established tools of network science [19, 22]. Thus one concludes that graph-theoretical properties of metabolic networks are significantly different from the properties of randomly generated or randomized background models for chemical reaction networks [3, 2325]. The insights gained from this “non-randomness” of metabolism, however, critically depend on what exactly the authors meant by “random”, that is, how the background models are defined. In particular, it is important to understand whether differences between chemical networks and the background are caused by the implementation of universal properties (that any “chemistry-like” RN must satisfy) or whether they arise from the intrinsic structure of particular chemical networks.

To this end, however, we first need a comprehensive conception of what constitutes a chemistry-like reaction network. The different representations used in the literature highlight the fact that it is far from obvious which graphs or hypergraphs properly describe chemical RNs among a possibly much larger set of network models. There is a significant body of work in the literature that describes necessary conditions on the stoichiometric matrix S that derive from key properties of chemical RNs, such as the conservation of mass or atoms in each reaction [8, 2630]. In contrast, we are interested here in sufficient conditions with the aim of providing a concise characterization of RNs (X,R) and their stoichiometric matrices S that describe reaction system that can reasonably be considered as “chemistry-like”. This is of practical relevance in particular for the construction of artifical chemistry models [3134] and random “chemistries”: It is still an open problem how random RNs can be constructed that can serve as fair, chemistry-like background models. We therefore start with a brief survey of random artificial chemistries and randomized RNs. As we shall see in the following section, oftentimes no explicit provisions are made to include “chemical” constraints such as the conservation of matter and energy into the background models.

Beyond the practical importance for the generation of random chemistries, it is also of interest to ask whether and to what extent the stoichiometry of a RN constrains the underlying chemistry, i.e., the composition of compounds and the type of reactions. Chemical reaction networks have been studied as a paradigm of computation that is quite different from, but theoretically equally powerful as Turing machines [3538]. In the case of DNA based computing [39], the field has matured to the point that a compiler for translating chemical reaction networks into nucleic acid strand displacement systems has become available [40]. If chemical reaction networks are to be used as computing devices, a necessary intermediate step is to design reaction systems that implement a given stoichiometric matrix. Constraints on the chemistry imposed by the desired network stoichiometry itself thus become an issue in the design process, prompting us to ask whether there are chemical limitations to the realizability of RNs also beyond the constraints imposed by thermodynamics.

The main part of this contribution is the characterization of chemistry-like RNs. Starting from the principles of energy conservation and conservation of matter, we derive equivalent conditions on the stoichiometric matrix S. We then introduce realizability of RNs by sum formulas and structural formulas as a first step towards a formalization of chemistry-like networks, and show that conservation of matter is already sufficient to guarantee the existence of such chemistry-like representations. Finally we discuss the consequences of the mathematical results for the construction of random RNs and address some open research questions.

A brief survey of random and randomized chemical RNs

Chemical reaction networks are specified either as a set of chemical reactions or as a system of differential equations describing its kinetics. Graphical models have been extracted from both.

Simple graph models of RNs

S-graphs have been used to explore statistical properties of large RNs. In this line of research, empirical S-graphs are compared to the “usual” random networks models such as Erdős Renyí (ER) random graphs, Small World networks in the sense of Watts and Strogatz [41], or the Álbert-Barabasi model of preferential attachment. Generative models for random graphs with given degree distributions were introduced in [42]. Not surprisingly, chemical reaction networks do not very well conform to either one of them. As noted early on, however, R-graphs of metabolic networks at least qualitatively fit the small world paradigm [22]. More sophisticated analyses detected evidence for modularity and hierarchical organization in metabolic networks [43], using random graph models with the same degree distributions as contrasts. Arita noted, however, that S-graphs are poor representations of biochemical pathways and proposed an analysis in terms of atom traces, concluding that “the metabolic world [of E. coli] is not small in terms of biosynthesis and degradation” [44]. The motivation to focus on atom maps comes from the insight that two compounds that are linked by reactions are only related by the chemical transformation if they share at least one atom.

A versatile generator for bipartite graphs that can handle joint degree distributions is described in [45]. Surprisingly, bipartite random graph models apparently have not been used to model chemistry. Instead of generative models such as the ER graph or the preferential attachment model, null models are often specified in terms of rewiring, that is, edit operations on the graph. Rewiring rules define a Markov Process on a set of graphs that can produce samples of randomized networks. The key idea is to specify the rewiring procedure in such a way that it preserves graph properties that are perceived to be important [46, 47]. For example, the degrees of all vertices in a digraph are preserved when a pair of directed edges x1y1 and x2y2 is replaced by x1y2 and x2y1 as long as x1 and x2 have the same out-degree while y1 and y2 have the same in-degree. Randomization procedures for bipartite graphs have become available in the context of ecological networks [48] or trade networks [49]. To our knowledge they have not been used for SR graphs.

Random (directed) hypergraphs

In [50] a hypergraph is defined as a multiset of hyperedges, each of which in turn is a multiset of vertices. In this setting, a random hypergraph is specified by the probabilities pk to include a hyperedge e with cardinality |e|=k. Similar models for undirected hypergraphs are used e.g. in [51]. In a directed hypergraph, every hyperedge is defined as the pair (e-,e+) consisting of the multisets e- and e+. The construction of [50] thus naturally generalizes to directed hypergraphs specified by picking e with probability p|e-|,|e+|. In the context of chemistry this amounts to picking educt and product sets for reactions with probabilities depending on their cardinality. This type of random (directed) hypergraph models are the obvious generalizations of the Erdős Renyí (di)graphs. A certain class of random directed hypergraphs with |e-|=2 and |e+|=1 for all hyperedges e is considered in [52].

Hypergraphs are also amenable to rewiring procedures that ensures the preservation of certain local or global properties. For instance [17] proposes a scheme that preserves the number and cardinality of the hyperedges (replacing a randomly selected (e-,e+) with a randomly selected pair of disjoint subsets (e-,e+) with |e-|=|e-| and |e+|=|e+|). On this basis, the authors conclude that the hierarchical structure hypothesis proposed in [43] is not supported for metabolic networks when a clustering coefficient is defined for directed hypergraphs. [17] also compares S- and R-graphs of metabolic networks with ensembles of S- and R-graphs derived from randomized directed hypergraphs and cast further doubt on previously reported scaling results. Randomization procedures for hypergraphs that preserve local clustering are described in [53]. An approach that uses a chemical graph rewriting model to ensure soundness of reactions is described in the MSc thesis [54].

In [25] networks are constructed in a stepwise procedure starting with directed graphs whose arcs are then re-interpreted as directed hyperarcs by combining multiple arcs. This process is guided by matching the degree distribution of the implied S-graph.

Reaction universes: random subhypergraphs

Instead of generating a random RN directly from a statistical model or rewiring a given one, one can also start from a reaction universe RU, that is, a RN that contains all species of interest and all known or inferred reactions between them. Without losing generality we can think of the RU as a directed hypergraph in the sense of [50], where the multi-set formalism accounts for the stoichiometric coefficients. In contrast to the generative and rewiring approaches the a priori specification of an RU ensures a high level of chemical realism and RNs can now be sampled by randomly selecting subsets of directed hyperedges, that is, chemical reactions. If the RU already ensures conservation of matter or energy, these properties are inherited by the sub-networks. In order to generate random metabolic networks, reactions can be drawn from databases such as KEGG or EcoCyc [55, 56]. Such selections of reactions are sometimes called “metabolic genotypes” since the available reactions are associated with enzymes, whose presence or absence is determined by an organism’s genome [55]. In some studies, additional constraints such as the production of biomass are exploited and networks are sampled e.g. by combining Flux-Balance Analysis (FBA) and a Markov Chain Monte Carlo (MCMC) approach [55, 57].

A characterization of chemistry-like reaction networks

In this section, we start from reaction networks that are specified as abstract stoichiometric relations, Eq. (1), and identify minimal constraints necessary to avoid blatantly unphysical behavior.

Notation and peliminaries

Let X be a finite set and let R be a pair of formal sums of elements of X with non-negative integer coefficients according to Eq. (1). Then we call the pair (X,R) a reaction network (RN). Equivalently, a RN is a directed, integer-weighted hypergraph with directed edges (r-,r+) such that xr- with weight sxr->0 and xr+ with weight sxr+>0. The weights sxr- and sxr+ are usually called the stoichiometric coefficients. We set sxr-=0 and sxr+=0 if xr- and xr-, respectively. We deliberately dropped the qualifier chemical here since, as we shall see, not every RN (X,R) makes sense as a model of a chemical system. In fact, the aim of this contribution is to characterize the set of RNs that make sense as models of chemistry.

Such directed hypergraphs are most conveniently drawn as (bipartite) König multigraphs, with distinct types of vertices representing compounds xX and reactions rR, respectively. Stoichiometric coefficients larger than one appear as multiple edges. See the example in Fig. 1.

Fig. 1.

Fig. 1

Representation of a RN as König multigraph of the corresponding directed hypergraph. Round vertices (with chemical structures shown inside) designate compounds xX, while reactions rR are shown as square vertices. Stoichometric coefficients are indicated by the number of edges from x to r for sxr->0 and r to x for sxr+>0, respectively. A flow (an overall reaction) is given by non-negative integer multiples of individual reactions. Here the coefficients vr are indicated in the square nodes for each reaction r. The flow shown here defines Oró’s [58] route from HCN to adenine (marked by red triangles) and corresponds to the net reaction 5HCNH5C5N5. Figure adapted from [59]

For each reaction rR, we define its support as supp(r)={xsxr-+sxr+>0}; that is, xsupp(r) if it appears as an educt, a product, or a catalyst in r. The stoichiometric matrix of (X,R) is SN0X×R with entries Sxr=sxr+-sxr-.

We distinguish proper reactions r, for which there is both xX with Sxr<0 and yX with Syr>0, import reactions for which Sxr0 for all xX, and export reactions for which Sxr0 for all xX. We write for the empty formula, hence A and B designate the import of A and the export of B, respectively. Note that this definition also allows catalyzed import and export reactions, e.g., C C + A or B + C C.

In thermodynamics, a system is closed if it does not exchange matter with its environment, but may exchange energy in the form of work or heat [60]. For a RN, this rules out import and export reactions.

Definition 1

A RN (X,R) is closed if all reactions rR are proper.

Given an arbitrary RN (X,R), there is a unique inclusion-maximal closed RN contained in (X,R), namely (X,Rp) with

Rp={rRris proper}. 2

We will refer to (X,Rp) as the proper part of (X,R).

For every reaction r, one can define a reverse reaction r¯ that is obtained from r by exchanging the role of products and educts. That is, r¯ is the reverse of r iff, for all xX, it holds that

sxr¯-=sxr+andsxr¯+=sxr-. 3

While thermodynamics dictates that every reaction is reversible in principle (albeit possibly with an extremely low reaction rate), it is a matter of modeling whether sufficiently slow reactions are included in the reaction set R.

Chemical reactions can be composed and aggregated into “overall reactions”. In the literature on metabolic networks, pathways are of this form. An overall reaction consists of multiple reactions that collectively convert a set of educts into a set of products. It can be represented as a formal sum of reactions rRvrr, where the vector of multiplicities vN0R has non-negative integer entries. Thereby, [Sv]x determines the net consumption or production of compound x in the overall reaction specified by v.

A vector vN0R can be interpreted as an integer hyperflow in the following sense: If x is neither an educt nor a product of the overall reaction specified by v, then [Sv]x=r(sxr+-sxr-)vr=0, i.e., every unit of x that is produced by some reaction r with vr>0 is consumed by another reaction r with vr>0.

The effect of an overall reaction can be represented via formal sums of species in two ways: as composite reactions,

xXrRsxr-vrxxXrRsxr+vrx, 4

or as net reactions,

xXrR(sxr--sxr+)vr+xxXrR(sxr+-sxr-)vr+x. 5

Here we use the notation [c]+=c if c>0 and [c]+=0 for c0. In Eq. (5), intermediates, i.e., formal catalysts are cancelled. Hence, the net consumption (or production) of a species x is rR[(sxr--sxr+)vr]+=-[Sv]x if [Sv]x<0 (or rR[(sxr+-sxr-)vr]+=[Sv]x if [Sv]x>0).

Fig. 1 shows the RN of Oro’s prebiotic adenine synthesis from HCN and the integer hyperflow v corresponding to the net reaction “5 HCN adenine” as an example.

While a restriction to integer hyperflows vN0R is necessary in many applications, see e.g. [21] for a detailed discussion, it appears mathematically more convenient to use the more general setting of fluxes vRR as in the analysis of metabolic pathways. To emphasize the connection with the body of literature on network (hyper)flows we will uniformly speak of flows.

For any vector vRR, we write v0 if v is non-negative, v>0 if v is non-negative and non-zero, that is, at least one entry is positive, and v0 if all entries of v are positive. Analogously, we write v0, v<0, and v0. In particular, a vector vRR is called a flow if v0.

A non-trivial flow satisfies v>0, i.e., v0. Two flows v1 and v2 are called parallel if they describe the same net reaction. In particular, we therefore have Sv1=Sv2 for parallel flows.

Futile cycles in a RN are non-trivial flows for which educts and products coincide and thus the net reaction is empty.

Definition 2

A flow v>0 is a futile cycle if Sv=0.

We use the term futile cycle in the strict sense to describe the concurrent activity of multiple reactions (or pathways) having no net effect other than the dissipation of energy. In the literature on metabolic networks often a less restrictive concept is used that allows certain compounds (usually co-factors, ATP/ADP, redox equivalents, or solvents) to differ between products and educts, see e.g. [6164]. In this setting, the net reaction of concurrent glycolysis and gluconeogenesis, namely the hydrolysis of ATP, is viewed as energy dissipation rather than a chemical reaction. In our setting, ATP+H2OADP+Pi-+H+, is a net reaction like any other, and hence a futile cycle would only arise if recycling of ATP, i.e., ADP + Pi-+H+ATP+H2O, was included as well.

If a RN has a futile cycle, it also has an integer futile cycle vN0R, since S has integer entries and thus its kernel has a rational basis, which can be scaled with the least common denominator to have integer entries.

A pair (X,R) is a subnetwork of (X,R) if XX, RR, and supp(r)X implies rR. We say that a property P of a RN is hereditary if “(X,R) has P” implies that every subnetwork “(X,R) has P”.

Chemical reactions are subject to thermodynamic constraints that are a direct consequence of the conservation of energy, the conservation of mass, and the reversibility of chemical reactions. In the context of chemistry, conservation of mass is of course a consequence of the conservation of atoms throughout a chemical reaction. In the following sections, we investigate how these physical principles constrain RNs. Since we have introduced RNs in terms of abstract molecules and reactions, Eq. (1), we express the necessary conditions in terms of the stoichiometric matrix S, which fully captures only the proper part of the RN. Throughout this work, therefore, we assume that (X,R) is a closed RN, unless explicitly stated otherwise.

Thermodynamic constraints

Reaction energies and perpetuum mobiles

Every chemical reaction r is associated with a change in the Gibbs free energy of educts and products. We therefore introduce a vector of reaction (Gibbs free) energies gRR and write (X,R,g) for a RN endowed with reaction energies. The reaction energy for an overall reaction is the total energy of the individual reactions involved. In terms of vRR, it can be expressed as

rRgrvr=gv=g,v, 6

where ·,· denotes the scalar product on RR.

Futile cycles may act as a chemical version of a perpetuum mobile. This is the case whenever a flow v>0 with zero formal net reaction, Sv=0, increases or decreases energy, i.e., if g,v0.

Definition 3

Let (X,R,g) be a RN with reaction energies. A flow v>0 is a perpetuum mobile if Sv=0 and g,v0.

The classical concept of a perpetuum mobile decreases its energy, g,v<0, thereby “creating” energy for its environment. An “anti” perpetuum mobile with g,v>0 would “annihilate” energy. Either situation violates energy conservation and thus cannot be allowed in a chemical RN. Obviously, there is no perpetuum mobile if (X,R) does not admit a futile cycle.

In fact, thermodynamics dictates that Gibbs free energy is a state function. Two parallel flows v1 and v2 therefore must have the same associated net reaction energies. That is, Sv1=Sv2 implies g,v1=g,v2. Equivalently, any vector v=v1-v2RR with Sv=0 must satisfy g,v=0. That is, g(kerS).

Definition 4

Let (X,R,g) be a RN with reaction energies. Then (X,R,g) is thermodynamic if vRR and Sv=0 imply g,v=0, that is, if g(kerS).

Let (X,R,g) be thermodynamic, (X,R) be a subnetwork of (X,R), and g be the restriction of g to R. Then vRR corresponds to vRR with supp(v)R, and thus vRR and Sv=0 imply Sv=0 and further g,v=g,v=0. Hence (X,R,g) is again thermodynamic.

We note that the reaction energies of a reaction r and its reverse r¯ necessarily cancel:

Lemma 5

If r and r¯ are reverse reactions in a thermodynamic network (X,R,g), then gr¯=-gr.

Proof

If r and r¯ are reverse reactions, then v with vr=vr¯=1 (and vr=0 otherwise) satisfies Sv=0. Thus g,v=gr+gr¯=0.

Digression: molecular energies and Hess’ Law

Every molecular species xX has an associated Gibbs free energy of formation. For notational simplicity, we write Gx instead of the commonly used symbol Gf(x). The corresponding vector of molecular energies is denoted by GRX. Molecular energies and reactions energies gRR are related by Hess’ law: For every reaction rR, it holds that

gr=xXGx(sxr+-sxr-)=xXGxSxr.

In matrix form, the relationship between reaction energies g and molecular energies G amounts to

g=SG. 7
Proposition 6

Let (X,R) be a RN and gRR be a vector of reaction energies. Then (X,R,g) is thermodynamic if and only if there is a vector of molecular energies GRX satisfying Hess’ law, Eq. (7).

Proof

By Definition 4, (X,R,g) is thermodynamic if g(kerS)=imS, that is, if there is G such that g=SG, satisfying Hess’s law.

Note that the vector of molecular energies G is not uniquely determined by g in general.

Reversible and irreversible networks

To begin with, we consider purely reversible or irreversible RNs.

Definition 7

A RN (X,R) is reversible if rR implies r¯R and irreversible if rR implies r¯R.

In reversible networks, general vectors vRR have corresponding flows v~0 with the same net reactions and, in the case of thermodynamic networks, with the same energies.

Lemma 8

Let (X,R,g) be a reversible RN (with reaction energies), and let vRR be a vector. Then there is a flow v~0 such that Sv~=Sv. If (X,R,g) is thermodynamic, then further g,v~=g,v.

Proof

If v0, there is nothing to show. Otherwise, there are two flows v10 and v2>0 such that v=v1-v2. Since (X,R) is reversible, each reaction rR has a reverse r¯, and we define the reverse flow v¯2>0 such that v¯2r=vr¯2. By construction, it satisfies Sv¯2=-Sv2.

Now consider the flow v~=v1+v¯2>0. It satisfies

Sv~=S(v1+v¯2)=S(v1-v2)=Sv.

If the network is thermodynamic, then the reverse flow satisfies g,v¯2=-g,v2, by Lemma 5. Hence,

g,v~=g,v1+v¯2=g,v1-v2=g,v.

By definition, a thermodynamic network cannot contain a perpetuum mobile. Conversely, by the result below, if a reversible network is not thermodynamic, then it contains a perpetuum mobile.

Proposition 9

Let (X,R,g) be a reversible RN with reaction energies. Then, the following two statements are equivalent:

  • (i)

    (X,R,g) is thermodynamic.

  • (ii)

    (X,R,g) contains no perpetuum mobile.

Proof

Suppose (X,R,g) is not thermodynamic. That is, there is vRR with Sv=0 and v,g0. By Lemma 8, there is v~0 with Sv~=0 and v~,g0, that is, a perpetuum mobile.

The exclusion of a perpetuum mobile is not sufficient in non-reversible systems.

Example 10

Consider the following RN (with reaction energies g):

graphic file with name 13321_2022_621_Equ8_HTML.gif 8

It contains one futile cycle,

A1B1¯A, v=(1,1,0,0) with g,v=0,

but no perpetuum mobile. However, it contains two parallel flows with different energies,

A1B2C, v=(1,0,1,0) with g,v=-2,

A3C, v=(0,0,0,1) with g,v=-1.

Hence, the RN (with reaction energies g) is not thermodynamic. By setting g3=-2, it can be made thermodynamic.

Many RN models are non-reversible, i.e., they contain irreversible reactions whose reverse reactions are so slow that they are neglected. From a thermodynamic perspective, irreversible reactions r must be exergonic, i.e., gr<0. We first consider the extreme case that all reactions rR are irreversible.

Proposition 11

Let (X,R,g) be an irreversible RN with reaction energies. Then, every futile cycle is a perpetuum mobile. Hence, if (X,R,g) is thermodynamic, then there are no futile cycles.

Proof

Consider a futile cycle, that is, a flow v>0 with Sv=0. Since all reactions are exergonic, vr>0 implies gr<0 and further g,v<0, that is, v is a perpetuum mobile. Now, if there is a futile cycle and hence a perpetuum mobile, then the network is not thermodynamic.

Thermodynamic soundness

We next ask whether a RN (X,R) can always be endowed with a vector of reaction energies g such that (X,R,g) is thermodynamic.

Definition 12

A RN (X,R) is thermodynamically sound if there is a vector of reaction energies g such that (X,R,g) is a thermodynamic network.

We note that thermodynamic soundness is a hereditary property of RNs, since we have seen above that if (X,R,g) is a thermodynamic network so are all its subnetworks (X,R,g).

Again, we first consider purely reversible or irreversible RNs.

Proposition 13

Every reversible RN is thermodynamically sound.

Proof

Since S0 (the zero matrix), obviously (kerS)=imS{0} (the zero vector), and hence there is a non-zero g(kerS).

Theorem 14

An irreversible RN is thermodynamically sound if and only if there are no futile cycles.

Proof

By Gordan’s Theorem (which is in turn a special case of Minty’s Lemma [65], see Appendix B in [66]): Either there is a negative g(kerS) or there is a non-zero, non-positive vkerS. That is, either there is g0 with g(kerS) (the network is thermodyn. sound) or there is v<0 with vkerS; equivalently, there is a futile cycle v>0.

It is not always obvious from the specification of an artificial chemistry model whether or not it is thermodynamically sound. As an example, we consider the artificial chemistry proposed in [67]. It considers only binary reactions (two educts) that produce two products, aiming to ensure conservation of particle numbers. In one variant, the networks only contains irreversible and thus exergonic reactions. It may produce, for instance, the following set of reactions:

A+BC+D,A+CE+B,B+DF+A,E+FA+B. 9

Their sum corresponds to the flow v=(1,1,1,1)0 and yields the exergonic composite reaction

2A+2B+C+D+E+F2A+2B+C+D+E+F,

that is, Sv=0. Thus the model admits a futile cycle composed entirely of exergonic reactions and hence a perpetuum mobile. Thus it is not thermodynamically sound.

Mixed networks

In many applications, RNs contain both reversible and irreversible reactions, Inline graphic. There are two interpretations of such models:

  1. In the (lax) sense used above, reversible reactions can be associated with arbitrary energies, while irreversible reactions are considered exergonic. That is, the reaction energies must satisfy gr<0 for rRirr.

  2. In a strict sense, the reaction energies assigned to irreversible reactions are much more negative than the reaction energies of the reversible ones. After scaling, one requires |gr|1 (that is, -1gr1) for rRrev and |gr|γ (that is, gr-γ) for rRirr and (large) γ>1. The intuition is that reactions r with grγ can be neglected.

The following example shows that thermodynamic soundness differs in the lax and strict senses.

Example 15

Consider the following RN (with reaction energies g):

graphic file with name 13321_2022_621_Equ10_HTML.gif 10

for some g>0. It contains two futile cycles:

A1B1¯A, v=(1,1,0,0) with g,v=0,

A1B2C3A, v=(1,0,1,1), g,v=1-2g.

By setting g=1/2, the RN can be made thermodynamic. (Then the second futile cycle is not a perpetuum mobile.)

However, the RN in (10) cannot be seen as the limit of a thermodynamic, reversible network (ABCA) for large g. Thereby, one considers small g1,g1¯ and large negative g2,g3 (and hence large positive g2¯,g3¯, that is, negligible reverse reactions 2¯,3¯). Any such (limit of a) reversible RN contains a perpetuum mobile (the second futile cycle); equivalently, it is not thermodynamic.

Definition 16

A mixed network Inline graphic is thermodynamically sound if there are reaction energies g such that (X,R,g) is thermodynamic and gr<0 for rRirr.

Inline graphic is strictly thermodynamically sound if, for all γ>1, there are reaction energies g such that (X,R,g) is thermodynamic, |gr|1 for rRrev, and gr<0 with |gr|γ for rRirr.

The scaling condition can also be written in the form

minrRirr|gr|γmaxrRrev|gr|for allγ>1. 11

A more detailed justification for strict thermodynamic soundness in mixed networks will be given in the next subsection when considering open RNs. Here, we focus on the relationship of thermodynamic soundness and futile cycles.

Theorem 17

A mixed RN Inline graphic is thermodynamically sound if and only if there is no irreversible futile cycle.

Proof

By a “sign vector version” of Minty’s Lemma: Either there is g(kerS) with gr<0 for rRirr (the network is thermodynamically sound) or there is a non-zero vkerS with vr0 for rRirr and vr=0 for rRrev; equivalently, there is a futile cycle v>0 with supp(v)Rirr.

Theorem 18

A mixed RN Inline graphic is strictly thermodynamically sound if and only if no futile cycle contains an irreversible reaction.

Proof

By Minty’s Lemma: Let γ>1. Either there is g(kerS) with gr[-1,1] for rRrev and gr(-,-γ] for rRirr or there is vkerS with

rRrevvr[-1,1]+rRirrvr(-,-γ]>0. 12

Thereby, a sum of intervals is defined in the obvious way, yielding an interval which is positive if each of its elements is positive. Via v-v, the interval condition (12) is equivalent to: there is vkerS with

rRrevvr[-1,1]+rRirrvr[γ,)>0. 13

As necessary conditions, we find (i) vr>0 for some rRirr and (ii) vr0 for all rRirr. By Lemma 8, (iii) there is an equivalent flow with vr0 for rRrev. That is, there is a futile cycle v>0 involving an irreversible reaction. For γ large enough, the necessary conditions are also sufficient for the interval condition (13).

We may characterize strict thermodynamic soundness for mixed networks also in geometric terms:

Corollary 19

Let Inline graphic, Lrev=imSrev, and Cirr=coneSirr. Then, (X,R) is strictly thermodynamically sound if and only if it is thermodynamically sound and LrevCirr={0}.

Figure 2 illustrates the concepts of futile cycles and (strict) thermodynamical soundness in a metabolically relevant example.

Fig. 2.

Fig. 2

Substrate cycle. Reaction network (top) as a complex-reaction graph, involving substrate S, product P, enzymes E, F, and complexes ES, FP, and stoichiometric matrix S (middle). In addition to the futile cycles (1,1,0,0,0,0) and (0,0,0,1,1,0), corresponding to the two (pairs of) reversible reactions, there is a non-trivial futile cycle v=(1,0,1,1,0,1), involving both reversible and irreversible reactions. (Note that this futile cycle is not an actual cycle of the graph.) As a result, the network is thermodynamically sound, but not strictly thermodynamically sound. In a metabolically relevant example from glycolysis/gluconeogenesis, the compounds are S = fructose-6-phosphate, P = fructose-1,6-bisphosphate, E = phosphofructokinase 1, and F = fructose-1,6-bisphosphatase, and reactions 2 and 4 involve additional compounds (bottom). As a consequence, there is no non-trivial futile cycle (in the strict sense of this work). In fact, the vector v above then represents the net reaction ATP+H2OADP+Pi. Still, it is called a futile cycle or substrate cycle in the literature on metabolic networks. (In our approach, reactions producing/consuming the additional compounds Pi must be added to the network to obtain a futile cycle. Such a futile cycle involves the active reactions in v, and hence the extended network cannot be strictly thermodynamically sound.)

Open (mixed) networks

Opening the RN, i.e., adding transport reactions alters the representation of reaction energies. We now have to consider chemical potentials involving concentrations, i.e., we replace the (Gibbs free) energies Gx by Gx+RTln[x], where [x] is the activity of x, which approximately coincides with the concentrations. A reaction r then proceeds in the forward direction whenenver the chemical potential of the products is smaller than the chemical potential of the educts, i.e., if

xsxr+(Gx+RTln[x])<xsxr-(Gx+RTln[x]). 14

This condition can be rewritten in terms of the reaction (Gibbs free) energies and (the logarithm of) the “reaction quotient”, see e.g. [68]:

gr<-RTxXsxrln[x] 15

The activities [x] for xX therefore define an upper bound on the reaction energy gr. In an open system, (internal) concentrations may be buffered as fixed values or are implictly determined by given influxes or external concentrations [69]. Given a specification of the environment, i.e., of the transport fluxes and/or buffered concentrations, the upper bound in Eq. (15) can have an arbitrary value. Thus, if an irreversible reaction in R is meant to proceed forward for all conditions, it must be possible to choose gr<0 arbitrarily small, i.e., |gr| arbitrarily large. This amounts to requiring that (X,Rp) is strictly thermodynamically sound. In many studies of reaction networks, one requires that a reaction proceeds forward in a given situation, but allows that it proceeds backward in other situations. In this (lax) interpretation of irreversibility it is sufficient to require that (X,Rp) is thermodynamically sound, but not necessarily strictly thermodynamically sound.

In Def. 16, we introduce (strict) thermodynamical soundness in terms of reaction energies, and in Thms. 17 and 18, we characterize it in terms of futile cycles. In a corresponding approach [70, 71], “extended” detailed balance is required for (closed) RNs with irreversible reactions at thermodynamic equilibrium. Thereby, activities [x], rate constants k+,k- and equilibrium constants K are explicitly used to formulate Wegscheider conditions for non-reversible RNs that are limits of reversible systems. The characterization of such systems in [70] is equivalent to our results.

Reversible completion

As models of chemistry, non-reversible networks are abstractions that are obtained from reversible thermodynamics networks by omitting the reverse of reactions that mostly flow into one direction.

Definition 20

Let (X,R,g) be a thermodynamic RN with Inline graphic. The reversible completion of (X,R,g) is the RN (X,R,g) with Inline graphic and gr=gr for Inline graphic and gr¯=-gr for rRirr.

Lemma 21

If (X,R,g) is a thermodynamic RN, then its reversible completion (X,R,g) is also a thermodynamic RN.

Proof

Let r¯R be the reverse reaction of rRirr. By Prop. 6, for every rR there is a vector GRX satisfying Hess’ law. It suffices to show that G still satisfies Hess’ law for (X,R). By the definition of r¯, its reaction energy is gr¯=xXGx(sxr¯+-sxr¯-)=xXGx(sxr--sxr+)=-gr, as required by Def. 20. Thus (X,R,g) is also thermodynamic.

The following result is an immediate consequence of Lemma 21.

Proposition 22

If the RN (X,R) is thermodynamically sound, then its reversible completion is also thermodynamically sound, and the reaction energies g can be choosen such that gr<0<gr¯ for all rRirr.

Mass conservation and cornucopias/abysses

Thermodynamic soundness is not sufficient to ensure chemical realism. As an example, consider the random kinetics model introduced in [72]. It assigns (a randomly chosen) energy G(x) to each xX. Each reaction r is defined by randomly picking a set of educts er- and products er+. A possible instance of this model comprises four compounds with molecular energies G(A)=-5, G(B)=-5, G(C)=-10, and G(X)=-2, and two reactions

A+BC+X,CA+B. 16

The first reaction is exergonic with g1=-2 and the second has reaction energy g2=0. The composite reaction, obtained as their sum, is A+BA+B+X. Ignoring the effective catalysts A and B, the corresponding net reaction is X. In this universe, therefore, it is possible to spontaneosly create mass in a sequence of exergonic reactions. Reverting the signs of the energies reverts the two reactions and thus yields an exergonic reaction that makes X disappear.

We can again describe this situation in terms of flows. Recall that [Sv]x is the net production or consumption of species x. The spontaneous creation or annihilation of mass thus corresponds to flows v>0 with Sv>0 or Sv<0, respectively.

Definition 23

Let (X,R) be a RN. A flow v>0 is a cornucopia if Sv>0 and an abyss if Sv<0.

Systems with cornucopias or abysses cannot be considered as closed systems. The proper part of chemical reaction networks therefore must be free of cornucopias and abysses.

Since in a reversible network any vector vRR can be transformed into an equivalent flow v~0 (with Sv~=Sv), cf. Lemma 8, we have the following characterization.

Proposition 24

A reversible RN is free of cornucopias and abysses if and only if there is no vector vRR such that Sv>0.

In fact, mass conservation rules out cornucopias and abysses. More generally, a reaction invariant is a property that does not change over the course of a chemical reaction [8, 27, 29]. Here, we are only interested in linear reaction invariants, also called conservation laws [73], that is, quantitative properties of molecules (such as mass) whose sum is the same for educts and products.

Definition 25

A linear reaction invariant or conservation law is a non-zero vector mRX that satisfies xXmxsxr+=xXmxsxr- for all reactions rR, that is, mS=0.

Definition 26

A RN is conservative if it has a positive conservation law, that is, if there is mRX such that m0 and mS=0.

By definition, a conservative network is free of cornucopias and abysses. Conversely, by the result below, if a reversible network is not conservative, then it contains a cornucopia (and an abyss).

Theorem 27

A reversible RN (X,R) is free of cornucopias and abysses if and only if it is conservative.

Proof

By Stiemke’s Theorem (which is in turn a special case of Minty’s Lemma): Either there is a non-zero, non-negative nimS or there is a positive m(imS)=kerS. That is, either there is vRR with n=Sv>0 (corresponding to a cornucopia v~>0) or there is m0 with Sm=0 (as claimed).

We therefore conclude that every closed chemical RN must have a positive reaction invariant. This is no longer true if the RN is embedded in an open system and mass exchange with the environment is allowed. By construction, each transport reaction violates at least one of the conservation laws of the closed system, since [mS]r>0 if r is an import reaction and [mS]r<0 if it is an export reaction. As discussed e.g. in [73], opening a RN by adding import or export reactions, can only reduce the number of conservation laws and cannot introduce additional constraints. Nevertheless, a RN must be chemically meaningful when the import and export reactions are turned off. That is, its proper part (X,Rp) must be conservative to ensure that it has a chemical realization.

Realizations of reaction networks

Conservation of atoms and moieties

Molecules are composed of atoms, which are – by definition – preserved in every chemical reaction. For each atom type a, there is a conservation law that accounts for the number of atoms of type a in each compound x. More precisely, denote by AaxN0 the number of atoms of type a in molecule x, i.e., the coefficients in the chemical sum formula aAaxa for compound x. (Alternatively, we may think of sum formulas as multisets of atoms.) Conservation of atom a in reaction r therefore becomes

xAaxSxr=0. 17

For all atoms and reactions and in matrix form, this condition reads AS=0. Each row of the matrix A thus is a non-negative linear reaction invariant, i.e., a non-negative conservation law.

Conserved moieties are groups of atoms that remain intact in all reactions in which they appear [26, 28, 30]. Like atoms, they lead to non-negative integer conservation laws.

However, (the vectors representing) conserved atoms or moieties need not span the left kernel of the stoichiometric matrix S and need not be linearly independent. To see this, consider the following two RNs comprising a single reaction. For

MgCO3MgO+CO2 18

with S=(-1,1,1), there are only two linearly independent conservation laws, e.g. (1, 1, 0) and (1, 0, 1), corresponding to the moieties MgO and CO2, while the three vectors for the atomic composition AMg=(1,1,0), AC=(1,0,1), and AO=(3,1,2) are linearly dependent. On the other hand, as noted in [26],

C6H5CH3+H2C6H6+CH4 19

with S=(-1,-1,1,1) has three conservation laws but only two atom types, which correspond to the conservation laws AC=(7,0,6,1) and AH=(8,2,6,4). E.g. the phenyl-moiety Mph=(1,0,1,0) or the methyl-moiety MCH4=(1,0,0,1) form the missing third, linearly independent conservation law. The latter example also shows that atom conservation relations are not necessarily support-minimal among the non-negative integer left-kernel vectors of S. In fact, also (0, 1, 1, 0) and (0, 1, 0, 1) are left-kernel vectors of S, the chemical interpretation of which is less obvious.

These examples show that key chemical properties such as atom conservation or conservation of moieties are not encoded in the stoichiometric matrix S. In other words, two RNs can be isomorphic as hypergraphs but describe reactions between sets of compounds that are not isomorphic in terms of their sum formulas. For example, S=(-1,-1,1,1) is realized by the hydroalkylation of toluene in Eq. (19), but also by the inorganic reaction

MgO+H2SO4MgSO4+H2O, 20

having four atom conservation laws, AMg=(1,0,1,0), AO=(1,4,4,1), AH=(0,2,0,2), AS=(0,1,1,0), and three moiety convervation laws, e.g. MMgO=(1,0,1,0), MH2O=(0,1,0,1), and MSO3=(0,1,1,0).

“Semi-positive” conservation laws [26, 74] of a RN are the non-zero elements of the polyhedral cone

K(S)=yRXyS=0,y0, 21

the non-negative left-kernel of S. Thereby, K(S) is an s-cone as defined in [75], given by a subspace (here: kerS) and non-negativity conditions. Since the s-cone K(S) is contained in the non-negative orthant, its extreme (non-decomposable) vectors agree with its support-minimal vectors. Further, since S is an integer matrix, all extreme vectors of K(S) are positive real multiples of integer vectors.

All potential moiety conservation laws (MCLs) [76] for a given stoichiometric matrix S (but unknown atomic composition) are non-zero, integer elements of K(S), i.e., elements of the set

K(S)=yN0XyS=0\{0}. 22

Clearly, K(S) contains the integer extreme vectors of K(S). Ultimately, one is interested in minimal MCLs, i.e., minimal elements of K(S), cf. [77]. (Minimal vectors are called maximal in [74].)

Definition 28

A vector yK(S) is minimal if there is no yK(S) such that y<y.

In fact, integer minimality and integer non-decomposability are equivalent.

Proposition 29

Let yK(S). The following statements are equivalent:

  1. y is minimal.

  2. There are no two y,yK(S) such that y=cy+cy with c,cN.

Proof

Suppose y<y. Then y=1·(y-y)+1·y. Conversely, suppose y=cy+cy. Then y,y<y.

Most importantly, the minimal MCLs generate all MCLs.

Theorem 30

Every element of K(S) is a finite integer linear combination of minimal elements of K(S).

Proof

By Noetherian induction on the partial order < on N0X and Proposition 29.

Knowing all minimal MCLs allows to represent the compounds X of a RN (X,R) in a minimal (most coarse-grained) way.

Definition 31

The minimal moiety representation (short: mm-representation) of a conservative RN (X,R) is the matrix MN0M×X, where the rows of M are the minimal MCLs, and M is the corresponding set of abstract moieties.

For example, consider the abstract chemical reaction

A+B2C 23

with S=(-1,-1,2). There are three minimal MCLs denoted by the abstract moieties M={X,Y,Z}: on the one hand, MX=(2,0,1) and MY=(0,2,1), which are (minimal) extreme vectors of K(S), on the other hand, MZ=(1,1,1), which is minimal, but not extreme. Hence, the mm-representation is given by

M=201021111, 24

and the reaction (23) can be represented as

X2Z+Y2Z2XYZ. 25

By definition, imMkerS. In fact, imM=kerS, and hence there is an obvious lower bound for the number of minimal MCLs.

Lemma 32

Let MN0M×X be the mm-representation of a conservative RN (X,R) with stoichiometric matrix S. Then, imM=kerS and hence |M|dimkerS.

Proof

Since the left kernel of S and hence K(S) contain a positive vector, we have dimK(S)=dimkerS=:d. Hence, (the extreme vectors of) K(S) and therefore also (the corresponding minimal integer vectors of) K(S) generate kerS, that is, imM=kerS. Hence, the number of minimal MCLs is greater equal d, that is, |M|dimkerS.

By instantiating the abstract moieties {X,Y,Z} with sum formulas (multisets of atoms), every chemical realization of the reaction can be obtained. In general, we define an instance as follows.

Definition 33

A sum formula instance (short: sf-instance) of a RN (X,R) with stoichiometric matrix S is a matrix AN0A×X for some non-empty, finite set A of “atoms” such that

  • (i)

    each column of A is non-zero, and

  • (ii)

    AS=0.

Def. 33 in particular allows that A comprises a single row. By condition (i), this row vector is a strictly positive conservation law, which, as a linear combination of MCLs, may be chosen to be integer valued. Conversely, if (X,R) admits an sf-instance, then the column-sum m=1AkerS is a strictly positive integer conservation law and thus in particular an sf-instance with |A|=1. Taken together, we have shown the following existence result.

Proposition 34

A RN (X,R) admits an sf-instance if and only if it is conservative.

The entry mx of m can be interpreted as the total number of atoms in compound xX. In [78], a RN is called primitive atomic if each reaction preserves the total number of atoms. Thus a RN is primitive atomic if and only if it is conservative, cf. [78].

Isomers and sum formula realizations

In order to gain a better understanding of sf-instances for a RN (X,R), we consider net reactions of the form XY in the reversible completion of (X,R). That is, we ask whether it is possible, in principle, to convert X into Y, irrespective of whether the conversion is thermodynamically favorable. From a chemical perspective, if such a net isomerization reaction exists, then X and Y must be compositional isomers. These will play a key role in our discussion of realizations of (X,R) in terms of sf-instances.

Before we proceed, we first give a more formal account of net isomerization reactions. Recall that a net reaction derives from an overall reaction, which in turn is specified by an integer hyperflow. Instead of working explicitly in the reversible completion, we may instead consider vectors vZR with negative entries vr<0, representing the reverse of irreversible reactions rR.

Definition 35

Let (X,R) be a RN with stoichiometric matrix S. A vector vZR, satisfying k:=-[Sv]x=[Sv]yN for some x,yX and [Sv]z=0 for all zX\{x,y}, specifies a net isomerization reaction kxky. Two (distinct) compounds x,yX are obligatory isomers if (X,R) admits a net isomerization reaction kxky. We write xy if x=y or x and y are obligatory isomers.

Proposition 36

The binary relation xy introduced in Def. 35is an equivalence relation.

Proof

By definition, is reflexive. If v specifies the net isomerization reaction kxky, then -v specifies kykx, and thus is symmetric. To verify transitivity, suppose xy and yz, i.e., there are vectors v1 and v2 that specify the net isomerization reactions pxpy and qyqz. Then v=qv1+pv2 satisfies [Sv]x=-pq, [Sv]z=pq, [Sv]y=0, and [Sv]u=0 for all uX\{x,y,z}, and thus specifies the net isomerization reaction (pq)x(pq)z. Thus, is transitive.

The intuition is to define a sum formula realization of a RN as a matrix A that (i) is an sf-instance of the RN and (ii) assigns different atomic compositions to x and y whenever x⇌̸y, that is, whenever x and y are not isomers. In the following, we will see that such a definition both ensures chemical realism and leads to a useful mathematical description. The next result relates net isomerization reactions to the structure of kerS (and ultimately to compositional isomers as given by MCLs and sf-instances).

Theorem 37

Let (X,R) be a RN with stoichiometric matrix S. Then xy if and only if mx=my for all mkerS.

Proof

First suppose xy. Then either x=y (in which case the assertion is trivially true) or there is a net isomerization reaction kxky specified by the vector v. Let mkerS. By the definition of v, we have 0=mSv=zXmz[Sv]z=mx[Sv]x+my[Sv]y=(mx-my)[Sv]x and [Sv]x0. Hence, mx=my.

Now suppose mx=my for all mkerS and consider the vector wZX with wx=-1, wy=1, and wz=0 for all zX\{x,y}. Clearly, m,w=0 for all mkerS, that is, w(kerS)=imS. Thus there is vRR such that w=Sv. Since SZX×R, the solution v of this linear equation is rational. Writing lcd(v) for the least common denomintor of the entries in v, we obtain the integer vector lcd(v)vZR, specifying the net isometrization reaction lcd(v)xlcd(v)y. By definition, xy.

The proof of Thm. 37 also provides a simple algorithm to compute integer hyperflows v that specify net isomerization reactions and to identify the obligatory isomers: For each pair x,yX, construct w with wx=-1 and wy=1 being the only non-zero entries and solve the linear equation Sv=w. We have xy if and only if a solution exists, in which case the desired integer hyperflow is lcd(v)v.

We next show that obligatory isomers cannot be distinguished by sf-instances, and conversely, compounds that are not obligatory isomers are distinguished by certain sf-instances.

Theorem 38

Let (X,R) be a RN with stoichiometric matrix S and AN0A×X be an sf-instance. If imA=kerS, then the following statements are equivalent:

  • (i)

    x,yX are obligatory isomers;

  • (ii)

    Aax=Aay for all aA.

If imAkerS, then (i) implies (ii).

Proof

Let x,yX be distinct. On the one hand, by Theorem 37, statement (i) is equivalent to mx=my for all mkerS. On the other hand, statement (ii) is equivalent to mx=my for all mimA. If imA=kerS, then (i) and (ii) are equivalent. If imAkerS, that is, if the rows of A are elements of kerS, then (i) implies (ii).

Any sf-instance A whose rows span kerS not only identifies obligatory isomers, but also assigns distinct sum formulas to any distinct compounds x,yX that are not obligatory isomers. In this case, there is at least one row (corresponding to atom a) for which AaxAay. This provides the formal justification for a mathematical definition of sum formula realizations.

Definition 39

A sum formula realization (short: sf-realization) of a RN (X,R) with stoichiometric matrix S is a matrix AN0A×X for some non-empty, finite set A of “atoms” such that

  • (i)

    each column of A is non-zero and

  • (ii)

    imA=kerS.

As an illustration, consider the RN

U+VX,U+WY,X+WZ,Y+VZ, 26

depicted on the left side of Fig. 3. The RN can be instantiated by the sum formulas U=A, V=B, W=C, X=AB, Y=AC, Z=ABC. The corresponding matrix A (middle right in Fig. 3) is not only an sf-instance, its rows also span kerS, and hence it is an sf-realization. (In fact, it is also the mm-representation.) A “reduced representation” can be obtained by assuming that U, V, and W are compositional isomers corresponding to the same moiety D, that is, U=V=W=D. As a consequence, X and Y are also compositional isomers, X=Y=D2, and further Z=D3. The corresponding matrix A still defines an sf-instance, but its rows do not span kerS. Now consider an extension of the RN in Eq. (26), by adding three isometrization reactions,

UV,VW,UW. 27

In the extended network given by Eq. (26) and Eq. (27), we have dimkerS=1, and thus there is a unique MCL. The reactions in Eq. (27) now enforce that U, V, and W are compositional isomers and thus correspond to the same moiety D. This coincides with the “reduced representation” A for the RN in Eq. (26). The distinction is that, for the RN of Eq. (26), we may (but do not have to) assume that U, V, and W are isomers, whereas in the extended network, no other interpretation is possible.

Fig. 3.

Fig. 3

Reaction network (left) and stoichiometric matrix S (top right) showing reactions r1-r4, Eq. (26), in gray and the isomerization reactions r5-r7, Eq. (27) in light red. For the basic system (gray) we have dimkerS=3. The three MCLs are shown below S. In the full system, r1-r7, we have dimkerS=1 with the unique MCL shown at the bottom right. In the full system U, V, and W form obligatory isomers of the monomer D. Similarly, X and Y are also obligatory isomers composed of two D units, while Z is a trimer of D units. The vector v=(-1,0,1,0,0,1,0) is represented by the composite reaction X+(U+W)+V(U+V)+Y+W and specifies the net isometrization reaction XY

Finally, we characterize RNs that admit an sf-realization.

Proposition 40

A RN (X,R) admits an sf-realization if and only if it is conservative.

Proof

Suppose (X,R) admits an sf-realization, which, in particular, is an sf-instance. By Prop. 34, (X,R) is conservative. Conversely, suppose (X,R) is conservative. By definition, the mm-representation is an sf-instance, and by Lemma 32, it is an sf-realization.

Obligatory isomers put some restriction on sf-instances. Still, there is surprising freedom for sf-realizations. We say that two sf-realizations A and A are equivalent, AA, if there are integers p,qN such that pA=qA. One easily checks that is an equivalence relation. If dimkerS=1, then all mdimkerS are multiples of the unique minimal MCL. All sum formulas are then of the form Dk, and thus we can think of compounds simply as integers kN. Every reaction thus can be written in the form kskr-Dkkskr+Dk with k(skr+-skr-)k=0. An example of practical interest is the rearrangement chemistry of carbohydrates, found in metabolic networks such as the pentose phosphate pathway (PPP) or the non-oxidative part of the Calvin-Benson-Bassham (CBB) cycle in the dark phase of photosynthesis. Carbohydrates may be seen as a “polymers” of formaldehyd units and can therefore be written as Dk=(CH2O)k. The PPP interconverts pentoses (e.g. ribose) and hexoses (such as glucose), in an atom-economic (no waste) rearrangement network possessing the overall reaction 6(CH2O)55(CH2O)6. In a similar fashion five 3-phosphoglycerates are reconfigured via carbohydrate chemistry into three ribulose-5-phosphate which results in the overall reaction of 5(CH2O)33(CH2O)5 if focusing on the sugar component. Carbohydrate reaction chemistry is particularly well-suited for the implementation of isomerization networks, and the logic and structure of the design space of alternative networks implementing the same overall reaction has been explored using mathematical and computational models [21, 80]. Fig. 4 shows the RN of the prebiotic carbohydrate formation according to [79]. The analysis of the corresponding stoichiometric matrix, available as Additional file 1, shows that all Cn compounds are obligatory isomers. Furthermore, their sum formulas are necessarily multiples of the C1 unit, which corresponds to formaldehyd in the formose reaction.

Fig. 4.

Fig. 4

Reaction network of the formose reaction describing pre-biotic carbohydrate formation [79]. The RN is drawn here in a simplified form showing aldol and retro-aldol reactions (those with 1 educt and 2 products, and vice versa) without their reverse reactions. The stoichiometric matrix of the full network comprising all 38 reaction connecting the 29 compounds is provided as Addition file 1. Compounds are labeled by the number of carbon atoms. C1a (in the center) designates formaldehyd, C3c is dihydroxy acetone. The network was drawn an analyzed with MØD [21]. All compounds with the same number of carbons are obligatory isomers. Moreover, all sum formula representations are of the from An, with A denoting the moiety corresponding to the formaldehyd unit

For dimkerS>1, there is an infinite set of sf-realizations that are pairwisely inequivalent. To see this, construct matrices At=(t1y1,t2y2,,tkyk) from k=dimkerS>1 linearly independent (minimal) MCLs yi and with tNk. Clearly, every such matrix At is an sf-realization. Furthermore, AtAt if and only if there are p,qN such that pt=qt. Hence AtAt if there are two distinct indices 1i<jk such that ti/titj/tj. Clearly, there is an infinite set TNk of integer vectors such that this inequality is satisfied for all distinct t,tT. For instance, one may choose distinct primes for all entries of tT. Thus there are infinitely many pairwisely inequivalent sf-realizations. Furthermore, the choice of the (minimal) MCLs is not unique, in general, allowing additional freedom for sf-realizations. Finally, one may produce more complex sf-realizations by appending additional rows to A that are linear combinations of the basis vectors. Therefore we have the following result.

Proposition 41

Let (X,R) be a conservative RN with stoichiometric matrix S. If dimkerS>1, then there are infinitely many in-equivalent sf-realizations of (X,R).

Structural formula realizations

A structural formula represents a chemical species as a (connected) molecular graph, whose vertices are labeled by atom types and edges refer to chemical bonds. Lewis structures [81] are equivalent to vertex-labeled multigraphs in which each bonding electron pair is represented as an individual edge, and each non-bonding electron pair as a loop. In particular, double or triple bonds are shown as two or three parallel edges. The educt and product complexes r- and r+ of a reaction r can then be represented as the disjoint unions of the educt and product graphs, respectively. A chemical reaction is a graph transformation that converts the educt graph into the product graph such that vertices and their labels are preserved [33, 82]. Only the bonds are rearranged. Since electrons are conserved, and each edge or loop accounts for two electrons, any reaction must preserve the sum of vertex degrees and thus the number of edges. Fig. 5 shows an example.

Fig. 5.

Fig. 5

Multigraph representation for the reaction H2SO4SO3+H2O. Atoms shown in color: H, black; O, red; S, yellow. Non-bonding electron pairs are represented by loops, double bonds by two parallel edges

This idea can be generalized to sf-realizations in which “atoms” are viewed as moieties. We may then interpret the vertices of a multigraph as “fragments” of species that are endowed with a certain number of “valencies” or “half bonds”. These must be “saturated” by binding to free valencies of other moieties or they must be used to form internal bonds within a moiety. In graph theory, the degree of a vertex is simply the number of incident edges. In chemistry, a related notion is the valency of an atom, i.e., the number of bonds (counting bond order) that can be formed by an atom. Each type of atom/moiety therefore has a fixed degree that we can think of as the number of halfbonds. Each of these may bind to other moieties or form a “loop”, i.e., match up with another halfbond of the same vertex. Correspondingly, the degree d(u) of a vertex u in a multigraph is defined as the number of edges that connect u with other vertices plus twice the number of loops. A reaction thus preserves electrons if and only if its only effect is to rearrange the bonds in the multigraph. The valency val(a) of an atom of type a is most naturally interpreted as the number of electrons in the outer shell. Loops then correspond to non-bonding electron pairs. This notion of valency matches Frankland’s “atomicity” and conforms to the IUPAC terminology [83]. Much of the chemical literature, however, uses the term valency loosely for the number of bonds; it is then not an unambigous property of an element or atom and changes with the oxidation state.

Definition 42

Let A be a non-empty, finite set, val:AN be an arbitrary function, and aAnaa be a sum formula. A multigraph Γ=(V,E,α) with loops and vertex coloring α:VA is a corresponding structural formula if it satisfies the following conditions:

  • (i)

    Each vertex uV corresponds to a moiety α(u), in particular, |{uV:α(u)=a}|=na.

  • (ii)

    d(u)=val(α(u)) for all uV, i.e., the vertex degree of u is given by the corresponding moiety.

  • (iii)

    Γ is connected.

The structural formulas specified in Def. 42 do not cover all Lewis structures. In particular, neither explicit charges nor unpaired electrons are covered. While these are important from a chemical perspective, we shall see below that such extensions are not needed for our purposes since the straightforward multigraphs in Def. 42 already provide sufficient freedom to obtain representations for all conservative RNs. Extensions to radicals and charges will be briefly considered in the Discussion section.

Definition 43

Let (X,R) be a RN, A be a non-empty, finite set, and val:AN be an arbitrary function. A Lewis instance is an assignment of vertex-colored multigraphs Γx=(Vx,Ex,αx) to all xX such that

  • (i)

    vertex degrees satisfy d(u)=val(αx(u)), for all uVx and xX, and

  • (ii)

    the corresponding matrix AN0A×X defined by Aax=|{uVx:αx(u)=a}| is an sf-instance.

Furthermore, xΓx is a Lewis realization if A is an sf-realization.

Clearly, every Lewis realization has a corresponding sf-realization. Given an sf-realization, we therefore ask when there is a corresponding Lewis realization. By Def. 42 and 43, we have the following result.

Lemma 44

A RN (X,R) has a Lewis realization with corresponding sf-realization AN0A×X for some non-empty, finite set A, if and only if there is a function val:AN such that for the sum formula aAAaxa (for xX) there is a corresponding structural formula Γx.

Proof

For the ’if’ part, let xAAaxa be the sum formula for xX. By assumption, there exists a vertex-colored multigraph Γx=(Vx,Ex,αx) for x such that (i) vertex degrees satisfy d(u)=val(αx(u)) and (ii) the corresponding matrix equals the sf-realization A. Analogously, for the ’only if’ part.

The appeal of this characterization is that it does not use any properties of the RN (X,R), at all. In fact, it is easy to see that such a representation always exists.

Lemma 45

Let A be a nonempty, finite set and aAnaa be a sum formula. Then, there exists a corresponding structural formula with val(a)=2 for all aA.

Proof

If the sum formula is given by na=1 and na=0 for all aA\{a}, i.e., if it is single moiety, then the corresponding structural formula is a single vertex with color a and a loop. Otherwise, arrange the |V|=ana vertices, of which exactly na are colored by a, in a cycle and connect the vertices along the cycle. Then every vertex u satisfies d(u)=val(α(u))=2 and the graph is connected.

The result extends to any constant function val(a)=2k (with kN) by adding k-1 loops to each vertex. As an immediate consequence of Lem. 44 and 45, we have the following result.

Proposition 46

(X,R) has a Lewis realization if and only if it has an sf-realization.

Using Prop. 40, we can characterize RNs that admit a Lewis realiztion.

Proposition 47

A RN (X,R) admits a Lewis realization if and only if it is conservative.

Interestingly, the simple multigraphs in Def. 42 are sufficient to represent all conservative RNs and thus (the proper part of) all chemical networks. Radicals and other chemical species whose structures cannot be expressed in terms of electron pairs therefore do not add to the universe of chemically realistic RNs. For more details, see the Discussion section.

Like an sf-realization, a Lewis realization does not necessarily assign distinct multigraphs Γx and Γy to distinct compounds x and y. In the case of sf-realizations, obligatory isomers must have the same sum formula. In Lewis realizations, however, they need not have the same multigraph.

Proposition 48

For every conservative RN (X,R) there exists an injective Lewis realization xΓx.

Proof

Sf-representations can be constructed to have an arbitrary number of atoms or moieties for each xX, that is, the vertex sets Vx of the corresponding multigraphs Γx can be chosen arbitrarily large. Set val(a)=4 for all aA and construct an initial Lewis representation of compounds as cycles, as in the proof of Lemma 44, but with an additional loop at each vertex. Consider two obligatory isomers xy, and let the (adjacent) vertices u,vVx be connected (by a single edge). Now replace the two loops at the corresponding vertices u,vVy by two additional edges between u and v. If the equivalence class of obligatory isomers contains more than two compounds, choose sets of pairs of disjoint positions along the cycles and replace pairs of loops by double edges. This yields circular matchings, familiar e.g. from the theory of RNA secondary structures [85, 86]. Setting n=|Vx|-5, one can construct crossing-free circular matchings on n vertices, whose number grows faster than 2.6n, see also Fig. 6. Thus, if Vx is chosen large enough, an arbitrarily large set of obligatory isomers can be represented by non-isomorphic multigraphs. Note, finally, that the construction of non-isomorphic graphs does not depend on (the cardinality of) the atom set A, and thus the construction is also applicable in the case |A|=1, i.e., dimkerS=1.

Fig. 6.

Fig. 6

Construction of non-isomorphic multigraphs with valency 4 in the proof of Prop. 48. The first three isomers are a cycle (with loops), a cycle with a single triple-bond indicating an “origin”, and a graph with an additional double bond. In the third graph, the asymmetric arrangement of the double and triple bonds implies an unambiguous ordering of the remaining vertices (numbered from 1 to n). Non-isomorphic graphs are obtained converting a pair of loops into a double bound. Since each vertex has at most one bond in addition to the cycle, the resulting graphs correspond to Kleitman’s “irreducible diagrams” [84]. If crossings of bonds are excluded, the resulting induced subgraphs with vertex set {1,,n} are isomorphic to RNA secondary structures on sequences of n monomers. The number Sn of secondary structures grows asymptotically 2.6n [85]

The proof in particular shows that the number of vertices required to accommodate the obligatory isomers grows only logarithmically in the size of the equivalence classes of obligatory isomers.

Discussion

Characterization of chemistry-like reaction networks

In this contribution, we have characterized reaction networks that are chemistry-like in the sense that they are consistent with the conservation of energy and mass and allow an interpretation as transformations of chemical molecules. It is worth noting that we arrive at our results without invoking mass-action kinetics, which has been the focus of interest in chemical reaction network theory since the 1970s [79]. Instead, we found that basic arguments from thermodynamics (without kinetic considerations) are sufficient. The main results of this contribution can be summarized as follows:

  • (i)

    A closed RN (X,R) is thermodynamically sound if and only if it does not contain an irreversible futile cycle. In particular, every reversible networks is thermodynamically sound. If irreversible reactions are meant to proceed in a given direction for all external conditions (after opening the RN by adding transport reactions), then (X,R) must be strictly thermodynamically sound. Equivalently, a futile cycle must not contain an irreversible reaction. An analogous result was obtained by [70] assuming mass-action kinetics.

  • (ii)

    A RN (X,R) is free of cornucopias and abysses if and only if it is conservative.

  • (iii)

    Both thermodynamic soundness and conservativity are completely determined by the stoichiometric matrix S, i.e., they are unaffected by catalysts.

  • (iv)

    A RN (X,R) admits an sf-realization if and only if it is conservative. That is, conservative RNs admit assignments of sum formulas such that (i) atoms (or moieties) are conserved and (ii) two compounds are assigned the same sum formula if and only if they are obligatory isomers. Obligatory isomers, in turn, are completely determined by S.

  • (v)

    For every sf-realization of a RN (X,R) there is also a Lewis-realization, i.e., an assignment of multigraphs to each compound such that reactions are exclusively rearrangements of edges.

Such chemistry-like realizations, however, are by no means unique. In general, the same RN has infinitely many chemical realizations corresponding to different atomic compositions. The structure of the stoichiometric matrix S of a closed RN therefore implies surprisingly little about the underlying chemistry.

Nevertheless there is interesting information that is independent of the concrete realization. For example, Thm. 37 can be reformulated as follows: The reversible completion of (X,R) admits a net reaction of the form pxqy with x,yX and p,qN if and only if qmx=pmy for every mkerS. This identifies “obligatory oligomers”, necessarily composed of multiples of the same monomer.

Computational considerations

Somewhat surprisingly, the computational problems associated with recognizing “chemistry-like” RNs are not particularly difficult and can be solved by well-established methods. To see this, recall that (X,R) is conservative iff there is a vector m0 such that Sm=0 and not thermodynamically sound iff there is a vector v>0 such that Sv=0 and vr>0 for some rRirr These linear programming problems can be solved in O((|X|+|R|)2.37) time [87].

An integer (not necessarily non-negative) basis of kerS can be computed exactly in polynomial time, e.g. using the Smith normal form, see [88]. Chubanov’s algorithm finds exact rational solutions to systems of linear equations with a strict positivity constraint. Thus is can be employed to compute a strictly positive integer solution m0 to Sm=0 in polynomial time [89, 90]. As a consequence, an sf-realization can also be computed explicitly in polynomial time. Each sum formula in turn can be converted into a graph with total effort bounded by maxxXaAxa·|X|, the maximal number of atoms that appear in a sum formula times the number of molecules.

The equivalence relation for obligatory isomers is determined by the existence of solutions to a linear equation of the form Sv=w and thus can also be computed in polynomial time, again bounded by the effort for matrix multiplication for each pair x,yX. A much more efficient approach, however, is to compute a basis of kerS, from which can be read off directly. This approach easily extends to “obligatory oligomers.”

Treating RNs as closed systems is too restrictive to describe metabolic networks. There, RNs are considered as open systems that allow the inflow of nutrients and the outflow of waste products. Models of metabolism often impose a condition of viability. Traditionally, this is modeled as a single export “reaction” rbm of the form iαiCi, known as the biomass function [91]. It comprises all relevant precursor metabolites Ci (forming all relevant macromolecules) in their empirically determined proportions αi. Viability is then defined as the existence of a flow v>0 with Sv=0 and vbm>0. This linear programming problem can be tested efficiently by means of flux balance analysis (FBA) [92]. In contrast to (X,R) being conservative and thermodynamically sound, however, viability is a property of the metabolic model, not of the underlying representation of the chemistry.

Outlook to open problems

Construction of random chemistry-like networks

The formal characterization of chemistry-like RNs developed here suggests several interesting questions for further research. In particular, our results define rather clearly how random chemistry-like RNs should be defined and thus poses the question whether there are efficient algorithms for their construction. Let us consider the task of generating a random chemistry-like RN in a bit more detail. We first note that it suffices to generate a stoichiometric matrix SN0X×R that is thermodynamically sound and conservative. If explicit catalysts are desired, they can be added to a reaction without further restrictions. More precisely, given S, we obtain a network with the same stoichiometric matrix plus catalysts by setting

sxr-=cxr,sxr+=cxr+sxrifsxr0,sxr-=cxr-sxr,sxr+=cxrifsxr0. 28

The “catalyst matrix” C may contain arbritrary integers cxr0. For the generation of a RN (X,R), therefore, it can be drawn independently of S.

The key task of generating (X,R) is therefore the construction of an |X|×|R| integer matrix S that is conservative and thermodynamically sound. Both conditions amount to the (non)existence of vectors with certain sign patterns in kerS and kerS, respectively. In order to obtain a background model for a given chemical RN, one might also ask for a random integer matrix that has a given left nullspace and is thermodynamically sound. In addition, one would probably like to (approximately) preserve the fraction of zero entries per row and column and the mean of the non-zero entries. To our knowledge, no efficient exact algorithms for this problem are known.

A potentially promising alternative is the independent generation of the complex matrix Y and the incidence matrix Z of the complex-reaction graph. Given a fixed conservative and thermodynamically sound RN, furthermore, one can make use of the heredity of thermodynamic soundness and conservativity and consider random subnetworks. This approach has been explored in particular for metabolic networks: The ensemble of viable metabolic networks in a given chemical RN can then be sampled by a random walk on the set of reactions [57] or a more sophisticated Markov-Chain-Monte-Carlo procedure [55, 93].

Chemistry-like realizations

The structural formulas constructed in Lemma 45 are not very “realistic’ from a chemical perspective. It is of interest, therefore, if one can construct chemically more appealing (multi-)graphs. As noted in the Introduction, the problem of designing a “molecular implementation” of a prescribed stoichiometric matrix S is a key problem in utilizing chemical reaction networks as computing devices. From a mathematical point of view there seem to be only a few constraints: (i) If a moiety a appears in isolation, i.e., as a molecule x=1a, then val(a) must be even, since it contains val(a)/2 loops. (ii) The case val(a)=1 is only possible if there is no compound composed exclusively of three or more copies of a or composed of more than two moieties with valency 1. (iii) It is well known that the sum of degrees must be even for every multigraph, and connectedness implies uval(u)2(|V|-1) [94].

The problem of finding multigraph realizations is closely related to, but not the same as, the problem determining the realizability of degree sequences in graphs [95] or multigraphs [96]. As in graph theory, it seems to be of particular interest to study realizability by structural formula in the presence of additional constraints on admissible graphs. Complementary to constraints on the multigraphs that render them plausible chemical graphs, the “chemical implementation” of a given S also involves constraints on the admissible (types of) reactions, i.e., the allowed rearrangements of edges in the multigraphs. It is much less clear how to formalize this aspect, although there seems to be a connection to graph grammar models of chemical reactions [97].

An advantage of considering the multigraphs specified in Def. 42 instead of the full range of Lewis structures is that a well-established mathematical theory is available. However, “multigraphs with semi-edges”, which are essentially equivalent to Lewis structures of radicals, have been studied occasionally in recent years [98, 99] and may be an appealing framework, in particular, when restricted realizations are considered. The example of nitrogen oxids in Fig. 7 shows, however, that unpaired electrons (as in the Lewis structure of NO) are not the only issue. A complete implementation of Lewis structures also requires local net charges val(α(v))-deg(v) at vertices v, as a semi-edge-like annotation distinct from unpaired electrons, see e.g. [100].

Fig. 7.

Fig. 7

A Lewis structure-like presentation of NO2+NON2O3 highlights that multigraphs with atom-type dependent degrees are not sufficient to represent all molecules of interest. To represent NO2, both an unpaired electron (shown as a semi-edge ending in a small black ball), an N atom with vertex degree 4<val(N)=5 and an oxygen atom with vertex degree 7>val(O)=6 are required. Similarly, NO is a neutral stable radical, with an unpaired electron at N. The product N2O3 has no unpaired electrons, but exhibits an O and an N atom with a deviant vertex degree and thus a net charge. Differences between nominal valency and actualy vertex degree are indicated by the charge symbols and . In general, the net charge at a vertex v is given by val(α(v))-deg(v)

Infinite RNs

Throughout this contribution, we have assumed that (X,R) is finite. In general, however, chemical universes are infinite, at least in principle. The simplest example of infinite families are polymers. It is of interest, therefore, to develop a theory of infinite reaction networks. To this end, one could follow e.g. [101], where also infinite directed hypergraphs are considered, and further extend the literature on countably infinite undirected hypergraphs, see e.g. [102, 103] and the references therein. Most previous work pre-supposed k-uniformity, i.e., hyper-edges of (small) finite cardinality, matching well with the situation in chemical RNs. Every sub-RN of an infinite RN induced by a finite vertex set YX can be assumed to support only a finite number of reactions (directed hyperedges) RYR. This amounts to assuming that a sub-RN induced by finite set of compounds Y is a finite RN. Every finite sub-RN of a “chemistry-like” infinite RN, furthermore, needs to be conservative and thermodynamically sound. Infinite RNs will not be locally finite, in general, since every compound xX may have infinitely many reaction partners, e.g., all members of a polymer family. Thus x may appear in an infinite number of reactions. These simple observations suggest infinite “chemistry-like” RNs are non-trivial structures whose study may turn out to be a worth-while mathematical endeavor.

Supplementary Information

13321_2022_621_MOESM1_ESM.txt (5.2KB, txt)

Additional file 1. Stoichiometric matrix of the complete formose RN, Fig. 4, in machine-readable form.

Acknowledgements

Not applicable.

Appendix: Mathematical notation

We consider matrices and vectors indexed by chemical species xX or chemical reactions rR. Hence, both species and reactions can be thought of as endowed with an arbitrary, but fixed order determining the order of rows and columns. Standard mathematical notation is used without further explanation in the main text. Notation that may be less familiar is summarized here:

N positive integers
R real numbers
A transpose of matrix A
kerA kernel of matrix A,
i.e., kerA={xAx=0}
imA image of matrix A,
i.e., imA={yy=Axfor somex}
coneA polyhedral cone induced by matrix A,
i.e., coneA={yy=Axfor somex0}
x row vector of column vector x
xi component of vector xRI (with iI)
suppx support of vector xRI,
i.e., suppx={iIxi0}
dimV dimension of vector space V
V orthogonal complement of vector space V,
i.e., V={yxy=0for allxV}

Author contributions

CF and PFS designed the study, SM and PFS proved the mathematical results, all authors contributed to the interpretation of the results and the writing of the manuscript. All authors read and approved the final manuscript.

Funding

Open Access funding enabled and organized by Projekt DEAL. This research was funded in part by the German Federal Ministry of Education and Research within the project Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI.) Dresden/Leipzig (BMBF 01IS18026B), and the German Research Foundation DFG, grant no. STA 850/58-1. SM was supported by the Austrian Science Fund (FWF), project P33218.

Availability of data and materials

The stoichiometric matrix of the complete formose RN, Fig. 4, is availble as machine-readable Additional file 1.

Declarations

Competing interests

The authors declare that they have no competing interests.

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Stefan Müller, Email: st.mueller@univie.ac.at.

Christoph Flamm, Email: xtof@tbi.univie.ac.at.

Peter F. Stadler, Email: studla@bioinf.uni-leipzig.de

References

  • 1.Sandefur CI, Mincheva M, Schnell S. Network representations and methods for the analysis of chemical and biochemical pathways. Mol Biosyst. 2013;9:2189–2200. doi: 10.1039/c3mb70052f. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Alon U. Network motifs: theory and experimental approaches. Nat Rev Genet. 2007;8:450–461. doi: 10.1038/nrg2102. [DOI] [PubMed] [Google Scholar]
  • 3.Shellman ER, Burant CF, Schnell S. Network motifs provide signatures that characterize metabolism. Mol Biosyst. 2013;9:352–360. doi: 10.1039/c2mb25346a. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Soulé C. Graphic requirements for multistationarity. ComplexUs. 2003;1:123–133. doi: 10.1159/000076100. [DOI] [Google Scholar]
  • 5.Borenstein E, Kupiec M, Feldman MW, Ruppin E. Large-scale reconstruction and phylogenetic analysis of metabolic environments. Proc Natl Acad Sci USA. 2008;105:14482–14487. doi: 10.1073/pnas.0806162105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Fagerberg R, Flamm C, Merkle D, Peters P, Stadler PF. On the complexity of reconstructing chemical reaction networks. Math Comp Sci. 2013;7:275–292. doi: 10.1007/s11786-013-0160-y. [DOI] [Google Scholar]
  • 7.Horn FJM. Necessary and sufficient conditions for complex balancing in chemical kinetics. Arch Rational Mech Anal. 1972;49:172–186. doi: 10.1007/BF00255664. [DOI] [Google Scholar]
  • 8.Horn F, Jackson R. General mass action kinetics. Arch Rational Mech Anal. 1972;47:81–116. doi: 10.1007/BF00251225. [DOI] [Google Scholar]
  • 9.Feinberg M. Complex balancing in general kinetic systems. Arch Rational Mech Anal. 1972;49:187–194. doi: 10.1007/BF00255665. [DOI] [Google Scholar]
  • 10.Craciun G, Dickenstein A, Shiu A, Sturmfels B. Toric dynamical systems. J Symb Comput. 2009;44:1551–1565. doi: 10.1016/j.jsc.2008.08.006. [DOI] [Google Scholar]
  • 11.Angeli D. A tutorial on chemical reaction network dynamics. Eur J Control. 2009;15:398–406. doi: 10.3166/ejc.15.398-406. [DOI] [Google Scholar]
  • 12.Craciun G, Feinberg M (2016) Multiple Equilibria in Complex Chemical Reaction Networks: II. The Species-Reaction Graph. SIAM J Appl Math. 66:1321–1338
  • 13.Kaltenbach HM. A unified view on bipartite species-reaction and interaction graphs for chemical reaction networks. Electronic Notes Theor Comp Sci. 2020;350:79–90. [Google Scholar]
  • 14.Shinar G, Feinberg M. Concordant chemical reaction networks and the Species-Reaction graph. Math Biosci. 2013;241:1–23. doi: 10.1016/j.mbs.2012.08.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Mincheva M, Roussel MR. A graph-theoretic method for detecting potential Turing bifurcations. J Chem Phys. 2006;125:204102. doi: 10.1063/1.2397073. [DOI] [PubMed] [Google Scholar]
  • 16.Zykov AA. Hypergraphs. Usp Math Nauk. 1974;6:89–154. [Google Scholar]
  • 17.Zhou W, Nakhleh L. Properties of metabolic graphs: biological organization or representation artifacts? BMC Bioinform. 2011;12:132. doi: 10.1186/1471-2105-12-132. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Santiago Arguello A, Stadler PF (2021) Whitney’s Connectivity Inequalities for Directed Hypergraphs. Art Discr Appl Math. 5:P1.01
  • 19.Klamt S, Haus UU, Theis F. Hypergraphs and cellular networks. PLoS Comput Biol. 2009;5:e1000385. doi: 10.1371/journal.pcbi.1000385. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Montañez R, Medina MA, Solé RV, Rodríguez-Caso C. When metabolism meets topology: reconciling metabolite and reaction networks. BioEssays. 2010;32:246–256. doi: 10.1002/bies.200900145. [DOI] [PubMed] [Google Scholar]
  • 21.Andersen JL, Flamm C, Merkle D, Stadler PF. Chemical transformation motifs—modelling pathways as integer hyperflows. IEEE/ACM Trans Comp Biol. 2019;16:510–523. doi: 10.1109/TCBB.2017.2781724. [DOI] [PubMed] [Google Scholar]
  • 22.Wagner A, Fell DA. The small world inside large metabolic networks. Proc R Soc Lond B. 2001;268:1803–1810. doi: 10.1098/rspb.2001.1711. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Jeong H, Tombor B, Albert R, Oltvai ZN, Barabási AL. The large-scale organization of metabolic networks. Nature. 2000;407:651–654. doi: 10.1038/35036627. [DOI] [PubMed] [Google Scholar]
  • 24.Gleiss PM, Stadler PF, Wagner A, Fell DA. Relevant cycles in chemical reaction network. Adv Complex Syst. 2001;4:207–226. doi: 10.1142/S0219525901000140. [DOI] [Google Scholar]
  • 25.Fischer J, Kleidon A, Dittrich P. Thermodynamics of random reaction networks. PLoS ONE. 2015;10:e0117312. doi: 10.1371/journal.pone.0117312. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Schuster S, Höfer T. Determining all extreme semi-positive conservation relations in chemical reaction systems: a test criterion for conservativity. J Chem Soc Faraday Trans. 1991;87:2561–2566. doi: 10.1039/FT9918702561. [DOI] [Google Scholar]
  • 27.Gadewar SB, Doherty MF, Malone MF. A systematic method for reaction invariants and mole balances for complex chemistries. Comput Chem Eng. 2001;25:1199–1217. doi: 10.1016/S0098-1354(01)00695-0. [DOI] [Google Scholar]
  • 28.Famili I, Palsson BØ. The convex basis of the left null space of the stoichiometric matrix leads to the definition of metabolically meaningful pools. Biophys J. 2003;85:16–26. doi: 10.1016/S0006-3495(03)74450-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Flockerzi D, Bohmann A, Kienle A. On the existence and computation of reaction invariants. Chem Eng Sci. 2007;62:4811–4816. doi: 10.1016/j.ces.2007.05.003. [DOI] [Google Scholar]
  • 30.Haraldsdóttir HS, Fleming RMT. Identification of conserved moieties in metabolic networks by graph theoretical analysis of atom transition networks. PLoS Comput Biol. 2016;12:e1004999. doi: 10.1371/journal.pcbi.1004999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Fontana W (1991) Algorithmic chemistry. In: Langton CG, Taylor C, Farmer JD, Rasmussen S (eds) Artificial Life II. Addison-Wesley, pp 159–210
  • 32.Dittrich P, Ziegler J, Banzhaf W. Artificial chemistries—a review. Artificial life. 2001;7:225–275. doi: 10.1162/106454601753238636. [DOI] [PubMed] [Google Scholar]
  • 33.Benkö G, Flamm C, Stadler PF. A graph-based toy model of chemistry. J Chem Inf Comput Sci. 2003;43:1085–1093. doi: 10.1021/ci0200570. [DOI] [PubMed] [Google Scholar]
  • 34.Banzhaf W, Yamamoto L. Artificial chemistries. Cambridge: MIT Press; 2015. [Google Scholar]
  • 35.Berry G, Boudol G. The chemical abstract machine. Theor Comp Sci. 1992;96:217–248. doi: 10.1016/0304-3975(92)90185-I. [DOI] [Google Scholar]
  • 36.Liekens AML, Fernando CT (2007) Turing complete catalytic particle computers. In: Almeida e Costa F, Rocha LM, Costa E, Harvey I, Coutinho A, editors. Proceedings of the 9th European Conference on Artificial Life. vol. 4648 of Lect. Notes Comp. Sci. Berlin: Springer, p. 1202–1211
  • 37.Soloveichik D, Cook M, Winfree E, Bruck J. Computation with finite stochastic chemical reaction networks. Natural Comput. 2008;7:615–633. doi: 10.1007/s11047-008-9067-y. [DOI] [Google Scholar]
  • 38.Dueñas-Díez M, Pérez-Mercader J. Native chemical computation. A generic application of oscillating chemistry illustrated with the Belousov-Zhabotinsky reaction. A review. Front Chem. 2021;9:611120. doi: 10.3389/fchem.2021.611120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Soloveichik D, Seelig G, Winfree E. DNA as a universal substrate for chemical kinetics. Proc Natl Acad Sci USA. 2010;107:5393–5398. doi: 10.1073/pnas.0909380107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Badelt S, Shin SW, Johnson RFJ, Dong Q, Thachuk C, Winfree E (2017) A General-Purpose CRN-to-DSD Compiler with Formal Verification, Optimization, and Simulation Capabilities. In: Brijder R, Qian L, editors. DNA Computing and Molecular Programming. vol. 10467 of Lect. Notes Comp. Sci. Cham: Springer. p. 232–248
  • 41.Watts DJ, Strogatz SH. Collective dynamics of ‘small-world’ networks. Nature. 1998;393:409–410. doi: 10.1038/30918. [DOI] [PubMed] [Google Scholar]
  • 42.Newman MEJ, Strogatz SH, Watts DJ. Random graphs with arbitrary degree distributions and their applications. Phys Rev E. 2001;64:026118. doi: 10.1103/PhysRevE.64.026118. [DOI] [PubMed] [Google Scholar]
  • 43.Ravasz E, Somera AL, Mongru DA, Oltvai ZN, Barabási AL. Hierarchical organization of modularity in metabolic networks. Science. 2002;297:1551–1555. doi: 10.1126/science.1073374. [DOI] [PubMed] [Google Scholar]
  • 44.Arita M. The metabolic world of Escherichia coli is not small. Proc Natl Acad Sci USA. 2004;101:1543–1547. doi: 10.1073/pnas.0306458101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Azizi A, Dewar J, Wu T, Hyman JM. Generating bipartite networks with a prescribed joint degree distribution. J Complex Netw. 2017;5:839–857. doi: 10.1093/comnet/cnx014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Rao AR, Jana R, Bandyopadhyay S. A Markov chain Monte Carlo method for generating random (0,1)-matrices with given marginals. Indian J Statistics Ser A. 1996;58:225–242. [Google Scholar]
  • 47.Hanhijärvi S, Garriga GC, Puolamäki K (2009) Randomization Techniques for Graphs. In: Proceedings of the 2009 SIAM International Conference on Data Mining. SIAM. p. 780–791
  • 48.Strona G, Nappo D, Boccacci F, Fattorini S, San-Miguel-Ayanz J. A fast and unbiased procedure to randomize ecological binary matrices with fixed row and column totals. Nat Comm. 2014;5:4114. doi: 10.1038/ncomms5114. [DOI] [PubMed] [Google Scholar]
  • 49.Saracco F, Di Clemente R, Gabrielli A, Squartini T. Randomizing bipartite networks: the case of the World Trade Web. Sci Rep. 2015;5:10595. doi: 10.1038/srep10595. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.de Panafieu É. Phase transition of random non-uniform hypergraphs. J Discrete Alg. 2015;31:26–39. doi: 10.1016/j.jda.2015.01.009. [DOI] [Google Scholar]
  • 51.Ghoshal G, Zlatić V, Caldarelli G, Newman MEJ. Random hypergraphs and their applications. Phys Rev E. 2009;79:066118. doi: 10.1103/PhysRevE.79.066118. [DOI] [PubMed] [Google Scholar]
  • 52.Sloan RH, Stasi D, Turán G. Random horn formulas and propagation connectivity for directed hypergraphs. Discrete Math Theor Comp Sci. 2012;14:29–36. [Google Scholar]
  • 53.Nakajima K, Shudo K, Masuda N (2021) Randomizing hypergraphs preserving degree correlation and local clustering. IEEE Trans Network Sci Eng
  • 54.Braun P (2019) Randomization of chemical reaction networks based on a graph-language model [MSc thesis]. Universität Wien, Fakultät für Physik. https://othes.univie.ac.at/58106/
  • 55.Samal A, Matias Rodrigues JF, Jost J, Martin OC, Wagner A. Genotype networks in metabolic reaction spaces. BMC Syst Biol. 2010;4:30. doi: 10.1186/1752-0509-4-30. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Kim H, Smith HB, Mathis C, Raymond J, Walker SI. Universal scaling across biochemical networks on Earth. Sci Adv. 2019;5:eaau0149. doi: 10.1126/sciadv.aau0149. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Matias Rodrigues JF, Wagner A. Evolutionary plasticity and innovations in complex metabolic reaction networks. PLoS Comput Biol. 2009;5:e1000613. doi: 10.1371/journal.pcbi.1000613. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Oró J, Kimball AP. Synthesis of purines under possible primitive earth conditions. I. Adenine from hydrogen cyanide. Arch Biochem Biophys. 1961;94:217–227. doi: 10.1016/0003-9861(61)90033-9. [DOI] [PubMed] [Google Scholar]
  • 59.Andersen JL, Andersen T, Flamm C, Hanczyc M, Merkle D, Stadler PF. Navigating the chemical space of HCN polymerization and hydrolysis: guiding graph grammars by mass spectrometry data. Entropy. 2013;15:4066–4083. doi: 10.3390/e15104066. [DOI] [Google Scholar]
  • 60.Tschoegl NW. Fundementals of equilibrium steady-state thermodynamics. Amsterdam: Elsevier; 2000. [Google Scholar]
  • 61.Schilling CH, Letscher D, Palsson BØ. Theory for the systemic definition of metabolic pathways and their use in interpreting metabolic function from a pathway-oriented perspective. J Theor Biol. 2000;203(3):229–248. doi: 10.1006/jtbi.2000.1073. [DOI] [PubMed] [Google Scholar]
  • 62.Beard DA, Liang S, Qian H. Energy balance for analysis of complex metabolic networks. Biophys J. 2002;83:79–86. doi: 10.1016/S0006-3495(02)75150-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Schwender J, Ohlrogge J, Shachar-Hill Y. Understanding flux in plant metabolic networks. Curr Opin Plant Biol. 2004;7:309–317. doi: 10.1016/j.pbi.2004.03.016. [DOI] [PubMed] [Google Scholar]
  • 64.Qian H, Beard DA. Metabolic futile cycles and their functions: a systems analysis of energy and control. IEE Proc Systems Biology. 2006;153:192–200. doi: 10.1049/ip-syb:20050086. [DOI] [PubMed] [Google Scholar]
  • 65.Minty GJ. A “from scratch” proof of a theorem of Rockafellar and Fulkerson. Mathematical Programming. 1974;7:368–375. doi: 10.1007/BF01585531. [DOI] [Google Scholar]
  • 66.Müller S, Hofbauer J, Regensburger G. On the bijectivity of families of exponential/generalized polynomial maps. SIAM J Appl Algebra Geom. 2019;3(3):412–438. doi: 10.1137/18M1178153. [DOI] [Google Scholar]
  • 67.Dondi D, Merli D, Albini A, Zeffiroa A, Serpone N. Chemical reaction networks as a model to describe UVC- and radiolyticallyinduced reactions of simple compounds. Photochem Photobiol Sci. 2012;11:835–842. doi: 10.1039/c2pp00005a. [DOI] [PubMed] [Google Scholar]
  • 68.Pekař M. Thermodynamics and foundations of mass-action kinetics. Prog React Kinet Mech. 2005;30:3–113. doi: 10.3184/007967405777874868. [DOI] [Google Scholar]
  • 69.Polettini M, Esposito M. Irreversible thermodynamics of open chemical networks. I. Emergent cycles and broken conservation laws. J Chem Phys. 2014;141:024117. doi: 10.1063/1.4886396. [DOI] [PubMed] [Google Scholar]
  • 70.Gorban AN, Yablonsky GS. Extended detailed balance for systems with irreversible reactions. Chem Eng Sci. 2011;66(21):5388–5399. doi: 10.1016/j.ces.2011.07.054. [DOI] [Google Scholar]
  • 71.Gorban AN, Mirkes EM, Yablonsky GS. Thermodynamics in the limit of irreversible reactions. Physica A: Stat Mech Appl. 2013;392(6):1318–1335. doi: 10.1016/j.physa.2012.10.009. [DOI] [Google Scholar]
  • 72.Bigan E, Steyaert JM, Douady S (2013) Properties of Random Complex Chemical Reaction Networks and Their Relevance to Biological Toy Models. arXiv. 1303.7439
  • 73.Rao R, Esposito M. Conservation laws and work fluctuation relations in chemical reaction networks. J Chem Phys. 2018;149:245101. doi: 10.1063/1.5042253. [DOI] [PubMed] [Google Scholar]
  • 74.Schuster S, Hilgetag C. What information about the conserved-moiety structure of chemical reaction systems can be derived from their stoichiometry? J Phys Chem. 1995;99:8017–8023. doi: 10.1021/j100020a026. [DOI] [Google Scholar]
  • 75.Müller S, Regensburger G. Elementary vectors and conformal sums in polyhedral geometry and their relevance for metabolic pathway analysis. Front Genet. 2016;7:1–11. doi: 10.3389/fgene.2016.00090. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.De Martino A, De Martino D, Mulet R, Pagnani A. Identifying all moiety conservation laws in genome-scale metabolic networks. PLoS ONE. 2014;9:e100750. doi: 10.1371/journal.pone.0100750. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Graver JE. On the foundations of linear and integer linear programming. I. Math Program. 1975;9:207–226. doi: 10.1007/BF01681344. [DOI] [Google Scholar]
  • 78.Doty D, Zhu S. Computational complexity of atomic chemical reaction networks. Natural Comput. 2018;17:677–691. doi: 10.1007/s11047-018-9687-9. [DOI] [Google Scholar]
  • 79.Benner SA, Kim HJ, Kim MJ, Ricardo A. Planetary organic chemistry and the origins of biomolecules. Cold Spring Harb Perspect Biol. 2010;2:a003467. doi: 10.1101/cshperspect.a003467. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Meliéndez-Hevia E, Isidoro A. The game of the pentose phosphate cycle. J Theor Biol. 1985;117(2):251–263. doi: 10.1016/S0022-5193(85)80220-4. [DOI] [PubMed] [Google Scholar]
  • 81.Lewis GN. The atom and the molecule. J Am Chem Soc. 1916;38:762–785. doi: 10.1021/ja02261a002. [DOI] [Google Scholar]
  • 82.Rossello F, Valiente G. Chemical graphs, chemical reaction graphs, and chemical graph transformation. Electr Notes Theor Comp Sci. 2005;127:157–166. doi: 10.1016/j.entcs.2004.12.033. [DOI] [Google Scholar]
  • 83.Muller P. Glossary of terms used in physical organic chemistry (IUPAC Recommendations 1994) Pure Appl Chem. 1994;66:1077–1184. doi: 10.1351/pac199466051077. [DOI] [Google Scholar]
  • 84.Kleitman DJ. Proportions of irreducible diagrams. Studies Appl Math. 1970;49:297–299. doi: 10.1002/sapm1970493297. [DOI] [Google Scholar]
  • 85.Stein PR, Waterman MS. On some new sequences generalizing the Catalan and Motzkin numbers. Discr Math. 1979;26:261–272. doi: 10.1016/0012-365X(79)90033-5. [DOI] [Google Scholar]
  • 86.Waterman MS, Smith TF. RNA secondary structure: a complete mathematical analysis. Math Biosci. 1978;42:257–266. doi: 10.1016/0025-5564(78)90099-8. [DOI] [Google Scholar]
  • 87.Cohen MB, Lee YT, Song Z. Solving linear programs in the current matrix multiplication time. J ACM. 2021;68:31–39. doi: 10.1145/3424305. [DOI] [Google Scholar]
  • 88.Newman M. The Smith normal form. Lin Alg Appl. 1997;254:367–381. doi: 10.1016/S0024-3795(96)00163-2. [DOI] [Google Scholar]
  • 89.Chubanov S. A polynomial projection algorithm for linear feasibility problems. Math Program. 2015;153:687–713. doi: 10.1007/s10107-014-0823-8. [DOI] [Google Scholar]
  • 90.Root K. An improved version of Chubanov’s method for solving a homogeneous feasibility problem. Opt Methods Softw. 2018;33:26–44. doi: 10.1080/10556788.2017.1368509. [DOI] [Google Scholar]
  • 91.Feist AM, Palsson BØ. The biomass objective function. Curr Opin Microbiol. 2010;13:344–349. doi: 10.1016/j.mib.2010.03.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92.Orth JD, Thiele I, Palsson BØ. What is flux balance analysis? Nature Biotech. 2010;28:245–248. doi: 10.1038/nbt.1614. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Barve A, Matias Rodrigues J, Wagner A. Superessential reactions in metabolic networks. Proc Natl Acad Sci. 2012;109:E1121–E1130. doi: 10.1073/pnas.1113065109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94.Edmonds J. Existence of k-edge connected ordinary graphs with prescribed degrees. J Res Nat Bur Standards Sect B. 1964;68:73–74. doi: 10.6028/jres.068B.013. [DOI] [Google Scholar]
  • 95.Meierling D, Volkmann L. A remark on degree sequences of multigraphs. Math Methods Oper Res. 2009;69:369–374. doi: 10.1007/s00186-008-0265-2. [DOI] [Google Scholar]
  • 96.Sierksma G, Hoogeveen H. Seven criteria for integer sequences being graphic. J Graph Th. 1991;15:223–231. doi: 10.1002/jgt.3190150209. [DOI] [Google Scholar]
  • 97.Andersen JL, Flamm C, Merkle D, Stadler PF. Inferring chemical reaction patterns using graph grammar rule composition. J Syst Chem. 2013;4:4. doi: 10.1186/1759-2208-4-4. [DOI] [Google Scholar]
  • 98.Getzler E, Kapranov MM. Modular operads. Compositio Mathematica. 1998;110:65–125. doi: 10.1023/A:1000245600345. [DOI] [Google Scholar]
  • 99.Mednykh AD, Nedela R. Harmonic morphisms of graphs: Part I: graph coverings. Banska Bystrica: Vydavatelstvo Univerzity Mateja Bela; 2015. [Google Scholar]
  • 100.Karen P, McArdle P, Takats J (2014) Toward a comprehensive definition of oxidation state. J Pure Appl Chem. 86:1017–1081. IUPAC Report
  • 101.Ostermeier L, Hellmuth M, Stadler PF. The Cartesian product of hypergraphs. J Graph Th. 2012;70:180–196. doi: 10.1002/jgt.20609. [DOI] [Google Scholar]
  • 102.Banakha T, van der Zypen D. Minimal covers of infinite hypergraphs. Discr Math. 2019;342:3043–3046. doi: 10.1016/j.disc.2019.06.014. [DOI] [Google Scholar]
  • 103.Bustamante S, Corsten J, Frankl N. Partitioning infinite hypergraphs into few monochromatic Berge-Paths. Graphs Combinatorics. 2020;36:437–444. doi: 10.1007/s00373-019-02113-3. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

13321_2022_621_MOESM1_ESM.txt (5.2KB, txt)

Additional file 1. Stoichiometric matrix of the complete formose RN, Fig. 4, in machine-readable form.

Data Availability Statement

The stoichiometric matrix of the complete formose RN, Fig. 4, is availble as machine-readable Additional file 1.


Articles from Journal of Cheminformatics are provided here courtesy of BMC

RESOURCES