Skip to main content
Springer logoLink to Springer
. 2020 Apr 4;82(4):48. doi: 10.1007/s11538-020-00724-z

Circular Tessera Codes in the Evolution of the Genetic Code

Elena Fimmel 1,, Martin Starman 1, Lutz Strüngmann 1
PMCID: PMC7128014  PMID: 32248310

Abstract

The origin of the modern genetic code and the mechanisms that have contributed to its present form raise many questions. The main goal of this work is to test two hypotheses concerning the development of the genetic code for their compatibility and complementarity and see if they could benefit from each other. On the one hand, Gonzalez, Giannerini and Rosa developed a theory, based on four-based codons, which they called tesserae. This theory can explain the degeneracy of the modern vertebrate mitochondrial code. On the other hand, in the 1990s, so-called circular codes were discovered in nature, which seem to ensure the maintenance of a correct reading-frame during the translation process. It turns out that the two concepts not only do not contradict each other, but on the contrary complement and enrichen each other.

Keywords: Genetic code, Degeneracy, Circular code, Tessera

Introduction

In 1986, John Maynard Smith stated: “We understand biological phenomena only when we have invented machines with similar properties” (Smith 1986, pp 99–100). This quotation explains the motivation of this work quite well. This paper was written in order to better understand the origin of the genetic code using such a machinery. One possible machine or rather a model which gives a feasible explanation for an important aspect of the evolutionary processes of the genetic code was found by Gonzalez, Giannerini and Rosa. In their work “On the origin of degeneration in the genetic code” (Gonzalez et al. 2019) they focus on the degeneracy of amino acid coding and especially on symmetry as an essential cause and consequence of the natural phenomena of degeneracy (compare also Fimmel and Strüngmann 2016). A famous example, which shows the importance of including symmetry deliberations when considering natural phenomena, can be found in quantum mechanics. Here, symmetry describes more than just the patterns that matter takes – it is used to classify the nature of quantum states. This is by far not the only example of its kind. Noether’s theorem even states a one-to-one connection between fundamental laws of nature - so-called conservation laws- and respective symmetries in nature.

Taking these general considerations into account, Gonzalez, Giannerini, and Rosa argue that none of the theories regarding the origin of the genetic code pays the necessary attention to the idea of symmetry (Gonzalez et al. 2019). As a consequence the concept of tessera codes was developed. The tesserae build a subset of all tetranucleotides, chosen in such a way that the degeneracy of the vertebrate mitochondrial genetic code can be explained from the symmetries of the tesserae (Gonzalez et al. 2012).

The other line of thought adressed by the current work is the theory of circular codes. This theory is intended to explain the property of the noise-immunity of the genetic code, and is based on a proposal by Crick et al. (1957). They argue that the coding of amino acids requires only a subset of codons where the correct reading-frame is automatically and immediately recognizable - the so-called comma-free property. While Crick’s theory was refuted in reality (Nirenberg and Matthaei 1961), 40 years later so-called circular codes were discovered in nature (Arqués and Michel 1996). More specifically, it has been noticed that the set of codons, which, together with their frame-shifts in three potential reading-frames, are the most commonly used across all species, has very remarkable properties in terms of detecting the correct reading-frame (Fimmel and Strüngmann 2018; Fimmel et al. 2016; Michel 2017). The comma-free codes proposed by Crick belong to the same family of circular codes, but within them they have the most distinct error-detecting properties (see, for instance, Fimmel et al. 2018, 2017, 2016, xxxx). The natural circular codes have even more interesting structural properties, which makes it very doubtful that these structures play no role in biological processes (Arqués and Michel 1996, Fimmel and Strüngmann (2018)).

The primary goal of this work is to combine the two concepts, tesserae and circular codes, and see if they could benefit from each other. In this work we specify among other things a construction algorithm for circular tessera codes of maximal length. Furthermore, self-complementary tessera codes are characterized and criteria for their self-complementarity are formulated and proved in the language of graph theory. The growth tables for circular and comma-free tessera codes are also presented for the first time. In summary, one result of the work is that the two concepts under scrutiny—that of tessera codes and circularity—have proved to be mutually compatible and complementary.

Thus, with this work we hope to bring more clarity into the possible role of tesserae in the evolutionary process of the genetic code and the mechanisms behind it.

Definitions and Notations

The genetic code is written with words of three letters called codons, built over an alphabet

B:={U(T),C,A,G}

of four letters which are called nucleotide bases Uracil (Thymine), Cytosine, Adenine, and Guanine, in short U(T), CAG. Clearly, the number of codons is 43=64 and by |B3| we will denote the cardinality of the set B3. Accordingly, the set B2 denotes the set of 16 dinucleotides and the set B4 contains the 256 tetranucleotides. It is hypothesized that during evolution the genetic code had several ancestors that might have consisted not only of trinucleotides but of dinucleotides or tetranucleotides or even combinations of these (see Baranov et al. 2009; Gonzalez et al. 2012; Seligmann 2014; Patel 2005; Wilhelm and Nikolajewa 2004; Wu et al. 2005). In particular, in Gonzalez et al. (2012) the tessera code was suggested as an ancestral code that might have been the origin of the mitochondrial code (see also Gonzalez et al. 2019). In order to define the tessera code we have to introduce some group theory and how it can be applied in the genetic setting.

Klein Four-Group and Equivalence Classes of Dinucleotides

The symmetric group on a set of elements is usually known as the group of permutations of these elements. Applying this to our genetic alphabet B we define the symmetric group SB as

SB={π:BBπisbijective}

with the usual group operation given by composition of functions. Recall that a group (H,) is a set H together with an operation :HH such that is associative and H contains a neutral element e as well as inverses h-1 for all hH (see Rotman 1995 for more details on groups). The group SB has 4!=24 elements and is trivially isomorphic to the symmetric group S4 on four elements. We will use standard notation as can be found in Rotman (1995), e.g. we will either write π=(A,G,C,U) or π:(A,U,C,G)(G,A,U,C) if π satisfies π(A)=G,π(U)=A,π(C)=U, and π(G)=C. Naturally, any permutation π:BB can be applied to n-nucleotides of any length componentwise, i.e. if x=b1bnBn, then π(x)=π(b1)π(bn). There is no danger of confusion when denoting the induced bijective map BnBn by π again for any natural number n.

In Fimmel et al. (2014), Fimmel et al. (2015) a subgroup L of SB was identified that seems to play an important role in error-detection and error-correction mechanisms during the translation process. This group consists of all permutations from SB that preserve the codon-anticodon relation and can be geometrically interpreted as the symmetry group of a square. In particular, it contains 4 bijective transformations of nucleotide bases that are invariant with respect to the chemical characters of the nucleotides (we will use the notations from Fimmel et al. 2014, 2015). These are the

Identity:

I (orid):(A,U,C,G)(A,U,C,G);

Strong/Weak (SW) or complementary transformation:

SW (orc):(A,U,C,G)(U,A,G,C);

Pyrimidine/Purine (YR) transformation:

YR (orp):(A,U,C,G)(G,C,U,A);

and Keto/Amino (KM) transformation:

KM (orr):(A,U,C,G)(C,G,A,U).

In particular, the complementary map c is biologically important since it mirrors the hydrogen bonds AT and CG of the DNA double helix. Moreover, the transformation r from above carries codons of degeneracy class 4 to codons of degeneracy class less than 4 and vice versa - a symmetry property of the genetic code that was already observed by Rumer (see Fimmel et al. 2014, 2015 for more details). In the sequel we will denote the set of these four transformations as V={I,SW,YR,KM} (Fig. 1).

Fig. 1.

Fig. 1

Graphical representation of the primeval base symmetries. KM is represented by red, YR by green and SW by blue colored lines (Color figure online)

Equipped with the usual group operation of SB the set V forms a subgroup of the symmetric group SB which is isomorph to the so-called Klein four-group. It can be easily verified that the group V is commutative, i.e. αβ=βα for all α,βV and that all permutations in V are of order two, i.e applying them twice yields the identity αα=id for every αV.

As we will see in the next section, the group V is used in order to define the class of tesserae in mathematical terms. If we consider V acting on the set of dinucleotides B2 we obtain four orbits of size four. Recall that an orbit of an element x (here a dinculeotide) under some group H (here V) is defined as [x]={h(x):hH}. Each orbit represents an equivalence class under the natural equivalence relation d1d2d1d2 if and only if there is πV such that π(d1d2)=d1d2. An easy observation shows that for each such equivalence class there is a unique transformation πV that maps the first nucleotide of a dinucleotide in that class to the second nucleotide, e.g. the map SW for the class [AU]={AU,UA,CG,GC}. Table 1 below shows the four equivalence classes and the corresponding permutations.

Table 1.

Each column is one of the four equivalence classes of dinucleotides: ΣI=[AA], ΣSW=[AU], ΣYR=[AC], ΣKM=[AG] under the action of V on B2

V ΣI ΣSW ΣYR ΣKM
I AA AU AC AG
SW UU UA UG UC
YR CC CG CA CU
KM GG GC GU GA

The left most column shows the transformation that sends the first dinucleotide in the class to the second, third and fourth, respectively, e.g. YR(AA)=CC. The column header are the equivalence classes names. The header index is the unique transformation used for mapping the first nucleotide of a dinucleotide to the second

We are now almost in the position to define the set of tesserae as introduced in Gonzalez et al. (2012). But before we need some more technicalities. Besides the group SB acting as a group of exchanges of bases, there is a second important group which consists of transformations that permute the positions of single bases in a nucleotide sequence. Together with the usual composition of maps these permutations form again a group that once more can be seen as a symmetric group Sn. For the convenience of the reader we here only recall the biologically relevant permutations that will be of importance for us: the so-called reversing permutation and the n-1 shift operations α1,,αn-1. Given an n-nucleotide x=N1Nn we define and αk for kn-1 as

N1Nn=NnN1,αk(x)=Nk+1NnN1Nk

which are the n-nucleotides obtained from x by reversing or a shift of k positions, respectively. Explicitely, for n=4 we have

N1N2N3N4=N4N3N2N1

and

α1(N1N2N3N4)=N2N3N4N1α2(N1N2N3N4)=N3N4N1N2α3(N1N2N3N4)=N4N1N2N3

It is now obvious that the anti-n-nucleotide of some n-nucleotide x can be described as SW(x) with the complementary map SW from V. For trinucleotides (codons) it is well-known that the anti-codon is always different from the codon. However, if n is even it might happen that SW(x)=x for some n-nucleotide x. These nucleotide sequences are called self-complementary. For example, if n=4, then the tetranucleotide ACGU is self-complementary since SW(ACGU)=UGCA=ACGU.

Tesserae: Definition and Structure

Tesserae were motivated biologically in an evolutionary context in Gonzalez et al. (2012). Each tessera is a tetranucleotide that has a particular form that comes from the symmetries induced by the group V. Let us give a definition of a tessera in mathematical terms (see also Gonzalez et al. 2012 and Fimmel and Strüngmann 2019):

Definition 2.1

A tessera is a tetranucleotide (four letter word) tB4 of the form

t=N1N2α(N1)α(N2)

where N1,N2B and αV. The set of all valid tesserae is denoted by TESS.

The set TESS is also called the tessera code since it is a subset of B4 and hence a code in the sense that every concatenation of words from TESS has a unique decomposition over TESS. Clearly, the size of TESS is 64 and so we have |TESS|=|B3|. Table 2 shows the set of all tesserae together with their generating transformation.

Table 2.

The table of all tessera with the generating transformation

Dinucleotide id c p r
AA AAAA AAUU AAGG AACC
CC CCCC CCGG CCUU CCAA
GG GGGG GGCC GGAA GGUU
UU UUUU UUAA UUCC UUGG
AC ACAC ACUG ACGU ACCA
AG AGAG AGUC AGGA AGCU
AU AUAU AUUA AUGC AUCG
CA CACA CAGU CAUG CAAC
CG CGCG CGGC CGUA CGAU
CU CUCU CUGA CUUC CUAG
GA GAGA GACU GAAG GAUC
GC GCGC GCCG GCAU GCUA
GU GUGU GUCA GUAC GUUG
UA UAUA UAAU UACG UAGC
UC UCUC UCAG UCCA UCGA
UG UGUG UGAC UGCA UGGU

It is easy to see that a codon N1N2N3B3 can be uniquely extended to a valid tessera tess(N1N2N3)=N1N2N3N4 by determing the unique permutation αV such that α(N1)=N3 and letting N4=α(N3). This shows that the tessera code TESS is 1-error-correcting and it was shown in Fimmel and Strüngmann (2019) that TESS can be obtained as a linear code from B3 and by the so-called Plotkin construction from B2 - for more details on this see (Fimmel and Strüngmann 2019).

In Gonzalez et al. (2012) the idea of symmetric primeval adaptor molecules that could recognize the normal reading frame in the coding strand in the 3–5 direction, in the complementary strand in the 3–5 direction, in the coding strand in the reverse 5–3 direction and in the complementary strand in the reverse 5–3 direction was utilized to propose an ancient model of tRNA adaptors that explains the reading mechanism and degeneracy distribution of the tesserae. In particular, since there exist self-complementary tesserae, e.g. ACGU, the tessera code allows degeneracy 2 and 4 only. Maintaining the degeneracy an algorithm was suggested in Gonzalez et al. (2019) for passing from the tessera code back to the (mitochondrial) genetic code in the following way: We assign to each of the transformations from V a letter in the genetic alphabet via IA, SWU, KMC and YRG and then perform the following algorithm displayed in Fig. 2.

Fig. 2.

Fig. 2

Schematic representation of the mapping between the tessera b1 b2 b3 b4 onto the codon x1 x2 x3. (Color figure online)

For instance, the tessera ACGU will be mapped to the codon CUU since KM(A)=C and SW(C)=G. In the sequel we will denote by cod(N1N2N3N4) the corresponding codon under this algorithm. However, note that the two mappings tess(·) and cod(·) are not inverses of each other.

We now aim for a better description of tesserae. Let us assume that N1N2N3N4 is a tessera. By definition there is an element αV such that

N3N4=α(N1N2).

This implies that N1N2 and N3N4 have to be in the same equivalence class Σα displayed in Table 1. Thus, the tessera code can be split into four disjoint subsets.

TESS=TESSITESSSWTESSYRTESSKM

where

TESSi={d1d2TESSd1,d2Σi}fori{I,SW,YR,KM}.

Clearly, any subset XTESS has a similar induced decomposition where the components could be empty.

Definition 2.2

Let XB4 be a tessera code. Then

X=XIXSWXYRXKM

where

Xi=XTESSifori{I,SW,YR,KM}.

The above decomposition will be used in Sect. 4 for constructing all maximal circular tessera codes.

Graph Theoretical Approach

In this section we recall a graph theory approach from Fimmel et al. (2016) that turned out to be very useful for determining properties of circular codes (see Sect. 3 for the definition of circularity) and extend it to our setting of tesserae. To each subset XBn a directed graph G(X) will be associated as the union of disjoint components Cj(X) where 1jn2. The vertices of such a component Cj(X) will be initial segments and end segments of n-tuples from X of length l and n-l, respectively.

Definition 2.3

Let nN and XBn. For 1jn2 we define a graph component Cj(X)=(Vj(X),Ej(X)) with set of vertices Vj(X) and set of arcs Ej(X) as follows:

  • Vj(X):={N1Nj,Nj+1Nn,N1Nn-j,Nn-j+1Nn:N1N2N3NnX}

  • Ej(X):={N1NjNj+1Nn,N1Nn-jNn-j+1Nn:N1N2N3NnX}

The graph C(X) associated to X is the union C(X)=1jn2Cj(X) of the graphs Cj(X) for all 1jn2. The graph C(X) is called the representing graph of X.

It is easy to see that the graph components Cj(X) of a representing graph G(X) are pairwise disjoint since their labels have different lengths. However, the components need not be connected. For the convenience of the reader and for a better illustration we give some examples for n=2,3 and 4 (Figs. 3, 4 and 5).

Fig. 3.

Fig. 3

Graphical representation G(X) of the dinucleotide code X = {UC, CG, GU, AC, AA} which has only one component C1(X). (Color figure online)

Fig. 4.

Fig. 4

Graphical representation G(X) of the trinucleotide code X = {UCA, UAC, CAU, ACA, ACG} which has only one component C1(X) that is not connected. (Color figure online)

Fig. 5.

Fig. 5

Graphical representation G(X) of the tetranucleotide code X = {AAUC, ACUA, ACUU, CUCU, CUUU} which has two components C1(X) and C2(X) that are both not connected but have two components themselves. (Color figure online)

Since the tesserae are tetranucleotides it follows that any set of tesserae has two (maybe empty) graph components in their representing graph, one with labels of length 1 and 3 and the other with labels of length 2.

In Fimmel et al. (2016) the graph approach was used to characterize circularity of codes in terms of graph theory. We will consider circular tessera codes in the next section but it seems reasonable to state the corresponding theorem in this section. For the technical definition of circularity see Definition 3.1.

Theorem 2.4

Let XBn. Then the following are equivalent:

  1. X is a circular code;

  2. the representing graph G(X) is acyclic, i.e. does not contain any cycle.

In the particular case of tesserae we will use a second graph associated to a set that we shall utilize later on in order to construct maximal circular tessera codes.

Definition 2.5

Let XTESS. The di-cut-graphs T1,3(X) and T2,4(X) associated to X are defined as the representing graphs G(X1,3) and G(X2,4) of the sets

X1,3={N1N3N1N2N3N4X}

and

X2,4={N2N4N1N2N3N4X}

To conclude this section we give an example of a di-cut-graph T(X) of some tessera code X (Fig. 6).

Fig. 6.

Fig. 6

Graphical representation U1,3(X) of the di-cut-graph of the Tessera code X = {UCUC, AUGC, CUAG, GCCG}. (Color figure online)

Circular Tessera Codes

In this section we consider circular tessera codes. Simply speaking circularity means that a frame-shift in any concatenation of tesserae from that code will be detected. In the biological setting of the genetic code, a circular set of trinucleotides was first observed in Arqués and Michel (1996) and is supposed to play an important role in error-detection mechanisms during the translational process. We start with the definition of circularity for tesserae.

Definition 3.1

Let nN. A tessera code XB4 is called n-circular if for any set of tessera tiX (1in) the concatenation t1tm has a unique decomposition into tesserae from the code X for any mn if considered on the circle. We will call a tessera code XB4 circular, if it is n-circular for all nN.

As we had noted before in Theorem 2.4 a tessera code X is circular if and only if its representing graph G(X) is acyclic. Moreover, it is easy to see that the code X is n-circular if and only if for any concatenation t1tm of tesserae t1,,tm from X with mn the shifted sequences αi(t1tm) for i3 do not yield a valid sequence in Xm, i.e.

αi(t1tm)Xm

In particular, a tessera code X is 1-circular if it does not contain the cyclically shifted tesserae of its members, i.e.

αi(t)X

for all i3 and tX. Therefore, a circular code can not contain any tessera that equals one of its shifts, e.g. ACAC=α2(ACAC), and it makes sense to consider the equivalence classes that are formed by tesserae and their circular shifts. If all shifts are different, then this class is called complete. There are 12 such complete equivalence classes, each containing 4 elements. Four other classes each contain one element {AAAA},{CCCC},{GGGG},{UUUU} and six classes each with two elements like {ACAC,CACA}. Table 5 displays all the complete equivalence classes of tesserae (Table 3).

Table 5.

Numbers of self-complementary circular codes of different code lengths

Code length 1 2 3 4 5 6 7 8 9 10 11 12
Number 12 72 304 996 2580 5408 9264 12708 13696 11232 6144 1584

Table 3.

List of complete equivalence classes

Tessera Shift 1 Shift 2 Shift 3 Class number
AAUU AUUA UUAA UAAU D1
AAGG AGGA GGAA GAAG D2
AACC ACCA CCAA CAAC D3
CCGG CGGC GGCC GCCG D4
CCUU CUUC UUCC UCCU D5
UUGG UGGU GGUU GUUG D6
AGCU GCUA CUAG UAGC D7
UGCA GCAU CAUG AUGC D8
GUAC UACG ACGU CGUA D9
AGUC GUCA UCAG CAGU D10
UCGA CGAU GAUC AUCG D11
ACUG CUGA UGAC GACU D12

Self-complementary tesserae are in bold

Since any circular code is also 1-circular and there are only 12 complete equivalence classes, it is obvious that a circular tessera code can contain at most 12 elements.

Definition 3.2

A circular tessera code is called maximal if it contains exactly 12 elements.

We will show in Sect. 4 how to construct all maximal circular tessera codes and now give an example of a 1-circular tessera code that is not 2-circular.

Example 3.3

Let X={ACGU,CAUG,GUCA,UGAC}. Then X is a 1-circular tessera code but the word ACGUCAUG has two decompositions on a circle

ACGU|CAUGandGUCA|UGAC=α2(ACGUCAUG).

Thus X is not 2-circular. In particular, the graph component C2(X) of the representing graph of G(X) of X contains a cycle.

Moreover, the example below shows that also the classes of 2- and 3-circular tessera codes are different:

Example 3.4

Let X={CAGU,UGCA,GUUG}. Then X is a 2-circular (by means of easy computations) but not a 3-circular tessera code since the word CAGUUGCAGUUG has two decompositions on a circle

CAGU|UGCA|GUUGandGUUG|CAGU|UGCA

We show next that the graph component C2(G) being not acyclic is not an accident but in fact it is the only possibility for 1-circular codes not to be circular. In order to do so recall that a cycle in a graph G is a sequence e1ene1 of distinct vertices ei (in) in G. The length of this cycle is then defined to be n. Note that for n=1 a cycle of length 1 is a loop.

Proposition 3.5

Let X be a tessera code. Then the following hold:

  • (i)

    The maximal length of a cycle in C1(X) is 2; in particular, the maximal length of a path that does not contain a cycle is 1;

  • (ii)

    The maximal length of a cycle in C2(X) is 4; in particular, the maximal length of a path that does not contain a cycle is 3.

Proof

Let X be a tessera code. We first prove (i) by showing that any path in C1(X) of length 2 must contain a cycle. Hence assume that C1(X) contains a path of length 2. Without loss of generality we may assume that it starts with a nucleotide, e.g.

N1N2N3N4N5.

Then N1N2N3N4 and N2N3N4N5 are valid tesserae from X. By definition of tesserae the former tells us that there is a transformation αV such that α(N2)=N4 and α(N3)=N1. The latter however, then implies that also α(N3)=N5 and so N1=N5 which shows that α1(N1N2N3N4)=N2N3N4N1X and N1N2N3N4N1 is a cycle.

We now prove (ii) by showing that any path of length 4 in C2(X) contains a cycle. Assume that C2(X) contains a path of length 4, e.g.

N1N2N3N4N5N6N7N8N9N10.

By definition of G(X) there are permutations π1,π2,π3,π4V such that

π1(N1N2)=N3N4,π2(N3N4)=N5N6,π3(N5N6)=N7N8,π4(N7N8)=N9N10

If one of the πi is the identity we obtain a cycle of length 1 (a loop). Thus all πi are different from the identity. If π1=π2, then N1N2=N5N6 since π12=I. This gives a cycle of length 2. Thus π1π2 and similarly π2π3, π3π4. If π1π3, then the group structure of V implies that π1π2=π3 and so N7N8=N1N2, hence we obtain a cycle of length 3. Finally, if π1=π3, then similar arguments as above show that we get a cycle of length 3 or π2=π4 holds. Now

π4(π3(π2(π1(N1N2))))=π2(π1(π2(π1(N1N2))))=N9N10

but V is commutative and all elements in V are of order 2, hence

N9N10=π2(π1(π2(π1(N1N2))))=N1N2.

Consequently, the path itself is a cycle of length 4.

As a corollary we obtain an important theorem. Note that part (ii) was also obtained in a bachelor-thesis (Cisowski 2015) with a much more technical proof.

Theorem 3.6

Let X be a tessera code. Then the following hold:

  • (i)

    If X is 1-circular, then C1(X) is acyclic;

  • (ii)
    The following two conditions are equivalent:
    1. X is circular;
    2. X is 3-circular.

Proof

We first prove (i). By Proposition 3.5 we know that the maximal length of a cycle in C1(X) is 2, hence a cycle would be of the form N1N2N3N4N1N2 which contradicts 1-circularity since α2(N1N2N3N4)=N3N4N1N2.

In order to prove (ii) note that by Proposition 3.5 the maximal length of a cycle in G(X) is 4. However, a cycle of even length 2 is excluded by 1-circularity and of length 4 by 2-circularity since

N1N2N3N4N5N6N7N8N1N2

implies that N1N2N3N4|N5N6N7N8 has two decompositions - a contradiction. Hence G(X) does not contain any cycle of even length and the maximal length of an odd cycle is 3. By Theorem 2.3 from [13] we conclude that X is circular if and only if it is 3-circular.

We conclude this section with a result that gives a handy criterion for constructing circular tessera codes and some application.

Theorem 3.7

Let XTESS be a tessera code. Then X is circular if

  • X is 1-circular

  • One of the di-cut graphs T1,3(X) and T2,4(X) is acyclic.

Proof

Assume that X is 1-circular and one of the di-cut graphs T1,3(X) and T2,4(X) is acyclic. Without loss of generality we assume that T1,3(X) is acyclic. Assume that X is not circular. Then Proposition 3.5 and Theorem 3.6 imply that the component C1(X) is acyclic and the maximal length of a cycle in C2(X) is 4. Assume without loss of generality that

N1N2N3N4N5N6N7N8N1N2

is a cycle in G(X). Thus the tesserae N1N2N3N4,N3N4N5N6,N5N6N7N8 and N7N8N1N2 are in X. By definition of T1,3(X) is follows that N1N3,N3N5,N5N7 and N7N1 are dinucleotides in the set X1,3 and hence N1,N3,N5 and N7 are vertices of T1,3(X). Moreover,

N1N3N5N7N1

is a cycle in T(X) - a contradiction to the fact that T1,3(X) is acyclic.

The converse of Theorem 3.7 does not hold as the following example shows. Note, however, that the code X1,3 (respectively X2,4) can never contain dinucleotides of the form NN since they would imply that there is a tessera of the form NKNK in X which contradicts 1-circularity.

Example 3.8

Let

X={AGUC,GAAG,CAAC,GGCC,AGCU,UGCA,GUAC,UUAA,CGAU,GACU,CUUC,GUUG},

then X is a maximal circular tessera code but neither T1,3(X) nor T2,4(X) is acyclic.

We now state some application of the above results in order to construct maximal circular tessera codes from circular dinucleotide codes. In fact, the constructed codes will even have stronger properties:

Definition 3.9

A circular tessera code XTESS is called a C4-code if also the three shifted codes α1(X), α2(X) and α3(X) are circular.

Recall from Fimmel et al. (2015) that a maximal circular dinucleotide code DB2 must be of the form D={N1N2,N1N3,N1N4,N2N3,N2N4,N3N4} where N1>N2>N3>N4 is any linear ordering of the genetic alphabet B.

Proposition 3.10

Let D={N1N2,N1N3,N1N4,N2N3,N2N4,N3N4} be a maximal circular dinucleotide code. Then

X={N1N1N2N2,N1N1N3N3,N1N1N4N4}{N1N3N2N4,N1N4N2N3}{N1N2N3N3,N1N4N3N2}{N1N3N4N2,N1N2N4N3}{N2N2N3N3,N2N2N4N4,N3N3N4N4}

is a maximal tessera C4-code such that T1,3(X)=G(D).

Proof

We first prove circularity of the code X. Clearly, T1,3(X)=G(D). Since D is circular its graph G(D) is acyclic by Theorem 2.4 and thus we only need to verify that X is 1-circular by Theorem 3.7. But this is clear since the code contains exactly one tessera from each of the twelve complete equivalence classes from Table 5.

Now let X(n) be the nth shift of X for n3. Then we have

X1,3(1)={N1N2,N1N3,N1N4,N3N4,N4N3,N2N3,N4N2,N3N2,N2N4}X1,3(2)={N2N1,N3N1,N4N1,N3N2,N4N2,N4N3}=X1,3=DX1,3(3)={N2N1,N3N1,N4N1,N4N3,N3N4,N3N2,N2N4,N4N2,N4N3}=X1,3(1)

Clearly, X1,3(2) is a dinucleotide circular code since it is equal to D, hence its representing graph G(X1,3(2))=T1,3(X(2)) is acyclic and as above X(2) is 1-circular. By Theorem 3.7 we conclude that X(2) is a circular code.

It remains to show that also X(1) and X(3) are circular. However, in this case

X2,4(1)=X1,3=D

which is circular and so Theorem 3.7 implies that also X(1)and also X(3) are circular. Hence X is a C4-code.

We would like to remark that the construction in the above lemma has some flexibility, e.g the tessera of the form NiNiNjNj can be substituted by tessera from the same equivalence class. However, it is not obvious how to construct all maximal circular tessera codes using this method. Nevertheless, in the next section we will give a way to obtain all such codes.

Construction of All Maximal Circular Tessera Codes

This section introduces one possibility to construct all maximal circular tessera codes. Recall that a circular tessera code is maximal if it contains exactly 12 elements. The construction will be accomplished in two major steps. Firstly, for each of the four equivalence classes from Table 1 we define a tournament on four vertices which are representing the single dinucleotides. Finally, we combine the four tournaments constructed in the previous step to construct maximal circular tessera codes. Recall that a tournament is a complete oriented graph (see e.g. Clark and Holton 1991). Figure 7 shows an example of a tournament.

Fig. 7.

Fig. 7

An acyclic tournament on four nodes. (Color figure online)

As already proved in Theorem 3.6, the graph component C1(X) associated to a tessera code X has either no path bigger than 1 or X is not circular. Even more precise, if C1(X) is acyclic the code X must not even be 1-circular. Considering that, a construction of a maximal circular tessera code could almost be reduced to the problem of constructing a valid and acyclic C2 which represents a correct tessera code X.

  • Step 1:
    In this step we construct four acyclic tournaments which together represent a tessera code X of length 24 so that C2(X) is acyclic. Note that a tournament on 4 vertices has exactly 6 edges and in order to be acyclic it has to be isomorphic to the tournament given in Fig. 7. Below we will show how to construct tournaments on four vertices that represent a correct (circular) tessera code, i.e. the tournaments will be acyclic. Together they form the desired code X as
    X=XIXSWXYRXKM 1
    with|XI|=|XSW|=|XYR|=|XKM|=6and, thus,|X|=24 2
    As it can be seen from the construction, C2(X) is acyclic as it is the union of acyclic tournaments, while C1(X) is not. Yet, for this initial step we can ignore this fact. Since C2(XI), C2(XSW), C2(XYR) and C2(XKM) are disjoint it is sufficient that these subgraphs are acyclic to ensure the acyclicity of C2(X). As mentioned above, each of these subgraphs has to be isomorphic to the graph in Fig. 7.

    Let us choose one of the equivalence classes Σi,i{I,SW,YR,KM} and assign numbers 1, 2, 3, 4 to the dinucleotides of Σi. Now we draw directed edges from each node to the nodes with a higher number. This way we will obtain four acyclic tournaments, each of them represents a circular tessera code of size 6. This gives 4! possible assignments per subgraph. Hence, there are altogether (4!)4=331776 tessera codes of size 24 with an acyclic C2-component.

  • Step 2:

    In this step, we use the 331776 tessera codes, constructed in Step 1, to construct all possible maximal circular tessera codes. Since the C2 is already acyclic, it is sufficient to focus on C1.

Lemma 4.1

Let X be a tessera code constructed as above and

t=N1N2N3N4=N1N2γ(N1N2)X

for some γV. Then the following hold:

  1. α2(t)X

  2. α3(t)=N4N1N2N3Xorα1(t)=N2N3N4N1X.

Proof

First we prove (1). Obviously, t is represented by the arrow N1N2N3N4 in the corresponding tournament. Obviously, γid. Let us consider α2(t)=N3N4N1N2. It follows that α2(t)X since it would be represented in the same tournament by the opposite directed arrow N4N3N1N2 - a contradiction. Now we claim that one of the remaining shifts of t

α3(t)=N4N1N2N3orα1(t)=N2N3N4N1

is necessarily in the code X. Let us first assert that the dinucleotides N4N1 and N2N3 cannot be in the same equivalence class as N1N2 and N3N4 since in this case N4=N2 takes place and, thus, γ=id. Consequently, one of the arrows N4N1N2N3 or N2N3N4N1 is drawn in the corresponding tournament and it follows that α3(t)X or α1(t)X. This proves (2).

The above lemma shows that consequently, X consists of 12 pairs of cyclically equivalent tesserae. To ensure that the codes are circular, one of the cyclically equivalent tuples must be removed. This has to be done for all 12 cyclically equivalent pairs of tuples in such a code X. It follows that each of the 331776 codes can be used to construct 212 circular codes - with possible repetitions. It remains to prove that all maximal circular tessera codes can be obtained this way. Let X be a auch a maximal code. As shown above, the C2 component of each Xi,i{I,KM,SW,YR} is a simple directed acyclic graph with a maximum of four nodes. According to Theorem 3.1 (Fimmel et al. 2017), such a graph can be embedded in an acyclic tournament. In Step 1, all possible acyclic tournaments are constructed. Step 2 takes all possible subgraphs of each tournament and combines those. This ensures that all possible maximal circular tessera codes are represented in the construction.

Hence, the total amount of 212×(4!)4=1358954496 constructed maximal circular tessera codes include all maximal circular tessera codes.

The table below gives the exact numbers of circular and even C4- codes (compare 3.9) for all cardinalities from 1 to the maximum 12. Moreover, it also shows that number of comma-free codes. Recall that comma-free codes form a subclass of circular codes.

Definition 4.2

A code XBl is called comma-free if any concatenation x1x2 does not contain any xX as a substring except for x1 (as initial segment) and x2 (as end segment) themselves.

Clearly, a comma-free code is circular and X is comma-free if and only if it associated graph has no path of length more than 2 (see Fimmel et al. 2016) (Table 4).

Table 4.

Numbers of circular, comma-free and C4-tessera codes of different code lengths

Code length # 1-circular codes # Circular codes # C4-codes # Comma free codes
1 48 48 48 48
2 1056 1056 1056 1056
3 14080 14048 14016 13952
4 126720 125544 124368 122376
5 811008 791952 773088 745584
6 3784704 3606048 3433584 3214272
7 12976128 11908800 10922112 9816960
8 32440320 28230456 24577404 20952504
9 57671680 46720800 37987120 30297824
10 69206016 51111024 38129856 28015728
11 50331648 33113472 22240992 14790144
12 16777216 9592512 5685408 3351232

Self-Complementary Circular Tessera Codes

In this section we will discuss some properties of self-complementary tessera codes. In particular, we will determine all maximal self-complementary comma-free tessera codes and give a graph-theoretical characterization of self-complementarity for tessera codes.

Let us first recall the definition of self-complementarity of a code.

Definition 4.3

Let XB be a -nucleotide code. We will call X self-complementary if for each -nucleotide xX its anti--nucleotide SW(x) is also in X:

xXSW(x)X.

We will also use the notation

X=SW(X).

According to the above, a circular tessera code can contain a maximum of 12 tesserae. Such a code can even be self-complementary, as the next example shows.

Example 4.4

The following code XTESS is a self-complementary maximal circular code:

X={AAUU,CCGG,AGCU,UGCA,GUAC,UCGA,AAGG,CCUU,AACC,GGUU,AGUC,GACU}.

The next lemma gives the exact number of self-complementary 1-circular tessera codes.

Lemma 4.5

The maximal size of a self-complementary 1-circular tessera code is 12 and the number of them is 4096.

Proof

Firstly, Example 4.4 shows that there are self-complementary circular codes of size 12 which is maximal. Secondly, inn order to calculate the exact number of self-complementary 1-circular codes, we first ascertain that for 6 conjugacy classes, the respective antitessera of a tessera from that class is found in another conjugacy class: The antitesserae of tesserae from class D2 are all in class D5, from class D3 in class D6 and from class D10 in class D12 and, of course, vice versa. Thus, we have 43 possibilities to choose 6 tesserae from these conjugacy classes for a 1-circular self-complementary tessera code. As for the classes D1,D4,D7,D8,D9,D11, only the self-complementary tesserae can be chosen from these, since the other two form tessera-antitessera pairs and are cyclically equivalent. So we have further 26 possibilities for this. Altogether we have 26·43=4096 maximal self-complementary 1-circular codes.

The following example shows that not every 1-circular self-complementary tessera code is also circular (even not 2-circular).

Example 4.6

Let us take (self-complementary) tesserae AAUU from the class D1 and CCGG from the class D4, as well as GGAA (from D2) and UUCC (from D5) which are complementary to each other. Then we have that the word CCGGAAUU has two different decompositions on a circle:

CCGG|AAUUandCC|GGAA|UU.

With an extensive computer calculation the exact numbers of self-complementary circular and comma-free codes of maximal length are calculated:

Lemma 4.7

There are

  • 1584 self-complementary circular (Table 5) and

  • 16 self-complementary comma-free (Table 6)

tessera codes of maximal length.

Table 6.

The list of all self-complementary comma-free tessera codes of maximal length

UUAA CCAA AGGA UCCU UUGG CCGG UCGA CAUG ACGU AGCU ACUG CAGU
AAUU AACC AGGA UCCU GGUU GGCC UCGA CAUG ACGU AGCU ACUG CAGU
UUAA CCAA GAAG CUUC UUGG CCGG GAUC CAUG ACGU CUAG ACUG CAGU
AAUU AACC GAAG CUUC GGUU GGCC GAUC CAUG ACGU CUAG ACUG CAGU
UUAA CCAA AGGA UCCU UUGG CCGG UCGA UGCA GUAC AGCU UGAC GUCA
AAUU AACC AGGA UCCU GGUU GGCC UCGA UGCA GUAC AGCU UGAC GUCA
UUAA CCAA GAAG CUUC UUGG CCGG GAUC UGCA GUAC CUAG UGAC GUCA
AAUU AACC GAAG CUUC GGUU GGCC GAUC UGCA GUAC CUAG UGAC GUCA
AAUU ACCA AAGG CCUU UGGU CCGG GAUC UGCA ACGU AGCU GACU AGUC
UUAA ACCA GGAA UUCC UGGU GGCC GAUC UGCA ACGU AGCU GACU AGUC
AAUU CAAC AAGG CCUU GUUG CCGG GAUC CAUG GUAC AGCU GACU AGUC
UUAA CAAC GGAA UUCC GUUG GGCC GAUC CAUG GUAC AGCU GACU AGUC
AAUU ACCA AAGG CCUU UGGU CCGG UCGA UGCA ACGU CUAG CUGA UCAG
UUAA ACCA GGAA UUCC UGGU GGCC UCGA UGCA ACGU CUAG CUGA UCAG
AAUU CAAC AAGG CCUU GUUG CCGG UCGA CAUG GUAC CUAG CUGA UCAG
UUAA CAAC GGAA UUCC GUUG GGCC UCGA CAUG GUAC CUAG CUGA UCAG

We now aim for a graph-theoretical characterization of self-complementarity for tessera codes. Let us start with some observations on self-complementary 1-circular tessera codes:

Lemma 4.8

Let XTESS be a self-complementary 1-circular tessera code. Then it holds

XSW=.

Proof

Let X be a self-complementary 1-circular tessera code. Then for all t=d1d2XSW

c(t)=c(d2)c(d1)=d2d1=α2(t)

where d1,d2ΣSW. However, cyclically equivalent tesserae cannot be in the same 1-circular code.

The next property is discovered by examining maximum circular codes of codons (RNA triplets) (Fimmel et al. 2018). Assume that YB3 is a trinucleotide self-complementary code, G(Y)=(V,E) the graph associated to Y. Then the following conditions are true:

  1. V=c(V)

  2. d+(v)=d-(c(v)) for all vertices vV

where d+(v) of a vertex v denotes the number of outgoing edges (directed edges that start in v) and d-(v) denotes the number of ingoing edges, respectively. It was also shown in Fimmel et al. (2018) that the conditions from above are not sufficient in general to ensure self-complementarity but only for circular codes of size at least 18.

We will show next that in the case of tesserae or dinucleotides, the size of the code does not matter and that one can obtain a similar result. Let us first prove the claim for dinucleotides:

Lemma 4.9

Let XB2 be a 1-circular dinucleotide code, G(X)=(V,E) its associated graph. X is self-complementary if and only if

  1. V=c(V)

  2. d+(v)=d-(c(v)) for all vertices vV

Proof

Let X be a self-complementary dinucleotide code, l1l2X for some l1l2B. Due to self-complementarity of X we have c(l2)c(l1)X which implies that both conditions (1) and (2). Conversely, assume that X is a 1-circular code. Then its associated graph G(X) can be embedded into a tournament on four vertices A,C,G,UB (compare Fimmel et al. 2017). Assume that G(X) satisfies the conditions (1) and (2). The presence or absence of the self-complementary dinucleotides AUUACG or GC in X does not affect either the self-complementarity of X or the conditions (1) and (2). Let us focus then on non-self-complementary dinucleotides from X. Suppose without loss of generality that the dinucleotide AC is in the code. For conditions (1) and (2) to be met, a dinucleotide N1U and a dinucleotide GN2 must be in the code. This can be achieved in three ways:

  • N1=G,N2=U In this case AC=c(GU) is valid or

  • N1=C,N2=U The condition (2) can now only be met if the dinucleotide AGX and the code is self-complementary or

  • N1=C,N2=A The condition (2) can now only be met if the dinucleotide UGX and the code is self-complementary

This proves that X is self-complementary.

In the case of tesserae we should additionally consider the condition from the Lemma 4.8 and obtain a handy characterization of self-complementarity.

Theorem 4.10

Let XTESS be a 1-circular tessera code, C2(X)=(V2,E2). X is self-complementary if and only if

  1. XSW=

  2. V2=c(V2)

  3. d+(v)=d-(c(v)) for all vertices vV2

Proof

One implication is analogous to the proof of Proposition 3.1 in Fimmel et al. (2018) considering Lemma 4.8. Conversely, assume that XTESS is a 1-circular tessera code that satisfies all three conditions (1), (2), (3). It is immediately clear by direct verification that for all equivalence classes Σi with i{I,SW,YR,KM}

c(Σi)=Σi

holds, i.e. the dinucleotide codes Σi are self-complementary. So we can restrict ourselves to the consideration of C2(Xi) for i{I,SW,YR,KM}. Since X is a 1-circular code each of C2(Xi) is embedded into a tournament on four nodes.

Secondly, as we can see from Table 1, two of the six tesserae represented in each tournament, except of that corrsponding to ΣSW, are self-complementary:

  • For ΣI these are AAUU (or UUAA) and CCGG (or GGCC)

  • For ΣYR these are ACGU (or GUAC) and UGCA (or CAUG)

  • For ΣKM these are AGCU (or CUAG) and UCGA (or GAUC)

and for each not self-complementary tessera T=d1d2Xi where i{I,SW,YR,KM} its anti-tessera should be in the same component Xi due to the fact that

c(U)=c(d2)c(d1).

The rest of the proof can now be done analogously to the proof of Lemma 4.9.

In the Theorem above, the condition of 1-circularity can not be omitted, as the following example shows:

Example 4.11

Let us consider the following tessera code

X={CUGA,GACU,AGAG,UCUC}.

The code is obviously not 1-circular and non-self-complementary since, for instance, c(AGAG)=CUCUX takes place. But all three conditions from Theorem 4.10 are fulfilled. In the picture below, the round and square nodes represent pairs of reversed-complementary dinucleotides.

graphic file with name 11538_2020_724_Figa_HTML.jpg

We conclude this section with a second theorem that gives a graph-theoretical characterization for tessera codes that are not 1-circular using the graph component C1(X) of a code X.

Theorem 4.12

Let XTESS be a tessera code, C1(X)=(V1,E1). X is self-complementary if and only if

  1. V1=c(V1)

  2. d+(v)=d-(c(v)) for all vertices vV1

Proof

Let us assume that XTESS satisfies properties (1) and (2) from Theorem 4.12. Hence, for any tessera N1N2N3N4X we have that N2N3N4V1 and by property (1) also c(N4N3N2)V1. Property (2) then implies that c(N4N3N2)N5X for some basis N5. It is clear that N5 has to be the complement of N1 by the unique definition of tesserae. More precisely, assume that πV such that N2=π(N4) which implies that c(N2)=π(c(N4)) and thus c(N3)=π(N5). Hence N5 = c(N1). Therefore c(N4N3N2)N5=c(N1N2N3N4)X and X is self-complementary.

Let us make a final remark: A 1-circular tessera code X represented by a tournament which is built on four dinucleotides of one of the equivalence classes (see Table 1) is self-complementary if and only if the numbers 1, 2, 3, 4 (see paragraph Construct a Tournament) are assigned to dinucleotides so that 1 is complementary to 4 and 2 is complementary to 3, i.e. d1=c(d4),d2=c(d3). In order to see this let the order on dinucleotides be defined as described above, didjX,i<j,i,j{1,2,3,4} and

c(didj)=c(dj)c(di)=dkdl.

If i=1 or j=4 then it is obvious that k<l since k=1 or l=4 and dkdlX. The only remaining case is i=2,j=3. But in this case k=2,l=3 takes place per definition of the order on dinucleotides and d2d3X. The opposite direction: Let d1=c(d2) and, correspondingly, d3=c(d4). Then c(d1d3)=d4d2X. The case d1=c(d3) is analogous. In both cases X is not a self-complementary code. Here is an example.

Example 4.13

For example, let us consider the class ΣKM. Then one possible self-complementary assignment would be: 1CU, 4AG, 2UC and 3GA. The represented code XKM= {CUAG, CUUC, CUGA, UCAG, UCGA, GAAG} is self-complementary.

This shows that in the construction of all maximal circular tessera codes one can also identify and construct all maximal self-complementary circular codes.

Conclusions

In this work we have identified and characterized circular tessera codes and their properties. In Gonzalez et al. (2012) and Gonzalez et al. (2019) Gonzalez, Giannerini and Rosa had proposed an ancestor code of the universal genetic code that is based on 64 tetranucleotides built from dinucleotides by using the Klein four symmetry group. It was hypothesized that this tessera code existed before LUCA and even before the early genetic code that coded for 20 amino acids using all 64 codons. Possible primeval adaptor molecules that could decode the tessera were also modelled and it was shown that the tessera code mirrors exactly the degeneracy distribution of the mitochondrial genetic code.

We have combined the theory of tesserae with the the theory of circular codes that have been studied extensively during the last decades. Circular codes were found by an extensive statistical investigation in Arqués and Michel (1996) and seem to play an important role in the detection and correction mechanisms of the ribosome during translation. Moreover, it was hypothesized in [13] that ancestor codes of the universal genetic code might have used codons from a circular code only. Thus it was reasonable to investigate circular tessera codes which could have existed between a primitive genetic code and the tessera code.

Our results show that circular tessera codes can be of size 12 at most and we have given construction methods for all circular tessera codes of this size. Moreover, the number of circular (comma-free, self-complementary) tessera codes of any size between 1 and 12 have been calculated.

Acknowledgements

Open Access funding provided by Projekt DEAL.

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Elena Fimmel, Email: e.fimmel@hs-mannheim.de.

Martin Starman, Email: m.starman@live.com.

Lutz Strüngmann, Email: l.struengmann@hs-mannheim.de.

References

  1. Arqués DG, Michel CJ. A complementary circular code in the protein coding genes. J Theor Biol. 1996;182:45–58. doi: 10.1006/jtbi.1996.0142. [DOI] [PubMed] [Google Scholar]
  2. Baranov PV, Venin M, Provan G. Codon size reduction as the origin of the triplet genetic code. PLoS ONE. 2009;4(5):e5708. doi: 10.1371/journal.pone.0005708. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Clark J, Holton DA. A first look at graph theory. Newark: World Scientific; 1991. [Google Scholar]
  4. Cisowski D (2015) Tessera-based encoding of the mitochondrial genome. Bachelor-Thesis, Mannheim
  5. Crick F, Griffith JS, Orgel LE. Codes without commas. Proc Natl Acad Sci USA. 1957;43(5):416–21. doi: 10.1073/pnas.43.5.416. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Fimmel E, Michel ChJ, Starman M, Strüngmann L. Self-complementary circular codes in coding theory. Theory Biosci. 2018;37(1):51–65. doi: 10.1007/s12064-018-0259-4. [DOI] [PubMed] [Google Scholar]
  7. Fimmel E, Michel ChJ, Strüngmann L. Diletter circular codes over finite alphabets. Math Biosci. 2017;294:120–129. doi: 10.1016/j.mbs.2017.10.001. [DOI] [PubMed] [Google Scholar]
  8. Fimmel E, Strüngmann L. Mathematical fundamentals for the noise immunity of the Genetic Code. BioSystems. 2018;164:186–198. doi: 10.1016/j.biosystems.2017.09.007. [DOI] [PubMed] [Google Scholar]
  9. Fimmel E, Strüngmann L. Linear codes and the mitochondrial genetic code. BioSystems. 2019;184:103990. doi: 10.1016/j.biosystems.2019.103990. [DOI] [PubMed] [Google Scholar]
  10. Fimmel E, Michel CJ, Strüngmann L. n-nucleotide circular codes in graph theory. Phil Trans A. 2016;374:20150058. doi: 10.1098/rsta.2015.0058. [DOI] [PubMed] [Google Scholar]
  11. Fimmel E, Giannerini S, Gonzalez D, Strüngmann L. Circular codes, symmetries and transformations. J Math Biol. 2014;70(7):1623–44. doi: 10.1007/s00285-014-0806-7. [DOI] [PubMed] [Google Scholar]
  12. Fimmel E, Giannerini S, Gonzalez D, Strüngmann L. Dinucleotide circular codes and bijective transformations. J Theor Biol. 2015;386:159–165. doi: 10.1016/j.jtbi.2015.08.034. [DOI] [PubMed] [Google Scholar]
  13. Fimmel E, Michel Ch. J, Pirot F, Sereni JS, Starman M, Strüngmann L (2020) The relation between k-circularity and circularity of codes, submitted [DOI] [PMC free article] [PubMed]
  14. Fimmel E, Strüngmann L. Yury Borisovich Rumer and his biological papers on the genetic code. Phil Trans R Soc A. 2016;374:20150228. doi: 10.1098/rsta.2015.0228. [DOI] [PubMed] [Google Scholar]
  15. Gonzalez DL, Giannerini S, Rosa R (2012) On the origin of the mitochondrial genetic code: towards a unified mathematical framework for the management of genetic information. In: Nature precedings. 10.1038/npre.2012.7136
  16. Gonzalez DL, Giannerini S, Rosa R (2019) On the origin of degeneracy in the genetic code., In: Interface Focus 9: 20190038. 10.1098/rsfs.2019.0038 [DOI] [PMC free article] [PubMed]
  17. Michel CJ. The maximal C3 self-complementary trinucleotide circular code X in genes of bacteria, archaea, eukaryotes, plasmids and viruses. Life. 2017;7(20):1–16. doi: 10.3390/life7020020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Nirenberg MW, Matthaei JH. The dependence of cell-free protein synthesis in E. coli upon naturally occurring or synthetic polyribonucleotides. Proc Natl Acad Sci USA. 1961;47:1588–1602. doi: 10.1073/pnas.47.10.1588. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Patel A. The triplet genetic code had a doublet predecessor. J theor Biol. 2005;233:527–532. doi: 10.1016/j.jtbi.2004.10.029. [DOI] [PubMed] [Google Scholar]
  20. Rotman JJ. An introduction to the theory of groups. Berlin: Springer; 1995. [Google Scholar]
  21. Seligmann H. Putative anticodons in mitochondrial tRNA sidearm loops: Pocketknife tRNAs? J Theor Biol. 2014;7(340):155–63. doi: 10.1016/j.jtbi.2013.08.030. [DOI] [PubMed] [Google Scholar]
  22. Smith JM. The problems of biology. Oxford: Oxford University Press; 1986. [Google Scholar]
  23. Wilhelm T, Nikolajewa S. A new classification scheme of the genetic code. J Mol Evol. 2004;59(5):598–605. doi: 10.1007/s00239-004-2650-7. [DOI] [PubMed] [Google Scholar]
  24. Wu HL, Bagby S, van den Elsen JM. Evolution of the genetic triplet code via two types of doublet codons. J Mol Evol. 2005;61(1):54–64. doi: 10.1007/s00239-004-0224-3. [DOI] [PubMed] [Google Scholar]

Articles from Bulletin of Mathematical Biology are provided here courtesy of Springer

RESOURCES