Abstract
A Galois field with a prime number and is a mathematical structure widely used in Cryptography and Error Correcting Codes Theory. In this paper, we propose a novel DNA-based model for arithmetic over . Our model has three main advantages over other previously described models. First, it has a flexible implementation in the laboratory that allows the realization arithmetic calculations in parallel for , while the tile assembly and the sticker models are limited to . Second, the proposed model is less prone to error, because it is grounded on conventional Polymerase Chain Reaction (PCR) amplification and gel electrophoresis techniques. Hence, the problems associated to models such as tile-assembly and stickers, that arise when using more complex molecular techniques, such as hybridization and denaturation, are avoided. Third, it is simple to implement and requires 50 ng/μL per DNA double fragment used to develop the calculations, since the only feature of interest is the size of the DNA double strand fragments. The efficiency of our model has execution times of order and , for the addition and multiplication over , respectively. Furthermore, this paper provides one of the few experimental evidences of arithmetic calculations for molecular computing and validates the technical applicability of the proposed model to perform arithmetic operations over .
Keywords: Bioinformatics, Applied mathematics, Molecular computing technologies, Galois fields, Finite fields, DNA computing, Gel electrophoresis, Polymerase chain reaction (PCR)
Bioinformatics; Applied mathematics; Molecular computing technologies; Galois fields; Finite fields; DNA computing; Gel electrophoresis; Polymerase chain reaction (PCR)
1. Introduction
The fast-paced technological development keeps pushing computer science to new boundaries. The field of DNA computing was born to address hard computational problems. The strategy of most algorithms developed within this novel area of study, is brute force, relying on the huge capacity for parallel processing of DNA computing. The interest in designing a molecular computer is not limited to difficult search problems. If a computer should be able to carry out addition and multiplication, a wider range of problems could be addressed. However, most of the work done in the field of DNA computing is theoretical. Researchers laxly count on the supposed feasibility of the biomolecular techniques proposed in the works of Adleman (Adleman, 1994, 1996; Roweis at al., 1998), Lipton (Lipton, 1995, Winfree (Winfree et al., 1998); LaBean et al., 1999; Rothemund et al., 2004) and Rothemund (Rothemund, 2006). Most algorithms that have appeared in the literature are based on a reduced number of DNA computing models and introduced with no experimental work to back up their actual implementation.
We propose a new DNA based model specifically designed to do arithmetic over Galois fields, which was successfully implemented in the laboratory. Galois fields, , are mathematical structures widely used in Cryptography and in Error Correcting Codes Theory. In Cryptography, the key exchange scheme of Diffie-Hellman is implemented on elliptic and hyperelliptic curves defined on Galois fields (Menezes et al., 1996; Koblitz, 1998; Cohen et al., 2006). On the other hand, in Error Correcting Codes Theory, algebraic geometric codes use algebraic curves, such as Hermitian curves, defined on Galois fields (Sklar, 2001; Guajardo, 2004; Carrasco and Johnston, 2008). Our model has two main properties. First, the molecular techniques employed, Polymerase Chain Reaction (PCR) and electrophoresis, are standard techniques, widely used, easily implemented and not expensive, with only a few designed components needed to carry out the experiments. Secondly, our model allows calculations over , for prime number, , and an integer . In contrast, all works on DNA molecular computation over finite fields found in the literature, are restricted to and .
This paper is organized as follows. Section 2 introduces the DNA computing model. Section 3 presents mathematical basic concepts about Galois fields. Section 4 presents the proposed DNA-based model for arithmetic over Galois fields. Section 5 describes the physical molecular implementation for the proposed DNA-based model. Section 6 summarizes the obtained experimental results for as case study. Section 7 presents a simulation of the proposed DNA-based model using Field Programmable Gate Array (FPGA) technology. Section 8 contains the discussion of the experimental results, the analysis of the advantages and the description of a possible DNA-based computer that implements arithmetic over using the proposed model. Section 9 presents conclusions and future works.
2. DNA computing models
The tendency in computer technology is to produce devices with greater memory and speed than the previous generation but much smaller. The idea of building a tiny computer is not new. In the late 1950s, Richard Feynman suggested the possibility of having sub-microscopic computers in his famous talk “There's Plenty of Room at the Bottom”. However, only about two decades ago Leonard Adleman made a breakthrough when he used the tools of molecular biology to address an NP-complete problem (Adleman, 1994). He succeeded in solving a case of the Hamiltonian path, by manipulating DNA. This event marked the birth of the field known as DNA computing (Kari, 1997).
The speed of any computing device, bio-molecular or not, depends on how many parallel processes it has and how many steps, per each process, it can realize per unit of time. Electronic computers can calculate millions of instructions per second, a task a biological system cannot emulate. However, a DNA computer has a huge advantage in parallel processing and memory (Goldman et al., 2013) and this compensates for the much slower execution time for one instruction (Lipton, 1995; Guarnieri et al., 1996). The immense capacity for parallelization of DNA computing appeared to be the key to outperform electronic computers. The advent of the new discipline at first augured the end of silicon-based computers, however, scientists in the field soon acknowledged there were some obstacles in the way of realizing a competitive DNA-based computer (Gibbons et al., 1996; Regalado, 2000).
The models developed for DNA computing can be classified in two types: those which require human intervention during the process of calculation and those that can be programmed to function autonomously. Early research, following the works of Adleman (Adleman, 1994) and Lipton (Lipton, 1995), provided a variety of non-autonomous models, known as filtering models, for solving complex computational problems. Filtering models use large DNA combinatorial libraries as search spaces for algorithms of parallel filtering (Ignatova et al., 2008). Most of these works were theoretical (Adleman, 1996; Gibbons et al., 1996; Reif, 1995; Rozenberg and Spaink, 2003), however, a few specific problems were actually solved in the laboratory: a 3-SAT problem with 3 (Liu et al., 2000), 6 (Braich et al., 2000) and 20 (Braich et al., 2002) variables, and a variation of the SAT problem, known as the knight problem (Faulhammer et al., 1999).
To solve a wider range of problems a computer should be able to carry out addition and multiplication. However, carrying out binary operations poses other challenges. Guarnieri and colleagues (Guarnieri et al., 1996) presented a general algorithm to perform addition of two nonnegative binary numbers. In the same year, Roweis and colleagues introduced the sticker model, a complete and universal system (Roweis et al., 1998), which has been considered to do arithmetic over finite fields (Chang et al., 2005; Guo and Zhang, 2009; Li et al., 2013a). The sticker system is also a filtering model, which uses two types of single stranded DNA molecules, named memory strand and sticker strand. A memory strand and a number of sticker strands, hybridize to form a partial duplex (memory complex), which represents a bit string of zeros and ones. The main issues with this model are the limited length of a memory complex - it might fragment if it is longer than 15,000 bases – and time consuming operations, which are prone to error - stickers may bind to the wrong sites, or unbind when they are not supposed to. In 1998, Erik Winfree (Winfree, 1998) provided a remarkable new approach in the emerging field, when he proposed that DNA self-assembly could be used to do computation in an autonomous manner. Winfree explored algorithmic self-assembly, which is the result of the combination of Wang's tiling theory (Wang, 1961) and DNA nanotechnology, introduced by Seeman (Seeman, 1982). Winfree showed that DNA computation is Turing-universal and proposed that DNA self-assembly can be used to compute functions or assemble shapes (Winfree, 1996; Winfree et al., 1998; Rothemund et al., 2004). The introduced model by Winfree and colleagues, known as tile assembly model (TAM), has been considered to implement arithmetic over a finite field (Barua and Das, 2003; Li et al., 2013b, 2016; Li, and Xiao, 2014). TAM is based on the self-assembly of double-crossover DNA molecules (known as tiles) into a rectangular lattice, a pseudo-crystalline growth that occurs in the presence of an infinite supply of a finite number of tile types (Rothemund and Winfree, 2000; Jonoska et al., 2011). Tiles glue together or not depending on the binding domains on their sides. To carry out a computation one must start with an arrangement of tiles, called seed configuration, and a set of unattached tiles of different types. The calculation proceeds by annealing, ligation and melting, which occur in a controlled manner. A final configuration containing the result is obtained (Brun, 2007). The disadvantages of this model are the high error rate, the big number of components that a single calculation requires, and the fact that the seed configuration cannot be recycled (Brun, 2008; Brun and Medvidovic, 2007).
Despite of the progress achieved in the field of DNA computing, big drawbacks such as time consuming operations with a high error rate, the output following statistical laws, and the amount of DNA molecules growing exponentially with problem size, are still unresolved in all the mentioned models (Kari et al., 2012). Recently, Woods and colleagues have presented a reprogrammable model of self-assembly (Woods et al., 2019). On the other hand, Currin and colleagues presented a non-deterministic Turing universal model which offered to overcome the problems that previous models posed (Currin et al., 2017). However, the drawbacks associated to the complexities of the experiments are still an issue.
3. Basic concepts about Galois fields
In this section, basic concepts about Galois fields are presented (Guajardo, 2004; Hungerford, 2012; Koblitz, 1998; Menezes et al., 1996; Sklar, 2001).
A Galois field is a finite set with addition, , and multiplication, , module , defined in Tables 1(a) and (b), respectively. Here, is a prime number.
Table 1.
Definitions for addition and multiplication in .
| (a) | |||||
|
|
|
|
|
|
|
| (b) | |||||
Next, we briefly explain the method for constructing an extension field , with and , using as the underlying field.
First, an irreducible polynomial of degree over is selected,
| (1) |
where for . The polynomial is called a primitive polynomial. Let a root of , that is , then
| (2) |
where is the additive inverse of according to Table 1a.
Next, is constructed recursively as,
and, the element is replaced using Eq. (2),
Then,
where , , , and .
Thus, the nonzero elements of are generated as linear combinations of in the following manner,
| (3) |
with , , . We should note that , and the null element does not have a representation as power of . Hence, the field has elements, which are stored in a look-up table according to the powers of each element.
Next, we explain how addition and multiplication of the elements of the field are carried out.
Let , where
Their addition is calculated as follows
| (4) |
where is calculated in using the Table 1a, for . There is not carry or borrow, because are independent of each other.
On the other hand, the multiplication, , is calculated using Algorithm 1 (Guajardo, 2004).
In the seventh step of the algorithm, we must set , when .
In the following sections, we will refer to steps 2 to 9 as the external cycle and to the two internal for cycles, that is, the first cycle from steps 3 to 5, and the second cycle from 6 to 8, as cycles IF-A and IF-B, respectively. These can be executed in parallel, since these are independent of each other.
Algorithm 1
Multiplication for .
Input: where . Output: where . 1. 2. for to do 3. for to do 4. 5. end_for 6. for to do 7. 8. end_for 9. end_for 10. Return
Example 1
For the field , the addition and multiplication are defined in Tables 2(a) and (b), respectively.
Table 2.
Definitions for addition and multiplication in .
| (a) | |||
|
|
|
|
|
| (b) | |||
The extension field is constructed using the primitive polynomial . If a root of , then
Now, we construct the non-null elements of recursively as follows,
Equivalently, we can represent the elements of as arrays of elements in .
Next we build a look-up table that contains all the elements of . In particular, this field has elements and Table 3 shows some of its elements.
Table 3.
Look-up table with some non-null elements of .
We use Algorithm 1 to calculate the multiplication in , where and . Initially, the array is initialized with
Then, the input values for and are
The coefficients of the primitive polynomial are
In Tables 4, 5, 6, 7, and 8, we detail the iterations of Algorithm 1 to calculate .
Table 4.
Iteration for the external cycle.
| Cycle IF-A | ||
| Cycle IF-B | ||
Table 5.
Iteration for the external cycle.
| Cycle IF-A | ||
| Cycle IF-B | ||
Table 6.
Iteration for the external cycle.
| Cycle IF-A | ||
| Cycle IF-B | ||
Table 7.
Iteration for the external cycle.
| Cycle IF-A | ||
| Cycle IF-B | ||
Table 8.
Iteration for the external cycle.
| Cycle IF-A | ||
| Cycle IF-B | ||
Finally,
4. Proposed DNA-based model for arithmetic over and
We have developed a simple DNA-based model to perform addition and multiplication over the fields and , . It is based on the differential migration of dsDNA fragments of different sizes in a gel electrophoresis, which is a standard technique for the separation of double-stranded DNA (dsDNA) fragments of different sizes that are previously obtained by PCR. Here the size of a dsDNA fragment corresponds to the number of base pairs that are contained in the fragment.
Each element is represented by a dsDNA fragment whose size is unique to the element . Therefore, only dsDNA fragments are necessary to represent all the elements of . Table 9 shows this representation using dsDNA of different sizes, where the smallest size is and the largest size is .
Table 9.
DNA representation for elements .
| 0 | 1 | 2 | |||
| Size of DNA fragment |
The gel electrophoresis is used to visualize the DNA molecular representation of a nonzero element representing the coefficients of the polynomial expression, which is given for Eq. (3). The dsDNA fragments for each coefficient are loaded into different slots of the agarose gel matrix. The slots and their respective columns are numbered as according to the order of powers from left to right. Then, an electric field is applied to make the molecules migrate through the gel and be separated by sizes. Figure 1 shows the dsDNA fragments representation of . For this purpose, chains with size were loaded in the slot , chains with size were loaded in the slot , and from slot to slot , chains with size were loaded. Finally, chains with size were loaded in the slot , and chains with size were loaded in the slots and . Thus, our model defines a unique DNA-based representation for each element of .
Example 2
For and the extension field , the dsDNA fragments and are required to represent the elements of . Figure 2 shows the DNA-based representation of and . Such representations are obtained at the end of the gel electrophoresis process.
To calculate addition and multiplication in , first it is necessary to establish a key configuration to interpret addition and multiplication in , according to Tables 1(a) and (b), respectively. In the configuration key, the band pattern in any column (depicting dsDNA fragments on a gel matrix) represents the addition or multiplication of two elements of the field . This is illustrated in the following example.
Example 3
The DNA-based implementation of addition in , requires dsDNA fragments of 3 different sizes. For example, to carry the addition , dsDNA fragments and are loaded into a slot in the agarose gel matrix. Then, the electrophoresis is executed, and the resulting band pattern is interpreted according to the key configuration for addition over , shown in Figure 3. In this figure, a pattern as the one shown in column , will be interpreted as , the result of or . In a similar way, for Figure 4, to calculate , the dsDNA fragments and are loaded into the gel matrix and the electrophoresis is run. The resulting configuration, identical to the one shown in column of the key configuration for multiplication, will be read as , the result of or .
Using these key configurations, we can carry out addition and multiplication of any two elements of by gel electrophoresis. Addition is calculated by adding the coefficients of corresponding powers of each operand, as explained in formula (4) of Section 3. Each one of these additions is independent of the others, since there is not carry or borrow. This is best explained by the following example of addition over .
Example 4
The addition of and is calculated as
Next, we use the Table 3, which is the look-up table of , previously constructed, to find the representation of as a power of a root of primitive polynomial . Then, the linear combination or equivalently the array corresponds to .
For the implementation of addition by gel electrophoresis, dsDNA fragments representing corresponding coefficients of and , are loaded into five slots in the agarose matrix. The slots (columns) are numbered from left to right. The dsDNA fragments and representing and are loaded in slot 4, and representing and are loaded into slot 3. This procedure is repeated for the rest of the coefficients. Next, the electrophoresis is run. Figure 5 shows the resulting band pattern from calculating . The configurations in columns 4 and 3, match the configurations in columns and of Figure 3, respectively. Thus the band patterns in these columns are read as 1 and 2. The columns are independent of each other, because there is not carry or borrow, therefore they are interpreted separately. The complete band pattern is interpreted as the array , which corresponds to element , according to the Table 3.
Below, we explain the DNA-based implementation for the multiplication of two elements of .
Example 5
We use Algorithm 1 to calculate the multiplication in , where and . Initially, the agarose gel matrix is empty. Then, the array is initialized according to the step 1 of Algorithm 1, with
Next, dsDNA fragments representing corresponding coefficients of both elements are loaded into five slots of the agarose gel matrix. Then, the electrophoresis is run according to Algorithm 1, where the input values and their representation as dsDNA fragments for and are
and
respectively. The coefficients of the primitive polynomial and their representation as dsDNA fragments are
Figure 1.
dsDNA fragments representation of performed by agarose gel electrophoresis.
Figure 2.
DNA-based representation for .
Figure 3.
Key configuration for addition over , used to interpret band patterns in gel electrophoresis.
Figure 4.
Key configuration for multiplication over , used to interpret band patterns in gel electrophoresis.
Figure 5.
Gel electrophoresis implementation of in .
Tables 4 and 8 show the iterations of the algorithm for cycles IF-A and IF-B for and , respectively. Then, the array is searched in the rows of Table 3 for , and it is determined that .
For Table 4, the theoretical scheme of gel electrophoresis for cycles IF-A (steps 3 to 5) and IF-B (steps 6 to 8) in Algorithm 1 is shown in Figure 6. We use three different sizes for the dsDNA fragments, in order to implement the addition and the multiplication in steps 4 and 7. Furthermore, as explained in Section 3, the cycles IF-A and IF-B are independent of each other. This allows executing them in parallel, using gel electrophoresis. Figure 8 in Section 6 shows this condition of parallelism empirically in the lab.
Figure 6.
Interpretation of gel electrophoresis: cycles IF-A and IF-B for and .
Figure 8.
Practical implementation for Figure 7 by DNA gel electrophoresis.
For internal cycles IF-A and IF-B, the per column configuration in the lower half of the gel matrix ( or ), is interpreted according to the key configuration for multiplication shown in Figure 4. The obtained result of multiplication ( or ) is added to the operand ( or ), on the upper half of the gel matrix, which is in the same column. The addition is calculated according to Table 2a. The first iteration is shown in Figure 6. At the last iteration , the final configuration is interpreted as the array using the look-up table described in Table 3 for , concluding that is the result of .
Therefore, our DNA-based model has a flexible implementation in the laboratory, because we only need to change:
-
•
The number , which defines the amount of dsDNA fragments with different sizes that are previously obtained by PCR.
-
•
The number , which defines the amount of slots in the gel matrix to execute an electrophoresis.
It allows us to calculate additions and multiplications in different fields and with . For example, if we want to change from the field to the field , we would only have to change from to , and from to . Obviously for each field we must build the respective tables and DNA-based representations for the addition and multiplication.
Finally, we analyze the efficiency of our model to calculate the addition and the multiplication in a field . For this, we assume that all electrophoresis are executed in a constant time for the addition and the multiplication. Thus,
-
•
The addition has an execution time of the order , since the addition is calculated in only one electrophoresis, as shown in Examples 3 and 4, Figures 3 and 5.
-
•
The multiplication has an execution time of the order , since the internal cycles (Cycle IF-A and Cycle IF-B) of the Algorithm 1 are calculated in parallel, and in only one electrophoresis for each iteration of the external cycle (steps 2 to 9), as shown in Example 5 and Figure 6.
5. Physical implementation of the proposed DNA-based model
Before performing the agarose gel electrophoresis experiments, a dsDNA template is required from which to generate the different dsDNA fragments of known size by the PCR technique. For that purpose, the bacterial strain Sulfobacillus sp. CBAR-13, whose genomic DNA sequence was already known, was grown by microbial culture in the laboratory. Detailed information about the culture methodology is described below. Then, the genomic DNA was purified.
5.1. Cells growth and DNA template preparation
Bacterial strain CBAR-13 of Sulfobacillus sp. was grown in a shaking incubator at 59 °C in Single Strength medium [0.2 g/liter (NH4)2SO4, 0.4 g/liter MgSO4·7H2O, and 0.1 g/liter K2HPO4 (initial pH 1.7)] with 50 mM ferrous sulfate (membrane filtered) and 0.02% yeast extract. At the mid-exponential-growth phase, the bacterial cells were harvested, and the total genomic DNA was extracted with High Pure PCR Template Preparation Kit (Roche Product No. 11796828001) following the protocol prescribed by the manufacturer and then used for Polymerase chain reaction (PCR). The DNA fragments were obtained by PCR using a set of DNA primers designed specifically and the genomic DNA of CBAR-13 as a PCR template. The details are explained below.
5.2. Primers design
Primers were designed using Primer-BLAST software (Ye et al., 2012). The genomic sequence of S. sp. CBAR-13 (access numbers; NZ_LGRO01000001 and NZ_LGRO01000002) was used as template for primers design. Table 10 shows all primers designed, synthesized and used in this study.
Table 10.
The three pair of PCR primers used in this study and the expected size for each PCR product. Fw and Rv are forward and reverse primers, respectively.
| Primer | Sequence | Product length |
|---|---|---|
| 1-Fw | GACAGACCTGCTCGCTTCTT | 639 |
| 1-Rv | TGGTAAACGCGGGCAACTTA | |
| 4-Fw | TACTCCATCCGCCAGTCAGA | 110 |
| 4-Rv | GTTGACGTGCTGTGACAACC | |
| 5-Fw | GTTGTCACAGCACGTCAACC | 77 |
| 5-Rv | AAGTACAAGAGCGCCAACGA |
5.3. PCR protocol for generation of dsDNA fragments
PCR amplification was carried out in a 50 μl reaction volume containing 50–100 ng of template DNA, primers (1 μM each Fw and Rv), dNTPs (10 μM each), MgCl2 (2 mM), 5X Green GoTaq® Flexi Buffer (1X final concentration) and 1.25 U GoTaq® DNA Polymerase (Promega catalog M7801).
The conditions for the PCR reactions were: 98 °C for 3 min, followed by 30 cycles of denaturation at 95 °C for 30 s, annealing at 60 °C for 45 s, extension at 72 °C for 30 s, and a final extension at 72 °C for 5 min.
5.4. Agarose gel electrophoresis
The products of the PCR reactions were separated as follows, 4 μl of each reaction were revealed by agarose gel electrophoresis at 90 V for 1.5 h on 1 or 3% agarose in Tris-acetate-EDTA buffer (40mM Tris, 20mM acetic acid, and 1mM EDTA) and 3 μl GelRed® 10000X (Biotium catalog 41002). The agarose gels were visualized by a transilluminator and then documented and confirmed. The same electrophoretic procedure was applied to perform the arithmetic calculations with the obtained DNA fragments.
6. Experimental results
The dsDNA fragments of specific size were generated for the experimental development of the proposed model. Figure 7 shows the size and quality of the generated fragments , that represent the elements of the field , verified by electrophoresis of the PCR products.
Figure 7.
Practical implementation for Table 8 by DNA gel electrophoresis. Mk = Molecular weight marker.
To test the validity of the proposed model, we performed the calculation described in Section 4 using the fragments generated previously.
Figure 8 shows the true implementation by gel electrophoresis of the multiplication described in Figure 6 of Section 4, which considers the iterations , where , for internal cycles IF-A and IF-B of Algorithm 1.
Once the results of and for (cycle IF-A and IF-B) are interpreted and obtained, then and are replaced for calculation of , and so on. The calculation of the multiplication of and in ends when all iterations have been completed.
7. FPGA simulation of the proposed model
In this section, we present the simulation of our proposed model using Field Programmable Gate Array (FPGA) technology. The simulation consists in the design and testing of arithmetic circuits optimized for , achieving shorter times than it would take sequential computers. For this we use the FPGA ZYNQ7000 of Xilinx, which is incorporated into a SoM TE0729-02 of Trenz electronic GmbH.
For the case study of the field , the operations (base 3) were designed as a virtual layer on the components of the architecture of FPGA ZYNQ7000 (base 2). Thus, virtual minimum logical units of 3 states are considered to establish a homologation between the virtual layer and the physical layer. Figure 9 shows the logical mapping using a FPGA for the addition and multiplication of , according to Tables 2a and b with . These operations are used to develop addition and multiplication of elements , as explained in Sections 3 and 4.
Figure 9.
The logical mapping for the addition and multiplication of using a FPGA.
On the other hand, Figure 10 shows the simulation using a FPGA, for , which was described in Example 1. The multiplication is done in 5 iterations. These iterations appear in red color in the row corresponding to the result (R). It should be noted that in each operation performed with the coefficients of and , the logical mapping described in Figure 9 is used. Although is performed in one clock, there is an additional computational cost of converting non-binary coefficients in to their respective binary representation in order to operate with FPGA.
Figure 10.
Simulation using a FPGA for .
8. Discussion
This paper is the first that introduces a new DNA model in the area of molecular computing, designed to perform arithmetic over Galois fields, based on the differential migration of dsDNA fragments of different sizes. The proposed model presents several advantages over other models that have been widely studied and previously published, such as the Tile Assembling model (TAM) (Winfree et al., 1998) and the Sticker model (Roweis et al., 1998). All the arithmetic calculations covered by the TAM have been performed only in a theoretical way (e. g. Brun, 2007; Li et al., 2013a, 2013b; Li and Xiao, 2014; Li and Xiao, 2016; Li et al., 2016; Li, 2018; Li and Zhang, 2018). Li et al. (Li et al., 2013b) designed a tile assembly system that, in theory, could compute a square over , based on the condition that all DNA operations are perfect. But it is widely known that this is not the case but quite the opposite. In the same way, Jonoska et al. (Jonoska et al., 2011) assumed, in their flexible-TAM study, that the assembly process happens in ideal conditions. In the more recent studies (Li et al., 2016; Li, 2018) the authors only reference the article (Rothemund, 2012) to justify the technical feasibility of their methodology for DNA computation of modular-multiplication and modular-square over . However (Rothemund, 2012), only uses DNA self-assembly for fabrication of nanostructures (DNA origami) at laboratory scale. There is still not empirical evidence of the application of this molecular technique in the calculation of the mentioned problem. This may be due to the recognized complexity associated to the implementation of the DNA self-assembly technique in the laboratory (Rothemund, 2006; Jonoska et al., 2011). In (Woods et al., 2019) the authors present a reprogrammable DNA self-assembly system based on tiles, which can copy, sort, recognize palindromes, find multiples of 3, and other functions that are detailed in that article. However, this reprogrammable DNA self-assembly system is limited to the binary case, since the system uses iterated Boolean circuits. This hinders its application to develop calculations over a with .
With respect to the molecular process associated to the TAM, Rothemund (Rothemund, 2006) and Jonoska (Jonoska et al., 2011) described and showed some typical experimental deviations that can occur during the practical work in the laboratory. First, the known difficulty in determining the stoichiometry for complex test tubes with different types of molecules, could result in annealing or thermodynamical problems and, in consequence, in hybridization mismatches and low performance of the reaction. Second, the low proportion of well-formed structures resulting from self-assembling (only 53%), evidenced by Rothemund (Rothemund, 2006), could predict an important percentage of error in the tile assembly process and, consequently, in the molecular calculations. Third, the presence of large dislocations at unbridged seams, where two halves of one assembled structure get completely separated, are also common. It is highly probable that all these technical issues hinder an accurate and successful computation by the TAM. In contrast, our model is based on conventional PCR reactions and agarose gel electrophoresis, both highly stable, reproducible and relatively low-cost molecular techniques. Another consideration for the technical application of the TAM on Galois fields calculations is that all technical work performed for one calculation (for example a multiplication of two elements in ) cannot be recycled for a different calculation, which would have to be completely performed from the start (Rothemund et al., 2004; Rothemund, 2012). Furthermore, it has been reported that time required for a calculation by TAM increases proportionally with the increase of in a , just because the complexity of the DNA assembly increase with (Brun, 2008). If it is considered that the time required for the design of the sequences necessary for structure formation is one week in addition to one week needed for sequences synthesis and 2 h for mixing and annealing reactions (Rothemund, 2006), then TAM is a time and money expensive technique for algorithmic calculation. Conversely, most of the technical work of our model could be reused for different arithmetic calculations transforming it into a very attractive model in terms of costs and time. In an eventual new arithmetic calculation, only the electrophoresis must be repeated, while the different DNA fragments could be reused or, at most, re-amplified by PCR (90 min) with the previously designed primers. The application of the TAM model to arithmetical problems has been studied for more than ten years. However, there are still no records of its successful implementation in the laboratory to do arithmetic over with . This supports our hypothesis that the TAM works well at a theoretical level but there is high uncertainty about the feasibility of its practical implementation and application in the short or medium term.
On the other hand, the only methodological complexity of our DNA-based model is that the number of different dsDNA fragments for the input depends on parameter , and the number of slots in the gel matrix depends on parameter for any . For example, for the addition and multiplication over we only need dsDNA fragments of three different sizes and a gel matrix with capacity for five slots. For a much large field, such as , we only require dsDNA fragments of two different sizes and a gel matrix with capacity for 163 slots, to perform addition and multiplication over that field. This technical component of our model, also makes it simpler than the sticker model. The sticker model uses the hybridization of complementary DNA fragments to represent bit strings and do binary arithmetic, in theory (Zimmermann, 2002). However, the implementation of such calculations requires a careful adjustment of the hybridization conditions to ensure reproducibility and the correct assembly of all fragments. Consequently, any modification in the sequence of stickers or an increase in the number of stickers required, will need a resetting of the hybridization conditions. In (Li et al., 2013a), the authors present a stickers-based algorithm for parallel reduction in a field , and they use the field as a theoretical example. Then, to represent all the elements of it is required to design and handle 163 different stickers to represent a bit 1 in the different positions of the memory strand. Moreover, it would not be possible to effectively manage the melting and annealing temperatures to bind and unbind selectively up to 163 different stickers. This fact makes the biological operations of merge, separate, set and clear difficult to implement in the laboratory for addition and multiplication over .
Parallel computing is being widely studied and has been implemented using different approaches in recent years and it is expected that the threshold of ExaFLOP (1018 floating-point operations per second) will be reached by the year 2020. Thus, parallel computers could replace the current computers, which mostly have a sequential architecture (Li, 2018; Wright, 2019). However, silicon and molecular computers continue to use binary logic, and this force translating non-binary operations into binary atomic operations using Boolean algebra (Zhang et al., 2019; Eshra et al., 2019). In particular, to calculate addition and multiplication in a non-binary field , , there is an additional computational cost associated with using atomic operations in Boolean algebra for the implementation of these operations. In this context, our model can also operate in parallel and it has a flexible implementation in the laboratory that allows avoiding the translation to Boolean operations to implement addition and multiplication over a non-binary field, which gives it a big advantage over current silicon and molecular systems. Even in the simulation with FPGA ZYNQ7000 for the case study of , in which we programmed the logical mapping using the circuits to simulate the addition and the multiplication over any Galois field, it was necessary to translate the no-binary operations performed by our model into binary ones.
Therefore, our model has a flexible implementation in the laboratory which allows arithmetic operations over binary and non-binary fields without conversion cost, it is easy to implement, economic, less prone to error, more tolerant to changes without altering the result.
On the other hand, we analyze the efficiency of our model to calculate the addition and the multiplication in a field with . For this, we assume that each electrophoresis is executed in a constant time for the addition and the multiplication. Thus, we obtain that the addition has an execution time of the order , while the multiplication has an execution time of the order .
We use the multiplication as the worst case to compare the efficiency of our model with the efficiency of TAM-based models. Since the multiplication is more expensive than the addition in computational terms (memory, processor and time).
The authors of (Li, 2018; Li and Xiao, 2016; Li et al., 2016) state that the execution time for the multiplication over a field is using the TAM system, which is equal to the execution time of our model. But in the TAM model, 7746 different tiles are needed to do the calculations, and as explained above this leads to great complexity of the implementation in the laboratory of the TAM model. Further, we say again TAM model can only be used in the binary case, that is, for .
Finally, with a plausible engineering intervention, our model can be fully automated. This is discussed with more detail below.
Figure 11 shows a system diagram of a possible DNA-based computer for our model that performs addition and multiplication over for and in an autonomous way.
Figure 11.
System diagram for a DNA-base molecular computer for proposed model.
In the system there are Robots, Interpreters, Electrophoresis boxes and a Display. A Robot makes the loading of the previously generated dsDNA fragments in the respective slots in the gel matrix for the electrophoresis. Thus, our model avoids any human error in pipetting and loading process. An Interpreter contains the look-up table of all the elements of the field , and the key configurations for addition and multiplication. This device makes the interpretation of the images obtained by electrophoresis, and gives instructions to the next Robot to load the dsDNA fragments of the next iteration or Electrophoresis box. The Electrophoresis boxes represent the DNA electrophoresis processes for each addition or multiplication calculations, according to previously explained and showed in Sections 4, 5 and 6. At the end of the system, there is a Display, which is a device that receives information from the last Interpreter and shows the final result of the completed arithmetic calculation.
Many of the mentioned possible improvements for the proposed model are currently in use. The simplicity of the technique makes feasible its adaptation to devices currently available in the market, like automated and miniaturized systems. Microfabricated capillary array electrophoresis is a microfluidic device system that allows the separation of molecules, in this case DNA fragments of different size, and could be used for DNA sequencing (Paegel et al., 2002). The DNA sequencing technologies are relevant for our model, because they resume most of the advances towards increasing the analysis throughput and fragment resolution, and decreasing consumption of reactants and samples. For comparison, a slab gel requires 0.5–1.0 μL of DNA sample and 6–8 h of analysis time, the DNA separation in a microfluidic device could require 0.0001 – 0.0005 μL of DNA sample and 0.1 – 0.5 h of analysis time (Sinville and Soper, 2007). The handling of samples and the distribution of liquid solutions is a field where multiple solutions have been developed through robotics, e.g. QIAsymphony SP/AS instruments of Qiagen. Those devices can handle the mixture and the loading of samples into the analytic instrument, so the entire wet procedure is free of human intervention. This offers the advantages of a uniform loading of samples, time saving and avoidance of the error prone process of handling and loading larger number of samples by hand. Another area where improvements can be incorporated is in reading and interpreting of the results through image analysis by simple software, for example (Intarapanich et al., 2015; Abeykoon et al., 2015). This software can be of great help, as it can quickly and accurately read and translate the image results in a user-friendly format.
9. Conclusions and future work
This work is the first that introduces a novel DNA model to implement arithmetic operations over a field , with and , which is based on the differential migration of dsDNA fragments of different sizes. It has three major advantages over TAM and sticker models. First, because of its flexible implementation in the laboratory, it allows performing arithmetic operations over binary and non-binary Galois fields without the translation to Boolean operations, while finite field arithmetic, using the TAM model or the sticker model, is limited to . The second asset of our model is that it is less prone to error than other systems. It is based on conventional PCR amplification and electrophoresis, highly stable, reproducible and low-cost molecular techniques. Hence, the problems associated to other models that arise when using more complex molecular techniques, such as hybridization and denaturation, are avoided. The third advantage is that it is simple to implement and, when fully developed, it will use 50 ng/μL per DNA fragment used to develop the calculations. There is no need for designing complex DNA structures, since the only feature of interest is the size of the used dsDNA fragments. Also, to do arithmetic over only fragments of different sizes and a gel matrix with capacity for slots are necessary. This contrasts with TAM and sticker models, where the design of the DNA strands is of major importance and the concentration of reactants increases greatly with the size of the problem.
Furthermore, the flexible implementation in the laboratory of our model allows us to perform arithmetic calculations in parallel over , and also without the cost of translating non-binary operations into binary atomic operations using Boolean algebra. Then, it is easy to implement, economic, less prone to error, more tolerant to changes without altering the result.
On the other hand, the efficiency of our model has execution times of order and , for the addition and multiplication over a field with , respectively. For this, we assume that each electrophoresis is executed in a constant time for the addition and the multiplication.
This paper provides one of the few experimental evidences of arithmetic calculations for molecular computing and validates the technical applicability of the proposed model to perform arithmetic operations over with .
Finally, our future work will be focused on making faster the interpretation of the DNA patterns produced in each electrophoresis, and with this achieve a cheaper and faster implementation of the addition and multiplication on a field with in the laboratory.
Declarations
Author contribution statement
Ivan Jiron, Susana Soto, Sabrina Marin, Mauricio Acosta, Ismael Soto: Conceived and designed the experiments; Performed the experiments; Analyzed and interpreted the data; Contributed reagents, materials, analysis tools or data; Wrote the paper.
Funding statement
This work was supported by VRIDT/UCN N° 171/17, VRIDT/UCN N° 084/2018 and Project FONDEF IT17M10012.
Competing interest statement
The authors declare no conflict of interest.
Additional information
Supplementary content related to this article has been published online at https://doi.org/10.1016/j.heliyon.2019.e02901.
Appendix A. Supplementary data
The following is the supplementary data related to this article:
References
- Abeykoon A.M.K.W.K. An automated system for analyzing agarose and PolyacrylamideGel images. Ceylon J. Sci. (Bio. Sci.) 2015;44(1):45–54. [Google Scholar]
- Adleman L.M. Molecular computation of solutions to combinatorial problems. Science. 1994;266:1021–1024. doi: 10.1126/science.7973651. [DOI] [PubMed] [Google Scholar]
- Adleman L.M. On constructing a molecular computer. DNA Based Comp. 1996;27 DIMACS - Series in Discrete Mathematics and Theoretical Computer Science. [Google Scholar]
- Barua R., Das S. IEEE the Congress on Evolutionary Computation. 2003. Finite field arithmetic using self-assembly of DNA tilings. CEC '03. [Google Scholar]
- Braich R.S. Solution of a satisfiability problem on a gel-based DNA computer. In: Condon A., Rozenberg G., editors. Vol. 2054. Springer-Verlag; New York: 2000. p. 27—42. (Lecture Notes in Computer Science). [Google Scholar]
- Braich R.S. Solution of a 20-variable 3-SAT problem on a DNA computer. Sciencexpress. 2002 doi: 10.1126/science.1069528. [DOI] [PubMed] [Google Scholar]
- Brun Y. Arithmetic computation in the tile assembly model: addition and multiplication. Theor. Comput. Sci. 2007;378:17–31. [Google Scholar]
- Brun Y., Medvidovic N. Technical Report USC-CSSE-2007-714. Center for Software Engineering, University of Southern California; 2007. Discreetly distributing computation via self-assembly. [Google Scholar]
- Brun Y. Non-deterministic polynomial time factoring in the tile assembly model”. Theor. Comput. Sci. 2008;395:3–23. [Google Scholar]
- Carrasco R.A., Johnston M. John Wiley & Sons, Ltd.; 2008. Non-Binary Error Control Coding for Wireless Communication and Data Storage. [Google Scholar]
- Chang W.-L. Fast parallel molecular algorithms for DNA-based computation: factoring integers. IEEE Trans. NanoBioscience. 2005;4(No.2) doi: 10.1109/tnb.2005.850474. [DOI] [PubMed] [Google Scholar]
- Cohen H. Taylor & Francis Group, LLC; 2006. Handbook of Elliptic and Hyperelliptic Curve Cryptography. [Google Scholar]
- Currin A. Computing exponentially faster: implementing a non-deterministic universal Turing machine using DNA. J. R. Soc. Interface. 2017;14 doi: 10.1098/rsif.2016.0990. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Eshra A. Renewable DNA hairpin-based logic circuits. IEEE Trans. Nanotechnol. 2019;18:252–259. [Google Scholar]
- Faulhammer D. Molecular computation: RNA solutions to chess problems. Proc. Natl. Acad. Sci. 1999;97:1385–1389. doi: 10.1073/pnas.97.4.1385. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gibbons A. Models of DNA computation", LNCS 1113, 21th. Int. Sym. Math. Foundations Comp. Sci. 1996;18–36 [Google Scholar]
- Goldman N. Towards practical, high-capacity, low-maintenance information storage in synthesized DNA. Nature. 2013;494:77–80. doi: 10.1038/nature11875. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Guajardo J. Ruhr-Universität; Bochum: 2004. Arithmetic architectures for finite fields GF(pˆm) with cryptographic applications. [Google Scholar]
- Guarnieri F. Making DNA add. Science. 1996;273:220–223. doi: 10.1126/science.273.5272.220. [DOI] [PubMed] [Google Scholar]
- Guo P., Zhang H. Fifth International Conference on Natural Computation. 2009. DNA implementation of arithmetic operations. [Google Scholar]
- Hungerford T.W. 3 ed. CENGAGE Learning Custom Publishing; 2012. Abstract Algebra: an Introduction. [Google Scholar]
- Ignatova Z. Springer; 2008. DNA Computing Models. [Google Scholar]
- Intarapanich A. Automatic DNA diagnosis for 1D gel electrophoresis images using bio-image processing technique. BMC Genom. 2015;16(Suppl 12):S15. doi: 10.1186/1471-2164-16-S12-S15. http://www.biomedcentral.com/1471-2164/16/S12/S15 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jonoska N. On stoichiometry for the assembly of flexible tile DNA Complexes. Nat. Comput. 2011;10:1121–1141. [Google Scholar]
- Kari L. DNA computing: the arrival of biological mathematics. Math. Intell. 1997;19(No. 2):9–22. [Google Scholar]
- Kari L. DNA computing -- Foundations and implications. In: Back T., Kok J.N., editors. Handbook of Natural Computing. Springer; 2012. pp. 1073–1127. [Google Scholar]
- Koblitz N. Vol. 3. Springer; Berlin: 1998. (“Algebraic Aspect of Cryptography”. Algorithms and Computation in Mathematics). [Google Scholar]
- LaBean T.H. 5th International Meeting on DNA Based Computers (DNA5) MIT; Cambridge, M.A.: 1999. Experimental progress in computation by self-assembly of DNA tilings. [Google Scholar]
- Li Y. A DNA sticker algorithm for parallel reduction over finite field GF(2ˆn) Int. J. Grid Distributed Comput. 2013;6(No. 5):17–28. [Google Scholar]
- Li Y. Square over finite field GF(2ˆn) using self-assembly of DNA tiles. Int. J. Hosp. Inf. Technol. 2013;6(No. 4) [Google Scholar]
- Li Y., Xiao L. Arithmetic computation using self-assembly of DNA tiles: integer power over finite field GF(2ˆn) Int. Conf. Bioinfo. Biomed. 2014:471–475. [Google Scholar]
- Li Y. 22nd International Conference on Parallel and Distributed Systems. 2016. A molecular computation model to compute inversion over finite field GF(2ˆn) pp. 1151–1156. [Google Scholar]
- Li Y., Xiao L. IEEE 14th International Conference on Smart City. 2016. Molecular computation based on tile assembly model: modular-multiplication and modular-square over finite field GF(2n) pp. 1007–1014. [Google Scholar]
- Li Y. IEEE Intl. Conf. On Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications. 2018. A revised DNA computing model of inversion and division over finite field GF(2n) pp. 477–484. [Google Scholar]
- Li Y., Zhang Z. IEEE 5th International Conference on Information Science and Control Engineering 2018. 2018. Parallel computing: review and perspective; pp. 365–369. [Google Scholar]
- Lipton R.J. DNA solution of hard computational problems. Science. 1995;268:542–545. doi: 10.1126/science.7725098. [DOI] [PubMed] [Google Scholar]
- Liu Q. DNA computing on surfaces. Nature. 2000;403:175. doi: 10.1038/35003155. [DOI] [PubMed] [Google Scholar]
- Menezes A., van Oorschot P., Vanstone S. CRC Press; 1996. Handbook of Applied Cryptography. [Google Scholar]
- Paegel B.M., Emrich C.A. High throughput DNA sequencing with a microfabricated 96-lane capillary array electrophoresis bioprocessor. Proc. Natl. Acad. Sci. U.S.A. 2002;99(2):574–579. doi: 10.1073/pnas.012608699. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Regalado A. DNA computing. MIT Technol. Rev. 2000;103(3):80. [Google Scholar]
- Reif J.H. 1995. Parallel Molecular Computation. SPAA'95 Santa Barbara CA USA@ 1995 ACM 0-89791-717-0/95/07. [Google Scholar]
- Rothemund P.W.K., Winfree E. STOC ’00 Proceedings of the Thirty-Second Annual ACM Symposium on Theory of Computing. 2000. The program size complexity of self assembled squares; pp. 459–468. [Google Scholar]
- Rothemund P., Papadakis N., Winfree E. Algorithmic self-assembly of DNA Sierpinski triangles. PLoS Biol. 2004;2(12):2041–2053. doi: 10.1371/journal.pbio.0020424. e424. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rothemund P.W.K. Folding DNA to create nanoscale shapes and patterns. Nature. 2006;440:297–302. doi: 10.1038/nature04586. [DOI] [PubMed] [Google Scholar]
- Rothemund P.W.K. IEEE 7th International Conference on Nano/Micro Engineered and Molecular Systems (NEMS) 2012. Beyond Watson and Crick: programming DNA self-assembly for nanofabrication. [Google Scholar]
- Roweis S. A sticker-based model for DNA computation. J. Comput. Biol., Winter. 1998;5(4):615–629. doi: 10.1089/cmb.1998.5.615. [DOI] [PubMed] [Google Scholar]
- Rozenberg G., Spaink H. DNA computing by blocking. Theor. Comput. Sci. 2003;292:653–665. [Google Scholar]
- Seeman N.C. Nucleic acid junctions and lattices. J. Theor. Biol. 1982;99:237–247. doi: 10.1016/0022-5193(82)90002-9. [DOI] [PubMed] [Google Scholar]
- Sinville R., Soper S.A. High resolution DNA separations using microchip electrophoresis. J. Sep. Sci. 2007;30(11):1714–1728. doi: 10.1002/jssc.200700150. [DOI] [PubMed] [Google Scholar]
- Sklar B. second ed. Prentice-Hall, Inc.; 2001. Digital Communications. Fundamentals and Applications. [Google Scholar]
- Wang H. “Proving theorems by pattern recognition, II. Bell Syst. Tech. J. 1961;40:1–41. [Google Scholar]
- Winfree E. Ph.D. Thesis. Caltech; Pasadena, CA: June 1998. Algorithmic self-assembly of DNA. [Google Scholar]
- Winfree E. On the computational power of DNA annealing and ligation. DNA Based Comp. 1996:199–221. [Google Scholar]
- Winfree E., Liu F.R., Wenzler L.A., Seeman N.C. Design and self-assembly of two-dimensional DNA crystals. Nature. 1998;394:539–544. doi: 10.1038/28998. [DOI] [PubMed] [Google Scholar]
- Wright A.S. Performance modeling, benchmarking and simulation of high performance computing systems. Future Gener. Comput. Syst. 2019;92:900–902. [Google Scholar]
- Woods D. Diverse and robust molecular algorithms using reprogrammable DNA self-assembly. Nature. 2019;567:366–372. doi: 10.1038/s41586-019-1014-9. [DOI] [PubMed] [Google Scholar]
- Ye J. Primer-BLAST: A tool to design target-specific primers for polymerase chain reaction”. BMC Bioinf. 2012;13:134. doi: 10.1186/1471-2105-13-134. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang C. DNA computing for combinational logic. Sci. China Inf. Sci. 2019;62:61301. [Google Scholar]
- Zimmermann K.H. Efficient DNA sticker algorithms for NP-complete graph problems. Comput. Phys. Commun. 2002;114:297–309. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.











