Skip to main content
Entropy logoLink to Entropy
. 2022 Feb 5;24(2):242. doi: 10.3390/e24020242

An Efficient Parallel Reverse Conversion of Residue Code to Mixed-Radix Representation Based on the Chinese Remainder Theorem

Mikhail Selianinau 1, Yuriy Povstenko 1,*
Editors: Sergio Cruces1, Rubén Martín-Clemente1, Andrzej Cichocki1, Iván Durán-Díaz1
PMCID: PMC8871111  PMID: 35205536

Abstract

In this paper, we deal with the critical problems in residue arithmetic. The reverse conversion from a Residue Number System (RNS) to positional notation is a main non-modular operation, and it constitutes a basis of other non-modular procedures used to implement various computational algorithms. We present a novel approach to the parallel reverse conversion from the residue code into a weighted number representation in the Mixed-Radix System (MRS). In our proposed method, the calculation of mixed-radix digits reduces to a parallel summation of the small word-length residues in the independent modular channels corresponding to the primary RNS moduli. The computational complexity of the developed method concerning both required modular addition operations and one-input lookup tables is estimated as Ok2/2, where k equals the number of used moduli. The time complexity is Olog2k modular clock cycles. In pipeline mode, the throughput rate of the proposed algorithm is one reverse conversion in one modular clock cycle.

Keywords: Residue Number System, modular arithmetic, residue-to-binary conversion, Chinese Remainder Theorem, mixed-radix representation

1. Introduction

Along with the improvement of computer technology, the development and implementation of new effective approaches to the organization and realization of computational tasks are some of the main ways to increase the data processing speed. At present, high-performance computing is developing extremely rapidly. These reasons lead to qualitatively new requirements imposed on number-theoretic methods and computational algorithms. Practically, all well-known approaches to high-performance computing use certain parallel forms of data representation and processing. In recent decades, special consideration has been given to the so-called modular computational structures. Their arithmetic foundation is the Residue Number System (RNS), whose ideological roots go back to the classic topics of number theory and abstract algebra. The RNS is a non-positional number system with inherent parallelism and occupies a place of particular importance due to its carry-free properties, which provide a high potential for accelerating arithmetic operations.

As is well known, the RNS has some advantages over a conventional Weighted Number System (WNS) in the design and implementation of high-performance computing applications, devices, and systems. From its appearance in the mid-1950s to the present, RNS arithmetic has attracted the constant attention of researchers in computer technology [1,2], number-theoretic methods [3,4,5], digital signal and image processing [2,5,6,7,8], communications systems [5,9], cryptography [2,8,10,11], and other fields [10].

The main advantage of RNS is its unique ability to decompose the large word-length numbers into a set of smaller word-length residues, which are processed in parallel in the independent modular channels. The inherent parallelism of RNS enables avoiding the carry-overs obtained in addition, subtraction, and multiplication, which are usually time-consuming in the WNS. In this regard, the modularity and carry-free properties make computation fast and efficient. Therefore, the RNS presents one of the most efficient means for increasing data processing speed.

Due to its carry-free property, the residue arithmetic is exceptionally suitable for a broad class of applications in which addition and multiplication are the dominant arithmetic operations. In any case, it has excellent potential for many substantial applications in such areas as digital signal processing, cryptography, distributed information and communication systems, information security systems, fault tolerance, cloud computing, and others. Moreover, these RNS applications may be effectively embedded in processor platforms functioning according to the conventional information-processing approach [2,5,8]. For the reasons mentioned above, residue arithmetic represents an efficient mathematical tool for the high-speed implementation of various computational tasks.

The reverse conversion and base extension are the most critical topics in residue arithmetic. As opposed to conventional WNS, these operations, on a par with other central non-modular procedures such as magnitude comparison, sign determination, overflow detection, general division, scaling, etc., are relatively harder for implementation. They are time consuming and costly due to their more complicated structure compared to modular operations.

As is known, to perform non-modular operations, it is necessary to carry out the binary reconstruction of the integer by its residue code, which in general is hampered by the non-weighted nature of the RNS. This circumstance negates to a substantial extent the main advantages of residue arithmetic.

Therefore, the development of novel approaches and methods for fast number reconstruction by its residue code has significant importance in high-performance computing based on parallel algorithmic structures of RNS, especially for high-speed implementing digital signal processing applications and public-key cryptosystems. That should enable the extensive use of residue arithmetic in many priority areas of science and technology.

In this paper, we present a novel approach to the parallel reverse conversion from the residue code into the mixed-radix representation. In the proposed method, the calculation of mixed-radix digits reduces to a parallel summation of the small word-length residues in the independent modular channels corresponding to the primary RNS moduli.

The paper is structured as follows. Section 2 and Section 3 discuss the basic theoretical concepts of the research. Section 4 describes the mathematical background of the proposed reverse conversion method. Section 5 and Section 6 present a numerical example and an analysis of the computational cost, respectively. Section 7 provides discussion, and Section 8 concludes the paper.

2. The Basic Concepts of the Residue Arithmetic

The abstract algebra and number theory create the theoretical basis of the residue arithmetic [12,13].

An RNS is defined by an ordered set m1,m2,,mk of k pairwise relatively prime moduli, where each modulus mi2(i=1,2,,k), and the greatest common divisor of mi and mj equals 1, i.e., gcdmi,mj=1 for ij. For convenience, we assume that the default order of moduli is ascending, i.e., m1<m2<<mk.

In the given RNS, it is possible to represent Mk integer numbers, where Mk is the product of all moduli, Mk=i=1kmk. Therefore, the set ZMk=0,1,,Mk1 is usually used as an RNS dynamic range.

Every number XZMk has a unique representation in the form of a k-tuple of small integers (χ1,χ2,,χk), which is called a residue code, where χi is a least non-negative remainder of a division of X by mii=1,2,,k. We can notationally write this relation as χi=Xmi, where χiZmi=0,1,,mi1.

The main advantage of the residue arithmetic over conventional binary arithmetic consists of parallel carrying out addition, subtraction, and multiplication at the level of small word-length residues. The modular operations +,,× on integers A=α1,α2,,αk and B=β1,β2,,βk are performed independently in each modular channel in compliance with the computational rule:

A B=α1,α2,,αkβ1,β2,,βk==α1β1m1,α2β2m2,,αkβkmk, (1)

where αi=Ami and βi=Bmi,i=1,2,,k.

In other words, the arithmetic operations on long-word operands are decomposed into modular channels with operands that are no larger than the corresponding modulus. Moreover, all the modular channels are entirely independent of each other. The carry-free nature of modular operations (1) is one of the most attractive features of residue arithmetic [1,3,8].

Therefore, compared with the conventional WNS, the RNS simplifies and speeds up the addition and multiplication operations. This fundamental advantage of the residue arithmetic strongly appears in the case of implementing computational procedures, which mainly contain long segments consisting of only sequences of modular arithmetic operations. In this case, the primary moduli set is chosen so that the final results of the computational procedure always belong to the used dynamic range for any allowed values of input operands. At the same time, the intermediate results can even exceed the boundaries of the dynamic range.

Along with the carry-free modular operations, there are also the so-called non-modular operations such as residue-to-binary conversion, base extension, magnitude comparison, sign determination, overflow detection, general division, scaling, etc. These operations are complicated and quite time consuming, and their significant computational complexity limits the applications of the residue arithmetic and restricts its widespread usage for high-speed computing.

To perform the non-modular operations, it is required to consider all residues in the k-tuple χ1,χ2,,χk. Furthermore, it is necessary to determine the integer value of the number by its residue code, which in general is hampered by the non-positional nature of the RNS. The crucial problem of efficient implementation of non-modular operations is constantly receiving considerable attention by modern researchers [2,5,8].

The applicability of residue arithmetic is mainly determined by the computational complexity and feasibility of non-modular operations, which are used as a basis for implementing more complex computational algorithms in RNS. At the same time, the fundamental problem in the residue arithmetic, which unfortunately up to now is yet completely unresolved; it consists of reducing the computational complexity of non-modular operations. Due to a lack of efficient methods and algorithms for non-modular operations implementation, the residue arithmetic is mainly suitable when the modular additions and multiplications make up the bulk of required computations. In this case, the number of used non-modular operations is relatively small. This circumstance bounds the widespread use of the RNS to a narrow class of specific tasks.

3. Reverse Conversion of the Residue Code to Conventional Representation

The root problem of residue arithmetic is that the weighted value of the integer X depends on all the residues χ1,χ2,,χk. The reconstruction of an integer by its residue code, i.e., the reverse conversion, is one of the most difficult non-modular operations in residue arithmetic. Moreover, this operation underlies all the other non-modular procedures.

Despite the currently extensive studies on residue arithmetic and its applications, there is a need to develop novel efficient approaches and methods of an integer number reconstruction by its residue code. This should enable us the extensive use of residue arithmetic for high-speed computing in many priority fields, first of all, in various digital signal processing and cryptographic applications.

There are two canonical techniques of reverse conversion: the canonical method based on the Chinese Remainder Theorem (CRT) and the residue code conversion to a weighted representation in the Mixed-Radix System (MRS) [1,2,5,8,14,15,16,17,18]. In general, all other conversion methods represent different variants of these two methods.

Below, we describe the mathematical background of these methods.

3.1. CRT-Base Conversion Method

When the moduli m1,m2,,mk are pairwise relatively prime, the integer number X and its residue code χ1,χ2,,χk are related by the equation:

X=i=1kMi,kχi,kMk, (2)

where Mi,k=Mk/mi, χi,k=Mi,k1χimi is a normalized residue modulo mii=1,2,,k, Y1m denotes the multiplicative inverse of an integer Y modulo m.

In essence, Equation (2) represents the CRT [10,19,20].

In the last decades, considerable efforts are directed to reducing the complexity of the CRT implementation and the possibility of its application in high-speed computing [2,5,8,21,22,23]. The main idea of these methods is to replace the inner multiplications and additions modulo Mk with simpler operations (see (2)).

Consider the CRT-number

Xk=i=1kMi,kχi,k. (3)

As follows from (2), the difference XkX is a multiple of Mk. Therefore, the following exact integer equality holds

X=XkρkXMk. (4)

The unique integer number ρkX is a normalized rank (or, briefly, rank) of the number X [3,4,7].

Equation (4) is called a rank form of the integer X. In essence, the rank ρkX is a reconstruction coefficient that indicates how many times the dynamic range Mk is exceeded when converting the residue code χ1,χ2,,χk to the integer X.

In contrast to (2), Equation (4) does not contain a very time-consuming reduction modulo Mk. Therefore, when we have the efficient method for the rank ρkX computation, the reverse conversion algorithm constructed on the basis of (4) has a substantial lead over the canonical CRT implementation (2).

3.2. MRS-Base Conversion Method

In the MRS defined by a set m1,m2,,mk of pairwise relatively prime moduli, the integer XZMk is represented by the k-tuple xk,xk1,,x1 of mixed-radix digits, resulting in

X=x1+x2M1+x3M2++xkMk1=i=1kxiMi1, (5)

where xiZmi(i=1,2,,k) [1,2,8].

It is well known that the MRS surpasses the RNS when performing non-modular operations such as magnitude comparison, sign determination, and overflow detection. Therefore, the mixed-radix representation has received the widest appliance for the implementation of non-modular procedures along with the other generally accepted integral characteristics of the residue code such as the rank of a number, core function, interval index, parity function, diagonal, and quotient functions [3,4,7,24,25,26,27,28,29,30,31,32,33].

The RNS-to-MRS reverse conversion establishes an association between the residue code χ1,χ2,,χk of the number X and its mixed-radix representation xk,xk1,,x1. The mixed-radix digits xi(i=1,2,,k) in (5) are computed according to the following calculation relations [1]:

x1=χ1,
x2=χ2x1m11m2m2,
x3=χ3x1m11m3x2m21m3m3,
xk=χkx1m11mkx2m21mkxk1mk11mkmk.

This sequential calculation procedure called a chained algorithm can be written in the general form

xi=X(i)mi, (6)

where

X(i)=X,ifi=1,X(i1)xi1mi11,ifi=2,3,,k. (7)

From (6) and (7), it follows that the considered computational process requires two modular operations: subtraction and multiplication by the multiplicative inverse. Thus, the most crucial advantage of this algorithm is its high modularity. However, its strictly sequential nature prevents general use for the construction of appropriate high-performance parallel computing procedures.

4. A Novel CRT-Base RNS-to-MRS Reverse Conversion Method

Now, we describe a proposed new method for calculating mixed-radix digits x1,x2,,xk of the number X by its residue code (χ1,χ2,,χk).

Consider the CRT-number Xk. According to (3), we have

Xk=i=1k1Mi,k1mkχi,k+Mk1χk,k. (8)

By Euclid’s Division Lemma, the integer mkχi,k can be written as

mkχi,k=χi,k1+mkχi,kmimi, (9)

where

χi,k1=mkχi,kmi=mkMi,k1χimimi=mkMi,k1χimi=Mi,k11χimi,

x denotes the largest integer less than or equal to x.

Substituting (9) into (8), we obtain

Xk=Xk1+Mk1SkX, (10)

where

Xk1=i=1k1Mi,k1χi,k1, (11)
SkX=i=1kRi,kχi, (12)
Ri,kχi=mkχi,kmii=1,2,,k. (13)

Taking into account (9), we have

Ri,kχi=mkχi,kχi,k1mi.

Since Ri,kχiZmk, we can reduce the right side of equality modulo mk.

Hence, the residue Ri,kχi can be calculated as

Ri,kχi=χi,k1mimk=Mi,k11χimimimki=1,2,,k1. (14)

At the same time, from (13) it follows that

Rk,kχk=χk,k=Mk,k1χkmk=Mk11χkmk. (15)

Similarly, taking into account Equations (10)–(13), the numbers Xi(i=k1,k2,,1) can be written by turns as

Xk1=Xk2+Mk2Sk1X,
Xk2=Xk3+Mk3Sk2X,
X2=X1+M1S2X,
X1=M0S1X,

where M0=1, S1X=χ1, the integers SlXl=2,3,,k are calculated according to (12)–(15) in the case when the index k is replaced by l.

Finally, substituting the above equations for Xl(l=k1,k2,,1) by turns into (10), we obtain

Xk=i=1kMl1SlX. (16)

At the same time, according to Euclid’s Division Lemma, we have

SlX=RlX+mlQlX, (17)

where RlX=SlXml and QlX=SlX/mi are the remainder and quotient of the division SlX by the modulus ml, respectively.

Therefore, taking into account (12), when the index k is replaced by l, the integers RlX and QlX can be computed as

RlX=i=1lRi,lχiml, (18)
QlX=1mli=1lRi,lχi. (19)

From (19), it follows that QlX equals the number of occurred overflows when calculating the sum RlX of residues R1,lχ1,R2,lχ2,,Rl,lχl modulo mll=2,3,,k.

Note that R1X=χ1 and Q1X=0 since S1X=χ1.

Substituting (17) into (16), we obtain

Xk=XkR+Xk1Q+MkQkX, (20)

where

XkR=l=1kMl1RlX, (21)
Xk1Q=l=1k1MlQlX. (22)

Let us draw attention to Equations (21) and (22). It is evident that the number XkR is represented by the k-tuple xkR,xk1R,,x1R of mixed-radix digits, where xlR=RlX, l=1,2,,k (see Equation (5)). At the same time, xlRZml and XkRMk1.

Bearing in mind that Q1X=0, the number Xk1Q can be written as

Xk1Q=l=1k1Ml1QlX, (23)

where Q1X=0, Q2X=Q1X=0, and QlX=Ql1X for l3. Therefore, taking into account (19), the integer QlX can be calculated as

QlX=1ml1i=1l1Ri,l1χil=3,4,,k. (24)

Hence, QlX<l1 since Ri,l1χiml11.

Thus, the integer Xk1Q (see Equations (23) and (5)) can be represented by a k-tuple xkQ,xk1Q,,x1Q of mixed-radix digits under the condition that xlQZmll=1,2,,k, where x1Q=x2Q=0,xlQ=QlX for l>2. Consequently, that entails the fulfillment of the condition Zl1Zml, which leads to inequality

mll1l=1,2,,k. (25)

Thus, when the moduli set m1,m2,,mk meets the conditions (25), we have that Xk1Q<Mk.

Note that the integer Xk1Q is a multiple of the number M2=m1m2 because of x1Q=x2Q=0 (see Equation (5)).

Now, let us return to Equation (20). According to Euclid’s Division Lemma, the sum of two mixed-radix numbers XkR and Xk1Q results in

XkR+Xk1Q=XkR+Xk1QMk+MkXkR+Xk1QMk. (26)

Hence, substituting (26) into (20), we obtain

Xk=XkR+Xk1QMk+MkQkX+XkR+Xk1QMk. (27)

Taking into account the rank form of the number X (4), from (27) we have

X=XkR+Xk1QMk. (28)

From (28), it follows that the mixed-radix representation of the number X, i.e., k-tuple xk,xk1,,x1, can be calculated as a result of the addition of two mixed-radix numbers XkR=xkR,xk1R,,x1R and Xk1Q=xkQ,xk1Q,,x1Q (see (21) and (23)) in the basis m1,m2,,mk. Note that x1R=χ1, x1Q=x2Q=0. At the same time, the digits x2R,x3R,,xkR and x3Q,x4Q,,xkQ are calculated as the sum of the residues R1,lχ1,R2,lχ2,,Rl,lχl modulo ml along with the counting of occurred overflows according to (18) and (24) l=2,3,,k.

Therefore, the mixed-radix digits xlR and xlQ are computed as

x1R=χ1,xlR=i=1lRi,lχimll=2,3,,k, (29)
x1Q=x2Q=0,xlQ=1ml1i=1l1Ri,l1χil=3,4,,k, (30)

where

Ri,lχi=Mi,l11χimimimlil, (31)
Rl,lχl=Ml11χlmll=2,3,,k. (32)

Furthermore, in the MRS with the bases m1,m2,,mk, we calculate the sum of two numbers XkR and Xk1Q. As a result, we obtain the mixed-radix representation xk,xk1,,x1 of the number X.

Table 1 given below presents the pre-calculation components (see Equations (31) and (32)). It should be recalled that R1,1χ1=χ1. The abbreviation LUT means lookup table. The bit-length of residues is bl=log2mll=1,2,,k. Here, and further, x denotes the smallest integer greater than or equal to x.

Table 1.

The pre-calculation components.

Input Residue Number and Skope of LUTs Output Residue Set
χ1 k1,   2b1×bl l=2,3,,k R1,2χ1,R1,3χ1,,R1,kχ1
χ2 k1,   2b2×bl l=2,3,,k R2,2χ2,R2,3χ2,,R2,kχ2
χk1 2,   2bk1×bl l=k1,k Rk1,k1χk1,Rk1,kχk1
χk 1,   2bk×bk Rk,kχk

Table 2 presents the results of calculations in the modular channels according to Equations (29) and (30). It should be reminded that in the first modular channel corresponding to the modulus m1, the calculations are not carried out, so x1R=χ1 and x2Q=0.

Table 2.

The results of calculations in the modular channels.

Modular Channel Input Data Output Data
m2 R1,2χ1,R2,2χ2 x2R,   x3Q
m3 R1,3χ1,R2,3χ2,R3,3χ3 x3R,   x4Q
mk1 R1,k1χ1,R2,k1χ2,,Rk1,k1χk1 xk1R,   xkQ
mk R1,kχ1,R2,kχ2,,Rk,kχk xkR

The stated above allows us to formulate the following substantial theorem.

Theorem 1.

(About RNS-to-MRS reverse conversion).

Let an arbitrary RNS be defined by an ascending-ordered set of k pairwise relatively prime moduli m1,m2,,mk (mll1, l=1,2,,k, k2), and let the residue code χ1,χ2,,χk of the number XZMk be given. Then, the mixed-radix representation xk,xk1,,x1 of the number X can be computed as a result of the summation of two mixed-radix numbers, namely, the appropriate number XkR=xkR,xk1R,,x1R and the correction number Xk1Q=xkQ,xk1Q,,x1Q, where the digits xlR and xlQl=1,2,,k are calculated according to (29) and (30), respectively, taking into account (31) and (32).

5. A Numerical Example of the Proposed Conversion Method

The main idea of the proposed approach to reverse conversion is illustrated below by a simple numerical example. For convenience, we consider a four-moduli RNS.

Example 1.

Let the RNS moduli-set be m1,m2,m3,m4 = 5,7,9,11. Suppose that we wish to calculate the digits of the mixed-radix representation x4,x3,x2,x1 of the given number X by its residue code χ1,χ2,χ3,χ4=(3,6,4,2).

Step 1 . The calculation of the primitive constants in a given RNS.

M4=3465,M3=315,M2=35,M1=5,M0=1,

M1,4=693,M2,4=495,M3,4=385,M4,4=315,

M1,41m1=2,M2,41m2=3,M3,41m3=4,M4,41m4=8,

m11m4=9,m21m4=8,m31m4=5,M31m4=8,

M1,3=63,M2,3=45,M3,3=35,

M1,31m1=2,M2,31m2=5,M3,31m3=8,

m11m3=2,m21m3=4,M21m3=8,

M1,2=7,M2,2=5,

M1,21m1=3,M2,21m2=3,

m11m2=3,M11m2=3.

Step 2 . The calculation of the residue sets R1,lχ1,R2,lχ2,,Rl,lχl according to (31) and (32) l=1,2,3,4.

We obtain

R1,1χ1=χ1=3,

R1,2χ1=1·35·37=5,

R2,2χ2=3·67=4,

R1,3χ1=3·35·29=1,

R2,3χ2=3·67·49=2,

R3,3χ3=8·49=5,

R1,4χ1=2·35·911=2,

R2,4χ2=5·67·811=6,

R3,4χ3=8·49·511=8,

R4,4χ4=8·211=5.

As a result, the following sets of residues occur

R1,1χ1=3,

R1,2χ1,R2,2χ2=5,4,

R1,3χ1,R2,3χ2,R3,3χ3=1,2,5,

R1,4χ1,R2,4χ2,R3,4χ3,R4,4(χ4)=2,6,8,5.

Step 3. The summation of the residues R1,lχ1,R2,lχ2,,Rl,lχl modulo ml along with the counting of occurring overflows according to (18) and (19), respectively l=2,3,4.

Recall that R1X=R1,1χ1=3, and Q1X=0. We have

R2X=5+47=97=2,

R3X=1+2+59=89=8,

R4X=2+6+8+511=2111=10,

Q2X=5+4/7=9/7=1,

Q3X=1+2+5/9=8/9=0,

Q4X=2+6+8+5/11=21/11=1.

Therefore, the mixed-radix representations of the numbers X4R and X3Q (see (21) and (23)) are computed:

x4R,x3R,x2R,x1R=R4X,R3X,R2X,R1X=10,8,2,3,

x4Q,x3Q,x2Q,x1Q=Q3X,Q2X,0,0=0,1,0,0.

Step 4. The calculation of the mixed-radix digits x4,x3,x2,x1.

The addition of two numbers X4R=10,8,2,3 and X3Q=0,1,0,0 according to (28) gives the mixed-radix representation 0,0,2,3 of the number X.

Let us now verify the obtained result. According to (5), we have

X=0,0,2,3=0·315+0·35+2·5+3=13.

This result holds because the residue code of the integer number X=13 is 3,6,4,2, since 135=3, 137=6, 139=4, 1311=2. Thus, this result coincides with the condition of the example.

6. The Computational Cost of the Reverse Conversion Method

As it follows from the results mentioned above, the calculation of the mixed-radix digits x1, x2, ⋯, xk reduces to the independent and parallel summation of small residues R1,lχ1, R2,lχ2, ⋯, Rl,lχl modulo ml in lth modular channel l=1,2,,k, taking into account the number of the overflows occuring during the modular addition operations (see (29)–(32)).

Let us evaluate the time required to perform the parallel reverse conversion.

First, we consider the calculation of mixed-radix digits of the numbers XkR=xkR,xk1R,,x1R and Xk1Q=xkQ,xk1Q,,x1Q (see (29) and (30)). As can be seen, there are no modular addition operations in the first modular channel corresponding to the modulus m1. In the second channel, we have only one addition operation modulo m2. Furthermore, two additions modulo m3 are performed in the third channel and so on. Thus, in the lth modular channel, we have l1 additions modulo mll=2,3,,k. These calculations are easily parallelized and pipelined. Therefore, the required computation time for calculating digits xlR and xlQ is Tl=log2l modular clock cycles.

Thus, the time for obtaining the mixed-radix representations of the numbers XkR and Xk1Q is determined by the time in the kth modular channel and equals Tk=log2k modular clock cycles.

The summation of XkR and Xk1Q on the bases m1,m2,,mk involves two additional modular clock cycles taking into account the inter-digit carries. Therefore, the execution time of the reverse conversion equals Tconv=Tk+2 modular clock cycles. Thus, the overall time is tconv=Tconvtmod, where tmod denotes the modular clock cycle time. At the same time, when pipelined, the throughput rate of the proposed conversion method is one conversion in one modular clock cycle.

Consider now the evaluation of the required computational cost. Due to the small word-length of residues in the k-tuple (χ1,χ2,,χk), the pre-computation and lookup table techniques are suitable for reverse conversion implementation. So, we can use one-input lookup tables depending on the residues word-length in each modular channel.

At the beginning stage of the reverse conversion, in the lth channel corresponding to the modulus ml, the number of lookup tables required to store the residue set R1,lχ1,R2,lχ2,,Rl,lχl equals Nlutl=l. At the same time, the word length of recorded residues is bl=log2ml bits l=2,3,,k. In the first modular channel, Nlut1=0 since S1X=χ1.

Then, the overall number of one-input lookup tables in all modular channels is equal to

Nlut=l=2kNlutl=k2+k22.

The summation of the residues R1,lχ1,R2,lχ2,,Rl,lχl modulo ml requires Naddl=l1 modular addition operations l=2,3,,k. At the same time, all independent calculations are realized in parallel in corresponding modular channels.

Taking into account that x1Q=x2Q=0, the summation of two numbers XkR=xkR,xk1R,,x1R and Xk1Q=xkQ,xk1Q,,x1Q on the final stage of the reverse conversion requires 2k2 modular addition operations.

Hence, the overall number of modular addition operations in all modular channels is equal to

Nadd=l=2kNaddl+2k2=k2+3k82.

When pipelined, the throughput rate of the proposed method is one reverse conversion in one modular clock cycle.

7. Discussion

As it follows from [1], the calculation of the mixed-radix digits x1,x2,,xk (see (6) and (7)) requires k1 both addition and multiplication operations; in this case, the overall conversion time is kk1/2·tadd+tmul, where tadd and tmul denote an execution time of addition/subtraction and multiplication, respectively. The computational cost of the pipelined implementation of this algorithm is kk1/2, both multiplication and addition operations, while the conversion time is k1tadd+tmul. The main drawback of this method is its strictly sequential nature.

The parallel conversion method circumscribed in [16] uses the additional lookup tables. At the same time, k(k+1)/2 lookup tables and k(k+1)/2 adders are required. The conversion time is tlut+k1tadd due to the need to generate the inter-digit carries when performing addition operations. As noted in [34], the method proposed in [16] does not allow obtaining the claimed depth of Olog2k in terms of RNS processing elements. In this regard, an improved method was proposed by adding extra k(k+1)/2 multipliers to hardware resources used in [16]. The implementation time is tlut+tmul+2log2k+1tadd. Hence, the time complexity of this conversion algorithm is Olog2k.

In [15], the mixed-radix conversion is realized by the cascaded scheme of lookup tables and adders. The computational cost for the sequential implementation is k(k2)/4 double-size lookup tables and k(k2)/4 adders, while the conversion time equals k/2·tlut+tadd. When pipelined, the throughput rate is determined by the time equals tlut+tadd. This method works well when the used moduli do not have a very large word-length, since the size of lookup tables increases significantly with a word-length growth.

The paper [17] presents the parallel reverse conversion method, which uses the lookup table technique and requires no arithmetic or logical units. As reported, this algorithm is better than the ones presented in [15,16]. It is based on solving k(k1)/2 linear Diophantine Equations and requires k(k1)/2 lookup tables of size mi×mj, while a conversion time is k1tlut. When pipelined, its effective conversion rate is one conversion per tlut. So, this method is attractive for DSP implementation. However, it is not suitable for implementing cryptographic applications because of the enormous size of the required lookup tables, especially when processing large numbers.

In the paper [9], the reverse conversion method is based on modular reduction by a modified canonic CRT algorithm. This enables minimizing the bit-width of intermediate data processing. The lookup tables translate the bi-bit input residues i=1,2,,k into bout-bit output integers, where bi=log2mi, bout=12log2i=1kbi, and k is the number of RNS moduli. As a result, the modular reduction of the modified k-tuple of bout-bits integers is carried out over a ring of size 2bout such that only the bout least significant bits of the binary representation are maintained. In this case, all the bout-bit outputs in the modified k-tuple are added together by adder tree without regard to overflow, propagating the bout least significant bits to the output. The reverse conversion requires k lookup tables and k1 adders. The scope of used lookup tables is 2b×2bout, b{b1,b2,,bk}. The overall conversion time is tlut+log2ktadd.

Some reverse conversion methods use the special moduli sets with a limited number of moduli, such as m=2n+dd1,0,1 [2,8,35,36,37,38,39,40]. Their main drawback consists in a small number of the selected moduli, typically from three to five. These moduli sets are suitable for the efficient implementations of DSP algorithms but completely not applicable for large numbers processing widely used in cryptography. For example, to represent 1024-bit word-length cryptographic numbers using four RNS moduli, each modular channel must have residues of 256-bit length, which is not qualified for high-performance computing.

Table 3 compares the results across multiple techniques of the reverse conversion. Here, we use the following abbreviations: LUT—lookup table, ADD–adder, MUL—multiplier. The bit length bb1,b2,,bk, bl=log2mll=1,2,,k.

Table 3.

RNS-to-MRS reverse conversion methods.

Method Number and Scope of LUTs ADD MUL Conversion Time
[1],
sequential k1 k1 kk12tmul+tadd
[1],
sequential,
pipelined kk122b+1×b kk12 k1tlut+tadd
[16],
parallel kk+122b×b kk+12 tlut+k1tadd
[34],
parallel kk+122b×b kk+12 kk+12 tlut+tmul+2log2k+1tadd
[15],
sequential kk2422b×2b kk14 k2tlut+tadd
[15],
parallel kk24+k122b×2b kk+243 tlut+k2tadd
[17],
parallel kk1222b×b k1tlut
[9] k2b×212log2kb k1 tlut+log2ktadd
Our method,
parallel k2+k222b×b k2+3k82 log2k+2tmod

As seen from above, the proposed parallel reverse conversion method has time complexity of the order Olog2k. In pipelined mode, it enables the high throughput rate and has one reverse conversion in one modular clock cycle. At the same time, the computational complexity is of the order of O(k2/2) in terms of the number of both required arithmetic operations and one-input lookup tables.

8. Conclusions

In this paper, a novel approach to parallel reverse conversion of the residue code χ1,χ2,,χk of the number X to mixed-radix representation xk,xk1,,x1 is described.

The calculation of the mixed-radix digits xk,xk1,,x1 is reduced to a parallel summation of the small word-length residues R1,lχ1, R2,lχ2, ⋯, Rl,lχl modulo ml in lth modular channel l=1,2,,k, taking into account the number of the overflows occuring during the modular addition operations. These modular operations are performed fast and independently in each modular channel and easily pipelined.

The computational cost of the proposed reverse conversion method is presented. In all modular channels, the general number of modular addition operations is equal to Nadd=k2+3k8/2. At the same time, the summary number of reqiured one-input lookup tables makes up Nlut=k2+k2/2.

The execution time of the reverse conversion equals Tconv=log2k+2 modular clock cycles. At the same time, when pipelined, the throughput rate of the proposed conversion method is one conversion in one modular clock cycle.

The proposed parallel reverse conversion method coincides with the development vector of modern high-performance computing using residue arithmetic. It can find a widespread application for implementing a broad class of tasks in various areas of science and technology, first of all, in digital signal processing and cryptography.

Author Contributions

Conceptualization, M.S.; investigation, Y.P.; methodology, M.S.; writing—original draft preparation, M.S.; writing—review and editing, Y.P. All authors have read and improved the final version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Footnotes

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Szabo N.S., Tanaka R.I. Residue Arithmetic and Its Application to Computer Technology. McGraw-Hill; New York, NY, USA: 1967. [Google Scholar]
  • 2.Molahosseini A.S., de Sousa L.S., Chang C.H., editors. Embedded Systems Design with Special Arithmetic and Number Systems. Springer; Cham, Switzerland: 2017. [Google Scholar]
  • 3.Akushskii I.Y., Juditskii D.I. Machine Arithmetic in Residue Classes. Soviet Radio; Moscow, Russia: 1968. (In Russian) [Google Scholar]
  • 4.Amerbayev V.M. Theoretical Foundations of Machine Arithmetic. Nauka; Alma-Ata, Kazakhstan: 1976. (In Russian) [Google Scholar]
  • 5.Omondi A.R., Premkumar B. Residue Number Systems: Theory and Implementation. Imperial College Press; London, UK: 2007. [Google Scholar]
  • 6.Soderstrand M.A., Jenkins W.K., Jullien G.A., Taylor F.J., editors. Residue Number System Arithmetic: Modern Applications in Digital Signal Processing. IEEE Press; New York, NY, USA: 1986. [Google Scholar]
  • 7.Chernyavsky A.F., Danilevich V.V., Kolyada A.A., Selyaninov M.Y. High-Speed Methods, and Systems of Digital Information Processing. Belarusian State University; Minsk, Belarus: 1996. (In Russian) [Google Scholar]
  • 8.Ananda Mohan P.V. Residue Number Systems. Theory and Applications. Springer; Cham, Switzerland: 2016. [Google Scholar]
  • 9.Michaels A.J. A maximal entropy digital chaotic circuit; Proceedings of the 2011 IEEE International Symposium of Circuits and Systems (ISCAS); Rio de Janeiro, Brazil. 15–18 May 2011; pp. 717–720. [Google Scholar]
  • 10.Ding C., Pei D., Salomaa A. Chinese Remainder Theorem: Applications in Computing, Coding, Cryptography. World Scientific; Singapore: 1996. [Google Scholar]
  • 11.Omondi A.R. Cryptography Arithmetic: Algorithms and Hardware Architectures. Springer; Cham, Switzerland: 2020. [Google Scholar]
  • 12.Burton D.M. Elementary Number Theory. 7th ed. McGraw-Hill; New York, NY, USA: 2011. [Google Scholar]
  • 13.Hardy G.H., Wright E.M. An Introduction to the Theory of Numbers. 6th ed. Oxford University Press; London, UK: 2008. [Google Scholar]
  • 14.Akkal M., Siy P. A new mixed radix conversion algorithm MRC-II. J. Syst. Archit. 2007;53:577–586. doi: 10.1016/j.sysarc.2006.12.006. [DOI] [Google Scholar]
  • 15.Chakraborti N.B., Soundararajan J.S., Reddy A.L.N. An implementation of mixed-radix conversion for residue number applications. IEEE Trans. Comput. 1986;35:762–764. doi: 10.1109/TC.1986.1676829. [DOI] [Google Scholar]
  • 16.Huang C.H. Fully parallel mixed-radix conversion algorithm for residue number applications. IEEE Trans. Comput. 1983;32:398–402. doi: 10.1109/TC.1983.1676242. [DOI] [Google Scholar]
  • 17.Miller D.F., McCormick W.S. An arithmetic free parallel mixed-radix conversion algorithm. IEEE Trans. Circuits Syst. II. 1998;45:158–162. doi: 10.1109/82.659469. [DOI] [Google Scholar]
  • 18.Yassine H.M., Moore W.R. Improved mixed-radix conversion for residue number architectures. IEE Proc. G - Circuits Devices Syst. 1991;138:120–124. doi: 10.1049/ip-g-2.1991.0022. [DOI] [Google Scholar]
  • 19.Knuth D.E. The Art of Computer Programming, Volume 2: Seminumerical Algorithms. 3rd ed. Addison-Wesley; Boston, MA, USA: 1998. [Google Scholar]
  • 20.Shoup V. A Computational Introduction to Number Theory and Algebra. 2nd ed. Cambridge University Press; Cambridge, UK: 2005. [Google Scholar]
  • 21.Phatak D.S., Houston S.D. New distributed algorithms for fast sign detection in residue number systems (RNS) J. Parallel Distrib. Comput. 2016;97:78–95. doi: 10.1016/j.jpdc.2016.06.005. [DOI] [Google Scholar]
  • 22.Shenoy M.A.P., Kumaresan R. A fast and accurate RNS scaling technique for high speed signal processing. IEEE Trans. Acoust. Speech Signal Process. 1989;37:929–937. doi: 10.1109/ASSP.1989.28063. [DOI] [Google Scholar]
  • 23.Vu T.V. Efficient implementations of the Chinese Remainder Theorem for sign detection and residue decoding. IEEE Trans. Comput. 1985;34:646–651. [Google Scholar]
  • 24.Miller D.D., Altschul R.E., King J.R., Polky J.N. Residue Number System Arithmetic: Modern Applications in Digital Signal Processing. IEEE Press; Piscataway, NJ, USA: 1986. Analysis of the residue class core function of Akushskii, Burcev, and Pak; pp. 390–401. [Google Scholar]
  • 25.Gonnella J. The application of core functions to residue number system. IEEE Trans. Signal Process. 1991;39:69–75. doi: 10.1109/78.80766. [DOI] [Google Scholar]
  • 26.Abtahi M. Core function of an RNS number with no ambiguity. Comput. Math. Appl. 2005;50:459–470. doi: 10.1016/j.camwa.2005.03.008. [DOI] [Google Scholar]
  • 27.Kong Y., Asif S., Khan M.A.U. Modular multiplication using the core function in the residue number system. Appl. Algebra Eng. Commun. Comput. 2016;27:1–16. doi: 10.1007/s00200-015-0268-1. [DOI] [Google Scholar]
  • 28.Kolyada A.A., Selyaninov M.Y. Generation of integral characteristics of symmetric-range residue codes. Cybern. Syst. Anal. 1986;22:431–437. doi: 10.1007/BF01075072. [DOI] [Google Scholar]
  • 29.Selianinau M. An efficient implementation of the CRT algorithm based on an interval-index characteristic and minimum-redundancy residue code. Int. J. Comput. Meth. 2020;17:2050004. doi: 10.1142/S0219876220500048. [DOI] [Google Scholar]
  • 30.Lu M., Chiang J.-S. A novel division algorithm for the residue number system. IEEE Trans. Comput. 1992;41:1026–1032. doi: 10.1109/12.156545. [DOI] [Google Scholar]
  • 31.Dimauro G., Impedovo S., Modugno R., Pirlo G., Stefanelli R. Residue-to-binary conversion by the “quotient function”. IEEE Trans. Circuits Syst. II Analog Digital Signal Process. 2003;50:488–493. doi: 10.1109/TCSII.2003.814808. [DOI] [Google Scholar]
  • 32.Dimauro G., Impedovo S., Pirlo G., Salzo A. RNS architectures for the implementation of the ’diagonal function’. Inf. Process. Lett. 2000;73:189–198. doi: 10.1016/S0020-0190(00)00003-X. [DOI] [Google Scholar]
  • 33.Pirlo G., Impedovo D. A new class of monotone functions of the residue number system. Int. J. Math. Models Meth. Appl. Sci. 2013;7:802–809. [Google Scholar]
  • 34.Hitz M.A., Kaltofen E. Integer division in residue number systems. IEEE Trans. Comput. 1995;44:983–989. doi: 10.1109/12.403714. [DOI] [Google Scholar]
  • 35.Bergerman M.V., Lyakhov P.A., Voznesensky A.S., Bogaevskiy D.V., Kaplun D.I. Designing reverse converter for data transmission systems from two-level RNS to BNS. J. Phys. Conf. Ser. 2020;1658:012005. doi: 10.1088/1742-6596/1658/1/012005. [DOI] [Google Scholar]
  • 36.Daphni S., Vijula Grace K.S. A review analysis of reverse converter based on RNS in signal processing. Int. J. Sci. Technol. Res. 2020;9:1686–1689. [Google Scholar]
  • 37.Sousa L., Paludo R., Martins P., Pettenghi H. Towards the integration of reverse converters into the RNS channels. IEEE Trans. Comput. 2020;69:342–348. doi: 10.1109/TC.2019.2948335. [DOI] [Google Scholar]
  • 38.Mojahed M., Molahosseini A.S., Zarandi A.A.E. A multifunctional unit for reverse conversion and sign detection based on the 5-moduli set. Comp. Sci. 2021;22:101–121. doi: 10.7494/csci.2021.22.1.3823. [DOI] [Google Scholar]
  • 39.Salifu A. New reverse conversion for four-moduli set and five-moduli set. J. Comp. Commun. 2021;9:57–66. doi: 10.4236/jcc.2021.94004. [DOI] [Google Scholar]
  • 40.Taghizadeghankalantari M., TaghipourEivazi S. Design of efficient reverse converters for Residue Number System. J. Circuits Syst. Comp. 2021;30:2150141. doi: 10.1142/S0218126621501413. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Not applicable.


Articles from Entropy are provided here courtesy of Multidisciplinary Digital Publishing Institute (MDPI)

RESOURCES