Skip to main content
PLOS One logoLink to PLOS One
. 2025 May 28;20(5):e0322726. doi: 10.1371/journal.pone.0322726

A simple and efficient attack on the Merkle-Hellman knapsack cryptosystem

Jingguo Bi 1,2,*, Lei Su 1, Haipeng Peng 1, Lin Wang 3,*
Editor: Je Sen Teh4
PMCID: PMC12118879  PMID: 40435178

Abstract

The Merkle-Hellman knapsack cryptosystem was one of the two earliest public key cryptosystems, which was invented by Merkle and Hellman in 1978. One can recover the equivalent keys by using Shamir’s method. The most time-consuming part of Shamir’s attack is to recover the critical intermediate parameters by solving an integer programming problem with a fixed number of variables, whose worst-case complexity is exponential of the number of variables. In this paper, we present an improved algorithm to analyze the basic Merkle-Hellman public key cryptosystem. The main idea is to recover a partial super-increasing sequence as equivalent private key, which is the main difference from Shamir’s. More precisely, we first obtain a super-increasing sequence by invoking the LLL algorithm on a special lattice with a small dimension. We can recover most part of the plaintext from the tail by solving the super-increasing knapsack problem. Finally, we get the first part of plaintext by solving a size-reduced knapsack problem. Experimental data shows that one can recover the whole plaintext in less than 1 second on a laptop for the typical parameters of the Merkle-Hellman cryptosystem, whose time complexity is reduced by a polynomial level compared with Shamir’s algorithm.

1 Introduction

Merkle-Hellman cryptosystem, which is a kind of public key cryptosystem, was proposed by Merkle and Hellman in 1978 [1]. Its security was based on the hardness of the knapsack problem. This scheme used super-increasing sequences and modular multiplication to construct a trapdoor, making it easy to decrypt with the private key. Attacks on this cryptographic system can be divided into two main categories: one is to directly solve the knapsack problem, and the other is to find equivalent keys.

Many algorithms have been proposed to solve the knapsack problem. In 1985, Lagarias and Odlyzko proposed a direct way to use lattices to solve the subset sum problem [2]. This method does not rely on any properties of the subset sum instance and only works when the density is sufficiently small. In 1991, Coster, Joux, LaMacchia, Odlyzko, Schnorr and Stern [3]made a small adjustment to the original method, raising the upper limit of applicable density from 0.6463 to 0.9408. However, there are no efficient lattice-based methods for solving the knapsack problem with density close to 1. For this case, the state-of-the-art algorithm is proposed by Schroeppel and Shamir [4], which runs in O(n · 2n/2) time and uses O(n · 2n/4) bits of memory. This algorithm runs in the same time as the basic birthday-based algorithm for knapsack problem introduced by Horowitz and Sahni [5], but has much lower memory requirements. In 2010, Howgrave-Graham and Joux proposed two new algorithms that improve this bound [6], reducing the running time further with reasonable heuristics. Besides, intelligent algorithms such as genetic algorithm [7, 8] and ant colony algorithm [9] are also used to crack the Merkle-Hellman cryptosystem.

Another way to break the Merkle-Hellman cryptosystem is to solve the private key. In 1982, Shamir proposed an algorithm to break it in polynomial time [10]. Shamir noticed an "unusual" relationship between parameters, whose essence is that modular multiplication cannot perfectly hide the information of the private key. In 2019, Liu and Bi proposed an attack based on lattice [11]. They use orthogonal lattice technique as the main tool and obtain a O(n7) speed-up compared with Shamir’s algorithm.

Currently, National Institute of Standards and Technology (NIST) is collecting post-quantum algorithms from all over the world, and knapsack ciphers are a promising class of candidate algorithms. The hardness of the knapsack problem is NP-complete and it can resist the attack of a quantum computer. Some improved knapsack algorithms have been proposed in recent years [1215]. In this paper, we propose an improved method based on Shamir’s attack for cracking the Merkle-Hellman cryptosystem, which is helpful to the cryptanalysis of knapsack-based schemes.

1.1 Contributions

In this paper, we contribute to the body of knowledge on lattice-based attacks on the Merkle-Hellman cryptosystem by presenting an improved algorithm. Our approach differs from previous works in that we focus on recovering a partial super-increasing sequence as the equivalent private key, which is a novel way of utilizing the lattice reduction results. By invoking the LLL algorithm on a specially constructed small-dimensional lattice, we are able to bypass the most time-consuming integer programming step in Shamir’s attack, thereby achieving a significant reduction in time complexity.

Let n denote the number of the public key elements of the Merkle-Hellman cryptosystem and L is the length of input. l is the number of variables we use. In our improved method, the most time-consuming part is finding equivalent M and U by invoking the LLL algorithm [16]. Its complexity depends on the LLL algorithm, which is O(l6L3) for classical LLL and can be further reduced using L2 algorithm [17]. In our attack, we choose l = 20. Note that L=O(n) and l=O(1), the time complexity of our attack is O(n3), and the complexity of Shamir’s attack is O(n15nlogn), which means we get a O(n13) speed-up compared with Shamir’s algorithm.

Our work not only advances the understanding of how lattice theory can be applied to break the Merkle-Hellman cryptosystem but also provides new insights into the potential weaknesses of similar knapsack-based cryptographic schemes. The experimental results that demonstrate the efficiency of our attack further highlight the importance of considering lattice-based attacks when evaluating the security of cryptographic algorithms, especially in the context of post-quantum cryptography where the relevance of such attacks is expected to grow.

1.2 Organization

We organize the paper as follows. In Section 2, we recall the background on lattice theory, the Merkle-Hellman cryptosystem and Shamir’s attack algorithm. Section 3 shows our improved method that recovers the equivalent keys. Section 4 provides the experimental data of this method. Finally, we conclude the paper in Section 5.

2 Preliminary

2.1 Lattice

There exist n linearly independent vectors B={b1,b2,...,bn}m. A lattice generated by B is the set of all the integer linear combinations of the basis vectors.

(B)={xibi:xi}

A classic problem related to lattice is SVP (shortest vector problem): find a non-zero vector with the shortest Euclidean norm in lattice. The length of the shortest non-zero vector in lattice is denoted by λ1().

Definition 1 (SVP). Given a basis of a lattice , find a non-zero vector 𝐮, such that 𝐮𝐯 for any vector 𝐯{0}.

The length of the shortest vector in the lattice is denoted by λ1, and the Gaussian heuristic can be used to estimate its value. Gaussian heuristic says that in a random lattice, the length of the shortest vector is approximately:

λ1Γ(n/2+1)1/nπ·(det())1/nn2πe·(det())1/n

where Γ(x) is the gamma function.

Given the basis B of lattice , the LLL algorithm [16] can output a set of short vectors in polynomial time, which is called LLL-reduced basis. Let (𝐛1,,𝐛n) be an LLL-reduced basis of a lattice . Then:

  1. 𝐛12n14(det()) 1n.

  2. 𝐛12n12λ1().

2.2 Knapsack problem

Definition 2 (Knapsack Problem). Given a set of non-negative integers a1,a2,,an, and a value sum s from a subset of the whole set, determine the subset to sum s.

Formally, the knapsack problem is given a set of non-negative integers a1,a2,,an and finds x1,x2,,xn{0,1} such that i=1naixi=s.

The knapsack problem is a well-known NP-complete problem. In recent years, new algorithms have been proposed to solve this problem, whose time complexity and the space complexity are all exponential of n [6, 18, 19]. However, there is one kind of easy knapsack problem we define below.

Definition 3 (Super-increasing Knapsack Problem). Given a set of non-negative integers a1,a2,,an, and a value sum s from a subset of the whole set, where ai>j=1i1aj, determine the subset to sum s.

It is easy to solve a super-increasing knapsack. Simply take the total weight of the knapsack s and compare it with the largest weight an in the sequence. If s<an, then it is not in the knapsack, i.e.xn = 0, otherwise, xn = 1. Subtract an from the total s, and compare with the next highest number. Keep working this way until the total reaches zero.

2.3 Basic Merkle-Hellman cryptosystem

Here we have a brief introduction to the basic Merkle-Hellman cryptosystem. Alice wants to send a message m to Bob by utilizing the Merkle-Hellman cryptosystem.

Key generation:

Bob selects a super-increasing sequence b={b1,b2,...,bn}, which satisfies bi>j=1i1bj for any i(1,n]. Chooses integer M,W s.t. M>i=1nbi and gcd(M,W)=1. Calculate

aibiW(mod M).

Then Bob’s public key as a=(a1,a2,...,an), and Bob’s private key is (b,W,M).

Encryption:

Alice wants to send the plaintext message m=(x1,x2,...,xn){0,1}n to Bob. Alice inquires about Bob’s public key and then encrypts the message m as follows.

c=i=1nxiai.

The ciphertext is c.

Decryption:

Bob receives the ciphertext c, calculates c=cW1(mod M).

Note that

c=cW1(mod M)=i=1nxiaiW1(mod M)=i=1nxibi(mod M).

Because that M>i=1nbi, then one have c=i=1nxibi.

Bob can recover the plaintext message m by solving the super-increasing subset sum problem, which is easy.

2.4 Shamir’s attack

The core idea of the algorithm is to discover the "unusual" relationship between the parameters. Let us first consider the size of each parameter in the cryptosystem. ai is a dn bit number and typical values of d and n are 2 and 100, respectively, so ai is generally 200 bits. M’s size is 200 bits and bi is chosen as dnn + i−1, which is n + i−1 bits in general situations.

Note that U=W1(mod M), so the size of U is also 200 bits. From bi=aiU(mod M), then there is a positive integer ki such that

bi=aiUkiM (1)

Compared with aiU, the left side of the equation is a relatively small number, indicating that aiU and kiM are of the same order of magnitude, so ki is also 200 bits. It can be obtained by Eq 1:

|UMkiai|=biMai2n+i124n=23n+i1

We denote the number of ai we use by l, for 1<i<l, we have

|kiaik1a1|23n+i1+23n23n+l

This indicates that

|kia1k1ai|2n+l (2)

We can observe from Eq 2 that the difference between two 4n bit numbers is a less than 2n bit number, which is a very unusual thing. Shamir pointed out that when l>d + 1, the integer programming problem can be used to find several sets of solutions in polynomial time. Once we get the value of k1, we can try to construct (W,M) pairs such that

bi=aiW1 (mod M) , 0<bi<M ,1in

become a super-increasing sequence, thereby cracking the Merkle-Hellman cryptosystem. The time complexity of this attack is O(nl + 10LlogL) [20], where L is the length of the input.

3 Cryptanalysis

3.1 Observation of short vectors

Note that ai=W*bimodM,i=1,,n, we have bi=W1*aimodM,i=1,,n. Let ki be the quotient such that bi=W1ai+Mki. The equations imply that

b1a2b2a1M=k1a2k2a1 (3)
b1aibia1M=k1aikia1,3in. (4)

Take n = 100. Consider the bit length of the first few integers of the sequence Si, which is less than 99+i, it is far less than the bit length of ai, which is about 200. We try to recover the first l–1 integers b1aibia1M,i=2,,l. By the right part of the above equations, we define h1=(a1,,0),h2=(0,a1,,0),,hl1=(0,,a1), hl=(a2,,al). Then we apply LLL algorithm on the lattice (h1,,hl).

Define the following matrix:

H=[a1000a1000a1a2a3al]

Obviously, the rank of lattice is l–1, and h1,,hl is not a basis of the lattice. Therefore, the first short vector of the output of LLL algorithm is a zero vector. Besides, the norm of vector

(b1a2b2a1M,b1a3b3a1M,,b1albla1M)

in lattice is unusually small according to Eq 2 and Eq 4. It is the second shortest vector in the output, and the other l–2 vectors are much longer than the second one. The experiment data confirms our assumption.

Define

Si=aik1kia1 (5)

From Eq 3 and Eq 4, we have

(k2,k3,...,kl,k1)H=(S2,S3,...,Sl)

then the vector (S2,S3,...,Sl) is the shortest in lattice . Because that we choose l is small, sometimes choosing l = 10, the rank of lattice is l–1, one can recover Si,2il very quickly.

Lemma 1. Let be the lattice generated by the basis matrix H. The target vector 𝐒=(S2,,Sl) defined above is the shortest vector in the lattice , where Si=aik1kia1,provided l > 3.

Proof: The target vector 𝐒=(S2,,Sl) is defined as Si=aik1kia1 for 2il, where |Si|2n+l by Eq. (2). Its Euclidean norm satisfies:

𝐒l1·2n+l=O(2n+l)

According to the lattice basis H, the determinant of lattice is calculated as:

det()=a1l2(22n)l2=22n(l2)

By the Gaussian heuristic, the expected shortest vector length in lattice is:

λ1()l12πe·det()1/(l1)=O(22n(l2)l1)

Obviously, when l>3, 2n+l<22n(l2)l1, and the ratio of the Gaussian heuristic-predicted shortest vector length to the target vector length exhibits a monotonic increase with growing parameter l, demonstrating asymptotic divergence behavior.

Therefore, the target vector is the shortest vector in the lattice provided l > 3.

Remark 1. Note that the LLL algorithm guarantees that the first reduced basis vector 𝐛1 satisfies:𝐛12(l2)/4 · det()1/(l1)=O(22n). It is well-known that the LLL algorithm performs well when the dimension of the lattice is smaller than 50. From Lemma 1, one can recover the target vector S by utilizing LLL algorithm when l is not big. The gap between the length of the target vector and the expected length of Gaussian heuristic becomes larger, then LLL algorithm will recover the target vector much more easily. It is validated by the experimental data.

Our approach shares similarities with the one presented in [21], yet it differs in two aspects. Firstly, our lattice has a lower dimension. Secondly, we offer a more rigorous theoretical analysis ensuring the lattice’s ability to find the target vector.

3.2 Recover the equivalent key

We all know that one can solve the super-increasing knapsack problem in n times arithmetic operations. Shamir’s method aims to recover the whole group of the equivalent keys, and then recover the plaintext based on the equivalent keys. In this paper, we propose a method to recover the whole group of equivalent keys except the first small part of the equivalent keys, and then, we can recover most of the plaintext from the tail. After that, one can easily search for the remaining part of the plaintext, because the size of the knapsack problem is quite small.

From the last subsection, one can obtain S2,...,Sl and k1,k2,...,kl by using the LLL lattice reduction algorithm. Notice that ki may be equal to the original ki or not, but as long as it satisfies the relationship Si=aik1kia1, it will not affect our successful cracking of the cryptosystem. For convenience, we will not distinguish ki and ki in the following, and they will all be recorded as ki.

Interestingly, we observe that there exists a fixed integer i0, the sequence Si0,Si0+1,...,Sn forms a super-increasing sequence, which is an important property of the equivalent key we are looking for. Therefore, we regard Si as the private key bi. The relationship between bi and ai is Eq 1. For Si and ai, we can find an equation (5) with the same structure . Therefore, we obtain the equivalent key:

b=(S1,S2,,Si0,,Sn)
U=k1
M=a1

Lemma 2. In the parameters defined above, there exists an integer i0 to be determined, such that Si0,Si0+1,...,Sn forms a super-increasing sequence.

Proof: It can be derived from Eq 1 (5) that

MSi=bia1b1ai

As long as it is proved that MSi is a super-increasing sequence, we can get that Si is also a super-increasing sequence. Let’s calculate that

MSi+1k=1iMSk=bi+1a1b1ai+1k=1i(bka1b1ak)=a1(bi+1k=1ibk)+b1(k=1iakai+1) (6)

Note that bi forms a super-increasing sequence, so a1(bi+1k=1ibk)>0. For the second term, there exists an integer i0, such that for any i>i0,

k=1iak>ai+1. (7)

In the following we will prove that when i10, Eq 7 is true with a probability close to 1.

Consider a sequence of i independent random variables a1,a2,,ai, where each ak (k=1,2,,i) represents a random integer uniformly chosen from the discrete interval [0,M1]. Let Ri=a1+a2++ai denote the sum of these i random integers. Each ak is a discrete random variable, and its expectation and variance are derived from the properties of a uniform distribution. Specifically, the expectation of ak is given by:

E(ak)=M12,

which is the mean of the uniform distribution over [0,M1]. The variance of ak is:

Var(ak)=(M1)M12.

Since Ri is the sum of i independent and identically distributed random variables, the expectation and variance of Ri are:

E(Ri)=i·E(ak)=i·M12,

and

Var(Ri)=i·Var(ak)=i·(M1)M12.

To compute P(Ri>M), we use the Central Limit Theorem (CLT), which states that the sum of a sufficiently large number of independent and identically distributed random variables converges to a normal distribution. Consequently, Ri can be approximated as:

Ri~𝒩(E(Ri),Var(Ri)).

By standardizing Ri into the standard normal variable Z, we write:

Z=RiE(Ri)Var(Ri).

The probability P(Ri>M) is then expressed as:

P(Ri>M)=P(Z>ME(Ri)Var(Ri)).

Substituting E(Ri)=i·M12 and Var(Ri)=i·(M1)M12, the argument of the standard normal cumulative distribution function (CDF) becomes:

Mi·M12i·(M1)M12.

Finally, the probability P(Ri>M) is expressed as:

P(Ri>M)=1Φ(Mi·M12i·(M1)M12),

where Φ(x) denotes the CDF of the standard normal distribution. This formulation provides the exact probability of the sum of the i random integers exceeding M, based on their uniform distribution over [0,M1].

For the specific case where i = 10 and M is a 200-bit integer, substituting these values into the formula gives the standardized value:

Z=M10·M1210·(M1)M12.

Numerically approximating this for M = 2200, we find:

P(k=1iak>ai+1)P(R10>M)0.999994.

It should be noted that the Central Limit Theorem is most accurate when the number of summands i is sufficiently large. For i = 10, the approximation is reasonable but not exact. Nonetheless, for practical purposes, the CLT-based approximation remains highly effective, particularly for large M, where the relative error becomes negligible.

In summary, the above analysis shows that when i10, k=1iakai+1>0 holds with a probability close to 1.

Remark 2. Here we use another method to give a lower bound on the probability that Eq 7 holds.If we can prove that k=1iak>M, then k=1iak>ai+1 must be true. The necessary condition for k=1iak>M is that at least two of a1,a2,...,ai are greater than M/2. Since aibiW(mod M) , M is very large, a can be approximately regarded as uniformly randomly selected from [0,M–1]. Then it is easy to calculate

P(k=1iak>ai+1)1i+12i.

According to the above analysis, this probability rises as i increases and when i = 10, the probability that Eq 7 being true is not less than 98.9%.

Based on this parameters setting, one can recover the plaintext (mi0,mi0+1,,mn). For the remaining unknown part m1,m2,,mi01, we can solve a knapsack problem with size i0−1, i.e.

i=1i01aimi=ci=i0naimi

This size-reduced problem can be solved easily. Because i0 is usually not larger than 5 in the practical cryptanalysis, whose probability exceeds 96%. See the experimental section later for details.

It can be seen from Remark 2 that the probability lower bound corresponding to the constant value of i0 is independent of n and its size is O(1), so the complexity of exhaustively enumerating m1,m2,,mi01 is also O(1). According to Lemma 2 and Remark 2, the theoretical probability of i010 is 99.9994%, and the probability lower bound of i010 and i020 is 98.9258% and 99.9938% respectively. In the experiment, the appropriate parameter setting can ensure that i0 is all within 10, so 10 can be regarded as the "actual effective upper bound" of i0. In Remark 2, the lower bound of the probability corresponding to i0 decreases exponentially with its increase. Therefore, the theoretical upper bound of i0 will be a small integer, and an exhaustive search of this scale can be easily performed.

Finally, we recover the whole plaintext of the basic Merkle-Hellman cryptosystem. The whole description of our attack is given in Algorithm 1 below.

Algorithm 1. Our algorithm for breaking basic Merkle-Hellman cryptosystem.

Require: Public key a and Ciphertext c.

Ensure: Message m.

1: Construct the lattice basis matrix H mentioned in Sect.3.1.

2: Invoke the LLL algorithm to obtain a set of reduced basis.

3: Select the last coefficient corresponding to the shortest

  non-zero vector as k1.

4: Let n be the length of public key.

5: Uk1,Ma1.

graphic file with name pone.0322726.e184.jpg

9: Determine i0, such that the sequence bi0,bi0+1,...,bn form a

  super-increasing sequence.

10: c=c·U mod M.

11: Obtain mi0,mi0+1,,mn by solving the super-increasing

  knapsack problem.

12: Solve the size-reduced knapsack problem

i=1i01aimi=ci=i0naimi by enumeration the

mi,1ii01.

The most time-consuming part of our method is invoking the LLL algorithm to recover k1. The lattice basis matrix H is l×(l1), and the complexity of invoking classical LLL algorithm on lattice H is O(l6L3). Note that the value of l is a small integer and L=O(n), so the time complexity of this attack is O(n3).

BKZ algorithm [22] has stronger reduction capabilities than the LLL algorithm in practice. However, the dimension of our lattice is a small constant. The LLL algorithm can easily find the target vector from the lattice. The BKZ algorithm is a waste of talent here and can not bring any improvement. Therefore, we use the LLL algorithm instead of the BKZ algorithm in our method.

4 Experiments

According to Lemma 2, there exists an integer i0, such that the sequence Si0,Si0+1\allowbreak...,Sn forms a super-increasing sequence. Experiment data validates Lemma 2. We find that the first few elements of Si,i<i0 are not always super-increasing. This will lead to incorrect decryption of the first few bits when solving this partially super-increasing knapsack problem directly. Fortunately, there is an upper limit for the number of error bits in decryption. When the size of plaintext is 100 bits, errors are usually concentrated in the first 5 bits in most cases, and rarely up to around 10 bits, which can be obtained by exhaustive methods by solving the size-reduced knapsack problem. Note that the S1 given by our algorithm is always 0, and the ciphertext obtained when the coefficient x1 is 0 and 1 in the subset sum problem is the same, so the first bit of plaintext can only be obtained by exhaustion.

In the experiment, we choose n = 100, l=10,20 and the size of the message is 100 bits. We define T as the number of message bits the algorithm successfully recovered. To check the effect of our method, we generate 1000 sets of public keys and ciphertext of Merkle-Hellman knapsack cryptosystem for each group of parameters. prop(Tc) denotes the proportion of instances that successfully recover c bits.

Case I, we do not consider the final step of Algorithm.1, i.e. solving the size-reduced knapsack problem. The specific experiment data are shown in Table 1.When n = 100, experimental results show that the success rate of restoring 99 bits is stable at 84%, and the success rate of restoring 90 bits is up to 98.9% when l = 10. When l = 20, experimental data show that the success rate of restoring 99 bits is stable at over 85%, and the success rate of restoring 90 bits is up to 99.5%. It should be noted that when T = 99, we consider that all the plaintexts have been recovered, because the first plaintext can only be recovered by enumeration.

Table 1. Success rate of case I.

n 100 200 300
l 10 20 10 20 10 20
prop(T90) 98.9% 99.5% 99.4% 100% 99.7% 99.8%
prop(T99) 84.0% 85.0% 82.4% 86.8% 83.7% 81.3%

Moreover, we examine the performance of the algorithm when n = 200 and n = 300 in Table 1. We find that when the scale of the system becomes larger, our method can still achieve excellent results. The probability of correctly recovering the plaintext is over 99%. As we have theoretically analyzed,the dimension of lattice H will not change as n increases, so the probability of successfully cracking the private key will not change significantly.

The conclusion we can draw is that the more public key used, the more plaintext bits T can be correctly recovered. However, l = 20 is enough. At this time, the failure probability of recovering 90% of the plaintext bits is negligible. The remaining plaintext of less than 10 bits can be obtained through exhaustive search.

Case II, we run the whole Algorithm 1. For the parameters n = 100, l=10,20, the size of the message is 100 bits, and the success rate is 100%. We can recover the plaintext for each set of parameters. Experimental data show that i0 is mostly concentrated in 1~5 when l=10,20. The value of i0 is shown in Table 2.

Table 2. Distribution of i0.

i0 1 2 3 4 5 6 7 8 9 10 11~20 20+
l = 10 720 0 155 51 36 11 6 7 6 0 0 8
l = 20 741 0 153 49 30 16 6 5 0 0 0 0

We can see that in 1000 sets of data, more than 96% of S can form a super-increasing sequence after removing the first five digits at most no matter l is 10 or 20. The case of 20 +  indicates that the decryption failed. When l = 10, there is a probability of 8 out of 1000, and there is no failure when l = 20. Since the first element of S is 0, there is no case of i0 = 2. The experimental data also validates Lemma 2.

We used the same computing power to run Shamir’s algorithm and our algorithm. The running time of the two algorithms is shown in Table 3. As we can see, the running time of our algorithm is very short, which is much more efficient than Shamir’s. Moreover, as the size of n increases, the running efficiency of our algorithm is very little affected. This is because the lattice dimension used to solve the intermediate variables in our algorithm does not change with the increase of n. Accordingly, the increase of n has a very small impact on the complexity of lattice basis reduction, and the running time will not change significantly. Therefore, our algorithm can still perform well when n is large.

Table 3. Comparison of running time.

n Running time of shamir’s algorithm(h) Running time of our algorithm(s)
100 5.84×106 0.13
200 1.01×109 0.16
300 1.72×1010 0.21

In summary, the more information of the public key we use, the higher the success rate should be. From Table 2, one can choose l = 20 conservatively for the typical parameters of the Merkle-Hellman cryptosystem. In this case, i0 will be smaller than 10 for the overwhelming probability, that one can recover the whole plaintext by solving a reduced knapsack problem with size smaller than 10.

5 Conclusion

Our paper presents a novel approach to cryptanalyzing the Merkle-Hellman Knapsack Cryptosystem, one of the earliest public-key cryptosystems. Our improvement lies in an enhanced algorithm that significantly reduces the time complexity for key recovery compared to Shamir’s method. By leveraging the LLL lattice reduction algorithm on a specially constructed lattice, we efficiently identify a partial super-increasing sequence, which serves as the equivalent private key. This approach allows for the rapid decryption of the plaintext message, with experimental data indicating a recovery time of less than 1 second on a standard laptop for typical parameters of the Merkle-Hellman cryptosystem.

The significance of this work mainly lies in two aspects: it not only provides a more efficient method for analyzing the security of Merkle-Hellman cryptosystem but also contributes to the broader field of cryptanalysis by offering insights into the potential vulnerabilities of knapsack-based cryptographic schemes. This could have implications for the ongoing development of post-quantum cryptographic algorithms, as the knapsack problem’s resistance to quantum attacks makes it a promising candidate for future secure communication protocols.

For future research, we propose several directions. First, it would be beneficial to explore the application of our method to other knapsack-based schemes and assess its generalizability. Besides, further optimization of the LLL algorithm for our specific use case could lead to even more significant performance gains. Lastly, the development of new cryptographic schemes that build upon the lessons learned from attacks like ours is essential to staying ahead in the ever-evolving field of cryptography.

Supporting information

S1 File. Main code.

The experimental code of the algorithm in this paper and the environment configuration required to run the code.

(zip)

pone.0322726.s001.zip (18.6KB, zip)

Data Availability

All relevant data are within the manuscript and its Supporting Information files.

Funding Statement

This work was supported by the National Key Research and Development Program of China (No. 2024YFB4504700, 2022YFB2902202), the National Key Laboratory of Security Communication Foundation (No. 2024,6142103042408), the Youth Science and Technology Innovation Talent Support Program Project of Beijing University of Posts and Telecommunications (No. 2023ZCJH10).

References

  • 1.Merkle R, Hellman M. Hiding information and signatures in trapdoor knapsacks. IEEE Trans Inform Theory. 1978;24(5):525–30. 10.1109/tit.1978.1055927 [DOI] [Google Scholar]
  • 2.Lagarias JC, Odlyzko AM. Solving low-density subset sum problems. J ACM. 1985;32(1):229–46. 10.1145/2455.2461 [DOI] [Google Scholar]
  • 3.Coster MJ, LaMacchia BA, Odlyzko AM, Schnorr CP. An improved low-density subset sum algorithm. In: Workshop on the Theory and Application of of Cryptographic Techniques. Springer; 1991. pp. 54–67. [Google Scholar]
  • 4.Schroeppel R, Shamir A. A T = O(2n/2), S = O(2n/4) algorithm for certain NP-complete problems. SIAM J Comput. 1981;10(3):456–64. 10.1137/0210033 [DOI] [Google Scholar]
  • 5.Horowitz E, Sahni S. Computing partitions with applications to the Knapsack problem. J ACM. 1974;21(2):277–92. 10.1145/321812.321823 [DOI] [Google Scholar]
  • 6.Howgrave-Graham N, Joux A. New generic algorithms for hard knapsacks. In: Annual International Conference on the Theory and Applications of Cryptographic Techniques. Springer; 2010. pp. 235–56. [Google Scholar]
  • 7.Kantour N, Bouroubi S. Cryptanalysis of Merkle-Hellman cipher using parallel genetic algorithm. Mobile Netw Appl. 2019;25(1):211–22. 10.1007/s11036-019-01216-8 [DOI] [Google Scholar]
  • 8.Kochladze Z, Beselia L. Cracking of the Merkle–Hellman cryptosystem using genetic algorithm. Trans Sci Technol. 2016;3(1-2):291–6. [Google Scholar]
  • 9.Grari H, Lamzabi S, Azouaoui A, Zine-Dine K. Cryptanalysis of Merkle-Hellman cipher using ant colony optimization. Int J Artif Intell ISSN. 2021;2252(8938):8938. [Google Scholar]
  • 10.Shamir A. A polynomial time algorithm for breaking the basic Merkle-Hellman cryptosystem. In: 23rd Annual Symposium on Foundations of Computer Science (sfcs 1982). 1982, pp. 145–52. 10.1109/sfcs.1982.5 [DOI] [Google Scholar]
  • 11.Liu J, Bi J, Xu S. An improved attack on the basic Merkle–Hellman knapsack cryptosystem. IEEE Access. 2019;7:59388–93. 10.1109/access.2019.2913678 [DOI] [Google Scholar]
  • 12.Kobayashi K, Tadaki K, Kasahara M, Tsujii S. A knapsack cryptosystem based on multiple knapsacks. In: 2010 International Symposium On Information Theory & Its Applications. 2010, pp. 428–32. 10.1109/isita.2010.5649307 [DOI] [Google Scholar]
  • 13.Murakami Y, Hamasho S, Kasahara M. A public-key cryptosystem based on decision version of subset sum problem. In: 2012 International Symposium on Information Theory and its Applications. IEEE; 2012, pp. 735–9. [Google Scholar]
  • 14.Thangavel M, Varalakshmi P. A novel public key cryptosystem based on Merkle-Hellman knapsack cryptosystem. In: 2016 Eighth International Conference on Advanced Computing (ICoAC). IEEE; 2017, pp. 117–22. [Google Scholar]
  • 15.Zhang W, Wang B, Hu Y. A new knapsack public-key cryptosystem. In: 2009 Fifth International Conference on Information Assurance and Security. 2009:53–6. 10.1109/ias.2009.300 [DOI] [Google Scholar]
  • 16.Lenstra AK, Lenstra HW Jr, Lovász L. Factoring polynomials with rational coefficients. Math Ann. 1982;261(4):515–34. 10.1007/bf01457454 [DOI] [Google Scholar]
  • 17.Nguyen PQ, Stehlé D. An LLL algorithm with quadratic complexity. SIAM J Comput. 2009;39(3):874–903. 10.1137/070705702 [DOI] [Google Scholar]
  • 18.Bansal N, Garg S, Nederlof J, Vyas N. Faster space-efficient algorithms for subset sum and k-sum. In: Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing. 2017, pp. 198–209. 10.1145/3055399.3055467 [DOI] [Google Scholar]
  • 19.Becker A, Coron JS, Joux A. Improved generic algorithms for hard knapsacks. In: Annual International Conference on the Theory and Applications of Cryptographic Techniques. Springer; 2011. pp. 364–85. [Google Scholar]
  • 20.Lagarias JC. Performance analysis of Shamir’s attack on the basic Merkle-Hellman knapsack cryptosystem. In: International Colloquium on Automata, Languages, and Programming. Springer; 1984. pp. 312–23. [Google Scholar]
  • 21.Galbraith SD. Mathematics of public key cryptography. Cambridge University Press; 2012. [Google Scholar]
  • 22.Schnorr CP, Euchner M. Lattice basis reduction: improved practical algorithms and solving subset sum problems. Math Program. 1994;66(1–3):181–99. 10.1007/bf01581144 [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1 File. Main code.

The experimental code of the algorithm in this paper and the environment configuration required to run the code.

(zip)

pone.0322726.s001.zip (18.6KB, zip)

Data Availability Statement

All relevant data are within the manuscript and its Supporting Information files.


Articles from PLOS One are provided here courtesy of PLOS

RESOURCES