Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Apr 4.
Published in final edited form as: IEEE Trans Dependable Secure Comput. 2020 Apr 2;19(1):579–590. doi: 10.1109/tdsc.2020.2984219

Efficient and Precise Secure Generalized Edit Distance and Beyond

Ruiyu Zhu 1, Yan Huang 2
PMCID: PMC10072857  NIHMSID: NIHMS1772631  PMID: 37020740

Abstract

Secure string-comparison by some non-linear metrics such as edit-distance and its variations is an important building block of many applications including patient genome matching and text-based intrusion detection. Despite the significance of these string metrics, computing them in a provably secure manner is very expensive. In this paper, we improve the performance of secure computation of these string metrics without sacrificing security, generality, composability, and accuracy. We explore a new design methodology that allows us to reduce the asymptotic cost by a factor of O(log n) (where n denotes the input string length). In our experiments, we observe up to an order-of-magnitude savings in time and bandwidth compared to the best prior results. We extended our semi-honest protocols to work in the malicious model, which is by-far the most efficient actively-secure protocols for computing these string metrics.

1. Introduction

STRING comparison is a useful primitive that finds applications in many real-world scenarios. Among the metrics for comparing strings, many non-linear metrics such as edit distance are most interesting thanks to their versatility in adapting its cost model to field applications. For example, when genomes are denoted by strings, it is customary to use non-linear metrics such as weighted edit distance and Needleman-Wunsch distance [1] to help diagnosing genetic diseases [2], [3], [4]. In many other scenarios where the strings may represent file segments, sequences of system calls, or snippets of network traffic, these non-linear metrics are important enabling techniques of computer immunology [5] and intrusion detection [6], [7].

Often, the input strings in these applications carry highly sensitive information, thus are intended to stay encrypted throughout the computation. However, securely computing these non-linear metrics is a highly challenging research task. Researchers have studied intensively secure protocols to match strings based on edit distance, an epitome metric of its kind. When designing these protocols, several properties are vitally important.

First, one would prefer the protocols to be generic. This implies a number of desirable features: 1) The resulting protocol is ready to be used as a subroutine in another secure protocol using standard composition methods; 2) It is easy to modify the protocol to also work with other variants of string-metrics; 3) It allows to upgrade the security guarantees, e.g., from semi-honest to covert or malicious threat models, using well-known cryptographic techniques.

Second, it is desirable for the protocols to produce accurate results. Imprecise results can cause false decisions that will undermine the value of some security-critical systems. Secure protocols that can always provide accurate results irrespective of the secret input data can be used in many very different scenarios.

Third, the protocols are expected to be rigorously proven secure and free of leakage, which is necessary for safe use of such protocols in real-world applications.

Finally, we surely wish to have protocols as efficient as possible, such that they can be adopted in more performance-critical settings.

Unfortunately, existing protocols cannot yet provide a satisfactory solution to meet all the design expectations above. The heuristics-based protocols [8], [9] are efficient, but missed the first three design requirements entirely. On the other hand, while protocols using state-of-the-art generic garbled circuits [10] or ABY [11] are generic, accurate and proven-secure, they are very expensive in terms of cost.

1.1. Methodology and Threat Model

Motivated by the limitation of existing protocols, we ask:

Can we design secure string comparison protocols that are as secure, accurate and generic as required by the standard definition of secure computation, while being significantly more efficient than the best existing generic solutions?

In this work, we answer this question with a new methodology. We adapt the garbling scheme itself to the public properties of target computations. In the context of computing string-comparison metrics, for example, we made two key observations and exploited them in our protocol design: (a) There are useful public patterns in such computations that correlate the secret values on intermediate wires. E.g., in edit distance, the two input numbers to the min circuit will differ by at most 1. (b) Many parts of string-comparison computations can be realized more efficiently using arithmetic (instead of binary) circuits. By exploiting these insights, we are able to securely compute a number of representative string-metrics significantly more efficiently than the best previous secure protocols.

Threat Models.

In this work, we consider both semi-honest and malicious adversaries. We will discuss the semi-honest protocols first and then show how to upgrade them to thwart full-malicious attacks in Section 5.

1.2. Contributions

We propose a new design methodology for building efficient privacy-preserving computations. We customize the garbling to exploit public properties of the target computations. We apply this methodology in developing secure protocols for several representative string-comparison metrics. Like the protocols of [10], [14], [15], our protocols work in the Random Oracle Model. Unlike prior works, our approach leverages low-cost bounded-input comparison, minimum, and table-lookup, while keeping arithmetic addition free. The overall complexity of our secure string-comparison protocols is O(n2) (with n being the length of each input string), in contrast to O(n2 log n) of prior protocols using best previous garbling schemes [10], [14]. We formally proved the security of our scheme (Section 3.2) and presented ways to extend our garbling schemes to handle arbitrary functions through tethering it to binary garbled circuits (Section 4).

We have strengthened the semi-honest protocols into efficient actively-secure string-metrics computation protocols (Section 5), which are by far the best of its kind. Equipped with state-of-the-art cut-and-choose strategies, for improved performance, the cut-and-choose parameters of our protocols can be selected based on the actual cost ratio between checking and evaluating a GC. Security of our protocols can be guaranteed as long as one correct GC is evaluated.

We have experimentally evaluated our approach on a range of string-comparison metrics including edit distance, weighted edit distance, Needleman-Wunsch distance, LCS, HCS. In the semi-honest model, our protocols are able to run up to 16 times faster and use an significantly less bandwidth than best existing GC-based protocols (see Table 1). In the malicious model, our protocols achieves 2−40 statistical and 2−127 computational security with only about 20x (or 10x) the time and about 15x (or 8.5x) the bandwidth of their semi-honest versions in the LAN (or WAN) setting (see Table 2). Unlike the heuristics-based protocols [8], [9], our approach is generic, accurate, and proven-secure, and does not use any public reference. As a first step in this direction of research, our findings would shed some light on designing other application-specific MPC protocols in the future.

TABLE 1:

Performance Highlights (semi-honest model)

Edit Distance Weighted ED Needleman-Wunsch LCS
Time B/W Time B/W Time B/W Time B/W
LAN WAN LAN WAN LAN WAN LAN WAN
Best Prior 286 1776 39.4 360 2257 50.1 1030 6747 155 202 1224 27.1
This Work 23.7 178 4.09 83.3 625 14.3 142 1073 25.6 18.9 135 3.07

Tested with 127-bit computational security. Times are in seconds and B/W in GB. Computation inputs are two 4000-nucleotide genomes. The weight tables used in Weighted ED and Needleman-Wunsch are given in Figure 1. “Best Prior” results are measured on efficient implementations based on the ideas of Huang et al. [12] and emp-toolkit [13], an updated framework integrating Free-XOR, AESNI, and Half-Gates. Detailed experiment setup is given in Section 6.

TABLE 2:

Performance of Actively-Secure Protocols

Edit Dist. Wgtd. ED N. Wunsch LCS
LAN cevalcchk 3.31 4.82 3.25 2.98
n 45 49 45 45
E(k) 15.00 13.47 15.01 14.99
Time 9.41 25.14 57.23 7.71
B/W 61.5 193.2 369.0 46.1
WAN cevalcchk 23.97 39.06 23.57 20.70
n 93 123 93 93
E(k) 8.91 7.93 8.91 8.91
Time 35.38 106.62 213.52 27.92
B/W 36.6 113.8 219.1 27.5

cevalcchk denotes the observed cost ratio between Evaluate (Step 5.) and Check (Step 4.), which can vary with application and hardware/network conditions. n and k are cut-and-choose parameters picked based on cost ratio. As k is chosen probabilistically, E(k) denotes the expected value of k. Time in minute and B/W in GB.

1.3. Related Work

1.3.1. Heuristics-based private string matching

Researchers have proposed some interesting heuristics to approximate best matches of low-entropy strings by their edit-distances. Two seminal works of this kind are by Wang et al. [9] and Asharov et al. [8]. Wang et al. estimated edit-distance of human genomes through solving set-difference-size problems that were efficiently sketched using a public reference genome. Asharov et al. divided genome strings into short segments then approximated edit-distance-based match.1 Although these protocols offered high efficiency, they also suffered from leakage, accuracy, and generality issues: (a) They assume a weaker threat model that does leak more than what is allowed by the standard definition of secure computation, while it is hard to argue that the leaked information is not what an attacker wanted. (b) They produce input-data-specific errors with their results, making them inapplicable in scenarios where errors are less tolerable. (c) It is hard to use these protocols as generic building blocks in other secure protocols or on inputs other than low-entropy genomes. It is neither clear how to modify them to compute other variant string-metrics such as Needleman-Wunsch, LCS, or to work against stronger adversaries. Moreover, both protocols rely a “good” public reference string, which may not be available in many use cases. We note that the quality of the reference string can severely affect the accuracy and cost of these protocols (see experiments in Section 6.1), while methods of picking “good” references were yet to be studied.

1.3.2. Garbled-circuit-based approach

Generic protocols using optimized garbled circuits (GC) were also used to compute edit-distance [12], [16]. This type of protocols always produce accurate results, offering strong security guarantees satisfying the standard definition of security for MPC, and are generally applicable to other string-comparison metrics. In addition, these GC-based protocols can be used as black-box components in larger secure computations. There are standard practical transformations to upgrade these protocols to work in the presence of active adversaries.

On the flip side, the costs of such protocols are prohibitive, partly because of the large constant factor blowup from translating the computation into binary circuits. We have implemented GC-based protocols to securely compute edit distance, weighted edit distance, Needleman-Wunsch, LCS and HCS. In the baseline implementation, we used all applicable state-of-the-art optimizations including fixed-key hardware AES [17], [18], Half-Gate garbling [10], and free-XOR technique [19]. The performance of these protocols is reported as “Best Prior” row in Table 1 and the performance charts of Figure 3, Figure 4 in Section 6.1. These baseline performance numbers are already significantly better than any generic protocols found in the literature, since we used all possible state-of-the-art optimizations. Still, their performance wouldn’t be satisfactory in many practical settings.

Fig. 3:

Fig. 3:

Edit Distance, Weighted Edit Distance, and Needleman-Wunsch. (κ = 127)

Fig. 4:

Fig. 4:

LCS and HCS. (κ = 127)

1.3.3. Comparison with Ball-Malkin-Rosulek [14]

Ball, Malkin and Rosulek proposed a garbling scheme where plaintext signals are encoded in their CRT-representations (Chinese Remainder Theorem). The CRT-representation encodes a plaintext value as elements in field GF(p1 × ··· × pn) where p1, ... , pn are a number of distinct small primes. Their scheme can be considered as an extension of Half-Gate’s wire-label encoding, which is over the field GF(2κ), with general projection gates. They show with calculation that this garbling scheme can be of theoretical interest in saving bandwidth for certain computations that consist of many high fan-in threshold and modular addition gates. However, they did not consider any practical end-to-end secure computation protocols and practical time efficiency. In contrast, we focus on a class of important string comparison metrics computations, discovering some key properties of these computations, and advocate customizing the protocol to leverage the public properties for efficiency improvements. Our work considers both semi-honest and malicious adversaries. We show with experiments that our protocols are up to an order-of-magnitude better in both time and bandwidth than best existing generic protocols.

In fact, an important technical distinction between the two works is that their garbling schemes rely on the point-and-permute technique to evaluate the garbled gates, while we use zero-tags to allow the evaluator to identify failed trial-decryptions. The point-and-permute technique is not compatible with the bounded-value projection technique which has significantly boosted the performance of our protocols. This is because if any garbled rows skip the transmission, the point-and-permute mechanism allows the evaluator to learn something about the secret permutation from observing how the (publicly-known) omitted entries are moved before and after the permutation.

1.3.4. Comparison with ABY [11] and ABY3 [20]

The ABY framework enables generic secure computations using one or a mixture of GMW-based Arithmetic/Binary circuits and Yao’s binary GC. However, it won’t outperform our protocols since it does not support bounded-value projection. Nor does it offer convenient upgrade to malicious model security as our protocols do. In fact, ABY cannot even outperform our GC-based baseline (see Section 6.3).

In comparison to ABY’s secret-share conversion techniques, we stress that our GC-based arithmetic encoding is very different from ABY’s GMW-based arithmetic encoding. Henceforth, the conversion methods we present in Section 4.1 differs from those of ABY in essential ways.

Mohassel and Rindal extended ABY to ABY3 for machine learning computations. However, ABY3 only works for a completely different threat model (3PC with honest majority) which is out of the scope of this paper.

1.3.5. Comparison with DKS+ [21]

The work by Dessouky et al. suggests that some custom-built circuits could benefit from efficient OT extension specific for short messages by a constant (and application-dependent) factor (2–4x for AES and PSI) of savings. However, they didn’t consider the string metrics that we study here and their idea doesn’t support bounded-value projection.

2. Background

Notations.

We let κ be the computational parameter; let a:=b denote assigning the value of b to a; and let xS denote assigning to x a uniformly element of the set S.

2.1. Secure Garbling

First proposed by Yao [22], garbled circuits were later formalized by Bellare et al. [18] as a cryptographic primitive of independent interest. Following the notations of Bellare et al., a garbling scheme G is a 5-tuple (Gb, En, Ev, De, f) of algorithms, where Gb is an efficient randomized garbler that, on input (1k, f), outputs (F, e, d); En is an encoder that, on input (e, x), outputs X; Ev is an evaluator that, on input (F, X), outputs Y; De is a decoder that, on input (d, Y), outputs y. The correctness of G requires that for every (F,e,d)Gb(1k,f) and every x,

De(d,Ev(F,En(e,x)))=f(x).

Bellare et al. have proposed three security notions for garbling: privacy, obliviousness, and authenticity, which we summarize as below.

  • Privacy: There exists an efficient Sprv such that for all x,
    {(F,X,d):(F,e,d)Gb(1k,f),XEn(e,x)}{Sprv(1k,f,f(x)}.

    where “≈” symbolizes computational indistinguishability.

  • Obliviousness: There exists an efficient Soblvs such that ∀x,
    {(F,X):(F,e,d)Gb(1k,f),XEn(e,x).}{Soblvs(1k,f)}.
  • ϵ-Authenticity: For all efficient A=(A1,A2),
    Pr(YEv(F,X)(f,x)A1(1k),and:(F,e,d)Gb(1k,f),De(d,Y)XEn(e,x),YA2(1k,F,X).)ϵ.

Optimizations have been proposed and improved garbling in many aspects such as bandwidth [10], [14], [23], evaluator’s computation [23], memory consumption [12], and using dedicated hardware [10], [17], [24]. State-of-the-art implementations of garbling schemes using AESNI can typically produce a garbled row of the garbled truth table in roughly every 25ns [13], [17], [24].

2.2. Edit Distance and Other Metric Variants

The edit distance (also known as Levenshtein distance) between any two strings s and t is the minimum number of edits needed to transform s into t, where an edit is typically one of three basic operations: insert, delete, and substitute. Algorithm 1 is a standard dynamic programming approach to compute the edit distance between two strings. The invariant is that Di,j always represents the edit distance between s[1..i] and t[1..j]. Lines 1–2 initialize the first row of the matrix D while lines 3–4 initialize the first column. Within the main nested loops (lines 5–7), Di,j is set at line 7 to the smallest of Di1,j+cins, Di,j1+cdel, and Di1,j1+csub, where cins, cdel, and csub correspond to the cost of insert, delete, and substitute a single character (at any position). For basic edit distance, cins:=1, cdel:=1, and csub:=(s[i]=t[j])?0:1, i.e., each single-character insert, delete, and substitute incurs one unit cost while matching characters costs zero. Once the minimal edit distance is computed, it is easy to backtrack (from Di,j) a sequence of edits that transform s[1..i] to t[1..j], e.g., for the purpose of deriving an optimal alignment.

Algorithm 1.

EditDistance(s, t)

1: for i := 0 to length(s) do
2:  Di, 0 := i · cins;
3: for j := 0 to length(t) do
4:  D0, j := j · cdel;
5: for i := 1 to length(s) do
6: for j := 1 to length(t) do
7:   Di,j:=min(Di1,j+cins,Di,j1+cdel,Di1,j1+Csub);

Weighted Edit Distance.

More generally, the cins, cdel, and csub above can be adjusted to fit the goals of specific applications. For example, in diagnosing certain genetic diseases [2], [4], it is customary to set cins and cdel to integers between 5–10 while setting the substitution cost to 1. The rationale behind the cost gaps is that insertions and deletions (called indels) occur much more rarely than substitution in some application domain so one would adjust the costs so that the changes are better captured by the editing model. For example, during DNA replication, indels are much rarer than substitutes, so we would expect a good alignment to contain proportionally less indels to reflect the natural clone of DNAs.

Needleman-Wunsch.

As the statistical models of various operations were refined with respect to the symbols involved in the mutations, researchers [25], [26], [27], [28] have found many good reasons to also adjust the costs cins, cdel, csub according to the specific characters to be inserted, deleted, or substituted. In this case, cins, cdel and csub can be viewed as functions over the alphabet of all possible characters. For example, for genomes, they can be encoded as one- and two-dimensional tables (Fig. 1). Note that although the weight tables are publicly known, lookups over the arrays have to be obliviously computed because the indices used to lookup are secret.

Fig. 1:

Fig. 1:

Example weight tables of genomic Needleman-Wunsch

Longest Common Subsequence (LCS).

Unlike edit distance, the length of longest common subsequence measures the similarity of two strings. Given strings s and t, the length of the longest common subsequence between them can be computed using dynamic programming similar to that for edit distance (Algorithm 2). Comparing to Algorithm 1, the only two changes are the initialization values in line 2 and 3, and the logic to derive Di,j (line 7). The invariant now is that Di,j always represents the length of LCS(s[1..i],t[1..j]).

Algorithm 2.

Longest common subsequence(s, t)

1: for i := 0 to length(s) do
2:  Di, 0 := 0;
3: for j := 0 to length(t) do
4:  D0, j := 0;
5: for i := 1 to length(s) do
6: for j := 1 to length(t) do
7:   Di,j:=max(Di1,j,Di,j1,Di1,j1+wi,j);

With basic LCS, the matching reward, wi,j, is set to

wi,j={1,if s[i]=t[j]0,otherwise.

Heaviest Common Subsequence (HCS).

As a generalization of LCS, researchers [29] have introduced the concept of heaviest common subsequence, just like Needleman-Wunsch generalizes edit distance. The idea is to let different characters reward differently when they match. Therefore, wi,j can be viewed as a matrix (to be indexed by s[i] and t[j]) where only the diagonal entries will be positive while the rest of the matrix are filled by 0s.

3. The Semi-Honest Model

Next, we give our semi-honest string-comparison protocols.

3.1. Insights and Intuitions

First, we illustrate two important observations behind the design of our new garbling scheme.

Dominant Costs.

A dominant cost of solving the general edit distance problem lies in the oblivious computation of addition, equality (or table-lookup in general), minimum. This is evident from the dynamic programming Algorithm 1. Therefore, it should be our foremost priority to make these oblivious computations efficient in our new garbling scheme.

Bounded Difference Values.

The edit distance computation makes a number of calls to the three-minimum function, which can be instantiated as two nested calls to the two-minimum function, i.e., min(a,b,c)=min(min(a,b),c). A key observation is that edit distances can be calculated such that all two-minimum gates are computed on such inputs (a, b) that ab is bounded by some constants independent of the absolute values of a and b. This observation opens up an opportunity to speed up private edit distance computation. We exploit this opportunity by designing special two-minimum gadgets which only need to work for inputs of bounded difference, but runs significantly more efficient than generic minimum gates (that need to process all possible inputs).

Take basic edit distance as an example. We can show that every call to min(a, b) can be arranged so that ab{1,0,1,2}. We can prove this fact as follows. First, because

min(Di1,j+1,Di,j1+1,Di1,j1+csub)=min(min(Di1,j+1,Di1,j1+csub),Di,j1+1),

let mi,j=min(Di1,j+1,Di1,j1+csub), our goal is then to show

(Di1,j+1)(Di1,j1+csub){1,0,1,2},(Di,j1+1)mi,j{1,0,1,2}.

Since all the quantities involved are integers, it suffices to show

1(Di1,j+1)(Di1,j1+csub)2,and (1)
1(Di,j1+1)mi,j2. (2)

The triangle inequality of basic edit distance ensures

|Di1,jDi1,j1|1, (3)
|Di,j1Di1,j1|1. (4)

Thus,

|Di1,jDi,j1|
=|Di1,jDi1,j1(Di,j1Di1,j1)|
|Di1,jDi1,j1|+|Di,j1Di1,j1|2.

Also because (3), (4), and 0csub1, we know

1(Di1,j+1)(Di1,j1+csub)2,and
1(Di,j1+1)(Di1,j1+csub)2.

Since

(Di,j1+1)(Di1,j+1)|Di,j1Di1,j|2,
(Di,j1+1)(Di1,j1+csub)2,

thus,

(Di,j1+1)mi,j=
(Di,j1+1)min(Di1,j+1,Di1,j1+csub)2.

Finally, we have

(Di,j1+1)mi,j
=(Di,j1+1)min(Di1,j+1,Di1,j1+csub)
(Di,j1+1)(Di1,j+1)1.

Therefore, both constraints (1) and (2) must hold.

Generally, observations like the one above can also be shown for many other string-comparison metrics. Next, we state our general proposition of this insight which is formally proven in Appendix B. We note that, unlike the example above, our proof for the general case does not rely on the triangle inequality property of the metrics.

Proposition 1.

Let s, t, Di,j,cins,cdel,csub be defined as in Section 2.2, where cins, cdel are generalized to one-dimensional tables and csub is generalized to a two-dimensional table. Let

mi,j=min(Di,j1+cdel[t[j]],Di1,j1+csub[s[i],t[j]])
ui,j=(Di,j1+cdel[t[j]])(Di1,j1+csub[s[i],t[j]])
vi,j=(Di1,j+cins[s[i]])mi,j

Then, there exist public constants C1, C2, C3, C4 which are independent of Di,j, such that for all valid indices i, j.

C1ui,jC2,C3vi,jC4.

3.2. The Garbling Scheme

Basic Idea.

Since these computations only deal with integers, we generalize the idea of garbling binary signals to work directly on arithmetic signals. Recall that when garbling binary circuits, the garbler picks, for every wire in the circuit, a secret string w0{0,1}128 to encode 0 and sets w1:=w0Δ to encode 1 (where Δ is a circuit-global secret uniformly sampled from {0, 1}127). To generalize this idea, we replace “⊕”, the adder on the binary field, with “+p”, the adder on the prime field p (where p is public and sufficiently large, e.g., p > 287). In our scheme, the garbler will first pick a uniform global secret Δ from p. Then, for every wire in the arithmetic circuit, the garbler picks a uniform k0 (called wire-key) from p to denote 0; and encode every integer ap as ka=k0+pa×pΔ where “+p” and “×p” denote mod-p addition and multiplication, respectively.

To garble a gate, the garbler would use encoding of a gate’s every possible input signal as a key to encrypt the encoding of its corresponding output signal; to evaluate the gate, the evaluator will decrypt every garbled row of the gate. To allow the evaluator to tell which row decrypts successfully, we add a constant tag of sufficient length to every wire-key ka to form a wire-label. Thus, it is the output wire-labels (rather than wire-keys) that are actually encrypted.

If the zero-tags are short (e.g., 40-bits), one might worry that a wire-label could happen to successfully decrypt more than one garbled row in the same gate due to collision, which violates the correctness property of garbling. However, to semi-honest attackers, who cannot leverage side-computation to affect protocol execution, the length of the zero-tags is actually a statistical security parameter. To malicious attackers, the issue can be addressed, either by increasing the length of zero-tags (Section 4.2), or by fixing the random-tape of the Gb function to a collaboratively coin-tossed bit-string (so the garbler cannot precompute and cherry-pick a particular random-tape to produce a problematic garbled gate).

Notation for Wire-labels.

In the rest of the paper, we always use upper-case letters (e.g., A) to name wires. If waA denotes a wire-label, the superscript (A) indicates the id of the wire to which this wire-label is associated and the subscript (a) indicates the plaintext signal that the wire-label encodes. When the wire name is irrelevant to a discussion, the superscript can be omitted. In our terminology, generating (or sampling) a fresh wire-label, say waA, for a plaintext value a means first picking k0Ap (unless k0A is already known) then setting kaA:=k0A+pa×pΔ and waA:=040kaA. We require waA{0,1}128, so if kaA<p, leading zeros are padded in front to ensure waA has exactly 128 bits.

Next, we show how every gadget needed in the private edit distance computation can be efficiently instantiated.

Addition.

To securely add two plaintext signals a,bp on two wires A and B, which are represented by wire-labels

waA=040(k0A+pa×pΔ)and
wbB=040(k0B+pb×pΔ),respectively,

it suffices for the garbler to set

w0C=w0A+pw0B

while the evaluator locally computes

wcC:=waA+pwbB.

Assuming there is no overflow2, it is easy to verify that wcC=(w0C+p(a+pb)×pΔ), which is indeed the expected encoding of a+pb on wire C. Moreover, recall that if a+b<p, then a+b=a+pb. Therefore, this essentially realizes addition over when a+b<p.

As a natural extension of secure addition, multiplying a secret value a of a wire A, encoded by wire-label

waA=040(k0A+pa×pΔ)

with a public constant c can simply be realized as:

  1. the garbler sets w0C=c×pw0A; and

  2. the evaluator locally derives the wire-label wzZ=c×pwaA.

Again, note that if c×a<p then c×a=c×pa. Hence, it realizes constant multiplication over if c×a<p.

Obviously, addition (or public-constant multiplication) is also free—no expensive cryptographic computation nor network traffic is used—but only a mod-p addition (or mod-p multiplication, respectively) on each side of the protocol.

Equality.

When computing csub, an equality test is needed to decide whether two input characters are identical. Let a,b{0,1,,ζ} be two integers, and wa, wb are the wire-labels corresponding to a and b, respectively. To securely compute if a equals b, first d=ab is securely computed, hence the garbler knows k0D and Δ while the evaluator knows wdD=040(k0D+pd×pΔ). Then, since d{ζ,,ζ},

  1. the garbler samples a fresh pair of wire-labels w0Z and w1Z to encode signal 0 and 1 on the output-wire Z; and sends the following 2ζ+1 garbled rows
    Encw0D(w1Z,id);and
    EncwiD(w0Z,id),i0,i{ζ,,ζ}
    in a randomly permuted order. Note that idZ is the identifier of this projection gate.
  2. the evaluator tries to decrypt the above 2ζ+1 ciphertexts using wdD as the key. Thus, only the ciphertext encrypted with key wdD will be successfully decrypted to reveal the valid wire-label wzZ encoding (a=b)?1:0.

Namely, the evaluator will learn w1Z if and only if a = b; and otherwise, will learn w0Z.

The cost of the secure equality is linear in the range of (ab). Recall that the cost of traditional binary garbled circuit based integer comparison is linear in the number of bits to represent the input numbers. Therefore, when ab can be bounded by a constant (for application-specific reasons), our approach reduce can reduce the cost by a factor of min(log a, log b).

Minimum.

First, we observe that given two integers a, b, min(a,b)=aab, where “” is a function defined as follows,

x={x,ifx0;0,otherwise.

In essence, “” is a generalized comparison, which can be realized using the same idea of secure projection like in the equality gadget above. Let X, Z be the input and output wires, respectively, and assume x{ζ,,ζ}, the garbler simply sends the following 2ζ+1 ciphertexts in a randomly permuted order:

EncwiX(w0Z,id),i{ζ,,1};and
EncwiX(wiZ,id),i{0,,ζ}

where for 0iζ, wiZ is the wire-label representing plaintext value i on the wire Z. When a, b are large but |ab| is bounded by some constant (which is indeed the case for the string metrics considered in this paper), we can save a factor of min(log a, log b) than traditional garbling.

Table-lookup.

A one-dimensional table of n entries can be viewed as an association-list

{(0,v0),(1,v1),,(n1,vn1)},

where vis are bounded integer values. A table-lookup gadget can be treated as an unary gate with input-wire I and output-wire V. Given a wire-label wiI that encodes plaintext index i, a secure table look-up will output a wire-label wviV that actually encodes vi. In our scheme, this can be straightforwardly realized as follows:

  1. The garbler generates fresh wire-labels wv0V,,wvn1V to encode v0,,vn1 on the output-wire V; and sends the following n ciphertexts in a randomly permuted order:
    EncwiI(wviV,id),i{0,,n1}
    where wiI encodes i on the input index wire I.
  2. The evaluator uses wiI as key to decrypt the above n ciphertexts. Due to the way the ciphertexts are constructed, precisely one of them will be successfully decrypted, revealing the wire-label wviV that encodes vi.

Moreover, looking up a multi-dimensional table with our scheme is readily reducible into a one-dimensional table lookup problem. Take the two-dimensional m-by-n-table lookup as an example. A two-dimensional table can always be mapped to a one-dimensional table by concatenating the rows, i.e., an index (i,j) (where 0i<m, 0j<n) over the 2D-table can be translated into an index k = im + j over a 1D-table of size mn. Since m is public, the affine mapping of wire-labels wiI and wjJ (that encode the row and column indices) to the wire-label wkK (that encode the translated index) is almost free with our scheme. Once the translation is done, the secure 2D-table lookup reduces to sending and trial-decrypting mn ciphertexts—the same as the treatment to securely look up a 1D-table of mn entries.

Recall that with traditional binary circuit garbling schemes, a generic multiplexer-based secure table lookup is significantly more expensive because: 1) each index and each content integer in the table need to be encoded by multiple wire-labels; 2) n multiplexers would be needed to scan the table while the cost of each multiplexer depends on the bit length of the table content values as well as the length of the index. Alternatively, if the table is small, a secure table lookup can be realized as a giant garbled truth table like Huang et al. suggested [12]. However, it is unclear how this can be efficiently realized with AESNI support because log n keys (one key per bit of the index) are involved in producing every garbled row. A more straightforward solution would use SHA hashing, which, however, is orders-of-magnitude slower than AESNI instructions. In contrast, secure table lookup with our garbling scheme is significantly cheaper.

Handle Initial Inputs.

We assume the initial circuit inputs to our (arithmetic) circuit are in bits and the processing of these binary input values resembles that in binary garbled circuit protocols, i.e., the circuit generator’s private input bits are encoded by wire-labels that are directly sent to the evaluator while the circuit evaluator’s private input bits are translated to their corresponding wire-labels through oblivious transfer. Though we stress that the format of the wire-labels that encode the initial input bits conforms to the mod-p field notion of wire-labels required by our garbling scheme. Therefore, a set of addition and public-constant multiplication gadgets will be used to translate the bits representation of input values into their arithmetic representations.

Implementation.

Today’s high-performance garbling schemes rely heavily on ideal block ciphers instantiated with fixed-key AES. Our scheme can also leverage fast fixed-key AES garbling accelerated by AESNI. For all the building blocks, our garbling scheme requires only one cryptographic primitive, Encwin(id,wout), where win and wout are 128-bit wire-labels with valid zero-tags and i < 2128 is an integer serving as a gadget counter. Similar to Half-Gates [10], we implement Encwin(id,wout) as

Encwin(i,wout)=π(K)Kwout

where K=2wini (note that 2win refers to doubling win in GF(2128)) and π is a random permutation realized using fixed-key AES. We can implement Decwin(i,c) as

Decwin(i,c)={m:=π(K)Kc,mhas the zero-tag;,otherwise.

where K is as defined before.

3.3. Formal Analysis

Complexity.

With our approach, the dominating cost can be attributed to the projection gates (used in computing the minimum and equality). For edit-distance, to compute each entry of the n2-entry dynamic programming matrix, only two projection gates are needed: one 8-row projection for equality and another 8-row projection for minimum. So overall, the cost is 16n2 garbled rows. In comparison, using Half-Gate’s [10] garbling to compute edit-distance, computing each entry of the n2 DP matrix requires a fixed-width equality gate (2 garbled rows), two variable-width minimum gates (2 log n × 2 rows), and one variable-width addition gates (2logn rows), totaling at (6logn+2)n2 garbled rows. With Ball-Malkin-Rosulek, even if additions are ignored, the cost per matrix entry will still be c1 log n rows for equality plus c2 log n rows for minimum (where c1, c2 are fairly large constants depending on the choice of the CRT-representation), totaling at (c1+c2)n2logn rows. Therefore, our approach brings log n-factor savings over best existing generic garbling schemes.

Correctness.

We formalize our garbling scheme in Figure 2. It is easy to verify that correctness of this garbling scheme fails only if more than one row in the same gate decrypt to valid (but different) wire-labels. To semi-honest attackers, the length of the zero-tags provides statistical security. Thus correctness fails only when multiple honestly-garbled rows in the same gate happen to decrypt to different wire-labels each with a valid zero tag (such that an evaluator would be confused), which is bounded by 2−40 in each gate. One might also worry that large circuits may be more likely to fail since a circuit with a |C| non-free gates might fail with probability i=1|C|240ni (this is actually a loose upper-bound) where ni is the number of rows in the ith gate. However, this is not the case because for every internal gate, the evaluator can always try all seemingly-valid wire-labels in a subsequent gate to eliminate such spurious wire-labels. So a spurious wire-label can only propagate with 240n1240n2 probability at most, where n1, n2 are the number of garbled rows in the two connected gates, respectively. Since the number of rows in every garbled gate is bounded by a small constant in practice, we have 240n1240n2240. Therefore, the overall failure probability only depends on the final layer of gates, that is, at most i=1|Cout|240.ni where |Cout| is the number of non-free gates in the final layer and ni is the number of rows in the ith final-layergate. Since |Cout| is typically a small constant in MPC applications (e.g., 4|Cout|20 to denote the output distance/score in private string-comparison applications), our garbling schemes are correct except for a negligible probability (in concrete sense). In Section 4.2, we give a technique to increase both the computational and statistical security to 127-bit, which will address the concern even in presence of malicious attackers.

Fig. 2:

Fig. 2:

The garbling scheme

Security.

Note that equality, minimum and table-lookup gadgets are all essentially realized by a primitive operation called projection. Secure projection obliviously maps an input signal ai to a predefined output signal bi based on a publicly table {(a1,b1),,(an,bn)}. Thus, to prove the garbling scheme to be secure, it suffices to just consider addition and projection.

Theorem 1.

If π is an ideal block cipher that is used to realize Enc and Dec as described above. the scheme in Figure 2 satisfies the privacy and obliviousness definitions given in Section 2.1, and an application-dependent notion of authenticity.

Proof of Theorem 1 is given in Section A.

4. Extensions

In this section, we discuss three extensions of our approach: one for garbling arbitrary computations, the second for increasing the security parameters, and the third for achieving application-independent authenticity.

4.1. Garbling Arbitrary Computations

Our garbling scheme as is described so far can’t handle generic computations because we haven’t discussed how to multiply two secret values efficiently. To efficiently handle arbitrary computations, our basic idea is to tether the above scheme with a traditional binary circuit garbling such as Half-Gate.

Arithemtic Wire-labels to Binary Wire-labels.

Suppose the circuit garbler knows w0=040k0 and Δ, whereas the evaluator knows wa=040(k0+pa×pΔ). Let the binary form of the integer a be a1a2,,an. After conversion, we hope the the garbler learns wire-labels w1,0,,wn,0 and Δ while the evaluator learns w1,a1,,wn,an such that wi,ai=wi,0aiΔ. We describe two methods to accomplish this goal that exhibit complementary tradeoffs between performance and generality.

4.1.0.1 Via secret shares: If the range of a is publicly known to be restricted to {0,,ζ}. The basic idea is to let the garbler send a random permutation of

Encwi(im),i{0,,ζ}

where m is a logζ-bit secret mask picked by P1. Thus, the evaluator who has wa is able to recover am. Then, the two parties can use traditional garbled circuit protocols [10] to run any followup computation over a by starting from their respective shares m and am.

To convert an arithmetic wire, it costs ζ+1 encryptions to send the encrypted masked-shares, 176 encryptions to translate the garbler’s input bits and 88 oblivious transfers (for the evaluator’s 88-bit input) in the second stage of the secure computation. This approach would be preferred when ζ is known to be relatively small. As ζ grows too big, it becomes infeasible to transmit O(ζ) encryptions, in which case we can opt to an alternative conversion method suitable for large ζs.

4.1.0.2 Via generic secure modular-arithmetic: The basic idea is to construct a binary garbled circuit to securely compute (kak0)/Δ where “−” and “/” are mod-p subtraction and division, respectively. By requiring the garbler to locally compute (Δ1 mod p), we can reduce the above computation into a secure mod-p subtraction followed by a secure mod-p multiplication, both realized by a traditional binary circuit garbling scheme.

Because k0,ka,p{0,1}88, the cost of this approach is that of a traditional garbled circuit secure computation protocol with 88 × 3 input bits (88 × 2 bits from the garbler and 88 bit from the evaluator), an 88-bit mod-p secure subtraction, and an 88-bit mod-p secure multiplication. Since it only depends on the computational security parameter rather than the range of the plaintext values, it fits better when the range of a can be very big (e.g., more than 217).

With either approach, we stress that the authenticity of the final output-wire labels holds if ap, because without knowing w0 and Δ, for any a,bp,

(w0+pa×pΔ,w0+pb×pΔ)(X,Y)

where X, Y are uniform random samples from 040p. So for example, when it is known that a ≤ 232 from the application context, our approach can offer at least 87 − 32 = 55 bits authenticity.

Binary Circuit Wire-labels to Arithmetic Wire-labels.

Converting wire-labels from traditional binary circuit garbling to arithmetic wire-labels used in ours is more straightforward: the garbler only needs to send a randomly permuted pair of ciphertext

[Encw0(w0),Encw1(w1)]

per wire in the binary circuit, where w0,w1 are wire-labels conforming to the format required by the traditional garbling (e.g., b{0,1},wb=w0bΔ,Δ{0,1}128), and w0, w1 are freshly sampled labels based on our garbling scheme (e.g., b{0,1},wb=040kb,kb=k0+pb×pΔ,Δp). So the evaluator can decrypt the ciphertext corresponding to the binary circuit wire-labels it learns from the evaluation.

To derive an arithmetic wire-label wa that encodes

a=a0+a1×2++an×2n1,ai{0,1}

from binary wire-labels wa0,,wan, it suffices to first convert binary encodings wa0,,wan to arithmetic encodings wa0,,wan, then wa can be derived from wa0,,wan through local constant multiplication and local addition.

4.2. Increase Security Parameters

The scheme as we described thus far only guarantees 87 bits computational and 40 bits statistical security for semi-honest adversaries. Next, we show how to modify our scheme to provide 127 bits computational and 128 bits statistical security for semi-honest adversaries (or 128 bit computational security for malicious adversaries).

The key idea is to set p to be a 128-bit prime (in doing so, we abandon the idea of using 40-bit all-zero tags to identify successful decryptions) and add to each garbled row Encwin(i,wout) a 128-bit tag. That is,

Encwin(i,wout)=(C1,C2)

where

C1=π(K)Kwout
C2=π(K1)Kwout
K=2wini

where 2win refers to doubling win in GF(2128) and π is an ideal block cipher realized by fixed-key AES.

Symmetrically, we can define

Decwin(i,(C1,C2))={m1,m1=m2,otherwise

where

m1=π(K)KC1
m2=π(K1)KC2,

and K is as was defined above. Thus, the evaluator, who obtains wout by trial decrypting garbled rows in the i-th gate with wire-label win, can verify whether

π(2wini1)2winiwout=C2

to tell if the decryption was successful. The intuitive reason behind this is that if win is not equal to win (the key used to generate (C1, C2), then woutwout and

π(2wini1)2winiwoutπ(2wini1)2winiwout,

for all but a negligible probability.

4.3. Application-Independent Authenticity

The authenticity of the garbling scheme described before (as well as the CRT-based garbling scheme of Ball-Malkin-Rosulek) is application-dependent since its authenticity-error n/p can grow with n (see proof of Theorem 1), the size of the application-dependent plaintext-domain. To provide the standard, application-independent notion of authenticity, we can modify our garbling scheme so that every wire’s plaintext value a is encoded as a pair (k0+pa×pΔ,k^0+pa×pΔ^) where Δ, Δ^ are the garbler’s two independently-sampled, circuit-global secrets and k^0 are the garbler’s two independently-sampled wire-specific secrets. In more detail,

  • Encode. To encode a plaintext value ap, the garbler picks uniform k0,k^0,Δ,Δ^p and computes
    La:=(k0+pa×pΔ,k^0+pa×pΔ^)
    as the encoding (i.e., wire-label) of a.
    If the garbler (who knows k0,k^0,Δ,Δ^) receives an encoding L=(L,L^), to check the validity of the encoding, he/she will verify
    (Lk0)×pΔ1=(L^k^0)×pΔ^1
    and decode L to (Lk0)×pΔ1 only if the above equality holds.
  • Addition. Since the encoding is additively homomorphic, given two encodings La and Lb, the encoding of their sum can be locally computed as La+b:=La+pLb.

  • Projection. To garble a projection (v1u1,,vtut), a t-row garbled gate is computed as follows:
    EncLv1(Lu1),EncLv2(Lu2),,EncLvt(Lut).

Theorem 2.

The improved garbling scheme of Section 4.3 satisfies the privacy, obliviousness, and authenticity properties outlined in Section 2.1.

Proof of Theorem 2 can be found in Section C.2.

5. The Malicious Model

In this section, we give a general approach to compile our semi-honest protocols into ones secure against malicious adversaries. We consider the standard definition of active-security of secure two-party computation with respect to the standard ideal model execution: the trusted party, upon receiving input string x and y from party P1 and P2, respectively, computes the agreed string metric between x and y and sends the result to P2.

Protocol Design Intuition.

We use the cut-and-choose technique, where the circuit generator sends n garbled circuits to the evaluator, k of which will be checked and the rest will be evaluated to derive the final outcome. For improved performance, we used the probabilistic cut-and-choose strategy of [30] to fix n but pick k from a public distribution, based on the observed cost ratio between checking and evaluation per GC. Also, the garbler sends only hashes of GCs in the “garble” step to save bandwidth, but re-generates the evaluation GCs in the “evaluate” step. Our protocol succeeds as long as at least one of the evaluation circuit is correctly generated. Due to page limit, we describe our malicious model protocols in Section C.1 and formally prove its security in Section C.3, but state Theorem 3 below for completeness.

Theorem 3.

The protocol of section C.1 securely computes f in presence of malicious adversaries.

6. Evaluation

In this section, we evaluate a set of secure string comparing protocols which motivated our work.

Experiment Setup.

We used two n1-standard-1 instances (1vCPU, 3.75GB memory, priced at 3 cents/hour) on Google Cloud Platform. The LAN setting has 2Gbps with 1ms latency. The WAN has 200Mbps with 40ms latency. The computational security parameter κ is 127 and the statistical security parameter s = 40. Unless specified otherwise, the performance numbers are averaged over 10 runs.

We implemented our scheme in C/C++, using Intel AESNI intrinsic instructions to realize the fixed block cipher π. We use emp-tool [13]’s implementation of Half-Gate [10] garbling and efficient OT extension [31], [32] to construct the baseline protocols to compare with. For fair comparison, all baseline protocols use their best possible custom circuits.

6.1. Application Performance

We applied the proposed garbling scheme to implementing five string metrics: edit distance, weighted edit distance, Needleman-Wunsch, longest common subsequence (LCS), and heaviest common subsequence.

6.1.1. Semi-honest Model Performance

Table 1 highlighted the performance improvements of our protocols in comparison with best previous results. Generally, the gains of our approach are slightly bigger on Edit Distance and LCS, since the choices of weights can affect the sizes of the projection gates (as indicated by Proposition 1). We observe that running times on LAN and WAN all conform very well with the linear cost model: Timeoverall=Timecomputation+Sizetraffic/Speednetwork. Thus, in the scalability experiments below we will focus on the LAN setting.

Figure 3 and Figure 4 delineate the time and bandwidth costs of these end-to-end applications over input strings of lengths 800–4000 characters. The curves all show a quadratic shape, which is consistent with the asymptotic complexity of the underlying dynamic programming algorithms. We set cins and cdel to 5 and csub to 1 for Weighted-ED. Our Needleman-Wunsch used the weight tables of Figure 1.

6.1.2. Malicious Model Performance

Table 2 shows the performance of our actively-secure protocols in the malicious threat model. Input strings in these experiments each has 4000 nucleotides (or 8000 bits). We exploited the game-theoretic cut-and-choose strategy for single-cut protocols proposed by Zhu et al [30] to pick n and k based on the actual cost ratio Ceval/Cchk which can vary with network settings and applications. In our experiments, the cost ratio is affected most by the network performance due to the large bandwidth savings by the hashes of check-GCs. As a result, this optimization saved more than 1/2 (or 1/4) of bandwidth in WAN (or LAN) setting. Comparing to their semi-honest versions (see Table 1), the overall slowdown factor is about 10 (or 20) in WAN (or LAN) environment. This is by far the best performance for securely computing these string-metrics in the malicious model.

6.2. Comparison with [9] and [8]

These heuristics-based protocols are still more efficient than our protocols. However, those protocols are only able to approximate certain computation over very restricted sets of low-entropy strings and are not provably secure with respect to the standard definition of security for MPC protocols. It is also crucial to select a “good” reference string since as the accuracy of these heuristic protocols can be very sensitive on the choice of the reference strings. However, no secure methods to choose “good” reference strings were known.

The quality of an approximation method can be measured by Root-Mean-Square Relative-Error (i=1n[(viu)/u]2)/n where {v1,,vn} are n approximations of a ground-truth value u using the method. Typically, approximation methods with RMSRE≥ 50% are not usable in most real-world string-comparison applications. We run an experiment over the same dataset used by [8], [9], where we picked uniformly 2000 pairs of 3500-nucleotide genome strings, computed the editdistances between them using the protocols of [8], [9] with a randomly generated reference string. We observed a root-mean-square relative-error (RMSRE) of 75% and 59% using [9]’s and [8]’s approach, respectively. Both numbers clearly indicate serious accuracy issues of applying their methods in practice.

In contrast, our approach doesn’t require any public reference string to work, can always produce accurate results, and can work for many variant string metrics over arbitrary strings. However, without knowing how to pick good reference strings, it is not possible to draw meaningful performance comparisons, even merely for the standard edit-distance case.

6.3. Comparison with protocols using ABY

We also find our GC-based baseline better than ABY-based protocols. Analytically speaking, this is because, for the string metrics considered in this paper,

  1. a pure Y approach is essentially the same as our baseline;

  2. a pure B approach (i.e. GMW on binary circuits) is no cheaper than Y, but only allows to move the expensive cryptography into an input-independent offline phase (at the cost of linear online rounds). Thus, for overall efficiency, it neither makes sense to combine B and Y;

  3. even if A allows free addition, it can’t do secure comparison efficiently (other than first translate arithmetic encodings into binary encodings, then compare using either B or Y). In the best known circuits for computing these string metrics, every addition gate is immediately followed by a comparison gate. Because secure wire-label conversion is not cheaper than secure addition using Y or B, using A alone or mixing it with B or Y won’t produce better protocols than our baseline.

Micro-benchmarks.

We also measured the costs of addition, projection, and wire-label conversion. Due to page limit, we report our micro-benchmark experiments in Section D.

7. Conclusion

Customizing garbling schemes to specific computations can bring dramatical efficiency benefits. We have taken a first step to explore this methodology in constructing secure protocols for several representative string-comparison metrics. Our protocols are up to an order-of-magnitude more efficient than best existing results, but also generic, accurate, and provably secure under the standard, preferred definition of security. The resulting actively-secure versions of these protocols are also the best of its kind. Our findings would shed some light on designing other application-specific MPC protocols in the future.

Appendix A. Proof of Theorem 1

Theorem 1.

If π is an ideal block cipher that is used to realize Enc and Dec as described above. the scheme in Figure 2 satisfies the privacy and obliviousness definitions given in Section 2.1, and an application-dependent notion of authenticity.

Proof. Privacy:

Figure 5 describes a simulator Simprv that can be used to show our garbling scheme is private. The construction of Simprv is similar to Gb except for three changes that we highlighted in red: (1) Simprv has a third input f(x); (2) it uses f(x)i to replace t when producing the decoding information dOi; and (3) it calls En with an arbitrary legitimate input x0 to produce X in the end.

Fig. 5:

Fig. 5:

The Simulator for Proving Privacy

For any x, consider (F, X, d) generated by

(F,e,d)Gb(1k,f)
X:=En(e,x)

and the tuple (F,X,d) produced by Simprvx0(1k,f,f(x)). Should Simprvx0 know x, then it would not replace t with f(x)i in producing dtOi but simply call En(e^,x) in the end to generate X. Note that Simprvx outputs exactly the distribution (F,X,d). It is easy to see that (F, X, d) and (F,X,d) are identically distributed. Now, to see (F,X,d)(F,X,d), we note that

  1. the distinguisher cannot tell the two distributions apart by examining any garbled gates because e^ is a tuple of uniform strings and Simprvx and Simprvx0 used exactly the same procedure to produce all garbled gates.

  2. For every output-wire Oi, for every wtOi the distinguisher does not learn, dtOi is no different from a random string (because π is an ideal cipher); from the wtOi learned by the distinguisher, the distinguisher can only get f(x)i from decrypting dtOi, which is no different from what it would learn from examining (F, X, d).

Obliviousness:

We simply observe that in Simprv, f(x) is used only to compute d, which is dropped in the security definition of obliviousness. Thus, the simulator Simobl can be derived from Simprvx simply by dropping the input f(x) and the third component d in the output. The proof of privacy can be carried over to prove obliviousness.

Application-dependent Authenticity:

We note that due to the construction of Enc, if the adversary A can provide any Y such that YEv(F,X)) and De(d,Y)=k (where (F,e,d)Gb(1k,f),X:=En(e,x)), then A must know wk=w0+pk×pΔ, which is the output wire-label corresponding to k. However, without knowing w0 and Δ, for any particular k, A can only guess wk=w0+pk×pΔ correctly with probability at most 1/p. Hence, let n be the size of the domain of the plaintext k, then the adversary can only succeed in guessing a valid output wire-label with probability at most n/p. Thus, our scheme guarantees n/p-authenticity. Since n can vary with application, we call this notion of authenticity application-dependent.

Remark.

In many practical applications such as string-comparison, it is easy to bound the value of n to small (application-specific) constants (e.g., < 300 in all applications considered in this paper) so that n/p is negligible.

Appendix B. Proof of Proposition 1

Proposition 1.

Let s, t, Di,j,cins,cdel,csub be defined as in Section 2.2, where cins, cdel are generalized to one-dimensional tables and csub is generalized to a two-dimensional table. Let

mi,j=min(Di,j1+cdel[t[j]],Di1,j1+csub[s[i],t[j]])
ui,j=(Di,j1+cdel[t[j]])(Di1,j1+csub[s[i],t[j]])
vi,j=(Di1,j+cins[s[i]])mi,j

Then, there exist public constants C1, C2, C3, C4 which are independent of Di,j, such that for all valid indices i,j.

C1ui,jC2,C3vi,jC4.

Proof. Because |Di,j1Di1,j1|cins[s[i]], therefore

Di1,j1cins[s[i]]Di,j1Di1,j1+cins[s[i]]

so,

Di,j1+cdel[t[j]]Di1,j1cins[s[i]]+cdel[t[j]]
Di,j1+cdel[t[j]]Di1,j1+cins[s[i]]+cdel[t[j]]

hence,

ui,j=Di,j1+cdel[t[j]](Di1,j1+csub[s[i],t[j]])cdel[t[j]]cins[s[i]]csub[s[i],t[j]] (5)
ui,j=Di,j1+cdel[t[j]](Di1,j1+csub[s[i],t[j]])cins[s[i]]+cdel[t[j]]csub[s[i],t[j]] (6)

So we can set

C1:=mini,j(cdel[t[j]]cins[s[i]]csub[s[i],t[j]]),
C2:=maxi,j(cins[s[i]]+cdel[t[j]]csub[s[i],t[j]]),

and we have C1ui,jC2.

Symmetrically, we can derive that

(Di1,j+cins[s[i]])(Di1,j1+csub[s[i],t[j]])cins[s[i]]csub[s[i],t[j]]cdel[t[j]] (7)
(Di1,j+cins[s[i]])(Di1,j1+csub[s[i],t[j]])cins[s[i]]+cdel[t[j]]csub[s[i],t[j]] (8)

(8)(5) yields

(Di1,j+cins[s[i]])(Di,j1+cdel[t[j]])2cdel[t[j]] (9)

(7)(6) yields

(Di1,j+cins[s[i]])(Di,j1+cdel[t[j]])2cins[s[i]] (10)

Thus, we know from (7) and (9) that

vi,jmax(cins[s[i]]csub[s[i],t[j]]cdel[t[j]],2cdel[t[j]])

and from (8) and (10) that

vi,jmax(cins[s[i]]+cdel[t[j]]csub[s[i],t[j]],2cins[s[i]])

Finally, by defining

C3:=mini,j(max(cins[s[i]]csub[s[i],t[j]]cdel[t[j]],2cdel[t[j]]))
C4:=maxi,j(max(cins[s[i]]+cdel[t[j]]csub[s[i],t[j]],2cins[s[i]]))

we proved C3vi,jC4. □

Appendix C. Actively-Secure Protocols

We use three ideal functionalities FIHash (interactive hash), FCOT (correlated OT), and Fcoin-toss-toss (coin tossing) defined in Section C.4. First, the garbler is required to use a coin-tossed randomness to run Gb. This prevents an adversarial garbler from compromising the correctness any garbled table through selecting problematic randomness. To ensure P1’s input wire-labels to the evaluation circuits denote the same plaintext value, we used FIHash, an XOR-homomorphic interactive hash implementation that was also used by JIMU [33] for similar purposes. Each initial input and final output wire in f’s circuit is associated with a random permutation bit λI and the evaluator knows λI (i-hash of bit λI) and mλII (i-hash of the wire-label denoting bit λI. Thanks to the XOR-homomorphism of FIHash, it is easy for the evaluator to securely translate a wire-label mbI (a label on the master circuit denoting b) into mbI,i (a label on the i-th GC denoting b) for any b{0,1}, given their i-hashes and their XOR-differences (see Step 5.).

We assume y has more than 40 bits. To ensure the evaluator use consistent y in all evaluation GCs, the parties run the correlated OT functionality FCOT once for the evaluator to learn the wire-labels {myII}Ilnp(P2), which represent y on the master circuit. For evaluation, these master wire-labels are translated to wire-labels on each evaluation GC using XOR-differences whose validity is guaranteed by FIHash.

Finally, FIHash also allows the evaluator to what output wire-labels are valid and identify inconsistent but valid output wire-labels. Note that inconsistent valid wire-labels reveals δ of the master circuit, and further reveals the garbler’s input x when the garbler cheats (see Step 6.).

C.1. Full Protocol Description

Let s be the statistical security parameter. Assume P1 (the circuit generator holding input string x) and P2 (the circuit evaluator holding input string y that has more than s bits) want to compute a string-comparison metric f between x, y.

1. Setup.

On cut-and-choose parameter n and computational security parameter κ, P1 and P2 call Fcoin-toss for P1 to learn {seedi{0,1}κ}i[n]. For i[n], P1 sets

(δi,Δi):=(PRG(seedi,“delta”),PRG(seedi,“Delta”))

and sends {δi}i[n] to P2 through FIHash.

The master circuit.

P1 samples δ{0,1}κ. For every input-wire I of P1’s input to f, P1 samples uniform random bit λI:=PRG(δ,I“lambda”); for every input-wire I of P2’s input to f and every output-wire I of f, P1 sets λI := 0. For every input-wire or output-wire I of f, P1 samples m0I{0,1}κ, sets m1I:=m0Iδ, and sends λI,mλII,δ to P2 through FIHash.

OT of seeds.

P1 picks a uniform Δ{0,1}κ. P1 and P2 call Fcoin-toss for P2 to learn an n-bit string J sampled from certain public distribution (see [30] for the details on how this public distribution of J is calculated). Then P1 with input (Δ,{seedi}i[n]) with input J call FCOT for P2 to learn {seediJi=1} and {seediΔJi=0}.

2. Inputs.

For every input-wire I in the i-th GC, P1 sets

m0I,i:=PRG(δi,(I,i)“label”),
m1I,i:=m0I,iδi.

Then,

  1. P1’s Input. For every input-wire I of P1’s input in the i-th garbled circuit, P1 sets
    λI,i:=PRG(δi,(I,i)“lambda”).
  2. P2’s Input. For every input-wire I of P2’s input in the i-th garbled circuit, P1 sets
    λI,i:=0.
    (C-OT) P1 with (δ,{m0I}Ilnp(P2)) and P2 with {yI}Ilnp(P2), invoke FCOT so P2 learns {myII}Ilnp(P2). P2 verifies that myII matches with m0IyIδ for all I, and aborts otherwise.

    Now, for every input-wire I in the i-th GC, P1 sends λI,i,mλI,iI,i to P2 through FIHash.

3. Garble.

P1 generates n garbled circuits {GCi}i[n] for f as follows:

  1. For every input-wire I of the i-th garbled circuit, P1 generates arithmetic wire-labels
    w0I,i:=PRG(δi,I,i)
    w1I,i:=w0I,i+pΔi,
    then sends an ordered pair
    [EncmλI,iI,i(wλI,iI,i),Encm1λI,iI,i(w1λI,iI,i)].
    which will allow securely translating binary field encodings into their p encodings.
  2. For addition, subtraction, constant multiplication and bounded-value projection gates, P1 runs the Gb algorithm of the garbling scheme of Figure 2.

  3. For every output-wire I of the i-th garbled circuit, P1 sends a secure projection table allows to translate arithmetic encoding wvI,i (v takes a bounded number of values) to its binary encodings mb0I,i,0,,mbkI,i,k where v=b0b1bk and m0I,i,j:=PRG(δi,I,i,j), m1I,i,j:=m0I,i,jδi for all j[k]. P1 sends {m0I,i,j}IOutput(f),i[n],j[k] via FIHash.

P1 sends H(GCi) to P2 (H is a collision-resistant hash).

4. Check.

For each check-circuit GCi, namely those i[n],Ji=1, P2 use seedi to verify that P1 have played honestly in all previous steps; and aborts otherwise. In particular, P2 checks the following constraints:

  1. GCi generated from seedi matches its hash H(GCi).

  2. δi generated from seedi matches δi via FIHash.

  3. m0I,i generated from δi matches m0I,i via FIHash.

  4. λI, i generated from δi matches λI,i via FIHash.

  5. For all IOutput(f),j[k], wire-label m0I,i,j of GCi matches m0I,i,j via FIHash.

5. Evaluate.

For each evaluation-circuit GCi, namely those i[n],Ji=0, P2 sends seediΔ to P1 who verifies the consistency of the value. Then, P1 and P2 collaborate to evaluate these circuits. For every evaluation-circuit GCi, P1 sends δδi to P2, who verifies it with δδi.

  1. P1’s Input. For every input-wire I of P1’s input xI, P1 sends mxIIλIλI,i,mλIImλI,iI,i(λIλI,i)δi to P2, who verifies their validity against their i-hashes FIHash. P2 computes mxII,i:=mxII(mλIImλI,iI,i(λIλI,i)δi)(λIxI)(δδi).

  2. P2’s Input. For every input-wire I of P2’s input yI, P1 sends m0Im0I,i to P2, who verifies their validity against their i-hashes through. FIHash. P2 computes myII,i:=myII(m0Im0I,i)yI(δδi).

  3. Eval. With the wire-labels obtained above, P2 evaluates the garbled circuit according to the garbling scheme’s Ev method.

  4. Check. P2 verifies that all GCi(i[n]) received match their hashes received in Step 3..

6. Output.

For every output-wire I in the i-th evaluation circuit, P1 sends m0Im0I,i to P2, who verifies its validity through FIHash. P2 validates every output wire-label obtained from circuit evaluation against their i-hashes, then translates them to plaintext values.

  1. If P2 all valid wire-labels obtained from evaluating the nk circuits refer to the same plaintext value, then P2 outputs this value and halts.

  2. If P2 obtains two valid output wire-labels on the same output wire I which decode to different plaintext values, then P2 can obtain valid m0I and m1I simultaneously by:
    m0I:=m0I,i1(m0Im0I,i1)
    m1I:=m1I,i2(m0Im0I,i2)(δδi2)
    for some i1, i2. Then P2 can learn δ:=m0Im1I, whose value can be validated through FIHash. With δ, P2 can learn {λI}Ilnp(P1), and further recovers P1’s input x from {λI}Ilnp(P1), {mλII}Ilnp(P1) and those {mxII}Ilnp(P1) it received in Step 5.. P2 locally computes f(x,y) and outputs it.

C.2. Proof of Theorem 2

Theorem 2.

The improved garbling scheme of Section 4.3 satisfies the privacy, obliviousness, and authenticity properties outlined in Section 2.1.

Proof. Note that the garbling mechanism for addition and projection is the same as that of our basic garbling scheme described in Section 3. The proofs for privacy and obliviousness properties are essentially the same as that of the main theorem (Theorem 1). Thus, below we focus on the proof of authenticity. Assume for the purpose of contradiction that the adversary A can provide some Y such that YEv(F,X)) but De(d,Y)=k (where (F,e,d)Gb(1k,f),X:=En(e,x)). Then A must know La=(k0+pa×pΔ,k^0+pa×pΔ^) for some a. However, without knowing any of k0,k^0,Δ,Δ^, for any particular a, the probability that A succeeds in guessing a La=(L,L^) such that

(Lk0)×pΔ1=(L^k^0)×pΔ^1

is 1/p at the best, (because both sides of the equation above are uniformly distributed over p). Therefore, with e.g. p > 287, the authenticity error will be less than 2−87. □

C.3. Proof of Theorem 3

Theorem 3.

The protocol of section C.1 securely computes f in presence of malicious adversaries.

Proof. We prove the security in a hybrid-model where the parties have access to ideal functionalities for FIHash, FCOT, and Fcoin-toss. The standard composition theorem [34] implies security when the sub-routines are instantiated with secure implementations of these functionalities.

If P1 is corrupted. We construct an efficient simulator S interacting with the ideal string-metrics functionality as P1. S runs the corrupted real-model P1 as a subroutine, interacting with it like real-model P2 with input y = 0 using the protocol of Section C.1, except for the following changes:

  1. 1) In Step 1., through the simulated FIHash, S learns {λI}IInput(P1).

  2. In Step 5.a, S learns {mxII}IInput(P1). For all I ∈ Input(P1), if mxI matches mλII, then S sets xI:=λI; if mxII matches mλIIδ, then S sets xI:=λI¯.

  3. In Step 6., S submits x to the trusted functionality and outputs whatever P1 outputs.

To show that the joint out distribution in this ideal-model involving S is indistinguishable from that of the real-model involving the corrupted P1, we consider a series of experiments each with a slightly modified simulator.

  1. Hybrid1 The simulator S1 interacts with the corrupted P1 running the real-model protocol, where S1 uses P2’s actual input y as its input. Their interaction is identical to the real-model execution.

  2. Hybrid2 Simulator S2 acts the same way as S1 in Hybrid1, except:
    1. In Step 1., through the simulated FIHash, S2 learns {λI}IInput(P1).
    2. In Step 5.a, S2 learns {mxII}IInput(P1). For I ∈ Input mxII matches mλII, then S2 sets xI:=λI; if mxII matches mλIIδ, then S2 sets xI:=λI¯.
    3. In Step 6., S2 outputs f(x, y).
    We claim Hybrid2Hybrid1 because
    • To the corrupted P1, the only messages it got from the simulators are {seediΔi[n],Ji=0} in Step 4.. However, these messages in Hybrid1 and Hybrid2 are identically distributed, a fact guaranteed by FCOT.
    • In both experiments, the simulators correctly output f(x, y) if at least one correct GC is evaluated.
  3. Hybrid3 Simulator S3 acts the same way as S2 in Hybrid2, except:
    1. S3 runs the corrupted P1 as a sub-routine and take an ideal P1’s role to interact with the ideal string-metrics functionality.
    2. In Step 6., S3 submits x to the ideal string-metrics functionality and outputs whatever the corrupted P1 outputs.
    We claim Hybrid3Hybrid2 because
    • S3’s output is the same as the corrupted P1’s in Hybrid2.
    • The ideal P2 in Hybrid3 and S2 in Hybrid2 both output f(x, y).
  4. Hybrid4 Simulator S4 acts the same way as S3, except s4 uses y = 0 instead of P2’s actual input as its input when interacting with the corrupted P1. S4 is identical to S. This is the ideal-model execution.

We claim Hybrid4Hybrid3 because the real-model P2’s outgoing-message distributions (including whether and when P2 aborts) do not depend on the value of its input y.

If P2 is corrupted. We construct an efficient simulator S interacting with the ideal string-metrics functionality as an ideal-model P2. S will run the corrupted P2 as a sub-routine, interacting with it as real-model P1 with input x = 0 using the protocol of Section C.1, except for the following changes:

  1. In Step 1., S learns J through the simulated FCOT.

  2. In Step 2., S learns y through the simulated FCOT. S sends y to the ideal functionality and gets back z=f(x,y).

  3. In Step 3., for all i[n], Ji=1, S generates GCi honestly. For all i[n], Ji=0, S runs the simulator Sprv(f,z) to produce GCi (see the privacy definition of garbling for Sprv).

  4. In Step 6., S outputs whatever the malicious P2 outputs.

To show that the joint out distribution in this ideal-model involving S is indistinguishable from that of the real-model involving the corrupted P2, we consider a series of experiments each with a slightly modified simulator.

  1. Hybrid1 The simulator S1 interacts with the corrupted P2 using the real-model protocol with P1’s actual input x. This is the real-model execution.

  2. Hybrid2 The simulator S2 is the same as S1 in Hybrid1, except:
    1. In Step 1., S learns J through the simulated FCOT.
    2. In Step 2., S2 learns y through the simulated FCOT.
    3. In Step 3., for all i[n], Ji=1, S generates GCi honestly. For all i[n], Ji=0, S runs the simulator Sprv(f,z) to produce GCi (see the privacy definition of garbling for Sprv).

    We claim Hybrid2Hybrid1 because our garbling scheme is proven to be private, hence the corrupted P2 cannot tell if a GC is honestly garbled or simulated with a chosen output z.

  3. Hybrid3 Simulator S3 is the same as S2 in Hybrid2, except:
    1. S3 runs the corrupted P2 as a subroutine and interacts with the ideal string-metrics functionality as an ideal-model P2.
    2. In Step 3., instead of computing f(x, y), S3 submits y to the ideal functionality and receives f(x, y).
    3. In Step 6., S3 outputs whatever the corrupted P2 outputs.
    We claim Hybrid3Hybrid2 because
    • S3’s output is the same as the corrupted P2’s in Hybrid2.
    • The ideal-model P1 in Hybrid3 and S2 in Hybrid2 both have no output.
  4. Hybrid4 the simulator S4 is the same as S3 in Hybrid3, except that it uses x = 0 as its input to interact with the corrupted P1. S4 is identical to S and this is the ideal-model execution.

We claim Hybrid3Hybrid2 because the real-model P1’s outgoing-message distributions (including whether and when P1 aborts) do not depend on the value of x. □

C.4. Definition of FIHash, FCOT, and Fcoin-toss

The FIHash Functionality.

We adopt the definition of FIHash from that of JIMU [33]. Note that FIHash allows to verify the validity of the XOR-difference among several previously hashed messages. FIHash also enables the receive to verify any single message by calling Verify on a single message (i.e., t = 1).

Fig. 6:

Fig. 6:

The ideal functionality FIHash.

The FCOT Functionality.

FCOT is the correlated OT functionality as defined in Figure 7. It can be efficiently realized with small modification to the actively-secure OT-extension protocol of Keller et al. [35]. The idea of FCOT was also used in authenticated garbling [15], [36] to construct authenticated multiplicative triples.

The Fcoin-toss Functionality.

On receiving “init” from both parties, Fcoin-toss samples a uniform bit-string s and send it to the designated party (while allowing premature aborts).

Appendix D. Micro-benchmark Experiments

We measured the performance of several basic operations under our garbling scheme. All experiments in this subsection are conducted with respect to 87-bit computational security.

Secure Addition.

Table 3 shows the performance of secure addition in our approach. Recall that addition is (almost) free, our scheme is able to perform one addition every 2.8 nano-seconds, regardless of the bit-length of the numbers to add. This result is in line with the cost of computing a mod-p addition on this hardware. In contrast, costs of binary circuit based addition circuits (powered by Half-Gates) increase roughly linearly with the width of the adder. Ours are 500–40, 000 times faster and consume no bandwidth.

Fig. 7:

Fig. 7:

The Correlated OT functionality FCOT.

Fig. 8:

Fig. 8:

Costs of secure table-lookup. (Timings are measured by averaging over 106 runs.)

Secure Table-Lookup.

This is also the essential enabling primitive for secure comparison and bounded range minimum computations. Figure 8 shows the efficiency of secure table-lookup with our scheme and compares it to the best existing garbled-circuit-based implementation. Two relevant parameters are used to describe the table: the table size (i.e., the number of entries in the table) and the bit-length of each entry. With our scheme, the cost of secure table-lookup grows linearly with the number of entries in the table, but not the bit-length of the entries.

In contrast, a garbled-circuit-based table-lookup costs more when the values in the table grow bigger, because the secure multiplexers has to take wider inputs. In our experiments, we assumed the table contains either 4-, 8-, or 12-bit values, representing the value range of constant tables used in many practical applications. On these table parameters, our approach is 3.6–20 times faster and 6–23 times more bandwidth-efficient.

Wire-label Conversions.

Converting Boolean wire-labels from the binary circuit garbling scheme into arithmetic wire-labels in our scheme is highly efficient, at about 420ns (and ∼32 bytes bandwidth) per bit of Boolean wire-label, since it involves only two garbled rows per Boolean wire (Table 4).

Converting arithmetic wire-labels into Boolean ones used in Half-Gates is comparatively more expensive. The generic method needs 9.6 millisecond and 2MB per arithmetic wire-label, mostly spent on oblivious mod-p multiplication under the Half-Gates garbling scheme. However, if the arithmetic wire-label is known to denote values of a smaller range (usually < 220 possibilities), the faster secret-sharing based label conversion method turns out very efficient. For example, if the range of the arithmetic signal is up to 28, the conversion an arithmetic wire-label takes only less than 11ns and 4.2KB bandwidth. We empirically find that the secret-sharing based conversion can outperform the generic method when the plaintext value is within 216.

TABLE 3:

Costs of secure additions

Time (ns) Bandwidth (byte)

8-bit 16-bit 32-bit 64-bit 8-bit 16-bit 32-bit 64-bit
Half-Gates [10] 1420 2770 5520 11100 154 330 682 1386
This Work 2.8 0

We note the timings coincide well with the cost of AESNI-based garbling (~ 45 ns/row) and that of modulo arithmetic with respect to an 88-bit prime (~2.8 ns/+p). Timings are averaged over 106 runs for Half-Gates and 109 runs for ours.

TABLE 4:

Costs of label conversions.

Time (μs) Bandwidth (KB)

8-bit 16-bit 32-bit 64-bit 8-bit 16-bit 32-bit 64-bit
Boolean to Arithmetic 3.34 6.69 13.31 26.75 0.26 0.51 1.02 2.05
Arithmetic to Boolean (via secret-shares) 10.89 743.2 ――― 4.22 1048.83 ―――
Arithmetic to Boolean (via generic secure modulo-arithmetic) 9628 2004.96

Timings in the first two rows are averaged over 106 runs while those in the third row are over 103 runs.

Footnotes

1.

The protocol of [8] can’t really calculate edit-distance, but aimed at computing the closest matches under the edit-distance metric (a task that doesn’t necessarily require computing edit-distances).

2.

For every specific computation, this assumption can be guaranteed to hold by setting p to be a sufficiently large prime so that no intermediate values in the computation could overflow. For example, fixing p to the largest 88-bit prime suffices for edit-distance-based human genome comparisons. We also note that, without incurring significant overhead, it is possible to use a 128-bit prime p with the extension technique discussed in Section 4.

Contributor Information

Ruiyu Zhu, IU Bloomington, with main research focus in applied cryptography, Indiana University, Bloomington.

Yan Huang, Computer Science at Indiana University Bloomington.

References

  • [1].Needleman S. and Wunsch C, “A general method applicable to the search for similarities in the amino acid sequence of two proteins,” Journal of molecular biology, vol. 48, no. 3, 1970. [DOI] [PubMed] [Google Scholar]
  • [2].Cancer Genome Atlas Network, “Comprehensive molecular portraits of human breast tumours,” Nature, vol. 490, no. 7418, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [3].Waddell N, Pajic M, Patch A-M, Chang DK, Kassahn KS, Bailey P, Johns AL, Miller D, Nones K, Quek K. et al. , “Whole genomes redefine the mutational landscape of pancreatic cancer,” Nature, vol. 518, no. 7540, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [4].Evans WE and Relling MV, “Moving towards individualized medicine with pharmacogenomics,” Nature, vol. 429, 2004. [DOI] [PubMed] [Google Scholar]
  • [5].Forrest S, Hofmeyr SA, and Somayaji A, “Computer immunology,” Communications of the ACM, vol. 40, no. 10, pp. 88–96, 1997. [Google Scholar]
  • [6].Warrender C, Forrest S, and Pearlmutter B, “Detecting intrusions using system calls: Alternative data models,” in IEEE S&P, 1999.
  • [7].Gao D, Reiter MK, and Song D, “Behavioral distance for intrusion detection,” in Workshop on Recent Advances in Intrusion Detection, 2006.
  • [8].Asharov G, Halevi S, Lindell Y, Rabin T, “Privacy-preserving search of similar patients in genomic data.” in PETS, 2018.
  • [9].Wang X, Huang Y, Zhao Y, Tang H, Wang X, and Bu D, “Efficient genome-wide, privacy-preserving similar patient query based on private edit distance,” in ACM CCS, 2015.
  • [10].Zahur S, Rosulek M, and Evans D, “Two halves make a whole: reducing data transfer in garbled circuits using half gates,” in EUROCRYPT, 2015.
  • [11].Demmler D, Schneider T, Zohner M, “ABY: A framework for efficient mixed-protocol two-party computation,” in NDSS, 2015.
  • [12].Huang Y, Evans D, Katz J, and Malka L, “Faster secure two-party computation using garbled circuits,” in USENIX Security, 2011.
  • [13].Malozemoff A. and Wang X, “EMP-Toolkit,” https://github.com/emp-toolkit, 2016.
  • [14].Ball M, Malkin T, and Rosulek M, “Garbling gadgets for boolean and arithmetic circuits,” in ACM CCS, 2016.
  • [15].Wang X, Ranellucci S, and Katz J, “Global-scale secure multiparty computation,” in ACM CCS, 2017.
  • [16].Jha S, Kruger L, and Shmatikov V, “Towards practical privacy for genomic computation,” in IEEE S&P, 2008.
  • [17].Bellare M, Hoang VT, Keelveedhi S, and Rogaway P, “Efficient garbling from a fixed-key blockcipher,” in IEEE S&P, 2013.
  • [18].Bellare M, Hoang VT, and Rogaway P, “Foundations of garbled circuits,” in ACM CCS, 2012.
  • [19].Kolesnikov V. and Schneider T, “Improved garbled circuit: Free XOR gates and applications,” in ICALP, 2008.
  • [20].Mohassel P. and Rindal P, “ABY3: a mixed protocol framework for machine learning,” in ACM CCS. 2018, pp. 35–52.
  • [21].Dessouky G, Koushanfar F, Sadeghi A-R, Schneider T, Zeitouni S, and Zohner M, “Pushing the communication barrier in secure computation using lookup tables.” in NDSS, 2017.
  • [22].Yao AC-C, “How to generate and exchange secrets (extended abstract),” in FOCS, 1986.
  • [23].Pinkas B, Schneider T, Smart NP, and Williams SC, “Secure two-party computation is practical,” in ASIACRYPT, 2009.
  • [24].Gueron S, Lindell Y, Nof A, and Pinkas B, “Fast garbling of circuits under standard assumptions,” in ACM CCS, 2015.
  • [25].Tamura K, Peterson D, Peterson N, Stecher G, Nei M, and Kumar S, “Mega5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods,” Molecular biology and evolution, vol. 28, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [26].Kumar S, Tamura K, and Nei M, “Mega: molecular evolutionary genetics analysis software for microcomputers,” Computer applications in the biosciences: CABIOS, vol. 10, no. 2, pp. 189–191, 1994. [DOI] [PubMed] [Google Scholar]
  • [27].Kimura M, “A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences,” Journal of molecular evolution, vol. 16, no. 2, 1980. [DOI] [PubMed] [Google Scholar]
  • [28].Tajima F. and Nei M, “Estimation of evolutionary distance between nucleotide sequences.” Molecular biology and evolution, 1984. [DOI] [PubMed]
  • [29].Amir A, Gotthilf Z, and Shalom BR, “Weighted LCS,” Journal of Discrete Algorithms, vol. 8, no. 3, 2010. [Google Scholar]
  • [30].Zhu R, Huang Y, Katz J, and Shelat A, “The cut-and-choose game and its application to cryptographic protocols,” in USENIX Security, 2016.
  • [31].Kolesnikov V. and Kumaresan R, “Improved OT extension for transferring short secrets,” in CRYPTO, 2013.
  • [32].Ishai Y, Kilian J, Nissim K, and Petrank E, “Extending oblivious transfers efficiently,” in CRYPTO, 2003.
  • [33].Zhu R. and Huang Y, “Jimu: Faster lego-based secure computation using additive homomorphic hashes,” in ASIACRYPT, 2017.
  • [34].Canetti R, “Security and composition of multiparty cryptographic protocols,” Journal of Cryptology, vol. 13, no. 1, pp. 143–202, 2000. [Google Scholar]
  • [35].Keller M, Orsini E, and Scholl P, “Actively secure OT extension with optimal overhead,” in CRYPTO, 2015.
  • [36].Wang X, Ranellucci S, and Katz J, “Authenticated garbling and efficient maliciously secure two-party computation,” in CCS, 2017.

RESOURCES