Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2024 Nov 20.
Published in final edited form as: Proc ACM Manag Data. 2024 May 14;2(2):85. doi: 10.1145/3651148

Tight Lower Bounds for Directed Cut Sparsification and Distributed Min-Cut

Yu Cheng *, Max Li , Honghao Lin , Zi-Yi Tai §, David P Woodruff , Jason Zhang
PMCID: PMC11576265  NIHMSID: NIHMS2027450  PMID: 39569053

Abstract

In this paper, we consider two fundamental cut approximation problems on large graphs. We prove new lower bounds for both problems that are optimal up to logarithmic factors.

The first problem is to approximate cuts in balanced directed graphs. In this problem, the goal is to build a data structure that (1±ε)-approximates cut values in graphs with n vertices. For arbitrary directed graphs, such a data structure requires Ωn2 bits even for constant ε. To circumvent this, recent works study β-balanced graphs, meaning that for every directed cut, the total weight of edges in one direction is at most β times that in the other direction. We consider two models: the for-each model, where the goal is to approximate each cut with constant probability, and the for-all model, where all cuts must be preserved simultaneously. We improve the previous Ω(nβ/ε) lower bound to Ω~(nβ/ε) in the for-each model, and we improve the previous Ω(nβ/ε) lower bound to Ωnβ/ε2 in the for-all model. 1 This resolves the main open questions of (Cen et al., ICALP, 2021).

The second problem is to approximate the global minimum cut in a local query model, where we can only access the graph via degree, edge, and adjacency queries. We improve the previous Ωmk query complexity lower bound to Ω(min{m,mε2k}) for this problem, where m is the number of edges, k is the size of the minimum cut, and we seek a (1+ε)-approximation. In addition, we show that existing upper bounds with slight modifications match our lower bound up to logarithmic factors.

1. Introduction

The notion of cut sparsifiers has been extremely influential. It was introduced by Benczúr and Karger [BK96] and it is the following: Given a graph G=(V,E,w) with n=|V| vertices, m=|E| edges, edge weights we0, and a desired error parameter ε>0, a (1±ε) cut sparsifier of G is a subgraph H on the same vertex set V with (possibly) different edge weights, such that H approximates the value of every cut in G within a factor of (1±ε). Benczúr and Karger [BK96] showed that every undirected graph has a (1±ε) cut sparsifier with only Onlogn/ε2 edges. This was later extended to the stronger notion of spectral sparsifiers [ST11] and the number of edges was improved to On/ε2 [BSS12]; see also related work with different bounds for both cut and spectral sparsifiers [FHHP19, KP12, ST04, SS11, LS17, CKST19].

In the database community, a key result is the work of [AGM12], which shows how to construct a sparsifer using O~n/ε2 linear measurements to (1+ε)-approximate all cut values. Sketching massive graphs arises in various applications where there are entities and relationships between them, such as webpages and hyperlinks, people and friendships, and IP addresses and data flows. As large graph databases are often distributed or stored on external memory, sketching algorithms are useful for reducing communication and memory usage in distributed and streaming models. We refer the readers to [McG14] for a survey of graph stream algorithms in the database community.

For very small values of ε, the 1/ε2 dependence in known cut sparsifiers may be prohibitive. Motivated by this, the work of [ACK+16] relaxed the cut sparsification problem to outputting a data structure D, such that for any fixed cut SV, the value D(S) is within a (1±ε) factor of the cut value of S in G with probability at least 2/3. Notice the order of quantifiers — the data structure only needs to preserve the value of any fixed cut (chosen independently of its randomness) with high constant probability. This is referred to as the for-each model, and the data structure is called a for-each cut sketch. Surprisingly, [ACK+16] showed that every undirected graph has a (1±ε) for-each cut sketch of size O~(n/ε) bits, reducing the dependence on ε to linear. They also showed an Ω(n/ε) bits lower bound in the for-each model. The improved dependence on ε is indeed coming from relaxing the original sparsification problem to the for-each model: [ACK+16] proved an Ωn/ε2 bit lower bound on any data structure that preserves all cuts simultaneously, which is referred to as the for-all model. This lower bound in the for-all model was strengthened to Ωnlogn/ε2 bits in [CKST19].

While the above results provide a fairly complete picture for undirected graphs, a natural question is whether similar improvements are possible for directed graphs. This is the main question posed by [CCPS21]. For directed graphs, even in the for-each model, there is an Ωn2 lower bound without any assumptions on the graph. Motivated by this, [EMPS16, IT18, CCPS21] introduced the notion of β-balanced directed graphs, meaning that for every directed cut (S,V\S), the total weight of edges from S to V\S is at most β times that from V\S to S. The notion of β-balanced graphs turned out to be very useful for directed graphs, as [IT18, CCPS21] showed an O~(nβ/ε) upper bound in the for-each model, and an O~nβ/ε2 upper bound in the for-all model, thus giving non-trivial bounds for both problems for small values of β. The work of [CCPS21] also proved lower bounds: they showed an Ω(nβ/ε) lower bound in the for-each model, and an Ω(nβ/ε) lower bound in the for-all model. While their lower bounds are tight for constant ε, there is a quadratic gap for both models in terms of the dependence on ε. The main open question of [CCPS21] is to determine the optimal dependence on ε, which we resolve in this work.

Recent work further explored spectral sketches, faster computation of sketches, and sparsification of Eulerian graphs (β-balanced graphs with β=1) [ACK+16, JS18, CKK+18, CGP+23, SW19]. In this paper, we focus on the space complexity of cut sketches for general values of β.

As observed in [ACK+16], one of the main ways to use for-each cut sketches is to solve the distributed minimum cut problem. This is the problem of computing a (1+ε)-approximate global minimum cut of a graph whose edges are distributed across multiple servers. One can ask each server to compute a (1 ± 0.2) for-all cut sketch and a (1±ε) for-each cut sketch. This allows one to find all O(1)-approximate minimum cuts, and because there are at most nO(C) cuts with value within a factor of C of the minimum cut, one can query all these poly(n) cuts using the more accurate for-each cut sketches, resulting in an optimal linear in 1/ε dependence in the communication.

Motivated by this connection to distributed minimum cut estimation, we also consider the problem of directly approximating the minimum cut in a local query model, which was introduced in [RSW18] and studied for minimum cut in [ER18, BGMP21]. The model is defined as follows.

Let G(V,E) be an unweighted and undirected graph, where the vertex set V is known but the edge set E is unknown. In the local query model, we have access to an oracle that can answer the following three types of local queries:

  1. Degree query: Given uV, the oracle returns the degree of u.

  2. Edge query: Given uV and index i, the oracle returns the i-th neighbor of u, or if the edge does not exist.

  3. Adjacency query: Given u,vV, the oracle returns whether (u,v)E.

In the Min-Cut problem, our goal is to estimate the global minimum cut up to a (1±ε)-factor using these local queries. The complexity of the problem is measured by the number of queries, and we want to use as few queries as possible. For this problem we focus on undirected graphs.

Previous work [ER18] showed an Ωmk query complexity lower bound, where k is the size of the minimum cut. The main open question is what the dependence on ε should be. There is also an O~(mkpoly(ε)) upper bound in [BGMP21], and a natural question is to close this gap.

1.1. Our Results

We resolve the main open questions mentioned above.

Cut Sketch for Balanced (Directed) Graphs.

We study the space complexity of (1±ε) cut sketches for n-node β-balanced (directed) graphs. Previous work [IT18, CCPS21] gave an O~nβ/ε2 upper bound in the for-all model and an O~(nβ/ε) upper bound in the for-each model, along with an Ω(nβ/ε) lower bound and an Ω(nβ/ε) lower bound, respectively.

We close these gaps and resolve the dependence on ε, improving the lower bounds to match the upper bounds for all parameters n, β, and ε (up to logarithmic factors). Formally, we have:

Theorem 1.1 (For-Each Cut Sketch for Balanced Graphs).

Let β1 and 0<ε<1. Assume β/εn/2. Any (1±ε) for-each cut sketching algorithm for β-balanced n-node graphs must output Ω~(nβ/ε) bits.

Theorem 1.2 (For-All Cut Sketch for Balanced Graphs).

Let β1 and 0<ε<1. Assume β/ε2n/2. Any (1±ε) for-all cut sketching algorithm for β-balanced n-node graphs must output Ωnβ/ε2 bits.

Query Complexity of Min-Cut in the Local Query Model.

We study the problem of (1±ε)-approximating the (undirected) global minimum cut in a local query model, where we can only access the graph via degree, edge, and adjacency queries.

We close the gap on the ε dependence in the query complexity of this problem by proving a tight Ω(min{m,mε2k}) lower bound, where m is the number of edges and k is the size of the minimum cut. This improves the previous Ωmk lower bound in [ER18]. Formally, we have:

Theorem 1.3 (Approximating Min-Cut using Local Queries).

Any algorithm that estimates the size of the global minimum cut of a graph G up to a (1±ε) factor requires Ω(min{m,mε2k}) queries in expectation in the local query model, where m is the number of edges in G and k is the size of the minimum cut.

We also show that with a slight modification, the O~(mkpoly(ε)) query complexity upper bound in [BGMP21] can be improved to O~mε2k, which implies that our lower bound is tight (up to logarithmic factors).

1.2. Our Techniques

A common technique we use for the different problems is communication complexity games that involve the approximation parameter ε. For example, suppose Alice has a bit string s of length 1/ε2, and she can encode s into a graph G such that, if she sends Bob a (1±ε) (for-each or for-all) cut sketch to Bob, then Bob can recover a specific bit of s with high constant probability. By communication complexity lower bounds, we know Alice must send Ω1/ε2 bits to Bob, which gives a lower bound on the size of the cut sketch.

For-Each Cut Sketch Lower Bound.

Let k=β/ε. At a high level, we partition the n nodes into n/(2k) sub-graphs, where each sub-graph is a k-by-k bipartite graph with two parts L and R. We then divide L and R into β disjoint clusters L1=L2==|Lβ|=1/ε and R1=R2==|Rβ|=1/ε. For every cluster pair Li and Rj, there are a total of 1/ε2 edges. Intuitively, we wish to encode a bit string s{-1,1}1/ε2 into forward edges (left to right) each with weight Θ(1), and add backward edges (right to left) each with weight 1/β so that the graph β-balanced. If we could approximately decode this string from a for-each cut sketch, then we would get an Ω((n/k)(β)2(1/ε)2)=Ω(nβ/ε) lower bound.

However, if we use a simple encoding method [ACK+16, CCPS21] where each bit si is encoded into one edge (u,v) (e.g., with weight 1 or 2) and query the edges leaving S={u}(R\{v}), then the (k-1)2=Ωβ/ε2 backward edges with weight 1/β will cause the cut value to be Ω1/ε2. The (1±ε) cut sketch will have additive error Ω(1/ε)Θ(1), which will obscure si={-1,1}. To address this, we instead encode 1/ε2 bits of information across 1/ε2 edges simultaneously. When we want to decode a specific bit si, we query the (directed) cut values between two carefully designed subsets ALi and BRj. The key idea of our construction is that, although each edge in A×B is used to encode many bits of s, the encoding of different bits of s is never too correlated: while encoding other bits does affect the total weight from A to B, this effect is similar to adding noise which only varies the total weight from A to B by a small amount.

For-All Cut Sketch Lower Bound.

Let k=β/ε2. At a high level, we partition the n nodes into n/(2k) sub-graphs, where each sub-graph is a k-by-k bipartite graph with two parts L and R. Let L=1,,k. We partition R into β disjoint clusters R1==Rβ=1/ε2. We use edges from i to Rj to encode a bit string s{0,1}1/ε2 by setting the weight of each forward edge to 1 or 2, and adding a backward edge of weight 1/β to balance the graph.

We can show that the following problem requires Ω1/ε2 bits of communication: Consider iL and a random subset TRj where |T|=Rj2. Let Ni denote i’s neighbors v such that i,v has weight 2, which is uniformly random if s is uniformly random. The problem is to decide whether NiT14ε2+c2ε or NiT14ε2-c2ε for a sufficiently small constant c>0. Intuitively, the graph encodes a (kβ)-fold version of this communication problem, which implies an Ω((n/k)kβ(1/ε)2)=Ωnβ/ε2 lower bound.

We need to show that Bob can distinguish between the two cases of NiT given a for-all cut sketch. However, there are some challenges. The difference between the two cases is Θ(1/ε) while the natural cut to query S=i(R\T) has value Ωβ/ε4. The (1±ε) cut sketch will have additive error Ωβ/ε3Θ(1/ε), which is too much. To overcome this, note that we have not used the property that the for-all cut sketch preserves all cuts. We make use of the following crucial observation in [ACK+16]: In expectation, roughly half of the nodes iL satisfy NiT14ε2+c2ε because c is small. If Bob enumerates all subsets QL of size |L|2, he will eventually get lucky and find a set Q that contains almost all such nodes. Since there are roughly |L|2=β2ε2 such nodes, the (c/ε) bias per node will contribute Ωcβ/ε3 in total, which is enough to be detected even under an Oβ/ε3 additive error.

Query Complexity of Min-Cut in the Local Query Model.

We prove our lower bound using communication complexity, but unlike previous work [ER18], we consider the following 2SUM problem [WZ14]: Given 2t length-L binary strings x1,x2,,xt and y1,y2,,yt, we want to approximate the value of i[t]DISJxi,yi up to a t additive error, with the promise that at least a constant fraction of the xi,yi satisfy INTxi,yi=α while the remaining pairs satisfy INTxi,yi=0 or α. Here INT(x,y)=i=1Lxiyi is the number of indices where x and y are both 1, and DISJ(x,y) is the set-disjointness problem, i.e., DISJ(x,y)=1 if INT(x,y)=0 and DISJ(x,y)=0 otherwise. The parameters L,t, and α will be chosen later.

We construct our graph Gx,y based on the vectors xi and yi in a way inspired by [ER18]. We then give a careful analysis of the size of the minimum cut of Gx,y, and show that under certain conditions, the size of the minimum cut is exactly 2i[t]INTxi,yi. Consequently, a (1±ε)-approximation of the minimum cut yields an approximation of i[t]DISJxi,yi up to a ε additive error, which implies the desired lower bound.

2. Preliminaries

Let G=(V,E,w) be a weighted (directed) graph with n vertices and m edges, where each edge eE has weight we0. We write G=(V,E) if G is unweighted and leave out w. For two sets of nodes S,TV, let E(S,T)={(u,v)E:uS,vT} denote the set of edges from S to T. Let w(S,T)=eE(S,T)we denote the total weight of edges from S to T. For a node uV and a set of nodes SV, we write w(u,S) for w({u},S).

We write [n] for {1,,n}. We use 1 to denote the all-ones vector. For a vector v, we write v2 and v for the 2 and norm of x respectively. For two vectors u,vRn, let uvRn2 be the tensor product of u and v. Given a matrix A, we use Ai to denote the i-th row of A.

Directed Cut Sketches.

We start with the definitions of β-balanced graphs, for-all and for-each cut sketches [BK96, ST11, ACK+16, CCPS21].

We say a directed graph is balanced if all cuts have similar values in both directions.

Definition 2.1 (β-Balanced Graphs).

A strongly connected directed graph G=(V,E,w) is β-balanced if, for all SV, it holds that w(S,V\S)βw(V\S,S).

We say sk(G) is a for-all cut sketch if the value of all cuts can be approximately recovered from it. Note that sk(G) is not necessarily a graph and can be an arbitrary data structure.

Definition 2.2 (For-All Cut Sketch).

Let 0<ε<1. We say 𝒜 is a (1±ε) for-all cut sketching algorithm if there exists a recovering algorithm f such that, given a directed graph G=(V,E,w) as input, 𝒜 can output a sketch sk(G) such that, with probability at least 2/3, for all SV:

1-εwS,V\SfS,skG1+εwS,V\S.

Another notion of cut approximation is that of a “for-each” cut sketch, which requires that the value of each individual cut is preserved with high constant probability, rather than approximating the values of all cuts simultaneously.

Definition 2.3 (For-Each Cut Sketch).

Let 0<ε<1. We say 𝒜 is a (1±ε) for-each cut sketching algorithm if there exists a recovering algorithm f such that, given a directed graph G=(V,E,w) as input, 𝒜 can output a sketch sk(G) such that, for each SV, with probability at least 2/3,

1-εwS,V\SfS,skG1+εwS,V\S.

In Definitions 2.2 and 2.3, the sketching algorithm 𝒜 and the recovering algorithm f can be randomized, and the probability is over the randomness in 𝒜 and f.

3. For-Each Cut Sketch

In this section, we prove an Ω(nβ/ε) lower bound on the output size of (1±ε) for-each cut sketching algorithms (Definition 2.3).

Theorem 1.1 (For-Each Cut Sketch for Balanced Graphs).

Let β1 and 0<ε<1. Assume β/εn/2. Any (1±ε) for-each cut sketching algorithm for β-balanced n-node graphs must output Ω~(nβ/ε) bits.

Our result uses the following communication complexity lower bound for a variant of the Index problem, where Alice and Bob’s inputs are random.

Lemma 3.1 ([KNR01]).

Suppose Alice has a uniformly random string s{-1,1}n and Bob has a uniformly random index i[n]. If Alice sends a single (possibly randomized) message to Bob, and Bob can recover si with probability at least 2/3 (over the randomness in the input and their protocol), then Alice must send Ω(n) bits to Bob.

Our lower-bound construction relies on the following technical lemma.

Lemma 3.2.

For any integer k1, there exists a matrix M{-1,1}2k-12×22k such that:

  1. Mt,1=0 for all t[2k-12].

  2. Mt,Mt=0 for all 1t<t2k-12.

  3. For all t[2k-12], the t-th row of M can be written as Mt=uv where u,v{-1,1}2k and u,1=v,1=0.

Proof.

Our construction is based on the Hadamard matrix H=H2k{-1,1}2k×2k. Recall that the first row of H is the all-ones vector and that Hi,Hj=0 for all ij. For every 2i, j2k, we add HiHj{-1,1}22k as a row of M, so M has 2k-12 rows.

Condition (3) holds because Hi,1=Hj,1=0 for all i,j2. For Conditions (1) and (2), note that for any vectors u, v, w, and z, we have uv,wz=u,wv,z. Using this fact, Condition (1) holds because Mt,1=HiHj,11=Hi,1Hj,1=0, and Condition (2) holds because (i,j)i,j and thus Mt,Mt=HiHj,HiHj=Hi,HiHj,Hj=0. □

We first prove a lower bound for the special case n=Θ(β/ε). Our proof for this special case introduces important building blocks for proving the general case n=Ω(β/ε).

Lemma 3.3.

Suppose n=Θ(β/ε). Any (1±ε) for-each cut sketching algorithm for β-balanced n-node graphs must output Ω~(nβ/ε)=Ω~β/ε2 bits.

At a high level, we reduce the Index problem (Lemma 3.1) to the for-each cut sketching problem. Given Alice’s string s, we construct a graph G to encode s, such that Bob can recover any single bit in s by querying O(1) cut values of G. Our lower bound (Lemma 3.3) then follows from the communication complexity lower bound of the Index problem (Lemma 3.1), because Alice can run a for-each cut sketching algorithm and send the cut sketch to Bob, and Bob can successfully recover the O(1) cut values with high constant probability.

Proof of Lemma 3.3.

We reduce from the Index problem. Let s{-1,1}β(1ε-1)2 denote Alice’s random string.

Construction of G.

We construct a directed complete bipartite graph G to encode s. Let L and R denote the left and right nodes of G, where |L|=|R|=β/ε. We partition L into β disjoint blocks L1,,Lβ of equal size, and similarly partition R into R1,,Rβ. We divide s into β disjoint strings si,j{-1,1}(1ε-1)2 of the same length. We will encode si,j using the edges from Li to Rj. Note that the encoding of each si,j is independent since ELi,RjELi,Rj= for (i,j)i,j.

We fix i and j and focus on the encoding of si,j. Note that Li=Rj=1/ε. We refer to the edges from Li to Rj as forward edges and the edges from Rj to Li as backward edges. Let wR1/ε2 denote the weights of the forward edges, which we will choose soon. Every backward edge has weight 1/β.

Let z=si,j{-1,1}(1ε-1)2. Assume w.l.o.g. that 1/ε=2k for some integer k. Consider the vector x=t=1(1ε-1)2ztMtR1/ε2 where M is the matrix in Lemma 3.2 with 2k=1/ε. Because zt{-1,1} is uniformly random, each coordinate of x is a sum of O1/ε2 i.i.d. random variables of value ±1. By the Chernoff bound and the union bound, we know that with probability at least 99/100, xc1ln(1/ε)/ε for some constant c1>0. If this happens, we set w=εx+2c1ln(1/ε)1, so that each entry of w is between c1ln(1/ε) and 3c1ln(1/ε). Otherwise, we set w=2c1ln(1/ε)1 to indicate that the encoding failed.

We first verify that G is O(βlog(1/ε))-balanced. This is because every edge has a reverse edge with similar weight: For every uL and vR, the edge (u,v) has weight Θ(log(1/ε)), while the edge (v,u) has weight 1/β.

We will show that given a (1±c2εln(1/ε)) cut sketch for some constant c2>0, Bob can recover a specific bit of z using 4 cut queries. By Lemma 3.1, this implies an Ωβ/ε2=Ω~β/ε2 lower bound for cut sketching algorithms for β=O(βlog(1/ε)) and ε=c2ε/ln(1/ε).

Recovering a bit in s from a for-each cut sketch of G.

Suppose Bob wants to recover a specific bit of s, which belongs to the substring z=si,j and has an index t in z. We assume that z is successfully encoded by the subgraph between Li and Rj.

For simplicity, we index the nodes in Li as 1,,(1/ε) and similarly for Rj. We index the forward edges (u,v) in alphabetical order, first by uLi and then by vRj. Under this notation, w,1A1B gives the total weight w(A,B) of forward edges from A to B, where 1A,1B{0,1}1/ε are the indicator vectors of ALi and BRj.

The crucial observation is that, given a cut sketch of G, Bob can approximate w,Mt using 4 cut queries. By Lemma 3.2, Mt=hAhB for some hA,hB{-1,1}1/ε. Let ALi be the set of nodes uLi with hA(u)=1. Let BRi be the set of nodes vRj with hB(v)=1. Let A=Li\A and B=Rj\B.

w,Mt=w,hAhB=w,1A-1A1B-1B=wA,B-wA,B-wA,B+wA,B.

To approximate the value of w(A,B) (and similarly w(A,B), w(A,B), w(A,B)), Bob can query w(S,V\S) for S=A(R\B). Consider the edges from S to (V\S): the forward edges are from A to B, each with weight Θ(log(1/ε)); and the backward edges are from (R\B) to (L\A), each with weight 1/β. See Figure 1 as an example.

Figure 1:

Figure 1:

For S=A(R\B), the (directed) edges from S to (V\S) consist of the following: the forward edges from A to B, each with weight Θ(log(1/ε)), and the backward edges from (R\B) to (L\A), each with weight 1/β.

By Lemma 3.2, hA,1=hB,1=0, so |A|=|B|=Li2=Rj2=12ε. The total weight of the forward edges is Θlog(1/ε)/ε2, and the total weight of the backward edges is (βε-12ε)21β=Θ1/ε2, so the cut value w(S,V\S) is Θlog(1/ε)/ε2. Given a (1±c2εln(1/ε)) for-each cut sketch, Bob can obtain a (1±c2εlog(1/ε)) multiplicative approximation of w(S,V\S), which has Oc2/ε additive error. After subtracting the total weight of backward edges, which is fixed, Bob has an estimate of w(A,B) with Oc2/ε additive error. Consequently, Bob can approximate w,Mt with Oc2/ε additive error using 4 cut queries.

Now consider w,Mt. By Lemma 3.2, Mt,1=0 and the rows of M are orthogonal,

w,Mt=εx,Mt=εtztMt,Mt=εztMt22=ztε.

We can see that, for a sufficiently small universal constant c2, Bob can distinguish whether zt=1 or zt=-1 based on an Oc2/ε additive approximation of w,Mt.

Bob’s success probability is at least 0.95, because the encoding of z fails with probability at most 0.01, and each of the 4 cut queries fails with probability at most 0.01.2

We next consider the case with general values of n,β, and ε, and prove Theorem 1.1.

Proof of Theorem 1.1.

Let k=β/ε. We assume w.l.o.g. that k is an integer, n is a multiple of k, and (1/ε) is a power of 2. Suppose Alice has a random string s{-1,1}Ω(nk). We will show that s can be encoded into a graph G such that

  1. G has n nodes and is O(βlog(1/ε))-balanced, and

  2. Given a (1±c2εln(1/ε)) for-each cut sketch of G and an index q, where c2>0 is a sufficiently small universal constant, Bob can recover sq with probability at least 2/3.

Consequently, by Lemma 3.1, any for-each cut sketching algorithm must output Ω(nk)=Ω(nβ/ε)=Ω~(nβ/ε) bits for β=O(βlog(1/ε)) and ε=c2ε/ln(1/ε).

We first describe the construction of G. We partition the n nodes into =n/k2 disjoint sets V1,,V, each containing k nodes. Let s be Alice’s random string with length β1ε-12(-1)=Ωk2=Ω(nk). We partition s into (-1) strings sii=1-1, with k2 bits in each substring. We then follow the same procedure as in Lemma 3.3 to encode si into a complete bipartite graph between Vi and Vi+1. Notice that we have si=β1ε-12 and Vi=Vi+1=β/ε, which is the same setting as in Lemma 3.3.

We can verify that G is O(βlog(1/ε))-balanced. This is because every edge e has a reverse edge whose weight is at most O(βlog(1/ε)) times the weight of e. For every uVi and vVi+1, the edge (u,v) has weight Θ(log(1/ε)), while the edge (v,u) has weight 1/β.

We next show that Bob can recover the q-th bit of s. Suppose Bob’s index q belongs to the substring si which is encoded by the subgraph between Vi and Vi+1. Similar to the proof of Lemma 3.3, Bob only needs to approximate w(A,B) for 4 pairs of (A,B) with O(1/ε) additive error, where Vi, BVi+1, and |A|=|B|=12ε. To achieve this, Bob can query the cut value w(S,V\S) for S=AVi+1\Bj=i+2Vj. The edges from S to (V\S) are:

  • 14ε2 forward edges from A to B, each with weight Θ(log(1/ε)).

  • βε-12ε2 backward edges from Vi+1\B to Vi\A, each with weight 1β.

  • β2ε2 backward edges from A to Vi-1 when i2, each with weight 1β.

The cut value w(S,V\S) is Θ(log(1/ε)/ε2). Consequently, given a (1±c2εln(1/ε)) for-each cut sketch, after subtracting the fixed weight of the backward edges, Bob can approximate w(A,B) with Oc2/ε additive error. Similar to the proof of Lemma 3.3, for sufficiently small constant c2>0, repeating this process for 4 different pairs of (A,B) will allow Bob to recover sq{-1,1}. □

4. For-All Cut Sketch

In this section, we prove an Ωnβ/ε2 lower bound on the output size of (1±ε) for-all cut sketching algorithms (Definition 2.2).

Theorem 1.2 (For-All Cut Sketch for Balanced Graphs).

Let β1 and 0<ε<1. Assume β/ε2n/2. Any (1±ε) for-all cut sketching algorithm for β-balanced n-node graphs must output Ωnβ/ε2 bits.

Our proof is inspired by [ACK+16] and uses the following communication complexity lower bound for an n-fold version of the Gap-Hamming problem.

Lemma 4.1 ([ACK+16]).

Consider the following distributional communication problem: Alice has h strings s1,,sh{0,1}1/ε2 of Hamming weight 12ε2. Bob has an index i[h] and a string t{0,1}1/ε2 of Hamming weight 12ε2, drawn as follows:

  1. i is chosen uniformly at random;

  2. every si for ii is chosen uniformly at random;

  3. si and t are chosen uniformly at random, conditioned on their Hamming distance Δsi,t being, with equal probability, either 12ε2+cε or 12ε2-cε for some universal constant c>0.

Consider a (possibly randomized) one-way protocol, in which Alice sends Bob a message, and Bob then determines with success probability at least 2/3 whether Δsi,t is 12ε2+cε or 12ε2-cε. Then Alice must send Ωh/ε2 bits to Bob.

Before proving Theorem 1.2, we first consider the special case n=Θβ/ε2.

Lemma 4.2.

Suppose n=Θβ/ε2. Any (1±ε) for-all cut sketching algorithm for β-balanced n-node graphs must output Ωnβ/ε2=Ωβ2/ε4 bits.

We reduce the distributional Gap-Hamming problem (Lemma 4.1) to the for-all cut sketching problem. Suppose Alice has h strings s1,s2,,sh{0,1}1/ε2 where h=β2/ε2, and Bob has an index i[h] and a string t{0,1}1/ε2. We construct a graph G to encode s1,s2,,sh, such that given a for-all cut sketch of G, Bob can determine whether Δsi,t12ε2+cε or Δsi,t12ε2-cε with high constant probability. Our lower bound then follows from Lemma 4.1.

Construction of G.

We construct a directed complete bipartite graph G. Let L and R denote the left and right nodes of G, where |L|=|R|=β/ε2. We partition R into β disjoint sets with R1==Rβ=1/ε2.

Consider the distributional Gap-Hamming problem in Lemma 4.1 with h=β2/ε2. We re-index Alice’s β2/ε2 strings as si,j, where iβ/ε2 and j[β]. Let 1,2,,β/ε2 be the nodes in L. We encode si,j{0,1}1/ε2 using the edges from i to Rj: For node i and the v-th node in Rj, the forward edge i,v has weight si,j(v)+1, and the backward edge v,i has weight 1/β. Note that the encoding of each si,j is independent since Ei,RjEi,Rj= for (i,j)i,j.

Determining Δsi,j,t from a for-all cut sketch of G.

Suppose Bob’s input (after re-indexing) is 1iβ/ε2, 1jβ, and t{0,1}1/ε2. Bob wants to decide whether Δsi,j,t12ε2+cε or Δsi,j,t12ε2-cε.

Let Ni denote the set of nodes vRj where the forward edge i,v has weight 2, which corresponds to the positions of 1 in si,j. Let T be the set of nodes vRj such that t(v)=1.

Δsi,j,t=Ni\T+T\Ni=Ni+|T|-2NiT=1ε2-2NiT.

Hence, to determine whether Δsi,j,t12ε2-cε or Δsi,j,t12ε2+cε, Bob only needs to decide whether NiT14ε2+c2ε or NiT14ε2-c2ε.

Let S=i(R\T). The cut w(S,V\S) consists of forward edges from i to T and backward edges from (R\T) to (L\i. Ideally, if Bob knows w(S,V\S), he can subtract the weight of backward edges to obtain wi,T=1ε2+NiT and recover NiT. However, Bob can only get a (1±ε)-approximation of w(S,V\S), which may have Θβ/ε3 additive error because w(S,V\S)=Θβ/ε4. With this much error, Bob cannot distinguish between the two cases.

To overcome this issue, we follow the idea of [ACK+16]. Intuitively, when c is small, roughly half of iL satisfy NiT14ε2+c2ε. By enumerating all subsets QL of size |Q|=L2, Bob can find a set Q such that most nodes vQ satisfy NiT14ε2+c2ε. Since |Q|=β2ε2, the c2ε bias per node adds up to roughly cβ2ε3, which can be detected even with Θβ/ε3 error.

To prove Lemma 4.2, we need the following two technical lemmas, which are essentially proved in [ACK+16].

Lemma 4.3 (Claim 3.5 in [ACK+16]).

Let c>0 and βε10c. Consider the following sets:

Lhigh={iL:NiT14ε2+c2ε},andLlow={iL:NiT14ε2-c2ε}.

With probability at least 0.98, we have 12-10cLhigh|L|12 and 12-10cLlow|L|12.

Lemma 4.4 (Lemma 3.4 in [ACK+16]).

Let c1>0 be a sufficiently small universal constant. Suppose one can approximate w(U,T) with additive error c1β/ε3 for every UL with |U|=|L|2. Let QL be the subset with the highest (approximate) cut value. Then, with probability at least 0.96, we have LhighQLhigh45.

We are now ready to prove Lemma 4.2.

Proof of Lemma 4.2.

We reduce from the distributional Gap-Hamming problem (Lemma 4.1) with h=β2/ε2. We re-index Alice’s h strings as si,j, where iβ/ε2 and j[β].

We construct a directed bipartite graph G with two parts L and R, where |L|=|R|=β/ε2. Let L=1,2,,β/ε2. We partition R into β disjoint sets with R1==Rβ=1/ε2. We encode si,j{0,1}1/ε2 using the edges from i to Rj: For node i and the v-th node in Rj, the forward edge i,v has weight si,j(v)+1, and the backward edge v,i has weight 1/β.

Note that G is (2β)-balanced. We will show that given a 1±c2ε for-all cut sketch of G for some constant c2>0, Bob can decide whether Δsi,j,t12ε2+cε or Δsi,j,t12ε2-cε with probability at least 2/3. Consequently, by Lemma 4.1, any for-all cut sketching algorithm must output Ωh/ε2=Ωβ2/ε4=Ωβ2/ε4 bits for β=2β and ε=c2ε.

Bob enumerates every UL with |U|=|L|2=β2ε2 and uses the cut sketch to approximate w(U,T), where TRj corresponds to the positions of 1 in Bob’s string t and |T|=12ε2. Let S=U(R\T). The cut (S,V\S) has β4ε4 forward edges from U to T with weights 1 or 2, and βε2-12ε2β2ε2=Oβ2ε4 backward edges from (R\T) to (L\U) with weight 1β. The total weight of these edges is Oβ/ε4. Therefore, given a (1±c2ε) for-all cut sketch, Bob can subtract the fixed weight of the backward edges and approximate w(U,T) with additive error Oc2β/ε3. When c2 is sufficiently small, this additive error is at most c1β/ε3. By Lemma 4.4, Bob can find QL with |Q|=|L|2 such that LhighQLhigh45. Finally, if iQ, Bob decides NiT14ε2+c2ε and Δsi,j,t12ε2-cε; and if iQ, Bob decides Δsi,j,t12ε2+cε.

Suppose Bob’s index is (i,j). Notice that Bob uses j to determine which Rj to look at, but does not use any information about i. Therefore, when Δsi,j,t12ε2-cε and NiT14ε2+c2ε,

PriQ=LhighQLhigh45.

Conversely, when Δsi,j,t12ε2+cε and NiT14ε2-c2ε, because LlowLhigh=,

PriQ=Llow\QLlow=Llow-LlowQLlowLlow-15LhighLlow34.

The last inequality holds because Llow0.4|L| and Lhigh0.5|L| by Lemma 4.3 when c0.1.

We analyze Bob’s success probability. Lemma 4.3 fails with probability at most 0.02, Lemma 4.4 fails with probability at most 0.04, and the for-all cut sketch fails with probability at most 0.013 If they all succeed, Bob’s probability of answering correctly is at least minLhighQLhigh,Llow\QLlow34. Bob’s overall fail probability is at most 0.02 + 0.04 + 0.01 + 0.25 < 1/3. □

We next consider the case with general values of n,β, and ε, and prove Theorem 1.2.

Proof of Theorem 1.2.

Let k=β/ε2. We assume w.l.o.g. that k is an integer and n is a multiple of k. We reduce from the distributional Gap-Hamming problem in Lemma 4.1 with h=Ω(nβ). We will show that Alice’s strings can be encoded into a graph G such that

  1. G has n nodes and is (2β)-balanced, and

  2. After receiving a string t, an index q[h], and a (1±c2ε) for-all cut sketch of G for some universal constant c2>0, Bob can distinguish whether Δsq,t12ε2-cε or Δsq,t12ε2+cε with probability at least 2/3.

Consequently, by Lemma 4.1, any for-all cut sketching algorithm must output Ωh/ε2=Ωnβ/ε2=Ωnβ/ε2 bits for β=2β and ε=c2ε.

We first describe the construction of G. We partition the n nodes into =n/k2 disjoint sets V1,V2,,V, each containing k nodes. Let s1,s2,,sh{0,1}1/ε2 be Alice’s random strings where h=(t-1)β2/ε2=Ω(n/k)β2/ε2=Ω(nβ). We partition the h strings into (t-1) disjoint sets S1,S2,,St-1, each with β2/ε2 strings. We then follow the same procedure as in Lemma 4.2 to encode Si into a complete bipartite graph between Vi and Vi+1. Notice that Si has β2/ε2 strings and Vi=Vi+1=k=β/ε2, which is the same setting as in Lemma 4.2.

We can verify that G is (2β)-balanced. This is because every edge e has a reverse edge whose weight is at most 2β times the weight of e. For every uVi and vVi+1, the edge (u,v) has weight 1 or 2, while the edge (v,u) has weight 1/β.

We next show how Bob can distinguish between the two cases. Suppose Bob’s index q specifies a string encoded by the subgraph between Vi and Vi+1. Similar to the proof of Lemma 4.2, we only need to show that given a 1±c2ε for-all cut sketch, Bob can approximate w(U,T) with additive error Oβ/ε3 for every UVi with |U|=Vi2=β2ε2 and for some TVi+1 with |T|=12ε2. To see this, consider S=UVi+1\Tj=i+2tVj. The edges from S to (V\S) are

  • β4ε4 forward edges from U to T, each with weight 1 or 2.

  • βε2-12ε2β2ε2 backward edges from Vi+1\T to Vi\U, each with weight 1β.

  • β22ε4 backward edges from U to Vi-1 when i2, each with weight 1β.

The total weight of these edges is w(S,V\S)=Oβ/ε4. Consequently, given a 1±c2ε cut sketch, Bob can subtract the fixed weight of the backward edges and approximate w(U,T) with Oc2εβ/ε4=Oc2β/ε3 additive error. Similar to the proof of Lemma 4.2, for sufficiently small constant c2>0, this will allow Bob to distinguish between the two cases Δsq,t12ε2-cε or Δsq,t12ε2+cε with probability at least 2/3. □

5. Local Query Complexity of Min-Cut

In this section, we present an Ω(min{m,mε2k}) lower bound on the query complexity of approximating the global minimum cut of an undirected graph G to a (1±ε) factor in the local query model. Formally, we have the following theorem.

Theorem 1.3 (Approximating Min-Cut using Local Queries).

Any algorithm that estimates the size of the global minimum cut of a graph G up to a (1±ε) factor requires Ω(min{m,mε2k}) queries in expectation in the local query model, where m is the number of edges in G and k is the size of the minimum cut.

To achieve this, we define a variant of the 2-SUM communication problem in Section 5.1, show a graph construction in Section 5.2, and show that approximating 2-SUM can be reduced to the minimum cut problem using our graph construction in Section 5.3. In Section 5.4, we will show that our lower bound is tight up to logarithmic factors.

5.1. 2-SUM Preliminaries

Building off of the work of [WZ14], we define the following variant of the 2-SUM(t,L,α) problem.

Definition 5.1.

For binary strings x=x1,,xL{0,1}L and y=y1,,yL{0,1}L, let INT(x,y)=i=1Lxiyi denote the number of indices where x and y are both 1. Let DISJ(x,y) denote whether x and y are disjoint. That is, DISJ(x,y)=1 if INT(x,y)=0, and DISJ(x,y)=0 if INT(x,y)1.

Definition 5.2.

Suppose Alice has binary strings X1,,Xt where each string Xi{0,1}L has length L and likewise Bob hast strings Y1,,Yt each of length L.INTXi,Yi is guaranteed to be either 0 or α1 for each pair of strings Xi,Yi. Furthermore, at least 1/1000 of the Xi,Yi pairs are guaranteed to satisfy INTXi,Yi=α. In the 2-SUM(t,L,α) problem, Alice and Bob want to approximate i[t]DISJXi,Yi up to additive error t with high constant probability.

Lemma 5.3.

To solve 2-SUM(t,L,1) with high constant probability, the expected number of bits Alice and Bob need to communicate is Ω(tL).

Proof.

[WZ14] proved an expected communication complexity of Ω(tL) for 2-SUM(t,L,1) without the promise that at least a 1/1000 fraction of the t string pairs intersect. Adding this promise does not change the communication complexity, because if (X1,,Xt and Y1,,Yt do not satisfy the promise, we can add a number of new Xi and Yi to satisfy the promise and later subtract their contribution to approximate i[t]DISJXi,Yi with additive error Θ(t). □

Theorem 5.4.

To solve 2-SUM(t,L,α) with high constant probability, the expected number of bits Alice and Bob need to communicate is Ω(tL/α).

Proof.

Consider an instance of 2-SUM(t,L/α,1) with Alice’s strings X1,,Xt and Bob’s strings Y1,,Yt each with length L/α. For each of Alice’s strings Xi with length L/α, we produce Xi,α (with length L) by concatenating α copies of Xi, and likewise we produce Yi,α for each of Bob’s strings Yi. The setup where Alice has strings X1,α,,Xt,α and Bob has strings Y1,α,,Yt,α is an instance of 2-SUM(t,L,α). From Lemma 5.3, the communication complexity of 2-SUM(t,L/α,1) is Ω(tL/α). Thus, the communication complexity of 2-SUM(t,L,α) is Ω(tL/α). □

5.2. Graph Construction

Inspired by the graph construction from [ER18], given two strings x,y{0,1}N, we construct a graph Gx,y(V,E) such that V is partitioned into A, A, B and B, where |A|=A=|B|=B=N=. Note that since 2=N, we can index the bits in x by xi,j, where 1i, j. We construct the edges E according to the following rule:

(ai,bj),(bi,aj)Eifxi,j=yi,j=1(ai,aj),bi,bjEotherwise

Figure 2 illustrates an example of the graph Gx,y(V,E) when x=000000100 and y=100010100.

Figure 2:

Figure 2:

Example of Gx,y(V,E) where x=000000100 and y=100010100. The red edges represent the intersection at x31=y31=1. The green edges represent all the non-intersections in x and y.

We will show that under certain assumptions about N and INT(x,y), the number of intersections in x,y is twice the size of the minimum cut in Gx,y.

Lemma 5.5.

Given x,y{0,1}N, if N3INT(x,y), then MINCUTGx,y=2INT(x,y).

Proof.

To prove this, we use some properties about γ-connectivity of a graph. A graph is γ-connected if at least γ edges must be removed from G to disconnect it. In other words, if a graph G is γ-connected, then MINCUT(G)γ. Equivalently, a graph G is γ-connected if for every u,vV, there are at least γ edge-disjoint paths between u and v. Therefore, given INT(x,y)=γ, if we can show that Gx,y is 2γ-connected and there exists one cut of size exactly 2γ, then we can show MINCUTGx,y2γ. By the construction of the graph, it is easy to see that CUTAA,BB has size 2γ, since each intersection of x, y produces two crossing edges in between. Therefore, all we need to show here is that if N3γ, then Gx,y is 2γ-connected.

Similar to [ER18], we prove this by looking at each pair of u,vV. Our goal is to show that for every u,vV, there exist at least 2γ edge-disjoint paths from u to v.

Case 1.

u,vA (or symmetrically u,vA,B,B). For each pair u,vA, we have that there are at least -γ distinct common neighbors in A. This is because one intersection at xij and yij implies that the edge (ai,aj) is not contained in E, and would remove at most one common neighbor in A. Since =N3γ, we have that there are at least -γ2γ distinct common neighbors in A, which we denote by u1A,u2A,,u2γA. Therefore, each path uuiAv is edge-disjoint, and we have at least 2γ edge-disjoint paths from u to v, as shown in Figure 3.

Figure 3:

Figure 3:

u,vA. We omit all the (ai,bj), (bi,bj), and (bi,aj) edges.

Case 2.

uA, vA (or symmetrically uB, vB). Since -γ2γ, we have that v has at least 2γ distinct neighbors in A, which we denote by u1A,u2A,,u2γA. From Case 1, we also have that each uiA has at least 2γ distinct common neighbors in A. Therefore, we can choose v1A,v2A,,v2γA such that each path uviAuiAv is edge-disjoint, so we have at least 2γ edge-disjoint paths from u to v, as shown in Figure 4. Note that it may be the case where uiA=u. In this case, we can simply take the edge (u,v) to be one of the edge-disjoint paths.

Figure 4:

Figure 4:

A, vA. We omit all the (ai,bj), (bi,bj), and (bi,aj) edges. The green edges exist since v has at least 2γ neighbors in A. The orange edges exist since uiA and u have at least 2γ common neighbors in A.

Case 3.

uA, vB (or symmetrically uA, vB). In this case, we show two sets of edge-disjoint paths, where each set has at least γ edge-disjoint paths from u to v, and the two sets of paths do not overlap. Overall, we have at least 2γ edge-disjoint paths.

The first set of paths S1 uses the edges between A and B. Let w1,x1,w2,x2,,wγ,xγA×B be the edges between A and B. Each of these edges represents one intersection in x and y. Therefore, there are exactly γ of them. From Case 2, we have that for every wi, there are 2γ edge-disjoint paths from u to wi. Hence, for every wi, we can choose a path from u to wi and these γ paths are edge-disjoint. Figure 5 illustrates the paths uuiuiwixi. By symmetry, we can extend the paths from xi to v. This gives us γ edge-disjoint paths from u to v.

Figure 5:

Figure 5:

A, vB. The first set of paths S1 goes from uuiuiwixi. We omit the paths from xi to v, as they are symmetric to the paths from wi to u. Once we extend the paths from xi to v, we have γ edge-disjoint paths from u to v. Note that the wi and xi may not be distinct.

We now consider the second set of paths S2. Let

y1,z1,y2,z2,,yγ,zγA×B

be the distinct edges between A and B. Once again, it suffices to prove that there are 2γ edge-disjoint paths from u to yi, since the paths between v to zi would be symmetric. From Case 1, we have that for every yi, there are at least 2γ common neighbors between yi and u. Therefore, we can always find distinct u1,u2,,uγ such that the paths uuiyi are edge-disjoint, as shown in Figure 6. Once we extend the paths from zi to v, we have γ-edge disjoint paths in the second set.

Figure 6:

Figure 6:

A, vB. The second set of paths S2 goes from uuiyizi. We omit the paths from zi to v, as they are symmetric to the paths from yi to u. Once we extend the paths from xi to v, we have γ edge-disjoint paths from u to v. Note that the yi and zi may not be distinct.

Now we have two sets of paths S1 and S2, where both sets have at least γ edge-disjoint paths. It remains to show that the paths in S1 and S2 can be edge-disjoint. Observe that the only possible edge overlaps between the paths from u to the wi and paths from u to the yi are uui and uui, since they are both neighbors of u. However, note that what we have shown is that for every wi or yi, there are at least 2γ edge-disjoint paths from u to wi or yi. Therefore, one can choose 2γ edge-disjoint paths from u to wi and yi such that ui and ui do not overlap. And similarly one can choose 2γ edge-disjoint paths from v to the zi and the xi. Overall, we have 2γ edge-disjoint paths from u to v.

Case 4.

uA, vB (or symmetrically uA, vB). This case is similar to Case 3, where we have two edge-disjoint sets S1 and S2. Consider the set of paths S1, where we use the edges

w1,x1,w2,x2,,wγ,xγA×B.

We can construct the paths from u to wi using the same way as for S1 in Case 3 (Figure 5). For the paths from xi to v, however, we construct them using the same way as in S2 in Case 3 (Figure 6). By connecting these paths, we obtain at least γ edge-disjoint paths in S1. Similarly, we can also construct at least γ edge-disjoint paths in S2, where we use the edges

y1,z1,y2,z2,,yγ,zγA×B.

We follow the same way of choosing the paths in S1 and S2 that are edge-disjoint. □

5.3. Reducing 2-SUM to MINCUT

In this section, we use the graph constructions in Section 5.2 to reduce the 2-SUM(t,L,α) problem to MINCUT and derive a lower bound on the number of queries in the local query model.

Lemma 5.6.

Given M,λ>0, and 0<ε<1, suppose that we have any algorithm 𝒜 that can estimate the size of the minimum cut of a graph up to a (1±ε) multiplicative factor with T expected queries in the local query model. Then there exists an algorithm that can approximate 2-SUMε-2,ε2M,maxε2λ,1 up to an additive error ε-2=ε-1 using at most O(T) bits of communication in expectation given M3maxλ,ε-2.

Proof.

We will show that the following algorithm satisfies the above conditions:

  1. Given Alice’s strings (X1,,Xε-2) each of length ε2M, let x be the concatenation of Alice’s strings having total length ε-2ε2M=M. Similarly let y{0,1}M be the concatenation of Bob’s strings.

  2. Construct a graph Gx,y as in Section 5.2 using the above concatenated strings as x, y.

  3. Run 𝒜Gx,y and output (1ε2-𝒜Gx,y2maxε2λ,1) as the solution to 2-SUMε-2,ε2M,maxε2λ,1.

For the 2-SUM problem, let r=ε-2-iε-2DISJXi,Yi be the number of string pairs with intersections. Since there are ε-2 pairs Xi,Yi,r is at most ε-2. From our definition of 2-SUM, each intersecting string pair has maxε2λ,1 intersections. x,y are formed by concatenations, so INTx,y=rmaxε2λ,1. Since M3maxλ,ε-2=3ε-2maxε2λ,13rmaxε2λ,1=3INT(x,y), Lemma 5.5 is applicable to Gx,y so that

MinCUTGx,y=2rmaxε2λ,1.

Since 𝒜 approximates MINCUT up to a (1±ε) factor, 𝒜Gx,y=2r(1±ε)maxε2λ,1. Thus, ‘s output to the 2-SUM problem is within ε-2-r±rε=iε-2DISJXi,Yi±rε. Recall that rε-2. We can see that approximates 2-SUMε-2,ε2M,maxε2λ,1 up to additive error ε-1.

To compare the complexities of 𝒜 and , recall 𝒜 is measured by degree, neighbor, and pair queries, whereas is measured by bits of communication. Given the construction of Gx,y, as shown in [ER18], degree, neighbor, and pair queries can each be simulated using at most 2 bits of communication:

  • Degree queries: each vertex in Gx,y has degree M so Alice and Bob do not need to communicate to simulate degree queries.

  • Neighbor queries: assuming an ordering where ai ‘s j’th neighbor is either aj or bj, Alice and Bob can exchange xi,j and yi,j with 2 bits of communication to simulate a neighbor query.

  • Pair queries: Alice and Bob can exchange xi,j and yi,j with 2 bits of communication to determine whether edges (ai,bj) and (bi,aj) exist.

As each of 𝒜’s queries can be simulated using up to 2 bits of communication in , can use O(T) bits of communication to simulate T queries in 𝒜. So we have established a reduction from approximating 2-SUMε-2,ε2M,maxε2λ,1 up to additive error ε-1 to approximating MINCUT up to a (1±ε) multiplicative factor. □

We are now ready to prove Theorem 1.3.

Proof of Theorem 1.3.

Given an instance of 2-SUMε-2,ε2m,maxε2k,1, consider the same way of constructing the graph Gx,y in Lemma 5.6. From the construction of Gx,y, the number of edges is 2m since each of pair xi,yi corresponds to 2 edges. Using the promise from 2-SUM, we get that rε-2/1000, where r=iε-2DISJXi,Yi, which means that the size of the minimum cut of Gx,y is 2rmaxε2k,1Ωmaxk,ε-2. When kε-2, we have that the size of the minimum cut of Gx,y is Ω(k), and from Lemma 5.6 we obtain that any algorithm 𝒜 that satisfies the guarantee on the distribution of Gx,y must have Ωm/ε2k queries in expectation. When k<ε-2, the size of the minimum cut of Gx,y is Ωε-2 and similarly we get that any algorithm 𝒜 that satisfies the guarantee on the distribution of Gx,y must use Ω(m) queries in expectation. Combining the two, we finally obtain an Ω(min{m,mε2k}) lower bound on the expected number of queries in the local query model. □

5.4. Almost Matching Upper Bound

In this section, we will show that our lower bound is tight up to logarithmic factors. In the work of [BGMP21], the authors presented an algorithm that uses O(mkpoly(logn,1/ε)) queries, where k is the size of the minimum cut. We will show that, despite their analysis giving a dependence of 1/ε4, a slight modification of their algorithm yields a dependence of 1/ε2. Formally, we have the following theorem.

Theorem 5.7 (essentially [BGMP21]).

There is an algorithm that solves the minimum cut query problem up to a (1±ε)-multiplicative factor with high constant probability in the local query model. Moreover, the expected number of queries used by this algorithm is O~mε2k.

To prove Theorem 5.7, we first give a high-level description of the algorithm in [BGMP21]. The algorithm is based on the following sub-routine.

Lemma 5.8 ([BGMP21]).

There exists an algorithm Verify-Guess (D,t,ε) which makes O~ε-2m/t queries in expectation such that (here D is the degree of each node)

  1. If t2000lognε2k, then Verify-Guess(D,t,ε) rejects t with probability at least 1-1poly(n).

  2. If tk, then Verify-Guess(D,t,ε) accepts t and outputs a (1±ε)-approximation of k with probability at least 1-1poly(n).

Given the above sub-routine, the algorithm initializes a guess t=n2 for the value of the minimum cut k and proceeds as follows:

  • if Verify-Guess(D,t,ε) rejects t, set t=t/2 and repeat the process.

  • if Verify-Guess(D,t,ε) accepts t, set t=t/κ where κ=2000lognε2. Let k~=Verify-Guess(D,t,ε) and return the value of k~ as the output.

To analyze the query complexity of the algorithm, notice that when Verify-Guess first accepts t, we have that k2<t<κk. which means that t/κ<k and hence one call to Verify-Guess(D,t/κ,ε) will get the desired output. However, at a time in t=Θ(k/κ), the Verify-Guess procedure needs to make O~mε4k queries in expectation.

To avoid this, the crucial observation is that, during the above binary search process, the error parameter of Verify-Guess(D,t,ε) does not have to be set to ε. Using a small constant β0 is sufficient. This way, when Verify-Guess D,t,β0 first accepts t, we have k2<t<clog(n)k, where c is a constant. Consequently, the output of Verify-Guess(D,t/(clogn),ε) will satisfy the error guarantee. Using the analysis in [BGMP21], we can show that the query complexity of the new algorithm is O~mε2k.

Acknowledgement

Yu Cheng is supported in part by NSF Award CCF-2307106. Honghao Lin and David Woodruff would like to thank support from the National Institute of Health (NIH) grant 5R01 HG 10798-2, and a Simons Investigator Award. Part of this work was done while D. Woodruff was visiting the Simons Institute for the Theory of Computing.

Footnotes

1

In this paper, we use O~() and Ω~() to hide logarithmic factors in its parameters.

2

The success probability of a cut query given a for-each cut sketch (Definition 2.3) can be boosted from 2/3 to 99/100, e.g., by running the sketching and recovering algorithms O(1) times and taking the median. This increases the length of Alice’s message by a constant factor, which does not affect our asymptotic lower bound.

3

The probability that a for-all cut sketch (Definition 2.2) preserves all cuts simultaneously can be boosted from 2/3 to 99/100, e.g., by running the sketching and recovering algorithms O(1) times and taking the median. This increases the length of Alice’s message by a constant factor, which does not affect our asymptotic lower bound.

References

  • [ACK+16].Andoni Alexandr, Chen Jiecao, Krauthgamer Robert, Qin Bo, Woodruff David P., and Zhang Qin. On sketching quadratic forms. In Proceedings of the 2016 ACM Conference on Innovations in Theoretical Computer Science (ITCS), pages 311–319, 2016. 2, 4, 5, 9, 10 [Google Scholar]
  • [AGM12].Kook Jin Ahn Sudipto Guha, and McGregor Andrew. Graph sketches: sparsification, spanners, and subgraphs. In Proceedings of the 31st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS), pages 5–14, 2012. 2 [Google Scholar]
  • [BGMP21].Bishnu Arijit, Ghosh Arijit, Mishra Gopinath, and Paraashar Manaswi. Query complexity of global minimum cut. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM), volume 207 of Leibniz International Proceedings in Informatics (LIPIcs), pages 6:1–6:15, 2021. 2, 3, 18, 19 [Google Scholar]
  • [BK96].Benczúr András A. and Karger David R.. Approximating s-t minimum cuts in Õ(n2) time. In Proceedings of the 28th Annual ACM Symposium on the Theory of Computing (STOC), pages 47–55, 1996. 1, 5 [Google Scholar]
  • [BSS12].Batson Joshua D., Spielman Daniel A., and Srivastava Nikhil. Twice-Ramanujan sparsifiers. SIAM J. Comput, 41(6):1704–1721, 2012. 1 [Google Scholar]
  • [CCPS21].Cen Ruoxu, Cheng Yu, Panigrahi Debmalya, and Sun Kevin. Sparsification of directed graphs via cut balance. In 48th International Colloquium on Automata, Languages, and Programming (ICALP), volume 198 of LIPIcs, pages 45:1–45:21, 2021. 2, 3, 4, 5 [Google Scholar]
  • [CGP+23].Chu Timothy, Gao Yu, Peng Richard, Sachdeva Sushant, Sawlani Saurabh, and Wang Junxing. Graph sparsification, spectral sketches, and faster resistance computation via short cycle decompositions. SIAM J. Comput, 52(6):S18–85, 2023. 2 [Google Scholar]
  • [CKK+18].Cohen Michael B., Kelner Jonathan A, Kyng Rasmus, Peebles John, Peng Richard, Rao Anup B, and Sidford Aaron. Solving directed Laplacian systems in nearly-linear time through sparse LU factorizations. In Proceedings of the 59th IEEE Annual Symposium on Foundations of Computer Science (FOCS), pages 898–909, 2018. 2 [Google Scholar]
  • [CKST19].Carlson Charles, Kolla Alexandra, Srivastava Nikhil, and Trevisan Luca. Optimal lower bounds for sketching graph cuts. In Proceedings of the 30th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2565–2569, 2019. 1, 2 [Google Scholar]
  • [EMPS16].Ene Alina, Miller Gary L., Pachocki Jakub, and Sidford Aaron. Routing under balance. In Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing (STOC), pages 598–611, 2016. 2 [Google Scholar]
  • [ER18].Eden Talya and Rosenbaum Will. Lower bounds for approximating graph parameters via communication complexity. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM), volume 116 of Leibniz International Proceedings in Informatics (LIPIcs), pages 11:1–11:18, 2018. 2, 3, 5, 13, 14, 17 [Google Scholar]
  • [FHHP19].Wai Shing Fung Ramesh Hariharan, Harvey Nicholas J. A., and Panigrahi Debmalya. A general framework for graph sparsification. SIAM J. Comput, 48(4):1196–1223, 2019. 1 [Google Scholar]
  • [IT18].Ikeda Motoki and Tanigawa Shin-ichi. Cut sparsifiers for balanced digraphs. In Approximation and Online Algorithms - 16th International Workshop (WAOA), volume 11312 of Lecture Notes in Computer Science, pages 277–294, 2018. 2, 3 [Google Scholar]
  • [JS18].Jambulapati Arun and Sidford Aaron. Efficient Õ(n/ϵ) spectral sketches for the Laplacian and its pseudoinverse. In Proceedings of the 29th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2487–2503, 2018. 2 [Google Scholar]
  • [KNR01].Kremer Ilan, Nisan Noam, and Ron Dana. Errata for: “on randomized one-round communication complexity”. Comput. Complex, 10(4):314–315, 2001. 6 [Google Scholar]
  • [KP12].Kapralov Michael and Panigrahy Rina. Spectral sparsification via random spanners. In Innovations in Theoretical Computer Science (ITCS), pages 393–398, 2012. 1 [Google Scholar]
  • [LS17].Lee Yin Tat and Sun He. An SDP-based algorithm for linear-sized spectral sparsification. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing (STOC), pages 678–687, 2017. 1 [Google Scholar]
  • [McG14].Andrew McGregor. Graph stream algorithms: a survey. ACM SIGMOD Record, 43(1):9–20, 2014. 2 [Google Scholar]
  • [RSW18].Rubinstein Aviad, Schramm Tselil, and Matthew Weinberg S. Computing exact minimum cuts without knowing the graph. In 9th Innovations in Theoretical Computer Science Conference (ITCS), volume 94 of LIPIcs, pages 39:1–39:16, 2018. 2 [Google Scholar]
  • [SS11].Spielman Daniel A. and Srivastava Nikhil. Graph sparsification by effective resistances. SIAM J. Comput, 40(6):1913–1926, 2011. 1 [Google Scholar]
  • [ST04].Spielman Daniel A. and Teng Shang-Hua. Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. In Proceedings of the 36th Annual ACM Symposium on Theory of Computing (STOC), pages 81–90, 2004. 1 [Google Scholar]
  • [ST11].Spielman Daniel A. and Teng Shang-Hua. Spectral sparsification of graphs. SIAM J. Comput, 40(4):981–1025, 2011. 1, 5 [Google Scholar]
  • [SW19].Saranurak Thatchaphol and Wang Di. Expander decomposition and pruning: Faster, stronger, and simpler. In Proceedings of the 30th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2616–2635, 2019. 2 [Google Scholar]
  • [WZ14].Woodruff David P. and Zhang Qin. An optimal lower bound for distinct elements in the message passing model. In Proceedings of the 25th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), page 718–733, 2014. 5, 13 [Google Scholar]

RESOURCES