Skip to main content
Springer logoLink to Springer
. 2021 Feb 22;83(8):2578–2605. doi: 10.1007/s00453-021-00811-0

Faster algorithms for counting subgraphs in sparse graphs

Marco Bressan 1,
PMCID: PMC8550495  PMID: 34720296

Abstract

Given a k-node pattern graph H and an n-node host graph G, the subgraph counting problem asks to compute the number of copies of H in G. In this work we address the following question: can we count the copies of H faster if G is sparse? We answer in the affirmative by introducing a novel tree-like decomposition for directed acyclic graphs, inspired by the classic tree decomposition for undirected graphs. This decomposition gives a dynamic program for counting the homomorphisms of H in G by exploiting the degeneracy of G, which allows us to beat the state-of-the-art subgraph counting algorithms when G is sparse enough. For example, we can count the induced copies of any k-node pattern H in time 2O(k2)O(n0.25k+2logn) if G has bounded degeneracy, and in time 2O(k2)O(n0.625k+2logn) if G has bounded average degree. These bounds are instantiations of a more general result, parameterized by the degeneracy of G and the structure of H, which generalizes classic bounds on counting cliques and complete bipartite graphs. We also give lower bounds based on the Exponential Time Hypothesis, showing that our results are actually a characterization of the complexity of subgraph counting in bounded-degeneracy graphs.

Keywords: Subgraph counting, Tree decomposition, Degeneracy, Sparsity

Introduction

We address the following fundamental subgraph counting problem:

Input: an n-node graph G (the host graph) and a k-node graph H (the pattern)

Output: the number of induced copies of H in G

If no further assumptions are made, the best possible algorithm for this problem is likely to have running time f(k)·nΘ(k). Indeed, the naive brute-force algorithm has running time O(k2nk), and under the Exponential Time Hypothesis [23] any algorithm for counting k-cliques has running time nΩ(k) [8, 9]. The best algorithm known, which was given over 30 years ago by Nešetřil and Poljak [29] and is based on fast matrix multiplication, is only slightly faster than O(k2nk). Ignoring poly(k) factors,1 the algorithm runs in time O(nωk3+2) where ω is the matrix multiplication exponent. Since ω2.373 [25], this gives a state-of-the-art running time of O(n0.791k+2).

In this work, we aim at breaking through this “nΘ(k) barrier” by assuming that G is sparse, and in particular, that G has bounded degeneracy. This assumption is often made for real-world graphs like social networks, since it agrees well with their structural properties [17]. The family of bounded-degeneracy graphs is rich from a theoretical point of view, too: it includes many important classes such as Barabási-Albert preferential attachment graphs, graphs excluding a fixed minor, planar graphs, bounded-treewidth graphs, bounded-degree graphs, and bounded-genus graphs, see [20]. Unfortunately, even when G has bounded degeneracy, the state of the art remains the O(n0.791k+2)-time algorithm by Nešetřil and Poljak, unless one makes further assumptions. For example, one can count the copies of any given pattern H in time O(n), provided G is planar [15] or has bounded treewidth [27] or has bounded degree [30]; all conditions that are stricter than bounded degeneracy. Alternatively, if G has bounded degeneracy, O(n)-time algorithms exist when H is the clique [1, 10, 16], or when H is a complete bipartite graph, if we do not require the copies of H to be induced [14]. Unfortunately, it is not clear how to extend the techniques behind these results to all patterns H and all G with bounded degeneracy. Thus, to what extent a small degeneracy of G makes subgraph counting easier remains an open question.

In this work we introduce a novel tree-like graph decomposition, to be applied to the pattern graph H, designed to exploit the degeneracy of G when counting the homomorphisms from H to G. When G is sparse enough, this decomposition yields subgraph counting algorithms faster than the state of the art. For example, we show how to count the induced copies of any k-node pattern H in time 2O(k2)O(n0.25k+2logn) when G has bounded degeneracy, and in time 2O(k2)O(n0.625k+2logn) when G has bounded average degree. These results are instantiations of a more general result which says that H can be counted in time f(k)O(dk-τ(H)nτ(H)logn), where d is the degeneracy of G, and τ(H) is a certain measure of “width” of H arising from our decomposition. Assuming the Exponential Time Hypothesis, we also show that nΩ(τ(H)/logτ(H)) operations are required in the worst case, even if G has degeneracy 2. This provides a novel characterization of the complexity of subgraph counting in bounded-degeneracy graphs.

Results

We divide our results into bounds (Sect. 1.1.1) and techniques (Sect. 1.1.2). We denote by d the degeneracy of G, and we denote by hom(H,G), sub(H,G), ind(H,G) the number of, respectively, homomorphisms, occurrences, and induced occurrences of H in G. See Sect. 1.2 for further definitions and notation. We remark that, unless otherwise specified, our bounds hold for every H including disconnected ones.

Bounds

Our first results are two running time bounds parameterized by the sparsity of G.

Theorem 1

For any k-node pattern H one can compute hom(H,G) and sub(H,G) in time 2O(klogk)·O(dk-(k4+2)nk4+2logn), and one can compute ind(H,G) in time 2O(k2)·O(dk-(k4+2)nk4+2logn), where d is the degeneracy of G.

This bound reduces the exponent of n to k4+20.25k+2, down from the state-of-the-art ωk3+20.791k+2 of the Nešetřil–Poljak bound. This implies that our polynomial dependence on n is better whenever d=O(n0.721), and in any case (that is, even if ω=2) whenever d=O(n0.556). As a corollary of Theorem 1, since d=O(rn) where r is the average degree of G, we obtain:

Theorem 2

For any k-node pattern H one can compute hom(H,G) and sub(H,G) in time 2O(klogk)·O(r12(k-k4)-1n12(k+k4)+1logn), and one can compute ind(H,G) in time 2O(k2)·O(r12(k-k4)-1n12(k+k4)+1logn), where r is the average degree of G.

This bound has a polynomial dependence on n better than Nešetřil-Poljak whenever r=O(n0.221), and in any case (that is, even if ω=2) whenever r=O(n0.056). In particular, we have a 2O(k2)·O(n0.625k+1logn)-time algorithm when r=O(1). These are the first improvements over the Nešetřil-Poljak algorithm for graphs with small degeneracy or small average degree.

As a second result, we give improved bounds for some classes of patterns. The first is the class of quasi-cliques, a typical target pattern for social networks [57, 3133]. We prove:

Theorem 3

If H is the clique minus ϵ edges, then one can compute hom(H,G) and sub(H,G) in time 2O(klogk)·O(dk-12+ϵ2n12+ϵ2logn), and ind(H,G) in time 2O(ϵ+klogk)·O(dk-12+ϵ2n12+ϵ2logn).

This generalizes the classic O(dk-1n) bound for counting cliques by Chiba and Nishizeki [10], at the price of an extra factor 2O(ϵ+klogk)O(logn). Next, we consider complete quasi-multipartite graphs:

Theorem 4

If H is a complete multipartite graph, then one can compute hom(H,G) and sub(H,G) in time 2O(klogk)·O(dk-1nlogn). If H is a complete multipartite graph plus ϵ edges, then one can compute hom(H,G) and sub(H,G) in time 2O(klogk)·O(dk-ϵ4-2nϵ4+2logn).

This generalizes an existing O(d322dn) bound for counting the non-induced copies of complete (maximal) bi-partite graphs [14], again at the price of an extra factor 2O(klogk)logn.

Table 1 summarizes our upper bounds. We remark that our algorithms work for the colored versions of the problem (count only copies of H with prescribed vertex and/or edge colors) as well as the weighted versions of the problem (compute the total node or edge weight of copies of H in G). This can be obtained by a straightforward adaptation of our homomorphism counting algorithms.

Table 1.

Summary of upper bounds for the problem of counting the number of occurrences of H in G

Pattern H Time to compute ind(H,G) References
All (even disconnected) O(nωk3+2) [29]
All (even disconnected) 2O(k2)·O(dk-k4-2nk4+2logn) This work
All (even disconnected) 2O(k2)·O(r12(k-k4)-1n12(k+k4)+1logn) This work
Kk O(dk-1n) [10]
Kk - ϵ edges 2O(ϵ+klogk)·O(dk-12+ϵ2n12+ϵ2logn) This work
Kk1,k2 O(d322dn)   (sub(H,G) only) [14]
Kk1,,k 2O(klogk)·O(dk-1nlogn)  (sub(H,G) only) This work
Kk1,,k + ϵ edges 2O(klogk)·O(dk-ϵ4-2nϵ4+2logn)  (sub(H,G) only) This work

The graphs G and H have n and k vertices respectively, and d is the degeneracy of G. All except the last tree bounds hold for counting the number of induced occurrences

Techniques

The bounds of Sect. 1.1.1 are instantiations of a single, more general result. This result is based on a novel notion of width, the dag treewidth τ(H) of H, which captures the relevant structure of H when counting its copies in a d-degenerate graph. In a simplified form, the bound is the following:

Theorem 5

For any k-node pattern H one can compute hom(H,G), sub(H,G), and ind(H,G) in time f(k)·O(dk-τ(H)nτ(H)logn).

Let us briefly explain this result. The heart of the problem is computing hom(H,G); once we know how to do this, we can obtain sub(H,G) and ind(H,G) via inclusion-exclusion arguments at the price of an extra multiplicative factor f(k), like in [3, 11]. To compute hom(H,G), we give G an acyclic orientation with maximum outdegree d. Then, we take every possible acyclic orientation P of H, and compute hom(P,G) where by hom(P,G) we mean the number of homomorphisms from P to G that respect the orientations of the arcs. Note that the number of such homomorphisms can be nΩ(k) even if G has bounded degeneracy (for example, if P is an independent set), so we cannot list them explicitly. At this point we introduce our technical tool, the dag tree decomposition of P. This is a tree T that captures the relevant reachability relations between the nodes of P. Given T, one can compute hom(P,G) via dynamic programming in time f(k)·O(dk-τ(T)nτ(T)logn), where τ(T){1,,k} is the width of T. The dynamic program computes hom(P,G) by combining carefully the homomorphism counts of certain subgraphs of P. The dag-treewidth τ(H), which is the parameter appearing in the bound of Theorem 5, is the maximum width of the optimal dag tree decomposition of any acyclic orientation P of any graph obtainable by identifying nodes of and/or adding edges to H (this arises from the inclusion-exclusion arguments). With this, our technical machinery is complete. To obtain the bounds of the previous paragraph, we show how to compute efficiently dag tree decompositions of low width, and apply a more technical version of Theorem 5.

We conclude by complementing Theorem 5 with a lower bound based on the Exponential Time Hypothesis. This lower bound shows that in the worst case the dag-treewidth τ(H) cannot be beaten, and therefore our decomposition captures, at least in part, the complexity of counting subgraphs in d-degenerate graphs.

Theorem 6

Under the Exponential Time Hypothesis [23] , no algorithm can compute sub(H,G) or ind(H,G) in time f(d,k)·no(τ(H)/logτ(H)) for all H.

Preliminaries and notation

Both G=(V,E) and H=(VH,EH) are simple graphs, possibly disconnected. For any subset VV we denote by G[V] the subgraph of G induced by V; the same notation applies to any graph. A homomorphism from H to G is a map ϕ:VHV such that {u,u}EH implies {ϕ(u),ϕ(u)}E. We write ϕ:HG to highlight the edges that ϕ preserves. When H and G are oriented, ϕ must preserve the direction of the arcs. If ϕ is injective then we have an injective homomorphism. We denote by hom(H,G) and inj(H,G) the number of homomorphisms and injective homomorphisms from H to G. To avoid confusion, we will use the symbol ψ to denote maps that are not necessarily homomorphisms. The symbol denotes isomorphism. A copy of H in G is a subgraph FG such that FH. If moreover FG[VF] then F is an induced copy. We denote by sub(H,G) and ind(H,G) the number of copies and induced copies of H in G; we may omit G if clear from the context. When we give an acyclic orientation to the edges of H, we denote the resulting dag by P. All the notation described above applies to directed graphs in the natural way.

The degeneracy of G is the smallest integer d such that there is an acyclic orientation of G with maximum outdegree bounded by d. Such an orientation can be found in time O(|E|) by repeatedly removing from G a minimum-degree node [27]. From now on we assume that G has this orientation. Equivalently, d is the smallest integer that bounds from above the minimum degree of every subgraph of G.

We assume the following operations take constant time: accessing the i-th arc of any node uV, and checking if (uv) is an arc of G for any pair (uv). Our upper bounds still hold if checking an arc takes time O(logn), which can be achieved via binary search if we first sort the adjacency lists of G. The logn factor in our bounds appears since we assume logarithmic access time for our dictionaries, each of which holds O(nk) entries. This factor can be removed by using dictionaries with worst-case O(1) access time (e.g., hash maps), at the price of obtaining probabilistic/amortized bounds rather than deterministic ones.

Finally, we recall the tree decomposition and treewidth of a graph. For any two nodes XY in a tree T, we denote by T(XY) the unique path between X and Y in T.

Definition 1

(see [13], Ch. 12.3) Given a graph G=(V,E), a tree decomposition of G is a tree T=(VT,ET) such that each node XVT is a subset XV, and that:2

  1. XVTX=V

  2. for every edge e={u,v}G there exists XT such that u,vX

  3. X,X,XVT, if XT(X,X) then XXX

The width of a tree decomposition T is t(T)=maxXVT|X|-1. The treewidth t(G) of a graph G is the minimum of t(T) over all tree decompositions T of G.

Related work

As discussed above, the fastest algorithm known for computing ind(H,G) is the one by Nešetřil and Poljak [29] that runs in time O(nωk3+(kmod3)) where ω is the matrix multiplication exponent. With the current bound ω2.373, this running time is in O(n0.791k+2). Unfortunately, the algorithm is based on fast matrix multiplication, which makes it oblivious to the sparsity of G.

Under certain assumptions on G, faster algorithms are known. If G has bounded maximum degree, Δ=O(1), then we can compute ind(H,G) in time ck·O(n) for some c=c(Δ) via multivariate graph polynomials [30]. If G has treewidth t(G)k, and we are given a tree decomposition of G of such width, then we can compute ind(H,G) in time 2O(klogk)O(n); see Lemma 18.4 of [27]. When G is planar, we obtain an f(k)O(n) algorithm where f is exponential in k [15]. All these assumptions are stronger than bounded degeneracy, and the techniques cannot be extended easily. A more general class that captures all these cases is that of nowhere-dense graphs [28], for which there exist fixed-parameter-tractable subgraph counting algorithms [22]. Nowhere dense graphs however do not include all bounded degeneracy graphs or all graphs with bounded average degree.

Even assuming G has bounded degeneracy, algorithms faster than Nešetřil-Poljak are known only when H belongs to special classes. The earliest result of this kind is the classic algorithm by Chiba and Nishizeki [10] to list all k-cliques in time O(dk-1n). Eppstein showed that one can list all maximal cliques in time O(d3d/3n) [16] and all non-induced complete bipartite subgraphs in time O(d322dn) [14]. These algorithms exploit the degeneracy ordering of G in a way similar to ours. In fact, our techniques can be seen as a generalization of [10] that takes into account the structure of H. We note that a fundamental limitation of [10, 14, 16] is that they list all the copies of H, which for a generic H might be Θ(nk) even if G has bounded degeneracy (for example if H is the independent set). In contrast, we list the copies of subgraphs H, and combine them to infer the number of copies of H. To be more precise, we list the homomorphisms of H, which is another difference we have with [10, 14, 16] and a point we have in common with previous work [11].

Regarding our “dag tree decomposition”, it is inspired by the standard notion of tree decomposition of a graph, and it yields a similar dynamic program. Yet, the similarity between the two decompositions is rather superficial; indeed, our dag-treewidth can be O(1) when the treewidth is Ω(k), and vice versa. Our decomposition is unrelated to the several notions of tree decomposition for directed graphs already known [19]. Finally, our lower bounds are novel; no general lower bound in terms of d and of the structure of H was available before.

Manuscript organisation

In Sect. 2 we build the intuition with a gentle introduction to our approach. In Sect. 3 we give our dag tree decomposition and the dynamic program for counting homomorphisms. In Sect. 4 we show how to compute good dag tree decompositions. Finally, in Sect. 5 we prove the lower bounds.

Exploiting degeneracy orientations

We build the intuition behind our approach, starting from the classic algorithm for counting cliques by Chiba and Nishizeki [10]. The algorithm begins by orienting G acyclically so that maxvGdout(v)d, which takes time O(|E|). With G oriented acyclically, we take each vG in turn, enumerate every subset of (k-1) out-neighbors of v, and check its edges. In this way we can explicitly find all k-cliques of G in time O(k2dk-1n). Observe that the crucial fact here is that an acyclically oriented clique has exactly one source, that is, a node with no incoming arcs. We would like to extend this approach to an arbitrary pattern H. Since every copy of H in G appears with exactly one acyclic orientation, we take every possible acyclic orientation P of H, count the copies of P in G, and sum all the counts. Thus, the problem reduces to counting the copies of an arbitrary dag P in our acyclic orientation of G.

Let us start in the naive way. Suppose P has s sources. Fix a directed spanning forest F of P. This is a collection of s directed disjoint trees rooted at the sources of P (arcs pointing away from the roots). Clearly, each copy of P in G contains a copy of F. Hence, we can enumerate the copies of F in G, and for each one check if it is a copy of P. To this end, first we enumerate the O(ns) possible s-tuples of V to which the sources of P can be mapped. For each such s-uple, we enumerate the possible mappings of the remaining k-s nodes of the forest. This can be done in time O(dk-s) by a straightforward extension of the out-neighbor listing algorithm above. Finally, for each mapping we check if its nodes induce P in G, in time O(k2). The total running time is O(k2dk-sns). Unfortunately, if P is an independent set then s=k and the running time is O(k2nk), so we have made no progress over the naive algorithm.

At this point we introduce our first idea. For reference we use the toy pattern P in Fig. 1. Instead of enumerating the copies of P in G, we decompose P into two pieces, P(1) and P(3, 5). Here, P(1) denotes the subgraph of P reachable from 1 (that is, the transitive closure of 1 in P). The same for P(3) and P(5), and we let P(3,5)=P(3)P(5). Now we count the copies of P(1), and then the copies of P(3, 5), hoping to combine the result in some way to obtain the count of P. To simplify the task, we focus on counting homomorphisms rather than copies (see below). Thus, we want to compute hom(P,G) by combining hom(P(1),G) and hom(P(3,5),G).

Fig. 1.

Fig. 1

Toy example: an acyclic orientation P of H=C6, decomposed into two pieces

Now, clearly, knowing hom(P(1),G) and hom(P(3,5),G) is not sufficient to infer hom(P,G). Thus, we need to solve a slightly more complex problem. For every pair x,yV(G), let ϕ:{2,6}V(G) be the map given by ϕ(2)=x and ϕ(6)=y. We let hom(P,G,(x,y)) be the number of homomorphisms of P in G whose restriction to {2,6} is ϕ. By a counting argument one can immediately see that:

hom(P,G)=ϕ:{2,6}V(G)hom(P,G,ϕ) 1

Thus, to compute hom(P,G) we only need to compute hom(P,G,ϕ) for all possible ϕ. Now, define hom(P(1),G,ϕ) and hom(P(3,5),G,ϕ) with the same meaning as above. A crucial observation is that {2,6}, the domain of ϕ, is precisely the set of nodes in P(1)P(3,5). It is not difficult to see that this implies:

hom(P,G,ϕ)=hom(P(1),G,ϕ)·hom(P(3,5),G,ϕ) 2

Thus, now our goal is to compute hom(P(1),G,ϕ) and hom(P(3,5),G,ϕ) for all ϕ:{2,6}V(G). To this end, we list all ϕP(1):P(1)G with the technique above, and for each such ϕP(1) we increment a counter associated to (ϕP(1)(2),ϕP(1)(6)) in a dictionary with default value 0. Thus, we obtain hom(P(1),G,ϕ) for all ϕ:{2,6}V. Since P(1) has one source, we enumerate O(k2dk-1n) maps. If the dictionary takes time O(logn) to access an entry, the total running time is O(k2dk-1nlogn). The same technique applied to P(3, 5) yields a running time of O(k2dk-2n2logn), since P(3, 5) has two sources. Finally, we apply Eq. 1 by running over all entries in the first dictionary and retrieving the corresponding value from the second dictionary. The total running time is O(k2dk-2n2logn), while enumerating the homomorphisms of P would have required time O(k2dk-3n3).

Let us abstract the general approach from this toy example. We want to decompose P into a set of pieces P1,P2, with the following properties: (i) Each piece Pi has a small number of sources s(Pi), and (ii) we can obtain hom(P,G,ϕ) by combining the homomorphism counts of the Pi. This is achieved by the dag tree decomposition, which we introduce in Sect. 3. Like the tree decomposition for undirected graphs, the dag tree decomposition leads to a dynamic program to compute hom(P,G).

DAG tree decompositions

Let P=(VP,AP) be a directed acyclic graph. We denote by SP, or simply S, the set of nodes of P having no incoming arc. These are the sources of P. We denote by VP(u) the transitive closure of u in P, i.e. the set of nodes of P reachable from u, and we let P(u)=P[VP(u)] be the corresponding subgraph of P. For a subset of sources BS we let VP(B)=uBVP(u) and P(B)=P[VP(B)]. Thus, P(B) is the subgraph of P induced by all nodes reachable from B. We call B a bag of sources. We can now formally introduce our decomposition.

Definition 2

(Dag tree decomposition) Let P=(VP,AP) be a dag. A dag tree decomposition (d.t.d.) of P is a (rooted) tree T=(B,E) with the following properties:

  1. each node BB is a bag of sources BSP

  2. BBB=SP

  3. for all B,B1,B2T, if BT(B1,B2) then VP(B1)VP(B2)VP(B)

One can see the similarity with the tree decomposition of an undirected graph (Definition 1). However, our dag tree decomposition differs crucially in two aspects. First, the bags are subsets of S rather than subsets of VP. This is because the time needed to list the homomorphisms between P(Bi) and G is driven by n|Bi|. Second, the path-intersection property (3) concerns the pieces reachable from the bags rather than the bags themselves. The reason is that, to combine the counts of two pieces together, their intersection must form a separator in P (similarly to the tree decomposition of an undirected graph). The dag tree decomposition induces the following notions of width, used throughout the rest of the article.

Definition 3

The width of T is τ(T)=maxBB|B|. The dag treewidth τ(P) of P is the minimum of τ(T) over all dag tree decompositions T of P.

Clearly τ(P){1,,k} for any k-node dag P. Figure 2 shows a pattern P together with a d.t.d. of width 1. We observe that τ(P) has no obvious relation to the treewidth t(H) of H; see the discussion in Sect. 3.2.

Fig. 2.

Fig. 2

Left: a dag P formed by five pieces. Right: a dag tree decomposition T for P. Since τ(T)=1 and the largest piece contains 4 nodes, we can compute hom(P,G) in time O(d3nlogn)

Counting homomorphisms via dag tree decompositions

For any BB let T(B) be the subtree of T rooted at B. We let Γ[B] be the down-closure of B in T, that is, the union of all bags in T(B). Consider P(Γ[B]), the subgraph of P induced by the nodes reachable from Γ[B] (note the difference with P(B), which contains only the nodes reachable from B). We compute hom(P(Γ[B]),G) in a bottom-up fashion over all B, starting with the leaves of T and moving towards the root. This is similar to the dynamic program given by the standard tree decomposition (see [18]).

As anticipated, we actually compute hom(P(Γ[B]),ϕ), the number of homomorphisms that extend a fixed mapping ϕ. We need the following concept:

Definition 4

Let P1=(VP1,AP1),P2=(VP2,AP2) be two subgraphs of P, and let ϕ1:P1G and ϕ2:P2G be two homomorphisms. We say ϕ1 and ϕ2 respect each other if ϕ1(u)=ϕ2(u) for all uVP1VP2.

Given some ϕ, we denote by hom(P1,G,ϕ2) the number of homomorphisms from P1 to G that respect ϕ2. We can now present our main algorithmic result.

Theorem 7

Let P be any k-node dag, and T=(B,E) be a d.t.d. for P. Fix any BB as the root of T. There is a dynamic programming algorithm HomCount(PTB) that in time O(|B|k2dk-τ(T)nτ(T)logn) computes hom(P(Γ[B]),G,ϕB) for all ϕB:P(B)G. This is also a bound on the time needed to compute hom(P,G).

The proof of Theorem 7 is given in the next subsection. Before continuing, let fT(k) be an upper bound on the time needed to compute a d.t.d. of minimum width with at most 2k bags for a pattern on k nodes. We can show that such a d.t.d. always exists:

Lemma 1

Any k-node dag P has a minimum-width d.t.d. on at most 2k bags.

Proof

We show that, if a d.t.d. T=(B,E) has two bags containing exactly the same sources, then one of the two bags can be removed. This implies that there exists a minimum-width d.t.d. where every bag contains a distinct source set, which therefore has at most 2k bags. Suppose indeed T contains two bags X and X formed by the same subset of sources. Let B be the neighbor of X on the unique path T(X,X). Let T=(B,E) be the tree obtained from T by replacing the edge {B,X} with {B,B} for every neighbor BB of X and then deleting X. Clearly |B|=|B|-1, and properties (1) and (2) of Definition 2 are satisfied. Let us then check property (3). Consider a generic path T(B1,B2) and look at the corresponding path T(B1,B2). If T(B1,B2) does not contain edges that we deleted, then T(B1,B2)=T(B1,B2). In this case the property holds for any bag in T(B1,B2) since it holds in T. Suppose instead T(B1,B2) contains edges that we deleted. Then T(B1,B2) contains the same bags of T(B1,B2) save that X is replaced by B. Thus we only need to check that VP(B1)VP(B2)VP(B). By property (3), VP(B1)VP(B2)VP(X). Moreover, since by construction BT(X,X), property (3) also gives VP(X)=VP(X)VP(X)VP(B). Thus VP(B1)VP(B2)VP(B). Therefore T is a d.t.d. for P.

Then, as an immediate corollary of Theorem 7, we have:

Theorem 8

Let P be any k-node dag, and T=(B,E) be a d.t.d. for P. We can compute hom(P,G) in time fT(k)+O(k22kdk-τ(P)nτ(P)logn).

Theorem 8 will be used in Sect. 3.2 to prove the bounds for our original problem of counting the copies of H via inclusion-exclusion arguments.

Proof of Theorem 7

The algorithm behind Theorem 7 is similar to the one for counting homomorphisms using a tree decomposition. To start, we prove that our dag tree decomposition enjoys a separator property similar to the one enjoyed by tree decompositions.

Lemma 2

Let T be a rooted d.t.d. and let B1,,Bl be the children of B in T. Then for all i[l]:

  1. VP(Γ[Bi])VP(Γ[Bj])VP(B) for all ji

  2. for any arc (u,u)P(Γ[B]), if uVP(Γ[Bi])\VP(B) then uVP(Γ[Bi])

  3. for any arc (u,u)P(Γ[B]), if uVP(Γ[Bi])\VP(B) then uVP(Γ[Bi])

Proof

We prove (a). Suppose for some ij we have VP(Γ[Bi])VP(Γ[Bj])VP(B). So there exists some node uVP such that uVP(Γ[Bi]), uVP(Γ[Bj]), and uVP(B). By definition of Γ[·], this implies uVP(Bi) and uVP(Bj) for some bags BiT(Bi) and BjT(Bj). Observe however that BT(Bi,Bj). Thus, by point (3) of Definition 2, we have uVP(B). This contradicts the third inclusion, uVP(B).

Now we prove (b) and (c). For (b), since uVP(Γ[Bi]) and (u,u)P, then uVP(Γ[Bi]) too. For (c), suppose by contradiction uVP(Γ[Bi]). Therefore, either uVP(B), or uVP(Bj) for some BjΓ[Bj] with ji. In both cases however we have uVP(B): in the first case this holds since u is reachable from u, and in the second case since BT(Bi,Bj) and by point (3) of Definition 2. Thus in any case uVP(B), which contradicts again uVP(B).

Lemma 2 says that VP(B) is a separator for the sub-patterns P(Γ[Bi]) in P. This allows us to compute hom(P(Γ[B])) by combining hom(P(Γ[B1])),,hom(P(Γ[Bl])).

Next, we show that each homomorphism ϕ of P(Γ[B]) is the juxtaposition (definition below) of some homomorphism ϕB of B and some homomorphisms ϕ1,,ϕl of Γ[B1],,Γ[Bl], provided they respect ϕB. This establishes a bijection, implying that we can count the homomorphisms ϕ by multiplying the counts of the homomorphisms ϕB,ϕ1,,ϕl.

Definition 5

Let {ϕ1,,ϕ} be any set of homomorphisms, where for all i=1,, we have ϕi:XiG and ϕi respects ϕj for all j=1,,. The juxtaposition of ϕ1,,ϕ, denoted by ϕ1ϕ, is the homomorphism ϕ:i=1XiG such that ϕ(u)=ϕi(u) whenever uXi.

Note that the juxtaposition is always well-defined and unique, since the ϕi respect each other and the image of every u is determined by at least one among ϕ1,,ϕ.

Lemma 3

Let T be a d.t.d. and let B1,,Bl be the children of B in T. Fix any ϕB:P(B)G. Let Φ(ϕB)={ϕ:P(Γ[B])G|ϕrespectsϕB}, and for i=1,,l let Φi(ϕB)={ϕ:P(Γ[Bi])G|ϕrespectsϕB}. Then there is a bijection between Φ(ϕB) and Φ1(ϕB)××Φl(ϕB), and therefore:

hom(P(Γ[B]),G,ϕB)=i=1lhom(P(Γ[Bi]),G,ϕB) 3
Proof

First, we show there is an injection between Φ(ϕB) and Φ1(ϕB)××Φl(ϕB). Fix any ϕΦ(ϕB), and consider the tuple (ϕ1,,ϕl) where each ϕi is the restriction of ϕ to P(Γ[Bi]). Note that ϕi is unique, and that it respects ϕB since ϕ does. Thus ϕiΦi(ϕB). It follows that (ϕ1,,ϕl)Φ1(ϕB)××Φl(ϕB). Now we show there is an injection between Φ1(ϕB)××Φl(ϕB) and Φ(ϕB). Consider any tuple (ϕ1,,ϕl)Φ1(ϕB)××Φl(ϕB), and consider the juxtaposition ϕ=ϕBϕ1ϕl. Then ϕ:P(Γ[B])G and ϕ respects ϕB. It follows that ϕΦ(ϕB).

Last, we bound the cost of enumerating the homomorphisms of a piece of P.

Lemma 4

Given any BS, the set of homomorphisms Φ={ϕ:P(B)G} has size O(dk-|B|n|B|) and can be enumerated in time O(k2dk-|B|n|B|).

Proof

We prove the bound on the enumeration time; the proof gives immediately also the bound on |Φ|. Let B={u1,,ub} where b=|B|. Fix a spanning forest {T1,,Tb} of P(B), where each Ti=(Vi,Ai) is a directed tree rooted at ui (arcs pointing away from the root). Consider any ϕΦ, and let ϕi be its restriction to Vi. Clearly, ϕ=ϕ1ϕb. Note that ϕi is a homomorphism of Ti in G. Thus, to enumerate Φ we can enumerate each possible tuples (ϕ1,,ϕb) where ϕi is a homomorphism of Ti in G for all i. Note that not all such tuples give a valid juxtaposition that is a homomorphism ϕΦ. However, we can check if ϕΦ in time O(k2) by checking the arcs between the vertices of G in the image of ϕ.

Let then ΦTi be the set of homomorphisms of Ti in G. We show how to enumerate ΦTi in time O(d|Vi|-1n), and thus all tuples (ϕ1,,ϕb)ΦT1××ΦTb in time i=1bO(d|Vi|-1n)=O(dk-bnb). Together with the check on the arcs, this gives a total running time of O(k2dk-bnb) for enumerating Φ, as desired. To enumerate ΦTi, we take each vG and enumerate all ϕiΦTi such that ϕi(si)=v. To this end note that, once we have fixed ϕi(x), for each arc (x,y)Ti we have at most d choices for ϕ(y). Thus we can enumerate all ϕiΦTi that map si to v in time d|Vi|-1. The total time to enumerate ΦTi is therefore O(d|Vi|-1n), as claimed.

We can now describe our dynamic programming algorithm, HomCount, to compute hom(P(Γ[B]),G). Given a d.t.d. T of P, the algorithm goes bottom-up from the leaves of T towards the root, combining the counts using Lemma 3. For readability, we write the algorithm in a recursive fashion. We prove:

graphic file with name 453_2021_811_Figa_HTML.jpg

Lemma 5

Let P be any dag, T=(B,E) any d.t.d. for P, and B any element of B. HomCount(PTB) in time O(|B|poly(k)dk-τ(T)nτ(T)logn) returns a dictionary CB that for all ϕB:P(B)G satisfies CB(ϕB)=hom(P(Γ[B]),G,ϕB).

Proof

We first prove the correctness, by induction on the nodes of T. The base case is when B is a leaf of T. In this case P(B)=P(Γ[B]), and the algorithm sets CB(ϕB)=1 for each ϕB:P(B)G. Therefore CB(ϕB)=hom(P(Γ[B]),G,ϕB) as desired. The inductive case is when B is an internal node of T. As inductive hypothesis we assume that, for every child Bi of B, the dictionary CBi computed at line 8 satisfies CBi(ϕ)=hom(P(Γ[Bi]),G,ϕ) for every ϕ:P(Bi)G. Let ΦP(Γ[Bi]) be the set of homomorphisms from P(Γ[Bi]) to G, and let ΦP(Γ[Bi])(ϕ) be the subset of elements of ΦP(Γ[Bi]) that respect ϕ. Thus, the inductive hypothesis says that CBi(ϕ)=|ΦP(Γ[Bi])(ϕ)| for every ϕ:P(Bi)G.

Now consider the loop at lines 10–12. We claim that, after that loop, we have:

AGGBi(ϕr)=ϕinCBiϕrespectsϕrCBi(ϕ)=ϕ:P(Bi)GϕrespectsϕrCBi(ϕ)=ϕ:P(Bi)GϕrespectsϕrΦP(Γ[Bi])(ϕ)=ΦP(Γ[Bi])(ϕr) 4

The first equality holds since the loop adds CBi(ϕ) to AGGBi(ϕr) if and only if the restriction of ϕ to VP(B)VP(Γ[Bi]) is ϕr, that is, if and only if ϕ respects ϕr. The second equality holds since the keys of CBi are a subset of {ϕ:P(Bi)G} and CBi(ϕ)=0 if ϕ is not in CBi. The third equality holds by the inductive hypothesis above. The fourth equality holds since the sets ΦP(Γ[Bi])(ϕ) form a partition of ΦP(Γ[Bi])(ϕr).

Finally, consider the loop at lines 13–15. We claim that line 15 sets:

CB(ϕ)=i=1lΦP(Γ[Bi])(ϕi)=i=1lΦP(Γ[Bi])(ϕ)=hom(P(Γ[B]),G,ϕ) 5

The first equality holds by coupling line 15 and Eq. (4). The second equality holds since any element of ΦP(Γ[Bi]) respects ϕ if and only if it respects its restriction ϕi to VP(B)VP(Γ[Bi]), thus ΦP(Γ[Bi])(ϕi)=ΦP(Γ[Bi])(ϕ). The last equality holds by definition of ΦP(Γ[Bi])(ϕi) and by Lemma 3. This proves the correctness.

Let us turn to the running time. For each dictionary C we let |C| be the number of distinct keys in C. Recall that reading or writing an entry in our dictionary takes time O(poly(k)logn). We split the running time as follows:

  • (i)

    the cost of the base case (lines 2–4). Since the loop has |CB| iterations, and each one costs O(poly(k)logn), this cost is in O(poly(k)|CB|logn).

  • (ii)

    the cost of the iteration at lines 7–12, performed by the parent of B in T, where the considered child is B, excluding obviously the cost of the recursive call at line 8. Similarly to (i), this cost is bounded by O(poly(k)|CB|logn).

  • (iii)

    the same as (ii), but for the loop at lines 13–15. This cost is in O(poly(k)|CB|logn) where B is the parent of B in T.

We charge B with every cost among (i), (ii), (iii) that applies (this depends on whether B is the root, a leaf, or an internal node of T). It is easy to check that the sum of all charged costs accounts for the total running time of HomCount(PTB) when B is the root of T. Now, by Lemma 4, for every BB we have |CB|=O(dk-|B|n|B|), which is in O(dk-τ(T)nτ(T)) by definition of τ(T). Thus each one of the three costs above is in O(poly(k)dk-τ(T)nτ(T)logn). Summing over BB yields the claimed bound.

Inclusion–exclusion arguments and the dag-treewidth of undirected graphs

We turn to computing hom(H,G), sub(H,G) and ind(H,G). We do so via standard inclusion-exclusion arguments, using our algorithm for computing hom(P,G) as a primitive. To this end we shall define appropriate notions of width for undirected pattern graphs. Let Σ(H) be the set of all dags P that can be obtained by orienting H acyclically. Let Θ(H) be the set of all equivalence relations on VH (that is, all the partitions of VH), and for θΘ(H) let H/θ be the pattern obtained from H by identifying equivalent nodes according to θ and removing loops and multiple edges. Let D(H) be the set of all supergraphs of H on the node set VH, including H.

Definition 6

The dag treewidth of H is τ(H)=τ3(H), where:

τ1(H)=max{τ(P):PΣ(H)} 6
τ2(H)=max{τ1(H/θ):θΘ(H)} 7
τ3(H)=max{τ2(H):HD(H)} 8

Note that τ(H) is unrelated to the treewidth t(H). For example, when H is a clique we have t(H)=k and τ(H)=1; when H is the independent set we have t(H)=1 and τ(H)=Θ(k), see Lemma 14; and when H is an expander we have t(H),τ(H)Θ(k), see again Lemma 14. In fact, τ(H) is within constant factors of the independence number α(H) of H (see Sect. 4.4), and thus decreases as H becomes denser. This happens because adding arcs increases the number of nodes reachable from the sources of PΣ(H), so we may need fewer sources to reach a given piece of P. When H is a clique, P is reachable from just one source and thus τ(H)=1.

Clearly, τ1(H)τ2(H)τ(H). The intuition behind τ1(H) is that, in G, each homomorphism of H corresponds to a homomorphism of some acyclic orientation P of H. Thus to compute hom(H,G) we sum hom(P,G) over all orientation P of H, and the running time is dominated by the P with largest treewidth. The intuition behind τ2(H) is similar but now we look at computing sub(H,G). Since homomorphisms can map different nodes of H to the same node of G, to recover sub(H,G) we must combine hom(H,G) for all possible H=H/θ through inclusion-exclusion arguments. The intuition behind τ3(H) is that to compute ind(H,G) we must remove from sub(H,G) the counts of sub(H,G) for certain supergraphs H of H. Indeed, the three measures τ1(H),τ2(H),τ(H) yield:

Theorem 9

Consider any k-node pattern graph H=(VH,EH), and let fT(k) be an upper bound on the time needed to compute a d.t.d. of minimum width on 2O(klogk) bags for any k-node dag. Then one can compute:

  • hom(H,G) in time 2O(klogk)·O(fT(k)+dk-τ1(H)nτ1(H)logn),

  • sub(H,G) in time 2O(klogk)·O(fT(k)+dk-τ2(H)nτ2(H)logn),

  • ind(H,G) in time 2O(k2)·O(fT(k)+dk-τ(H)nτ(H)logn).

The claim still holds if we replace τ1,τ2,τ with upper bounds, and fT(k) with the time needed to compute a d.t.d. on 2O(klogk) bags that satisfies those upper bounds.

Proof

We prove the three bounds in three separated steps. The last claim follows straightforwardly.

From dags to undirected patterns. Let H be any undirected pattern. First, note that:

hom(H,G)=PΣ(H)hom(P,G) 9

Let indeed Φ(H)={ϕ:HG} be the set of homomorphisms from H to G. Similarly, for any PΣ(H) define Φ(P)={ϕP:PG} (note that ϕP must preserve the direction of the arcs). Then, there is a bijection between Φ(H) and PΣ(H)Φ(P). Consider indeed any ϕΦ(H). Let σ be the orientation of H that assigns to {u,v}EH the orientation of {ϕ(u),ϕ(v)} in G, and let P=Hσ. Then ϕ is a homomorphism of P in G. On the other hand consider any homomorphism ϕΦ(P) for some acyclic orientation P of H. By ignoring the orientation of the edges, ϕΦ(H), too.

Thus, to compute hom(H,G) we compute hom(P,G) for all PΣ(H) and apply Eq. 9. Clearly, enumerating |Σ(H)| takes time O(k!)=2O(klogk). For each P, by Lemma 1 in time fT(k) we compute a d.t.d. T=(B,E) of width τ(P) such that |B|2k. Then, by Lemma 5 we compute hom(P,G) in time O(2kpoly(k)dk-τ(P)nτ(P)logn). Thus, we can compute every hom(P,G) in time O(fT(k)+2O(k)dk-τ(P)nτ(P)logn). Multiplying by 2O(klogk) gives the first bound of the theorem.

From homomorphisms to non-induced copies Recall that H/θ is the graph obtained from H by identifying the nodes in the same equivalence class and removing loops and multiple edges, where θΘ(H) is an equivalence relation (or partition) over VH. Then, by Equation 15 of [3]:

inj(H,G)=θΘ(H)μ(θ)hom(H/θ,G) 10

where μ(θ)=Aθ(-1)|A|-1(|A|-1)!, where A runs over the equivalence classes (the sets) in θ. Thus, to compute inj(H,G), we enumerate all θΘ(H), compute hom(H/θ,G), and apply Eq. 10. It is known that |Θ|=2O(klnk) (see e.g. [2]), and clearly for each θ we can compute μ(θ) and H/θ in O(poly(k)). Thus, the first bound of the theorem holds for computing inj(H,G) too. Finally, we compute sub(H,G)=inj(H,G)aut(H), where aut(H) is the number of automorphisms of H, which can be computed in time 2O(klnk) [26]. This proves the second bound of the theorem.

From non-induced to induced. Finally, let D(H) be the set of all supergraphs of H on the same node set. Then from Equation 14 of [3]:

ind(H,G)=HD(H)(-1)|EH\EH|inj(H,G) 11

To compute ind(H,G), we take every HD(H), compute inj(H,G), and apply Eq. 11. Since |D(H)|2k2, the third bound of the theorem follows.

The algorithmic part of our work is complete. We shall now focus on computing good dag tree decompositions, so to instantiate Theorem 9 and obtain the upper bounds of Sect. 1.1.

Computing good dag tree decompositions

In this section we show how to compute dag tree decompositions of low width. First, we show that for every k-node dag P we can compute in time 2O(k) a dag tree decomposition T that satisfies τ(T)k4+2. This result requires a nontrivial proof. As a corollary, we prove Theorems 1 and 2. Second, we give improved bounds for cliques minus ϵ edges; as a corollary, we prove Theorem 3. Third, we give improved bounds for complete multipartite graphs plus ϵ edges; as a corollary, we prove Theorem 4. Finally, we show that Ω(α(H))τ(H)α(H) where α(H) is the independence number of H, which is of independent interest. This implies that the trivial decomposition on one bag has width that is asymptotically optimal, since in any orientation of H, the set of sources is an independent set.

To proceed, we need some additional notation. For a dag P, we say vVP is a joint if it is reachable from at least two sources, i.e., if vVP(u)VP(u) for some u,uS with uu. Let J be the set of joints of P. We write J(u) for the set of joints reachable from u, and for any XVP we let J(X)=uXJ(u). Similarly, we denote by S(y) the sources from which y is reachable, and we let S(X)=uXS(u).

A bound for all patterns

This subsection is devoted to prove:

Theorem 10

For any dag P=(VP,AP), in time O(1.7549k) we can compute a dag tree decomposition T=(B,E) with τ(T)min(e4,k4)+2 and |B|=O(k), where k=|VP| and e=|AP|.

By combining Theorem 10 with Definition 6 and Theorem 9, we obtain as a corollary Theorem 1. The proof of Theorem 10 is divided in four steps, as follows. First (step 1), we remove the “easy” pieces of P; this can break P into several connected components, and we show that their d.t.d.’s can be composed into a d.t.d. for P. Next, we show that if the i-th component has ki nodes, then it admits a d.t.d. of width ki4+2. This requires to “peel” the component to remove its tree-like parts (step 2) and decomposing the remainder using a reduction to standard tree decompositions (step 3). Finally, we wrap up our results and conclude the proof (step 4).

Throughout the proof, the relevant structure of P is encoded by a graph that we call the skeleton of P, defined as follows.

Definition 7

The skeleton of a dag P=(VP,AP) is the bipartite dag Λ(P)=(VΛ,EΛ) where VΛ=SJ and EΛS×J, and (u,v)EΛ if and only if vJ(u).

Figure 3 gives an example. Note that Λ(P) does not contain nodes that are neither sources nor joints, as they are irrelevant to the d.t.d.. Note also that computing Λ(P) takes time O(poly(k)).

Fig. 3.

Fig. 3

Left: a dag P. Right: its skeleton Λ(P) (sources S above, joints J below)

Let us now delve into the proof. For any node x, we denote by dx the current degree of x in the skeleton.

Step 1: Greedy bag construction Set B(0)= and let Λ(0)=(VΛ(0),EΛ(0))=Λ. Set j=0 and proceed iteratively as follows, recalling that VΛ(j)=SΛ(j)JΛ(j). For any source uSΛ(j) let VΛ(j)(u) be the transitive closure of u in Λ(j). If there exists a source uSΛ(j) such that |VΛ(j)(u)|4, then let B(j+1)=B(j){u}, and let Λ(j+1) be obtained from Λ(j) by removing VΛ(j)(u) from VΛ(j). Repeat the procedure until |VΛ(j)(u)|3 for all u. Suppose the procedure stops at j=j, producing the subset B=B(j) and the residual skeleton Λ=Λ(j)=(VΛ,EΛ).

Lemma 6

|B|min(k-|VΛ|4,e-|EΛ|4), where k=|VP| and e=|AP|.

Proof

For the first term of the min(), note that at each step we remove at least 4 nodes from Λ(j) and add one node to B(j). Hence 4|B||VΛ\VΛ|k-|VΛ|, which implies |B|k-|VΛ|4. For the second term of the min(), we show that the set of nodes VΛ(j)(u) removed at step j identifies at least 4 unique arcs of P. To this end, consider the sub-pattern P(j)=P\P(B(j)) containing all nodes not reachable from B(j). Note that Λ(j) is the skeleton of P(j). Indeed, if vJ(j) then v is not reachable from any source in B(j), since otherwise v would have been removed before step j. Thus, at step j we are removing at least 3 joint nodes of P(j)(u). Therefore, P(j)(u) contains at least three arcs pointing to its joints. In addition, by definition, the joints of P(j)(u) must be reachable from some node in P(j)\P(j)(u). Thus there is an arc from P(j)\P(j)(u) to a node of P(j)(u), and this node is therefore a joint itself. Thus, P(j) contains at least 4 arcs pointing to the joints of P(j)(u). Since the joints of P(j)(u) are then removed from P(j), these arcs are counted only once. Hence e4|B|+|EΛ|, and |B|e-|EΛ|4.

Now, if B=S, then T=({B},) is a d.t.d. of P whose width is τ(T)=|B|. By Lemma 6, |B|min(k4,e4), so Theorem 10 is proven.

If instead BS, then Λ has 1 nonempty connected components. For each i=1,, let Λi=(SiJi,Ei) be the i-th component of Λ. Let P=P\P(B), and let Pi=P(Si). Then, Λi is the skeleton of Pi; this follows from the same argument used in the proof of Lemma 6. We shall now see that we can obtain a d.t.d. for P by arranging the d.t.d.’s of the Pi into a tree, and adding B to all bags.

Lemma 7

For each i=1,, let Ti=(Bi,Ei) be a d.t.d. of Pi. Consider the tree T obtained as follows. The root of T is the bag B, and the subtrees below B are T1,,T, where each bag BTi has been replaced by BB. Then T=(B,E) is a d.t.d. of P with τ(T)|B|+maxi=1,,τ(Ti) and |B|=1+i=1|Bi|.

Proof

The claims on τ(T) and |B| are straightforward. Let us check that T is a d.t.d. of P, via Definition 2. Property (1) is immediate. For property (2), note that BBi=Si because Ti is by hypothesis a d.t.d. of Λi. Thus BB=B(i=1Si)=SP. We turn to property (3). Choose any two bags BB and BB of T, where BTi and BTj for some i,j{1,,}, and any bag BBT(BB,BB). Suppose first i=j; thus by construction BT(B,B). Since Ti is a d.t.d., then Ji(B)Ji(B)Ji(B), and in T this implies VP(BB)VP(BB)VP(BB). Suppose instead ij. Thus Ji(Si)Jj(Sj)= and this means that J(Si)J(Sj)J(B). But VP(Bi)VP(Bj)J(Si)J(Sj) and J(B)VP(B), thus VP(Bi)VP(Bj)VP(B). It follows that for every bag BB of T we have VP(BiB)VP(BjB)VP(BB).

Step 2: Peeling Λi. We now remove the tree-like parts of Λi. These include, for instance, sources that have only one reachable joint. For each such source, we create a dedicated bag which becomes the child of another bag that reaches the same joint. This removes a source without increasing the width of the decomposition.

The construction is recursive. Let Pi(0)=Pi and Λi(0)=(Si(0)Ji(0),Ei(0))=Λi. Set j=0. For any node xΛi(j), we denote by du(j) its degree in Λi(j). We will show that the tree T(0) returned by our recursive construction is a d.t.d. for Pi(0)=Pi.

The base case is |Si(j)|=1. In this case we set Ti(j)=({Si(j)},). Clearly, Ti(j) is a d.t.d. for Pi(j) of width 1 and we are done. Suppose instead |Si(j)|>1. Recall that du(j)2 for all uSi(j). Consider the first one of these three cases that applies (if none of them does, then we stop):

  1. uSi(j):du(j)=1. Then choose any such u, and we choose any uSi(j)\{u} with Ji(j)(u)Ji(j)(u).

  2. Ji(j)(u)=Ji(j)(u) for some u,uSi(j) with uu. Then, choose any such u,u.

  3. vJi(j):dv(j)=1. Then choose any such v, let u be the unique source such that vJi(j)(u), and let uu be any source with Ji(j)(u)Ji(j)(u).

Then, we define Ti(j) recursively as follows. Let Pi(j+1)=Pi(Si(j)\{u}), and let Λi(j+1) be the skeleton graph obtained from Λi(j) by removing u and (for the third case) the node vJi(j)(u) that is reachable only from u. We invoke the procedure recursively on Λi(j+1). Suppose the recursive procedure returns a d.t.d. Ti(j+1) of Pi(j+1). Then, Ti(j+1) must contain a bag B such that uB. Create the bag Bu={u}, set it as a child of B in Ti(j+1), and let the resulting tree be Ti(j). Let us check that Ti(j) is a d.t.d. for Pi(j). Properties (1) and (2) of Definition 2 are obviously satisfied. For property (3), since Bu is a leaf, we only need to check that Ji(j)(Bu)Ji(j)(B)Ji(j)(B) for all BTi(j) and all BTi(j)(Bu,B). To this end note that, by the choice of u and u, for any uSi(j)\{u,u} we have Ji(j)(u)Ji(j)(u)Ji(j)(u). We repeat the entire procedure until we reach the base case, or until |Si(j)|>1 and none of the three cases above holds, in which case we move to the next phase.

Before continuing, we make sure that the procedure above is well defined; we must guarantee that, in each of the three cases, the node u exists. One can see that u exists whenever |Si(j)|>1 (which is true by hypothesis) and Λi(j) is connected. To see that Λi(j) is connected, note that if this was not the case then in some step h<j we removed a source u with du(h)=2 such that no other source u has Ji(h)(u)=Ji(h)(u). However, this cannot happen by construction of the procedure.

Step 3: Decomposing the core Suppose the peeling phase stops at j=j. Let Pi=Pi(j) and Λi=(SiJi,Ei)=Λi(j). We say Pi is the core of Pi; this is the part that determines the dag treewidth. Now, since Λi violates all three conditions of the peeling step, we have du=2 for every source u and dv2 for every joint v. Thus Λi can be encoded as a simple graph. Formally, let Ci=(VCi,ECi) where VCi=Ji and ECi={eu:uSi}, where eu=Ji(u) for each uSi. To ease the discussion, for the edges we use u and eu interchangeably. Figure 4 gives an example. Note that Λi is the skeleton of Pi, since Si are the sources of Pi and the degree bound above implies that each vJc is reachable from at least two sources of Pi. In what follows we let ki=|SiJi|=|ECi|+|VCi|.

Fig. 4.

Fig. 4

Above: example of a skeleton component Λi. Below: the core Λi obtained from Λi after peeling (left), and its encoding as Ci (right)

We use Ci to compute a good d.t.d. Ti=(Bi,Ei) of Pi via tree decompositions. First, we show that four our purposes it is sufficient to find a d.t.d. of width at most |ECi|5+3.

Lemma 8

If τ(Pi)|ECi|5+3 then τ(Pi)ki4+2.

Proof

First, suppose that |VCi|4. Since all nodes of Ci have degree at least 2, then Ci contains a 4-cycle, and thus an edge cover Bcov of size 2. We then build Ti by setting Bcov as root, and Bu={u} for every uECi\Bcov as child of Bcov. This is clearly a d.t.d. for Pi of width 2<ki4+2, and thus τ(Pi)<ki4+2.

Suppose instead that |VCi|5. Note that |ECi||VCi| by construction of Ci. One can check that these conditions imply |ECi|20+|VCi|4>1, which in turn gives:

τ(Pi)|ECi|5+3|ECi|+|VCi|4+2=ki4+2 12

concluding the proof.

Therefore, we compute a d.t.d. of width at most |ECi|5+3. We do this in two steps.

Lemma 9

In time O(1.7549ki) one can compute a tree decomposition D=(VD,ED) for Ci with treewidth at most |ECi|5+2 and O(ki) bags.

Proof

By Theorem 2 of [24], the treewidth of a graph G=(V,E) is at most |E|5+2. By Theorems 5.23-5.24 of [18], we can compute a minimum-width tree decomposition of an n-node graph in time O(1.7549n). By Lemma 5.16 of [18], in time O(n) we can transform such a decomposition into one that contains at most 4n bags, leaving its width unchanged. Therefore, in time O(1.7549ki) we can build a tree decomposition D for Ci with O(ki) bags that satisfies t(D)|ECi|5+2.

Lemma 10

Let D=(VD,ED)be a tree decomposition of Ci.In time poly(k)we can build a d.t.d. Ti=(Bi,Ei)for Pisuch that τ(Ti)t(D)+1and |Bi|=O(|VD|+ki).

Proof

To simplify the notation let us write T,B,E in place of Ti,Bi,Ei. We first show how to build the tree T. The tedious part is proving it is a valid d.t.d. for Pi.

The intuition is that D covers the edges of Ci, which correspond to the sources of Pi. This gives a way to “convert” the bags of D into bags for Ti. For every vVCi choose an arbitrary incident edge uv={v,z}ECi. Replace each bag YD by B(Y)={uv:vY}, and for every uSi\YDB(Y), choose a bag B(Y):J(u)Y, and set the bag Bu={u} as child of B(Y). Let T be the resulting tree. To see that the construction is well-defined, note that, by point (2) of Definition 1, for any uECi there exists some YD such that u={x,y}Y. Therefore assigning Bu as child of some B(Y) with uY is licit. Now, τ(T)t(D)+1 follows immediately by the facts that |B(Y)||Y| for all YD and that |Bu|=1 for each of the bags Bu above, and by Definitions 1 and 2 . The bound |Bi|=O(|VD|+ki) holds since T contains a bag for each bag of D, plus at most one bag for each node in Si, and |Si|ki.

Let us then check that T is a d.t.d. for Pi via Definition 2. Clearly, T is a tree and satisfies property (1). For property (2), let ECi(D)=YDB(Y). Observe that by construction BTB=ECi(D)(uECi\ECi(D)Bu). The right-hand expression is ECi.

It remains to check property (3). First, if we have set Bu as child of BY then by construction J(Bu)J(BY). Thus we can ignore any such Bu and focus on the remaining bags of T, proving that every B,B,B such that BT(B,B) satisfy J(B)J(B)J(B). Let Y,Y,Y the three bags of D from which the construction produced respectively B,B,B. Observe that BT(B,B) implies YD(Y,Y). Now suppose that, by contradiction, there exists vJ(B)J(B) such that vJ(B). Note that, by construction, we must have put some u with eu={v,z} in B and some u with eu={v,z} in B, for some z,zVi. Moreover, Y{v,z} and Y{v,z}, else we could not have uB and uB. Finally, bear in mind that vY and u,uB, otherwise vJ(B), contradicting the hypothesis. Now we consider three cases. We use repeatedly properties (2) and (3) of Definition 1.

Case 1 vYY. Then vY, a contradiction.

Case 2 vY and vY. Then zY and u with eu={v,z} is the edge chosen to cover z, else we would not put uBY. Moreover there must be Y^D such that eu={v,z}Y^. For the sake of the proof root D at Y, so Y and Y are in distinct subtrees. If Y^ and Y are in the same subtree then YD(Y,Y^), but vYY^ and thus vY, a contradiction. Otherwise YD(Y,Y^), and since zYY^ then zY and then uB(Y), a contradiction.

Case 3 vY and vY. Then zY, zY, and u,u are the sources chosen to cover respectively z,z. Moreover there must be Y^,Y^D such that eu={v,z}Y^ and eu={v,z}Y^. Root again D at Y. If YD(Y^,Y^) then since vY^Y^ it holds vY, a contradiction. Otherwise Y^,Y^ are in the same subtree of D. If the subtree is the same as Y, then YD(Y,Y^), but zYY^ and thus zY and thus uB(Y), a contradiction. Otherwise we have YD(Y,Y^); but zYY^, thus zY and uB(Y), again a contradiction.

Combining Lemma 8, Lemma 9, and Lemma 10, we obtain:

Lemma 11

In time O(1.7549kipoly(ki)) we can compute a d.t.d. Ti=(Bi,Ei) for Pi such that τ(Ti)ki4+2 and |Bi|=O(ki).

With Lemma 11, we are almost done. It remains to wrap all our bounds together.

Step 4: Assembling the tree Recall the sub-patterns Pi obtained after the greedy bag construction (step 1). Let Ti=(Bi,Ei) be the d.t.d. for Pi as returned by the recursive peeling and the core decomposition. Since the peeling phase only adds bags of size 1, then τ(Ti)=τ(Ti). Therefore, by Lemma 9, τ(Ti)ki4+2. Moreover, since each bag added in the peeling phase corresponds to a unique source, then |Bi|=O(ki+|V(Pi)|)=O(|V(Pi)|).

Let now T=(B,E) be the d.t.d. for P obtained by assembling the trees T1,,T as in Lemma 7. By Lemma 7 itself, τ(T)|B|+maxi=1,,τ(Ti), thus:

τ(T)|B|+maxi=1,,ki4+2 13

Now, by Lemma 6 we know that P(B) has at least 4|B| nodes and 4|B| arcs. Similarly, since each Λi has at least ki nodes and ki arcs, then P\P(B) has at least i=1ki nodes and i=1ki arcs. Then τ(T)k4+2 and τ(T)e4+2, so τ(T)min(k4,e4)+2. Moreover, |B|=1+i=1|Bi|=O(i=1|V(Pi)|)=O(k). Finally, by Lemma 9 the time to build Ti is O(1.7549kipoly(ki)), since the peeling phase clearly takes time poly(ki). The total time to build T is therefore O(1.7549kpoly(k)). This concludes the proof of Theorem 10.

Bounds for quasi-cliques (Theorem 3)

Lemma 12

If a k-node dag P has k2-ϵ edges, then in time O(poly(k)) one can compute a d.t.d. T for P on two bags such that τ(T)12+ϵ2.

Proof

The source set S of P is an independent set. Hence ϵ|S|2, and |S|1+2ϵ. Consider any tree T on two bags B1,B2 such that B1B2=S, |B1|=|S|/2, and |B2|=|S|/2. It is immediate to check that T satisfies the claim.

By coupling Lemma 12 and Theorem 9, for computing hom(H,G) and sub(H,G) we obtain a running time bound of 2O(klogk)·O(dk-12+ϵ2n12+ϵ2logn). For ind(H,G), we refine the bound of Theorem 9 by observing that |D(H)|2ϵ. This yields a running time bound of 2O(ϵ+klogk)·O(dk-12+ϵ2n12+ϵ2logn). This concludes the proof of Theorem 3.

Bounds for quasi-multipartite graphs (Theorem 4)

Lemma 13

If H is a complete multipartite graph, then τ2(H)=1. If H is a complete multipartite graph plus ϵ edges, then τ2(H)ϵ4+2. In either case, for any θΘ(H), for any acyclic orientation P of H/θ we can compute in time 2O(k) a d.t.d. of P on O(k) bags whose width satisfies the bounds above.

Proof

First, suppose H=(VH,EH) is complete multipartite, so VH=VH1VHκ where each VHj is a maximal independent set in H. In any acyclic orientation P of H, the source set S satisfies SVHj for some j{1,,κ}. Moreover, VP(u)=VP(u) for any u,uS. A d.t.d. T for P of width τ(T)=1 is the tree on |S| bags with one source per bag, which can be computed in time O(poly(k)).

Suppose now we add ϵ arcs to P, with any orientation; this means H is a complete multipartite graph plus ϵ edges. Again we have SVHj, but now for some u,uS we might have VP(u)VP(u), so the d.t.d above might not be valid anymore. Let Pj=P[VHj] and consider any d.t.d. T for Pj. We argue that T is a valid d.t.d. for P as well. First, the source set of Pj is same of P (that is, S). Thus, since T satisfies properties (1) and (2) of Definition 2 for Pj, then it does so for P, too. For property (3), note that every node vVH\VHj is reachable from every uS. Thus, all bags B of T satisfy VP(B)=(VH\VHj)(VP(B)VHj). As a consequence, for any three bags B,B1,B2, if VPj(B1)VPj(B2)VPj(B) then VP(B1)VP(B2)VP(B). Thus T satisfies property (3) and is a d.t.d. for P. Therefore, any d.t.d. for Pj is a d.t.d. for P. Now, since Pj has at most ϵ edges, by Theorem 10 in time 2O(k) we can compute a d.t.d. for it of width at most ϵ4+2 on O(k) bags.

Consider now any θΘ(H). For any vVH, we denote by θ(v) the node of H/θ corresponding to v, and for any xH/θ we let θ-1(x)={uVH:θ(u)=θ(x)} be the set of nodes of H identified in x. Let P be any acyclic orientation of H/θ. Since the sources S of P form an independent set, then xSθ-1(x)VHj for some j. Moreover, for any node x of P, if vθ-1(x) for some vVHi with ij then x reachable from every node in S. Therefore, if we let VPj=vVHjθ(v) and Pj=P[VPj], the arguments above apply and we obtain the same bound.

By coupling Lemma 13 and Theorem 9, when H is complete multipartite, for computing hom(H,G) and sub(H,G) we obtain a time bound of 2O(klogk)·O(dk-1nlogn). Similarly, when H is complete multipartite plus ϵ edges, we obtain a time bound of 2O(klogk)·O(dk-ϵ4-2nϵ4-2logn). This proves Theorem 4.

Independence number and dag treewidth

Recall that α(H) is the independence number of H. We show:

Lemma 14

Any k-node graph H satisfies Ω(α(H))τ(H)α(H).

Proof

For the upper bound, note that α(H/θ)α(H) for any HD(H) and any θΘ(H). Moreover, in any acyclic orientation P of H/θ the sources form an independent set. Thus τ(P)α(H). The bound follows by Definition 6.

For the lower bound, we exhibit a pattern H obtained by adding edges to H such that τ(P)=Ω(α(H)) for all its acyclic orientations P. Let IVH be an independent set of H with |I|=Ω(α(H)) and |I|mod50. We add edges to I, so as to obtain the 1-subdivision of an expander. Partition I into IJ,IS where |IJ|=25|I| and |IS|=35|I|. Consider a 3-regular expander E=(IJ,EE) of linear treewidth t(E)=Ω(|IJ|). It is well known that such expanders exist (see e.g. Proposition 1 and Theorem 5 of [21]). Note that |EE|=32|IJ|=|IS|. For each edge {u,v}EE, we choose a distinct node in IS, denoted by euv, and we add to H the edges {euv,u} and {euv,v}. Let H be the resulting pattern. Observe that H[I] is the 1-subdivision of E, and that t(E)=Ω(α(H)) since |IJ|=Ω(α(H)).

Let now P=(VP,AP) be any acyclic orientation of H where ISSP where SP are the sources of P. Such an orientation exists since IS is an independent set in H. Let T be any d.t.d. of P. We show that τ(T)12(t(E)+1)=Ω(α(H)), which implies τ(P)=Ω(α(H)) and therefore the thesis τ(H)=Ω(α(H)). To this end, consider the tree D obtained from T by replacing each bag of sources B with the bag of nodes J(B)IJ. We claim that D is a tree decomposition of E of width at most 2τ(T)-1. Let us start by checking the properties of Definition 1.

Property (1) By point (2) of Definition 2, the d.t.d. T satisfies BTB=SP. Therefore, by construction of D, we have XDX=BT(J(B)IJ)=IJ.

Property (2) Let {v,w} be any edge of E where we recall that {v,w}IJ. By construction of H, there exists uIS such that J(u)={v,w} in P. Since T is a d.t.d., it satisfies point (2) of Definition 2, hence uB for some BT. By construction of D this implies there is some bag XD such that X=J(u)={v,w}.

Property (3) Fix any three bags X1,X2,X3D such that X1D(X2,X3). By construction, X1=J(B1)IJ,X2=J(B2)IJ,X3=J(B3)IJ for some B1,B2,B3T such that B1T(B2,B3). Consider any vX2X3; we need to show that vX1. By construction of D, we have vX2X3=J(B2)J(B3)IJ. Thus, there exist uB2 and uB3 such that vJ(u)IJ and vJ(u)IJ. However, since B1T(B2,B3), point (3) of Definition 1 implies J(u)J(u)J(B1). Therefore, vJ(B1) as well. Moreover vIJ, and thus vJ(B1)IJ. But J(B1)IJ=X1, so vX1.

Hence, D is a tree decomposition of E. Finally, note that any bag XD by construction satisfies |X|=|J(B)IJ|2|B| since any source uB has at most 2 arcs towards IJ. Then by Definition 1 and Definition 3 we have t(E)2τ(P)-1, that is, τ(P)12(t(E)+1), as claimed.

Lower bounds

We prove Theorem 6, in a more technical form. Note that, since τ(H)=Θ(α(H)) by Lemma 14, the bound still holds if one replaces τ(H) by α(H). The proof uses the following result:

Theorem 11

([12], Theorem I.2) The following problems are #W[1]-hard and, assuming ETH, cannot be solved in time f(k)·no(k/logk) for any computable function f: counting (directed) paths or cycles of length k, and counting edge-colorful or uncolored k-matchings in bipartite graphs.

Let us now state the lower bound.

Theorem 12

Choose any function a:NN such that a(k)[1,k] for all kN. There exists an infinite family H of patterns such that (1) for all HHwe have τ(H)=Θ(a(|V(H)|)),and (2) if there exists an algorithm that for all HHcomputes ind(H,G)or sub(H,G)in time f(d,k)·no(a(τ(H))/loga(τ(H))), where d is the degeneracy of G, then ETH fails.

Proof

We reduce counting cycles in an arbitrary graph to counting a gadget pattern on k nodes and dag treewidth O(a(k)) in a d-degenerate graph.

First, fix a function d:NN such that d(k)Ω(ka(k)). Now consider a simple cycle on k03 nodes and any arbitrarily large k3. Our gadget pattern on k nodes is the following. For each edge e=uv of the cycle create a clique Ce on d(k)-1 nodes; delete e and connect both u and v to every node of Ce. The resulting pattern H has k=k0d(k) nodes. Let us prove that τ(H)k0; since k0=kd(k)O(a(k)), this implies τ(H)=O(a(k)) as desired. Consider again the generic edge e=uv. In any acyclic orientation P of H, the set Ceu induces a clique, and thus can contain at most one source. Applying the argument to all e shows that |S(P)|k0, hence τ(P)k0. This holds also if we add edges and/or identify nodes of P, hence τ(H)k0.

Now consider a simple graph G0 on n0 nodes and m0 edges. We replace each edge of G0 as described above, which takes time O(poly(n0)). The resulting graph G has n=m0(d-1)+n0=O(dn02) nodes and degeneracy d. Every k0-cycle of G0 is univocally associated to a copy of H in G (note that every non-induced copy of H in G is induced too). Suppose we have an algorithm that computes ind(H,G) or sub(H,G) in time f(d,k)·no(τ(H)/logτ(H)). Since τ(H)k0, n=O(dn02), k=f1(d,k0), and d=f2(k0), for some functions f1,f2, then the running is time f(k0)·(n0)o(k0/logk0). Invoking Theorem 11 concludes the proof.

Conclusions

We have shown how, by introducing a novel tree-like decomposition for directed acyclic graphs, one can improve on the decades-old state-of-the-art subgraph counting algorithms when the host graph is sufficiently sparse. This decomposition seems to capture the structure of the problem, and may therefore be of independent interest.

We leave open the (important) problem of finding a tight characterization of the class of patterns that are fixed-parameter tractable with respect to homomorphism, subgraph, and induced subgraph counting, when the complexity is parameterized by the degeneracy of the host graph. Our results represent one step, outlining properties (the boundedness of our notions of width) that are sufficient, but maybe not necessary. Indeed, our lower bounds only give evidence that such properties are necessary for counting induced or non-induced occurrences, and only for some classes of patterns. Finding a complete characterization, and therefore establishing a class pattern dichotomy for all three problems, would be an exciting development.

Acknowledgements

Most of this work was done while the author was affiliated with the Department of Computer Science of the Sapienza University of Rome. The author was supported in part by: BICI, the Bertinoro International Center for Informatics; a Google Focused Award “Algorithms and Learning for AI” (ALL4AI); the ERC Starting Grant DMAP 680153; the “Dipartimenti di Eccellenza 2018–2022” Grant awarded to the Department of Computer Science of the Sapienza University of Rome. The author thanks the anonymous reviewers for their precious feedback.

Funding

Open access funding provided by Università degli Studi di Milano within the CRUI-CARE Agreement.

Footnotes

1

In this paper we suppress poly(k) factors by default; if needed, we explicit them in order to emphasize that the dependence on k is polynomial rather than exponential.

2

Formally, we should define a tree together with a mapping between its nodes and the subsets of V. However, the definition adopted here is sufficient for our purposes and lightens the notation.

A preliminary version of this article, [4], appeared in the proceedings of the 14th International Symposium on Parameterized and Exact Computation (IPEC 2019).

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Alon N, Dao P, Hajirasouliha I, Hormozdiari F, Sahinalp SC. Biomolecular network motif counting and discovery by color coding. Bioinformatics. 2008;24(13):i241–249. doi: 10.1093/bioinformatics/btn163. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Berend D, Tassa T. Improved bounds on Bell numbers and on moments of sums of random variables. Probab. Math. Stat. 2010;30(2):185–205. [Google Scholar]
  • 3.Borgs, C., Chayes, J., Lovász, L., Sós, V.T., Vesztergombi, K.: Counting Graph Homomorphisms, pp. 315–371 (2006)
  • 4.Bressan, M.: Faster subgraph counting in sparse graphs. In: Proceedings of IPEC, vol. 148, pp. 6:1–6:15 (2019)
  • 5.Bressan, M., Chierichetti, F., Kumar, R., Leucci, S., Panconesi, A.: Counting graphlets: space vs. time. In: Proceedings of ACM WSDM, pp. 557–566 (2017)
  • 6.Bressan M, Chierichetti F, Kumar R, Leucci S, Panconesi A. Motif counting beyond five nodes. ACM Trans. Knowl. Discov. Data. 2018;20(2):1–25. doi: 10.1145/3186586. [DOI] [Google Scholar]
  • 7.Bressan M, Leucci S, Panconesi A. Motivo: fast motif counting via succinct color coding and adaptive sampling. Proc. VLDB Endow. 2019;12(11):1651–1663. doi: 10.14778/3342263.3342640. [DOI] [Google Scholar]
  • 8.Chen J, Chor B, Fellows M, Huang X, Juedes D, Kanj IA, Xia G. Tight lower bounds for certain parameterized NP-hard problems. Inf. Comput. 2005;201(2):216–231. doi: 10.1016/j.ic.2005.05.001. [DOI] [Google Scholar]
  • 9.Chen J, Huang X, Kanj IA, Xia G. Strong computational lower bounds via parameterized complexity. J. Comput. Syst. Sci. 2006;72(8):1346–1367. doi: 10.1016/j.jcss.2006.04.007. [DOI] [Google Scholar]
  • 10.Chiba N, Nishizeki T. Arboricity and subgraph listing algorithms. SIAM J. Comput. 1985;14(1):210–223. doi: 10.1137/0214017. [DOI] [Google Scholar]
  • 11.Curticapean, R., Dell, H., Marx, D.: Homomorphisms are a good basis for counting small subgraphs. In: Proceedings of ACM STOC, pp. 210–223 (2017)
  • 12.Curticapean, R., Marx, D.: Complexity of counting subgraphs: only the boundedness of the vertex-cover number counts. In: Proceedings of IEEE FOCS, pp. 130–139 (2014)
  • 13.Diestel R. Graph Theory. 5. Berlin: Springer; 2017. [Google Scholar]
  • 14.Eppstein D. Arboricity and bipartite subgraph listing algorithms. Inf. Process. Lett. 1994;51(4):207–211. doi: 10.1016/0020-0190(94)90121-X. [DOI] [Google Scholar]
  • 15.Eppstein D. Subgraph isomorphism in planar graphs and related problems. J. Graph Algorithms Appl. 1999;3(3):1–27. doi: 10.7155/jgaa.00014. [DOI] [Google Scholar]
  • 16.Eppstein, D., Löffler, M., Strash, D.: Listing all maximal cliques in sparse graphs in near-optimal time. In: Algorithms and Computation, pp. 403–414. Springer, Berlin (2010)
  • 17.Eppstein, David, Strash, Darren: Listing all maximal cliques in large sparse real-world graphs. In: Experimental Algorithms, pp. 364–375. Springer, Berlin (2011)
  • 18.Fedor, V.: Fomin and Dieter Kratsch, 1st edn. Exact Exponential Algorithms. Springer, Berlin (2010)
  • 19.Ganian R, Hliněný P, Kneis J, Meister D, Obdržálek J, Rossmanith P, Sikdar S. Are there any good digraph width measures? J. Comb. Theory Ser. B. 2016;116:250–286. doi: 10.1016/j.jctb.2015.09.001. [DOI] [Google Scholar]
  • 20.Grohe M, Kreutzer S, Siebertz S. Characterisations of nowhere dense graphs (invited talk) Proc. FSTTCS. 2013;24:21–40. [Google Scholar]
  • 21.Grohe M, Marx D. On tree width, bramble size, and expansion. J. Comb. Theory Ser. B. 2009;99(1):218–228. doi: 10.1016/j.jctb.2008.06.004. [DOI] [Google Scholar]
  • 22.Grohe, M., Schweikardt, N.: First-order query evaluation with cardinality conditions. In: Proceedings of ACM SIGMOD, pp. 253–266 (2018)
  • 23.Impagliazzo, R., Paturi, R., Zane, F.: Which problems have strongly exponential complexity? In: Proceedings of IEEE FOCS, pp. 653–662 (1998)
  • 24.Kneis, J., Mölle, D., Richter, S., Rossmanith, P.: Algorithms based on the treewidth of sparse graphs. In: Graph-Theoretic Concepts in Computer Science, pp. 385–396. Springer, Berlin (2005)
  • 25.Le Gall, F.: Powers of tensors and fast matrix multiplication. In: Proceedings of ISSAC, pp. 296–303 (2014)
  • 26.Mathon R. A note on the graph isomorphism counting problem. Inf. Process. Lett. 1979;8(3):131–136. doi: 10.1016/0020-0190(79)90004-8. [DOI] [Google Scholar]
  • 27.Nešetřil, J., de Mendez, P.O.: Sparsity: graphs, structures, and algorithms. In: Algorithms and Combinatorics. Springer, Berlin (2012)
  • 28.Nešetřil J, de Mendez PO. On nowhere dense graphs. Eur. J. Combin. 2011;32(4):600–617. doi: 10.1016/j.ejc.2011.01.006. [DOI] [Google Scholar]
  • 29.Nešetřil J, Poljak S. On the complexity of the subgraph problem. Comment. Math. Univ. Carol. 1985;026(2):415–419. [Google Scholar]
  • 30.Patel V, Regts G. Computing the number of induced copies of a fixed graph in a bounded degree graph. Algorithmica. 2018;81(5):1844–1858. doi: 10.1007/s00453-018-0511-9. [DOI] [Google Scholar]
  • 31.Sariyüce, A.E., Pinar, A.: Peeling bipartite networks for dense subgraph discovery. In: Proceedings of ACM WSDM, pp. 504–512 (2018)
  • 32.Sariyüce AE, Seshadhri C, Pinar A, Çatalyürek ÜV. Nucleus decompositions for identifying hierarchy of dense subgraphs. ACM Trans. Web. 2017;11(3):16:1–16:27. doi: 10.1145/3057742. [DOI] [Google Scholar]
  • 33.Tsourakakis, C.E., Pachocki, J., Mitzenmacher, M.: Scalable motif-aware graph clustering. In: Proceedings of WWW, pp. 1451–1460 (2017)

Articles from Algorithmica are provided here courtesy of Springer

RESOURCES