ABSTRACT
A wide class of problems in combinatorics, computer science and physics can be described along the following lines. There are a large number of variables ranging over a finite domain that interact through constraints that each bind a few variables and either encourage or discourage certain value combinations. Examples include the k‐SAT problem or the Ising model. Such models naturally induce a Gibbs measure on the set of assignments, which is characterised by its partition function. The present paper deals with the partition function of problems where the interactions between variables and constraints are induced by a sparse random (hyper)graph. According to physics predictions, a generic recipe called the “replica symmetric cavity method” yields the correct value of the partition function if the underlying model enjoys certain properties [Krzkala et al., PNAS (2007) 10318–10323]. Guided by this conjecture, we prove general sufficient conditions for the success of the cavity method. The proofs are based on a “regularity lemma” for probability measures on sets of the form for a finite Ω and a large n that may be of independent interest. © 2016 Wiley Periodicals, Inc. Random Struct. Alg., 49, 694–741, 2016
Keywords: random graphs, Belief Propagation, cavity method, regularity lemma
1. INTRODUCTION
Despite their simplicity, or perhaps because thereof, the first and the second moment method are the most widely used techniques in probabilistic combinatorics. Erdős employed the first moment method famously to lower‐bound the Ramsey number as well as to establish the existence of graphs of high girth and high chromatic number 28, 29. Even a half‐century on, deterministic constructions cannot hold a candle to these probabilistic results 14, 41. Moreover, the second moment method has been used to count prime factors 51 and Hamilton cycles 47 as well as to determine the two possible values of the chromatic number of a sparse random graph 3.
Yet there are quite a few problems for which the standard first and the second moment methods are too simplistic. The random k‐SAT model is a case in point. There are n Boolean variables and m clauses , where (the real αn rounded up to the next integer) for some fixed . Each clause binds k variables, which are chosen independently and uniformly, and discourages them from taking precisely one of the possible truth value combinations. The forbidden combination is chosen uniformly and independently for each clause.
The random k‐SAT instance gives rise to a probability measure on the set of all Boolean assignments naturally. Indeed, for a given parameter the Gibbs measure is defined by letting
| (1.1) |
for every assignment , where
| (1.2) |
is called the partition function. Thus, the Gibbs measure weighs assignments according to the number of clauses that they violate. In effect, by tuning β we can interpolate between just the uniform distribution on (β = 0) and a measure that strongly favours satisfying assignments (). Hence, if we think of as inducing a “height function” violated by on the set of assignments, then varying β allows us to explore the resulting landscape. Apart from its intrinsic combinatorial interest, the shape of the height function, the so‐called “Hamiltonian”, governs the performance of algorithms such as the Metropolis process or Simulated Annealing.
To understand the Gibbs measure it is key to get a handle on the partition function . Of course, the default approach to this kind of problem would be to apply the first and second moment methods. However, upon closer inspection it emerges that with high probability for any 5. In other words, the first moment over‐estimates the partition function of a typical random formula by an exponential factor. The reason for this is a “lottery effect”: a tiny minority of formulas render an exceptionally high contribution to . Unsurprisingly, going to the second moment only exacerbates the problem and thus for any we find . In other words, the second moment method fails rather spectacularly for all possible parameter combinations.
The first and the second moment method fall victim to similar large deviations effects in many alike “random constraint satisfaction problems”. These problems, ubiquitous in combinatorics, information theory, computer science and physics 4, 36, 46, can be described along the following lines. A random factor graph, chosen either from a uniform distribution (like the random k‐SAT model above) or from a suitable configuration model, induces interactions between the variables and the constraints. The variables range over a fixed finite domain Ω and each constraint binds a few variables. The constraints come with “weight functions” that either encourage or discourage certain value combinations of the incident variables. Multiplying up the weight functions of all the contraints just like in ((1.1), (1.2)), we obtain the Gibbs measure and the partition function.
With the standard first and second moment method drawing a blank, we seem to be at a loss as far as calculating the partition function is concerned. However, physicists have put forward an ingenious albeit non‐rigorous alternative called the cavity method 36. This technique, which applies almost mechanically to any problem that can be described in the language of sparse random factor graphs, yields an explicit conjecture as to the value of the partition function. More specifically, the cavity method comes in several installments. In this paper, we are concerned with the simplest, so‐called “replica symmetric” version.
In one of their key papers 34 physicists hypothesized abstract conditions under which the replica symmetric cavity method yields the correct value of the partition function. The thrust of this paper is to prove corresponding rigorous results. Specifically, according to 34 the replica symmetric cavity method gives the correct answer if the Gibbs measure satisfies certain correlation decay properties. For example, the Gibbs uniqueness condition requires that under the Gibbs measure the value assigned to a variable x is asymptotically independent of the values assigned to the variables at a large distance from x in the factor graph. In Corollary 4.6 below we prove that this condition is indeed sufficient to guarantee the success of the cavity method. Additionally, Theorems 4.4 and 4.5 yield rigorous sufficient conditions in terms of substantially weaker conditions, namely a symmetry property and the non‐reconstruction property.
A key feature of the paper is that we establish these results not for specific examples but generically for a very wide class of factor graph models. Of course, stating and proving general results requires a degree of abstraction. In particular, we resort to the framework of local weak convergence of graph sequences 8, 35. This framework suits the physics predictions well, which come in terms of the “limiting tree” that describes the local structure of a large random factor graph. To be precise, the replica symmetric prediction is given by a functional called the Bethe free energy applied to an (infinite) random tree.
The principal tool to prove these results is a theorem about the structure of probability measures on sets of the form for some fixed finite set Ω and a large integer n, Theorem 2.1 below. We expect that this result, which is inspired by Szemerédi's regularity lemma 49, will be of independent interest. To prove our results about random factor graphs, we combine Theorem 2.1 with the theory of local weak convergence to carry out completely generically “smart” first and second moment arguments that avoid the lottery effects that the standard arguments fall victim to.
In Section 2 we begin with the abstract results about probability measures on cubes. Subsequently, in Section 3 we set the stage by introducing the formalism of factor graphs and local weak convergence. Further, in Section 4 we state and prove the main results about Gibbs measures on random factor graphs. Finally, Section 5 contains the proof of a technical result that enables us to control the local structure of random factor graphs.
1.1. Related Work
A detailed (non‐rigorous) discussion of the cavity method can be found in 36. It is known that the replica symmetric version of the cavity method does not always yield the correct value of the partition function. For instance, in some factor graph models there occurs a “condensation phase transition” beyond which the replica symmetric prediction is off 20, 34. The more complex “1‐step replica symmetry breaking (1RSB)” version of the cavity method 37 is expected to yield the correct value of the partition function some way beyond condensation. However, another phase transition called “full replica symmetry breaking” spells doom on even the 1RSB cavity method 36.
The replica symmetric cavity method has been vindicated rigorously in various special cases. For instance, Montanari and Shah 38 proved that in the random k‐SAT model the replica symmetric prediction is correct up to the Gibbs uniqueness threshold. A similar result was obtained by Bandyopadhyay and Gamarnik 9 for graph colorings and independent sets. Furthermore, Dembo, Montanari and Sun 23 proved the replica symmetric conjecture on a class of models with specific types of constraints. A strength of 23 is that the result applies even to sequences of non‐random factor graphs under a local weak convergence assumption. But both 23, 38 are based on the “interpolation method” 30, 32, 42, which entails substantial restrictions on the types of models that can be handled. By contrast, the present proof method is based on a completely different approach centered around the abstract classification of measures on cubes that we present in Section 2.
Since the “vanilla” second moment method fails on the random k‐SAT model, more sophisticated variants have been proposed. The basic idea is to apply the second moment method not to the partition function itself but to a tweaked random variable. For instance, Achlioptas and Moore 2 applied the second moment method to NAE‐satisfying assignments, i.e., both the assignment and its binary inverse satisfy all clauses. However, the number of NAE‐satisfying assignments is exponentially smaller than the total number of satisfying assignments and thus this type of argument cannot yield the typical value of the partition function. The same is true of the more subtle random variable of Achlioptas and Peres 6. Furthermore, the work of Ding, Sly and Sun 26 that yields the precise k‐SAT threshold for large k is based on applying the second moment method to a random variable whose construction is guided by the 1RSB cavity method. Among other things, the random variable from 26 incorporates conditioning on the local structure of the factor graph, an idea that will be fundamental to our arguments as well.
The material of Section 2 of the present paper has recently been investigated from the more analytic viewpoint of the theory of graph limits 18. This leads to a general notion of limits of probability measures on discrete cubes. The article 18 also discusses the connection with the Aldous‐Hoover representation of exchangeable arrays, which has long been known to be related to the theory of graph limits, and Panchenko's notion of asymptotic Gibbs measures 7, 33, 44, 45. A further recent application of the methods of the present paper to two special classes of random factor graph models can be found in 19.
1.2. Notation
If is a finite set, then we denote by the set of probability measures on . Moreover, signifies the total variation norm. If μ is a probability measure on a product space for finite sets , V and , then denotes the marginal distribution of μ on U. That is, if , then
If for some , then we briefly write rather than . Further, if is an event with , than is the conditional distribution given S. That is, for any event we have .
The entropy of a probability measure is denoted by . Thus, with the convention that we have . Further, agreeing that as well, we recall that the Kullback‐Leibler divergence of is
We are going to work with probability measures on sets for a (small) finite Ω and a large integer n a lot. If , then we write for two independent samples from μ. Where μ is obvious from the context we just write . Additionally, if is a random variable, then stands for the expectation of X with respect to μ. Further, if and , then we let
Thus, is a probability distribution on Ω, namely the distribution of for a random . If for some , then we just write rather than . Clearly, .
We use the notation for averages over to avoid confusion with averages over other, additional random quantities, for which we reserve the common symbols . Furthermore, we frequently work with conditional expectations. Hence, let us recall that for a probability space , a random variable and a σ‐algebra the conditional expectation is a ‐measurable random variable on such that for every ‐measurable event F we have . Moreover, recall that the conditional variance is defined as .
In line with the two previous paragraphs, if is a random variable, and is a σ‐algebra on , then we write for the conditional expectation, which is a ‐measurable random variable . Accordingly, for an event with we write for the expectation of Y given A.
Finally, we need the Paley‐Zygmund inequality 43, Lemma 19]: if is a random variable with a finite second moment, then
| (1.3) |
2. PROBABILITY MEASURES ON THE CUBE
In this section we present a general “regularity lemma” for probability measures on sets for some finite set Ω and a large integer n (Theorem 2.1 below).
2.1. Examples
Needless to say, probability distributions on sets for a small finite Ω and a large integer n are ubiquitous. To get an idea of what we might hope to prove about them in general, let us look at a few examples.
The simplest case certainly is a product measure with . By the Chernoff bound, for any fixed there is such that for we have
| (2.1) |
In words, if we fix a large enough set U of coordinates and then choose randomly, then with probability close to one the empirical distribution on U will be close to p.
As a twist on the previous example, let , assume that n is a square and define a measure μ by letting
In words, the coordinates come in blocks of size . While the values of all the coordinates in one block coincide and have distribution p, the coordinates in different blocks are independent. Although μ is not a product distribution, (2.1) is satisfied for any fixed and large enough n. Furthermore, if for a fixed k > 1 we choose uniformly and independently, then
| (2.2) |
provided that is sufficiently large. This is because for large enough n it is unlikely that two of the randomly chosen belong to the same block.
As a third example, consider the set and the measure μ defined by
All the marginals , are equal to the uniform distribution on {0, 1}. But of course the uniform distribution on is a horrible approximation to μ. Indeed, by the Chernoff bound with overwhelming probability a point drawn from μ either satisfies or . However, the conditional distribution given, say, , is close to a product measure. Thus, μ induces a decomposition of into two “states” such that are close to product measures.
As a final example, consider , assume that n is even and define by letting
In words, μ is a product measure with marginal distribution on the first coordinates and on the other coordinates. Clearly, μ satisfies (2.1) with for sets and with for sets , provided that n is large.
In summary, the following picture emerges. The conditions (2.1) and (2.2) are proxies for saying that a given measure μ resembles a product measure. Furthermore, in order to obtain from a given μ measures that satisfy (2.1) or (2.2) it may be necessary to decompose the space into “states” so that the conditional distributions have these properties. In addition, because different coordinates may have different marginal distributions, for (2.1) to hold it may be necessary to partition the set of coordinates.
2.2. Homogeneity
The main result of this section shows that by partitioning the space and/or the set of coordinates it is always possible to “approximate” a given measure μ by measures that satisfy (2.1) for some suitable p as well as (2.2). In fact, the number of parts that we have to partition and into is bounded only in terms of the desired accuracy but independently of n.
Let us introduce some terminology. If is a partition of some set V, then we call the size of . Furthermore, a partition refines another partition if for each there is such that .
For we say that is ‐regular on a set if for every subset of size we have
Further, μ is ‐regular with respect to a partition if there is a set such that and such that μ is ‐regular on V i for all . Additionally, if is a partition of and is a partition of , then we say that μ is ‐homogeneous with respect to if there is a subset such that the following is true.
-
1
HM1: We have for all and .
-
2
HM2: for all and we have
-
3
HM3: for all the measure is ‐regular with respect to .
-
4
HM4: μ is ‐regular with respect to .
Theorem 2.1
For any
there exists
such that for every n > N, any measure
and any partition
of
of size
the following is true. There exist a refinement
of
and a partition
of
such that
and such that μ is
‐homogeneous with respect to
Informally speaking, Theorem 2.1 shows that any probability measure admits a partition such that the following is true. Almost the entire probability mass of μ belongs to parts S i such that the conditional measure is ‐regular w.r.t. . This means that almost every coordinate belongs to a class V j such that for every “large” for chosen from very likely the empirical distribution is close to the marginal distribution of the entire class.
Theorem 2.1 and its proof, which we defer to Section 2.3, are inspired by Szemerédi's regularity lemma 49. Let us proceed to state a few consequences of Theorem 2.1.
A ‐state of μ is a set such that and
In other words, if we choose independently and uniformly at random, then the expected total variation distance between the joint distribution of and the product of the marginal distributions is small.
Corollary 2.2
For any
there exists
such that for every
any measure
has pairwise disjoint
‐states
such that
for all
and
Thus, we can chop the space into subsets , that capture almost the entire probability mass such that “resembles a product measure” for each . We prove Corollary 2.2 in Section 2.4.
Let us call μ ‐symmetric if itself is an ‐state.
Corollary 2.3
For any there exists δ such that for any there is such that for all and all the following is true. If for any two ‐states S1, S2 with we have
(2.3) then μ is ‐symmetric.
Thus, the entire measure μ “resembles a product measure” if extensive states have similar marginal distributions. Conversely, we have the following.
Corollary 2.4
For any
there exists
such that for all
and all
the following is true. If μ is
‐symmetric, then for any S with
we have
The proofs of Corollaries 2.3 and 2.4 can be found in Sections 2.5 and 2.6, respectively. Finally, in Section 2.7 we prove the following fact that will be useful in Section 4.
Proposition 2.5
For any there exist such that for large enough n the following is true. If is ‐symmetric, then is ‐symmetric.
2.3. Proof of Theorem 2.1
Throughout this section we assume that n is sufficiently large. To prove Theorem 2.1 and guided by 49, we define the index of μ with respect to a partition of as
The index can be viewed as a conditional variance (cf. 50). Indeed, choose uniformly and independently of . Furthermore, let be the σ‐algebra generated by the events for . Writing and for the expectation and variance with respect to the choice of only, we see that
Lemma 2.6
For any partition of we have . If is a refinement of , then
The fact that is immediate from the definition. Moreover, if refines , then . Consequently, . Averaging over yields .
Lemma 2.7
If fails to be ‐regular with respect to , then there is a refinement of such that and
Let be the set of all indices such that there exists of size such that
(2.4) Since μ fails to be ‐regular with respect to we have
(2.5) For each pick a set such that (2.4) is satisfied. Then there exists such that
(2.6) Let be the partition obtained from by splitting each class V j, , into the sub‐classes . Clearly, . Furthermore,
(2.7)
If then (2.6) implies that on V j we have
(2.8) Hence, combining (2.5) and (2.8), we find
(2.9)
Proof of Theorem 2.1
The set is compact. Therefore, there exists a partition of into pairwise disjoint sets such that for all and any two measures we have .
Given any partition of , we can construct a corresponding decomposition of as follows. Call ‐equivalent if for every there exists such that . Then comprises of the equivalence classes.
We construct the desired partition of inductively, starting from any given partition of size at most . The construction stops once μ is ‐homogeneous with respect to . Assuming that this is not the case, we obtain from as follows. If μ fails to be ‐regular with respect to , then we let be the partition promised by Lemma 2.7, which guarantees that
(2.10) Otherwise let and for the sake of brevity. Further, let for with . Moreover, let be the set of all such that and fails to be ‐regular with respect to . If μ fails to be ‐homogeneous with respect to but μ is ‐regular w.r.t. , then
(2.11) Lemma 2.7 shows that for any there exists a refinement of such that
(2.12) Let be the coarsest common refinement of all the partitions . Then
(2.13) In addition, (2.12) and Lemma 2.6 imply
(2.14) Therefore, by (2.11), (2.14) and Bayes’ rule
(2.15)
Combining (2.10), (2.15) and Lemma 2.6, we conclude that μ is ‐homogeneous with respect to for some . Finally, (2.13) entails that are bounded in terms of only.
2.4. Proof of Corollary 2.2
To derive Corollary 2.2 from Theorem 2.1 we use the following handy sufficient condition for ‐symmetry.
Lemma 2.8
For any there is such that for large enough n the following is true. Assume that is δ‐regular with respect to a partition and set for . If
(2.16) then μ is ‐symmetric.
Choose a small and a smaller . Then (2.16) implies that there is satisfying
(2.17) such that for all we have
(2.18) In particular, we claim that (2.18) implies the following (if ξ is small enough):
(2.19)
Indeed, assume that and for some . Then because is a probability measure on Ω for every x, there exists such that the set has size . In particular, Therefore, by Markov's inequality
Consequently, we obtain
Since , this is a contradiction to (2.18).
Now, fix any and let be chosen independently and uniformly at random. Let be the event that for all . We are going to show that for ,
(2.20) In the case h = 0 there is nothing to show. As for the inductive step, condition on .
- 1
Case 1: : regardless of the choice of we have
- 2
Hence, (2.20) follows.
To complete the proof, we are going to show by induction on that
(2.21) For h = 1 there is nothing to show. To proceed from h to h + 1 we use the triangle inequality to write
Invoking the induction hypothesis and (2.20) completes the proof.
Proof of Corollary 2.2
For a small enough let be a pair of partitions of size at most such that μ is ‐homogeneous with respect to as guaranteed by Theorem 2.1. Let and let J be the set of all such that and such that is δ‐regular with respect to . Then
Furthermore, for every the measure satisfies (2.16) due to HM2. Therefore, Lemma 2.8 implies that is ‐symmetric. Consequently, the sets are pairwise disjoint ‐states with for all and .
2.5. Proof of Corollary 2.3
Pick small enough . Then by Theorem 2.1 μ is γ‐homogeneous with respect to for partitions that satisfy . Let contain all j such that is γ‐regular with respect to and such that . Let . Then by HM2 for every we have
Therefore, Lemma 2.8 implies that S j is a ‐state. Consequently, our assumption (2.3) and the triangle inequality entail that for all ,
| (2.22) |
Choosing η small, we can ensure that . Therefore, letting , we obtain from (2.22)
| (2.23) |
Since μ is γ‐regular and thus ‐regular w.r.t. by HM4, (2.23) and Lemma 2.8 imply that μ is ‐symmetric.
2.6. Proof of Corollary 2.4
Choose small enough, assume that satisfies and that μ is ‐symmetric. Assume for contradiction that
| (2.24) |
Let
Then (2.24) entails that . Therefore, there is such that for either or s = – 1. Let for the sake of brevity. Of course, by the definition of ,
| (2.25) |
Since μ is ‐symmetric,
| (2.26) |
On the other hand we have
| (2.27) |
Finally, plugging (2.25) and (2.26) into (2.27), we find , which is a contradiction if δ is small enough.
2.7. Proof of Proposition 2.5
Choose small enough and an even smaller and assume that μ is ‐symmetric. Suppose that μ is χ‐homogeneous with respect to a partition such that as promised by Theorem 2.1. Let J be the set of all such that . Moreover, let I be the set of all such that μ is χ‐regular on V i and . By Corollary 2.4 we have
provided that δ is chosen small enough. Therefore, letting , for all we have
| (2.28) |
Fix some . We claim that is α‐regular on V i. Hence, let be a set of size and let
Then (2.28) implies that , because μ is γ‐regular on V i. Now, fix some . For let . Let
If , then due to (2.28) and γ‐regularity we obtain, by a similar token as previously, . Consequently, the event that occurs for all ω satisfying has probability at least . Therefore, for any we obtain
Summing over all and choosing γ small enough, we conclude that is α‐regular on V i.
Finally, (2.28) implies that satisfies
Therefore, picking α small enough, we can apply Lemma 2.8 to conclude that is ‐symmetric.
3. FACTOR GRAPHS
3.1. Examples
The aim in this section is to set up a comprehensive framework for the study of “random factor graphs” and their corresponding Gibbs measures. To get started let us ponder a few concrete examples.
In the Ising model on a graph the variables of the problem are just the vertices of the graph. The values available for each variable are ±1. Thus, an assignment is simply a map . Moreover, each edge of G gives rise to a constraint. Specifically, given a parameter we define a weight function ψ e corresponding to the edge by letting Thus, edges give larger weight to assignments σ such that than in the case . The corresponding partition function reads
Further, the Gibbs distribution induced by G, β is the probability measure on defined by
Thus, weighs assignments according to the number of edges such that .
The Ising model has been studied extensively in the mathematical physics literature on various classes of graphs, including and particularly random graphs. For instance, if is a random regular graph of degree d on n vertices, then is known to “converge” to the value predicted by the cavity method 22. Formally, the cavity method yields a certain number such that
| (3.1) |
Because is exponential in n with high probability, the scaling applied in (3.1) is the appropriate one to obtain a finite limit. Furthermore, by Azuma's inequality is concentrated about its expectation. Therefore, (3.1) implies that converges to in probability.
The Potts antiferromagnet on a graph can be viewed as a twist on the Ising model. In this case we look at assignments for some number . The weight functions associated with the edges are defined by for some . Thus, this time the edges prefer that the incident vertices receive different values. The Gibbs measure and the partition function read
While it is known that exists and that is concentrated about its expectation 15, the precise value remains elusive for a wide range of (in contrast ferromagnetic version of the model 24). However, it is not difficult to see that for sufficiently large values of we have 12
Hence, just like in the random k‐SAT model the first moment overshoots the actual value of the partition function by an exponential factor. The Potts model is closely related to the k‐colorability problem. Indeed, if we think of the k possible values as colors, then for large β the Gibbs measure concentrates on colorings with few monochromatic edges.
As a third example let us consider the following version of the random k‐SAT model. Let be fixed integers, let be a set of Boolean variables and let be a map such that
is an integer. Then we let be a random k‐CNF formula with m clauses in which each variable appears precisely times as a positive literal and precisely times as a negative literal. As in Section 1, for a clause a and a truth assignment we let Then for a given parameter we obtain a Gibbs measure that weighs assignments by the number of clauses that they violate and a corresponding partition function , cf. ((1.1), (1.2)). Hence, for given and degree assignments the problem of determining arises. This question is anything but straightforward even in the special case that is the same for all x. In 11 we show how the results of the present paper can be put to work to tackle this case.
3.2. Random Factor Graphs
The following definition encompasses a variety of concrete models.
Definition 3.1
Let be an integer, let be finite sets and let be a finite set of functions of arity . A ‐model consists of
- 1
M1: a countable set V of variable nodes,
- 2
M2: a countable set F of constraint nodes,
- 3
M3: a map such that
(3.2) - 4
M4: a map , where we letsuch that
(3.3) - 5
M5: a map such that for all
The size of the model is . Furthermore, a ‐factor graph is a bijection such that for all
Of course, (3.2) and (3.3) require that either both quantities are infinite or both are finite.
The semantics is that Δ is the maximum degree of a factor graph. Moreover, Ω is the set of possible values that the variables of the model range over, e.g., the set in the Ising model. Further, Θ is a set of “types”. For instance, in the random k‐SAT model the types can be used to specify the signs of the literals. Additionally, is a set of possible weight functions.
A model comes with a set V of variable nodes and a set F of contraint nodes. The degrees of these nodes are prescribed by the map d. Just like in the “configuration model” of graphs with a given degree sequence we create d(v) “clones” of each node v. The sets CV, C F contain the clones of the variable and constraint nodes, respectively. Further, the map t assigns a type to each “clone” of either a constraint or variable node and each constraint node a comes with a weight function ψ a.
A ‐factor graph is a type‐preserving matching G of the variable and constraint clones. Let be the set of all ‐factor graphs and write for a uniformly random sample from . Contracting the clones of each node, we obtain a bipartite (multi‐)graph with variable nodes V and constraint nodes F. We often identify with this multi‐graph. For instance, if we speak of the distance of two vertices in we mean the length of a shortest path in this multi‐graph.
For a clone we denote by the clone that G matches (x, i) to. Similarly, for we write for the variable clone (x, i) such that . Moreover, for a variable x we let and analogously for we set . To economise notation we sometimes identify a clone (x, i) with the underlying variable x. For instance, if is an assignment, then we take the liberty of writing . Additionally, where convenient we view as the set of all constraint nodes such that there exist such that . The corresponding convention applies to .
A ‐assignment is a map and we define
Further, the Gibbs distribution and the partition function of G are
| (3.4) |
We denote expectations with respect to the Gibbs measure by .
The fundamental problem that arises is the study of the random variable . As mentioned in Section 1, this random variable holds the key to getting a handle the Gibbs measure and thus the combinatorics of the problem. The following proposition establishes concentration about the expectation. For two factor graphs let
| (3.5) |
Proposition 3.2
For any
there exists
such that for any
‐model
of size
and any
we have
There exists a number that depends on only such that for any two factor graphs we have . Therefore, the assertion follows from Azuma's inequality.
Thus, Proposition 3.2 reduces our task to calculating the expectation . Generally, the standard first and second moment method do not suffice to tackle this problem because the logarithm sits inside the expectation. While, of course, Jensen's inequality guarantees that
| (3.6) |
equality does not typically hold. In fact, we saw examples where is linear in the size of the model already. If so, then the Paley‐Zygmund inequality (1.3) entails that is linear in as well, dooming the second moment method. Furthermore, even if the second moment method does not generally succeed 20. Let us now revisit the examples from Section 3.1.
Example 3.3
((the Ising model on the random d‐regular graph)). Suppose that . Let , where , and set . Further, given such that dn is even we define a ‐model by letting , d(x) = d for all , d(a) = 2 for all , for all , and for all . Thus, all clones have the same “type” and all constraint nodes have arity two and the same weight function. Hence, the random graph is obtained by matching the dn variable clones randomly to the dn constraint clones. If we simply replace the constraint nodes, which have degree two, by edges joining the two adjacent variable nodes, then the resulting random multigraph is contiguous to the uniformly random d‐regular graph on n vertices. In the model (3.6) holds with (asymptotic) equality for all 22.
Example 3.4
((the Potts antiferromagnet on the random d‐regular graph)). The construction is similar to the previous example, except that is the set of colors and . In this example (3.6) holds with asymptotic equality if either or and for certain critical values . However, for sufficiently large there occurs a linear gap 12, 21.
Example 3.5
((random k‐SAT)). To capture the random k‐SAT model we let be a maximum degree and . Further, each gives rise to a function
and we let . The idea is that s is the “sign pattern” of a k‐clause, with indicating that the ith literal is positive/negative. Then a truth assignment σ of the k variables is satisfying unless for all i. The corresponding model has a set of Boolean variables and a set of clauses. Moreover, the map prescribes the degree of each variable, while of course each clause has degree k. Additionally, the map prescribes the positive/negative occurrences of the variables and the sign patterns of the clauses. Thus, a variable x occurs times positively/negatively and the jth literal of a clause a is positive iff . Finally, the weight function of clause a is . The bound (3.6) does not generally hold with equality 5, 11.
While Definition 3.1 encompasses many problems of interest, there are two restrictions. First, because all weight functions take strictly positive values, Definition 3.1 does not allow for “hard” constraints. For instance, Definition 3.1 does not accommodate the graph coloring problem, which imposes the strict requirement that no single edge be monochromatic. However, for some purposes hard constraints can be approximated by soft ones, e.g., by choosing a very large value of β in the Potts antiferromagnet. Moreover, many of the arguments in the following sections do extend to hard constraints with a bit of care. However, the assumption that all ψ are strictly positive saves us many case distinctions as it ensures that is strictly positive and that therefore the Gibbs measure is well‐defined.
The second restriction is that we prescribe a fixed maximum degree Δ. Thus, if we consider a sequence of ‐models with , then all factor graphs have a bounded degree. By comparison, if we choose a k‐SAT formula with n variables and clauses uniformly at random for fixed , then the maximum variable degree will be of order . Yet this case can be approximated well by a sequence of models with a large enough maximum degree Δ. In fact, if we calculate for any fixed Δ, then the limit is easily seen to yield the answer in the case of uniformly random formulas. Nevertheless, the bounded degree assumption is technically convenient because it facilitates the use of local weak convergence, as we will discuss next.
Remark 3.6
For the sake of simplicity in (3.4) we definied the partition function as the sum over all . However, the results stated in the following carry over to the cases where Z is defined as the sum over all configurations of a subset of , e.g., all σ that have Hamming distance at most αn from some reference assignment σ 0 for a fixed . Of course, in this case the Gibbs measure is defined such that its support is equal to .
3.3. Local Weak Convergence
Suppose that we fix as in Definition 3.1 and that is a sequence of ‐models such that has size n. Let us write for the sake of brevity. According to the cavity method, is determined by the “limiting local structure” of the random factor graph . To formalise this concept, we adapt the concept of local weak convergence of graph sequences 8, 35 to our current setup, thereby generalising the approach taken in 23.
Definition 3.7
A ‐template consists of a ‐model , a connected factor graph and a root rH, which is a variable or factor node. Its size is . Moreover, two templates with models , are isomorphic if there exists a bijection such that
- 1
ISM1: ,
- 2
ISM2: and ,
- 3
ISM3: for all ,
- 4
ISM4: for all ,
- 5
ISM5: for all , and
- 6
ISM6: if satisfy , then
Thus, a template is, basically, a finite or countably infinite connected factor graph with a distinguished root. Moreover, an isomorphism preserves the root as well as degrees, types, weight functions and adjacencies.
Let us write for the isomorphism class of a template and let be the set of all isomorphism classes of ‐templates. For each and let be the isomorphism class of the template obtained by removing all vertices at a distance greater than from the root. We endow with the coarsest topology that makes all the functions
continuous. Moreover, the space of probability measures on carries the weak topology. So does the space of probability measures on . For we write for the Dirac measure that puts mass one on the single point Γ. Similarly, for we let be the Dirac measure on λ. Our assumption that the maximum degree is bounded by a fixed number Δ ensures that are compact Polish spaces.
For a factor graph and a variable or constraint node v we write for the isomorphism class of the connected component of v in G rooted at v. Then each factor graph gives rise to the empirical distribution
We say that converges locally to if
| (3.7) |
Denote a random isomorphism class chosen from the distribution by . Unravelling the definitions, we see that (3.7) holds iff for every integer and every we have
| (3.8) |
We are going to be interested in the case that converges locally to a distribution on acyclic templates. Thus, let be the set of all acyclic templates. Further, we write for the set of all templates whose root is a variable node and for the set of all templates whose root is a constraint node. Additionally, for a template we write for the root vertex, for its degree and for the weight function of the root vertex if . Moreover, for we write for the template obtained from by re‐rooting the template at the jth neighbor of . (This makes sense because condition ISM6 from Definition 3.7 preserves the order of the neighbors.)
We will frequently condition on the depth‐ neighborhoods of the random factor graph for some finite . Hence, for and we write if for all variable nodes and for all constraint nodes . Let be the σ‐algebra on generated by the equivalence classes of the relation . Additionally, for and we let
be the empirical distribution of the depth‐ neighborhood structure.
Furthermore, let
Then for a probability measure we denote by the image of under the map
Because all degrees are bounded by Δ, the set is finite for every . Hence, (3.8) entails that converges locally to iff
| (3.9) |
3.4. The Planted Distribution
While is chosen uniformly at random (from the configuration model), we need to consider another distribution that weighs factor graphs by their partition function. Specifically, given let be a random graph chosen according to the distribution
| (3.10) |
which we call the planted distribution. The definition (3.10) ensures that the distribution of the “depth‐ neighborhood structure” of coincides with that of .
Perhaps more intuitively, the planted distribution can be described by the following experiment. First, choose a random factor graph . Then, given , choose the factor graph randomly such that a graph comes up with a probability that is proportional to Z(G). Perhaps despite appearances, the planted distribution is reasonably easy to work with in many cases. For instance, it has been employed successfully to study random k‐SAT as well as random graph or hypergraph coloring problems 1, 11, 13, 20, 26.
3.5. Short Cycles
In most cases of interest the random factor graph is unlikely to contain many short cycles, and it will be convenient for us to exploit this fact. Hence, let us call a factor graph G l‐acyclic if it does not contain a cycle of length at most l. We say that the sequence of models has high girth if for any we have
| (3.11) |
Thus, there is a non‐vanishing probability that the random factor graph is l‐acyclic. Moreover, short cycles do not have too heavy an impact on the partition function as the graph chosen from the planted distribution has a non‐vanishing probability of being l‐acyclic as well.
In the following, we are going to denote the event that a random factor graph is l‐acyclic by . Let us highlight the following consequence of the high girth condition and the construction of the planted distribution.
Proposition 3.8
Assume that is a sequence of ‐models of high girth. Let be an integer and suppose that is an event such that . If b is a real and is an integer such that
(3.12) then
Since the high girth condition (3.11) implies that for every l. Set . Then by the definition (3.10) of the planted distribution,
Consequently, . Hence, (3.12) yields
Therefore, the assertion follows from (3.11).
Remark 3.9
Strictly speaking, the first condition in (3.11) is superfluous as it is implied by the second one.
From here on out we assume that is a sequence of ‐models of high girth that converges locally to and we fix for the rest of the paper.
4. THE BETHE FREE ENERGY
In this section we present the main results of the paper. The thrust is that certain basic properties of the Gibbs measure entail an asymptotic formula for . The results are guided by the physics predictions from 34.
4.1. An Educated Guess
The formula for that the cavity method predicts, the so‐called “replica symmetric solution”, comes in terms of the distribution to which converges locally. Thus, the cavity method claims that in order to calculate it is not necessary to deal with the mind‐boggling complexity of the random factor graph with its expansion properties, long cycles etc. Instead, it suffices to think about the random tree , a dramatically simpler object. The following definition will help us formalise this notion.
Definition 4.1
A marginal assignment is a measurable map
such that
- 1
MA1: for all ,
- 2
MA2: and for all ,
- 3
MA3: For all we have
(4.1) Further, the Bethe free energy of p with respect to iswhere, of course, refer to the choice of the random tree
(4.2)
Thus, a marginal assignment provides a probability distribution p T on Ω for each tree whose root is a variable node. Furthermore, for trees T rooted at a contraint node p T is a distribution on , which we think of as the joint distribution of the variables involved in the constraint. The distributions assigned to T rooted at a constraint node must satisfy a consistency condition: the jth marginal of p T has to coincide with the distribution assigned to the tree rooted at the jth child of the root of T for every ; of course, is a tree rooted at a variable node. In addition, MA3 requires that for the distribution p T maximises the functional amongst all distribution ν with the same marginal distributions as p T. Furthermore, the Bethe free energy is a functional that maps each marginal assignment p to a real number. For a detailed derivation of this formula based on physics intuition we refer to 36.
The basic idea behind Definition 4.1 is to capture the limiting distribution of the marginals of the variables of the random factor graph . More specifically, Definition 4.1 aims to provide the “limiting object” of the following combinatorial construction: for a fixed take a random factor graph on a large enough number n of variable nodes and for each possible tree record the empirical distribution of the marginals of the nodes whose neighborhood is isomorphic to T. In the simplest possible case (which here we confine ourselves to), in the limit of large we expect to obtain distributions that satisfy MA2. That is, in the limit of large the empirical distribution of the marginals of variable nodes converges to a deterministic limit; going from to (the depth up to which constraint nodes can “see”) does not make much of a difference. Moreover, in the proof of Corollary 4.8 we are going to see that MA3 is the “correct” way of linking the constraint/variable distributions.
Given a distribution on trees, the cavity method provides a plausible recipe for constructing marginal assignments. Roughly speaking, the idea is to identify fixed points of an operator called Belief Propagation on the random infinite tree 36. However, this procedure is difficult to formalise mathematically because generally there are several Belief Propagation fixed points and model‐dependent considerations are necessary to identify the “correct” one. To keep matters as simple as possible we are therefore going to assume that a marginal assignment is given.
Remark 4.2
Because the entropy is concave, conditions MA2 and MA3 specify the distributions p T for uniquely. In other words, a marginal assignment is actually determined completely by the distributions p T for .
For a marginal assignment p, an integer and a tree we define
Thus, is the conditional expectation of p given the first layers of the tree. To avoid notational hazards we let be the uniform distribution on Ω for all .
Lemma 4.3
For any
there is
such that for all
we have
Define an equivalence relation on by letting iff . Then for any the sequence of random variables is a martingale with respect to the filtration generated by the equivalence classes of . By the martingale convergence theorem 27, Theorem 5.7], converges ‐almost surely to p.
Unless specified otherwise, in the rest of this section p is understood to be a marginal assignment.
4.2. Symmetry
In the terminology of Section 2, the cavity method claims that converges to the Bethe free energy of a suitable marginal assignment iff
| (4.3) |
This claim is, of course, based on bold non‐rigorous deliberations. Nonetheless, we aim to prove a rigorous statement that comes reasonably close.
To this end, let p be a marginal assignment. We say that is p‐symmetric if for every there is such that for all we have
| (4.4) |
In other words, for any for sufficiently large the random factor graph enjoys the following property with high probability. If we pick two variable nodes x, y of uniformly and independently, then the joint distribution is close to the product distribution determined by the depth‐ neighborhoods of x, y. Of course, as has bounded maximum degree the distance between randomly chosen x, y is going to be greater than, say, with high probability. Thus, similar in spirit to (4.3), (4.4) provides that far‐apart variables typically decorrelate and that p captures the Gibbs marginals.
In analogy to (4.4), we say that the planted distribution of is p‐symmetric if for every there is such that for all we have
The main result of this paper is
Theorem 4.4
If
is p‐symmetric, then
If the planted distribution of
is p‐symmetric as well, then
Thus, the basic symmetry assumption (4.4) implies that is an upper bound on . If, additionally, the symmetry condition holds in the planted model, then this upper bound is tight. In particular, in this case is completely determined by the limiting local structure and p.
The proof of Theorem 4.4, which can be found in Section 4.6, is based on Theorem 2.1, the decomposition theorem for probability measures on cubes. More precisely, we combine Theorem 2.1 with a conditional first and a second moment argument given the local structure of the factor graph, i.e., given for a large . The fact that it is necessary to condition on the local structure in order to cope with “lottery effects” has been noticed in prior work 6, 17, 22, 23. Most prominently, such a conditioning was crucial in order to obtain the precise k‐SAT threshold for large enough k 26. But here the key insight is that Theorem 2.1 enables us to carry out conditional moment calculations in a fairly elegant and generic way.
The obvious question that arises from Theorem 4.4 is whether there is a simple way to show that is p‐symmetric (and that the same is true of the planted distribution). In Sections 4.3 and 4.4 we provide two sufficient conditions called non‐reconstruction and Gibbs uniqueness. That these two conditions entail symmetry was predicted in 34, and Theorem 2.1 enables us to prove it.
While the present paper deals with a very general class of factor graphs, the methods give somewhat stronger results in special classes, e.g., models with only one type and variable all nodes of the same degrees or variable nodes with Poisson degrees. The details have been worked out in 18, 19.
4.3. Non‐reconstruction
Following 34 we define a correlation decay condition, the “non‐reconstruction” condition, on factor graphs and show that it implies symmetry. The basic idea is to formalise the following. Given pick a large , choose a random factor graph for some large n and pick a variable node x uniformly at random. Further, sample an assignment randomly from the Gibbs measure . Now, sample a second assignment from subject to the condition that for all variable nodes y at distance at least from x. Then the non‐reconstruction condition asks whether the distribution of is markedly different from the unconditional marginal . More precisely, non‐reconstruction occurs if for any there is such that with high probability is such that the shift that a random “boundary condition” induces does not exceed in total variation distance.
Of course, instead of conditioning on the values of all variables at distance at least from x, we might as well just condition on the variables at distance either or from x, depending on the parity of . This is immediate from the definition (3.4) of the Gibbs measure.
As for the formal definition, suppose that is a factor graph, let and let . Let signify the σ‐algebra on generated by the events for and at distance either or from x. Thus, pins down all for y at distance from x if is even and otherwise. Then we say that has non‐reconstruction with respect to a marginal assignment p if for any there is such that
To parse the above, the outer refers to the choice of . The big is the choice of the boundary condition called above. Finally, is the random choice given the boundary condition.
Analogously, the planted distribution of has non‐reconstruction with respect to p if for any there exists such that
Theorem 4.5
If has non‐reconstruction with respect to p, then is p‐symmetric. If the planted distribution of has non‐reconstruction with respect to p, then it is p‐symmetric.
In concrete applications the non‐reconstruction condition is typically reasonably easy to verify. For instance, in 11 we determine the precise location of the so‐called “condensation phase transition” in the regular k‐SAT model via Theorems 4.4 and 4.5. The proof of Theorem 4.5 can be found in Section 4.7.
4.4. Gibbs Uniqueness
Although the non‐reconstruction condition is reasonably handy, to verify it we still need to “touch” the complex random graph . Ideally, we might hope for a condition that can be stated solely in terms of the limiting distribution on trees, which is conceptually far more accessible. The “Gibbs uniqueness” condition as put forward in 34 fills this order.
Specifically, suppose that T is a finite acyclic template whose root r T is a variable node. Then we say that T is ‐unique with respect to a marginal assignment p if
| (4.5) |
To parse (4.5), we observe that is a random variable, namely the average of the value assigned to the root variable under the Gibbs measure μ T given the values of the variables at distance at least from r T. Hence, (4.5) requires that is at total variation distance less than for every possible assignment of the variables at distance at least from r T, i.e., for every “boundary condition”.
More generally, we say that is ‐unique with respect to p if the finite template has this property. (That is finite follows once more from the fact that all degrees are bounded by Δ.) Further, we call the measure Gibbs‐unique with respect to p if for any we have
Corollary 4.6
If
is Gibbs‐unique with respect to p, then
If is Gibbs‐unique with respect to p, then (3.9) guarantees that has non‐reconstruction with respect to p. Indeed, given and a graph G let denote the set of vertices for which is acyclic and ‐unique. Then we have
and by (3.9) tends to 0 as . Similarly, because the distribution of the depth‐ neighborhood structure in the planted distribution coincides with , Gibbs‐uniqueness implies that the planted model has non‐reconstruction with respect to p as well. Therefore, the assertion follows from Theorems 4.4 and 4.5.
In problems such as the random k‐SAT model, the Ising model or the Potts antiferromagnet that come with an “inverse temperature” parameter , Gibbs uniqueness is always satisfied for sufficiently small values of β. Consequently, Corollary 4.6 shows that the cavity method always yields the correct value of in the case of small β, the so‐called “high temperature” case in physics jargon. Furthermore, if the Gibbs uniqueness condition is satisfied then there is a canonical way of constructing the marginal assignment p by means of the Belief Propagation algorithm 36, Chapter 14]. Hence, Corollary 4.6 provides a full comprehensive answer in this case.
4.5. Meet the Expectation
In this section we lay the groundwork for proving Theorem 4.4. In particular, the conditions MA1–MA3 will be used in the proofs of Corollaries 4.8 and 4.9 in this section, which will be vital to the proof of Theorem 4.4 in Section 4.6. To proceed, we need to get a handle on the conditional expectation of Z given and for this purpose we need to study the possible empirical distributions of the values assigned to the variables of a concrete factor graph . Specifically, by a ‐marginal sequence we mean a map such that
-
1
MS1: if ,
-
2
MS2: if ,
-
3MS3: for all we have
(4.6)
Thus, q assigns each tree rooted at a variable node a distribution on Ω and each tree rooted at a constraint node a distribution on , just like in Definition 4.1. Furthermore, the consistency condition (4.6) provides that for a given T rooted at a variable the average marginal distribution over all such that is equal to q T. However, in contrast to condition MA2 from Definition 4.1 MS3 does not require this marginalisation to work out for every individually.
Suppose now that is a set of constraint nodes such that for all . Then for we let
Thus, is the empirical distribution of the sequences . A factor graph G and induce a ‐marginal sequence canonically, namely the empirical distributions
Conversely, given a ‐marginal sequence q let be the set of all such that for all we have
| (4.7) |
Moreover, let
Finally, define
In Section 5 we are going to prove the following formula for the expectation of .
Proposition 4.7
For any
there is
such that for large enough n the following is true. Assume that
is
‐acyclic and let q be a
‐marginal sequence. Then
We are going to be particularly interested in the expectation of for q “close” to a specific marginal assignment p (in the sense of Definition 4.1). Formally, a ‐marginal sequence q is ‐judicious with respect to p if
We say that is ‐judicious with respect to p if the empirical distribution is ‐judicious w.r.t. p.
Let us explain this definition briefly. Suppose we are given a factor graph G and a certain “depth” . The for an assignment σ we can jot the empirical distribution of the values assigned to the variables x with for each tree . The “judicious” condition essentially provides that these empirical distributions are fairly “homogeneous”. That is, if we refine our classification of variable nodes according to the depth‐ structure , then the resulting empirical distributions are close to the coarser ones obtained at level . Clearly, this condition is closely related to the MA2 condition from Definition 4.1.
Before we prove Theorem 4.4 we deduce two corollaries to Proposition 4.7. The first one gives an upper bound on the “judicious part” of . The second one yields a lower bound.
Corollary 4.8
Suppose that p is a marginal assignment. For any there exist such that for all and all the following is true. Let be the event that . Then
Pick a small enough . By Lemma 4.3 there exists such that for all . Now, fix any , pick small enough and assume that n is big enough. Let Q(G) be the set of all (G, l)‐marginal sequences that are ‐judicious w.r.t. p. Because is a finite set, there exists a number such that for every factor graph G there is a subset of size such that the following is true. If is ‐judicious w.r.t. p, then . Therefore, for all G we have
(4.8) Proposition 4.7 and (4.8) imply that for ξ small enough and n large enough for any factor graph there is such that
(4.9)
Further, for any the function is uniformly continuous because is compact. By the same token, is uniformly continuous for any . Consequently, if for some and ε is chosen small enough, then we obtain
(4.10) Similarly, because our choice of l ensures that and because of the uniform degree bound Δ, condition MA2 from Definition 4.1 implies that
Moreover, for any and any there is a unique distribution with marginals that maximises because the entropy is concanve and the map is linear. In fact, the map is uniformly continuous. Therefore, MA2 and MA3 show that for ,
(4.11) Combining (4.9), (4.10) and (4.11), we get
(4.12) Finally, the assertion follows from (4.12) and Bayes’ rule.
Corollary 4.9
Suppose that p is a marginal assignment. For any there exists such that for all ,
Choose a small . By Lemma 4.3 there exists such that for all . Hence, fix some and define
Moreover, for let be the (unique) distribution that maximises subject to the condition that for all (cf. (4.1)). Then q is a marginal sequence. Indeed, MS1, MS2 are trivially satisfied and MS3 holds because for all . Further, if we pick small enough, then Proposition 4.7 implies that for large n and any
(4.13)
To complete the proof, we need to compare the r.h.s. of (4.13) with . Thus, let us write
Because w.h.p. by (3.9), by our choice of l, the entropy is a uniformly continuous on and is uniformly bounded for all T, (3.11) ensures that we can make so small that
(4.14) A similar argument applies to . Indeed, since and because all degrees are bounded by Δ, condition MA2 from Definition 4.1 implies that
provided is small enough. In effect, because for any the function is uniformly continuous, MA3 and the construction of q T for ensure that
(4.15) Since w.h.p. by (3.9), (4.15) implies that
(4.16) Finally, the assertion follows from (4.13), (4.14) and (4.16).
4.6. Proof of Theorem 4.4
We begin by spelling out the following consequence of the symmetry assumption. Let p be a marginal assignment.
Lemma 4.10
If
is p‐symmetric, then for any
for all sufficiently large
we have
(4.17)
(4.18)
Choose small enough. For an integer consider the event
If is p‐symmetric, then for sufficiently large . Similarly, if the planted distribution is p‐symmetric, then for large .
Hence, assume that . Then by the triangle inequality, for any ,
Therefore,
(4.19) Furthermore, by (4.19) and the triangle inequality,
(4.20) Since , (4.20) entails that
i.e., G is ‐symmetric.
Together with Lemma 4.10 the following lemma shows that under the assumptions of Theorem 4.4 the partition function is dominated by its judicious part.
Lemma 4.11
There is a number
such that for all
there exists
such that for large enough n the following is true. If
is a
‐acyclic factor graph such that
(4.21) and μG is
‐symmetric, then
Pick small, and smaller and smaller still and assume that . Let be the partition of V n such that belong to the same class iff . By Theorem 2.1 there exists a refinement of such that μ G is γ‐homogeneous with respect to for some partition of such that . We may index the classes of as with for all x in the class and for some integer N T.
Let J be the set of all such that and is γ‐regular. Then
(4.22) Choosing χ small enough, we obtain from Corollary 2.4 that
Therefore, by (4.21) and the triangle inequality, for we get
Consequently, by (4.22), Bayes' rule and the triangle inequality, summing on and we get
(4.23)
Applying the triangle inequality once more, we find
(4.24) Further, consider such that and let . Because G is ‐acyclic, there exists a set with the following two properties. First, for every constraint node a with the variable node satisfies . Second, for every variable node x with there is a constraint node a with such that . For let be the number of constraint nodes a with such that belongs to . Then by the triangle inequality,
(4.25)
the last inequality follows because all degrees are between one and Δ. Finally, the assertion follows from (4.24) and (4.25).
We proceed by proving the upper bound and the lower bound statement from Theorem 4.4 separately. Strictly speaking, the proof of the lower bound implies the upper bound as well. But presenting the arguments separately makes them slightly easier to follow.
Proof of Theorem 4.4
upper bound. We assume that is p‐symmetric. Pick and fix a number ; we aim to show that for large enough n,
(4.26) For let
Additionally, let be the event that is ‐symmetric and let be the event that . Corollary 4.8 shows that for some small enough and large enough (both dependent on α) for all for large enough n we have
(4.27) To apply this bound we are going to argue that is not much smaller than for most .
The proof of this fact is based on Lemma 4.11. To apply it, we need to pick and fix some specific, large enough (upon which the value of χ provided by Lemma 4.11 will depend). By Lemma 4.3 there is such that
(4.28) Further, by Lemma 4.10 there is such that
Let . Now, Lemma 4.11 yields such that the following is true. Consider the event . Then
(4.29) Combining (4.27) and (4.29) we obtain
(4.30) Hence, we are left to estimate the probability of the event . With respect to we obtain from Lemma 4.10 that and . Moreover, the local convergence assumption (3.9) implies that . Consequently,
(4.31) Hence, the high girth assumption (3.11) yields
(4.32) Finally, combining (4.30) and (4.32) and using Markov's inequality, we obtain
Further, since (4.32) shows that the probability of the event is bounded away from 0, Proposition 3.2 yields
(4.33) Because is bounded by some number by the definition (3.4) of Z, (4.26) follows from (4.33).
To establish the lower bound we introduce a construction reminiscent of those used in 24, 25, 31, 39, 48. Namely, starting from the sequence of ‐models, we define another sequence of models as follows. Let and let us denote pairs by . Further, for any we define a function
Let . Then the ‐model gives rise to the ‐model .
Clearly, there is a canonical bijection . Moreover, the construction ensures that the Gibbs measure equals . Explicitly, for all ,
| (4.34) |
In effect, we obtain
| (4.35) |
Further, writing for the ‐templates and the acyclic ‐templates, we can lift the marginal assignment p from to by letting for all T. Additionally, let be the image of under the map so that
| (4.36) |
Proof of Theorem 4.4
lower bound. We assume that is p‐symmetric and that the same is true of the planted distribution. For consider the event
(4.37) and let be the event that is ‐symmetric. Moreover, as before let . Basically, we are going to apply the same argument as in the proof of the upper bound to the random factor graph and to for a large enough l.
Hence, let . Then Corollary 4.8 applied to yields a small and a large such that for all and large enough n we have
(4.38) Further, by Lemma 4.3 and (4.34) there exists such that
(4.39) Moreover, Lemma 4.10 shows that for some we have
(4.40) Similarly, (4.34), (4.39) the planted p‐symmetry assumption and Lemma 4.10 imply that there is such that for any for large enough we have
(4.41) Additionally, Corollary 4.9 shows that for a certain we have
(4.42) Let .
Applying Lemma 4.11 to , we obtain such that the following is true: let and define . Then (using (4.35))
(4.43) Further, Proposition 2.5 and Lemma 4.10 imply that
(4.44) Combining (3.9), (4.40) and (4.44), we get
(4.45) Now, (4.36), (4.38) and (4.43) give an upper bound on the second moment of , namely
(4.46)
As a next step, we are going to show that
(4.47) Indeed, by Proposition 2.5 and Lemma 4.10 we have
(4.48) for large enough l. Further, the local convergence assumption (3.9) and the construction (3.10) of the planted distribution ensures that for large enough l,
Hence, (4.41) and (4.48), show that for l large enough
(4.49) Thus, (4.42), (4.49) and Proposition 3.8 yield (4.47).
Finally, combining (4.46) and (4.47) and applying the Paley‐Zygmund inequality (1.3), we obtain for large n,
Because this holds for any , the assertion follows from Proposition 3.2.
4.7. Proof of Theorem 4.5
The key step of the proof is to establish the following statement.
Lemma 4.12
For any
there exists
such that for any
there exists n0 such that for all
the following is true. Assume that
satisfies
(4.50) Then G is
‐symmetric and
Before we prove Lemma 4.12 let us show how it implies Theorem 4.5.
Proof of Theorem 4.5
If satisfies is ‐symmetric and , then by the triangle inequality
Therefore, the theorem follows by applying Lemma 4.12 either to the random factor graph or to the random factor graph chosen from the planted model.
Proof of Lemma 4.12
The proof is morally similar to one for the special case of the “stochastic block model” from 40. Let be sufficiently small. By Theorem 2.1 we can pick small enough so that there exists a partition with with respect to which μ G is ‐homogeneous. Suppose that V i, S j are classes such that and such that is ‐regular on V i. We claim that
(4.51) The assertion is immediate from this inequality. Indeed, suppose that (4.51) is true for all i, j such that such that is ‐regular on V i. Then because
(4.52) Hence, by HM1 and Bayes’ rule, . Further, (4.52) and Lemma 2.8 imply that μ G is ‐regular (provided that we pick γ small enough). Thus, we are left to prove (4.51).
Assume for contradiction that (4.51) is violated for V i, S j such that . Then by the triangle inequality there is a set of size at least such that for all we have
For pick such that is maximum. Then by the pigeonhole principle there exist and , such that either
(4.53)
(4.54) In particular, for some ω we have
(4.55) We claim that there is a set of size with the following properties.
- (i)
the pairwise distance between any two is at least .
- (ii)
for all we have
(4.56) Indeed, because and the assumption (4.50) implies that
(4.57) Since , (4.57) implies that there is a set of size such that (4.56) holds for all . Now, construct a sequence inductively as follows. In step pick some . Then contains x i and all whose distance from x i is greater than . Since for each x i the total number of variable nodes at distance at most is bounded by and , the set has size at least , provided that n is large enough. Finally, simply pick any subset of size .
Consider the event We claim that
(4.58) Indeed, by (4.56) and the union bound we have
(4.59)
Now, let be the coarsest σ‐algebra such that for all . Suppose that is such that
(4.60) We claim that (4.60) implies
(4.61) Indeed, let . Then (4.60) implies that
(4.62) Furthermore, the pairwise distance of the variables in L is at least and given the values of the variables at distance either or from each are fixed. Therefore, given the events are mutually independent. In effect, X is stochastically dominated by a sum of independent random variables. Hence, recalling that γ is much smaller than δ, we see that (4.61) follows from (4.62) and the Chernoff bound. Finally, combining (4.59) and (4.61) we obtain (4.58).
But (4.58) does not sit well with (4.55). In fact, (4.55) entails that ; for consider the random variable . Then (4.55) yields . Hence, by Markov's inequality
Combining this bound with (4.58), we obtain . Thus, choosing δ much smaller than γ, we conclude that , which is a contradiction. Thus, we have established that (4.51).
5. CONDITIONING ON THE LOCAL STRUCTURE
5.1. A Generalised Configuration Model
The aim in this section is to prove Proposition 4.7. The obvious problem is the conditioning on the σ‐algebra that fixes the depth‐ neighborhoods of all variable nodes and the depth‐ neighborhoods of all constraint nodes. Following 16, we deal with this conditioning by setting up a generalised configuration model.
Recall that is the (finite) set of all isomorphism classes for and for . Let be integers and let be a ‐model of size n. Moreover, let be a ‐acyclic factor graph. Then we define an enhanced ‐model with type set as follows. The set of variable nodes is V, the set of constraint nodes is F, the degrees are given by d and the weight function associated with each constraint a is ψ a just as in . Moreover, the type of a variable clone (x, i) is . Further, the type of a constraint clone (a, j) such that is . Clearly, . The following lemma shows that the model can be used to generate factor graphs whose local structure coincides with that of G.
Lemma 5.1
Assume that
and that
is
‐acyclic. Then
viewed as a
‐factor graph satisfies
We are going to show inductively for that . The case l = 0 is immediate from the construction. Thus, assume that l > 0, let and let B be the set of all clones that have distance precisely l – 1 from (x, i). Since is ‐acyclic, the pairwise distance of any two clones in B is at least 2. Moreover, by induction we know that for all . Therefore, .
In order to prove Proposition 4.7 we need to enhance the model further to accommodate an assignment that provides a value from Ω for each clone. Thus, let be a map. We call valid if for all and if for all we have
Of course, we can extend a valid to a map . Given a valid we define a ‐model with variable nodes V, constraint nodes F, degrees d and weight functions such that the type of a variable clone (x, i) is and such that the type of a constraint clone (a, j) with is . By construction, . Let us recall the definition of the distance from (3.5). Further, for two maps let . In Section 5.2 we are going to establish the following.
Lemma 5.2
For any there is such that for the following holds. If is a ‐model of size n, is ‐acyclic and is valid, then with probability at least the random factor graph has the following property. There exist a valid and a ‐acyclic such that
To proceed consider a ‐marginal sequence q. We call q‐valid if the following two conditions hold.
-
1V1: For all we have
-
2V2: For all we have
Lemma 5.3
For any there is such that for the following holds. Assume that is a ‐model of size n, is ‐acyclic and q is a ‐marginal sequence such that there exists a q‐valid . Then with the sum ranging over all q‐valid we have
We defer the proof of Lemma 5.3 to Section 5.3.
Proof of Proposition 4.7
We claim that
(5.1) To see this, apply Lemma 5.2 to the constant map for some fixed . Then we conclude that with probability at least 1/2 the random graph is at distance at most from a ‐acyclic . Furthermore, by Lemma 5.1 this factor graph , viewed as an element of , satisfies . Finally, since the total number of factor graphs at distance at most from is bounded by because all degrees are bounded, we obtain (5.1).
Let be small enough. If , then by (4.7) there exists a ‐marginal sequence such that such that for all . Because is finite and , the total number of such is bounded by a polynomial in n. Moreover, due to the continuity of we can choose small enough so that for all such . Hence, summing over all corresponding to , we obtain from (5.1) and Lemma 5.3 that
Conversely, by Lemma 5.2 with probability at least 1/2 the graph is within distance at most of a ‐acyclic , which satisfies by Lemma 5.1. As before, the total number of graphs at distance at most off is bounded by . Similarly, the total number of at distance at most off is bounded by . Therefore, by Lemma 5.1
as desired.
5.2. Proof of Lemma 5.2
Let be the set of all possible types. For each let be the number of clones with . Throughout this section we assume that is sufficiently large.
Lemma 5.4
There exists
such that the following is true. For any
there exists
such that for every
either
or
The number of possible types is bounded independently of n. Hence, choosing β small enough, we can ensure that there exists an integer j > 0 such that such that .
Fix as in the previous lemma. Call τ rare if and common otherwise. Let Y be the number of variable clones that belong to cycles of length at most in .
Lemma 5.5
For large enough n we have
Let R be the set of variable clones (v, i) of a rare type and let U be the set of all variable clones whose distance from R in G does not exceed . Since the maximum degree as well as the total number of types are bounded, we have , provided that n is big enough. Thus, to get the desired bound on we merely need to consider the set W of common clones that are at distance more than from R.
More specifically, let (v, i) be a common clone. We are going to bound the probability that and that (v, i) lies on a cycle of length at most . To this end, we are going to explore the (random) factor graph from (v, i) via the principle of deferred decisions. Let be a sequence of indices. If (v, i) lies on a cycle of length at most , then there exists such a sequence that corresponds to this cycle. Namely, with the cycle comprises of the clones such that . In particular, . Clearly, the total number of sequences is bounded. Furthermore, given that (v l, i l) is common, the probability that is bounded by . Since , the linearity of expectation implies that .
Lemma 5.6
Assume that satisfies . Then there is a ‐acyclic such that
Let R be the set of variable clones (v, i) of a rare type and let U be the set of all variable clones whose distance from R in G does not exceed . Moreover, let minimise subject to the condition that for all . Then because the total number of types is bounded. Therefore, the assumption implies that , say. In addition, because G is ‐acyclic, none of the clones in R lies on a cycle of length at most in .
Altering only a bounded number of edges in each step, we are now going to remove the short cycles of one by one. Let C be the set of common clones. The construction of ensures that only common clones lie on cycles of length at most . Consider one such clone (v, i) and let N be the set of all variable clones that can be reached from (v, i) by traversing precisely two edges of ; thus, N contains all clones (w, j) such that w has distance two from v and all clones (v, j) that are incident to the same constraint node as (v, i). Once more by the construction of we have . Furthermore, .
We claim that there exists and a bijection such that the following conditions are satisfied.
- (i)
for all
- (ii)
the pairwise distance in between any two clones in is at least .
- (iii)
the distance in between and is at least .
- (iv)
the distance between R and is at least .
- (v)
any is at distance at least from any clone that belongs to a cycle of of length at most .
Since the maximum degree of is bounded by Δ, there are no more than clones violate condition (iii), (iv) or (v). By comparison, there are at least clones of any common type. Hence, the existence of ξ follows.
Now, obtain from as follows.
let and for all .
let for all .
It is immediate from the construction that any clone on a cycle of length at most in also lies on such a cycle of . Moreover, (v, i) does not lie on a cycle of length at most in . Hence, . In addition, all clones on cycles of length at most and their neighbours are common. Hence, the construction can be repeated on . Since , we ultimately obtain a ‐acyclic with .
Proof of Lemma 5.2
The assertion is immediate from Lemmas 5.5 and 5.6 and Markov's inequality.
5.3. Proof of Lemma 5.3
Let and for let n T be the number of variable nodes x such that . By Stirling's formula the number of assignments with marginals as prescribed by q satisfies
| (5.2) |
Further, for and let be the set of all clones such that . Moreover, let be the set of all clones such that . Additionally, let be the set of all pairs with such that there is such that . Of course, the total number of perfect matchings between and equals . If we fix , then any such perfect matching induces an assignment by mapping a clone matched to (x, i) to the value . Let be the event that in a such random matching for all and all ω we have
Moreover, for let be the number of such that . Then
Let . Multiplying up over all (T, i), we obtain for
| (5.3) |
where the constant hidden in the depends on only.
Further, for let be the event that for every we have
Then
| (5.4) |
Moreover,
| (5.5) |
In addition,
| (5.6) |
Further, because q is a ‐marginal sequence, condition MS3 guarantees that
| (5.7) |
Hence, letting , we obtain from ((5.4), (5.5))
| (5.8) |
Once more the constant hidden in the depends on only. Further, given we have
| (5.9) |
Finally, the assertion follows from (5.2), (5.3), (5.8) and (5.9).
ACKNOWLEDGEMENTS
The second author thanks Dimitris Achlioptas for inspiring discussions. We also thank two anonymous reviewers for their careful reading and their invaluable comments, which led to an improved version of Corollary 2.4, among other things.
Supported by European Research Council European Union's Seventh Framework Programme (FP7/2007‐2013)/ERC Grant Agreement n. 278857–PTCC).
A preliminary version [10] of this paper, presented by the first author at RANDOM 2015 and by the second author at the RS&A 2015 conference, contained a critical technical error that affected its main results. This present version is based on similar key insights but the main results are different from the ones stated in [10].
Contributor Information
Victor Bapst, Email: bapst@math.uni-frankfurt.de.
Amin Coja‐Oghlan, Email: acoghlan@math.uni-frankfurt.de.
REFERENCES
- 1. Achlioptas D. and Coja‐Oghlan A., Algorithmic barriers from phase transitions, In Proceedings of 49th FOCS, IEEE, Philadelphia, 2008, pp. 793–802.
- 2. Achlioptas D. and Moore C., Random k‐SAT: Two moments suffice to cross a sharp threshold, SIAM J Comput 36 (2006), 740–762. [Google Scholar]
- 3. Achlioptas D. and Naor A., The two possible values of the chromatic number of a random graph, Ann Math 162 (2005), 1333–1349. [Google Scholar]
- 4. Achlioptas D., Naor A., and Peres Y., Rigorous location of phase transitions in hard optimization problems, Nature 435 (2005), 759–764. [DOI] [PubMed] [Google Scholar]
- 5. Achlioptas D., Naor A., and Peres Y., On the maximum satisfiability of random formulas, J ACM 54 (2007). [Google Scholar]
- 6. Achlioptas D., and Peres Y., The threshold for random k‐SAT is , J AMS 17 (2004), 947–973. [Google Scholar]
- 7. Aldous D., Representations for partially exchangeable arrays of random variables, J Multivariate Anal 11 (1981), 581–598. [Google Scholar]
- 8. Aldous D. and Steele J., The objective method: probabilistic combinatorial optimization and local weak convergence, In Kesten H. editors, Probability on discrete structures, Encyclopaedia of Mathematical Sciences, Vol. 110, Springer, Berlin, 2004, pp. 1–72. [Google Scholar]
- 9. Bandyopadhyay A. and Gamarnik D., Counting without sampling: Asymptotics of the log‐partition function for certain statistical physics models, Random Struct Algorithms 33 (2008), 452–479. [Google Scholar]
- 10. Bapst V. and Coja‐Oghlan A., Harnessing the Bethe free energy, In Proceedings of 19th RANDOM, Leibniz International Proceedings in Informatics, Princeton, 2015, pp. 467–480, Also available as arXiv:1504.03975, version 1.
- 11. Bapst V. and Coja‐Oghlan A., The condensation phase transition in the regular k‐SAT model, In Proceedings of 20th RANDOM, Leibniz International Proceedings in Informatics, Paris, 2016, pp. 22:1–22:18.
- 12. Bapst V., Coja‐Oghlan A., and Hetterich S., Rassmann F., and D. Vilenchik The condensation phase transition in random graph coloroing, Comm Math Phys 341 (2016), 543–606. [Google Scholar]
- 13. Bapst V., Coja‐Oghlan A., and Rassmann F. A positive temperature phase transition in random hypergraph 2‐coloring, Ann Appl Probab 26 (2016), 1362–1406. [Google Scholar]
- 14. Barak B., Rao A., Shaltiel R., and Wigderson A., 2‐source dispersers for sub‐polynomial entropy and Ramsey graphs beating the Frankl‐Wilson construction, In Proceedings of 38th STOC, ACM, Seattle, 2006, pp. 671–680.
- 15. Bayati M., Gamarnik D. and Tetali P., Combinatorial approach to the interpolation method and scaling limits in sparse random graphs, Ann Probab 41 (2013), 4080–4115. [Google Scholar]
- 16. Bordenave C. and Caputo P., Large deviations of empirical neighborhood distribution in sparse random graphs, Probab Theory Relat Fields 163 (2015), 149–222. [Google Scholar]
- 17. Coja‐Oghlan A. and Panagiotou K., The asymptotic k‐SAT threshold, Adv Math 288 (2016), 985–1068. [Google Scholar]
- 18. Coja‐Oghlan A., Perkins W., and Skubch K., Limits of discrete distributions and Gibbs measures on random graphs, preprint, arXiv:1512.06798, 2015.
- 19. Coja‐Oghlan A. and Perkins W., Belief Propagation on replica symmetric random factor graph models, In Proceedings of 20th RANDOM, Leibniz International Proceedings in Informatics, 2016, pp. 27:1–27:15.
- 20. Coja‐Oghlan A. and Zdeborová L., The condensation transition in random hypergraph 2‐coloring, In Proceedings of 23rd SODA, ACM‐SIAM, Kyoto, 2012, pp. 241–250.
- 21. Contucci P., Dommers S., Giardina C., and Starr S., Antiferromagnetic Potts model on the Erdös‐Rényi random graph, Commun Math Phys 323 (2013), 517–554. [Google Scholar]
- 22. Dembo A. and Montanari A., Ising models on locally tree‐like graphs, Ann Appl Probab 20 (2010), 565–592. [Google Scholar]
- 23. Dembo A., Montanari A., and Sun N., Factor models on locally tree‐like graphs, Ann Probab 41 (2013), 4162–4213. [Google Scholar]
- 24. Dembo A., Montanari A., Sly A. and Sun N., The replica symmetric solution for Potts models on d‐regular graphs, Comm Math Phys 327 (2014), 551–575. [Google Scholar]
- 25. Ding J., Sly A., and Sun N., Satisfiability threshold for random regular NAE‐SAT, In Proceedings of 46th STOC, ACM, New York, 2014, pp. 814–822.
- 26. Ding J., Sly A., and Sun N., Proof of the satisfiability conjecture for large k , In Proceedings of 47th STOC, ACM, Portland, 2015, pp. 59–68.
- 27. Durrett R., Probability: Theory and examples, 4th edition, Cambridge University Press, Cambridge, 2010. [Google Scholar]
- 28. Erdös P., Some remarks on the theory of graphs, Bull Am Math Soc 53 (1947), 292–294. [Google Scholar]
- 29. Erdös P. Graph theory and probability, Canad J Math 11 (1959), 34–38. [Google Scholar]
- 30. Franz S. and Leone M., Replica bounds for optimization problems and diluted spin systems, J Stat Phys 111 (2003), 535–564. [Google Scholar]
- 31. Galanis A., Stefankovic D., and Vigoda E., Inapproximability for antiferromagnetic spin systems in the tree non‐uniqueness region, In Proceedings of 46th STOC, ACM, New York, 2014, pp. 823–831.
- 32. Guerra F., Broken replica symmetry bounds in the mean field spin glass model, Comm Math Phys 233 (2003), 1–12. [Google Scholar]
- 33. Hoover D., Relations on probability spaces and arrays of random variables, Preprint, Institute of Advanced Studies, Princeton, 1979.
- 34. Krzakala F., Montanari A., Ricci‐Tersenghi F., Semerjian G. and Zdeborova L., Gibbs states and the set of solutions of random constraint satisfaction problems, Proc Nat Acad Sci 104 (2007), 10318–10323. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Lováasz L., Large networks and graph limits, Vol. 60, Colloquium Publications, AMS, Providence, 2012. [Google Scholar]
- 36. Mézard M. and Montanari A., Information, physics and computation, Oxford University Press, Oxford, 2009. [Google Scholar]
- 37. Mézard M. and Parisi G., and Zecchina R., Analytic and algorithmic solution of random satisfiability problems, Science 297 (2002), 812–815. [DOI] [PubMed] [Google Scholar]
- 38. Montanari A. and Shah D., Counting good truth assignments of random k‐SAT formulae, In Proceedings of 18th SODA, ACM‐SIAM, New Orleans, 2007, pp. 1255–1264.
- 39. Mossel E., Weitz D., and Wormald N., On the hardness of sampling independent sets beyond the tree threshold, Probab Theory Relat Fields 143 (2009), 401–439. [Google Scholar]
- 40. Mossel E., Neeman J. and Sly A., Reconstruction and estimation in the planted partition model, Probab Theory Relat Fields 162 (2014), 1–31. [Google Scholar]
- 41. Nešetřil J., A combinatorial classic–sparse graphs with high chromatic number. In Lovász L. et al. editors, Erdös Centennial, Springer, Berlin, 2013. [Google Scholar]
- 42. Panchenko D. and Talagrand M., Bounds for diluted mean‐fields spin glass models, Probab Theory Relat Fields 130 (2004), 319–336. [Google Scholar]
- 43. Paley R. and Zygmund A., On some series of functions, (3), Proc Camb Philos Soc 28 (1932), 190–205. [Google Scholar]
- 44. Panchenko D., Spin glass models from the point of view of spin distributions, Ann Probab 41 (2013), 1315–1361. [Google Scholar]
- 45. Panchenko D., The Sherrington‐Kirkpatrick model, Springer, Berlin, 2013. [Google Scholar]
- 46. Richardson T. and Urbanke R., Modern coding theory, Cambridge University Press, Cambridge, 2008. [Google Scholar]
- 47. Robinson R. and Wormald N., Almost all regular graphs are Hamiltonian, Random Struct Algorithms 5 (1994), 363–374. [Google Scholar]
- 48. Sly A. and Sun N., The computational hardness of counting in two‐spin models on d‐regular graphs, In Proceedings of 53rd FOCS, IEEE, New Brunswick, 2012, pp. 361–369.
- 49. Szemerédi E., Regular partitions of graphs, Colloq Inter CNRS 260 (1978), 399–401. [Google Scholar]
- 50. Tao T., Szemerédi's regularity lemma revisited, Contrib Discrete Math 1 (2006), 8–28. [Google Scholar]
- 51. Turán P., On a theorem of Hardy and Ramanujan, J London Math Soc 9 (1934), 274–276. [Google Scholar]
