Skip to main content
Springer logoLink to Springer
. 2020 Sep 16;191(2):487–558. doi: 10.1007/s10107-020-01562-6

Complete positivity and distance-avoiding sets

Evan DeCorte 1, Fernando Mário de Oliveira Filho 2,, Frank Vallentin 3
PMCID: PMC8863813  PMID: 35250093

Abstract

We introduce the cone of completely positive functions, a subset of the cone of positive-type functions, and use it to fully characterize maximum-density distance-avoiding sets as the optimal solutions of a convex optimization problem. As a consequence of this characterization, it is possible to reprove and improve many results concerning distance-avoiding sets on the sphere and in Euclidean space.

Keywords: Hadwiger-Nelson problem, Chromatic number of Euclidean space, Semidefinite programming, Copositive programming, Harmonic analysis

Introduction

The two prototypical geometrical problems considered in this paper are:

  1. What is the maximum surface measure m0(Sn-1) that a subset of the unit sphere Sn-1={xRn:x=1} can have if it does not contain pairs of orthogonal vectors?

  2. What is the maximum density m1(Rn) that a subset of Rn can have if it does not contain pairs of points at distance 1?

Problem (P1) was posed by Witsenhausen [48]. Two antipodal open spherical caps of radius π/4 form a subset of Sn-1 with no pairs of orthogonal vectors, and Kalai [20, Conjecture 2.8] conjectured that this construction is optimal, that is, that it attains m0(Sn-1); this conjecture remains open for all n2. Problem (P1) will be considered in depth in Sect. 8, where many upper bounds for m0(Sn-1) will be improved.

Problem (P2) figures in Moser’s collection of problems [32] and was popularized by Erdős, who conjectured that m1(R2)<1/4 (cf. Székely [45]); this conjecture is still open. A long-standing conjecture of L. Moser (cf. Conjecture 1 in Larman and Rogers [26]), related to Erdős’s conjecture, would imply that m1(Rn)1/2n for all n2. Moser’s conjecture asserts that the maximum measure of a subset of the unit ball having no pairs of points at distance 1 is at most 1/2n times the measure of the unit ball; it has recently been shown to be false [34]: the behavior of subsets of the unit ball that avoid distance 1 resembles Kalai’s double cap conjecture. Problem (P2) will be considered in detail in Sect. 9, where upper bounds for m1(Rn) will be improved.

Bachoc et al. [1] proposed an upper bound for m0(Sn-1) similar to the linear programming bound of Delsarte et al. [10] for the maximum cardinality of spherical codes. Recall that a continuous function f:[-1,1]R is of positive type for Sn-1 if for every finite set USn-1 the matrix (f(x·y))x,yU is positive semidefinite. Bachoc, Nebe, Oliveira, and Vallentin showed that the optimal value of the infinite-dimensional optimization problem

maximizeSn-1Sn-1f(x·y)dω(y)dω(x)f(1)=ω(Sn-1)-1,f(0)=0,f:[-1,1]Ris continuous and of positive type forSn-1 1

is an upper bound for m0(Sn-1). Here, ω is the surface measure on Sn-1.

Later, Oliveira and Vallentin [36] proposed an upper bound for m1(Rn) similar to the linear programming bound of Cohn and Elkies [7] for the maximum density of a sphere packing in Rn; the Cohn–Elkies bound has recently been used to solve the sphere-packing problem in dimensions 8 and 24 [8, 46]. Recall that a continuous function f:RnR is of positive type if for every finite set URn the matrix (f(x-y))x,yU is positive semidefinite. Oliveira and Vallentin showed that the optimal value of the infinite-dimensional optimization problem

maximizeM(f)f(0)=1,f(x)=0ifx=1,f:RnRis continuous and of positive type 2

is an upper bound for m1(Rn). Here, M(f) is the mean value of f, defined as

M(f)=limT1vol[-T,T]n[-T,T]nf(x)dx.

An explicit characterization of functions of positive type for Sn-1 is given by Schoenberg’s theorem [40]. Likewise, functions of positive type on Rn are characterized by Bochner’s theorem [38, Theorem IX.9]. Using these characterizations, it is possible to rewrite and simplify problems (1) and (2), which become infinite-dimensional linear programs. It then becomes possible to solve these problems by computer or even analytically; in this way, one obtains upper bounds for the geometrical parameters m0(Sn-1) and m1(Rn). Both optimization problems above can also be strengthened by the addition of extra constraints. The best bounds for both geometrical parameters, in several dimensions, were obtained through strengthenings of the optimization problems above; see Sects. 8 and 9.

A symmetric matrix ARn×n is completely positive if it is a conic combination of rank-one, symmetric, and nonnegative matrices, that is, if there are nonnegative vectors f1, ..., fkRn such that

A=f1f1++fkfk.

The set of all completely positive matrices is a closed and convex cone of symmetric matrices that is strictly contained in the cone of positive-semidefinite matrices. Completely positive matrices are the main object of study in this paper.

A continuous function f:[-1,1]R is of completely positive type for Sn-1 if for every finite set USn-1 the matrix (f(x·y))x,yU is completely positive. Analogously, a continuous function f:RnR is of completely positive type if for every URn the matrix (f(x-y))x,yU is completely positive. Notice that functions of completely positive type are functions of positive type, but not every function of positive type is of completely positive type.

The central result of this paper is that, by considering functions of completely positive type instead of functions of positive type, one fully characterizes the geometrical parameters in (P1) and (P2).

Theorem 1.1

If in (1) we require f to be of completely positive type, then the optimal value of the problem is exactly m0(Sn-1). Similarly, if in (2) we require f to be of completely positive type, then the optimal value is exactly m1(Rn).

The significance of this result is twofold.

First, it gives us a source of constraints that can be added to (1) or (2) and asserts that this source is complete, that is, that the constraints are sufficient for us to obtain the exact parameters. Namely, for every finite set USn-1 we can add to (1) the constraint that (f(x·y))x,yU has to be completely positive, and similarly for (2). All strengthenings of problems (1) and (2) considered so far in the literature have used such constraints. In this paper, by systematically using them, we are able to improve many of the known upper bounds for m0(Sn-1) and m1(Rn); see Table 1 in Sect. 8 and Table 2 in Sect. 9.

Table 1.

New upper bounds for the independence ratio of G(Sn-1,{π/2}). Next to each bound is the number of BQP(U)-constraints used to obtain it. The lower bounds come from two opposite spherical caps. The bound for n=3 improves on a previous bound of 0.308 by Zhao (personal communication); the bounds for n4 improve on Witsenhausen’s bound [48] of 1/n

n Upper bound Lower bound # Extra constraints
3 0.30153 0.2929 11
4 0.21676 0.1817 2
5 0.16765 0.1161 1
6 0.13382 0.0756 3
7 0.11739 0.0498 2
8 0.09981 0.0331 2

Table 2.

The bounds for n=3 are due to Oliveira and Vallentin [36]; all other bounds are due to Bachoc et al. [2]. The graphs used for the subgraph constraints are indicated in the last column; they are the same ones used by Bachoc, Passuello, and Thiery (ibid., Table 2), except for the 8-simplex, which is the regular simplex of side-length 1 in R8

Upper bound for αδ¯ Lower bound for χm
n Previous New Previous New Graphs used
3 0.1645090 0.1532996 7 7 None
4 0.1000620 0.0985701 10 11 600-cell
5 0.0677778 0.0624485 15 17 600-cell
6 0.0478444 0.0450325 21 23 600-cell
7 0.0276502 0.0260782 37 39 E8 kissing
8 0.0195941 0.0190945 52 53 E8 and 8-simplex

Second, the characterizations of m0(Sn-1) and m1(Rn) in terms of convex optimization problems, even computationally difficult ones, is good enough to allow us to derive some interesting theoretical results through analytical methods. For instance, denote by md1,,dN(Rn) the maximum density that a Lebesgue-measurable set IRn can have if it is such that x-y{d1,,dN} for all distinct x, yI. Bukh [6] showed, unifying results by Furstenberg et al. [17], Bourgain [5], Falconer [14], and Falconer and Marstrand [13], that, as the distances d1, ..., dN space out, so does md1,,dN(Rn) approach (m1(Rn))N. This precise asymptotic result can be recovered from (2) by using functions of completely positive type in a systematic way that can provide precise analytic results. Another result of Bukh (ibid.) that can be proved using this approach is the Turing-machine computability of m1(Rn). Using our convex formulation one can in principle extend this computability result to distance-avoiding sets in other geometric spaces.

Outline of the paper

The main theorem proved in this paper is Theorem 5.1, from which Theorem 1.1 follows. Theorem 5.1 is stated in terms of graphs on topological spaces and is much more general than Theorem 1.1. It has a rather technical statement, but it is in fact a natural extension of a well-known result in combinatorial optimization, namely that the independence number of a graph is the optimal value of a convex optimization problem over the cone of completely positive matrices. This connection is the main thread of this paper; it will be clarified in Sect. 3.

In Sect. 2 we will see how geometrical parameters such as m0(Sn-1) and m1(Rn) can be modeled as the independence number of certain graphs defined over topological spaces such as the sphere. In Sect. 3 this will allow us to extend the completely positive formulation for the independence number from finite graphs to these topological graphs; this extension will rely on the introduction of the cone of completely positive operators on a Hilbert space. A study of these operators, carried out in Sect. 4, will then allow us to prove Theorem 5.1 in Sect. 5 and extend it from compact spaces to Rn in Sect. 6. In Sects. 78, and 9 we will see how to use Theorem 5.1 to obtain better bounds for m0(Sn-1) and m1(Rn); these sections will be focused on computational techniques. We close in Sect. 10 by seeing how Theorem 5.1 can be used to prove Bukh’s results [6] concerning sets avoiding many distances and the computability of m1(Rn).

Notation

All graphs considered have no loops nor parallel edges. Often, the edge set of a graph G=(V,E) is also seen as a symmetric subset of V×V. In this case, x, yV are adjacent if and only if (xy), (y,x)E. A graph G=(V,E) is a topological graph if V is a topological space; topological properties of E (e.g., closedness, compactness) always refer to E as a subset of V×V.

If V is a metric space with metric d, then for xV and δ>0 we denote by

B(x,δ)={yV:d(y,x)<δ}

the open ball with center x and radius δ. The topological closure of a set X is denoted by clX. The term “neighborhood” always means “open neighborhood”, though the distinction is never really relevant.

The Euclidean inner product on Rn is denoted by x·y=x1y1++xnyn for x, yRn. The (n-1)-dimensional unit sphere is Sn-1={xRn:x=1}.

All functions considered are real valued unless otherwise noted. If V is a measure space with measure ω, then the inner product of f, gL2(V) is

(f,g)=Vf(x)g(x)dω(x).

The inner product of kernels A, BL2(V×V) is

A,B=VVA(x,y)B(x,y)dω(y)dω(x).

When V is finite and ω is the counting measure, then A,B is the trace inner product. If fL2(V), then ff denotes the kernel (x,y)f(x)f(y).

Denote by Lsym2(V×V) the space of all kernels that are symmetric, that is, self adjoint as operators. Note that ALsym2(V×V) if and only if AL2(V×V) and A(x,y)=A(y,x) almost everywhere. A symmetric kernel A is positive if for all fL2(V) we have

VVA(x,y)f(x)f(y)dydx0.

Locally independent graphs

Let G=(V,E) be a graph (without loops and parallel edges). A set IV is independent if it does not contain pairs of adjacent vertices, that is, if for all x, yI we have (x,y)E. The independence number of G, denoted by α(G), is the maximum cardinality of an independent set in G. The problem of computing the independence number of a finite graph figures, as the complementary maximum-clique problem, in Karp’s original list of 21 NP-hard problems [21].

To model the geometrical parameters m0(Sn-1) and m1(Rn) as the independence number of some graph, we will have to extend the concept of independence number from finite to infinite graphs. Then the nature of both the vertex and edge sets plays a role; this can be best seen considering a few examples.

Let V be a metric space with metric d and take D(0,). The D -distance graph on V is the graph G(VD) whose vertex set is V and in which vertices xy are adjacent if d(x,y)D. Independent sets in G(VD) are sometimes called D -avoiding sets. Let us consider a few concrete choices for V and D, corresponding to central problems in discrete geometry.

  • (i)

    The kissing number problem: V=Sn-1 and D=(0,π/3). Here we consider the metric d(x,y)=arccosx·y. In this case, all independent sets in G(VD) are finite; even more, the independence number is finite. The independent sets in G(VD) are exactly the contact points of kissing configurations in Rn, so α(G(V,D)) is the kissing number of Rn.

  • (ii)
    Witsenhausen’s problem (P1): V=Sn-1 and D={π/2}. Again we consider the metric d(x,y)=arccosx·y. An independent set in G(VD) is a set without pairs of orthogonal vectors. These sets can be infinite and even have positive surface measure, so α(G(V,D))=. The right concept in this case is the measurable independence number
    αω(G(V,D))=sup{ω(I):IVis measurable and independent},
    where ω is the surface measure on the sphere. Then αω(G(V,D))=m0(Sn-1).
  • (iii)

    The sphere-packing problem: V=Rn and D=(0,1). Here we consider the Euclidean metric. The independent sets in G(VD) are the sets of centers of spheres in a packing of spheres of radius 1/2 in Rn. So independent sets in G(VD) can be infinite but are always discrete, hence α(G(V,D))= while independent sets always have Lebesgue measure 0. A better definition of independence number in this case would be the center density of the corresponding packing, that is, the average number of points per unit volume.

  • (iv)

    Measurable one-avoiding sets (P2): V=Rn and D={1}. In this case, G(VD) is called the unit-distance graph of Rn. Independent sets in this graph can be infinite and even have infinite Lebesgue measure, hence α(G(V,D))=. So the right notion of independence number is the density of a set, informally the fraction of space it covers. We will formally define the independence density αδ¯(G(V,D))=m1(Rn) in Sect. 6.

In the first two examples above, the vertex set is compact. In (i), there is δ>0 such that (0,δ)D. Then every point has a neighborhood that is a clique (that is, a set of pairwise adjacent vertices), and this implies that all independent sets are discrete and hence finite, given the compactness of V. In (ii), 0 is isolated from D. Then every point has an independent neighborhood and there are independent sets of positive measure.

In the last two examples, the vertex set is not compact. In (iii), again there is δ>0 such that (0,δ)D, and this implies that all independent sets are discrete, though since V is not compact they can be infinite. In (iv), 0 is again isolated from D, hence there are independent sets of positive measure and even infinite measure, given that V is not compact.

We have therefore two things at play. First, compactness of the vertex set. Second, the nature of the edge set, which in the examples above depends on 0 being isolated from D or not.

In this paper, the focus rests on graphs with compact vertex sets, though the not compact case of Rn can be handled by seeing Rn as a limit of tori (see Sect. 6 below). As for the edge set, we consider graphs like the ones in examples (ii) and (iv).

The graphs in examples (i) and (iii) are topological packing graphs, a concept introduced by de Laat and Vallentin [25]. These are topological graphs in which every finite clique is a subset of an open clique. In particular, every vertex has a neighborhood that is a clique. Here and in the remainder of the paper we consider locally independent graphs, which are in a sense the complements of topological packing graphs.

Definition 2.1

A topological graph is locally independent if every compact independent set in it is a subset of an open independent set.

In particular, every vertex of a locally independent graph has an independent neighborhood. The graphs in examples (ii) and (iv) are locally independent, as follows from the following theorem.

Theorem 2.2

If G=(V,E) is a topological graph, if V is metrizable, and if E is closed, then G is locally independent.

Proof

Let d be a metric that induces the topology on V. For V×V we consider the metric

d((x,y),(x,y))=max{d(x,x),d(y,y)}

which induces on V×V the product topology.

Consider the function dE:V×VR such that

dE(x,y)=d((x,y),E)=inf{d((x,y),(x,y)):(x,y)E};

this is a continuous function.

Let IV be a nonempty and compact independent set. Since I×I is compact, the function dE has a minimum δ over I×I. Note δ>0. Indeed, since I×I is compact, there is (x,y)I×I such that d((x,y),E)=δ. Since I is independent, (x,y)E. But then from the closedness of E there is ϵ>0 such that E(B(x,ϵ)×B(y,ϵ))=, whence δ>0.

Next take the set

S=xIB(x,δ).

This is an open set that contains I; it is moreover independent. Indeed, suppose x, yS are adjacent. Take x, yI such that xB(x,δ) and yB(y,δ). Then

d((x,y),(x,y))=max{d(x,x),d(y,y)}<δ,

a contradiction since (x,y)E, x, yI, and dE(x,y)δ.

Let G=(V,E) be a topological graph and ω be a Borel measure on V. The independence number of G with respect to the measure ω is

αω(G)=sup{ω(I):IVis measurable and independent};

when speaking of the independence number of a graph, the measure considered will always be clear from the context. The following theorem is a converse of sorts to Theorem 2.2.

Theorem 2.3

If G=(V,E) is locally independent, then so is G=(V,clE). Moreover, if ω is an inner-regular Borel measure on V, then αω(G)=αω(G).

Proof

Let IV be a compact independent set in G. Then I is also an independent set in G and, since G is locally independent, there is an open independent set S in G that contains I. Since S is independent, E(S×S)=, and hence E(V×V)\(S×S). Now (V×V)\(S×S) is a closed set and so clE(V×V)\(S×S), whence S is also an independent set in G, finishing the proof that G is locally independent.

As for the second part of the statement, clearly αω(G)αω(G), so we prove the reverse inequality. Since ω is inner regular, we can restrict ourselves to compact sets, writing

αω(G)=sup{ω(I):IVis compact and independent}.

So, to prove the reverse inequality, it suffices to show that a compact independent set in G is also independent in G. Let I be a compact independent set in G and let S be an open independent set in G that contains I, which exists since G is locally independent. Since S is independent, E(S×S)=, and hence E(V×V)\(S×S). Now (V×V)\(S×S) is closed, and so clE(V×V)\(S×S), whence clE(S×S)= and clE(I×I)=, that is, I is independent in G.

A conic programming formulation for the independence number

One of the best polynomial-time-computable upper bounds for the independence number of a finite graph is the theta number, a graph parameter introduced by Lovász [27]. Let G=(V,E) be a finite graph. The theta number and its variants can be defined in terms of the following conic program, in which a linear function is maximized over the intersection of a convex cone with an affine subspace:

maximizeJ,AtrA=1,A(x,y)=0if(x,y)E,AK(V). 3

Here, A:V×VR is the optimization variable, J:V×VR is the all-ones matrix, J,A=trJA=x,yVA(x,y), and K(V)RV×V is a convex cone of symmetric matrices. Both the optimal value of the problem above and the problem itself are denoted by ϑ(G,K(V)).

The theta number of G, denoted by ϑ(G), is simply ϑ(G,PSD(V)), where PSD(V) is the cone of positive-semidefinite matrices. In this case our problem becomes a semidefinite program, whose optimal value can be approximated in polynomial time to within any desired precision using the ellipsoid method [19] or interior-point methods [24]. We have moreover ϑ(G)α(G): if IV is a nonempty independent set and χI:V{0,1} is its characteristic function, then A=|I|-1χIχI, which is the matrix such that

A(x,y)=|I|-1χI(x)χI(y),

is a feasible solution of ϑ(G,PSD(V)); moreover J,A=|I|, and hence ϑ(G)|I|. Since I is any nonempty independent set, ϑ(G)α(G) follows.

A strengthening of the Lovász theta number is the parameter ϑ(G) introduced independently by McEliece et al. [30] and Schrijver [41], obtained by taking K(V)=PSD(V)NN(V), where NN(V) is the cone of matrices with nonnegative entries.

Another choice for K(V) is the cone

C(V)=cone{ff:f:VRandf0}PSD(V)NN(V)

of completely positive matrices. The proof above that ϑ(G)α(G) works just as well when K(V)=C(V), and hence

ϑ(G,PSD(V))ϑ(G,PSD(V)NN(V))ϑ(G,C(V))α(G). 4

De Klerk and Pasechnik [23] observed that a theorem of Motzkin and Straus [33] implies that the last inequality in (4) is actually tight; a streamlined proof of this fact goes as follows. If A is a feasible solution of ϑ(G,C(V)), then, after suitable normalization,

A=α1f1f1++αnfnfn, 5

where αi>0, fi0, and fi=1 for all i. Since fi=1, we have trfifi=1, and then since trA=1 we must have α1++αn=1. It follows that for some i we have J,fifiJ,A; assume then that this is the case for i=1.

Next, observe that since A(x,y)=0 for all (x,y)E and each fi is nonnegative, we must have f1(x)f1(y)=0 for all (x,y)E. This implies that I, the support of f1, is an independent set. Denoting by (f,g)=xVf(x)g(x) the Euclidean inner product in RV, we then have

J,AJ,f1f1=(f1,χI)2f12χI2=|I|α(G)

and, since A is any feasible solution, we get ϑ(G,C(V))α(G).

Problem (3) can be naturally extended to infinite topological graphs, as we will see now. Let G=(V,E) be a topological graph where V is compact, ω be a Borel measure on V, JL2(V×V) be the constant 1 kernel, and K(V)Lsym2(V×V) be a convex cone of symmetric kernels. When V is finite with the discrete topology and ω is the counting measure, the following optimization problem is exactly (3):

maximizeJ,AVA(x,x)dω(x)=1,A(x,y)=0if(x,y)E,Ais continuous andAK(V). 6

As before, we will denote both the optimal value (that is, the supremum of the objective function) of this problem and the problem itself by ϑ(G,K(V)).

The problem above is a straight-forward extension of (3), except that instead of the trace of the operator A we take the integral over the diagonal. Not every Hilbert–Schmidt operator has a trace, so if we were to insist on using the trace instead of the integral, we would have to require that A be trace class. Recall that A is trace class and has trace τ if for every complete orthonormal system (fα) of L2(V) we have

τ=α(Afα,fα).

Mercer’s theorem says that a continuous and positive kernel A has a spectral decomposition in terms of continuous eigenfunctions that moreover converges absolutely and uniformly. This implies in particular that A is trace class and that its trace is the integral over the diagonal. So, as long as K(V) is a subset of the cone of positive kernels, taking the integral over the diagonal or the trace is the same.

As before, there are at least two cones that can be put in place of K(V). One is the cone PSD(V) of positive kernels. The other is the cone of completely positive kernels on V, namely

C(V)=clcone{ff:fL2(V)andf0}, 7

with the closure taken in the norm topology on L2(V×V), and where f0 means that f is nonnegative almost everywhere. Note that C(V)PSD(V), and hence ϑ(G,PSD(V))ϑ(G,C(V)).

Theorem 3.1

If G=(V,E) is a locally independent graph, if V is a compact Hausdorff space, and if ω is an inner-regular Borel measure on V such that 0<αω(G)<, then ϑ(G,C(V))αω(G).

Bachoc et al. [1] proved a similar result for the special case of distance graphs on the sphere; the proof below uses similar ideas.

Proof

Fix 0<ϵ<αω(G). Since ω is inner regular and 0<αω(G)<, there is a compact independent set I such that ω(I)αω(G)-ϵ>0.

Since G is locally independent, there is an open independent set S that contains I. Now V is a compact Hausdorff space and hence normal [16, Proposition 4.25] and I and V\S are disjoint closed sets, so from Urysohn’s lemma there is a continuous function f:V[0,1] such that f(x)=1 for xI and f(x)=0 for xV\S.

Note f>0 since ω(I)>0. Set A=f-2ff. Then A is a feasible solution of ϑ(G,C(V)). Indeed, A is continuous and belongs to C(V), and moreover VA(x,x)dω(x)=1. Since S is independent and f’s support is a subset of S, A(x,y)=0 if (x,y)E, and hence A is feasible.

Finally, since S is independent, ω(S)αω(G). But then f2ω(S) and

J,A=J,fff2ω(I)2ω(S)(αω(G)-ϵ)2αω(G).

Since ϵ is any positive number, the theorem follows.

Theorem 5.1 in Sect. 5 states that, under some extra assumptions on G and ω, one has ϑ(G,C(V))=αω(G), as in the finite case. The proof of this theorem is fundamentally the same as in the finite case; here is an intuitive description.

There are two key steps in the proof for finite graphs as given above. First, the matrix A is a convex combination of rank-one nonnegative matrices, as in (5). Second, this together with the constraints of our problem implies that the support of each fi in (5) is an independent set. Then the support of one of the fis will give us a large independent set.

In the proof that ϑ(G,C(V))=αω(G) for an infinite topological graph we will have to repeat the two steps above. Now A will be a kernel, so it will not be in general a convex combination of finitely many rank-one kernels as in (5); Choquet’s theorem [43, Theorem 10.7] will allow us to express A as a sort of convex combination of infinitely many rank-one kernels. Next, it will not be the case that the support of any function appearing in the decomposition of A will be independent, but depending on some properties of G and ω we will be able to fix this by removing from the support the measure-zero set consisting of all points that are not density points.

To be able to apply Choquet’s theorem, we first need to better understand the cone C(V); this we do next.

The completely positive and the copositive cones on compact spaces

Throughout this section, V will be a compact Hausdorff space and ω will be a finite Borel measure on V such that every open set has positive measure and ω(V)=1; the normalization of ω is made for convenience only.

For fL2(V) and gL(V), write fg for the function xf(x)g(x); note that fgL2(V). For AL2(V×V) and BL(V×V), define AB analogously. For UV and AL2(V×V), denote by A[U] the restriction of A to U×U.

There are two useful topologies to consider on the L2 spaces we deal with: the norm topology and the weak topology. We begin with a short discussion about them, based on Chapter 5 of Simon [43]. Statements will be given in terms of L2(V), but they also hold for L2(V×V) and Lsym2(V×V).

The norm topology on L2(V) coincides with the Mackey topology, the strongest topology for which only the linear functionals f(f,g) for gL2(V) are continuous.

The weak topology on L2(V) is the weakest topology for which all linear functionals f(f,g) for gL2(V) are continuous. A net1 (fα) converges in the weak topology if and only if ((fα,g)) converges for all gL2(V).

The weak and norm topologies are dual topologies, that is, the topological dual of L2(V) is the same for both topologies, and hence it is isomorphic to L2(V). Theorem 5.2 (iv) (ibid.) says that if XL2(V) is a convex set, then clX is the same whether it is taken in the weak or norm topology. Since the set

cone{ff:fL2(V)andf0}

is convex, it follows that if we take the closure in (7) in the weak topology we also obtain C(V).

The dual cone of C(V) is

C(V)={ZLsym2(V×V):Z,ff0for allfL2(V)withf0};

it is the cone of copositive kernels on V. This is a convex cone and, since it is closed in the weak topology on Lsym2(V×V), it is also closed in the norm topology. Moreover, the dual of C(V), namely

(C(V))={ALsym2(V×V):Z,A0for allZC(V)}

is exactly C(V) by the Bipolar Theorem [43, Theorem 5.5]; see also Problem 1, §IV.5.3 in Barvinok [3].

Theorem 4.1

Let AC(V) and ZC(V). Then:

  • (i)

    If UV is measurable and has positive measure, then A[U]C(U) and Z[U]C(U), where U inherits its topology and measure from V.

  • (ii)

    If gL(V) is nonnegative, then A(gg)C(V) and Z(gg)C(V).

Proof

The first statement is immediate, so let us prove the second. If fL2(V) is nonnegative, then fg0, and so (ff)(gg)=(fg)(fg)C(V). This implies that if AC(V), then A(gg)C(V).

Now take ZC(V). If fL2(V) is nonnegative, then

Z(gg),ff=Z,(fg)(fg)0,

and hence Z(gg)C(V).

Partitions and averaging2

An ω-partition of V is a partition of V into finitely many measurable sets each of positive measure. Given a function fL2(V) and an ω-partition P of V, the averaging of f on P is the function fP:VR such that

(fP)(x)=ω(X)-1Xf(x)dω(x)

for all XP and xX. It is immediate that fPL2(V). We also see fP as a function with domain P, writing (fP)(X) for the common value of fP in XP.

Given AL2(V×V), the averaging of A on P is the function AP:V×VR such that

(AP)(x,y)=ω(X)-1ω(Y)-1XYA(x,y)dω(y)dω(x)

for all X, YP and xX, yY. Again, APL2(V×V); moreover, if A is symmetric, then so is AP. The kernel AP can also be seen as a function with domain P×P (that is, as a matrix), so (AP)(X,Y) is the common value of AP in X×Y for X, YP. Seeing AP as a matrix allows us to show that, as a kernel, AP has finite rank. Note also that (ff)P=(fP)(fP).

The averaging operation preserves step functions and step kernels on the partition P. In particular, it is idempotent: if fL2(V), then (fP)P=fP, and similarly for kernels. Moreover, if A, BL2(V×V), then

AP,B=AP,BP=A,BP.

For a proof, simply expand all the inner products. On the one hand,

AP,BP=X,YPXY(AP)(x,y)(BP)(x,y)dω(y)dω(x)=X,YP(AP)(X,Y)(BP)(X,Y)ω(X)ω(Y).

On the other hand,

AP,B=X,YPXY(AP)(x,y)B(x,y)dω(y)dω(x)=X,YP(AP)(X,Y)XYB(x,y)dω(y)dω(x)=X,YP(AP)(X,Y)(BP)(X,Y)ω(X)ω(Y)=AP,BP.

One concludes similarly that A,BP=AP,BP.

Theorem 4.2

Let P be an ω-partition. If AC(V), then APC(V) and APC(P), where on P we consider the discrete topology and the counting measure. Similarly, if ZC(V), then ZPC(V) and ZPC(P).

Proof

Let us prove the second statement first. Take ZC(V) and fL2(V) with f0. Then fP0 and

ZP,ff=Z,(ff)P=Z,(fP)(fP)0,

whence ZPC(V).

To see that ZPC(P), take a function ϕ:PR with ϕ0. Let fL2(V) be the function such that f(x)=ϕ(X)ω(X)-1 for all XP and xX; notice f0. Then

X,YP(ZP)(X,Y)ϕ(X)ϕ(Y)=X,YPXY(ZP)(x,y)ϕ(X)ϕ(Y)ω(X)-1ω(Y)-1dω(y)dω(x)=ZP,ff0,

and ZPC(P).

Now take AC(V). If ZC(V), then since ZPC(V) we have

AP,Z=A,ZP0.

So, since (C(V))=C(V), we have APC(V).

Seeing that APC(P) is only slightly more complicated. Given ZC(P), consider the kernel ZL2(V×V) such that Z(x,y)=Z(X,Y)ω(X)-1ω(Y)-1 for all X, YP and xX, yY. Then ZC(V). Indeed, let fL2(V) be nonnegative. Note ZP=Z and expand Z,ff to get

Z,ff=ZP,ff=Z,(fP)(fP)=X,YPXYZ(X,Y)ω(X)-1ω(Y)-1(fP)(X)(fP)(Y)dω(y)dω(x)=X,YPZ(X,Y)(fP)(X)(fP)(Y)0,

since fP0. So ZC(V). Now, since APC(V) and ZC(V),

X,YP(AP)(X,Y)Z(X,Y)=AP,Z0,

and APC(P).

Corollary 4.3

If P is an ω-partition and if AC(V), then there are nonnegative and nonzero functions f1, ..., fnL2(V), each one constant in each XP, such that

AP=f1f1++fnfn.

Proof

From Theorem 4.2 we know that APC(P). So there are nonnegative and nonzero functions ϕ1, ..., ϕn with domain P such that

AP=ϕ1ϕ1++ϕnϕn,

where AP is seen as a function on P×P. The result now follows by taking fi(x)=ϕi(X) for XP and xX.

Approximation of continuous kernels

The main use of averaging is in approximating continuous kernels by finite-rank ones. We say that a continuous kernel A:V×VR varies (strictly) less than ϵ over an ω-partition P if the variation of A in each X×Y for X, YP is less than ϵ. We say that a partition P of V separates UV if |UX|1 for all XP. The main tool we need is the following result.

Theorem 4.4

If A:V×VR is continuous and if UV is finite, then for every ϵ>0 there is an ω-partition P that separates U and over which A varies less than ϵ.

Proof

Since V is a Hausdorff space and U is finite, every xV has a neighborhood Nx such that every yU\{x} is in the exterior of Nx. Since A is continuous, for every (x,y)V×V we can choose neighborhoods Nx,yx of x and Nx,yy of y such that the variation of A in Nx,yx×Nx,yy is less than ϵ/2. The same is then true of the neighborhoods Nx,yxNx and Nx,yyNy of x and y.

The sets (Nx,yxNx)×(Nx,yyNy) form an open cover of V×V, and since V×V is compact there is a finite subcover B consisting of such sets. Set

C={SV:there isTsuch that(S,T)or(T,S)B}.

Note C is an open cover of V. Moreover, by construction, |US|1 for all SC and, if xU is such that xS for some SC, then x is in the exterior of S. Let us turn this open cover C into the desired ω-partition P.

For SC, consider the set

ES=SSS\SC\SS=SSSSC\SV\S.

Write R={ES:SCandES}. Then R is a partition of V that, by construction, separates U. Moreover, if X, YR, then the variation of A in X×Y is less than ϵ/2. Indeed, note that if SC and SC are such that ESS, then ESS. Since B is a cover of V×V, given X, YR there must be S×TB such that (X×Y)(S×T), implying that XS and YT, whence XS and YT. But then X×YS×T, and we know that the variation of A in S×T is less than ϵ/2.

Now R may not be an ω-partition: though the sets in R are measurable, some may have measure 0. This does not happen, however, for sets in R that contain some point in U. Indeed, if for SC and xU we have xES, then xSSS, which is an open set. Moreover, xS for all SC\S, and hence x is in the exterior of each SC\S. But then x is in the interior of ES and so ES has nonempty interior and hence positive measure.

Let us fix R by getting rid of sets with measure 0. Let W be the union of all sets in R with measure 0. Note cl(V\W)=V. For if not, then there would be xW and a neighborhood N of x such that Ncl(V\W)=. But then NV\cl(V\W)W, and hence ω(W)>0, a contradiction.

Let X1, ..., Xn be the sets of positive measure in R. Set

Xi=Xi(WclXi)\(X1Xi-1).

Since V=cl(V\W)=clX1clXn, P={X1,,Xn} is an ω-partition of V; moreover, since UW=, P separates U. Now XiclXi, and so the variation of A in X×Y for X, YP is at most ϵ/2, and hence less than ϵ.

The existence of ω-partitions over which A has small variation allows us to approximate a continuous kernel by its averages.

Theorem 4.5

If a continuous kernel A:V×VR varies less than ϵ over an ω-partition P, then |A(x,y)-(AP)(x,y)|<ϵ for all x, yV.

Proof

Take x, yV and say xX, yY for some X, YP. Then

(AP)(x,y)=ω(X)-1ω(Y)-1XYA(x,y)dω(y)dω(x)<ω(X)-1ω(Y)-1XYA(x,y)+ϵdω(y)dω(x)=A(x,y)+ϵ.

Similarly, (AP)(x,y)>A(x,y)-ϵ, and the theorem follows.

Corollary 4.6

If a continuous kernel A:V×VR varies less than ϵ over an ω-partition P, then A-AP<ϵ. If moreover A is positive, then |trA-trAP|<ϵ.

Proof

Using Theorem 4.5 we get

A-AP2=VV(A(x,y)-(AP)(x,y))2dω(y)dω(x)<ϵ2,

as desired.

Since A is positive and continuous, Mercer’s theorem implies that the trace of A is the integral over the diagonal. Since AP is a finite-rank step kernel, its trace is also the integral over the diagonal. Then, using Theorem 4.5,

|trA-trAP|=|VA(x,x)-(AP)(x,x)dω(x)|V|A(x,x)-(AP)(x,x)|dω(x)<ϵ,

as we wanted.

A continuous kernel A:V×VR is positive if and only if the matrix A[U] is positive semidefinite for all finite UV (cf. Bochner [4]). An analogous result holds for C(V) and its dual; see also Lemma 2.1 of Dobre et al. [12].

Theorem 4.7

A continuous kernel A:V×VR belongs to C(V) if and only if A[U] belongs to C(U) for all finite UV, where we consider for U the discrete topology and the counting measure. Likewise, a continuous Z:V×VR belongs to C(V) if and only if Z[U] belongs to C(U) for all finite UV.

Proof

Take AC(V) and let UV be finite. For n1, let Pn be an ω-partition that separates U and over which A varies less than 1/n, as given by Theorem 4.4. Since APnC(Pn) and Pn separates U, Theorem 4.2 implies that (APn)[U]C(U) for all n1; Theorem 4.5 implies that A[U] is the limit, in the norm topology, of ((APn)[U]), so A[U]C(U). One proves similarly that if ZC(V), then Z[U]C(U) for all finite UV.

Now let A:V×VR be a continuous kernel such that AC(V). Let us show that there is a finite set UV such that A[U]C(U). If A is not symmetric, we are done. So assume A is symmetric and let ZC(V) be such that A,Z=δ<0.

Corollary 4.6 together with the Cauchy–Schwarz inequality implies that, if A varies less than ϵ over an ω-partition P, then |A,Z-AP,Z|<ϵZ. So, for all small enough ϵ, if A varies less than ϵ over the ω-partition P, then

δ/2>AP,Z=AP,ZP=X,YP(AP)(X,Y)(ZP)(X,Y)ω(X)ω(Y). 8

Let gL(V) be the function such that g(x)=ω(X) for XP and xX. Theorems 4.1 and 4.2 say that Z=(ZP)(gg)C(V). For x, yV, write s(x,y)=sgnZ(x,y). Let UV be a set of representatives of the parts of P. Develop (8) using Theorem 4.5 to obtain

δ/2>x,yU(AP)(x,y)Z(x,y)x,yU(A(x,y)-s(x,y)ϵ)Z(x,y)=x,yUA(x,y)Z(x,y)-ϵx,yUs(x,y)Z(x,y). 9

Now notice that, if P is an ω-partition, then ZP1Z1. So

x,yUs(x,y)Z(x,y)=ZP1Z1.

Together with (9) this gives

x,yUA(x,y)Z(x,y)<δ/2+ϵZ1.

Since U is a set of representatives of the parts of P, Theorem 4.2 says Z[U]C(U). Since Z1< (as ω is finite, L2(V×V)L1(V×V)), by taking ϵ sufficiently small we see that A[U]C(U), as we wanted.

The analogous result for C(V) can be similarly proved.

Using Theorem 4.7, we can rewrite problem ϑ(G,C(V)) (see (6)) by replacing the constraint “AC(V)” by infinitely many constraints on finite subkernels of A.

The tip of the cone of completely positive kernels

A base of a cone K is a set BK that does not contain the origin and is such that for every nonzero xK there is a unique α>0 for which α-1xB. Cones with compact and convex bases have many pleasant properties that are particularly useful to the theory of conic programming [3, Chapter IV].

It is not in general clear whether C(V) has a compact and convex base, however the following subset of C(V) — its tip — will be just as useful in the coming developments:

T(V)=cch{ff:fL2(V),f0,andf1},

where cchX is the closure of the convex hull of X. Notice the closure is the same whether taken in the norm or the weak topology.

If f1, then ff=f21, so T(V) is a closed subset of the closed unit ball in L2(V×V), and hence by Alaoglu’s theorem [16, Theorem 5.18] it is weakly compact. If L2(V×V) is separable, then the weak topology on the closed unit ball of L2(V×V), and hence the weak topology on T(V), is metrizable [16, p. 171, Exercise 50].

The tip displays a key property of a base, at least for continuous kernels.

Theorem 4.8

If AC(V) is nonzero and continuous, then (trA)-1AT(V).

Proof

For n1, let Pn be an ω-partition over which A varies less than 1/n. For each n1, use Corollary 4.3 to write

APn=m=1rnαmnfmnfmn,

where αmn0, fmn0, and fmn=1.

The kernel A is in C(V) and hence positive, so using Corollary 4.6 we have

limn(trAPn)-1APn=(trA)-1A

in the norm topology. Now trAPn=m=1rnαmn>0 for all large enough n, and then (trAPn)-1APnT(V) for all large enough n, proving the theorem.

Finally, we also know how the extreme points of T(V) look like.

Theorem 4.9

An extreme point of T(V) is either 0 or of the form ff for fL2(V) with f0 and f=1.

Proof

We show first that the set B={ff:fL2(V),f0, and f1} is weakly closed. Then, since T(V) is weakly compact and convex and since the weak topology is locally convex, it will follow from Milman’s theorem [43, Theorem 9.4] that all extreme points of T(V) are contained in B.

Let (fαfα) be a weakly converging net with fαL2(V), fα0, and fα1 for all α. The net (fα) lies in the closed unit ball, which is weakly compact, and hence it has a weakly converging subnet. So we may assume that the net (fα) is itself weakly converging; let f be its limit.

Immediately we have f0 and f1. Claim: ff is the limit of (fαfα). Proof: We have to show that, if GL2(V×V), then

fαfα,Gff,G. 10

Let S be a complete orthonormal system of L2(V); then {gh:g,hS} is a complete orthonormal system of L2(V×V). Given GL2(V×V), write

G=i=1λigihi,

where gi, hiS and i=1λi2=G2. For every ϵ>0, let Nϵ be such that the finite-rank kernel

Gϵ=i=1Nϵλigihi

satisfies G-Gϵ<ϵ. Apply the Cauchy–Schwarz inequality to get

|gh,G-gh,Gϵ|<ϵ 11

for every g, hL2(V) with g=h1.

Since f is the weak limit of (fα), for g, hL2(V) we have

fαfα,gh=(fα,g)(fα,h)(f,g)(f,h)=ff,gh.

Now, Gϵ has finite rank for every ϵ>0, so we must have

fαfα,Gϵff,Gϵ

and, together with (11), it follows that B is weakly closed.

Now we only have to argue that ff for f0 is an extreme point if and only if f=0 or f=1. First, if 0<f<1, then ff is a convex combination of 0 and f-2ff, and hence not an extreme point.

Conversely, 0 is clearly not a convex combination of nonzero points, and hence it is an extreme point. Moreover, if f=1, then ff=1. Now, by the Cauchy–Schwarz inequality, it is impossible for a vector of norm 1 in L2 to be a nontrivial convex combination of other vectors of norm 1, so ff is an extreme point.

When is the completely positive formulation exact?

Throughout this section, the Haar measure on a compact group will always be normalized so the group has total measure 1.

When is ϑ(G,C(V))=αω(G)? When G is a finite graph and ω is the counting measure, equality holds, as we saw in the introduction. In the finite case, actually, equality holds irrespective of the measure. In this section, we will see some sufficient conditions on G and ω under which ϑ(G,C(V))=αω(G); these conditions will be satisfied by the main examples of infinite graphs considered here.

Let G=(V,E) be a topological graph. An automorphism of G is a homeomorphism σ:VV such that (x,y)E if and only if (σx,σy)E. Denote by Aut(G) the set of all automorphisms of G, which is a group under function composition.

Say V is a set and Γ a group that acts on V. We say that Γ acts continuously on V if

  • (i)

    for every σΓ, the map xσx from V to V is continuous and

  • (ii)

    for every xV, the map σσx from Γ to V is continuous.

We say that Γ acts transitively on V if for all x, yV there is σΓ such that σx=y.

Assume that Γ is compact and that it acts continuously and transitively on V and let μ be its Haar measure. Fix xV and consider the function p:ΓV such that p(σ)=σx. The pushforward of μ is the measure ω on V defined as follows: a set XV is measurable if p-1(X) is measurable and its measure is ω(X)=μ(p-1(X)). The pushforward is a Borel measure; moreover, since Γ acts transitively and since μ is invariant, it is independent of the choice of x. The pushforward is also invariant under the action of Γ, that is, if XV and σΓ, then

ω(σX)=ω({σx:xX})=ω(X).

Let V be a metric space with metric d and ω be a Borel measure on V such that every open set has positive measure. A point x in a measurable set SV is a density point of S if

limδ0ω(SB(x,δ))ω(B(x,δ))=1.

We say that the metric d is a density metric for ω if for every measurable set SV the set of all density points of S has the same measure as S, that is, almost all points of S are density points. For example, Lebesgue’s density theorem states that the Euclidean metric on Rn is a density metric for the Lebesgue measure.

We now come to the main theorem of the paper.

Theorem 5.1

Let G=(V,E) be a locally independent graph where V is a compact Hausdorff space, ΓAut(G) be a compact group that acts continuously and transitively on V, and ω be a multiple of the pushforward of the Haar measure on Γ. If Γ is metrizable via a bi-invariant density metric for the Haar measure, then ϑ(G,C(V))=αω(G).

Here, a bi-invariant metric on Γ is a metric d such that for all λ, γ, σ, τΓ we have d(λσγ,λτγ)=d(σ,τ).

Theorem 5.1 implies for instance that

ϑ(G(Sn-1,{θ}),C(Sn-1))=αω(G(Sn-1,{θ}))

for every angle θ>0. Indeed, G(Sn-1,{θ}) is a locally independent graph. For Γ we take the orthogonal group O(n); this group acts continuously and transitively on Sn-1 and the surface measure on the sphere is a multiple of the pushforward of the Haar measure [29, Theorem 3.7]. The metric on O(n)Rn×n inherited from the Euclidean metric is bi-invariant and is moreover a density metric since O(n) is a Riemannian manifold [15]. More generally, any compact Lie group is metrizable via a bi-invariant metric [31, Corollary 1.4].

In the proof of the theorem, the symmetry provided by the group Γ is used to reduce the problem to an equivalent problem on a graph over Γ, a Cayley graph.

Cayley graphs

Let Γ be a topological group with identity 1 and ΣΓ be such that 1Σ and Σ-1={σ-1:σΣ}=Σ. Consider the graph whose vertex set is Γ and in which σ, τΓ are adjacent if and only if σ-1τΣ (which happens, since Σ-1=Σ, if and only if τ-1σΣ). This is the Cayley graph over Γ with connection set Σ; it is denoted by Cayley(Γ,Σ). Note that Γ acts on itself continuously and transitively and that left multiplication by an element of Γ is an automorphism of the Cayley graph.

We will use the following construction to relate a vertex-transitive graph to a Cayley graph over any transitive subgroup of its automorphism group. Let G=(V,E) be a topological graph and ΓAut(G) be a group that acts transitively on V. Fix x0V and set ΣG,x0={σΓ:(σx0,x0)E}. Since ΓAut(G), we have ΣG,x0-1=ΣG,x0.

Lemma 5.2

If G=(V,E) is a locally independent graph and if ΓAut(G) is a topological group that acts continuously and transitively on V, then Cayley(Γ,ΣG,x0) is locally independent for all x0V. If moreover ω is a multiple of the pushforward of the Haar measure μ on Γ, then for every M0 the graph G has a measurable independent set of measure at least M if and only if Cayley(Γ,ΣG,x0) has a measurable independent set of measure at least M/ω(V); in particular,

αμ(Cayley(Γ,ΣG,x0))=αω(G)/ω(V)

for all x0V.

Proof

Independent sets in G and Cayley(Γ,ΣG,x0) are related: if p:ΓV is the function such that p(σ)=σx0, then (i) if IV is independent, then so is p-1(I); conversely, (ii) if IΓ is independent, then so is p(I).

Let us first prove the second statement of the theorem. By normalizing ω if necessary, we may assume that ω(V)=1. Then ω is the pushforward of μ, and (i) implies directly that if IV is a measurable independent set, then p-1(I)Γ is a measurable independent set with μ(p-1(I))=ω(I).

Now suppose IΓ is a measurable independent set. The Haar measure is inner regular, meaning that we can take a sequence C1, C2, ... of compact subsets of I such that μ(I\Cn)<1/n. Let C be the union of all Cn. Since CI, we have that C, and hence p(C), are both independent sets. Since Cn is compact, p(Cn) is also compact and hence measurable. But then since

p(C)=n=1p(Cn),

it follows that p(C) is measurable. Finally, ω(p(C))=μ(p-1(p(C)))μ(C)=μ(I), as we wanted.

As for the first statement of the theorem, suppose G is locally independent and let IΓ be a compact independent set. The function p is continuous and hence p(I)V is compact. Since G is locally independent and p(I) is independent, there is an open independent set S in G that contains p(I). But then p-1(S) is an open independent set in Cayley(Γ,ΣG,x0) that contains I, and thus the Cayley graph is locally independent.

The theta parameters of G and any corresponding Cayley graph are also related:

Lemma 5.3

If G=(V,E) is a locally independent graph, if ΓAut(G) is a compact group that acts continuously and transitively on V, and if ω is a multiple of the pushforward of the Haar measure μ on Γ, then

ϑ(G,C(V))/ω(V)ϑ(Cayley(Γ,ΣG,x0),C(Γ))

for all x0V.

In fact, there is nothing special about the cone C(V) in the above statement; the statement holds for any cone invariant under the action of Γ, for example the cone of positive kernels.

Proof

We may assume that ω(V)=1. Fix x0V and let Φ:L2(V×V)L2(Γ×Γ) be the operator such that

Φ(A)(σ,τ)=A(σx0,τx0)

for all σ, τΓ. Since Γ acts continuously on V, if A is continuous, then so is Φ(A). Moreover,

ΓΦ(A)(σ,σ)dμ(σ)=VA(x,x)dω(x).

Indeed,

ΓΦ(A)(σ,σ)dμ(σ)=ΓA(σx0,σx0)dμ(σ). 12

Now, the right-hand side above is independent of x0. For if x0x0, then since Γ acts transitively on V there is τΓ such that x0=τx0. Then using the right invariance of the Haar measure we get

ΓA(σx0,σx0)dμ(σ)=ΓA(στx0,στx0)dμ(σ)=ΓA(σx0,σx0)dμ(σ).

The measure ω is the pushforward of μ, so it is invariant under the action of Γ and ω(V)=1. Continuing (12) we get

ΓA(σx0,σx0)dμ(σ)=VΓA(σx,σx)dμ(σ)dω(x)=ΓVA(σx,σx)dω(x)dμ(σ)=VA(x,x)dω(x),

as we wanted. Similarly, one can prove that Φ(A),Φ(B)=A,B; in particular, for all A, BL2(V×V) we have Φ(A)=A and we see that Φ is a bounded operator.

Now let A be a feasible solution of ϑ(G,C(V)). Claim: Φ(A) is a feasible solution of ϑ(Cayley(Γ,ΣG,x0),C(Γ)).

Indeed, ΓΦ(A)(σ,σ)dμ(σ)=1. If σ, τΓ are adjacent in the Cayley graph, then (σx0,τx0)E, so that Φ(A)(σ,τ)=A(σx0,τx0)=0. So it remains to show that Φ(A)C(Γ).

Note A is the limit, in the norm topology, of a sequence (An), where each An is a finite sum of kernels of the form ff with fL2(V) nonnegative. Since Φ is linear and since Φ(ff)C(Γ) for all nonnegative fL2(V), we have Φ(An)C(Γ) for all n. Now Φ(An-A)=An-A, so Φ(A) is the limit of (Φ(An)), and hence Φ(A)C(Γ), proving the claim.

Finally, J,Φ(A)=Φ(J),Φ(A)=J,A, and since A is any feasible solution of ϑ(G,C(V)), the theorem follows.

The Reynolds operator

Let V be a compact Hausdorff space, let Γ be a compact group that acts continuously and transitively on V, and consider on V a multiple of the pushforward of the Haar measure μ on Γ. An important tool in the proof of Theorem 5.1 will be the Reynolds operator R:L2(V×V)L2(V×V) that maps a kernel to its symmetrization: for AL2(V×V),

R(A)(x,y)=ΓA(σx,σy)dμ(σ)

almost everywhere3 in V×V. The operator is defined given a group that acts on V; the group and its action will always be clear from context. Since Γ is compact and therefore the Haar measure is both left and right invariant, the Reynolds operator is self adjoint, that is, R(A),B=A,R(B).

Lemma 5.4

If V is a compact space, if Γ is a compact group that acts continuously and transitively on V, and if V is metrizable via a Γ-invariant metric, then for every continuous A:V×VR the kernel R(A) is also continuous.

Here we say that a metric d on V is Γ -invariant if d(σx,σy)=d(x,y) for all xyV and σΓ.

Proof

If d is a Γ-invariant metric on V, then

d((x,y),(x,y))=max{d(x,x),d(y,y)}

is a metric inducing the product topology on V×V. Now A is continuous, and hence uniformly continuous on the compact metric space V×V. So for every ϵ>0 there is δ>0 such that for all (xy), (x,y)V×V,

ifd((x,y),(x,y))<δ,then|A(x,y)-A(x,y)|<ϵ.

Since d is Γ-invariant, d((σx,σy),(σx,σy))=d((x,y),(x,y)), and

ifd((x,y),(x,y))<δ,then|A(σx,σy)-A(σx,σy)|<ϵfor allσΓ. 13

So, given ϵ>0, if δ>0 is such that (13) holds, then d((x,y),(x,y))<δ implies that

|R(A)(x,y)-R(A)(x,y)|Γ|A(σx,σy)-A(σx,σy)|dμ(σ)<ϵ,

proving that R(A) is continuous.

Lemma 5.5

If V is a compact space, if Γ is a compact group that acts continuously and transitively on V, if V is metrizable via a Γ-invariant metric, and if on V we consider a multiple ω of the pushforward of the Haar measure on Γ, then for every fL2(V) the kernel R(ff) is continuous.

Proof

By normalizing ω if necessary, we may assume that ω(V)=1. Fix xV. Given a function fL2(V), consider the function ϕ:ΓR such that ϕ(σ)=f(σx); given gL2(V), define ψ:ΓR similarly. Then

(f,g)=(ϕ,ψ), 14

where (·,·) denotes the usual L2 inner product in the respective spaces; this implies in particular that ϕ, ψL2(Γ). To see (14) note that, since Γ acts transitively, for every xV there is τΓ such that x=τx. Then use the invariance of the Haar measure to get

Γf(σx)g(σx)dμ(σ)=Γf(στx)g(στx)dμ(σ)=Γf(σx)g(σx)dμ(σ)=(ϕ,ψ).

So, using the invariance of ω under the action of Γ,

(ϕ,ψ)=VΓf(σx)g(σx)dμ(σ)dω(x)=ΓVf(σx)g(σx)dω(x)dμ(σ)=(f,g),

as we wanted.

Assume without loss of generality that f1. Continuous functions are dense in L2(V), so given ϵ>0 there is a continuous function g such that f-g<ϵ. Then, for x, yV,

|Γf(σx)f(σy)-g(σx)g(σy)dμ(σ)|=|Γf(σx)f(σy)-g(σx)f(σy)+g(σx)f(σy)-g(σx)g(σy)dμ(σ)|Γ|f(σx)-g(σx)||f(σy)|dμ(σ)+Γ|g(σx)||f(σy)-g(σy)|dμ(σ).

Since f1, and hence g1+ϵ, the Cauchy–Schwarz inequality together with (14) implies that the right-hand side above is less than ϵ+(1+ϵ)ϵ. So

|R(ff)(x,y)-R(gg)(x,y)|<ϵ+(1+ϵ)ϵ

for all x, yV.

Now gg is continuous, so Lemma 5.4 says that R(gg) is continuous. With the above inequality, this implies that R(ff) is the uniform limit of continuous functions, and hence continuous.

Proof of Theorem 5.1

Under the hypotheses of Theorem 5.1, we must establish the identity ϑ(G,C(V))=αω(G). The ‘’ inequality follows from Theorem 3.1; for the reverse inequality we use the following lemma.

Lemma 5.6

Let G=(V,E) be a locally independent graph where V is a compact Hausdorff space, let ΓAut(G) be a compact group that acts continuously and transitively on V, let ω be a multiple of the pushforward of the Haar measure on Γ, and assume Γ is metrizable via a bi-invariant density metric for the Haar measure. If A is a feasible solution of ϑ(G,C(V)), then there is a measurable independent set in G with measure at least J,A.

Proof

In view of Lemmas 5.2 and 5.3, it is sufficient to prove that, if ΣΓ is a connection set such that Cayley(Γ,Σ) is a locally independent graph and if A is a feasible solution of ϑ(Cayley(Γ,Σ),C(Γ)), then there is an independent set in Cayley(Γ,Σ) of measure at least J,A.

So fix a connection set ΣΓ and suppose Cayley(Γ,Σ) is locally independent. Throughout the rest of the proof, EΣ will be the edge set of Cayley(Γ,Σ). It is immediate that

ϑ(Cayley(Γ,Σ),C(Γ))=ϑ((Γ,EΣ),C(Γ))=ϑ((Γ,clEΣ),C(Γ)),

that is, considering the closure of the edge set does not change the optimal value. Together with Theorem 2.3, this implies that we may assume that EΣ is closed.

Notice that Γ is a Hausdorff space (topological groups are Hausdorff spaces by definition) and that μ is an inner-regular Borel measure (because it is a Haar measure) that is positive on open sets (indeed, if SΓ is open, then {σS:σΓ} is an open cover of Γ; since Γ is compact, there is a finite subcover, hence μ(S)>0 or else we would have μ(Γ)=0). So we can use the results of Sect. 4.

There is a countable set EEΣ such that clE=EΣ. Indeed, since EΣ is closed and hence compact, for every n1 we can cover EΣ with finitely many open balls of radius 1/n; now choose one point of EΣ in each such ball and let E be the set of all points chosen for n=1, 2, ....

Let (σ1,τ1), (σ2,τ2), ... be an enumeration of E. For n1 consider the kernel

Tn=i=12-iμ(B(σi,1/n))-1μ(B(τi,1/n))-1χB(σi,1/n)×B(τi,1/n).

This is indeed a kernel: the norm of each summand is 2-i times a constant that depends only on n, so Tn is square integrable.

If A:Γ×ΓR is continuous, and hence uniformly continuous, then for every ϵ>0 there is n0 such that for all nn0 we have

|A(σ,τ)-A(σi,τi)|<ϵfor alli1,σB(σi,1/n),andτB(τi,1/n).

This implies that

limnTn,A=i=12-iA(σi,τi). 15

Let A be a feasible solution of ϑ(Cayley(Γ,Σ),C(Γ)). Since trA=1, Theorem 4.8 tells us that AT(Γ), where T(Γ) is the tip of C(Γ); see Sect. 4.3. Also from Sect. 4.3 we know that T(Γ) is weakly compact, that it is a subset of L2(Γ×Γ), whose weak topology is locally convex, and that the weak topology on T(Γ) is metrizable.4 So we can apply Choquet’s theorem [43, Theorem 10.7] to get a probability measure ν on T(Γ) with barycenter A and ν(X)=1, where X is the set of extreme points of T(Γ). From Theorem 4.9 we know that any element of X is of the form ff for some nonnegative fL2(Γ) that is either 0 or such that f=1. So A being the barycenter of ν means that for every KLsym2(Γ×Γ) we have

K,A=XK,ffdν(ff). 16

Since A is feasible, its symmetrization R(A) is also feasible, and in particular R(A)(σ,τ)=0 for all (σ,τ)EΣ. (Note that here we need to use Lemma 5.4, and for that we need the left invariance of the metric on Γ.) This, together with (15), (16), and the self-adjointness of the Reynolds operator gives

0=limnTn,R(A)=limnR(Tn),A=limnXR(Tn),ffdν(ff)=limnXTn,R(ff)dν(ff).

Fatou’s lemma now says that we can exchange the integral with the limit (that becomes a lim inf) to get

0Xlim infnTn,R(ff)dν(ff).

So, since Tn and all fs above are nonnegative, the set

{ff:lim infnTn,R(ff)>0}

has measure 0 with respect to ν.

Taking K=J in (16), we see that we can choose f0 with f=1 such that J,ffJ,A and

lim infnTn,R(ff)=0.

By Lemma 5.5, R(ff) is continuous, and hence from (15) we see that f satisfies

i=12-iR(ff)(σi,τi)=0.

So it must be that R(ff)(σi,τi)=0 for all i, and hence R(ff)(σ,τ)=0 for all (σ,τ)EΣ.

We are now almost done. Let I be the set of density points in the support of f (note that fL2(Γ), so its support is not clearly defined; here it suffices to take, however, an arbitrary representative of the equivalence class of f and then its support). Claim: I is independent. Proof: Since R(ff)(σ,τ)=0 for every (σ,τ)EΣ, it suffices to show that if σ, τI, then R(ff)(σ,τ)>0.

Since σ, τI are density points, there is δ>0 such that

μ(IB(σ,δ))μ(B(σ,δ))2/3andμ(IB(τ,δ))μ(B(τ,δ))2/3. 17

For ζΓ, write Nζ={γΓ:γζI}; note that I=Nζζ. The right invariance of the metric on Γ implies that B(ζ,δ)=B(1,δ)ζ for all ζΓ and δ>0. Then, using (17) and the invariance of μ,

1μ(B(1,δ))-1μ((NσNτ)B(1,δ))=μ(B(1,δ))-1(μ(NσB(1,δ))+μ(NτB(1,δ))-μ(NσNτB(1,δ)))4/3-μ(B(1,δ))-1μ(NσNτB(1,δ)).

Hence μ(NσNτ)μ(NσNτB(1,δ))μ(B(1,δ))/3>0. Finally, since f(γ)>0 for all γI,

R(ff)(σ,τ)=NσNτf(γσ)f(γτ)dμ(γ)>0,

proving the claim.

So I is independent; it remains to estimate its measure. Recall I has the same measure as the support of f. Since f=1, if χΓ is the constant 1 function, then

J,AJ,ff=(f,χΓ)2=(f,χI)2f2χI2=μ(I),

proving the lemma.

Proof of Theorem 5.1

Theorem 3.1 says that ϑ(G,C(V))αω(G). The reverse inequality follows directly from Lemma 5.6.

Notice that, if ϑ(G,C(V)) has an optimal solution, then Lemma 5.6 implies that the measurable independence number is attained, that is, there is a measurable independent set I with ω(I)=αω(G). This is the case, for instance, of the distance graph G=G(Sn-1,{θ}) for n3. In this case, a convergence argument, akin to the one we will use in Sect. 10.2, can be used to show that ϑ(G,C(V)) has an optimal solution. This provides another proof of a result of DeCorte and Pikhurko [9].

Distance graphs on the Euclidean space

Theorem 5.1 applies only to graphs on compact spaces, but thanks to a limit argument it can be extended to some graphs on Rn; we will see now how to make this extension for distance graphs.

Let D(0,) be a set of forbidden distances and consider the D-distance graph G(Rn,D), where two vertices x, yRn are adjacent if x-yD. To measure the size of an independent set in G(Rn,D) we use the upper density. Given a Lebesgue-measurable set XRn, its upper density is

δ¯(X)=suppRnlim supTvol(X(p+[-T,T]n))vol[-T,T]n,

where vol is the Lebesgue measure. The independence density of G(Rn,D) is

αδ¯(G(Rn,D))=sup{δ¯(I):IRnis Lebesgue-measurable and independent}.

Periodic sets and limits of tori

The key idea is to consider independent sets that are periodic. A set XRn is periodic if there is a lattice ΛRn whose action leaves X invariant, that is, X+v=X for all vΛ; in this case we say that Λ is a periodicity lattice of X. Given a lattice ΛRn spanned by vectors u1, ..., un, its (strict) fundamental domain with respect to u1, ..., un is the set

F={α1u1++αnun:αi[-1/2,1/2)for alli}.

A periodic set with periodicity lattice Λ repeats itself in copies of F translated by vectors in Λ. We identify the torus Rn/Λ with the fundamental domain F of Λ, identifying a coset S with the unique xF such that S=x+Λ. When speaking of an element xRn/Λ, it is always implicit that x is the unique representative of x+Λ that lies in the fundamental domain.

Given a lattice ΛRn, consider the graph G(Rn/Λ,D) whose vertex set is the torus Rn/Λ and in which vertices x, yRn/Λ are adjacent if there is vΛ such that x-y+vD. Independent sets in G(Rn/Λ,D) correspond to periodic independent sets in G(Rn,D) with periodicity lattice Λ and vice versa.

Lemma 6.1

If D(0,) is closed and bounded, then G(Rn/LZn,D) is locally independent for every L>2supD.

The hypothesis that D is bounded is essential: for instance, if D=(1,), then for every L>0, any xRn/LZn would be adjacent to itself. When D is unbounded, however, a theorem of Furstenberg et al. [17] implies that αδ¯(G(Rn,D))=0, so this case is not really interesting.

Though the lemma is stated in terms of the lattice LZn, a similar statement holds for any lattice Λ, as long as the shortest nonzero vectors have length greater than 2supD. The lattice LZn is chosen here for concreteness and also because it is the lattice that will be used later on.

Proof

The torus Rn/LZn is a metric space, for instance with the metric

d(x,y)=infvLZnx-y+v 18

for x, yRn/LZn. If x, y lie in the fundamental domain with respect to the canonical basis vectors, then x-y<L and x-y<Ln1/2. So if vL+Ln1/2, then x-y+vx-y+v>Ln1/2. This shows that the infimum above is attained by one of the finitely many vectors vRn/LZn with v<L+Ln1/2.

Let L>2supD. Since any nonzero vLZn is such that vL, the graph G=G(Rn/LZn,D) is loopless. We show that x, yRn/LZn are adjacent in G if and only if d(x,y)D, so G is a distance graph. Since D is closed, this will moreover imply that the edge set of G is closed and then, since the torus is metrizable, from Theorem 2.2 it will follow that G is locally independent.

If d(x,y)D, then immediately we have that x, y are adjacent. So suppose that x, y are adjacent, that is, that there is vLZn such that x-y+vD. Claim: d(x,y)=x-y+v. Indeed, take wRn/LZn, wv. Note that x-y+vx-y+vsupD<L/2 and that w-vL. So

x-y+wx-y+w=x-y+v+(w-v)>L/2,

proving the claim.

The independence numbers of the graphs G(Rn/LZn,D) are also related to the independence density of G(Rn,D):

Lemma 6.2

If D(0,) is bounded, then

lim supLαvol(G(Rn/LZn,D))vol(Rn/LZn)=αδ¯(G(Rn,D)),

where vol denotes the Lebesgue measure.

It is well known that the densities of periodic sphere packings approximate the sphere-packing density arbitrarily well [7, Appendix A]. The proof of the lemma above is very similar to the proof of this fact.

Proof

Any independent set in G(Rn/LZn,D) gives rise to a periodic independent set in G(Rn,D), so the ‘’ inequality is immediate. Let us then prove the reverse inequality.

If D=, the statement is trivial. So assume D, write r=supD, and let IRn be a measurable independent set. From the definition of upper density, for every ϵ>0 there is a point pRn such that for every L00 there is LL0 with

|vol(I(p+[-L/2,L/2]n))vol[-L/2,L/2]n-δ¯(I)|<ϵ/2. 19

Now take L>2r satisfying (19) and write X=I(p+[-L/2+r,L/2-r]n); in words, X is obtained from I(p+[-L/2,L/2]n) by erasing a border of width r around the facets of the hypercube. Then consider the set

I=vLZnX+v.

The set I is, by construction, periodic with periodicity lattice LZn, measurable, and independent. If moreover we take L large enough compared to r, then the volume of the border that was erased is negligible compared to the volume of the hypercube, and so using (19) we can make sure that |δ¯(I)-δ¯(I)|<ϵ. Since I is an arbitrary measurable independent set, we just proved that for any ϵ>0 and any L00 there is LL0 such that

|αvol(G(Rn/LZn,D))vol(Rn/LZn)-αδ¯(G(Rn,D))|<ϵ,

establishing the reverse inequality.

Some harmonic analysis

This is a good place to gather some notation and basic facts about harmonic analysis, which will be used next to extend Theorem 5.1 to G(Rn,D); harmonic analysis will again be used in Sects. 9 and 10. For background, see e.g. the book by Reed and Simon [38]. In this section, functions are complex-valued unless stated otherwise.

A function fL(Rn) is said to be of positive type if f(x)=f(-x)¯ for all xRn and if for every ρL1(Rn) we have

RnRnf(x-y)ρ(x)ρ(y)¯dydx0.

A continuous function f:RnC is of positive type if and only if for every finite URn the matrix

(f(x-y))x,yU

is (Hermitian) positive semidefinite. This characterization shows that if f is a continuous function of positive type, then f=f(0), since for every xRn the matrix

f(0)f(x)f(-x)f(0)

is positive semidefinite and hence |f(x)|f(0). The set of all functions of positive type is a closed and convex cone, which we denote by PSD(Rn).

Bochner’s theorem says that functions of positive type are exactly the Fourier transforms of finite measures: a continuous function f:RnC is of positive type if and only if

f(x)=Rneiu·xdν(u) 20

for some finite (positive) Borel measure ν, with the integral converging uniformly5 over Rn.

A continuous function of positive type f:RnC has a well-defined mean value

M(f)=limT1vol[-T,T]n[-T,T]nf(x)dx,

and if ν is the measure in (20), then M(f)=ν({0}). To see this last identity, for T>0 and uRn, write

gT(u)=1vol[-T,T]n[-T,T]neiu·xdx.

Let g:RnR be the function such that g(0)=1 and g(u)=0 for all nonzero uRn. Then g is the pointwise limit of gT as T. Moreover, |gT(u)|1 for all u, and the constant one function is integrable with respect to the measure ν, since ν is finite. So we may use Lebesgue’s dominated convergence theorem, and together with (20) we get

M(f)=limTRngT(u)dν(u)=Rng(u)dν(u)=ν({0}).

A function f:RnC is periodic if there is a lattice ΛRn whose action leaves f invariant, that is, f(x+v)=f(x) for all xRn and vΛ; in this case we say that Λ is a periodicity lattice of f. If f is periodic with periodicity lattice Λ, then

M(f)=1vol(Rn/Λ)Rn/Λf(x)dx.

So we may equip L2(Rn/Λ) with the inner product

(f,g)=vol(Rn/Λ)M(xf(x)g(x)¯).

Then the functions xeiu·x, for u2πΛ where

Λ={vRn:u·vZfor alluΛ}

is the dual lattice of Λ, form a complete orthogonal system of L2(Rn/Λ). Given fL2(Rn/Λ) and u2πΛ, the Fourier coefficient of f at u is

f^(u)=1vol(Rn/Λ)(f,xeiu·x).

We then have that

f(x)=u2πΛf^(u)eiu·x

with convergence in L2 norm, and from this follows Parseval’s identity: if f, gL2(Rn/Λ), then

(f,g)=u2πΛf^(u)g^(u)¯.

An exact completely positive formulation

Let D(0,) be a set of forbidden distances and K(Rn)PSD(Rn) be a convex cone; consider the optimization problem

maximizeM(f)f(0)=1,f(x)=0ifxD,f:RnRis continuous andfK(Rn). 21

We denote both the problem above and its optimal value by ϑ(G(Rn,D),K(Rn)). Notice that, since K(Rn)PSD(Rn), every fK(Rn) has a mean value, so the objective function is well defined.

Again, there are at least two cones that can be put in place of K(Rn). One is the cone PSD(Rn) of functions of positive type. The other is the cone of real-valued completely positive functions on Rn, namely

C(Rn)=cl{fL(Rn):fis real valued and continuousand(f(x-y))x,yUC(U)for all finiteURn},

where the closure is taken in the L norm; note that C(Rn) is a cone contained in PSD(Rn).

Theorem 6.3

If D(0,) is closed, then ϑ(G(Rn,D),C(Rn))=αδ¯(G(Rn,D)).

Write G=G(Rn,D) for short. Since D is closed and does not contain 0, Theorem 2.2 implies that G is locally independent. Recall that, if D is unbounded, then a theorem of Furstenberg et al. [17] implies that αδ¯(G)=0. In this case, one can show that ϑ(G,C(Rn))=0; actually, ϑ(G,PSD(Rn))=0, as shown by Oliveira and Vallentin [36, Theorem 5.1] (see also Sect. 10 below).

To prove the theorem we may therefore assume that D is bounded and nonempty. Write r=supD, and for L>2r write VL=Rn/LZn; note VL is a compact Abelian group. Lemma 6.1 says that GL=G(VL,D) is locally independent. Since VL is metrizable via the bi-invariant metric (18), by taking V=Γ=VL and letting ω be the Lebesgue measure on VL, the graph GL satisfies the hypotheses of Theorem 5.1, and so

ϑ(GL,C(VL))=αvol(GL).

Lemma 6.2 then implies that

lim supLϑ(GL,C(VL))volVL=αδ¯(G). 22

So to prove Theorem 6.3 it suffices to show that the limit above is equal to ϑ(G,C(Rn)). The proof of this fact is a bit technical, but the main idea is simple; we prove the following two assertions:

  1. If A is a feasible solution of ϑ(GL,C(VL)) for L>2r, then there is a feasible solution f of ϑ(G,C(Rn)) such that M(f)=(volVL)-1J,A.

  2. If f is a feasible solution of ϑ(G,C(Rn)), then for every L>2r there is a feasible solution AL of ϑ(GL,C(VL)) and (volVL)-1J,ALM(f) as L.

The first assertion establishes that the limit in (22) is ϑ(G,C(Rn)); the second assertion establishes the reverse inequality.

To prove (A1), fix L>2r and let A be a feasible solution of ϑ(GL,C(VL)). By applying the Reynolds operator to A if necessary, we may assume that A is invariant under the action of VL, that is, A(x+z,y+z)=A(x,y) for all x, y, zVL. Indeed, if A is feasible, then R(A) is also feasible, and to see this it suffices to show that R(A) is continuous, since the other constraints are easily seen to be satisfied. But the continuity of R(A) follows from Lemma 5.4, since VL is metrizable via the invariant metric (18).

Since A is invariant, there is a function g:VLR such that

A(x,y)=g(x-y)for allx,yVL.

Then:

  • (i)

    g is continuous;

  • (ii)

    since L>2r, if xRn is such that xD, then x lies in the fundamental domain of LZn with respect to the canonical basis vectors, and so g(x)=A(0,x)=0 since 0 and x are adjacent in GL;

  • (iii)

    since AC(VL), using Theorem 4.7 we see that gC(Rn);

  • (iv)

    since A is invariant, its diagonal is constant, and then since trA=1 we have g(0)=(volVL)-1.

This all implies that f=(volVL)g is a feasible solution of ϑ(G,C(Rn)); all that is left to do is to compute M(f). Since g is periodic, its mean value is the integral of g on the fundamental domain F of the periodicity lattice divided by the volume of F, hence

J,A=VLVLg(x-y)dydx=VLVLg(y)dydx=(volVL)2M(g),

and we get M(f)=(volVL)M(g)=(volVL)-1J,A, as we wanted.

To prove (A2), let f be a feasible solution of ϑ(G,C(Rn)) and fix L>2r. Let WL=[-L/2,L/2]n and consider the kernel H:WL×WLR such that H(x,y)=f(x-y). Note H is continuous and, since fC(Rn), using Theorem 4.7 we see that HC(WL).

Let WL=[-L/2+r,L/2-r]n and consider the kernel F:VL×VLR such that

F(x,y)=H(x,y)ifx,yWL;0otherwise.

If x, yVL are adjacent in GL, then F(x,y)=0. Indeed, if either x or y is not in WL, then F(x,y)=0. If x, yWL, then x-yL-2r and, if vLZn is nonzero, then vL and x-y+v2r>r, whence x-y+vD. But then if x and y are adjacent, we must have x-yD and F(x,y)=H(x,y)=f(x-y)=0.

Now F is not continuous, but R(F) is; here is a proof. Since H is continuous and positive (recall HC(WL)), Mercer’s theorem says that there are continuous functions ϕi:WLR with ϕi=1 and numbers λi0 for i=1, 2, ... such that i=1λi< and

H(x,y)=i=1λiϕi(x)ϕi(y)=i=1λi(ϕiϕi)(x,y)

with absolute and uniform convergence over WL×WL.

For i=1, 2, ... define the function ψi:VLR by setting

ψi(x)=ϕi(x)ifxWL;0otherwise.

Then

F(x,y)=i=1λiψi(x)ψi(y)=i=1λi(ψiψi)(x,y).

We show now that the series

i=1λiR(ψiψi)(x,y)

converges absolutely and uniformly over VL×VL and, since R(ψiψi) is continuous by Lemma 5.5, this will imply that R(F) is continuous.

For uVL and ψ:VLR, write ψu for the function such that ψu(x)=ψ(x+u). Then

R(ψiψi)(x,y)=1volVLVLψi(x+z)ψi(y+z)dz=1volVL((ψi)x,(ψi)y).

Now |((ψi)x,(ψi)y)|ψi2ϕi2=1, so

i=1|λi((ψi)x,(ψi)y)|i=1λi<,

establishing absolute convergence. For uniform convergence, note that given ϵ>0 there is m1 such that i=mλi<ϵ. But then

i=m|λi((ψi)x,(ψi)y)|i=mλi<ϵ,

establishing uniform convergence and thus finishing the proof that R(F) is continuous.

Now that we know that R(F) is continuous, we can show that R(F)C(VL). Indeed, since H is continuous and belongs to C(WL), using Theorem 4.7 it is straightforward to show that, if UVL is finite, then F[U]C(U) and hence also R(F)[U]C(U). But then, since R(F) is continuous, Theorem 4.7 implies that R(F)C(VL).

So far we can conclude that AL=(trR(F))-1R(F) is a feasible solution of ϑ(GL,C(VL)). To estimate J,AL we use the following fact.

Lemma 6.4

If f:RnC is continuous and of positive type, then

limT1(vol[-T,T]n)2[-T,T]n[-T,T]nf(x-y)dydx=M(f). 23

Proof

The function g:Rn×RnC such that g(x,y)=f(x-y) is continuous and of positive type. Indeed, let ν be the measure given by Bochner’s theorem such that (20) holds and consider the Borel measure μ on Rn×Rn such that

μ(X)=ν({uRn:(u,-u)X})

for all measurable XRn×Rn. Then μ is a finite measure and

g(x,y)=f(x-y)=Rneiu·(x-y)dν(u)=Rn×Rnei(u·x+v·y)dμ(u,v),

so μ is the measure representing g. But then the left-hand side of (23) is M(g)=μ({(0,0)})=ν({0})=M(f).

Now note that

trR(F)=VLF(x,x)dx=(volWL)f(0)=volWL.

Since r is fixed,

limLvolWLvolVL=1.

So using the lemma above we get

limL(volVL)-1J,AL=limL1volVLVLVLAL(x,y)dydx=limL1(volVL)(volWL)WLWLf(x-y)dydx=limLvolWLvolVL1(volWL)2WLWLf(x-y)dydx=M(f),

finishing the proof of (A2). Here, the second identity follows from the definition of AL and the self-adjointness of the Reynolds operator.

Proof of Theorem 6.3

Follows from (A1) and (A2), proved above.

The Boolean-quadratic cone and polytope

As was said in Sect. 1, one can use valid inequalities for C(V) to strengthen the upper bound provided by ϑ(G,PSD(V)). This is one of our goals: to obtain better upper bounds in some particular cases of interest, like the unit-distance graph on Euclidean space or distance graphs on the sphere.

From a practical standpoint, and for reasons that will become clear soon, instead of using valid inequalities for the completely positive cone, it is more convenient to use valid inequalities for the Boolean-quadratic cone. Given a nonempty finite set V, the Boolean-quadratic cone on V is

BQC(V)=cone{ff:f:V{0,1}};

notice that BQC(V)C(V). The dual cone of BQC(V) is

BQC(V)={Z:V×VR:Zis symmetricandZ,A0for allABQC(V)}.

Now let V be a compact topological space and ω be a finite Borel measure on V and consider the cone

BQC(V)=cl{AL2(V×V):Ais continuousandA[U]BQC(U)for all finiteUV},

with the closure taken in the L2-norm topology. In view of Theorem 4.7, if V is a compact Hausdorff space and ω is positive on open sets, then BQC(V)C(V).

Let V be a compact Hausdorff space and ω be a finite Borel measure on V. If G=(V,E) is a locally independent graph, then since BQC(V)C(V) we have

ϑ(G,BQC(V))ϑ(G,C(V)).

If V is finite and ω is the counting measure, then recalling the proof of the inequality ϑ(G,C(V))αω(G) given in Sect. 3 we immediately get

ϑ(G,BQC(V))αω(G). 24

If V is infinite, it is not clear that (24) holds; at least the proof of Theorem 3.1 does not go through anymore: if f:VR is the continuous function approximating the characteristic function of the independent set, then in general it is not true that f-2ffBQC(V). If G and ω satisfy the hypotheses of Theorem 5.1, however, then (24) holds and we have:

Theorem 7.1

Let G=(V,E) be a locally independent graph where V is a compact Hausdorff space, ΓAut(G) be a compact group that acts continuously and transitively on V, and ω be a multiple of the pushforward of the Haar measure on Γ. If Γ is metrizable via a bi-invariant density metric for the Haar measure, then ϑ(G,BQC(V))=αω(G).

The proof requires the use of the Reynolds operator on V, namely of Lemma 5.5. For this we need a Γ-invariant metric on V, whose existence is implied by the metrizability of Γ via a bi-invariant metric, as shown by the following lemma.

Lemma 7.2

Let V be a compact Hausdorff space and Γ be a compact group that acts continuously and transitively on V. If Γ is metrizable via a bi-invariant metric, then V is metrizable via a Γ-invariant metric.

Proof

For xV, consider the map px:ΓV such that px(σ)=σx; the continuous action of Γ implies that px is continuous for every xV. Since Γ is compact and Hausdorff and V is Hausdorff, px is a closed and proper map: images of closed sets are closed and preimages of compact sets are compact.

Let dΓ be a bi-invariant metric that induces the topology on Γ and for σΓ and δ0 let

B¯Γ(σ,δ)={τΓ:dΓ(σ,τ)δ}

be the closed ball in Γ with center σ and radius δ. For x, yV, let

dV(x,y)=inf{δ:ypx(B¯Γ(1,δ))}=inf{dΓ(1,σ):σΓ,σx=y}.

It is easy to show that dV is a Γ-invariant metric; we show now that it induces the topology on V.

To this end, for xV consider the closed ball with center x and radius δ0, namely

B¯V(x,δ)={yV:dV(x,y)δ}={σx:σΓanddΓ(1,σ)δ}=px(B¯Γ(1,δ)).

Notice that this ball is closed since B¯Γ(1,δ) is closed and px is a closed map. We show now that the collection of finite unions of such balls is a base of closed sets of the topology on V, and it will follow that the metric dV induces the topology on V.

Let XV be a closed set and take xX. Note px-1(X) and px-1({x}) are compact and disjoint, so

δ=dΓ(px-1(X),px-1({x}))>0.

Since px-1(X) is compact, it can be covered by finitely many closed balls of radius δ/2, say B¯Γ(σi,δ/2) with σipx-1(X) for i=1, ..., N; moreover, by the definition of δ, we have that px-1({x}) is disjoint from each such ball. But then

Xpx(px-1(X))i=1Npx(B¯Γ(σi,δ/2))=i=1Npσix(B¯Γ(1,δ/2))=i=1NB¯V(σix,δ/2)

and xi=1NB¯V(σix,δ/2). We have shown that, given any closed set XV and any xX, there is a finite union of dV-balls that contains X but not x, that is, finite unions of dV-balls form a base of closed sets of the topology on V.

Proof of Theorem 7.1

Since BQC(V)C(V), from Theorem 5.1 it suffices to show that (24) holds. So let IV be a measurable independent set with ω(I)>0 (such a set exists since G is locally independent and ω is positive on open sets) and consider the kernel A=ω(I)-1R(χIχI). Using Lemma 7.2 we know that V is metrizable via a Γ-invariant metric, and then using Lemma 5.5 we see that A is continuous; it is also immediate that trA=1 and A(x,y)=0 if x, yV are adjacent. Let us then show that ABQC(V).

Indeed, given a finite UV, note that for any ZBQC(U), if μ is the Haar measure on Γ, then

x,yUZ(x,y)A(x,y)=ω(I)-1Γx,yUZ(x,y)χI(σx)χI(σy)dμ(σ)0,

whence A[U]BQC(U). So A is a feasible solution of ϑ(G,BQC(V)) with J,A=ω(I), establishing (24).

A corresponding result holds for the bound for distance graphs on Rn, presented in Sect. 6, by considering the cone

BQC(Rn)=cl{fL(Rn):fis real valued and continuousand(f(x-y))x,yUBQC(U)for all finiteUV},

with the closure taken in the L norm. Note that BQC(Rn)C(Rn).

Theorem 7.3

If D(0,) is closed, then

ϑ(G(Rn,D),BQC(Rn))=αδ¯(G(Rn,D)).

Proof

Recall from Sect. 6.3 that we may assume D is bounded. In view of Theorem 6.3, it then suffices to show that ϑ(G(Rn,D),BQC(Rn))αδ¯(G(Rn,D)).

Let IRn be a measurable and periodic independent set with δ¯(I)>0 (which exists since D is bounded) and consider the function f:RnR given by

f(x)=δ¯(I)-1limT1vol[-T,T]n[-T,T]nχI(z)χI(x+z)dz

(notice the limit above exists since I is periodic). This function is continuous and satisfies f(0)=1 and f(x)=0 if xD, since if xD then for all z we cannot have both z and x+zI. Moreover, fBQC(Rn): if URn is finite and ZBQC(U), then

x,yUZ(x,y)f(x-y)=δ¯(I)-1limT1vol[-T,T]n[-T,T]nx,yUZ(x,y)χI(z)χI(x-y+z)dz=δ¯(I)-1limT1vol[-T,T]n[-T,T]nx,yUZ(x,y)χI(x+z)χI(y+z)dz0,

whence f is a feasible solution of ϑ(G(Rn,D),BQC(Rn)). We also have M(f)=δ¯(I). Indeed, the characteristic function χI of I is periodic, say with periodicity lattice Λ. For xRn, consider the function (χI)x such that (χI)x(z)=χI(x+z). Then it is easy to check that the Fourier coefficient of (χI)x at u equals eiu·xχ^I(u), and thus Parseval’s identity gives us

f(x)=δ¯(I)-1((χI)x,χI)=δ¯(I)-1u2πΛ|χ^I(u)|2eiu·x.

From this it is clear that M(f)=f^(0)=δ¯(I)-1|χ^I(0)|2=δ¯(I), since χ^I(0)=δ¯(I).

To finish, note that I is any measurable and periodic independent set, so using Lemma 6.2 the theorem follows.

Theorem 7.1 tells us that any number of constraints of the form

x,yUZ(x,y)A(x,y)0,

for finite URn and ZBQC(U), can be added to ϑ(G,PSD(V)), and that the resulting problem still provides an upper bound for the independence number. Moreover, if all such constraints are added, then we obtain the independence number. Theorem 7.3 says the same for the independence density of G(Rn,D).

The main advantage of using BQC(U) instead of C(U) is that the Boolean-quadratic cone in finite dimension is a polyhedral cone, so for finite U one is able to compute all (or at least some of) the facets of BQC(U), though the amount of work gets prohibitively large already for |U|=7 [11, §30.6]. The better upper bounds described in Sects. 8 and 9 were obtained by the use of constraints based on such facets.

Subgraph constraints

Constraints from subgraphs of G(Rn,{1}) played a central role in the computation of the best upper bounds for the independence density of the unit-distance graph [2, 22, 36].

Such subgraph constraints are as follows. Let G=(V,E) be a locally independent graph and ω be a Borel measure on V and assume G and ω satisfy the hypotheses of Theorem 5.1. Let UV be finite and for every x0V consider the inequality

yUA(x0,y)α(G[U])A(x0,x0), 25

where AL2(V×V) is continuous and G[U] is the subgraph of G induced by U.

After adding any number of such constraints to ϑ(G,PSD(V)) we still get an upper bound for αω(G). Indeed, if IV is a measurable independent set of positive measure, then A=ω(I)-1R(χIχI) is continuous, positive, and such that trA=1, A(x,y)=0 if x, yV are adjacent, and J,A=ω(I) (recall the proof of Theorem 7.1). Moreover, since A(x,x)=ω(V)-1 for all xV, and since for every σΓAut(G) the set σ-1I is independent, we get

yUA(x0,y)=yUω(I)-1ΓχI(σx0)χI(σy)dμ(σ)=ω(I)-1ΓχI(σx0)yUχI(σy)dμ(σ)=ω(I)-1ΓχI(σx0)|Uσ-1I|dμ(σ)α(G[U])ω(V)=α(G[U])A(x0,x0).

Notice these constraints do not come directly from C(V) or BQC(V), since they rely on the edge set of the graph. Theorem 5.1 says that they must be somehow implied by the constraints coming from C(V) together with the other constraints of problem ϑ(G,C(V)), but the way in which this implication is carried out is not necessarily simple: it could be that only by adding many constraints from the completely positive cone for sets other than U one would get the implication.

The situation is clearer when one considers instead the Boolean-quadratic cone. In this case, a subgraph constraint for a given finite UV and a given x0V is implied by a single constraint from BQC(U{x0}) together with the constraints A(x,y)=0 for adjacent x and y.

To see this, assume for the sake of simplicity that x0U and write U=U{x0} (if x0U, a simple modification of the argument below works). Let C:U×UR be the matrix such that

C(x,y)=α(G[U])ifx=y=x0;-1/2ifx=x0ory=x0;0otherwise.

Then the subgraph constraint (25) is

x,yUC(x,y)A(x,y)0.

We now show that there are matrices ZBQC(U) and B:U×UR such that B(x,y)=0 if x, yU are not adjacent satisfying C=Z+B, and it will follow that, if A is feasible for ϑ(G,PSD(V)) and x,yUZ(x,y)A(x,y)0, then

x,yUC(x,y)A(x,y)=x,yUZ(x,y)A(x,y)+x,yUB(x,y)A(x,y)0,

whence A satisfies the subgraph constraint.

For Z, consider the matrix

Z(x,y)=α(G[U])ifx=y=x0;-1/2ifx=x0ory=x0;1/2if(x,y)E;0otherwise, 26

and for B take the matrix with -1/2 on entries corresponding to edges of G[U] and 0 everywhere else. Then C=Z+B, and it remains to show that ZBQC(U). To this end, take f:U{0,1}. If f(x0)=0, then clearly Z,ff0. So suppose f(x0)=1 and write S={xU:f(x)=1}. Then

Z,ff=α(G[U])-|S|+|E(G[S])|.

Now let XS be a maximal independent set in G[S]. Then |X|α(G[U]). Since X is maximal, every yS\X is adjacent to some xX, so |S\X||E(G[S])|, and

α(G[U])-|S|+|E(G[S])|=α(G[U])-|X|-|S\X|+|E(G[S])|0,

showing that ZBQC(U).

Finally, subgraph constraints can also be used for distance graphs on Rn: given a set D(0,) of forbidden distances, one can add to ϑ(G(Rn,D),PSD(Rn)) any number of constraints of the form

yUf(x0-y)α(G(Rn,D)[U])f(0),

where URn is finite and x0Rn is fixed. Such constraints have been used by Oliveira and Vallentin [36] to get improved upper bounds for the independence density of the unit-distance graph on Rn in several dimensions; the sets U used were always vertex sets of regular simplices in Rn. Keleti et al. [22] used the points of the Moser spindle to get improved bounds for the independence density of G(R2,{1}); Bachoc et al. [2] used several different graphs to get better bounds for the independence density of G(Rn,{1}) for n=4, ..., 24 and a better asymptotic bound.

A new class of graphical facets of the Boolean-quadratic cone

The matrix Z defined in (26) is sometimes an extreme ray of BQC(U), that is, Z,A0 induces a facet of BQC(U). In fact, matrices like Z comprise a whole class of facets of the Boolean-quadratic cone that generalizes the class of clique inequalities introduced by Padberg [37].

Let G=(V,E) be a finite graph with at least two vertices. We say that G is α -critical if α(G-e)>α(G) for all eE; α-critical graphs have been extensively studied in the context of combinatorial optimization [42, §68.5].

Assume V and write W=V{}. Consider the matrix QG:W×WR defined as

QG(x,y)=α(G)ifx=y=;-1/2ifx=ory=;1/2if(x,y)E;0otherwise.
Theorem 7.4

Let G=(V,E) be a finite graph with at least two vertices, and assume V. The inequality QG,A0 induces a facet of BQC(W), where W=V{}, if and only if G is connected and α-critical.

Proof

The argument given in the previous section shows that QG,A0 is valid for BQC(W); let us then establish the necessary and sufficient conditions for it to be facet defining.

As a subset of the space of symmetric matrices indexed by W×W, the cone BQC(W) is full dimensional. Indeed, it suffices to notice that the 1+|W|(|W|+1)/2 matrices χUχU for UW with |U|2 are affinely independent.

We first show necessity. If G=G1+G2, where G1, G2 have disjoint vertex sets and G1 is a connected component of G, then QG=QG1+P, where G1=(V,E(G1)) and P:W×WR is such that P(,)=α(G2) and P(x,y)=1/2 if (x,y)E(G2). Now QG1,A0 is valid for BQC(W) and, since P0, so is P,A0. Since α(G)=α(G1)+α(G2) and since BQC(W) is full dimensional, we see that QG,A0 does not induce a facet.

Similarly, if α(G-e)=α(G) for some e=(x,y)E, then QG=QG-e+P, where P(x,y)=P(y,x)=1/2, and we see that QG,A0 does not induce a facet.

To see sufficiency, assume G is connected and α-critical. Now suppose Z:W×WR is such that Z,A0 induces a facet of BQC(W) and

{ABQC(W):QG,A=0}{ABQC(W):Z,A=0}.

To show that QG,A0 induces a facet it suffices to show that Z is a nonnegative multiple of QG.

To this end, notice first that if xV, then QG,χ{x}χ{x}=0, so

Z(x,x)=Z,χ{x}χ{x}=0.

Next, let x, yV and assume (x,y)E. Then QG,χ{x,y}χ{x,y}=0, whence

Z(x,y)=Z(y,x)=Z,χ{x,y}χ{x,y}=0.

Note that, for all UV, if S=U{}, then

QG,χSχS=α(G)-|U|+|E(G[U])|.

Take now (x,y)E. Let IV be a maximum independent set in G-(x,y); then |I|=α(G)+1 and hence we must have x, yI. Write S=I{}, so

QG,χSχS=α(G)-(α(G)+1)+1=0

and similarly

QG,χS-xχS-x=0,

whence Z,χSχS=Z,χS-xχS-x=0. Now, since Z(x,y)=0 if (x,y)E,

0=Z,χSχS=Z,χS-xχS-x+2Z(,x)+2Z(x,y)=2Z(,x)+2Z(x,y).

Since x and y are interchangeable in the above argument, we see immediately that Z(,x)=-Z(x,y)=Z(,y). Now G is connected, and so it follows immediately that there is a number a such that Z(,x)=-a for all xV and Z(x,y)=a for all (x,y)E.

We are almost done. If (x,y)E, then Z,χ{x,y}χ{x,y}0, so a0. If I is a maximum independent set in G and S=I{}, then QG,χSχS=0 and

0=Z,χSχS=Z(,)-2a|I|,

whence Z(,)=2aα(G) and Z=2aQG, as we wanted.

An alternative normalization and polytope constraints

The constraint “trA=1” in (6) is there to prevent the problem from being unbounded: it is a normalization constraint. There is another kind of normalization constraint that can be used to replace the trace constraint; by doing so we obtain an equivalent problem and also gain the ability to add to our problem constraints from the Boolean-quadratic polytope, which given a nonempty finite set V is defined as

BQP(V)=conv{ff:f:V{0,1}}.

Such constraints are also implied by constraints from the Boolean-quadratic cone, but in practice, given our limited computational power, they are useful. For instance, the inclusion–exclusion inequalities used by Keleti et al. [22] to get better upper bounds for G(R2,{1}) come from facets of BQP(V), as we will soon see.

Let G=(V,E) be a topological graph where V is a compact Hausdorff space, ω be a finite Borel measure on V, and K(V)PSD(V) be a convex cone. Since K(V) is a subset of the cone of positive kernels, Mercer’s theorem implies that any continuous kernel in K(V) is trace class and that the trace is the integral over the diagonal. The alternative version of (6) is:

maximizetrAA(x,y)=0if(x,y)E,1trAtrAJ,Ais positive semidefinite,Ais continuous andAK(V). 27

If A is a feasible solution of the above problem, then A=(trA)-1A is feasible for ϑ(G,K(V)). Moreover, the positive-semidefiniteness of the 2×2 matrix in (27) implies that (trA)2J,A, whence

J,A=(trA)-1J,AtrA,

so ϑ(G,K(V)) is  the optimal value of (27). The reverse inequality is also true: if A is a feasible solution of (6), then one easily checks that A=J,AA is a feasible solution of (27) and that trA=J,A. So problems (6) and (27) are actually equivalent.

Fix a finite set UV and let Z:U×UR be a symmetric matrix and β be a real number such that Z,Aβ is a valid inequality for BQP(U), that is, Z,Aβ for all ABQP(U).

If G and ω satisfy the hypotheses of Theorem 5.1, then any number of constraints

x,yUZ(x,y)A(x,y)β 28

can be added to (27) with K(V)=PSD(V) and we still get an upper bound for αω(G). Indeed, if I is a measurable independent set of positive measure, then A=R(χIχI) is easily checked to be a feasible solution of (27) with K(V)=PSD(V) that moreover satisfies (28), and trA=ω(I). The alternative normalization is essential for this approach to work: if we try to add constraint (28) to (6), then if β0 we get a nonlinear constraint because of the different normalization, making it more difficult to deal with the resulting problem in practice.

The same ideas can be applied to problem (21). First, given a closed set D(0,) of forbidden distances, we consider an alternative normalization that gives rise to an equivalent problem:

maximizef(0)f(x)=0ifxD,1f(0)f(0)M(f)is positive semidefinite,f:RnRis continuous andfK(Rn). 29

Then, we observe that we can add to this problem, with K(Rn)=PSD(Rn), any number of constraints of the form

x,yUZ(x,y)f(x-y)β 30

for finite URn and Z, β such that Z,Aβ is valid for BQP(U) and still prove that the optimal value provides an upper bound for the independence density of G(Rn,D).

Given points x1, ..., xNRn, the inclusion-exclusion inequality used by Keleti, Matolcsi, Oliveira, and Ruzsa is

1i<jNf(xi-xj)-Nf(0)-1.

This constraint is just (30) with Z such that Z(xi,xi)=-1 for all i and Z(xi,xj)=1/2 for all ij. It can be easily checked that Z,A-1 is a valid inequality for BQP({x1,,xN}); one can even verify that it gives a facet of the polytope, simply by finding enough affinely independent points in the polytope for which the inequality is tight.

Constraints from BQP(U) for a finite URn are implied by constraints from BQC(U{}) together with the other constraints from (6) or (21). It is still useful to consider constraints from BQP(U) mainly since U{} is a larger set than U, and therefore computing the facets of BQC(U{}) can be much harder than computing the facets of BQC(U), as is the case already when |U|=6. For instance, Deza and Laurent [11, §30.6] survey some numbers for the cut polytope, which is equivalent to the Boolean-quadratic polytope under a linear transformation. For 6 points, the total number of facets is 116, 764, distributed among 11 equivalence classes. The approach we use to find violated constraints cannot, however, exploit the full symmetry of the polytope, so we end up using a list of 428 facets. For 7 points, the total number of facets is 217, 093, 472, distributed among 147 classes. Taking into account the smaller symmetry group we use, the total list of facets needed for our procedure would have more than ten thousand entries.

Better upper bounds for the independence number of graphs on the sphere

By adding BQP(U)-constraints to ϑ(G(Sn-1,{π/2}),PSD(Sn-1)) using the approach described in Sect. 7.2, one is able to improve on the best upper bounds for αω(G(Sn-1,{π/2}))=m0(Sn-1). Table 1 shows bounds thus obtained for the independence ratio, namely

αω(G(Sn-1,{π/2}))/ωn,

for n=3, ..., 8. The rest of this section is devoted to an explanation of how these bounds were computed. The bounds have also been checked to be correct; the verification procedure is explained in detail in a document available with the arXiv version of this paper. The programs used for verification can also be found with the arXiv version.

Invariant kernels on the sphere

Let O(n) be the orthogonal group on Rn, that is, the group of n×n orthogonal matrices. The orthogonal group acts on a kernel A:Sn-1×Sn-1R by

(T·A)(x,y)=A(T-1x,T-1y),

where TO(n); we say that A is invariant if T·A=A for all TO(n). An invariant kernel is thus a real-valued function with domain [-1,1], since if x·y=x·y, then A(x,y)=A(x,y).

Let D(0,π] be a set of forbidden distances. If the cone K(Sn-1) is invariant under the action of the orthogonal group, then one can add to the problem ϑ(G(Sn-1,D),K(Sn-1)) the restriction that A has to be invariant without changing the optimal value of the resulting problem. Indeed, if A is a feasible solution, then so is T·A for all TO(n), and hence its symmetrization

A¯(x,y)=O(n)A(T-1x,T-1y)dμ(T),

where μ is the Haar measure on O(n), is also feasible and has the same objective value as A.

The advantage of requiring A to be invariant is that invariant and positive kernels can be easily parameterized. Indeed, let Pkn denote the Jacobi polynomial of degree k and parameters (α,α), where α=(n-3)/2, normalized so Pkn(1)=1 (for background on Jacobi polynomials, see the book by Szegö [44]). A theorem of Schoenberg [40] says that A:Sn-1×Sn-1R is continuous, invariant, and positive if and only if there are nonnegative numbers a(0), a(1), ... such that k=0a(k)< and

A(x,y)=k=0a(k)Pkn(x·y) 31

for all x, ySn-1; in particular, the sum above converges absolutely and uniformly on Sn-1×Sn-1.

Primal and dual formulations

When a continuous, invariant, and positive kernel A is represented as in (31), constraint (28) becomes

βx,yUZ(x,y)A(x,y)=k=0a(k)x,yUZ(x,y)Pkn(x·y)=k=0a(k)r(k),

where r:NR is the function such that

r(k)=x,yUZ(x,y)Pkn(x·y).

Let R be a finite collection of BQP(U)-constraints represented as pairs (r,β), where r is given by the above expression for a valid inequality Z,Aβ for BQP(U) for some finite USn-1.

If a continuous, invariant, and positive kernel A is given by expression (31), then J,A=ωn2a(0). Moreover, all diagonal entries of A are the same, and hence

trA=ωnk=0a(k).

Using the alternative normalization of Sect. 7.2, problem ϑ(G(Sn-1,{θ}),PSD(Sn-1)), strengthened with the BQP(U)-constraints in R, can be equivalently written as

maximizek=0a(k)k=0a(k)Pkn(cosθ)=0,k=0a(k)r(k)βfor(r,β)R,1ωnk=0a(k)ωnk=0a(k)ωn2a(0)is positive semidefinite,a(k)0for allk0. 32

Notice that the objective function was scaled so the optimal value is a bound for the independence ratio αω(G(Sn-1,{θ}))/ωn.

A dual for this problem is the following optimization problem on variables λ, y(r,β) for (r,β)R, and z1, z2, z3:

minimizez1+(r,β)Ry(r,β)βλ+(r,β)Ry(r,β)r(0)+z2ωn+z3ωn21,λPkn(cosθ)+(r,β)Ry(r,β)r(k)+z2ωn1,fork1,z1-12z2-12z2-z3is positive semidefinite,y0. 33

In practice, this is the problem that we solve to obtain an upper bound; there are two main reasons for this. The first one comes from weak duality: the objective value of any feasible solution of this problem is an upper bound for the independence ratio. Indeed, let λ, y, z1, z2, z3 be a feasible solution of (33) and a be a feasible solution of (32). Then

z1+(r,β)Ry(r,β)βz1+(r,β)Ry(r,β)k=0a(k)r(k)=z1+k=0a(k)(r,β)Ry(r,β)r(k)z1+a(0)(-z3ωn2)+k=0a(k)(1-λPkn(cosθ)-z2ωn)=z1-z3ωn2a(0)+(1-z2ωn)k=0a(k)-λk=0a(k)Pkn(cosθ)=z1-z3ωn2a(0)-z2ωnk=0a(k)+k=0a(k)k=0a(k),

as we wanted, where for the last inequality we use the positive-semidefiniteness of the 2×2 matrices in (32) and (33).

The second reason is that the dual is a semidefinite program with finitely many variables, though infinitely many constraints, including one constraint for each k0. In practice, we choose d>0 and disregard all constraints for k>d. Then we solve a finite semidefinite program, and later on we prove that a suitable modification of the solution found is indeed feasible for the infinite problem, as we will see now.

Finding feasible dual solutions and checking them

To find good feasible solutions of (33), we start by taking R=. Then we turn our problem into a finite one: we choose d>0 and disregard all constraints for k>d. We have then a finite semidefinite program, which we solve using standard semidefinite programming solvers. The idea is that, if d is large enough, then the solution found will be close enough to being feasible, and so by slightly changing z1, z2, and z3 we will be able to find a feasible solution.

By solving the finite problem we obtain at the same time an optimal solution of the corresponding finite primal problem, in which a(k)=0 if k>d (notice this is likely not an optimal solution of the original primal problem). We use this primal solution to perform a separation round, that is, to look for violated polytope constraints that we can add to the problem. One way to do this is as follows.

Say a is the primal solution and let

A(x,y)=k=0a(k)Pkn(x·y).

Fix an integer N2, write [N]={1,,N}, and let ZRN×N, βR be such that Z,Xβ is valid for BQP([N]). Then we try to find points x1, ..., xNSn-1 that maximize the violation

β-i,j=1NZ(i,j)A(xi,xj) 34

of the polytope inequality. If we find points such that the violation is positive, then we have a violated constraint which can be added to R; the whole procedure can then be repeated: the dual problem is solved again and a new separation round is performed.

To find violated constraints we need to know valid inequalities, or better yet facets, of BQP([N]). Up to N=6 it is possible to work with a full list of facets; for N=7 only with a partial list. To find points x1, ..., xNSn-1 maximizing (34), we represent the points on the sphere by stereographic projection on the xn=-1 plane and use some method for unconstrained optimization that converges to a local optimum.

After a few optimization/separation rounds, one starts to notice only minor improvements to the bound. Then it is time to check how far from feasible the dual solution is and to fix it in order to get a truly feasible solution and therefore an upper bound. A detailed description of the verification procedure, together with a program to check the dual solutions used for the results in this section, can be found together with the arXiv version of this paper.

Better upper bounds for the independence density of unit-distance graphs

Just like in the case of graphs on the sphere, we can add BQP(U)-constraints to ϑ(G(Rn,{1}),PSD(Rn)) and so obtain improved upper bounds for αδ¯(G(Rn,{1})) for n=3, ..., 8. These improved upper bounds then provide new lower bounds for the measurable chromatic number χm(G(Rn,{1})) of the unit-distance graph, which is the minimum number of measurable independent sets needed to partition Rn, for n=4, ..., 8. Indeed, since

αδ¯(G(Rn,{1}))χm(G(Rn,{1}))1,

if αδ¯(G(Rn,{1}))u, then χm(G(Rn,{1}))1/u.

Table 2 shows these new bounds compared to the previously best ones. To obtain the bounds for n=4, ..., 8, subgraph constraints (see Sect. 7.1) have also been used. In the remainder of this section we will see how these bounds have been computed; they have also been checked to be correct, and the verification procedure is explained in detail in a document available with the arXiv version of this paper. The programs used for the verification can also be found with the arXiv version.

Radial functions

The orthogonal group O(n) acts on a function f:RnC by

(T·f)(x)=f(T-1x),

where TO(n); we say that f is radial if it is invariant under this action, that is, if T·f=f for all TO(n). A radial function f is thus a function of one real variable, since if x=y, then f(x)=f(y).

Let D(0,) be a set of forbidden distances. If the cone K(Rn)L(Rn) is invariant under the action of the orthogonal group, then one can add to the problem ϑ(G(Rn,D),K(Rn)) the restriction that f has to be radial without changing the optimal value of the resulting problem. Indeed, if f is a feasible solution, then so is T·f for all TO(n), and hence its radialization

f¯(x)=O(n)f(T-1x)dμ(T)=1ω(Sn-1)Sn-1f(xξ)dω(ξ),

where μ is the Haar measure on O(n), is also feasible and has the same objective value as f.

The advantage of requiring f to be radial is that radial functions of positive type can be easily parameterized. Indeed, if fPSD(Rn) is continuous, then Bochner’s theorem says that there is a finite Borel measure ν on Rn such that

f(x)=Rneiu·xdν(u).

But then we obtain the following expression, due to Schoenberg [39], for the radialization of f:

f¯(x)=1ω(Sn-1)Sn-1Rneiu·xξdν(u)dω(ξ)=Rn1ω(Sn-1)Sn-1eiu·xξdω(ξ)dν(u)=0Ωn(tx)dα(t), 35

where

Ωn(u)=1ω(Sn-1)Sn-1eiu·ξdω(ξ) 36

for uRn and α is the Borel measure on [0,) such that

α(X)=ν({λξ:λXandξSn-1})

for every measurable set X. The function Ωn has a simple expression in terms of Bessel functions, namely

Ωn(t)=Γ(n2)(2t)(n-2)/2J(n-2)/2(t) 37

for t>0 and Ωn(0)=1, where Jα denotes the Bessel function of first kind of order α (for background, see the book by Watson [47]).

Primal and dual formulations

When a continuous radial function f of positive type is represented as in (35), constraint (30) becomes

βx,yUZ(x,y)f(x-y)=0x,yUZ(x,y)Ωn(tx-y)dα(t)=0r(t)dα(t),

where r:[0,)R is the continuous function such that

r(t)=x,yUZ(x,y)Ωn(tx-y).

As shown in Sect. 7.1, a subgraph constraint is implied by one BQP(U)-constraint together with the other constraints of ϑ(G(Rn,{1}),PSD(Rn)), so in the discussion below we treat them as BQP(U)-constraints.

Let R be a finite collection of BQP(U)-constraints represented as pairs (r,β), where r is given by the above expression for a valid inequality Z,Aβ for BQP(U) for some finite URn. Using the alternative normalization of Sect. 7.2, problem ϑ(G(Rn,{1}),PSD(Rn)), strengthened with the BQP(U)-constraints in R, can be equivalently written as

maximizeα([0,))0Ωn(t)dα(t)=0,0r(t)dα(t)βfor(r,β)R,1α([0,))α([0,))α({0})is positive semidefinite,αis a finite Borel measure on[0,). 38

A dual for this problem is the following optimization problem on variables λ, y(r,β) for (r,β)R, and z1, z2, z3:

minimizez1+(r,β)Ry(r,β)βλ+(r,β)Ry(r,β)r(0)+z2+z31,λΩn(t)+(r,β)Ry(r,β)r(t)+z21fort>0,z1-12z2-12z2-z3is positive semidefinite,y0. 39

Again, this is the problem that we solve to obtain an upper bound, and the two reasons for this are the same as before. The first one comes from weak duality: the objective value of any feasible solution of this problem is an upper bound for the independence density. Indeed, let λ, y, z1, z2, z3 be a feasible solution of (39) and α be a feasible solution of (38). Then

z1+(r,β)Ry(r,β)βz1+(r,β)Ry(r,β)0r(t)dα(t)=z1+0(r,β)Ry(r,β)r(t)dα(t)z1+α({0})(-z3)+01-λΩn(t)-z2dα(t)=z1-z3α({0})+(1-z2)α([0,))-λ0Ωn(t)dα(t)=z1-z3α({0})-z2α([0,))+α([0,))α([0,)),

as we wanted.

The second reason is that the dual is a semidefinite program with finitely many variables, though infinitely many constraints, including one constraint for each t>0. In practice, we discretize the set of constraints and solve a finite semidefinite program, later on proving that a suitable modification of the solution found is indeed feasible for the infinite problem, as we discuss now.

Finding feasible dual solutions and checking them

To find good feasible solutions of (39), we start by taking R=. Then we discretize the constraint set: we choose a finite sample S(0,) and instead of all constraints for t>0 we only consider constraints for tS. Then we have a semidefinite program, which we solve using standard semidefinite programming solvers. The idea is that, if the sample S is fine enough, then the solution found will be close enough to being feasible, and so by slightly increasing z1 and z2 we will be able to find a feasible solution.

By solving the discretized dual problem we obtain at the same time an optimal solution of the discretized primal problem, in which α is a sum of Dirac δ measures supported on S{0} (notice this is likely not an optimal solution of the original primal problem, but of the discretized one). We use this primal solution to perform a separation round, that is, to look for violated BQP(U)-constraints that we can add to the problem. One way to do this is as follows.

Say that α is the primal solution and let

f(x)=0Ωn(tx)dα(t).

Fix an integer N2, write [N]={1,,N}, and let ZRN×N, βR be such that Z,Aβ is valid for BQP([N]). Then we try to find points x1, ..., xNRn that maximize the violation

β-i,j=1NZ(i,j)f(xi-xj) 40

of the BQP(U)-constraint. If we find points such that the violation is positive, then we have a violated constraint which can be added to R; the whole procedure can then be repeated: the dual problem is solved again and a new separation round is performed. To find violated constraints we work with a list of facets of BQP([N]), as in Sect. 8.3. To find points x1, ..., xNRn maximizing (40) we simply use some method for unconstrained optimization.

After a few optimization/separation rounds, one starts to notice only minor improvements to the bound. Then it is time to check how far from feasible the dual solution is and to fix it in order to get a truly feasible solution and therefore an upper bound. The verification procedure for the dual solution has already been outlined by Keleti et al. [22] and will be omitted here; the dual solutions that give the bounds in Table 2 and a program to verify them can be found together with the arXiv version of this paper.

Sets avoiding many distances in Rn and the computability of the independence density

Reassuring though Theorem 5.1 may be, the computational results of Sects. 8 and 9 do not use it, or rather use only the easy direction of the statement. In this section we will see how the full power of Theorem 5.1 can be used to recover results about densities of sets avoiding several distances in Euclidean space.

Furstenberg et al. [17] showed that, if n2, then any subset of Rn with positive upper density realizes all arbitrarily large distances. More precisely, if IRn has positive upper density, then there is d0>0 such that for all d>d0 there are x, yI with x-y=d. This fails for n=1: the set kZ(2k,2k+1) has density 1/2 but does not realize any odd distance.

Falconer [14] proved the following related theorem: if (dm) is a sequence of positive numbers that converges to 0, then for all n2

limmαδ¯(G(Rn,{d1,,dm}))=0.

This theorem also fails when n=1, as can be seen from an adaptation of the previous example.

Bukh [6] proved a theorem that implies both theorems above; namely, he showed that, as the ratios d2/d1, ..., dm/dm-1 between the distances d1, ..., dm go to infinity, so does αδ¯(G(Rn,{d1,,dm})) go to αδ¯(G(Rn,{1}))m, provided n2. More precisely, for every n2 and every m2,

limqsup{αδ¯(G(Rn,{d1,,dm})):dk/dk-1>q}=αδ¯(G(Rn,{1}))m. 41

Oliveira and Vallentin [36] showed that the limit above decreases exponentially fast as m increases. They showed that

limqsup{ϑ(G(Rn,{d1,,dm}),PSD(Rn)):dk/dk-1>q}2-m,

using in the proof only a few properties of the Bessel function. In this section, we will see how Bukh’s result (41) can be obtained in a similar fashion using Theorem 5.1. This illustrates how the completely positive formulation provides a good enough characterization of the independence density to allow us to prove such precise asymptotic results.

Bukh derives his asymptotic result from an algorithm to compute the independence density to any desired precision. As a by-product of the approach of this section we also obtain such an algorithm based on solving a sequence of stronger and stronger convex optimization problems.

Finally, similar decay results can be proved for distance graphs on other metric spaces, such as the sphere or the real or complex projective space [35]. The methods of this section can in principle be applied to any metric space, as long as the harmonic analysis can be tackled successfully.

Thick constraints

The better bounds for the independence density described in Sect. 9 were obtained by adding to the initial problem ϑ(G(Rn,{1}),PSD(Rn)) a few BQP(U)-constraints for finite sets U. Our approach in this section is similar: we wish to add more and more constraints to the initial problem in a way that is guaranteed to give us closer and closer approximations of the independence density. The constraints used in Sect. 9 are easy to deal with in computations, but it is not clear (and we do not know) whether by adding a finite number of them to the initial problem we can get arbitrarily close to the independence density. A slight modification of these constraints, however, displays this property, even though such modified constraints are much harder to deal with in practice.

For a finite set URn write

m(U)=min{x-y:x,yU,xy}

for the minimum distance between pairs of distinct points in U. The following lemma provides an alternative characterization of C(Rn).

Lemma 10.1

A continuous and real-valued function fL(Rn) belongs to C(Rn) if and only if

x,yUZ(x,y)B(x,δ)B(y,δ)f(x-y)dydx0 42

for all finite URnZC(U), and 0<δm(U)/2.

Compare this lemma to the definition of C(Rn) from Sect. 6.3. A constraint (42) is obtained from

x,yUZ(x,y)f(x-y)0

by considering an open ball of radius δ around each point in U; since δm(U)/2, balls around different points do not intersect. So we are “thickening” each point in U.

Proof

Let fL(Rn) be a continuous and real-valued function and suppose there is a finite URn and ZC(U) such that

x,yUZ(x,y)f(x-y)<0.

Since f is continuous, for every ϵ>0 there is δ>0 such that for all x, yU we have |f(x-y)-f(x-y)|<ϵ for all xB(x,δ) and yB(y,δ). So for all x, yU one has

|f(x-y)-(volB(0,δ))-2B(x,δ)B(y,δ)f(x-y)dydx|(volB(0,δ))-2B(x,δ)B(y,δ)|f(x-y)-f(x-y)|dydx<ϵ.

It follows that, by taking ϵ small enough, the left-hand side of (42) for the corresponding δ will be negative.

For the other direction, we approximate integrals of f by finite sums. If f is such that the left-hand side of (42) is negative, then take for U the set consisting of a fine sample of points inside each B(x,δ) for xU. In this way one approximates by summation the double integrals in (42), showing that

x,yUZ(x,y)f(x-y)<0,

where Z:U×UR is the copositive matrix derived from Z by duplication of rows and columns.

Recall from Sect. 9.1 that a continuous radial function fL(Rn) of positive type can be represented by a finite Borel measure α on [0,) via

f(x)=0Ωn(tx)dα(t).

Using this expression, a constraint like (42) becomes

0r(t)dα(t),

where r:[0,)R is the function such that

r(t)=x,yUZ(x,y)B(x,δ)B(y,δ)Ωn(tx-y)dydx; 43

note r is continuous. The following lemma establishes two key properties of such a function r.

Lemma 10.2

If r is given as in (43), then r vanishes at infinity. If moreover n2 and trZ0, then r(t)0 for all large enough t.

Proof

Let B be an open ball centered at the origin and fix zRn. Let μ be the Haar measure on the orthogonal group O(n)Rn×n, normalized so the total measure is 1. Averaging over O(n) the Fourier transform (on the space R2n) of the characteristic function χB×(z+B) of B×(z+B) we get

O(n)χ^B×(z+B)(Tu,-Tu)dμ(T)=O(n)RnRnχB(x)χz+B(y)e-i(Tu·x-Tu·y)dydxdμ(T)=RnRnχB(x)χz+B(y)O(n)e-iTu·(x-y)dμ(T)dydx=Bz+BΩn(ux-y)dydx,

which provides us with an expression for the double integrals appearing in (43) in terms of the Fourier transform of χB×(z+B); the lemma will follow from this relation.

First, it is immediate from this relation that r vanishes at infinity. Indeed, the Riemann–Lebesgue lemma [38, Theorem IX.7] says that the Fourier transform of the characteristic function vanishes at infinity (that is, as u) and so, since Z is a fixed matrix, we must have that r vanishes at infinity.

To see that r is nonnegative at infinity is only slightly more complicated. Note

χ^B×(z+B)(u,-u)=eiu·zχ^B×B(u,-u).

Since B is centered at the origin, χ^B×B(Tu,-Tu)=χ^B×B(u,-u) for all TO(n), so averaging gives us

Bz+BΩn(ux-y)dydx=O(n)eiTu·zχ^B×B(Tu,-Tu)dμ(T)=O(n)eiTu·zχ^B×B(u,-u)dμ(T)=Ωn(uz)χ^B×B(u,-u). 44

Recall that Ωn(0)=1. Since n2, the function Ωn vanishes at infinity.6 Then, since trZ0, and hence trZ>0 as Z is copositive, using (44) it follows that for all large t the diagonal summands in (43) together dominate the off-diagonal ones.

Now χ^B×B(u,-u)0 as follows from the definition of the Fourier transform. So since trZ>0, it follows that for all large enough t we have r(t)0.

Say now R is any finite collection of functions r each one defined in terms of a thick constraint as in (43), and let d1, ..., dm be m distinct positive numbers. Consider the optimization problem

maximizeα({0})α([0,))=1,0Ωn(dit)dα(t)=0fori=1,,m,0r(t)dα(t)0forrR,αis a Borel measure on[0,). 45

This problem is comparable to (38), but instead of using the alternative normalization of Sect. 7.2, the standard normalization is used, and instead of considering only distance 1 as a forbidden distance, distances d1, ..., dm are forbidden; this way we get an infinite-dimensional linear program instead of a semidefinite program. By construction, the optimal value of (45) is an upper bound for αδ¯(G(Rn,{d1,,dm})).

A dual problem for (45) is the following (cf. problem (39)):

minimizeλλ+i=1mzi+rRy(r)r(0)1,λ+i=1mziΩn(dit)+rRy(r)r(t)0for allt>0,y0. 46

(Recall Ωn(0)=1, hence the coefficient of zi in the first constraint is 1.) Weak duality holds between (45) and (46): if λ, z, and y is any feasible solution of the dual problem and α is any feasible solution of the primal problem, then α({0})λ; the proof of this fact is analogous to the proof of the weak duality relation between problems (38) and (39), given in Sect. 9.2. So any feasible solution λ, z, and y of the dual provides an upper bound for the independence density, namely

αδ¯(G(Rn,{d1,,dm}))λ.

A sequence of primal problems

For each finite nonempty set U, the set

T(U)={ZC(U):Z11},

the tip of C(U), is a compact convex set, and every copositive matrix is a multiple of a matrix in the tip.7 There is then a countable dense subset T0(U) of T(U), and we may assume that all ZT0(U) are such that trZ>0 and J,Z>0.

If URn is finite, then the set of constraints of the form (42) with ZT0(U) and δ=m(U)/(2k) for integer k1 is countable. If we consider all finite subsets U of Qn and all corresponding constraints, then the set of all constraints thus obtained is also countable. The corresponding functions (43) can be enumerated as r1, r2, .... We use this enumeration to define a sequence of optimization problems, the Nth one being

maximizeα({0})α([0,))=1,0Ωn(t)dα(t)=0,0rk(t)dα(t)0for1kN,αis a Borel measure on[0,). 47

Note this is just problem (45) with R={r1,,rN}, m=1, and d1=1. Let ϑN denote both the Nth optimization problem above and its optimal value, and denote by ϑ the optimization problem in which constraints for all k1 are added, as well as the optimal value of this problem. We know that ϑNαδ¯(G(Rn,{1})) for all N1. By the construction of the rk functions, using Lemma 10.1 and Theorem 6.3, we also know that ϑ=αδ¯(G(Rn,{1})).

Theorem 10.3

If n2, then limNϑN=ϑ.

Proof

Since ϑNϑN+1 and ϑNϑ for all N1, the limit exists and is at least ϑ; we show now the reverse inequality.

So let (αN) be a sequence of measures such that αN is a feasible solution of ϑN and αN({0})L for all N1 and some L>0. Each αN is a finite Radon measure (since [0,) is a complete separable metric space), being therefore an element of the space M([0,)) of signed Radon measures of bounded total variation. By the Riesz Representation Theorem [16, Theorem 7.17], the space M([0,)) is the dual space of C0([0,)), which is the space of continuous functions vanishing at infinity equipped with the supremum norm.

For fC0([0,)) and μM([0,)), write

[f,μ]=0f(t)dμ(t).

If f1, then |[f,αN]|1 since αN([0,))=1. So all αN belong to the closed unit ball

{μM([0,)):|[f,μ]|1for allfC0([0,))withf1},

which by Alaoglu’s theorem [16, Theorem 5.18] is compact in the weak- topology on M([0,)).

So (αN) has a weak--convergent subsequence8; let us assume that the sequence itself converges to a measure αM([0,)). Here is what we want to prove:

  • (i)

    α({0})limNαN({0});

  • (ii)

    α([0,))1;

  • (iii)

    α([0,))-1α is a feasible solution of ϑ.

From these three claims the reverse inequality, and hence the theorem, follows.

To see (i), note first that α must be nonnegative. For suppose α(X)<0 for some set X. Since α is Radon, it is inner regular on σ-finite sets [16, Proposition 7.5], so there is a compact set CX such that α(C)<0. For k1, let Uk be the set of all points at distance less than 1/k from C; note that Uk is open and that C is the intersection of Uk for k1.

For every k1, Urysohn’s lemma says that there is a continuous function fk:[0,)[0,1] that is 1 on C and 0 outside of Uk, and since Uk is bounded this function vanishes at infinity. Now α(C)=limkα(Uk), so if k is large enough we have

0>[fk,α]=limN[fk,αN],

and for some N we must have [fk,αN]<0, a contradiction since f0 and αN is nonnegative.

Next, for every ϵ>0 let fϵ:[0,)[0,1] be a continuous function such that fϵ(0)=1 and fϵ(t)=0 for tϵ. Note that

α({0})=limϵ0α([0,ϵ)).

Now

α([0,ϵ))[fϵ,α]=limN[fϵ,αN]limNαN({0}),

proving (i).

For (ii), if α([0,))>1, then there is U such that α([0,U))>1. Let f:[0,)[0,1] be a continuous function such that f(t)=1 for t[0,U) and f(t)=0 for tU+1. Then

1<α([0,U))[f,α]=limN[f,αN],

and for some N we have αN([0,U+1))[f,αN]>1, a contradiction since αN is feasible for ϑN.

Finally, for (iii), recall that Ωn vanishes at infinity for n2. Then

0Ωn(t)dα(t)=[Ωn,α]=limN[Ωn,αN]=0.

From Lemma 10.2 we know that rk vanishes at infinity for all k, so similarly we have [rk,α]0 for all k1, finishing the proof of (iii) and that of the theorem.

A sequence of dual problems

Following (46), here is a dual problem for ϑN:

minimizeλλ+z+k=1Nykrk(0)1,λ+zΩn(t)+k=1Nykrk(t)0for allt>0,y0. 48

Weak duality holds between this problem and ϑN, but in this case we know even more, namely that there is no duality gap between primal and dual problems:

Theorem 10.4

If n2, then the optimal value of (48) is ϑN.

In Sect. 9.3 we saw how problem (39), which is similar to (48), is solved: we disregard all constraints for t>L for some L>0, take a finite sample S of points in [0, L], and consider only constraints for tS. We then have a finite linear program, which can be solved by computer. Most likely, an optimal solution of this problem will be (slightly) infeasible for the original, infinite problem. However, the hope is that, if L is large enough and the sample S is fine enough, then the solution obtained from the discretized problem can be fixed to become a feasible solution of the original problem.

The proof of the above theorem follows the same strategy, but while in Sect. 9.3 we did not have to argue that this solution strategy always works (since we were only interested in having it work for the cases considered), here we have to. For that we need two lemmas, the first one to help us find the number L.

Lemma 10.5

If n2 and if t0>0 is such that Ωn(t0)<0 and rk(t0)0 for k=1, ..., N, then the polyhedron in RN+2 consisting of vectors (λ,z,y1,,yN) satisfying

-1λ2,yk0fork=1,,N,λ+z+k=1Nykrk(0)1,λ+zΩn(t0)+k=1Nykrk(t0)0 49

is bounded.

Note that such a t0 as in the statement above exists, as follows from Lemma 10.2 since Ωn has zeros of arbitrarily large magnitude.9

Proof

Let KRN+2 be the cone generated by the N+4 vectors

l1=(1,0,,0),l2=(-1,0,,0),e1=(0,0,-1,,0),e2=(0,0,0,-1,,0),,eN=(0,0,0,,-1),s1=(1,1,r1(0),,rN(0)),s2=(1,Ωn(t0),r1(t0),,rN(t0)).

The polyhedron given by the inequalities (49) is bounded if and only if K=RN+2; let us show that this is the case.10

By construction we have rk(0)>0 (recall that the copositive matrix Z used in the definition of rk is such that J,Z>0; see Sect. 10.2); add nonnegative multiples of l2, e1, ..., eN to s1 to get w1=(0,1,0,,0)K. Since rk(t0)0, add nonnegative multiples of l2, e1, ..., eN to s2 and rescale the result to see that -w1K.

Finally, for each k=1, ..., N, add to s1 nonnegative multiples of l2, -w1, and ei for ik and rescale the result to see that -ekK, finishing the proof that K=RN+2.

The second lemma provides some crude bounds on the derivative of the functions Ωn and rk, and will be used to help us decide how fine the sample S has to be.

Lemma 10.6

If n2, then for all t0 we have |Ωn(t)|Γ(n/2). If r is given as in (43), then

|r(t)|x,yU|Z(x,y)|(x-y+2δ)(volB(0,δ))2Γ(n/2).

Proof

It follows directly from the series expansion of the Bessel function of order α that

dt-αJα(t)dt=-t-αJα+1(t),

and so from (37) we get

Ωn(t)=-Γ(n2)(2t)(n-2)/2Jn/2(t).

Compare this with the expression for Ωn+2 to get

Ωn(t)=-(t/n)Ωn+2(t).

Now |Jα(t)|1 for all α0 and t0 [47, equation (10), §13.42]. Combine this with the first expression for Ωn to see that for t2 we have |Ωn(t)|Γ(n/2). From the definition (36) of Ωn, it follows that |Ωn(t)|1 for all t, hence from the second expression for Ωn it is clear that |Ωn(t)|2/n for t2. For n2 we have Γ(n/2)2/n, and so |Ωn(t)|Γ(n/2).

For the estimate on r, take x, yU. Then

|ddtB(x,δ)B(y,δ)Ωn(tx-y)dydx|=|B(x,δ)B(y,δ)dΩn(tx-y)dtdydx|B(x,δ)B(y,δ)x-y|Ωn(tx-y)|dydx(x-y+2δ)(volB(0,δ))2Γ(n/2),

and the estimate for r follows.

We now have everything needed to prove that there is no duality gap.

Proof of Theorem 10.4

Fix ϵ>0 and let t0 be such that Ωn(t0)<0 and rk(t0)0 for all k=1, ..., N. Lemma 10.5 says that the polyhedron described by the inequalities (49) is bounded; let M be an upper bound on the Euclidean norm of any vector in this polyhedron. Since Ωn vanishes at infinity and so does rk for all k (cf. Lemma 10.2), there is Lt0 such that

(Ωn(t),r1(t),,rN(t))ϵ/Mfor alltL. 50

Lemma 10.6 implies that there is a constant D such that

(Ωn(t),r1(t),,rk(t))Dfor allt0. 51

Let S[0,L] be a finite set of points with the property that given t[0,L] there is sS with |t-s|ϵ/(MD) and make sure that both t0 and L are in S.

Now consider the optimization problem

minimizeλλ+z+k=1Nykrk(0)1,λ+zΩn(t)+k=1Nykrk(t)0for alltS,-1λ2,y0, 52

which is a finite linear program. Let λ, z, and y be an optimal solution of this problem and write

g(t)=zΩn(t)+k=1Nykrk(t).

Since t0S, we know from Lemma 10.5 that (z,y1,,yN)M. Using the Cauchy–Schwarz inequality together with (50) we see that, for all tL,

|g(t)|M(ϵ/M)=ϵ. 53

Given t[0,L], there is sS such that |t-s|ϵ/(MD). Then using the mean-value theorem, the Cauchy–Schwarz inequality, and (51) we get

|g(t)-g(s)||t-s|MDϵ. 54

Since λ+g(s)0, we then have that λ+g(t)-ϵ.

The estimates (53) and (54) together show that λ+ϵ, z, and y is a feasible solution of (48). We now find a solution of ϑN, defined in (47), of value close to it.

To do so, notice that if ϵ is small enough, then (53) implies in particular that λ>-1, or else λ+g(L)<0, a contradiction. Since our solution is optimal, we must also have λ<2 (notice λ=1, z=0, and y=0 is a feasible solution of our problem).

Now problem (52) is a finite linear program, and we can apply the strong duality theorem. Its dual looks very much like problem ϑN, except that the measure α is now a discrete measure supported on S{0} and there are two extra variables corresponding to the constraints λ-1 and λ2. Since our optimal solution of (52) is such that -1<λ<2, complementary slackness implies that these two extra variables of the dual of (52) will be 0 in an optimal solution. So if α is an optimal solution of the dual of (52), then it is also a feasible (though likely not optimal) solution of ϑN.

We have then a solution of ϑN of value λ and a feasible solution of (48) of value λ+ϵ. Making ϵ approach 0 we obtain the theorem.

Asymptotics for many distances

The theorem below implies the ‘’ direction of Bukh’s result (41). The reverse inequality is much simpler to prove; the reader is referred to Bukh’s paper [6].

Theorem 10.7

If n2 and m2, then for every ϵ>0 there is q such that if d1, ..., dm are positive numbers such that di/di-1>q for i=2, ..., m, then

αδ¯(G(Rn,{d1,,dm}))(αδ¯(G(Rn,{1}))+ϵ)m+ϵ(m-1).

Proof

All ideas required for the proof can be more clearly presented when only two distances are considered; for larger values of m one only has to use induction.

So fix ϵ>0. Theorems 6.3 and 10.3 imply that we can choose N such that ϑNαδ¯(G(Rn,{1}))+ϵ/2 and Theorem 10.4 then says that we can take a feasible solution λ, z, and y of the dual (48) of ϑN satisfying

λϑN+ϵ/2αδ¯(G(Rn,{1}))+ϵ.

We may assume moreover that λ1. Since λ is an upper bound on the independence density of the unit-distance graph, which is positive, by taking ϵ small enough we assume that λϵ.

Write

g(t)=zΩn(t)+k=1Nykrk(t);

note g is continuous. Since (λ,z,y) is feasible, we know that g(0)1-λ and g(t)-λ for all t>0. Now Ωn vanishes at infinity for n2, and together with Lemma 10.2 this implies that g also vanishes at infinity, so there is L>0 such that |g(t)|ϵ for all tL. Since g is continuous at 0, we can pick η>0 such that g(t)1-λ-ϵ for all t[0,η].

Set q=L/η and suppose d1, d2 are distances satisfying d2/d1>q. The independence density does not change if we scale the forbidden distances, so we may assume that d2=1 and then d1<q-1. Consider the function h(t)=g(d1t). Then λ2+ϵ+g(t)+λh(t) is

  • (i)

    at least 1+ϵ if t=0;

  • (ii)

    at least ϵ-λϵ0 if t[0,L], since 0λ1 and d1t<q-1t=ηt/Lη;

  • (iii)

    at least 0 if tL, since λϵ.

Now notice

h(t)=zΩn(d1t)+k=1Nykrk(d1t),

where from (43)

rk(d1t)=x,yUkZk(x,y)B(x,δk)B(y,δk)Ωn(d1tx-y)dydx=x,yUkZk(x,y)B(x,δk)B(y,δk)Ωn(td1x-d1y)dydx=x,yUkZk(x,y)d1B(x,δk)d1B(y,δk)Ωn(tx-y)d1-2ndydx=x,yUk(d1-2nZk(x,y))B(d1x,d1δk)B(d1y,d1δk)Ωn(tx-y)dydx=x,yd1Uk(d1-2nZk(x,y))B(x,d1δk)B(y,d1δk)Ωn(tx-y)dydx.

This shows that r~k(t)=rk(d1t) also comes from a thick constraint through (43). Write now R={r1,,rN,r~1,,r~N}. Then from (i)–(iii) we see that

λ¯=λ2+ϵ,z¯1=λz,z¯2=z,y¯(rk)=ykfork=1,,N,andy¯(r~k)=λykfork=1,,N

is a feasible solution of (46) for distances d1, d2, whence

αδ¯(G(Rn,{d1,d2}))λ¯=λ2+ϵ(αδ¯(G(Rn,{1}))+ϵ)2+ϵ,

as we wanted.

Computability of the independence density

The sequence of dual problems of Sect. 10.3 can be used to construct a Turing machine that computes the independence ratio of the unit-distance graph up to any prescribed precision. Here is a brief sketch of the idea.

First we describe a Turing machine that computes an increasing sequence of lower bounds for the independence density that come arbitrarily close to it.

Given T>0, let PT,N be the partition of [-T,T)n consisting of all half-open cubes C1××Cn with

Ci{[-T+2kT/N,-T+2(k+1)T/N):k=0,,N-1}.

For each such partition let GT,N be the graph whose vertex set is PT,N and in which two vertices X, Y are adjacent if and only there are xX and yY such that x-y=1. Given T and N, the finite graph GT,N can be computed by a Turing machine.

By construction, if I is an independent set of GT,N, then the union I of all X in I is an independent set of the unit-distance graph with measure |I|vol[0,2T/N]n and

v(2T+1)Znv+I

is a periodic independent set of the unit-distance graph with density

|I|vol[0,2T/N]nvol[-T-1/2,T+1/2]n. 55

We know from Sect. 6.1 that periodic independent sets can come arbitrarily close to the independence density. It is then not hard to show that by taking larger and larger T and larger and larger N one can by the above construction generate lower bounds for the independence density that can come arbitrarily close to it.

So our Turing machine simply fixes an enumeration (T1,N1), (T2,N2), ... of (N\{0})2, computes the independence number of GTi,Ni for all i, uses (55) to get a lower bound, and outputs at each step the best lower bound found so far.

Let us now see how to construct a Turing machine that computes a decreasing sequence of upper bounds for the independence density that come arbitrarily close to it.

The idea is to find at the Nth step a feasible solution of the dual (48) of ϑN with value at most ϑN+1/N. This we do by mimicking the proof of Theorem 10.4: we disregard constraints for tL for some large L and we discretize the interval [0, L]. Following the proof of the theorem, one sees that it is possible to estimate algorithmically how large L has to be and how fine the discretization has to be so we obtain a feasible solution of value at most ϑN+1/N.

One problem now is that we have to work with rational numbers and not real numbers. The Bessel function and all integrals involved have to be approximated by rationals, which can be done to any desired precision algorithmically. In the end, however, we are not solving the original dual problem, but an approximated version of it. Why is the solution of this approximated version close to the solution of the original version, given, that is, that the approximation is good enough? Such a result, related to what is known in linear programming as sensitivity analysis, follows from Lemma 10.5: we work with problems of bounded feasible region, so there is a universal upper bound on the magnitude of any number appearing in any feasible solution, and it is possible to show that if the input data approximates the real data well enough, then the solutions will be very close together; moreover, it is possible to estimate how good the approximation has to be.

Another problem is to see that the set {r1,r2,} can be enumerated by a Turing machine. The only difficulty here is how to enumerate the set T0(U) for some finite set U. One way to do it is as follows. First, note that T(U) is a subset of the L1 unit ball in RU×U. Given ϵ>0, consider a finite ϵ-net Nϵ for this unit ball. Let now Nϵ be a finite set containing for each ANϵ a matrix BT(U) with B11 such that A-B1ϵ, if it exists. Then, since Nϵ is an ϵ-net, for every ZT(U) there is BNϵ such that Z-B12ϵ. So we may take for T0(U) the union of N1/k for k1.

It only remains to show how Nϵ can be computed. Given ANϵ, we want to solve the following finite-dimensional optimization problem:

minimizeA-B1B11,BC(U).

The L1 norms above can be equivalently rewritten using linear constraints, so the above problem is a conic program that can be solved with the ellipsoid method (the separation problem is NP-hard, as follows from the equivalence between separation and optimization [19], but in this case we do not care for efficiency: it is enough to have a separation algorithm for the copositive cone, and we do [18]). By solving this problem repeatedly one can construct Nϵ.

So we have two Turing machines, one to find better and better lower bounds, and one to find better and better upper bounds. Running the two alternately, one constructs a third Turing machine that given ϵ>0 stops when the best lower bound is ϵ-close to the best upper bound found.

Acknowledgements

We would like to thank Etienne de Klerk for pointing us to references [23, 33] and Stefan Krupp, Markus Kunze, and Fabrício Caluza Machado for reading an early version of the manuscript and providing useful comments. We are also grateful to the anonymous referees who read the paper carefully and made many useful suggestions and corrections. This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Agreement Number 764759.

Footnotes

1

For more about nets, see Folland [16].

2

The results in this section are similar to those related to step kernels in the theory of graph limits of Lovász and Szegedy [28, §4.2].

3

First, the integral is well defined as the composition A(σ(σx,σy)) is measurable, since A is measurable and the map σ(σx,σy) is continuous from the continuous action of Γ. Second, the pushforward of the Haar measure is a finite measure. Then L2(V×V)L1(V×V) [16, Exercise 5, §6.1], and Tonelli’s theorem applied to the product measure on (V×V)×Γ says that (x,y)Γ|A(σx,σy)|dμ(σ), and hence R(A)(x,y), exists for almost all (x,y)V×V. One checks similarly that R(A)L2(V×V).

4

Since Γ is compact and metrizable, it is separable. This implies that L2(Γ×Γ) is separable, and hence T(Γ) is metrizable; see Sect. 4.3.

5

For every ϵ>0, there is a compact set BRn such that |f(x)-Beiu·xdν(u)|<ϵ for all xRn.

6

This follows e.g. from the asymptotic formula for the Bessel function [47, equation (1), §7.21] and is false for n=1.

7

Here we take the L1 norm for the matrix Z simply for convenience; except for the developments of Sect. 10.5, any norm will do.

8

In principle, we know that (αN) has a weak--convergent subnet, which is not necessarily a sequence. However, since C0([0,)) with the supremum norm is separable, the closed unit ball in M([0,)) is second countable [16, p. 171, Exercise 50], and hence the sequence (αN) has a weak--convergent subsequence.

9

This is true for the Bessel function [47, Chapter XV].

10

This follows from Farkas’s Lemma. The vectors above form the rows of the constraint matrix of the finite linear-inequality system (49).

The first author was supported by CRM Applied Math Laboratory and NSERC Discovery Grant 2015-0674. Part of this research was carried out while the second author was at the Institute of Mathematics and Statistics of the University of São Paulo; the second author was partially supported by the São Paulo State Science Foundation (FAPESP) under Grant 2013/03447-6. The third author was partially supported by the SFB/TRR 191 “Symplectic Structures in Geometry, Algebra and Dynamics”, funded by the DFG.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Evan DeCorte, Email: pevdecorte@gmail.com.

Fernando Mário de Oliveira Filho, Email: fmario@gmail.com.

Frank Vallentin, Email: frank.vallentin@uni-koeln.de.

References

  • 1.Bachoc C, Nebe G, de Oliveira Filho FM, Vallentin F. Lower bounds for measurable chromatic numbers. Geom. Funct. Anal. 2009;19:645–661. [Google Scholar]
  • 2.Bachoc C, Passuello A, Thiery A. The density of sets avoiding distance 1 in Euclidean space. Discrete Comput. Geom. 2015;53:783–808. [Google Scholar]
  • 3.Barvinok A. A Course in Convexity. Graduate Studies in Mathematics 54. Providence, RI: American Mathematical Society; 2002. [Google Scholar]
  • 4.Bochner S. Hilbert distances and positive definite functions. Ann. Math. 1941;42:647–656. [Google Scholar]
  • 5.Bourgain J. A Szemerédi type theorem for sets of positive density in Rk. Isr. J. Math. 1986;54:307–316. [Google Scholar]
  • 6.Bukh B. Measurable sets with excluded distances. Geom. Funct. Anal. 2008;18:668–697. [Google Scholar]
  • 7.Cohn H, Elkies N. New upper bounds on sphere packings I. Ann. Math. 2003;157:689–714. [Google Scholar]
  • 8.Cohn H, Kumar A, Miller SD, Radchenko D, Viazovska M. The sphere packing problem in dimension 24. Ann. Math. 2017;185:1017–1033. [Google Scholar]
  • 9.de Klerk E, Pasechnik DV. A linear programming reformulation of the standard quadratic optimization problem. J. Global Optim. 2007;37:75–84. [Google Scholar]
  • 10.de Klerk E, Vallentin F. On the Turing model complexity of interior point methods for semidefinite programming. SIAM J. Optim. 2016;26:1944–1961. [Google Scholar]
  • 11.de Laat D, Vallentin F. A semidefinite programming hierarchy for packing problems in discrete geometry. Math. Program. Ser. B. 2015;151:529–553. [Google Scholar]
  • 12.de Oliveira Filho FM, Vallentin F. Fourier analysis, linear programming, and densities of distance-avoiding sets in Rn. J. Eur. Math. Soc. 2010;12:1417–1428. [Google Scholar]
  • 13.de Oliveira Filho FM, Vallentin F. A quantitative version of Steinhaus’s theorem for compact, connected, rank-one symmetric spaces. Geom. Dedicata. 2013;167:295–307. [Google Scholar]
  • 14.de Oliveira Filho FM, Vallentin F. A counterexample to a conjecture of Larman and Rogers on sets avoiding distance 1. Mathematika. 2019;65:785. [Google Scholar]
  • 15.DeCorte E, Pikhurko O. Spherical sets avoiding a prescribed set of angles. Int. Math. Res. Not. 2016;20:6095–6117. [Google Scholar]
  • 16.Delsarte P, Goethals JM, Seidel JJ. Spherical codes and designs. Geom. Dedicata. 1977;6:363–388. [Google Scholar]
  • 17.Deza MM, Laurent M. Geometry of Cuts and Metrics. Algorithms and Combinatorics 15. Berlin: Springer; 1997. [Google Scholar]
  • 18.Dobre C, Dür ME, Frerick L, Vallentin F. A copositive formulation of the stability number of infinite graphs. Math. Program. Ser. A. 2016;160:65–83. [Google Scholar]
  • 19.Falconer KJ. The realization of distances in measurable subsets covering Rn. J. Comb. Theory Ser. A. 1981;31:184–189. [Google Scholar]
  • 20.Falconer KJ, Marstrand JM. Plane sets with positive density at infinity contain all large distances. Bull. Lond. Math. Soc. 1986;18:471–474. [Google Scholar]
  • 21.Federer H. Geometric Measure Theory. Die Grundlehren der mathematischen Wissenschaften. New York: Springer; 1969. [Google Scholar]
  • 22.Folland GB. Real Analysis: Modern Techniques and Their Applications. New York: Wiley; 1999. [Google Scholar]
  • 23.Furstenberg H, Katznelson Y, Weiss B. Ergodic theory and configurations in sets of positive density. In: Nešetřil J, Rödl V, editors. Mathematics of Ramsey Theory. Berlin: Springer; 1990. pp. 184–198. [Google Scholar]
  • 24.Gaddum JW. Linear inequalities and quadratic forms. Pac. J. Math. 1958;8:411–414. [Google Scholar]
  • 25.Grötschel M, Lováász L, Schrijver A. Geometric Algorithms and Combinatorial Optimization, Algorithms and Combinatorics 2. Berlin: Springer; 1988. [Google Scholar]
  • 26.Kalai, G.: Some old and new problems in combinatorial geometry I: around Borsuk’s problem. In: Surveys in combinatorics 2015, London Mathematical Society Lecture Note Series 424. Cambridge University Press, Cambridge, pp. 147–174 (2015)
  • 27.Karp RM. Reducibility among combinatorial problems. In: Miller RE, Thatcher JW, editors. Complexity of Computer Computations (Proceedings of a symposium on the Complexity of Computer Computations, IBM Thomas J. Watson Research Center, Yorktown Heights, New York, 1972) New York: Plenum Press; 1972. pp. 85–103. [Google Scholar]
  • 28.Keleti T, Matolcsi M, de Oliveira Filho FM, Ruzsa IZ. Better bounds for planar sets avoiding unit distances. Discrete Comput. Geom. 2016;55:642–661. [Google Scholar]
  • 29.Larman DG, Rogers CA. The realization of distances within sets in Euclidean space. Mathematika. 1972;19:1–24. [Google Scholar]
  • 30.Lovász L. On the Shannon capacity of a graph. IEEE Trans. Inf. Theory. 1979;IT–25:1–7. [Google Scholar]
  • 31.Lovász L, Szegedy B. Limits of dense graph sequences. J. Comb. Theory Ser. B. 2006;96:933–957. [Google Scholar]
  • 32.Mattila P. Geometry of Sets and Measures in Euclidean Space: Fractals and Rectifiability, Cambridge Studies in Advanced Mathematics 44. Cambridge: Cambridge University Press; 1995. [Google Scholar]
  • 33.McEliece RJ, Rodemich ER, Rumsey HC. The Lovász bound and some generalizations. J. Comb. Inf. Syst. Sci. 1978;3:134–152. [Google Scholar]
  • 34.Milnor J. Curvatures of left invariant metrics on Lee groups. Adv. Math. 1976;21:293–329. [Google Scholar]
  • 35.Moser WOJ. Problems, problems, problems. Discrete Appl. Math. 1991;31:201–225. [Google Scholar]
  • 36.Motzkin TS, Straus EG. Maxima for graphs and a new proof of a theorem of Turán. Can. J. Math. 1965;17:533–540. [Google Scholar]
  • 37.Padberg M. The Boolean quadric polytope: some characteristics, facets and relatives. Math. Program. Ser. B. 1989;45:139–172. [Google Scholar]
  • 38.Reed M, Simon B. Methods of Modern Mathematical Physics II: Fourier Analysis, Self-adjointness. New York: Academic Press; 1975. [Google Scholar]
  • 39.Schoenberg IJ. Metric spaces and completely monotone functions. Ann. Math. 1938;39:811–841. [Google Scholar]
  • 40.Schoenberg IJ. Positive definite functions on spheres. Duke Math. J. 1942;9:96–108. [Google Scholar]
  • 41.Schrijver A. A comparison of the Delsarte and Lovász bounds. IEEE Trans. Inf. Theory. 1979;IT–25:425–429. [Google Scholar]
  • 42.Schrijver A. Combinatorial Optimization: Polyhedra and Efficiency. Berlin: Springer; 2003. [Google Scholar]
  • 43.Simon B. Convexity: An Analytic Viewpoint, Cambridge Tracts in Mathematics 187. Cambridge: Cambridge University Press; 2011. [Google Scholar]
  • 44.Szegö G. Orthogonal Polynomials. American Mathematical Society Colloquium Publications Volume XXIII. 4. Providence: American Mathematical Society; 1975. [Google Scholar]
  • 45.Székely LA. Erdős on unit distances and the Szemerédi–Trotter theorems. In: Halász G, Lovász L, Simonovits M, Sós VT, editors. Paul Erdős and His Mathematics II Bolyai Society Mathematical Studies 11, János Bolyai Mathematical Society, Budapest. Berlin: Springer; 2002. pp. 646–666. [Google Scholar]
  • 46.Viazovska MS. The sphere packing problem in dimension 8. Ann. Math. 2017;185:991–1015. [Google Scholar]
  • 47.Watson GN. A Treatise on the Theory of Bessel Functions. Cambridge: Cambridge University Press; 1922. [Google Scholar]
  • 48.Witsenhausen HS. Spherical sets without orthogonal point pairs. Am. Math. Mon. 1974;10:1101–1102. [Google Scholar]

Articles from Mathematical Programming are provided here courtesy of Springer

RESOURCES