Skip to main content
Springer logoLink to Springer
. 2021 Jul 1;192(1-2):339–360. doi: 10.1007/s10107-021-01674-7

Fair colorful k-center clustering

Xinrui Jia 1, Kshiteej Sheth 1,, Ola Svensson 1
PMCID: PMC8907124  PMID: 35300155

Abstract

An instance of colorful k-center consists of points in a metric space that are colored red or blue, along with an integer k and a coverage requirement for each color. The goal is to find the smallest radius ρ such that there exist balls of radius ρ around k of the points that meet the coverage requirements. The motivation behind this problem is twofold. First, from fairness considerations: each color/group should receive a similar service guarantee, and second, from the algorithmic challenges it poses: this problem combines the difficulties of clustering along with the subset-sum problem. In particular, we show that this combination results in strong integrality gap lower bounds for several natural linear programming relaxations. Our main result is an efficient approximation algorithm that overcomes these difficulties to achieve an approximation guarantee of 3, nearly matching the tight approximation guarantee of 2 for the classical k-center problem which this problem generalizes. algorithms either opened more than k centers or only worked in the special case when the input points are in the plane.

Keywords: Approximation algorithms, k-center, Clustering and facility location, Fairness

Introduction

In the colorful k-center problem introduced in [5], we are given a set of n points P in a metric space partitioned into a set R of red points and a set B of blue points, along with parameters k, r, and b.

The goal is to find a set of k centers CP that minimizes ρ so that balls of radius ρ around each point in C cover at least r red points and at least b blue points.

More generally, the points can be partitioned into ω color classes C1,,Cω, with coverage requirements p1,,pω. To keep the exposition of our ideas as clean as possible, we concentrate the bulk of our discussion to the version with two colors. In Sect. 3 we show how our algorithm can be generalized for ω color classes with an exponential dependence on ω in the running time in a rather straightforward way, thus getting a polynomial time algorithm for constant ω.

This generalization of the classic k-center problem has applications in situations where fairness is a concern. For example, if a telecommunications company is required to provide service to at least 90% of the people in a country, it would be cost effective to only provide service in densely populated areas. This is at odds with the ideal that at least some people in every community should receive service. In the absence of color classes, an approximation algorithm could be “unfair” to some groups by completely considering them as outliers. The inception of fairness in clustering can be found in the recent paper [8] (see also [1, 4]), which uses a related but incomparable notion of fairness. Their notion of fairness requires each individual cluster to have a balanced number of points from each color class, which leads to very different algorithmic considerations and is motivated by other applications, such as “feature engineering”.

The other motive for studying the colorful k-center problem derives from the algorithmic challenges it poses. One can observe that it generalizes the k-center problem with outliers, which is equivalent to only having red points and needing to cover at least r of them. This outlier version is already more challenging than the classic k-center problem: only recent results give tight 2-approximation algorithms [6, 12], improving upon the 3-approximation guarantee of [7]. In contrast, such algorithms for the classic k-center problem have been known since the ’80s[10, 13]. That the approximation guarantee of 2 is tight, even for classic k-center, was proved in [14].

At the same time, a special case of subset-sum with polynomial-sized numbers is embedded within the colorful k-center problem. To see this, consider n numbers a1,,an and let A=i=1nai. Construct an instance of the colorful k-center problem with r=k·A+A/2, b=k·A-A/2, and for every i{1,,n}, a ball of radius one containing A+ai red points and A-ai blue points. These balls are assumed to be far apart so that any single ball that covers two of these balls must have a very large radius. It is easy to see that the constructed colorful k-center instance has a solution of radius one if and only if there is a size k subset of the n numbers whose sum exactly equals A/2.

We use this connection to subset-sum to show that the standard linear programming (LP) relaxation of the colorful k-center problem has an unbounded integrality gap even after a linear number of rounds of the powerful Lasserre/Sum-of-Squares hierarchy (see Sect. 4.1). We remark that the standard linear programming relaxation gives a 2-approximation algorithm for the outliers version even without applying lift-and-project methods. Another natural approach for strengthening the standard linear programming relaxation is to add flow-based inequalities specially designed to solve subset-sum problems. However, in Sect. 4.2, we prove that they do not improve the integrality gap due to the clustering feature of the problem. This shows that clustering and the subset-sum problem are intricately related in colorful k-center. This interplay makes the problem more complex and prior to our work only a randomized constant-factor approximation algorithm was known when the points are in R2 with an approximation guarantee greater than 6 [5].

Our main result overcomes these difficulties and we give a nearly tight approximation guarantee:

Theorem 1

There is a 3-approximation algorithm for the colorful k-center problem.

As aforementioned, our techniques can be easily extended to a constant number of color classes but we restrict the discussion here to two colors.

On a very high level, our algorithm manages to decouple the clustering and the subset-sum aspects. First, our algorithm guesses certain centers of the optimal solution that it then uses to partition the point set into a “dense” part Pd and a “sparse” part Ps. The dense part is clustered using a subset-sum instance while the sparse set is clustered using the techniques of Bandyapadhyay, Inamdar, Pai, and Varadarajan [5] (see Sect. 2.1). Specifically, we use the pseudo-approximation of [5] that satisfies the coverage requirements using k+1 balls of at most twice the optimal radius.

While our approximation guarantee is nearly tight, it remains an interesting open problem to give a 2-approximation algorithm or to show that the ratio 3 is tight. One possible direction is to understand the strength of the relaxation obtained by combining the Lasserre/Sum-of-Squares hierarchy with the flow constraints. While we show that individually they do not improve the integrality gap, we believe that their combination can lead to a strong relaxation. Independent work Independently and concurrently to our work, authors in  [2] obtained a 4-approximation algorithm for the colorful k-center problem with ω=O(1) and running time |P|O(ω) using different techniques than the ones described in this work. Furthermore they show that, assuming PNP, if ω is allowed to be unbounded then the colorful k-center problem admits no algorithm guaranteeing a finite approximation. They also show that assuming the Exponential Time Hypothesis, colorful k-center is inapproximable if ω grows faster than logn.

Organization We begin by giving some notation and definitions and describing the pseudo-approximation algorithm in [5]. In fact, we then describe a 2-approximation algorithm on a certain class of instances that are well-separated, and the 3-approximation follows almost immediately. This 2-approximation proceeds in two phases: the first is dedicated to the guessing of certain centers, while the second processes the dense and sparse sets. Section 3 explains the generalization to ω color classes. In Sect. 4 we present our integrality gaps under the Sum-of-Squares hierarchy and additional constraints deriving from a flow network to solve subset-sums.

A 3-approximation algorithm

In this section we present our 3-approximation algorithm. We briefly describe the pseudo-approximation algorithm of Bandyapadhyay et al. [5] since we use it as a subroutine in our algorithm.

Notation We assume that our problem instance is normalized to have an optimal radius of one and we refer to the set of centers in an optimal solution as OPT. The set of all points at distance at most ρ from a point j is denoted by B(j,ρ) and we refer to this set as a ball of radius ρ at j. We write B(j) for B(j,1). By a ball of OPT we mean B(j) for some jOPT.

The pseudo-approximation algorithm

The algorithm of Bandyapadhyay et al. [5] first guesses the optimal radius for the instance (there are at most O(n2) distinct values the optimal radius can take), which we assume by normalization to be one, and considers the natural LP relaxation LP1 depicted on the left in Fig. 1. The variable xi indicates the extent to which point i is fractionally opened as a center and zi indicates the extent to which i is covered by centers.

Fig. 1.

Fig. 1

The linear programs used in the pseudo-approximation algorithm

Given a fractional solution to LP1, the algorithm of [5] finds a clustering of the points. The clusters that are produced are of radius two, and with a simple modification (details can be found in Appendix 2), can be made to have a special structure that we call a flower:

Definition 1

For jP, a flower centered at j is the set F(j)=iB(j)B(i).

More specifically, given a fractional solution (xz) to LP1, the clustering algorithm in [5] produces a set of points SP and a cluster CjP for every jS such that:

  1. The set S is a subset of the points {jP:zj>0} with positive z-values.

  2. For each jS, we have CjF(j) and the clusters {Cj}jS are pairwise disjoint.

  3. If we let rj=|CjR| and bj=|CjB| for jS, then the linear program LP2 (depicted on the right in Fig. 1) has a feasible solution y of value at least r.

As LP2 has only two non-trivial constraints, any extreme point will have at most two variables attaining strictly fractional values. So at most k+1 variables of y are non-zero. The pseudo-approximation of [5] now simply takes those non-zero points as centers. Since each flower is of radius two, this gives a 2-approximation algorithm that opens at most k+1 centers. (Note that, as the clusters {Cj}jS are pairwise disjoint, at least b blue points are covered, and at least r red points are covered since the value of the solution is at least r.)

Obtaining a constant-factor approximation algorithm that only opens k centers turns out to be significantly more challenging.

Nevertheless, the above techniques form an important subroutine in our algorithm. Given a fractional solution (xz) to LP1, we proceed as above to find S and an extreme point to LP2 of value at least r. However, instead of selecting all points with positive y-value, we, in the case of two fractional values, only select the one whose cluster covers more blue points. This gives us a solution of at most k centers whose clusters cover at least b blue points. Furthermore, the number of red points that are covered is at least r-maxjSrj since we disregarded at most one center. As S{j:zj>0} (see first property above) and CjF(j) (see second property above), we have maxjSrjmaxj:zj>0|F(j)R|. We summarize the obtained properties in the following lemma.

Lemma 1

Given a fractional solution (xz) to LP1, there is a polynomial-time algorithm that outputs at most k clusters of radius two that cover at least b blue points and at least r-maxj:zj>0|F(j)R| red points.

We can thus find a 2-approximate solution that covers sufficiently many blue points but may cover fewer red points than necessary. The idea now is that, if the number of red points in any cluster is not too large, i.e., maxj:zj>0|F(j)R| is “small”, then we can hope to meet the coverage requirements for the red points by increasing the radius around some opened centers. Our algorithm builds on this intuition to get a 2-approximation algorithm using at most k centers for well-separated instances as defined below.

Definition 2

An instance of colorful k-center is well-separated if there does not exist a ball of radius three that covers at least two balls of OPT.

Our main result of this section can now be stated as follows:

Theorem 2

There is a 2-approximation algorithm for well-separated instances.

The above theorem immediately implies Theorem 1, i.e., the 3-approximation algorithm for general instances. Indeed, if the instance is not well-separated, we can find a ball of radius three that covers at least two balls of OPT by trying all n points and running the pseudo-approximation of [5] on the remaining uncovered points with k-2 centers. A more formal description of the algorithm is as follows: graphic file with name 10107_2021_1674_Figa_HTML.jpg In the correct iteration, this gives us at most k-1 centers of radius two, which when combined with the ball of radius three that covers two balls of OPT, is a 3-approximation.

Our algorithm for well-separated instances now proceeds in two phases with the objective of finding a subset of P on which the pseudo-approximation algorithm produces subsets of flowers containing not too many red points. In addition, we maintain a partial solution set of centers (some guessed in the first phase), so that we can expand the radius around these centers to recover the deficit of red points from closing one of the fractional centers.

Phase I

In this phase we will guess some balls of OPT that can be used to construct a bound on maxj:zj>0|RF(j)|. To achieve this, we define the notion of Gain(pq) for any point pP and qB(p).

Definition 3

For any pP and qB(p), let

Gain(p,q):=RF(q)\B(p)

be the set of red points added to B(p) by forming a flower centered at q.

Our algorithm in this phase proceeds by guessing three centers c1,c2,c3 of the optimal solution OPT: graphic file with name 10107_2021_1674_Figb_HTML.jpg The time it takes to guess c1,c2, and c3 is O(n3) and for each ci we find the qiB(ci) such that |Gain(ci,qi)Pi| is maximized by trying all points in B(ci) (at most n many).

For notation, define Guess:=i=13B(ci) and let

τ=|Gain(c3,q3)P3|.

The reason for guessing three points is that later we lose up to 3τ red points after closing one extra center opened by running the pseudo-approximation on a pre-processed instance (see Lemma 4).

The important properties guaranteed by the first phase are summarized in the following lemma.

Lemma 2

Assuming that c1,c2, and c3 are guessed correctly, we have that

  1. the k-3 balls of radius one in OPT\{ci}i=13 are contained in P4 and cover b-|BGuess| blue points and r-|RGuess| red points; and

  2. the three clusters F(q1),F(q2), and F(q3) are contained in P\P4 and cover at least |BGuess| blue points and at least |RGuess|+3·τ red points.

Proof

(1) We claim that the intersection of any ball of OPT\{ci}i=13 with F(qi) in P is empty, for all 1i3. Then the k-3 balls in OPT\{ci}i=13 satisfy the statement of (1). To prove the claim, suppose that there is pOPT\{ci}i=13 such that B(p)F(qi) for some 1i3. Note that F(qi)=jB(qi)B(j), so this implies that B(p)B(q), for some qB(qi). Hence, a ball of radius three around q covers both B(p) and B(ci) as ciB(qi), which contradicts that the instance is well-separated.

(2) Note that for 1i3, B(ci)Gain(ci,qi)F(qi), and that B(ci) and Gain(ci,qi) are disjoint. The balls B(ci) cover at least |BGuess| blue points and |RGuess| red points, while i=13|Gain(ci,qi)Pi|3τ.

Phase II

Throughout this section we assume c1,c2, and c3 have been guessed correctly in Phase I so that the properties of Lemma 2 hold. Furthermore, by the selection and the definition of τ, we also have

|Gain(p,q)P4|τforanypP4OPTandqB(p)P4. 1

This implies that F(p)\B(p) contains at most τ red points of P4. However, to apply Lemma 1 we need that the number of red points of P4 in the whole flower F(p) is bounded. To deal with balls with many more than τ red points, we will iteratively remove dense sets from P4 to obtain a subset Ps of sparse points.

Definition 4

When considering a subset of the points PsP, we say that a point jPs is dense if the ball B(j) contains strictly more than 2·τ red points of Ps. For a dense point j, we also let IjPs contain those points iPs whose intersection B(i)B(j) contains strictly more than τ red points of Ps.

We remark that in the above definition, we have in particular that jIj for a dense point jPs. Our iterative procedure now works as follows: graphic file with name 10107_2021_1674_Figc_HTML.jpg

Let Pd=P4\Ps denote those points that were removed from P4. We will cluster the two sets Ps and Pd of points separately. Indeed, the following lemma says that a center in OPT\{ci}i=13 either covers points in Ps or Pd but not points from both sets. Recall that Dj denotes the set of points that are removed from Ps in the iteration when j was selected and so Pd=jDj.

Lemma 3

For any c OPT\{ci}i=13 and any IjI, either cIj or B(c)Dj=.

Proof

Let cOPT\{ci}i=13, IjI, and suppose cIj. If B(c)Dj, there is a point p in the intersection B(c)B(i) for some iIj. Suppose first that B(c)B(j). Then, since cIj, the intersection B(c)B(j) contains at most τ red points from Dj (recall that Dj contains the points of B(j) in Ps at the time j was selected). But by the definition of dense clients, B(j)Dj has more than 2·τ red points, so (B(j)\B(c))Dj has more than τ red points. This region is a subset of Gain(c,p)P4, which contradicts (1). This is shown in Fig. 2a. Now consider the second case when B(c)B(j)= and there is a point p in the intersection B(c)B(i) for some iIj and ij. Then, by the definition of Ij, B(i)B(j) has more than τ red points of Dj. However, this is also a subset of Gain(c,p)P4 so we reach the same contradiction. See Fig. 2b.

Fig. 2.

Fig. 2

The shaded regions are subsets of Gain(c,p), which contain the darkly shaded regions that have >τ red points

Our algorithm now proceeds by guessing the number kd of balls of OPT\{ci}i=13 contained in Pd. We also guess the numbers rd and bd of red and blue points, respectively, that these balls cover in Pd. Note that after guessing kd, we know that the number of balls in OPT\{ci}i=13 contained in Ps equals ks=k-3-kd. Furthermore, by the first property of Lemma 2, these balls cover at least bs=b-|BGuess|-bd blue points in Ps and at least rs=r-|RGuess|-rd red points in Ps. As there are O(n3) possible values of kd,bd, and rd (each can take a value between 0 and n) we can try all possibilities by increasing the running time by a multiplicative factor of O(n3). Henceforth, we therefore assume that we have guessed those parameters correctly. In that case, we show that we can recover an equally good solution for Pd and a solution for Ps that covers bs blue points and almost rs red points:

Lemma 4

There exist two polynomial-time algorithms Ad and As such that if kd,rd, and bd are guessed correctly then

  • Ad returns kd balls of radius one that cover bd blue points of Pd and rd red points of Pd;

  • As returns ks balls of radius two that cover at least bs blue points of Ps and at least rs-3·τ red points of Ps.

Proof

We first describe and analyze the algorithm Ad followed by As.

The algorithm Ad for the dense point set Pd. By Lemma 3, we have that all kd balls in OPT\{ci}i=13 that cover points in Pd are centered at points in jIj. Furthermore, we have that each Ij contains at most one center of OPT. This is because every iIj is such that B(i)B(j) and so, by the triangle inequality, B(j,3) contains all balls {B(i)}iIj. Hence, by the assumption that the instance is well-separated, the set Ij contains at most one center of OPT.

We now reduce our problem to a 3-dimensional subset-sum problem.

For each IjI, form a group consisting of an item for each pIj. The item corresponding to pIj has the 3-dimensional value vector (1,|B(p)DjB|,|B(p)DjR|). Our goal is to find kd items such that at most one item per group is selected and their 3-dimensional vectors sum up to (kd,bd,rd). Such a solution, if it exists, can be found by standard dynamic programming that has a table of size O(n4). For completeness, we provide the recurrence and precise details of this standard technique in Appendix 1. Furthermore, since the Dj’s are disjoint by definition, this gives kd centers that cover bd blue points and rd red points in Pd, as required in the statement of the lemma.

It remains to show that such a solution exists. Let o1,o2,,okd denote the centers of the balls in OPT\{ci}i=13 that cover points in Pd. Furthermore, let Ij1,,Ijkd be the sets in I such that oiIji for i{1,,kd}. Notice that by Lemma 3 we have that B(oi) is disjoint from Pd\Dji, i.e., B(oi) is contained in Dji. It follows that the 3-dimensional vector corresponding to an OPT center oi equals (1,|B(oi)DjiB|,|B(oi)DjiR|). This is equivalent to just (1,|B(oi)B|,|B(oi)R|) and so the definition of the value vectors does indeed give the correct contribution of points. Therefore, the sum of these vectors corresponding to o1,,okd results in the vector (kd,bd,rd), where we used that our guesses of kd,bd, and rd were correct.

The algorithm As for the sparse point set Ps. Assuming that the guesses are correct we have that OPT\{ci}i=13 contains ks balls that cover bs blue points of Ps and rs red points of Ps. Hence, LP1 has a feasible solution (xz) to the instance defined by the point set Ps, the number of balls ks, and the constraints bs and rs on the number of blue and red points to be covered, respectively. Lemma 1 then says that we can in polynomial-time find ks balls of radius two such that at least bs blue balls of Ps are covered and at least

rs-maxj:zj>0|F(j)R|

red points of Ps are covered. Here, F(j) refers to the flower restricted to the point set Ps.

To prove the the second part of Lemma 4, it is thus sufficient to show that LP1 has a feasible solution where zj=0 for all jPs such that |F(j)R|>3·τ. In turn, this follows by showing that, for any such jPs with |F(j)R|>3·τ, no point in B(j) is in OPT (since then zj=0 in the integral solution corresponding to OPT). Such a feasible solution can be found by adding xi=0iB(j) for all such j to LP1.

To see why this holds, suppose towards a contradiction that there is a cOPT such that cB(j). First, since there are no dense points in Ps, we have that the number of red points in B(c)Ps is at most 2·τ. Therefore the number of red points of Ps in F(j)\B(c) is strictly more than τ. In other words, we have τ<|Gain(c,j)Ps||Gain(c,j)P4| which contradicts (1).

Equipped with the above lemma we are now ready to finalize the proof of Theorem 2.

Proof of Theorem 2

Our algorithm guesses the optimal radius and the centers c1,c2,c3 in Phase I, and kd,rd,bd in Phase II. There are at most n2 choices of the optimal radius, n choices for each ci, and n+1 choices of kd,rd,bd (ranging from 0 to n). We can thus try all these possibilities in polynomial time and, since all other steps in our algorithm run in polynomial time, the total running time will be polynomial. The algorithm tries all these guesses and outputs the best solution found over all choices. For the correct guesses, we output a solution with 3+kd+ks=k balls of radius at most two. Furthermore, by the second property of Lemma 2 and the two properties of Lemma 4, we have that

  • the number of blue points covered is at least |BGuess|+bd+bs=b; and

  • the number of red points covered is at least |RGuess|+3τ+rd+rs-3τ=r.

We have thus given a polynomial-time algorithm that returns a solution where the balls are of radius at most twice the optimal radius.

Constant number of colors

Our algorithm extends easily to a constant number ω of color classes C1,,Cω with coverage requirements p1,,pω. We use the LPs in Fig. 3 for a general number of colors, where pj,i in LP2(ω) indicates the number of points of color class i in cluster jS. S is the set of cluster centers obtained from modified clustering algorithm in Appendix 2 to instances with ω color classes. LP2(ω) has only ω non-trivial constraints, so any extreme point has at most ω variables attaining strictly fractional values, and a feasible solution attaining objective value at least p1 will have at most k+ω-1 positive values. By rounding up to 1 the fractional value of the center that contains the most number of points of Cω, we can cover pω points of Cω. We would like to be able to close the remaining fractional centers, so we apply an analogous procedure to the case with just two colors.

Fig. 3.

Fig. 3

Linear programs for ω color classes

We can guess 3(ω-1) centers of OPT for each of the ω-1 colors whose coverage requirements are to be satisfied. Then we bound the number of points of each color that may be found in a cluster, by removing dense sets that contain too many points of any one color and running a dynamic program on the removed sets. The final step is to run the clustering algorithm of [5] on the remaining points, and rounding to one the fractional center with the most number of points of C1, and closing all other fractional centers.

In particular, we get a running time with a factor of nO(ω2). The remainder of this section gives a formal description of the algorithm for ω color classes.

Formal algorithm for ω colors

The following is a natural generalization of Lemma 1 and summarizes the main properties of the clustering algorithm of Appendix 2 for instances with ω color classes.

Lemma 1′

Given a fractional solution (xz) to LP1(ω), there is a polynomial-time algorithm that outputs at most k clusters of radius two that cover at least p1 points of C1, and at least pi-(ω-1)maxj:zj>0|F(j)Ci| points of Ci for 2iω.

Since we may not meet the coverage requirements for ω-1 color classes, it is necessary to guess some balls of OPT for each of those colors, and for each fractional center. In total we guess 3(ω-1)2 points of OPT as follows: graphic file with name 10107_2021_1674_Figd_HTML.jpg

This guessing takes O(n3(ω-1)2) rounds. It is possible that some cj,i coincide, but this does not affect the correctness of the algorithm. In fact, this can only improve the solution, in the sense that the coverage requirements will be met with fewer than k centers. Let kc denote the number of distinct cj,i obtained in the correct guess. For notation, define

Guess:=j=2ωi=13(ω-1)B(cj,i)τj=|CjGain(cj,3(ω-1),qj,3(ω-1))Pj,3(ω-1)|.

To be consistent with previous notation, let

P4:=P\j=2ωi=13(ω-1)F(qj,i).

The important properties guaranteed by the first phase can be summarized in the following lemma whose proof is the natural extension of Lemma 2.

Lemma 2′

Assuming that cj,i are guessed correctly, we have that

  1. the k-3(ω-1)2 balls of radius one in OPT\j=2ωi=13(ω-1){cj,i} are contained in P4 and cover pω-|CωGuess| of points in Cω and pj-|CjGuess| points of Cj for j=2,,ω; and

  2. the clusters F(qj,i) are contained in P\P3(ω-1)+1 and cover at least |CωGuess| points of Cω and at least |CjGuess|+3(ω-1)·τj points of Cj.

Now we need to remove points which contain many points from any one of the color classes to partition the instance into dense and sparse parts which leads to the following generalized definition of dense points.

Definition 4′

When considering a subset of the points PsP, we say that a point pPs is j-dense if |CjB(p)Ps|>2τj. For a j-dense point p, we also let IpPs contain those points iPs such that |CjB(i)B(p)Ps|>τj , for every 2jω.

Now we perform a similar iterative procedure as for two colors: graphic file with name 10107_2021_1674_Fige_HTML.jpg

As in the case of two colors, set Pd=P3(ω-1)\Ps. By naturally extending Lemma 3 and its proof, we can ensure that any ball of OPT\j=2ωi=13(ω-1){cj,i} is completely contained in either Pd or Ps. We guess the number kd of such balls of OPT contained in Pd, and guess the numbers d1,,dω of points of C1,,Cω covered by these balls in Pd. There are O(nω+1) possible values of kd,d1,,dω and all the possibilities can be tried by increasing the running time by a multiplicative factor. The number of balls of OPT\j=2ωi=13(ω-1){cj,i} contained in Ps is given by ks=k-kc-kd and these balls cover at least sj=pj-|CjGuessall|-dj points of Cj in Ps, 1jω.

Assuming that the parameters are guessed correctly we can show, similar to Lemma 4, that the following holds.

Lemma 4′

There exist two polynomial-time algorithms Ad and As such that if kd,d1,dω are guessed correctly then

  • Ad returns kd balls of radius one that cover d1,,dω points of C1,,Cω of Pd;

  • As returns ks balls of radius two that cover at least s1 points of C1 of Ps and at least sj-3(ω-1)·τj points of Cj of Ps, 2jω.

The algorithm Ad proceeds as Ad did, with the modification that the dynamic program is now (ω+1)-dimensional. Algorithm As is also similar to As, because LP1 has a feasible solution where zp=0 for all pPs such that |F(p)Cj|>3τj holds for any 2jω. Hence, we output a solution with kc+kd+ks=k balls of radius at most two, and

  • the number of points of C1 covered is at least |C1Guess|+d1+s1=p1; and

  • the number of points of Cj covered is at least |CjGuess|+3(ω-1)τj+dj+sj-3(ω-1)τj=pj, for all j=2,,ω.

This is a polynomial-time algorithm for colorful k-center with a constant number of color classes.

LP integrality gaps

In this section, we present two natural ways to strengthen LP1 and show that they both fail to close the integrality gap, providing evidence that clustering and knapsack feasibility cannot be decoupled in the colorful k-center problem. On one hand, the Sum-of-Squares hierarchy is ineffective for knapsack problems, while on the other hand, adding knapsack constraints to LP1 is also insufficient due to the clustering aspect of this problem.

Sum-of-squares integrality gap

The Sum-of-Squares (equivalently Lasserre [16, 17]) hierarchy is a method of strengthening linear programs that has been used in constraint satisfaction problems, set-cover, and graph coloring, to just name a few examples [3, 9, 18]. We use the same notation for the Sum-of-Squares hierarchy, abbreviated as SoS, as in Karlin et al. [15]. For a set V of variables, P(V) are the power sets of V and Pt(V) are the subsets of V of size at most t. Their succinct definition of the hierarchy makes use of the shift operator: for two vectors x,yRP(V) the shift operator is the vector xyRP(V) such that

(xy)I=JVxJyIJ.

Analogously, for a polynomial g(x)=IVaIiIxi we have (gy)I=JVaJyIJ. In particular, we work with the linear inequalities g1,,gm so that the polytope to be lifted is

K={x[0,1]n:g(x)0for=1,,m}.

Let T be a collection of subsets of V and y a vector in RT. The matrix MT(y) is indexed by elements of T such that

(MT(y))I,J=yIJ.

We can now define the t-th SoS lifted polytope.

Definition 5

For any 1tn, the t-th SoS lifted polytope SoSt(K) is the set of vectors y[0,1]P2t(V) such that y=1, MPt(V)(y)0, and MPt-1(V)(gy)0 for all .

A point x[0,1]n belongs to the t-th SoS polytope SoSt(K) if there exists ySoSt(K) such that y{i}=xi for all iV.

We use a reduction from Grigoriev’s SoS lower bound for knapsack [11] to show that the following instance has a fractional solution with small radius that is valid for a linear number of rounds of SoS.

Theorem 3

At least min{2min{k/2,n-k/2}+3,n} rounds of SoS are required to recognize that the following polytope contains no integral solution for kZ odd.

i=1n2wi=kwi[0,1]i.

Consider an instance of colorful k-center with two colors, 8n points, k=n, and r=b=2n where n is odd. Points {4i-3,4i-2,4i-1,4i}i[2n] belong to cluster Ci of radius one. For odd i, Ci has three red points and one blue point and for even i, Ci has one red point and three blue points. A picture is shown in Fig. 4. In an optimal integer solution, one center needs to cover at least 2 of these clusters while a fractional solution satisfying LP1 can open a center of 1/2 around each cluster of radius 1. Hence, LP1 has an unbounded integrality gap since the clusters can be arbitrarily far apart. This instance takes an odd number of copies of the integrality gap example given in [5].

Fig. 4.

Fig. 4

Integrality gap example for linear rounds of SoS

We can do a simple mapping from a feasible solution for the tth round of SoS on the system of equations in Theorem 3 to our variables in the tth round of SoS on LP1 for this instance to demonstrate that the infeasibility of balls of radius one is not recognized. More precisely, we assign a variable wi to each pair of clusters of radius one as shown in Fig. 4, corresponding to opening each cluster in the pair by wi amount. Then a fractional opening of balls of radius one can be mapped to variables that satisfy the polytope in Theorem 3. The remainder of this subsection is dedicated to formally describing the reduction from Theorem 3. Let W denote the set of variables used in the polytope defined in Theorem 3. Let w be in the t-th round of SoS applied to the system in Theorem 3 so that w is indexed by subsets of W of size at most t. Let V=VxVz, where Vx={x1,,x8n} and Vz={z1,,z8n}, be the set of variables used in LP1 for the instance shown in Fig. 4. We define vector y with entries indexed by subsets of V, and show that y is in the t-th SoS lifting of LP1. In each ball we pick a representative xi, i1mod4, to indicate how much the ball is opened, so we set yI=0 if xjI, j1mod4. Otherwise, we set yI=wπ(I) where

π(I)={wi:x8i-3orx8i-7orz8i-jI,forsomei[n],j[7]}.

We have MPt(W)(w)0, and for g1=-n+i=1n2xi and g2=n-i=1n2xi, MPt-1(W)(gw)0 for =1,2 since w satisfies the t-th round of SoS. This implies that MPt-1(W)(gw) is the zero matrix.

To show that MPt(V)(y)0, we start with MPt(W)(w) and construct a sequence of matrices such that the semidefiniteness of one implies the semidefiniteness of the next, until we arrive at a matrix that is MPt(V)(y) with rows and columns permuted, i.e. MPt(V)(y) multiplied on the left and right by a permutation matrix and its transpose. Since the eigenvalues of a matrix are invariant under this operation, MPt(W)(w)0 implies that MPt(V)(y)0.

Lemma 5

There exists a sequence of square matrices MPt(W)(w):=M0, M1, M2, , Mp, such that the rank of Mi is the same as the rank of Mi+1, Mi is the leading principal submatrix of Mi+1 of dimension one less, and Mp is MPt(V)(y) with rows and columns permuted.

Proof

We claim that this sequence of matrices exists with the following description. Firstly, the matrix Mi+1 has one extra row and column than Mi, and is the same on the leading principal submatrix of size Mi. Then there are two possibilities:

  1. The last row and column of Mi+1 are all zeroes, or

  2. for some j, the last row of Mi+1 is a copy of the jth row of Mi, the last column is a copy of the jth column of Mi, and the last entry is (Mi)j,j.

Either way, the rank of Mi+1 would be the same as the rank of Mi.

To prove this claim, it suffices to consider a sequence of indices of the matrix MPt(V)(y). The matrix M0 in our sequence will be the submatrix of MPt(V)(y) indexed by the first k indices, where k is the dimension of MPt(W)(w), i.e. the number of subsets of W of size at most t. Each subsequent matrix Mi will be the submatrix of MPt(V)(y) indexed by the first k+i indices. Note that the rows/columns of MPt(V)(y) can be considered to be indexed by all the subsets of V of size at most t. With this in mind, consider a sequence of subsets of V of size at most t with the following properties:

  1. All subsets of {x8i-7:i[n]} of size at most t form a prefix of our sequence.

  2. Each set index after the first has exactly one more element than some set index that came earlier in the sequence.

It is clear that it is possible to arrange all the subsets of V of size at most t in a sequence to satisfy these properties. It only remains to show that this sequence produces the desired construction for M0,M1,,Mp.

We have

MPt(y)I,J=yIJ=wπ(IJ)=wπ(I),π(J)

so property (1) guarantees that we begin with M0 being MPt(W)(w), up to the correct permutation of subsets of {x8i-7:i[n]}. Now consider some kth index in the sequence, k>k where k is the dimension of MPt(W)(w). By property (2), it is of the form J{x}, where J is one of the first k-1 indices, and xV. There are two cases:

  • If x is some xi with i1mod4, then yIJ=0 for all k.

  • Otherwise, π(J{x})=π(J).

In the first case, the matrix constructed from the first k indices will have property (a), and in the second, property (b). Finally, it is clear that at each step the dimension of the matrices increases by one, and that it is the leading principal submatrix of the following matrix in the sequence, until we end up with MPt(V)(y) (up to some permutation of its rows and columns).

By the rank-nullity theorem, Mi+1 has one more 0 eigenvalue than Mi, so we can apply the following theorem.

Theorem 4

Let A be a symmetric n×n matrix and B be a principal submatrix of A of dimension (n-1)×(n-1). If the eigenvalues of A are α1αn and the eigenvalues of B are β1βn-1 then α1β1α2β2αn-1βn-1αn.

With Mi+1=A and Mi=B as in Theorem 4 we have that αn=0 (since Mi+1 and Mi have the same eigenvalues but the dimension of the zero eigenspace of Mi+1 is one greater than that of Mi). Hence, Mi+1 has no negative eigenvalues if Mi has no negative eigenvalues. This is sufficient to show that each matrix in the sequence constructed is positive semidefinite, and concludes the proof that MPt(V)(y)0.

It remains to show that the matrices arising from the shift operator between y and the linear constraints of our polytope are positive semidefinite. Let hi denote the linear inequalities in LP1. In essence, the corresponding moment matrices MPt-1(V)(hiy) are zero matrices since all hi are tight for the example in Fig. 4. Formally, we have

Lemma 6

Matrices MPt-1(V)(hy) are the zero matrix, for each h a linear constraint from LP1.

Proof

Let h1,j be the linear polynomial that corresponds to the first inequality of LP1 for jP. First, if i1mod4, then yI{xi}=0 for any IV. Otherwise, we have

(MPt-1(h1jy))I,J=iB(j,1)yIJ{xi}-yIJ{zj}=wπ(IJ)π(xi)-wπ(IJ)π(zj)=0

since π({xi})=π(zj) for iB(j,1), i1mod4. For the remaining inequalities of LP1: h2, h3, and h4, we have that MPt-1(V)(hy) is the zero matrix because of how we defined the projection onto w:

(MPt-1(h2y))I,J=nyIJ-xjVxyIJ{xj}=nwπ(IJ)-j=1n2wπ(IJ{wj})=(MPt-1(g2w))π(I),π(J)=0MPt-1(h3y))I,J=MPt-1(h4y))I,J=jRyIJ{zj}-2nyIJ=i=1n4wπ(IJ){wi}-2nwπ(IJ)=2(MPt-1(g1w))π(I),π(J)=0.

This concludes the formal proof of the following theorem.

Theorem 5

The integrality gap of LP1 with 8n points persists up to Ω(n) rounds of Sum-of-Squares.

Flow constraints

In this section we add additional constraints based on standard techniques to LP1. These incorporate knapsack constraints for the fractional centers produced in the hope of obtaining a better clustering and we show that this fails to reduce the integrality gap.

We define an instance of a knapsack problem with multiple objectives. Each point pP corresponds to an item with three dimensions: a dimension of size one to restrict the number of centers, |BB(p)|, and |RB(p)|. We set up a flow network with an (n+1)×n×n×k grid of nodes and we name the nodes with the coordinate (wxyz) of its position. The source s is located at (0, 0, 0, 0) and we add an extra node t for the sink. Assign an arbitrary order to the points in P. For the item corresponding to iP, for each x[n], y[n], z[k]:

  1. Add an edge from (ixyz) to (i+1,x,y,z) with flow variable ei,x,y,z.

  2. With bi:=|BB(i)| and ri:=|RB(i)|, if z<k add an edge from (ixyz) to (i+1,min{x+bi,n},min{y+bi,n},z+1) with flow variable fi,x,y,z.

For each x[b,n], y[r,n]:

  • 3.

    Add an edge from (n+1,x,y,k) to t with flow variable gx,y.

Set the capacities of all edges to one. In addition to the usual flow constraints, add to LP1 the constraints

xi=x,y[n],z[k]fi,x,y,zforalliP 2
1-xi=x,y[n],z[k]ei,x,y,zforalliP. 3

We refer to the resulting linear program as LP3. Notice that an integral solution to LP1 defines a path from s to t through which one unit of flow can be sent; hence LP3 is a valid relaxation. On the other hand, any path P from s to t defines a set CP of at most k centers by taking those points c for which fc,x,y,zP for some xy, and z. Moreover, as t can only be reached from a coordinate with xb and yr we have that cCP|B(c)B|b and cCP|B(c)R|r. It follows that CP forms a solution to the problem of radius one if the balls are disjoint. In particular, our integrality gap instances for the Sum-of-Squares hierarchy do not fool LP3.

The example in Fig. 5 shows that in an instance where balls overlap, the integrality gap remains large. Here, the fractional assignment of open centers is 1/2 for each of the six balls and this gives a fractional covering of 8 red and 8 blue points as required. This assignment also satisfies the flow constraints because the three balls at the top of the diagram define a path disjoint from the three at the bottom. By double counting the five points in the intersection of two balls we cover 8 red and 8 blue points with each set of three balls. Hence, we can send flow along each path. However, this does not give a feasible integral solution with three centers as any set of three clusters does not contain enough points. In fact, the four clusters can be placed arbitrarily far from each other and in this way we have an unbounded integrality gap since one ball needs to cover two clusters.

Fig. 5.

Fig. 5

k=3, r=b=8

Conclusion and open questions

Our 3-approximation algorithm for colorful k-center with ω color classes runs in time |P|O(ω2), where the quadratic term arises from guessing linearly many optimal centers to make up for linearly many extra centers in the pseudo-approximation. In  [2], it was shown that a linear exponential dependence on ω is necessary assuming ETH holds. It would be interesting to obtain the same approximation factor but without the quadratic dependence on ω, or better yet, obtain a tight result. The current best hardness of approximation of 2-ϵ comes from the standard k-center problem. Note that in the well-separated case where no ball of radius 3 covers two optimal balls, we obtain a 2-approximation. This well-separated condition is crucial in the design of our algorithm, however, so it seems that significantly new ideas would be required to decrease this factor, if it is at all possible.

Another direction is to explore fair coverage constraints for other clustering problems such as k-median and k-means. The natural linear programming relaxations for these problems have unbounded integrality gaps even in the case of just one color class, i.e. the case of outliers.

Having multiple color classes requires solving subset-sum in some form. Investigating these problems could shed light on general combinatorial optimization problems that involve subset-sum.

Dynamic programming for dense points

In this section we describe the dynamic programming algorithm discussed in Lemma 4. As stated in the proof of Lemma 4, given I=jIj and correct guesses for kd,bd,rd, we need to find kd balls of radius one centered at points in I covering bd blue and rd red points with at most one point from each IjI picked as a center. To do this, we first order the sets in I arbitrarily as I={Ij1,,Ijm},m=|I|. We create a 4-dimensional table T of dimension (m,bd,rd,kd). T[m,b,r,k] stores whether there is a set of k balls in the first m sets of I covering b blue and r red points. The recurrence relation for T is

T[0,0,0,0]=TrueT[0,b,r,k]=False,foranyb,r,k0T[m,b,r,k]=TrueifT[m-1,b,r,k]=TrueTrueifcIjms.t.T[m-1,b,r,k-1]=True,forb=b-|B(c)DjmB|,r=r-|B(c)DjmR|Falseotherwise.

The table T has size O((m+1)·(n+1)·(n+1)·(n+1))=O(n4) since the first parameter has range from 0 to m, and the other parameters can have value 0 up to at most n. Moreover, since |i=1mIji|n and this is a disjoint union, for each of the O(n3) choices of b,r,k we can determine all of the m entries in O(n) time. Hence, we can compute the the whole table in time O(n4) using, for example, the bottom-up approach. We can also remember the choices in a separate table and so we can find a solution in time O(n4) if it exists.

The clustering algorithm

In this section we present the clustering algorithm used in [5] with a simple modification. The algorithm is described in pseudo-code in Algorithm 1.graphic file with name 10107_2021_1674_Figf_HTML.jpg Now we state the theorem which states the properties of this clustering algorithm used in Sect. 2.1.

Theorem 6

Given a feasible fractional solution (xz) to LP1, the set of points SP and clusters CjP for every jS produced by Algorithm 1 satisfy:

  1. The set S is a subset of the points {jP:zj>0} with positive z-values.

  2. For each jS, we have CjF(j) and the clusters {Cj}jS are pairwise disjoint.

Moreover, if we let Rj=CjR and Bj=CjB with rj=|Rj| and bj=|Bj| for jS, then y is a feasible solution to LP2 (depicted on the right in Fig. 1) with objective value at least r.

Proof

The proof of the first statement is clear from the condition in the while loop of the algorithm.

For the second statement, observe that, by the definition of Cj as stated in the algorithm, CjiB(j)B(i)=F(j). Since in each iteration, the cluster is removed from P, the clusters are clearly disjoint.

In order to prove that y is feasible this we first state some useful observations.

  • Firstly, for any iP there is at most one jS such that d(i,j)1. This is true because if there were j,jS such that both j,jB(j) then, assuming w.l.o.g. j was considered before in the while loop, jCj and thus j cannot be in S which is a contradiction.

  • Secondly, note that for any j1P such that j1Cj for some j, then z~j=z~j1zj1. This is trivially true if z~j=1, otherwise z~j=iB(j)xizjzj1 where the first inequality follows from LP1 constraints and second inequality from the fact that when Cj was removed, zj had the highest z value.

Now we show that y is feasible for LP2 with objective value at least r. Firstly we show that jSrjyjr. To see this,

jSrjyj=jS|Rj|yj=jSjRjz~j(yj=z~jfor anyjS)jSjRjzj(from second observation,z~jzjfor anyjCj)=jR:zj>0zj(sinceCj's are disjoint and contain alljs.t.zj>0)=jRzjr(sincezsatisfies LP1))

Similarly jSbjyjb. Finally we will show that jSyjk,

jSyjjSjB(j)xjsinceyjjB(j)xjjPxj(from the first observation)k(sincexsatisfies LP1)

This concludes the proof of the claim that y is a feasible solution to LP2 with objective value at least r.

Funding

Open Access funding provided by EPFL Lausanne.

Footnotes

Supported by the Swiss National Science Foundation project 200021-184656 “Randomness in Problem Instances and Randomized Algorithms.”

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Xinrui Jia, Email: xinrui.jia@epfl.ch.

Kshiteej Sheth, Email: kshiteej.sheth@epfl.ch.

Ola Svensson, Email: ola.svensson@epfl.ch.

References

  • 1.Anagnostopoulos, A., Becchetti, L., Böhm, M., Fazzone, A., Leonardi, S., Menghini, C., Schwiegelshohn, C.: Principal fairness: removing bias via projections. CoRR arXiv:1905.13651 (2019)
  • 2.Anegg, G., Angelidakis, H., Kurpisz, A., Zenklusen, R.: A technique for obtaining true approximations for k-center with covering constraints. In: International Conference on Integer Programming and Combinatorial Optimization (IPCO). pp. 52–65 (2020) [DOI] [PMC free article] [PubMed]
  • 3.Arora, S., Ge, R.: New tools for graph coloring. In: Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques. pp. 1–12. Springer (2011)
  • 4.Backurs, A., Indyk, P., Onak, K., Schieber, B., Vakilian, A., Wagner, T.: Scalable fair clustering. In: Proceedings of the 36th International Conference on Machine Learning, ICML. pp. 405–413 (2019)
  • 5.Bandyapadhyay, S., Inamdar, T., Pai, S., Varadarajan, K.R.: A constant approximation for colorful k-center. In: 27th Annual European Symposium on Algorithms, ESA. pp. 1–14 (2019)
  • 6.Chakrabarty, D., Goyal, P., Krishnaswamy, R.: The non-uniform k-center problem. In: 43rd International Colloquium on Automata, Languages, and Programming, ICALP. pp. 1–15 (2016)
  • 7.Charikar, M., Khuller, S., Mount, D.M., Narasimhan, G.: Algorithms for facility location problems with outliers. In: Proceedings of the 12th Annual ACM-SIAM symposium on Discrete algorithms (SODA). pp. 642–651 (2001)
  • 8.Chierichetti, F., Kumar, R., Lattanzi, S., Vassilvitskii, S.: Fair clustering through fairlets. In: Advances in Neural Information Processing Systems (NIPS). pp. 5029–5037 (2017)
  • 9.Chlamtac, E., Friggstad, Z., Georgiou, K.: Understanding set cover: Sub-exponential time approximations and lift-and-project methods. CoRR arXiv:1204.5489 (2012)
  • 10.Gonzalez TF. Clustering to minimize the maximum intercluster distance. Theor. Comput. Sci. 1985;38:293–306. doi: 10.1016/0304-3975(85)90224-5. [DOI] [Google Scholar]
  • 11.Grigoriev D. Complexity of positivstellensatz proofs for the knapsack. Comput. Complex. 2001;10(2):139–154. doi: 10.1007/s00037-001-8192-0. [DOI] [Google Scholar]
  • 12.Harris DG, Pensyl T, Srinivasan A, Trinh K. A lottery model for center-type problems with outliers. ACM Trans. Algorithms. 2019;15(3):1–25. [Google Scholar]
  • 13.Hochbaum DS, Shmoys DB. A best possible heuristic for the k-center problem. Math. Oper. Res. 1985;10(2):180–184. doi: 10.1287/moor.10.2.180. [DOI] [Google Scholar]
  • 14.Hsu WL, Nemhauser GL. Easy and hard bottleneck location problems. Discret. Appl. Math. 1979;1(3):209–215. doi: 10.1016/0166-218X(79)90044-1. [DOI] [Google Scholar]
  • 15.Karlin, A.R., Mathieu, C., Nguyen, C.T.: Integrality gaps of linear and semi-definite programming relaxations for knapsack. In: Integer Programming and Combinatoral Optimization IPCO. pp. 301–314 (2011)
  • 16.Lasserre, J.B.: An explicit exact SDP relaxation for nonlinear 0-1 programs. In: International Conference on Integer Programming and Combinatorial Optimization (IPCO). pp. 293–303 (2001)
  • 17.Lasserre JB. Global optimization with polynomials and the problem of moments. SIAM J. Optim. 2001;11(3):796–817. doi: 10.1137/S1052623400366802. [DOI] [Google Scholar]
  • 18.Tulsiani, M.: CSP gaps and reductions in the lasserre hierarchy. In: Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC. pp. 303–312 (2009)

Articles from Mathematical Programming are provided here courtesy of Springer

RESOURCES