Abstract
Mycielski graphs are a family of triangle-free graphs
with arbitrarily high chromatic number.
has chromatic number k and there is a short informal proof of this fact, yet finding proofs of it via automated reasoning techniques has proved to be a challenging task. In this paper, we study the complexity of clausal proofs of the uncolorability of
with
colors. In particular, we consider variants of the
(propagation redundancy) proof system that are without new variables, and with or without deletion. These proof systems are of interest due to their potential uses for proof search. As our main result, we present a sublinear-length and constant-width
proof without new variables or deletion. We also implement a proof generator and verify the correctness of our proof. Furthermore, we consider formulas extended with clauses from the proof until a short resolution proof exists, and investigate the performance of CDCL in finding the short proof. This turns out to be difficult for CDCL with the standard heuristics. Finally, we describe an approach inspired by SAT sweeping to find proofs of these extended formulas.
Introduction
Proof complexity investigates the relative strengths of Cook–Reckhow proof systems [7], defined in terms of the length of the shortest proof of a tautology as a function of the length of the tautology. Proof systems are separated with respect to their strengths by establishing lower and upper bounds on the lengths of the proofs of certain “difficult” tautologies in each system. Finding short proofs of such tautologies in a proof system is a method for proving small upper bounds, which provide evidence for the strength of a proof system. Similarly, the existence of a large lower bound implies that a proof system is relatively weak. The related field of SAT solving involves the study of search algorithms that have corresponding proof systems, and concerns itself with not only the existence of short proofs, but also the prospect of finding them automatically when they exist. As a result, the two areas interact. The long-term agenda of proof complexity is to prove lower bounds on proof systems of increasing strength towards concluding
, whereas SAT solving benefits from strong proof systems with properties that make them suitable for automation. A recently proposed such system is
(propagation redundancy) [14] and some of its variants
(subset
),
(without new variables),
(allowing deletion).
For several difficult tautologies,
has been shown to admit proofs that are short (at most polynomial length), narrow (small clause width), and without extension (disallowing new variables) [5, 12, 13, 14]. From the perspective of proof search, these are favorable qualities for a proof system:
Polynomial length is essentially a necessity.
Small width implies that we may limit the search to narrow proofs.
Eliminating extension drastically shrinks the search space.
Compared to strong proof systems with extension, a proof system with the above properties may admit a proof search algorithm that is effective in practice.
Mycielski graphs are a family of triangle-free graphs
with arbitrarily high chromatic number. In particular,
has chromatic number k. Despite having a simple informal proof, this has been a difficult fact to prove via automated reasoning techniques, and the state-of-the-art tools can only handle instances up to
or
[6, 9, 18, 19, 20, 21, 23]. Symmetry breaking [8], a crucial automated reasoning technique for hard graph coloring instances, is hardly effective on these graphs as the largest clique has size 2. Most short
proofs for hard problems are based on symmetry arguments. Donald Knuth challenged us in personal communication1 to explore whether short
proofs exist for Mycielski graph formulas.
In this paper, we provide short proofs in
and
for the colorability of Mycielski graphs [17]. Our proofs are of length quasilinear (with deletion and low discrepancy) and sublinear (without deletion but high discrepancy) in the length of the original formula, and include clauses that are at most ternary. With deletion allowed, the
inferences have short witnesses, which allows us to additionally establish the existence of quasilinear-length
proofs. We also implement a proof generator and verify the generated proofs with dpr-trim2. Furthermore, we experiment with adding various combinations of the clauses in the proofs to the formulas and observe their effect on conflict-driven clause learning (CDCL) solver [3, 16] performance. It turns out that the resulting formulas are still difficult for state-of-the-art CDCL solvers despite the existence of short resolution proofs, reinforcing a recent result by Vinyals [22]. We then demonstrate an approach inspired by SAT sweeping [24] to solve these difficult formulas automatically.
Preliminaries
In this work we focus on propositional formulas in conjunctive normal form (CNF), which consist of the following: n Boolean variables, at most 2n
literals
and
referring to different polarities of variables, and m
clauses
where each clause is a disjunction of literals. The CNF formula is the conjunction of all clauses. Formulas in CNF can be treated as sets of clauses, and clauses as sets of literals. For two clauses C, D such that
, their resolvent onp is the clause
. A clause is called a tautology if it includes both p and
. We denote the empty clause by
.
An assignment
is a partial mapping of variables in a formula to truth values in
. We denote assignments by a conjunction of the literals they satisfy. As an example, the assignment
is denoted by
. The set of variables assigned by
is denoted by
. We denote by
the restriction of a formula F under an assignment
, the formula obtained by removing satisfied clauses and falsified literals from F. A clause C is said to block the assignment
, which we denote by
.
A clause is called unit if it contains a single literal. Unit propagation refers to the iterative procedure where we assign the variables in a formula F to satisfy the unit clauses, restrict the formula under the assignment, and repeat until no unit clauses remain. If this procedure yields the empty clause
, we say that unit propagation derives a conflict on F.
Assume for the rest of the paper that F, H are formulas in CNF, C is a clause, and
is the assignment blocked by C. Formulas F, H are equisatisfiable if either they are both satisfiable or both unsatisfiable. C is redundant with respect to F if F and
are equisatisfiable. C is blocked with respect to F if there exists a literal
such that for each clause
that includes
, the resolvent of C and D on p is a tautology [15]. C is a reverse unit propagation (
) inference from F if unit propagation derives a conflict on
[11]. F
implies H by unit propagation, denoted
, if each clause
is a
inference from F. Let us state a lemma about implication by unit propagation for later use.
Lemma 1
([5]). Let C, D be clauses such that
is not a tautology and let
be the assignment blocked by C. Then
Letting
be either a unit clause or a conjunction of unit clauses, we will use the notation
to mean that for each
we have
. This serves as a compact way of writing a sequence of unit clauses that become true on the way to deriving
from F via unit propagation.
proof system
Redundancy is the basis for clausal proof systems. In a clausal proof of a contradiction, we start with the formula and introduce redundant clauses until we can finally introduce the empty clause. Since satisfiability is preserved at each step due to redundancy, introduction of the empty clause implies that the formula is unsatisfiable. The sequence of redundant clauses constitutes a proof of the formula. Also note that since only unsatisfiable formulas are of interest, we use “proof” and “refutation” interchangeably.
Definition 1
For a formula F, a valid clausal proof of it is a sequence of clause–witness pairs
where, defining
, we have
each clause
is redundant with respect to the conjunction of the formula with the preceding clauses in the proof, that is,
and
are equisatisfiable,there exists a predicate
computable in polynomial time that indicates whether
is redundant with respect to
,
.
For a clausal proof P of length N, we call
its width.
Definition 2
C is propagation redundant with respect to F if there exists an assignment
satisfying C such that
where
is the assignment blocked by C.
Note that propagation redundancy can be decided in polynomial time given a witness
due to the existence of efficient unit propagation algorithms. Unit propagation is a core primitive in SAT solvers, and despite the prevalence of large collections of heuristics implemented in solvers, in practice the majority of the runtime of a SAT solver is spent performing unit propagation inferences.
Theorem 1
([14]). If C is propagation redundant with respect to F, then it is redundant with respect to F.
Theorem 1 allows us to define a specific clausal proof system:
Definition 3
A
proof is a clausal proof where the predicate
in Definition 1 computes the relation
where
is the assignment blocked by
.
Resolvents, blocked clauses, and
inferences are propagation redundant. Hence they are valid steps in a
proof.
Let us also mention a few notable variants of the
proof system:
: For each clause–witness pair
in the proof and
the assignment blocked by
, require that
.
: No clause C in the proof can include a variable that does not occur in the formula F being proven.
: In addition to introducing redundant clauses, allow deletion of a previous clause in the proof (or the original formula), that is, allow
for some
.
Following the notation of Buss and Thapen [5], the prefix “
” denotes a variant of a proof system with deletion allowed, and the superscript “−” denotes a variant disallowing new variables.
Expressiveness of
Intuition
allows us to introduce clauses that intuitively support the following reasoning:
If there exists a satisfying assignment, then there exists a satisfying assignment with a certain property X, described by the witness
. This is because we can take any assignment that does not have X, apply a transformation to it that does not violate any original constraints of the formula, and obtain a new satisfying assignment with property X. The validation of such a transformation in general is
-hard. Transformations are limited such that they can be validated using unit propagation.
Hence, if our goal is to find some (not all) of the satisfying assignments to a formula or to refute it, then we can extend the formula by introducing useful assumptions without harming our goal since satisfiability is preserved with each assumption. The redundancy of each assumption is efficiently checkable using the blocked assignment
and the witness
which together describe the transformation that we apply to a solution without property X to obtain another with X. Having this kind of understanding and mentally executing unit propagation allows us to look for
proofs while continuing to reason at a relatively intuitive level. This proves useful when working towards upper bounds.
Upper bounds For several difficult tautologies (pigeonhole principle, bit pigeonhole principle, parity principle, clique-coloring principle, Tseitin tautologies) short
proofs exist [5, 14]. Still, there are several problems mentioned by Buss and Thapen [5] for which there are no known
proofs of polynomial length. Furthermore, we do not know whether there are short
proofs of the Mycielski graph formulas. Buss and Thapen [5] have a partial simulation result between
and
depending on a notion called “discrepancy”, defined as follows.
Definition 4
For a
inference, its discrepancy is
.
Theorem 2
Let F be a formula with a
refutation of length N such that
. Then, F has an
refutation of length
without using variables not in the
refutation.
As a result, a
proof of length N with maximum discrepancy at most
directly gives an
proof of length
. In our case, the maximum discrepancy of the
proof is
, hence we cannot utilize Theorem 2 to obtain a polynomial-length
proof. For our
proof, the maximum discrepancy is 2, and by Theorem 2 there do exist quasilinear-length
proofs of the Mycielski graph formulas.
Proofs of Mycielski graph formulas
Mycielski graphs
Let
be a graph. Its Mycielski graph
is constructed as follows:
Include G in
as a subgraph.For each vertex
, add a new vertex
that is connected to all the neighbors of
in G.Add a vertex w that is connected to each
.
Unless G has a triangle
does not have a triangle, and
has chromatic number one higher than G. We denote the chromatic number of G by
.
Starting with
(the complete graph on 2 vertices) and applying
repeatedly, we obtain triangle-free graphs with arbitrarily large chromatic number. We call
the kth Mycielski graph. Since
and
increases the chromatic number by one, we have
. The graph
has
vertices and
edges [1].
Fig. 1.
The first few graphs in the sequence of Mycielski graphs.
Let us denote by
the contradiction that
is colorable with
colors. We will present short
and
proofs of
in Section 4.2. Before doing so, let us present the short informal argument to prove that applying
increases the chromatic number, which implies that
.
Proposition 1
.
Proof
Assume we partition the vertices of
as
where V is the set of vertices of G which is included as a subgraph, U is the set of newly added vertices corresponding to each vertex in V, and w is the vertex that is connected to all of U.
Let
, and denote
. Denote the set of neighbors of a vertex v by N(v). Consider a proper k-coloring
of
. Assume that in this coloring U uses only the first
colors. Then we can define a
-coloring
of G by setting
for
with
and copying
for the remaining vertices. The coloring
is proper, because for any two
,
if
, then no edges exist between them;if
, then their colors are not modified;if
, then
since for all
we have
.
As a result, we can obtain a proper
-coloring of G, contradiction. Hence, U must use at least k colors in a proper coloring of
, and since w then has to have a color greater than k we have
. 
Theorem 3
is not colorable with
colors.
Proof
Follows from the fact that
and Proposition 1 via induction. 
proofs
To obtain
and
proofs, we follow a different kind of reasoning than that of the informal proof in the previous section. Let
. Denote by
the vertices and the edge set of the
th Mycielski graph, respectively. Assume we partition the vertices of
as in the proof of Proposition 1 into
. Let
.
In propositional logic,
is defined on the variables
,
,
for
,
. The variable
indicates that the vertex
is assigned color c, and
,
have similar meanings.
consists of the clauses
![]() |
For both the
and the
proofs, the high-level strategy is to introduce clauses that effectively insert edges between any
for which
. In other words, if there is an edge
, we introduce clauses that imply the existence of the edge
, resulting in the modified graph
that has an induced subgraph
isomorphic to
, and has all of its vertices connected to w. As an example, Figure 3a shows the result of this step on
. Then we partition the vertices of
into new
similar to the way we did for
. Such a partition exists as
is isomorphic to
which by construction has this partition. Then we inductively repeat the whole process. Figure 3c displays the result of repeating it once. Finally, the added edges result in a k-clique in
, as illustrated in Figure 3d. The vertices that participate in the clique are the two
’s of the subgraph we obtain at the last step that is isomorphic to
and the w’s of all the intermediate graphs isomorphic to
for
. Since we have
colors available, the problem then reduces to the pigeonhole principle with k pigeons and
holes (denoted
), for which we know there exists a polynomial-length
proof due to Heule et al. [14]. At the end we simply concatenate the pigeonhole proof for the clique, which derives the empty clause as desired.
Fig. 3.
Illustrations of the proof steps in the case where
is the initial graph, i.e.
is the formula being refuted. The blue and the red edges correspond to the clauses introduced as RUP inferences, and the clauses corresponding to the faded edges are deleted.
The primary difference between the versions of the proof with and without deletion is the discrepancy of the
inferences. Deletion allows us to detach
from
, as illustrated in Figure 3b, by removing each preceding clause that contains both a variable corresponding to some vertex in U and another corresponding to some vertex in V. This makes it possible to introduce the
clauses with discrepancy bounded by a constant. Without deletion, we instead introduce the
inferences at each inductive step which imply that every
has the same color as its corresponding
, and this requires us to keep track of sets of equivalent vertices and assign them together in the witnesses. Figure 4 displays the effect of introducing these clauses on
.
Fig. 4.

Equivalent vertices and implied edges. Groups of equivalent vertices are highlighted. Dashed edges are implied by unit propagation.
For ease of presentation, we first describe the
proof, followed by the
proof.
Theorem 4
has quasilinear-length
and
refutations.
Proof
At each step below, let F denote the conjunction of
with the clauses introduced in the previous steps.
- As the first step, we introduce
blocked clauses
for each
1
such that
. These clauses assert that each vertex in the graph can be assumed to have at most one color. - Then, we introduce
clauses
Intuitively, these clauses introduce the assumption that if there exists a solution, then there exists a solution that does not simultaneously have
2
colored c,
colored
, and w not colored c. If
has color
, then we can switch its color to c and still have a valid coloring. The validity of this new coloring is verifiable relying only on unit propagation inferences. It does not create any monochromatic edges between
and
, as
would already not have the color c. It also does not create a monochromatic edge between
and w since w is already assumed not to have color c. Figure 2 shows this argument with a diagram. The corresponding witness for this transformation is
, leading to a discrepancy of 1. -
Next, we introduce
inferences
Let
4
and
. From the previous set of
inferences in (3) we have
Due to the edge
we also have
and consequently
. Since
, we have
by Lemma 1.With the addition of this last set of assumptions, we have effectively copied the edges between
to between
. Figure 3a visualizes the result of this step on
with the red edges corresponding to the newly introduced assumptions. After the addition of the new edges, we delete the clauses introduced in steps 2, 3, and the clauses corresponding to the edges between U and V of the current Mycielski graph. Figure 3b displays the graph after the deletions.
Then we inductively repeat steps 2–5, that is, we introduce clauses and delete the intermediate ones for each subgraph isomorphic to Mycielski graphs of descending order. Figure 3c shows the result of repeating the process on a subgraph isomorphic to
, with the blue edges corresponding to the latest assumptions.After an edge is inserted between the two
of the subgraph isomorphic to
, we obtain a k-clique on the two
and all of the previous w’s. Then we delete all the clauses corresponding to the edges leaving the clique. This detaches the clique from the rest of the graph as illustrated for
in Figure 3d. Since
-colorability of the k-clique is exactly the pigeonhole principle, we simply concatenate a
proof of the pigeonhole principle as described by Heule et al. [14], which has maximum discrepancy 2. This completes the
proof that
is not colorable with
colors.
Fig. 2.

Schematic form of the argument for the
inference. With
and
, the above diagram shows the transformation we can apply to a solution to obtain another valid solution. A vertex colored black on the inside means that it does not have the outer color, i.e. w has some color other than red. Unit propagation implies that
is not colored red.
In total, the proof has length
and the
inferences have maximum discrepancy 2. Hence, by Theorem 2, there also exists a
proof of length
. Since
has length
, if we denote the length of the formula by S then the proof is of quasilinear length
. 
Theorem 5
has sublinear-length
refutations.
Proof
At a high level, the proof is similar to the
proof. However, in order to avoid deletion we introduce assumptions at each inductive step that imply the equivalence of every
with its corresponding
. This eliminates the need to detach
from
, but leads to sets of vertices forced to have the same color. As a result, the witnesses for the
inferences after the first inductive step that refer to switching the color of a vertex
need to also include all the previous vertices forced to have the same color as
.
We start by introducing the blocked clauses from (1).
Then we introduce the
inferences from (2).- It becomes possible to infer the following
clauses via
.
Let
5
, and denote the conjunction of the formula and the clauses in (1) and (2) by F. In step 3 of the previous proof we showed that
. Then, by Lemma 1, we have
. Hence, we can switch the color of
from
to c. This does not result in any conflicts since
having color c implies that no
has the color c, and
is implied by unit propagation. As a result, the clause
is
with witness
. After the addition of these clauses, the equivalence
is implied via unit propagation. Due to the edge
, the existence of the edge
is also implied via unit propagation. This step allows us to avoid deletion. At this point, we inductively repeat steps 2–3 for each subgraph isomorphic to Mycielski graphs of descending order. However, due to the equivalences
, any subsequent
inference that argues by way of switching a vertex
’s color should include in its witness the same color switch for all the vertices that are transitively equivalent to
from the previous steps. For instance, if a witness contains
, then for each vertex
that is equivalent to
it also has to contain
. The maximum number of such vertices for any
occurring in the proof is
.After the
clauses are introduced for the subgraph isomorphic to
, the existence of a k-clique is implied via unit propagation. Figure 4 shows the equivalent vertices and the implied edges after the last inductive step when starting from
. At the end, we simply concatenate a proof of the pigeonhole principle as before, taking care to include in the witnesses all the equivalent vertices (as described in the previous step) to each vertex whose color is switched by a witness.
The proof has length
, and
has length
. Letting S denote the length of the formula, the proof has sublinear length
. 
In the
proof, the maximum discrepancy is
. Letting N be the length of the proof, this becomes
. As a result, we cannot rely on Theorem 2, and the existence of a polynomial-length
proof for Mycielski graph formulas remains open. While the existence of such a proof is plausible, we conjecture that it will not be of constant width as the ones we present.
Experimental results
All of the formulas, proofs, and the code for our experiments are available at https://github.com/emreyolcu/mycielski.
Proof verification
In order to verify the proofs we described in the previous section, we implemented two proof generators for
and checked the
and
proofs with dpr-trim for values of k from 5 to 10. Figure 5 shows a plot of the lengths of the formulas and the proofs, and Table 1 shows their exact sizes.
Fig. 5.

Plot of the length of the formula and the lengths of the proofs versus k.
Table 1.
Formula and proof sizes. For each formula
, this table shows the number of variables and clauses in the formula, and the lengths of the proofs.
| k | #vars | #cls | ![]() |
![]() |
|---|---|---|---|---|
| 5 | 92 | 307 | 1572 | 600 |
| 6 | 235 | 1227 | 7635 | 2165 |
| 7 | 570 | 4625 | 33178 | 6796 |
| 8 | 1337 | 16711 | 134855 | 19523 |
| 9 | 3064 | 58551 | 524456 | 52816 |
| 10 | 6903 | 200531 | 1976271 | 136905 |
Effect of redundant clauses on CDCL performance
Suppose we have a proof search algorithm for
and that the redundant clauses we introduce in the
proof are discovered automatically. Assuming they are found by some method, we look at their effect on the efficiency of CDCL at finding the rest of the proof automatically. In addition, we generate satisfiable instances of the coloring problem (denoted
and stating that
is colorable with k colors) and compare how many of the satisfying assignments remain after the clauses are introduced. The reduction in the number of solutions suggests that the added clauses do a significant amount of work in breaking symmetries.
Let us denote by
BC: the blocked clauses that we add in step 1,
PR: the
clauses that we add inductively in step 2,R1: the
inferences that we add inductively in step 3,R2: the
inferences that we add inductively in step 4.
We consider extended versions of the formulas where we gradually include more of the redundant clauses. We cumulatively introduce the redundant clauses from each step, i.e. when we add the PR clauses we also add the BC clauses.
For the satisfiable formulas
, the remaining number of solutions are in Table 2. We used allsat3 to count the exact number of solutions. Adding only the BC or PR clauses drastically reduces the number of solutions. Adding them both leaves a fraction of the solutions.
Table 2.
Number of solutions left in
after introducing redundant clauses. PR
BC is the version of the formula where we add the PR clauses but not the BC ones. For
, it takes longer than 24 hours to count all solutions, so we only included the results for two small formulas here.
| k | ![]() |
BC | PR BC |
PR |
|---|---|---|---|---|
| 3 | 60 | 30 | 36 | 18 |
| 4 | 163680 | 12480 | 6576 | 792 |
For the unsatisfiable formulas, we ran CaDiCaL4 [3] with a timeout of 2000 seconds on the original formulas and the versions including the clauses introduced at each step. The results are in Table 3. These runtimes are somewhat unexpected as R1 and R2 can be derived from
PR with relatively few resolution steps. One would therefore expect the performance on
PR,
R1, and
R2 to be similar. We study this observation in the next subsection.
Table 3.
CDCL performance on formulas with additional clauses. Each cell shows the time (in seconds) it takes for CaDiCaL to prove unsatisfiability. The cells with dashes indicate that the solver ran out of time before finding a proof.
| k | ![]() |
BC | PR | R1 | R2 |
|---|---|---|---|---|---|
| 5 | 0.07 | 0.04 | 0.03 | 0.01 | 0.00 |
| 6 | 29.53 | 24.51 | 1.17 | 0.03 | 0.01 |
| 7 | — | — | 26.80 | 0.28 | 0.02 |
| 8 | — | — | 1503 | 1.33 | 0.19 |
| 9 | — | — | — | 22.99 | 0.88 |
| 10 | — | — | — | 196.18 | 12.88 |
Difficult extended Mycielski graph formulas
The CDCL paradigm has been highly successful, because it has been able to find short refutations for problems arising from various applications. However, the above results show that there exist formulas for which CDCL cannot find the short refutations. In particular, the
PR formulas have length
and there exist resolution refutations of length
: Each clause in R1 and R2, of which there are
, can be derived in O(k) steps of resolution. As for the clique, it is known that
has resolution refutations of length
[4].
This shows that, even if we devise an algorithm to discover the redundant
clauses automatically, the Mycielski graph formulas still remain difficult for the standard tools. After the clauses in BC and PR become part of the formula, the difficulty lies in deriving the R2 clauses automatically. If we resort to incremental SAT solving [10] and provide the cubes
(negation of each clause in R2) as assumptions to the solver, the formulas become relatively easily solvable. For instance,
PR takes approximately 3 minutes on a single CPU. Although it is unlikely that a solver can run this efficiently without any explicit guidance, the small runtime provides evidence that the shortest resolution proof of
PR is of modest length.
In this section, we describe a method for discovering useful cubes automatically and using them to solve the
PR formulas. While inefficient, with this method it at least becomes possible to find proofs of these formulas in a matter of minutes, compared to CDCL which did not succeed even with a timeout of three days on
PR. Given a formula F, the below procedure discovers binary clauses, inserts them to F, and attempts to solve F via CDCL.
Iteratively remove the clause that has the largest number of resolution candidates until the formula becomes satisfiable. For
PR, this corresponds to simply removing the clause
. Call the newly obtained formula, which is satisfiable,
.- Repeat:
- Find all pairs of literals
that do not appear together in any of the solutions sampled so far. Form a list with the cubes
, and shuffle it in order to avoid ordering the pairs with respect to variable indices. In the case of
PR, the clause
is implied by
, hence
must be among the pairs found. - If the number of pairs found did not decrease by more than 1 percent after the latest addition of satisfying assignments, break.
- Repeat:
- Partition the remaining cubes into P pieces. Use P workers in parallel to perform incremental solving with a limit of L conflicts allowed on the instances of the formula F using each separate piece as the set of assumptions. Aggregate a list of refuted cubes.
- For each refuted cube B, append
to the formula F. - If the number of refuted cubes is less than half of the previous iteration, break.
Run CDCL on the final formula F that includes negations of all the refuted cubes.
Table 4 displays the results for formulas with
and varying numbers of parallel workers P.
Table 4.
Results on finding proofs for
PR. From left to right, the columns correspond to the number of samples used for obtaining a list of cubes, the number of cubes obtained after filtering pairs of literals, time it takes to sample solutions using a local search solver with 20 workers and filter pairs of literals, maximum number L of conflicts allowed to the incremental SAT solver, number of parallel workers P, total time it takes to refute cubes and prove unsatisfiability of the final formula F, percentage of time spent in the final CDCL run on F, number of iterations spent refuting cubes and adding them to the formula.
| k | #samples | #cubes | time to cubes | L | P | time to solve | final% | #iter |
|---|---|---|---|---|---|---|---|---|
| 7 | 2000 | 9675 | 18.4s | 100 | 1 | 15.4s | 0.39% | 2 |
| 12 | 5.7s | 0.87% | 3 | |||||
| 25 | 5.3s | 0.94% | 4 | |||||
| 50 | 6.3s | 0.80% | 4 | |||||
| 8 | 2000 | 38255 | 2m 15s | 100 | 1 | 2m 50s | 0.12% | 2 |
| 12 | 44.4s | 0.43% | 4 | |||||
| 25 | 30.6s | 0.65% | 4 | |||||
| 50 | 33.5s | 0.60% | 5 | |||||
| 9 | 3000 | 148624 | 10m 37s | 100 | 1 | 38m 40s | 0.03% | 2 |
| 12 | 7m 4s | 0.14% | 5 | |||||
| 25 | 5m 22s | 0.24% | 6 | |||||
| 50 | 3m 26s | 0.26% | 5 | |||||
| 10 | 3000 | 568214 | 35m 18s | 100 | 1 | 11h 37m | 0.003% | 3 |
| 12 | 1h 55m | 0.04% | 6 | |||||
| 25 | 1h 7m | 0.33% | 5 | |||||
| 50 | 42m 18s | 0.32% | 6 |
Conclusion
We showed that there exist short
,
, and
proofs of the colorability of Mycielski graphs. Interesting questions about the proof complexity of
variants remain. For instance,
has not been shown to separate from
or Frege, and even simpler questions regarding upper bounds for some difficult tautologies are open. It is also unknown, although plausible, whether there exists a polynomial-length
proof of the Mycielski graph formulas.
Apart from our theoretical results, we encountered formulas with short resolution proofs for which CDCL requires substantial runtime. We developed an automated reasoning method to solve these formulas. In future work, we plan to study whether this method is also effective on other problems that are challenging for CDCL.
Acknowledgements
This work has been supported by the National Science Foundation (NSF) under grant CCF-1813993.
Footnotes
Email correspondence on May 25, 2019
Contributor Information
Luca Pulina, Email: lpulina@uniss.it.
Martina Seidl, Email: martina.seidl@jku.at.
Emre Yolcu, Email: emreyolcu@cmu.edu.
Xinyu Wu, Email: xinyuwu@cmu.edu.
Marijn J. H. Heule, Email: marijn@cmu.edu
References
- 1.The On-Line Encyclopedia of Integer Sequences. Published electronically at https://oeis.org/A122695
- 2.Biere, A.: CaDiCaL, Lingeling, Plingeling, Treengeling, YalSAT entering the SAT competition 2017. In: Proceedings of SAT Competition 2017 – Solver and Benchmark Descriptions. vol. B-2017-1, pp. 14–15 (2017)
- 3.Biere, A.: CaDiCaL at the SAT Race 2019. In: Proceedings of SAT Race 2019 – Solver and Benchmark Descriptions. vol. B-2019-1, pp. 8–9 (2019)
- 4.Buss, S., Pitassi, T.: Resolution and the weak pigeonhole principle. In: Computer Science Logic, pp. 149–156 (1998)
- 5.Buss, S., Thapen, N.: DRAT proofs, propagation redundancy, and extended resolution. In: Theory and Applications of Satisfiability Testing – SAT 2019. pp. 71–89 (2019)
- 6.Caramia M, Dell’Olmo P. Coloring graphs by iterated local search traversing feasible and infeasible solutions. Discrete Applied Mathematics. 2008;156(2):201–217. doi: 10.1016/j.dam.2006.07.013. [DOI] [Google Scholar]
- 7.Cook SA, Reckhow RA. The relative efficiency of propositional proof systems. The Journal of Symbolic Logic. 1979;44(1):36–50. doi: 10.2307/2273702. [DOI] [Google Scholar]
- 8.Crawford, J.M., Ginsberg, M.L., Luks, E.M., Roy, A.: Symmetry-breaking predicates for search problems. In: Proceedings of the Fifth International Conference on Principles of Knowledge Representation and Reasoning. pp. 148–159 (1996)
- 9.Desrosiers C, Galinier P, Hertz A. Efficient algorithms for finding critical subgraphs. Discrete Applied Mathematics. 2008;156(2):244–266. doi: 10.1016/j.dam.2006.07.019. [DOI] [Google Scholar]
- 10.Eén N, Sörensson N. Temporal induction by incremental SAT solving. Electronic Notes in Theoretical Computer Science. 2003;89(4):543–560. doi: 10.1016/S1571-0661(05)82542-3. [DOI] [Google Scholar]
- 11.Goldberg, E., Novikov, Y.: Verification of proofs of unsatisfiability for CNF formulas. In: Proceedings of the Conference on Design, Automation and Test in Europe (DATE 2003). pp. 886–891 (2003)
- 12.Heule, M.J.H., Biere, A.: What a difference a variable makes. In: Tools and Algorithms for the Construction and Analysis of Systems. pp. 75–92 (2018)
- 13.Heule, M.J.H., Kiesl, B., Biere, A.: Clausal proofs of mutilated chessboards. In: NASA Formal Methods. pp. 204–210 (2019)
- 14.Heule MJH, Kiesl B, Biere A. Strong extension-free proof systems. Journal of Automated Reasoning. 2020;64(3):533–554. doi: 10.1007/s10817-019-09516-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Kullmann O. On a generalization of extended resolution. Discrete Applied Mathematics. 1999;96–97:149–176. doi: 10.1016/S0166-218X(99)00037-2. [DOI] [Google Scholar]
- 16.Marques-Silva, J.P., Sakallah, K.A.: GRASP—a new search algorithm for satisfiability. In: Proceedings of the 1996 IEEE/ACM International Conference on Computer-Aided Design. pp. 220–227 (1997)
- 17.Mycielski J. Sur le coloriage des graphs. Colloquium Mathematicae. 1955;3(2):161–162. doi: 10.4064/cm-3-2-161-162. [DOI] [Google Scholar]
- 18.Ramani, A., Aloul, F.A., Markov, I.L., Sakallah, K.A.: Breaking instance-independent symmetries in exact graph coloring. In: Proceedings of the Conference on Design, Automation and Test in Europe (DATE 2004). pp. 324–329 (2004)
- 19.Schaafsma, B., Heule, M.J.H., van Maaren, H.: Dynamic symmetry breaking by simulating Zykov contraction. In: Theory and Applications of Satisfiability Testing – SAT 2009. pp. 223–236 (2009)
- 20.Trick, M.A., Yildiz, H.: A large neighborhood search heuristic for graph coloring. In: Integration of AI and OR Techniques in Constraint Programming for Combinatorial Optimization Problems. pp. 346–360 (2007)
- 21.Van Gelder A. Another look at graph coloring via propositional satisfiability. Discrete Applied Mathematics. 2008;156(2):230–243. doi: 10.1016/j.dam.2006.07.016. [DOI] [Google Scholar]
- 22.Vinyals, M.: Hard examples for common variable decision heuristics. In: Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence (2020)
- 23.Zhou Z, Li CM, Huang C, Xu R. An exact algorithm with learning for the graph coloring problem. Computers and Operations Research. 2014;51:282–301. doi: 10.1016/j.cor.2014.05.017. [DOI] [Google Scholar]
- 24.Zhu, Q., Kitchen, N., Kuehlmann, A., Sangiovanni-Vincentelli, A.: SAT sweeping with local observability don’t-cares. In: Proceedings of the 43rd Annual Design Automation Conference. pp. 229–234 (2006)






















