A Graph Coloring Algorithm for Large Scheduling Problems

Frank Thomson Leighton

doi:10.6028/jres.084.024

. 1979 Nov-Dec;84(6):489–506. doi: 10.6028/jres.084.024

A Graph Coloring Algorithm for Large Scheduling Problems

Frank Thomson Leighton ^1,^**

PMCID: PMC6756213 PMID: 34880531

Abstract

A new graph coloring algorithm is presented and compared to a wide variety of known algorithms. The algorithm is shown to exhibit O(n²) time behavior for most sparse graphs and thus is found to be particularly well suited for use with large-scale scheduling problems. In addition, a procedure for generating large random test graphs with known chromatic number is presented and is used to evaluate heuristically the capabilities of the algorithms discussed.

Keywords: Algorithm, chromatic number, color function, graph, graph coloring, heuristic, interchange, random test graphs, scheduling, time complexity

AMS-MOS 1970 Subject Classification: 05C15, 68A10, 68A20, 90B35

1. Introduction

Graph coloring has considerable application to a large variety of complex problems involving optimization. In particular conflict resolution, or the optimal partitioning of mutually exclusive events, can often be accomplished by means of graph coloring. Examples of such problems include: the scheduling of exams in the smallest number of time periods such that no individual is required to participate in two exams simultaneously (see appendix A), the storage of chemicals on the minimum number of shelves such that no two mutually dangerous chemicals (i.e., dangerous when one is in the presence of the other) are stored on the same shelf, and the pairing of individuals (as in a computer dating agency) such that the maximal number of compatible persons are paired together.

In each of the above problems, the constraints are usually expressible in the form of pairs of incompatible objects (e.g., pairs of chemicals that cannot be stored on the same shelf). Such incompatibilities are usefully embodied through the structure of a graph. Each object is represented by a node and each incompatibility is represented by an edge joining the two nodes. A coloring of this graph is then simply a partitioning of the objects into blocks (or colors) such that no two incompatible objects end up in the same block. Thus, optimal solutions to such problems may be found by determining minimal colorings for the corresponding graphs. Unfortunately, this may not always be accomplishable in a reasonable amount of time.

As the graph coloring problem is known to be NP-complete [1],¹ there is no known algorithm which, for every graph, will optimally color the nodes of the graph in a time bounded by a polynomial in the number of nodes. Since exponential time algorithms [5, 6, 7, 9, 18] are prohibitively expensive for use with large-scale problems, much attention has been focused on the development of heuristic algorithms which will usually produce a good, though not necessarily optimal, coloring for any graph in a reasonable amount of time.

This paper describes a new graph coloring algorithm, the recursive largest first (RLF) coloring algorithm. In addition, a variety of existing coloring procedures are presented and their performance on a wide range of test data is compared to that of the RLF algorithm.

Also described is a procedure for generating random graphs with known chromatic number. The existence of such a procedure, heretofore lacking in the experimental literature, provides a standard method for testing the accuracy of graph coloring algorithms.

2. Preliminary Definitions

Throughout this paper, the graph G with nodes V and edges E, denoted by (V, E), is assumed to contain no loops or multiple edges. The subgraph of G = (V, E) induced by a subset U of the nodes V consists of those nodes and all the edges that directly connect them. This subgraph is represented by < U > or (U, E′) where E′ = {(w₁, w₂)∣(w₁, w₂)ϵE, w₁ϵU, w₂ϵU}. The degree of a node wϵG, denoted by d(w), is the number of nodes adjacent to w in G. Define d_U(w) to be the number of nodes in U adjacent to w in G. This is equivalent to the degree of w in < U ⋃ {w} >.

A coloring of G is an assignment of colors to the nodes of G such that no two adjacent nodes share the same color. More formally, a k-coloring of G is a mapping f: V → {1, 2, …, k} such that f(u) = f(v) only if (u,v) ϵ E. The chromatic number of G, denoted χ(G), is the minimal number of colors necessary to color G. An optimal coloring of G is one which uses exactly χ(G) colors.

3. Sequential Coloring Algorithms

One of the simplest coloring algorithms is the randomly ordered sequential (RND) graph coloring algorithm [14]. Given a graph G = (V, E), the algorithm randomly orders the nodes so that V = {v₁, …, v_n} and then assigns colors to the nodes in the following manner. The first node, v₁, is assigned color number 1. Once the first i nodes have been colored (1 ≤ i ≤ n − 1), v_{i + 1} is assigned the lowest possible color number such that no previously colored node adjacent to v_{i + 1} has been assigned the same color number.

Though this algorithm is locally optimal in the sense that each node is assigned the smallest possible number, the overall action is highly dependent on the initial ordering of the nodes. For any graph, there exists an ordering for which this algorithm will produce an optimal coloring [14], while a less fortuitous ordering may lead to an extremely poor coloring. Thus the problem of finding an optimal initial ordering of the nodes is equivalent to the problem of optimally coloring the graph.

This fact has led to the development of a large number of algorithms, each differing from RND only in the method of initially ordering the nodes [7, 14]. Two such algorithms are the largest first (LF) and smallest last (SL) sequential coloring algorithms.

The LF algorithm orders the nodes such that d(v_i) ≥ d(v_{i + 1}) for 1 ≤ i < n where V = {v₁, ……, v_n}. The SL algorithm is similar in strategy but recursively orders the smallest degree nodes last. An SL ordering is one in which $d (v_{n}) = {min}_{w \in V} d (w)$ and for n − 1 ≥ i ≥ 1, $d_{U} (v_{i}) = {min}_{w \in U} d_{U} (w)$ where U = V − {v_n,… ……, v_{i + 1}}.

Note that both the LF and SL algorithms tend to order the high degree nodes before the low degree nodes. Computational experience has shown that this is generally a good strategy, whereas algorithms which color the higher degree nodes last have often been found to produce colorings worse than those produced by a random ordering.

Each of the sequential coloring algorithms presented in this section requires O(n²) time and O(n²) space to color a graph with n nodes. Quadratic time and space complexities are generally quite acceptable for use with large-scale coloring problems. If only they gave guaranteed optimal colorings, we would look no further.

4. More Sophisticated Algorithms

One successful variation of the sequential coloring algorithms involves what is known as an interchange. Given any G = (V, E) and color function f such that f(w)ϵ{i, j} for all wϵV, and (i, j)-interchange on G is a redefinition of f such that if f(w) = i originally, f(w) is now assigned j and vice versa for all wϵV.

Appropriate use of the interchange process has been found to yield particularly good results when used in conjunction with the LF and SL algorithms [14]. The resulting procedures are referred to as the smallest last with interchange (SLI) and largest first with interchange (LFI) coloring algorithms.

The SLI (LFI) algorithm operates just like the SL (LF) algorithm except when the latter requires the introduction of a new color. Suppose that such a situation occurs when v_m is the node to be colored and that $k = max_{i < m} f (v_{i})$ . For 1 ≤ i < j ≤ k, define G_ij to be the subgraph of G induced by the nodes of G previously colored i or j. If possible, choose i and j such that no connected component of G_ij contains two differently colored nodes both adjacent to v_m. If such a G_ij is found to exist, then perform an (i, j) – interchange on each connected component of G_ij which contains an i-colored node adjacent to v_m in G. It is now possible to assign color i to v_m and thus the addition of a new color has been avoided. If no such G_ij exists, however, then regardless of what interchange is performed, v_m must be assigned color k + 1.

This version of the SLI (LFI) algorithm initially appeared in [11] and is an extension of the original version which is described in [14]. The original version allows an (i, j) – interchange only when v_m is adjacent to exactly one node colored i and one node colored j. There is little difference between the original and extended versions of the SLI and LFI algorithms in terms of colorings produced or time required. While the extended versions may be able to perform a useful interchange impossible in the original version, they will likely take slightly longer to do so. All four algorithms require O(n³) time and O(n²) space to color an n node graph. Based on a limited amount of computational experience, the extended version of the SLI algorithm (henceforth to be referred to simply as the SLI algorithm) was found to produce slightly better results than did the other interchange procedures.

All of the algorithms thus far presented are capable of producing very bad colorings, in terms of number of colors used, for certain graphs. Johnson [10, 11] has given constructions of 3-colorable graphs on O(n) vertices which each of the above algorithms requires n colors to color completely. Since no more than O(n) colors may be used to color an O(n) node graph, such colorings are, up to a constant, the worst possible.

There is an algorithm, however, which will color any graph G with n nodes in $O (\frac{n}{log n}) χ (G)$ or fewer colors. While this worst-case behavior is still unacceptable in practice, the approximately maximum independent set (AMIS) algorithm is interesting because it is the only known algorithm which is known not to exhibit the worst possible worst-case behavior [11]. The algorithm proceeds as follows. Given G = (V, E) select the node with minimum degree in G, say v₁, and color it 1. Once i nodes have been assigned color 1, select, if possible v_{i + 1} ϵU such that d_U(v_{i + 1}) is minimal for nodes in U where U is the set of uncolored nodes not adjacent to any colored node. If no such selection is possible, i.e., U is empty, then repeat the entire procedure on the subgraph of G induced by the uncolored nodes of G, using the next available color. This process is then, in turn, repeated until all the nodes of G have been colored.

Interestingly enough, while this algorithm exhibits better worst-case behavior than the other algorithms thus far discussed, computational experience has shown that, on the average, the colorings it produces are substantially inferior to those produced by the LF, SL, and SLI algorithms.

5. The Recursive Largest First (RLF) Algorithm

The RLF algorithm combines the strategy of the LF algorithm with the structure of the AMIS algorithm. Like the LF algorithm, at each step in the RLF procedure a node is selected for coloring which will, in some sense, leave the resulting uncolored nodes colorable in as few colors as possible. As with the AMIS algorithm, the RLF procedure completes the assignment of color i before commencing assignment of color i + 1.

The RLF graph coloring algorithm proceeds as follows. Given G = (V, E) assign color 1 to the node with maximal degree in G, say v₁. Once i nodes have been assigned color 1, select, if possible, v_{i + 1} ϵU₁ such that $d_{U} (v_{i + 1})$ is maximal for nodes in U₁ where U₁ is the set of uncolored nodes not adjacent to any colored node and U₂ is the set of uncolored nodes adjacent to at least one colored node. Ties are, if possible, broken by choosing the node that has minimal degree in < U₁ >. If no such selection is possible, i.e., U₁ is empty, then repeat the entire process recursively on the subgraph of G induced by the uncolored nodes of G, using the next available color. This recursion is then repeated until all of the nodes in G are colored. Several examples of this procedure are worked out in appendix A.

As was true with the SLI algorithm, the RLF algorithm, in general, requires O(n³) time and O(n²) space to color an n node graph. Unlike the SLI algorithm, however, the RLF algorithm requires only O(n²) time to color graphs for which k · e ≈ n² where k is the number of colors used to color the graph, e is the number of edges in the graph, and n is the number of nodes in the graph (see appendix B for proof). Such graphs, which are usually sparse, quite commonly arise in practical applications such as exam scheduling. For example, the graph associated with the 1977–8 Princeton University fall term course examinations schedule consisted of 273 nodes, 6727 edges, and required 17 colors to be colored by the RLF algorithm. Thus, for practical purposes, the RLF algorithm, if programmed properly, exhibits an O(n²) time dependence for many applications. Appendix B presents a PL-1 listing of the RLF algorithm as well as a rigorous analysis of its time complexity.

6. Generation of Test Graphs With Known Chromatic Number

A few papers have been published which compare the performance of various algorithms on large (usually 100-node) randomly generated graphs [14, 21, 23]. Unfortunately, none of these empirical studies provide the chromatic numbers of the test graphs used. Indeed, the task of closely approximating the chromatic number of a graph is NP-complete [8] and thus virtually impossible to accomplish for large graphs. Consequently, approximations of upper and lower bound results established for χ(G) have generally been crude and of little practical use [1, 7, 14, 19].

The lack of such information makes an accurate interpretation of the experimental data very difficult. For instance, if algorithm A required 22 colors to color G while algorithm B required only 20, the conclusions drawn about their relative effectiveness if χ(G) = 20 might be quite different from those drawn if χ(G) = 4. Further, without knowledge of χ(G), no statement can be made at all about the accuracy or closeness to optimality of either algorithm A or B. Thus there is a need for a standard procedure for generating random test graphs with known chromatic numbers. Such a procedure will now be presented.

Suppose it is desired to construct an n-node graph G with e edges and chromatic number k. For the purposes of the following argument, assume that k ∣ n. This is not a significant restriction since most test or modeling uses of a large graph generator are likely to allow some flexibility in the choices of n and k. For such a graph to exist under these restrictions, e must be such that

\frac{k (k - 1)}{2} \leq e \leq \frac{n^{2} (k - 1)}{2 k} .

The first step in the procedure is to choose positive integers a, c and m such that:

m ⪢ n,
(n, m) = k,
(c, m) = 1,
p ∣ m → p ∣ (a − 1) for all primes p, and
4 ∣ m → 4 ∣ (a − 1).

Next generate a uniform sequence of random numbers {X_i} on the interval 0 to m − 1 by the linear congruential method described in [12]. This is accomplished by fixing X₀ and, then for each i > 0, setting X_i = MOD(aX_{i − 1} + c, m) where $MOD (X, Y) = X - [\frac{X}{Y}] * Y .$ . Sequences generated in such a manner exhibit two important properties [12]. First, for every i and j such that 0 ≤ j ≤ m − 1 and i ≥ 0, there exists an r such that i ≤ r ≤ i + m − 1 and X_r = j. Second, for every i ≥ 0, X_i = X_{i + m}.

Next construct the sequence {Y_i} on the interval 0 to n − 1 so that Y_i = MOD(X_i, n). Note that unless $k = n, n | m$ and {Y_i} is not a uniform random number sequence on the interval 0 to n − 1.

By defining V = {0, 1, …, n − l}, it is possible to associate two consecutive values of {Y_i} with edges to be added to E. Similarly, it is possible to associate h consecutive values of {Y_i} with h-cliques to be implanted in G. For example, the subsequence {Y₁, Y₂, Y₃} corresponds to the subset ${(v_{Y_{1}}, v_{Y_{2}}), (v_{Y_{1}}, v_{Y_{3}}), (v_{Y_{2}}, v_{Y_{3}})}$ . By identifying certain subsequences of consecutive elements of {Y_i} and adding the corresponding edges to E, it is possible to construct the desired graph G.

More precisely, define the (k − 1)-vector b = (b_k, b_k−1, …, b₂) so that b_k ≥ 1 and b_i ≥ 0 for 2 ≤ i ≤ k − 1. Each b_i corresponds to the number of i-cliques to be implanted in G. Specifically, given the sequence {Y_i} and vector b, proceed as follows. Select the first k values of {Y_i}starting with Y₁ and add the corresponding edges to E. If b_k > 1, select the next k values of {Y_i} and add the corresponding edges to E. Repeat this process until b_k k-cliques have been implanted in G. Next add, in an identical fashion, b_k−1 (k − 1)-cliques to G. Continue the process until b₂ 2-cliques or edges have been added to E. Note that some edges may be “added” several times and thus it may not be possible to precalculate a vector b such that there are exactly e edges in the resulting graph. It is possible, however, to keep track of how many edges have been added at any point and to eliminate the addition of i-cliques which might result in the addition of too many edges to E. Since edges may be added one at a time, it is not difficult to show that graphs having exactly e edges may be constructed in this manner for any e such that

\frac{k (k - 1)}{2} \leq e \leq \frac{n^{2} (k - 1)}{2 k} .

It now remains to be shown that χ(G) = k for any G constructed in this manner. Since b_k ≥ 1, G contains a k-clique and thus χ(G) ≥ k. Before establishing that χ(G) ≤ k, it is useful to examine the structure of the sequence ${Y_{i}^{'}}$ where $Y_{i}^{'}$ . Since k∣n and k∣ m,

Y_{i + 1}^{'} = MOD (Y_{i + 1}, k) = MOD (MOD (X_{i + 1}, n), k) = MOD (X_{i + 1}, k) = MOD (MOD (a X_{i} + c, m), k) = MOD (a X_{i} + c, k) = MOD (MOD (a X_{i} + c, n), k) = MOD (a Y_{i} + c, k) Y_{i + 1}^{'} = MOD (a Y_{i}^{'} + c, k) .

Further,

p | k \to p | m \to p | (a - 1) 4 | k \to 4 | m \to 4 | (a - 1), and (c, m) = 1 \to (c, k) = 1.

Thus ${Y_{i}^{'}}$ is a uniform sequence of random numbers on the interval 0 to k − 1.

This structure of the {Y_i} modulo k allows the following coloring of G. For each i, define $f (v_{Y_{i}}) = MOD (i, k)$ . Since for all j such that 0 ≤ j^: < n, there exists an i > 0 such that Y_i = j, it is clear that every node is assigned a color by this procedure. Since {Y_i} is a uniform sequence of random numbers on the interval from 0 to k − 1, we know that if Y_i = Y_j then $Y_{i}^{'} = Y_{j}^{'}$ and MOD(i, k) = MOD(j, k) and thus that $f (v_{Y_{i}}) = f (v_{Y_{j}})$ . This means that f is well defined. Finally, it is easily verified that $v_{Y_{i}}, v_{Y_{i + 1}}, \dots, v_{Y_{i + h - 1}}$ are all colored differently if h ≤ k, for all i ≥ 0. This means that edges occur only between differently colored nodes, and that f is a proper coloring of G. Thus χ(G) = k.

It should be noted that the above result is a special case of a more general result for arbitrary k and n. That is, if k and n are such that k ≤ d where d = (n, m), then k ≤ χ(G) ≤ k + MOD(d, k) < 2k. The proof of the general result is not given here but is similar to that of the special case when MOD(d, k) = 0.

As will be demonstrated shortly, the range of graphs which can be generated by this procedure is quite large. The node degrees of such graphs may vary between 0 and $n - \frac{n}{k}$ while the average node degree may vary between $\frac{k (k - 1)}{n}$ and $\frac{k - 1}{k} n$ . The variety of distributions of node degrees is also quite large. Most importantly, however, the procedure generates graphs which are as difficult to color as are randomly generated graphs (where the chromatic number is not known). Demonstration of this fact is provided in section 7.

Another advantage of this procedure is that the test graphs may be easily characterized. For example, only k + 5 values are required to generate an n-node graph with chromatic number k. These values are n, k, X₀, a, c, m, b_k, b_k−1, …, b₃ and b₂. Whereas it would be infeasible to completely describe a large, randomly generated graph by conventional means in a short paper, graphs generated by this procedure are easily described. Thus, in future publications concerning the effectiveness of various graph coloring algorithms, it will be possible to specify precisely which graphs were used to test the various algorithms. There are several conceivable situations where such documentation could be valuable to the interested reader. For example, should the reader desire to compare the effectiveness of a new graph coloring algorithm to those in the literature, he would need only to regenerate the graphs used in published tests and color them with the new algorithm. This would eliminate the necessity of developing an entirely new set of test data and of having to rerun all previous algorithms on such data. Pursuant to these goals, a complete characterization of the test graphs referred to in tables 1 and 2 is provided in appendix C.

Table 1.

Results for 150-node test graphs generated according to the procedure detailed in section 6.

χ	d	number of colors used					average number of excess colors used
χ	d	RND	LF	SL	RLF	SLI	RND	LF	SL	RLF	SLI
5	11	(9, 9, 9)	(7, 8, 7)	(7, 7, 8)	(6, 6, 6)	(6, 6, 6)	4	2⅓	2⅓	1	1
	19	(11, 11, 12)	(10, 10, 9)	(9, 10, 9)	(7, 7, 8)	(9, 5, 8)	6⅓	4⅔	4⅓	2⅓	2⅓
	24	(13, 12, 13)	(10, 11, 11)	(10, 9, 10)	(7, 6, 7)	(8, 7, 6)	7⅔	5⅔	4⅔	1⅔	2
Ave. total							6	4²/₉	3⁷/₉	1⁶/₉	1⁷/₉
10	11	(11, 11, 12)	(10, 10, 10)	(10, 10, 10)	(10, 10, 10)	(10, 10, 10)	1⅓	0	0	0	0
	21	(15, 14, 16)	(12, 12, 12)	(12, 12, 12)	(11, 11,11)	(11, 11, 11)	5	2	2	1	1
	29	(17, 16, 18)	(14, 14, 14)	(14, 15, 15)	(13, 13, 12)	(13, 13, 12)	7	4	4⅔	2⅔	2⅔
Ave. total							4⁴/₉	2	2²/₉	1²/₉	1²/₉
15	12	(15, 16, 15)	(15, 15, 15)	(15, 15, 15)	(15, 15, 15)	(15, 15, 15)	⅓	0	0	0	0
	25	(19, 18, 18)	(17, 16, 16)	(16, 16, 15)	(15, 15, 15)	(15, 15, 15)	3⅓	1⅓	⅔	0	0
	34	(19, 21, 20)	(17, 17, 18)	(17, 19, 17)	(16, 16, 16)	(16, 16, 16)	5	2⅓	2⅔	1	1
Ave. total							2⁸/₉	1²/₉	1¹/₉	⅓	⅓
Overall average total							4¹²/₂₇	2¹³/₂₇	2¹⁰/₂₇	1²/₂₇	1³/²⁷
Total time (seconds)		10.2	13.3	13.5	15.3	49.2

Open in a new tab

Table 2.

Results for 450-node test graphs generated according to the procedure detailed in section 6.

χ	d	number of colors used					average number of excess colors used
χ	d	RND	LF	SL	RLF	SLI	RND	LF	SL	RLF	SLI
5	25	(14, 13)	(11, 12)	(11, 12)	(8, 8)	(10, 9)	8½	6½	6½	3	4½
	43	(17, 18)	(12, 14)	(11, 15)	(5, 5)	(5, 5)	12½	8	8	0	0
Ave. total							10½	7¼	7¼	1½	2¼
15	36	(22, 22)	(18, 18)	(18, 18)	(17, 16)	(16, 16)	7	3	3	1½	1
	74	(30, 31)	(26, 26)	(26, 26)	(23, 23)	(23, 24)	15½	11	11	8	8½
Ave. total							11¼	7	7	4³/₄	4³/₄
25	37	(29, 27)	(26, 25)	(25, 25)	(25, 25)	(25, 25)	3	^½	0	0	0
	77	(37, 35)	(29, 30)	(31, 31)	(28, 28)	(28, 29)	11	4½	6	3	3½
Ave. total							7	2½	3	1½	1¾
Overall average Total							9⁷/₁₂	5⁷/₁₂	5⁹/₁₂	2⁷/₁₂	2¹¹/₁₂
Total time(seconds)		32.0	42.8	44.9	80.7	308.8

Open in a new tab

7. Test Results

The procedure described above was used to generate 27 150-node graphs and 12 450-node graphs of varying edge density and chromatic number. In addition, 27 completely random 150-node graphs were generated with varying edge density and unknown chromatic number. The RND, LF, SL, RLF and SLI algorithms were tested on each of the 66 graphs. The resulting data are displayed in tables 1, 2, and 3, respectively.

Table 3.

Results for random 150-node test graphs.

Lower Bound on χ	d	RND	LF	SL	RLF	SLI
5	10 18 24	(9, 9, 8) (11, 11, 11) (13, 13, 12)	(7, 7, 7) (10, 10, 10) (10, 12, 11)	(7, 7, 7) (10, 10, 9) (11, 11, 11)	(6, 6, 6) (8, 8, 8) (9, 9, 9)	(6, 6, 6) (9, 8, 8) (10, 10, 10)
10	10 18 24	(16, 15, 16) (16, 17, 16) (17, 18, 17)	(15, 15, 15) (15, 15, 15) (15, 16, 16)	(15, 15, 15) (15, 15, 15) (15, 15, 15)	(15, 15, 15) (15, 15, 15) (15, 15, 15)	(15, 15, 15) (15, 15, 15) (15, 15, 15)
15	8 16 24	(24, 22, 21) (23, 23, 24) (24, 24, 24)	(24, 22, 21) (23, 23, 24) (24, 24, 24)	(24, 22, 21) (23, 23, 24) (24, 24, 24)	(24,22, 21) (23, 23, 24) (24, 24, 24)	(24, 22, 21) (23, 23, 24) (24, 24, 24)
Total time (seconds)		9.8	12.9	12.9	19.0	384.3

Open in a new tab

In each table, the graphs are subdivided into groups according to chromatic number, χ, (or as in the case with the completely random graphs, to a known lower bound for χ) and average node degree, d. There are three graphs in each 150-node group and two graphs in each 450-node group. The numbers in parentheses indicate the number of colors used by an algorithm to color the first, second, and possibly, third graph of that category. For the graphs where χ was known, the average number of excess colors used by the algorithm was computed for each group and totaled. For example, the RLF algorithm optimally colored each of the 10-colorable 150-node graphs with average degree 11 in table 1 but required, on the average, 4³/4 extra colors to color each of the four 15-colorable 450-node graphs in table 2.

The total run time for each algorithm is also included in each table. This figure represents execution time in seconds on an IBM 360–91. It should be noted that such figures are highly dependent on factors unrelated to the inherent efficiency of the algorithm, such as programmer skill and machine characteristics. The time complexity estimates provided earlier are much more rigorous measures of the algorithms’ relative speeds. Except for the SLI time in table 3, the run times are in accordance with these theoretical estimates. It is quite possible that the graphs referenced in table 3 have chromatic numbers much higher than the minimum estimate, and that the SLI algorithm was thus induced to attempt and possibly perform a large number of time-consuming interchanges. This example points out the highly variable amounts of time required by most interchange algorithms to color various graphs (a phenomenon also observable in the data of [14]).

The random graphs of table 3 were included only for the purpose of demonstrating that the graphs generated by the technique discussed in section 6 are just as suitable for testing the relative capabilities of graph coloring algorithms as are completely randomly generated graphs. As was pointed out earlier, the data in table 3 cannot be used to draw conclusions about the accuracy of the tested algorithms. From the data in tables 1 and 2, however, we observe that, for the graphs considered, the LF and SL algorithms required about twice as many extra colors to color the graphs as did the RLF and SLI algorithms. Similarly, the RND algorithm required about twice as many extra colors as did the LF and SL algorithms. Significantly, this observation can be made for most of the graphs on an individual basis. The RND algorithm always used more colors than the LF and SL algorithms which, in turn, always used more colors than the RLF or SLI algorithms.

There is not as clear a distinction between the performance of the LF and SL algorithms or the RLF and SLI algorithms. The LF and SL algorithms required virtually the same number of colors on the average and required nearly the same amount of time. While the colorings produced by both the RLF and SLI algorithms for test graphs were, on the average, quite good, the RLF algorithm required substantially less time and used approximately 12 percent fewer excess colors on the 450-node graphs and 3 percent fewer excess colors on the 150-node graphs than did the SLI algorithm. Of the eight 450-node graphs which were not optimally colored by both the RLF and SLI algorithms, the RLF algorithm required the fewest colors for four of the graphs and the most for only one graph.

As a final note, the edge density, $\frac{d}{n}$ , of the test graphs did not exceed ¼. This results from the fact that for most large-scale practical applications, the edge density of the graphs to be colored is generally small. For instance, the Princeton University exam scheduling graph mentioned in Section 5 had an edge density of approximately ¹/6.

8. Conclusions

From the data presented it is apparent that the RLF algorithm, when not optimal, colored large graphs with substantially fewer colors than did any of the other algorithms that did not involve interchanges. When compared with interchange algorithms, the RLF algorithm was found to produce slightly better colorings in substantially less time. While the RLF and interchange algorithms in general each require O(n³) time to color an n-node graph, the RLF procedure is unique in that it exhibits O(n²) time behavior for graphs with low edge density. Thus the RLF algorithm is particularly well suited for use with large-scale practical problems.

The method described in section 6, for generating random graphs with a known chromatic number, was found to produce test data which can be used to determine heuristically a given algorithm’s accuracy as well as algorithms’ relative capabilities. Previously, published comparison tests have been made only on graphs with unknown chromatic numbers, which rendered impossible any evaluation of an individual algorithm’s accuracy and questionable any statement about two algorithms’ relative capabilities. In addition, the procedure provides a standard method of generating test data for coloring algorithms; by its use a large graph with known chromatic number may be uniquely constructed from only a few parameters.

Acknowledgments

In addition to Professor Acton, the author would like to thank Dr. Charles Johnson, Dr. James Lawrence, and Dr. Alan Goldman for their helpful remarks.

This work was done in part while the author was a staff member of the Center for Applied Mathematics of the National Bureau of Standards during the summer of 1976 and in part while the author was an undergraduate majoring in Electrical Engineering and Computer Sciences at Princeton University working under the supervision of Professor Forman S. Acton.

10. Appendix A: Application to Examination Scheduling

The examination scheduling problem is probably the best known of a large class of scheduling problems in applied mathematics and operations research. It consists of scheduling exams such that no individual is required to participate in two or more exams simultaneously. It is usually assumed desirable to schedule the exams such that the total number of time periods required for the examinations is minimized. Sometimes, additional restraints are imposed. Requiring that some examinations be given or not given in specified time periods and scheduling the exams so that a certain subset of the exams will be completed as early as is possible are examples of such restraints.

Consider the following exam scheduling problem.

In addition to the information contained in figures 1 and 2, assume that we also know that exam 2 must be scheduled in time period 1 and that the final schedule must be such that the last exam involving a participant of type A is scheduled as early as possible.

We will now proceed to solve the above scheduling problem utilizing the RLF graph coloring algorithm.

Since exam 2 must be scheduled in time period 1, we will do so and amend figure 2 so that exams incompatible (i.e., may not be scheduled concurrently) with exam 2 will not be scheduled in time period 1. This information is included in figure 3.

The restriction placed on exams involving type A participants may be satisfied, as far as is possible by heuristic means, by scheduling the exams involving type A individuals first and then, using this information, scheduling the remaining exams. The graph in figure 4 contains the information necessary for the first step.

Node E_i represents exam i and node T_j represents time period j for all i, j. There is an edge between every pair of time period nodes to insure that no two time periods are assigned the same color. An edge is inserted between node E_i and node E_j if and only if exams i and j may not be scheduled simultaneously. Finally, an edge is inserted between node E_i and node T_j if and only if exam i may not be given during time period j.

To obtain the desired schedule we will color the graph in figure 4 using a slightly modified version of the RLF algorithm. At the beginning of each recursive step, we will first color the earliest uncolored time period node instead of the uncolored node adjacent to the most uncolored nodes. This will guarantee that exams are assigned to the earliest time periods first. We will denote the set of uncolored nodes adjacent to at least one colored node, U₂, by encircling such nodes. Colors, as usual, will be denoted by a number. Uncolored nodes not adjacent to any colored node, nodes in U₁, will not be labeled.

The coloring then proceeds as follows. Node T₁ is colored 1 and nodes T₂, T₃, T₄, T₅, E₄, E₆, E₁₀, and E_l2 are circled. This is illustrated in figure 5.

Only nodes E₁ and E₇ remain in U₁. Node E₁, is adjacent to five nodes in U₂ while node E₇ is adjacent to only two nodes in U₂. Thus node E₁ is colored 1 and node E₇ circled. This leaves every node either colored or circled (U₁ is empty) and thus we must delete the colored nodes (T₁ and E₁) from the graph and repeat the procedure on the resulting graph starting with color 2. Once T₂ has been assigned color 2 and the appropriate nodes circled, we have the graph in figure 6.

Both nodes E₇ and E₁₀ are adjacent to one node in U₂ while all other nodes in U₁ are not adjacent to any node in U₂. Since E₇ is connected to only one node in U₁ while $d_{U_{1}} (E_{10}) = 2, E_{7}$ is colored 2 and E₄ is circled. This leaves the graph in figure 7.

Completing the assignment of color 2, E₁₀ is assigned color 2, node E₆ is circled and, finally, E₁₂ is colored 2. Since U₁ is now empty, we delete the colored nodes and repeat the process on the graph in figure 8 using color 3.

This graph is trivially colored by assigning color 3 to nodes T₃, E₄ and E₆; color 4 to T₄; and color 5 to T₅.

The schedule may now be constructed by assigning to each time period those exams which were assigned the color of that time period. Should the number of colors used exceed the number of time period nodes used, the additional colors may be arbitrarily associated with additional time periods (assuming they exist). In this example, however, the number of time period nodes, 5, exceeded the number of colors used, 3, and no such additional assignment of time periods was necessary. The resulting schedule is displayed in figure 9.

This completes the scheduling of exams involving type A individuals. We must now schedule the remaining exams taking into account the partial schedule in figure 9 and the information displayed in figure 3. This information is summarized in figure 10.

Combining this information with that in figure 1, the graph in figure 11 is readily constructed.

All that remains is to color the graph displayed in Figure 11. This is easily done using the RLF algorithm. The final coloring is displayed in figure 12.

Note that, depending on the order in which the nodes were considered, E₁₁ could have been assigned color 2 and E₈ color 3 since the two nodes are identical as far as the RLF algorithm is concerned. This illustrates the fact that the final colorings assigned to a graph may, to a small extent, depend on the initial ordering of the nodes (i.e., on the manner in which nodes with identical characteristics are distinguished). Finally, since 4 colors were necessary to color the graph, those exams colored 4 will be scheduled in the first available, unused time period, the fourth time period.

This results in the partial schedule in figure 13.

The completed exam schedule for all exams is displayed in figure 14. In this particular case, the schedule produced is optimal.

11. Appendix B: Computer Implementation of the RLF Algorithm

As was demonstrated in Appendix A, the RLF algorithm can be used to color small graphs easily by hand. For large graphs, however, such a task can only be accomplished reliably by computer. In order to program the RLF algorithm efficiently, the values of $d_{U_{1}}$ and $d_{U_{2}}$ must be stored for each node and updated whenever U₁ or U₂ is modified. This is accomplished by defining two arrays, E and F:

E (w) is {\begin{array}{l} < 0 if w is colored \\ < 0 if w \in U_{2} \\ = d_{U_{1}} (w) if w \in U_{1} \end{array}} .

and

F (w) is {\begin{array}{l} < 0 if w is colored \\ = d_{U_{1} \cup U_{2}} (w) if w \in U_{2} \\ = d_{U_{1} \cup U_{2}} (w) if w \in U_{1} \end{array}} .

Initially U₁ consists of every node in G and thus E(w) = F(w) = d(w) for all wϵV. If node w′ is selected for coloring, then we must modify E and F so that:

E (w^{'}) \leftarrow - 1

E (w) \leftarrow - 1 for w adjacent to w^{'}

E (w) \leftarrow E (w) - (the number of nodes in U_{1} adjacent to both w and w^{'}) for w \in U_{1} and nonadjacent to w

F (w^{'}) \leftarrow - 1, and

F (w) \leftarrow F (w) - 1 for w adjacent to w^{'} .

The next node to be colored will, of the nodes in U₁, then have the maximum value of F(w) − E(w) and, if a tie exists, the minimum value of E(w) among those nodes that tied. When U₁ is empty (this corresponds to E(w) < 0 for all wϵV), the values of E are modified so that E(w) ← F(w) for all wϵV. This corresponds to a reinitialization of the values of E for the subgraph of G induced by the uncolored nodes.

The above operations on E and F can be easily performed by appropriate use of the following subroutine, procedure DELETE. Given an array D and node w′, DELETE performs the following operations on D:

D (w^{'}) \leftarrow - 1 D (w) \leftarrow D (w) - 1 for w adjacent to w^{'}

Thus whenever node w′ is selected for coloring we may modify E and F by simply performing DELETE on (F, w′) and (E, w′) as well as on (E, w) for all w in U₁ adjacent to w′. It is not difficult to verify that such a procedure maintains the desired values of E and F.

A complete PL-1 computer program listing of the RLF procedure is included at the end of this appendix. The program is written in subroutine form and assumes that values for CI and CL are provided on input. Array CI serves as an index array to CL, the node adjacency list. For example, the nodes adjacent to the i th node are sequentially stored in CL(CI(i − 1) + 1), CL(CI(i − 1) + 2), …, CL(CI(i)). The data are stored in this compact form to minimize the amount of storage required, a very important consideration when working with large graphs.

Each node w in the graph is processed by procedure DELETE, i.e. deleted, f(w) times where f(w) is the color number assigned to w. This claim is easily established by observing that for each new color i introduced, i ≤ f(w), w is either colored i or adjacent to a node colored i. In either case w is deleted exactly once. Once w is colored, then it may never subsequently be deleted. Thus exactly

\sum_{i = 1}^{k} i n_{i} \leq \sum_{i = 1}^{k} k n_{i} = k \cdot \sum_{k = 1}^{k} n_{i} = n k

deletions are performed on G, where k is the number of colors used to color G, n_i, is the number of nodes colored i and n is the number of nodes in G. Since each deletion requires O(d) time where d is the average degree of a node, all the deletions may be accomplished in O(kdn) time. It is easily checked that all other operations may be accomplished in O(n²) time. Thus the algorithm requires O(n³) time and O(n²) space to color an arbitrary n node graph. However, for those graphs where kd = O(n) or, equivalently, ke = O(n²), the RLF algorithm consumes only O(n²) time and O(n²) space in coloring the graph. As was pointed out in the text, many large-scale practical problems involve graphs for which this property holds and thus may be colored with the RLF algorithm in O(n²) time.

graphic file with name jres-84-489-g001.jpg

graphic file with name jres-84-489-g002.jpg

12. Appendix C: Characterization of Test Graphs

The data provided in tables 4 and 5 of this appendix provide the necessary information to regenerate each of the test graphs used in the preparation of the data included in tables 1 and 2, respectively. Each graph is referenced by its chromatic number, average node degree, and order in which the colorings (or values of X₀) are given. For example, the parameters for the third 150-node graph with chromatic number 5 and average node degree 11 (i.e., the graph that SL 8-colored) are: n = 150, k = 5, a = 8401, c = 6859, m = 84035, b₅ = 19, b₄ = 60, b₃ = 97, b₂ = 210, and X₀ = 22093. Using these parameters, it is possible to regenerate the graph by using the procedure described in section 6.

Table 4.

Generation Parameters for the Test Graph Used in Table 1.

X	d	n	k	a	c	m	b	3 values ofX₀
5	11	150	5		6859	84035	(19, 60, 97, 210)	0,33289,22093
	19	]50	5		6859	84035	(39, 120, 195, 420)	21047,55697,74912
	24	150	5		6859	84035	(58, 180, 292, 630)	78692,83491,52870
10	11	150	10		6859	168070	(10, 0, 0, 12, 0, 0, 25, 0, 2, 4)	8879,64827,78005
	21	150	10	8401	6859	168070	(21, 0, 0, 24, 0, 0, 51, 0, 48)	5293 59845, 102567
	29	150	10	8401	6859	L68070	(31, 0, 0, 36, 0, 0, 76, 0, 72)	107489,118239,101759
15	12	150	15	8401	6859	252105	(4, 0, 0, 0, 0, 7, 0, 0, 0, 0, 12, 0, 0, 148)	80589,60363,94632
	25	150	15	8401	6859	252105	(9, 0, 0, 0, 0, 15, 0, 0, 0, 0, 24, 0, 0, 297)	220881,67107,198723
	34	150	15	8401	6859	252105	(13, 0, 0, 0, 0, 22, 0, 0, 0, 0, 36, 0, 0, 445)	66684,189309,9534

Open in a new tab

Table 5.

Generation Parameters for the Test Graphs Used in Table 2.

X	d avg.	n	k	a	c	m	b	2 values of X₀
5	25	450	5	8401	6859	84035	(175, 540, 877, 1890)	0,41794
	43	450	5	8401	6859	84035	(409, 1260, 2047, 4410)	35428,47927
15	36	450	15	8401	6859	252105	(40, 0, 0, 0, 0, 67, 0, 0, 0, 0, 108, 0, 0, 1336)	36276,213549
	74	450	15	8401	6859	252105	(94, 0, 0, 0, 0, 157. 0, 0, 0, 0, 252, 0, 0, 3118)	161712,160056
25	37	450	25	8401	6859	420175	(13, 0, 0, 0, 0, O. 0, 0, 0, 27, 0, 0, 0, 0, 0,	192625,358531
							0, 0, 0, 40, 0, 0, 0, 0, 1336)
	77	450	25	8401	6859	420175	(31, 0, 0, 0, 0, 0, 0, 0, 0, 63, 0, 0, 0, 0, 0,	247337,274955
							0, 0, 0, 94, 0, 0, 0, 0, 3118)

Open in a new tab

Footnotes

Numbers in brackets indicate literature references at the end of this paper.

9. References

[1].Aho A. V., Hopcroft J. E., and Ullman J. D., The Design and Analysis of Computer Algorithms, (Addison-Wesley, Reading, MA, 1974), pp. 364–404. [Google Scholar]
[2].Bondy J. A., Bounds for the chromatic number of a graph, Journal of Combinatorial Theory, 7 (1969), pp. 96–8. [Google Scholar]
[3].Broder S., Final examination scheduling, Communications of the ACM, Vol. 7, No. 8 (1964), pp. 494–8. [Google Scholar]
[4].Brooks R. L., On coloring the nodes of a network, Proceedings of the Cambridge Philosophical Society, 37 (1941), p. 194. [Google Scholar]
[5].Brown J. R., Chromatic scheduling and the chromatic number problem, Management Science, Vol. 19, No. 4 (1972), pp. 456–463. [Google Scholar]
[6].Christofides N., An algorithm for the chromatic number of a graph, The Computer Journal, Vol. 14, No. 1 (1971), pp. 38–9. [Google Scholar]
[7].Christofides N., Graph Theory — An Algorithmic Approach, (Academic Press, New York, 1975), pp. 58–78. [Google Scholar]
[8].Garey M. R., and Johnson D. S., The Complexity of Near-optimal Graph Coloring, Journal of the ACM, Vol. 23, No. 1 (January 1976), pp. 43–9. [Google Scholar]
[9].Hall A. D., and Acton F. S., Scheduling University Course Examinations by Computer, Communications of the ACM, Vol. 10, No. 4 (April 1967), pp. 235–8. [Google Scholar]
[10].Johnson D. S., Approximation Algorithms for Combinatorial Problems, Journal of Computer and System Sciences 9 (1974), pp. 256–78. [Google Scholar]
[11].Johnson D. S., Worst Case Behavior of Graph Coloring Algorithms, Proceedings of the 5th Southeast Conference on Combinatorics, Graph Theory, and Computing (1974), pp. 513–27. [Google Scholar]
[12].Knuth D. E., The Art of Computer Programming, (Addison-Wesley, Reading, MA, 1969), Vol. 2, pp. 9–18. [Google Scholar]
[13].Leighton F. T., A New Solution to the Exam Scheduling Problem, unpublished paper.
[14].Matula D. W., Marble G., and Isaacson J. D., Graph Coloring Algorithms, Graph Theory and Computing, Read Ronald C., editor, (Academic Press, New York, 1972), pp. 109–122. [Google Scholar]
[15].Matula D. W., Bounded Color Functions on Graphs, Networks 2 (1972), pp. 29–44. [Google Scholar]
[16].Ore Oystein, The Four Color Problem, (Academic Press, New York, London, 1967). [Google Scholar]
[17].Peck J. E. L., and Williams M. R., Examination Scheduling, Communications of the ACM, Vol. 9, No. 6 (1966), pp. 433–4. [Google Scholar]
[18].Pershin O. Y., An Algorithm for Determining the Minimum Coloring of a Finite Graph, Engineering Cybernetics, Vol. 11, No. 6 (1973), pp. 980–5. [Google Scholar]
[19].Szekeres G., and Wilf H. S., An Inequality for the Chromatic Number of a Graph, Journal of Combinatorial Theory, 4 (1968), pp. 1–3. [Google Scholar]
[20].Welsh D. J. A., and Powell M. B., An Upper Bound for the Chromatic Number of a Graph and its Applications to Timetabling Problems, The Computer Journal, 10 (1967), pp. 85–6. [Google Scholar]
[21].Williams M. R., The Coloring of Very Large Graphs, Combinatorial Structures and Their Applications — Proceedings of the Calgary International Conference on Combinatorial Structures and Their Applications, Gordon and Breach, New York (June 1969), pp. 477–8. [Google Scholar]
[22].Wood D. C., A System for Computing University Examination Timetables, The Computer Journal, Vol. 11, No. 1 (May 1968), p. 41. [Google Scholar]
[23].Wood D. C., A Technique for Coloring a Graph Applicable to Large Scale Timetabling Problems, The Computer Journal, 12 (1969), p. 317. [Google Scholar]

[R1] [1].Aho A. V., Hopcroft J. E., and Ullman J. D., The Design and Analysis of Computer Algorithms, (Addison-Wesley, Reading, MA, 1974), pp. 364–404. [Google Scholar]

[R2] [2].Bondy J. A., Bounds for the chromatic number of a graph, Journal of Combinatorial Theory, 7 (1969), pp. 96–8. [Google Scholar]

[R3] [3].Broder S., Final examination scheduling, Communications of the ACM, Vol. 7, No. 8 (1964), pp. 494–8. [Google Scholar]

[R4] [4].Brooks R. L., On coloring the nodes of a network, Proceedings of the Cambridge Philosophical Society, 37 (1941), p. 194. [Google Scholar]

[R5] [5].Brown J. R., Chromatic scheduling and the chromatic number problem, Management Science, Vol. 19, No. 4 (1972), pp. 456–463. [Google Scholar]

[R6] [6].Christofides N., An algorithm for the chromatic number of a graph, The Computer Journal, Vol. 14, No. 1 (1971), pp. 38–9. [Google Scholar]

[R7] [7].Christofides N., Graph Theory — An Algorithmic Approach, (Academic Press, New York, 1975), pp. 58–78. [Google Scholar]

[R8] [8].Garey M. R., and Johnson D. S., The Complexity of Near-optimal Graph Coloring, Journal of the ACM, Vol. 23, No. 1 (January 1976), pp. 43–9. [Google Scholar]

[R9] [9].Hall A. D., and Acton F. S., Scheduling University Course Examinations by Computer, Communications of the ACM, Vol. 10, No. 4 (April 1967), pp. 235–8. [Google Scholar]

[R10] [10].Johnson D. S., Approximation Algorithms for Combinatorial Problems, Journal of Computer and System Sciences 9 (1974), pp. 256–78. [Google Scholar]

[R11] [11].Johnson D. S., Worst Case Behavior of Graph Coloring Algorithms, Proceedings of the 5th Southeast Conference on Combinatorics, Graph Theory, and Computing (1974), pp. 513–27. [Google Scholar]

[R12] [12].Knuth D. E., The Art of Computer Programming, (Addison-Wesley, Reading, MA, 1969), Vol. 2, pp. 9–18. [Google Scholar]

[R13] [13].Leighton F. T., A New Solution to the Exam Scheduling Problem, unpublished paper.

[R14] [14].Matula D. W., Marble G., and Isaacson J. D., Graph Coloring Algorithms, Graph Theory and Computing, Read Ronald C., editor, (Academic Press, New York, 1972), pp. 109–122. [Google Scholar]

[R15] [15].Matula D. W., Bounded Color Functions on Graphs, Networks 2 (1972), pp. 29–44. [Google Scholar]

[R16] [16].Ore Oystein, The Four Color Problem, (Academic Press, New York, London, 1967). [Google Scholar]

[R17] [17].Peck J. E. L., and Williams M. R., Examination Scheduling, Communications of the ACM, Vol. 9, No. 6 (1966), pp. 433–4. [Google Scholar]

[R18] [18].Pershin O. Y., An Algorithm for Determining the Minimum Coloring of a Finite Graph, Engineering Cybernetics, Vol. 11, No. 6 (1973), pp. 980–5. [Google Scholar]

[R19] [19].Szekeres G., and Wilf H. S., An Inequality for the Chromatic Number of a Graph, Journal of Combinatorial Theory, 4 (1968), pp. 1–3. [Google Scholar]

[R20] [20].Welsh D. J. A., and Powell M. B., An Upper Bound for the Chromatic Number of a Graph and its Applications to Timetabling Problems, The Computer Journal, 10 (1967), pp. 85–6. [Google Scholar]

[R21] [21].Williams M. R., The Coloring of Very Large Graphs, Combinatorial Structures and Their Applications — Proceedings of the Calgary International Conference on Combinatorial Structures and Their Applications, Gordon and Breach, New York (June 1969), pp. 477–8. [Google Scholar]

[R22] [22].Wood D. C., A System for Computing University Examination Timetables, The Computer Journal, Vol. 11, No. 1 (May 1968), p. 41. [Google Scholar]

[R23] [23].Wood D. C., A Technique for Coloring a Graph Applicable to Large Scale Timetabling Problems, The Computer Journal, 12 (1969), p. 317. [Google Scholar]

PERMALINK

A Graph Coloring Algorithm for Large Scheduling Problems

Frank Thomson Leighton

Abstract

1. Introduction

2. Preliminary Definitions

3. Sequential Coloring Algorithms

4. More Sophisticated Algorithms

5. The Recursive Largest First (RLF) Algorithm

6. Generation of Test Graphs With Known Chromatic Number

Table 1.

Table 2.

7. Test Results

Table 3.

8. Conclusions

Acknowledgments

10. Appendix A: Application to Examination Scheduling

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Figure 7.

Figure 8.

Figure 9.

Figure 10.

Figure 11.

Figure 12.

Figure 13.

Figure 14.

11. Appendix B: Computer Implementation of the RLF Algorithm

12. Appendix C: Characterization of Test Graphs

Table 4.

Table 5.

Footnotes

9. References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases