Skip to main content
Journal of Research of the National Bureau of Standards logoLink to Journal of Research of the National Bureau of Standards
. 1979 Nov-Dec;84(6):489–506. doi: 10.6028/jres.084.024

A Graph Coloring Algorithm for Large Scheduling Problems

Frank Thomson Leighton 1,**
PMCID: PMC6756213  PMID: 34880531

Abstract

A new graph coloring algorithm is presented and compared to a wide variety of known algorithms. The algorithm is shown to exhibit O(n2) time behavior for most sparse graphs and thus is found to be particularly well suited for use with large-scale scheduling problems. In addition, a procedure for generating large random test graphs with known chromatic number is presented and is used to evaluate heuristically the capabilities of the algorithms discussed.

Keywords: Algorithm, chromatic number, color function, graph, graph coloring, heuristic, interchange, random test graphs, scheduling, time complexity

AMS-MOS 1970 Subject Classification: 05C15, 68A10, 68A20, 90B35

1. Introduction

Graph coloring has considerable application to a large variety of complex problems involving optimization. In particular conflict resolution, or the optimal partitioning of mutually exclusive events, can often be accomplished by means of graph coloring. Examples of such problems include: the scheduling of exams in the smallest number of time periods such that no individual is required to participate in two exams simultaneously (see appendix A), the storage of chemicals on the minimum number of shelves such that no two mutually dangerous chemicals (i.e., dangerous when one is in the presence of the other) are stored on the same shelf, and the pairing of individuals (as in a computer dating agency) such that the maximal number of compatible persons are paired together.

In each of the above problems, the constraints are usually expressible in the form of pairs of incompatible objects (e.g., pairs of chemicals that cannot be stored on the same shelf). Such incompatibilities are usefully embodied through the structure of a graph. Each object is represented by a node and each incompatibility is represented by an edge joining the two nodes. A coloring of this graph is then simply a partitioning of the objects into blocks (or colors) such that no two incompatible objects end up in the same block. Thus, optimal solutions to such problems may be found by determining minimal colorings for the corresponding graphs. Unfortunately, this may not always be accomplishable in a reasonable amount of time.

As the graph coloring problem is known to be NP-complete [1],1 there is no known algorithm which, for every graph, will optimally color the nodes of the graph in a time bounded by a polynomial in the number of nodes. Since exponential time algorithms [5, 6, 7, 9, 18] are prohibitively expensive for use with large-scale problems, much attention has been focused on the development of heuristic algorithms which will usually produce a good, though not necessarily optimal, coloring for any graph in a reasonable amount of time.

This paper describes a new graph coloring algorithm, the recursive largest first (RLF) coloring algorithm. In addition, a variety of existing coloring procedures are presented and their performance on a wide range of test data is compared to that of the RLF algorithm.

Also described is a procedure for generating random graphs with known chromatic number. The existence of such a procedure, heretofore lacking in the experimental literature, provides a standard method for testing the accuracy of graph coloring algorithms.

2. Preliminary Definitions

Throughout this paper, the graph G with nodes V and edges E, denoted by (V, E), is assumed to contain no loops or multiple edges. The subgraph of G = (V, E) induced by a subset U of the nodes V consists of those nodes and all the edges that directly connect them. This subgraph is represented by < U > or (U, E′) where E′ = {(w1, w2)∣(w1, w2)ϵE, w1ϵU, w2ϵU}. The degree of a node wϵG, denoted by d(w), is the number of nodes adjacent to w in G. Define dU(w) to be the number of nodes in U adjacent to w in G. This is equivalent to the degree of w in < U ⋃ {w} >.

A coloring of G is an assignment of colors to the nodes of G such that no two adjacent nodes share the same color. More formally, a k-coloring of G is a mapping f: V → {1, 2, …, k} such that f(u) = f(v) only if (u,v) ϵ E. The chromatic number of G, denoted χ(G), is the minimal number of colors necessary to color G. An optimal coloring of G is one which uses exactly χ(G) colors.

3. Sequential Coloring Algorithms

One of the simplest coloring algorithms is the randomly ordered sequential (RND) graph coloring algorithm [14]. Given a graph G = (V, E), the algorithm randomly orders the nodes so that V = {v1, …, vn} and then assigns colors to the nodes in the following manner. The first node, v1, is assigned color number 1. Once the first i nodes have been colored (1 ≤ in − 1), vi + 1 is assigned the lowest possible color number such that no previously colored node adjacent to vi + 1 has been assigned the same color number.

Though this algorithm is locally optimal in the sense that each node is assigned the smallest possible number, the overall action is highly dependent on the initial ordering of the nodes. For any graph, there exists an ordering for which this algorithm will produce an optimal coloring [14], while a less fortuitous ordering may lead to an extremely poor coloring. Thus the problem of finding an optimal initial ordering of the nodes is equivalent to the problem of optimally coloring the graph.

This fact has led to the development of a large number of algorithms, each differing from RND only in the method of initially ordering the nodes [7, 14]. Two such algorithms are the largest first (LF) and smallest last (SL) sequential coloring algorithms.

The LF algorithm orders the nodes such that d(vi) ≥ d(vi + 1) for 1 ≤ i < n where V = {v1, ……, vn}. The SL algorithm is similar in strategy but recursively orders the smallest degree nodes last. An SL ordering is one in which d(vn)=minwVd(w) and for n − 1 ≥ i ≥ 1, dU(vi)=minwUdU(w) where U = V − {vn,… ……, vi + 1}.

Note that both the LF and SL algorithms tend to order the high degree nodes before the low degree nodes. Computational experience has shown that this is generally a good strategy, whereas algorithms which color the higher degree nodes last have often been found to produce colorings worse than those produced by a random ordering.

Each of the sequential coloring algorithms presented in this section requires O(n2) time and O(n2) space to color a graph with n nodes. Quadratic time and space complexities are generally quite acceptable for use with large-scale coloring problems. If only they gave guaranteed optimal colorings, we would look no further.

4. More Sophisticated Algorithms

One successful variation of the sequential coloring algorithms involves what is known as an interchange. Given any G = (V, E) and color function f such that f(w)ϵ{i, j} for all wϵV, and (i, j)-interchange on G is a redefinition of f such that if f(w) = i originally, f(w) is now assigned j and vice versa for all wϵV.

Appropriate use of the interchange process has been found to yield particularly good results when used in conjunction with the LF and SL algorithms [14]. The resulting procedures are referred to as the smallest last with interchange (SLI) and largest first with interchange (LFI) coloring algorithms.

The SLI (LFI) algorithm operates just like the SL (LF) algorithm except when the latter requires the introduction of a new color. Suppose that such a situation occurs when vm is the node to be colored and that k=maxi<mf(vi). For 1 ≤ i < jk, define Gij to be the subgraph of G induced by the nodes of G previously colored i or j. If possible, choose i and j such that no connected component of Gij contains two differently colored nodes both adjacent to vm. If such a Gij is found to exist, then perform an (i, j) interchange on each connected component of Gij which contains an i-colored node adjacent to vm in G. It is now possible to assign color i to vm and thus the addition of a new color has been avoided. If no such Gij exists, however, then regardless of what interchange is performed, vm must be assigned color k + 1.

This version of the SLI (LFI) algorithm initially appeared in [11] and is an extension of the original version which is described in [14]. The original version allows an (i, j) – interchange only when vm is adjacent to exactly one node colored i and one node colored j. There is little difference between the original and extended versions of the SLI and LFI algorithms in terms of colorings produced or time required. While the extended versions may be able to perform a useful interchange impossible in the original version, they will likely take slightly longer to do so. All four algorithms require O(n3) time and O(n2) space to color an n node graph. Based on a limited amount of computational experience, the extended version of the SLI algorithm (henceforth to be referred to simply as the SLI algorithm) was found to produce slightly better results than did the other interchange procedures.

All of the algorithms thus far presented are capable of producing very bad colorings, in terms of number of colors used, for certain graphs. Johnson [10, 11] has given constructions of 3-colorable graphs on O(n) vertices which each of the above algorithms requires n colors to color completely. Since no more than O(n) colors may be used to color an O(n) node graph, such colorings are, up to a constant, the worst possible.

There is an algorithm, however, which will color any graph G with n nodes in O(nlogn)χ(G) or fewer colors. While this worst-case behavior is still unacceptable in practice, the approximately maximum independent set (AMIS) algorithm is interesting because it is the only known algorithm which is known not to exhibit the worst possible worst-case behavior [11]. The algorithm proceeds as follows. Given G = (V, E) select the node with minimum degree in G, say v1, and color it 1. Once i nodes have been assigned color 1, select, if possible vi + 1 ϵU such that dU(vi + 1) is minimal for nodes in U where U is the set of uncolored nodes not adjacent to any colored node. If no such selection is possible, i.e., U is empty, then repeat the entire procedure on the subgraph of G induced by the uncolored nodes of G, using the next available color. This process is then, in turn, repeated until all the nodes of G have been colored.

Interestingly enough, while this algorithm exhibits better worst-case behavior than the other algorithms thus far discussed, computational experience has shown that, on the average, the colorings it produces are substantially inferior to those produced by the LF, SL, and SLI algorithms.

5. The Recursive Largest First (RLF) Algorithm

The RLF algorithm combines the strategy of the LF algorithm with the structure of the AMIS algorithm. Like the LF algorithm, at each step in the RLF procedure a node is selected for coloring which will, in some sense, leave the resulting uncolored nodes colorable in as few colors as possible. As with the AMIS algorithm, the RLF procedure completes the assignment of color i before commencing assignment of color i + 1.

The RLF graph coloring algorithm proceeds as follows. Given G = (V, E) assign color 1 to the node with maximal degree in G, say v1. Once i nodes have been assigned color 1, select, if possible, vi + 1 ϵU1 such that dU(vi+1) is maximal for nodes in U1 where U1 is the set of uncolored nodes not adjacent to any colored node and U2 is the set of uncolored nodes adjacent to at least one colored node. Ties are, if possible, broken by choosing the node that has minimal degree in < U1 >. If no such selection is possible, i.e., U1 is empty, then repeat the entire process recursively on the subgraph of G induced by the uncolored nodes of G, using the next available color. This recursion is then repeated until all of the nodes in G are colored. Several examples of this procedure are worked out in appendix A.

As was true with the SLI algorithm, the RLF algorithm, in general, requires O(n3) time and O(n2) space to color an n node graph. Unlike the SLI algorithm, however, the RLF algorithm requires only O(n2) time to color graphs for which k · en2 where k is the number of colors used to color the graph, e is the number of edges in the graph, and n is the number of nodes in the graph (see appendix B for proof). Such graphs, which are usually sparse, quite commonly arise in practical applications such as exam scheduling. For example, the graph associated with the 1977–8 Princeton University fall term course examinations schedule consisted of 273 nodes, 6727 edges, and required 17 colors to be colored by the RLF algorithm. Thus, for practical purposes, the RLF algorithm, if programmed properly, exhibits an O(n2) time dependence for many applications. Appendix B presents a PL-1 listing of the RLF algorithm as well as a rigorous analysis of its time complexity.

6. Generation of Test Graphs With Known Chromatic Number

A few papers have been published which compare the performance of various algorithms on large (usually 100-node) randomly generated graphs [14, 21, 23]. Unfortunately, none of these empirical studies provide the chromatic numbers of the test graphs used. Indeed, the task of closely approximating the chromatic number of a graph is NP-complete [8] and thus virtually impossible to accomplish for large graphs. Consequently, approximations of upper and lower bound results established for χ(G) have generally been crude and of little practical use [1, 7, 14, 19].

The lack of such information makes an accurate interpretation of the experimental data very difficult. For instance, if algorithm A required 22 colors to color G while algorithm B required only 20, the conclusions drawn about their relative effectiveness if χ(G) = 20 might be quite different from those drawn if χ(G) = 4. Further, without knowledge of χ(G), no statement can be made at all about the accuracy or closeness to optimality of either algorithm A or B. Thus there is a need for a standard procedure for generating random test graphs with known chromatic numbers. Such a procedure will now be presented.

Suppose it is desired to construct an n-node graph G with e edges and chromatic number k. For the purposes of the following argument, assume that kn. This is not a significant restriction since most test or modeling uses of a large graph generator are likely to allow some flexibility in the choices of n and k. For such a graph to exist under these restrictions, e must be such that

k(k1)2en2(k1)2k.

The first step in the procedure is to choose positive integers a, c and m such that:

  1. mn,

  2. (n, m) = k,

  3. (c, m) = 1,

  4. pmp ∣ (a − 1) for all primes p, and

  5. 4 ∣ m → 4 ∣ (a − 1).

Next generate a uniform sequence of random numbers {Xi} on the interval 0 to m − 1 by the linear congruential method described in [12]. This is accomplished by fixing X0 and, then for each i > 0, setting Xi = MOD(aXi − 1 + c, m) where MOD(X,Y)=X[XY]*Y.. Sequences generated in such a manner exhibit two important properties [12]. First, for every i and j such that 0 ≤ jm − 1 and i ≥ 0, there exists an r such that iri + m − 1 and Xr = j. Second, for every i ≥ 0, Xi = Xi + m.

Next construct the sequence {Yi} on the interval 0 to n − 1 so that Yi = MOD(Xi, n). Note that unless k=nn |m and {Yi} is not a uniform random number sequence on the interval 0 to n − 1.

By defining V = {0, 1, …, n − l}, it is possible to associate two consecutive values of {Yi} with edges to be added to E. Similarly, it is possible to associate h consecutive values of {Yi} with h-cliques to be implanted in G. For example, the subsequence {Y1, Y2, Y3} corresponds to the subset {(vY1,vY2),(vY1,vY3),(vY2,vY3)}. By identifying certain subsequences of consecutive elements of {Yi} and adding the corresponding edges to E, it is possible to construct the desired graph G.

More precisely, define the (k − 1)-vector b = (bk, bk−1, …, b2) so that bk ≥ 1 and bi ≥ 0 for 2 ≤ ik − 1. Each bi corresponds to the number of i-cliques to be implanted in G. Specifically, given the sequence {Yi} and vector b, proceed as follows. Select the first k values of {Yi}starting with Y1 and add the corresponding edges to E. If bk > 1, select the next k values of {Yi} and add the corresponding edges to E. Repeat this process until bk k-cliques have been implanted in G. Next add, in an identical fashion, bk−1 (k − 1)-cliques to G. Continue the process until b2 2-cliques or edges have been added to E. Note that some edges may be “added” several times and thus it may not be possible to precalculate a vector b such that there are exactly e edges in the resulting graph. It is possible, however, to keep track of how many edges have been added at any point and to eliminate the addition of i-cliques which might result in the addition of too many edges to E. Since edges may be added one at a time, it is not difficult to show that graphs having exactly e edges may be constructed in this manner for any e such that

k(k1)2en2(k1)2k.

It now remains to be shown that χ(G) = k for any G constructed in this manner. Since bk ≥ 1, G contains a k-clique and thus χ(G) ≥ k. Before establishing that χ(G) ≤ k, it is useful to examine the structure of the sequence {Yi} where Yi. Since kn and k∣ m,

Yi+1=MOD(Yi+1,k)=MOD(MOD(Xi+1,n),k)=MOD(Xi+1,k)=MOD(MOD(aXi+c,m),k)=MOD(aXi+c,k)=MOD(MOD(aXi+c,n),k)=MOD(aYi+c,k)Yi+1=MOD(aYi+c,k).

Further,

p|kp|mp|(a1)4|k4|m4|(a1), and (c,m)=1(c,k)=1.

Thus {Yi} is a uniform sequence of random numbers on the interval 0 to k − 1.

This structure of the {Yi} modulo k allows the following coloring of G. For each i, define f(vYi)=MOD(i,k). Since for all j such that 0 ≤ j: < n, there exists an i > 0 such that Yi = j, it is clear that every node is assigned a color by this procedure. Since {Yi} is a uniform sequence of random numbers on the interval from 0 to k − 1, we know that if Yi = Yj then Yi=Yj and MOD(i, k) = MOD(j, k) and thus that f(vYi)=f(vYj). This means that f is well defined. Finally, it is easily verified that vYi,vYi+1,,vYi+h1 are all colored differently if hk, for all i ≥ 0. This means that edges occur only between differently colored nodes, and that f is a proper coloring of G. Thus χ(G) = k.

It should be noted that the above result is a special case of a more general result for arbitrary k and n. That is, if k and n are such that kd where d = (n, m), then k ≤ χ(G) ≤ k + MOD(d, k) < 2k. The proof of the general result is not given here but is similar to that of the special case when MOD(d, k) = 0.

As will be demonstrated shortly, the range of graphs which can be generated by this procedure is quite large. The node degrees of such graphs may vary between 0 and nnk while the average node degree may vary between k(k1)n and k1kn. The variety of distributions of node degrees is also quite large. Most importantly, however, the procedure generates graphs which are as difficult to color as are randomly generated graphs (where the chromatic number is not known). Demonstration of this fact is provided in section 7.

Another advantage of this procedure is that the test graphs may be easily characterized. For example, only k + 5 values are required to generate an n-node graph with chromatic number k. These values are n, k, X0, a, c, m, bk, bk−1, …, b3 and b2. Whereas it would be infeasible to completely describe a large, randomly generated graph by conventional means in a short paper, graphs generated by this procedure are easily described. Thus, in future publications concerning the effectiveness of various graph coloring algorithms, it will be possible to specify precisely which graphs were used to test the various algorithms. There are several conceivable situations where such documentation could be valuable to the interested reader. For example, should the reader desire to compare the effectiveness of a new graph coloring algorithm to those in the literature, he would need only to regenerate the graphs used in published tests and color them with the new algorithm. This would eliminate the necessity of developing an entirely new set of test data and of having to rerun all previous algorithms on such data. Pursuant to these goals, a complete characterization of the test graphs referred to in tables 1 and 2 is provided in appendix C.

Table 1.

Results for 150-node test graphs generated according to the procedure detailed in section 6.

χ d number of colors used average number of excess colors used
RND LF SL RLF SLI RND LF SL RLF SLI
5 11 (9, 9, 9) (7, 8, 7) (7, 7, 8) (6, 6, 6) (6, 6, 6) 4 2⅓ 2⅓ 1 1
19 (11, 11, 12) (10, 10, 9) (9, 10, 9) (7, 7, 8) (9, 5, 8) 6⅓ 4⅔ 4⅓ 2⅓ 2⅓
24 (13, 12, 13) (10, 11, 11) (10, 9, 10) (7, 6, 7) (8, 7, 6) 7⅔ 5⅔ 4⅔ 1⅔ 2
Ave. total 6 42/9 37/9 16/9 17/9
10 11 (11, 11, 12) (10, 10, 10) (10, 10, 10) (10, 10, 10) (10, 10, 10) 1⅓ 0 0 0 0
21 (15, 14, 16) (12, 12, 12) (12, 12, 12) (11, 11,11) (11, 11, 11) 5 2 2 1 1
29 (17, 16, 18) (14, 14, 14) (14, 15, 15) (13, 13, 12) (13, 13, 12) 7 4 4⅔ 2⅔ 2⅔
Ave. total 44/9 2 22/9 12/9 12/9
15 12 (15, 16, 15) (15, 15, 15) (15, 15, 15) (15, 15, 15) (15, 15, 15) 0 0 0 0
25 (19, 18, 18) (17, 16, 16) (16, 16, 15) (15, 15, 15) (15, 15, 15) 3⅓ 1⅓ 0 0
34 (19, 21, 20) (17, 17, 18) (17, 19, 17) (16, 16, 16) (16, 16, 16) 5 2⅓ 2⅔ 1 1
Ave. total 28/9 12/9 11/9
Overall average total 412/27 213/27 210/27 12/27 13/27
Total time (seconds) 10.2 13.3 13.5 15.3 49.2

Table 2.

Results for 450-node test graphs generated according to the procedure detailed in section 6.

χ d number of colors used average number of excess colors used
RND LF SL RLF SLI RND LF SL RLF SLI
5 25 (14, 13) (11, 12) (11, 12) (8, 8) (10, 9) 3
43 (17, 18) (12, 14) (11, 15) (5, 5) (5, 5) 12½ 8 8 0 0
Ave. total 10½
15 36 (22, 22) (18, 18) (18, 18) (17, 16) (16, 16) 7 3 3 1
74 (30, 31) (26, 26) (26, 26) (23, 23) (23, 24) 15½ 11 11 8
Ave. total 11¼ 7 7 43/4 43/4
25 37 (29, 27) (26, 25) (25, 25) (25, 25) (25, 25) 3 ½ 0 0 0
77 (37, 35) (29, 30) (31, 31) (28, 28) (28, 29) 11 6 3
Ave. total 7 3
Overall average Total 97/12 57/12 59/12 27/12 211/12
Total time(seconds) 32.0 42.8 44.9 80.7 308.8

7. Test Results

The procedure described above was used to generate 27 150-node graphs and 12 450-node graphs of varying edge density and chromatic number. In addition, 27 completely random 150-node graphs were generated with varying edge density and unknown chromatic number. The RND, LF, SL, RLF and SLI algorithms were tested on each of the 66 graphs. The resulting data are displayed in tables 1, 2, and 3, respectively.

Table 3.

Results for random 150-node test graphs.

 Lower Bound on χ d  RND  LF  SL  RLF  SLI
 5  10
 18
 24
 (9, 9, 8) (11, 11, 11) (13, 13, 12)  (7, 7, 7) (10, 10, 10) (10, 12, 11)  (7, 7, 7)
 (10, 10, 9) (11, 11, 11)
 (6, 6, 6)
  (8, 8, 8)
  (9, 9, 9)
 (6, 6, 6)
  (9, 8, 8)
  (10, 10, 10)
 10  10
 18 24
 (16, 15, 16)
 (16, 17, 16) (17, 18, 17)
 (15, 15, 15) (15, 15, 15) (15, 16, 16)  (15, 15, 15) (15, 15, 15) (15, 15, 15)  (15, 15, 15) (15, 15, 15) (15, 15, 15)  (15, 15, 15) (15, 15, 15) (15, 15, 15)
 15  8 16
 24
 (24, 22, 21) (23, 23, 24)
 (24, 24, 24)
 (24, 22, 21) (23, 23, 24) (24, 24, 24)  (24, 22, 21) (23, 23, 24)
(24, 24, 24)
 (24,22, 21) (23, 23, 24) (24, 24, 24)  (24, 22, 21) (23, 23, 24) (24, 24, 24)
Total time (seconds)  9.8  12.9  12.9  19.0  384.3

In each table, the graphs are subdivided into groups according to chromatic number, χ, (or as in the case with the completely random graphs, to a known lower bound for χ) and average node degree, d. There are three graphs in each 150-node group and two graphs in each 450-node group. The numbers in parentheses indicate the number of colors used by an algorithm to color the first, second, and possibly, third graph of that category. For the graphs where χ was known, the average number of excess colors used by the algorithm was computed for each group and totaled. For example, the RLF algorithm optimally colored each of the 10-colorable 150-node graphs with average degree 11 in table 1 but required, on the average, 43/4 extra colors to color each of the four 15-colorable 450-node graphs in table 2.

The total run time for each algorithm is also included in each table. This figure represents execution time in seconds on an IBM 360–91. It should be noted that such figures are highly dependent on factors unrelated to the inherent efficiency of the algorithm, such as programmer skill and machine characteristics. The time complexity estimates provided earlier are much more rigorous measures of the algorithms’ relative speeds. Except for the SLI time in table 3, the run times are in accordance with these theoretical estimates. It is quite possible that the graphs referenced in table 3 have chromatic numbers much higher than the minimum estimate, and that the SLI algorithm was thus induced to attempt and possibly perform a large number of time-consuming interchanges. This example points out the highly variable amounts of time required by most interchange algorithms to color various graphs (a phenomenon also observable in the data of [14]).

The random graphs of table 3 were included only for the purpose of demonstrating that the graphs generated by the technique discussed in section 6 are just as suitable for testing the relative capabilities of graph coloring algorithms as are completely randomly generated graphs. As was pointed out earlier, the data in table 3 cannot be used to draw conclusions about the accuracy of the tested algorithms. From the data in tables 1 and 2, however, we observe that, for the graphs considered, the LF and SL algorithms required about twice as many extra colors to color the graphs as did the RLF and SLI algorithms. Similarly, the RND algorithm required about twice as many extra colors as did the LF and SL algorithms. Significantly, this observation can be made for most of the graphs on an individual basis. The RND algorithm always used more colors than the LF and SL algorithms which, in turn, always used more colors than the RLF or SLI algorithms.

There is not as clear a distinction between the performance of the LF and SL algorithms or the RLF and SLI algorithms. The LF and SL algorithms required virtually the same number of colors on the average and required nearly the same amount of time. While the colorings produced by both the RLF and SLI algorithms for test graphs were, on the average, quite good, the RLF algorithm required substantially less time and used approximately 12 percent fewer excess colors on the 450-node graphs and 3 percent fewer excess colors on the 150-node graphs than did the SLI algorithm. Of the eight 450-node graphs which were not optimally colored by both the RLF and SLI algorithms, the RLF algorithm required the fewest colors for four of the graphs and the most for only one graph.

As a final note, the edge density, dn, of the test graphs did not exceed ¼. This results from the fact that for most large-scale practical applications, the edge density of the graphs to be colored is generally small. For instance, the Princeton University exam scheduling graph mentioned in Section 5 had an edge density of approximately 1/6.

8. Conclusions

From the data presented it is apparent that the RLF algorithm, when not optimal, colored large graphs with substantially fewer colors than did any of the other algorithms that did not involve interchanges. When compared with interchange algorithms, the RLF algorithm was found to produce slightly better colorings in substantially less time. While the RLF and interchange algorithms in general each require O(n3) time to color an n-node graph, the RLF procedure is unique in that it exhibits O(n2) time behavior for graphs with low edge density. Thus the RLF algorithm is particularly well suited for use with large-scale practical problems.

The method described in section 6, for generating random graphs with a known chromatic number, was found to produce test data which can be used to determine heuristically a given algorithm’s accuracy as well as algorithms’ relative capabilities. Previously, published comparison tests have been made only on graphs with unknown chromatic numbers, which rendered impossible any evaluation of an individual algorithm’s accuracy and questionable any statement about two algorithms’ relative capabilities. In addition, the procedure provides a standard method of generating test data for coloring algorithms; by its use a large graph with known chromatic number may be uniquely constructed from only a few parameters.

Acknowledgments

In addition to Professor Acton, the author would like to thank Dr. Charles Johnson, Dr. James Lawrence, and Dr. Alan Goldman for their helpful remarks.

This work was done in part while the author was a staff member of the Center for Applied Mathematics of the National Bureau of Standards during the summer of 1976 and in part while the author was an undergraduate majoring in Electrical Engineering and Computer Sciences at Princeton University working under the supervision of Professor Forman S. Acton.

10. Appendix A: Application to Examination Scheduling

The examination scheduling problem is probably the best known of a large class of scheduling problems in applied mathematics and operations research. It consists of scheduling exams such that no individual is required to participate in two or more exams simultaneously. It is usually assumed desirable to schedule the exams such that the total number of time periods required for the examinations is minimized. Sometimes, additional restraints are imposed. Requiring that some examinations be given or not given in specified time periods and scheduling the exams so that a certain subset of the exams will be completed as early as is possible are examples of such restraints.

Consider the following exam scheduling problem.

Figure 1.

Figure 1

Figure 2.

Figure 2

In addition to the information contained in figures 1 and 2, assume that we also know that exam 2 must be scheduled in time period 1 and that the final schedule must be such that the last exam involving a participant of type A is scheduled as early as possible.

We will now proceed to solve the above scheduling problem utilizing the RLF graph coloring algorithm.

Since exam 2 must be scheduled in time period 1, we will do so and amend figure 2 so that exams incompatible (i.e., may not be scheduled concurrently) with exam 2 will not be scheduled in time period 1. This information is included in figure 3.

Figure 3.

Figure 3

The restriction placed on exams involving type A participants may be satisfied, as far as is possible by heuristic means, by scheduling the exams involving type A individuals first and then, using this information, scheduling the remaining exams. The graph in figure 4 contains the information necessary for the first step.

Node Ei represents exam i and node Tj represents time period j for all i, j. There is an edge between every pair of time period nodes to insure that no two time periods are assigned the same color. An edge is inserted between node Ei and node Ej if and only if exams i and j may not be scheduled simultaneously. Finally, an edge is inserted between node Ei and node Tj if and only if exam i may not be given during time period j.

Figure 4.

Figure 4

To obtain the desired schedule we will color the graph in figure 4 using a slightly modified version of the RLF algorithm. At the beginning of each recursive step, we will first color the earliest uncolored time period node instead of the uncolored node adjacent to the most uncolored nodes. This will guarantee that exams are assigned to the earliest time periods first. We will denote the set of uncolored nodes adjacent to at least one colored node, U2, by encircling such nodes. Colors, as usual, will be denoted by a number. Uncolored nodes not adjacent to any colored node, nodes in U1, will not be labeled.

The coloring then proceeds as follows. Node T1 is colored 1 and nodes T2, T3, T4, T5, E4, E6, E10, and El2 are circled. This is illustrated in figure 5.

Figure 5.

Figure 5

Only nodes E1 and E7 remain in U1. Node E1, is adjacent to five nodes in U2 while node E7 is adjacent to only two nodes in U2. Thus node E1 is colored 1 and node E7 circled. This leaves every node either colored or circled (U1 is empty) and thus we must delete the colored nodes (T1 and E1) from the graph and repeat the procedure on the resulting graph starting with color 2. Once T2 has been assigned color 2 and the appropriate nodes circled, we have the graph in figure 6.

Figure 6.

Figure 6

Both nodes E7 and E10 are adjacent to one node in U2 while all other nodes in U1 are not adjacent to any node in U2. Since E7 is connected to only one node in U1 while dU1(E10)=2,E7 is colored 2 and E4 is circled. This leaves the graph in figure 7.

Figure 7.

Figure 7

Completing the assignment of color 2, E10 is assigned color 2, node E6 is circled and, finally, E12 is colored 2. Since U1 is now empty, we delete the colored nodes and repeat the process on the graph in figure 8 using color 3.

Figure 8.

Figure 8

This graph is trivially colored by assigning color 3 to nodes T3, E4 and E6; color 4 to T4; and color 5 to T5.

The schedule may now be constructed by assigning to each time period those exams which were assigned the color of that time period. Should the number of colors used exceed the number of time period nodes used, the additional colors may be arbitrarily associated with additional time periods (assuming they exist). In this example, however, the number of time period nodes, 5, exceeded the number of colors used, 3, and no such additional assignment of time periods was necessary. The resulting schedule is displayed in figure 9.

Figure 9.

Figure 9

This completes the scheduling of exams involving type A individuals. We must now schedule the remaining exams taking into account the partial schedule in figure 9 and the information displayed in figure 3. This information is summarized in figure 10.

Figure 10.

Figure 10

Combining this information with that in figure 1, the graph in figure 11 is readily constructed.

Figure 11.

Figure 11

All that remains is to color the graph displayed in Figure 11. This is easily done using the RLF algorithm. The final coloring is displayed in figure 12.

Figure 12.

Figure 12

Note that, depending on the order in which the nodes were considered, E11 could have been assigned color 2 and E8 color 3 since the two nodes are identical as far as the RLF algorithm is concerned. This illustrates the fact that the final colorings assigned to a graph may, to a small extent, depend on the initial ordering of the nodes (i.e., on the manner in which nodes with identical characteristics are distinguished). Finally, since 4 colors were necessary to color the graph, those exams colored 4 will be scheduled in the first available, unused time period, the fourth time period.

This results in the partial schedule in figure 13.

Figure 13.

Figure 13

The completed exam schedule for all exams is displayed in figure 14. In this particular case, the schedule produced is optimal.

Figure 14.

Figure 14

11. Appendix B: Computer Implementation of the RLF Algorithm

As was demonstrated in Appendix A, the RLF algorithm can be used to color small graphs easily by hand. For large graphs, however, such a task can only be accomplished reliably by computer. In order to program the RLF algorithm efficiently, the values of dU1 and dU2 must be stored for each node and updated whenever U1 or U2 is modified. This is accomplished by defining two arrays, E and F:

E(w) is {<0 if w is colored <0 if wU2=dU1(w) if wU1}.

and

F(w) is {<0 if w is colored =dU1U2(w) if wU2=dU1U2(w) if wU1}.

Initially U1 consists of every node in G and thus E(w) = F(w) = d(w) for all wϵV. If node w′ is selected for coloring, then we must modify E and F so that:

E(w)1
E(w)1 for w adjacent to w
E(w)E(w) (the number of nodes in U1 adjacent to both w and w) for wU1 and nonadjacent to w
F(w)1,  and 
F(w)F(w)1 for w adjacent to w.

The next node to be colored will, of the nodes in U1, then have the maximum value of F(w) − E(w) and, if a tie exists, the minimum value of E(w) among those nodes that tied. When U1 is empty (this corresponds to E(w) < 0 for all wϵV), the values of E are modified so that E(w) ← F(w) for all wϵV. This corresponds to a reinitialization of the values of E for the subgraph of G induced by the uncolored nodes.

The above operations on E and F can be easily performed by appropriate use of the following subroutine, procedure DELETE. Given an array D and node w′, DELETE performs the following operations on D:

D(w)1D(w)D(w)1 for w adjacent to w

Thus whenever node w′ is selected for coloring we may modify E and F by simply performing DELETE on (F, w′) and (E, w′) as well as on (E, w) for all w in U1 adjacent to w′. It is not difficult to verify that such a procedure maintains the desired values of E and F.

A complete PL-1 computer program listing of the RLF procedure is included at the end of this appendix. The program is written in subroutine form and assumes that values for CI and CL are provided on input. Array CI serves as an index array to CL, the node adjacency list. For example, the nodes adjacent to the i th node are sequentially stored in CL(CI(i − 1) + 1), CL(CI(i − 1) + 2), …, CL(CI(i)). The data are stored in this compact form to minimize the amount of storage required, a very important consideration when working with large graphs.

Each node w in the graph is processed by procedure DELETE, i.e. deleted, f(w) times where f(w) is the color number assigned to w. This claim is easily established by observing that for each new color i introduced, if(w), w is either colored i or adjacent to a node colored i. In either case w is deleted exactly once. Once w is colored, then it may never subsequently be deleted. Thus exactly

i=1kinii=1kkni=kk=1kni=nk

deletions are performed on G, where k is the number of colors used to color G, ni, is the number of nodes colored i and n is the number of nodes in G. Since each deletion requires O(d) time where d is the average degree of a node, all the deletions may be accomplished in O(kdn) time. It is easily checked that all other operations may be accomplished in O(n2) time. Thus the algorithm requires O(n3) time and O(n2) space to color an arbitrary n node graph. However, for those graphs where kd = O(n) or, equivalently, ke = O(n2), the RLF algorithm consumes only O(n2) time and O(n2) space in coloring the graph. As was pointed out in the text, many large-scale practical problems involve graphs for which this property holds and thus may be colored with the RLF algorithm in O(n2) time.

graphic file with name jres-84-489-g001.jpg

graphic file with name jres-84-489-g002.jpg

12. Appendix C: Characterization of Test Graphs

The data provided in tables 4 and 5 of this appendix provide the necessary information to regenerate each of the test graphs used in the preparation of the data included in tables 1 and 2, respectively. Each graph is referenced by its chromatic number, average node degree, and order in which the colorings (or values of X0) are given. For example, the parameters for the third 150-node graph with chromatic number 5 and average node degree 11 (i.e., the graph that SL 8-colored) are: n = 150, k = 5, a = 8401, c = 6859, m = 84035, b5 = 19, b4 = 60, b3 = 97, b2 = 210, and X0 = 22093. Using these parameters, it is possible to regenerate the graph by using the procedure described in section 6.

Table 4.

Generation Parameters for the Test Graph Used in Table 1.

X d n k a c m  b  3 values ofX0
5  11 150 5
6859 84035 (19, 60, 97, 210) 0,33289,22093
 19 ]50 5
6859 84035 (39, 120, 195, 420)  21047,55697,74912
 24 150 5 6859 84035 (58, 180, 292, 630)  78692,83491,52870
10  11 150 10
6859  168070 (10, 0, 0, 12, 0, 0, 25, 0, 2, 4)  8879,64827,78005
 21 150 10 8401 6859  168070 (21, 0, 0, 24, 0, 0, 51, 0, 48)  5293 59845, 102567
 29 150 10 8401 6859  L68070 (31, 0, 0, 36, 0, 0, 76, 0, 72)  107489,118239,101759
15  12 150 15 8401 6859  252105 (4, 0, 0, 0, 0, 7, 0, 0, 0, 0, 12, 0, 0, 148)  80589,60363,94632
 25 150 15 8401 6859  252105 (9, 0, 0, 0, 0, 15, 0, 0, 0, 0, 24, 0, 0, 297)  220881,67107,198723
 34 150 15 8401 6859  252105 (13, 0, 0, 0, 0, 22, 0, 0, 0, 0, 36, 0, 0, 445)  66684,189309,9534

Table 5.

Generation Parameters for the Test Graphs Used in Table 2.

X d avg. n k a c m  b  2 values of X0
5 25 450 5 8401 6859  84035  (175, 540, 877, 1890) 0,41794
43 450 5 8401 6859  84035  (409, 1260, 2047, 4410) 35428,47927
15 36 450 15 8401 6859  252105  (40, 0, 0, 0, 0, 67, 0, 0, 0, 0, 108, 0, 0, 1336) 36276,213549
74 450 15 8401 6859  252105  (94, 0, 0, 0, 0, 157. 0, 0, 0, 0, 252, 0, 0, 3118) 161712,160056
25 37 450 25 8401 6859  420175  (13, 0, 0, 0, 0, O. 0, 0, 0, 27, 0, 0, 0, 0, 0, 192625,358531
 0, 0, 0, 40, 0, 0, 0, 0, 1336)
77 450 25 8401 6859  420175  (31, 0, 0, 0, 0, 0, 0, 0, 0, 63, 0, 0, 0, 0, 0, 247337,274955
 0, 0, 0, 94, 0, 0, 0, 0, 3118)

Footnotes

1

Numbers in brackets indicate literature references at the end of this paper.

9. References

  • [1].Aho A. V., Hopcroft J. E., and Ullman J. D., The Design and Analysis of Computer Algorithms, (Addison-Wesley, Reading, MA, 1974), pp. 364–404. [Google Scholar]
  • [2].Bondy J. A., Bounds for the chromatic number of a graph, Journal of Combinatorial Theory, 7 (1969), pp. 96–8. [Google Scholar]
  • [3].Broder S., Final examination scheduling, Communications of the ACM, Vol. 7, No. 8 (1964), pp. 494–8. [Google Scholar]
  • [4].Brooks R. L., On coloring the nodes of a network, Proceedings of the Cambridge Philosophical Society, 37 (1941), p. 194. [Google Scholar]
  • [5].Brown J. R., Chromatic scheduling and the chromatic number problem, Management Science, Vol. 19, No. 4 (1972), pp. 456–463. [Google Scholar]
  • [6].Christofides N., An algorithm for the chromatic number of a graph, The Computer Journal, Vol. 14, No. 1 (1971), pp. 38–9. [Google Scholar]
  • [7].Christofides N., Graph Theory — An Algorithmic Approach, (Academic Press, New York, 1975), pp. 58–78. [Google Scholar]
  • [8].Garey M. R., and Johnson D. S., The Complexity of Near-optimal Graph Coloring, Journal of the ACM, Vol. 23, No. 1 (January 1976), pp. 43–9. [Google Scholar]
  • [9].Hall A. D., and Acton F. S., Scheduling University Course Examinations by Computer, Communications of the ACM, Vol. 10, No. 4 (April 1967), pp. 235–8. [Google Scholar]
  • [10].Johnson D. S., Approximation Algorithms for Combinatorial Problems, Journal of Computer and System Sciences 9 (1974), pp. 256–78. [Google Scholar]
  • [11].Johnson D. S., Worst Case Behavior of Graph Coloring Algorithms, Proceedings of the 5th Southeast Conference on Combinatorics, Graph Theory, and Computing (1974), pp. 513–27. [Google Scholar]
  • [12].Knuth D. E., The Art of Computer Programming, (Addison-Wesley, Reading, MA, 1969), Vol. 2, pp. 9–18. [Google Scholar]
  • [13].Leighton F. T., A New Solution to the Exam Scheduling Problem, unpublished paper.
  • [14].Matula D. W., Marble G., and Isaacson J. D., Graph Coloring Algorithms, Graph Theory and Computing, Read Ronald C., editor, (Academic Press, New York, 1972), pp. 109–122. [Google Scholar]
  • [15].Matula D. W., Bounded Color Functions on Graphs, Networks 2 (1972), pp. 29–44. [Google Scholar]
  • [16].Ore Oystein, The Four Color Problem, (Academic Press, New York, London, 1967). [Google Scholar]
  • [17].Peck J. E. L., and Williams M. R., Examination Scheduling, Communications of the ACM, Vol. 9, No. 6 (1966), pp. 433–4. [Google Scholar]
  • [18].Pershin O. Y., An Algorithm for Determining the Minimum Coloring of a Finite Graph, Engineering Cybernetics, Vol. 11, No. 6 (1973), pp. 980–5. [Google Scholar]
  • [19].Szekeres G., and Wilf H. S., An Inequality for the Chromatic Number of a Graph, Journal of Combinatorial Theory, 4 (1968), pp. 1–3. [Google Scholar]
  • [20].Welsh D. J. A., and Powell M. B., An Upper Bound for the Chromatic Number of a Graph and its Applications to Timetabling Problems, The Computer Journal, 10 (1967), pp. 85–6. [Google Scholar]
  • [21].Williams M. R., The Coloring of Very Large Graphs, Combinatorial Structures and Their Applications — Proceedings of the Calgary International Conference on Combinatorial Structures and Their Applications, Gordon and Breach, New York (June 1969), pp. 477–8. [Google Scholar]
  • [22].Wood D. C., A System for Computing University Examination Timetables, The Computer Journal, Vol. 11, No. 1 (May 1968), p. 41. [Google Scholar]
  • [23].Wood D. C., A Technique for Coloring a Graph Applicable to Large Scale Timetabling Problems, The Computer Journal, 12 (1969), p. 317. [Google Scholar]

Articles from Journal of Research of the National Bureau of Standards are provided here courtesy of National Institute of Standards and Technology

RESOURCES