Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2003 Sep 22;100(20):11211–11215. doi: 10.1073/pnas.1635191100

Scaling and universality in continuous length combinatorial optimization

David Aldous , Allon G Percus ‡,§
PMCID: PMC208736  PMID: 14504403

Abstract

We consider combinatorial optimization problems defined over random ensembles and study how solution cost increases when the optimal solution undergoes a small perturbation δ. For the minimum spanning tree, the increase in cost scales as δ2. For the minimum matching and traveling salesman problems in dimension d ≥ 2, the increase scales as δ3; this is observed in Monte Carlo simulations in d = 2, 3, 4 and in theoretical analysis of a mean-field model. We speculate that the scaling exponent could serve to classify combinatorial optimization problems of this general kind into a small number of distinct categories, similar to universality classes in statistical physics.


The interface of statistical physics, algorithmic theory, and mathematical probability is an active research field, containing diverse topics such as mixing times of Glauber-type dynamics (ref. 1 and many others), reconstruction of broadcast information (2), and probabilistic analysis of paradigm computational problems such as k-SAT (35). In this article we introduce an article topic whose motivation is simpler than those.

Freshman calculus tells us that, for a smooth function Inline graphic attaining its minimum at x*, for x near x* the relation between δ = |xx*| and ε = F(x) – F(x*) is ε ≈ (1/2)F″(x*)δ2. If instead we consider a function Inline graphic on d-dimensional space, sophomore calculus tells us that similarly

graphic file with name M3.gif

for appropriate c. So in a sense the scaling exponent 2 is naturally associated with “smooth” or “regular” optimization problems.

Now consider a graph-based combinatorial optimization problem, such as the traveling salesman problem (TSP): each feasible solution has n constituents (edges) and associated continuous costs (lengths), the sum of which gives the overall solution cost. Compare an arbitrary feasible solution x with the optimal (minimal) solution x*, unique, which for generic lengths is unique, by the two quantities

graphic file with name M4.gif

where s(n) expresses the rate at which the optimal cost scales in n. Define εn(δ) to be the minimum value of εn(x) over all feasible solutions x for which δn(x) ≥ δ. Although the function εn(δ) will depend on n and the problem instance, we anticipate that for typical instances drawn from a suitable probability model it will converge in the n → ∞ limit to some deterministic function ε(δ).

The universality paradigm from statistical physics suggests there may be a scaling exponent a such that

graphic file with name M5.gif

and that the exponent should be robust under model details. In statistical physics, universality classes are typically defined by critical exponents that characterize the behavior of measurable quantities both near and at a phase transition. Although a is not a critical exponent here, and there is no phase transition, we suggest that it could play a similar role, categorizing combinatorial optimization problems into a small set of classes. If our analogy with freshman calculus is apposite, we expect that the simplest problems should have scaling exponent 2.

This approach may seem obvious in retrospect and fits within a long-standing tradition in the physical sciences (see Discussion). However, it has never been proposed or studied explicitly. In this article we report on three aspects of our program. For the minimum spanning tree (MST), a classic “algorithmically easy” problem solvable to optimality by greedy methods, we confirm that the scaling exponent is indeed 2. We then turn to two harder problems: minimum matching (MM) and the TSP. Under a mean-field model, our mathematical analysis methods combined with numerics show that the scaling exponent is 3 for both MM and TSP, independent of the pseudo-dimension defined below. For the Euclidean model the exponent is 2 in the (essentially trivial) 1D case, while Monte Carlo simulations suggest it is 3 in higher dimensions.

Models

In the Euclidean model we take n random points in a d-dimensional cube whose volume scales as n. Interpoint lengths are Euclidean distances. To reduce finite-size effects, we take the space to have periodic (toroidal) boundary conditions when calculating the distances.

In the mean-field or random link model we imagine n random points in some abstract space such that the (n2) vertex pair lengths are i.i.d. random variables distributed as nl/dl, with probability density p(l) ∼ ld–1 for small l. Here 0 < d < ∞ is the pseudo-dimension parameter and the distribution of small single interpoint lengths mimics that in the Euclidean model of corresponding dimension d, up to a proportionality constant. Both models are set up so that nearest-neighbor distances are of order 1 and the scaling of overall cost in the optimization problems is s(n) = n.

A Simple Case: The MST

For the MST, given any reasonable model of interpoint lengths including the two models above, we expect a scaling exponent of 2. We will provide a rigorous account elsewhere, but the underlying idea is simple. The classical greedy algorithm gives the following explicit inclusion criterion for whether an edge e = (v1, v2) of a graph belongs in the MST. Consider the subgraph containing edges between any two vertices within length t of each other. Let perc(e) ≤ length(e) be the smallest t that keeps v1 and v2 within the same connected component. It is not difficult to see that eMST if and only if length(e) = perc(e).

Given a probability model for n random points and their interpoint lengths, define a measure μn(x) on x ∈ (0, ∞) in terms of the expectation

graphic file with name M6.gif

For any reasonable model we expect an n → ∞ limit measure μ(x), with a density ν(x) = dμ/dx having a nonzero limit ν(0+).

Now modify the MST by adding an edge e with length(e) – perc(e) = b, for some small b, to create a cycle: then delete the longest edge e′ ≠ e of that cycle, which necessarily has length(e′) = perc(e). This gives a spanning tree containing exactly one edge not in the MST and having length greater by b. Repeat this procedure with every edge e for which 0 < length(e) – perc(e) < β, for some small β. The number of such edges is nμ(β) ≈ nν(0+)β to first order in β, and as there is negligible overlap between cycles, each of the new edges will increase the tree length by ∼β/2 on average. So

graphic file with name M7.gif

This construction must yield essentially the minimum value of ε for given δ, so the scaling exponent is 2.

Poisson Weighted Infinite Tree (PWIT)

We now consider the MM and TSP. In MM, we ask for the minimum total length Ln of n/2 edges matching n random points and study the normalized limit expectation limn→∞ (2/n)E[Ln]. Taking the mean-field model with d = 1 for simplicity, the limit value π2/6 was obtained in ref. 6 by using the replica method from statistical physics. We work in the framework of ref. 7, which rederives this limit rigorously by doing calculations within an n = ∞ limit structure, the PWIT.

Briefly, the PWIT is an infinite degree rooted tree in which the edge weights (lengths) at each vertex are distributed as the successive points 0 < ξ1 < ξ2 <... of a Poisson process with a mean number xd of points in [0, x], i.e., a process with rate increasing as d xd–1. In this way, the PWIT corresponds to the mean-field model at a given d (see ref. 8 for review).

Consider a matching on an instance of a rooted PWIT, as well as a matching on the same instance but with the root removed, as shown in Fig. 1. Introduce the variable

graphic file with name M8.gif
graphic file with name M9.gif

Both lengths are infinite, so this is interpreted as a limit of finite differences. If Xi is the analogous quantity for the ith constituent subtree of the rootless PWIT instance and ξi the length of the root's ith edge, these variables satisfy the recursion

graphic file with name M10.gif [1]

Fig. 1.

Fig. 1.

Matching on a PWIT with (a) and without (b) root node. Numbers represent edge weights (lengths).

Now take the {ξi} to be the Poisson-distributed edge lengths and the {Xi} to be independent random variables from the same random process that produces X. Eq. 1 is then a distributional equation for X and can be shown (7) for d = 1 to have as its unique solution the logistic distribution

graphic file with name M11.gif [2]

The PWIT structure further leads to the following inclusion criterion. Consider an edge of length x in the tree, and the two subtrees formed by deleting that edge. The memoryless nature of the Poisson process allows us to consider each of these subtrees as independent copies of a PWIT, with their roots at the vertices of the deleted edge. It may be seen that including the edge in the optimal matching incurs a cost of xX1X2, where X1 and X2 are the X variables as defined above, but for the two subtrees. Thus, an edge of length x is present in the minimal matching if and only if

graphic file with name M12.gif [3]

The probability density function for edge lengths in the MM is then

graphic file with name M13.gif

Here X1 and X2 are independent random variables distributed according to Eq. 2, from which the mean edge length can be calculated:

graphic file with name M14.gif

Mean-Field MM and TSP

The previous section summarized analysis from ref. 7; now we continue with new analysis. To study scaling exponents, we introduce a parameter λ > 0 that plays the role of a Lagrange multiplier. Penalize edges used in the optimal matching by adding λ to their length. Let us study optimal solutions to the MM problem on this new penalized instance. Precisely, on a realization of the PWIT, define Y and Z as

graphic file with name M15.gif
graphic file with name M16.gif

where Y and Z differ in the definition of the edge lengths of the new tree: for Y, the edges penalized are those used by the original rooted optimal matching: for Z, they are those used by the original rootless optimal matching.

For the penalized problem the recursion Eq. 1 for X is supplemented by the following recursions for (X, Y, Z) jointly. Let i* be the value of i that minimizes ξiXi. Then

graphic file with name M17.gif

where, as before, the {Yi} and {Zi} are independent random variables from the same random process producing Y and Z.

Moreover, we get an inclusion criterion, analogous to Eq. 3:an edge of length x is included if and only if

graphic file with name M18.gif

In terms of the expected unique joint distribution for (X, Y, Z), the quantities δ and ε that compare the penalized solution (as a nonoptimal solution of the original problem) with the original optimal solution are

graphic file with name M19.gif

and

graphic file with name M20.gif

By the theory of Lagrange multipliers these functions ε(λ), δ(λ) determine ε(δ). We do not have explicit analytic expressions analogous to Eq. 2 for the joint distribution of (X, Y, Z) in terms of λ. However, we can use routine bootstrap Monte Carlo simulations to simulate the distribution and thence estimate the functions δ(λ) and ε(λ) numerically. And as indicated in refs. 7, 9, and 10 the mean-field MM and the mean-field TSP can be studied by using similar techniques; the TSP analysis is just a minor variation of the MM analysis. For instance, recursion Eq. 1 becomes

graphic file with name M21.gif

where min[2] denotes second minimum.

Table 1 reports numerical results showing good agreement with ε ∞ δ3 in both problems for d = 1. These numerics are compatible with independent MM results obtained recently (11), as well as with our direct simulations on mean-field TSP instances at n = 512. The same exponent 3 arises for other d.

Table 1. Scaling for mean-field MM and TSP in pseudo-dimension d = 1, obtained by simulating joint distribution of (X, Y, Z).

MM
TSP
λ δ ε 2.3δ3 δ ε 2.0δ3
0.02 0.112 0.004 0.003 0.128 0.009 0.006
0.04 0.156 0.010 0.009 0.175 0.015 0.011
0.06 0.190 0.017 0.016 0.212 0.023 0.019
0.08 0.219 0.024 0.024 0.243 0.030 0.029
0.10 0.244 0.035 0.033 0.270 0.042 0.039
0.12 0.267 0.042 0.044 0.300 0.053 0.051
0.14 0.287 0.053 0.054 0.318 0.065 0.064
0.16 0.306 0.067 0.066 0.340 0.077 0.079
0.18 0.323 0.080 0.078 0.360 0.091 0.093
0.20 0.340 0.089 0.090 0.379 0.104 0.109

Results show a good fit to ε ∼ 2.3δ3 and 2.0δ3. In more detail, δ scales as λ1/2 while ε scales as λ3/2. Estimates for ε have standard deviation of ≈0.001 for MM and 0.003 for TSP.

Euclidean MM and TSP

We consider the d = 1 case where the scaling exponent can be found exactly and give numerical results for other cases. We restrict the discussion to the Euclidean TSP, although as for the mean-field model, MM is phenomenologically similar.

Take the Euclidean TSP in d = 1, with periodic boundary conditions. The optimal tour here is trivial (with high probability a straight line of length n) but nevertheless instructive to analyze. As before, add a penalty term λ to each edge used in the tour and consider how the optimal tour changes in this new penalized instance. When λ is small, changes to the tour will consist of “2-changes” shown in Fig. 2 and will occur when an original edge length is < λ. A simple nearest-neighbor distance argument gives the distribution of edge lengths in the original tour as p(l) ∼ el. Since two edges are modified in each 2-change,

graphic file with name M22.gif

The scaling exponent of 2 is not surprising, as the 1D TSP behaves very similarly to the 1D MST on penalized instances. Furthermore, it is consistent with the intuition that the “easiest” problems scale in this way. A similar argument applies to MM, and in both cases a more rigorous analysis yields bounds on δ and ε that confirm the exponent.

Fig. 2.

Fig. 2.

2-change schematic. Original optimal tour is shown by dashed line. New optimal tour on penalized instance is shown by solid line: over sufficiently short lengths, tour doubles back to avoid using penalized edges.

For d > 1, numerical results are shown in Fig. 3. These have been obtained by finding exact solutions to randomly generated n = 512 Euclidean instances in d = 2, 3, 4, using the concorde TSP solver available at www.math.princeton.edu/tsp/concorde.html. For each instance, the optimum was obtained on the original instance as well as on the instance penalized with a range of λ values. For each λ value, δ(λ) and ε(λ) were averaged over the sample of instances. The resulting numerics are closely consistent with a scaling exponent of 3 (in spite of suffering from some finite-size effects at smaller δ), suggesting that the mean-field picture gives the correct exponent in all but the trivial 1D case. In the language of critical exponents, this would correspond to an “upper critical dimension” of 2.

Fig. 3.

Fig. 3.

Scaling for Euclidean TSP in d = 2, 3, 4, based on exact solutions for 100 instances in each case. Data points correspond to λ values from 0.004 to 0.05. Slopes of best-fit lines vary from 2.94 to 3.24. Standard deviation is ≈3 × 10–3 for δ and 3 × 10–5 for ε.

Discussion

The goal of our scaling study has been to address a new kind of problem in the theory of algorithms, using concepts from statistical physics. Traditionally, work on the TSP within the theory of algorithms (12) has emphasized algorithmic performance, rather than the kinds of questions we ask here. Rigorous study of the Euclidean TSP model within mathematical probability (13) has yielded a surprising amount of qualitative information: existence of an n → ∞ limit constant giving the mean edge-length in the optimal TSP tour (14), and large deviation bounds for the probability that the total tour length differs substantially from its mean (15). However, calculation of explicit constants in dimensions d ≥ 2 seems beyond the reach of analytic techniques. For the mean-field bipartite MM problem, impressive recent work (26, 27) has proven an exact formula giving the expectation of the finite-n minimum total matching length, though such exact methods seem unlikely to be widely feasible.

On the other hand, there has been significant progress over the past 20 years in the use of statistical physics techniques on combinatorial optimization problems in general. Finding optimal solutions to these problems is a direct analog to determining ground states in statistical physics models of disordered systems (16). This observation has motivated the development of such approaches as simulated annealing (17), the replica method (18), and the cavity method (4). Condensed matter physics, particularly models arising in spin glass theory, has provided a powerful means to study algorithmic problems: at the same time, algorithmic results have implications for the associated physical models. It is instructive to consider our work in that context.

Researchers in the physical sciences have long been interested in the low-temperature thermodynamics (18, 19) of disordered systems, investigating properties of near-optimal states in spin glass models. Our procedure for studying near-optimal solutions by way of a penalty parameter is similar to a method, known as ε-coupling (2022), used for calculating low-energy excitations in spin glasses. Making use of this method, physicists have obtained quantities closely analogous to our scaling exponents for models of RNA folding (22). Furthermore, in the last year independent work (11) has explored ε-coupling on MM, numerically identifying a different but related scaling exponent.

For the TSP, analytical and numerical studies were performed >15 years ago (23, 24) on the thermodynamics of the model, with overlap quantities calculated for near-optimal solutions. The results have suggested that at low temperature T, the cost excess ε scales as T2 while the average fraction of differing edges between solutions (1 – q, with q being the “overlap fraction”) scales as T. This leads to ε ∼ (1 – q)2, in apparent contradiction with our exponent of 3. However, at low temperatures, q represents overlaps between typical near-optimal solutions, whereas our δ measures overlaps between a near-optimal solution and the optimum. The different definitions of these two quantities could account for the discrepancy in scaling exponent: it is not surprising that 1 – q grows faster than δ as one considers solutions of increasing cost. At the same time, a possible implication of these results is that at low temperature, δ ∼ T2/3. We are not aware of any direct theoretical arguments to explain this and consider it an intriguing open question.

It is also important to note that the underlying property δ → 0 as ε → 0 cannot always be taken for granted. This property is called asymptotic essential uniqueness (AEU) (7). AEU requires, among other things, that the optimum itself be unique. In principle, even if it is not, one could still analyze near-optimal scaling by considering sufficiently local perturbations from a given optimum. It is natural to expect the resulting exponent to be independent of the specific optimum chosen. However, this may not be true in the event of what statistical physicists call replica symmetry breaking (RSB) (18, 19). AEU is a special case of replica symmetry, so while RSB implies the absence of AEU, the absence of AEU does not necessarily imply RSB. A current debate in condensed matter literature concerns whether or not low-temperature spin glasses display RSB (20, 21, 25). It is generally believed that RSB is incompatible with unique nonzero values of various scaling exponents. Thus, the correct approach to analyzing near-optimal scaling in such problems remains another open question.

One final example may serve to illustrate the diversity of possible applications for our type of scaling analysis, as well as an instance where the absence of AEU is surmountable. In oriented percolation on the 2D lattice, there are independent random traversal times on each oriented (up or right) edge. The percolation time Tn is the minimum, over all (2n n) paths from (0, 0) to (n, n), of the time to traverse the path. So (2n)–1E[Tn] → t*, a time constant. It is elementary that there will be near-optimal paths, with lengths Tn such that n–1 (E[Tn] – E[Tn]) → 0 and which are almost disjoint from the optimal path. So our ε(δ) analysis applied to paths will not be useful: even with a unique optimum, AEU will not hold. But we can rephrase the problem in terms of flows. A flow on the n × n oriented torus assigns to each edge a flow of size ∈ [0, 1], such that at each vertex, in-flow equals out-flow. Let t(δ) be the minimum, over all flows with mean flow-per-edge = δ, of the flow-weighted average edge traversal time. In the n → ∞ limit, one can show that as δ → 0, t(δ) → t* where t* is the same limiting constant as before. We therefore expect a scaling t(δ) – t* ∼ δa. Mean field analysis gives the scaling exponent a = 2, and Monte Carlo study of the d = 2 case is in progress.

Conclusions

We have studied the scaling of the relative cost difference ε between optimal and near-optimal solutions to combinatorial optimization problems, as a function of the solution's relative distance δ from optimality. This kind of scaling study, although well accepted in theoretical physics, is new to combinatorial optimization. For the MST, we have found ε ∼ δ2. For the MM and TSP, in the 1D Euclidean case ε ∼ δ2 as well, while in both the mean-field model and higher Euclidean dimensions ε ∼ δ3.

The scaling exponent may categorize combinatorial optimization problems into a small number of classes. The fact that MST is solvable by a simple greedy algorithm, and that the 1D case of the MM and TSP is essentially trivial, suggests that a scaling exponent of 2 characterizes problems of very low complexity. The exponent of 3 characterizes problems that are algorithmically more difficult. Of course, this is a different kind of classification from traditional notions of computational complexity: MM is solvable to optimality in O(n3) time whereas the TSP is in the NP-hard class. Rather, these exponent classes are reminiscent of universality classes in statistical physics, which unite diverse physical systems exhibiting identical behavior near phase transitions.

A key question in the study of critical phenomena is whether mean-field models correctly describe phase transition behavior in the geometric models they approximate. The TSP and MM do not involve critical behavior, but the fact that mean-field and geometric scaling exponents coincide for d ≥ 2 is significant. It provides evidence that in a combinatorial setting, the mean-field approach can give a valuable and accurate description of the structure of near-optimal solutions.

Acknowledgments

We thank Mike Steele for helpful discussions. D.A.'s research is supported by National Science Foundation Grant DMS-0203062. A.P. acknowledges support from Department of Energy Grant LDRD/ER 20030137 and the kind hospitality of the Institute for Pure and Applied Mathematics at the University of California (Los Angeles), where much of this work was conducted.

Abbreviations: TSP, traveling salesman problem; MST, minimum spanning tree; MM, minimum matching; PWIT, Poisson weighted infinite tree; AEU, asymptotic essential uniqueness; RSB, replica symmetry breaking.

References

  • 1.Dyer, M., Greenhill, C. & Molloy, M. (2002) Random Struct. Algorithms 20, 98–114. [Google Scholar]
  • 2.Evans, W., Kenyon, C., Peres, Y. & Schulman, L. J. (2000) Ann. Appl. Probability 10, 410–431. [Google Scholar]
  • 3.Coppersmith, D., Gamarkin, D., Hajiaghayi, M. T. & Sorkin, G. B. (2003) in Proceedings of the 14th Annual ACM-SIAM Symposium on Discrete Algorithms, ed. Farach-Colton, M. (Association for Computing Machinery, New York), pp. 364–373.
  • 4.Mézard, M., Parisi, G. & Zecchina, R. (2002) Science 297, 812–815. [DOI] [PubMed] [Google Scholar]
  • 5.Monasson, R., Zecchina, R., Kirkpatrick, S., Selman, B. & Troyansky, L. (1999) Nature 400, 133–137. [Google Scholar]
  • 6.Mézard, M. & Parisi, G. (1985) J. Phys. Lett. 46, L771–L778. [Google Scholar]
  • 7.Aldous, D. J. (2001) Random Struct. Algorithms 18, 381–418. [Google Scholar]
  • 8.Aldous, D. J. & Steele, J. M. (2004) in Probability on Discrete Structures, ed. Kesten, H. (Springer, Berlin), in press.
  • 9.Krauth, W. & Mézard, M. (1987) Europhys. Lett. 8, 213–218. [Google Scholar]
  • 10.Cerf, N. J., Boutet de Monvel, J., Bohigas, O., Martin, O. C. & Percus, A. G. (1997) J. Phys. I 7, 117–136. [Google Scholar]
  • 11.Ratiéville, M. (2002) Ph.D. thesis (Université Pierre et Marie Curie, Paris and Università degli Studi di Roma “La Sapienza,” Rome).
  • 12.Gutin, G. & Punen, A. P., eds. (2002) The Traveling Salesman Problem and its Variations (Kluwer, Dordrecht, The Netherlands).
  • 13.Steele, J. M. (1997) Probability Theory and Combinatorial Optimization (Society for Industrial and Applied Mathematics, Philadelphia).
  • 14.Beardwood, J., Halton, J. H. & Hammersley, J. M. (1959) Proc. Cambridge Philos. Soc. 55, 299–327. [Google Scholar]
  • 15.Rhee, W. T. & Talagrand, M. (1989) Ann. Probability 17, 1–8. [Google Scholar]
  • 16.Moore, M. A. (1987) Phys. Rev. Lett. 58, 1703–1706. [DOI] [PubMed] [Google Scholar]
  • 17.Kirkpatrick, S., Gellatt, C. D. & Vecchi, M. P. (1983) Science 220, 671–680. [DOI] [PubMed] [Google Scholar]
  • 18.Mézard, M., Parisi, G. & Virasoro, M. A. (1987) Spin Glass Theory and Beyond (World Scientific, Singapore).
  • 19.Dotsenko, V. (2001) Introduction to the Replica Theory of Disordered Statistical Systems (Cambridge Univ. Press, Cambridge, U.K.).
  • 20.Palassini, M. & Young, A. P. (2000) Phys. Rev. Lett. 85, 3017–3020. [DOI] [PubMed] [Google Scholar]
  • 21.Marinari, E. & Parisi, G. (2001) Phys. Rev. Lett. 86, 3887–3890. [DOI] [PubMed] [Google Scholar]
  • 22.Marinari, E., Pagnani, A. & Ricci-Tersenghi, F. (2002) Phys. Rev. E 65, 041919. [DOI] [PubMed] [Google Scholar]
  • 23.Mézard, M. & Parisi, G. (1986) J. Phys. 47, 1285–1296. [Google Scholar]
  • 24.Sourlas, N. (1986) Europhys. Lett. 2, 919–923. [Google Scholar]
  • 25.Krzakala, F. & Martin, O. C. (2000) Phys. Rev. Lett. 85, 3013–3016. [DOI] [PubMed] [Google Scholar]
  • 26.Nair, C., Prabhakar, B. & Sharma, M. (2003) in Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science (Institute of Electrical and Electronics Engineers, Los Alamitos, CA), in press.
  • 27.Linusson, S. & Wästlund, J. (2003) Technical report LiTH-MAT-R-2003-03 (Linköping University, Linköping, Sweden).

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES