Smoothed Analysis of the 2-Opt Heuristic for the TSP under Gaussian Noise

Marvin Künnemann; Bodo Manthey; Rianne Veenstra

doi:10.1007/s00453-025-01335-7

. 2025 Jul 21;87(11):1518–1563. doi: 10.1007/s00453-025-01335-7

Smoothed Analysis of the 2-Opt Heuristic for the TSP under Gaussian Noise

Marvin Künnemann ¹, Bodo Manthey ^2,^✉, Rianne Veenstra ²

PMCID: PMC12450235 PMID: 40984866

Abstract

The 2-opt heuristic is a very simple local search heuristic for the traveling salesperson problem. In practice it usually converges quickly to solutions within a few percentages of optimality. In contrast to this, its running-time is exponential and its approximation performance is poor in the worst case. Englert, Röglin, and Vöcking (Algorithmica, 2014) provided a smoothed analysis in the so-called one-step model in order to explain the performance of 2-opt on d-dimensional Euclidean instances, both in terms of running-time and in terms of approximation ratio. However, translating their results to the classical model of smoothed analysis, where points are perturbed by Gaussian distributions with standard deviation Inline graphic , yields only weak bounds. We prove bounds that are polynomial in n and for the smoothed running-time with Gaussian perturbations. In addition, our analysis for Euclidean distances is much simpler than the existing smoothed analysis. Furthermore, we prove a smoothed approximation ratio of Inline graphic . This bound is almost tight, as we also provide a lower bound of for . Our main technical novelty here is that, different from existing smoothed analyses, we do not separately analyze objective values of the global and local optimum on all inputs (which only allows for a bound of ), but simultaneously bound them on the same input.

Keywords: Travelling salesperson problem, Local search, Smoothed analysis, Approximation ratio, 2-opt

2-Opt and Smoothed Analysis

The traveling salesperson problem (TSP) is one of the classical combinatorial optimization problems. Euclidean TSP is the following variant: given points Inline graphic , find the shortest Hamiltonian cycle that visits all points in X (also called a tour). Even this restricted variant is NP-hard for [26]. We consider Euclidean TSP with Manhattan and Euclidean distances as well as squared Euclidean distances to measure the distances between points. For the former two, there exist polynomial-time approximation schemes (PTAS) [2, 25]. The latter, which has applications in power assignment problems for wireless networks [15], admits a PTAS for Inline graphic and is APX-hard for [30].

As it is unlikely that there are efficient algorithms for solving Euclidean TSP optimally, heuristics have been developed in order to find near-optimal solutions quickly. One very simple and popular heuristic is 2-opt: starting from an initial tour, we iteratively replace two edges by two other edges to obtain a shorter tour until we have found a local optimum. Experiments indicate that 2-opt converges to near-optimal solutions quite quickly [16, 17], but its worst-case performance is bad: the worst-case running-time is exponential even for Inline graphic [12] and the approximation ratio is for Euclidean instances [5, 7].

An alternative to worst-case analysis is average-case analysis, where the expected performance with respect to some probability distribution is measured. The average-case running-time for Euclidean and random metric instances and the average-case approximation ratio for non-metric instances of 2-opt have been analyzed [4, 7, 11, 19]. However, while worst-case analysis is often too pessimistic because it is dominated by artificial instances that are rarely encountered in practice, average-case analysis is dominated by random instances, which have often very special properties with high probability that they do not share with typical instances.

In order to overcome the drawbacks of both worst-case and average-case analysis and to explain the performance of the simplex method, Spielman and Teng invented smoothed analysis [28], a hybrid of worst-case and average-case analysis: an adversary specifies an instance, and then this instance is slightly randomly perturbed. The smoothed performance is the expected performance, where the expected value is taken over the random perturbation. The underlying assumption is that real-world instances are often subjected to a small amount of random noise. This noise can stem from measurement or rounding errors, or it might be a realistic assumption that the instances are influenced by unknown circumstances, but we do not have any reason to believe that these are adversarial. Smoothed analysis often allows more realistic conclusions about the performance than worst-case or average-case analysis. Since its invention, it has been applied successfully to explain the performance of a variety of algorithms. We refer to two surveys for an overview of smoothed analysis in general [22, 29] and a more recent survey about smoothed analysis applied to local search algorithms [21].

Related Results

Running-time. Englert, Röglin, and Vöcking [12] provided a smoothed analysis of 2-opt in order to explain its performance. They used the one-step model: an adversary specifies n probability density functions Inline graphic . Then the n points are drawn independently according to the densities , respectively. Here, is the perturbation parameter. If , then the only possibility is the uniform distribution on , and we obtain an average-case analysis. The larger , the more powerful the adversary. Englert et al. [12] proved that the expected number of iterations of 2-opt is Inline graphic and for Manhattan and Euclidean distances, respectively. These bounds can be improved slightly by choosing the initial tour with an insertion heuristic. However, if we transfer these bounds to the classical model of points perturbed by Gaussian distributions of standard deviation , we obtain bounds that are polynomial in n and Inline graphic [12, Section 6], since the maximum density of a d-dimensional Gaussian with standard deviation is . While this is polynomial for any fixed d, it is unsatisfactory that the degree of the polynomial depends on d.

Approximation ratio.

Much less is known about the smoothed approximation performance of algorithms. Karger and Onak have shown that multi-dimensional bin packing can be approximated arbitrarily well for smoothed instances [18] and there are frameworks to approximate Euclidean optimization problems such as TSP for smoothed instances [3, 8]. However, these approaches mostly consider algorithms tailored to solving smoothed instances.

With respect to concrete algorithms other than 2-opt, we are only aware of analyses of the jump and lex-jump heuristics for scheduling [6, 13].

Englert et al. [12] proved a bound of Inline graphic . Translated to Gaussians, this yields a bound of if we truncate the Gaussians such that all points lie in a hypercube of constant side length. This result, however, does not explain the approximation performance 2-opt, as the bound is still quite large, even for larger values of or smaller values of Inline graphic .

Our Contribution

In order to improve our understanding of the practical performance of 2-Opt, we provide an improved smoothed analysis of both its running-time and its approximation ratio. To do this, we use the classical smoothed analysis model: an adversary chooses n points from the d-dimensional unit hypercube Inline graphic , and then these points are independently randomly perturbed by Gaussian random variables of standard deviation .

Running-time The bounds that we prove are polynomial in n and Inline graphic . Different to earlier results, the degree of the polynomial is independent of d. As distance measures, we consider Manhattan (Sect. 3.3), Euclidean (Sect. 3.5), and squared Euclidean distances (Sect. 3.4).

The analysis for Manhattan distances is essentially an adaptation of the existing analysis by Englert et al. [12, Section 4.1]. Note that our bound does not have any factor that is exponential in d.

Our analysis for Euclidean distances is considerably simpler than the one by Englert et al., which is rather technical and takes more than 25 pages [12, Section 4.2 and Appendix C].

The analysis for squared Euclidean distances is, to our knowledge, not preceded by a smoothed analysis in the one-step model. Because of the nice properties of squared Euclidean distances and Gaussian perturbations, this smoothed analysis is relatively compact and elegant.

Table 1 summarizes our bounds for the number of iterations.

Table 1.

Our bounds compared to the bounds obtained by Englert et al. [12] for the one-step model

	Euclidean	Squared Euclidean
Englert et al. [12]		–
General


Remarks	Only for	only for ; a weaker bound holds for (Theorem 3.14)

Open in a new tab

The bounds can roughly be transferred to Gaussian noise by replacing Inline graphic with . For convenience, we added our bounds for small and large values of : for , we have , for larger , we have . The notation means that terms depending on d are hidden in the O. The remarks are only for our bounds

Recently, building on our analysis, Manthey and van Rhijn [23] have improved the running-time bounds for Euclidean distances. Specifically, they have reduced the upper bound from Inline graphic to , at the cost of a significantly more complex proof. Furthermore, while their analysis remains restricted to Euclidean distances, we cover Manhattan and squared Euclidean distances as well. In particular, our analysis using squared Euclidean distances is very compact, in contrast to most applications of smoothed analysis, which makes this case particularly appealing, not least for teaching courses on this subject.

Approximation ratio As the earlier smoothed analysis by Englert et al. [12], we provide bounds on the quality of the worst local optimum. While this measure is rather unrealistic and pessimistic, it decouples the analysis from the seeding of the heuristic. Taking into account the seeding would probably severely complicate the analysis.

Our bound of Inline graphic improves significantly upon the direct translation of the bound of Englert et al. [12] to Gaussian perturbations (see Sect. 4.2 for how to translate the bound to Gaussian perturbations without truncation). It smoothly interpolates between the average-case constant approximation ratio and the worst-case bound of Inline graphic .

In order to obtain our improved bound for the smoothed approximation ratio, we take into account the origins of the points, i.e., their unperturbed positions. Although this information is not available to the algorithm, it can be exploited in the analysis. The smoothed analyses of approximation ratios so far [3, 6, 8, 12, 13, 18] essentially ignored this information. While this simplifies the analysis, being oblivious to the unperturbed positions seems to be too pessimistic. In fact, we see that the bound of Englert et al. [12] cannot be improved beyond Inline graphic by ignoring the positions of the points (Sect. 4.2). The reason for this limitation is that the lower bound for the global optimum is obtained if all points have the same origin, which corresponds to an average-case rather than a smoothed analysis. On the other hand, the upper bound for the local optimum has to hold for all choices of the unperturbed points, most of which yield higher costs for the global optimum than the average-case analysis. Taking this into account carefully yields our bound of Inline graphic (Sect. 4.3).

To complement our upper bound, we show that the lower bound of Inline graphic by Chandra et al. [7] remains true for (Sect. 4.4). This implies that a smoothed bound of is impossible, and, thus, our bound cannot be improved significantly.

2-Opt and Smoothing Model

2-Opt Heuristic for the TSP

Let Inline graphic be a set of n points. The goal of the TSP is to find a Hamiltonian cycle (also called a tour) T through X that has minimum length according to some distance measure. In this paper, we consider standard Euclidean distances for both approximation ratio and running-time as well as squared Euclidean distances and Manhattan distances for the running-time.

Given a tour T, a 2-change replaces two edges Inline graphic and of T by two new edges and , provided that this yields again a tour (this is the case if appear in this order in the tour) and that this decreases the length of the tour, i.e., , where (Euclidean distances), (Manhattan distances), or (squared Euclidean distances). The 2-opt heuristic iteratively improves an initial tour by applying 2-changes until it reaches a local optimum. A local optimum is called a 2-optimal tour.

Smoothing Model

Throughout the rest of this paper, let Inline graphic be a set of n points from the unit hypercube. In the smoothed analysis, these points are chosen by an adversary, and they serve as unperturbed origins. Let be n independent random variables with mean 0 and standard deviation . By slight abuse of notation, refers here to the multivariate normal distribution with covariance matrix Inline graphic . We obtain the perturbed point set by adding for each . We write to make explicit from which point set the points in X are obtained.

We assume that Inline graphic throughout the paper. This is justified by two reasons. First, small are the interesting case, i.e., when the order of magnitude of the perturbation is relatively small. Second, smoothed performance guarantees are monotonically decreasing in : if we have , then this is equivalent to adversarial instances in Inline graphic that are perturbed with standard deviation 1. This in turn is dominated by adversarial instances in that are perturbed with standard deviation 1, as . Thus, any upper bound for (be it for the number of iterations or the approximation ratio) holds also for larger .

Let us make a final remark about the smoothing model: while the algorithm itself, the 2-opt heuristic in our case, only sees X and does not know anything about the origins Inline graphic , we can of course exploit the positions of the unperturbed points in the analysis.

Smoothed Analysis of the Running-Time

In this section, we make the dependence on all parameters (the number n of points, the dimension d, and the perturbation parameter Inline graphic ) explicit. This means that the O or do not hide any factors, not even factors depending on d, which is often considered as a constant and therefore ignored. (This is also in contrast to our analysis of the approximation ratio, where the hidden constant can indeed depend on d.)

Probability Theory for the Running-Time

In order to get an upper bound for the length of the initial tour, we need an upper bound for the diameter of the point set X. Such an upper bound is also necessary for the analysis of 2-changes with Euclidean distances (Sect. 3.5). We choose Inline graphic such that with a probability of at least . For fixed d and , we can choose according to the following lemma. For , we have .

Lemma 3.1

Let Inline graphic be a sufficiently large constant, and let . Then .

Proof

We have Inline graphic only if there is a point and a coordinate of that is perturbed by more than . According to Durrett [10, Theorem 1.2.3], the probability that a 1-dimensional Gaussian of standard deviation is more than away from its mean is bounded from above by . Thus, the probability that is bounded from above by Inline graphic . For sufficiently large c, this is at most 1/n!.

Note that the constant c in Lemma 3.1 does not depend on the dimension d.

The following lemma is well known and follows from the fact that the density of a d-dimensional Gaussian with standard deviation Inline graphic is bounded from above by and the volume of a d-dimensional ball of radius is bounded from above by .

Lemma 3.2

Let Inline graphic be drawn according to a d-dimensional Gaussian distribution of standard deviation , and let be a d-dimensional hyperball of radius centered at . Then .

For Inline graphic with , let denote the straight line through x and y.

Lemma 3.3

Let Inline graphic be arbitrary with . Let be drawn according to a d-dimensional Gaussian distribution with standard deviation . Then the probability that c is -close to L(a, b), i.e., , is bounded from above by .

Proof

We divide drawing c into drawing a 1-dimensional Gaussian Inline graphic in the direction of and drawing a -dimensional Gaussian in the hyperplane orthogonal to and containing . Then the distance of c to L(a, b) is . For every , the point c is -close to L(a, b) only if falls into a -dimensional hyperball of radius around in the -dimensional subspace orthogonal to Inline graphic . Now the lemma follows by applying Lemma 3.2.

We need the following lemma in Sect. 3.5.

Lemma 3.4

Let Inline graphic be a differentiable function. Let B be an upper bound for the absolute value of the derivative of f. Let c be distributed according to a Gaussian distribution with standard deviation . Let I be an interval of size , and let be the image of I. Then .

Proof

Since the derivative of f is bounded by B, the set f(I) is contained in some interval of length Inline graphic . The lemma follows since the density of c is bounded from above by .

The chi distribution [14, Section 8] is the distribution of the Euclidean length of a d-dimensional Gaussian random vector of standard deviation Inline graphic and mean 0. In the following, we denote its density function by . It is given by

where Inline graphic denotes the gamma function. We need the following lemma several times.

Lemma 3.5

Assume that Inline graphic is a fixed constant and is arbitrary with . Then we have

Proof

The first equality follows by integration. For the second inequality, we observe Inline graphic is a fixed constant (which also never depends on d when we apply the lemma) and that

for some function Inline graphic with according to Stirling’s formula [1, 6.1.37]. We have as and both are integers. Then

Here, the third equality follows from two facts: first, c is a fixed constant, thus Inline graphic . Second, . Thus, and lie between 0 and a constant. Hence, the exponential term is .

Analyzing A and B remains to be done: We have Inline graphic , thus . If , then B is bounded from below by a constant and so is . If , then . Hence, .

We have Inline graphic . Distinguishing the cases and in the same way as for B yields . Thus, .

The analysis with Euclidean and squared Euclidean distances depends on the distribution of the distance between two points perturbed by Gaussians, where a larger distance between the two points is better for the analysis. The following two lemmas show that, given that larger distance is better, we can replace the distribution of the distance by the corresponding chi distribution. Since we do not know the original positions of the points involved, this allows us to replace unknown distributions by the chi distribution.

Lemma 3.6

Assume that a is drawn according to a d-dimensional Gaussian distribution with standard deviation Inline graphic and mean 0. Assume that b is drawn according to a d-dimensional Gaussian distribution with standard deviation and mean . Then stochastically dominates , i.e., for all .

Proof

For Inline graphic , we have the following:

Now we prove the lemma for larger d. Since Gaussian distributions are rotationally symmetric, we can assume that Inline graphic for some .

We observe that Inline graphic dominates if and only if dominates . Let . It suffices to prove the lemma for this choice of , as follows the same distribution as b. Fixing fixes also . Then dominates if dominates . This is true because the lemma holds for .

Lemma 3.7

Let b be as in Lemma 3.6, and let Inline graphic be a monotonically decreasing function. Let g be the density function of . Then

provided that both integrals exist.

Proof

Let a denote the d-dimensional Gaussian random variable of standard deviation Inline graphic and mean 0. Then has density . By Lemma 3.6, is dominated by . This implies that dominates since h is monotonically decreasing. The lemma follows by observing that the two integrals are the two expected values of and .

For Euclidean and squared Euclidean distances, it turns out to be useful to study Inline graphic for points . By abusing notation, we sometimes write instead of for short. A 2-change that replaces and by and improves the tour length by .

2-Opt State Graph and Linked 2-Changes

The number of iterations that 2-opt needs depends of course heavily on the initial tour and on which 2-change is chosen in each iteration. We do not make any assumptions about the initial tour and about which 2-change is chosen. Following Englert et al. [12], we consider the 2-opt state graph: we have a node for every tour and a directed edge from tour T to tour Inline graphic if can be obtained by one 2-change. The 2-opt state graph is a directed acyclic graph, and the length of the longest path in the 2-opt state graph is an upper bound for the number of successful iterations that 2-opt needs.

In order to improve the bounds, we also consider pairs of linked 2-changes [12]. Two 2-changes form a pair of linked 2-changes if there is one edge added in one 2-change and removed in the other 2-change. Formally, one 2-change replaces Inline graphic and by and and the other 2-change replaces and by and . The edge is the one that appears and disappears again (or the other way round). It can happen that and intersect. Englert et al. [12] called a pair of linked 2-changes a type i pair if . As type 2 pairs, which involve only four nodes, are difficult to analyze because of dependencies, we ignore them. Fortunately, the following lemma states that we will find enough disjoint pairs of linked 2-changes of type 0 and 1 in any sufficiently long sequence of 2-changes.

Lemma 3.8

(Englert et al. [12, Lemma 9 of corrected version]) Every sequence of t consecutive 2-changes contains at least Inline graphic disjoint pairs of linked 2-changes of type 0 or type 1.

Following Englert et al. [12, Figure 8], we subdivide type 1 pairs into type 1a and type 1b depending on how Inline graphic and intersect. One of the 2-changes replaces and by and . Then the other 2-change, i.e., the one that removes the edge shared by the linked pair, determines its type:

Type 0: and are replaced by and .
Type 1a: and are replaced by and .
Type 1b: and are replaced by and .

The main idea in the proofs by Englert et al. [12] and also in our proofs is to bound the minimal improvement of any 2-change or the minimal improvement of any pair of linked 2-changes. We denote the smallest improvement of any 2-change by Inline graphic and the smallest improvement of any pair of linked 2-changes of type 0, 1a, or 1b by . It will be clear from the context which distance measure is used for and .

Suppose that the initial tour has a length of at most L, then 2-opt cannot run for more than Inline graphic iterations and not for more than iterations, provided that because of Lemma 3.8.

The following lemma formalizes this and shows how to bound the expected number of iterations using a tail bound for Inline graphic or .

Lemma 3.9

Suppose that, with a probability of at least Inline graphic , any tour has a length of at most L. Let . Then

If , then the expected length of the longest path in the 2-opt state graph is bounded from above by .
If , then the expected length of the longest path in the 2-opt state graph is bounded from above by .
The same bounds as (1) and (2) hold if we replace by , provided that for Case 1 and for Case 2.

Proof

If the length of the longest tour is longer than L, then we use the trivial upper bound of n!. This contributes only O(1) to the expected value, which, by slight abuse of mathematical correctness, we ignore in the following.

Consider the first statement. Let T be the longest path in the 2-opt state graph. If Inline graphic , then . Plugging this in and observing that n! is an upper bound for T yields

Now consider the second statement, and let T be as above. Let Inline graphic . Then

Finally, we consider the third statement. The statement follows from the observation that the maximal number of disjoint pairs of linked 2-changes and the length of the longest path in the 2-opt state graph are asymptotically equal if they are of length at least Inline graphic (Lemma 3.8) and the probability statements become nontrivial only for in the first and in the second case.

Manhattan Distances

The essence of our analysis for Manhattan distances is a straightforward adaptation of the analysis in the one-step model. The extra factor of Inline graphic comes from the bound of the initial tour, and the extra factor of stems from stating the dependence on d explicitly and getting rid of the exponential dependence on d [12, Proofs of Theorem 7 and Lemma 10].

Lemma 3.10

Inline graphic .

Proof

We consider a pair of linked 2-changes as described in Sect. 3.2. The improvement of the first 2-change is

where Inline graphic is the i-th coordinate of . The improvement of the second 2-change is

Note that we can have a type 1 pair, i.e., two of the points Inline graphic can be identical.

Each ordering of the Inline graphic gives rise to a linear combination for and . We have such orderings. If we examine the case analysis by Englert et al. [12, Lemmas 11, 12, 13] closely, we see that any pair of linear combinations is either impossible (it uses a different ordering of the variables for and or one of Inline graphic and is non-positive, thus the corresponding 2-change is in fact not a 2-change) or we have one variable that has a non-zero coefficient in and a coefficient of 0 in and another variable that has a non-zero coefficient in and a coefficient of 0 in . The absolute values of the non-zero coefficients of Inline graphic and is 2. Now falls into only if falls into an interval of length . This happens with a probability of at most . By independence, the same holds for and .

However, we would incur an extra factor of Inline graphic in this way, and we would like to remove all exponential dependence of d. In order to do this, we assume that we know i and already. This comes at the expense of a factor of for taking a union bound over the choices of i and . We let an adversary fix values for all with . Since we know i and Inline graphic , we are left with at most possible linear combinations.

Finally, the lemma follows by taking a union bound over all Inline graphic possible pairs of linked 2-changes.

Theorem 3.11

The expected length of the longest path in the 2-opt state graph corresponding to d-dimensional instances with Manhattan distances is at most Inline graphic .

Proof

The initial tour has a length of at most Inline graphic with a probability of at least by Lemma 3.1. We apply Lemma 3.9 for linked 2-changes using Lemma 3.10 and .

Squared Euclidean Distances

Preparation

In this section, we have Inline graphic for .

Assume that we have a 2-change that replaces Inline graphic and by and . The improvement caused by this 2-change is . Given the positions of the four nodes except for a single , such a 2-change yields a small improvement only if the corresponding falls into some interval of size . The following lemma gives an upper bound for the probability that this happens.

Lemma 3.12

Let Inline graphic , , and let c be drawn according to a Gaussian distribution with standard deviation . Let be an interval of length . Then

Proof

Since Gaussian distributions are rotationally symmetric, we can assume without loss of generality that Inline graphic and with . Let . Then . Thus, if and only if falls into an interval of length . Since is a 1-dimensional Gaussian random variable with a standard deviation of , the probability for this is bounded from above by since the maximum density of a 1-dimensional Gaussian of standard deviation Inline graphic is bounded from above by .

Single 2-Changes

In this section, we prove a simple bound for the expected number of iterations of 2-opt with squared Euclidean distances. This bound holds for all Inline graphic . In the next section, we improve this bound for the case using pairs of linked 2-changes.

Lemma 3.13

For Inline graphic , we have .

Proof

Consider a 2-change where Inline graphic and are replaced by and . Its improvement is given by . We let an adversary fix . Then we draw . This fixes the distance . Now we draw . This fixes . The 2-change yields an improvement of at most only if falls into an interval of size at most . According to Lemma 3.12, the probability that this happens is at most Inline graphic .

Now let g be the probability density of Inline graphic . Then the probability that the 2-change yields an improvement of at most is bounded from above by

The first step is due to Lemma 3.7. The second step is due to Lemma 3.5 using Inline graphic and . The lemma follows by a union bound over the possible 2-changes.

Theorem 3.14

For all Inline graphic , the expected length of the longest path in the 2-opt state graph corresponding to d-dimensional instances with squared Euclidean distances is at most .

Proof

With a probability of at least Inline graphic , the instance is contained in a hypercube of side length . Thus, the longest edge has a length of at most . Therefore, the initial tour has a length of at most . We combine this with Lemmas 3.9 and 3.13 to complete the proof.

Pairs of Linked 2-Changes

We can obtain a better bound than in the previous section by analyzing pairs of linked 2-changes. With the following three lemmas, we analyze the probability that pairs of linked 2-changes of type 0, 1a, or 1b yield an improvement of at most Inline graphic .

Lemma 3.15

For Inline graphic , the probability that there exists a pair of type 0 of linked 2-changes that yields an improvement of at most is bounded from above by .

Proof

Consider a fixed pair of type 0 of linked 2-changes involving the six points Inline graphic as described in Sect. 3.2. We show that the probability that it yields an improvement of at most is at most . A union bound over the possibilities of pairs of type 0 yields the lemma.

The basic idea is that we restrict ourselves to analyzing Inline graphic and only in order to bound the probability that we have a small improvement. In this way, we use the principle of deferred decision to show that we can analyze the improvements of the two 2-changes as if they were independent:

We let an adversary fix arbitrarily.
We draw , which determines the distance .
We draw . This fixes the position of the “bad” interval for . Its size is already fixed since we know the positions of and . The position of is still random.
We draw . The probability that assumes a position such that the first 2-change yields an improvement of at most is thus at most .
We draw . This determines the distance .
We draw . The probability that assumes a position such that the second 2-change yields an improvement of at most is thus at most .

Let g be the probability density function of the distance between Inline graphic and , and let be the probability density function of the distance between and . By independence of the points, the probability that both 2-changes of the pair yield an improvement of at most is bounded from above by

We observe that Inline graphic is monotonically decreasing in . Thus, by Lemma 3.7, we can replace g and by the density of the chi distribution to get the following upper bound for the probability that a pair of type 0 yields an improvement of at most :

Here, we use Lemma 3.5 with Inline graphic , which is allowed since .

Lemma 3.16

For Inline graphic , the probability that there exists a pair of type 1a of linked 2-changes that yields an improvement of at most is bounded from above by .

Proof

We can analyze pairs of type 1a in the same way as type 0 pairs in Lemma 3.15. To do this, we analyze Inline graphic and :

We let an adversary fix the position of .
We draw . This fixes .
We draw . This fixes . In addition, this fixes the positions of the intervals into which and must fall if the first or second 2-change yield an improvement of at most .
We draw .
We draw .

The remainder of the proof is identical to the proof of Lemma 3.15, except that we have to take a union bound only over Inline graphic possible choices.

Lemma 3.17

For Inline graphic , the probability that there exists a pair of type 1b of linked 2-changes that yields an improvement of at most is bounded from above by .

Proof

Again, we proceed similarly to Lemma 3.15. We analyze a fixed pair of type 1b, where Inline graphic and are replaced by and in one step and and are replaced by and , and apply a union bound over the possible type 1a pairs. We analyze the probability that or assume a bad value.

We draw the points in the following order:

We fix .
We draw . This fixes the distance , which is crucial for both 2-changes.
We draw .
We draw . The probability that the first 2-change yields an improvement of at most is at most .
We draw . The probability that the second 2-change yields an improvement of at most is at most .

The main difference to Lemma 3.15 is that the sizes of the bad intervals are not independent. However, once the size of the bad intervals is fixed, we can analyze the probabilities that Inline graphic or fall into their bad intervals as independent. Given that is fixed, the probability that the first and the second 2-change yield an improvement of at most is bounded from above by . Since this is decreasing in , we can replace the distribution of by the chi distribution to obtain an upper bound according to Lemma 3.7. Thus, using Lemma 3.5 with Inline graphic and , we obtain the following upper bound for the probability that a pair of type 1b yields an improvement of at most :

With the three lemmas above, we can obtain a bound on the expected number of iterations of 2-opt for TSP with squared Euclidean distances.

Theorem 3.18

For Inline graphic , the expected length of the longest path in the 2-opt state graph corresponding to d-dimensional instances with squared Euclidean distances is at most .

Proof

The probability that any pair of linked 2-changes of type 0, 1a, or 1b yields an improvement of at most Inline graphic is bounded from above by . We apply Lemma 3.9 with and observe that the initial tour has a length of at most with a probability of at least .

Euclidean Distances

Differences of Euclidean Distances

In this section, we have Inline graphic for . Analyzing turns out to be more difficult than analyzing in the previous section. In particular the case when is close to its maximal value of requires special attention. Intuitively, this is for the following reason: if , then z is close L(a, b). Assume that for the moment. Then either z is between a and b, which is fine. Or z is not between a and b. Then moving z in the direction of L(a, b) does not change Inline graphic at all.

We observe that Inline graphic behaves essentially 2-dimensionally: it depends only on the distance of z from L(a, b) (this is x in the following lemma) and on the position of the projection z onto L(a, b) (this is y in the following lemma). It also depends on the distance between a and b (this is in the following lemma, and we had this dependency also in the previous section about squared Euclidean distances). The following lemma makes the connection between x and y explicit for a given Inline graphic . Figure 1 depicts the situation described in the lemma.

Lemma 3.19

Let Inline graphic , , . Let and be two points at a distance of . Let . Then we have

for Inline graphic and

for Inline graphic . Furthermore, is impossible.

Proof

The last statement follows from the triangle inequality.

We have Inline graphic . Rearranging terms and squaring implies

Squaring again yields

By rearranging terms again, we obtain

Using the assumption Inline graphic or implies the two claims.

As said before, the difficult case in analyzing Inline graphic is when . In terms of the previous lemma, this can only happen if x is small, i.e., if c is close to L(a, b), but not between a and b. The following lemmas makes a quantitative statement about this connection.

Lemma 3.20

Let Inline graphic . Assume that and that z has a distance of x from L(a, b). Then

Proof

Let y be the distance of z from Inline graphic , and let . Then, according to (3), we have

We have Inline graphic . This and the upper bound yield the following weaker bound:

We distinguish two cases. The first case is that Inline graphic . In this case, it suffices to show that in order to prove (4). Since , this holds because .

The second case is that Inline graphic . We have

Replacing Inline graphic by in the numerator and by in the denominator of (5), we obtain

Rearranging terms completes the proof. Inline graphic

In order to be able to apply Lemma 3.4, we need the following upper bound on the derivative of y with respect to Inline graphic , given that x is fixed.

Lemma 3.21

For Inline graphic , let with . Assume further that and that . Then the derivative of y with respect to is bounded by

Proof

The derivative of y with respect to Inline graphic is given by

We observe that Inline graphic for all x and allowed choices of and . For the second term, we have

By assumption, we have Inline graphic and . Thus, we have

Using Lemmas 3.21 and 3.4, we can bound the probability that Inline graphic assumes a value in an interval of size .

Lemma 3.22

Let Inline graphic . Let be arbitrary, , and let z be drawn according to a Gaussian distribution with standard deviation . Let . Let I be an interval of length . Then

Proof

We assume throughout this proof that Inline graphic . The case that this is not satisfied is taken care of by the second term in the upper bound for the probability in the statement of the lemma.

Let x denote the distance of z to L(a, b), and let y denote the position of the projection of z onto L(a, b). First, let us assume that x is fixed. Then, by Lemmas 3.21 and 3.4, the probability that Inline graphic is bounded from above by

Here, the requirements of Lemma 3.21 are satisfied because of Lemma 3.20, or we have Inline graphic .

We observe that this probability is decreasing in x. Thus, in order to get an upper bound for the probability with random x, we can use the Inline graphic -dimensional chi distribution for x according to Lemma 3.7. We obtain

by Lemma 3.5 using Inline graphic and . Since , the lemma follows.

Analysis of Pairs of 2-Changes

We immediately go to pairs of linked 2-changes, as these yield the better bounds.

Lemma 3.23

For Inline graphic , the probability that a pair of linked 2-changes of type 0 yields an improvement of at most or some point lies outside is bounded from above by

Proof

We proceed similarly as in the proof of Lemma 3.15 for type 0 pairs for squared Euclidean distances. We draw the points of a fixed pair of linked 2-changes as in the proof of Lemma 3.15.

In the same way as in the proof of Lemma 3.15, using Lemma 3.22 instead of Lemma 3.12, we obtain that the probability that one fixed of the two 2-changes yields an improvement of at most Inline graphic is bounded from above by

Here, we applied Lemma 3.5 with Inline graphic .

Again in the same way as in the proof of Lemma 3.15, we can analyze both 2-changes of the type 0 pair as if they are independent. Finally, the lemma follows by a union bound over the Inline graphic possibilities for a type 0 pair.

Lemma 3.24

For Inline graphic , the probability that a pair of linked 2-changes of type 1a yields an improvement of at most or some point lies outside is bounded from above by

Proof

The lemma can be proved in the same way as Lemma 3.16 with differences analogous to the proof of Lemma 3.23. Inline graphic

Lemma 3.25

For Inline graphic , the probability that a pair of linked 2-changes of type 1b yields an improvement of at most or some point lies outside is bounded from above by

Proof

Similar to the proof of Lemma 3.17 and using Lemma 3.22, the probability that the two 2-changes of the pair both yield an improvement of at most Inline graphic is bounded from above by

Now the lemma follows by applying Lemma 3.5 with Inline graphic .

Theorem 3.26

For Inline graphic , the expected length of the longest path in the 2-opt state graph corresponding to d-dimensional instances with Euclidean distances is at most .

Proof

We have Inline graphic by Lemmas 3.23, 3.24, and 3.25. If all points are in , then the longest edge has a length of . Thus, the initial tour has a length of at most . Plugging this into Lemma 3.9 yields the result.

Smoothed Analysis of the Approximation Ratio

Technical Preparation

The following standard lemma provides a convenient way to bound the deviation of a perturbed point from its mean in the two-step model.

Lemma 4.1

(Chi-square bound [28, Cor. 2.19]) Let x be a Gaussian random vector in Inline graphic of standard deviation centered at the origin. Then, for , we have

To give large-deviation bounds on sums of independent variables with bounded support, we will make use of a standard Chernoff-Hoeffding bound.

Lemma 4.2

(Chernoff-Hoeffding Bound [9, Exercise 1.1]) Let Inline graphic , where are independently distributed in [0, 1], and . Then, for , we have

Throughout this part of the paper, we assume that the dimension Inline graphic is a fixed constant. Given a sequence of points , we call a collection of edges a tour, if T is connected and every has in- and outdegree exactly one in T. Note that we consider directed tours, which is useful in the analysis in this chapter, but our distances are always symmetric.

Given any collection of edges S, its length is denoted by Inline graphic , where d(u, v) denotes the Euclidean distance between points and .

We call a collection Inline graphic a partial 2-optimal tour if T is a subset of a tour and holds for all edges . Our main interests are the traveling salesperson functional as well as the functional that maps the point set X to the length of the longest 2-optimal tour through X.

We note that the results in Sect. 4.2 hold for metrics induced by arbitrary norms in Inline graphic (Lemma 4.4 and 4.5) or typical norms (Lemma 4.6 and 4.7), not only for the Euclidean metric. We conjecture that also the upper bound in Sect. 4.3 holds for more general metrics, while the lower bound in Sect. 4.4 is probably specific for the Euclidean metric. Still, we think that the construction can be adapted to work for most natural metrics.

For obtaining lower bounds on the length of optimal tours, we consider the boundary functional Inline graphic : is the length of the shortest tour through all points in X, where we are allowed to connect points directly or to the boundary, and traversing the boundary of has zero costs. For a proof of the following lemma and for more details about boundary functions of Euclidean optimization problems, we refer to the monograph by Yukich [31].

Lemma 4.3

(Boundary Functional [31, Lemma 3.7]) There is a constant Inline graphic such that for all sets of n points, we have .

Length of 2-Optimal Tours under Perturbations

In this section, we provide an upper bound for the length of any 2-optimal tour and a lower bound for the length of any global optimum. These two results yield an upper bound of Inline graphic for the approximation ratio.

Chandra et al. [7] proved a bound on the worst-case length of 2-optimal tours that, in fact, already holds for the more general notion of partial 2-optimal tours. For an intuition why this is true, let us point out that their proof strategy is to argue that not too many long arcs in a tour may have similar directions due to the 2-optimality of the edges, while short edges do not contribute much to the length. The claim then follows from a packing argument. It is straight-forward to verify that it is never required that the collection of edges is closed or connected.

Lemma 4.4

(Length of partial 2-optimal tours [7, Theorem 5.1], paraphrased) There exists a constant Inline graphic such that for every sequence X of n points in , any partial 2-optimal tour has length less than .

While this bound directly applies to any perturbed instance under the one-step model, Gaussian perturbations fail to satisfy the premise of bounded support in Inline graphic . However, Gaussian tails are sufficiently light to enable us to translate the result to the two-step model by carefully taking care of outliers.

Lemma 4.5

There exists a constant Inline graphic such that for any the following statement holds. For any , the probability that any partial 2-optimal tour on has length greater than , i.e., , is bounded by . Furthermore,

Proof

By translation, assume without loss of generality that the input points are contained in Inline graphic . We define cubes with . The side length of cube is . We consider the partitioning of into the regions and for . For some cube C and any tour T, let denote the edges in T that are completely contained in C. For any tour T, the sequence defined by and , for , partitions the edges of T. Thus, Inline graphic .

For any outcome of the perturbed points, let T be the longest 2-optimal tour. Then, each Inline graphic is a partial 2-optimal tour in . Let be the (random) number of points in , which is an upper bound on the number of points in . At most vertices are incident to the edges , since each such edge is incident to at least one endpoint in and every point has degree 2 in T. Since is a translated unit cube scaled by Inline graphic , Lemma 4.4 yields .

Observe that Inline graphic is not contained in only if its origin has been perturbed by noise of length at least . Thus, let and note that implies that . Hence, for each point , Lemma 4.1 yields

By linearity of expectation, we conclude that Inline graphic for . This yields

where we used Jensen’s inequality for the first inequality.

To derive tail bounds for the length of any 2-optimal tour, let Inline graphic be the upper bound on derived above. By the Chernoff bound (Lemma 4.2), we have

This guarantee is only strong as long as Inline graphic is sufficiently large. Hence, we regard this guarantee only for , where is chosen such that . Assume that for all . Then, analogously to the above calculation, the contribution of is bounded by

Let Inline graphic denote the probability that some fails to satisfy . Then,

Let us continue assuming that all Inline graphic satisfy . Since in particular , at most vertices remain outside . Let . By a union bound,

Assume that the corresponding event holds (i.e., Inline graphic ), then the remaining points outside (and hence, outside ) are contained in . We conclude that, with probability at least , we have

This finishes the proof, since we have shown that with probability Inline graphic , both the contribution of and is bounded by .

We complement the bound above by a lower bound on tour lengths of perturbed inputs, making use of the following result by Englert et al. [12] for the one-step model.

Lemma 4.6

(Englert et al. [12, Proof of Theorem 1.4]) Let Inline graphic be a -perturbed instance. Then with probability , any tour on has length at least .

It also follows from their results that this bound translates to the two-step model consistently with the intuitive correspondence of Inline graphic between the one-step and the two-step model.

Lemma 4.7

Let Inline graphic be an instance of points in the unit cube perturbed by Gaussians of standard deviation . Then with probability any tour on has length at least .

Proof

We summarize the arguments of Englert et al. [12, Section 6] first, who considered truncated Gaussian perturbations: Here, we condition the Gaussian perturbation Inline graphic for each input point to be contained in for some . Conditioned on this event, the resulting input instance is contained in the cube . By straight-forward calculations, the conditional distribution of each point in C has maximum density bounded by . Moreover, the probability that the condition fails for a single point is bounded by Inline graphic for all i. Thus, by choosing sufficiently large, each point has at least constant probability to satisfy the condition .

Given any instance (with Gaussian perturbations which are not truncated), first reveal the (random) subinstance of those points for which the condition Inline graphic is satisfied and let be the number of such points. By the Chernoff bound (Lemma 4.2), and , we have for some with probability at least . If this event occurs, we obtain a random instance of points and maximum density . Hence an application of Lemma 4.6 yields that, for some constant Inline graphic , the probability that a tour of length less than exists is at most .

Note that Lemmas 4.5 and 4.7 almost immediately yield the following bound on the approximation performance for the two-step model. (The large-deviation bound is immediate. For the expected approximation ratio, we make use of the older and non-tight worst-case bound of Inline graphic , given in Lemma 4.9 below.)

Observation 4.8

Let Inline graphic be an instance of points in the unit cube perturbed by Gaussians of standard deviation . Then the approximation performance of 2-Opt is bounded by in expectation and with probability .

We remark that this bound is best possible for an analysis of perturbed instances that separately bounds the lengths of any 2-optimal tour from above and gives a lower bound on any optimal tour. To see this, we argue that Lemma 4.6, Lemma 4.4 (even under Inline graphic -perturbed input), Lemma 4.7 and Lemma 4.5 cannot be improved in general. This is straight-forward for Lemma 4.6, since n points distributed uniformly at random in a cube of volume always have, by scaling and Lemma 4.4, a tour of length . Hence, the lower bound on optimal tours on perturbed instances is tight. To see that the upper bound on any 2-optimal tour is tight, take n uniformly distributed points that have, by Lemma 4.6, an optimal tour of length Inline graphic with high probability and thus also in expectation.

Naturally, this transfers to the case of Gaussian perturbations, albeit more technical to verify: If we place n identical points in Inline graphic , say at the origin, and perturb them with Gaussians of standard deviation , then we may without loss of generality scale the unit cube to and perturb the points with standard deviation 1 instead. By Lemma 4.5, any 2-optimal tour and, thus, any optimal tour on these points has a length of Inline graphic on the scaled instance, since the origins are still contained in the unit cube. Thus, the optimal tour on the original instance has a length of at most in expectation and with high probability.

We only sketch that 2-optimal tours can have a length of at least Inline graphic : We distribute the n (unperturbed) points into groups of points each, and we partition the cube into subcubes of equal side length. Let be a constant such that with high probability, at least points of a group remain in their subcube after perturbation. We call these points successful. Since successful points are identically distributed, conditioned on falling into a compact set, the shortest tour through these (at least) Inline graphic points has a length of at least for some other constant [31]. (This is just a scaled version of perturbing and truncating a Gaussian of standard deviation 1 to a unit hypercube, which would result in a tour length of for m points.) By closeness of the tour on all points to the boundary functional and geometric superadditivity of the boundary functional (see Yukich [31] for details), it follows that the optimal tour on all successful points has a length of at least Inline graphic .

Upper Bound on the Approximation Performance

In this section, we establish an upper bound on the approximation performance of 2-Opt under Gaussian perturbations. We achieve a bound of Inline graphic . Due to the lower bound presented in Sect. 4.4, improving the smoothed approximation ratio to is impossible. Thus, our bound is almost tight.

As noted in the previous section, to beat Inline graphic it is essential to exploit the structure of the unperturbed input. This will be achieved by classifying edges of a tour into long and short edges and bounding the length of long edges by a (worst-case) global argument and short edges locally against the partial optimal tour on subinstances (by a reduction to an (almost-)average case). The local arguments for short edges will exploit how many unperturbed origins lie in the vicinity of a given region.

The global argument bounding long edges follows from the older Inline graphic bound on the worst-case approximation performance [7] that we rephrase here for our purposes.

Lemma 4.9

(Chandra et al. [7, Proof of Theorem 4.3]) Let T be a 2-optimal tour and Inline graphic denote the length of the optimal traveling salesperson tour . Let contain the set of all edges in T whose length is in . Then . In particular, it follows that .

In the proof of our bound of Inline graphic , the above lemma accounts for all edges of length . A central idea to bound all shorter edges is to apply the one-step model result to small parts of the input space. In particular, we will condition sets of points to be perturbed into cubes of side length . The following technical lemma helps to capture what values of Inline graphic suffice to express the conditional density function of these points depending on the distance of their unperturbed origins to the cube. This allows for appealing to the one-step model result of Lemma 4.6.

Lemma 4.10

Let Inline graphic and . Let Y be the random variable conditioned on and be the corresponding probability density function. Then is bounded from above by .

Proof

Let Inline graphic be the probability density function of X. Let be the point in Q that is closest to c. Then, since is rotationally invariant around c and decreasing in , the density inside Q is maximized at . Likewise, minimizes the density inside Q. Since Q is a -cube in , , where denotes the all-ones vector. Given Inline graphic , we can thus bound the conditional probability density function for by

It remains to bound, for Inline graphic ,

Since for all Inline graphic , , we can bound , yielding the claim.

The main result of this section is the following theorem, which will be proved in the remainder of the section.

Theorem 4.11

Let Inline graphic be an instance of points in perturbed by Gaussians of standard deviation . With probability for any constant , we have . Furthermore,

Since the approximation performance of 2-Opt is bounded by Inline graphic in the worst-case, we may assume that for all constant , since otherwise our smoothed result is superseded by Lemma 4.9. Furthermore, we may also assume that , since otherwise Observation 4.8 already yields the result. In what follows, let and T be any optimal and longest 2-optimal, respectively, traveling salesperson tour on Inline graphic . Furthermore, we let denote the length of the shortest traveling salesperson tour.

Outliers and Long Edges

We will first show that the contribution of almost all points outside Inline graphic is bounded by with high probability and in expectation, similar to Lemma 4.5. For this, we define growing cubes , where we set for and . Let be the number of points not contained in . For every point , Lemma 4.1 with bounds (note that we have chosen the such that ). Thus, . We define Inline graphic as the set of edges of the longest 2-optimal tour T contained in with at least one endpoint in . We first bound the contribution of the edges in with .

Lemma 4.12

With probability Inline graphic for any constant , we have

In addition, we have Inline graphic .

Proof

The proof is analogous to the proof of Lemma 4.5. Linearity of expectation, Lemma 4.4, and Jensen’s inequality yield

By observing that Inline graphic is bounded by a constant, we conclude that is bounded by .

Let Inline graphic be the upper bound on derived above. By the Chernoff bounds (Lemma 4.2), we have

Choose Inline graphic such that . Thus, . Assume that for all . Then, analogously to the above calculation, the contribution of is bounded by

Note that the probability that some Inline graphic fails to satisfy is bounded by

for any constant Inline graphic . Since , at most vertices remain outside . Let . By a union bound, for any constant ,

Assume that we have the – very likely – event that all points are in Inline graphic , then the remaining points outside are contained in . We conclude that

In the remainder of the proof, we bound the total length of edges inside Inline graphic . Define and note that all edges in C have bounded length . We let contain the set of all those edges within C (in the longest 2-optimal tour T) whose lengths are in . Let be such that . Then for all , since no longer edges exist. Let be such that . Then by Lemma 4.9. This argument bounds the contribution of long edges, i.e., edges longer than Inline graphic , in the worst case, after observing the perturbation of the input points. It remains to bound the length of short edges in C, which we do in the next section.

Short Edges

To account for the length of the remaining edges, we take a different route than for the long edges: Call an edge that is shorter than Inline graphic a short edge and partition the bounding box into a grid of -cubes with , which we call cells. All edges in for , i.e., short edges, are completely contained in a single cell or run from some cell to one of its neighboring cells. For a given tour T, let denote the short edges of T for which at least one of the endpoints lies in Inline graphic .

We aim to relate the length of the edges Inline graphic for the longest 2-optimal tour T to the length of the edges of the optimal tour . This local approach is justified by the following property.

Lemma 4.13

For any tour Inline graphic , the contribution of cell is lower bounded by .

Proof

Consider all edges S in Inline graphic that have at least one endpoint in . Replacing those edges with and by the shortest edge connecting u to the boundary of does not increase the total edge length by triangle inequality. If were the unit cube, would thus be lower bounded by the boundary functional . Instead, we scale the instance Inline graphic by to obtain an instance in the unit cube, satisfying and, as argued above, . Thus an application of Lemma 4.3 yields

Intuitively, a cell Inline graphic is of one of two kinds: either few points are expected to be perturbed into it and hence it cannot contribute much to the length of any 2-optimal tour (a sparse cell), or many unperturbed origins are close to the cell (a heavy cell). In the latter case, either the conditional densities of points perturbed into Inline graphic are small, hence any optimal tour inside has a large value by Lemma 4.6, or we find another cell close to that has a very large contribution to the length of any tour.

To formalize this intuition, fix a cell Inline graphic and let be the expected number of points with . Assume for convenience that and are integer. We describe the position of a cube canonically by indices . For two cells and , we define their distance as . For , let denote all cells of distance k to and let denote the cardinality of unperturbed origins located in a cell in Inline graphic . We call a perturbed point with unperturbed origin , for some , a k-successful point. Let denote the set of all k-successful points. Then .

Our first technical lemma shows that any cell Inline graphic , having (in expectation) a large number of points perturbed into it from cells of distance at most K, contributes at least to the length of the optimal tour.

Lemma 4.14

Let Inline graphic and define as the set of k-successful points for . Let . If , then with probability , we have

Proof

Note that by Lemma 4.13, Inline graphic . Fix any realization of , i.e., choice of unperturbed origins inside some cell in whose perturbed points fall into . We can simulate the distribution of (under this realization of ) by appealing to the one-step model. Note that each point in is distributed as a Gaussian conditioned on containment in cell Inline graphic . By rotational invariance of the Gaussian distribution, Lemma 4.10 is applicable and bounds the conditional density function of each point in by . By scaling, we obtain an instance in the unit cube with points distributed according to density functions of maximum density . Hence, by Lemma 4.6 we obtain that any tour has length Inline graphic on the scaled instance with probability . Scaling back to , we obtain . Since by Chernoff bounds (Lemma 4.2), with probability , we finally obtain, using Lemma 4.13,

with probability Inline graphic , where we used that .

The following simple technical lemma shows that with constant probability, a point is perturbed into the cell it originates in.

Lemma 4.15

Let Inline graphic and . Then .

Proof

Let Inline graphic be the probability density function of Z. For all , we have and hence . This yields

We are set-up to formally show the classification of heavy cells. Recall that Inline graphic denotes the number of cells .

Lemma 4.16

Let Inline graphic , and for sufficiently small constants . Then we can classify each cell with into one of the following two types.

With probability for any constant , we have
There is some such that for any , we have
with probability for any constant .

Proof

We start with some intuition. By Lemma 4.4, we can bound Inline graphic . If we have , then Lemma 4.14 already proves to have type (T1). Otherwise, by tail bounds for the Gaussian distribution, we argue that some cell in distance at most contains at least unperturbed origins. These are sufficiently many to let contribute , for any , to the optimal tour length.

To make the intuition formal, note that all edges in Inline graphic are contained in a cube of side length around . By Chernoff bounds (Lemma 4.2), at most points are contained in with probability . Hence, Lemma 4.4 bounds

with probability Inline graphic .

Case 1: Inline graphic . In this case, we may appeal to Lemma 4.14 (since ) and obtain

with probability Inline graphic , since and can be chosen sufficiently small. By a union bound, (6) and (7) hold with probability for any constant , proving that has type (T1).

Case 2: Inline graphic . Every point in has an -distance of at least to every point in . Thus, by Lemma 4.1, we have

for sufficiently large k. Since Inline graphic , we can choose a sufficiently small constant such that satisfies . From , we conclude

Hence, we have

By (8), it follows that

unperturbed origins are situated in cells in distance Inline graphic from . Note that there are at most such cells and for any . By pigeon hole principle, there is a cell with many unperturbed origins.

Let Inline graphic be the 0-successful points for cell , i.e., the points with origin in that are perturbed into . By Lemma 4.15, each unperturbed origin has constant probability to be perturbed into , i.e., . Hence, . Thus, Lemma 4.14 bounds

with probability Inline graphic . Since (6) and (9) hold simultaneously with probability for any constant , this proves that has type (T2).

Inline graphic

Total Length of 2-Optimal Tours

With the analyses of the previous subsections, we can finally bound the total length of 2-optimal tours. To bound the total length of short edges, consider first sparse cells Inline graphic , i.e., cells containing perturbed points in expectation (recall that , where is the number of cells). For each such cell, the Chernoff bound (Lemma 4.2) yields that with probability , at most points are contained in , since each point is perturbed independently. By a union bound, no sparse cell contains more than Inline graphic points with probability at least for any constant . In this event, Lemma 4.4 allows for bounding the contribution of sparse cells by

For bounding the length in the remaining cells (the heavy cells), let Inline graphic and . We observe the following: with probability at least , all type-(T1) cells satisfy . Thus,

where the last inequality follows from Inline graphic , which holds since every edge in (inside C) is counted at most twice on the left-hand side.

Let Inline graphic be any function that assigns to each cell of type-(T2) a corresponding cell satisfying the condition (T2). We say that charges . We can choose any and have with probability at least that for all . Assume that this event occurs. Since every cell can only be charged by cells in distance Inline graphic , each cell can only be charged times. Hence,

Since Inline graphic , choosing sufficiently large yields

Proof of Theorem 4.11

By a union bound, we can bound by Inline graphic , for any constant , the probability that (i) (by Lemma 4.7), (ii) all edges outside C contribute (by Lemma 4.12), (iii) all sparse cells contribute (by (10)), (iv) the type-(T1) cells induce a cost of (by (11)), and (v) the type-(T2) cells induce a cost of (by (12)). Since the remaining edges are long edges and contribute only Inline graphic , we obtain that every 2-optimal tour has a length of at most with probability .

Since a 2-optimal tour always constitutes a Inline graphic -approximation to the optimal tour length by Lemma 4.9, we also obtain that the expected cost of the worst 2-optimal tour is bounded by

Lower Bound on the Approximation Ratio

We complement our upper bound on the approximation performance by the following lower bound: for Inline graphic , the worst-case lower bound is robust against perturbations. For this, we face the technical difficulty that in general, a single outlier might destroy the 2-optimality of a desired long tour, potentially cascading into a series of 2-Opt iterations that result in a substantially different or even optimal tour.

Theorem 4.17

Let Inline graphic . For infinitely many n, there is an instance X of points in perturbed by normally distributed noise of standard deviation such that with probability for any constant , we have . This also yields

We remark that our result transfers naturally to the one-step model with Inline graphic and interestingly, holds with probability 1 over such random perturbations.

Proof of Theorem 4.17. We alter the construction of Chandra et al. [7] to strengthen it against Gaussian perturbations with standard deviation Inline graphic (see Fig. 2). Let be an odd integer and . The original instance of [7] is a subset of the -grid, which we embed into by scaling by 1/P, and consists of three parts , and . The vertices in are partitioned into the layers . Layer i consists of equidistant vertices, each of which has a vertical distance of Inline graphic to the point above it in Layer and a horizontal distance of to the nearest neighbor(s) in the same layer. The set is a copy of shifted to the right by a distance of 2/3. The remaining part consists of a copy of Layer p of shifted to the right by 1/3 to connect and by a path of points. We regard Inline graphic as the set of Layer-i points in .

Fig. 2 — Parts and of the lower bound instance. Each point is contained in a corresponding small container (depicted as brown circle) with high probability. The black lines indicate the constructed 2-optimal tour, which runs analogously on

As in the original construction, we will construct an instance of Inline graphic points, which implies . Let be the largest odd integer such that . In our construction, we drop all Layers in both and , as well as Layer p in . Instead, we connect and already in Layer t by an altered copy of Layer t of shifted to the right by 1/3. Let C be an arbitrary point of our construction, for convenience we will use the central point of Layer t in Inline graphic . We introduce additional copies of this point C. These surplus points serve as a “padding” of the instance to ensure . Note that the resulting instance has layers . We choose t such that the magnitude of perturbation is negligible compared to the pairwise distances of all non-padding points. Furthermore, the restriction on Inline graphic ensures that incorporating the padding points increases the optimal tour length only by a constant.

Lemma 4.18

With probability Inline graphic for any constant , the optimal tour has length O(1).

Proof

Let n be the number of points in the constructed instance. Note that Inline graphic consists of (i) a subset of the instance of Chandra et al. [7], plus (ii) an additional copy of Layer t and (iii) the padding points in . Denote the number of points in by . We have

by choice of t. Hence Inline graphic . It is easy to see [7] that the original instance of Chandra et al. has a minimum spanning tree of length . (This is achieved by the spanning tree that includes, for each Layer-i vertex with , the vertical edge to the point above it, and each edge between consecutive points on Layer p.) Clearly,

Consider the perturbed instance Inline graphic . Note that for every constant , we have for sufficiently large n. Thus for each , the Gaussian noise satisfies with probability at least by Lemma 4.1. By a union bound, we have with probability at least . In this case, by the triangle inequality, the fact that for all point sets Y and since only a constant number of edges connects the three parts, we obtain

Note that we may translate and scale Inline graphic to be contained in , by which may be regarded as the optimal tour length on an instance of points in perturbed by Gaussians with standard deviation 1. By Lemma 4.5, any 2-optimal tour and hence also the optimal tour on the scaled instance has length with probability . Scaling back to the original instance, we obtain Inline graphic with probability . This yields the result by a union bound.

We find a long 2-optimal tour on all non-padding points analogously to the original construction by taking a shortcut of the original 2-optimal tour, which connects Inline graphic and already in Layer t (see Fig. 2).

Consider the padding points, which are yet to be connected. Let Inline graphic denote the nearest point in Layer t of that is to the left of C. Symmetrically, is the nearest point to the right of C. Let be any 2-optimal path from to that passes through all the padding points (including C). We replace the edges and by the path , completing the construction of our tour T.

Lemma 4.19

Let Inline graphic be arbitrary. With probability , T is 2-optimal and has a length of .

Note that given Lemma 4.19, Theorem 4.17 follows directly using Lemma 4.18. The (rather technical) proof of Lemma 4.19 hence concludes our lower bound.

Probability of 2-optimality.

To account for the perturbation in the analysis, we define a safe region for every point. More formally, let Inline graphic be any unperturbed origin. We define its container as the circle centered at with radius . Very likely, all perturbed points lie in their containers.

Lemma 4.20

For sufficiently large p, the tour T constructed as described in Sect. 4.4 is 2-optimal, provided that all points Inline graphic lie in their corresponding containers .

We first show that this lemma implies Lemma 4.19.

Proof of Lemma 4.19

Let Inline graphic , and let be arbitrary. Since , we have for sufficiently large n. By definition of the containers, Lemma 4.1 yields that for any point and sufficiently large n,

By a union bound, we conclude that with probability Inline graphic , all points are contained in their corresponding containers and hence, by the previous lemma, T is 2-optimal.

Recall that t is the largest odd integer satisfying Inline graphic . Since , this implies . Observe that T visits many layers and crosses a horizontal distance of 2/3 in each of them. Hence, it has a length of at least .

In the remainder of this section, we prove Lemma 4.20, i.e., show that the constructed tour is 2-optimal, provided all points stay inside their respective containers. Clearly, it suffices to show for any pair of edges (u, v) and (w, z) in the tour, the corresponding 2-change, i.e., replacing these edges by (u, w) and (v, z) does not reduce the tour length, i.e., Inline graphic . We first state the technical lemmas capturing the ideas behind the construction.

The first lemma treats pairs of horizontal edges and establishes how large their vertical distance must be in order to make swapping these edges increase the length of the tour. It is a generalization of a similar lemma of Chandra et al. [7] to a perturbation setting, in which points are placed arbitrarily into small containers.

Note that in what follows, for a point Inline graphic , we let denote its x-coordinate and its y-coordinate. Furthermore, for any points , we let and denote their horizontal and vertical distance, respectively.

Lemma 4.21

Let Inline graphic and be horizontal line segments in the Euclidean plane with and . Let , , and be circles of radius with centers p, q, r and s, respectively. If and the vertical distance between and is at least

then, for all Inline graphic , we have

Proof

Note that Inline graphic . Furthermore, we have that

and hence

where the right-hand side expression is at least 0, since Inline graphic by assumption. Let and , then it is straight-forward to verify that the expression

subject to Inline graphic is minimized when .

Hence, we can bound (13) by

where the third line follows from our assumption on v. Inline graphic

The following very basic lemma shows that a sequence of edges that share roughly the same direction will always be 2-optimal.

Lemma 4.22

Let Inline graphic and be a sequence of points in such that all connecting segments fulfill . Then,

Proof

For any point p, let Inline graphic denote the cone . Let , then by assumption, we have and thus . Let us assume that (the other case is symmetric). Since by assumption, , we have for that and for some and with . If , the claim is immediate from . Otherwise, for , we obtain

By an analogous computation, Inline graphic follows and hence the claim.

We can now prove Lemma 4.20. Assume that all points are contained in their respective containers. We call an edge between Inline graphic and horizontal (or vertical) if the edge between and is horizontal (or vertical) and neither nor belong to the set of padding points. In what follows, we will first consider horizontal-horizontal, horizontal-vertical and vertical-vertical edge pairs and then turn to pairs of edges for which at least one edge is adjacent to some padding point. Recall that Inline graphic is chosen such as to satisfy .

Horizontal-horizontal edge pair Let Inline graphic and be two horizontal edges. Horizontal edges with appear only if . We distinguish the following cases.

: Both edges are in the same layer. Note that no 2-change swaps neighboring edges. Assume without loss of generality that (the other case is symmetric). Since , we have that
Similarly, and . This shows that Lemma 4.22 is applicable to , which yields that no 2-change can be profitable.
, and . By construction of T, the edges have opposite direction. Assume that and hence (the other case is symmetric). By construction . We have that . The same reasoning shows that . Similarly, one can show that for all and . Hence the 2-change to and has a crossing, which by triangle inequality cannot be profitable.
, and with and . Either both edges have opposite directions, then the previous argument shows that a 2-change is not profitable. Otherwise, note that the first requirement of Lemma 4.21, , is fulfilled. Also note that , since . We have
since for sufficiently large p, we have . Consequently, Lemma 4.21 applies and shows that the 2-change does not yield an improvement.

Horizontal-vertical edge pair. Let Inline graphic be a vertical edge and be a horizontal edge. We assume that the vertical edge is in , since the case is symmetric. Exactly one of the following cases occurs.

and with . The horizontal edge is in the same layer as one of the end points of the vertical edge. Clearly, and . Since a 2-change cannot swap neighboring edges, at least one horizontal segment lies between both edges. By construction of the tour, one of the edges and crosses a vertical distance of at least and the other a horizontal distance of at least . Hence
since .
and with . As in the previous case, and . Consider first the case that , then by construction of the tour, one of the edges and crosses a horizontal distance of at least and the other edge crosses a vertical distance of at least , yielding
since . Otherwise, if , the edge crosses a vertical distance of at least and hence
since . Thus in both cases, a 2-change is not profitable.

Vertical-vertical edge pair. Let Inline graphic and be vertical edges.

and with , i.e., the vertical edges are above each other. By swapping the x- and y-axis in Lemma 4.22, we can show that a 2-change is not profitable, since it is easy to see that for all consecutive pairs (p, q) in .
and with . Clearly, and , while and . Hence a 2-change is not profitable, since .

Padding points.

Since we assumed for convenience that the padding points are placed at the central vertex C of Layer t in Inline graphic , only the edges with at least one endpoint in are relevant candidates for the treatment of padding points. This is because all other edges have both endpoints at a distance of 1/6 to the padding points, which can never be accounted for by its edge length, since all edges except in Layer 0 are much shorter than 1/3. Separately, the Layer-0 edges can be handled easily as well: an edge Inline graphic with is a horizontal edge, hence the pair and a Layer-0 edge trigger the corresponding case of horizontal-horizontal edge pairs with even smaller edge length of the edge in Layer t.

It remains to handle the following cases, where we regard C as a padding point, i.e., Inline graphic , not as a Layer-t point.

, and . Clearly, and . Furthermore, at least one of has a horizontal distance of at least to . Hence,
and . These edge pairs are exactly as regular pairs of Layer-t edges and the corresponding case of horizontal-horizontal edge pairs applies.
. All such edges are 2-optimal by construction, since a 2-optimal path from to passing by all padding points was used.

This concludes the case analysis and thus the proof of Lemma 4.20.

Concluding Remarks

Running-time. Our approach for Euclidean distances does not work for Inline graphic and . However, we can use the bound of Englert et al. [12] for Euclidean distances, which yields a bound polynomial in n and for .

In the same way as Englert et al. [12], we can slightly improve the smoothed number of iterations by using an insertion heuristic to choose the initial tour. We save a factor of Inline graphic for Manhattan and Euclidean distances and a factor of for squared Euclidean distances. The reason is that there always exist tours of length for n points in for Euclidean and Manhattan distances and of length for squared Euclidean distances for [31] (the constants in these upper bounds depend on d). Taking into account also that, because Gaussians have light tails, only few points are far away from the hypercube Inline graphic after perturbation, one might get an even better bound. However, we did not take these improvements into account in our analysis to keep the paper concise.

Of course, even our improved bounds do not fully explain the linear number of iterations observed in experiments. However, we believe that new approaches, beyond analyzing the smallest improvement, are needed in order to further improve the smoothed bounds on the running-time.

Approximation ratio.

We have proved an upper bound of Inline graphic for the smoothed approximation ratio of 2-Opt. Furthermore, we have proved that the lower bound of Chandra et al. [7] remains robust even for . We leave as an open problem to generalize our upper bounds to the one-step model to improve the current bound of [12], but we conjecture that this might be difficult, because of the lack of the nice structure that Gaussian distributions provide.

Given the recent improvement from Inline graphic to by Brodowsky et al. [5], we raise the question of tightening our upper bound to .

While our bound significantly improves the previously known bound for the smoothed approximation ratio of 2-Opt, we readily admit that it still does not explain the performance observed in practice. A possible explanation is that when the initial tour is not picked by an adversary or the nearest neighbor heuristic, but using a construction heuristic such as the spanning tree heuristic or an insertion heuristic, an approximation factor of 2 is guaranteed even before 2-Opt has begun to improve the tour [27]. We chose to compare the worst local optimum to the global optimum in order, as this is arguably the simplest of all technically difficult possibilities.

However, a smoothed analysis of the approximation ratio of 2-Opt initialized with a good heuristic might be difficult: even in the average case, it is only known that the length of an optimal TSP tour is concentrated around Inline graphic for some constant . But the precise value of is unknown [31]. Since experiments suggest that 2-Opt even with good initialization does not achieve an approximation ratio of [16, 17], one has to deal with the precise constants, which seems challenging.

Finally, we conjecture that many examples for showing lower bounds for the approximation ratio of concrete algorithms for Euclidean optimization such as the TSP remain stable under perturbation for Inline graphic . The question remains whether such small values of , although they often suffice to prove polynomial smoothed running time, are essential to explain practical approximation ratios or if already slower decreasing provide a sufficient explanation.

Footnotes

This paper is based on results presented at ISAAC 2013 [25] and ICALP 2015 [20].

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

1.Abramowitz, M., Stegun, I. A.: editors. Pocketbook of mathematical functions. Harri Deutsch, (1984)
2.Arora, S.: Polynomial time approximation schemes for Euclidean traveling salesman and other geometric problems. J. ACM 45(5), 753–782 (1998) [Google Scholar]
3.Bläser, M., Manthey, B., Raghavendra Rao, B.V.: Smoothed analysis of partitioning algorithms for Euclidean functionals. Algorithmica 66(2), 397–418 (2013) [Google Scholar]
4.Bringmann, K., Engels, C., Manthey, B., Raghavendra Rao, B. V.: Random shortest paths: Non-euclidean instances for metric optimization problems. In Krishnendu Chatterjee and Jiří Sgall, editors, Proc. of the 38th Int. Symp. on mathematical foundations of computer science (MFCS), volume 8087 of lecture notes in computer science, pages 219–230. Springer, (2013)
5.Brodowsky, U.A., Hougardy, S., Zhong, X.: The approximation ratio of the k-opt heuristic for the Euclidean traveling salesman problem. SIAM J. Comput. 52(4), 841–864 (2023) [Google Scholar]
6.Brunsch, T., Röglin, H., Rutten, C., Vredeveld, T.: Smoothed performance guarantees for local search. Math. Program. 146(1–2), 185–218 (2014) [Google Scholar]
7.Chandra, B., Karloff, H., Tovey, C.: New results on the old [CDATA[k]]-opt algorithm for the traveling salesman problem. SIAM J. Comput. 28(6), 1998–2029 (1999) [Google Scholar]
8.Curticapean, R., Künnemann, M.: A quantization framework for smoothed analysis of Euclidean optimization problems. Algorithmica 73(3), 1–28 (2015) [Google Scholar]
9.Dubhashi, D.P., Panconesi, A.: Concentration of Measure for the Analysis of Randomized Algorithms. Cambridge University Press, Cambridge (2009) [Google Scholar]
10.Durrett, R.: Probability: Theory and Examples. Cambridge University Press, Cambridge (2013) [Google Scholar]
11.Engels, C., Manthey, B.: Average-case approximation ratio of the 2-opt algorithm for the TSP. Oper. Res. Lett. 37(2), 83–84 (2009) [Google Scholar]
12.Englert, M., Röglin, H., Vöcking, B.: Worst case and probabilistic analysis of the 2-Opt algorithm for the TSP. Algorithmica 68(1), 190–264 (2014) [Google Scholar]
13.Etscheid, M.: Performance guarantees for scheduling algorithms under perturbed machine speeds. Discret. Appl. Math. 195, 84–100 (2015) [Google Scholar]
14.Evans, M., Hastings, N., Peacock, B.: Statistical Distributions, 3rd edn. Wiley, Hoboken (2000) [Google Scholar]
15.Funke, S., Laue, S., Lotker, Z., Naujoks, R.: Power assignment problems in wireless communication: Covering points by disks, reaching few receivers quickly, and energy-efficient travelling salesman tours. Ad Hoc Netw. 9(6), 1028–1035 (2011) [Google Scholar]
16.Johnson, D.S., McGeoch, L.A.: The traveling salesman problem: a case study. In: Aarts, E., Lenstra, J.K. (eds.) Local Search in Combinatorial Optimization. Wiley, Hoboken (1997) [Google Scholar]
17.Johnson, D.S., McGeoch, L.A.: Experimental analysis of heuristics for the STSP. In: Gutin, G., Punnen, A.P. (eds.) The Traveling Salesman Problem and its Variations. Kluwer Academic Publishers, Dordrecht (2002) [Google Scholar]
18.Karger, D., Onak, K.: Polynomial approximation schemes for smoothed and random instances of multidimensional packing problems. In Proc. of the 18th Ann. ACM-SIAM Symp. on Discrete Algorithms (SODA), pages 1207–1216. SIAM, (2007)
19.Kern, W.: A probabilistic analysis of the switching algorithm for the TSP. Math. Program. 44(2), 213–219 (1989) [Google Scholar]
20.Künnemann, M., Manthey, B.: Towards understanding the smoothed approximation ratio of the 2-opt heuristic. In Magnús M. Halldórsson, Kazuo Iwama, Naoki Kobayashi, and Bettina Speckmann, editors, In: Proc. of the 42nd Int. Coll. on Automata, Languages and Programming (ICALP), volume 9134 of Lecture Notes in Computer Science, pages 859–871. Springer, (2015)
21.Manthey, B.: Smoothed analysis of local search. In: Roughgarden, T. (ed.) Beyond the Worst-Case Analysis of Algorithms, pp. 285–308. Cambridge University Press, Cambridge (2020) [Google Scholar]
22.Manthey, B., Röglin, H.: Smoothed analysis: analysis of algorithms beyond worst case. IT Inf Technol 53(6), 280–286 (2011) [Google Scholar]
23.Manthey, B.,Rhijn, J.V.: Improved smoothed analysis of 2-opt for the Euclidean TSP. In Satoru Iwata and Naonori Kakimura, editors, In: Proc. 34th Int. Symposium on Algorithms and Computation (ISAAC), volume 283 of LIPIcs, pages 52:1–52:16. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, (2023)
24.Manthey, B.,Veenstra, R.: Smoothed analysis of the 2-Opt heuristic for the TSP: Polynomial bounds for Gaussian noise. In Leizhen Cai, Siu-Wing Cheng, and Tak-Wah Lam, editors, In: Proc. of the 24th Ann. Int. Symp. on Algorithms and Computation (ISAAC), volume 8283 of Lecture Notes in Computer Science, pages 579–589. Springer, (2013)
25.Mitchell, J.S.B.: Guillotine subdivisions approximate polygonal subdivisions: a simple polynomial-time approximation scheme for Geometric TSP, [CDATA[k]]-MST, and related problems. SIAM J. Comput. 28(4), 1298–1309 (1999) [Google Scholar]
26.Papadimitriou, C.H.: The Euclidean traveling salesman problem is NP-complete. Theoret. Comput. Sci. 4(3), 237–244 (1977) [Google Scholar]
27.Rosenkrantz, D.J., Stearns, R.E., Lewis, P.M., II.: An analysis of several heuristics for the traveling salesman problem. SIAM J. Comput. 6(3), 563–581 (1977) [Google Scholar]
28.Spielman, D.A., Teng, S.-H.: Smoothed analysis of algorithms: Why the simplex algorithm usually takes polynomial time. J. ACM 51(3), 385–463 (2004) [Google Scholar]
29.Spielman, D.A., Teng, S.-H.: Smoothed analysis: an attempt to explain the behavior of algorithms in practice. Commun. ACM 52(10), 76–84 (2009) [Google Scholar]
30.Nijnatten, F.v., Sitters, R., Woeginger, G. J., Wolff, A., de Berg, M.: The traveling salesman problem under squared Euclidean distances. In Jean-Yves Marion and Thomas Schwentick, editors, In: Proc. of the 27th Int. Symp. on Theoretical Aspects of Computer Science (STACS), volume 5 of LIPIcs, pages 239–250. Schloss Dagstuhl– Leibniz-Zentrum für Informatik, (2010)
31.Yukich, J.E.: Probability theory of classical Euclidean optimization problems. Springer, Berlin (1998) [Google Scholar]

[CR1] 1.Abramowitz, M., Stegun, I. A.: editors. Pocketbook of mathematical functions. Harri Deutsch, (1984)

[CR2] 2.Arora, S.: Polynomial time approximation schemes for Euclidean traveling salesman and other geometric problems. J. ACM 45(5), 753–782 (1998) [Google Scholar]

[CR3] 3.Bläser, M., Manthey, B., Raghavendra Rao, B.V.: Smoothed analysis of partitioning algorithms for Euclidean functionals. Algorithmica 66(2), 397–418 (2013) [Google Scholar]

[CR4] 4.Bringmann, K., Engels, C., Manthey, B., Raghavendra Rao, B. V.: Random shortest paths: Non-euclidean instances for metric optimization problems. In Krishnendu Chatterjee and Jiří Sgall, editors, Proc. of the 38th Int. Symp. on mathematical foundations of computer science (MFCS), volume 8087 of lecture notes in computer science, pages 219–230. Springer, (2013)

[CR5] 5.Brodowsky, U.A., Hougardy, S., Zhong, X.: The approximation ratio of the k-opt heuristic for the Euclidean traveling salesman problem. SIAM J. Comput. 52(4), 841–864 (2023) [Google Scholar]

[CR6] 6.Brunsch, T., Röglin, H., Rutten, C., Vredeveld, T.: Smoothed performance guarantees for local search. Math. Program. 146(1–2), 185–218 (2014) [Google Scholar]

[CR7] 7.Chandra, B., Karloff, H., Tovey, C.: New results on the old [CDATA[k]]-opt algorithm for the traveling salesman problem. SIAM J. Comput. 28(6), 1998–2029 (1999) [Google Scholar]

[CR8] 8.Curticapean, R., Künnemann, M.: A quantization framework for smoothed analysis of Euclidean optimization problems. Algorithmica 73(3), 1–28 (2015) [Google Scholar]

[CR9] 9.Dubhashi, D.P., Panconesi, A.: Concentration of Measure for the Analysis of Randomized Algorithms. Cambridge University Press, Cambridge (2009) [Google Scholar]

[CR10] 10.Durrett, R.: Probability: Theory and Examples. Cambridge University Press, Cambridge (2013) [Google Scholar]

[CR11] 11.Engels, C., Manthey, B.: Average-case approximation ratio of the 2-opt algorithm for the TSP. Oper. Res. Lett. 37(2), 83–84 (2009) [Google Scholar]

[CR12] 12.Englert, M., Röglin, H., Vöcking, B.: Worst case and probabilistic analysis of the 2-Opt algorithm for the TSP. Algorithmica 68(1), 190–264 (2014) [Google Scholar]

[CR13] 13.Etscheid, M.: Performance guarantees for scheduling algorithms under perturbed machine speeds. Discret. Appl. Math. 195, 84–100 (2015) [Google Scholar]

[CR14] 14.Evans, M., Hastings, N., Peacock, B.: Statistical Distributions, 3rd edn. Wiley, Hoboken (2000) [Google Scholar]

[CR15] 15.Funke, S., Laue, S., Lotker, Z., Naujoks, R.: Power assignment problems in wireless communication: Covering points by disks, reaching few receivers quickly, and energy-efficient travelling salesman tours. Ad Hoc Netw. 9(6), 1028–1035 (2011) [Google Scholar]

[CR16] 16.Johnson, D.S., McGeoch, L.A.: The traveling salesman problem: a case study. In: Aarts, E., Lenstra, J.K. (eds.) Local Search in Combinatorial Optimization. Wiley, Hoboken (1997) [Google Scholar]

[CR17] 17.Johnson, D.S., McGeoch, L.A.: Experimental analysis of heuristics for the STSP. In: Gutin, G., Punnen, A.P. (eds.) The Traveling Salesman Problem and its Variations. Kluwer Academic Publishers, Dordrecht (2002) [Google Scholar]

[CR18] 18.Karger, D., Onak, K.: Polynomial approximation schemes for smoothed and random instances of multidimensional packing problems. In Proc. of the 18th Ann. ACM-SIAM Symp. on Discrete Algorithms (SODA), pages 1207–1216. SIAM, (2007)

[CR19] 19.Kern, W.: A probabilistic analysis of the switching algorithm for the TSP. Math. Program. 44(2), 213–219 (1989) [Google Scholar]

[CR20] 20.Künnemann, M., Manthey, B.: Towards understanding the smoothed approximation ratio of the 2-opt heuristic. In Magnús M. Halldórsson, Kazuo Iwama, Naoki Kobayashi, and Bettina Speckmann, editors, In: Proc. of the 42nd Int. Coll. on Automata, Languages and Programming (ICALP), volume 9134 of Lecture Notes in Computer Science, pages 859–871. Springer, (2015)

[CR21] 21.Manthey, B.: Smoothed analysis of local search. In: Roughgarden, T. (ed.) Beyond the Worst-Case Analysis of Algorithms, pp. 285–308. Cambridge University Press, Cambridge (2020) [Google Scholar]

[CR22] 22.Manthey, B., Röglin, H.: Smoothed analysis: analysis of algorithms beyond worst case. IT Inf Technol 53(6), 280–286 (2011) [Google Scholar]

[CR23] 23.Manthey, B.,Rhijn, J.V.: Improved smoothed analysis of 2-opt for the Euclidean TSP. In Satoru Iwata and Naonori Kakimura, editors, In: Proc. 34th Int. Symposium on Algorithms and Computation (ISAAC), volume 283 of LIPIcs, pages 52:1–52:16. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, (2023)

[CR24] 24.Manthey, B.,Veenstra, R.: Smoothed analysis of the 2-Opt heuristic for the TSP: Polynomial bounds for Gaussian noise. In Leizhen Cai, Siu-Wing Cheng, and Tak-Wah Lam, editors, In: Proc. of the 24th Ann. Int. Symp. on Algorithms and Computation (ISAAC), volume 8283 of Lecture Notes in Computer Science, pages 579–589. Springer, (2013)

[CR25] 25.Mitchell, J.S.B.: Guillotine subdivisions approximate polygonal subdivisions: a simple polynomial-time approximation scheme for Geometric TSP, [CDATA[k]]-MST, and related problems. SIAM J. Comput. 28(4), 1298–1309 (1999) [Google Scholar]

[CR26] 26.Papadimitriou, C.H.: The Euclidean traveling salesman problem is NP-complete. Theoret. Comput. Sci. 4(3), 237–244 (1977) [Google Scholar]

[CR27] 27.Rosenkrantz, D.J., Stearns, R.E., Lewis, P.M., II.: An analysis of several heuristics for the traveling salesman problem. SIAM J. Comput. 6(3), 563–581 (1977) [Google Scholar]

[CR28] 28.Spielman, D.A., Teng, S.-H.: Smoothed analysis of algorithms: Why the simplex algorithm usually takes polynomial time. J. ACM 51(3), 385–463 (2004) [Google Scholar]

[CR29] 29.Spielman, D.A., Teng, S.-H.: Smoothed analysis: an attempt to explain the behavior of algorithms in practice. Commun. ACM 52(10), 76–84 (2009) [Google Scholar]

[CR30] 30.Nijnatten, F.v., Sitters, R., Woeginger, G. J., Wolff, A., de Berg, M.: The traveling salesman problem under squared Euclidean distances. In Jean-Yves Marion and Thomas Schwentick, editors, In: Proc. of the 27th Int. Symp. on Theoretical Aspects of Computer Science (STACS), volume 5 of LIPIcs, pages 239–250. Schloss Dagstuhl– Leibniz-Zentrum für Informatik, (2010)

[CR31] 31.Yukich, J.E.: Probability theory of classical Euclidean optimization problems. Springer, Berlin (1998) [Google Scholar]

PERMALINK

Smoothed Analysis of the 2-Opt Heuristic for the TSP under Gaussian Noise

Marvin Künnemann

Bodo Manthey

Rianne Veenstra

Abstract

2-Opt and Smoothed Analysis

Related Results

Our Contribution

Table 1.

2-Opt and Smoothing Model

2-Opt Heuristic for the TSP

Smoothing Model

Smoothed Analysis of the Running-Time

Probability Theory for the Running-Time

Lemma 3.1

Proof

Lemma 3.2

Lemma 3.3

Proof

Lemma 3.4

Proof

Lemma 3.5

Proof

Lemma 3.6

Proof

Lemma 3.7

Proof

2-Opt State Graph and Linked 2-Changes

Lemma 3.8

Lemma 3.9

Proof

Manhattan Distances

Lemma 3.10

Proof

Theorem 3.11

Proof

Squared Euclidean Distances

Preparation

Lemma 3.12

Proof

Single 2-Changes

Lemma 3.13

Proof

Theorem 3.14

Proof

Pairs of Linked 2-Changes

Lemma 3.15

Proof

Lemma 3.16

Proof

Lemma 3.17

Proof

Theorem 3.18

Proof

Euclidean Distances

Differences of Euclidean Distances

Fig. 1.

Lemma 3.19

Proof

Lemma 3.20

Proof

Lemma 3.21

Proof

Lemma 3.22

Proof

Analysis of Pairs of 2-Changes

Lemma 3.23

Proof

Lemma 3.24

Proof

Lemma 3.25

Proof

Theorem 3.26

Proof

Smoothed Analysis of the Approximation Ratio

Technical Preparation

Lemma 4.1

Lemma 4.2

Lemma 4.3