Abstract
The 2-opt heuristic is a very simple local search heuristic for the traveling salesperson problem. In practice it usually converges quickly to solutions within a few percentages of optimality. In contrast to this, its running-time is exponential and its approximation performance is poor in the worst case. Englert, Röglin, and Vöcking (Algorithmica, 2014) provided a smoothed analysis in the so-called one-step model in order to explain the performance of 2-opt on d-dimensional Euclidean instances, both in terms of running-time and in terms of approximation ratio. However, translating their results to the classical model of smoothed analysis, where points are perturbed by Gaussian distributions with standard deviation
, yields only weak bounds. We prove bounds that are polynomial in n and
for the smoothed running-time with Gaussian perturbations. In addition, our analysis for Euclidean distances is much simpler than the existing smoothed analysis. Furthermore, we prove a smoothed approximation ratio of
. This bound is almost tight, as we also provide a lower bound of
for
. Our main technical novelty here is that, different from existing smoothed analyses, we do not separately analyze objective values of the global and local optimum on all inputs (which only allows for a bound of
), but simultaneously bound them on the same input.
Keywords: Travelling salesperson problem, Local search, Smoothed analysis, Approximation ratio, 2-opt
2-Opt and Smoothed Analysis
The traveling salesperson problem (TSP) is one of the classical combinatorial optimization problems. Euclidean TSP is the following variant: given points
, find the shortest Hamiltonian cycle that visits all points in X (also called a tour). Even this restricted variant is NP-hard for
[26]. We consider Euclidean TSP with Manhattan and Euclidean distances as well as squared Euclidean distances to measure the distances between points. For the former two, there exist polynomial-time approximation schemes (PTAS) [2, 25]. The latter, which has applications in power assignment problems for wireless networks [15], admits a PTAS for
and is APX-hard for
[30].
As it is unlikely that there are efficient algorithms for solving Euclidean TSP optimally, heuristics have been developed in order to find near-optimal solutions quickly. One very simple and popular heuristic is 2-opt: starting from an initial tour, we iteratively replace two edges by two other edges to obtain a shorter tour until we have found a local optimum. Experiments indicate that 2-opt converges to near-optimal solutions quite quickly [16, 17], but its worst-case performance is bad: the worst-case running-time is exponential even for
[12] and the approximation ratio is
for Euclidean instances [5, 7].
An alternative to worst-case analysis is average-case analysis, where the expected performance with respect to some probability distribution is measured. The average-case running-time for Euclidean and random metric instances and the average-case approximation ratio for non-metric instances of 2-opt have been analyzed [4, 7, 11, 19]. However, while worst-case analysis is often too pessimistic because it is dominated by artificial instances that are rarely encountered in practice, average-case analysis is dominated by random instances, which have often very special properties with high probability that they do not share with typical instances.
In order to overcome the drawbacks of both worst-case and average-case analysis and to explain the performance of the simplex method, Spielman and Teng invented smoothed analysis [28], a hybrid of worst-case and average-case analysis: an adversary specifies an instance, and then this instance is slightly randomly perturbed. The smoothed performance is the expected performance, where the expected value is taken over the random perturbation. The underlying assumption is that real-world instances are often subjected to a small amount of random noise. This noise can stem from measurement or rounding errors, or it might be a realistic assumption that the instances are influenced by unknown circumstances, but we do not have any reason to believe that these are adversarial. Smoothed analysis often allows more realistic conclusions about the performance than worst-case or average-case analysis. Since its invention, it has been applied successfully to explain the performance of a variety of algorithms. We refer to two surveys for an overview of smoothed analysis in general [22, 29] and a more recent survey about smoothed analysis applied to local search algorithms [21].
Related Results
Running-time. Englert, Röglin, and Vöcking [12] provided a smoothed analysis of 2-opt in order to explain its performance. They used the one-step model: an adversary specifies n probability density functions
. Then the n points
are drawn independently according to the densities
, respectively. Here,
is the perturbation parameter. If
, then the only possibility is the uniform distribution on
, and we obtain an average-case analysis. The larger
, the more powerful the adversary. Englert et al. [12] proved that the expected number of iterations of 2-opt is
and
for Manhattan and Euclidean distances, respectively. These bounds can be improved slightly by choosing the initial tour with an insertion heuristic. However, if we transfer these bounds to the classical model of points perturbed by Gaussian distributions of standard deviation
, we obtain bounds that are polynomial in n and
[12, Section 6], since the maximum density of a d-dimensional Gaussian with standard deviation
is
. While this is polynomial for any fixed d, it is unsatisfactory that the degree of the polynomial depends on d.
Approximation ratio.
Much less is known about the smoothed approximation performance of algorithms. Karger and Onak have shown that multi-dimensional bin packing can be approximated arbitrarily well for smoothed instances [18] and there are frameworks to approximate Euclidean optimization problems such as TSP for smoothed instances [3, 8]. However, these approaches mostly consider algorithms tailored to solving smoothed instances.
With respect to concrete algorithms other than 2-opt, we are only aware of analyses of the jump and lex-jump heuristics for scheduling [6, 13].
Englert et al. [12] proved a bound of
. Translated to Gaussians, this yields a bound of
if we truncate the Gaussians such that all points lie in a hypercube of constant side length. This result, however, does not explain the approximation performance 2-opt, as the bound is still quite large, even for larger values of
or smaller values of
.
Our Contribution
In order to improve our understanding of the practical performance of 2-Opt, we provide an improved smoothed analysis of both its running-time and its approximation ratio. To do this, we use the classical smoothed analysis model: an adversary chooses n points from the d-dimensional unit hypercube
, and then these points are independently randomly perturbed by Gaussian random variables of standard deviation
.
Running-time The bounds that we prove are polynomial in n and
. Different to earlier results, the degree of the polynomial is independent of d. As distance measures, we consider Manhattan (Sect. 3.3), Euclidean (Sect. 3.5), and squared Euclidean distances (Sect. 3.4).
The analysis for Manhattan distances is essentially an adaptation of the existing analysis by Englert et al. [12, Section 4.1]. Note that our bound does not have any factor that is exponential in d.
Our analysis for Euclidean distances is considerably simpler than the one by Englert et al., which is rather technical and takes more than 25 pages [12, Section 4.2 and Appendix C].
The analysis for squared Euclidean distances is, to our knowledge, not preceded by a smoothed analysis in the one-step model. Because of the nice properties of squared Euclidean distances and Gaussian perturbations, this smoothed analysis is relatively compact and elegant.
Table 1 summarizes our bounds for the number of iterations.
Table 1.
Our bounds compared to the bounds obtained by Englert et al. [12] for the one-step model
| Manhattan | Euclidean | Squared Euclidean | |
|---|---|---|---|
| Englert et al. [12] | ![]() |
![]() |
– |
| General | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| Remarks | Only for
|
only for ; a weaker bound holds for |
The bounds can roughly be transferred to Gaussian noise by replacing
with
. For convenience, we added our bounds for small and large values of
: for
, we have
, for larger
, we have
. The notation
means that terms depending on d are hidden in the O. The remarks are only for our bounds
Recently, building on our analysis, Manthey and van Rhijn [23] have improved the running-time bounds for Euclidean distances. Specifically, they have reduced the upper bound from
to
, at the cost of a significantly more complex proof. Furthermore, while their analysis remains restricted to Euclidean distances, we cover Manhattan and squared Euclidean distances as well. In particular, our analysis using squared Euclidean distances is very compact, in contrast to most applications of smoothed analysis, which makes this case particularly appealing, not least for teaching courses on this subject.
Approximation ratio As the earlier smoothed analysis by Englert et al. [12], we provide bounds on the quality of the worst local optimum. While this measure is rather unrealistic and pessimistic, it decouples the analysis from the seeding of the heuristic. Taking into account the seeding would probably severely complicate the analysis.
Our bound of
improves significantly upon the direct translation of the bound of Englert et al. [12] to Gaussian perturbations (see Sect. 4.2 for how to translate the bound to Gaussian perturbations without truncation). It smoothly interpolates between the average-case constant approximation ratio and the worst-case bound of
.
In order to obtain our improved bound for the smoothed approximation ratio, we take into account the origins of the points, i.e., their unperturbed positions. Although this information is not available to the algorithm, it can be exploited in the analysis. The smoothed analyses of approximation ratios so far [3, 6, 8, 12, 13, 18] essentially ignored this information. While this simplifies the analysis, being oblivious to the unperturbed positions seems to be too pessimistic. In fact, we see that the bound of Englert et al. [12] cannot be improved beyond
by ignoring the positions of the points (Sect. 4.2). The reason for this limitation is that the lower bound for the global optimum is obtained if all points have the same origin, which corresponds to an average-case rather than a smoothed analysis. On the other hand, the upper bound for the local optimum has to hold for all choices of the unperturbed points, most of which yield higher costs for the global optimum than the average-case analysis. Taking this into account carefully yields our bound of
(Sect. 4.3).
To complement our upper bound, we show that the lower bound of
by Chandra et al. [7] remains true for
(Sect. 4.4). This implies that a smoothed bound of
is impossible, and, thus, our bound cannot be improved significantly.
2-Opt and Smoothing Model
2-Opt Heuristic for the TSP
Let
be a set of n points. The goal of the TSP is to find a Hamiltonian cycle (also called a tour) T through X that has minimum length according to some distance measure. In this paper, we consider standard Euclidean distances for both approximation ratio and running-time as well as squared Euclidean distances and Manhattan distances for the running-time.
Given a tour T, a 2-change replaces two edges
and
of T by two new edges
and
, provided that this yields again a tour (this is the case if
appear in this order in the tour) and that this decreases the length of the tour, i.e.,
, where
(Euclidean distances),
(Manhattan distances), or
(squared Euclidean distances). The 2-opt heuristic iteratively improves an initial tour by applying 2-changes until it reaches a local optimum. A local optimum is called a 2-optimal tour.
Smoothing Model
Throughout the rest of this paper, let
be a set of n points from the unit hypercube. In the smoothed analysis, these points are chosen by an adversary, and they serve as unperturbed origins. Let
be n independent random variables with mean 0 and standard deviation
. By slight abuse of notation,
refers here to the multivariate normal distribution with covariance matrix
. We obtain the perturbed point set
by adding
for each
. We write
to make explicit from which point set
the points in X are obtained.
We assume that
throughout the paper. This is justified by two reasons. First, small
are the interesting case, i.e., when the order of magnitude of the perturbation is relatively small. Second, smoothed performance guarantees are monotonically decreasing in
: if we have
, then this is equivalent to adversarial instances in
that are perturbed with standard deviation 1. This in turn is dominated by adversarial instances in
that are perturbed with standard deviation 1, as
. Thus, any upper bound for
(be it for the number of iterations or the approximation ratio) holds also for larger
.
Let us make a final remark about the smoothing model: while the algorithm itself, the 2-opt heuristic in our case, only sees X and does not know anything about the origins
, we can of course exploit the positions of the unperturbed points in the analysis.
Smoothed Analysis of the Running-Time
In this section, we make the dependence on all parameters (the number n of points, the dimension d, and the perturbation parameter
) explicit. This means that the O or
do not hide any factors, not even factors depending on d, which is often considered as a constant and therefore ignored. (This is also in contrast to our analysis of the approximation ratio, where the hidden constant can indeed depend on d.)
Probability Theory for the Running-Time
In order to get an upper bound for the length of the initial tour, we need an upper bound for the diameter of the point set X. Such an upper bound is also necessary for the analysis of 2-changes with Euclidean distances (Sect. 3.5). We choose
such that
with a probability of at least
. For fixed d and
, we can choose
according to the following lemma. For
, we have
.
Lemma 3.1
Let
be a sufficiently large constant, and let
. Then
.
Proof
We have
only if there is a point
and a coordinate of
that is perturbed by more than
. According to Durrett [10, Theorem 1.2.3], the probability that a 1-dimensional Gaussian of standard deviation
is more than
away from its mean is bounded from above by
. Thus, the probability that
is bounded from above by
. For sufficiently large c, this is at most 1/n!. 
Note that the constant c in Lemma 3.1 does not depend on the dimension d.
The following lemma is well known and follows from the fact that the density of a d-dimensional Gaussian with standard deviation
is bounded from above by
and the volume of a d-dimensional ball of radius
is bounded from above by
.
Lemma 3.2
Let
be drawn according to a d-dimensional Gaussian distribution of standard deviation
, and let
be a d-dimensional hyperball of radius
centered at
. Then
.
For
with
, let
denote the straight line through x and y.
Lemma 3.3
Let
be arbitrary with
. Let
be drawn according to a d-dimensional Gaussian distribution with standard deviation
. Then the probability that c is
-close to L(a, b), i.e.,
, is bounded from above by
.
Proof
We divide drawing c into drawing a 1-dimensional Gaussian
in the direction of
and drawing a
-dimensional Gaussian
in the hyperplane orthogonal to
and containing
. Then the distance of c to L(a, b) is
. For every
, the point c is
-close to L(a, b) only if
falls into a
-dimensional hyperball of radius
around
in the
-dimensional subspace orthogonal to
. Now the lemma follows by applying Lemma 3.2. 
We need the following lemma in Sect. 3.5.
Lemma 3.4
Let
be a differentiable function. Let B be an upper bound for the absolute value of the derivative of f. Let c be distributed according to a Gaussian distribution with standard deviation
. Let I be an interval of size
, and let
be the image of I. Then
.
Proof
Since the derivative of f is bounded by B, the set f(I) is contained in some interval of length
. The lemma follows since the density of c is bounded from above by
. 
The chi distribution [14, Section 8] is the distribution of the Euclidean length of a d-dimensional Gaussian random vector of standard deviation
and mean 0. In the following, we denote its density function by
. It is given by
![]() |
1 |
where
denotes the gamma function. We need the following lemma several times.
Lemma 3.5
Assume that
is a fixed constant and
is arbitrary with
. Then we have
![]() |
Proof
The first equality follows by integration. For the second inequality, we observe
is a fixed constant (which also never depends on d when we apply the lemma) and that
![]() |
for some function
with
according to Stirling’s formula [1, 6.1.37]. We have
as
and both are integers. Then
![]() |
Here, the third equality follows from two facts: first, c is a fixed constant, thus
. Second,
. Thus,
and
lie between 0 and a constant. Hence, the exponential term is
.
Analyzing A and B remains to be done: We have
, thus
. If
, then B is bounded from below by a constant and so is
. If
, then
. Hence,
.
We have
. Distinguishing the cases
and
in the same way as for B yields
. Thus,
. 
The analysis with Euclidean and squared Euclidean distances depends on the distribution of the distance between two points perturbed by Gaussians, where a larger distance between the two points is better for the analysis. The following two lemmas show that, given that larger distance is better, we can replace the distribution of the distance by the corresponding chi distribution. Since we do not know the original positions of the points involved, this allows us to replace unknown distributions by the chi distribution.
Lemma 3.6
Assume that a is drawn according to a d-dimensional Gaussian distribution with standard deviation
and mean 0. Assume that b is drawn according to a d-dimensional Gaussian distribution with standard deviation
and mean
. Then
stochastically dominates
, i.e.,
for all
.
Proof
For
, we have the following:
![]() |
Now we prove the lemma for larger d. Since Gaussian distributions are rotationally symmetric, we can assume that
for some
.
We observe that
dominates
if and only if
dominates
. Let
. It suffices to prove the lemma for this choice of
, as
follows the same distribution as b. Fixing
fixes also
. Then
dominates
if
dominates
. This is true because the lemma holds for
. 
Lemma 3.7
Let b be as in Lemma 3.6, and let
be a monotonically decreasing function. Let g be the density function of
. Then
![]() |
provided that both integrals exist.
Proof
Let a denote the d-dimensional Gaussian random variable of standard deviation
and mean 0. Then
has density
. By Lemma 3.6,
is dominated by
. This implies that
dominates
since h is monotonically decreasing. The lemma follows by observing that the two integrals are the two expected values of
and
. 
For Euclidean and squared Euclidean distances, it turns out to be useful to study
for points
. By abusing notation, we sometimes write
instead of
for short. A 2-change that replaces
and
by
and
improves the tour length by
.
2-Opt State Graph and Linked 2-Changes
The number of iterations that 2-opt needs depends of course heavily on the initial tour and on which 2-change is chosen in each iteration. We do not make any assumptions about the initial tour and about which 2-change is chosen. Following Englert et al. [12], we consider the 2-opt state graph: we have a node for every tour and a directed edge from tour T to tour
if
can be obtained by one 2-change. The 2-opt state graph is a directed acyclic graph, and the length of the longest path in the 2-opt state graph is an upper bound for the number of successful iterations that 2-opt needs.
In order to improve the bounds, we also consider pairs of linked 2-changes [12]. Two 2-changes form a pair of linked 2-changes if there is one edge added in one 2-change and removed in the other 2-change. Formally, one 2-change replaces
and
by
and
and the other 2-change replaces
and
by
and
. The edge
is the one that appears and disappears again (or the other way round). It can happen that
and
intersect. Englert et al. [12] called a pair of linked 2-changes a type i pair if
. As type 2 pairs, which involve only four nodes, are difficult to analyze because of dependencies, we ignore them. Fortunately, the following lemma states that we will find enough disjoint pairs of linked 2-changes of type 0 and 1 in any sufficiently long sequence of 2-changes.
Lemma 3.8
(Englert et al. [12, Lemma 9 of corrected version]) Every sequence of t consecutive 2-changes contains at least
disjoint pairs of linked 2-changes of type 0 or type 1.
Following Englert et al. [12, Figure 8], we subdivide type 1 pairs into type 1a and type 1b depending on how
and
intersect. One of the 2-changes replaces
and
by
and
. Then the other 2-change, i.e., the one that removes the edge
shared by the linked pair, determines its type:
Type 0:
and
are replaced by
and
.Type 1a:
and
are replaced by
and
.Type 1b:
and
are replaced by
and
.
The main idea in the proofs by Englert et al. [12] and also in our proofs is to bound the minimal improvement of any 2-change or the minimal improvement of any pair of linked 2-changes. We denote the smallest improvement of any 2-change by
and the smallest improvement of any pair of linked 2-changes of type 0, 1a, or 1b by
. It will be clear from the context which distance measure is used for
and
.
Suppose that the initial tour has a length of at most L, then 2-opt cannot run for more than
iterations and not for more than
iterations, provided that
because of Lemma 3.8.
The following lemma formalizes this and shows how to bound the expected number of iterations using a tail bound for
or
.
Lemma 3.9
Suppose that, with a probability of at least
, any tour has a length of at most L. Let
. Then
If
, then the expected length of the longest path in the 2-opt state graph is bounded from above by
.If
, then the expected length of the longest path in the 2-opt state graph is bounded from above by
.The same bounds as (1) and (2) hold if we replace
by
, provided that
for Case 1 and
for Case 2.
Proof
If the length of the longest tour is longer than L, then we use the trivial upper bound of n!. This contributes only O(1) to the expected value, which, by slight abuse of mathematical correctness, we ignore in the following.
Consider the first statement. Let T be the longest path in the 2-opt state graph. If
, then
. Plugging this in and observing that n! is an upper bound for T yields
![]() |
Now consider the second statement, and let T be as above. Let
. Then
![]() |
Finally, we consider the third statement. The statement follows from the observation that the maximal number of disjoint pairs of linked 2-changes and the length of the longest path in the 2-opt state graph are asymptotically equal if they are of length at least
(Lemma 3.8) and the probability statements become nontrivial only for
in the first and
in the second case. 
Manhattan Distances
The essence of our analysis for Manhattan distances is a straightforward adaptation of the analysis in the one-step model. The extra factor of
comes from the bound of the initial tour, and the extra factor of
stems from stating the dependence on d explicitly and getting rid of the exponential dependence on d [12, Proofs of Theorem 7 and Lemma 10].
Lemma 3.10
.
Proof
We consider a pair of linked 2-changes as described in Sect. 3.2. The improvement of the first 2-change is
![]() |
where
is the i-th coordinate of
. The improvement of the second 2-change is
![]() |
Note that we can have a type 1 pair, i.e., two of the points
can be identical.
Each ordering of the
gives rise to a linear combination for
and
. We have
such orderings. If we examine the case analysis by Englert et al. [12, Lemmas 11, 12, 13] closely, we see that any pair of linear combinations is either impossible (it uses a different ordering of the variables for
and
or one of
and
is non-positive, thus the corresponding 2-change is in fact not a 2-change) or we have one variable
that has a non-zero coefficient in
and a coefficient of 0 in
and another variable
that has a non-zero coefficient in
and a coefficient of 0 in
. The absolute values of the non-zero coefficients of
and
is 2. Now
falls into
only if
falls into an interval of length
. This happens with a probability of at most
. By independence, the same holds for
and
.
However, we would incur an extra factor of
in this way, and we would like to remove all exponential dependence of d. In order to do this, we assume that we know i and
already. This comes at the expense of a factor of
for taking a union bound over the choices of i and
. We let an adversary fix values for all
with
. Since we know i and
, we are left with at most
possible linear combinations.
Finally, the lemma follows by taking a union bound over all
possible pairs of linked 2-changes. 
Theorem 3.11
The expected length of the longest path in the 2-opt state graph corresponding to d-dimensional instances with Manhattan distances is at most
.
Proof
The initial tour has a length of at most
with a probability of at least
by Lemma 3.1. We apply Lemma 3.9 for linked 2-changes using Lemma 3.10 and
. 
Squared Euclidean Distances
Preparation
In this section, we have
for
.
Assume that we have a 2-change that replaces
and
by
and
. The improvement caused by this 2-change is
. Given the positions of the four nodes except for a single
, such a 2-change yields a small improvement only if the corresponding
falls into some interval of size
. The following lemma gives an upper bound for the probability that this happens.
Lemma 3.12
Let
,
, and let c be drawn according to a Gaussian distribution with standard deviation
. Let
be an interval of length
. Then
![]() |
Proof
Since Gaussian distributions are rotationally symmetric, we can assume without loss of generality that
and
with
. Let
. Then
. Thus,
if and only if
falls into an interval of length
. Since
is a 1-dimensional Gaussian random variable with a standard deviation of
, the probability for this is bounded from above by
since the maximum density of a 1-dimensional Gaussian of standard deviation
is bounded from above by
. 
Single 2-Changes
In this section, we prove a simple bound for the expected number of iterations of 2-opt with squared Euclidean distances. This bound holds for all
. In the next section, we improve this bound for the case
using pairs of linked 2-changes.
Lemma 3.13
For
, we have
.
Proof
Consider a 2-change where
and
are replaced by
and
. Its improvement is given by
. We let an adversary fix
. Then we draw
. This fixes the distance
. Now we draw
. This fixes
. The 2-change yields an improvement of at most
only if
falls into an interval of size at most
. According to Lemma 3.12, the probability that this happens is at most
.
Now let g be the probability density of
. Then the probability that the 2-change yields an improvement of at most
is bounded from above by
![]() |
The first step is due to Lemma 3.7. The second step is due to Lemma 3.5 using
and
. The lemma follows by a union bound over the
possible 2-changes. 
Theorem 3.14
For all
, the expected length of the longest path in the 2-opt state graph corresponding to d-dimensional instances with squared Euclidean distances is at most
.
Proof
With a probability of at least
, the instance is contained in a hypercube of side length
. Thus, the longest edge has a length of at most
. Therefore, the initial tour has a length of at most
. We combine this with Lemmas 3.9 and 3.13 to complete the proof. 
Pairs of Linked 2-Changes
We can obtain a better bound than in the previous section by analyzing pairs of linked 2-changes. With the following three lemmas, we analyze the probability that pairs of linked 2-changes of type 0, 1a, or 1b yield an improvement of at most
.
Lemma 3.15
For
, the probability that there exists a pair of type 0 of linked 2-changes that yields an improvement of at most
is bounded from above by
.
Proof
Consider a fixed pair of type 0 of linked 2-changes involving the six points
as described in Sect. 3.2. We show that the probability that it yields an improvement of at most
is at most
. A union bound over the
possibilities of pairs of type 0 yields the lemma.
The basic idea is that we restrict ourselves to analyzing
and
only in order to bound the probability that we have a small improvement. In this way, we use the principle of deferred decision to show that we can analyze the improvements of the two 2-changes as if they were independent:
We let an adversary fix
arbitrarily.We draw
, which determines the distance
.We draw
. This fixes the position of the “bad” interval for
. Its size is already fixed since we know the positions of
and
. The position of
is still random.We draw
. The probability that
assumes a position such that the first 2-change yields an improvement of at most
is thus at most
.We draw
. This determines the distance
.We draw
. The probability that
assumes a position such that the second 2-change yields an improvement of at most
is thus at most
.
Let g be the probability density function of the distance between
and
, and let
be the probability density function of the distance between
and
. By independence of the points, the probability that both 2-changes of the pair yield an improvement of at most
is bounded from above by
![]() |
We observe that
is monotonically decreasing in
. Thus, by Lemma 3.7, we can replace g and
by the density
of the chi distribution to get the following upper bound for the probability that a pair of type 0 yields an improvement of at most
:
![]() |
Here, we use Lemma 3.5 with
, which is allowed since
. 
Lemma 3.16
For
, the probability that there exists a pair of type 1a of linked 2-changes that yields an improvement of at most
is bounded from above by
.
Proof
We can analyze pairs of type 1a in the same way as type 0 pairs in Lemma 3.15. To do this, we analyze
and
:
We let an adversary fix the position of
.We draw
. This fixes
.We draw
. This fixes
. In addition, this fixes the positions of the intervals into which
and
must fall if the first or second 2-change yield an improvement of at most
.We draw
.We draw
.
The remainder of the proof is identical to the proof of Lemma 3.15, except that we have to take a union bound only over
possible choices. 
Lemma 3.17
For
, the probability that there exists a pair of type 1b of linked 2-changes that yields an improvement of at most
is bounded from above by
.
Proof
Again, we proceed similarly to Lemma 3.15. We analyze a fixed pair of type 1b, where
and
are replaced by
and
in one step and
and
are replaced by
and
, and apply a union bound over the
possible type 1a pairs. We analyze the probability that
or
assume a bad value.
We draw the points in the following order:
We fix
.We draw
. This fixes the distance
, which is crucial for both 2-changes.We draw
.We draw
. The probability that the first 2-change yields an improvement of at most
is at most
.We draw
. The probability that the second 2-change yields an improvement of at most
is at most
.
The main difference to Lemma 3.15 is that the sizes of the bad intervals are not independent. However, once the size of the bad intervals is fixed, we can analyze the probabilities that
or
fall into their bad intervals as independent. Given that
is fixed, the probability that the first and the second 2-change yield an improvement of at most
is bounded from above by
. Since this is decreasing in
, we can replace the distribution of
by the chi distribution to obtain an upper bound according to Lemma 3.7. Thus, using Lemma 3.5 with
and
, we obtain the following upper bound for the probability that a pair of type 1b yields an improvement of at most
:
![]() |
With the three lemmas above, we can obtain a bound on the expected number of iterations of 2-opt for TSP with squared Euclidean distances.
Theorem 3.18
For
, the expected length of the longest path in the 2-opt state graph corresponding to d-dimensional instances with squared Euclidean distances is at most
.
Proof
The probability that any pair of linked 2-changes of type 0, 1a, or 1b yields an improvement of at most
is bounded from above by
. We apply Lemma 3.9 with
and observe that the initial tour has a length of at most
with a probability of at least
. 
Euclidean Distances
Differences of Euclidean Distances
In this section, we have
for
. Analyzing
turns out to be more difficult than analyzing
in the previous section. In particular the case when
is close to its maximal value of
requires special attention. Intuitively, this is for the following reason: if
, then z is close L(a, b). Assume that
for the moment. Then either z is between a and b, which is fine. Or z is not between a and b. Then moving z in the direction of L(a, b) does not change
at all.
We observe that
behaves essentially 2-dimensionally: it depends only on the distance of z from L(a, b) (this is x in the following lemma) and on the position of the projection z onto L(a, b) (this is y in the following lemma). It also depends on the distance
between a and b (this is
in the following lemma, and we had this dependency also in the previous section about squared Euclidean distances). The following lemma makes the connection between x and y explicit for a given
. Figure 1 depicts the situation described in the lemma.
Fig. 1.

The situation for Lemma 3.19
Lemma 3.19
Let
,
,
. Let
and
be two points at a distance of
. Let
. Then we have
![]() |
2 |
for
and
![]() |
3 |
for
. Furthermore,
is impossible.
Proof
The last statement follows from the triangle inequality.
We have
. Rearranging terms and squaring implies
![]() |
Squaring again yields
![]() |
By rearranging terms again, we obtain
![]() |
Using the assumption
or
implies the two claims. 
As said before, the difficult case in analyzing
is when
. In terms of the previous lemma, this can only happen if x is small, i.e., if c is close to L(a, b), but not between a and b. The following lemmas makes a quantitative statement about this connection.
Lemma 3.20
Let
. Assume that
and that z has a distance of x from L(a, b). Then
![]() |
4 |
Proof
Let y be the distance of z from
, and let
. Then, according to (3), we have
![]() |
We have
. This and the upper bound
yield the following weaker bound:
![]() |
5 |
We distinguish two cases. The first case is that
. In this case, it suffices to show that
in order to prove (4). Since
, this holds because
.
The second case is that
. We have
![]() |
Replacing
by
in the numerator and
by
in the denominator of (5), we obtain
![]() |
Rearranging terms completes the proof. 
In order to be able to apply Lemma 3.4, we need the following upper bound on the derivative of y with respect to
, given that x is fixed.
Lemma 3.21
For
, let
with
. Assume further that
and that
. Then the derivative of y with respect to
is bounded by
![]() |
Proof
The derivative of y with respect to
is given by
![]() |
We observe that
for all x and allowed choices of
and
. For the second term, we have
![]() |
By assumption, we have
and
. Thus, we have
![]() |
Using Lemmas 3.21 and 3.4, we can bound the probability that
assumes a value in an interval of size
.
Lemma 3.22
Let
. Let
be arbitrary,
, and let z be drawn according to a Gaussian distribution with standard deviation
. Let
. Let I be an interval of length
. Then
![]() |
Proof
We assume throughout this proof that
. The case that this is not satisfied is taken care of by the second term in the upper bound for the probability in the statement of the lemma.
Let x denote the distance of z to L(a, b), and let y denote the position of the projection of z onto L(a, b). First, let us assume that x is fixed. Then, by Lemmas 3.21 and 3.4, the probability that
is bounded from above by
![]() |
Here, the requirements of Lemma 3.21 are satisfied because of Lemma 3.20, or we have
.
We observe that this probability is decreasing in x. Thus, in order to get an upper bound for the probability with random x, we can use the
-dimensional chi distribution for x according to Lemma 3.7. We obtain
![]() |
by Lemma 3.5 using
and
. Since
, the lemma follows. 
Analysis of Pairs of 2-Changes
We immediately go to pairs of linked 2-changes, as these yield the better bounds.
Lemma 3.23
For
, the probability that a pair of linked 2-changes of type 0 yields an improvement of at most
or some point lies outside
is bounded from above by
![]() |
Proof
We proceed similarly as in the proof of Lemma 3.15 for type 0 pairs for squared Euclidean distances. We draw the points of a fixed pair of linked 2-changes as in the proof of Lemma 3.15.
In the same way as in the proof of Lemma 3.15, using Lemma 3.22 instead of Lemma 3.12, we obtain that the probability that one fixed of the two 2-changes yields an improvement of at most
is bounded from above by
![]() |
Here, we applied Lemma 3.5 with
.
Again in the same way as in the proof of Lemma 3.15, we can analyze both 2-changes of the type 0 pair as if they are independent. Finally, the lemma follows by a union bound over the
possibilities for a type 0 pair. 
Lemma 3.24
For
, the probability that a pair of linked 2-changes of type 1a yields an improvement of at most
or some point lies outside
is bounded from above by
![]() |
Proof
The lemma can be proved in the same way as Lemma 3.16 with differences analogous to the proof of Lemma 3.23. 
Lemma 3.25
For
, the probability that a pair of linked 2-changes of type 1b yields an improvement of at most
or some point lies outside
is bounded from above by
![]() |
Proof
Similar to the proof of Lemma 3.17 and using Lemma 3.22, the probability that the two 2-changes of the pair both yield an improvement of at most
is bounded from above by
![]() |
Now the lemma follows by applying Lemma 3.5 with
. 
Theorem 3.26
For
, the expected length of the longest path in the 2-opt state graph corresponding to d-dimensional instances with Euclidean distances is at most
.
Proof
We have
by Lemmas 3.23, 3.24, and 3.25. If all points are in
, then the longest edge has a length of
. Thus, the initial tour has a length of at most
. Plugging this into Lemma 3.9 yields the result. 
Smoothed Analysis of the Approximation Ratio
Technical Preparation
The following standard lemma provides a convenient way to bound the deviation of a perturbed point from its mean in the two-step model.
Lemma 4.1
(Chi-square bound [28, Cor. 2.19]) Let x be a Gaussian random vector in
of standard deviation
centered at the origin. Then, for
, we have 
To give large-deviation bounds on sums of independent variables with bounded support, we will make use of a standard Chernoff-Hoeffding bound.
Lemma 4.2
(Chernoff-Hoeffding Bound [9, Exercise 1.1]) Let
, where
are independently distributed in [0, 1], and
. Then, for
, we have
![]() |
Throughout this part of the paper, we assume that the dimension
is a fixed constant. Given a sequence of points
, we call a collection
of edges a tour, if T is connected and every
has in- and outdegree exactly one in T. Note that we consider directed tours, which is useful in the analysis in this chapter, but our distances are always symmetric.
Given any collection of edges S, its length is denoted by
, where d(u, v) denotes the Euclidean distance
between points
and
.
We call a collection
a partial 2-optimal tour if T is a subset of a tour and
holds for all edges
. Our main interests are the traveling salesperson functional
as well as the functional
that maps the point set X to the length of the longest 2-optimal tour through X.
We note that the results in Sect. 4.2 hold for metrics induced by arbitrary norms in
(Lemma 4.4 and 4.5) or typical
norms (Lemma 4.6 and 4.7), not only for the Euclidean metric. We conjecture that also the upper bound in Sect. 4.3 holds for more general metrics, while the lower bound in Sect. 4.4 is probably specific for the Euclidean metric. Still, we think that the construction can be adapted to work for most natural metrics.
For obtaining lower bounds on the length of optimal tours, we consider the boundary functional
:
is the length of the shortest tour through all points in X, where we are allowed to connect points directly or to the boundary, and traversing the boundary of
has zero costs. For a proof of the following lemma and for more details about boundary functions of Euclidean optimization problems, we refer to the monograph by Yukich [31].
Lemma 4.3
(Boundary Functional [31, Lemma 3.7]) There is a constant
such that for all sets
of n points, we have
.
Length of 2-Optimal Tours under Perturbations
In this section, we provide an upper bound for the length of any 2-optimal tour and a lower bound for the length of any global optimum. These two results yield an upper bound of
for the approximation ratio.
Chandra et al. [7] proved a bound on the worst-case length of 2-optimal tours that, in fact, already holds for the more general notion of partial 2-optimal tours. For an intuition why this is true, let us point out that their proof strategy is to argue that not too many long arcs in a tour may have similar directions due to the 2-optimality of the edges, while short edges do not contribute much to the length. The claim then follows from a packing argument. It is straight-forward to verify that it is never required that the collection of edges is closed or connected.
Lemma 4.4
(Length of partial 2-optimal tours [7, Theorem 5.1], paraphrased) There exists a constant
such that for every sequence X of n points in
, any partial 2-optimal tour has length less than
.
While this bound directly applies to any perturbed instance under the one-step model, Gaussian perturbations fail to satisfy the premise of bounded support in
. However, Gaussian tails are sufficiently light to enable us to translate the result to the two-step model by carefully taking care of outliers.
Lemma 4.5
There exists a constant
such that for any
the following statement holds. For any
, the probability that any partial 2-optimal tour on
has length greater than
, i.e.,
, is bounded by
. Furthermore,
![]() |
Proof
By translation, assume without loss of generality that the input points are contained in
. We define cubes
with
. The side length of cube
is
. We consider the partitioning of
into the regions
and
for
. For some cube C and any tour T, let
denote the edges in T that are completely contained in C. For any tour T, the sequence
defined by
and
, for
, partitions the edges of T. Thus,
.
For any outcome of the perturbed points, let T be the longest 2-optimal tour. Then, each
is a partial 2-optimal tour in
. Let
be the (random) number of points in
, which is an upper bound on the number of points in
. At most
vertices are incident to the edges
, since each such edge is incident to at least one endpoint in
and every point has degree 2 in T. Since
is a translated unit cube scaled by
, Lemma 4.4 yields
.
Observe that
is not contained in
only if its origin has been perturbed by noise of length at least
. Thus, let
and note that
implies that
. Hence, for each point
, Lemma 4.1 yields
![]() |
By linearity of expectation, we conclude that
for
. This yields
![]() |
where we used Jensen’s inequality for the first inequality.
To derive tail bounds for the length of any 2-optimal tour, let
be the upper bound on
derived above. By the Chernoff bound (Lemma 4.2), we have
![]() |
This guarantee is only strong as long as
is sufficiently large. Hence, we regard this guarantee only for
, where
is chosen such that
. Assume that
for all
. Then, analogously to the above calculation, the contribution of
is bounded by
![]() |
Let
denote the probability that some
fails to satisfy
. Then,
![]() |
Let us continue assuming that all
satisfy
. Since in particular
, at most
vertices remain outside
. Let
. By a union bound,
![]() |
Assume that the corresponding event holds (i.e.,
), then the remaining points outside
(and hence, outside
) are contained in
. We conclude that, with probability at least
, we have
![]() |
This finishes the proof, since we have shown that with probability
, both the contribution of
and
is bounded by
. 
We complement the bound above by a lower bound on tour lengths of perturbed inputs, making use of the following result by Englert et al. [12] for the one-step model.
Lemma 4.6
(Englert et al. [12, Proof of Theorem 1.4]) Let
be a
-perturbed instance. Then with probability
, any tour on
has length at least
.
It also follows from their results that this bound translates to the two-step model consistently with the intuitive correspondence of
between the one-step and the two-step model.
Lemma 4.7
Let
be an instance of points in the unit cube perturbed by Gaussians of standard deviation
. Then with probability
any tour on
has length at least
.
Proof
We summarize the arguments of Englert et al. [12, Section 6] first, who considered truncated Gaussian perturbations: Here, we condition the Gaussian perturbation
for each input point
to be contained in
for some
. Conditioned on this event, the resulting input instance is contained in the cube
. By straight-forward calculations, the conditional distribution of each point in C has maximum density bounded by
. Moreover, the probability that the condition fails for a single point is bounded by
for all i. Thus, by choosing
sufficiently large, each point has at least constant probability to satisfy the condition
.
Given any instance (with Gaussian perturbations which are not truncated), first reveal the (random) subinstance of those points for which the condition
is satisfied and let
be the number of such points. By the Chernoff bound (Lemma 4.2), and
, we have
for some
with probability at least
. If this event occurs, we obtain a random instance of
points and maximum density
. Hence an application of Lemma 4.6 yields that, for some constant
, the probability that a tour of length less than
exists is at most
. 
Note that Lemmas 4.5 and 4.7 almost immediately yield the following bound on the approximation performance for the two-step model. (The large-deviation bound is immediate. For the expected approximation ratio, we make use of the older and non-tight worst-case bound of
, given in Lemma 4.9 below.)
Observation 4.8
Let
be an instance of points in the unit cube perturbed by Gaussians of standard deviation
. Then the approximation performance of 2-Opt is bounded by
in expectation and with probability
.
We remark that this bound is best possible for an analysis of perturbed instances that separately bounds the lengths of any 2-optimal tour from above and gives a lower bound on any optimal tour. To see this, we argue that Lemma 4.6, Lemma 4.4 (even under
-perturbed input), Lemma 4.7 and Lemma 4.5 cannot be improved in general. This is straight-forward for Lemma 4.6, since n points distributed uniformly at random in a cube of volume
always have, by scaling and Lemma 4.4, a tour of length
. Hence, the lower bound on optimal tours on perturbed instances is tight. To see that the upper bound on any 2-optimal tour is tight, take n uniformly distributed points that have, by Lemma 4.6, an optimal tour of length
with high probability and thus also in expectation.
Naturally, this transfers to the case of Gaussian perturbations, albeit more technical to verify: If we place n identical points in
, say at the origin, and perturb them with Gaussians of standard deviation
, then we may without loss of generality scale the unit cube to
and perturb the points with standard deviation 1 instead. By Lemma 4.5, any 2-optimal tour and, thus, any optimal tour on these points has a length of
on the scaled instance, since the origins are still contained in the unit cube. Thus, the optimal tour on the original instance has a length of at most
in expectation and with high probability.
We only sketch that 2-optimal tours can have a length of at least
: We distribute the n (unperturbed) points into
groups of
points each, and we partition the cube
into
subcubes of equal side length. Let
be a constant such that with high probability, at least
points of a group remain in their subcube after perturbation. We call these points successful. Since successful points are identically distributed, conditioned on falling into a compact set, the shortest tour through these (at least)
points has a length of at least
for some other constant
[31]. (This is just a scaled version of perturbing and truncating a Gaussian of standard deviation 1 to a unit hypercube, which would result in a tour length of
for m points.) By closeness of the tour on all points to the boundary functional and geometric superadditivity of the boundary functional (see Yukich [31] for details), it follows that the optimal tour on all successful points has a length of at least
.
Upper Bound on the Approximation Performance
In this section, we establish an upper bound on the approximation performance of 2-Opt under Gaussian perturbations. We achieve a bound of
. Due to the lower bound presented in Sect. 4.4, improving the smoothed approximation ratio to
is impossible. Thus, our bound is almost tight.
As noted in the previous section, to beat
it is essential to exploit the structure of the unperturbed input. This will be achieved by classifying edges of a tour into long and short edges and bounding the length of long edges by a (worst-case) global argument and short edges locally against the partial optimal tour on subinstances (by a reduction to an (almost-)average case). The local arguments for short edges will exploit how many unperturbed origins lie in the vicinity of a given region.
The global argument bounding long edges follows from the older
bound on the worst-case approximation performance [7] that we rephrase here for our purposes.
Lemma 4.9
(Chandra et al. [7, Proof of Theorem 4.3]) Let T be a 2-optimal tour and
denote the length of the optimal traveling salesperson tour
. Let
contain the set of all edges in T whose length is in
. Then
. In particular, it follows that
.
In the proof of our bound of
, the above lemma accounts for all edges of length
. A central idea to bound all shorter edges is to apply the one-step model result to small parts of the input space. In particular, we will condition sets of points to be perturbed into cubes of side length
. The following technical lemma helps to capture what values of
suffice to express the conditional density function of these points depending on the distance of their unperturbed origins to the cube. This allows for appealing to the one-step model result of Lemma 4.6.
Lemma 4.10
Let
and
. Let Y be the random variable
conditioned on
and
be the corresponding probability density function. Then
is bounded from above by
.
Proof
Let
be the probability density function of X. Let
be the point in Q that is closest to c. Then, since
is rotationally invariant around c and decreasing in
, the density
inside Q is maximized at
. Likewise,
minimizes the density inside Q. Since Q is a
-cube in
,
, where
denotes the all-ones vector. Given
, we can thus bound the conditional probability density function
for
by
![]() |
It remains to bound, for
,
![]() |
Since for all
,
, we can bound
, yielding the claim. 
The main result of this section is the following theorem, which will be proved in the remainder of the section.
Theorem 4.11
Let
be an instance of points in
perturbed by Gaussians of standard deviation
. With probability
for any constant
, we have
. Furthermore,
![]() |
Since the approximation performance of 2-Opt is bounded by
in the worst-case, we may assume that
for all constant
, since otherwise our smoothed result is superseded by Lemma 4.9. Furthermore, we may also assume that
, since otherwise Observation 4.8 already yields the result. In what follows, let
and T be any optimal and longest 2-optimal, respectively, traveling salesperson tour on
. Furthermore, we let
denote the length of the shortest traveling salesperson tour.
Outliers and Long Edges
We will first show that the contribution of almost all points outside
is bounded by
with high probability and in expectation, similar to Lemma 4.5. For this, we define growing cubes
, where we set
for
and
. Let
be the number of points not contained in
. For every point
, Lemma 4.1 with
bounds
(note that we have chosen the
such that
). Thus,
. We define
as the set of edges of the longest 2-optimal tour T contained in
with at least one endpoint in
. We first bound the contribution of the edges in
with
.
Lemma 4.12
With probability
for any constant
, we have
![]() |
In addition, we have
.
Proof
The proof is analogous to the proof of Lemma 4.5. Linearity of expectation, Lemma 4.4, and Jensen’s inequality yield
![]() |
By observing that
is bounded by a constant, we conclude that
is bounded by
.
Let
be the upper bound on
derived above. By the Chernoff bounds (Lemma 4.2), we have
![]() |
Choose
such that
. Thus,
. Assume that
for all
. Then, analogously to the above calculation, the contribution of
is bounded by
![]() |
Note that the probability that some
fails to satisfy
is bounded by
![]() |
for any constant
. Since
, at most
vertices remain outside
. Let
. By a union bound, for any constant
,
![]() |
Assume that we have the – very likely – event that all points are in
, then the remaining points outside
are contained in
. We conclude that
![]() |
In the remainder of the proof, we bound the total length of edges inside
. Define
and note that all edges in C have bounded length
. We let
contain the set of all those edges within C (in the longest 2-optimal tour T) whose lengths are in
. Let
be such that
. Then
for all
, since no longer edges exist. Let
be such that
. Then
by Lemma 4.9. This argument bounds the contribution of long edges, i.e., edges longer than
, in the worst case, after observing the perturbation of the input points. It remains to bound the length of short edges in C, which we do in the next section.
Short Edges
To account for the length of the remaining edges, we take a different route than for the long edges: Call an edge that is shorter than
a short edge and partition the bounding box
into a grid of
-cubes
with
, which we call cells. All edges in
for
, i.e., short edges, are completely contained in a single cell or run from some cell
to one of its
neighboring cells. For a given tour T, let
denote the short edges of T for which at least one of the endpoints lies in
.
We aim to relate the length of the edges
for the longest 2-optimal tour T to the length of the edges
of the optimal tour
. This local approach is justified by the following property.
Lemma 4.13
For any tour
, the contribution
of cell
is lower bounded by
.
Proof
Consider all edges S in
that have at least one endpoint in
. Replacing those edges
with
and
by the shortest edge connecting u to the boundary of
does not increase the total edge length by triangle inequality. If
were the unit cube,
would thus be lower bounded by the boundary functional
. Instead, we scale the instance
by
to obtain an instance
in the unit cube, satisfying
and, as argued above,
. Thus an application of Lemma 4.3 yields
![]() |
Intuitively, a cell
is of one of two kinds: either few points are expected to be perturbed into it and hence it cannot contribute much to the length of any 2-optimal tour (a sparse cell), or many unperturbed origins are close to the cell (a heavy cell). In the latter case, either the conditional densities of points perturbed into
are small, hence any optimal tour inside
has a large value by Lemma 4.6, or we find another cell close to
that has a very large contribution to the length of any tour.
To formalize this intuition, fix a cell
and let
be the expected number of points
with
. Assume for convenience that
and
are integer. We describe the position of a cube
canonically by indices
. For two cells
and
, we define their distance as
. For
, let
denote all cells of distance k to
and let
denote the cardinality of unperturbed origins located in a cell in
. We call a perturbed point
with unperturbed origin
, for some
, a k-successful point. Let
denote the set of all k-successful points. Then
.
Our first technical lemma shows that any cell
, having (in expectation) a large number
of points perturbed into it from cells of distance at most K, contributes at least
to the length of the optimal tour.
Lemma 4.14
Let
and define
as the set of k-successful points for
. Let
. If
, then with probability
, we have
![]() |
Proof
Note that by Lemma 4.13,
. Fix any realization of
, i.e., choice of unperturbed origins inside some cell in
whose perturbed points fall into
. We can simulate the distribution of
(under this realization of
) by appealing to the one-step model. Note that each point in
is distributed as a Gaussian conditioned on containment in cell
. By rotational invariance of the Gaussian distribution, Lemma 4.10 is applicable and bounds the conditional density function of each point in
by
. By scaling, we obtain an instance in the unit cube with
points distributed according to density functions of maximum density
. Hence, by Lemma 4.6 we obtain that any tour has length
on the scaled instance with probability
. Scaling back to
, we obtain
. Since by Chernoff bounds (Lemma 4.2),
with probability
, we finally obtain, using Lemma 4.13,
![]() |
with probability
, where we used that
. 
The following simple technical lemma shows that with constant probability, a point is perturbed into the cell it originates in.
Lemma 4.15
Let
and
. Then
.
Proof
Let
be the probability density function of Z. For all
, we have
and hence
. This yields
![]() |
We are set-up to formally show the classification of heavy cells. Recall that
denotes the number of cells
.
Lemma 4.16
Let
,
and
for sufficiently small constants
. Then we can classify each cell
with
into one of the following two types.
- With probability
for any constant
, we have 
- There is some
such that for any
, we have
with probability
for any constant
.
Proof
We start with some intuition. By Lemma 4.4, we can bound
. If we have
, then Lemma 4.14 already proves
to have type (T1). Otherwise, by tail bounds for the Gaussian distribution, we argue that some cell
in distance at most
contains at least
unperturbed origins. These are sufficiently many to let
contribute
, for any
, to the optimal tour length.
To make the intuition formal, note that all edges in
are contained in a cube of side length
around
. By Chernoff bounds (Lemma 4.2), at most
points are contained in
with probability
. Hence, Lemma 4.4 bounds
![]() |
6 |
with probability
.
Case 1:
. In this case, we may appeal to Lemma 4.14 (since
) and obtain
![]() |
7 |
with probability
, since
and
can be chosen sufficiently small. By a union bound, (6) and (7) hold with probability
for any constant
, proving that
has type (T1).
Case 2:
. Every point in
has an
-distance of at least
to every point in
. Thus, by Lemma 4.1, we have
![]() |
8 |
for sufficiently large k. Since
, we can choose a sufficiently small constant
such that
satisfies
. From
, we conclude
![]() |
Hence, we have
![]() |
By (8), it follows that
![]() |
unperturbed origins are situated in cells in distance
from
. Note that there are at most
such cells and
for any
. By pigeon hole principle, there is a cell
with
many unperturbed origins.
Let
be the 0-successful points for cell
, i.e., the points with origin in
that are perturbed into
. By Lemma 4.15, each unperturbed origin
has constant probability to be perturbed into
, i.e.,
. Hence,
. Thus, Lemma 4.14 bounds
![]() |
9 |
with probability
. Since (6) and (9) hold simultaneously with probability
for any constant
, this proves that
has type (T2).
Total Length of 2-Optimal Tours
With the analyses of the previous subsections, we can finally bound the total length of 2-optimal tours. To bound the total length of short edges, consider first sparse cells
, i.e., cells containing
perturbed points in expectation (recall that
, where
is the number of cells). For each such cell, the Chernoff bound (Lemma 4.2) yields that with probability
, at most
points are contained in
, since each point is perturbed independently. By a union bound, no sparse cell contains more than
points with probability at least
for any constant
. In this event, Lemma 4.4 allows for bounding the contribution of sparse cells by
![]() |
10 |
For bounding the length in the remaining cells (the heavy cells), let
and
. We observe the following: with probability at least
, all type-(T1) cells
satisfy
. Thus,
![]() |
11 |
where the last inequality follows from
, which holds since every edge in
(inside C) is counted at most twice on the left-hand side.
Let
be any function that assigns to each cell
of type-(T2) a corresponding cell
satisfying the condition (T2). We say that
charges
. We can choose any
and have with probability at least
that
for all
. Assume that this event occurs. Since every cell
can only be charged by cells in distance
, each cell can only be charged
times. Hence,
![]() |
Since
, choosing
sufficiently large yields
![]() |
12 |
Proof of Theorem 4.11
By a union bound, we can bound by
, for any constant
, the probability that (i)
(by Lemma 4.7), (ii) all edges outside C contribute
(by Lemma 4.12), (iii) all sparse cells contribute
(by (10)), (iv) the type-(T1) cells
induce a cost of
(by (11)), and (v) the type-(T2) cells induce a cost of
(by (12)). Since the remaining edges are long edges and contribute only
, we obtain that every 2-optimal tour has a length of at most
with probability
.
Since a 2-optimal tour always constitutes a
-approximation to the optimal tour length by Lemma 4.9, we also obtain that the expected cost of the worst 2-optimal tour is bounded by
![]() |
Lower Bound on the Approximation Ratio
We complement our upper bound on the approximation performance by the following lower bound: for
, the worst-case lower bound is robust against perturbations. For this, we face the technical difficulty that in general, a single outlier might destroy the 2-optimality of a desired long tour, potentially cascading into a series of 2-Opt iterations that result in a substantially different or even optimal tour.
Theorem 4.17
Let
. For infinitely many n, there is an instance X of points in
perturbed by normally distributed noise of standard deviation
such that with probability
for any constant
, we have
. This also yields
![]() |
We remark that our result transfers naturally to the one-step model with
and interestingly, holds with probability 1 over such random perturbations.
Proof of Theorem 4.17. We alter the construction of Chandra et al. [7] to strengthen it against Gaussian perturbations with standard deviation
(see Fig. 2). Let
be an odd integer and
. The original instance of [7] is a subset of the
-grid, which we embed into
by scaling by 1/P, and consists of three parts
,
and
. The vertices in
are partitioned into the layers
. Layer i consists of
equidistant vertices, each of which has a vertical distance of
to the point above it in Layer
and a horizontal distance of
to the nearest neighbor(s) in the same layer. The set
is a copy of
shifted to the right by a distance of 2/3. The remaining part
consists of a copy of Layer p of
shifted to the right by 1/3 to connect
and
by a path of points. We regard
as the set of Layer-i points in
.
Fig. 2.
Parts
and
of the lower bound instance. Each point is contained in a corresponding small container (depicted as brown circle) with high probability. The black lines indicate the constructed 2-optimal tour, which runs analogously on 
As in the original construction, we will construct an instance of
points, which implies
. Let
be the largest odd integer such that
. In our construction, we drop all Layers
in both
and
, as well as Layer p in
. Instead, we connect
and
already in Layer t by an altered copy of Layer t of
shifted to the right by 1/3. Let C be an arbitrary point of our construction, for convenience we will use the central point of Layer t in
. We introduce
additional copies of this point C. These surplus points serve as a “padding” of the instance to ensure
. Note that the resulting instance has
layers
. We choose t such that the magnitude of perturbation is negligible compared to the pairwise distances of all non-padding points. Furthermore, the restriction on
ensures that incorporating the padding points increases the optimal tour length only by a constant.
Lemma 4.18
With probability
for any constant
, the optimal tour has length O(1).
Proof
Let n be the number of points in the constructed instance. Note that
consists of (i) a subset
of the instance of Chandra et al. [7], plus (ii) an additional copy
of Layer t and (iii) the padding points
in
. Denote the number of points in
by
. We have
![]() |
by choice of t. Hence
. It is easy to see [7] that the original instance of Chandra et al. has a minimum spanning tree of length
. (This is achieved by the spanning tree that includes, for each Layer-i vertex with
, the vertical edge to the point above it, and each edge between consecutive points on Layer p.) Clearly,
![]() |
Consider the perturbed instance
. Note that for every constant
, we have
for sufficiently large n. Thus for each
, the Gaussian noise
satisfies
with probability at least
by Lemma 4.1. By a union bound, we have
with probability at least
. In this case, by the triangle inequality, the fact that
for all point sets Y and since only a constant number of edges connects the three parts, we obtain
![]() |
Note that we may translate and scale
to be contained in
, by which
may be regarded as the optimal tour length on an instance of
points in
perturbed by Gaussians with standard deviation 1. By Lemma 4.5, any 2-optimal tour and hence also the optimal tour on the scaled instance has length
with probability
. Scaling back to the original instance, we obtain
with probability
. This yields the result by a union bound. 
We find a long 2-optimal tour on all non-padding points analogously to the original construction by taking a shortcut of the original 2-optimal tour, which connects
and
already in Layer t (see Fig. 2).
Consider the padding points, which are yet to be connected. Let
denote the nearest point in Layer t of
that is to the left of C. Symmetrically,
is the nearest point to the right of C. Let
be any 2-optimal path from
to
that passes through all the padding points (including C). We replace the edges
and
by the path
, completing the construction of our tour T.
Lemma 4.19
Let
be arbitrary. With probability
, T is 2-optimal and has a length of
.
Note that given Lemma 4.19, Theorem 4.17 follows directly using Lemma 4.18. The (rather technical) proof of Lemma 4.19 hence concludes our lower bound.
Probability of 2-optimality.
To account for the perturbation in the analysis, we define a safe region for every point. More formally, let
be any unperturbed origin. We define its container
as the circle centered at
with radius
. Very likely, all perturbed points lie in their containers.
Lemma 4.20
For sufficiently large p, the tour T constructed as described in Sect. 4.4 is 2-optimal, provided that all points
lie in their corresponding containers
.
We first show that this lemma implies Lemma 4.19.
Proof of Lemma 4.19
Let
, and let
be arbitrary. Since
, we have
for sufficiently large n. By definition of the containers, Lemma 4.1 yields that for any point
and sufficiently large n,
![]() |
By a union bound, we conclude that with probability
, all points are contained in their corresponding containers and hence, by the previous lemma, T is 2-optimal.
Recall that t is the largest odd integer satisfying
. Since
, this implies
. Observe that T visits
many layers and crosses a horizontal distance of 2/3 in each of them. Hence, it has a length of at least
. 
In the remainder of this section, we prove Lemma 4.20, i.e., show that the constructed tour is 2-optimal, provided all points stay inside their respective containers. Clearly, it suffices to show for any pair of edges (u, v) and (w, z) in the tour, the corresponding 2-change, i.e., replacing these edges by (u, w) and (v, z) does not reduce the tour length, i.e.,
. We first state the technical lemmas capturing the ideas behind the construction.
The first lemma treats pairs of horizontal edges and establishes how large their vertical distance must be in order to make swapping these edges increase the length of the tour. It is a generalization of a similar lemma of Chandra et al. [7] to a perturbation setting, in which points are placed arbitrarily into small containers.
Note that in what follows, for a point
, we let
denote its x-coordinate and
its y-coordinate. Furthermore, for any points
, we let
and
denote their horizontal and vertical distance, respectively.
Lemma 4.21
Let
and
be horizontal line segments in the Euclidean plane with
and
. Let
,
,
and
be circles of radius
with centers p, q, r and s, respectively. If
and the vertical distance
between
and
is at least
![]() |
then, for all
, we have
![]() |
Proof
Note that
. Furthermore, we have that
![]() |
and hence
![]() |
where the right-hand side expression is at least 0, since
by assumption. Let
and
, then it is straight-forward to verify that the expression
![]() |
13 |
subject to
is minimized when
.
Hence, we can bound (13) by
![]() |
where the third line follows from our assumption on v. 
The following very basic lemma shows that a sequence of edges that share roughly the same direction will always be 2-optimal.
Lemma 4.22
Let
and
be a sequence of points in
such that all connecting segments
fulfill
. Then,
![]() |
Proof
For any point p, let
denote the cone
. Let
, then by assumption, we have
and thus
. Let us assume that
(the other case is symmetric). Since by assumption,
, we have for
that
and
for some
and
with
. If
, the claim is immediate from
. Otherwise, for
, we obtain
![]() |
By an analogous computation,
follows and hence the claim. 
We can now prove Lemma 4.20. Assume that all points are contained in their respective containers. We call an edge between
and
horizontal (or vertical) if the edge between
and
is horizontal (or vertical) and neither
nor
belong to the set of padding points. In what follows, we will first consider horizontal-horizontal, horizontal-vertical and vertical-vertical edge pairs and then turn to pairs of edges for which at least one edge is adjacent to some padding point. Recall that
is chosen such as to satisfy
.
Horizontal-horizontal edge pair Let
and
be two horizontal edges. Horizontal edges
with
appear only if
. We distinguish the following cases.
: Both edges are in the same layer. Note that no 2-change swaps neighboring edges. Assume without loss of generality that
(the other case is symmetric). Since
, we have that
Similarly,
and
. This shows that Lemma 4.22 is applicable to
, which yields that no 2-change can be profitable.
, and
. By construction of T, the edges have opposite direction. Assume that
and hence
(the other case is symmetric). By construction
. We have that
. The same reasoning shows that
. Similarly, one can show that
for all
and
. Hence the 2-change to
and
has a crossing, which by triangle inequality cannot be profitable.
, and
with
and
. Either both edges have opposite directions, then the previous argument shows that a 2-change is not profitable. Otherwise, note that the first requirement of Lemma 4.21,
, is fulfilled. Also note that
, since
. We have
since for sufficiently large p, we have
. Consequently, Lemma 4.21 applies and shows that the 2-change does not yield an improvement.
Horizontal-vertical edge pair. Let
be a vertical edge and
be a horizontal edge. We assume that the vertical edge is in
, since the case
is symmetric. Exactly one of the following cases occurs.
and
with
. The horizontal edge is in the same layer as one of the end points of the vertical edge. Clearly,
and
. Since a 2-change cannot swap neighboring edges, at least one horizontal segment lies between both edges. By construction of the tour, one of the edges
and
crosses a vertical distance of at least
and the other a horizontal distance of at least
. Hence
since
.
and
with
. As in the previous case,
and
. Consider first the case that
, then by construction of the tour, one of the edges
and
crosses a horizontal distance of at least
and the other edge crosses a vertical distance of at least
, yielding
since
. Otherwise, if
, the edge
crosses a vertical distance of at least
and hence
since
. Thus in both cases, a 2-change is not profitable.
Vertical-vertical edge pair. Let
and
be vertical edges.
and
with
, i.e., the vertical edges are above each other. By swapping the x- and y-axis in Lemma 4.22, we can show that a 2-change is not profitable, since it is easy to see that
for all consecutive pairs (p, q) in
.
and
with
. Clearly,
and
, while
and
. Hence a 2-change is not profitable, since
.
Padding points.
Since we assumed for convenience that the padding points are placed at the central vertex C of Layer t in
, only the edges with at least one endpoint in
are relevant candidates for the treatment of padding points. This is because all other edges have both endpoints at a distance of 1/6 to the padding points, which can never be accounted for by its edge length, since all edges except in Layer 0 are much shorter than 1/3. Separately, the Layer-0 edges can be handled easily as well: an edge
with
is a horizontal edge, hence the pair
and a Layer-0 edge trigger the corresponding case of horizontal-horizontal edge pairs with even smaller edge length of the edge
in Layer t.
It remains to handle the following cases, where we regard C as a padding point, i.e.,
, not as a Layer-t point.
, and
. Clearly,
and
. Furthermore, at least one of
has a horizontal distance of at least
to
. Hence, 
and
. These edge pairs are exactly as regular pairs of Layer-t edges and the corresponding case of horizontal-horizontal edge pairs applies.
. All such edges are 2-optimal by construction, since a 2-optimal path from
to
passing by all padding points was used.
This concludes the case analysis and thus the proof of Lemma 4.20.
Concluding Remarks
Running-time. Our approach for Euclidean distances does not work for
and
. However, we can use the bound of Englert et al. [12] for Euclidean distances, which yields a bound polynomial in n and
for
.
In the same way as Englert et al. [12], we can slightly improve the smoothed number of iterations by using an insertion heuristic to choose the initial tour. We save a factor of
for Manhattan and Euclidean distances and a factor of
for squared Euclidean distances. The reason is that there always exist tours of length
for n points in
for Euclidean and Manhattan distances and of length
for squared Euclidean distances for
[31] (the constants in these upper bounds depend on d). Taking into account also that, because Gaussians have light tails, only few points are far away from the hypercube
after perturbation, one might get an even better bound. However, we did not take these improvements into account in our analysis to keep the paper concise.
Of course, even our improved bounds do not fully explain the linear number of iterations observed in experiments. However, we believe that new approaches, beyond analyzing the smallest improvement, are needed in order to further improve the smoothed bounds on the running-time.
Approximation ratio.
We have proved an upper bound of
for the smoothed approximation ratio of 2-Opt. Furthermore, we have proved that the lower bound of Chandra et al. [7] remains robust even for
. We leave as an open problem to generalize our upper bounds to the one-step model to improve the current bound of
[12], but we conjecture that this might be difficult, because of the lack of the nice structure that Gaussian distributions provide.
Given the recent improvement from
to
by Brodowsky et al. [5], we raise the question of tightening our upper bound to
.
While our bound significantly improves the previously known bound for the smoothed approximation ratio of 2-Opt, we readily admit that it still does not explain the performance observed in practice. A possible explanation is that when the initial tour is not picked by an adversary or the nearest neighbor heuristic, but using a construction heuristic such as the spanning tree heuristic or an insertion heuristic, an approximation factor of 2 is guaranteed even before 2-Opt has begun to improve the tour [27]. We chose to compare the worst local optimum to the global optimum in order, as this is arguably the simplest of all technically difficult possibilities.
However, a smoothed analysis of the approximation ratio of 2-Opt initialized with a good heuristic might be difficult: even in the average case, it is only known that the length of an optimal TSP tour is concentrated around
for some constant
. But the precise value of
is unknown [31]. Since experiments suggest that 2-Opt even with good initialization does not achieve an approximation ratio of
[16, 17], one has to deal with the precise constants, which seems challenging.
Finally, we conjecture that many examples for showing lower bounds for the approximation ratio of concrete algorithms for Euclidean optimization such as the TSP remain stable under perturbation for
. The question remains whether such small values of
, although they often suffice to prove polynomial smoothed running time, are essential to explain practical approximation ratios or if already slower decreasing
provide a sufficient explanation.
Footnotes
This paper is based on results presented at ISAAC 2013 [25] and ICALP 2015 [20].
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Abramowitz, M., Stegun, I. A.: editors. Pocketbook of mathematical functions. Harri Deutsch, (1984)
- 2.Arora, S.: Polynomial time approximation schemes for Euclidean traveling salesman and other geometric problems. J. ACM 45(5), 753–782 (1998) [Google Scholar]
- 3.Bläser, M., Manthey, B., Raghavendra Rao, B.V.: Smoothed analysis of partitioning algorithms for Euclidean functionals. Algorithmica 66(2), 397–418 (2013) [Google Scholar]
- 4.Bringmann, K., Engels, C., Manthey, B., Raghavendra Rao, B. V.: Random shortest paths: Non-euclidean instances for metric optimization problems. In Krishnendu Chatterjee and Jiří Sgall, editors, Proc. of the 38th Int. Symp. on mathematical foundations of computer science (MFCS), volume 8087 of lecture notes in computer science, pages 219–230. Springer, (2013)
- 5.Brodowsky, U.A., Hougardy, S., Zhong, X.: The approximation ratio of the k-opt heuristic for the Euclidean traveling salesman problem. SIAM J. Comput. 52(4), 841–864 (2023) [Google Scholar]
- 6.Brunsch, T., Röglin, H., Rutten, C., Vredeveld, T.: Smoothed performance guarantees for local search. Math. Program. 146(1–2), 185–218 (2014) [Google Scholar]
-
7.Chandra, B., Karloff, H., Tovey, C.: New results on the old [CDATA[k]]
-opt algorithm for the traveling salesman problem. SIAM J. Comput. 28(6), 1998–2029 (1999) [Google Scholar] - 8.Curticapean, R., Künnemann, M.: A quantization framework for smoothed analysis of Euclidean optimization problems. Algorithmica 73(3), 1–28 (2015) [Google Scholar]
- 9.Dubhashi, D.P., Panconesi, A.: Concentration of Measure for the Analysis of Randomized Algorithms. Cambridge University Press, Cambridge (2009) [Google Scholar]
- 10.Durrett, R.: Probability: Theory and Examples. Cambridge University Press, Cambridge (2013) [Google Scholar]
- 11.Engels, C., Manthey, B.: Average-case approximation ratio of the 2-opt algorithm for the TSP. Oper. Res. Lett. 37(2), 83–84 (2009) [Google Scholar]
- 12.Englert, M., Röglin, H., Vöcking, B.: Worst case and probabilistic analysis of the 2-Opt algorithm for the TSP. Algorithmica 68(1), 190–264 (2014) [Google Scholar]
- 13.Etscheid, M.: Performance guarantees for scheduling algorithms under perturbed machine speeds. Discret. Appl. Math. 195, 84–100 (2015) [Google Scholar]
- 14.Evans, M., Hastings, N., Peacock, B.: Statistical Distributions, 3rd edn. Wiley, Hoboken (2000) [Google Scholar]
- 15.Funke, S., Laue, S., Lotker, Z., Naujoks, R.: Power assignment problems in wireless communication: Covering points by disks, reaching few receivers quickly, and energy-efficient travelling salesman tours. Ad Hoc Netw. 9(6), 1028–1035 (2011) [Google Scholar]
- 16.Johnson, D.S., McGeoch, L.A.: The traveling salesman problem: a case study. In: Aarts, E., Lenstra, J.K. (eds.) Local Search in Combinatorial Optimization. Wiley, Hoboken (1997) [Google Scholar]
- 17.Johnson, D.S., McGeoch, L.A.: Experimental analysis of heuristics for the STSP. In: Gutin, G., Punnen, A.P. (eds.) The Traveling Salesman Problem and its Variations. Kluwer Academic Publishers, Dordrecht (2002) [Google Scholar]
- 18.Karger, D., Onak, K.: Polynomial approximation schemes for smoothed and random instances of multidimensional packing problems. In Proc. of the 18th Ann. ACM-SIAM Symp. on Discrete Algorithms (SODA), pages 1207–1216. SIAM, (2007)
- 19.Kern, W.: A probabilistic analysis of the switching algorithm for the TSP. Math. Program. 44(2), 213–219 (1989) [Google Scholar]
- 20.Künnemann, M., Manthey, B.: Towards understanding the smoothed approximation ratio of the 2-opt heuristic. In Magnús M. Halldórsson, Kazuo Iwama, Naoki Kobayashi, and Bettina Speckmann, editors, In: Proc. of the 42nd Int. Coll. on Automata, Languages and Programming (ICALP), volume 9134 of Lecture Notes in Computer Science, pages 859–871. Springer, (2015)
- 21.Manthey, B.: Smoothed analysis of local search. In: Roughgarden, T. (ed.) Beyond the Worst-Case Analysis of Algorithms, pp. 285–308. Cambridge University Press, Cambridge (2020) [Google Scholar]
- 22.Manthey, B., Röglin, H.: Smoothed analysis: analysis of algorithms beyond worst case. IT Inf Technol 53(6), 280–286 (2011) [Google Scholar]
- 23.Manthey, B.,Rhijn, J.V.: Improved smoothed analysis of 2-opt for the Euclidean TSP. In Satoru Iwata and Naonori Kakimura, editors, In: Proc. 34th Int. Symposium on Algorithms and Computation (ISAAC), volume 283 of LIPIcs, pages 52:1–52:16. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, (2023)
- 24.Manthey, B.,Veenstra, R.: Smoothed analysis of the 2-Opt heuristic for the TSP: Polynomial bounds for Gaussian noise. In Leizhen Cai, Siu-Wing Cheng, and Tak-Wah Lam, editors, In: Proc. of the 24th Ann. Int. Symp. on Algorithms and Computation (ISAAC), volume 8283 of Lecture Notes in Computer Science, pages 579–589. Springer, (2013)
-
25.Mitchell, J.S.B.: Guillotine subdivisions approximate polygonal subdivisions: a simple polynomial-time approximation scheme for Geometric TSP,
[CDATA[k]]-MST, and related problems. SIAM J. Comput. 28(4), 1298–1309 (1999) [Google Scholar] - 26.Papadimitriou, C.H.: The Euclidean traveling salesman problem is NP-complete. Theoret. Comput. Sci. 4(3), 237–244 (1977) [Google Scholar]
- 27.Rosenkrantz, D.J., Stearns, R.E., Lewis, P.M., II.: An analysis of several heuristics for the traveling salesman problem. SIAM J. Comput. 6(3), 563–581 (1977) [Google Scholar]
- 28.Spielman, D.A., Teng, S.-H.: Smoothed analysis of algorithms: Why the simplex algorithm usually takes polynomial time. J. ACM 51(3), 385–463 (2004) [Google Scholar]
- 29.Spielman, D.A., Teng, S.-H.: Smoothed analysis: an attempt to explain the behavior of algorithms in practice. Commun. ACM 52(10), 76–84 (2009) [Google Scholar]
- 30.Nijnatten, F.v., Sitters, R., Woeginger, G. J., Wolff, A., de Berg, M.: The traveling salesman problem under squared Euclidean distances. In Jean-Yves Marion and Thomas Schwentick, editors, In: Proc. of the 27th Int. Symp. on Theoretical Aspects of Computer Science (STACS), volume 5 of LIPIcs, pages 239–250. Schloss Dagstuhl– Leibniz-Zentrum für Informatik, (2010)
- 31.Yukich, J.E.: Probability theory of classical Euclidean optimization problems. Springer, Berlin (1998) [Google Scholar]






































































































