Abstract
The VC-dimension, which has wide uses in learning theory, has been used in the analysis and design of graph algorithms recently. In this paper, we study the problem of bounding the VC-dimension of unique round-trip shortest path set systems (URTSP), which are set systems induced by sets of vertices in unique round-trip shortest paths in directed graphs. We first show that different from the VC-dimensions of set systems induced by unique undirected and directed shortest paths in undirected and directed graphs respectively, the VC-dimension of URTSP can be larger than 3. We then prove that the VC-dimension of URTSP is at most 32. Furthermore, we apply the VC-dimension result to the minimum k-round-trip shortest path cover problem (k-RTSPC), which is to find for a directed graph a minimum vertex set to intersect every round-trip shortest path containing at least k vertices, and derive an upper bound on the size of the vertex set. The k-RTSPC problem can be useful in many real-world applications, including optimal placement of facilities.
1. Introduction
The Vapnik-Chervonenkis dimension (VC-dimension) [22] has been a widely used technique in machine learning theory and computational geometry. However, recently it has been used in graph analysis [17] firstly, and in graph algorithm design [1, 21, 15]. Abraham et al. [1] analyzed the VC-dimension of unique undirected shortest path set systems (UUSP), which are set systems induced by sets of vertices in unique undirected shortest paths in undirected graphs. In a UUSP(U, S) with respect to (w.r.t.) an undirected graph G, the vertices in G formulate the element set U, and S is the collection of vertices in unique shortest paths in G. By uniqueness, it is assumed that only one shortest path exists between every pair of vertices in G, and this can be easily achieved by adding negligible perturbation to the edge weights. It can be shown that in such systems, the VC-dimension is upper bounded by 2. They also provided an alternative definition of highway dimension [2] based on set systems to model road networks. Then the VC-dimension result can be used to yield an improved time bound on point-to-point shortest path queries in road networks.
Tao et al. [21] independently derived the same VC-dimension result in their study of k-skip shortest paths in road networks. A k-skip shortest path consists of at least a vertex out of every k consecutive vertices in the original shortest path. With the VC-dimension result, they are able to prove that only a small subset V * ⊆ V of vertices with size O(|V|/k) needs to form vertices in every k-skip shortest path in a graph G(V, E), and design an algorithm for efficiently answering k-skip shortest path queries. However, in directed graphs (digraphs), until recently Funke et al. [15] proved that in the corresponding unique directed shortest path set systems (UDSP) induced by unique directed shortest paths, the VC-dimension is upper bounded by 3. With the VC-dimension result, they showed a logarithmic approximation algorithm to construct a small number of vertices, in order to hit every one-way directed shortest path consisting of at least k vertices.
In digraphs, round-trip shortest paths have been firstly studied in the context of routing schemes [8, 9], and then greatly studied in round-trip graph spanners [20, 24, 19, 23]. In a digraph G, the round-trip shortest path between vertex u and v in G is the concatenation of the one-way shortest path from u to v in G and the one-way shortest path from v to u in G. Different from one-way shortest paths, round-trip shortest paths are symmetric and define a metric in digraphs. Therefore, it appears that in digraphs round-trip shortest paths and distances are easier to handle and study.
For instance, it is well known that for an n-vertex undirected graph G, there exists an O(n1+1/k) sized (2k − 1)-spanner, where the distance between every pair of vertices is at most 2k − 1 times of their distance in G [3]. The factor 2k − 1 is called the stretch. The definition of graph spanners can be naturally extended to digraphs. However, for one-way shortest paths, it is even unmeaningful to study spanners because of the well known Ω(n2) size lower bound. In contrast, the stretch-size trade-off for round-trip spanners [24] has been close to the optimal trade-off for spanners in undirected graphs, if we believe Erdős’s girth conjecture [10].
Unfortunately, the VC-dimension of unique round-trip shortest path set systems (URTSP), which are set systems induced by round-trip shortest paths, has not been considered before, although the VC-dimension of unique directed shortest path set systems has been studied [15].
Our Contribution. In this paper, we study the VC-dimension of URTSP. We first show that, different from the VC-dimensions of UUSP and UDSP, the VC-dimension of URTSP can be larger than 3. This is different from our traditional understanding that round-trip shortest paths and distances are easier to study than their one-way counterparts. The difficulty stems from the following special behavior for round-trip shortest paths. A one-way shortest path, e.g., P = (v1, v5, v3, v4) in Figure 1, must include the shortest path for any ordered pair of vertices in P respecting their order in P, e.g., the shortest path from v5 to v3. Different from this behavior, a round-trip shortest path, e.g., PRT = (v4, v1, v2, v3, v4) between v2 and v4 in Figure 1, does not need to include the shortest path for some vertex pair in PRT, e.g., the shortest path (v1, v5, v3) from v1 to v3. This suggests that round-trip shortest paths behave slightly more complicated than one-way shortest paths. We then prove that the VC-dimension of URTSP is upper bounded by 32. We believe that the VC-dimension result can be useful in graph algorithm design.
Figure 1:
An example on the difference in behavior between one-way and round-trip shortest paths in digraphs. All edge weights are one except that the weight of the edge (v1, v2) is two.
Furthermore, we apply the VC-dimension result to the minimum k-round-trip shortest path cover problem (k-RTSPC), which is to construct in a digraph a minimum set of vertices to hit every round-trip shortest path consisting of at least k vertices. Previously, shortest path covers, mostly for undirected graphs, have been extensively used in algorithms for problems on road networks, e.g. shortest path queries [2, 1, 18], low-distortion embedding [13, 14, 4] and k-center problem [12, 4]. k-RTSPC is a corresponding problem in digraphs and can have many real-world applications, such as in optimal placement of facilities in road networks. For instance, organizations aim at placing facilities (e.g., advertisements) at a small number of vertices in a road network while providing good coverage. A solution is to first construct a vertex set for the corresponding k-RTSPC problem, and then place facilities only at those vertices in the vertex set. In this way, the coverage can be reasonably good because every round-trip shortest path consisting of at least k vertices (e.g. the route to work or go out from home) is guaranteed to include at least a facility, while the number of facilities to place remains small. Provided our result that the VC-dimension is constant, following similar analyses as in [21, 15], we get an upper bound O(n/k log(n/k)) on the size of the vertex set constructed. As already shown in [1], better hitting set approximations can be computed for set systems of smaller VC-dimensions. Then with the constant VC-dimension, we immediately have a logarithmic approximation algorithm. Our main theoretical results are summarized in Theorems 1 and 2.
Theorem 1. The VC-dimension of URTSP is upper bounded by 32.
Definition 1 (Minimum k-Round-Trip Shortest Path Cover). Given a weighted digraph G(V, E), the problem of Minimum k-Round-Trip Shortest Path Cover (k-RTSPC) is to find a minimum set of vertices C ⊆ V such that, for every round-trip shortest path PRT consisting of at least k vertices in G, we have C ∩ PRT ≠ ∅.
Theorem 2. Given an n-vertex digraph, there exists a solution of size O(n/k·log(n/k)) for, and an O(log h)-approximation of k-RTSPC, where h is the size of the optimal solution.
Organization. Now we present the organization of the paper. In Section 2, we provide the notations and definitions we will use. In Section 3, we present the details of analyzing the VC-dimension and its application in k-RTSPC. We conclude the paper with a brief discussion on the future work in Section 4.
2. Notation and Definition
In this paper, we consider weighted directed graphs G(V, E, W). Here V is a vertex set, E ⊆ V × V is an edge set, and W is a weight function which assigns a positive real valued weight to each edge in E. W can be omitted from the presentation if it is clear from the context. A (directed) path from u to v in G is a sequence of edges (u, u1), (u1, u2), … , (uk, v) from u to v. It can be denoted as (u, u1, u2, … , uk, v), and its path distance is the sum of edge weights of the edges in the sequence. We only consider simple paths, where there does not exist duplicated vertices in the list u, u1, … , uk, v.
A one-way shortest path from u to v in G is the path with the minimum distance among all paths from u to v in V . A round-trip shortest path PRT (or simply P) between u and v in G is the concatenation of a one-way shortest path from u to v and a one-way shortest path from v to u. We assume that the round-trip shortest path between each vertex pair in G is unique. This is a reasonable assumption we can make, because we can perturb the input graph G to enforce uniqueness of round-trip shortest paths.
3. The VC-Dimension and Its Application
The VC-dimension is formally defined in Definition 2, and in the next paragraph we illustrate it using an example from unique undirected shortest path set systems UUSP.
Definition 2 (VC-Dimension). Given a set system (U, S) with U being an element set and S being a collection of subsets of U, we say a set U′ ⊆ U can be shattered by S, if for every subset B ⊆ U′, there exists a set Si ∈ such that Si ∩ U′= B. The VC-dimension d of the set system (U, S) is the largest integer such that there exists a set of size d that can be shattered by S.
In an UUSP(U, S) w.r.t. an undirected graph G, the vertices in G formulate the element set U and S is the collection of vertices in unique undirected shortest paths in G. Consider a graph G(V,E) where V = {u, v, w}, E = {(u, v), (v, w)}, W(u, v) = 1 and W(v, w) = 1. In the UUSP(U, S) w.r.t. G, the element set U is equal to {u, v, w}, and S is equal to {{u}, {v}, {w}, {u, v}, {v, w}, {u, v, w}}. For a set U′ = {u, v, w}, it cannot be shattered by S because its subset {u, w} cannot be realized (the shortest path between u and w passes through v). Therefore, the VC-dimension of UUSP is at most 2.
In the following, we consider unique round-trip shortest path set systems URTSP, which are set systems induced by unique round-trip shortest paths in digraphs. In a URTSP(U, S) w.r.t. a digraph G, the vertices in the digraph G formulate the element set U and S is the collection of vertices in unique round-trip shortest paths for all pairs of vertices in G.
3.1. Details of the VC-Dimension Analysis
We first show that the upper bound of the VC-dimension of URTSP is not the same as the upper bounds of the VC-dimensions of UUSP and UDSP, i.e., 2 and 3 respectively. Specifically, the upper bound on the VC-dimension of URTSP is at least 4. We need to find an instance URTSP (U, S) where there exists a set U′⊆ U of size 4 such that U′ can be shattered by S. Such a graph is shown in Figure 2 and consider U′= {v1, v2, v3, v4}. By Definition 2, it suffices to show that every subset of U′ is realized by some set in S. We consider the subsets of U′ in an increasing size as follows. The subsets of size one, e.g., {v1}, are realized by themselves in S. The subsets, U1 = {v1, v2}, U2 = {v2, v3} and U3 = {v3, v4} of size two, are realized by the round-trip shortest paths P1, P2 and P3, respectively. Similarly, the subsets, U4 = {v1, v3}, U5 = {v2, v4} and U6 = {v1, v4}, are realized by P4, P5 and P6, respectively. For U1, U2 and U3, the two vertices of each set, e.g., v1 and v2 for U1, are included in each one-way shortest path of the corresponding round-trip shortest path, e.g. P1 between v1 and v2, as in Figure 2. However, for U4, U5 and U6, the two vertices of each set, e.g., v1 and v3 in U4, are included in two different one-way shortest paths of the round-trip shortest path, e.g. P4 between u1 and w1, as in Figure 2. As mentioned earlier, although the round-trip shortest path P4 traverses and includes v1 and v3, it does not need to include the shortest path from v1 to v3, which includes v2. This enables that U4 can be shattered by S. Similarly for U5 and U6. For the subsets of size three, we first consider U7 = {v1, v2, v4} which is realized by P7. P7 includes v1 and v2 in one of its one-way shortest paths, and v4 in the other one-way shortest path. It does not need to include vertices in the one-way shortest path from v2 to v4, e.g. v3. This enables that U7 can be shattered by S. Similarly for the realizations of other subsets, U8 = {v1, v3, v4}, U9 = {v1, v2, v3} and U10 = {v2, v3, v4} of size three, by P8, P9 and P10, respectively. Finally, U′ itself of size four is realized by P11. Therefore, U′ can be shattered by S.
Figure 2:
The upper bound on the VC-dimension of URTSP is at least 4. Edge weights are one unless marked otherwise. Some round-trip shortest paths to realize subsets of U′= {v1, v2, v3, v4}: P1 = (v1, v2, v1) between v1 and v2, P2 = (v2, v3, v2) between v2 and v3, P3 = (v3, v4, v3) between v3 and v4, P4 = (u1, v3, w1, v1, u1) between u1 and w1, P5 = (u3, v4, w3, v2, u3) between u3 and w3, P6 = (u2, v4, w2, v1, u2) between u2 and w2, P7 = (u4, v1, v2, w4, v4, u4) between u4 and w4, P8 = (u5, v3, v4, w5, v1, u5) between u5 and w5, P9 = (v1, v2, v3, v2, v1) between v1 and v3, P10 = (v2, v3, v4, v3, v2) between v2 and v4 and P11 = (v1, v2, v3, v4, v3, v2, v1) between v1 and v4.
Then it becomes natural to ask for the upper bound on the VC-dimension of URTSP. We show the upper bound is 32 as in Theorem 1 and provide its proof as follows.
Proof. (Theorem 1) We prove that, for a URTSP (U, S), an arbitrary set of thirty-three vertices U = {v1, v2, … , v33} cannot be shattered by S. If there exists no round-trip shortest path containing all thirty-three vertices in U′, the argument is obviously true. Suppose that there exists such a round-trip shortest path PRT between u and v. It is easy to see that at least more than a half, e.g., seventeen vertices, of U′ are included in a one-way shortest path or of PRT . Without loss of generality (w.l.o.g.), assume that includes a larger number of vertices of U′ than , and that the subset of U′ in is {v1, v2, … , v17}, where the indices of the vertices imply their orders in . In the following, we will show that two subsets of U′, B1 = {v1, v3, v5, v7, v9, v11, v13, v15, v17} and B2, which is a special subset of B1 that will be specified later, cannot be realized at the same time.
Suppose for contradiction that both B1 and B2 can be realized by round-trip shortest paths QRT and HRT such that QRT ∩ U′ = B1 and HRT ∩ U′ = B2, respectively. First at least more than a half of B1, e.g., five vertices, are included in a one-way shortest path or of QRT. Assume w.l.o.g. that includes a larger number of vertices of B1 than , and that the subset of B1 in is {v1, v3, v5, v7, v9}. The orders of the vertices in must be … v9 … v7 … v5 … v3 … v1 …. To see this, assume that v1 comes before v3 in w.l.o.g.. Then v2 must be in as well, because according to , the one-way shortest path from v1 to v3 includes v2, and of our assumption that round-trip shortest paths are unique. In this case, we have QRT ∩ U′ ⊃ B1, contradicting the hypothesis that QRT ∩ U′ = B1.
Let B2 = {v1, v5, v9}. Then at least more than a half of vertices of B2, e.g., two vertices, are included in a one-way shortest path or of HRT. W.l.o.g., assume that includes a larger number of vertices in B2 than , and that the subset of B2 in is {v1, v5}. If includes the one-way shortest path from v1 to v5, then according to , must also include v2, v3, v4. Otherwise, if includes the one-way shortest path from V5 to v1, according to , must also include v3. Thus in both cases, we have HRT ∩ U′⊃ B2, contradicting the hypothesis that HRT ∩ U′= B2.
3.2. The Application in k-RTSPC
Now we apply the VC-dimension result to the k-RTSPC problem. By using a similar analysis as in [21, 15], the problem can be connected to the theory of ϵ-nets. Given a set system (U, S), a hitting set is a set of elements U′ ⊆ U such that, for each set of elements Si ∈ S, Si is hit by U′, i.e., Si ∩ U′ ≠ ∅. An ϵ-net is a hitting set for all sets of elements Si ∈ S satisfying Si ≥ ϵ|U|, where ϵ ∈ (0, 1). In a set system URTSP, an ϵ-net with ϵ = k/n and n being the number of vertices in the input graph is a solution for k-RTSPC. The ϵ-net theorem [16] states that every set system with VC-dimension d has an ϵ-net of size O(d/ϵ · log(1/ϵ)). Therefore, with our VC-dimension result and the ϵ-net theorem, there exists an upper bound O(n/k · log(n/k)) on the size of a solution for k-RTSPC.
For the approximation algorithm, it has been explained in [1] that for set systems with low VC-dimension, there are better hitting set approximation algorithms than in the general case. Specifically, for a set system of optimum hitting set of size h and VC-dimension d, there exists an efficient algorithm to find an O(dh log(dh)) sized hitting set. For the details, there have been randomized algorithms [6, 7], deterministic algorithm [5], and linear programming based algorithm [11]. By plugging in the constant VC-dimension, we immediately get that for URTSP with optimum hitting set of size h, there exists an efficient algorithm to find an O(h log(h)) sized hitting set. This implies a stronger statement that for a digraph, there exists a logarithmic approximation algorithm to construct a minimum vertex set to intersect every round-trip shortest path with arbitrary number of vertices, not limited to only those consisting of at least k vertices.
4. Conclusion and Future Work
In this paper, we study the VC-dimension of URTSP, which are set systems induced by unique round-trip shortest paths in digraphs. We first show that the VC-dimension of URTSP is upper bounded by 32, different from the existing VC-dimension results on set systems induced by either unique undirected shortest paths or unique directed shortest paths. We then apply the VC-dimension result to the k-RTSPC problem, which is to find a minimum set of vertices to hit every round-trip shortest path consisting of at least k vertices. We believe that the VC-dimension result can have fruitful applications in graph algorithm design.
A major open problem is to determine the exact VC-dimension between 4 and 32. It is also interesting to study how to define a highway dimension-like measure in digraphs to better capture special properties of road networks, which can be naturally modelled as digraphs, and that how the VC-dimensions of URTSP and UDSP can be used to improve point-to-point shortest path query time bound. Furthermore, it is an interesting direction to study more graph problems, e.g. fault tolerant graph spanners which still function in the presence of failed vertices or edges, on the round-trip shortest path metric and one-way shortest paths in digraphs, in order to get a better understanding on themselves and their relationships.
Highlights.
The VC-dimension of unique round-trip shortest path systems is upper bounded by 32
A difference in behavior between one-way and round-trip shortest paths is revealed
The application to the minimum k-round-trip shortest path cover problem is discussed
Acknowledgments
Chun Jiang Zhu was partially supported by NIH grant R01-DA037349-01A1 and NSF grant IIS-1718738. Jinbo Bi was also supported by NSF grants CCF-1514357, DBI-1356655, and NIH grant K02-DA043063. Kam-Yiu Lam was supported by the Strategic Grant of City University of Hong Kong with grant number 7004679.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Contributor Information
Chun Jiang Zhu, Email: chunjiang.zhu@uconn.edu.
Kam-Yiu Lam, Email: cskylam@cityu.edu.hk.
Joseph Kee Yin Ng, Email: jng@comp.hkbu.edu.hk.
Jinbo Bi, Email: jinbo.bi@uconn.edu.
References
- [1].Abraham I, Delling D, Fiat A, Goldberg AV, and Werneck RF. VC-dimension and shortest path algorithms. In Proceedings of ICALP Conference, pages 690–699, 2011. [Google Scholar]
- [2].Abraham I, Fiat A, Goldberg AV, and Werneck RF. Highway dimension, shortest paths, and provably efficient algorithms. In Proceedings of the ACM-SIAM SODA Conference, pages 782–793, 2010. [Google Scholar]
- [3].Althofer I, Das G, Dobkin DP, Joseph D, and Soares J. On sparse spanners of weighted graphs. Discrete Computational Geometry, 9:81–100, 1993. [Google Scholar]
- [4].Becker A, Klein PN, and Saulpic D. Polynomial-time approximation schemes for k-center, k-median and capacitated vehicle routing in bounded highway dimension. In Proceedings of European Symposium on Algorithms Conference, pages 8:1–8:15, 2018. [Google Scholar]
- [5].Bronnimann H and Goodrich M. Almost optimal set covers in finite VC-dimension. Discrete & Computational Geometry, 14:463–497, 1995. [Google Scholar]
- [6].Clarkson KL. A Las Vegas algorithm for linear programming when the dimension is small. In Proceedings of the IEEE FOCS Conference, pages 452–456, 1988. [Google Scholar]
- [7].Clarkson KL. Algorithms for polytope covering and approximation. In Proceedings of the WADS Conference, pages 246–252, 1993. [Google Scholar]
- [8].Cowen L and Wagner C. Compact roundtrip routing in digraphs. In Proceedings of SIAM SODA Conference, pages 885–886, 1999. [Google Scholar]
- [9].Cowen L and Wagner C. Compact roundtrip routing in directed graphs. In Proceedings of ACM PODC Conference, pages 51–59, 2000. [Google Scholar]
- [10].Erdős P. Extremal problems in graph theory. Theory of Graphs and Its Applications, pages 29–36, 1964. [Google Scholar]
- [11].Even G, Rawitz D, and Shahar S. Hitting sets when the VC-dimension is small. Information Processing Letters, 95:358–362, 2005. [Google Scholar]
- [12].Feldmann AE. Fixed parameter approximations for k-center problems in low highway dimension graphs. In Proceedings of ICALP Conference, pages 588–600, 2015. [Google Scholar]
- [13].Feldmann AE, Fung WS, Kőnemann J, and Post I. A (1+ϵ)-embedding of low highway dimension graphs into bounded treewidth graphs. In Proceedings of ICALP Conference, pages 469–480, 2015. [Google Scholar]
- [14].Feldmann AE, Fung WS, Kőnemann J, and Post I. A (1+ϵ)-embedding of low highway dimension graphs into bounded treewidth graphs. SIAM Journal on Computing, 47(4):1667–1704, 2018. [Google Scholar]
- [15].Funke S, Nusser A, and Storandt S. On k-path covers and their applications. The VLDB Journal, 25(1):103–123, 2016. [Google Scholar]
- [16].Haussler D and Welzl E. Epsilon-nets and simplex range queries. Discrete & Computational Geometry, 2:127–151, 1987. [Google Scholar]
- [17].Kleinberg J. Detecting a network failure. In Proceedings of the IEEE FOCS Conference, pages 231–239, 2008. [Google Scholar]
- [18].Kosowski A and Viennot L. Beyond highway dimension: small distance labels using tree skeletons. In Proceedings of the ACM-SIAM SODA Conference, pages 1462–1478, 2017. [Google Scholar]
- [19].Pachocki J, Roditty L, Sidford A, Tov R, and Williams V. Approximating cycles in directed graphs: fast algorithms for girth and roundtrip spanners. In Proceedings of the ACM-SIAM SODA Conference, pages 1374–1392, 2018. [Google Scholar]
- [20].Roddity I, Thorup M, and Zwick U. Roundtrip spanners and roundtrip routing in directed graphs. ACM Transactions on Algorithms, 4(3), 2008. [Google Scholar]
- [21].Tao Y, Sheng C, and Pei J. On k-skip shortest paths. In Proceedings of the ACM SIGMOD Conference, pages 421–432, 2011. [Google Scholar]
- [22].Vapnik V and Chervonenkis A. On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applications, 16:264–280, 1971. [Google Scholar]
- [23].Zhu C and Lam K. Source-wise round-trip spanners. Information Processing Letters, 124(C):42–45, 2017. [Google Scholar]
- [24].Zhu C and Lam K. Deterministic improved round-trip spanners. Information Processing Letters, 129:57–60, 2018. [Google Scholar]


