Skip to main content
Data in Brief logoLink to Data in Brief
. 2023 Sep 7;50:109553. doi: 10.1016/j.dib.2023.109553

Data, instance sets, and instances generator for the Hop-Constrained Minimum Spanning Tree problem, the Delay-Constrained Minimum Spanning Tree problem, and their bi-objective variants

Iago A Carvalho a,⁎,1, Amadeu A Coco b,2
PMCID: PMC10514421  PMID: 37743887

Abstract

This article proposes a benchmark instance generator for the Hop-Constrained Minimum Spanning Tree problem, the Delay-Constrained Minimum Spanning Tree problem, and their bi-objective variants. The generator is developed in C++ and does not uses external libraries, being understandable, easy-to-read, and easy-to-use. Furthermore, the generator employs five parameters that makes possible to generate personalized benchmark instances for these problems. We also describe 640 benchmark instances that were previously used in computational experiments in the literature. Lastly, we include raw results obtained from computational experiments with the described benchmark instances. We hope that the data introduced in this article can foster the development and the evaluation of new algorithms for solving constrained minimum spanning tree problems.

Keywords: Multi-objective optimization, Bi-objective optimization, Constrained minimum spanning tree problems, Graphs, Optimization


Specifications Table

Subject Decision Sciences

Specific subject area Multi-objective optimization problems in graphs, most specifically bi-objective constrained minimum spanning tree problems
Data format Raw, Analysed
Type of data C++ File, CSV Tables, Text Files
Data collection The data has been generated in-silico in a laboratory environment. It is useful to evaluate both exact and heuristic multi-objective optimization algorithms. The dataset can be separated into three pieces. The first is the instance generator, which is a C++ algorithm, used to generate instances for the Hop-Constrained Minimum Spanning Tree problem [1] and the Delay-Constrained Minimum Spanning Tree problem [2]. This instance generator makes use of 5 parameters to generate different instances. The second piece is the instances generated using the instance generator, which was employed in the paper [3]. The last piece is the raw data used to generate parts of Tables 1 and 2 and Figs. 4 and 5 of the paper [3].
Data source location All data was sintetically generated in the Computational Intelligence Laboratory of the Computer Science Department of the Universidade Federal de Alfenas (UNIFAL). The university is located in Av. Jovino Fernandes de Sales, 2600 - Santa Clara, Alfenas - MG, 37133-840, Brazil. The information contained in the dataset has no geographic reference.
Data accessibility Repository name: Data for "On solving bi-objective constrained minimum spanning tree problems"
Data identification number: 10.17632/zck24s5wdf.2
Direct URL to data: doi:10.17632/zck24s5wdf.2
Related research article Carvalho, I. A., & Coco, A. A. (2023). On solving bi-objective constrained minimum spanning tree problems. Journal of Global Optimization, 1-23. https://doi.org/10.1007/s10898-023-01295-8

1. Value of the data

  • The dataset contains 320 instances for the Hop-Constrained Minimum Spanning Tree problem and other 320 instances for the Delay-Constrained Minimum Spanning Tree problem. These instances can be used as benchmark for these problems or for their bi-objective variant and were used as benchmark instances in [3].

  • The dataset also contains the algorithm employed to generate the benchmark instances. This algorithm was developed in C++ and contain 5 parameters that can be used to customize the generated instance.

  • There exists 4 .csv files containing the raw experimental results of [3]. These .csv files were employed to generate Tables 1 and 2 and Figs. 4 and 5 of the paper [3].

  • The instances can be used to evaluate algorithms for the Hop-Constrained Minimum Spanning Tree problem, the Delay-Constrained Minimum Spanning Tree problem, or their bi-objective variants.

  • The instance generator is useful to construct other instances for these problems. The generator is able to build small-, medium-, and large-sized instances in two different classic topologies of the literature.

  • The instance generator can be adapted to generate instances for other constrained minimum spanning tree problems, such as the Degree-Constrained Minimum Spanning Tree problem [4] or the Minimum-Cost Bounded-Error Minimum Spanning Tree problem [5].

2. Objective

This paper describes an instance generator for the Hop-Constrained Minimum Spanning Tree problem, the Delay-Constrained Minimum Spanning Tree problem, and their bi-objective variants. Along with the instance generator, it also presents 640 benchmark instances, being 320 instances for each problem, which were employed in the experiments of [3]. The objective is to provide an adequate set of benchmark instances to evaluate algorithms for solving these problems.

3. Data description

The dataset is divided into three directories, as presented in Fig. 1. Directory “Instances generator” contains the instance generator algorithm developed in C++, directory “Instances” contains the benchmark instances generated by the above-mentioned generator algorithm and employed in [3], and directory “Results” contains the raw data used to construct Tables 1 and 2 and Figs. 4 and 5 of [3]. These directories and their files will be presented in details below.

Fig. 1.

Fig 1:

Organization of the Mendeley data repository.

3.1. Directory instances generator

This directory contains a single file denominated “generator.cpp”. This is a C++ algorithm that can be used to generate benchmark instances for the Delay-Constrained Minimum Spanning Tree problem (DCMST) and for the Hop-Constrained Minimum Spanning Tree problem (HCMST). These instances can also be used for their bi-objective variants, i.e., the Bi-Objective Delay-Constrained Minimum Spanning Tree problem (B-DCMST) and the Bi-Objective Hop-Constrained Minimum Spanning Tree problem (B-DCMST).

The algorithm do not makes use of external libraries and can be easily compiled using any C++ compiler. For example, in Linux environments, one can compile the algorithm using the command g++ generator.cpp -o generator. Other compilers, such as clang or Intel C++ compiler, can also be used.

The compiled algorithm makes use of five parameters, as shown in Table 1. They need to be passed to the algorithm in the exact order they are displayed in Table 1. For example, in Linux environments, one could run the algorithm using the command ./generator 20 1 3 2 100, whereas 20 represents the number of nodes in the instance, 1 denotes the instance's topology, 3 is the seed for the Mersenne Twister [6] pseudo-random number generator algorithm, 2 represents the chosen problem, and 100 is the maximum allowed delay for the DCMST instance (or B-DCMST instance). In the case that the parameters are not properly passed to the algorithm, a help message will be displayed for the user.

Table 1.

Parameters employed by the benchmark instance's generator.

Parameter Description Possible values
size Number of nodes of the benchmark instance Natual numbers greater than 2
type Instance topology {1, 2}
seed Seed for the pseudo-random number generator Any integer number
problem The problem for which the benchmark instance will be created {1, 2}
delay Maximum delay for the DCMST or B-DCMST Natural numbers higher than 1

Every parameter has a predefined set of allowed values:

  • size: it should be a natural number greater than 2.

  • type: there exists two instance's topology, namely TC and TE. The value 1 denotes the TC instance, while the value 2 denotes the TE instance. Both instances topology are exemplified in Fig. 2. One may observe that the only difference between them is on the position of the root node. In TC instances, the root node is positioned in the center of the plane. On the other hand, in TE instances, the root node is positioned in the corner of the plane.

  • seed: it should be an integer number.

  • problem: two problems can be selected. The value 1 corresponds to a benchmark instance for the HCMST (or the B-HCMST), while the value 2 denotes a benchmark instance for the DCMST (or the B-DCMST).

  • delay: it should be a natual number greater than 1. It will only be used in the cases that problem is set to 2. When problem is set to 1, this parameter will be discarded.

Fig. 2.

Fig 2:

Illustrations of possible generated benchmark instances with 20 nodes. In these examples, the root node is drawn in black.

The generated instance will be a complete undirected graph. Every node (except the root) is randomly placed into a 40 × 40 plane using an uniform distribution. Furthermore, the edges’ weights are computed as the Euclidean distance between the nodes. For the DCMST (or B-DCMST) instances, the delay associated to every edge is computed randomly using an uniform distribution. The benchmark instance data is displayed for the user in the default output employed by the user.

The benchmark instance data is easy-to-read and to understand. In the first line, it displays, respectively, the number of nodes and the number of edges of the instance. Then, every remaining line denotes an edge. The first and second columns give the endpoints of the edge, while the third column shows the edge's weight. In the case of instances for the DCMST (or B-DCMST), a fourth column that denotes the delay of the edge also exists. One must observe that the root node is represented as the node zero.

3.2. Directory instances

This directory is divided into two sub-directories. The first, denominated as “B-DCMST”, contains the benchmark instances for the DCMST and for the B-DCMST along with a .txt file that describes the benchmark instance files. The second directory, denominated as “B-HCMST”, contains the benchmark instances for the HCMST and for the B-DCMST along with a .txt file that describes the benchmark instance files. All benchmark instances were employed in the computational experiments of [3]. Fig. 3(a) shows an excerpt from the instance TC-16-7.dat from the first sub-directory, while Fig. 3(b) shows an excerpt from the instance TC-11-100-9.dat from the second sub-directory.

Fig. 3.

Fig 3:

Excerpts from possible benchmark instances obtained from the generator algorithm.

For the “B-DCMST” directory, the instance name is given by a tuple <a>-<b>-<c>.dat. The value of a denotes the instance topology, i.e., TC or TE. Furthermore, the entry b gives the number of nodes in the benchmark instance. Besides that, the symbol c presents the seed passed to the pseudo-random number generator. This directory contains a total of 320 benchmark instances whose size varies from 10 to 25 nodes. It is possible to observe that exists 20 different instances of each size, which were created using distinct seeds for the pseudo-random number generator.

For the “B-HCMST” directory, the instance name is given by a tuple <a>-<b>-<c>-<d>.dat. The symbols a, b, and c denote the same parameters as described for the “B-DCMST” directory. In addition, the entry d gives the maximum delay of the benchmark instance. This directory also contains 320 benchmark instances whose size varies from 10 to 25 nodes. Similarly, there exists 20 different instances of each size that were created using different seed values for the pseudo-random number generator.

3.3. Directory results

This directory contains 4 .csv files reporting the results obtained by the algorithms of [3] on the benchmark instances contained in the directory “Instances”. The files files “B-HCMST-cost.csv” and “B-HCMST-hop.csv” give the results of the computational experiments for the B-HCMST, while “B-DCMST-cost.csv” and “B-DCMST-delay.csv” report the computational results for the B-DCMST.

Every .csv file is composed of 5 columns. The first column denotes the benchmark instance topology, i.e., TC or TE. The second column gives the number of nodes in the instance, while the third column presents the seed used for the pseudo-random number generator. The fourth and fifth columns give the results of the algorithm. The fourth column reports the status of the algorithm, which can be “optimal” if an optimal solution was found, or “interrupted” if an optimal solution was not found in less than 10 hours. The fifth column report the time spent running the algorithm.

4. Experimental design, materials and methods

The benchmark instances contained in directory “Instances” were artificially generated using the C++ algorithm give in this dataset. The data of the .csv files of directory “Results” were obtained using two different implementations of the Augmented ε-constraints algorithm (AUG) [7]. Each .csv file give the result of a different implementation of AUG for a different problem. The raw data in files “B-HCMST-cost.csv” and “B-HCMST-hop.csv” were employed to generate Table 1 of [3], while the raw data in files “B-DCMST-cost.csv” and “B-DCMST-delay.csv” were used as source for computing Table 2 of [3]. The raw data of these files were also employed to plot Figs. 4 and 5 of [3].

Limitations

The data reported in directory “Results” is not sufficient for reconstructing Tables 1 and 2 of [3] as the size of the Pareto-sets are not given in this repository.

Ethics statement

The authors have read and follow the ethical requirements for publication in Data in Brief and confirm that the current work does not involve human subjects, animal experiments, or any data collected from social media platforms.

CRediT authorship contribution statement

Iago A. Carvalho: Conceptualization, Data curation, Writing – original draft. Amadeu A. Coco: Writing – review & editing.

Declaration of Competing Interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001, the Conselho Nacional de Desenvolvimento Científico e Tecnológico - Brasil (CNPq), and the Fundação de Amparo à Pesquisa do Estado de Minas Gerais - Brasil (FAPEMIG).

Data availability

References

  • 1.Gouveia L. Using the Miller-Tucker-Zemlin constraints to formulate a minimal spanning tree problem with hop constraints. Appl. Comput. Oper. Res. Miner. Ind., Proc. Int. Symp., 30th. 1995;22(9):959–970. doi: 10.1016/0305-0548(94)00074-I. [DOI] [Google Scholar]
  • 2.Salama H.F., Reeves D.S., Viniotis Y. Proceedings Second IEEE Symposium on Computer and Communications. IEEE; 1997. The delay-constrained minimum spanning tree problem; pp. 699–703. [DOI] [Google Scholar]
  • 3.Carvalho I.A., Coco A.A. On solving bi-objective constrained minimum spanning tree problems. J. Global Optim. 2023:1–23. doi: 10.1007/s10898-023-01295-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.de Almeida A.M., Martins P., de Souza M.C. Min-degree constrained minimum spanning tree problem: complexity, properties, and formulations. Int. Trans. Oper. Res. 2012;19(3):323–352. doi: 10.1111/j.1475-3995.2011.00830.x. [DOI] [Google Scholar]
  • 5.Carvalho I.A., Ribeiro M.A. An exact approach for the minimum-cost bounded-error calibration tree problem. Ann Oper Res. 2020;287(1):109–126. doi: 10.1007/s10479-019-03443-4. [DOI] [Google Scholar]
  • 6.Matsumoto M., Nishimura T. Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator. ACM Trans Model Comput Simul. 1998;8(1):3–30. doi: 10.1145/272991.272995. [DOI] [Google Scholar]
  • 7.Mavrotas G. Effective implementation of the ε-constraint method in multi-objective mathematical programming problems. Appl. Math. Comput. 2009;213(2):455–465. doi: 10.1016/j.amc.2009.03.037. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement


Articles from Data in Brief are provided here courtesy of Elsevier

RESOURCES