Table 3.
Data set | (a) Computations on cluster: wall-time seconds | |||||
---|---|---|---|---|---|---|
eleg | Klein | HIV | drag 2 | random | genome | |
Size of complex | 4.4 × 106 | 1.1 × 107 | 2.1 × 108 | 1.3 × 109 | 3.1 × 109 | 4.5 × 108 |
Max. dim. | 2 | 2 | 2 | 2 | 8 | 2 |
javaPlex (st) | 84 | 747 | - | - | - | - |
Dionysus (st) | 474 | 1,830 | - | - | - | - |
DIPHA (st) | 6 | 90 | 1,631 | 142,559 | - | 9,110 |
Perseus | 543 | 1,978 | - | - | - | - |
Dionysus (d) | 513 | 145 | - | - | - | - |
DIPHA (d) | 4 | 6 | 81 | 2,358 | 5,096 | 232 |
Gudhi | 36 | 89 | 1,798 | 14,368 | - | 4,753 |
Ripser | 1 | 1 | 2 | 6 | 349 | 3 |
Data set | (b) Computations on cluster: CPU seconds | |||||
---|---|---|---|---|---|---|
eleg | Klein | HIV | drag 2 | random | genome | |
Size of complex | 4.4 × 106 | 1.1 × 107 | 2.1 × 108 | 1.3 × 109 | 3.1 × 109 | 4.5 × 108 |
Max. dim. | 2 | 2 | 2 | 2 | 8 | 2 |
javaPlex (st) | 284 | 1,031 | - | - | - | - |
Dionysus (st) | 473 | 1,824 | - | - | - | - |
DIPHA (st) | 68 | 1,360 | 25,950 | 1,489,615 | - | 130,972 |
Perseus | 542 | 1,974 | - | - | - | - |
Dionysus (d) | 513 | 145 | - | - | - | - |
DIPHA (d) | 39 | 73 | 1,276 | 37,572 | 79,691 | 3,622 |
Gudhi | 36 | 88 | 1,794 | 14,351 | - | 4,764 |
Ripser | 1 | 1 | 2 | 5 | 348 | 2 |
Data set | (c) Computations on shared-memory system: wall-time seconds | |||||
---|---|---|---|---|---|---|
eleg | Klein | HIV | drag 2 | genome | fract r | |
Size of complex | 3.2 × 108 | 1.1 × 107 | 2.1 × 108 | 1.3 × 109 | 4.5 × 108 | 2.8 × 109 |
Max. dim. | 3 | 2 | 2 | 2 | 2 | 3 |
javaPlex (st) | 13,607 | 1,358 | 43,861 | - | 28,064 | - |
Perseus | - | 1,271 | - | - | - | - |
Dionysus (d) | - | 100 | 142,055 | 35,366 | - | 572,764 |
DIPHA (d) | 926 | 13 | 773 | 4,482 | 1,775 | 3,923 |
Gudhi | 381 | 6 | 177 | 1,518 | 442 | 4,590 |
Ripser | 2 | 1 | 2 | 5 | 3 | 1,517 |
For each data set, we indicate the size of the simplicial complex and the maximum dimension up to which we construct the VR complex. For all data sets, we construct the filtered VR complex up to the maximum distance between any two points. We indicate the implementation of the standard algorithm using the abbreviation ‘st’ following the name of the package, and we indicate the implementation of the dual algorithm using the abbreviation ‘d.’ The symbol ‘-’ signifies that we were unable to finish computations for this data set, because the machine ran out of memory. Perseus implements only the standard algorithm, and Gudhi and Ripser implement only the dual algorithm. (a), (b) We run DIPHA on one node and 16 cores for the data sets eleg, Klein, and genome; on 2 nodes of 16 cores for the HIV data set; on 2 and 3 nodes of 16 cores for the dual and standard implementations, respectively, for drag 2; and on 8 nodes of 16 cores for random. (The maximum number of processes that we could use at any one time was 128.) (c) We run DIPHA on a single core.