Table 2.
Dataset |
NCI |
PubChem Subset |
PubChem Compound |
|
---|---|---|---|---|
Similarity measure | Atom pair | Atom pair | Atom pair | Fingerprint |
Total clustering time (h) | ||||
Jarvis–Patrick | 72.9 | 7355.6 | N/A | N/A |
EI-Clustering | 3.5 | 92.2 | 1517.2 | 2869.71 |
Jaccard coefficient | 0.9913 | 0.9887 | N/A | N/A |
The table compares the time and accuracy performance of EI-Clustering with Jarvis–Patrick clustering when using exhaustive search methods for generating the required nearest neighbor information. The compute time is given in hours of total CPU time. The agreement among the clustering results is given in the last row in form of Jaccard partition coefficients. Clustering of the PubChem Compound dataset was not possible with the exhaustive search methods due to their insufficient performance on this large dataset.