. 2023 Sep 8;11:e15838. doi: 10.7717/peerj.15838

Table 5. Selected clustering results with high or moderate clustering partition metrics.

Clustering validity metrics and the number of clusters k for different partitions obtained from different data representations. Repr. denotes a data representation used for clustering. It is either an embedding provided by a manifold learning algorithm (SE - Spectral Embedding, LLE - Locally Linear Embedding) or pairwise distances inferred from the data (L1 - Manhattan distance in the original space of taxonomic abundances). Spectral, Spectral Clustering algorithm. D-B index, Davies-Bouldin index. Silh. score, Silhouette score. DBCV, Density-Based Clustering Validation index. Ent., Entropy. Notation as in Table 2.

	Tax	Repr.	Cluster method	k	D-B index	Silh. score	DBCV	Prediction Strength	Ent.
AGP	O	L1	Spectral	2	0.60	0.60	−0.63	0.98	0.06
	O	LLE	Spectral	2	0.49	0.74	−0.86	0.94	0.06
	O	LLE	Spectral	3	0.60	0.60	−0.91	0.91	0.18
	O	SE	Spectral	2	0.50	0.68	−0.91	0.96	0.09
	O	SE	Spectral	3	0.57	0.63	−0.92	0.94	0.19
	F	t-SNE	HBDSCAN	2	1.38	0.14	0.15	1.00	0.09
	F	UMAP	HBDSCAN	2	1.02	0.17	0.22	1.00	0.06
	G	UMAP	HBDSCAN	2	1.03	0.23	0.25	1.00	0.08
HMP	O	t-SNE	HBDSCAN	2	1.00	0.13	0.12	1.00	0.06
	O	UMAP	HBDSCAN	2	0.87	0.15	0.19	1.00	0.08
	O	UMAP	HBDSCAN	3	1.02	0.06	0.19	1.00	0.16
	F	UMAP	HBDSCAN	2	1.03	0.08	0.10	1.00	0.08
	F	SE	HBDSCAN	2	0.53	0.64	−0.63	1.00	0.09
	F	t-SNE	HBDSCAN	2	1.11	0.09	0.21	0.97	0.09
	G	UMAP	HBDSCAN	2	1.24	−0.02	0.16	1.00	0.06