Skip to main content
. 2020 Jan 7;17(5):496–502. doi: 10.1016/j.gpb.2018.10.008

Table 3.

Clustering results and performance of Gclust at 90% eMEMi using 16 threads

Dataset No. of sequences No. of clusters Running time (min)
Viral data 9578 9101 8.7
Archaeal data 38,381 16,064 88.0
Fungal data 79,365 68,698 1322.8
Bacterial data 112,111 105,867 7678.8

Note: The parameters used for clustering are as follows: -minlen 41 -both -nuc -threads 16 -chunk 400 -loadall -memiden 90 -rebuild -ext 1 -sparse 4. Parameter “-both” indicates that Gclust compares both strands of DNA sequences. MEM, maximal exact match. eMEMi, extended MEM identity.