Table 4.
Dataset | Size (Mbp) | No. of sequences | Length of the longest sequence (Mbp) |
Running time (s) |
No. of clusters |
||
---|---|---|---|---|---|---|---|
BLASTclust | Gclust | BLASTclust | Gclust | ||||
Viral subset | 213 | 8584 | 2.474 | 10,075 | 245 | 8454 | 8215 |
Archaeal subset | 192 | 4135 | 3.122 | 8148 | 224 | 4085 | 2364 |
Fungal subset | 129 | 502 | 6.910 | / | 71 | / | 402 |
Bacterial subset | 331 | 14,891 | 0.997 | 73,672 | 237 | 9206 | 2284 |
Note: The parameters used in Gclust are as follows: -minlen 21 -both -nuc -threads 8 -rebuild -loadall -memiden 90; the parameters used in BLASTclust are as follows: -a 8 -p F -L 0.1 -b F -S 90. “/” means that BLASTclust could not process the fungal subset because the longest sequence was too long.