Table 1:
Some existing algorithms commonly used by the microbiology community for TIA
Method | Function supported | Alignment method | Clustering methoda | Using reference database | Generating distance matrix | Computational complexity | Space complexity |
---|---|---|---|---|---|---|---|
DOTUR | Clustering | N/A | HC | N | Y | O(N2) | O(N2) |
Mothur | Sequence alignment + clustering | Profile based MSA method | HC | Y | Y | O(N2) | O(N2) |
ESPRIT | Sequence alignment + clustering | PSA | HC | N | Y | O(N2) | O(N2) |
ESPRIT-Tree | Sequence alignment + clustering | PSA | HC | N | N | O(N1.2) | O(N) |
NASTb | Sequence alignment | Profile based MSA method | N/A | Y | Y | O(N) | O(N2) |
RDP/Pyro | Sequence alignment + clustering | Infernal aligner | HC | Y | Y | O(N2) | O(N2) |
CD-HIT | Sequence alignment + clustering | PSA | Greedy heuristic clustering | N | N | O(N1.2) | O(N) |
UCLUST | Sequence alignment + clustering | PSA | Greedy heuristic clustering | N | N | O(N1.2) | O(N) |
MUSCLE | Sequence alignment | MSA | N/A | N | Y | O(N4) | O(N2) |
aComplete linkage is the default method in DOTUR, mothur, ESPRIT and RDP/Pyro. ESPRIT-Tree supports only average linkage. bNAST only supports the sequence-alignment step. By aligning query sequences against a database, its computational complexity grows linearly with respect to the number of sequences. However, according to the NAST website, it aligns at a rate of approximately 10 sequences per minute. N/A = not applicable; N = no; Y = yes.