Table 3.
Performance of the matching algorithm using the 4Mycotoxins training set (1,338 sequences) and the 97AerobiotaSamples testing set by signature length λ
| aodp | |||||
|---|---|---|---|---|---|
| λ | μ98 | t | |||
| 16 | 1352 | 0.93 | 0.317 | 17.41 | 17039 |
| 24 | 1353 | 0.94 | 0.311 | 13.27 | 9720 |
| 32 | 1342 | 0.95 | 0.299 | 11.83 | 6362 |
| 40 | 1325 | 0.94 | 0.298 | 11.06 | 3031 |
| USEARCH | 32560 | ||||
| BLAST | 74335 | ||||
μ98: number of matching query sequences with similarity α≥1−2ε=0.98, t: running time in seconds (system description in “Comparisons with other algorithms” section). Average values (algorithm 1) are reported for: size of the matching kernel , number of sequences in all matching clusters . Ratio : average size of the result set to the average size of the matching kernel. Running times are also reported for USEARCH and BLAST