Table 2.
Parameters used in profile–sequence clustering grid search.
Parameter name | Values | Brief description |
---|---|---|
—cluster-mode | 0, 1, 2 | Clustering algorithm to usea |
—num-iterations | 1, 2, 3 | Number of iterations to dob |
−s | 1, 4, 7 | Sensitivityc |
—min-seq-id | 0.35, 0.3, 0.25, 0.2, 0.15 | Percent identity threshold |
−c/—cov | 0.85, 0.8, 0.75, 0.7, 0.65, 0.6, 0.55 | Coverage threshold |
−e/—e-profile | 1e−05, 1e−3 | E-value threshold |
Different algorithms are available to interpret the graph of pairwise edges into clusters. 0 = set cover, 1 = single-linkage (like blastclust), 2 = greedy-incremental (like CD-HIT). Details are in MMseqs2 manual.
Performs this many PSI-BLAST-like iterations of finding homologs and updating profiles before clustering.
Higher sensitivity values allow less similar k-mers to count as matches that can seed an alignment.