. 2021 May 5;9:e11348. doi: 10.7717/peerj.11348

Table 6. Feature comparison between dRep, Assembly-Dereplicator (A-D) and TQMD.

Feature	dRep	A-D	TQMD
main engine(s)	Mash + ANIm (or gANI)	Mash	JELLYFISH or Mash
other dependencies	CheckM (optional)	none	QUAST (optional), RNAmmer (optional), CD-HIT-EST (optional), Forty-Two (optional), CheckM (optional)
relational database	N	N	Y
genome source	custom	custom	RefSeq, GenBank, custom
taxonomic filters	N	N	Y (when downloading and clustering)
automatic genome download	N	N	Y
distance metric(s)	Mash distance (estimated JI) then ANI	Mash distance (estimated JI)	1-JI (exact) or Mash distance (estimated JI) or 1-IGF (exact)
heuristic(s)	biphasic approach: Mash for fast and rough clustering followed by ANI for slow and accurate clustering	d-and-c strategy (serial)	iterative greedy algorithm (serial) + d-and-c strategy (parallel)
stop condition(s)	unspecified	first failure to dereplicate any serial batch	any of 3 possible cut-offs (number of rounds, number of representatives, clustering ratio)
d-and-c dividing scheme	unspecified	random	random or taxonomic
selection of representatives	formula based on genome size, assembly quality and contamination level (incl. strain heterogeneity)	assembly quality	formula based on genome size, assembly quality, annotation richness and contamination level (fully customisable with 30 possible metrics)
parameterization of representative selection	Y (parameter weights)	N	Y (simplified formula)
grid engine support	N	N	Y (SGE/OGE) (optional)
distribution	source (pip), conda, Galaxy	source	source (Bitbucket), Singularity container
CPU usage	fixed on launch	fixed on launch	specified as a maximum (decreases over time)

Notes.

JI: Jaccard Index
IGF: Identical Genome Fraction
ANI: average nucleotide identity
d-and-c: divide-and-conquer
SGE/OGE: Sun/Open Grid Engine
Y: present feature
N: absent feature