Skip to main content
. 2021 May 5;9:e11348. doi: 10.7717/peerj.11348

Figure 2. Illustration of the divide and conquer strategy of the dereplication phase.

Figure 2

From a list of Bacteria downloaded from RefSeq (or GenBank), TQMD either sorts (based on the NCBI taxonomic lineage of each genome) or randomizes the list and splits it into packs of a given size. This allows each pack to be separately dereplicated, especially in parallel. Then all resulting lists of representative genomes are merged back and TQMD decides if it can stop or must refeed the merged list for another round.