. 2021 Oct 13;1(1):vbab027. doi: 10.1093/bioadv/vbab027

Table 1.

Datasets used for evaluation

Name	Locus	Mean length (bp)	Query length (bp)
bac-150	16S	1256	150
hiv-104	Viral genomes	9096	150; 500
neotrop-512	16S	1766	150; 300
tara-3748	16S	1406	150; 300
bv-797	16S	1341	150
epa-218	16S	1483	150
epa-628	5.8S	780	150
epa-714	16S	1169	150
wol-43	Microbial genomes	52 768 066	150

CPU-652	16S	1315	150
CPU-512	16S	1766	150

Notes: Overview of datasets used in the evaluation. The columns show the dataset names used in this manuscript, the locus from which the sequences in the datasets originate from, the mean sequence length of the references and the simulated read lengths used during evaluation. The name of each dataset includes the number of reference sequences. The first nine datasets are used in the PAC workflow, the last two in the RES workflow.