Table 1.
Name | Locus | Mean length (bp) | Query length (bp) |
---|---|---|---|
bac-150 | 16S | 1256 | 150 |
hiv-104 | Viral genomes | 9096 | 150; 500 |
neotrop-512 | 16S | 1766 | 150; 300 |
tara-3748 | 16S | 1406 | 150; 300 |
bv-797 | 16S | 1341 | 150 |
epa-218 | 16S | 1483 | 150 |
epa-628 | 5.8S | 780 | 150 |
epa-714 | 16S | 1169 | 150 |
wol-43 | Microbial genomes | 52 768 066 | 150 |
| |||
CPU-652 | 16S | 1315 | 150 |
CPU-512 | 16S | 1766 | 150 |
Notes: Overview of datasets used in the evaluation. The columns show the dataset names used in this manuscript, the locus from which the sequences in the datasets originate from, the mean sequence length of the references and the simulated read lengths used during evaluation. The name of each dataset includes the number of reference sequences. The first nine datasets are used in the PAC workflow, the last two in the RES workflow.