Table 1. The summary description for six datasets. Each dataset is numbered, named, and given a description. The intended use is also listed.
Dataset | Name | Description | Intended use | Reference |
---|---|---|---|---|
1 | Boston outbreak | A cohort of 63 samples from a real outbreak with three introductions, metagenomic approach | To understand the features of virus transmission during a real outbreak setting | Lemieux et al. (2021) |
2 | CoronaHiT rapid | A cohort of 39 samples prepared by 18 h wet-lab protocol and sequenced by two platforms (Illumina vs MinION), amplicon-based approach | To verify that a bioinformatics pipeline finds virtually no differences between sequences from the same genome run on different platforms. | Baker et al. (2021) |
3 | CoronaHiT routine | A cohort of 69 samples prepared by 30 h wet-lab protocol and sequenced by two platforms (Illumina vs MinION), amplicon-based approach | To verify that a bioinformatics pipeline finds virtually no differences between sequences from the same genome run on different platforms. | Baker et al. (2021) |
4 | VOI/VOC lineages | A cohort of 16 samples from 11 representative CDC defined VOI/VOCa lineages as of 05/30/2021, amplicon-based approach | To benchmark lineage-calling bioinformatics software, especially for VOI/VOCs. | This study |
5 | Non-VOI/VOC lineages | A cohort of 39 samples from representative non VOI/VOCa lineages, amplicon-based approach | To benchmark lineage-calling bioinformatics software, nonspecific to VOI/VOCs. | This study |
6 | Failed QC | A cohort of 24 samples failed basic QC metrics, covering 8 possible failure scenarios, amplicon-based approach | To serve as controls to test bioinformatics QC cutoffs. | This study |
Notes.
VOI, variant of interest; VOC, variant of concern