Skip to main content
. 2022 Sep 5;10:e13821. doi: 10.7717/peerj.13821

Table 1. The summary description for six datasets. Each dataset is numbered, named, and given a description. The intended use is also listed.

Dataset Name Description Intended use Reference
1 Boston outbreak A cohort of 63 samples from a real outbreak with three introductions, metagenomic approach To understand the features of virus transmission during a real outbreak setting Lemieux et al. (2021)
2 CoronaHiT rapid A cohort of 39 samples prepared by 18 h wet-lab protocol and sequenced by two platforms (Illumina vs MinION), amplicon-based approach To verify that a bioinformatics pipeline finds virtually no differences between sequences from the same genome run on different platforms. Baker et al. (2021)
3 CoronaHiT routine A cohort of 69 samples prepared by 30 h wet-lab protocol and sequenced by two platforms (Illumina vs MinION), amplicon-based approach To verify that a bioinformatics pipeline finds virtually no differences between sequences from the same genome run on different platforms. Baker et al. (2021)
4 VOI/VOC lineages A cohort of 16 samples from 11 representative CDC defined VOI/VOCa lineages as of 05/30/2021, amplicon-based approach To benchmark lineage-calling bioinformatics software, especially for VOI/VOCs. This study
5 Non-VOI/VOC lineages A cohort of 39 samples from representative non VOI/VOCa lineages, amplicon-based approach To benchmark lineage-calling bioinformatics software, nonspecific to VOI/VOCs. This study
6 Failed QC A cohort of 24 samples failed basic QC metrics, covering 8 possible failure scenarios, amplicon-based approach To serve as controls to test bioinformatics QC cutoffs. This study

Notes.

a

VOI, variant of interest; VOC, variant of concern