Table 1. Summary of whole genome sequencing pipeline settings applied for each SNP pipeline.
Settings | RIVM SNP | Oxford University SNP | Research Center Borstel (MTBseq) SNP | SSI SNP |
---|---|---|---|---|
H37Rv reference genome version | 3 | 2 | 3 | 3 |
Alignment software | Bowtie | Stampy | BWA | BWA |
SNP calling software | Breseq | Samtools | Samtools | Samtools |
Minimum mean sample coverage depth | ≥ 20x | NA | ≥ 30x | ≥ 20x |
Minimum sample coverage breadth | NA | > 88% | ≥ 80% fulfilling thresholds for variant detection | ≥ 95% |
Genomic regions excluded | Repeats | Repeats | Repeats, resistance genes | Repeats |
Minimum coverage depth to support a SNP | NA | 5x (one forward, one reverse, < 10% alternative allele) |
8x (four forward, four reverse, four with phred score ≥ 20) |
8x (four forward, four reverse) |
Excluding SNPs within 12bp | Yes | No | Yes | Yes |
Allele frequency | ≥ 80% | ≥ 90% | ≥ 75% | ≥ 85% |
Dealing with low coverage positions or positions not meeting variant call criteria when calculating the genetic distance | Report reference base | Report consensus base | Report consensus base or exclude position if data quality is below thresholds in >5% of samples | Complement with data from aligned reads if coverage is > 5x or exclude position if data quality is below threshold |
BWA: Burrows-Wheeler Alignment; MTB: Mycobacterium tuberculosis; NA: not applicable; RIVM: National Institute for Public Health and the Environment; SNP: single nucleotide polymorphism; SSI: Statens Serum Institut.