TABLE 1.
Advantages and disadvantages of whole genome sequencing platforms.
Platform | Read-length | Technique | Advantages | Disadvantages |
Illumina (MiSeq) | Short (2 bp × 300 bp) | Amplifies fragmented DNA and primers on a chip holding oligonucleotides with reversible dye-terminator bases before capturing the fluorescently labeled terminator nucleotides using a prepared library. | Allows for antibiotic resistance prediction, construct novel TB genomes, SNP and indel analysis. Short read-lengths enable accurate read data and low per-base error rate. Able to generate paired-end reads (Illumina, 2020). SMOR analysis for determination of low-frequency alleles in complex samples using targeted amplicons and deep sequencing for increased sensitivity over standard WGS (Colman et al., 2015). Superior performance over Ion Torrent (PGM) for GC rich (homopolymer) regions (Zhang G. et al., 2015). High throughput and high sequence yield. Low cost per sample at $22.41 (WHO, 2018b). Low input DNA (1 ng) required for library preparation allowing for use directly on clinical specimens (WHO, 2018b). Shorter sample and library preparation (Loman et al., 2012a). | Long run times and large data output per run. High instrument cost. Short read lengths are problematic for repetitive regions like the PE and PPE gene families which are generally excluded (Meehan et al., 2019). |
Ion Torrent (PGM) | Short (400–600 bp) | Semi-conductor detects protons/hydrogen ions that give off an electronic signal during the polymerization of DNA. Uses emulsion PCR, a washing step and library preparation. | Allows for antibiotic resistance prediction, construction of novel TB genomes and full length gene analysis for novel mutations (Daum et al., 2012; ION Torrent Next-Generation Sequencing Technology, 2020). Short read-lengths enable accurate read data and low per-base error rate. Longer read-length compared to Illumina. Short run time and low data output per run. Moderate instrument cost (WHO, 2018b). | Low throughput, few samples per run and requires more hands-on time. Reads obtained are single-stranded (WHO, 2018b). High cost per sample at $48.75 per sample (WHO, 2018b). Requires a compressed nitrogen cylinder and water purification system (WHO, 2018b). Requires preparation of amplified sequence libraries through emulsion PCR and enrichment stages off the instrument (Loman et al., 2012a). Associated with homopolymer errors (Loman et al., 2012b). |
Oxford Nanopore Technologies (MinION) | Long (100,000–2 million bp) | Single Molecule Real Time (SMRT) detects change in the ion current when DNA strands pass through the biological nanopore. Depends on library preparation, so user can choose the read length. | Allows for antibiotic resistance prediction, construct novel TB genomes, SNP and indel analysis, genomic rearrangement and nucleotide modifications like cytosine methylation. Longest individual read length enables easier assembly even for un-sheared DNA. Short run time. Lowest instrument cost at $1000. Portable and palm-size instrument. Sample preparation and library steps are shorter (WHO, 2018b). Can continue sequencing until sufficient genome coverage is obtained avoiding sequencing failures from low bacterial loads (Votintseva et al., 2017b). Low cost for high throughput ($16.67 per sample for 54 samples) (WHO, 2018b). | High per base error rate (20–35%) (WHO, 2018b). Has lower single raw read accuracy. Moderate data output per run (WHO, 2018b). Not intended for clinical use. Unable to reliably detect variant frequencies below 40%, requires resistant subpopulations to be at least 50% present in a mixed sample for low-frequency allele detection (Tafess et al., 2019). High cost for lower throughput ($42.86 per sample for 22 samples) (WHO, 2018b). |
PacBio (RSII) | Long (60,000 bp) | Single Molecule Real Time (SMRT) uses fluorescence detection. | Allows for antibiotic resistance prediction, construct novel TB genomes, SNP, indel and epigenetic analysis (McNerney et al., 2017; Lee et al., 2019). Long read length enables easier assembly of repetitive gene regions (Ley et al., 2019). Short run time. Low data output per run. | High per base error rate and lowest raw read accuracy. High instrument cost ($750,000) and difficult installation. Moderate throughput and limited capacity to multiplex (WHO, 2018b). |
Bp, base pairs; PGM, Personal Genome Machine; TB, tuberculosis; SNP, single nucleotide polymorphism; SMOR, Single Molecule-Overlapping Reads; SMRT, Single Molecule Real Time; NGS, next generation sequencing; GC, Guanine Cytosine; DNA, deoxyribo-nucleic acid; PE, proline-glutamic acid; PPE, proline-proline-glutamic acid; PCR, polymerase chain reaction.