Table 3.
Dataset | Method | Feature Type | Total | TP | FP | Singleton | Strains Detected |
---|---|---|---|---|---|---|---|
Zymo | DADA2 | ASVs | 29 | 29 | 0 | 0 | 8 |
mothur | OTUs | 24 | 8 | 5 | 11 | 8 | |
uparse | OTUs | 8 | 8 | 0 | 0 | 8 | |
HMP | DADA2 | ASVs | 51 | 51 | 0 | 0 | 17 |
mothur | OTUs | 29 | 16 | 4 | 9 | 16 | |
uparse | OTUs | 16 | 16 | 0 | 0 | 16 |
The PacBio long-read amplicon sequencing data from the Zymo and HMP mock communities was processed by DADA2, mothur and uparse (as implemented in usearch). DADA2 identifies exact amplicon sequence variants (ASVs) with single-nucleotide resolution, while mothur and usearch identify OTUs, i.e. clusters of reads within a 97% similarity threshold. Total: The number of features (ASVs or OTUs) identified. TP: True positives. FP: False positives. Singleton: Features with just one read. In these datasets, the singleton OTUs output by mothur consisted of chimeras, sequencing errors and contaminants. Strains detected: The number of strains in the mock communities identified by each method.