Table 4. Denoising of mtDNA COI amplicons generated from community samples of Collembola from two forest sites on the island of Tenerife with the pipeline of Yu et al. (2012).
Site 1 | Site 2 | |||||
MID7 | MID8 | MID9 | MID10 | MID11 | MID12 | |
Total read count | ||||||
Step 1 (quality control) | 10,413 | 10,581 | 8,416 | 7,043 | 10,635 | 10,737 |
Step 2 (PyNAST, 60%) | 10,394 | 10,572 | 8,392 | 7,040 | 10,603 | 10,712 |
Unique read count | ||||||
Step 2 (PyNAST, 60%) | 7,040 | 7,306 | 5,822 | 4,752 | 7,492 | 7,443 |
Step 2 (USEARCH) | 2,021 [12] | 2,032 [13] | 1,621 [16] | 1,452 [5] | 2,255 [15] | 2,147 [30] |
Step 3 (MACSE) | 709 | 719 | 305 | 152 | 940 | 817 |
Step 4 (DNACLUST, 99%) | 580 | 610 | 267 | 139 | 805 | 710 |
Step 5 (CROP, 98%) | 69 | 52 | 52 | 65 | 103 | 98 |
Steps 1–3 represent reduction of unique sequence volume by denoising, while steps 4 and 5 further reduce unique sequence volume by the creation of summary clusters of sequences. See Yu et al. (2012) for a detailed explanation of each of the steps. Bracketed values in step 2 represent sequences inferred to be chimeras with the de novo chimera detection function UCHIME in USEARCH, all of which were removed in step 4 of PyroClean.