Skip to main content
. 2012 Aug 30;7(8):e44357. doi: 10.1371/journal.pone.0044357

Table 1. Number of sequences failing quality screening criteria and total number of sequences remaining (bold, italics), for standard processing pipeline and for AmpliconNoise processing.

Screening Criteria Standard Output AmpliconNoise Output
Initial 409,997 232,792
<100% match to 5' primer 31,948 NA
Sequence length <120 bp 137,753 NA
Ambiguous bases present 6,295 0
Homopolymers >6 bases 149 12
Avg Qscore <25 8,985 NA
Poorly aligning to database 138 106
Remainder after 1st stage screening 253,973 232,674
Uniques 22,351 3,053
Pre-cluster sequences differing by 1 bp 10,529 uniques NA
Flagged as chimeras 1,081 1,460
Phyla other than target 25 25
Net sequence read yield 252,867 231,189
Sequence reads per sample (+/−SE) 4,214+/−101 3,853+/−99
Unique sequence reads 10,250 2,751
OTUs 1,166 792
OTUs containing shared reads 769 739
Equalize sampling effort (subsample to 3,000 reads per sample)
OTUs per sample (+/−SE) 112+/−2 120+/−2

The sum of sequences failing each criterion in the initial screening is greater than the number of sequences dropped because some sequences failed on multiple criteria. AmpliconNoise processing includes a test of matching to the 5' primer, and does not make use of quality scores. OTUs were defined based on a 3% sequence dissimilarity threshold, using the average neighbor method.