Skip to main content
. Author manuscript; available in PMC: 2015 Jul 28.
Published in final edited form as: Methods Enzymol. 2013;531:371–444. doi: 10.1016/B978-0-12-407863-5.00019-8

Table 3.

OTU picking approaches comparison. The table shows when each of the OTU picking approaches should be used and when they cannot be applied. It briefly describes the advantages and disadvantages of using each of the OTU picking approaches.

denovo closed-reference open-reference
Must use if There is no reference sequence collection to cluster against (e.g. infrequently used marker gene) Comparing non-overlapping amplicons. The reference set of sequences must span both of the regions being sequenced -

Cannot use if Comparing non-overlapping amplicons (e.g. V2 and V4 regions of 16S rRNA) There is no reference sequence collection to cluster against (e.g. infrequently used marker gene) Comparing non-overlapping amplicons (e.g. V2 and V4 regions of 16S rRNA) There is no reference sequence collection to cluster against (e.g. infrequently used marker gene)

Pros All reads are clustered Fast, as it is fully parallelizable (useful for extremely large datasets) Better tree and taxonomy quality since the OTUs are already defined on the reference set. All reads are clustered. Fast, as is partially run on parallel

Cons Time consuming since it runs in serial Inability to detect novel diversity with respect to the reference set because the reads that don’t hit the reference sequence collection are discarded, so the analysis focus on the “already known” diversity If the studied environment is not well-characterized, a large fraction of the reads can be thrown away There are still some steps performed in serial. If the data set contains a lot of novel diversity with respect to the reference set, this can still be slow