Table 1.
Number of nucleotide sequences and represented species in the developed plant ITS2 and rbcL databases at several key points of the DB4Q2 workflow
| ITS2 | rbcL | |||
|---|---|---|---|---|
| Without dereplication | With dereplication | Without dereplication | With dereplication | |
| After download from NCBI | 238,018 (74,411) | 238,018 (74,411) | 201,740 (62,314) | 201,740 (62,314) |
| After culling (and dereplication) | 223,947 (70,339) | 173,597 (70,339) | 197,071 (60,769) | 135,473 (60,769) |
| After misidentification filtering | 221,954 (69,799) | 171,754 (69,785) | 195,946 (60,342) | 134,321 (60,315) |
| After amplicon-based restriction | 35,505 (15,425) | 29,545 (15,416) | 113,526 (44,269) | 81,415 (44,244) |
Numbers in brackets reflect the count of represented species at each step