Table 1. NCBI database sources used for the probe specificity check.
Source | Number of sequences | Database fraction |
---|---|---|
Bacteria | 7 658 345 | 7.55% |
Environmental samples | 7 276 975 | 7.18% |
Invertebrates | 27 651 271 | 27.27% |
Patented sequences | 31 140 928 | 30.71% |
Plants | 3 798 824 | 3.75% |
Viruses | 1 837 439 | 1.81% |
Archaea | 38 310 | 0.04% |
Fungi | 3 889 143 | 3.84% |
Protozoa | 3 880 518 | 3.83% |
WGS project sequences | 14 220 046 | 14.02% |
Total amount of sequences | 101 391 799 | 100.00% |
The number of sequences and their share of the entire data pool are listed.