Skip to main content
. 2024 Apr 9;19(4):e0301871. doi: 10.1371/journal.pone.0301871

Table 1. Statistics for the main database, for the biggest order-level databases, and for all databases combined.

The number of proteins in each database is the number of distinct sequences, not the number of protein-coding genes.

Database genomes genera species proteins, millions clusters, millions
Main 6,377 6,377 6,377 21.8 -
Pseudomonadales 4,321 163 1,807 11.7 1.9
Burkholderiales 3,772 340 1,997 12.3 2.8
Actinomycetales 3,339 245 1,891 7.7 2.2
Rhizobiales 3,271 215 1,643 12.5 2.4
Lactobacillales 3,223 98 1,071 3.7 0.9
Combined 56,186 6,377 29,413 217.3 44.3