A Search against a Representative Subset of Structures Keeps the Main Information Retrieved from an Exhaustive Search
The plot shows two distributions of structure similarity scores S obtained from one-against-n structure database searches. Searches were done for a cytoplasmic protein of unknown function from S. typhimurium (2gjv@1) against all known biological assemblies (thin line, n = 138,294 items; November 7, 2013) and against a representative subset of them (bold line, n = 41,325 items, Sr = 90%; Experimental Procedures). The dashed vertical lines mark the thresholds above which S is considered to be significant at the 3σ level (Experimental Procedures). Note that these thresholds are practically identical for both distributions.