Skip to main content
. 2009 Jun 18;106(27):11079–11084. doi: 10.1073/pnas.0905029106

Fig. 2.

Fig. 2.

Unique and repetitious structural coverage as a function of year and size of the sequence database. Coverage is the percentage of single-domain architecture (SDA) families containing at least one sequence of known three-dimensional structure (in the PDB). For unique coverage, we count each family once, whereas for repetitious coverage we count every sequence in the family. If all of the known structures belonging to a particular family are determined by structural genomics, then that family is counted in structural genomics coverage. (If any structure of a family is not from structural genomics, then the entire family is not.). (Left) Unique coverage with merged CDART sequence profiles increasing from 17% in 1980 to 26% now, with a 5% increase since 2004 due to structural genomics. (Right) This increase in coverage occurred during a period when the number of sequences increased 900-fold (from 8,600 to 7.6 million) The upper curves show corresponding data for repetitious coverage that are higher at 71%; this is expected because larger families are more likely to contain a member with a known structure. It is an indication of the maximum number of sequences (4.2 million) that could be modeled by homology. (Center) Coverage with unmerged sequence profiles is significantly lower (22% and 54% for unique and repetitious coverage, respectively); this is expected because families are smaller with unmerged sequence profiles and less likely to contain a member with a known structure.