Skip to main content
. Author manuscript; available in PMC: 2013 May 28.
Published in final edited form as: Pac Symp Biocomput. 2013:433–444.

Table 1. Statistics of PDB-PMID-Residue relationships in CSA.

PDB = Protein Data Bank. CSA = Catalytic Site Atlas. CSA-Lit = the subset of CSA annotations marked as based on literature. PMID = PubMed ID. A verified text residue is a residue that has been identified through text mining, and mapped to a physical residue in the corresponding PDB protein sequence. “Site” refers to a particular numbered location in a protein sequence.

Source Set Residues PDB PMIDs (PMID,Site)

1. PDB PDB residues, with abstract 17904740 30816 17595 4797110
2. PDB PDB residues with verified text residues (abstracts) 44701 9923 5236 14127
3. PDB PDB residues with verified text residues (full text) 7309 107153
4. CSA PDB residues in CSA 112031 17524
5. CSA PDB residues in CSA, with abstract 94327 14673 7587 29447
6. CSA Verified text residues; match to CSA (abstracts) 9059 3163 1630 2708
7. CSA-Lit PDB residues in CSA-Lit 6372 942
8. CSA-Lit PDB residues in CSA-Lit, with abstract 5586 831 823 2799
9. CSA-Lit PDB residues in CSA-Lit, with abstract with at least one verified text residue 2116 343 341 1139
10. CSA-Lit Verified text residues; match to CSA-Lit (abstracts) 878 259 259 476
11. CSA-Lit Verified text residues; match to CSA-Lit (full text) 312 805
12. CSA-Lit Verified text residues; match to CSA-Lit (full text + abstract) 444 1052