NSM |
Near a small molecule in the crystal structure |
NSM-valid |
Subset of NSM interactions annotated as valid in the MOAD database [26], [27]
|
NSM-invalid |
Subset of NSM interactions annotated as invalid in the MOAD database [26], [27]
|
CSA |
catalytic residue from the Catalytic Site Atlas [28] with evidence type LIT |
Text Residue |
physically-verified residue mentioned in abstract of primary PDB reference |
Transfer MSA |
Protein-level or family-level multiple sequence alignment used for transferring annotations between protein domains |
Conservation MSA |
Subset of family-level transfer MSAs involving ten or more distinct proteins, used to calculate residue conservation scores |
S
|
Comprehensive set of 106,411 protein domains |
Sx
|
Subset of S consisting of 98,934 domains on which structure-based prediction was performed |
C
|
Corpus of 17,595 MEDLINE abstracts representing primary references for PDB entries |
E1
|
Evaluation corpus of 813 abstracts annotated with mutation mentions, compiled for MutationFinder |
E1_dev
|
Development subset of E1, consisting of 305 abstracts |
E1_test
|
Test subset of E1, consisting of 508 abstracts |
E2
|
Evaluation corpus of 100 abstracts annotated with both amino acid residues and mutation mentions, compiled by Nagel et al. [33]
|
E3
|
Evaluation corpus of 50 full text articles containing mentions of residues and mutations, compiled by the authors for this study |