Skip to main content
. Author manuscript; available in PMC: 2014 Dec 1.
Published in final edited form as: Proteins. 2013 Oct 17;81(12):2096–2105. doi: 10.1002/prot.24422

Table I.

Benchmark and validation sets used for optimization and evaluation of PeptiMap peptide binding site identification

(A) Initial Calibration Benchmark (Extracted from PeptiDB 26; n=21)
Name bound unbound Peptide sequence Cath domain domains/multimer treatment rank of hit Pepsite218 ranks
dystrophin (WW) 1EG4A 1EG3A_1 NMTPYRSPPPYVPa 2.20.70.10 (b; 2x1.10.238.10) 1 of 4 different domains (47-84) split: only first domain used 2,3 6,8,10
sh2a1 (SH2) 1D4TA 1D1ZA KSLTIYAQVQK 3.30.505.10 1 -
lsb3 sla1 (SH3) 1SSHA 1OOTA GPPPAMPARPT 2.30.30.40 2 1-6
erbb2 (PDZ) 1MFGA 2H3LA EYLGLDVPV 2.30.42.10 1 -
wdr5 (WD40) 2H9MA 2H14A ARTKQT 2.120.10.80 central hole not accessible to peptide mask internal sites 3 1,3,5,6
usp7 2FOJA 2F1WA GARAHSS NA* (β-sandwich) 1,2 1-6
cyclophilin 1AWR 2ALFA HAGPIA 2.40.100.10 1 1-10
p97 N-glycanase 2HPLA 2HPJA DDLYG NA* (p97) 1 -
traf2 1CZY 1CA4A ace-PQQATDD 2.60.210.10 weak homotrimer (ABC) use monomerc 3 -
i-ap1 1JD5A 1JD4A AIAYFIPD 1.10.1170.10 weak homodimer (AB) use monomer 1,2 4
gga1 1JWG_AC 1JWFA DEDLLHI 1.25.40.90 weak homodimer (AC - in bound structure) use monomer 2 -
ntf2 1GYB_AB 1GY7_AB FSF 3.10.450.50 tight homodimer unit: 2
chains
1
-
-
Clpx 2DS8_AB 2DS5_AB ALRVVK NA* (clpx) tight homodimer, 2 sites unit: 2
chains
1,2 -
calpain small subunit 1NX1A 1ALVAB DAIDALSSDFT 1.10.238.10 tight homodimer central hole not accessible to peptide unit: 2
chainsc
- -
ap2 2VJ0A 1B9KA_1
1B9KA_2
PKGWVTFE
FEDNFVP
2.60.40.1030
3.30.310.30
2 domains split: both have peptide bound -
3
-
-
pim1 kinase transferase domain 2C3I 2J2IB_2 KRRRHPSG (3.30.200.20) 1.10.510.10 2 domains split: only second used 2 -
cdk2 cyclin 2CCHB 1H1RB HTLKGRRLVFDN 1.10.472.10 x2 2 same domains do not split 3 1,6
trypsin 2AGEX 1UTNA Sin-AAPR 2.40.10.10 x2 2 same domains do not split 1 6
Pcna 1RXZ 1RWZA KSTQATLERWF 3.70.10.10 2 same domains do not split 2 -
Endothiapepsin 1ER8 4APEA PFHLLVY 2.40.70.10 2 same domains do not split 1,2,3 1-6
(B) Validation set (Compiled from recent PDB releases 1-4/2013; n=9)
Name bound unbound Peptide sequence Cath domain domains/multimer treatment rank of hit Pepsite2 ranks
γ2 adaptin Ear domain 2YMT 4BCXA EWGPWV 2.60.40.1230 1 domain 3 -
βcop WD40 4J73 3MKQA EAKKLV 2x 2.130.10.10 + additional domain (based on 1VYH) 3 domains central hole not accessible to peptides split, only first domain (2-300); site 1 masked 2,3 2,3,5,6
RADa 4B3B 4A74A aceFHTA 3.40.50.300 (based on 1PZN) weak homodimer use monomer 2,3 -
FKBP35 4ITZA 3NI6A sinALPFnit 3.10.50.40 (based on 1KT0A) weak homodimer use monomer 1 1-10
jnk1 kinase transferase domain 3VUH 3ELJ PKRPTTLNLF (3.30.200.20); 1.10.510.10 2 domains split: only second domain - 1-10
HIF alpha 4B7E 2W0X EVVKLLLEHGADVLAQD 2-305; 1.10.287.1010 2 domains split: only first domain 1,3 1,4,7-10
MLH1 mut alpha c-term 4FMNA 4E4WA VRSKYFK NA* multidomain too big without splittingc -
No split information available
-
demethylase 4FWF 4FWE ARTMQTARKSTGG (domain A02: 280-405 & 522-836, based on 2V1DA) multidomain split: only second domain 1 -
ck2 kinase 4IB5 3BQC GCRLYGFKIHGCG 3.30.200.20; (1.10.510.10) 2 domains split: only first domain 3 -
a

Peptide residues accurately mapped by fragments are highlighted in bold and underlined; Peptide residues with area covered by fragments are highlighted in bold but not underlined

b

XX: unassigned domains

c

Better prediction obtained with SCOP classification (1CA4: 1,2 instead of 3), or domain parser (1ALV)

*

No domain assignment available based on CATH