Table 2. Different methods and different gold standardsa.
Assignment used for training | Data set used for evaluation | Correctly predicted one-domain (random) | Correctly predicted two-domains (random) | Linker position for two-domain proteins (equal split) |
---|---|---|---|---|
SCOP | set_SCOP | 49% (83%) | 49% (15%) | 50% (42%) |
SCOP | set_CATH | 69% (88%) | 44% (11%) | 40% (27%) |
SCOP | set_Jones | 42% (76%) | 58% (18%) | 53% (49%) |
CATH | set_SCOP | 58% (88%) | 54% (10%) | 49% (52%) |
CATH | set_CATH | 52% (88%) | 53% (14%) | 58% (47%) |
CATH | set_Jones | 42% (76%) | 59% (18%) | 53% (53%) |
SCOP+CATH | set_SCOP | 73% (88%) | 40% (10%) | 46%(44%) |
SCOP+CATH | set_CATH | 73% (88%) | 41% (10%) | 47% (39%) |
SCOP+CATH | set_Jones | 60% (76%) | 50% (18%) | 51% (43%) |
PrISM | set_PrISM | 63% (63%) | 55% (24%) | 48% (30%) |
PrISM | set_Jones | 52% (76%) | 52% (18%) | 42% (53%) |
DomSSEAb | set_Jones | 82% (76%) | 46% (17%) | 49% (49%) |
DGS-Mb | set_Jones | 100% (76%) | 0% (17%) | 46% (49%) |
aThe leftmost column distinguishes different versions of CHOPnet (trained on SCOP, CATH, SCOP+CATH and PrISM), and previously published methods [DomSSEA (47) and DGS-M (43)]. The second column identifies different test sets [set_Jones taken from (47)], the third and fourth the accuracy in predicting the number of domains (values in brackets give random predictions), and the last column the percentage of linker regions for two-domain proteins predicted within 20 residues of the observation (values in brackets mark performance of equal split).
bAll values taken directly from a previous publication (47).