Table 1. Properties of the three datasets.
PSICOV | New | CASP10 | |
Entries | 150 | 383 | 114 |
Median ![]() |
1029 | 165 | 1224 |
Median length | 143 AA | 161 AA | 216 AA |
Part of complex | 9% | 70% | 50% |
Multi-domain | 0% | 10% | 22% |
Median is the median number of effective sequences, computed by clustering sequences with identity above a pre-computed threshold. Part of complex is the fraction of proteins that are part of a complex according to PISA [49]. Multi-domain is the fraction of proteins containing more than one Pfam domain [50]. Median resolution is the median resolution of crystal structures, where numbers in brackets indicate fraction of structures from NMR.