Skip to main content
. 2022 Nov 17;2:1033775. doi: 10.3389/fbinf.2022.1033775

TABLE 2.

Class imbalance for CATH hierarchy a .

Alpha Beta Alpha/beta Few secondary structures
Number of queries 2,668 2,328 5,773 105
Number of targets 3,987 3,159 7,105 182
QrawTop1 family knnProtT5 65.6% 72.9% 73.7% 53.3%
QrawTop1 family MMseqs2 35.9% 36.6% 43.8% 45.7%
QrawTop1 class knnProtT5 93.1% 92.1% 95.7% 58.1%
QrawTop1 class MMseqs2 57.6% 56.7% 79.3% 45.7%
a

Data set: CATH20 (redundancy reduced at PIDE≤20); performance measures: QrawTop1 (Eq. 1) reflected the percentage of queries for which the first hit was correct (same CATH, identifier); CATH, classes (columns): on the level of class (C), CATH (Orengo et al., 1997; Sillitoe et al., 2019) distinguishes between mostly-alpha, mostly-beta, mixed alpha/beta and “few secondary structure”; values (rows): number of queries and targets, and two different ways to compile accuracy, the first (QrawTop1 family) is the fraction of queries where the top hit is from the same CATH, family, the second (QrawTop1 class) does the same but considers one level higher in the CATH, hierarchy.