Skip to main content
. 2022 Apr 5;5:316. doi: 10.1038/s42003-022-03261-8

Fig. 2. Distribution of protein secondary structure classes and fold classes of confident domains of AlphaFold2 models.

Fig. 2

a The secondary structure classes were assigned to SCOPe domains and domains of high confidence in AlphaFold2 models. Four classes were considered, α, β, αβ, and small proteins. Left, SCOPe (232,630 domains); right, domains of high confidence in AlphaFold2 models. (508,787 domains). The classification was performed using a bagged SVM ensemble (see Methods). SCOPe domains (left) were also classified with the SVM ensemble to be able to compare with the results on AlphaFold2 domains (right). b Fold classification of the AlphaFold2 structure domains of high confidence. The classification was performed with the deep neural networks that were trained on the fold assignment provided in SCOPe (see Methods). The outer wheel indicates the fraction of each fold. Folds were ordered according to SCOPe IDs. Left, the fold distribution of AlphaFold2 domains using the deep network trained on 3DZDs of full atom domain structure surface. The inner wheel shows the fraction of secondary structure classes. Since this classification was based on the fold assignment, the fractions are overall consistent but not identical to those shown in panel (a). The top 10 most abundant folds are indicated. Right, the fold distribution using the deep network trained on 3DZDs of surface shapes with main-chain atoms. c The 10 most abundant folds among AlphaFold2 domains. The fraction of each fold is indicated in the wheel diagram on the left in panel b. For each fold, an example of AlphaFold domains is shown. (1) Non-globular all-alpha subunits of globular proteins (a.137). Example shown is A0A1D6E4Z3_F1, residue 823-895 (maize). (2) ROP-like (a.30): A0A1D6MV33_F1, residue 758-815 (maize). (3) Mediator hinge subcomplex-like (a.252). Q4DL50_F1, residue 384-495 (T. cruzi). (4) BAR/IMD domain-like (a.238). Q8LE58_F1, residue 2-133 (Arabidopsis). (5) Intrinsically disordered proteins (g.88). I1L2C2_F1, residue 210-284 (soybean). (6) N-terminal domain of bifunctional PutA protein (a.176). A7MBM2_F1, residue 157-225 (human). (7) L27 domain (a.194). A0A1D6PKM6_F1, residue 314-375 (maize). (8) alpha-alpha superhelix (a.118). K7KHY8_F, residue 213-524 (soybean). (9) Spectrin repeat-like (a.7). P38637_F1, residue 149-238_AFv1 (S. cerevisiae). 10 SRF-like (d.88). A0A1D6NUQ9_F1, residue 2-74 (maize).