Skip to main content
. 2021 Apr 30;7:612920. doi: 10.3389/fmolb.2020.612920

FIGURE 2.

FIGURE 2

Top 100 concepts from the inferred dictionary. The representative structural cartoons of the top 100 concepts from the inferred dictionary containing 1,493 concepts, ranked in a decreasing order of number of secondary-structure elements (row-wise top-left to bottom-right: c_0001 to c_0100). Strands of sheet are shown in Red; helices in Blue. (See the website for the full interactive listing.) The inference of the whole dictionary was automatic without any prior knowledge or preconceived notions of these recurrent themes. The inferred concepts subsume known patterns; for example, shown in the figure are: “α-β Barrel” (c_0005), “Armadillo repeat” (c_0083), “β Barrel” (c_0061), “β Propeller” (c_0004), “Icosahedral virus coat protein” (c_0067), Immunoglobulin (c_0062), “Jellyroll architecture” (c_0084), “Left-handed β-Helix” (c_0001), “Leucine-rich repeat” (c_0076), “Right-handed quadrilateral β-Helix” (c_0058) “NAD-binding domain” (c_0002), “TIM barrel” (c_0008), etc. Other classical supersecondary structures not shown in this figure such as β-hairpin (c_1442), α-hairpin (c_1484), β-α-β unit (c_1240) appear lower down in the dictionary of concepts, ordered from largest to smallest.