Skip to main content
. 2017 Oct 19;114(44):11703–11708. doi: 10.1073/pnas.1707642114

Fig. 1.

Fig. 1.

(A) The most reused themes in a protein P are derived from the set of meaningful alignments of P and other proteins: in this example, proteins 1–4. For any possible theme (for example, theme T that spans residues s-e), we can consider the parts in the alignments that are restricted to these residues, which are marked here by black rectangles. (B) We assign a score for every theme within the protein P based on the scores of these restricted parts, which is the sum over the BLOSUM-62 scores for the aligned parts. (C) Our goal is to identify the largest set of nonoverlapping themes (for example, theme_A and theme_B), such that the sum of these scores is optimal. Rather than exhaustively scoring all possible theme end points to find the optimal one, we find it more efficiently using dynamic programming (SI Appendix, Methods has details).