Skip to main content
. 2016 Jun 2;12(6):e1004817. doi: 10.1371/journal.pcbi.1004817

Fig 6. IC-based sequence divergences in the S1A protein family.

Fig 6

The panels show scatterplots of sequences in the G protein alignment along dimensions (U˜16p) that correspond to sequence variation in positions contributing to each of the top six ICs of the SCA coevolution matrix. The mapping between positional coevolution to sequence relationships is achieved using the reduced alignment matrix x, as per Eqs (14) and (15). Sequences are colored either by enzymatic activity (A-C, the haptoglobins are non-catalytic members of the S1A family), annotated catalytic specificity (D-F), or taxonomic origin (G-I). For each graph, the stacked histograms show the distributions of these classifications for each dimension. Note that trypsin, tryptase, kallikreins, and certain granzymes have tryptic specificity, and chymotrypsin and most granzymes have chymotryptic specificity. The data show that IC1 specifically separates sequences by enzymatic activity (A), IC2 separates sequences by catalytic specificity (D), IC3 separates sequences by invertebrate/vertebrate origin (H), and ICs 4–6 show more minor variations by catalytic specificity (E-F). These data (1) recapitulate and extend previous observations [10], and (2) demonstrate the functional relevance of the IC-based decomposition.