Skip to main content
. Author manuscript; available in PMC: 2009 Nov 25.
Published in final edited form as: J Proteomics Bioinform. 2009 Mar 21;2:139–149. doi: 10.4172/jpb.1000071

Figure 1. Single amino-acid phylogenetic profile generation and hierarchical clustering of structurally resolved SH2 domains.

Figure 1

(a)Workflow of using GDDA-BLAST to generate SAPPs using 131 profiles associated with peripheral lipid-binding activity. (i) For a query, embedded alignments are generated with each of 131 PLB profiles using GDDA-BLAST (i.e. reverse-COBBLER). These alignments are filtered using the thresholds of %identity and %coverage. (see Methods) A profile boundary in the query sequence is defined as overlapping the positive seeded alignments excluding seed over the query. (ii) By Smith-Waterman algorithm, the optimal local alignment is generated between a profile boundary region of query and a consensus sequence of PLB positive profile (i.e. a profile with at least one positive seeded alignment to the query). Based on the local alignment, each amino acid is scored as +2 for identities and +1 for positive substitution. For a query, the previous steps are repeated for every PLB profiles. Then, raw score at each amino acid is normalized by subtracting the average raw score of all amino acids. Finally, a SAPP (Single Amino-acid Phylogenetic Profile) of a query, which is a vector of normalized positional scores at each amino acid, is generated. (iii) The distribution of each amino acid and the number of amino acids with chemical properties (hydrophobic, positive charge and negative charge) from SAPP are incorporated into N (query) by M (compositional profile) matrix(31). Then, using the matrix, the query sequences are hierarchically clustered using Pearson’ correlation metrics.

(b) Dendrogram of SH2 domains hierarchically clustered using peripheral lipid-binding SAPPs. We observe 3 major clades in this dendrogram, all of which receive robust statistical support. We also observe that each clade contains SH2 domains which have been demonstrated to bind lipid experimentally. These results suggest that all of the SH2 domains tested contain lipid-binding activity.