Figure 5. DNA binding domains of the TFs most relevant to FC enhancer classification.
Only DNA binding domains for the fifty most relevant TFs have been included. TFs were ranked according to the SVM weights of their respective motifs, which represent their discriminating power. Only the highest scoring motif for each TF was considered (median ranks computed across 10 random partitions of the training data varied between 12 and 117). De novo motifs were explicitly excluded from this analysis. TF domains and sequences have been clustered using average linkage and Euclidean distance. The dendogram on top of the heatmap represents the relationships among the sequences in the training data, built on the presence/absence of TFBSs recognized by a specific class of TF DNA binding domain. The dendogram on the left of the heatmap shows the relationships among the different TF DNA binding domains.