(
A) Density of Hox/PBC/Meis clusters in the genome of various organisms. The density was calculated on all non-coding regions of the genomes (blue-filled bars), repeat-masked except for
Amphimedon where the repeat-masked genome was the same as the non-repeat-masked one. Two other conditions (white bars) represent a subset of the genome. The non-coding regions (CNE) conserved in 12 Drosophila genomes show a higher density than genome-wide approaches. These results suggest that the search space needs to be reduced to obtain a good signal/noise ratio. One idea would be to search for conserved regions within cnidarians, taking Nematostella as reference, but such an analysis was beyond the scope of this project, and the divergence time of available cnidarian genomes might not be adapted to this analysis. We attempted a related analysis by taking microsyntenic regions as described in
Irimia et al. (2012). The results do not show a high density comparable to the Drosophila CNEs. We hypothesize that these microsyntenic regions still have a low signal to noise ratio for these motif clusters, or that biologically, the Hox/Pbx + Meis cluster is not majorly involved in the regulation taking place at these regions conserved throughout metazoans. (
B) Validation of cluster enrichment. For each organism, we calculated the enrichment of Hox/PBC/Meis clusters, compared to a random control. At first, we conducted this analysis on random sequences, artificially generated from Markov models trained on the genomes of interest, and taking into account the number and positions of the repeats. The results, however, were highly dependent on the order of the Markov model. To circumvent this, we used randomized motifs as a control, allowing us to keep working on the real genomic sequences. We permuted the motif positions 100 times, allowing, removing the biological signal, while retaining the statistical properties of the PSSMs describing the motifs. We observed a clear enrichment of the clusters in Drosophila CNEs. In contrast, most of the genome-wide analyses did not reveal any enrichment, except in Drosophila, which showed a slight enrichment similar to
Trichoplax. As Trichoplax has no HX-containing ANTP proteins (therefore no Hox/PBC/Meis network), with signal to noise ratio still quite low, we did not consider this slight enrichment.