Figure 1.
Scheme of the motif analysis algorithm DistAMo. (A) The amino acid sequences encoded by a motif-containing DNA sequence (pot motifs) are determined. A probability of each potential motif to be encoded by a motif containing sequence is calculated. (B) The positions of potential motifs are detected using a suffix tree search in the proteome and assigned to the corresponding genes, gene groups or chromosomal region depending on the type of analysis. The random distribution of the number of motifs follows a Poisson binomial distribution. The z-score (significance value) is determined from the actual number of motifs, the mean (the expected number of motifs) and the standard deviation of the Poisson binomial distribution.
