Skip to main content
. 2005 Jun 27;102(27):9481–9486. doi: 10.1073/pnas.0501620102

Fig. 1.

Fig. 1.

Transcription factor binding sites and motif weight matrix. (a) A specific transcription factor can bind to 5- to 20-bp-long specific DNA segments in the regulatory region of different genes. Each line here represents one regulatory sequence of one gene, and the small rectangles on each line represent the transcription factor binding sites, called motif instances. Note those motif instances can be anywhere in the sequences. (b) The alignment of some motif instances from a, in which the ith position of one motif instance is aligned with the ith position of other motif instances. (c) From b, by assuming a flat prior distribution for the nucleotide composition at each position in the motif, we can obtain the weight matrix of the motif by using Bayes' theorem. Each column of the weight matrix corresponds to one position in the motif, in the order of the first position to the last position in the motif. Each row tells the probabilities that the corresponding nucleotide will occur at each position of the motif, in the order of the nucleotides A, C, G, and T. If all of the numbers in one column of the weight matrix are close to 0.25, the position corresponding to this column is degenerate and not so informative. If many positions are degenerate in the motif, this motif is a weak motif.