Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2001 Aug;11(8):1404–1409. doi: 10.1101/gr.186401

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

Copyright © 2001, Cold Spring Harbor Laboratory Press

PMC Copyright notice

Outline of the Bayesian classifier. (A) For a given motif length (in this figure, two base pairs), the occurrence of all overlapping motifs for each genome is recorded in the motif occurrence profile (B). The motif occurrence profile for each genome is then transformed to a motif frequency profile (C) by dividing each motif occurrence by the total number of motifs in that genome. (D) A sequence, S, of arbitrary length is taken at random from any of the genomes, consisting of a number of j overlapping motifs. (E) The probability of obtaining the motif distribution present in sequence S is separately calculated for each genome and motif. For example, the probability of obtaining motif i in E. coli, P(M_i:G_e) is estimated by the frequency of that motif in the E. coli genome, calculated in (C). The probability of obtaining the motif distribution present in sequence S is then estimated as the product of the individual probabilities of obtaining each motif (E). The classifier predicts the most probable genomic origin (F), the genome with the highest probability P(S:G).