Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2008 Dec 16;37(3):815–824. doi: 10.1093/nar/gkn981

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© 2008 The Author(s)

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

PMC Copyright notice

Figure 6. — Optimal number of pseudocounts, m, as a function of the number of independent observations, n. Using the data listed in Table 3 and the method illustrated in Figure 5, we found the optimal number of pseudocounts for varying n. The method cannot be valid for n < 212 (vertical dotted line), because the calculated decrease in model description length for is greater than the description length of the model at , but it is not possible for a model to have a negative description length. For n between 212 and 1000, the calculation suggests we use a nearly constant number m of pseudocounts, roughly 19.4. In the limit of very large n, the MDL principle suggests the number of pseudocounts should grow proportionately to n^1/3.

Inline graphic — Optimal number of pseudocounts, m, as a function of the number of independent observations, n. Using the data listed in Table 3 and the method illustrated in Figure 5, we found the optimal number of pseudocounts for varying n. The method cannot be valid for n < 212 (vertical dotted line), because the calculated decrease in model description length for is greater than the description length of the model at , but it is not possible for a model to have a negative description length. For n between 212 and 1000, the calculation suggests we use a nearly constant number m of pseudocounts, roughly 19.4. In the limit of very large n, the MDL principle suggests the number of pseudocounts should grow proportionately to n^1/3.

HHS Vulnerability Disclosure