Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2012 Oct 30;7:37. doi: 10.1186/1745-6150-7-37

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

Copyright ©2012 Wood et al.; licensee BioMed Central Ltd.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PMC Copyright notice

Data flow through our analysis pipeline. Annotations and sequences were obtained from GenBank, and all sequences were processed with the Glimmer 3 gene finder to obtain gene predictions. Sets of predicted genes were filtered to exclude annotated genes and pseudogenes to obtain a set of candidate missed genes. These predicted genes were input as queries to BLAST against a database of all bacterial genes in RefSeq. Predicted genes were then designated as named missed genes or hypothetical missed genes, based on if they had a significant alignment to a non-hypothetical protein, or only aligned to hypothetical proteins, respectively. Each of these two sets were further analyzed by ComBlast, which uses BLAST and the COMBREX database to associate genes with additional attributes, such as experimentally determined function, 3D structure, conservation and phenotype information and assign a COMBREX support level to each potential missed gene.