Skip to main content
. Author manuscript; available in PMC: 2011 Aug 23.
Published in final edited form as: Methods Enzymol. 2004;375:3–20. doi: 10.1016/s0076-6879(03)75001-0

Table 1. Comparison of search strategies for H3 histone sequences.

Entrez queries of the NCBI protein database were conducted from the NCBI website www.ncbi.nlm.nih.gov/Entrez. BLAST searches using human or yeast histone H3 sequences were performed from the command line in a Unix environment:

reference H3 set uniquegi
H3
success efficiency
1742 1742

ENTREZ “eukaryota[ORGN]” 1143461 1742 100.0% 0.2%
ENTREZ “H3” 3303 1452 83.4% 44.0%
ENTREZ “histon” 9297 1653 94.9% 17.8%
ENTREZ “eukaryota[ORGN] and H3” 2703 1452 83.4% 53.7%
ENTREZ “eukaryota[ORGN] and histon” 7453 1653 94.9% 22.2%
BLASTPGP H3human 1747 1719 98.7% 98.4%
BLASTPGP H3human+seg 1747 1719 98.7% 98.4%
BLASTPGP H3human+eukgi 1754 1722 98.9% 98.2%
BLASTPGP H3human+eukgi+seg 1754 1722 98.9% 98.2%
BLASTPGP H3yeast 1777 1718 98.6% 96.7%
BLASTPGP H3yeast+seg 1777 1718 98.6% 96.7%
BLASTPGP H3yeast+eukgi 1780 1718 98.6% 96.5%
BLASTPGP H3yeast+eukgi+seg 1780 1718 98.6% 96.5%
PSIBLASTPGP H3human 1897 1726 99.1% 91.0%
PSIBLASTPGP H3human+seg 1897 1726 99.1% 91.0%
PSIBLASTPGP H3human+eukgi 1949 1727 99.1% 88.6%
PSIBLASTPGP H3human+eukgi+seg 1949 1727 99.1% 88.6%
PSIBLASTPGP H3yeast 2011 1726 99.1% 85.8%
PSIBLASTPGP H3yeast+seg 2011 1726 99.1% 85.8%
PSIBLASTPGP H3yeast+eukgi 2077 1727 99.1% 83.1%
PSIBLASTPGP H3yeast+eukgi+seg 2077 1727 99.1% 83.1%
WINBLASTPGP H3human 69678 1730 99.3% 2.5%
WINBLASTPGP H3human+eukgi 60821 1732 99.4% 2.8%
WINBLASTPGP H3human+eukgi+seg 1697 1646 94.5% 97.0%
WINBLASTPGP H3yeast 70864 1730 99.3% 2.4%
WINBLASTPGP H3yeast+eukgi 63949 1730 99.3% 2.7%
WINBLASTPGP H3yeast+eukgi+seg 1788 1646 94.5% 92.1%

BLASTPGP = gapped protein blast; PSIBLASTPGP = interated gapped protein blast using profiles; WINBLASTPGP = gapped protein BLAST for short, nearly exact matches, using sequence windows as queries; eukgi = search restricted to sequences from eukaryotes; seg = SEG filtering of low-complexity regions enabled. All results were compared to a curated reference_H3_set of sequences. Column headers: uniq gi = number of unique sequence records retrieved; H3 = number of retrieved unique gis shared with the reference set; efficiency = percent H3/uniq gi; success = percent H3/reference set.