Table 2. Cross-hybridizing sequence hits resulting from bioinformatic evaluation of 2226 microarray probes against the CAMERA repository.
Database name as defined in CAMERA | Total length (bp) | # sequences | # Cross-hyb hits | # probes with hits |
GOS: Site-specific 16S Sequences (N) | 3,118,182 | 4,125 | n/a | 0 |
GOS: move858 Assembled 0.002-0.22 Chesapeake Bay (N) | 8,669,804 | 5,357 | n/a | 0 |
FarmSoil: Assembled Sequences (N) (Minnesota farm soil) | 144,897,582 | 139,340 | 2 | 2 |
HOT: All ORFs (N) (Hawaii Ocean Time-series ALOHA) | 169,784,453 | 449,086 | 2 | 1 |
Moore Foundation Marine Microbial Genomes (N)* | 856,811,427 | 12,886 | 630 | 526 |
GOS: Combined Assembly Coding Sequences (N) | 3,668,987,939 | 6,115,750 | 137 | 62 |
Eukaryotic Microbial Genomes (N)* | 6,342,658,807 | 1,453,409 | 3 | 3 |
All NCBI Environmental Samples (ENV_NT) | 7,194,061,284 | 17,695,887 | 218 | 118 |
All Prokaryotic Genomes (N)* | 9,577,197,991 | 655,666 | 3770 | 2,186 |
CAMERA's Non-Identical Nucleotide Sequences (N)* | 179,511,589,666 | 38,512,986 | 4,367 | 2,077 |
Columns 2 and 3 show the total amount of sequence information and the number of individual sequences, respectively, within each database in Column 1. Column 4 shows the numbers of sequences selected as potentially cross-hybridizing with the microarray probes. Column 5 shows the numbers of microarray probes that produced at least one hit in the corresponding CAMERA databases. Asterisks show databases with taxonomic annotations. The two databases used for taxonomic analysis of probe hits are shown in bold.