Table 7.
Clone no. and ORFa | Number of amino acids | % GC | Lowest ɛ (species)b | Δ%ɛ | Protein homology (%ID, %Sim) | Organism with closest protein homology |
---|---|---|---|---|---|---|
179_D14 | 240 | 69 | 31.98 (PA) | 179.2 | Flp pilus assembly protein, ATPase CpaF (91, 93) | Azotobacter vinelandii |
132_O3 (ORF1) | 191 | 49 | 33.81 (NM) | 55.3 | Conserved hypoth. protein (52, 71) | Chromobacterium violaceum ATCC12472 |
151_O4 | 385 | 29 | 26.4 (BB) | 38.7 | YhbX/YhjW/YijP/YjdB family (48, 69) | N.meningiditis MC58 |
125_L2(ORF3) | 143 | 37 | 32.6 (GT) | 37.9 | Anaerobic decarboxylate transporter (71, 83) | Shigella flexneri 2a |
121_L20 | 149 | 32 | 35.66 (MM) | 32.7 | SAM-dependent methyltransferase | |
55_M14 | 275 | 28 | 27.21 (CJ) | 30.1 | Hydrolase (metallo-beta-lactamase) (33, 52) | H.influenzae R2846 |
173_G10 | 225 | 34 | 29.85 (CM) | 29.7 | Hypoth. protein Hflu20300043 (98, 98) | H.influenzae R2866 |
104_E15(orf1) | 180 | 41 | 35.39 (BH) | 26.4 | Hypoth. protein Hsom02001338 (61, 76) | H.somnus 129PT |
124_K2 | 567 | 43 | 24.64 (VC) | 23.9 | Type I restriction enzyme HsdR (70, 81) | Vibrio cholerae O1 biovar eltor Str. N16961 |
125_L2(ORF1) | 144 | 32 | 34.91 (GT) | 23.2 | Transcriptional regulator (LysR family) (68, 77) | Shigella flexneri 2a |
120_O6 (ORF 2) | 198 | 41 | 32.06 (HP) | 22.7 | Hypoth protein (73, 82) | Actinobacillus pleuropneumoniae 4074 |
125_L2(ORF2) | 230 | 34 | 31.14 (UU) | 21.3 | Unknown (58, 77) (putative aspartate racemase) | Shigella flexneri 2a |
13_D9(ORF2) | 381 | 36 | 24.64 (A118) | 19.2 | TnaB (96, 97) | H.influenzae R2866 |
121_J7 | 293 | 33 | 33.35 (CB) | 19.1 | DNA methylase (58, 74) | E.coli |
32_B2 | 198 | 33 | 22.12 (SA) | 14.6 | Hemoglobin–haptoglobin binding protein HhuA (57, 73) | H.influenzae Str. TN106 |
47_C3 | 306 | 41 | 29.04 (VC) | 9.8 | Recombinational DNA repair protein (99, 99) | H.influenzae 86-028NP |
183_E8 | 177 | 44 | 40.5 (BS) | 8.8 | Transcriptional regulator (100, 100) | H.inf 86-028NP |
159_B20(ORFs1,2) | 157 | 38 | 40.62 (A118) | 8.8 | HD0114 and HD0115 (40, 69) (weaker protein homologies to HI1496 and HI1495) | H.ducreyi 35000HP |
96_C16 | 371 | 33 | 19.82 (LL) | 7 | Restriction/modification protein HI0216 (68, 77) | H.influenzae Rd |
13_D9(ORF1) | 185 | 43 | 32.29 (GT) | 5.3 | TnaA (98, 98) | H.influenzae R2866 |
134_O6 | 270 | 39 | 30.24 (PM) | 4.6 | Fapy DNA glycosylase (98, 99) | H.influenzae R2866 |
112_A12(ORF1) | 250 | 34 | 23.99 (PM) | 3.9 | ADP-heptose:LPS heptosyltransferase(100, 100) | H.influenzae R2866 |
43_I10 | 358 | 35 | 24.28 (SA) | 2.1 | Putative glucosidase (73, 84) | Yersinia pestis C092 |
110_E11(ORF1) | 151 | 49 | 46.65 (HP) | 1.4 | Hypoth. protein HD1532 (68, 81) | H.ducreyi 35000HP |
128_C1 | 141 | 45 | 33.0 (PM) | 0.8 | Hypoth. protein Lin1719 | Listeria innoocua |
93_G12/117_A22 (ORF1) | 254 | 43 | 30.79 (LP) | 0.7 | Hypoth. protein Hinf8010011272 | H.influenzae 86-028NP |
112_A12(ORF2) | 116 | 40 | 47.52 (PM) | 0.6 | Fapy DNA glycosylase (100, 100) | H.influenzae R2866 |
aThe number of the ORF within the clone, if the clone contained multiple ORFs; ɛ = codon usage bias similarity statistic.
bThe letters in parentheses indicate the species that gave the lowest ɛ value; PA = Pseudomonas aeruginosa; NM = Neisseria meningiditis; CB = Clostridium butyricum; BH = Bacillus halodurans; MP = Mycoplasma penetrans; VC = Vibrio cholerae; CJ = Campylobacter jejuni; LL = Lactococcus lactis; T4 = enterobacteriophage T4; PM = Pasturella multocida; NG = Neisseria gonorrheae; HI = Haemophilus influenzae.
%GC = the percentage of guanine and cytosine nucleotide bases in an ORF; Δ%ɛ = the percentage difference in ɛ values between the best-fitting genome and the best-fitting Haemophilus group genome; % ID = the percentage of amino acids from the novel ORF that are identical to the protein encoded by the paralogous reference gene;% Sim = the percentage of amino acids from the novel ORF that are similar to the protein encoded by the orthologous/paralogous reference gene.