Table 1.
L. delbrueckiissp. bulgaricusstrains | L. delbrueckiissp. lactisstrains | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
ATCC 11842 | ATCC BAA-365 | 2038 | VIB27 | VIB44 | NDO2 | CNRZ226 | CNRZ327 (e) | CNRZ333 | CNRZ700 | |
Assembled genome size (a) | 1,864,998 | 1,856,951 | 1,872,918 | 1,838,091 | 1,810,332 | 2,125,753 | 1,904,440 | 1,844,879 1,938,538 | 1,996,651 | 1,989,632 |
Estimated genome size (b) | N/A | N/A | N/A | 1,853,000 | 1,818,000 | N/A | 1,911,000 | 1,969,000 2,105,000 | 2,052,000 | 2,086,000 |
Number of contigs | 1 | 1 | 1 | 32 | 27 | 1 | 21 | 161/571* | 87 | 333 |
Number of scaffolds | 1 | 1 | 1 | 14 | 14 | 1 | 10 | 33/1 | 23 | 75 |
Average sequencing depth | - | - | - | 86 | 94 | - | 71 | 78 | 77 | 56 |
Number of CDS (c) | 1,466 | 1,380 | 1,333 | 1,783 | 1,711 | 1,666 | 1,665 | 1,525 | 1,721 | 1,593 |
Number of pseudogene-fragments (d) | 630 | 341 | 459 | 388 | 423 | 346 | 390 | 545 | 381 | 408 |
Number of CDS with unknown function | 642 | 294 | 343 | 442 | 434 | 317 | 361 | 315 | 369 | 345 |
Overall GC content (%) | 49.7 | 49.7 | 49.7 | 49.4 | 49.7 | 49.6 | 49.8 | 49.8 | 48.2 | 49.5 |
GC content of CDS | 50.8 | 51.2 | 51.9 | 51.7 | 51.8 | 51.5 | 52.0 | 52.2 | 51.6 | 51.8 |
GC content of CDS at codon position 3 (%) | 65.0 | 64.8 | 64.9 | 66.0 | 66.7 | 64.0 | 67.0 | 65.1 | 63.4 | 67.4 |
CDS as % of genome sequence | 73.4 | 68.3 | 69.2 | 77.1 | 76.5 | 75 | 77.4 | 62.9 | 75.3 | 68.4 |
Number of rrn operons | 9 | 9 | 9 | - | - | 9 | - | 9 | - | - |
Protein localization prediction | ||||||||||
Cytoplasmic | 1,089 | 996 | 958 | 1,346 | 1,277 | 1,245 | 1,237 | 1,140 | 1,272 | 1,182 |
Membrane | 225 | 227 | 208 | 247 | 253 | 242 | 248 | 223 | 259 | 237 |
Surface exposed | 86 | 101 | 115 | 118 | 115 | 119 | 119 | 101 | 123 | 113 |
Secreted | 69 | 56 | 52 | 72 | 66 | 60 | 61 | 61 | 67 | 61 |
a, without paired end sequencing results.
b, assembled sequence plus estimated size of sequence gaps (estimations on the basis of paired end sequencing results).
c, not counting pseudogenes.
d, corresponding to CDS annotated as “fragment”.
e, numbers in italics represent data after genome finishing.
*The increase in the number of contigs after genome finishing is due to the addition of sequence fragments in the original sequence gaps.
-, Data not available.