The Cag7 protein sequence is aligned with the translated protein in cosmid 36 combining ORF14, ORF13, and the intervening part requiring a single base (+1) frame shift after amino acid 682 (counting from the N terminus of ORF14). When introducing the frame shift the DNA sequence encodes a protein that, apart from two gaps, aligns more than 90% with Cag7. The first gap corresponds to amino acids 9–138 of Cag7, consisting of unit 𝒜, the second gap corresponds to amino acids 1114–1182, consisting of two consecutive repeat triplet groups, namely (α-ɛ-λ)–(β-δ-μ) (see text). The Cag7 ortholog jhp0476 in strain J99 is displayed below Cag7. The jhp0476 sequence is missing a segment equivalent to 𝒜 of Cag7, and the unit corresponding to 𝒜** is 16 aa longer. The same two consecutive triplet groups missing from cosmid 36 also are missing from jhp0476, whereas the repeat II in jhp0476 extends longer by 78 aa augmented by the two successive triplet groups (δ-μ-α)–(δ-μ-α). The Cag7 and jhp0476 can be divided into three parts corresponding to ORF14, ORF13, and the intervening piece in cosmid 36. The ★ locate two significantly long uncharged (potential transmembrane) segments, the first traversing amino acid positions 343–370 downstream proximal to repeat I and the second segment of positions 1836-1870 is near the C terminus. + corresponds to a concentrated charge region. The arrows indicate the extent and orientation of the ORFs.