(A) Schematic representation of the main features of the Spok genes. All homologs share an intron within the 5' UTR. At the start of the coding region (CDS) there is a repeat region, in which the number of repeats varies among the homologs. The central portion of the CDS has a number of indels, which appear to be independent deletions in each of Spok2, Spok3, and Spok4. There is a frameshift mutation at the 3' end of the CDS that shifts the stop codon of Spok3 and Spok4 into what is the 3' UTR of Spok1 and Spok2. The pseudogenized Spok gene (SpokΨ1) contains none of the aforementioned central indels and appears to share the stop codon of Spok1 and Spok2. However, there are numerous mutations that result in stop codons within the CDS as well as a full DNA transposon (discoglosse) insertion. No homologous sequence of the 5' end of SpokΨ1 is present. (B) A NeighborNet split network of all active Spok genes from all strains sequenced in this study. The four homologs cluster together well, but there are a number of reticulations, which presumably are the result of gene conversion events. (C) Maximum likelihood trees based on three separate regions of the Spok genes: the 5' UTR, the CDS, and the 3' UTR (starting from the stop codon of Spok3 and Spok4). The trees are rooted arbitrarily using Spok2. Branches are drawn proportional to the scale bar (substitutions per site), with bootstrap support values higher than 70 shown above.