Sequences were aligned using MUSCLE [50]. Amino acids highlighted in green are conserved in a majority of the sequences. Sequence sources are as follows: meta, translated from GenBank accession number AACY020544127 (DNA from the Sargasso Sea); Euglena_1, translated from Euglena gracilis EST with accession number EC679450; Euglena_2, translated from Euglena gracilis EST with accession number EC678321; Glaucocystis, translated from Glaucocystis nostochinearum EST with accession number EC122554; Tf_1, translated from Tritrichomononas foetus EST with accession number CX156355; Tf_2, translated from Tritrichomononas foetus EST with accession number CX157959; Reticulitermes_gut_1, translated from Reticulitermes flavipes symbiont ESTs with accession numbers FL643370 and FL643898; Reticulitermes_gut_2, translated from Reticulitermes flavipes symbiont ESTs with accession numbers FL637453 and FL638405; Reticulitermes_gut_3, translated from Reticulitermes flavipes symbiont EST with accession number GO904605; Tv_1, from Trichomonas vaginalis protein XP_001316516; Tv_2, from Trichomonas vaginalis protein XP_001322305; Tv_3, from Trichomonas vaginalis protein XP_001324076; Clytia, translated from a Clytia haemispherica EST with accession number FP933787. Zootermopsis_gut, translated from Zootermopsis symbiont EST with accession number EG751663; Aureococcus, from Aureococcus anophagefferens protein EGB07434; Lentisphaera, from Lentisphaera araneosa protein ZP_01876528; Porphyridium, translated from Porphyridium purpureum EST with accessison number HS847715; Akkermansia, from Akkermansia muciniphila protein YP_001878433. Neither of the two Euglena ESTs or the two Tritrichomonas foetus ESTs contained a start codon preceded by an in frame stop codon. Thus it is unclear if the amino terminal sequences shown here are correct for these species. The Reticulitermes_gut_3 and Zootermopsis_gut ESTs encode only a portion of the protein.