Figure 3. Multiple sequence alignment of NFACT-N+HhH domains with selected FMN-DG+HhH domain sequences. FMN-DG sequences from solved crystal structures are at the top of the alignment followed by the NFACT-N sequences. Secondary structure elements are depicted as follows: extended loop regions are represented by black lines, β-strands represented by orange arrows, and α-helices represented by purple cylinders. The transition from the core enzymatic domain to the HhH domains is labeled with a black arrow above the alignment. Individual sequences are labeled to the left with gene name, organism name, and gi number separated by an underscore. Gene names are replaced by PDB identifiers where appropriate. Numbers to the left and right of the alignment correspond to amino acid position within the protein encoding the domain. Insert regions are excised and replaced with numbers indicating the length in amino acids of the insert. Due to the “re-wiring” between the core enzymatic domains (Fig. 2), FMN-DG sequences are not presented in linear order; “breaks” in this order are marked at appropriate positions with “x.” Coloring is based on the consensus line at the bottom of the alignment: h, hydrophobic (shaded in yellow); s, small (shaded in green); l, aliphatic (shaded in yellow); -, negatively charged (shaded in purple); p, polar (shaded in blue); +, positively charged (shaded in purple); a, aromatic (shaded in yellow); b, big (shaded in gray); u, tiny (shaded in green); c, charged (shaded in purple). Columns corresponding to active site residue positions are shaded in red, colored in white, and marked at the top with “*.” Columns corresponding to positions involved in either direct or indirect substrate recognition are shaded in brown, colored in white, and marked with “^.” The column corresponding to the conserved glutamate/histidine residue in the HhH domains is shaded in red, colored in yellow, and marked with “&.” The column corresponding to the conserved lysine/arginine residue specific to NFACT-N is marked with a “%.” Organism abbreviations as follows: Aboo, Aciduliprofundum boonei; Alai, Acholeplasma laidlawii; Aory, Aspergillus oryzae; Aper, Aeropyrum pernix; Atha, Arabidopsis thaliana; Bcer, Bacillus cereus; Bthu, Bacillus thuringiensis; CCal, Candidatus Caldiarchaeum; CChl, Candidatus Chloracidobacterium; CKor, Candidatus Korarchaeum; Cele, Caenorhabditis elegans; Cint, Ciona intestinalis; Cowc, Capsaspora owczarzaki; Cpas, Clostridium pasteurianum; Cpha, Chlorobium phaeobacteroides; Csym, Cenarchaeum symbiosum; Dpul, Daphnia pulex; Drer, Danio rerio; Dtur, Dictyoglomus turgidum; Ecol, Escherichia coli; Efae, Enterococcus faecalis; Ehis, Entamoeba histolytica; Fnec, Fusobacterium necrophorum; Glam, Giardia lamblia; Gsp., Geobacillus sp.; Gste, Geobacillus stearothermophilus; Hmag, Hydra magnipapillata; Hsap, Homo sapiens; Hter, Halorubrum terrestre; Hvol, Haloferax volcanii; Klac, Kluyveromyces lactis; Ldel, Lactobacillus delbrueckii; Llac, Lactococcus lactis; Lmon, Listeria monocytogenes; Mbre, Monosiga brevicollis; Mmus, Mus musculus; Mpsy, Methanolobus psychrophilus; Myel, Metallosphaera yellowstonensis; Nfis, Neosartorya fischeri; Ngar, Natrinema gari; Ngru, Naegleria gruberi; Pmar, Prochlorococcus marinus; Psp., Pyrococcus sp.; Ptet, Paramecium tetraurelia; Rnor, Rattus norvegicus; Saci, Sulfolobus acidocaldarius; Saur, Staphylococcus aureus; Scer, Saccharomyces cerevisiae; Sequ, Streptococcus equi; Skow, Saccoglossus kowalevskii; Spur, Strongylocentrotus purpuratus; Tbru, Trypanosoma brucei; Tcru, Trypanosoma cruzi; Tlie, Thermovirga lienii; Tori, Theileria orientalis; Tsp., Thermococcus sp.; Tsp., Thermotoga sp.; Uarc, uncultured archaeon.
