ABSTRACT
A fundamental requirement for life is the replication of an organism’s DNA. Studies in Escherichia coli and Bacillus subtilis have set the paradigm for DNA replication in bacteria. During replication initiation in E. coli and B. subtilis, the replicative helicase is loaded onto the DNA at the origin of replication by an ATPase helicase loader. However, most bacteria do not encode homologs to the helicase loaders in E. coli and B. subtilis. Recent work has identified the DciA protein as a predicted helicase operator that may perform a function analogous to the helicase loaders in E. coli and B. subtilis. DciA proteins, which are defined by the presence of a DUF721 domain (termed the DciA domain herein), are conserved in most bacteria but have only been studied in mycobacteria and gammaproteobacteria (Pseudomonas aeruginosa and Vibrio cholerae). Sequences outside the DciA domain in Mycobacterium tuberculosis DciA are essential for protein function but are not conserved in the P. aeruginosa and V. cholerae homologs, raising questions regarding the conservation and evolution of DciA proteins across bacterial phyla. To comprehensively define the DciA protein family, we took a computational evolutionary approach and analyzed the domain architectures and sequence properties of DciA domain-containing proteins across the tree of life. These analyses identified lineage-specific domain architectures among DciA homologs, as well as broadly conserved sequence-structural motifs. The diversity of DciA proteins represents the evolution of helicase operation in bacterial DNA replication and highlights the need for phylum-specific analyses of this fundamental biological process.
IMPORTANCE Despite the fundamental importance of DNA replication for life, this process remains understudied in bacteria outside Escherichia coli and Bacillus subtilis. In particular, most bacteria do not encode the helicase-loading proteins that are essential in E. coli and B. subtilis for DNA replication. Instead, most bacteria encode a DciA homolog that likely constitutes the predominant mechanism of helicase operation in bacteria. However, it is still unknown how DciA structure and function compare across diverse phyla that encode DciA proteins. In this study, we performed computational evolutionary analyses to uncover tremendous diversity among DciA homologs. These studies provide a significant advance in our understanding of an essential component of the bacterial DNA replication machinery.
KEYWORDS: Actinobacteria, DNA helicase, DNA replication, DUF721, DciA, domain architecture, helicase loader, phylogeny, eubacteria, proteobacteria
INTRODUCTION
DNA replication is a process critical to life for all organisms. The current paradigm for the process of DNA replication in bacteria has primarily been based on studies in Escherichia coli and Bacillus subtilis. Bacterial DNA replication begins with the binding of the replication initiation protein DnaA to specific sequences referred to as DnaA boxes at the origin of replication (oriC) (1–7). DnaA binding to double-stranded DNA (dsDNA) triggers DNA unwinding at an AT-rich region of DNA called the DNA unwinding element (DUE), leaving a bubble of single-stranded DNA (ssDNA) (1, 4, 6, 8, 9). The ssDNA bubble is coated by single-stranded binding protein (SSB) (10), followed by the concerted loading of two hexameric replicative helicases onto the SSB-coated replication fork. The two helicases translocate along the two sides of the replication fork, unwinding the dsDNA as they move (1, 4, 5, 11–14).
Bacterial replicative helicases (DnaB in E. coli and DnaC in B. subtilis) are superfamily IV-type helicases, which are defined as hexameric RecA ATPases (4, 15, 16) that translocate in the 5′ to 3′ direction (11, 12, 17). The bacterial replicative helicase translocates on ssDNA using a “hand-over-hand” mechanism, which is driven by nucleotide hydrolysis (18, 19; reviewed in reference 17). The C terminus of the bacterial replicative helicase contains the RecA-like fold that is responsible for the ATPase activity and is connected to an N-terminal scaffolding domain via a linker region (1, 20, 21). The replicative helicase must oligomerize into a double-layered hexameric ring to be active during replication, with one layer made up of the N termini and the other layer composed of the C termini (1, 22, 23). In E. coli and B. subtilis, the loading of the replicative helicase is performed with the help of a helicase loader, named DnaC in E. coli and DnaI in B. subtilis (4, 24–27). dnaC and dnaI were acquired by E. coli and B. subtilis, respectively, via domestication of related but distinct phage ATPase-containing genes (28–31). DnaC and DnaI are both in the ATPases associated with diverse cellular activities (AAA+) ATPase family, and the ATPase activity of DnaC is required for its helicase loading function at the origin of replication (32, 33).
E. coli and B. subtilis have long represented the paradigm of helicase loading during bacterial replication. However, most bacteria do not encode ATPase helicase loader homologs to DnaC or DnaI. Instead, most bacteria encode the ancestral protein, DciA (DnaC/I antecedent) (28, 34), which is defined by the presence of a domain of unknown function, DUF721 (termed DciA domain herein). Despite the prevalence of DciA domain-containing proteins in bacteria (28), DciA homologs have only been studied in actinobacterial (Mycobacterium tuberculosis and Mycobacterium smegmatis) and gammaproteobacterial (Pseudomonas aeruginosa and Vibrio cholerae) species (28, 34–36). DciA homologs interact with the replicative helicase DnaB and are essential for M. tuberculosis, M. smegmatis, and P. aeruginosa DNA replication and viability (28, 34–36). Based on DciA’s interaction with the replicative helicase and requirement for DNA replication, DciA has been proposed to perform a function analogous to that of the DnaC/I helicase loaders. However, DciA does not have a predicted ATPase domain and, therefore, cannot be considered a helicase loader like DnaC and DnaI. Instead, DciA is referred to as a predicted helicase operator, although the mechanism of DciA helicase operation is still unknown (28, 34).
Based on studies in M. tuberculosis, the DciA domain was predicted to contain a region with structural homology to the N terminus of DnaA (34), which was subsequently confirmed in V. cholerae DciA (35, 36). The DciA domain in the M. tuberculosis DciA homolog is essential for protein function (34). In addition, mycobacterial DciA homologs encode a 58-amino-acid (aa) sequence extension N terminal to the DciA domain that is essential for M. smegmatis viability (34). However, this sequence is not conserved in the P. aeruginosa or V. cholerae DciA. Instead, sequences C terminal to the DciA domain in the V. cholerae DciA that are not shared with mycobacterial DciA are essential for the interaction between DciA and the replicative helicase DnaB (35). Therefore, the DciA homologs in mycobacteria and the gammaproteobacteria P. aeruginosa and V. cholerae have diverged in the relative positions of the DciA domain and the presence of N- or C-terminal sequence extensions. These sequence variations raise questions of whether there are functional consequences of these differences and whether a broader view of the DciA protein family would reveal further diversification. To begin to address these open questions, we took a computational evolutionary approach and analyzed the phylogenetic distribution, domain architecture, and conservation of sequence properties among 26,789 DciA homologs. Our analysis revealed that most bacterial DciA proteins encode a single annotated domain, the eponymous DciA domain. However, we also identified multiple DciA homologs with novel domain architectures, which could provide clues to specialized functions or biology in those bacteria. Among the bacterial DciA single-annotated-domain proteins, there was lineage-specific variation in the total protein lengths and positioning of the DciA domain. Despite this variation in sequence properties, AlphaFold (37) structural predictions identified a broadly conserved pattern of structural motifs where the DciA domain was connected to alpha-helical structures via an unstructured linker. Therefore, our analyses reveal conserved sequence-structural features of DciA homologs across bacterial phyla, as well as lineage- and, sometimes, species-specific features, highlighting the need for expanded studies of this protein family.
RESULTS
DciA domain-containing proteins are predominantly found in bacteria, with rare transfers to eukaryota.
A protein homology search with the M. tuberculosis DciA (AAK44227.1) only identifies closely related homologs, predominantly in Actinobacteria; similarly, homologs of the P. aeruginosa DciA (AAG07793.1) are mostly proteobacterial. These data suggest that the DciA homologs across different phyla exhibit low conservation in their primary amino acid sequences. This divergence warranted a more comprehensive approach that used multiple DciA domain-containing proteins as starting points to retrieve the rich repertoire of DciA-like proteins from across the tree of life. Therefore, we analyzed the ~27,000 InterPro entries for DciA domain-containing proteins (DUF721, Dna[CI] antecedent DciA, Pfam identification code [ID] PF05258) (38), excluding metagenome data entries. We also added valuable metadata to this data set, including protein accession numbers, taxonomic identifiers (taxIDs), species, and complete and collapsed lineages for each protein (Table S1 in the supplemental material and https://github.com/JRaviLab/dcia_evolution). Our initial characterization revealed that these ~27,000 DciA-like proteins spanned the three kingdoms of life, with 37 bacterial lineages (with assigned lineages), 76 bacterial Candidatus lineages (as-yet unassigned), and 5 archaeal, 1 viral, and 6 eukaryotic lineages.
Next, we analyzed the phyletic distribution and domain architectures of all ~27,000 DciA proteins using the MolEvolvR web application (39). This analysis highlighted that most (99.9%) of the proteins were found in the kingdom Eubacteria (Fig. 1A). In addition, there were 26 DciA domain-containing proteins from eukaryotal, archaeal, and viral sequences (Fig. 1A and Table S2). We performed BLASTP homology searches with the 26 nonbacterial DciA domain-containing proteins (40) and found that over 95% of the top hits with the highest confidence levels retrieved for 24 of these proteins were from bacterial genomes, suggesting that these DciA domain-containing proteins had been mistakenly attributed to archaeal, eukaryotic, or viral genomes (Table S2). The two remaining nonbacterial DciA-like proteins occurred in Kingdonia uniflora (KAF6150485.1), an endangered angiosperm with overrepresentation in DNA replication and repair genes (41), and the fungus Hyaloscypha bicolor E (PMD57303.1) (Table S2 and Fig. S1A). All close homologs of the K. uniflora DciA protein were in Magnoliopsida, a class of flowering plants. The DciA protein from H. bicolor E predominantly retrieved (98/100) DciA-like proteins in fungal species. We also verified that the two eukaryotic DciA proteins were part of annotated open reading frames (ORFs) in complete genomes, suggesting that they were truly eukaryotic proteins. Furthermore, using AlphaFold structure prediction, we found that the DciA domains within the K. uniflora and H. bicolor E proteins resembled a truncated version of the V. cholerae DciA domain, solved previously using nuclear magnetic resonance (NMR) (Fig. S1B) (36). Each eukaryotic DciA domain is approximately half the median length of the V. cholerae DciA domain and is predicted to contain just the first alpha-helix and beta-sheet present in the V. cholerae DciA domain (Fig. S1B) (36). Our analysis using MolEvolvR revealed that the DciA domain-containing protein in K. uniflora also carries an annotated HMA (heavy metal-associated) domain (Pfam ID PF00403) overlapping the DciA domain (Fig. S1A and B). HMA domains are typically involved in heavy metal transport and detoxification in both eukaryotic and prokaryotic species (42–44), where heavy metal exposure can induce DNA damage (45–47). Although there were no additional Pfam domains annotated in the H. bicolor E DciA domain-containing protein, we identified two AN11006-like domains (PANTHER family: PTHR42085) of unknown function on either side of the DciA domain (Table S2). Given that the DciA domain was previously considered to be an exclusively bacterial protein domain, the identification of predicted DciA structural domains in two eukaryotic species reveals the potential for much broader evolutionary distribution than previously appreciated, providing avenues for future structure-function studies.
FIG 1.
DciA protein phyletic spread and domain architectures. (A) Phyletic spread of DciA domain-containing proteins. The heatmap shows the abundances of DciA domain-containing proteins across bacterial (B), archaeal (A), viral (V), and eukaryotic (E) lineages. The color gradient indicates the number of homologs in a particular lineage. (B) Phyletic spread of diverse DciA domain architectures (excluding DciA single-annotated-domain proteins) across bacterial lineages. The color gradient indicates the number of homologs with that domain architecture in a particular lineage. Domain architectures involving TPR repeats were combined based on the number of repeats. (C) Key domain architectures of bacterial DciA homologs (including Pfam domains and MobiDB-lite-predicted IDRs). Representative proteins for each Pfam domain architecture within annotated bacterial lineages (no Candidatus or metagenomes) were selected. Each representative protein is marked with the genus species (represented as “Gspecies”), and the NCBI protein accession number. The Pfam and MobiDB annotations for each domain prediction are shown in the key. The arrow lengths represent the overall protein length. The characteristic DciA domain is indicated with a black arrowhead in the key (gray domains).
Bacterial DciA homolog domain architectures. (i) DciA domains are mostly loners.
The bacterial DciA proteins from InterPro fell into 37 bacterial lineages (Fig. 1A), plus many sequences annotated as Candidatus due to incomplete taxonomic classification (Table S1, https://github.com/JRaviLab/dcia_evolution). In line with sequencing and publication bias, Proteobacteria, Actinobacteria, and Bacteroidetes genomes were overrepresented in our homolog space (Fig. 1A) (48–50). Given the variation noted in the two eukaryotic DciA domain-containing proteins, we then proceeded to characterize the domain architecture of the ~26,000 bacterial DciA homologs using MolEvolvR (39). We found that a stark majority (99.7%) of bacterial DciA proteins carried a lone DciA domain. In addition, we identified 38 variations where DciA homologs contained either multiple DciA domains or additional annotated domains (Fig. 1B and C). The domain architecture variations were often lineage specific, as described below, indicating that they evolved later during speciation to adapt to lineage-specific biology.
(ii) Proteobacteria-specific variations in DciA domain architecture.
Within the Proteobacteria, we identified four lineage-specific DciA domain architectures (Fig. 1B), including some domains present in other proteins associated with DNA replication. Specifically, 15 proteobacterial DciA homologs contained a thioredoxin-like domain (Pfam family thioredoxin [TRX], Pfam ID PF13462; e.g., ODT99650.1, Rhodospirillales) (Fig. 1B and C). TRX domains are present in a large class of redox-regulated proteins (51) and play roles in oxidative stress responses (52, 53). In addition, TRX1 in the alphaproteobacterium Caulobacter crescentus (CcTRX1), an essential TRX domain protein, is upregulated during DNA replication initiation (54). Two other proteobacterial DciA homologs contain PDZ domains (Pfam ID PF13180; e.g., KQZ00574.1, Pseudolabrys) (Fig. 1B and C), which are generally involved in protein-protein interactions (55, 56) and could facilitate the interaction between these DciA proteins and other replication proteins. The PDZ domain-containing DciA homologs also carry trypsin-like peptidase domains (Pfam ID PF13365) (Fig. 1B and C), which have recently been linked to DNA replication in humans, where the trypsin-like peptidase domain in the protein FAM111A is necessary for overcoming replication fork stalling (57). The DciA homolog in the alphaproteobacterium Micavibrio aeruginosavorus (PZQ43964.1) encodes three translation elongation factor P domains (KOW-like, Pfam ID PF08207; OB, Pfam ID PF01132; and C-terminal, Pfam ID PF09285) (Fig. 1B and C).
(iii) Actinobacterial variations in DciA domain architectures.
We identified four distinct lineage-specific domain architectures within the Actinobacteria, all of which involve domains associated with nucleotide sensing and DNA replication. Three Streptomyces DciA homologs contain a YspA domain (Pfam family YAcAr, Pfam ID PF10686; e.g., SNC77843.1) in addition to the DciA domain (Fig. 1B and C). YspA domains typically have fusions to domains that sense and process nucleotide-derived ligands like ADP-ribose (58). We found that the DciA protein in Bifidobacterium callitrichos (PST49340.1) contains the C-terminal domain of the DEAD box RNA helicase family (Pfam ID PF00271) (59) and the res subunit of the type III restriction enzyme, which encodes ATPase activity (Pfam ID PF04851) (Fig. 1B and C) (60). In addition, five actinobacterial DciA homologs contain an N-terminal RecF/RecN/SMC domain (Pfam ID PF02463; e.g., OUE04448.1, Clavibacter) (Fig. 1B and C), which is common in the N termini of structural maintenance of chromosome (SMC) proteins. The SMC domain typically includes a nucleoside triphosphate (NTP)-binding motif, and SMC proteins are involved in chromosome partitioning and DNA recombination and repair (61–65). In addition, six actinobacterial DciA homologs contain multiple DciA domains (e.g., OEJ23183.1, Streptomyces) (Fig. 1B and C).
In addition to Actinobacteria-specific domain architectures, DciA proteins in Micromonospora endolithica (RKN42798.1) and Bacteroidetes species Rhodothermus marinus (BBM73918.1 and ACY49481.1) contain the γ/τ (Pfam ID PF12169) and δ (Pfam ID PF13177) subunit domains of DNA polymerase III (Fig. 1B and C), which make up part of the clamp loader complex in E. coli (66, 67). The DNA Pol III domains present in these DciA homologs also contain AAA+ ATPase domains, also found in DnaC/DnaI helicase loaders from E. coli and B. subtilis (29, 32, 33, 67).
(iv) Other DciA domain architectures involving domains associated with DNA replication.
Thirty-six DciA homologs from Nitrospinae and “Candidatus Rokubacteria” were annotated to contain tetratricopeptide repeat (TPR) domains (Fig. 1B and C). TPR domains are largely eukaryotic protein-protein interaction domains (68). For example, the TPR domain of the replication regulator Dia2 in eukaryotes is essential for the association of Dia2 with the replisome progression complex, which interacts with the MCM2-7 helicase at the replication fork (69). TPR proteins have also been identified in bacteria. For example, in the alphaproteobacterium Orientia tsutsugamushi, two TPR proteins bind to the eukaryotic RNA helicase DDX3 to inhibit host cell translation (70). In “Candidatus Rokubacteria”, these TPR repeats are sometimes associated with an anaphase-promoting complex domain (Pfam ID PF12895; e.g., OLB41025.1), which along with TPR repeat domains, is present in eukaryotic cell cycle regulators (71, 72).
The DciA homologs from two Planctomycetes (KAF0244841.1 and NUN49423.1) contain a histidine kinase-, DNA gyrase B-, and HSP90-like ATPase domain (Pfam ID PF02518), a DNA gyrase B domain (Pfam ID PF00204), a Toprim domain (DNA topoisomerase II Pfam ID PF01751), and a DNA gyrase B C terminus domain (Pfam ID PF00986). A closer look at one of the Planctomycetes DciA sequences (KAF0244841.1) shows that DciA and GyrB are annotated as a single fused protein. GyrB, part of the bacterial gyrase, is responsible for the negative supercoiling of dsDNA, which is essential during the opening of the DNA replication bubble (73).
(v) Homology search with multiple starting points identifies additional DciA domain architectures.
To ensure that we identified all sequenced DciA homologs and the entire repertoire of domain architectures in this protein family, we selected 94 unique lineage-domain architecture pairs from 36 bacterial lineages and 27 total domain architectures to identify novel DciA homologs across the tree of life using MolEvolvR (Fig. S2A). The only phylum without a representative from our initial analysis was Atribacterota, which did not contain class level assignments. In addition to the novel domain architectures discussed above, the homology search identified 8 new domain architectures, including 11 actinobacterial homologs with a DciA dyad (e.g., WP_165395910.1 from Streptomyces), several fusions with TPR repeats, and a DciA domain fused with an amino methyltransferase domain (WP_238693333.1, Micrococcus sp. HSID17227) (Fig. S2B). Together, our initial characterization along with these homologs captured the rich repertoire of DciA variation.
DciA single-annotated-domain homologs vary in protein length and the position of the DciA domain.
Despite having only one annotated domain, the DciA single-annotated-domain proteins varied considerably in length, ranging from 32 to 482 amino acids (Fig. 2A). In addition to total protein length, the distance of the DciA domain from the N or C terminus varied widely, where the DciA domain sometimes fell in the middle, at the N terminus, or at the C terminus of the protein (Fig. 2A). The median amino acid length of the annotated DciA domains in our data set is 85 aa (Table S1), indicating that DciA homologs close to this protein length would only contain the DciA structural domain. An example of a DciA protein where the total protein length is similar to the size of the single DciA structural domain is the Bacteroides fragilis DciA homolog (AUI45441.1), which is 96 aa long with the DciA domain annotated from amino acids 8 to 95. The AlphaFold structural prediction tool predicts that the entire B. fragilis DciA protein is comprised of the predicted DciA domain structure, consisting of one alpha-helix followed by two beta-sheet motifs, one kinked alpha-helix, and a third beta-sheet (Fig. 2B) (34, 36). However, with a median protein length of 148 aa (Fig. 2A), most DciA single-annotated-domain proteins are longer than the typical DciA domain length, suggesting that there may exist other functional sequences in DciA single-annotated-domain proteins that are not annotated as domains. For example, V. cholerae DciA is 157 aa long, with the DciA domain positioned at the N terminus (amino acids 12 to 90), followed by a 67-aa C-terminal sequence extension. An NMR structure of the N terminus of V. cholerae DciA confirms that the DciA domain in the N terminus is sufficient to form the DciA domain structure (36). In addition to confirming the DciA domain structure in the N terminus, AlphaFold prediction of the full-length V. cholerae DciA protein (AAF95538.1) depicts the C terminus as an alpha-helix immediately following the DciA domain, connected to two terminal alpha-helices by an 11-aa linker (Fig. 2B). Given that the C terminus of V. cholerae DciA is required for its interaction with the replicative helicase in vitro (35), these data demonstrate not only that most DciA single-annotated-domain homologs are longer than the DciA domain alone, but also that the sequence extensions appended to the DciA domain can be physiologically relevant. The identification of DciA homologs with different domain architectures (Fig. 1), various protein lengths (Fig. 2A), and different positioning of the DciA domain (Fig. 2A) suggests that some DciA homologs have evolved sequence properties with likely functional consequences for DciA activity.
FIG 2.
Length and DciA domain positioning in DciA single-annotated-domain proteins. (A) Distributions of DciA single-annotated-domain protein lengths and distances of the DciA domain from the N and C termini across bacterial lineages. Summary statistics for the single-annotated-domain DciA proteins were calculated for all bacterial lineages (no Candidatus or metagenomes). The blue lines represent the 25th and 75th (light blue) and median/50th (dark blue) percentiles across lineages. (B) AlphaFold structural predictions for the B. fragilis (AUI45441.1) and V. cholerae (AAF95538.1) DciA proteins visualized with ChimeraX. Blue DciA bracket and numbering indicates the amino acids comprising the DciA domain based on Pfam annotation. Proteins are oriented with the N terminus on the left and the C terminus on the right. Color key indicates accuracy confidence (0 to 100).
Many DciA single-annotated-domain homologs are predicted to encode intrinsically disordered regions.
Small-angle X-ray scattering, intrinsic disorder prediction tools, and molecular dynamics simulations have shown that the C terminus of V. cholerae DciA, which is required for DnaB binding in vitro, contains an intrinsically disordered region (IDR) (35). In addition, there is precedent for IDRs in other bacterial DNA replication proteins, including SSB (74–77), the replication restart helicase Rep (78, 79), the helicase loaders DnaC and DnaI, and the replication initiation protein DnaA (35). To estimate the cooccurrence of IDRs with DciA domains, we used MobiDB-lite (from within MolEvolvR [80]) to predict IDRs in each of the bacterial DciA homologs (Fig. 3A and Table S1). Out of ~23,000 total bacterial non-Candidatus DciA single-annotated-domain homologs, ~8,000 proteins contained at least one IDR, and some (4,728) had multiple regions of disorder predicted. These IDRs were present in DciA homologs from 24 phyla (Fig. 3A). Analysis of the lengths of the IDRs in the DciA homologs from each phylum found that these IDRs ranged in length from 14 aa to 213 aa (Fig. 3B). The M. tuberculosis DciA was among the single-annotated-domain proteins predicted to contain IDRs both N and C terminal to the DciA domain. Deletion of the N-terminal sequence of the M. tuberculosis DciA that encodes the predicted IDR renders mycobacteria nonviable, demonstrating that this region predicted to form an IDR in DciA is essential for its cellular function (34).
FIG 3.
Disordered regions in DciA single-annotated-domain proteins. (A) Phyletic spreads of intrinsically disordered region (IDR)-containing DciA proteins. The heatmap shows the abundance of IDR-containing DciA proteins across bacterial lineages (no Candidatus or metagenomes). “x” in “IDR(x)” indicates the number of IDRs predicted. The color gradient indicates the number of homologs in a particular lineage. (B) Distribution of IDR lengths across bacterial lineages. Summary statistics for the single-annotated-domain DciA proteins with IDRs were calculated and plotted for each bacterial lineage. The blue lines represent the 25th and 75th (light blue) and median/50th (dark blue) percentiles across lineages.
DciA single-annotated-domain homologs can be separated into four groups based on protein length and the positioning of the DciA domain.
The importance of the predicted IDR sequences for DciA activity in M. tuberculosis (34) and V. cholerae (35), along with the prevalence of predicted IDRs in DciA homologs, suggests that IDRs in other DciA homologs may also be functionally relevant. In addition, we hypothesized that the N- and C-terminal sequence extensions in DciA single-annotated-domain proteins without predicted IDRs may also be functionally relevant. Since the minimum IDR sequence length in DciA homologs was 14 aa (Fig. 3B), we reasoned that any N- or C-terminal sequence extension of ≥14 aa in a DciA protein could comprise a functionally relevant sequence. To identify DciA proteins with potentially relevant sequence extensions associated with the DciA domain, we binned the bacterial non-Candidatus DciA single-annotated-domain homologs (23,309 total proteins) into four groups based on sequence length and position of the DciA domain, as follows: (i) group 1 proteins with ≥14 aa N terminal to the DciA domain and <14 aa C terminal, (ii) group 2 proteins with <14 aa N terminal to the DciA domain and ≥14 aa C terminal, (iii) group 3 proteins with ≥14 aa both N and C terminal to the DciA domain, and (iv) group 4 proteins with <14 aa on both N and C termini (Fig. 4, top, and Table S1). Group 4 proteins would, therefore, be classified as truly single-domain DciA proteins.
FIG 4.
Grouping of DciA single-annotated-domain proteins and their phyletic spread. Top, the four groups of bacterial DciA single-annotated-domain proteins binned based on the lengths of flanking N- and C-terminal extensions of DciA domains as follows: (i) group 1 proteins with ≥14 aa N terminal to the DciA domain and <14 aa C terminal, (ii) group 2 proteins with <14 aa N terminal to the DciA domain and ≥14 aa C terminal, (iii) group 3 proteins with ≥14 aa both N and C terminal to the DciA domain, and (iv) group 4 proteins with <14 aa on both N and C termini. (A to C) Stacked bar plots of 4 groups of DciA single-annotated-domain proteins are plotted across all bacterial lineages (A), with a focus on predominant lineages further resolved into subphyla (Proteobacteria, Actinobacteria, and Bacteroidetes) (B) and other lineages (C). The number of homologs in each group is further characterized based on their lineage-wise distribution. No proteins from bacterial Candidatus or metagenomes are displayed.
We then proceeded to analyze the groupings of DciA homologs within each phylum. Most (~90%) actinobacterial DciA homologs fell into group 3, with sequences on both sides of the DciA domain, although there were examples of actinobacterial DciA proteins in each of the other three groups as well (Fig. 4A). When we separated the actinobacterial DciA proteins by class, we found that Actinomycetia and Coriobacteriia DciA proteins were mostly in group 3, whereas Acidimicrobiia, Nitriliruptoria, and Thermoleophilia DciA proteins were mostly in group 1 (Fig. 4B). Group 3 also contained several (~65%) proteobacterial DciA homologs; however, there were also 2,958 proteobacterial DciA proteins in group 2, as well as representatives in groups 1 and 4 (Fig. 4A). When we separated proteobacterial DciA proteins into individual classes, homologs from Alphaproteobacteria, Betaproteobacteria, Deltaproteobacteria, Gammaproteobacteria, Oligoflexia, and Zetaproteobacteria fell mostly into group 3 (Fig. 4B). In contrast, DciA homologs in Epsilonproteobacteria and Hydrogenophilalia mostly fell into group 2. In contrast to both Actinobacteria and Proteobacteria DciA proteins, Bacteroidetes DciA homologs fell almost exclusively into groups 1 and 4, with the majority (~84%) belonging to group 4 (Fig. 4A). When we separated Bacteroidetes DciA homologs into classes, we found that only the Cytophagia bacteria contained mostly group 1 homologs, while the rest of the Bacteroidetes classes contained mostly group 4 homologs, indicating lineage- and class-specific selection of DciA homolog sequence structures (Fig. 4B).
To better visualize the distribution of DciA single-annotated-domain proteins from phyla with smaller numbers of representative sequences, we removed the proteobacterial, actinobacterial, or bacteroidetes sequences and plotted the distribution of the remaining DciA single-annotated-domain homologs in groups 1 to 4 (Fig. 4C). Some lineages had almost all DciA homologs classed within a single group, indicating the evolution and conservation of a particular DciA sequence organization for that lineage. For example, 94% of Verrucomicrobia and 96% of Chlamydiae DciA homologs fell into group 1. Other lineages had DciA homologs more broadly distributed across groups, although there were still lineage-specific biases toward single groups. More than 65% of Elusimicrobia, Lentisphaerae, Ignavibacteriae, and Fibrobacteres DciA proteins were classed in group 1, >65% of Cyanobacteria, Nitrospirae, Deinococcus-Thermus, and Thermodesulfovibrio DciA homologs were classed in group 2, and >65% of Synergistetes DciA homologs were classed in group 3. Planctomycetes exhibited differentiation of DciA homolog grouping at the class level. Planctomycetes class Planctomycetia homologs were mostly in group 1, while Planctomycetes class Phycisphaerae homologs were split evenly between groups 1 and 3 (Fig. 4C; Table S1). Firmicutes and Fusobacteria DciA homologs were represented equally in groups 2 and 3, with a small number of representatives in groups 1 and 4. Alternatively, Gemmatimonadetes DciA homologs were predominantly split between groups 1 and 4, and Nitrospinae homologs were split between groups 1 and 2.
In addition to Bacteroidetes, other lineages that were predominantly in group 4 included Chloroflexi (56% of DciA homologs), Rhodothermaeota (81%), Calditrichaeota (74%), Chlorobi (86%), and Thermotogae (91%) (Fig. 4C). Group 4 proteins only encode the DciA structural domain, indicating that the DciA domain is sufficient for DciA activity in these bacteria, although this has yet to be experimentally tested. DciA homologs from Acidobacteria, Spirochaetes, and Armatimonadetes did not follow any apparent trends in DciA homolog classification, and Kiritimatiellaeota, Aquificae, Chrysiogenetes, Abditibacteriota, Tenericutes, Balneolaeota, Caldiserica, Dictyoglomi, Atribacterota, and Deferribacteres had <20 sequences in our data set. The lineage- and class-specific trends observed in the four different DciA groupings suggest differentiation of DciA proteins in a largely lineage-specific manner. The sequences N and/or C terminal to the DciA domain in groups 1 to 3 could have functional consequences for the mechanisms of DciA proteins in different bacterial lineages.
Structure predictions of DciA single-annotated-domain homologs reveal conserved structural motifs within and outside the DciA domain.
To understand the impact of sequence extensions on protein structure in DciA single-annotated-domain homologs, we used AlphaFold structure prediction to model representative DciA proteins from groups 1 to 4 (Fig. 5). For groups 1 to 3, we also compared the structures of homologs with and without IDRs predicted in the sequence extensions. Regardless of group designation, all DciA domains were predicted to fold into a structure similar to that previously predicted for the M. tuberculosis DciA (34) and experimentally validated for the V. cholerae DciA (Fig. 5) (36).
FIG 5.
Structure prediction of representative DciA homologs by groups (with and without IDR). Shown are representative AlphaFold structural predictions, including group 1 DciA proteins from A. aerolatus (left; GEO04242.1) and R. islandica (right; KLU02380.1), group 2 DciA proteins from P. aeruginosa (left; AAG07793.1) and O. acuminata (right; AFY85399.1), group 3 DciA proteins from L. interrogans (left; AAN47203.1) and M. tuberculosis (right; BAL63824.1), and group 4 DciA proteins from R. conorii (left; AAL03818.1) and T. maritima (right; AAD35914.1). Models were visualized with ChimeraX. Proteins are oriented with the N terminus on the left and the C terminus on the right, with the termini marked on each structure. DciA domains are identified in blue text and IDRs with black text and brackets, with the corresponding amino acids annotated for each. Key indicates accuracy confidence (0 to 100). See Materials and Methods for accession codes of the models deposited in ModelArchive.
For a group 1 DciA from Planctomycetes (KLU02380.1, Rhodopirellula islandica) that is predicted to have a 48-aa IDR N terminal to the DciA domain, AlphaFold predicts that the N-terminal sequence extension does not form any structured helices or beta sheets, supporting that this region could be intrinsically disordered (Fig. 5). In contrast, the structural prediction of a group 1 DciA single-annotated-domain protein from Bacteroidetes (GEO04242.1, Adhaeribacter aerolatus) with no predicted IDR displays the C-terminal sequence extension ending with a single alpha-helix, connected to the DciA protein with an unstructured linker (Fig. 5).
A group 2 DciA homolog from cyanobacteria (AFY85399.1, Oscillatoria acuminata) that is predicted to contain a C-terminal IDR exhibited the classic DciA domain folds followed by one alpha-helix connected to two more C-terminal helices via a 24-aa linker, which coincided with the predicted IDR (Fig. 5). The group 2 DciA homolog from P. aeruginosa (AAG07793.1, Gammaproteobacteria), with no predicted IDR, contains the DciA domain folds connected to two C-terminal helices via a 16-aa linker (Fig. 5), which is reminiscent of the structure of the O. acuminata DciA.
The group 3 DciA protein from the actinobacterium M. tuberculosis (BAL63824.1) consists of a single 18-aa alpha-helix in its N terminus that is connected to the DciA domain by a 34-aa linker region, which is predicted to contain an IDR (Fig. 5). The C terminus of the M. tuberculosis DciA is also predicted to contain a 19-aa IDR. A group 3 protein with no predicted IDR from Leptospira interrogans (AAN47203.1, Spirochaetes) contains sequence extensions in both its N and C terminus (Fig. 5), with the same double alpha-helix fold in its C terminus as observed in the group 2 proteins of P. aeruginosa and O. acuminata. The DciA domain is also connected to the two-alpha-helix domain by a 20-aa unstructured linker region.
Similar to the structure predicted for the B. fragilis DciA (Fig. 2B), other group 4 DciA homologs from Thermotogae (AAD35914.1, Thermotoga maritima) and Alphaproteobacteria (AAL03818.1, Rickettsia conorii) only contain the folds previously reported for the DciA domain in V. cholerae (Fig. 5) (36). These data further support that the DciA structural domain alone is sufficient for DciA activity in group 4 bacteria.
Visualization of these representative DciA structures reveals that when unannotated sequences are appended to the DciA domain (groups 1 to 3), they tend to form an unstructured linker region connected to 1 or 2 alpha-helices at the termini, regardless of whether the sequences are positioned N or C terminal to the DciA domain. In some group 1 to 3 DciA homologs, the linker is predicted to comprise an IDR. Notably, the V. cholerae DciA has an experimentally verified C-terminal IDR (35), although this was not predicted by MobiDB-lite (Table S1). Analysis of the V. cholerae DciA using circular dichroism and secondary structure prediction tools predicts that the C-terminal IDR can transiently form two alpha-helices (Fig. 2B) (35). These helices occur at a similar position in the P. aeruginosa DciA as well (Fig. 5). Therefore, it is possible that the unstructured linker motifs in P. aeruginosa and other group 1 to 3 DciA homologs comprise IDRs not predicted by MobiDB-lite due to the alpha-helices that can form in the termini. The conservation of the predicted structures of DciA homologs across bacterial lineages, where the DciA domain is connected to alpha-helical structures via an unstructured linker, suggests that these structural motifs are important for DciA activity in bacteria that encode group 1 to 3 homologs.
DciA evolution across the tree of life.
The natural next and final question is how did these different DciA proteins evolve—are there species-/lineage-specific, domain architecture, or group-specific migration patterns? To address these questions, we used all DciA proteins presented in Fig. 1 to generate a phylogenetic tree (Fig. 6). Most strikingly, we observed that DciA homologs clustered by lineages (Fig. 6A). The three largest lineages, Actinobacteria, Bacteroidetes, and Proteobacteria, are labeled based on their dominant membership (Fig. 6A). We noted two main proteobacterial clades (left and bottom), likely due to class-wise grouping. We therefore further resolved the tree by the predominant bacterial classes from these three phyla (Fig. 6B). The class-resolved tree explains the distinct proteobacterial clusters observed in the phylum-based tree (Fig. 6A), wherein Alphaproteobacteria and Beta-/Gammaproteobacteria form distinct clusters (Fig. 6B). This migration pattern of the alphaproteobacterial homologs suggests that the DciA proteins in this lineage have diverged evolutionarily from the rest of the proteobacterial lineages. The phylogenetic analysis indicates that variations in DciA protein domain architectures, protein lengths, and DciA domain positioning likely occurred after lineage-specific divergence of bacterial classes. Overall, the DciA phylogenetic tree delineates the evolution of this critical panbacterial protein across all major lineages.
FIG 6.
DciA evolution across the tree of life. All DciA domain-containing proteins presented in Fig. 1 were used to reconstruct the DciA phylogenetic tree. Kalign3 (87) was used for multiple-sequence alignment, and FastTree (88) was used to generate the tree. The resulting tree was visualized using FigTree (http://tree.bio.ed.ac.uk/software/figtree/). The tree was colored by major lineages (left, bacterial phyla; right, key bacterial classes) with the remaining DciA proteins in black. The three predominant bacterial lineages, Proteobacteria, Actinobacteria, and Bacteroidetes, are marked on the phylum-based tree next to their corresponding largest clusters (A), and the major bacterial classes are marked next to their largest clusters in the class-based tree (B).
DISCUSSION
The recent discovery of DciA as a predicted helicase operator in bacteria (28, 34, 36) has begun to shed light on a long-standing open question of how the majority of bacteria facilitate helicase activity during DNA replication in the absence of the ATPase helicase loaders expressed by E. coli and B. subtilis. The wide distribution of DciA in diverse bacterial phyla indicates that these proteins likely represent the predominant paradigm for helicase operation, despite not being conserved in E. coli and B. subtilis, the organisms typically used as models for bacterial replication. DciA proteins are defined by the presence of the DciA domain (28). Prior phylogenetic analysis indicates that dnaC and dnaI homologs were acquired through evolution at the expense of dciA (named for dna[CI] antecedent) (28, 30, 31), suggesting that DciA and DnaC/DnaI perform a common function. In addition, it has been shown that DciA interacts with the replicative helicase and is required for DNA replication and viability in the few organisms it has been studied in (28, 34–36). However, the mechanism by which DciA mediates replication initiation is still unknown.
Our comprehensive evolutionary study of DciA proteins has revealed both lineage-specific and conserved features among homologs. We find that most homologs are DciA single-annotated-domain proteins in bacteria, with rare instances of additional annotated domains in DciA homologs, many of which have known roles in DNA replication and repair (Fig. 1 and Fig. S2). These additional domain architectures were predominantly phylum specific and sometimes species specific, suggesting that they have been acquired and maintained to facilitate lineage-specific requirements during DNA replication. Further study of these DciA domain architectures could shed light on various mechanisms of the regulation of bacterial replication, or other roles for DciA in the cell. Similarly, we identified two eukaryotic proteins that encode partial DciA domains (Fig. S1), raising the question of how this domain would function in eukaryotes in the absence of its bacterial replicative helicase binding partner. The eukaryotic DciA domain-containing proteins also harbored additional domains without known connections to DNA replication, possibly suggesting that the DciA domain has been coopted for other purposes in these organisms.
Even though most bacterial DciA homologs were single-annotated-domain proteins, they exhibited a wide variety of sequence lengths and positioning of the DciA domain (Fig. 2 and Table S1). When we grouped the DciA homologs based on the position of the DciA domain and the presence of N- and C-terminal extensions, we identified lineage-specific trends in these sequence features (Fig. 4), suggesting that these variations mostly evolved following speciation and highlighting how the regulation of helicase activity during DNA replication initiation could differ between phyla. For example, most actinobacterial and proteobacterial DciA single-annotated-domain homologs fell into groups 2 and 3, which harbored sequence extensions either N terminally or on both sides of the DciA domain, while most bacteroidetes DciA homologs were classed in group 4, encoding only the DciA structural domain. The sequence extensions in the DciA proteins of the actinobacterium M. tuberculosis and the proteobacterium V. cholerae have been shown to be essential for DciA activity in vivo and in vitro, respectively (34, 35). The absence of these sequences in most bacteroidetes homologs suggests differing requirements for DciA activity in different bacteria.
Despite the lineage-specific grouping of DciA homologs based on sequence lengths and positioning of the DciA domain, AlphaFold structural prediction of representative DciA homologs from each group revealed common structural patterns (Fig. 5). The DciA domain structure (36) was conserved across DciA homologs and has previously been noted to resemble the structure of the N terminus of DnaA (34–36, 81). The N terminus of DnaA is critical for the interaction with the DnaB replicative helicase and other regulators (82, 83); however, the role of the DciA domain in the interaction with DnaB has yet to be established. A tryptophan residue conserved in the DciA domains of many DciA homologs is positioned similarly to a phenylalanine residue in the DnaA N terminus that has been predicted to have a key role in making contacts between DnaA and its interacting partners, including DnaB (34, 84). Mutation of the tryptophan in the DciA domain of M. tuberculosis DciA results in slow growth and decreased DNA replication (34). This supports that the tryptophan within the DciA domain plays a key role in DciA’s function in vivo in mycobacteria. However, not all DciA homologs encode this tryptophan residue within their DciA domains, a key example being the V. cholerae DciA, which has an isoleucine at this position (35). Therefore, even the defining DciA domain feature of DciA proteins exhibits some variation in different bacteria that may reflect differences in mechanisms of action or modes of interaction with the replicative helicase.
In addition to conservation of the DciA domain structure, when sequence extensions were associated with the DciA domain, they tended to form unstructured linkers terminating in 1 or 2 alpha-helices (Fig. 5). These linker and alpha-helical structures were identified either C or N terminal to the DciA domain, depending on the homolog, where the role of this positioning in terms of function is still unknown. At least one third of the DciA single-annotated-domain homologs were also predicted to contain IDRs (Fig. 3 and 5). The IDR C terminal to the DciA domain in the V. cholerae homolog is required for its interaction with the DnaB helicase (35). IDRs in other replication proteins, including SSB and Rep, also play key roles in facilitating protein-protein interactions within the replisome (74–79), indicating a common theme and function of these domains during bacterial DNA replication. However, if the IDR in the unstructured linker sequence is required for the interaction between DciA and the replicative helicase, this would imply that group 4 DciA homologs, which only encode the DciA structural domain and no linker domains, employ other sequences or mechanisms to interact with DnaB. Thus far, no DciA homologs in group 4 have been studied either genetically or biochemically, and so, how the DciA domain functions on its own remains a mystery.
Our computational evolutionary analyses have enabled an in-depth delineation of the evolution of DciA across the tree of life and elucidated the variation in DciA domain architectures and cooccurrences with IDRs and the many flavors of bacterial DciA sequence-structural features. Despite this deep exploration, many unknowns remain regarding DciA proteins and bacterial DNA replication. Our results highlight the complexities and diversity that have evolved in the fundamental process of DNA replication, where no single species of bacteria will be able to represent a central dogma that holds true throughout the kingdom. These studies provide a framework for researchers to consider the evolutionary variation while dissecting the mechanistic basis for helicase operation in bacteria.
MATERIALS AND METHODS
Analysis using MolEvolvR.
(i) Query selection. We started with ~27,000 DciA-carrying proteins from Pfam (UniProt sequences from the InterPro database [38]). We added relevant metadata to each of these homologs, including corresponding NCBI protein accession numbers, protein length, taxID, species, and lineages. We used MolEvolvR (39) and the underlying InterProScan (85) to explore the domain architectures, cellular localizations, and disorder predictions for all ~27,000 DciA proteins. The resulting domain architectures (from MolEvolvR) were also appended to each homolog’s metadata. All these data are available in Table S1 and on GitHub (https://github.com/JRaviLab/dcia_evolution).
(ii) Domain architectures. We used MolEvolvR (39) to determine and characterize all DciA-containing proteins from InterPro and their homologs. We first downloaded the sequences of all DciA-containing proteins identified by InterPro from UniProt. The domain architectures were identified using InterProscan (85) and analyzed by lineage, quantified with heatmaps, and visualized by unique combinations of Pfam domains. During this analysis, we renamed the MobiDB-lite predictions to “intrinsically disordered regions” to be consistent with previous literature. Also, DciA homologs with TPR fusions were condensed by the number of TPR repeats for clarity, e.g., TPR+TPR+TPR was condensed to TPR (3). We then aggregated relevant species and protein annotation metadata from the NCBI into our combined data set. We selected 94 representatives to include each combination of bacterial lineage (excluding Atribacterota, which did not contain class level assignments) and Pfam domain architecture as queries for MolEvolvR homology search.
(iii) Homology search. These 94 proteins were submitted to MolEvolvR to identify homologs in the NCBI RefSeq nonredundant proteins database (86). The domain architectures and sequence-structure motif predictions, as well as lineage and protein-related metadata, were determined using MolEvolvR for each of the homologs (as described above) for downstream analyses. The resulting data were summarized and visualized from within MolEvolvR using custom R scripts.
BLASTP homology search of nonbacterial DciA proteins.
We performed homology searches of the 26 nonbacterial DciA proteins using NCBI’s BLASTP (40). When examining the results, we excluded any nonbacterial query protein (from the 27) if it retrieved < 5% nonbacteria in its top 100 hits (or 23 hits in the case of the protein with accession number RLI67678.1) (greater than or equal to 95% were bacterial).
Phylogenetic tree. FASTA sequences of all DciA-containing proteins from InterPro (~27,000) were obtained from UniProtKB. These sequences were aligned using kalign3 (87), and a phylogenetic tree was constructed using FastTree (88). The resulting tree was visualized with FigTree (http://tree.bio.ed.ac.uk/software/figtree/) and color coded by lineage.
AlphaFold structure prediction.
We used the AlphaFold structural prediction Colab notebook (37, 89) via the ChimeraX 1.4 daily build software (90) downloaded on 2 July 2022 for all protein models. Visualization and analyses of models were performed with UCSF ChimeraX, developed by the Resource for Biocomputing, Visualization, and Informatics at the University of California, San Francisco (90).
Data availability.
All our data, analyses, and visualizations summarizing the DciA homologs across the bacterial kingdom, along with their domain architectures and phyletic spreads, are available at https://github.com/JRaviLab/dcia_evolution. Detailed legends for our data tables, structure predictions, and sequences used for tree generation are also available in our GitHub repository (PDB format, under “model_structures”). All AlphaFold structures have been deposited via ModelArchive (http://modelarchive.org/) and are available with the following accession codes: ma-q8sq3 (K. uniflora), ma-vyetl (H. bicolor E), ma-z6rsv (B. fragilis), ma-1hvpi (V. cholerae), ma-tk4v8 (R. islandica), ma-9cnra (A. aerolatus), ma-v2jc1 (O. acuminata), ma-ibgex (P. aeruginosa), ma-2ovw9 (M. tuberculosis), ma-n7tbg (L. interrogans), ma-02qnb (T. maritima), and ma-3ee0e (R. conorii).
ACKNOWLEDGMENTS
We are very grateful to the Midwest Microbial Pathogenesis Conference (MMPC) 2021 organizers for providing H.C.B. a travel award and the opportunity to present the DciA story. This interactive venue enabled the start of this collaboration between J.R., J.T.B., C.L.S., and H.C.B. C.L.S. is supported by a Burroughs Wellcome Fund investigator in the pathogenesis of infectious disease award. H.C.B. is supported by the Sondra Schlesinger student fellowship in molecular microbiology. J.R. is supported by Michigan State University (MSU) College of Veterinary Medicine endowed research funds and MSU start-up funds.
UCSF ChimeraX has support from the National Institutes of Health (grant number R01-GM129325) and the Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases.
Footnotes
Supplemental material is available online only.
Contributor Information
Janani Ravi, Email: janani@msu.edu, janani.ravi@cuanschutz.edu.
Christina L. Stallings, Email: stallings@wustl.edu.
Michael J. Federle, University of Illinois at Chicago
REFERENCES
- 1.Chodavarapu S, Kaguni JM. 2016. Replication initiation in bacteria. Enzymes 39:1–30. 10.1016/bs.enz.2016.03.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Fuller RS, Funnell BE, Kornberg A. 1984. The DnaA protein complex with the E. coli chromosomal replication origin (oriC) and other DNA sites. Cell 38:889–900. 10.1016/0092-8674(84)90284-8. [DOI] [PubMed] [Google Scholar]
- 3.Fuller RS, Kornberg A. 1983. Purified DnaA protein in initiation of replication at the Escherichia coli chromosomal origin of replication. Proc Natl Acad Sci USA 80:5817–5821. 10.1073/pnas.80.19.5817. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Jameson KH, Wilkinson AJ. 2017. Control of initiation of DNA replication in Bacillus subtilis and Escherichia coli. Genes 8:22. 10.3390/genes8010022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Kaguni JM. 2011. Replication initiation at the Escherichia coli chromosomal origin. Curr Opin Chem Biol 15:606–613. 10.1016/j.cbpa.2011.07.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Mott ML, Berger JM. 2007. DNA replication initiation: mechanisms and regulation in bacteria. Nat Rev Microbiol 5:343–354. 10.1038/nrmicro1640. [DOI] [PubMed] [Google Scholar]
- 7.Schaper S, Messer W. 1995. Interaction of the initiator protein DnaA of Escherichia coli with its DNA target. J Biol Chem 270:17622–17626. 10.1074/jbc.270.29.17622. [DOI] [PubMed] [Google Scholar]
- 8.O’Donnell M, Langston L, Stillman B. 2013. Principles and concepts of DNA replication in bacteria, archaea, and eukarya. Cold Spring Harb Perspect Biol 5:a010108. 10.1101/cshperspect.a010108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Bramhill D, Kornberg A. 1988. Duplex opening by DnaA protein at novel sequences in initiation of replication at the origin of the E. coli chromosome. Cell 52:743–755. 10.1016/0092-8674(88)90412-6. [DOI] [PubMed] [Google Scholar]
- 10.Meyer RR, Laine PS. 1990. The single-stranded DNA-binding protein of Escherichia coli. Microbiol Rev 54:342–380. 10.1128/mr.54.4.342-380.1990. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Baker TA, Funnell BE, Kornberg A. 1987. Helicase action of DnaB protein during replication from the Escherichia coli chromosomal origin in vitro. J Biol Chem 262:6877–6885. 10.1016/S0021-9258(18)48326-3. [DOI] [PubMed] [Google Scholar]
- 12.LeBowitz JH, McMacken R. 1986. The Escherichia coli DnaB replication protein is a DNA helicase. J Biol Chem 261:4738–4748. 10.1016/S0021-9258(17)38564-2. [DOI] [PubMed] [Google Scholar]
- 13.Lewis JS, Jergic S, Dixon NE. 2016. The E. coli DNA replication fork. Enzymes 39:31–88. 10.1016/bs.enz.2016.04.001. [DOI] [PubMed] [Google Scholar]
- 14.Oakley AJ. 2019. A structural view of bacterial DNA replication. Protein Sci 28:990–1004. 10.1002/pro.3615. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Gorbalenya AE, Koonin EV. 1993. Helicases: amino acid sequence comparisons and structure-function relationships. Curr Opin Struct Biol 3:419–429. 10.1016/S0959-440X(05)80116-2. [DOI] [Google Scholar]
- 16.Leipe DD, Aravind L, Grishin NV, Koonin EV. 2000. The bacterial replicative helicase DnaB evolved from a RecA duplication. Genome Res 10:5–16. [PubMed] [Google Scholar]
- 17.Fernandez AJ, Berger JM. 2021. Mechanisms of hexameric helicases. Crit Rev Biochem Mol Biol 56:621–639. 10.1080/10409238.2021.1954597. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Itsathitphaisarn O, Wing RA, Eliason WK, Wang J, Steitz TA. 2012. The hexameric helicase DnaB adopts a nonplanar conformation during translocation. Cell 151:267–277. 10.1016/j.cell.2012.09.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Spinks RR, Spenkelink LM, Stratmann SA, Xu Z-Q, Stamford NPJ, Brown SE, Dixon NE, Jergic S, van Oijen AM. 2021. DnaB helicase dynamics in bacterial DNA replication resolved by single-molecule studies. Nucleic Acids Res 49:6804–6816. 10.1093/nar/gkab493. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Nakayama N, Arai N, Kaziro Y, Arai K. 1984. Structural and functional studies of the DnaB protein using limited proteolysis. Characterization of domains for DNA-dependent ATP hydrolysis and for protein association in the primosome. J Biol Chem 259:88–96. 10.1016/S0021-9258(17)43625-8. [DOI] [PubMed] [Google Scholar]
- 21.Sakamoto Y, Nakai S, Moriya S, Yoshikawa H, Ogasawara N. 1995. The Bacillus subtilis dnaC gene encodes a protein homologous to the DnaB helicase of Escherichia coli. Microbiology 141:641–644. 10.1099/13500872-141-3-641. [DOI] [PubMed] [Google Scholar]
- 22.Arias-Palomo E, Puri N, O’Shea Murray VL, Yan Q, Berger JM. 2019. Physical basis for the loading of a bacterial replicative helicase onto DNA. Mol Cell 74:173–184.e4. 10.1016/j.molcel.2019.01.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Liu B, Eliason WK, Steitz TA. 2013. Structure of a helicase–helicase loader complex reveals insights into the mechanism of bacterial primosome assembly. Nat Commun 4:2495. 10.1038/ncomms3495. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Bruand C, Farache M, McGovern S, Ehrlich SD, Polard P. 2001. DnaB, DnaD and DnaI proteins are components of the Bacillus subtilis replication restart primosome. Mol Microbiol 42:245–255. 10.1046/j.1365-2958.2001.02631.x. [DOI] [PubMed] [Google Scholar]
- 25.Kobori JA, Kornberg A. 1982. The Escherichia coli dnaC gene product. II. Purification, physical properties, and role in replication. J Biol Chem 257:13763–13769. 10.1016/S0021-9258(18)33514-2. [DOI] [PubMed] [Google Scholar]
- 26.Koonin EV. 1992. DnaC protein contains a modified ATP-binding motif and belongs to a novel family of ATPases including also DnaA. Nucleic Acids Res 20:1997. 10.1093/nar/20.8.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Wahle E, Lasken RS, Kornberg A. 1989. The DnaB-DnaC replication protein complex of Escherichia coli. II. Role of the complex in mobilizing DnaB functions. J Biol Chem 264:2469–2475. 10.1016/S0021-9258(19)81637-X. [DOI] [PubMed] [Google Scholar]
- 28.Brézellec P, Vallet-Gely I, Possoz C, Quevillon-Cheruel S, Ferat J-L. 2016. DciA is an ancestral replicative helicase operator essential for bacterial replication initiation. Nat Commun 7:13271. 10.1038/ncomms13271. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Chase J, Berger J, Jeruzalmi D. 2022. Convergent evolution in two bacterial replicative helicase loaders. Trends Biochem Sci 47:620–630. 10.1016/j.tibs.2022.02.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Rokop ME, Auchtung JM, Grossman AD. 2004. Control of DNA replication initiation by recruitment of an essential initiation protein to the membrane of Bacillus subtilis. Mol Microbiol 52:1757–1767. 10.1111/j.1365-2958.2004.04091.x. [DOI] [PubMed] [Google Scholar]
- 31.Weigel C, Seitz H. 2006. Bacteriophage replication modules. FEMS Microbiol Rev 30:321–381. 10.1111/j.1574-6976.2006.00015.x. [DOI] [PubMed] [Google Scholar]
- 32.Davey MJ, Fang L, McInerney P, Georgescu RE, O’Donnell M. 2002. The DnaC helicase loader is a dual ATP/ADP switch protein. EMBO J 21:3148–3159. 10.1093/emboj/cdf308. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Ioannou C, Schaeffer PM, Dixon NE, Soultanas P. 2006. Helicase binding to DnaI exposes a cryptic DNA-binding site during helicase loading in Bacillus subtilis. Nucleic Acids Res 34:5247–5258. 10.1093/nar/gkl690. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Mann KM, Huang DL, Hooppaw AJ, Logsdon MM, Richardson K, Lee HJ, Kimmey JM, Aldridge BB, Stallings CL. 2017. Rv0004 is a new essential member of the mycobacterial DNA replication machinery. PLoS Genet 13:e1007115. 10.1371/journal.pgen.1007115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Chan-Yao-Chong M, Marsin S, Quevillon-Cheruel S, Durand D, Ha-Duong T. 2020. Structural ensemble and biological activity of DciA intrinsically disordered region. J Struct Biol 212:107573. 10.1016/j.jsb.2020.107573. [DOI] [PubMed] [Google Scholar]
- 36.Marsin S, Adam Y, Cargemel C, Andreani J, Baconnais S, Legrand P, Li de la Sierra-Gallay I, Humbert A, Aumont-Nicaise M, Velours C, Ochsenbein F, Durand D, Le Cam E, Walbott H, Possoz C, Quevillon-Cheruel S, Ferat J-L. 2021. Study of the DnaB:DciA interplay reveals insights into the primary mode of loading of the bacterial replicative helicase. Nucleic Acids Res 49:6569–6586. 10.1093/nar/gkab463. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, Tunyasuvunakool K, Bates R, Žídek A, Potapenko A, Bridgland A, Meyer C, Kohl SAA, Ballard AJ, Cowie A, Romera-Paredes B, Nikolov S, Jain R, Adler J, Back T, Petersen S, Reiman D, Clancy E, Zielinski M, Steinegger M, Pacholska M, Berghammer T, Bodenstein S, Silver D, Vinyals O, Senior AW, Kavukcuoglu K, Kohli P, Hassabis D. 2021. Highly accurate protein structure prediction with AlphaFold. Nature 596:583–589. 10.1038/s41586-021-03819-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Blum M, Chang H-Y, Chuguransky S, Grego T, Kandasaamy S, Mitchell A, Nuka G, Paysan-Lafosse T, Qureshi M, Raj S, Richardson L, Salazar GA, Williams L, Bork P, Bridge A, Gough J, Haft DH, Letunic I, Marchler-Bauer A, Mi H, Natale DA, Necci M, Orengo CA, Pandurangan AP, Rivoire C, Sigrist CJA, Sillitoe I, Thanki N, Thomas PD, Tosatto SCE, Wu CH, Bateman A, Finn RD. 2021. The InterPro protein families and domains database: 20 years on. Nucleic Acids Res 49:D344–D354. 10.1093/nar/gkaa977. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Burke JT, Chen SZ, Sosinski LM, Johnston JB, Ravi J. 2022. MolEvolvR: a web-app for characterizing proteins using molecular evolution and phylogeny. bioRxiv 10.1101/2022.02.18.461833. [DOI]
- 40.Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic local alignment search tool. J Mol Biol 215:403–410. 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
- 41.Sun Y, Deng T, Zhang A, Moore MJ, Landis JB, Lin N, Zhang H, Zhang X, Huang J, Zhang X, Sun H, Wang H. 2020. Genome sequencing of the endangered Kingdonia uniflora (Circaeasteraceae, Ranunculales) reveals potential mechanisms of evolutionary specialization. iScience 23:101124. 10.1016/j.isci.2020.101124. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Arnesano F, Banci L, Bertini I, Ciofi-Baffoni S, Molteni E, Huffman DL, O’Halloran TV. 2002. Metallochaperones and metal-transporting ATPases: a comparative analysis of sequences and structures. Genome Res 12:255–271. 10.1101/gr.196802. [DOI] [PubMed] [Google Scholar]
- 43.Dykema PE, Sipes PR, Marie A, Biermann BJ, Crowell DN, Randall SK. 1999. A new class of proteins capable of binding transition metals. Plant Mol Biol 41:139–150. 10.1023/a:1006367609556. [DOI] [PubMed] [Google Scholar]
- 44.Sun X-H, Yu G, Li J-T, Jia P, Zhang J-C, Jia C-G, Zhang Y-H, Pan H-Y. 2014. A heavy metal-associated protein (AcHMA1) from the halophyte, Atriplex canescens (Pursh) Nutt., confers tolerance to iron and other abiotic stresses when expressed in Saccharomyces cerevisiae. Int J Mol Sci 15:14891–14906. 10.3390/ijms150814891. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Morales ME, Derbes RS, Ade CM, Ortego JC, Stark J, Deininger PL, Roy-Engel AM. 2016. Heavy metal exposure influences double strand break DNA repair outcomes. PLoS One 11:e0151367. 10.1371/journal.pone.0151367. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Stohs SJ, Bagchi D. 1995. Oxidative mechanisms in the toxicity of metal ions. Free Radic Biol Med 18:321–336. 10.1016/0891-5849(94)00159-h. [DOI] [PubMed] [Google Scholar]
- 47.Zhou S, Wei C, Liao C, Wu H. 2008. Damage to DNA of effective microorganisms by heavy metals: impact on wastewater treatment. J Environ Sci 20:1514–1518. 10.1016/S1001-0742(08)62558-9. [DOI] [PubMed] [Google Scholar]
- 48.Blackwell GA, Hunt M, Malone KM, Lima L, Horesh G, Alako BTF, Thomson NR, Iqbal Z. 2021. Exploring bacterial diversity via a curated and searchable snapshot of archived DNA sequences. PLoS Biol 19:e3001421. 10.1371/journal.pbio.3001421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Land M, Hauser L, Jun S-R, Nookaew I, Leuze MR, Ahn T-H, Karpinets T, Lund O, Kora G, Wassenaar T, Poudel S, Ussery DW. 2015. Insights from 20 years of bacterial genome sequencing. Funct Integr Genomics 15:141–161. 10.1007/s10142-015-0433-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Tatusova T, Ciufo S, Fedorov B, O’Neill K, Tolstoy I. 2014. RefSeq microbial genomes database: new representation and annotation strategy. Nucleic Acids Res 42:D553–D559. 10.1093/nar/gkt1274. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Atkinson HJ, Babbitt PC. 2009. An atlas of the thioredoxin fold class reveals the complexity of function-enabling adaptations. PLoS Comput Biol 5:e1000541. 10.1371/journal.pcbi.1000541. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Lee S, Kim SM, Lee RT. 2013. Thioredoxin and thioredoxin target proteins: from molecular mechanisms to functional significance. Antioxid Redox Signal 18:1165–1207. 10.1089/ars.2011.4322. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Lu J, Holmgren A. 2014. The thioredoxin superfamily in oxidative protein folding. Antioxid Redox Signal 21:457–470. 10.1089/ars.2014.5849. [DOI] [PubMed] [Google Scholar]
- 54.Goemans CV, Beaufay F, Wahni K, Van Molle I, Messens J, Collet J-F. 2018. An essential thioredoxin is involved in the control of the cell cycle in the bacterium Caulobacter crescentus. J Biol Chem 293:3839–3848. 10.1074/jbc.RA117.001042. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Muley VY, Akhter Y, Galande S. 2019. PDZ domains across the microbial world: molecular link to the proteases, stress response, and protein synthesis. Genome Biol Evol 11:644–659. 10.1093/gbe/evz023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Nourry C, Grant SGN, Borg J-P. 2003. PDZ domain proteins: plug and play! Sci STKE 2003:RE7. 10.1126/scisignal.1792003re7. [DOI] [PubMed] [Google Scholar]
- 57.Kojima Y, Machida Y, Palani S, Caulfield TR, Radisky ES, Kaufmann SH, Machida YJ. 2020. FAM111A protects replication forks from protein obstacles via its trypsin-like domain. Nat Commun 11:1318. 10.1038/s41467-020-15170-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Burroughs AM, Zhang D, Schäffer DE, Iyer LM, Aravind L. 2015. Comparative genomic analyses reveal a vast, novel network of nucleotide-centric systems in biological conflicts, immunity and signaling. Nucleic Acids Res 43:10633–10654. 10.1093/nar/gkv1267. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Cordin O, Banroques J, Tanner NK, Linder P. 2006. The DEAD-box protein family of RNA helicases. Gene 367:17–37. 10.1016/j.gene.2005.10.019. [DOI] [PubMed] [Google Scholar]
- 60.Wyszomirski KH, Curth U, Alves J, Mackeldanz P, Möncke-Buchner E, Schutkowski M, Krüger DH, Reuter M. 2012. Type III restriction endonuclease EcoP15I is a heterotrimeric complex containing one Res subunit with several DNA-binding regions and ATPase activity. Nucleic Acids Res 40:3610–3622. 10.1093/nar/gkr1239. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Britton RA, Lin DC-H, Grossman AD. 1998. Characterization of a prokaryotic SMC protein involved in chromosome partitioning. Genes Dev 12:1254–1259. 10.1101/gad.12.9.1254. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Hirano T. 2005. SMC proteins and chromosome mechanics: from bacteria to humans. Philos Trans R Soc Lond B Biol Sci 360:507–514. 10.1098/rstb.2004.1606. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Pellegrino S, Radzimanowski J, de Sanctis D, Boeri Erba E, McSweeney S, Timmins J. 2012. Structural and functional characterization of an SMC-like protein RecN: new insights into double-strand break repair. Structure 20:2076–2089. 10.1016/j.str.2012.09.010. [DOI] [PubMed] [Google Scholar]
- 64.Strunnikov AV, Jessberger R. 1999. Structural maintenance of chromosomes (SMC) proteins. Eur J Biochem 263:6–13. 10.1046/j.1432-1327.1999.00509.x. [DOI] [PubMed] [Google Scholar]
- 65.Strunnikov AV. 2006. SMC complexes in bacterial chromosome condensation and segregation. Plasmid 55:135–144. 10.1016/j.plasmid.2005.08.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Kelch BA, Makino DL, O’Donnell M, Kuriyan J. 2012. Clamp loader ATPases and the evolution of DNA replication machinery. BMC Biol 10:34. 10.1186/1741-7007-10-34. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.O’Donnell M, Jeruzalmi D, Kuriyan J. 2001. Clamp loader structure predicts the architecture of DNA polymerase III holoenzyme and RFC. Curr Biol 11:R935–R946. 10.1016/S0960-9822(01)00559-0. [DOI] [PubMed] [Google Scholar]
- 68.Zeytuni N, Zarivach R. 2012. Structural and functional discussion of the tetra-trico-peptide repeat, a protein interaction module. Structure 20:397–405. 10.1016/j.str.2012.01.006. [DOI] [PubMed] [Google Scholar]
- 69.Morohashi H, Maculins T, Labib K. 2009. The amino-terminal TPR domain of Dia2 tethers SCFDia2 to the replisome progression complex. Curr Biol 19:1943–1949. 10.1016/j.cub.2009.09.062. [DOI] [PubMed] [Google Scholar]
- 70.Bang S, Min C-K, Ha N-Y, Choi M-S, Kim I-S, Kim Y-S, Cho N-H. 2016. Inhibition of eukaryotic translation by tetratricopeptide-repeat proteins of Orientia tsutsugamushi. J Microbiol 54:136–144. 10.1007/s12275-016-5599-5. [DOI] [PubMed] [Google Scholar]
- 71.Alfieri C, Zhang S, Barford D. 2017. Visualizing the complex functions and mechanisms of the anaphase promoting complex/cyclosome (APC/C). Open Biol 7:170204. 10.1098/rsob.170204. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Sudakin V, Ganoth D, Dahan A, Heller H, Hershko J, Luca FC, Ruderman JV, Hershko A. 1995. The cyclosome, a large complex containing cyclin-selective ubiquitin ligase activity, targets cyclins for destruction at the end of mitosis. Mol Biol Cell 6:185–197. 10.1091/mbc.6.2.185. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Cozzarelli NR. 1980. DNA gyrase and the supercoiling of DNA. Science 207:953–960. 10.1126/science.6243420. [DOI] [PubMed] [Google Scholar]
- 74.Antony E, Lohman TM. 2019. Dynamics of E. coli single stranded DNA binding (SSB) protein-DNA complexes. Semin Cell Dev Biol 86:102–111. 10.1016/j.semcdb.2018.03.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Bianco PR, Pottinger S, Tan HY, Nguyenduc T, Rex K, Varshney U. 2017. The IDL of E. coli SSB links ssDNA and protein binding by mediating protein-protein interactions. Protein Sci 26:227–241. 10.1002/pro.3072. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Kozlov AG, Weiland E, Mittal A, Waldman V, Antony E, Fazio N, Pappu RV, Lohman TM. 2015. Intrinsically disordered C-terminal tails of E. coli single-stranded DNA binding protein regulate cooperative binding to single-stranded DNA. J Mol Biol 427:763–774. 10.1016/j.jmb.2014.12.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Tan HY, Wilczek LA, Pottinger S, Manosas M, Yu C, Nguyenduc T, Bianco PR. 2017. The intrinsically disordered linker of E. coli SSB is critical for the release from single-stranded DNA. Protein Sci 26:700–717. 10.1002/pro.3115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Guy CP, Atkinson J, Gupta MK, Mahdi AA, Gwynn EJ, Rudolph CJ, Moon PB, van Knippenberg IC, Cadman CJ, Dillingham MS, Lloyd RG, McGlynn P. 2009. Rep provides a second motor at the replisome to promote duplication of protein-bound DNA. Mol Cell 36:654–666. 10.1016/j.molcel.2009.11.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Nguyen B, Shinn MK, Weiland E, Lohman TM. 2021. Regulation of E. coli Rep helicase activity by PriC. J Mol Biol 433:167072. 10.1016/j.jmb.2021.167072. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Necci M, Piovesan D, Dosztányi Z, Tosatto SCE. 2017. MobiDB-lite: fast and highly specific consensus prediction of intrinsic disorder in proteins. Bioinformatics 33:1402–1404. 10.1093/bioinformatics/btx015. [DOI] [PubMed] [Google Scholar]
- 81.Jameson KH, Rostami N, Fogg MJ, Turkenburg JP, Grahl A, Murray H, Wilkinson AJ. 2014. Structure and interactions of the Bacillus subtilis sporulation inhibitor of DNA replication, SirA, with domain I of DnaA. Mol Microbiol 93:975–991. 10.1111/mmi.12713. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Abe Y, Jo T, Matsuda Y, Matsunaga C, Katayama T, Ueda T. 2007. Structure and function of DnaA N-terminal domains: specific sites and mechanisms in inter-DnaA interaction and in DnaB helicase loading on oriC. J Biol Chem 282:17816–17827. 10.1074/jbc.M701841200. [DOI] [PubMed] [Google Scholar]
- 83.Sutton MD, Carr KM, Vicente M, Kaguni JM. 1998. Escherichia coli DnaA protein. The N-terminal domain and loading of DnaB helicase at the E. coli chromosomal origin. J Biol Chem 273:34255–34262. 10.1074/jbc.273.51.34255. [DOI] [PubMed] [Google Scholar]
- 84.Keyamura K, Abe Y, Higashi M, Ueda T, Katayama T. 2009. DiaA dynamics are coupled with changes in initial origin complexes leading to helicase loading. J Biol Chem 284:25038–25050. 10.1074/jbc.M109.002717. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Jones P, Binns D, Chang HY, Fraser M, Li W, McAnulla C, McWilliam H, Maslen J, Mitchell A, Nuka G, Pesseat S, Quinn AF, Sangrador-Vegas A, Scheremetjew M, Yong SY, Lopez R, Hunter S. 2014. InterProScan 5: genome-scale protein function classification. Bioinformatics 30:1236–1240. 10.1093/bioinformatics/btu031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.O’Leary NA, Wright MW, Brister JR, Ciufo S, Haddad D, McVeigh R, Rajput B, Robbertse B, Smith-White B, Ako-Adjei D, Astashyn A, Badretdin A, Bao Y, Blinkova O, Brover V, Chetvernin V, Choi J, Cox E, Ermolaeva O, Farrell CM, Goldfarb T, Gupta T, Haft D, Hatcher E, Hlavina W, Joardar VS, Kodali VK, Li W, Maglott D, Masterson P, McGarvey KM, Murphy MR, O’Neill K, Pujar S, Rangwala SH, Rausch D, Riddick LD, Schoch C, Shkeda A, Storz SS, Sun H, Thibaud-Nissen F, Tolstoy I, Tully RE, Vatsan AR, Wallin C, Webb D, Wu W, Landrum MJ, Kimchi A, et al. 2016. Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res 44:D733–D745. 10.1093/nar/gkv1189. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Lassmann T. 2019. Kalign 3: multiple sequence alignment of large datasets. Bioinformatics 36:1928–1929. 10.1093/bioinformatics/btz795. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Price MN, Dehal PS, Arkin AP. 2009. FastTree: computing large minimum evolution trees with profiles instead of a distance matrix. Mol Biol Evol 26:1641–1650. 10.1093/molbev/msp077. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Varadi M, Anyango S, Deshpande M, Nair S, Natassia C, Yordanova G, Yuan D, Stroe O, Wood G, Laydon A, Žídek A, Green T, Tunyasuvunakool K, Petersen S, Jumper J, Clancy E, Green R, Vora A, Lutfi M, Figurnov M, Cowie A, Hobbs N, Kohli P, Kleywegt G, Birney E, Hassabis D, Velankar S. 2022. AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models. Nucleic Acids Res 50:D439–D444. 10.1093/nar/gkab1061. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Pettersen EF, Goddard TD, Huang CC, Meng EC, Couch GS, Croll TI, Morris JH, Ferrin TE. 2021. UCSF ChimeraX: structure visualization for researchers, educators, and developers. Protein Sci 30:70–82. 10.1002/pro.3943. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Table S1. Download jb.00163-22-s0001.xlsx, XLSX file, 3.8 MB (3.9MB, xlsx)
Table S2. Download jb.00163-22-s0002.xlsx, XLSX file, 0.01 MB (9.4KB, xlsx)
Supplemental material. Download jb.00163-22-s0003.pdf, PDF file, 0.5 MB (497KB, pdf)
Data Availability Statement
All our data, analyses, and visualizations summarizing the DciA homologs across the bacterial kingdom, along with their domain architectures and phyletic spreads, are available at https://github.com/JRaviLab/dcia_evolution. Detailed legends for our data tables, structure predictions, and sequences used for tree generation are also available in our GitHub repository (PDB format, under “model_structures”). All AlphaFold structures have been deposited via ModelArchive (http://modelarchive.org/) and are available with the following accession codes: ma-q8sq3 (K. uniflora), ma-vyetl (H. bicolor E), ma-z6rsv (B. fragilis), ma-1hvpi (V. cholerae), ma-tk4v8 (R. islandica), ma-9cnra (A. aerolatus), ma-v2jc1 (O. acuminata), ma-ibgex (P. aeruginosa), ma-2ovw9 (M. tuberculosis), ma-n7tbg (L. interrogans), ma-02qnb (T. maritima), and ma-3ee0e (R. conorii).






