Skip to main content
Molecular Biology and Evolution logoLink to Molecular Biology and Evolution
. 2024 Mar 25;41(4):msae067. doi: 10.1093/molbev/msae067

Emergence of an Orphan Nitrogenase Protein Following Atmospheric Oxygenation

Bruno Cuevas-Zuviría 1,#,a, Amanda K Garcia 2,#, Alex J Rivier 3, Holly R Rucker 4, Brooke M Carruthers 5, Betül Kaçar 6,
Editor: Fabia Ursula Battistuzzi
PMCID: PMC11018506  PMID: 38526235

Abstract

Molecular innovations within key metabolisms can have profound impacts on element cycling and ecological distribution. Yet, much of the molecular foundations of early evolved enzymes and metabolisms are unknown. Here, we bring one such mystery to relief by probing the birth and evolution of the G-subunit protein, an integral component of certain members of the nitrogenase family, the only enzymes capable of biological nitrogen fixation. The G-subunit is a Paleoproterozoic-age orphan protein that appears more than 1 billion years after the origin of nitrogenases. We show that the G-subunit arose with novel nitrogenase metal dependence and the ecological expansion of nitrogen-fixing microbes following the transition in environmental metal availabilities and atmospheric oxygenation that began ∼2.5 billion years ago. We identify molecular features that suggest early G-subunit proteins mediated cofactor or protein interactions required for novel metal dependency, priming ancient nitrogenases and their hosts to exploit these newly diversified geochemical environments. We further examined the degree of functional specialization in G-subunit evolution with extant and ancestral homologs using laboratory reconstruction experiments. Our results indicate that permanent recruitment of the orphan protein depended on the prior establishment of conserved molecular features and showcase how contingent evolutionary novelties might shape ecologically important microbial innovations.

Keywords: nitrogenase, early life and evolution, nitrogen fixation, orphan genes, ancestral sequence reconstruction, planetary biology

Introduction

Over billions of years, life generated an enormous wealth of biomolecular diversity, producing an estimated 105 extant protein families (Choi and Kim 2006) and a much larger multitude of extinct proteins. Through this process, biology became a defining component of the Earth system, which experienced tremendous revolutions in climate and biogeochemistry that are largely attributable to the molecular innovations of life itself (Knoll 2003). A unified understanding of the Earth-life system requires investigating how the proteins and biogeochemical processes that power planetary phenomena emerged, evolved, and proliferated.

Tracking molecular novelties in the histories of biogeochemically critical enzymes is rarely attempted and represents a considerable challenge given the antiquity of relevant protein targets. Many of these enzymes, including those required for essential elemental cycling, likely evolved more than 3 billion years ago (Knoll 2003; Falkowski et al. 2008; Moore et al. 2017). Certain novel functions have arisen through the progressive duplication, accretion, and recombination of preexisting protein elements (Eck and Dayhoff 1966; Alva et al. 2015; Andersson et al. 2015; Copley 2020). In rarer cases, complete novelty has emerged through the de novo birth of genes (Andersson et al. 2015; Van Oss and Carvunis 2019). This scenario represents a more difficult evolutionary hurdle due to the improbability that formerly noncoding DNA would produce a structurally or functionally coherent peptide (Beasley and Hecht 1997). Importantly, de novo events are so far undocumented in critical, early evolved microbial metabolisms.

Here, we identify a unique case of molecular novelty in the evolutionary history of a critical enzyme enabling biological nitrogen fixation, nitrogenase. Nitrogenase metalloenzymes provide the sole biochemical gateway for essential nitrogen into Earth's biosphere by catalyzing the reduction, or “fixation,” of relatively inert, atmospheric nitrogen to biologically usable ammonia. Over their ∼3-billion-year history (Stueken et al. 2015; Parsons et al. 2021), these enzymes consequently transformed the planet (Falkowski 1997; Falkowski et al. 2008; Sanchez-Baracaldo et al. 2014; Allen et al. 2019; Rucker and Kacar 2023). The nitrogenase family today includes three homologous isozymes—molybdenum- (Nif), vanadium- (Vnf), and iron-only (Anf) nitrogenases—the latter two that are more recently evolved from Nif (Garcia et al. 2020). Each is named for the differing compositions of their active-site metal cofactors but otherwise share a core, oxidoreductase architecture (Eady 1996; Mus et al. 2018; Einsle and Rees 2020). The antiquity and biological significance of this architecture are highlighted by its conservation in distantly related enzymes within the nitrogenase superfamily (Ghebreamlak and Mansoorabadi 2020). Although nitrogenase itself likely evolved more recently (Boyd and Peters 2013), nitrogenase-like enzymes are proposed to date back to the last universal common ancestor (Weiss et al. 2016; Ghebreamlak and Mansoorabadi 2020).

A striking structural distinction in all experimentally studied Vnf and Anf nitrogenase complexes is a protein called the “G-subunit” (Eady et al. 1987; Chatterjee et al. 1997; Sippel and Einsle 2017; Pence et al. 2021). The G-subunit does not exist in any known Nif nitrogenases, and its functional role in Vnf and Anf nitrogenases remains enigmatic. Prior experimental investigations of the G-subunit in the model nitrogen-fixing bacterium Azotobacter vinelandii indicate that it is essential for Vnf and Anf nitrogenase assembly and activity (Waugh et al. 1995; Chatterjee et al. 1996, 1997). More recent structural analyses of specific G-subunit homologs have proposed roles related to electron transfer, protein interaction mediation, and cofactor stabilization (Pence et al. 2021; Schmidt et al. 2024). It is not known whether any of these possible roles were beneficial early in the evolutionary history of the G-subunit and whether they are today conserved across the broad diversity of Vnf and Anf enzymes. Importantly, the originating circumstances of the G-subunit and this protein's impact on the evolutionary trajectory of biological nitrogen fixation have not yet been explored.

The adaptive landscape for nitrogen fixation shifted considerably with Earth’s surface oxygenation ∼2.5 billion years ago (Lyons et al. 2014), which resulted in changes to the environmental availabilities of the redox-sensitive metals used by nitrogenases (Anbar and Knoll 2002; Anbar 2008; Robbins et al. 2016). Thus, it has been suggested that ancient transitions in environmental geochemistry were important factors in the diversification of nitrogenase metal dependence across Nif, Vnf, and Anf isozymes (Anbar and Knoll 2002; Glass et al. 2009; Boyd, Anbar, et al. 2011; Garcia et al. 2020). This interplay between environmental change and molecular innovations has been discussed for metalloenzymes more broadly (Williams 1997; Dupont et al. 2010; Moore et al. 2017; Kacar et al. 2021). What is not known is how certain molecular innovations like the nitrogenase G-subunit—as well as the evolutionary mechanisms by which they arise—regulate the ability of metabolic processes to both adapt to and drive planetary change.

Here, we trace the evolutionary history of the nitrogenase G-subunit and the interplay between molecular evolutionary events and environmental triggers in the ecological diversification of biological nitrogen fixation. We infer the ancestral sequences and structures of nitrogenase enzymes through the emergence of the G-subunit, replaying the sequence of events that gave rise to this nitrogenase protein component >1.5 billion years ago. We combine phylogenetics, ancient protein sequence, and structure reconstructions, as well as genetic experiments to interrogate G-subunit protein emergence, recruitment, and functional significance.

Results

The Nitrogenase G-Subunit Is an Ancient Orphan Protein

The extant G-subunit protein is a small structural element of the nitrogenase enzyme complex that otherwise contains two major components. The first component is an H-subunit homodimer that delivers electrons to the second, catalytic component, which, in Nif, is a D- and K-subunit heterotetramer. In Vnf/Anf (collectively referred to as “alternative nitrogenases”), however, the second component is a D-, G-, and K-subunit heterohexamer (Sippel and Einsle 2017; Trncik et al. 2023) (Fig. 1a). Thus, the G-subunit is a primary distinction in the otherwise structurally comparable Nif, Vnf, and Anf complexes. Available crystallographic and cryo-EM structures of VnfG (Sippel and Einsle 2017) and AnfG (Trncik et al. 2023; Schmidt et al. 2024) proteins show that they are relatively short (∼115 amino acids), four-helix proteins (Fig. 1b). In both Vnf and Anf, the G-subunit neighbors the D-subunit that houses the active-site cofactor where N2 is bound and reduced (Hoffman et al. 2014). The metal content of the cofactor differs between Nif, Vnf, and Anf nitrogenase isozymes, incorporating Fe and Mo (“FeMo-co”), Fe and V (“FeV-co”), or only Fe (“FeFe-co”), respectively. Finally, although sharing a common ancestry (Raymond et al. 2004; Boyd and Peters 2013; Garcia et al. 2020) and core N2-reduction mechanism (Harris et al. 2019), Nif, Vnf, and Anf nitrogenases vary in their kinetic behavior and reactivities to different substrates (Eady 1996; Hu et al. 2011; Zheng et al. 2018; Harris et al. 2019).

Fig. 1.

Fig. 1.

Structure and genetics of the nitrogenase G-subunit. a) Left, crystallographic structure of the extant A. vinelandii Vnf nitrogenase complex (VnfH, PDB 6Q93 [Rohde et al. 2018]; VnfDGK, PDB 5N6Y [Sippel and Einsle 2017]) containing the G-subunit (“G”). The AnfDGK crystallographic structure (PDB 8BOQ [Trncik et al. 2023], not shown) exhibits a similar heterohexameric configuration to VnfDGK but binds FeFe-co instead of FeV-co. Right, crystallographic structure of the extant Nif nitrogenase complex (NifH, PDB 1M34 [Schmid, Einsle, et al. 2002]; NifDK, PDB 3U7Q [Spatzal et al. 2011]) lacking the G-subunit. b) Secondary structure of the Vnf G-subunit. c) Arrangement of Nif, Vnf, and Anf structural genes in A. vinelandii. Vnf/AnfG is located between Vnf/AnfD and Vnf/AnfK genes.

We began our evolutionary study of the nitrogenase G-subunit by constraining its place within the broader diversity of protein sequence and structure. Specifically, we sought to identify distant homologs that might provide insights into G-subunit ancestry and function. We implemented a homology search workflow that included both sequence-based (BLASTP, TBLASTN [Altschul et al. 1990], PSI-BLAST [Altschul et al. 1997], and HMMER [Finn et al. 2011]) and structure-based (FoldSeek; van Kempen et al. 2024) methods appropriate for detection of distantly related proteins.

Our search returned 412 canonical G-subunit proteins, but we did not find convincing similarity to any other type of protein. We defined canonical G-subunit proteins as those that were both homologous to our search queries and encoded by genes within vnf/anf structural gene operons. Any other results obtained by our relatively permissive search parameters (see Materials and Methods) were considered “distant” hits. However, all distant hits had very low statistical significance and were therefore likely to be spurious. E-values for nearly all were larger than 0.01 (compared with maximum E-values for canonical G-subunit sequences of ∼1e−30), and many were only partially aligned to our search queries (Table 1; supplementary fig. S1, Supplementary Material online). One group of hits from our HMMER search had E-values of ∼4e−9 but belonged to an invertebrate metazoan and thus would not plausibly be the closest homologs of the exclusively prokaryote-hosted G-subunit. Distant hits from the structure-based FoldSeek search exhibit a general four-helix arrangement like the G-subunit but again have very high E-values (>5) (supplementary figs. S2 and S3, supplementary table S4, Supplementary Material online). All are ambiguously annotated and predominantly neighbor genes related to DNA binding, modification, and repair, which does not suggest any clear connection to nitrogenase function (supplementary fig. S4, Supplementary Material online). Finally, we note that our search did not return any hits to NafY (cofactor chaperone) and CowN (CO-protection protein), both of which have previously been proposed to resemble the G-subunit in their interactions with the nitrogenase complex (Rubio and Ludden 2005; Sippel and Einsle 2017; Medina et al. 2021).

Table 1.

Summary of G-subunit homolog search results

Method Query Database Total hits (including canonical G-subunit proteins) Distant hits E-value range of distant hits TM-scorea of distant hits Annotation of distant hits Taxonomy of distant hits
BLASTP A. vinelandii VnfG NCBI nonredundant proteins (nr) 433 0 n/a n/a n/a n/a
TBLASTN A. vinelandii VnfG NCBI nucleotide collection (nr/nt) 205 0 n/a n/a n/a n/a
PSI-BLAST A. vinelandii VnfG NCBI nonredundant proteins (nr) 446 7 0.026 to 0.064 n/a TIGR04141 family sporadically distributed protein; Uncharacterized protein Pseudomonas spp.
PSI-BLAST VnfAnfAnc G NCBI nonredundant proteins (nr) 543 0 n/a n/a n/a n/a
HMMER hmmsearch Extant G-subunit alignment UniProt Reference Proteomes 94 1 0.061 n/a Uncharacterized protein Coregonus sp.
HMMER hmmsearch Extant & ancestral G-subunit alignment UniProt Reference Proteomes 96 3 4.0e−09 to 0.01 n/a Uncharacterized protein Brugia malayi, Loa loa
FoldSeek A. vinelandii VnfG (PDB 5N6Y) Protein Data Bank 3 0 n/a n/a n/a n/a
FoldSeek A. vinelandii VnfG (PDB 5N6Y) AlphaFold-UniProt/Swiss-Prot 39 2 5.2 to 8.7 0.52 to 0.54 Uncharacterized protein Vibrio vulnificus, Deinococcus grandis

aTemplate modeling score for structural alignments.

If the G-subunit shares an ancient ancestry with another extant protein family, no strong vestige of this evolutionary relationship appears presently detectable. The absence of distant homologs to the G-subunit family is comparable with the case of “orphan genes,” which are individual genes that either have no resemblance to other sequences or have few, taxonomically restricted relatives (Tautz and Domazet-Loso 2011; Van Oss and Carvunis 2019). Proposed mechanisms for the emergence of orphan genes include duplication followed by rapid divergence and de novo birth from previously nongenic DNA (Andersson et al. 2015; Van Oss and Carvunis 2019). The latter has more recently been recognized as a significant source of orphan genes (Heames et al. 2020; Vakirlis, Acar, et al. 2020; Vakirlis, Carvunis, and McLysaght 2020) and thus would plausibly explain G-subunit origins. Regardless of the specific originating mechanism, our results indicate that the orphan G-subunit protein is a molecular innovation as unique to nitrogenases as it is to broader protein diversity.

Evolution of the G-subunit Was Associated with the Diversification of Nitrogenase Metal Usage and Host Microbial Ecology

We reconstructed phylogenies of nitrogenase G-subunit proteins and examined their genetic distribution across extant genomes and microbial host environments. We identified G-subunit homologs exclusively from vnf and anf gene clusters in bacterial and archaeal genomes (similarly, all identified vnf/anf clusters include the G-subunit), confirming that this protein is a significant molecular distinction between Vnf/Anf and Nif nitrogenases (Dos Santos et al. 2012). In all analyzed clusters, vnf/anfG genes are located between vnf/anfD and vnf/anfK loci (Fig. 1c). By contrast, nifD and nifK genes in our data set are universally contiguous with small intergenic distances (median ≈ 25 bp) (supplementary fig. S5, Supplementary Material online). Finally, all complete genomes harboring vnf and/or anf nitrogenase genes in our data set also possess nif genes. Therefore, the presence of the G-subunit is universally associated with the capacity of a host nitrogen fixer, or diazotroph, to express at least two different nitrogenase isozymes with distinct metal dependencies. We built two maximum-likelihood G-subunit phylogenies: the first from individual G-subunit protein sequences and the second from G-subunit sequences concatenated with other nitrogenase subunit sequences (H-, D-, and K-subunit sequences; see Materials and Methods). Both trees show separate clustering of Vnf and Anf sequences, although deep branch support in the concatenated tree is significantly improved (Fig. 2; supplementary figs. S6 and S7, Supplementary Material online).

Fig. 2.

Fig. 2.

Evolutionary and environmental history of the nitrogenases and the G-subunit. Maximum-likelihood phylogeny built from concatenated nitrogenase subunit proteins. Oxygen tolerance, predicted OGTs, and taxonomy of represented host microbes, as well as the metal content of the active-site metallocluster, are mapped to the phylogeny. Root position is approximated based on prior phylogenetic analyses (Garcia et al. 2020, 2022), which support rooting of the nitrogenase family between Vnf/Anf/Group III Nif and Group I/Group II Nif clades. Nif clades are labeled according to the nomenclature used by Raymond et al. (2004) (e.g. Nif-I = Group I Nif, etc.). The G-subunit origin event is indicated by a black hash mark. Branches that track the emergence of the G-subunit and its diversification into Vnf and Anf groups are shown in bold. Ancestral nodes (NifAnc, VnfAnfAnc, VnfAnc, and AnfAnc) targeted for sequence reconstruction and analysis are highlighted by bold, outlined circles and described in the table at the upper right. Homologs from select model organisms are labeled on the tree (avin, A. vinelandii; cpas, C. pasteurianum; koxy, Klebsiella oxytoca; mace, Methanosarcina acetivorans; rcap, Rhodobacter capsulatus; rpal, R. palustris).

The overall topology of our concatenated nitrogenase tree (Fig. 2) mirrors previous phylogenetic analyses of nitrogenase proteins reconstructed without G-subunit sequences. These prior analyses with an expanded outgroup root the nitrogenase tree, nesting Vnf and Anf clades among Nif lineages, which indicates that Vnf and Anf are more recently evolved (Boyd, Hamilton, and Peters 2011; Garcia et al. 2020). Given this root placement, the G-subunit of Vnf/Anf is established as the most recently evolved subunit in the nitrogenase complex. Minimum age estimates of the Vnf/Anf clade range from ∼1.5 to 2.5 Ga, based on the timing of horizontal gene transfer events that pervade nitrogenase evolutionary history (Parsons et al. 2021). These estimates indicate that the Paleoproterozoic origin of the G-subunit likely followed the initial accumulation of oxygen in the Earth’s surface environment (Lyons et al. 2014), the earliest whiffs of which have been identified from geochemical signatures in ∼2.5- to 3.0-Ga sediments (Anbar et al. 2007; Ostrander et al. 2021) and did not reach modern levels until ∼0.5 Ga (Cole et al. 2020).

Our analyses demonstrate that the presence of the G-subunit is associated both with diversity of nitrogenase metal dependence and host diazotroph ecology. Vnf and Anf sequences specifically branch from within the “Group III” nitrogenase lineage (Raymond et al. 2004), which is additionally populated by Nif homologs of either experimentally characterized or computationally predicted Mo dependence (Kessler et al. 1997; Garcia et al. 2020). Group III Nif is exclusively hosted by anaerobic taxa (primarily methanogenic Euryarchaeota and Firmicutes), many of which are probable thermophiles or hyperthermophiles (optimal growth temperatures [OGTs] were predicted from genome content; see Materials and Methods) (Fig. 2; supplementary table S5, Supplementary Material online). The first alternative nitrogenases were likely similarly hosted by anaerobic organisms (because Vnf and Anf homologs from aerobic organisms are each nested in anaerobe-associated lineages) but were later horizontally transferred to more taxonomically diverse and mesophilic bacteria and archaea. These early alternative nitrogenase hosts also likely harbored “Group I and II” Nif nitrogenases (Raymond et al. 2004), similar to the well-studied diazotrophic models like A. vinelandii, Clostridium pasteurianum, and Rhodopseudomonas palustris. Surprisingly, Vnf and Anf are not hosted by organisms that harbor Group III Nif, despite Vnf/Anf emerging from within the Group III lineage. Within Group III, Vnf and Anf are the only clades that contain sequences from aerobic taxa, including cyanobacteria that appear to only host Nif and Vnf nitrogenases. Together, these data indicate that certain G-subunit-containing alternative nitrogenases proliferated into different ecological niches than those that represent the earlier diverged Group III Nif.

D-Subunit Surface Residues and Properties Were Maintained through Birth of the G–D Interface

To replay the first evolutionary steps in the birth of the G-subunit, we reconstructed and analyzed ancestral nitrogenase protein sequences. These include the common ancestor of both Vnf and Anf (VnfAnfAnc) and a Nif ancestor immediately preceding VnfAnfAnc (NifAnc; we note that NifAnc does not represent the last common ancestor of all Nif nitrogenases) (Fig. 2). Because the G-subunit is exclusive to Vnf and Anf nitrogenases, the most parsimonious evolutionary scenario is that it emerged along the branch leading to VnfAnfAnc. We, therefore, compared sequences and structures of nitrogenase ancestors reconstructed both immediately before (NifAnc) and after (VnfAnfAnc) gain of the G-subunit. We examined the degree to which ancestral protein surface features were remodeled to accommodate a new subunit, hypothesizing that the emergence of the G-subunit triggered the evolution of complementary surface residues on the neighboring D-subunit protein.

We mapped the interface between ancestral G- and D-subunit proteins (“G–D interface”) by building its residue interaction network from an AlphaFold-predicted VnfAnfAnc structure (Fig. 3a to d). Nearly all well-conserved, G-subunit residues lie within the G–D interface and contribute significantly to the interface's stability, which we assessed by in silico alanine scanning (supplementary fig. S8, Supplementary Material online). The few exceptions are conserved residues distal from the interface with side chains oriented toward the interior of the helical bundle, which likely stabilize its globular arrangement.

Fig. 3.

Fig. 3.

Birth of the nitrogenase G-subunit interface. a) Predicted NifAnc D-subunit structure, prior to G-subunit emergence. Inset shows cross-section at the putative metallocluster insertion site, the entrance of which forms a “cleft” in the D-subunit surface. Superimposed G-subunit structure (transparent) highlights relative position of the cleft. b) Predicted VnfAnfAnc D- and G-subunit structures, following G-subunit emergence, shown in the same view as a). Side chain of the G-subunit Leu18 residue is anchored within the D-subunit cleft. a, b) Heterometal content for the ancestor metalloclusters is inferred. c) NifAnc G–D proto-interaction network. Highlighted residues are located within an interface patch conserved in extant Vnf/Anf. Residues that are later substituted in VnfAnfAnc are shown bold and labeled to permit comparison with d). Residues within the cleft are circled. d) VnfAnfAnc G–D interaction network, shown in the same view as c). Residues newly substituted in VnfAnfAnc are labeled. e) Surface view of the NifAnc D-subunit proto-interface patch. f) Surface view of the VnfAnfAnc D-subunit interface patch, shown in the same view as e). Residues that are either maintained or substituted between NifAnc and VnfAnfAnc are highlighted. g) Evolution of residues within the interface patch through the emergence of the G-subunit. For substituted sites, the posterior probability of the most likely VnfAnfAnc residue is <0.3 in NifAnc and >0.7 in VnfAnfAnc and, for maintained sites, >0.7 in NifAnc and >0.7 in VnfAnfAnc.

We tracked the evolution of D-subunit residues through the emergence of the ancestral G–D interface. We identified 42 D-subunit sites interacting with the G-subunit following its initial recruitment in ancestral VnfAnfAnc (supplementary tables S6-S7, Supplementary Material online). Of the 42 sites, 26 are well conserved among extant Vnf/Anf nitrogenases and form a core interface patch that was also likely integral to the early G–D interface (Fig. 3e and f). We inspected these 26 sites and found that only nine (∼35%) of these sites evolved (i.e. were substituted) alongside G-subunit emergence, imposing subtle changes to the electrostatic surface potential of the VnfAnfAnc interface patch (supplementary figs. S9 and S10, Supplementary Material online).

Intriguingly, residues and properties of the core, D-subunit interface patch are mostly maintained through the emergence of the G-subunit. Most interface residues (16 out of 26 sites, ∼62%) do not change (Fig. 3f and g). This finding suggests that a “proto-interface” had already appeared by NifAnc and required relatively minor residue changes to form G-subunit interactions. We observe that the residue-level similarity between the NifAnc and VnfAnfAnc D-subunit interface patches translates to similar surface hydrophobicity (median apolar fraction ≈ 0.2) before and after G-subunit recruitment (supplementary fig. S11, Supplementary Material online), indicating a predisposition for residue burial by protein–protein contacts.

The D-subunit interface patch of the AlphaFold-modeled NifAnc also exhibits an early evolved shape complementarity with the G-subunit that would have facilitated initial interactions between the two proteins. In both A. vinelandii Vnf (Sippel and Einsle 2017) and Anf (Trncik et al. 2023), we identify a “cleft” within the D-subunit surface that is occupied by a G-subunit leucine residue (Leu18; site index here and hereafter from A. vinelandii VnfG). Leu18 is universally conserved by all extant and ancestral G-subunit proteins in our data set (supplementary fig. S8a and c, Supplementary Material online) and the cleft itself is present in both ancestral and representative extant nitrogenase structures, including that of NifAnc (Figs. 3a and b and 4). Leu18 is among the most well-connected within the G–D interaction network (Fig. 3d) and is computationally inferred to be critical for interface stability (supplementary fig. S8b, Supplementary Material online), perhaps acting as an important anchor point for G-subunit binding. Finally, we predict Leu18 to be allosterically active based on perturbation response scanning (see Materials and Methods), which indicates that Leu18 movements can generate long-distance effects that reach across opposite ends of the nitrogenase complex (supplementary fig. S12, Supplementary Material online). These effects suggest the importance of the G–D interface in controlling the global motions of the nitrogenase complex and evoke similar, long-distance structural perturbations that are induced through the nitrogenase catalytic cycle (Huang et al. 2021; Rutledge et al. 2022).

Fig. 4.

Fig. 4.

Conservation of the D-subunit cofactor insertion cleft across extant and ancestral nitrogenases. Small arrow indicates position of the proposed cofactor insertion cleft. Dotted bold lines outline the cofactor insertion channel or cleft. a) Crystallographic structure of the A. vinelandii apo-NifD protein (PDB 1L5H). Cross-section view (right) reveals the proposed metallocluster insertion site (Schmid, Ribbe, et al. 2002), the entrance of which structurally aligns with the G–D interface of VnfD. Aligned, cross-section views are displayed for extant b) A. vinelandii NifD (PDB 3U7Q), c) C. pasteurianum NifD (PDB 4WES), d) A. vinelandii VnfD (PDB 5N6Y), and e) A. vinelandii AnfD (PDB 8BOQ) crystallographic structures as well as predicted, ancestral f) NifAnc D-subunit and g) VnfAnfAnc D-subunit structures. a to c, f) A. vinelandii VnfG (PDB 5N6Y; transparent) is superimposed on Nif structures to enable comparison across nitrogenases with and without native G-subunit proteins. a to f) The cross-section slices are at the same depth for all seven aligned structures.

An Ancient Assembly Pathway Preconditioned Early Nitrogenases for G-Subunit Interaction

The functional significance of the nitrogenase D-subunit interface patch provides an explanation for why it resisted significant remodeling following G-subunit recruitment. By comparison with extant Nif, we find that the ancestral (proto-)interface patch of both NifAnc and VnfAnfAnc D-subunits overlaps with the entrance to a proposed nitrogenase cofactor insertion site (Fig. 4a). In the extant apo-NifD protein, this feature adopts an open conformation to permit cofactor loading (Schmid, Einsle, et al. 2002). In the closed conformation, only the aforementioned D-subunit cleft remains visible at the surface, which aligns with the entrance of the open cofactor insertion channel (Fig. 4b). Given the consistent presence of the cleft that we observe across Nif, Vnf, and Anf nitrogenases (Fig. 4), a similar cofactor insertion mechanism can be predicted for Vnf and Anf. It is evidently an ancient, universal feature of nitrogenase assembly that predates the G-subunit. In fact, we find that 12 interface residues surrounding the insertion site are well conserved in extant Group III Nif clades that diverge prior to the origin of the G-subunit, and five of these are either universally or nearly universally conserved among all nitrogenases (supplementary fig. S13, Supplementary Material online). We expect that the essentiality of the cofactor insertion site would have made extensive residue changes to the NifAnc D-subunit surface deleterious during the formation of a new G–D interface.

The close residue-level interactions that we observe between both extant and ancestral G-subunit proteins and the cofactor insertion site of the D-subunit indicate that the G-subunit was integrated into the nitrogenase assembly process early in its evolution. In VnfAnfAnc, the side chain of the conserved G-subunit Leu18 residue is oriented into the cleft marking the entrance of the insertion site and toward the loaded active site (Figs. 3a and b and 4). An unusual feature of the G-subunit, a C-terminal tail that terminates in a strictly conserved tyrosine residue (Tyr113), bridges the presumed opening of the ancestral apo-D-subunit protein and neighbors universally conserved D-subunit residues (supplementary fig. S13, Supplementary Material online). A role in nitrogenase assembly is supported by experimental evidence that extant VnfG and VFe-co of A. vinelandii associate in vitro, as well as of the requirement of both VnfG and VFe-co to stabilize an active complex (Chatterjee et al. 1996, 1997).

Cofactor insertion in extant nitrogenases is considered to be otherwise mediated by transient interactions with protein assembly proteins, with the particular set of proteins varying across Nif, Vnf, and Anf nitrogenase systems (Schmid, Einsle, et al. 2002; Ruttimann-Johnson et al. 2003; Hernandez et al. 2011; Jimenez-Vicente et al. 2018; Perez-Gonzalez et al. 2021). Although this process can involve additional assembly factors (Jimenez-Vicente et al. 2018), minimally, it must require cofactor transfer from an assembly scaffold protein (e.g. NifEN) on which the cofactor is matured or directly from the cofactor precursor biosynthesis protein NifB that is shared between Nif, Vnf, and Anf systems (Perez-Gonzalez et al. 2021). These required protein interactions would presumably have shaped nitrogenase surface properties prior to the emergence of the G-subunit, particularly near the cofactor insertion site that in alternative nitrogenases also serves as the G–D interface. This possibility provides an explanation for the early evolved hydrophobicity that we observe for the ancestral, NifAnc D-subunit proto-interface patch (supplementary fig. S11, Supplementary Material online). An ancestral D-subunit surface that was already primed for protein interactions would have been preconditioned for G-subunit binding. A newly evolved G-subunit would have been integrated into this assembly pathway, perhaps replacing and/or mediating earlier evolved protein or cofactor interactions. However, unlike putative assembly factors, G and D interactions were evidently later tuned to form a permanent interface in alternative nitrogenases.

The G-Subunit is Functionally Specialized between Vnf and Anf Nitrogenases

A role in nitrogenase assembly suggests that the G-subunit would have evolved functional specialization between Vnf and Anf nitrogenases due to the different interactions with either FeV-co or FeFe-co, respectively, and/or other unique proteins involved in their respective cofactor insertion pathways (Schmid, Einsle, et al. 2002; Ruttimann-Johnson et al. 2003; Hernandez et al. 2011; Jimenez-Vicente et al. 2018; Perez-Gonzalez et al. 2021). For example, in A. vinelandii, NafY and NifY proteins are proposed to interact directly with the Nif nitrogenase to promote cofactor delivery (Jimenez-Vicente et al. 2018). A comparable protein, VnfY, is suggested to provide the same function for Vnf nitrogenase (Ruttimann-Johnson et al. 2003). Alternatively, the Anf nitrogenase system does not contain a counterpart for NafY/NifY/VnfY. Instead, cofactor delivery is proposed to occur directly between Anf and the NifB cofactor biosynthesis protein (Perez-Gonzalez et al. 2021). We therefore investigated sequence signatures of specialization between Vnf and Anf (i.e. protein sites that diverge between Vnf and Anf but are otherwise conserved within each of the two groups) that are likely to generate functional differences. We find the degree of specialization (relative to protein size) within the G-subunit to be on par with that of the D-subunit but observe comparatively little specialization within the K-subunit (supplementary fig. S14a, Supplementary Material online; supplementary table S8, Supplementary Material online). Therefore, the G-subunit contributes an outsized fraction of the total sequence specialization between Vnf and Anf, despite its remoteness from any nitrogenase cofactors and active sites. Specialized sites within the G-subunit are also distal from the G–D interface and primarily located at the surface of the protein (supplementary fig. S14b, Supplementary Material online). These include one residue, His110, close to the presumed H-subunit binding site, which might support a previously proposed role for the G-subunit in differentiating the H-subunit proteins of Nif, Vnf, or Anf systems in the same organism, thereby optimizing electron transfer between cognate nitrogenase components (Pence et al. 2021; Schmidt et al. 2024).

We examined how these signatures of sequence-level specialization manifested early in G-subunit evolution by tracking their amino acid content through the divergence of Vnf and Anf clades. First, we find that the specialized sequence motifs of VnfAnc (Vnf ancestor) and AnfAnc (Anf ancestor) G-subunit proteins most closely resemble those of their respective Vnf and Anf descendants (Fig. 5a; supplementary fig. S14c, Supplementary Material online). The most likely VnfAnc motif across six, noncontinuous specialized sites is “FIKIDV” (compared with extant Vnf consensus “FIKVDH”) whereas the most likely AnfAnc motif is “VNVDYD” (compared with extant Anf consensus “VNVDYD”). Second, however, we find that the most likely motif of the common VnfAnfAnc ancestor, “VIKIDV,” is more like that of Vnf than Anf, even accounting for the statistical uncertainty associated with ancestral reconstruction. These results indicate that the earliest G-subunit proteins were Vnf-like (in agreement with previous inferences of ancestral nitrogenases that indicate Vnf evolved prior to Anf; Garcia et al. 2020). Further, the G-subunit likely contributed to the protein-level specificity required for the assembly of an early, vanadium-dependent nitrogenase in the wake of Earth’s atmospheric oxygenation ∼2.5 billion years ago (Fig. 5b).

Fig. 5.

Fig. 5.

Evolution of G-subunit functional specialization and Vnf/Anf metal dependence. a) Hypothesized functional specialization of Vnf and Anf G-subunit proteins. Specialized sequence motifs, represented by sequence logos, of ancestral G-subunit proteins (VnfAnfAnc, VnfAnc, and AnfAnc) are shown beside their respective nodes in the stylized nitrogenase phylogeny. Sequence logos scale with amino acid posterior probabilities at each site. VnfAnfAnc dependence on FeV-co is inferred based on comparisons of specialized sequence motifs (indicated by a bold asterisk above FeV-co for VnfAnfAnc). Branch lengths in the phylogeny do not scale with the timeline in b). b) Geological context of G-subunit emergence in VnfAnfAnc nitrogenase. Plot shows the association between changes in bulk marine concentrations of redox-sensitive metals involved in nitrogen fixation (Mo, V, and Fe) and the progressive oxygenation of the Earth’s surface environment. Ancient vanadium concentrations are not well known, although oxygenation likely impacted the speciation and solubility of V and Mo similarly over geologic time (Emerson and Huested 1991; Moore et al. 2020). Metal concentrations are from Zerkle et al. (2005) and references therein and pO2 data are from Lyons et al. (2014). Estimated age constraints of nitrogenase origins and VnfAnfAnc (i.e. G-subunit emergence) are from Stueken et al. (2015) and Parsons et al. (2021), respectively. GOE, great oxidation event; PAL, present atmospheric level. c) Schematic of the experimental strategy to replace A. vinelandii VnfG with the ancestral VnfAnfAnc G-subunit (GAnc) or A. vinelandii AnfG and characterization via the acetylene (C2H2) reduction assay. d) Genome engineering approach to modify the vnfG gene of A. vinelandii by reciprocal recombination with donor plasmid DNA. All reported strains have Nif inactivated by nifD replacement with a kanamycin resistance cassette (KanR). e) Cellular acetylene reduction activity of engineered A. vinelandii strains (n = 4 biological replicates for WT vnfG and ΔvnfG::anfG; n = 3 for ΔvnfG::GAnc and ΔvnfG). Individual data points are shown by dark circles and mean values are shown by bar plots. Statistical tests were performed by one-way ANOVA and post hoc Tukey’s Honestly Significant Difference test.

We next assessed whether the divergence in Vnf and Anf G-subunit features since their common VnfAnfAnc ancestor was sufficient to diminish compatibilities of different G-subunit variants with other proteins in an alternative nitrogenase. We constructed several genomically engineered A. vinelandii strains to express hybrid Vnf enzymes containing different G-subunit proteins. In these strains, the vnfG gene was either (i) replaced by extant A. vinelandii anfG (“ΔvnfG::anfG”), (ii) replaced by a gene encoding the common VnfAnfAnc G-subunit ancestor (“ΔvnfG::GAnc”), or (iii) fully removed by markerless deletion (“ΔvnfG”) (Fig. 5c and d). A nifD deletion mutation was also introduced in all strains, as well as a fourth control strain with no modification to vnfG (“WT vnfG”), to ensure any observed change in phenotype would not be confounded by Nif activity (vanadium-amended growth medium was also confirmed to contain only trace Mo < 5 nM; supplementary fig. S15, Supplementary Material online). A. vinelandii VnfG and AnfG amino acid sequences are ∼40% identical, and each is ∼52% identical to the G-subunit protein of their common ancestor, VnfAnfAnc. All are structurally comparable and share several conserved residues at the G–D interface, as described above.

We observed that the replacement of VnfG by the VnfAnfAnc G-subunit or its complete deletion in A. vinelandii substantially decreased in vivo nitrogenase substrate reduction, assessed by the cellular rate of acetylene (C2H2; an alternative substrate of nitrogenase) reduction to ethylene (C2H4) (Hardy et al. 1968). Mean acetylene reduction rates of cells derepressed for nitrogenase expression in V-replete and Mo-deficient media were decreased by ∼73% in the ΔVnfG::GAnc strain (P < 0.001) and ∼90% in the ΔVnfG strain (P < 0.001) compared with the WT VnfG strain (Fig. 5e). Acetylene reduction activity was more moderately reduced in the ΔVnfG::AnfG strain (∼47%; P < 0.001), indicating that compatibility between the A. vinelandii Vnf enzyme and its AnfG protein is higher than between the former and the ancestral GAnc protein. This was unexpected because our sequence signature analysis predicted that GAnc would exhibit Vnf-life qualities (see above), and its overall sequence identity with A. vinelandii VnfG is higher than with AnfG. It is possible that other factors have shaped G-subunit features (and thus their cross compatibilities) in extant microbes, including for example the fact that both A. vinelandii VnfG and AnfG are expressed within an obligately aerobic microbe compared with a likely anaerobe-hosted VnfAnfAnc ancestor. Nevertheless, we conclude that the evolutionary divergence between the tested extant and ancestral G-subunit variants is sufficient to substantially impact the substrate reduction behavior of a hybrid Vnf enzyme.

Discussion

In the present study, we interrogated the origin, early evolution, and functional significance of the G-subunit protein that is unique to alternative Vnf and Anf nitrogenases. The emergence of the G-subunit constituted an unprecedented gain in structural and genetic complexity in the evolutionary history of nitrogen fixation >1.5 billion years ago (Parsons et al. 2021).

We demonstrate that there currently exists no candidate, evolutionary predecessor of the G-subunit family, whether another protein, protein fragment, or segment of nongenic DNA. This finding sets the origin of the G-subunit apart from other major evolutionary events in the history of nitrogen fixation that are explainable by gene duplication, including the divergence between other catalytic subunits, between nitrogenases and their maturation protein scaffolds, and between nitrogenase isozymes (Boyd and Peters 2013; Garcia et al. 2022). The addition of such a small and evolutionarily enigmatic protein domain appears to be rare across biogeochemical enzymes.

With the birth of the nitrogenase G-subunit, not only did a rare evolutionary event occur with associated fitness costs overcome (Andersson et al. 2015), but also a novel structural domain became fully integrated into the nitrogenase complex and was retained in one of the most central metabolic processes to Earth's biosphere. We propose that the G-subunit was formed de novo from previously nongenic DNA, more recently acknowledged as a viable, although arguably infrequent (Moyers and Zhang 2016), process for generating genetic novelty (Heames et al. 2020; Vakirlis, Acar, et al. 2020; Vakirlis, Carvunis, and McLysaght 2020). Large (>300 bp) insertion events between D- and K-genes like that which would have preceded the birth of a new G-subunit gene also appear to be rare, since we find that most NifD and NifK intergenic distances in extant organisms are <100 bp. It must typically be evolutionarily advantageous for D- and K-subunit genes to remain in close proximity. Nevertheless, the G-subunit gene was retained and may have accompanied the genetic duplication and divergence of Nif that produced alternative nitrogenases (Boyd and Peters 2013).

By analyzing predicted nitrogenase ancestors, we reveal an evolutionary stepping stone in the initial recruitment of the G-subunit. Our reconstruction of ancestral, D-subunit surface features indicates that the G-subunit binding site was built from an essential, preexisting cofactor insertion site. This site additionally appears previously adapted for transient protein interactions with chaperones that, like in extant microbes, would have been required for nitrogenase assembly (Schmid, Einsle, et al. 2002; Ruttimann-Johnson et al. 2003; Hernandez et al. 2011; Jimenez-Vicente et al. 2018; Perez-Gonzalez et al. 2021). This assembly scheme likely dates as far back as the nitrogenase family itself, hundreds of millions of years prior to the earliest whiffs of Earth oxygenation that accompanied G-subunit emergence (Anbar et al. 2007; Ostrander et al. 2021). Thus, the birth of the permanent G–D interface represents a gradual subsummation of formerly transient interactions that provided an evolutionary “cradle” for a permanent, novel structural domain.

Although the G-subunit is required for extant Vnf/Anf activity, its contemporary essentiality does not, on its own, prove that it was beneficial early in its emergence. Protein subunits can emerge via neutral evolutionary processes (Munoz-Gomez et al. 2021) and become entrenched in complexes via the accumulation of neutral mutations (Hochberg et al. 2020). Nevertheless, our evolutionary analyses of the G-subunit protein, coupled with prior experimental support of a role in nitrogenase assembly (Chatterjee et al. 1997), are suggestive of at least one early beneficial role within the nitrogenase assembly pathway by replacing and/or mediating earlier evolved protein or cofactor interactions. Specifically, we find that G-subunit residues, despite being relatively distal from the nitrogenase active site and not permanently retaining bound, reactive cofactors like other subunits, have an outsized contribution to the sequence-level specialization between alternative nitrogenases.

Our genetic experiments further confirm that extant and ancestral G-subunit proteins are not functionally interchangeable in a modern diazotroph, likely due to the degree of specialization required for their role in nitrogenases. Possible functional implications might include improved specificity in either cofactor or protein interactions during assembly (i.e. with aforementioned assembly factors that differ across Nif, Vnf, and Anf; Schmid, Einsle, et al. 2002; Ruttimann-Johnson et al. 2003; Hernandez et al. 2011; Jimenez-Vicente et al. 2018; Perez-Gonzalez et al. 2021) and in mediating crosstalk between electron transfer (i.e. H-subunit) and catalytic nitrogenase components (Pence et al. 2021; Schmidt et al. 2024). Given a capacity to mediate the assembly of alternative nitrogenases and the incorporation of novel metal cofactors, the G-subunit may have been essential for the initial diversification of Vnf/Anf. These molybdenum-independent nitrogenases would be advantageous for new, molybdenum-limited ecological niches and for temporal fluctuations and spatial heterogeneity in environmental redox conditions (and associated trace metal availabilities) following the rise of oxygenic phototrophs (Hazen et al. 2008; Reinhard et al. 2013; Lyons et al. 2014), one of the most significant ecological revolutions in Earth history. Although the precise role remains to be fully characterized, our work provides ancestral G-subunit protein targets with which evolutionary mechanisms in the history of this protein can continue to be tested.

The ancient, orphan G-subunit protein was birthed by seemingly improbable events. Early Earth’s surface oxygenation generated conditions favorable for the emergence of the G-subunit and alternative metal dependence. However, there preexisted a molecular foundation setting the primary constraints on the emergence of the protein. G-subunit recruitment was thus enabled by a conserved nitrogenase assembly pathway that originated hundreds of millions of years before the rise of oxygen and the associated shifts in metal availabilities. Future bridges between deep time evolution, molecular novelties, and planetary circumstances will refine our understanding of how life responds to environmental change.

Materials and Methods

Nitrogenase Protein Sequence Curation

A preliminary 412-sequence data set of nitrogenase Vnf/AnfG proteins was compiled by BLASTP (Altschul et al. 1990) search (expect value threshold = 0.1) against the National Center for Biotechnology Information (NCBI) nonredundant protein sequence database (nr) (Sayers et al. 2022) (accessed January 2022), using the A. vinelandii VnfG query sequence (WP_012698949.1). BLASTP hits were curated by filtering against partial sequences and homologs whose encoding genes are distantly located from vnf or anf operons (and thus unlikely to be canonical vnf/anfG genes). Certain VnfG sequences were manually extracted from fused VnfDG proteins, using MAFFT v7.490 (Katoh and Standley 2013) alignments to other D- or G-subunit sequences as a guide. The preliminary Vnf/AnfG sequence data set was pruned by CD-HIT (Fu et al. 2012) (sequence identity threshold = 90%), yielding a final 188-sequence G-subunit data set.

G-Subunit Homolog Search

In addition to the BLASTP search described above, three sequence-based search tools—TBLASTN (Altschul et al. 1990), PSI-BLAST (Altschul et al. 1997), and HMMER (Finn et al. 2011)—were used to identify distantly related homologs of the nitrogenase G-subunit (conducted January 2022). TBLASTN and PSI-BLAST searches against the NCBI nucleotide collection (nr/nt) and nr protein database, respectively, were executed with the A. vinelandii VnfG query (WP_012698949.1), an expect value threshold = 0.1, and word size = 2 (all other parameters default). HMMER search (hmmsearch) against the UniProt Reference Proteome database (UniProt Consortium 2023) was performed using HMM profiles generated from (i) the MAFFT-aligned, 188-sequence G-subunit data set (see above) and (ii) the PAML output alignment with both 188 extant and 187 ancestral G-subunit sequences. The search was run with an expect value threshold = 0.1 for both target sequences and hits (all other parameters default).

Structure-based searches of distant G-subunit homologs were performed by the FoldSeek server (van Kempen et al. 2024) using the A. vinelandii VnfG crystallographic structure (PDB 5N6Y; Sippel and Einsle 2017) query against the Protein Data Bank and AlphaFold-UniProt/Swiss-Prot databases. We employed both 3Di-AA and TM-Align modes with a 3Di-AA expect value threshold = 0.01 or a TM-score = 0.5 (template modeling score).

Phylogenetic and Ancestral Sequence Reconstruction

The curated, 188-sequence Vnf/AnfG data set was aligned by MAFFT. No outgroup to G-subunit sequences was included because the distant homology search described above did not yield convincing hits (see Results). IQ-TREE v1.6.12 (Nguyen et al. 2015) was used to reconstruct a maximum-likelihood G-subunit protein phylogeny, using the best-fit evolutionary model, WAG + R6 (selected by the Bayesian information criterion [BIC]), tested by ModelFinder (Kalyaanamoorthy et al. 2017), with branch support calculated by ultrafast bootstrap approximation (Hoang et al. 2018) and the Shimodaira–Hasegawa-like approximate likelihood-ratio test (Guindon et al. 2010).

A phylogeny of concatenated Vnf/AnfHDGK proteins was generated by retrieving syntenous subunit sequences to the curated VnfG/AnfG data set. Certain VnfH sequences that recently diverged from NifH clades were omitted due to having different evolutionary histories than those of other Vnf/Anf proteins. NifHDK sequences from a prior phylogenetic analysis (Garcia et al. 2023) were included as an outgroup (n = 67, closely branching to Anf/Vnf) and the full prior data set used to calculate intergenic nifD-to-nifK distances (n = 306). Each subunit sequence data set was aligned by MAFFT prior to concatenation. Maximum-likelihood phylogenetic reconstruction was performed by IQ-TREE as described above (best-fit evolutionary model, BIC: LG + F + R9).

Marginal maximum-likelihood ancestral sequence reconstruction of concatenated Vnf/AnfHDGK sequences was performed by PAML v4.9j (Yang 2007) (LG + F + G4 model). Mean sitewise posterior probabilities D- and G-subunit ancestors analyzed in this study range from 0.75 to 0.96 (supplementary table S1, Supplementary Material online). Sequence gaps were reconstructed as described by Aadland et al. (2019). Briefly, the input protein sequence alignment was recoded as a binary amino-acid presence/absence matrix. Ancestral presence/absence states were reconstructed by PAML using a binary character model and state frequencies inferred by maximum likelihood.

Logo representations of extant sequence variation were generated with the Python library LogoMaker (Tareen and Kinney 2020) using the probability transformation of the count matrix. Representations of ancestral sequences were generated from the posterior probability matrices produced by PAML.

Sequence and phylogenetic tree data sets are available at https://github.com/kacarlab/nitrogenase-g-subunit.

OGT Prediction

For taxa represented in the nitrogenase G-subunit sequence data set, OGTs were predicted from genomic content using the method described by Sauer and Wang (2019). Due to genome incompleteness and the absence of 16S rRNA sequences in fragmented genomes for certain representative taxa, the selected “Superkingdom Bacteria” and “Superkingdom Archaea” regression models for OGT prediction excluded genome size and 16S rRNA.

Protein Conservation and Divergence Analysis

Sequence conservation for D- and G-subunit proteins was determined by the ConSurf server (Ashkenazy et al. 2016). Sites were considered well conserved if their ConSurf grade was >7. Sequence specialization was determined by the TwinCons package (Penev et al. 2021). TwinCons was run separately on individual alignments of extant H, D, G, and K subunit protein sequences with user-defined Vnf and Anf groups, using default parameters and the LG substitution matrix. TwinCons specialization scores were normalized to a 0-to-1 scale per analyzed subunit.

Structural Modeling

Ancestral and extant NifHDGK protein structures were predicted using ColabFold (Mirdita et al. 2022), which combines the AlphaFold2 method for structure prediction from multiple-sequence alignments18 with the MMSeq2 method to obtain evolutionary information (Steinegger and Söding 2017). Specifically, the colabfold_batch v1.4.0 implementation (https://github.com/YoshitakaMo/localcolabfold) was used with the AlphaFold-v2.0-multimer prediction model and standard options (3 recycles and AMBER all-atom optimization in GPU). Protein structures were visualized by ChimeraX (Pettersen et al. 2021). A reference list of residue identifiers for the analyzed structures is provided in supplementary tables S8-S9, Supplementary Material online.

Structure Analysis

Residue interaction networks (Csermely 2008) were generated using ProLif (Bouysset and Fiorucci 2021), which finds the different possible interactions (hydrophobic, electrostatic, hydrogen bonds, etc.) between two groups of atoms (here defined as G- and D-subunit groups) within a structure snapshot (e.g. single structure and trajectory frame). Interactions were compiled into networks by NetworkX (Hagberg et al. 2008), which were later visualized in Gephi (Bastian et al. 2009). Scripts used to generate the networks are available at https://github.com/kacarlab/nitrogenase-g-subunit.

Surface electrostatic maps were computed using the Poisson Boltzmann Solvent-Accessible model implemented in APBS (Jurrus et al. 2018) through the PyMOL plugin (Schrödinger, Inc.), using standard PDB2PQR and APBS protocols at pH 7.

Sitewise contribution to D-G interface stability was computed using in silico alanine scanning, which substitutes residues with a minimal methyl moiety (alanine) (Kortemme et al. 2004). The changes on the binding free energy values upon substitution were obtained using the machine learning-enabled mCSM2 method (Rodrigues et al. 2019) with default parameters.

Allosteric effects across the nitrogenase complex were modeled using Perturbation Response Scanning (Gerek and Ozkan 2011), implemented in ProDy (Gerek and Ozkan 2011) with default parameters and provided with PDB 5N6Y (Sippel and Einsle 2017) alpha carbon coordinates. Scripts used to model allosteric effects are available at https://github.com/kacarlab/nitrogenase-g-subunit.

A. vinelandii Strains and Culturing

All A. vinelandii strains used in the present study (supplementary table S2, Supplementary Material online) are derivatives of the wild-type DJ strain, generously provided by Dennis Dean (Virginia Tech). A. vinelandii cells were cultured in Burk's medium (containing 1 µM Na2MoO4), with nitrogen (13 mM ammonium acetate), vanadium (1 µM Na2MoO4 replaced with 1 µM Na2O2V), and/or antibiotic amendments (0.1-µg/mL streptomycin, 0.6-µg/mL kanamycin, and 5-µg/mL rifampicin) as needed (Dos Santos 2019). For most procedures, cultures were grown at 30 °C and agitated at 300 rpm. Competent cells for strain engineering (see below) were cultured at 30 °C and agitated at 170 rpm.

Plasmid Construction

All plasmids are listed in supplementary table S2, Supplementary Material online. Plasmids for A. vinelandii strain engineering were constructed via BsmBI-based Golden Gate assembly of synthetic DNA parts (Twist Biosciences, GenScript Biotech), including G-subunit variant coding sequences, homology arms, and antibiotic resistance cassettes. The ancestral G-subunit nucleotide sequence was codon-optimized as previously described (Garcia et al. 2023). Briefly, optimization was based on A. vinelandii DJ codon frequencies (Codon Usage Database, https://www.kazusa.or.jp/codon/), with the exception that the vnfG wild-type codon was assigned for identical amino acid residues. A. vinelandii vnfD and vnfK (including the vnfG-vnfK intergenic region) sequences were used for 5′ and 3′ homology arms, respectively. An “ASWSHPQFEK” C-terminal Strep-tag was included in the vnfD homology arm. Notably, the overlapping vnfD stop and vnfG start codons that are a feature of the A. vinelandii vnf operon were preserved in the plasmid assembly design. All DNA parts were synthesized with flanking BsmBI sites and 4-bp overhangs for scarless genomic integration into A. vinelandii. Internal BsmBI sites (two within the vnfG-vnfK intergenic region and one within the vnfD coding sequence) in A. vinelandii native sequences were removed prior to synthesis.

Synthesized parts were assembled into a domesticated pUC19 vector, pSR01, wherein BsmBI sites were reengineered to flank the multiple cloning site via site-directed mutagenesis (Q5 Site Directed Mutagenesis Kit, New England Biolabs, Cat. No. E0554S). Golden Gate assembly was performed with BsmBI-v2 (New England Biolabs, Cat. No. R0739S) and T4 ligase (New England Biolabs, Cat. No. M0202S). Assembled plasmids were cloned into 5-alpha Competent Escherichia coli (New England Biolabs, Cat. No. C2987). Isolated transformants were screened for ampicillin resistance and stored in 80% glycerol at −80 °C. Assembly was confirmed by whole-plasmid sequencing (Plasmidsaurus, Eugene, OR, USA). Plasmids were purified with the QIAprep Spin Miniprep Kit (Qiagen, Cat. No. 27106) prior to A. vinelandii transformation.

Strain Engineering

Engineering of A. vinelandii strains used established methods, following Dos Santos (2019). All transformations were performed by first inducing genetic competence via metal starvation (growth in molybdenum- and iron-free Burk's medium). Competent cells were incubated with donor plasmid and screened on appropriate selective, solid Burk's medium. Mutations were incorporated into the genome via double reciprocal recombination. Isolated transformants were passaged on a selective solid medium two to three times to ensure phenotypic stability prior to storage in phosphate buffer containing 7% DMSO at −80 °C. Desired mutations were confirmed by Sanger sequencing. Oligonucleotides used for strain engineering and screening are listed in supplementary table S3, Supplementary Material online.

A vnfG knockout mutant (“ΔvnfG::StrR”) was constructed by transforming A. vinelandii DJ with a plasmid bearing a streptomycin resistance cassette (pSR22) and screening for streptomycin resistance. G-subunit gene variants were subsequently introduced into the vnf operon of ΔvnfG::StrR via congression transformation with a plasmid bearing the desired G-subunit gene mutation (pSR19, pSR21, or pSR24) and pDB303 (conferring rifampicin resistance) to facilitate selection (Dos Santos 2019). A fourth strain, containing Strep-tagged vnfK and wild-type vnfG was constructed similarly by transformation with a ∼3,400-bp vnfDGK fragment PCR amplified from A. vinelandii strain DJ2254. Nif was subsequently inactivated in each of the four strains by secondary transformation with a ∼4,000-bp fragment PCR amplified from strain DJ2278, carrying a ΔnifD::KanR mutation, and screened for kanamycin sensitivity. Together, the transformation steps yielded four strains used for subsequent phenotypic characterization: WT_vnfG, ΔvnfG::anfG, ΔvnfG::GAnc, and ΔvnfG. Strains DJ2254 and DJ2278, as well as plasmid pDB303, were provided by Dennis Dean (Virginia Tech).

Cellular Acetylene Reduction Assays

A. vinelandii seed cultures (50 mL in 125-mL flasks) representing independent biological replicates were prepared from plated cells in liquid, molybdenum-free Burk's medium amended with both vanadium and fixed nitrogen sources (“BNV medium”; see above) and left to grow for 24 h. Seed cultures were then used to inoculate 100-mL fresh BNV medium to an optical density (measured at 600 nm, “OD600”) of ∼0.01 and grown to an OD600 of ∼0.5 to 0.7 (mid-log phase). Cells were derepressed for Vnf expression by centrifuging 4,225 × g for 10 min and resuspending in 100-mL molybdenum-free Burk's medium amended with vanadium (“BV medium”) in 250-mL flasks, followed by incubation at 30 °C, 300 rpm for 4 h. Each flask was then affixed with a rubber septum cap. Twenty-five milliliters of headspace was removed and replaced by injecting an equivalent volume of acetylene gas. The cultures were then incubated as described above for 60 min. During this period, headspace samples were taken after 20, 40, and 60 min of incubation for ethylene quantification by a Nexis GC-2030 gas chromatograph (Shimadzu, Kyoto, Japan). After the incubation period, cells were pelleted at 4,225 × g for 10 min, washed once with 3 mL of phosphate buffer, and pelleted once more prior to storage at −80 °C. Total protein in cell lysates was quantified using the Pierce BCA Protein Assay Kit (Thermo Fisher Scientific, Cat. No. 23225) according to the manufacturer’s instructions on a CLARIOstar Plus plate reader (BMG Labtech, Ortenberg, Germany). The acetylene reduction rate for each replicate was calculated using an ethylene standard curve and normalized to total protein. Statistical significance was assessed by one-way ANOVA and post hoc Tukey’s Honestly Significant Difference test.

Molybdenum Concentration Analysis

To determine the amount of trace molybdenum in molybdenum-free Burk's medium, samples of different medium preparations were analyzed using inductively coupled plasma mass spectrometry (ICP-MS). BV growth medium was prepared in both a 500-mL plastic bottle and a 500-mL glass bottle. Only the latter was autoclaved following medium preparation. Additionally, an ultrapure water sample (Milli-Q purification system, Millipore Sigma, Burlington, MA, USA) was prepared to quantify trace molybdenum in water used to prepare growth media. Finally, an external calibration curve of sodium molybdate ranging from 0 to 96 ppb was prepared via serial dilution (supplementary fig. S15a, Supplementary Material online). Twenty-five-milliliter samples were acidified with 2% HNO3. Sample analysis was performed via high matrix induction on an 8900 Triple Quadrupole ICP-MS (Agilent, Santa Clara, CA, USA). Molybdenum was quantified at mass shifts of 95 (Mo) and 127 in O2 gas (MoO2).

Supplementary Material

msae067_Supplementary_Data

Acknowledgments

This research was supported by the Margarita Salas Postdoctoral Fellowship, founded by the Unión Europa—Next Generation EU (B.C.Z.; UP2021-035), Hypothesis Fund (B.K.), National Aeronautics and Space Administration (NASA) Interdisciplinary Consortia for Astrobiology Research (ICAR) (80NSSC22K0546), the NASA Arizona Space Grant (B.M.C), and the National Science Foundation Emerging Frontiers Award (B.K.; 2228495), and we thank Steven J. Russell for assistance with plasmid construction, Morgan Sobol for assistance with OGT prediction, Lance Seefeldt, Derek Harris, and Joanna Masel for valuable discussions and feedback.

Contributor Information

Bruno Cuevas-Zuviría, Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, USA.

Amanda K Garcia, Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, USA.

Alex J Rivier, Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, USA.

Holly R Rucker, Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, USA.

Brooke M Carruthers, Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, USA.

Betül Kaçar, Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, USA.

Conflict of Interest

The authors declare no conflict of interest.

Data availability

The data underlying this article are available at https://github.com/kacarlab/nitrogenase-g-subunit or in the Supplementary material available at Molecular Biology and Evolution online.

References

  1. Aadland  K, Pugh  C, Kolaczkowski  B. High-throughput reconstruction of ancestral protein sequence, structure, and molecular function. Methods Mol Biol. 2019:1851:135–170. 10.1007/978-1-4939-8736-8_8. [DOI] [PubMed] [Google Scholar]
  2. Allen  JF, Thake  B, Martin  WF. Nitrogenase inhibition limited oxygenation of earth's proterozoic atmosphere. Trends Plant Sci. 2019:24(11):1022–1031. 10.1016/j.tplants.2019.07.007. [DOI] [PubMed] [Google Scholar]
  3. Altschul  SF, Gish  W, Miller  W, Myers  EW, Lipman  DJ. Basic local alignment search tool. J Mol Biol. 1990:215(3):403–410. 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
  4. Altschul  SF, Madden  TL, Schäffer  AA, Zhang  J, Zhang  Z, Miller  W, Lipman  DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997:25(17):3389–3402. 10.1093/nar/25.17.3389. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Alva  V, Soding  J, Lupas  AN. A vocabulary of ancient peptides at the origin of folded proteins. eLife. 2015:4:e09410. 10.7554/eLife.09410. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Anbar  AD. Oceans. Elements and evolution. Science. 2008:322(5907):1481–1483. 10.1126/science.1163100. [DOI] [PubMed] [Google Scholar]
  7. Anbar  AD, Duan  Y, Lyons  TW, Arnold  GL, Kendall  B, Creaser  RA, Kaufman  AJ, Gordon  GW, Scott  C, Garvin  J, et al.  A whiff of oxygen before the great oxidation event?  Science. 2007:317(5846):1903–1906. 10.1126/science.1140325. [DOI] [PubMed] [Google Scholar]
  8. Anbar  AD, Knoll  AH. Proterozoic ocean chemistry and evolution: a bioinorganic bridge?  Science. 2002:297(5584):1137–1142. 10.1126/science.1069651. [DOI] [PubMed] [Google Scholar]
  9. Andersson  DI, Jerlstrom-Hultqvist  J, Nasvall  J. Evolution of new functions de novo and from preexisting genes. Cold Spring Harb Perspect Biol. 2015:7(6):a017996. 10.1101/cshperspect.a017996. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Ashkenazy  H, Abadi  S, Martz  E, Chay  O, Mayrose  I, Pupko  T, Ben-Tal  N. ConSurf 2016: an improved methodology to estimate and visualize evolutionary conservation in macromolecules. Nucleic Acids Res. 2016:44(W1):W344–W350. 10.1093/nar/gkw408. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Bastian  M, Heymann  S, Jacomy  M. Proceedings of the International AAAI Conference on Web and Social Media, Association for the Advancement of Artificial Intelligence, San Jose, California, USA; 2009. p. 361–362.
  12. Beasley  JR, Hecht  MH. Protein design: the choice of de novo sequences. J Biol Chem. 1997:272(4):2031–2034. 10.1074/jbc.272.4.2031. [DOI] [PubMed] [Google Scholar]
  13. Bouysset  C, Fiorucci  S. ProLIF: a library to encode molecular interactions as fingerprints. J Cheminform. 2021:13(1):72. 10.1186/s13321-021-00548-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Boyd  ES, Anbar  AD, Miller  S, Hamilton  TL, Lavin  M, Peters  JW. A late methanogen origin for molybdenum-dependent nitrogenase. Geobiology. 2011:9(3):221–232. 10.1111/j.1472-4669.2011.00278.x. [DOI] [PubMed] [Google Scholar]
  15. Boyd  ES, Hamilton  TL, Peters  JW. An alternative path for the evolution of biological nitrogen fixation. Front Microbiol. 2011:2:205. 10.3389/fmicb.2011.00205. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Boyd  ES, Peters  JW. New insights into the evolutionary history of biological nitrogen fixation. Front Microbiol. 2013:4:201. 10.3389/fmicb.2013.00201. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Chatterjee  R, Allen  RM, Ludden  PW, Shah  VK. Purification and characterization of the vnf-encoded apodinitrogenase from Azotobacter vinelandii. J Biol Chem. 1996:271(12):6819–6826. 10.1074/jbc.271.12.6819. [DOI] [PubMed] [Google Scholar]
  18. Chatterjee  R, Ludden  PW, Shah  VK. Characterization of VNFG, the delta subunit of the vnf-encoded apodinitrogenase from Azotobacter vinelandii. Implications for its role in the formation of functional dinitrogenase 2. J Biol Chem. 1997:272(6):3758–3765. 10.1074/jbc.272.6.3758. [DOI] [PubMed] [Google Scholar]
  19. Choi  I-G, Kim  S-H. Evolution of protein structural classes and protein sequence families. Proc Natl Acad Sci U S A. 2006:103(38):14056–14061. 10.1073/pnas.0606239103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Cole  DB, Mills  DB, Erwin  DH, Sperling  EA, Porter  SM, Reinhard  CT, Planavsky  NJ. On the co-evolution of surface oxygen levels and animals. Geobiology. 2020:18(3):260–281. 10.1111/gbi.12382. [DOI] [PubMed] [Google Scholar]
  21. Copley  SD. Evolution of new enzymes by gene duplication and divergence. FEBS J. 2020:287(7):1262–1283. 10.1111/febs.15299. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Csermely  P. Creative elements: network-based predictions of active centres in proteins and cellular and social networks. Trends Biochem Sci. 2008:33(12):569–576. 10.1016/j.tibs.2008.09.006. [DOI] [PubMed] [Google Scholar]
  23. Dos Santos  PC. Genomic manipulations of the diazotroph Azotobacter vinelandii. Methods Mol Biol. 2019:1876:91–109. 10.1007/978-1-4939-8864-8_6. [DOI] [PubMed] [Google Scholar]
  24. Dos Santos  PC, Fang  Z, Mason  SW, Setubal  JC, Dixon  R. Distribution of nitrogen fixation and nitrogenase-like sequences amongst microbial genomes. BMC Genomics. 2012:13(1):162. 10.1186/1471-2164-13-162. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Dupont  CL, Butcher  A, Valas  RE, Bourne  PE, Caetano-Anolles  G. History of biological metal utilization inferred through phylogenomic analysis of protein structures. Proc Natl Acad Sci U S A. 2010:107(23):10567–10572. 10.1073/pnas.0912491107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Eady  RR. Structure-function relationships of alternative nitrogenases. Chem Rev. 1996:96(7):3013–3030. 10.1021/cr950057h. [DOI] [PubMed] [Google Scholar]
  27. Eady  RR, Robson  RL, Richardson  TH, Miller  RW, Hawkins  M. The vanadium nitrogenase of Azotobacter chroococcum: purification and properties of the VFe protein. Biochemical Journal. 1987:244(1):197–207. 10.1042/bj2440197. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Eck  RV, Dayhoff  MO. Evolution of the structure of ferredoxin based on living relics of primitive amino acid sequences. Science. 1966:152(3720):363–366. 10.1126/science.152.3720.363. [DOI] [PubMed] [Google Scholar]
  29. Einsle  O, Rees  DC. Structural enzymology of nitrogenase enzymes. Chem Rev. 2020:120(12):4969–5004. 10.1021/acs.chemrev.0c00067. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Emerson  SR, Huested  SS. Ocean anoxia and the concentrations of molybdenum and vanadium in seawater. Marine Chem. 1991:34(3-4):177–196. 10.1016/0304-4203(91)90002-E. [DOI] [Google Scholar]
  31. Falkowski  PG. Evolution of the nitrogen cycle and its influence on the biological sequestration of CO2 in the ocean. Nature. 1997:387(6630):272–275. 10.1038/387272a0. [DOI] [Google Scholar]
  32. Falkowski  PG, Fenchel  T, Delong  EF. The microbial engines that drive Earth's biogeochemical cycles. Science. 2008:320(5879):1034–1039. 10.1126/science.1153213. [DOI] [PubMed] [Google Scholar]
  33. Finn  RD, Clements  J, Eddy  SR. HMMER web server: interactive sequence similarity searching. Nucleic Acids Res. 2011:39(suppl):W29–W37. 10.1093/nar/gkr367. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Fu  L, Niu  B, Zhu  Z, Wu  S, Li  W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics. 2012:28(23):3150–3152. 10.1093/bioinformatics/bts565. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Garcia  AK, Harris  DF, Rivier  AJ, Carruthers  BM, Pinochet-Barros  A, Seefeldt  LC, Kacar  B. Nitrogenase resurrection and the evolution of a singular enzymatic mechanism. eLife. 2023:12:e85003. 10.7554/eLife.85003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Garcia  AK, Kolaczkowski  B, Kacar  B. Reconstruction of nitrogenase predecessors suggests origin from maturase-like proteins. Genome Biol Evol. 2022:14(3):evac031. 10.1093/gbe/evac031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Garcia  AK, McShea  H, Kolaczkowski  B, Kacar  B. Reconstructing the evolutionary history of nitrogenases: evidence for ancestral molybdenum-cofactor utilization. Geobiology. 2020:18(3):394–411. 10.1111/gbi.12381. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Gerek  ZN, Ozkan  SB. Change in allosteric network affects binding affinities of PDZ domains: analysis through perturbation response scanning. PLoS Comput Biol. 2011:7(10):e1002154. 10.1371/journal.pcbi.1002154. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Ghebreamlak  SM, Mansoorabadi  SO. Divergent members of the nitrogenase superfamily: tetrapyrrole biosynthesis and beyond. Chembiochem. 2020:21(12):1723–1728. 10.1002/cbic.201900782. [DOI] [PubMed] [Google Scholar]
  40. Glass  JB, Wolfe-Simon  F, Anbar  AD. Coevolution of metal availability and nitrogen assimilation in cyanobacteria and algae. Geobiology. 2009:7(2):100–123. 10.1111/j.1472-4669.2009.00190.x. [DOI] [PubMed] [Google Scholar]
  41. Guindon  S, Dufayard  JF, Lefort  V, Anisimova  M, Hordijk  W, Gascuel  O. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol. 2010:59(3):307–321. 10.1093/sysbio/syq010. [DOI] [PubMed] [Google Scholar]
  42. Hagberg  AA, Schult  DA, Swart  PJ. Exploring network structure, dynamics, and function using NetworkX. In: Varoquaux G, Vaught T, Millman J, editors. Proceedings of the 7th Python in Science Conference (SciPy2008). Pasadena, CA, USA; 2008. p. 11–15.
  43. Hardy  RW, Holsten  RD, Jackson  EK, Burns  RC. The acetylene-ethylene assay for N2 fixation: laboratory and field evaluation. Plant Physiol. 1968:43(8):1185–1207. 10.1104/pp.43.8.1185. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Harris  DF, Lukoyanov  DA, Kallas  H, Trncik  C, Yang  ZY, Compton  P, Kelleher  N, Einsle  O, Dean  DR, Hoffman  BM, et al.  Mo-, V-, and Fe-nitrogenases use a universal eight-electron reductive-elimination mechanism to achieve N2 reduction. Biochemistry. 2019:58(30):3293–3301. 10.1021/acs.biochem.9b00468. [DOI] [PubMed] [Google Scholar]
  45. Hazen  RM, Papineau  D, Bleeker  W, Downs  RT, Ferry  JM, McCoy  TJ, Sverjensky  DA, Yang  H. Mineral evolution. Am Mineral. 2008:93(11-12):1693–1720. 10.2138/am.2008.2955. [DOI] [Google Scholar]
  46. Heames  B, Schmitz  J, Bornberg-Bauer  E. A continuum of evolving de novo genes drives protein-coding novelty in Drosophila. J Mol Evol. 2020:88(4):382–398. 10.1007/s00239-020-09939-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Hernandez  JA, Phillips  AH, Erbil  WK, Zhao  D, Demuez  M, Zeymer  C, Pelton  JG, Wemmer  DE, Rubio  LM. A sterile alpha-motif domain in NafY targets apo-NifDK for iron-molybdenum cofactor delivery via a tethered domain. J Biol Chem. 2011:286(8):6321–6328. 10.1074/jbc.M110.168732. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Hoang  DT, Chernomor  O, von Haeseler  A, Minh  BQ, Vinh  LS. UFBoot2: improving the ultrafast bootstrap approximation. Mol Biol Evol. 2018:35(2):518–522. 10.1093/molbev/msx281. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Hochberg  GKA, Liu  Y, Marklund  EG, Metzger  BPH, Laganowsky  A, Thornton  JW. A hydrophobic ratchet entrenches molecular complexes. Nature. 2020:588(7838):503–508. 10.1038/s41586-020-3021-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Hoffman  BM, Lukoyanov  D, Yang  ZY, Dean  DR, Seefeldt  LC. Mechanism of nitrogen fixation by nitrogenase: the next stage. Chem Rev. 2014:114(8):4041–4062. 10.1021/cr400641x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Hu  Y, Lee  CC, Ribbe  MW. Extending the carbon chain: hydrocarbon formation catalyzed by vanadium/molybdenum nitrogenases. Science. 2011:333(6043):753–755.. 10.1126/science.1206883. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Huang  Q, Tokmina-Lukaszewska  M, Johnson  LE, Kallas  H, Ginovska  B, Peters  JW, Seefeldt  LC, Bothner  B, Raugei  S. Mechanical coupling in the nitrogenase complex. PLoS Comput Biol. 2021:17(3):e1008719. 10.1371/journal.pcbi.1008719. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Jimenez-Vicente  E, Yang  ZY, Ray  WK, Echavarri-Erasun  C, Cash  VL, Rubio  LM, Seefeldt  LC, Dean  DR. Sequential and differential interaction of assembly factors during nitrogenase MoFe protein maturation. J Biol Chem. 2018:293(25):9812–9823. 10.1074/jbc.RA118.002994. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Jurrus  E, Engel  D, Star  K, Monson  K, Brandi  J, Felberg  LE, Brookes  DH, Wilson  L, Chen  J, Liles  K, et al.  Improvements to the APBS biomolecular solvation software suite. Protein Sci. 2018:27(1):112–128. 10.1002/pro.3280. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Kacar  B, Garcia  AK, Anbar  AD. Evolutionary history of bioessential elements can guide the search for life in the universe. Chembiochem. 2021:22(1):114–119. 10.1002/cbic.202000500. [DOI] [PubMed] [Google Scholar]
  56. Kalyaanamoorthy  S, Minh  BQ, Wong  TKF, von Haeseler  A, Jermiin  LS. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat Methods. 2017:14(6):587–589. 10.1038/nmeth.4285. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Katoh  K, Standley  DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013:30(4):772–780. 10.1093/molbev/mst010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Kessler  PS, McLarnan  J, Leigh  JA. Nitrogenase phylogeny and the molybdenum dependence of nitrogen fixation in Methanococcus maripaludis. J Bacteriol. 1997:179(2):541–543. 10.1128/jb.179.2.541-543.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Knoll  AH. The geological consequences of evolution. Geobiology. 2003:1(1):3–14. 10.1046/j.1472-4669.2003.00002.x. [DOI] [Google Scholar]
  60. Kortemme  T, Kim  DE, Baker  D. Computational alanine scanning of protein-protein interfaces. Sci STKE. 2004:2004(219):pl2. 10.1126/stke.2192004pl2. [DOI] [PubMed] [Google Scholar]
  61. Lyons  TW, Reinhard  CT, Planavsky  NJ. The rise of oxygen in Earth's early ocean and atmosphere. Nature. 2014:506(7488):307–315. 10.1038/nature13068. [DOI] [PubMed] [Google Scholar]
  62. Medina  MS, Bretzing  KO, Aviles  RA, Chong  KM, Espinoza  A, Garcia  CNG, Katz  BB, Kharwa  RN, Hernandez  A, Lee  JL, et al.  Cown sustains nitrogenase turnover in the presence of the inhibitor carbon monoxide. J Biol Chem. 2021:296:100501. 10.1016/j.jbc.2021.100501. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Mirdita  M, Schütze  K, Moriwaki  Y, Heo  L, Ovchinnikov  S, Steinegger  M. ColabFold: making protein folding accessible to all. Nat Methods. 2022:19(6):679–682. 10.1038/s41592-022-01488-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Moore  EK, Hao  J, Spielman  SJ, Yee  N. The evolving redox chemistry and bioavailability of vanadium in deep time. Geobiology. 2020:18(2):127–138. 10.1111/gbi.12375. [DOI] [PubMed] [Google Scholar]
  65. Moore  EK, Jelen  BI, Giovannelli  D, Raanan  H, Falkowski  PG. Metal availability and the expanding network of microbial metabolisms in the Archaean eon. Nat Geosci. 2017:10(9):629–636. 10.1038/ngeo3006. [DOI] [Google Scholar]
  66. Moyers  BA, Zhang  J. Evaluating phylostratigraphic evidence for widespread de novo gene birth in genome evolution. Mol Biol Evol. 2016:33(5):1245–1256. 10.1093/molbev/msw008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Munoz-Gomez  SA, Bilolikar  G, Wideman  JG, Geiler-Samerotte  K. Constructive neutral evolution 20 years later. J Mol Evol. 2021:89(3):172–182. 10.1007/s00239-021-09996-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Mus  F, Alleman  AB, Pence  N, Seefeldt  LC, Peters  JW. Exploring the alternatives of biological nitrogen fixation. Metallomics. 2018:10(4):523–538. 10.1039/C8MT00038G. [DOI] [PubMed] [Google Scholar]
  69. Nguyen  LT, Schmidt  HA, von Haeseler  A, Minh  BQ. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol. 2015:32(1):268–274. 10.1093/molbev/msu300. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Ostrander  CM, Johnson  AC, Anbar  AD. Earth's first redox revolution. Annu Rev Earth Planet Sci. 2021:49(1):337–366. 10.1146/annurev-earth-072020-055249. [DOI] [Google Scholar]
  71. Parsons  C, Stüeken  EE, Rosen  CJ, Mateos  K, Anderson  RE. Radiation of nitrogen-metabolizing enzymes across the tree of life tracks environmental transitions in Earth history. Geobiology. 2021:19(1):18–34. 10.1111/gbi.12419. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Pence  N, Lewis  N, Alleman  AB, Seefeldt  LC, Peters  JW. Revealing a role for the G subunit in mediating interactions between the nitrogenase component proteins. J Inorg Biochem. 2021:214:111273. 10.1016/j.jinorgbio.2020.111273. [DOI] [PubMed] [Google Scholar]
  73. Penev  PI, Alvarez-Carreno  C, Smith  E, Petrov  AS, Williams  LD. TwinCons: conservation score for uncovering deep sequence similarity and divergence. PLoS Comput Biol. 2021:17(10):e1009541. 10.1371/journal.pcbi.1009541. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Perez-Gonzalez  A, Jimenez-Vicente  E, Gies-Elterlein  J, Salinero-Lanzarote  A, Yang  ZY, Einsle  O, Seefeldt  LC, Dean  DR. Specificity of NifEN and VnfEN for the assembly of nitrogenase active site cofactors in Azotobacter vinelandii. MBio. 2021:12(4):e0156821. 10.1128/mBio.01568-21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Pettersen  EF, Goddard  TD, Huang  CC, Meng  EC, Couch  GS, Croll  TI, Morris  JH, Ferrin  TE. UCSF chimerax: structure visualization for researchers, educators, and developers. Protein Sci. 2021:30(1):70–82. 10.1002/pro.3943. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Raymond  J, Siefert  JL, Staples  CR, Blankenship  RE. The natural history of nitrogen fixation. Mol Biol Evol. 2004:21(3):541–554. 10.1093/molbev/msh047. [DOI] [PubMed] [Google Scholar]
  77. Reinhard  CT, Planavsky  NJ, Robbins  LJ, Partin  CA, Gill  BC, Lalonde  SV, Bekker  A, Konhauser  KO, Lyons  TW. Proterozoic ocean redox and biogeochemical stasis. Proc Natl Acad Sci U S A. 2013:110(14):5357–5362. 10.1073/pnas.1208622110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Robbins  LJ, Lalonde  SV, Planavsky  NJ, Partin  CA, Reinhard  CT, Kendall  B, Scott  C, Hardisty  DS, Gill  BC, Alessi  DS, et al.  Trace elements at the intersection of marine biological and geochemical evolution. Earth-Sci Rev. 2016:163:323–348. 10.1016/j.earscirev.2016.10.013. [DOI] [Google Scholar]
  79. Rodrigues  CHM, Myung  Y, Pires  DEV, Ascher  DB. mCSM-PPI2: predicting the effects of mutations on protein–protein interactions. Nucleic Acids Res. 2019:47(W1):W338–W344. 10.1093/nar/gkz383. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Rohde  M, Trncik  C, Sippel  D, Gerhardt  S, Einsle  O. Crystal structure of VnfH, the iron protein component of vanadium nitrogenase. J Biol Inorg Chem. 2018:23(7):1049–1056. 10.1007/s00775-018-1602-4. [DOI] [PubMed] [Google Scholar]
  81. Rubio  LM, Ludden  PW. Maturation of nitrogenase: a biochemical puzzle. J Bacteriol. 2005:187(2):405–414. 10.1128/JB.187.2.405-414.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Rucker  HR, Kacar  B. Enigmatic evolution of microbial nitrogen fixation: insights from Earth's past. Trends Microbiol. 2023:S0966-842X(23)00091-4. 10.1016/j.tim.2023.03.011. [DOI] [PubMed] [Google Scholar]
  83. Rutledge  HL, Cook  BD, Nguyen  HPM, Herzik  MA, Tezcan  FA. Structures of the nitrogenase complex prepared under catalytic turnover conditions. Science. 2022:377(6608):865–869. 10.1126/science.abq7641. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Ruttimann-Johnson  C, Rubio  LM, Dean  DR, Ludden  PW. VnfY is required for full activity of the vanadium-containing dinitrogenase in Azotobacter vinelandii. J Bacteriol. 2003:185(7):2383–2386. 10.1128/JB.185.7.2383-2386.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Sanchez-Baracaldo  P, Ridgwell  A, Raven  JA. A neoproterozoic transition in the marine nitrogen cycle. Curr Biol. 2014:24(6):652–657. 10.1016/j.cub.2014.01.041. [DOI] [PubMed] [Google Scholar]
  86. Sauer  DB, Wang  DN. Predicting the optimal growth temperatures of prokaryotes using only genome derived features. Bioinformatics. 2019:35(18):3224–3231. 10.1093/bioinformatics/btz059. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Sayers  EW, Bolton  EE, Brister  JR, Canese  K, Chan  J, Comeau  DC, Connor  R, Funk  K, Kelly  C, Kim  S, et al.  Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2022:50(D1):D20–D26. 10.1093/nar/gkab1112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Schmid  B, Einsle  O, Chiu  HJ, Willing  A, Yoshida  M, Howard  JB, Rees  DC. Biochemical and structural characterization of the cross-linked complex of nitrogenase: comparison to the ADP-AlF4(-)-stabilized structure. Biochemistry. 2002:41(52):15557–15565. 10.1021/bi026642b. [DOI] [PubMed] [Google Scholar]
  89. Schmid  B, Ribbe  MW, Einsle  O, Yoshida  M, Thomas  LM, Dean  DR, Rees  DC, Burgess  BK. Structure of a cofactor-deficient nitrogenase MoFe protein. Science. 2002:296(5566):352–356. 10.1126/science.1070010. [DOI] [PubMed] [Google Scholar]
  90. Schmidt  FV, Schulz  L, Zarzycki  J, Oehlmann  NN, Prinz  S, Erb  TJ, Rebelein  JG, Structural insights into the iron nitrogenase complex. Nat Struct Mol Biol.  2024:31(1):150–158. 10.1038/s41594-023-01124-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Sippel  D, Einsle  O. The structure of vanadium nitrogenase reveals an unusual bridging ligand. Nat Chem Biol. 2017:13(9):956–960. 10.1038/nchembio.2428. [DOI] [PMC free article] [PubMed] [Google Scholar]
  92. Spatzal  T, Aksoyoglu  M, Zhang  L, Andrade  SL, Schleicher  E, Weber  S, Rees  DC, Einsle  O. Evidence for interstitial carbon in nitrogenase FeMo cofactor. Science. 2011:334(6058):940. 10.1126/science.1214025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  93. Steinegger  M, Söding  J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat Biotechnol. 2017:35(11):1026–1028. 10.1038/nbt.3988. [DOI] [PubMed] [Google Scholar]
  94. Stueken  EE, Buick  R, Guy  BM, Koehler  MC. Isotopic evidence for biological nitrogen fixation by molybdenum-nitrogenase from 3.2 Gyr. Nature. 2015:520(7549):666–669. 10.1038/nature14180. [DOI] [PubMed] [Google Scholar]
  95. Tareen  A, Kinney  JB. Logomaker: beautiful sequence logos in Python. Bioinformatics. 2020:36(7):2272–2274. 10.1093/bioinformatics/btz921. [DOI] [PMC free article] [PubMed] [Google Scholar]
  96. Tautz  D, Domazet-Loso  T. The evolutionary origin of orphan genes. Nat Rev Genet. 2011:12(10):692–702. 10.1038/nrg3053. [DOI] [PubMed] [Google Scholar]
  97. Trncik  C, Detemple  F, Einsle  O. Iron-only Fe-nitrogenase underscores common catalytic principles in biological nitrogen fixation. Nat Catal. 2023:6(5):415–424. 10.1038/s41929-023-00952-1. [DOI] [Google Scholar]
  98. UniProt Consortium . UniProt: the universal protein knowledgebase in 2023. Nucleic Acids Res. 2023:51(D1):D523–D531. 10.1093/nar/gkac1052. [DOI] [PMC free article] [PubMed] [Google Scholar]
  99. Vakirlis  N, Acar  O, Hsu  B, Castilho Coelho  N, Van Oss  SB, Wacholder  A, Medetgul-Ernar  K, Bowman  RW  2nd, Hines  CP, Iannotta  J, et al.  De novo emergence of adaptive membrane proteins from thymine-rich genomic sequences. Nat Commun. 2020:11(1):781. 10.1038/s41467-020-14500-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  100. Vakirlis  N, Carvunis  AR, McLysaght  A. Synteny-based analyses indicate that sequence divergence is not the main source of orphan genes. eLife. 2020:9:e53500. 10.7554/eLife.53500. [DOI] [PMC free article] [PubMed] [Google Scholar]
  101. van Kempen  M, Kim  SS, Tumescheit  C, Mirdita  M, Gilchrist  CLM, Söding  J, Steinegger  M.  Fast and accurate protein structure search with FoldSeek. Nat Biotechnol.  2024;42(2):243–246. 10.1038/s41587-023-01773-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  102. Van Oss  SB, Carvunis  AR. De novo gene birth. PLoS Genet.  2019:15(5):e1008160. 10.1371/journal.pgen.1008160. [DOI] [PMC free article] [PubMed] [Google Scholar]
  103. Waugh  SI, Paulsen  DM, Mylona  PV, Maynard  RH, Premakumar  R, Bishop  PE. The genes encoding the delta subunits of dinitrogenases 2 and 3 are required for mo-independent diazotrophic growth by Azotobacter vinelandii. J Bacteriol. 1995:177(6):1505–1510. 10.1128/jb.177.6.1505-1510.1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
  104. Weiss  MC, Sousa  FL, Mrnjavac  N, Neukirchen  S, Roettger  M, Nelson-Sathi  S, Martin  WF. The physiology and habitat of the last universal common ancestor. Nat Microbiol. 2016:1(9):16116. 10.1038/nmicrobiol.2016.116. [DOI] [PubMed] [Google Scholar]
  105. Williams  RJP. The natural selection of the chemical elements. Cell Mol Life Sci. 1997:53(10):816–829. 10.1007/s000180050102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  106. Yang  Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007:24(8):1586–1591. 10.1093/molbev/msm088. [DOI] [PubMed] [Google Scholar]
  107. Zerkle  AL, House  CH, Brantley  SL. Biogeochemical signatures through time as inferred from whole microbial genomes. Am J Sci. 2005:305(6-8):467–502. 10.2475/ajs.305.6-8.467. [DOI] [Google Scholar]
  108. Zheng  Y, Harris  DF, Yu  Z, Fu  Y, Poudel  S, Ledbetter  RN, Fixen  KR, Yang  ZY, Boyd  ES, Lidstrom  ME, et al.  A pathway for biological methane production using bacterial iron-only nitrogenase. Nat Microbiol. 2018:3(3):281–286. 10.1038/s41564-017-0091-5. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

msae067_Supplementary_Data

Data Availability Statement

The data underlying this article are available at https://github.com/kacarlab/nitrogenase-g-subunit or in the Supplementary material available at Molecular Biology and Evolution online.


Articles from Molecular Biology and Evolution are provided here courtesy of Oxford University Press

RESOURCES