Abstract
Cyanobacterial CO2 concentrating mechanisms (CCMs) sequester a globally consequential proportion of carbon into the biosphere. Proteinaceous microcompartments, called carboxysomes, play a critical role in CCM function, housing two enzymes to enhance CO2 fixation: carbonic anhydrase (CA) and Rubisco. Despite its importance, our current understanding of the carboxysomal CAs found in α-cyanobacteria, CsoSCA, remains limited, particularly regarding the regulation of its activity. Here, we present a structural and biochemical study of CsoSCA from the cyanobacterium Cyanobium sp. PCC7001. Our results show that the Cyanobium CsoSCA is allosterically activated by the Rubisco substrate ribulose-1,5-bisphosphate and forms a hexameric trimer of dimers. Comprehensive phylogenetic and mutational analyses are consistent with this regulation appearing exclusively in cyanobacterial α-carboxysome CAs. These findings clarify the biologically relevant oligomeric state of α-carboxysomal CAs and advance our understanding of the regulation of photosynthesis in this globally dominant lineage.
The carbonic anhydrase essential for carbon fixation in α-cyanobacteria is regulated by the Rubisco substrate RuBP.
INTRODUCTION
A myriad of CO2 concentrating mechanisms (CCMs) have independently evolved to promote the rapid and efficient reduction of atmospheric CO2 into organic compounds. CCMs work to increase the local concentration of CO2 near ribulose-1,5-bisphosphate (RuBP) carboxylase/oxygenase (Rubisco), the primary carboxylase of the Calvin–Benson (CB) cycle, thereby increasing its substrate turnover and competitively inhibiting competing oxygenation reactions (1–4). These systems are an essential component of the global carbon cycle, catalyzing about half of global photosynthesis (2, 5). The bacterial CCM, found in all cyanobacteria and some autotrophic bacteria, consists of two key elements: First, energy-coupled inorganic carbon (Ci; primarily HCO3− and CO2) transporters actively establish a concentrated pool of HCO3− within a cytosol lacking free carbonic anhydrases (CAs); this HCO3− then diffuses into proteinaceous microcompartments called carboxysomes that house CA and Rubisco (2, 3). Here, the CA converts HCO3− to CO2 to elevate luminal CO2, promoting Rubisco-catalyzed CO2 reduction (6). Bacterial CCMs have arisen in two distinct lineages: β-carboxysomes are found exclusively in β-cyanobacteria, containing form IB Rubisco with component genes encoded by the ccm operon and satellite loci (7), whereas α-carboxysomes are found in photoautotrophic α-cyanobacteria and several bacterial chemoautotrophs and are distinguished by the presence of form IA Rubisco and the clustering of carboxysome-associated genes into a discrete cso carboxysome operon (8).
Regulation of carbon fixation is essential for effective energy production. Indeed, the CCM is a notable example of how cells may induce physiological changes in response to environmental conditions. This adaptive capacity is a critical feature of these processes, involving regulation at the transcriptional and protein level, allowing the bacterial CCM to competitively support life in a range of ecological contexts (9, 10). For example, CCM-related Ci transporters are regulated by gene expression and allosteric effectors (11–13), and carboxysome composition and morphology are responsive to environmental cues (14–17). Likewise, Rubisco content is transcriptionally regulated, and its activity is modulated by activases (18–21). However, little is known about how, or whether, the other enzymatic component of the carboxysome, the CA, is regulated (22).
The fundamental role of CAs in photosynthesis is well established (6). This versatile protein superfamily catalyzes the reversible hydration of CO2 [CO2 + H2O ⇌ HCO3− + H+], comprising eight reportedly evolutionarily distinct classes (α, β, γ, δ, θ, η, ζ, and ι), distributed across the tree of life in a kingdom-nonspecific manner (23). In many cases, the enzyme directly supplies Rubisco with CO2, promoting its efficient reduction by ensuring reaction rate optimization through a constant, high concentration of the enzyme-substrate complex (6, 24). Controlling Rubisco activity through buffering a bicarbonate pool in this way optimizes carbon fixation, coordinating CO2 assimilation rates with the generation of nicotinamide adenine dinucleotide phosphate/adenosine triphosphate produced in light reactions (24). Correspondingly, microbial and biochemical studies have established an absolute requirement for CA activity within the carboxysome (25, 26) The α-carboxysome contains a highly divergent β-CA known as CsoSCA, characterization of which has occurred exclusively through the isoform from the chemoautotroph Halothiobacillus neapolitanus (27, 28). Indeed, structural and compositional studies have revealed sequence variation between cyanobacterial and proteobacterial carboxysome components and distinct carboxysome organization between these taxa (29–34). Given the differences in underlying metabolism between these photo- and chemoautotrophs, this has restricted our understanding of the CA-Rubisco feedback in α-cyanobacterial carboxysomes.
Here, we present a detailed biochemical, structural, and evolutionary analysis of a CsoSCA from a photoautotrophic cyanobacterium, Cyanobium sp. PCC7001 (Cyanobium), revealing previously unknown aspects of this isoform’s activity and molecular structure that form the basis for carboxysome regulation and organization. We found that, unlike the CA from the chemoautotrophic bacterium H. neapolitanus (HnCsoSCA), the Cyanobium isoform (CyCsoSCA) is regulated by the Rubisco substrate RuBP for activity, constituting a feedback loop at a key junction in α-cyanobacterial carbon metabolism. Detailed evolutionary analysis extends this, revealing that the sequence motifs for this regulation are not found in chemoautotrophic bacteria and expanding our understanding of α-cyanobacterial photosynthesis and CCM diversification more broadly.
RESULTS
Cyanobium CsoSCA requires RuBP for activity
Despite homology to the constitutively active HnCsoSCA (27), in our hands, CyCsoSCA did not show detectable HCO3− dehydration/CO2 hydration activity under standard assay conditions (14). This unexpected result indicated a potential additional requirement for CyCsoSCA function. Given the previous observation of Cyanobium carboxysome function in vitro (14) and the established reliance on CA activity (6, 25, 35), we assessed CyCsoSCA function under Rubisco assay conditions, where RuBP and Mg2+ are the key additional components. The addition of RuBP resulted in the concentration-dependent activation of CyCsoSCA, with a KM (Michaelis constant) for RuBP of 18 ± 2 μM, in a similar range to the Cyanobium Rubisco KM for RuBP (36 μM) (35). Comparatively, HnCsoSCA activity was unaffected by RuBP. Above 100 μM RuBP, CyCsoSCA activity rates match those recorded for HnCsoSCA (Fig. 1A). Notably, the CyCsoSCA RuBP response curve is best described by a sigmoidal Hill function (R2 = 0.988), typically indicative of an allosteric activation mechanism. To mitigate precipitation, the disordered N-terminal region was removed from these isoforms (fig. S1). Subsequent analyses were conducted with these truncated forms, unless stated.
Fig. 1. RuBP allosterically regulates CyCsoSCA.
(A) CyCsoSCA and HnCsoSCA activity as a function of RuBP concentration, measured by membrane inlet mass spectrometry (MIMS). Measurements reported are an average of three technical replicates, and error bars represent standard error. Curves were fitted using GraphPad Prism. (B) The homohexameric CyCsoSCA structure solved to 2.3 Å (PDB ID: 8THM). The “trimer of dimers” arrangement is shown with dimers colored pink, green, or blue; monomers are indicated with different shades. The black square denotes the apical structural zinc atoms within the globular NTD with further detail shown in the right inset. Waters in the octahedral coordination sphere are shown as red crosses. Zinc ions are shown as gray spheres, and the bicarbonate ion is shown in ball-and-stick representation. Bottom: 2mFo − dFc density at key interacting residues is shown (1 s). Polder omit map (green) is shown at a contour level of 5.0 σ to highlight the bicarbonate ion density at the A/D/F apex. (C) The C/D dimer with a box highlighting the RuBP binding pocket of chain D. Right: The RuBP binding site is shown with the Polder omit map density (75) of RuBP overlayed at a contour level of 7.0 σ. Sulfate ions and RuBP are shown in ball-and-stick representation. Polar interactions are shown as dashed lines, RuBP interactions are shown in yellow, and other secondary/SO4 interactions are shown in gray. (D) The relative rate of CyCsoSCA, the K469D mutant of CyCsoSCA, and HnCsoSCA are shown over time as a proportion of the maximum recorded reaction rate for each respective wild-type enzyme. Time points at which the protein and 100 μM RuBP were added to the MIMS cuvette are indicated. Curve is the mean of three technical replicates with the shaded area indicative of standard (not visible; see table S4). All structural and biochemical data were collected with the truncated CyCsoSCA (fig. S1).
Cyanobium CsoSCA structure reveals RuBP binding site and oligomeric state
To understand the structural basis for RuBP activation, CsoSCA was co-crystallized with RuBP. Diffracting crystals were obtained almost exclusively in saturating levels of RuBP with the final CsoSCA crystal structure solved through molecular replacement to a resolution of 2.3 Å (table S1). The resulting structure showed a homohexameric trimer of dimers consistently arranged in the asymmetric unit with P212121 symmetry (Fig. 1B). Size exclusion chromatography (SEC) corroborates that both CyCsoSCA and HnCsoSCA are primarily hexameric in solution (fig. S13), contrasting with previous observations (27). While the CyCsoSCA dimer interface is highly reminiscent of HnCsoSCA (27), additional contacts at the N-terminal domain (NTD) of each monomer mediate further quaternary assembly, forming two apices of the final hexamer (Fig. 1B and fig. S1). For comparison, the average buried surface area for individual CyCsoSCA and HnCsoSCA dimers is 1497.4 and 1458.4 Å2, respectively. An additional 460.7 Å2 of buried surface area at the trimeric apices is seen in the CyCsoSCA complex solved here, with a total of 26,330 Å2 for the entire complex compared. A metal ion is evident at each apex, coordinated by a His3(H2O)3 octahedral coordination sphere, comprising His155 donated by a distinct monomer (Fig. 1B). This residue sits within a helical bundle in the NTD denoted here as the “hook motif.” The electron density and coordination geometry are consistent with a zinc ion. Density corresponding to a HCO3− ion was observed at one of the trimer apices. While the pH of crystallization conditions favors bicarbonate, this species was not present at saturating levels under the crystallization conditions, which could explain its absence at the opposing apex.
Density consistent with RuBP was observed in all monomers within a positively charged pocket near the dimer interface that extends into the protein core (Fig. 1C, figs. S3 and S11, and tables S2 and S3). While variations in omit map density at these positions were observed, given the consistency of this density across each chain, the dependence on RuBP for crystallization, and the observation that RuBP was an allosteric activator of CA (Fig. 1), we were confident in the modeling of RuBP at this site. Two sulfate ions, likely from the crystallization solvent, could be modeled with high confidence at the entrance of this site in all monomers. While slight variations in RuBP ligand conformation are evident in each chain, contacts at Arg266, Lys469, and Arg560 are consistently observed (figs. S3 and S4). Most notably, RuBP curls around Lys469, mediating multiple H-bonds with the ligand. To confirm that this region is responsible for RuBP binding, we mutated Lys469 to Asp, the amino acid at the corresponding position in the constitutively active HnCsoSCA isoform. This results in a biphasic activity profile with detectable CA activity evident in the absence of RuBP and a minor increase in activity upon addition of the ligand (Fig. 1D). This directly implicates K469 in the RuBP-mediated activation mechanism and further supports this region as the RuBP binding pocket.
RuBP regulation of Cyanobium CsoSCA is allosteric
Given the sigmoidal CyCsoSCA activation curve and RuBP binding site distinct from the active site (Fig. 1), we hypothesized that RuBP acts as an allosteric activator. The β-CA family is the only CA family known to exhibit allostery to date (36). Alignments between CyCsoSCA and a structure of a previously characterized type II β-CA bound to the allosteric bicarbonate ion show that RuBP engages distinct residues and sits further from the active site, indicating a distinct regulatory mechanism (fig. S6). The RuBP site overlays with the region of the CsoSCA C-terminal domain (CTD) that has lost the second symmetric catalytic zinc site seen in canonical β-CAs, likely following gene duplication and divergence of the catalytic domain (27). A structural analysis of the CyCsoSCAs and HnCsoSCAs was conducted to identify potential allosteric networks within the cyanobacterial variant. CyCsoSCA monomers align well with the canonical HnCsoSCA [37.2% sequence identity and Cα root mean square deviation (RMSD) of 1.5 Å; Fig. 2A], and the three domains (NTD, catalytic domain, and CTD) (27) are evident (Fig. 2A). The Cys2His(H2O) tetrahedral coordination of the catalytic zinc ion typical of β-CAs is maintained, and the overarching active site is highly homologous between each CsoSCA isoform. The Asp-Arg dyad between active-site residues Asp246 and Arg248 precludes the inactive Cys2HisAsp coordination sphere across all CyCsoSCA monomers, reinforcing the classification of CsoSCA as type I (27). This type of β-CA has not previously been associated with allostery. Manual inspection of H-bonds within the protein identified a network linking the catalytic Asp246 backbone and the Arg266 side chain that, in turn, binds RuBP, mediated by a water molecule and Leu249 backbone groups (Fig. 2B). In the corresponding region in the constitutively active HnCsoSCA, Lys179 (Ala250 in CyCsoSCA) occupies this space, coordinating a more extensive interaction network reinforced by multiple water molecules. Notably, Arg196 (Arg266 in CyCsoSCA) binds Asp409 (Lys469 in CyCsoSCA) identified above as a key determinant in RuBP-dependent activity (Fig. 2C).
Fig. 2. Active-site differences from HnCsoSCA are key to allosteric activation of CyCsoSCA.
(A) Structural alignment of CyCsoSCA and HnCsoSCA monomers excluding the disordered N-terminal tails (Cα RMSD 1.5 Å). The CyCsoSCA variant is colored by domain (white: NTD, green: catalytic domain, and dark gray: CTD). HnCsoSCA (PDB ID: 2FGY) is shown in orange. The box denotes the active site and RuBP binding site region. (B) A close-up view of the CyCsoSCA active site is shown with key residues in stick representation, colored as in (A). Corresponding HnCsoSCA residues (orange) are overlaid. All ligands (CO2, RuBP, and SO4) are shown in ball-and-stick representation. Catalytically relevant water molecules are shown as crosses (waters in black are those in the HnCsoSCA structure, and red indicates those in CyCsoSCA). Dashed lines indicate polar bonds (gray indicates those between HnCsoSCA molecules and yellow denotes CyCsoSCA interactions). Residue names are annotated according to the CyCsoSCA structure. (C) The proposed allosteric network in CyCsoSCA (top, green) and the corresponding region in HnCsoSCA aligned. Key residues for each structure are annotated in orange (HnCsoSCA) or green (CyCsoSCA). (D) PCA comparing Cartesian coordinates of the CyCsoSCA backbone in each MD simulation. Replicate simulations of CyCsoSCA with and without RuBP are shown in green (holoprotein) and gray (apoprotein), respectively. (E) The average difference in RMSF values of holoprotein and apoprotein simulations (Δ RMSF), where a negative value indicates a greater RMSF (and thus more mobile residue) in the holoprotein. Values are mapped onto the CyCsoSCA monomer. As the CyCsoSCA structure solved here was used for MD simulations, it excludes the 116-residue disordered N-terminal tail (fig. S1).
Molecular dynamics (MD) simulations of the CyCsoSCA structure (300-ns replicates) were conducted in the presence (holoprotein) and absence (apoprotein) of RuBP to assess for conformational changes upon ligand binding to evaluate the allosteric effect of RuBP. Analyses of replicate simulations are consistent with CyCsoSCA accessing different conformational landscapes when RuBP is present or absent. Principal components analysis (PCA) of the trajectories highlights differences between conformations sampled in apo- and holoprotein states, with holoprotein replicates converging on a distinct cluster as the simulations equilibrate (Fig. 2D and fig. S8). On average, residues in the apoprotein trajectories across the structure had root mean square fluctuations (RMSFs), indicating that they are more mobile (Fig. 2E). While it is difficult to ascribe a molecular mechanism with confidence, these results support a model in which RuBP stabilizes CyCsoSCA by establishing an internal H-bond network, promoting access to the active conformation.
Sequence patterns indicate that allosteric CsoSCA is limited to cyanobacteria
We sought to investigate the prevalence of RuBP allostery within the broader CsoSCA protein family by mapping the sequence diversity of the family to CyCsoSCA functional variation. To examine CsoSCA divergence, maximum likelihood (ML) phylogeny was inferred from a curated sequence database of the CsoSCA Pfam (PF08936) (Fig. 3A and fig. S10). Cyanobacteria form a clear, tight cluster distinct from other bacterial species, supported by a high bootstrap value, as seen in related studies (30, 37, 38). We hypothesized that RuBP regulation may be specific to photoautotrophs, evolving as the cso operon adapted to these organisms’ light-dependent metabolic requirements relative to ancestral chemoautotrophic α-carboxysomes (38).
Fig. 3. CsoSCA sequence analysis and mutagenesis.
(A) Unrooted ML phylogeny of 504 CsoSCA sequences produced by IQ-Tree including 19 canonical β-CAs as an outgroup (annotated as β-CA). The cluster containing members associated with the cso operon is shown in detail. Tips are colored by taxonomy according to the legend. Tree scale refers to the number of substitutions per site. See the Supplementary Materials for complete tree annotation and documentation. Complete tree files and associated alignments provided as Supplementary data files. (B) Sites targeted for mutagenesis shown as purple spheres on the CyCsoSCA dimer. (C) Sequence logos based on α-cyanobacterial cso-associated CsoSCA sequences (Cyanobacteria) or other bacterial cso-associated CsoSCA sequences (Other) of key sites targeted for mutation. Residues colored by chemistry, and logos were generated using WebLogo3. Mutations are as represented above the logo. (D) Activity assays of targeted mutants relative to the maximum rate recorded for wild-type CyCsoSCA. HnCsoSCA activity is also shown for comparison as a proportion of its maximum recorded rate. HCO3− dehydration activity was recorded using MIMS after the addition of CsoSCA variants (+Protein) and upon addition of 100 μM RuBP (+RuBP). Three technical replicates were recorded for each variant, and the standard deviation is indicated by shading (error on some samples not visible due to scale; see table S4). All mutants created in the truncated CyCsoSCA wild-type background lacking the 116 residue N-terminal disordered region (fig. S1).
To test this, concurrent approaches based on rational design and directed evolution were used to discern key residues involved in RuBP regulation. The conservation of these positions was then assessed across the CsoSCA protein family (Fig. 3). Candidate residues for targeted mutagenesis were chosen through successive steps of analyzing the sequence and structure of CyCsoSCA and HnCsoSCA to locate residues with distinct biophysical properties near the RuBP pocket. Final mutations were made at sites that differed between the two characterized isoforms and had varying levels of conserved difference between cyanobacterial taxa and other bacterial CsoSCA isoforms more broadly (K469D, H472Q, I466D, and H436N). Manual sequence inspection revealed a loop region in HnCsoSCA (position in HnCsoSCA) that contained an insertion conserved across cyanobacterial species (position in CyCsoSCA) but absent or nonconserved in other proteins. Mutants were created in the CyCsoSCA background with either a deletion of this loop (loop deletion) or with the corresponding HnCsoSCA loop sequence substituted at this site (loop insertion). Alongside this approach, CyCsoSCA was also randomly mutagenized, facilitating a broader exploration of the sequence space involved in this activation mechanism.
Mutants were screened using an in-house CA knockout Escherichia coli strain (39) for variants with CA activity independent of RuBP.Activity assays of these mutants revealed that, in addition to K469D, an H472Q mutation (targeted approach) and T477A (random approach) also resulted in a biphasic activity profile with CA function independent of RuBP (Fig. 3D). Other mutations resulted in either reduced or undetectable CA activity, making their effects on RuBP dependence specifically difficult to infer. Sequence-based analyses show that residues with apparent involvement in RuBP dependence are well conserved in α-cyanobacteria but absent or nonconserved in other taxa (Fig. 3C). The conservation of sites underpinning RuBP dependence is consistent with this regulation existing primarily, if not exclusively, in photoautotrophic CsoSCA variants.
The unique N-terminal oligomerization domain is exclusive to carboxysomal CAs
Further bioinformatics analysis of the CsoSCA protein family revealed an orphan cluster within this β-CA clade (Fig. 4). Manual inspection of the gene neighborhoods of these sequences demonstrates that they are not associated with the cso operon, appearing instead within other metabolic gene clusters, often associated with nicotinamide adenine dinucleotide or [Fe-Ni] hydrogenases or permeases (fig. S15). Structural modeling of “non-cso” sequences and subsequent structure-based searches using Foldseek (40) and DALI (41) shows that these non-cso sequences align preferentially to the published HnCsoSCA structure with high confidence relative to other β-CAs (table S5). These sequences appear to have lost the typical β-CA twofold symmetry, containing only one predicted zinc-binding site per pseudo-dimer, a defining feature of canonical CsoSCA sequences (Fig. 4). However, all non-cso sequences are notably shorter than carboxysome-associated variants. This is underlined by a consistent insertion in cso-associated sequences within the NTD, which encodes a hook-like bundle of α helices dubbed the “hook” motif (Fig. 4) shown here to facilitate structural zinc binding and oligomerization (Fig. 3). This is consistent with NTD presence and, thus, hexamer formation being a more recent adaptation unique to carboxysome-encapsulated variants of this family.
Fig. 4. The NTD is unique to carboxysome operon–associated CsoSCA homologs.
(A) ML tree of the CsoSCA Pfam (PF008936) and canonical β-CA with nodes colored by predicted mass as per the legend. (B) Structural alignment of Cα backbones of CyCsoSCA (NTD: gray, catalytic domain: green, and CTD: black) to an AF2-generated model of a candidate non-cso sequence (UniProt ID: A0A080M7C6; purple) with a close-up view of key active-site residues. The insertion exclusive to cso-associated sequences that encodes the NTD hook motif is annotated. (C) Sequence logo of alignments of either CsoSCA members associated with cso operons (cso) or non-cso sequences. The cso insertion encoding the NTD hook motif is annotated. Key catalytic residues are indicated with arrows, and zinc-binding residues are indicated with asterisks.
Carboxysome functional modeling indicates an adaptive advantage for RuBP-regulated CsoSCA
The results presented above support the hypothesis that CsoSCA RuBP dependence is a fixed trait unique to cyanobacterial α-carboxysome systems. To determine whether this feature emerged as an adaptive or neutral change in CsoSCA variants, we aimed to discern a functional benefit for RuBP regulation within the α-cyanobacterial system. An in vivo assessment of CsoSCA regulation is currently intractable, with no effective genetic transformation techniques reported to date. Instead, we modified our carboxysome steady-state diffusion model (35), incorporating kinetic data presented in Fig. 1A to compare the activity of a Cyanobium carboxysome with and without an RuBP-dependent CA (Fig. 5A). While second to physiological data, this approach permitted insights into any direct effects this regulation may have on core enzyme activity and metabolite flux of the Cyanobium α-carboxysome. No substantial changes in Rubisco carboxylation or oxygenation rates were observed between the standard model and one incorporating an RuBP-dependent CA (Fig. 5B and fig. S12). We note that modeling of the unmodified Cyanobium carboxysome here had a slightly alkaline carboxysome pH compared with the observation of an acidic carboxysome in our previous modeling (35). This is due to the use of Cyanobium Rubisco kinetics, the concentration of HCO3− tested here (20 mM), and a modification of the Rubisco active-site concentrations compared with previous modeling [a reduction from 10 to 5.7 mM to fit recent estimates (29)]. Applying RuBP dependence on CA activity in this model causes a shift to a more acidic carboxysomal lumen under low cellular RuBP levels. This implicates a regulated CA in maintaining an acidic carboxysome lumen under low-RuBP conditions in the Cyanobium system, likely experienced as light levels fluctuate throughout the diurnal cycle (14). Though the current model framework does not permit direct modeling of the effect of pH on carboxylation over time, previous work has shown an optimum for Rubisco activity (42) and a pH effect on net CO2 supply within carboxysomes (35, 43). Together, and in lieu of direct experimental data, the effect of RuBP-mediated CA modulation may contribute to CCM function by maintaining optimal pH conditions within the carboxysome for efficient CO2 fixation.
Fig. 5. Results of a reaction-diffusion model adapted to emulate Cyanobium α-carboxysome function with an RuBP-dependent CA (modified CA; green dots) or a constituently active CA (unmodified CA; black dots).
(A) In the modified model, carboxysomal CA activity was altered based on data shown in Fig. 1A to be dependent on carboxysomal RuBP concentrations [Ext RuBP (mM)]. “CA flux” is indicative of CA activity, confirming RuBP dependence in the modified model. (B) Rubisco carboxylation turnover rates {Rubisco [kcatc (s−1)]} as a function of modeled cellular RuBP concentrations [Ext RuBP (mM)] in the modified and unmodified systems. (C) Modeled carboxysomal pH is plotted as a function of the modeled cellular RuBP content [Ext RuBP (mM)] and indicates a decrease in carboxysomal pH when CA function is allosterically controlled by RuBP, with potential to aid Rubisco function as described previously (35).
DISCUSSION
The α-carboxysomal CCM enables efficient Ci fixation across a diverse range of microorganisms, comprising a major component of the global biosphere (5, 44). Given the essential nature of CAs in this system (6), characterizing the CsoSCA variant present in photosynthetic organisms and understanding its evolutionary trajectory provide important insights into the emergence and function of bacterial CCMs. Data presented here are consistent with RuBP allosterically activating CyCsoSCA and the likely confinement of this property to α-cyanobacteria. A hexameric trimer of dimers quaternary state is also described, coordinated by NTD contacts with structural zinc ions that appear only in carboxysome-associated members of the CsoSCA protein family.
A distinct paradigm for CA allosteric regulation
Allosteric regulation of CAs is rare, particularly allosteric activation. The β-CA family is the only CA family known to exhibit allostery, with HCO3− acting as both a substrate and an inhibitor in type II members. Binding of the HCO3− disrupts the “gatekeeper” Asp-Arg dyad, leading to the Asp-Zn bond that displaces the catalytic H2O, leading to inhibition (36, 45–47). In addition to activating the enzyme, the RuBP binding pocket presented here appears distinct from these previously characterized sites, engaging different residues and sitting further from relative active sites (figs. S3 and S6). Indeed, the RuBP site sits near the defunct active site of the CTD within the CsoSCA pseudo-dimer, previously identified as a highly divergent catalytic domain that has lost key zinc-binding sites and catalytic loops (27). The duplication and divergence of domains, particularly at protein termini, are a common motif in protein evolution (48). While seemingly acting as a regulatory domain in the cyanobacterial CsoSCA, it is unclear whether the CTD takes on alternative, more cryptic regulatory roles in RuBP-independent CsoSCA isoforms or indeed in the smaller uncharacterized non-cso isoforms.
We propose an allosteric activation mechanism in which RuBP binding establishes an internal H-bond network near the core of CyCsoSCA that has a broadly stabilizing effect on the protein; this, in turn, promotes access to the active conformation. RuBP binding engages Arg266 in H-bonding, establishing a H-bonding network that links to active-site loops (Fig. 3). The equivalent network in HnCsoSCA is more extensive involving more residues, specifically negatively charged residues that are absent in the CyCsoSCA isoform. Specifically, in HnCsoSCA, the Arg266 equivalent (Arg196) is stabilized by analogous interactions with Asp409 (Lys469 in CyCsoSCA), thereby negating the ligand-binding requirement for activation. This proposed mechanism is also consistent with the observed biphasic activity profile of mutants, many of which introduce a relatively negative charge to this site, most notably K469D (Fig. 3). However, it remains difficult to rationalize the relatively large increase in RuBP-independent activity observed in the CyCsoSCAT477A mutant. While Thr residues are typically known to stabilize β sheets, given this site is buried, it may be that this exchange reduces secondary structure strain and enhances protein core packing while having more minimal disruption to the overarching protein fold than other mutants (49). Consequently, T477A would result in increased apoprotein stability and a wild-type-like RuBP pocket, leading to high levels of ligand-independent activity and a notable increase with RuBP. Although the involvement of the Asp-Arg dyad destabilized by HCO3− does seem reasonable in the context of CyCsoSCA given its established role as an allosteric switch (36, 45), we could not establish this as the clear mechanism for allosteric signal propagation. Notably, carboxysomal CAs in both α- and β-carboxysomes exhibit various levels of redox regulation, all appearing to be inactivated by reducing conditions, typical of the cytoplasm, and activated upon oxidation, as expected to occur within the distinct carboxysomal lumen environment (22, 28, 50). Though a detailed redox characterization of CsoSCA is lacking in the literature; this appears to be distinct from RuBP activation (fig. S5). Future efforts to elucidate this redox regulation will be required for a complete understanding of carboxysomal CAs.
An adaptive advantage for an RuBP-regulated CA in some photoautotrophs
The conservation of residues central to RuBP-dependent activity corresponds with the divergence of the α-cyanobacterial clade in the CsoSCA phylogeny (Fig. 3). This suggests that allosteric RuBP activation is either a neutral or adaptive change uniquely fixed in α-cyanobacteria relative to other α-carboxysomal taxa. RuBP allostery appears infrequently in the literature, primarily characterized in Rubisco adjacent proteins such as the AAA+ red-type activase Cbbx (51). In these cases, it is proposed to synchronize the activity of such proteins with Rubisco, hinting at an intricate regulatory cycle to ensure efficient Rubisco function. Indeed, modeling and experimental data have established a similar functional link between CA and Rubisco, demonstrating that Rubisco function in carboxysomes is intrinsically dependent on CA activity (6, 25, 35). This type of posttranslational regulation would enable more rapid responses to transient metabolic signals, directly synchronizing CB cycle fluxes with Rubisco-mediated carbon fixation.
We propose that RuBP-dependent allostery may be linked to large fluctuation in cellular RuBP observed in photosynthetic α-cyanobacteria but absent in proteobacterial systems such as H. neapolitanus that are likely under more constant environmental substrate supply (14, 52). Using a carboxysome reaction-diffusion model, we previously identified that carboxysomes may require molecular mechanisms that modulate internal pH as RuBP concentrations vary to the detriment of Rubisco function (35). Modification of this model to allow for allosteric activation of CA in Cyanobium-like carboxysomes indicates that such regulation ameliorates this effect, resulting in a modulation of carboxysome pH without any change in Rubisco carboxylation (Fig. 5 and fig. S12). Thus, the emergence of RuBP regulation in these systems may have been prompted by a requirement for more fine-tuned control over carboxysomal H+ concentrations in photoautotrophic systems that are subject to more drastic cellular RuBP fluctuations across the diurnal cycle. That the protein can be switched from a constitutive to an autoinhibitory isoform in one residue change highlights the evolutionary pliability of this protein, suggesting that the evolution of allostery from a constitutively active isoform may have arisen from a relatively small sequence level change in ancestral photoautotrophic hosts in the presence of new fitness pressures. This raises the question as to why β-carboxysomes, present exclusively in photosynthetic cyanobacteria, do not appear to have the same requirement (50, 53). In the absence of a model, computational or otherwise, that can explicitly measure this, we speculate that the larger size of β-carboxysomes (~100 versus ~250 to 500 nm in diameter), and thereby smaller surface area:volume ratio, relative to α-carboxysomes may lead to diffusional differences between these structures (35, 38). Specifically, the smaller α-carboxysomes likely experience greater diffusional rates and faster achievement of homeostasis. This may mean that β-carboxysomes do not need to respond rapidly to internal species fluctuations. An RuBP-regulated CA may offer a mechanism to enable rapid homeostasis in smaller compartments. Given this, we conclude that the emergence of an RuBP-dependent CA comprises a key molecular step in the adaptation of the α-carboxysome within cyanobacterial lineages.
The α-carboxysomal CA is hexameric in solution
The oligomeric state of enzyme cargo is an important detail of bacterial microcompartment systems, providing insights into cargo organization and interaction networks within the shell. Here, we show that CyCsoSCA and HnCsoSCA are both hexameric in solution, not dimeric as previously described (27), and that this quaternary structure is mediated by the globular NTD (Fig. 1 and figs. S1 and S2). This hexamer likely eluded detection in the HnCsoSCA structure solved previously due to an accidental mutation that introduced an artificial second zinc site within the globular NTD of each monomer (27). This would have inhibited the structural zinc interactions that we observe as essential for the coordination of dimers within the hexameric complex (Fig. 1 and fig. S13). It is interesting that the previous study did not report any difficulty in purifying this HnCsoSCA mutant and express the complete disordered N-terminal without serious aggregation, in contrast to our and others’ accounts of wild-type isoforms (30) (fig. S1). It may be that hexamerization increases the propensity of CsoSCA to precipitate by enhancing the local density of disordered N-terminal tails that may self-interact, resulting in aggregates at high concentrations, or perhaps phase separation as has been reported between interaction partners in the system (30, 54). Given the stability and dominance of hexamers in vitro (fig. S13), it is plausible that this conformation, or perhaps a mix of hexameric and dimeric forms, exists in native carboxysomes. Hopefully, the identification and characterization of the hexameric CsoSCA in this work will aid future efforts to elucidate the internal architecture of α-carboxysomes in vivo.
CsoSCA is now known to interact with form IA Rubisco through interactions between the CsoSCA N-terminal disordered tail in a manner reminiscent of the α-carboxysome structural protein CsoS2 (30, 50). Though the conservation of this binding motif is limited, the N-terminal disordered tail likely only contains a single binding motif. Comparatively, CsoS2 contains multiple Rubisco binding motifs, thereby enhancing the valency and strength of the interaction. Considering this, a hexameric assembly would enhance the local concentration of Rubisco interaction motifs and, thus, multivalency of the system to promote CsoSCA-Rubisco binding. In this way, the emergence of the NTD may constitute a molecular marker for the early association of an ancestral CsoSCA-Rubisco complex, hypothesized as a likely carboxysome evolutionary precursor (35, 55).
As we begin to resolve the evolutionary trajectories of bacterial CCMs, a detailed understanding of structural variation of core system components and how this relates to function will be essential for resolving plausible evolutionary routes. We have identified a distinct divergence between α-cyanobacteria and other α-carboxysome taxa, presenting a paradigm for CO2 fixation in photoautotrophic α-cyanobacteria that hinges on the regulation of CsoSCA by RuBP. These results highlight the intrinsic role of CAs in photosynthesis, comprising an evolutionarily pliable enzyme at the bottleneck of key reactions in the carbon metabolism of many diverse organisms. This aspect of carboxysome function and evolution must be considered in future biotechnological applications that seek to adapt such systems, particularly efforts to transplant the bacterial CCM into photoautotrophic crop species.
MATERIALS AND METHODS
CsoSCA expression constructs
In initial expression tests, we found that the N-terminal disordered region (fig. S1) causes severe aggregation, inhibiting expression and purification and thus was removed in all CsoSCA isoforms discussed in this investigation. This propensity to aggregate was also noted by previous work in their supplemental methods addendum (30). The wild-type HnCsoSCA (O85042 with first 49 residues removed) and CyCsoSCA [WP_071778263.1 with first 116 residues removed (fig. S1)] were codon-optimized for expression in E. coli and cloned into the pHue expression system (56). These N-terminal truncations removed the nonconserved disordered N-terminal region of these proteins. While these disordered regions are known to be essential for CsoSCA encapsulation during carboxysome biogenesis, they are assumed not to have a role in CA activity or enzyme function (30, 57). Increased severity of aggregation in our work and in previous works relative to the original HnCsoSCA study may be due to the inadvertent mutation in previous works that prohibits hexamer formation (fig. S11) (27, 30). The presence of these larger quaternary complexes would increase the local concentration of disordered regions and perhaps lead to greater condensation/aggregation in solution (fig. S2). Codon-optimized genes encoding CyCsoSCA mutants were synthesized through Twist Bioscience and cloned into the pHue vector using standard Gibson assembly methods (58). Primers used are listed in table S5.
CsoSCA random mutagenesis
To create random CsoSCA mutants with RuBP-independent activity profiles, random base changes were introduced using error-prone polymerase chain reaction (epPCR), and the resulting libraries were screened for desired activity profiles. The GeneMorph II Kit (Agilent, catalog no. 200550), containing the Mutazyme II DNA polymerase blend, was used in the epPCR reaction to generate mutant libraries using conditions specified in the manual for medium rates of mutagenesis. Primers were designed to enable a one-pot cloning reaction of resulting PCR fragments into a pET16 expression vector (table S6). The resulting one-pot reaction mix was dialyzed using the MF-Millipore 0.25 μm MCE Membrane (Merck, Ireland) to remove salts and transformed into dense aliquots of electrocompetent DH5α E. coli cells. The library was plated and grown overnight at 22°C until pinprick colonies were visible. Plates were then washed with 5 ml of lysogeny broth (LB) to recover colonies and plasmids extracted using the QIAprep Spin Miniprep Kit (Qiagen, catalog no. 27104). The “DCAKO” cell line is an E. coli strain, created in Price lab, in which both native CA genes have been knocked out. Consequently, cells cannot grow under ambient conditions. Five microliters of this library was transformed into dense aliquots of electrocompetent DCAKO E. coli cells to screen the mutant library for members with the desired activity profile (activity independent of RuBP). Transformed cells were grown in triplicate plates at 37°C/ambient or 37°C/4% CO2 conditions. Colonies that appeared on plates grown under ambient conditions were re-grown in liquid cultures at 37°C/4% CO2 and plasmids extracted for sequencing. Resulting mutant csoSCA sequences were analyzed using the Geneious Prime software package. Mutants that permitted DCAKO growth under atmospheric conditions were sequenced, and a representative pool was chosen for crude activity assays to confirm CsoSCA activity independent of RuBP. Three residues (H519Q, T477A, and N278D) consistently corresponded to RuBP-independent CA activity, and all had notably different chemistry to the analogous sites in HnCsoSCA. However, in the mutant library, these sites were always present within a small background of other site changes. To assess the effect of these residues directly, single-site mutants were generated using custom primers and cloned into the pHue expression system (table S6). The N278D mutant was incredibly difficult to clone and consistently precipitated during expression, leading us to abandon it for further analysis.
Protein expression and purification
The USP2-pHue expression system was used for heterologous expression of CsoSCAs (56). NEB T7 Express pLysY chemically competent E. coli cells were transformed with a pHue plasmid containing the corresponding CsoSCA sequence. Single colony glycerol stocks (1:1 ratio of 40% glycerol and cell culture) were used to inoculate 3 ml of cultures in LB media supplemented with ampicillin (100 μg/ml; LB + Amp) and grown overnight at 37°C shaking at 200 rpm. From these overnight cultures, 1 ml was used to inoculate 25 ml of fresh LB + Amp, which, in turn, was incubated at 37°C for 6 hours in LB media, and 10 ml of this was then used to inoculate 500 ml of fresh LB + Amp. This was grown at 37°C for 2 hours, followed by induction with 100 μM isopropyl-β-d-thiogalactopyranoside and further incubation overnight at 28°C. Cell cultures were pelleted and stored at −80°C until required.
Cells were thawed, resuspended in binding buffer [50 mM tris (pH 7.8), 300 mM NaCl, and 25 mM imidazole], incubated at room temperature, and shaken with 10% rLysozyme (EMD Millipore Corp, USA) and 0.2 μl of Turbonuclease (Sigma-Aldrich) for 30 min. Cells were lysed with three passes of the Emulsiflex (Avestin, USA). Lysate was clarified by centrifugation (16,000g for 30 min at 4°C), and the soluble fraction passed through a 0.44-μm syringe filter before application to a pre-equilibrated 5-ml HisTrap HP column (GE Healthcare). The column was washed with 50 ml of binding buffer and the target protein was eluted in elution buffer [50 mM tris (pH 7.8), 300 mM NaCl, and 500 mM imidazole]. Fractions containing protein were pooled and concentrated to a maximum of 5 ml using a centrifuge filter (Amico Ultra-15 Centrifugal Filter Unit) and buffer-exchanged using a PD-10 column (GE Healthcare, lot 9760001) into size exclusion buffer [SEC buffer; 50 mM tris (pH 7.8) and 300 mM NaCl]. Samples were incubated with prepurified USP2 at a 1:10 protein-of-interest:USP2 molar ratio and incubated with rocking at 4°C for at least 12 hours with 2 μl of β-mercaptoethanol. Samples were then passed over clean HisTrap columns equilibrated with 50 ml of SEC buffer to remove protease. Flow through was collated and concentrated through centrifugation as above and further purified by SEC on a HiLoad 26/600 Superdex 200 Column (GE Healthcare), eluting in SEC buffer. Protein purity was confirmed by SDS–polyacrylamide gel electrophoresis, and protein concentrations were determined spectrophotometrically by measuring A280 (absorbance at 280 nm) using a NanoDrop One (Thermo Fisher Scientific) and molar absorption coefficients generated by ProtParam (http://expasy.org/tools/protparam.html) (59).
CA activity assay
All CA activity was measured using the MIMS technique and relevant equations were adapted for pure protein samples (26). This is based on measuring the loss of 18O2 from a labeled Ci source to water through the CA-catalyzed hydration and dehydration of CO2 and HCO3−, respectively. Briefly, NaH13CO3 is incubated for 24 hours with H218O, creating a 13C-labeled carbon source enriched with 18O. The change in the 18O enrichment of CO2 species is measured after chemical equilibrium was reached before and after the addition of enzyme. The 18O atom fraction (the atom % enrichment) was calculated at each point by summing the related carbon species determined by mass spectrometry. CA activity is reported as the rate of decline of the atom % enrichment, calculated from the slope of the change in log(atom % enrichment) over time (LogEnrich min−1). This gives a measure of the first-order rate constant. As this refers to the decline of 18O, this rate becomes increasingly negative with CA activity. For simplicity, we display the absolute value here. All assays were conducted with 0.1 μM protein and 2.7 mM heavy isotope bicarbonate (NaH13C18O3) in CA buffer [50 mM EPPS-NaOH and 20 mM MgCl2 (pH 7.8)] in a final cuvette volume of 600 μl. CsoSCA has previously been shown to be inactivated by reducing agents, indicative of an oxidative activation mechanism as yet unknown (28). Enzymes were expressed and purified under atmospheric conditions, and we assumed them to be largely oxidized. A 30 μM 5,5′-dithiobis-(2-nitrobenzoic acid) (DTNB) solution was added to reaction mixes before measurement to avoid the formation of unwanted disulfides with other proteins in clarified lysates. The final cuvette volume was brought to 600 μl. Figure S5 demonstrates that this does not alter CA activity or the response to RuBP. The rate of consumption of each measured isotope was determined at equilibrium before and after protein was added to the system and upon addition of RuBP when relevant. Activity measurements for CyCsoSCA mutants were conducted as above all in the presence of 100 μM RuBP.
Crystallization, data collection, and structure determination
Initial screening to determine crystallization conditions was performed at CyCsoSCA concentrations of 10 mg/ml at 20/18°C in 96-well plates using the sitting drop vapor diffusion method and commercially available sparse matrix screens [ShotGun SG1 (Molecular Dimensions), PegIon (Hampton), Crystal Screen HT (Hampton), JCSG (Molecular Dimensions), PACT Premier (Molecular Dimensions), Index HT (Hampton), PegRx (Hampton), and Salt Rx (Hampton)]. In each case, 200-nl drops comprising 100 nl of protein solution and 100 nl of reservoir were prepared on hanging-drop seals using an NT8-Drop Setter robot (Formulatrix, Bedford, MA, USA) and equilibrated against 100 μl of reservoir solution. Further optimization was carried out in a 24-well hanging-drop vapor diffusion plate format under varying pH, precipitant, and protein concentration conditions. The final optimized crystallization conditions were at 0.2 M ammonium sulfate, 0.1 M bis-tris (pH 6.5), 21% polyethylene glycol 3350, and 15% ethylene glycol, seeded with crystals from the same conditions at a protein concentration of 7 mg/ml and an RuBP:CyCsoSCA molar concentration of 10:1. Crystals were cryoprotected with glycerol and flash-cooled in liquid nitrogen. X-ray diffraction data were collected at beamline MX2 at the Australian Synchrotron. Data were processed using XDS and Aimless, and molecular replacement was performed using Phaser. The ColabFold:AlphaFold2 used the MMseqs Google Colab notebook to generate a model of CyCsoSCA for molecular replacement (60, 61). Iterative cycles of manual model building and refinement were performed using Coot (62), ccp4 (63), and phenix.refine (64). TLS (Translation/Libration/Screw) parameter refinement was also used with TLS groups automatically selected by phenix.refine. Data collection and refinement statistics are provided in table S1. Structures were subsequently annotated and analyzed in PyMOL and PDBePISA (https://ebi.ac.uk/pdbe/pisa/).
Molecular dynamics
The structure of CyCsoSCA chain D reported here, thus the truncated CyCsoSCA (fig. S1), was used for all simulations. For holoprotein preparation, chain D was used as is; otherwise, apoprotein starting structures were generated by deleting RuBP from the starting file. Automated structure refinement was conducted by the protein preparation wizard module of Maestro (Schrodinger, version 12.9.137 release 2021-3). This comprised assignment of bond order, addition of missing hydrogens, and generation of het states using the Epik program at a target pH of 7.4. The hydrogen bond network was subsequently optimized with the default parameters, and protonation states were assigned to residues at a pH of 7.4 using PROPKA. System builder was used to set up an orthorhombic box filled with TIP5P water models, the charge was neutralized with 18 Na+ ions, a buffer distance of 15 Å was used, and the OPLS4 force field was applied. A single run of the holoprotein and apoprotein with water model TIP4P was conducted for comparison (fig. S9). Short minimization was performed on the resultant structure of 1 ns. Simulations were performed at a total time of 300 ns at a time step of 50 ps in three replicates for each system at different initial random seeds. Resultant trajectories were analyzed using the simulation interaction diagram module in Schrodinger and using the MDTraj python package (65).
Phylogenetic reconstruction
All CsoSCA sequences within the protein family (PF08936) were harvested from the Pfam database, including the primary query sequence WP_071778263.1. Duplicates were removed, and a local allvsall BLAST search (66) was conducted to generate a sequence similarity network of the sequences with an e value cutoff of 10e−10. This was used to identify classes within the family, and using incremental percent identity edge cutoffs in Cytoscape (version 3.9.1), clusters within these classes were examined. An additional 19 reviewed sequences of known β-CAs were appended to this database for comparison with the non-cso clade. Sequences without key catalytic sites and outliers were manually removed, and the remaining sequences were filtered using CD-HIT to remove redundancy to 90%, resulting in a final database of 504 sequences, and details of these, including accessions, are documented in the Supplementary Materials (19 reviewed β-CAs and 485 CsoSCA protein family sequences) (67). A multiple sequence alignment of these was constructed using MAFFT-LINSI (68). Columns with >90% gaps were removed using TrimAI through the pheylemon2 web server (http://phylemon2.bioinfo.cipf.es/) (69). Given the lack of conservation in the disordered N-terminal tail, there was little phylogenetic signal in this portion of the alignment; thus, it was removed during this editing process. This edited alignment was submitted to IQ-TREE (http://iqtree.cibiv.univie.ac.at/) for phylogenetic inference via ML (70, 71). The ML sequence evolution model (WAG+F + I + G4) was selected as implemented in ModelFinder. Branch supports were measured by ultrafast bootstrap approximation and approximate likelihood ratio test, each conducted to 1000 replicates in IQ-TREE. Five independent replicates of tree search were conducted, one of which was selected due to high branch supports at key bifurcations. Trees were visualized and edited using iTOL (https://itol.embl.de/) (72). Sequence logos of alignments presented in the text were generated through WebLogo3 (https://weblogo.threeplusone.com/create.cgi) (73). To assess the genetic context of non-cso sequences, the genetic context was visualized and manually inspected through JGI IMG (https://img.jgi.doe.gov/) (74).
COPASI modeling
Modeling of carboxysome function with and without RuBP-dependent CA activity was carried out using the COPASI biochemical modeling simulator as described previously (35). Modeling was carried out for a single Cyanobium carboxysome of 100 nm diameter, containing a Rubisco active-site concentration of 5.68 mM based on the average number of Rubisco holoenzymes per Cyanobium carboxysome previously reported at 224 (29) and the Rubisco catalytic parameters reported previously (kcatC = 9.4 s−1, KMCO2 = 169 μM, KMO2 = 1.4 mM, kcatO = 1.42 s−1, and KMRuBP = 40 μM) (29, 35). Modeling was carried out at 20 mM HCO3−, pH 8.0, and atmospheric O2 concentrations over an exponential range of RuBP concentrations from 0.1 μM to 5 mM, spanning the apparent KM for CA activation (18 μM). Permeabilities were set to simulate a carboxysome, with CA activity confined to the Rubisco compartment (35). The model was run with either unmodified CA function or modified to be dependent on RuBP concentrations to replicate the observed RuBP response in Fig. 1. Specifically, the model modifies CA activity as a function of the deprotonated, and most abundant, form of RuBP within the carboxysome (RuBP4−).
Rate constant for the unmodified CA forward reaction (CO2 + H2O → HCO3− + H+)
Rate constant for the unmodified CA backward reaction (HCO3− + H+ → CO2 + H2O)
where the carboxysomal CA factor is set to 100,000, k1 is 0.05, k2 is 100, and [CO2]c, [HCO3−]c, and [H+]c are the carboxysomal concentrations of CO2, HCO3−, and protons, respectively (35). Where CA function is not present in the model (in the external and unstirred compartments), CA factor is set to 1 (1 × the background rate of interconversion).
Carboxysomal CA function within the model was modified to be dependent on RuBP concentrations by multiplying each of the forward and backward rate constants by the following function
where [RuBP]c is the carboxysomal concentration of RuBP4−, h is the Hill slope (set to 2.214), and Khalf is the apparent KMRuBP required to achieve half-maximal CA activity (18 μM). Both h and Khalf were determined through Hill-reaction curve fitting of the RuBP response curve of Cyanobium CA in Fig. 1 using GraphPad Prism.
Acknowledgments
We acknowledge the use of the Australian Synchrotron MX facility and thank the staff for their support. This research was undertaken in part using the MX2 beamline at the Australian Synchrotron, part of ANSTO, and made use of the Australian Cancer Research Foundation (ACRF) detector.
Funding: Funding for some salary components (B.F., S.B.P., and B.M.L.), materials, DNA constructs, and sequencing work was supported by a grant from the Australian Research Council Centre of Excellence for Translational Photosynthesis (CE140100015) to G.D.P. C.J.J. and S.B.P. also acknowledge support from the Australian Research Council Centre of Excellence in Synthetic Biology (project number CE200100029). S.B.P. acknowledges stipend funding from the Westpac Scholars Trust.
Author contributions: Conceptualization, experimental design, data collection and analysis, writing, and editing: S.B.P. and B.M.L. Data collection and analysis: M.A.O. Data collection: B.F. Data collection and analysis: S.J.W. Data collection and analysis, writing and editing, and funding: C.J.J. Conceptualization and experimental design: M.R.B. Conceptualization, experimental design, editing, and funding: G.D.P. Experimental design and data collection: T.R.
Competing interests: The authors declare that they have no competing interests.
Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. The DCAKO E. coli strain with native CAs knocked out can be provided by G.D.P. pending scientific review. Requests for the strain should be submitted to dean.price@anu.edu.au. The Cyanobium CsoSCA structure is available at the PDB (PDB ID: 8THM). MD trajectory data are available at https://doi.org/10.5281/zenodo.8227167.
Supplementary Materials
This PDF file includes:
Supplementary Materials and Methods
Tables S1 to S6
Figs. S1 to S15
Legends for data S1 to S3
References
Other Supplementary Material for this manuscript includes the following:
Data files S1 to S3
REFERENCES AND NOTES
- 1.Raven J. A., Cockell C. S., De La Rocha C. L., The evolution of inorganic carbon concentrating mechanisms in photosynthesis. Philos. Trans. R. Soc. Lond. B Biol. Sci. 363, 2641–2650 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Badger M. R., Price G. D., The CO2 concentrating mechanism in cyanobactiria and microalgae. Physiol. Plant. 84, 606–615 (1992). [Google Scholar]
- 3.Badger M. R., Hanson D., Price G. D., Evolution and diversity of CO2 concentrating mechanisms in cyanobacteria. Funct. Plant Biol. 29, 161–173 (2002). [DOI] [PubMed] [Google Scholar]
- 4.Tcherkez G. G. B., Farquhar G. D., Andrews T. J., Despite slow catalysis and confused substrate specificity, all ribulose bisphosphate carboxylases may be nearly perfectly optimized. Proc. Natl. Acad. Sci. 103, 7246–7251 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Cabello-Yeves P. J., Scanlan D. J., Callieri C., Picazo A., Schallenberg L., Huber P., Roda-Garcia J. J., Bartosiewicz M., Belykh O. I., Tikhonova I. V., Orcello-Requena A., De Prado P. M., Millard A. D., Camacho A., Rodriguez-Valera F., Puxty R. J., α-Cyanobacteria possessing form IA RuBisCO globally dominate aquatic habitats. ISME J. 16, 2421–2432 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Badger M., The roles of carbonic anhydrases in photosynthetic CO2 concentrating mechanisms. Photosynth. Res. 77, 83–94 (2003). [DOI] [PubMed] [Google Scholar]
- 7.Sommer M., Cai F., Melnicki M., Kerfeld C. A., β-Carboxysome bioinformatics: Identification and evolution of new bacterial microcompartment protein gene classes and core locus constraints. J. Exp. Bot. 68, 3841–3855 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Rae B. D., Long B. M., Badger M. R., Price G. D., Functions, compositions, and evolution of the two types of carboxysomes: Polyhedral microcompartments that facilitate CO2 fixation in cyanobacteria and some proteobacteria. Microbiol. Mol. Biol. Rev. 77, 357–379 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Badger M. R., Price G. D., Long B. M., Woodger F. J., The environmental plasticity and ecological genomics of the cyanobacterial CO2 concentrating mechanism. J. Exp. Bot. 57, 249–265 (2006). [DOI] [PubMed] [Google Scholar]
- 10.Raven J. A., Beardall J., CO2 concentrating mechanisms and environmental change. Aquat. Bot. 118, 24–37 (2014). [Google Scholar]
- 11.Fang S., Huang X., Zhang X., Zhang M., Hao Y., Guo H., Liu L.-N., Yu F., Zhang P., Molecular mechanism underlying transport and allosteric inhibition of bicarbonate transporter SbtA. Proc. Natl. Acad. Sci. U.S.A. 118, e2101632118 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Kaczmarski J. A., Hong N.-S., Mukherjee B., Wey L. T., Rourke L., Förster B., Peat T. S., Price G. D., Jackson C. J., Structural basis for the allosteric regulation of the SbtA bicarbonate transporter by the PII-like protein, SbtB, from Cyanobium sp. PCC7001. Biochemistry 58, 5030–5039 (2019). [DOI] [PubMed] [Google Scholar]
- 13.Woodger F. J., Badger M. R., Price G. D., Inorganic carbon limitation induces transcripts encoding components of the CO2-concentrating mechanism in Synechococcus sp. PCC7942 through a redox-independent pathway. Plant Physiol. 133, 2069–2080 (2003). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Whitehead L., Long B. M., Price G. D., Badger M. R., Comparing the in vivo function of α-carboxysomes and β-carboxysomes in two model cyanobacteria. Plant Physiol. 165, 398–411 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Sun Y., Casella S., Fang Y., Huang F., Faulkner M., Barrett S., Liu L.-N., Light modulates the biosynthesis and organization of cyanobacterial carbon fixation machinery through photosynthetic electron flow. Plant Physiol. 171, 530–541 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Rohnke Brandon A., Singh S. P., Pattanaik B., Montgomery B. L., RcaE-dependent regulation of carboxysome structural proteins has a central role in environmental determination of carboxysome morphology and abundance in Fremyella diplosiphon. mSphere 3, e00617–e00617 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Sun Y., Wollman A. J. M., Huang F., Leake M. C., Liu L.-N., Single-organelle quantification reveals stoichiometric and structural variability of carboxysomes dependent on the environment. Plant Cell 31, 1648–1664 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Harano K., Ishida H., Kittaka R., Kojima K., Inoue N., Tsukamoto M., Satoh R., Himeno M., Iwaki T., Wadano A., Regulation of the expression of ribulose-1,5-bisphosphate carboxylase/oxygenase (EC 4.1.1.39) in a cyanobacterium, Synechococcus PCC7942. Photosynth. Res. 78, 59–65 (2003). [DOI] [PubMed] [Google Scholar]
- 19.Huang F., Vasieva O., Sun Y., Faulkner M., Dykes G. F., Zhao Z., Liu L.-N., Roles of RbcX in carboxysome biosynthesis in the cyanobacterium Synechococcus elongatus PCC7942. Plant Physiol. 179, 184–194 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Tsai Y.-C. C., Lapina M. C., Bhushan S., Mueller-Cajar O., Identification and characterization of multiple rubisco activases in chemoautotrophic bacteria. Nat. Commun. 6, 8883 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Sutter M., Roberts E. W., Gonzalez R. C., Bates C., Dawoud S., Landry K., Cannon G. C., Heinhorst S., Kerfeld C. A., Structural characterization of a newly identified component of α-carboxysomes: The AAA+ domain protein CsoCbbQ. Sci. Rep. 5, 16243 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Peña K. L., Castel S. E., de Araujo C., Espie G. S., Kimber M. S., Structural basis of the oxidative activation of the carboxysomal γ-carbonic anhydrase, CcmM. Proc. Natl. Acad. Sci. U S A 107, 2455–2460 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Smith K. S., Jakubzick C., Whittam T. S., Ferry J. G., Carbonic anhydrase is an ancient enzyme widespread in prokaryotes. Proc. Natl. Acad. Sci. 96, 15184–15189 (1999). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Igamberdiev A. U., Control of Rubisco function via homeostatic equilibration of CO2 supply. Front. Plant Sci. 6, (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Price G. D., Badger M. R., Expression of human carbonic anhydrase in the cyanobacterium Synechococcus PCC7942 creates a high CO2-requiring phenotype. Plant Physiol. 91, 505–513 (1989). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Badger M. R., Price G. D., G. D., Carbonic anhydrase activity associated with the cyanobacterium Synechococcus PCC7942. Plant Physiol. 89, 51–60 (1989). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Sawaya M. R., Cannon G. C., Heinhorst S., Tanaka S., Williams E. B., Yeates T. O., Kerfeld C. A., The structure of β-carbonic anhydrase from the carboxysomal shell reveals a distinct subclass with one active site for the price of two. J. Biol. Chem. 281, 7546–7555 (2006). [DOI] [PubMed] [Google Scholar]
- 28.Heinhorst S., Williams E. B., Cai F., Murin C. D., Shively J. M., Cannon G. C., Characterization of the carboxysomal carbonic anhydrase CsoSCA from Halothiobacillus neapolitanus. J. Bacteriol. 188, 8087–8094 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Ni T., Sun Y., Burn W., al-Hazeem M. M. J., Zhu Y., Yu X., Liu L.-N., Zhang P., Structure and assembly of cargo Rubisco in two native α-carboxysomes. Nat. Commun. 13, 4299 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.C. Blikstad, E. J. Dugan, T. G. Laughlin, M. D. Liu, S. R. Shoemaker, J. P. Remis, D. F. Savage, Discovery of a carbonic anhydrase-Rubisco supercomplex within the alpha-carboxysome. bioRxiv 467472 [Preprint] (2021). 10.1101/2021.11.05.467472. [DOI] [PMC free article] [PubMed]
- 31.Metskas L. A., Ortega D., Oltrogge L. M., Blikstad C., Lovejoy D. R., Laughlin T. G., Savage D. F., Jensen G. J., Rubisco forms a lattice inside alpha-carboxysomes. Nat. Commun. 13, 4863 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.S. L. Evans, M. M. J. Al-Hazeem, D. Mann, N. Smetacek, A. J. Beavil, Y. Sun, T. Chen, G. F. Dykes, L.-N. Liu, J. R. C. Bergeron, Single-particle cryo-EM analysis of the shell architecture and internal organization of an intact α-carboxysome. bioRxiv 481072 [Preprint] (2022). 10.1101/2022.02.18.481072. [DOI] [PMC free article] [PubMed]
- 33.Sun Y., Harman V. M., Johnson J. R., Brownridge P. J., Chen T., Dykes G. F., Lin Y., Beynon R. J., Liu L.-N., Decoding the absolute stoichiometric composition and structural plasticity of α-carboxysomes. MBio 13, e0362921 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Chaijarasphong T., Nichols R. J., Kortright K. E., Nixon C. F., Teng P. K., Oltrogge L. M., Savage D. F., Programmed ribosomal frameshifting mediates expression of the α-carboxysome. J. Mol. Biol. 428, 153–164 (2016). [DOI] [PubMed] [Google Scholar]
- 35.Long B. M., Förster B., Pulsford S. B., Price G. D., Badger M. R., Rubisco proton production can drive the elevation of CO2 within condensates and carboxysomes. Proc. Natl. Acad. Sci. U S A 118, e2014406118 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Cronk J. D., Rowlett R. S., Zhang K. Y. J., Tu C., Endrizzi J. A., Lee J., Gareiss P. C., Preiss J. R., Identification of a novel noncatalytic bicarbonate binding site in eubacterial β-carbonic anhydrase. Biochemistry 45, 4351–4361 (2006). [DOI] [PubMed] [Google Scholar]
- 37.Marin B., Nowack E. C. M., Glöckner G., Melkonian M., The ancestor of the Paulinella chromatophore obtained a carboxysomal operon by horizontal gene transfer from a Nitrococcus-like γ-proteobacterium. BMC Evol. Biol. 7, 85 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Rae B., Forster B., Badger M., Price G., The CO2-concentrating mechanism of Synechococcus WH5701 is composed of native and horizontally-acquired components. Photosynth. Res. 109, 59–72 (2011). [DOI] [PubMed] [Google Scholar]
- 39.Förster B., Rourke L. M., Weerasooriya H. N., Pabuayon I. C. M., Rolland V., Au E. K., Bala S., Bajsa-Hirschel J., Kaines S., Kasili R. W., LaPlace L. M., Machingura M. C., Massey B., Rosati V. C., Stuart-Williams H., Badger M. R., Price G. D., Moroney J. V., The Chlamydomonas reinhardtii chloroplast envelope protein LCIA transports bicarbonate in planta. J. Exp. Bot. 74, 3651–3666 (2023). [DOI] [PubMed] [Google Scholar]
- 40.M. van Kempen, S. S. Kim, C. Tumescheit, M. Mirdita, J. Lee, C. L. M. Gilchrist, J. Söding, M. Steinegger, Fast and accurate protein structure search with Foldseek. bioRxiv 479398 [Preprint] (2023). 10.1101/2022.02.07.479398. [DOI] [PMC free article] [PubMed]
- 41.L. Holm, Using Dali for protein structure comparison, in Structural Bioinformatics: Methods and Protocols, Z. Gáspári, Ed. (Springer US, 2020), pp. 29–42. [DOI] [PubMed] [Google Scholar]
- 42.Badger M. R., Kinetic properties of ribulose 1,5-bisphosphate carboxylase/oxygenase from Anabaena variabilis. Arch. Biochem. Biophys. 201, 247–254 (1980). [DOI] [PubMed] [Google Scholar]
- 43.Mangan N. M., Flamholz A., Hood R. D., Milo R., Savage D. F., pH determines the energetic efficiency of the cyanobacterial CO2 concentrating mechanism. Proc. Natl. Acad. Sci. U.S.A. 113, E5354–E5362 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Flamholz A., Shih P. M., Cell biology of photosynthesis over geologic time. Curr. Biol. 30, R490–R494 (2020). [DOI] [PubMed] [Google Scholar]
- 45.Rowlett R. S., Tu C., Lee J., Herman A. G., Chapnick D. A., Shah S. H., Gareiss P. C., Allosteric site variants of Haemophilus influenzae β-carbonic anhydrase. Biochemistry 48, 6146–6156 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Kim S., Yeon J., Sung J., Kim N. J., Hong S., Jin M. S., Structural insights into novel mechanisms of inhibition of the major β-carbonic anhydrase CafB from the pathogenic fungus Aspergillus fumigatus. J. Struct. Biol. 213, 107700 (2021). [DOI] [PubMed] [Google Scholar]
- 47.Covarrubias A. S., Bergfors T., Jones T. A., Högbom M., Structural mechanics of the pH-dependent activity of β-carbonic anhydrase from Mycobacterium tuberculosis. J. Biol. Chem. 281, 4993–4999 (2006). [DOI] [PubMed] [Google Scholar]
- 48.Todd A. E., Orengo C. A., Thornton J. M., Evolution of function in protein superfamilies, from a structural perspective. J. Mol. Biol. 307, 1113–1143 (2001). [DOI] [PubMed] [Google Scholar]
- 49.Fujiwara K., Toda H., Ikeguchi M., Dependence of α-helical and β-sheet amino acid propensities on the overall protein fold type. BMC Struct. Biol. 12, 18 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.McGurn L. D., Moazami-Goudarzi M., White S. A., Suwal T., Brar B., Tang J. Q., Espie G. S., Kimber M. S., The structure, kinetics and interactions of the β-carboxysomal β-carbonic anhydrase, CcaA. Biochem. J. 473, 4559–4572 (2016). [DOI] [PubMed] [Google Scholar]
- 51.Mueller-Cajar O., Stotz M., Wendler P., Hartl F. U., Bracher A., Hayer-Hartl M., Structure and function of the AAA+ protein CbbX, a red-type Rubisco activase. Nature 479, 194–199 (2011). [DOI] [PubMed] [Google Scholar]
- 52.Dobrinski K. P., Longo D. L., Scott K. M., The carbon-concentrating mechanism of the hydrothermal vent chemolithoautotroph Thiomicrospira crunogena. J. Bacteriol. 187, 5761–5766 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.de Araujo C., Arefeen D., Tadesse Y., Long B. M., Price G. D., Rowlett R. S., Kimber M. S., Espie G. S., Identification and characterization of a carboxysomal γ-carbonic anhydrase from the cyanobacterium Nostoc sp. PCC 7120. Photosynth. Res. 121, 135–150 (2014). [DOI] [PubMed] [Google Scholar]
- 54.Oltrogge L. M., Chaijarasphong T., Chen A. W., Bolin E. R., Marqusee S., Savage D. F., Multivalent interactions between CsoS2 and Rubisco mediate α-carboxysome formation. Nat. Struct. Mol. Biol. 27, 281–287 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Flamholz A. I., Dugan E., Panich J., Desmarais J. J., Oltrogge L. M., Fischer W. W., Singer S. W., Savage D. F., Trajectories for the evolution of bacterial CO2-concentrating mechanisms. Proc. Natl. Acad. Sci. 119, e2210539119 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Catanzariti A.-M., Soboleva T. A., Jans D. A., Board P. G., Baker R. T., An efficient system for high-level expression and easy purification of authentic recombinant proteins. Protein Sci. 13, 1331–1339 (2004). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Aussignargues C., Paasch B. C., Gonzalez-Esquer R., Erbilgin O., Kerfeld C. A., Bacterial microcompartment assembly: The key role of encapsulation peptides. Commun. Integr. Biol. 8, e1039755 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Gibson D. G., Young L., Chuang R.-Y., Venter J. C., Hutchison C. A. III, Smith H. O., Enzymatic assembly of DNA molecules up to several hundred kilobases. Nat. Methods 6, 343–345 (2009). [DOI] [PubMed] [Google Scholar]
- 59.Wilkins M. R., Gasteiger E., Bairoch A., Sanchez J. C., Williams K. L., Appel R. D., Hochstrasser D. F., Protein identification and analysis tools in the ExPASy server. Methods Mol. Biol. 112, 531–552 (1999). [DOI] [PubMed] [Google Scholar]
- 60.Jumper J., Evans R., Pritzel A., Green T., Figurnov M., Ronneberger O., Tunyasuvunakool K., Bates R., Žídek A., Potapenko A., Bridgland A., Meyer C., Kohl S. A. A., Ballard A. J., Cowie A., Romera-Paredes B., Nikolov S., Jain R., Adler J., Back T., Petersen S., Reiman D., Clancy E., Zielinski M., Steinegger M., Pacholska M., Berghammer T., Bodenstein S., Silver D., Vinyals O., Senior A. W., Kavukcuoglu K., Kohli P., Hassabis D., Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Mirdita M., Schütze K., Moriwaki Y., Heo L., Ovchinnikov S., Steinegger M., ColabFold: Making protein folding accessible to all. Nat. Methods 19, 679–682 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Emsley P., Cowtan K., Coot: Model-building tools for molecular graphics. Acta Crystallogr. D Biol. Crystallogr. 60, 2126–2132 (2004). [DOI] [PubMed] [Google Scholar]
- 63.Agirre J., Atanasova M., Bagdonas H., Ballard C. B., Baslé A., Beilsten-Edmands J., Borges R. J., Brown D. G., Burgos-Mármol J. J., Berrisford J. M., Bond P. S., Caballero I., Catapano L., Chojnowski G., Cook A. G., Cowtan K. D., Croll T. I., Debreczeni J. É., Devenish N. E., Dodson E. J., Drevon T. R., Emsley P., Evans G., Evans P. R., Fando M., Foadi J., Fuentes-Montero L., Garman E. F., Gerstel M., Gildea R. J., Hatti K., Hekkelman M. L., Heuser P., Hoh S. W., Hough M. A., Jenkins H. T., Jiménez E., Joosten R. P., Keegan R. M., Keep N., Krissinel E. B., Kolenko P., Kovalevskiy O., Lamzin V. S., Lawson D. M., Lebedev A. A., Leslie A. G. W., Lohkamp B., Long F., Malý M., McCoy A. J., McNicholas S. J., Medina A., Millán C., Murray J. W., Murshudov G. N., Nicholls R. A., Noble M. E. M., Oeffner R., Pannu N. S., Parkhurst J. M., Pearce N., Pereira J., Perrakis A., Powell H. R., Read R. J., Rigden D. J., Rochira W., Sammito M., Sánchez Rodríguez F., Sheldrick G. M., Shelley K. L., Simkovic F., Simpkin A. J., Skubak P., Sobolev E., Steiner R. A., Stevenson K., Tews I., Thomas J. M. H., Thorn A., Valls J. T., Uski V., Usón I., Vagin A., Velankar S., Vollmar M., Walden H., Waterman D., Wilson K. S., Winn M. D., Winter G., Wojdyr M., Yamashita K., The CCP4 suite: Integrative software for macromolecular crystallography. Acta Crystallogr. Sect. Struct. Biol. 79, 449–461 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Liebschner D., Afonine P. V., Baker M. L., Bunkóczi G., Chen V. B., Croll T. I., Hintze B., Hung L.-W., Jain S., McCoy A. J., Moriarty N. W., Oeffner R. D., Poon B. K., Prisant M. G., Read R. J., Richardson J. S., Richardson D. C., Sammito M. D., Sobolev O. V., Stockwell D. H., Terwilliger T. C., Urzhumtsev A. G., Videau L. L., Williams C. J., Adams P. D., Macromolecular structure determination using X-rays, neutrons and electrons: Recent developments in Phenix. Acta Crystallogr. Sect. Struct. Biol. 75, 861–877 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.McGibbon R. T., Beauchamp K. A., Harrigan M. P., Klein C., Swails J. M., Hernández C. X., Schwantes C. R., Wang L.-P., Lane T. J., Pande V. S., MDTraj: A modern open library for the analysis of molecular dynamics trajectories. Biophys. J. 109, 1528–1532 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Camacho C., Coulouris G., Avagyan V., Ma N., Papadopoulos J., Bealer K., Madden T. L., BLAST+: Architecture and applications. BMC Bioinformatics 10, 421 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Fu L., Niu B., Zhu Z., Wu S., Li W., CD-HIT: Accelerated for clustering the next-generation sequencing data. Bioinformatics 28, 3150–3152 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Katoh K., Standley D. M., MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Capella-Gutiérrez S., Silla-Martínez J. M., Gabaldón T., trimAl: A tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Minh B. Q., Schmidt H. A., Chernomor O., Schrempf D., Woodhams M. D., von Haeseler A., Lanfear R., IQ-TREE 2: New models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 37, 1530–1534 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Hoang D. T., Chernomor O., von Haeseler A., Minh B. Q., Vinh L. S., UFBoot2: Improving the ultrafast bootstrap approximation. Mol. Biol. Evol. 35, 518–522 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Letunic I., Bork P., Interactive Tree Of Life (iTOL) v5: An online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 49, W293–W296 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Crooks G. E., Hon G., Chandonia J.-M., Brenner S. E., WebLogo: A sequence logo generator. Genome Res. 14, 1188–1190 (2004). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Markowitz V. M., Korzeniewski F., Palaniappan K., Szeto E., Werner G., Padki A., Zhao X., Dubchak I., Hugenholtz P., Anderson I., Lykidis A., Mavromatis K., Ivanova N., Kyrpides N. C., The integrated microbial genomes (IMG) system. Nucleic Acids Res. 34, D344–D348 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Liebschner D., Afonine P. V., Moriarty N. W., Poon B. K., Sobolev O. V., Terwilliger T. C., Adams P. D., Polder maps: Improving OMIT maps by excluding bulk solvent. Acta Crystallogr. Sect. Struct. Biol. 73, 148–157 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Angleton E. L., Van Wart H. E., Preparation and reconstitution with divalent metal ions of class I and class II Clostridium histolyticum apocollagenases. Biochemistry 27, 7406–7412 (1988). [DOI] [PubMed] [Google Scholar]
- 77.Cronk J. D., Endrizzi J. A., Cronk M. R., O'neill J. W., Zhang K. Y. J., Crystal structure of E. coli β-carbonic anhydrase, an enzyme with an unusual pH-dependent activity. Protein Sci. 10, 911–922 (2001). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Yariv B., Yariv E., Kessel A., Masrati G., Chorin A. B., Martz E., Mayrose I., Pupko T., Ben-Tal N., Using evolutionary data to make sense of macromolecules with a “face-lifted” ConSurf. Protein Sci. 32, e4582 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Landfried D. A., Vuletich D. A., Pond M. P., Lecomte J. T. J., Structural and thermodynamic consequences of b heme binding for monomeric apoglobins and other apoproteins. Gene 398, 12–28 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Clark J. J., Benson M. L., Smith R. D., Carlson H. A., Inherent versus induced protein flexibility: Comparisons within and between apo and holo structures. PLOS Comput. Biol. 15, e1006705 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplementary Materials and Methods
Tables S1 to S6
Figs. S1 to S15
Legends for data S1 to S3
References
Data files S1 to S3





