Abstract
As a part of innate immunity, the complement system relies on activation of the alternative pathway (AP). While feed-forward amplification generates an immune response towards foreign surfaces, the process requires regulation to prevent an immune response on the surface of host cells. Factor H (FH) is a complement protein secreted by native cells to negatively regulate the AP. In terms of structure, FH is composed of 20 complement-control protein (CCP) modules that are structurally homologous but vary in composition and function. Mutations in these CCPs have been linked to states of autoimmunity. In particular, several mutations in CCP 19-20 are correlated to atypical hemolytic uremic syndrome (aHUS). From crystallographic structures there are three putative binding sites of CCP 19-20 on C3d. Since there has been some controversy over the primary mode of binding from experimental studies, we approach characterization of binding using computational methods. Specifically, we compare each binding mode in terms of electrostatic character, structural stability, dissociative and associative properties, and predicted free energy of binding. After a detailed investigation, we found two of the three binding sites to be similarly stable while varying in the number of contacts to C3d and in the energetic barrier to complex dissociation. These sites are likely physiologically relevant and may facilitate multivalent binding of FH CCP 19-20 to C3b and either C3d or host glycosaminoglycans. We propose thermodynamically stable binding with modules 19 and 20, the latter driven by electrostatics, acting synergistically to increase the apparent affinity of FH for host surfaces.
Keywords: complement system, C3d, factor H, Poisson-Boltzmann electrostatics, molecular dynamics, steered molecular dynamics, MM/GBSA binding free energy
Statement: Crystallographic structures propose three binding sites for the complex of immune system protein C3d with the regulator factor H 19-20. Since crystallization of proteins can perturb physiological protein structure, we computationally characterize the interactions at the three putative binding sites. Our studies contribute to understanding the physicochemical basis of FH 19-20 interactions with C3d and provide a molecular basis for understanding autoimmune diseases such as atypical hemolytic syndrome.
Introduction
The complement system is a family of serum proteins and cell receptors that mediate innate immunity. These proteins act as the first line of defense against invading pathogens and can discriminate native surfaces from foreign surfaces. The complement system is comprised of the classical, lectin, and alternative pathways, though all converge on C3 convertase enzyme to generate an immune response. As a part of feed-forward amplification, this enzyme generates opsonins that produce more C3 convertase. Of the three pathways, only the alternative pathway generates C3 convertase at a basal level in the absence of foreign surfaces.
In the alternative pathway (AP) of the complement system, an immune response is facilitated through conversion of C3 to form C3b. Also referred to as tickover, this process consumes approximately one percent of C3 molecules per hour.1 C3b can further interact with factor B (FB) and factor D to form C3 convertase. As a result, positive feedback occurs, and C3 convertase cleaves C3 to C3a, an anaphylatoxin, and C3b, an opsonin. Since C3b and C3 convertase are capable of binding to host and foreign surfaces via a thioester domain located within the C3d domain, the positive amplification loop occurs at these surfaces, generating an immune response.1 While tickover is essential for immune response via the AP, overactivation and/or under-regulation can cause severe damage to native tissues due to the nonspecific nature of the interaction. In order to prevent a state of autoimmunity, native tissues express factor H (FH). This biomolecule competes with FB, limiting the formation of C3 convertase,2 acts as a cofactor for the degradation of C3b to iC3b, and accelerates decay of C3 convertase.2 Figure 1 illustrates the concepts of complement regulation on the surfaces of self and of complement activation of the surfaces of nonself.
Figure 1.
Comparison of complement response from the AP on surfaces of self (blue) and surfaces of nonself (red). (1, 4) C3b binds to both native and foreign surfaces through nucleophilic attack on an internal thioester bond. (2) On surfaces of self, FH is more likely to interact with attached C3b, preventing association with factor B and limiting the formation of the C3 convertase. In the case that C3 convertase does form, FH acts to accelerate decay of the complex. (3) Additionally, FH serves as a cofactor for conversion of attached C3b to inactive C3b that is eventually cleaved to C3d that remains surface-bound. (5) On surfaces of nonself, factor B associates with bound C3b. (6) Factor D subsequently cleaves a portion of factor B to form the C3 convertase. (7) This enzyme cleaves serum C3 into C3b, an opsonin, and C3a, an anaphylatoxin. (8) C3b generated by the convertase is capable of binding to surfaces once again and results in amplification of immune response.
FH is a 155 kDa protein composed of 20 complement control protein (CCP) modules.2 In order to explain negative regulation of AP amplification on host cells, several models that propose a mechanism to discriminate between self and nonself have been hypothesized that rely on functional CCP modules.2–5 In short, C3b binding is believed to occur at CCP 1-4 and 19-20 while polyanion binding is thought to occur at CCP 1, 5, 7, 13, and 20.4,6 Mutations in these modules have been linked to autoimmune disease. Of note for the present study, mutations in FH 19-20 have been linked to atypical hemolytic uremic syndrome (aHUS).2,3,6 Moreover, mutating these aHUS-linked residues has been shown to affect C3d binding.2,3 In order to understand pathogenesis of aHUS-linked mutations, studies have involved investigating relative binding affinity of FH mutants and characterizing the crystallographic structures of C3d in complex with FH CCP 19-20.2,3 These structures suggest three different modes of binding as summarized in Figure 2. Protein data bank (PDB) crystallographic structures 3OXU2 and 2XQW3 contain multiple chain pairings of C3d with FH CCP 19-20 that are described as binding sites 1-3, herein. Site 1 involves a negatively charged surface of C3d, referred to as the acidic patch,7 binding to positively charged FH CCP 20, while site 2 involves interaction between a ridge of C3d, not far from the acidic patch, and the negatively charged FH CCP 19. Located on the opposite face of C3d from sites 1-2, site 3 involves binding of the slightly positive, basic surface of C3d to the positively charged FH CCP 20.
Figure 2.
Molecular graphic representation of the competing binding modes from two crystallographic structures in the Protein Data Bank, 3OXU and 2XQW. Each binding site is described by one or more pairs of interacting chains (separated by a colon) from a specific crystallographic structure (PDB identifier of source noted). C3d is colored dim gray and specific residues associated with the acidic patch are colored in red while residues associated with the thioester domain are colored in yellow. FH is represented by CCP modules 19 and 20. FH CCP 20 is colored orange while CCP 19 is colored purple. At site 1, the binding configuration of C3d and FH varies between models (PDB 3OXU and 2XQW) suggesting apical flexibility.
In considering the physiological relevance of sites 1-3, a primary, C3b binding mode at site 2 has been suggested that does not involve the acidic patch or thioester bond of C3d.3,5 To reach this conclusion, studies considered the relative accessibility of the C3d subdomain3 within C3b and ensured that the proposed models enable discrimination between self and nonself.2,3 Additionally, studies map functional domains of FH CCP 19-20 by experimentally characterizing binding of C3d with mutant FH.2,3 Ultimately, these models emphasize the physiological importance of site 2; however, CCP 20 has a positive charge that may interact with the negatively charged acidic patch of C3d at site 1.6 One study does suggest a dual interaction of CCP 19 and 20 by one molecule of FH CCP 19-20 binding one C3b and either C3d or a host glycosaminoglycan (GAG);3 however, this conclusion is based on a mutated protein structure to enable crystallization.3 Crystallization conditions can significantly alter protein binding configurations, introducing nonphysiological states. In the past we have explored such an issue regarding the physiological binding mode of CR2 to C3d8–14 where zinc ions stabilized a protein complex thought to be nonphysiological. To propose the physiologically relevant binding mode or set of modes in the case of C3d and FH CCP 19-20, we compare each binding site in terms of electrostatic character, structural stability, dissociative properties, associative properties, and predicted free energy of binding using computational methods.
Results
AESOP analysis
Using a computational framework developed in our lab and termed Analysis of Electrostatic Similarities of Proteins (AESOP), we investigate the electrostatic character of each binding site (Fig. 2). Comprehensive, computational alanine mutagenesis of each ionizable amino acid, one at a time, in C3d revealed a high degree of electrostatic contributions in the binding modes of site 1 and site 2 as suggested by several mutations that cause loss or gain of binding outside the energetic fluctuations of thermal effects (up to 2.5 kJ/mol) (Fig. 3). Additionally, mutations that affect binding at one site are generally observed to not affect other binding sites. This result is in agreement with each binding site being located at distinct surface areas of C3d. From AESOP analysis of FH CCP 19-20, sites 1 and 2 have the most electrostatic character, while site 3 generally lacks mutations with significant energetic perturbations. In general for sites 1 and 2, mutation of acidic residues in C3d and basic residues in FH CCP 19-20 result in loss of binding relative to wild type (WT), while mutation of basic residues in C3d and acidic residues in FH CCP 19-20 result in gain of binding. At site 3, there is notable deviation from the previous pattern of sites 1 and 2 where acidic residues of FH typically result in gain of binding mutations. Instead, mutation of acidic residues result in loss of binding and mutation of basic residues result in gain of binding which is in agreement with the basic surface of the thioester domain.7
Figure 3.
Change in free energy of binding compared with the wild-type molecule resulting from computational alanine mutagenesis of (A) C3d and (B) FH. The heatmap scale is kept consistent between panels. Black for residue D1119 in panel B indicates no data since there is not an ionizable amino acid in this position within model 2XQW. Mutations boxed in green represent mutations that have been linked to aHUS. Negative values are gain of binding mutations while positive values are loss of binding mutations.
To compare our comprehensive alanine scan to mutations mentioned in literature, we compiled a list of individual mutants from previous mutagenesis studies of C3d and FH that involve residues with partial charges or ionizable amino acids to residues other than alanine.2,3 Using the resulting list of mutations, we once again applied the method of AESOP. Mutations in C3d at residues D122 and Q168 have been shown to decrease binding. In our results (Fig. 4), the D122N mutation corresponds to loss of binding at site 2, while Q168K corresponds to loss of binding at site 1. Shifting focus to FH, there are a number of mutations in literature that are linked to aHUS. Our results suggest (Fig. 4) a number of aHUS linked mutations that correspond to loss of binding mutations at site 1 and 2. Despite general agreement, several mutations of aHUS-linked residues result in gain of binding mutations in our model at both sites 1 and 2.
Figure 4.
Change in free energy of binding compared with the wild type molecule resulting from computational site-directed mutagenesis for (A) C3d and (B) FH. The heatmap scale is kept consistent between panels. The first amino acid in brackets represents the residue in 3OXU, while the second is the residue in 2XQW. Mutations boxed in green represent mutations that have been linked to aHUS. Negative values are gain of binding mutations.
Analysis of MD simulations
In order to characterize structural stability in terms of protein-protein interactions, we analyzed trajectories from molecular dynamics (MD) simulations. For each binding mode of C3d and FH CCP 19-20, root-mean-square deviation (RMSD) from the initial structure stabilizes relatively quickly (Supporting Information Fig. S1), though there is some moderate fluctuation about the mean value across the triplicate trajectories that may be attributed to the flexibility of the FH CCP 19 and 20 modules (Supporting Information Figs. S2 and S3). Solvent accessible surface area (SASA) lost upon binding (interfacial SASA) seems to be a better indicator of system stability for these systems and stabilizes within 10 ns for the binding modes at site 1 and site 2 (Fig. 5). Site 3, on the other hand, experiences a steady decrease in interfacial SASA during the first 10 ns of the simulation, stabilizing only after extending the simulation to 20 ns. Fluctuations in the interfacial SASA at site 1 likely reflect the flexibility of FH CCP 19-20 which is exacerbated by the end-to-end binding configuration that is represented in Figure 2, with the C-terminus of FH CCP 20 binding to residues of the acidic patch on C3d. This property is further illustrated by the conformational variation observed at binding site 1 between crystallographic structures (Fig. 2).
Figure 5.
Interfacial SASA is plotted for each binding mode above and is colored according to binding site. Large, solid lines represent the mean values across a set of three MD trajectories, while smaller, dotted lines represent one standard deviation above or below the mean. Inset next to the key is the Interfacial SASA for site 3's extended MD trajectories from 10 to 20 ns (solid black and red lines here represent the extrapolated mean Interfacial SASA for each associated binding mode and are used to compare with the values for site 3).
Interaction profiles from all trajectories seem to vary drastically between binding modes. When considering potential aliphatic interactions with a cutoff distance of 4 Å (Fig. 6), site 1 has the least amount of interactions, followed by site 3 with a higher number of interactions, and by site 2 with the most interactions. Also notable is the near time-invariant presence of many potential aliphatic interactions at site 2. In terms of hydrogen bonding (Fig. 7), site 2 also exhibits the greatest amount of hydrogen bonds throughout all trajectories. Once again, many of these interactions are nearly time-invariant with occupancies approaching 100%. Site 1 exhibits one nearly time-invariant predicted hydrogen bond between FH residue R1203 and C3d residue E160 with three other high occupancy interactions. Site 3 exhibits the least amount of hydrogen bonding with only two moderate occupancy interactions. Considering potential salt bridges within a cutoff distance of 5 Å (Fig. 8), site 1 and 2 exhibit two high occupancy interactions. In addition, site 1 has three lower occupancy predicted salt bridges. Site 3 has several lower occupancy salt bridges that appear to fluctuate over the course of the trajectories (Supporting Information Fig. S4). In two replicate simulations at this site, there is an average of four salt bridges near the end of the trajectories; however, in one replicate, the number of salt bridges drops to an average number of one salt bridge.
Figure 6.
Percent occupancies of aliphatic interactions between C3d and FH at (A) binding site 1, (B) binding site 2, and (C) binding site 3. These values represent the percentage of snapshots throughout the triplicate trajectories for each binding site where a pair of functional groups with aliphatic features were observed to be within 4 Å in the C3d:FH complex. Residue labels are colored according to secondary structure. Black represents a coil or turn; red represents an α-helix; blue represents a 310-helix; green represents a β-sheet; and magenta represents an isolated β-bridge.
Figure 7.
Percent occupancies of hydrogen bond interactions between C3d and FH at (A) binding site 1, (B) binding site 2, and (C) binding site 3. These values represent the percentage of snapshots throughout the triplicate trajectories for each binding site where a hydrogen bond is present in the C3d:FH complex. Residue labels are colored according to secondary structure. Black represents a coil or turn; red represents an α-helix; blue represents a 310-helix; green represents a β-sheet; and magenta represents an isolated β-bridge.
Figure 8.
Percent occupancies of salt bridges between C3d and FH at (A) binding site 1, (B) binding site 2, and (C) binding site 3. These values represent the percentage of snapshots throughout the triplicate trajectories for each binding site where a pair of chemical groups capable of forming a salt bridge were observed to be within 5 Å from each other in the C3d:FH complex. Residue labels are colored according to secondary structure. Black represents a coil or turn; red represents an α-helix; blue represents a 310-helix; and green represents a β-sheet.
Analysis of SMD Simulations
During steered molecular dynamics (SMD) simulations, FH CCP 19-20 is pulled away from C3d at a constant velocity. As a result, binding modes exhibit similar peak forces with different lengths of time until intermolecular forces are essentially eliminated (Fig. 9). For site 2, forces minimize after approximately 1.3 ns while site 1 and 3 took about 0.8 and 0.6 ns, respectively. For site 1 the peak forces experienced greater variation, and the mean value seems to plateau slightly longer than site 3.
Figure 9.
Forces required to move FH at a constant velocity along the surface normal vector are plotted over the length of time for each SMD simulation for each binding mode. Large, solid lines represent the mean values across five, independent trajectories for each binding mode. Smaller, dotted lines represent one standard deviation above or below the mean value. Force-time responses are color coded according to the key in the top right corner.
Analysis of predicted association rate constants
In order to characterize the associative properties of the C3d:FH(19-20) complex, we calculated association rate constants using the TransComp15 computational platform. For site 1, C3d and FH(19-20) are predicted to associate with a rate constant of 3.53 × 107 M−1 s−1, while site 2 and 3 are predicted to have rate constants of 3.26 × 105 and 6.05 × 104 M−1 s−1, respectively. If electrostatics are neglected and only association from random diffusion is considered, sites 1, 2, and 3 were found to associate at rate constants of 5.48 × 105, 3.12 × 105, and 4.07 × 105 M−1 s−1, respectively.
Analysis of binding free energies
To assess the energetics of binding at sites 1-3, binding free energies are calculated using a molecular mechanics method with a generalized Born surface area approximation (MM/GBSA). Specifically, we report distributions of these free energy calculations from multiple time points in the molecular dynamics trajectories as discussed in materials and methods. From Brown-Forsythe and Bartlett's tests, we found variances in distributions of overall binding free energies are significantly different across binding modes with a confidence level of 95%. In terms of free energy of binding between C3d and FH CCP 19-20, each binding mode is characterized by a distinct distribution of predicted energies that arise from different contributions by electrostatic and nonpolar interactions (Fig. 10). Site 2 is predicted to have the lowest free energy that is largely based on nonpolar interactions in the protein complex. In addition to exhibiting the lowest free energy from nonpolar interactions of all binding modes, site 2 is predicted to have the second highest free energy of binding from electrostatic interactions. These results are in agreement with MD and AESOP analyses. In terms of overall predicted binding free energy, the distribution of energies overlap between site 1 and site 3, with site 1 exhibiting a much wider distribution of energies. Site 1 is predicted to have the largest contribution to free energy of binding from electrostatic interactions while having the lowest contribution from nonpolar interactions. This reflects the low interfacial surface area expected from the end-to-end binding configuration and results from SMD analysis. Site 3, on the other hand, has the lowest electrostatic contribution to free energy of binding and the second highest nonpolar contribution. The bimodal distribution of energies from nonpolar interactions for site 3 likely reflects the gradually decreasing interfacial surface area that is present briefly during the last 10 ns of the MD simulations.
Figure 10.
Histograms of energies from MM/GBSA calculations for each binding site are shown above with a bin size of 5 kJ/mol. (A) Distributions for change in the overall free energy of binding. (B) Distributions for change in free energy resulting from electrostatic interactions. (C) Distributions for change in free energy from nonpolar interactions. Energies were calculated for all but the first 100 ps of each MD trajectory, pooling replicates from the same binding modes.
Discussion
Methods that assess the electrostatic nature of the interactions between C3d and FH note key differences in the basis of the interactions at each binding mode. AESOP suggests that site 1 has the most electrostatic character based on the greatest number of mutations in C3d and FH that are predicted to cause a change in binding. This result is also confirmed by MM/GBSA calculations. At site 2, AESOP suggests a lesser degree of electrostatic interactions based on the fewer number of mutations that result in predicted gain or loss of binding. Site 3, on the other hand, is suggested to interact primarily on the basis of nonpolar interactions based on the lack of mutations in AESOP that are predicted to increase or decrease binding in excess of energetic variations arising from thermal fluctuations. This lack of electrostatic character at site 3 is supported by results from the MM/GBSA calculations. Of key residues where mutations are linked to atypical hemolytic uremic syndrome (aHUS), site 1 exhibits the most mutations at these loci that result in loss of binding (R1182, K1203). Additionally, most mutations of basic residues probably affect polyanion binding.
Given the variation in interfacial surface area that each binding mode exhibits, stability of the corresponding protein complexes is difficult to determine. Without considering this variation, site 2 would certainly be the most stable from plotting interfacial SASA over time (Fig. 5). Interfacial SASA for site 1 exhibits greater variation of interfacial SASA that may be caused in part by the end-to-end binding configuration between C3d and FH CCP 20. Interfacial SASA decreases at site 3 during the first part of the MD trajectory, suggesting the initial crystallized structure may not be physiological. This is not to say, however, that the binding mode is not physiological since crystallization of the protein complex could have induced structural perturbations. Rather the structure could be stable in a different configuration that approaches the structure towards the end of our MD simulation. As mentioned, interfacial surface area does not fully explain complex stability. In the same way, investigation of predicted protein interactions gives insight into the basis of the binding between C3d and FH while being skewed by varying interfacial surface areas. At site 2, we observed a large number of nearly time-invariant aliphatic interactions (Fig. 6), suggesting a stable structure as well as a high degree of nonpolar interactions. Presence of several hydrogen bonds as well as two salt bridges (Figs. 7 and 8) further implies stability of the site 2 complex. In contrast, site 1 lacks the number of stabilizing aliphatic interactions observed in other binding modes; however, it has a nearly time-invariant salt bridge (in combination with several other lower occupancy salt bridges and hydrogen bonds) that may promote binding of C3d and FH. This is not to say, however, that the structure is unstable since this observation gives no insight into the resistance to external force in SMD simulations. Similar to site 2, site 3 has a large number of aliphatic interactions; however, these interactions tend to have lower occupancies than site 2. This result suggests slightly lower stability of the crystallized structure as one of three MD trajectories for site 3 decreases in number of aliphatic interactions present after 15 ns and does not stabilize. Moreover, there are relatively few predicted hydrogen bonds and salt bridges when compared with other binding modes.
From studying the dissociative properties throughout SMD simulations, binding modes seem to share similar stability despite variations in types of interactions. This conclusion is based on the similar peak force of 1500 pN that is observed across sets of triplicate SMD trajectories for each binding site. Despite their similarity, binding modes vary in the time required to separate the complex and eliminate protein interactions. Site 2 requires the longest amount of time to separate at 1.3 ns, suggesting a larger energetic barrier to dissociation. As mentioned in the results, sites 1 and 3 require similar amounts of energy to separate C3d and FH, with site 1 requiring slightly more energy on average.
In predicting associative properties of C3d and FH(19-20), we considered electrostatic interactions at each binding site that enhanced or diminished the association from random diffusion. From only diffusion, each binding configuration exhibited similar association rate constants. However, electrostatic interactions at each binding site had a profound effect on overall association rate constants. At site 1 electrostatic interactions enhanced association by two orders of magnitude, while electrostatic interactions at site 2 did not significantly affect association. Only site 3 had unfavorable electrostatic interactions that diminished the association rate constant. These observations are consistent with the degree of electrostatic character that we expect in binding modes 1-3. Moreover, maps of electrostatic potentials on the surface of C3d and FH(19-20) support the degree of electrostatic character found at each binding site as determined by both AESOP and free energy calculations (Supporting Information Fig. S5).
As an alternative method to AESOP that accounts for both electrostatic and nonpolar interactions between C3d and FH CCP 19-20, MM/GBSA gives insights into the free energy of binding for each binding mode. Moreover, results from MM/GBSA are in agreement with our predicted outcomes from AESOP. As expected site 1 has the greatest electrostatic contribution to the free energy of binding, while site 3 has the least. Since AESOP does not account for hydrophobic interactions, we were unable to resolve the nonpolar contributions to the free energy of binding at site 2 from AESOP alone; however, occupancy maps from MD analysis are in agreement with findings from MM/GBSA. Furthermore, MM/GBSA is in agreement with AESOP concerning the presence of electrostatic interactions at site 2 that are lesser in overall magnitude than the electrostatic contribution from site 1. When considering nonpolar interactions, MM/GBSA predicts that site 1 has the lowest contribution of nonpolar interactions to the free energy of binding while site 2 has the greatest nonpolar contribution. Perhaps a minor discrepancy between analytical methods is found when comparing results from MM/GBSA and SMD. From SMD analysis, we expect site 1 to be slightly more thermodynamically stable than site 3; however, MM/GBSA predicts that site 3 has a slightly lower mean free energy of binding than site 1. This result may characterize some uncertainty associated with the overlapping distributions of binding free energies between sites 1 and 3 and with the larger distribution of energies from site 1.
When comparing our data from AESOP and protein interaction studies to mutations from literature that are considered to affect binding of C3d and FH, we observe the most agreement at residues within sites 1 and 2 (Supporting Information Table S1). Specifically, mutations at residues D36, E160, D163, and Q168 on C3d and at residues R1182, E1198, R1203, and R1215 on FH suggest site 1 to be important to C3d and FH binding. Furthermore, mutations at residues E117 and D122 on C3d and at residues P1114, D1119, and Q1139 on FH suggest site 2 is also important for C3d binding. In contrast, site 3 lacks mutations that suggest a role in C3d and FH binding. There is one mutation mentioned in literature, however, that may result in increased binding between C3d and FH at site 3.
As expected, mutation of residues predicted to participate in salt bridge interactions between C3d and FH corresponds to loss of binding from AESOP. At site 1, alanine mutations of E160, E167, and D292 on C3d and of R1182, R1203, R1215, and K1230 on FH are predicted to cause loss of binding. Each of these residues participates in a high occupancy salt bridge interaction from our MD simulations. At site 2, alanine mutations of K112 and D122 on C3d and of D1119 and K1188 on FH are predicted to cause loss of binding. Once again, these residues participate in high occupancy salt bridge interactions from MD simulations. These results show agreement between AESOP and MD simulations. In general, we are not able to compare aliphatic or hydrogen bond interactions (that are not salt bridges) to results from AESOP, since our computational framework focuses on mutation of ionizable residues. In the directed mutagenesis studies, however, we did perform the mutation Q1139G which resulted in loss of binding at site 2. From MD analysis, Q1139 participated in a hydrogen bond interaction at site 2. Overall, each residue that participates in a high occupancy salt bridge at one binding site is observed as loss of binding mutation in AESOP at that corresponding site.
After comprehensive characterization of electrostatics, structural stability, dissociative properties, associative properties, and predicted free energy of binding, we suggest the physiological relevance of sites 1 and 2 and propose that FH may interact with C3b/d in a multivalent manner. This conclusion is in agreement with experimental studies and mechanisms proposed in literature.3,5,16 We find that binding at site 3 is less likely to be physiological since simulations seem less stable, comparatively, and superposition of the C3d:FH(19-20) structure with the structure of C3b (PDB 2I07) reveals clashing of CCP 19 with a domain of C3b (Supporting Information Fig. S6). We cannot rule out binding at site 3, however, since intermodular flexibility could resolve the clashes with C3b. This same structural comparison suggests that binding of FH CCP 19-20 at site 1 or site 2 is feasible (Supporting Information Fig. S6). In at least one study, binding at site 1 was suggested to require a conformational change of C3b.3 We suggest that the one clash observed from the superpositioning of FH CCP 19-20 with C3b of site 1 may be eliminated by flexibility in the end-to-end binding configuration. With this apically flexible structure, the complex may remain stable as a result of electrostatic interactions. We propose that sites 1 and 2 represent the major modes of binding between C3b and FH. Our results indicate that the complex at site 2 is very stable with a high degree of hydrophobic and electrostatic interactions and the most favorable binding free energy. Site 1, on the other hand, contains the negatively charged acidic patch of C3d and the positively charged FH (CCP 20) that may promote longer-range interactions of FH with C3d or C3b, resulting in a higher association rate than site 2 by two orders of magnitude. This property, combined with insights from experimental studies (Supporting Information Table S1), suggests that site 1 is potentially more important than site 2 for binding of FH to host surfaces.
In literature, two studies suggest competing modes of binding. Morgan et al. suggested that the physiological mode of binding between C3d and FH CCP 19-20 occurs at sites 2 and 3. At site 3 specifically, Morgan et al. suggest CCP 20 binds to the thioester domain and polyanionic surfaces simultaneously.2 In this manner, the study hypothesized that FH CCP 19-20 can distinguish between self and nonself. While we cannot rule out this mode of binding, our computational study suggests that such a complex would be less stable and require significant bending at the intermodular domain of CCP 19 and 20. A second study by Kajander et al. suggested a physiological mode of C3b binding at site 2 through FH CCP 19, excluding binding at site 1 for steric reasons.3 Furthermore, they hypothesized a dual interaction of FH CCP 19-20 with C3b at site 2 and either C3d at site 1 or host GAGs. Through this multivalent interaction, literature suggests avidity of FH to self surfaces is greatly increased and discrimination between self and nonself is facilitated.3,5 Our study is in agreement that site 2 is important for C3b binding; however, we further suggest that site 1 may play an important role. Due to the electrostatic nature of the protein interaction, we hypothesize the apical binding conformation of FH CCP 20 is flexible, yet stable. Furthermore, literature mentions bacterial proteins that confer immune resistance bind to the acidic patch of C3d (site 1) in a crystallographic structure, while binding with nanomolar affinity to C3b.17 Thus, site 1 is suggested to play some role in complement regulation and may be accessible in C3b. With regard to the dual interaction of CCP 19-20 with host surfaces, our study is unable to directly assess this behavior. However, structure superpositioning seems to indicate that CCP 20 is largely accessible when binding occurs at site 2 through CCP 19 (Supporting Information Fig. S6).
Assuming that a dual interaction of FH CCP 19-20 does occur, we hypothesize that CCP 19 and 20 complement each other in such a way to increase the apparent affinity of FH for host surfaces. Kajander et al. mention that mutations in CCP 20 significantly reduce the binding of FH(19-20) to C3b.3 We propose that site 1 may be the initial binding mode that precedes binding at site 2 where there exists a higher energetic barrier to dissociation. The electrostatic nature of binding at site 1 may increase the apparent on rate of CCP 19 to site 2. This effect is likely more influential in vivo where CCP 20 is additionally capable of binding to C3d or host GAGs. Additionally, the more thermodynamically stable binding at site 2 likely decreases the apparent off rate of CCP 20 from either C3d or host GAGs.
In conclusion, site 1 and 2 appear to be the most physiologically relevant binding modes. To assess the dual interaction of FH(19-20) with host surfaces, future mechanistic studies can investigate the structural stability of the resulting quaternary structure. Additionally, mechanistic studies may more directly assess feasibility of FH(19-20) binding at site 1 on C3b. Further exploration of protein interactions between FH CCP 19-20 and C3b at sites 1 and 2 is required for a deeper understanding of the physiological implications that are relevant to describing the pathophysiology of aHUS.
Materials and Methods
Structure preparation
Crystallographic structures of C3d in complex with FH were downloaded from the PDB using the identification codes 3OXU and 2XQW. From each structure all pairs of interacting protein chains were identified and superimposed based on the structure of C3d within the chain pair. In each sequence expression tags were removed and all chains were trimmed for consistency across all models. In effect, C3d contained residues 10 through 294, and FH contained residues 1109-1166 of CCP 19 and residues 1167-1230 of CCP 20. Note that the first residue of FH after the expression tag is residue 1107. For electrostatic analysis, each pair of interacting protein chains from each PDB model was examined. For subsequent molecular dynamics simulations, only one representative chain pair from 3OXU was used for each of the three binding modes. Site 1 consists of chains B and D; site 2 consists of chains B and F, and site 3 consists of chains A and D as illustrated in Figure 2. Note that the 2XQW crystallographic structure mutated aspartic acid 1119 to glycine and glutamine 1139 to alanine in order to enable crystallization of the protein complex. From the important roles these residues play, the mutations pose significant threats to the integrity of the structural model; however, the 2XQW structure remains in general agreement with 3OXU.
AESOP calculations
In order to analyze the energetic contributions of each individual amino acid to the binding of C3d and FH, our lab has developed and implemented a computational framework for predicted changes in free energy of binding using a mutagenesis perturbation approach.18–22 This method substitutes alanine for each ionizable amino acid and predicts the change in free energy of binding, as a sort of computational alanine scan, according to
![]() |
1 |
![]() |
2 |
The predicted electrostatic free energies of binding for the mutants are relative to the parent protein, given by Eq. 1, where and
correspond to electrostatic free energies of association, calculated using a thermodynamic cycle that includes Coulombic and solvation effects, as previously described14,19,20 and shown in Eq. 2. To prepare our collection of structures, a PQR file was generated for each structure of interacting chain pairs (PDB structure) using PDB 2PQR23 and the PARSE force field.24,25 This file contained hydrogen atoms omitted in the PDB file and parameters necessary for electrostatic calculations. APBS26 was used for all electrostatic calculations using a solvent dielectric value of 78.54, protein dielectric constant of 20, and a grid spacing of 1 Å. Furthermore, ionic strength of the solvent was set to 150 mM. Electrostatic free energies of association were calculated as described previously.22
In addition to the comprehensive computational alanine scan, we performed computational site directed mutagenesis. All parameters from the previous method were used; however, SCWRL427 was used to replace the specified residues with another user-defined residue. Mutations were selected to compare against previous results found in literature;2,3 however, we only performed mutations on residues that are ionizable or contain partial charges as this is a limitation of our computational method.
MD simulations
For each site of interaction on C3d with FH, three explicit solvent MD simulations were performed using NAMD28 and the CHARMM27 force field.29 Structures were initially minimized with the water molecules contained in the crystallographic structure and subsequently solvated in a cubic TIP3P water box leaving a minimum margin of 12 Å around the protein structure. Sodium and chloride ions were also added to the water box, bringing the ionic strength to 150 mM. Minimization of the solvated structure was performed for 25,000 steps (50 ps) before heating to 310 K over 64 ps. Next, equilibration was performed where all protein atoms were constrained at 10 kcal/mol/Å2 initially and then relaxed to 5, 2, and 1 kcal/mol/Å2 before removing constraints altogether for the final equilibration run. Following equilibration, the MD simulation was run with periodic boundary conditions, Langevin temperature and pressure control, and particle mesh Ewald (PME) electrostatics. Relevant parameters for the simulation included a nonbonded interaction cutoff of 12 Å and a switching distance of 10 Å. Hydrogen bonds were held constant according to the SHAKE algorithm as an integration time step of 2 fs was used. At least one of these simulations was run for 20 ns to establish the time required to reach structural stability within the simulation (i.e. a plateau of interfacial surface area and atomic RMSD throughout the trajectory). The remaining two simulations were run for 10 ns except in the case of site 3 where simulations required extension to 20 ns.
SMD simulations
Using the longest MD simulation of each binding site on C3d, snapshots from the stabilized portion of the trajectory were clustered according to the RMSD of the FH CCPs relevant for the corresponding binding mode. Specifically, RMSD was based on CCP 20 for site 1 and 3 but based on CCP 19 and 20 for site 2. The resulting dendrograms were split into five bins where the medoid of each set was reported as a representative structure for the corresponding binding site. In order to prepare the SMD simulation, each structure was oriented so the force was applied to FH in the z-direction. The z-direction also corresponds to the normal vector associated with the surface of C3d within 8 Å from FH. Each resulting structure was minimized without solvent for 5000 steps before solvation in a TIP3P water box with a minimum margin of 12 Å in all dimensions and a maximum margin of 36 to 46 Å in the z-direction to allow for a minimum separation of about 10 Å where interactions between protein chains were effectively eliminated. Addition of ions was performed as in MD simulations followed by minimization of 25,000 steps (50 ps). Subsequent heating to 310 K was performed over 32 ps and was followed by one equilibration step with a constraint constant of 10 kcal/mol/Å2. In the actual SMD production run, all residues on C3d greater than 12 Å from FH were constrained with a constant of 10 kcal/mol/Å2, and a constant velocity of 10 Å/ns was applied on the center of mass of FH. For site 1 and 3, the center of mass was calculated from all residues in CCP 20, and for site 2 the center of mass was calculated from all residues in CCP 19 and 20. For all stages in SMD simulations, force-field parameters were retained from the MD simulations, and periodic boundary conditions were only used in minimization with solvent and heating. Langevin temperature and pressure controls, PME grid electrostatics, and rigid hydrogen bonds were used in all steps except minimization without solvent. Furthermore, an integration time step of 1 fs was used for all steps after minimization with solvent.
Calculation of binding free energies
Excluding the first 100 ps for each MD simulation trajectory of C3d and FH, we calculated free energies of binding for all remaining snapshots according to standard MM/GBSA methods20,30 for the set of replicates corresponding to each binding mode, according to
![]() |
3 |
![]() |
4 |
![]() |
5 |
![]() |
6 |
Changes in free energy are calculated according to Eq. 3, where AB represents the protein complex and A or B individually represents a protein of the complex. Each right hand term of Eq. 3 is calculated using an equation from Eqs. (4–6), denoted by n. Free energy of binding was calculated and reported according to Eq. 4 as described in previous, published work.20,30 The subscripts np, elec, vdw, solv, and Coul represent nonpolar, electrostatic, van der Waals, solvation, and Coulombic interactions. The first terms on the right hand side of Eqs. 5 and 6 are calculated using atomic radii and the CHARMM27 force field29 within CHARMM. The second right hand term of Eq. 5 is calculated by multiplying a surface tension parameter (γ = 0.005 kcal/mol/Å2) by the protein complex interfacial surface area, while the second right hand term of Eq. 6 is calculated from Generalized Born approximation within CHARMM. After calculating all free energies for triplicate trajectories, we binned values into histograms with a bin size of 5 kJ/mol for each binding site. Next, we fit Gaussians to visualize these distributions. Statistical analysis was performed using Brown-Forsythe and Bartlett's tests to characterize the variance in free energies of binding and the relative contributions of electrostatic and nonpolar interactions.
Association rate constant prediction
We used the TransComp15,31,32 web server (http://pipe.sc.fsu.edu/transcomp/) to predict association rate constants with an ionic strength of 150 mM. For each binding site, we used the same chain pairs from PDB 3OXU as for MD simulations. Through a combination of force-free Brownian dynamics simulations and electrostatic free energy calculations, this method provides both the association rate constant from random diffusion and an overall association rate constant from both diffusion and long-range electrostatic interactions.15
Analytical tools
Analysis of computational data was largely performed within R33 using the Bio3D34 package as necessary. Visualization and analysis of protein interactions from the simulation trajectory snapshots was performed within UCSF Chimera.35 In particular, Chimera was used to calculate hydrogen bonds. Aliphatic and salt bridges were calculated with custom scripts in R. VMD36 was used translate atomic coordinates within proteins as well as calculate interfacial surface area. Interfacial surface area was calculated by subtracting individual SASA values for each protein from the SASA of the protein complex. Our AESOP framework is available upon request (http://biomodel.engr.ucr.edu/software/index.html).
Glossary
- AP
alternative pathway
- C3
complement component 3
- C3b
complement component 3b
- FB
factor B
- C3a
complement component C3a
- C3d
complement component 3d
- FH
factor H
- iC3b
proteolytically-inactive complement component 3b
- CCP
complement control protein
- aHUS
atypical hemolytic uremic syndrome
- PDB
protein data bank
- GAG
glycosaminoglycan
- CR2
complement receptor 2
- AESOP
analysis of electrostatic similarities of proteins
- WT
wild type
- MD
molecular dynamics
- RMSD
root-mean-square deviation
- SASA
solvent-accessible surface area
- SMD
steered molecular dynamics
- MM/GBSA
molecular mechanics/generalized Born surface area.
Supporting Information
Additional Supporting Information may be found in the online version of this article.
Supporting Information
Supporting Information
References
- Thurman JM, Holers VM. The central role of the alternative complement pathway in human disease. J Immunol. 2006;176:1305–1310. doi: 10.4049/jimmunol.176.3.1305. [DOI] [PubMed] [Google Scholar]
- Morgan HP, Schmidt CQ, Guariento M, Blaum BS, Gillespie D, Herbert AP, Kavanagh D, Mertens HDT, Svergun DI, Johansson CM, D Uhrin, PN Barlow, JP Hannan. Structural basis for engagement by complement factor H of C3b on a self surface. Nature Struct Mol Biol. 2011;18:463–470. doi: 10.1038/nsmb.2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kajander T, Lehtinen MJ, Hyvarinen S, Bhattacharjee A, Leung E, Isenman DE, Meri S, Goldman A, Jokiranta TS. Dual interaction of factor H with C3d and glycosaminoglycans in host-nonhost discrimination by complement. Proc Natl Acad Sci USA. 2011;108:2897–2902. doi: 10.1073/pnas.1017087108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schmidt CQ, Herbert AP, Hocking HG, Uhrín D, Barlow PN. Translational mini-review series on complement Factor H: Structural and functional correlations for factor H: structure-function of factor H. Clinical Experim Immunol. 2007;151:14–24. doi: 10.1111/j.1365-2249.2007.03553.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gros P. In self-defense. Nature Struct Mol Biol. 2011;18:401–402. doi: 10.1038/nsmb.2036. [DOI] [PubMed] [Google Scholar]
- Kieslich CA, Vazquez H, Goodman GN, de Victoria AL, Morikis D. The effect of electrostatics on factor H function and related pathologies. J Mol Graph Model. 2011;29:1047–1055. doi: 10.1016/j.jmgm.2011.04.010. [DOI] [PubMed] [Google Scholar]
- Kieslich CA, Morikis D. The two sides of complement C3d: evolution of electrostatics in a link between innate and adaptive immunity. PLoS Comput Biol. 2012;8:e1002840. doi: 10.1371/journal.pcbi.1002840. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Szakonyi G. Structure of complement receptor 2 in complex with its C3d ligand. Science. 2001;292:1725–1728. doi: 10.1126/science.1059118. [DOI] [PubMed] [Google Scholar]
- Isenman DE, Leung E, Mackay JD, Bagby S, Van Den Elsen JMH. Mutational analyses reveal that the Staphylococcal immune evasion molecule Sbi and complement receptor 2 (CR2) share overlapping contact residues on C3d: implications for the controversy regarding the CR2/C3d cocrystal structure. J Immunol. 2010;184:1946–1955. doi: 10.4049/jimmunol.0902919. [DOI] [PubMed] [Google Scholar]
- Van den Elsen JMH, Isenman DE. A crystal structure of the complex between human complement receptor 2 and its ligand C3d. Science. 2011;332:608–611. doi: 10.1126/science.1201954. [DOI] [PubMed] [Google Scholar]
- Shaw CD, Storek MJ, Young KA, Kovacs JM, Thurman JM, Holers VM, Hannan JP. Delineation of the complement receptor type 2–C3d complex by site-directed mutagenesis and molecular docking. J Mol Biol. 2010;404:697–710. doi: 10.1016/j.jmb.2010.10.005. [DOI] [PubMed] [Google Scholar]
- Kieslich CA, Morikis D, Yang J, Gunopulos D. Automated computational framework for the analysis of electrostatic similarities of proteins. Biotechnol Prog. 2011;27:316–325. doi: 10.1002/btpr.541. [DOI] [PubMed] [Google Scholar]
- Morikis D. 2011. ) F1000. London, UK. Available from: http://f1000.com/prime/10371956. Last accessed 2 February 2014.
- Mohan R, Gorham RDJr, Morikis D. A theoretical view of the C3d:CR2 binding controversy. Mol Immunol. 2015;64:112–122. doi: 10.1016/j.molimm.2014.11.006. . DOI: 10.1016/j.molimm.2014.11.006. [DOI] [PubMed] [Google Scholar]
- Qin S, Pang X, Zhou H-X. Automated prediction of protein association rate constants. Structure. 2011;19:1744–1751. doi: 10.1016/j.str.2011.10.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Prosser BE, Johnson S, Roversi P, Herbert AP, Blaum BS, Tyrrell J, Jowitt TA, Clark SJ, Tarelli E, Uhrin D, PN Barlow, RB Sim, AJ Day, SM Lea. Structural basis for complement factor H linked age-related macular degeneration. J Exp Med. 2007;204:2277–2283. doi: 10.1084/jem.20071069. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hammel M, Sfyroera G, Ricklin D, Magotti P, Lambris JD, Geisbrecht BV. A structural basis for complement inhibition by Staphylococcus aureus. Nat Immunol. 2007;8:430–437. doi: 10.1038/ni1450. [DOI] [PubMed] [Google Scholar]
- Kieslich CA. 2012. ) Development and applications of a computational framework for protein and drug design. PhD Thesis. Riverside, CA: Department of Bioengineering, University of California. Available from: http://www.escholarship.org/uc/item/6p31w1xs. Last accessed 2 February 2014.
- Gorham R, Kieslich C, Morikis D. Electrostatic clustering and free energy calculations provide a foundation for protein design and optimization. Ann Biomed Eng. 2011;39:1252–1263. doi: 10.1007/s10439-010-0226-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gorham RD, Rodriguez W, Morikis D. Molecular analysis of the interaction between Staphylococcal virulence factor Sbi-IV and complement C3d. Biophys J. 2014;106:1164–1173. doi: 10.1016/j.bpj.2014.01.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kieslich CA, Gorham RD, Morikis D. Is the rigid-body assumption reasonable? J Non-Crystal Solids. 2011;357:707–716. [Google Scholar]
- Gorham RD, Kieslich CA, Nichols A, Sausman NU, Foronda M, Morikis D. An evaluation of poisson-boltzmann electrostatic free energy calculations through comparison with experimental mutagenesis data. Biopolymers. 2011;95:746–754. doi: 10.1002/bip.21644. [DOI] [PubMed] [Google Scholar]
- Dolinsky TJ, Nielsen JE, McCammon JA, Baker NA. PDB2PQR: an automated pipeline for the setup of Poisson-Boltzmann electrostatics calculations. Nucleic Acids Res. 2004;32:W665–W667. doi: 10.1093/nar/gkh381. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sitkoff D, Sharp KA, Honig B. Accurate calculation of hydration free energies using macroscopic solvent models. J Phys Chem. 1994;98:1978–1988. [Google Scholar]
- Tang CL, Alexov E, Pyle AM, Honig B. Calculation of pKas in RNA: on the structural origins and functional roles of protonated nucleotides. J Mol Biol. 2007;366:1475–1496. doi: 10.1016/j.jmb.2006.12.001. [DOI] [PubMed] [Google Scholar]
- Baker NA, Sept D, Joseph S, Holst MJ, McCammon JA. Electrostatics of nanosystems: Application to microtubules and the ribosome. Proc Natl Acad Sci USA. 2001;98:10037–10041. doi: 10.1073/pnas.181342398. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Krivov GG, Shapovalov MV, Dunbrack RL. Improved prediction of protein side-chain conformations with SCWRL4. Proteins. 2009;77:778–795. doi: 10.1002/prot.22488. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Phillips JC, Braun R, Wang W, Gumbart J, Tajkhorshid E, Villa E, Chipot C, Skeel RD, Kalé L, Schulten K. Scalable molecular dynamics with NAMD. J Comput Chem. 2005;26:1781–1802. doi: 10.1002/jcc.20289. [DOI] [PMC free article] [PubMed] [Google Scholar]
- MacKerell AD, Bashford D, Bellott M, Dunbrack RL, Evanseck JD, Field MJ, Fischer S, Gao J, Guo H, Ha S, D Joseph-McCarthy, L Kuchnir, K Kuczera, FTK Lau, C Mattos, S Michnick, T Ngo, DT Nguyen, B Prodhom, WE Reiher, B Roux, M Schlenkrich, JC Smith, R Stote, J Straub, M Watanabe, J Wiorkiewicz-Kuczera, D Yin, M Karplus. All-atom empirical potential for molecular modeling and dynamics studies of proteins. J Phys Chem B. 1998;102:3586–3616. doi: 10.1021/jp973084f. [DOI] [PubMed] [Google Scholar]
- Kieslich CA, Tamamis P, Jr GorhamRD, Lopez de Victoria A, Sausman NU, Archontis G, Morikis D. Exploring protein-protein and protein-ligand interactions in the immune system using molecular dynamics and continuum electrostatics. Curr Phys Chem. 2012;2:324–343. [Google Scholar]
- Alsallaq R, Zhou H-X. Electrostatic rate enhancement and transient complex of protein–protein association. Proteins. 2008;71:320–335. doi: 10.1002/prot.21679. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Qin S, Zhou H-X. Prediction of salt and mutational effects on the association rate of U1A protein and U1 small nuclear RNA stem/loop II. J Phys Chem B. 2008;112:5955–5960. doi: 10.1021/jp075919k. [DOI] [PMC free article] [PubMed] [Google Scholar]
- R Foundation for Statistical Computing. 2013. ) R Core Team R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. Available from: http://www.R-project.org/
- Grant BJ, Rodrigues APC, ElSawy KM, McCammon JA, Caves LSD. Bio3d: an R package for the comparative analysis of protein structures. Bioinformatics. 2006;22:2695–2696. doi: 10.1093/bioinformatics/btl461. [DOI] [PubMed] [Google Scholar]
- Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, Ferrin TE. UCSF Chimera–a visualization system for exploratory research and analysis. J Comput Chem. 2004;25:1605–1612. doi: 10.1002/jcc.20084. [DOI] [PubMed] [Google Scholar]
- Humphrey W, Dalke A, Schulten K. VMD: Visual molecular dynamics. J Mol Graph. 1996;14:33–38. doi: 10.1016/0263-7855(96)00018-5. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supporting Information
Supporting Information