Abstract
In signaling cascades, where domain-motif interactions tend to interact with relatively low affinity (allowing for reversibility), signaling proteins often encode multiple domains or motifs, which present the possibility of avidity – drastically increasing the interaction strength and duration as a result of multivalent binding. However, given the large combinatorial space, predicting and validating multivalent interactions that interact with avidity is a challenge. Here, we integrate mechanistic modeling, structure-based analysis, and experimental approaches as a framework for defining the conditions under which avidity plays a role. We explore the tandem SH2 domain family of interactions with bisphosphorylated partners as a multivalent archetype, which encompasses key secondary messengers in tyrosine kinase signaling networks. While certain multivalent interactions have been shown to be necessary in immune receptor recruitment of partners, bivalent recruitment of tandem SH2 domains more broadly is poorly understood. Theoretical modeling suggests that maximum avidity occurs with closely spaced or flexibly linked phosphotyrosine sites, combined with moderate monovalent affinities – exactly around the innate range of SH2 domain affinity. Surprisingly, despite sequence diversity, structure-based analysis showed remarkably conserved three-dimensional spacing between SH2 domains across all tandem SH2 families, which we corroborate experimentally, suggesting evolutionary optimization for avidity interactions. The combination of structure-based analysis of domain spacing with available monovalent experimental data appears to be sufficiently accurate to predict and rank order high affinity interactions of tandem SH2 domain recruitment to the EGFR C-terminal tail. These approaches lay the groundwork for larger utility in multivalent prediction and testing to help better understand protein interactions that drive cell signaling.
Keywords: SH2 domains, protein interactions, computational biology, phosphotyrosine, avidity, bivalent, biolayer interferometry, BLI
Introduction
Avidity is the property that defines the effective binding strength of interactions that involve multiple monovalent interactions. Frequently, avidity is greater than the sum of the individual monovalent affinities, which occurs due to biophysical constraints where the initial interaction locks the unbound interface into a region of space constrained by their shared linker, thus increasing its “effective concentration” and, as a result, its binding driving force (1). It is a well-documented phenomenon in antibody binding, where multiple antibody epitopes boost affinity (2), which scientists have exploited for antibody-drug conjugates to increase therapeutic impact (3, 4). Cell signaling also relies on multivalent interactions to drastically increase sensitivity and specificity in network connections, and one subset of multivalent proteins – termed tandem SH2 proteins due to possessing two SH2 domains – is a strong example of this (5, 6). Recruitment of the tyrosine kinase ZAP70, a key regulator of T cell receptor activation, requires both phosphotyrosine (pTyr) sites in the immunoreceptor tyrosine-based activation motif (ITAM) sequences in the CD3ζ chain of the T-cell receptor. Not only are the two SH2 domains of ZAP70 recruited with 100-fold higher affinity than either SH2 domain individually, but deletion of one of ZAP70’s SH2 domains prevents localization of the protein to the T-cell receptor (7, 8). This phenomenon has been well documented in other immune receptors that regulate the T-cell synapse. For example, SYK – the homolog of ZAP70 – gets recruited to Fc receptors for immunoglobin E (FcϵRI) with three times greater affinity than either individual domain, and phosphorylation within the SH2 linker has been shown to alter the the binding affinity, suggesting that the interaction occurs through avidity which is dependent on the biophysical constraints of the linker (9). Furthermore, the programmed cell death protein 1 (PD-1) and B- and T- lymphocyte attenuator (BTLA) both contain an immunoreceptor tyrosine-based inhibitory motif (ITIM) and immunoreceptor tyrosine-based switch motif (ITSM), providing two pTyr sites in close proximity, which recruit tandem SH2 domain containing phosphatases (PTPN6 and PTPN11) to the complex. Avidity is necessary for BTLA recruitment. Like the TCR:ZAP70 interaction, PTPN11 recruitment to BTLA is lost if one of the pTyr sites is removed (10). PD-1 can recruit PTPN11 as long as the ITSM motif is intact – albeit with lower affinity – and there is evidence that PTPN11’s preferred interaction crosslinks two PD-1 proteins in cis by binding to both ITSM motifs (11), an unexpected but potent example of avidity that provides additional context to the importance of receptor dimerization and clustering.
In addition to PTPN6/PTPN11 and ZAP70/SYK, three additional families (PI3K, PLCγ, and RASA1) covering ten total human proteins also contain tandem SH2 domains. Although this accounts for only 10% of the human SH2 domain-containing proteins, they are key proteins that transition signaling from receptor-proximal phosphotyrosine signaling to downstream lipid signaling (PI3K and PLCγ) and serine/threonine signaling (RASA1). Multisite phosphorylation is also more broadly observed beyond immune receptor systems. It occurs on other receptor tyrosine kinases (RTKs) and on downstream targets. For example, GAB1 gets robustly phosphorylated to act as a docking protein for multiple tandem SH2 proteins such as PTPN11, PI3K, and RASA1, allowing for simultaneous regulation of different signaling networks (12). The binding of PTPN11 to Y627/Y659 in GAB1 has been shown to occur at a stronger rate when both pTyr sites are available, which impacts activation of PTPN11 and has downstream consequences for the EGFR-MAPK pathway (13).
Despite the clear importance that bivalent interactions of tandem SH2 proteins play in immune signaling, we generally lack a strong, comprehensive understanding of tandem SH2 domain recruitment to multi-phosphorylated proteins across the broader human proteome. Beyond tandem SH2 domain containing proteins, approximately half of the SH2 domain family of proteins contain two protein interaction modules (e.g. SH3-SH2 and PTB-SH2) (14) and multidomain protein interaction modules are found broadly across the proteome. A challenge to broadly identifying multivalent avidity interactions across the proteome is the complication of modeling multivalent interactions in a framework that captures the biophysics of the constrained interaction and the many species that are formed (such as an N-terminal SH2 domain bound to both pTyr sites and bivalent species bound in different orientations). Errington et al. developed an ordinary differential equation (ODE) model that accounts for the biophysical parameters of the linker constraint, using this to scale the driving force of the second interaction in the complex once the first has formed (15), which we adapted here for the purpose of exploring the theoretical regimes that define asymmetric binding.
An additional challenge in the field of tandem SH2 domain interactions is the experimental validation of a theoretical model, especially for interactions where the bisphosphorylated partners might be far apart and separated by a highly flexible linker – such as in receptor tails (e.g. EGFR). Peptide synthesis of phosphorylated proteins becomes both costly and complex, with limitations on total size and number of pTyr residues that can be incorporated (16). We developed a synthetic toolkit that can produce large amounts of multiply phosphorylated protein, including sites across the entirety of the EGFR C-terminal tail in a single protein construct (17). Using both of these advances, computational and experimental, we set out to test the theoretical modeling results. Using structure-based analysis, we also explored how to inform the linker biophysics for more accurate models, finding that, despite high sequence length variability, tandem SH2 domains are conserved across the family in their spatial distance, suggesting a converged structural property across the family that encodes high avidity partnerships with cognate bisphosphorylated partners. All together, the framework presented here, and the findings in the tandem SH2 domain family, offer an approach for tackling the high complexity of identifying and testing protein interactions with emergent properties that significantly differ from their monovalent parts.
Results
Establishing a computational model of tandem SH2 binding.
Errington et al.’s ordinary differential equation (ODE) model simulates a surface plasmon resonance (SPR) experiment between a multivalent receptor-ligand pair, tracking all possible binding configurations over time – 15 of which exist in a bivalent system (15). Key parameters in the model include the monovalent kon and koff rate constants, species concentrations, and linker parameters. Linker parameters include the sequence length and protein flexibility – as determined by the persistence length (lp) – which are used to calculate the “effective concentration” of an unbound species following initial binding of the first domain, thus altering the driving force of the subsequent interaction and producing avidity. We adapted the model to allow monovalent affinity parameterization of all possible SH2-pTyr interactions and to improve control over the persistence length parameter. We modeled dynamic behavior under different SH2 domain concentrations, extracting the fraction bound to both SH2 domains (in either configuration) at equilibrium to estimate the bivalent dissociation constant (KD,B). We additionally simulated the estimate of the monovalent domain interactions with the SH2 ligand (by setting the other domain affinity to 1M and effectively removing that interaction). We refer to these as KD,N and KD,C and calculate avidity as the inverse of the bivalent dissociation constant (KA,B) divided by the sum of the inverse of the monovalent affinities (KA,N and KA,C) (Equation 1):
| (1) |
Simulations recapitulate avidity in PTPN11:GAB1 interaction.
To test the ability of this modeling approach to be suitably parameterized from available data and to predict avidity, we selected an interesting cytosolic interaction that is known to occur bivalently – PTPN11 binding to Y627/Y659 bisphosphorylated GAB1 (13). The linker separating the PTPN11 SH2 domains is 9 amino acids long, and the GAB1 pTyr sites are separated by 31 amino acids. The persistence length of the SH2 linker was set to 30Å as has been measured in inter-domain spanning protein linkers. The pTyr linker was set to 4Å to match previously measured persistence lengths of intrinsically disordered proteins (18–20) (Fig. 1A). Estimates of monovalent KD values were taken from a deep re-analysis of fluorescence polarization experiments (21, 22) by Ronan et al. (23). The N-SH2 domain of PTPN11 is estimated to bind to site Y627 with a monovalent KD of 287 nM whereas the C-SH2 domain was not found to bind to either site. We modeled the C-SH2 domain binding to Y659 with 20 μM, based on the experimental limits of detection from the studies (21, 22). For simplicity, we assumed no binding of the C-SH2 domain to site Y627 to prevent interference with the N-terminal interaction. Rate constants were set with a constant koff of 1 s−1, adjusting kon for the desired KD. Although specific rates would be important to dynamic models, here we are specifically interested in using equilibrium behavior to understand overall binding affinity between a tandem SH2 domain and a bisphosphorylated partner.
Fig. 1.
Computational simulations to determine theoretical effects of interaction parameters on avidity. A) The generalized ODE model considers linker parameters for both partners and the monovalent affinities of all pairs. An interaction that is not expected to occur is listed as a non-binder (NB). This toy model is annotated with parameters specifically used to model tandem SH2 domain binding of PTPN11 to doubly phosphorylated Y627/Y659 of GAB1, based on literature and sequence values. B) A serial dilution simulation of surface plasmon resonance (SPR) experiments of PTPN11tan and GAB1 was performed using the ODE model. Equilibrium concentrations of bound protein were used to generate a concentration effect curve and estimate the KD,B. C) We used the model to independently explore the impact of ligand linker biophysics and affinity of the monovalent species. Here, the PTPN11 tandem SH2 domain parameters were used, and we either monovalent affinities as noted in Fig. 1A were fixed, with changing linker parameters, or linker parameters were fixed, with changing monovalent affinities. The heat map color indicates the avidity measured under each condition with black indicating an inability to achieve bivalent binding. D) The kinetics of transient binding (interaction does not reach equilibrium) of the SH2 domain are plotted here under different model parameters that yield varying levels of avidity. The black curve showcases the behavior of a monovalent species that experiences no avidity.
Simulations were performed with a series of eleven PTPN11 concentrations, and the resulting concentration effect curve provided a predicted KD,B of 8.39 nM and avidity of 34, meaning the presence of both SH2 domains makes the binding affinity of the interaction 34-fold stronger than the individual monomers (Fig. 1B). In short, the capacity for tandem SH2 domain interaction with GAB1 converts a moderate affinity interaction that could at best be 287 nM to a strong interaction of 8.39 nM, which has significant implications for the degree and duration of the PTPN11:GAB1 interaction and suggests that the modeling approach informed by available monovalent interaction data successfully recapitulates a known bivalent interaction property.
The role of linkers on shaping avidity.
Having demonstrated that PTPN11 is capable of experiencing avidity, we used this interaction to explore the theoretical effects of avidity with a binding partner of different linker biophysics, since this parameter drives avidity. Keeping the model linker parameters for PTPN11 and fixing KD,N and KD,C at 4 μM, we simulated a series of receptor linker and persistence lengths while calculating avidity (Fig. 1C). We observed maximum avidity occurs in conditions where the pTyr linker is relatively short or when it is highly flexible. As the distance between the pTyr sites increases and the linker becomes more rigid, avidity decreases until bivalent binding is not possible. Across the theoretical range of linkers, it turns out that the maximum avidity attainable is matched by that of the PTPN11:GAB1 interaction, suggesting that partnership is fairly optimized.
The role of monovalent affinities on shaping avidity.
The other key parameter that determines the driving force behind a tandem SH2 interaction is the monovalent affinities of each SH2 domain. To test the impact of monovalent KD on avidity we fixed the linker parameters to that of PTPN11 and GAB1 and then independently tested monovalent affinities (1C). The strongest avidities occurred when each monovalent affinity was moderately high. As either KD,N or KD,C became too large or too small, avidity decreased. The behavior in these two regimes is quite different, however, from a protein-protein interaction standpoint. Under high monovalent affinities, interactions are still high, and there is diminishing benefit or need for a second recruitment site. On the opposite end of the avidity drop-off, under low enough monovalent affinities, the partners are not interacting and there is no initial binding event that seeds the biophysical constraint that leads to a secondary binding event. Theoretically, the range where avidity is strongest occurs in the typical range of human SH2 domain binding (0.1 to 10 μM) (24).
The primary model of avidity is each domain binding with near exclusivity to one cognate partner on the other protein with little to know cross-reactivity, such as in the PTPN11-GAB1 model. To explore how cross-reactivity between partners impacts avidity, we altered the relative contributions of each potential domain/pTyr interaction towards the overall KD,N and KD,C by increasing the degree of cross-reactivity for both domains. Monovalent affinity remained constant, with KD,N–N and KD,C–C decreasing to compensate for the increases in KD,N–C and KD,C–N (Fig. S1). The maximum avidity decreases with cross-reactivity, though remains in the greater than 15-fold range, when both domains have reasonably strong affinity (2.5μM) for the ligand, except under extreme conditions where both domains are competing for the same single pTyr site with near exclusivity. However, moving to a model where there is one strong recruitment domain (2.5μM) and one weaker domain (25μM) results in a significantly larger reduction in avidity with the introduction of cross-reactivity. Hence, the characteristics of high avidity include strong preferences for separate domain specificity with minimal cross-reactivity.
The effects of avidity on binding duration.
Signaling proteins bind in a transient manner, meaning they bind, unbind, and typically re-bind a number of times before fully dissociating and moving away within the cell (25). Since we have established that avidity within tandem SH2 proteins likely occurs due to an increased effective concentration of unbound domain upon initial binding of the first SH2 domain, it stands to reason that a multivalent protein in the process of unbinding from a target where one domain is still bound would experience the same increased driving force. In this manner, we hypothesize that protein interactions that experience stronger avidity will re-bind with a stronger affinity, thus extending the amount of time two proteins are engaged. To test this hypothesis, the PTPN11:GAB1 interaction was used as a baseline, and linker and monovalent KD parameters were altered to achieve different avidities, including a simulated monovalent condition where no avidity occurs. The different conditions were then used to simulate a transient binding interaction where there is no longer a constant influx of PTPN11 to maintain a steady concentration and drive the system to equilibrium, and instead the proteins are allowed to bind and fully unbind from one another (Fig. 1D). As hypothesized, the conditions that represented a larger avidity also resulted in the proteins remaining bound for a significantly longer period of time.
Establishing an experimental model of tandem SH2 domain binding.
Although the adapted ODE modeling approach appears to describe a known bivalent interaction well, experimental validation of model outputs is necessary, especially since there may often be uncertainty in the parameterization of monovalent affinities and linker biophysics. Therefore, we set out to establish a complementary experimental validation. We selected biolayer interferometry (BLI) for measuring binding kinetics between an SH2 domain and a phosphorylated peptide partner immobilized to a streptavidin probe. However, synthesizing multiply phosphorylated peptides is challenging and expensive, especially at the length necessary to cover physiologically relevant partners. This limitation was overcome by utilizing our lab’s SISA-KiT system which involves co-expressing a protein fused to a p40 polyproline sequence with a constitutively active tyrosine kinase fused to an ABL SH3 domain (Fig. 2A). The affinity between the ABL SH3 domain and the p40 sequence localizes the kinase to our protein of interest, driving phosphorylation and helping us achieve multi-site phosphorylation. This approach has the benefit of cheaply and easily producing large batches of phosphorylated proteins, although it produces a mixture of peptide species, including nonphosphorylated, singly phosphorylated, and doubly phosphorylated partners. In some ways, this experimental system is a strength – it captures the complexity of ligand presentation in cells, where it is unlikely that tandem SH2 domains interact with a pure population of doubly phosphorylated partners. We set out to establish the relevancy of this experimental system and to test model predictions of tandem SH2 domain binding.
Fig. 2.
Experimental testing of the PTPN11:GAB1 interaction. A) SISA-KiT was used to phosphorylate both pTyr sites within the GAB1 region expressed using SRC kinase catalytic domain. The secondary interaction used in SISA-KiT to enhance phosphorylation of the substrate is the ABL SH3 domain with the p40 polyproline sequence (APTYSPPPPP). B) The degree of phosphorylation on purified GAB1 was evaluated by running native PAGE, immunoblotting with MYC, and confirming by pan-specific phosphoantibody. GAB1 WT refers to an unmutated GAB1 region, Y1F is a Y627F mutant, and Y2F is a Y627F/Y659F negative control. C) Purified GAB1 proteins co-expressed with kinase were biotinylated using BirA ligase and streptavidin BLI tips were used to measure the binding kinetics with SH2 domain. Shown here are the background corrected on- and off-kinetics and the fit of the BLI data with a 1:1 kinetic model. We adjusted the serial dilution series to match the predicted affinity, and the specific concentrations of SH2 domain in the experiment are indicated. Where fit was sufficiently reliable in the experiment, we report an effective affinity (KD,eff). D) Chimeras were produced by swapping out the PTPN11N and PTPN11C SH2 domains with RASA1N and PLCγ1C, respectively. Simulations of these chimeras interacting with GAB1 were performed in the model, and BLI experiments were performed to calculate KD,eff for each new species.
Experimentally testing PTPN11:GAB1 interaction.
To evaluate the general experimental approach of recombinant production of bisphosphorylated proteins using SiSA-KiT, we selected the positive control GAB1 and PTPN11 interaction. We expressed the approximately 100 amino acid long span (residues 590 to 694) of the GAB1 C-terminus, where Y627/Y659 were the only tyrosines, along with a non-phosphorylatable control (Y2F) and an additional form with only Y659 (Y1F). To test the yield of phosphorylation, we used native PAGE to resolve the phosphorylated forms, estimating 20% of our GAB1 protein was doubly phosphorylated and 30% of the protein was singly phosphorylated (Fig. 2B). These constructs were engineered to include Avitag™, which we used to biotinylate for the purposes of using BLI to measure interactions with GAB1 species. For BLI experiments, we tested three serial dilutions of SH2 domain protein, using the Y2F control to test for the phosphospecific binding of the interaction, and performed a global fit to measure binding affinity. The range of serial dilutions were adjusted throughout the experiments to better match the affinity of each interaction since binding saturation leads to nonlinearities and the inability to accurately extract on and off rates (Fig. 2C–D). Like other experimental kinetic or thermodynamic analyses, BLI signal represents the mixture of all configurations of the SH2 domain protein bound to the ligand and hence we refer to the measured affinity from BLI as an “effective” affinity (KD,eff).
We hypothesized that if the degree of doubly phosphorylated protein was sufficient to induce bivalent binding, then the tandem SH2 domain binding would be significantly stronger than the monovalent SH2 domain binding. Hence, for this experimental test we also tested the monovalent interactions of the N-terminal and C-terminal PTPN11 domains with the phosphorylated GAB1, allowing us to calculate the avidity effect of the interaction (Fig. S2). By presenting the same phospholigand species mixture to individual N- or C-terminal domains, the measurement reflects the combined KD,N and KD,C interaction with both phosphorylation sites and can be compared directly to the tandem SH2 domain interaction. Throughout this work, we report the lowest KD,eff of duplicate experiments, except in cases of poor fit, as determined in prior work (23).
Consistent with the literature values, individual PTPN11 domains bound weakly to GAB1, whereas the tandem PTPN11 domains bound strongly with an estimated KD,eff of 5.92 nM, 34-fold stronger than the highest monovalent affinity, matching the avidity predicted by the model. In contrast with the literature affinities, it was PTPN11N that appeared to be below the signal of our experimental range and instead PTPN11C which bound with a KD,eff of 200 nM. The limit of detection for interaction affinity is limited by the concentration of the SH2 protein we are able to produce and it is possible that our results were additionally limited by isolated N-terminal SH2 domain misfolding or activity loss in our monovalent experiments. However, the bivalent affinity demonstrates strongly that both domains are functionally intact in the tandem domain species. The model parameters of a monovalent species of 287 nM replicates well the 200 nM monovalent interaction, and the prediction of the bivalent interaction was 8.39 nM, which is very similar to the experimentally measured 5.92 nM interaction. Together, these suggest that, as a set of interactions, the model affinities and the linker parameters were well calibrated to predict the overall increase in binding affinity in tandem interactions, which is two orders of magnitude better than the best monovalent affinity. Importantly, these results demonstrate that the experimental approach of producing large phoshporylated species, coupled with BLI, is capable of measuring high avidity interactions to test model predictions.
SH2 chimeras to interrogate model parameters.
To overcome the signal limitations of testing low affinity SH2 monomers with BLI, we developed a new approach for evaluating the discrepancy in monovalent affinities of each PTPN11 domain. We selected two alternate tandem SH2 domain proteins, RASA1 and PLCγ1, which also had previously characterized binding affinities with GAB1 Y627 and Y659 (23). We made two domain “swaps” that would allow us to isolate and interrogate each monovalent PTPN11 domain under bivalent conditions that are better for BLI – increasing protein size and boosting affinity within the optimum range of the instrument (Fig. 2D). The PTPN11N - PLCγ1C chimera replaces what we experimentally found to be the important PTPN11:GAB1 interaction with a new C-terminal domain with previously measurable affinity to both GAB1 sites. The RASA1N - PTPN11C chimera replaces the immeasurable PTPN11 N-terminal domain with an interaction previously characterized with GAB1 Y627 as 3.97 μM. We simulated and experimentally tested these chimeras, keeping the originally proposed PTPN11 domain interaction parameters (Fig. 1A). For the PTPN11N - PLCγ1C chimera, the model and BLI experiments were in strong agreement with a KD,B of 6.98 nM and a KD,eff of 6.8 nM, suggesting that our parameterization of the PTPN11 N-SH2 domain is fairly accurate, and the monomer is binding with an affinity of roughly 300 nM. However, for the RASA1N - PTPN11C chimera, the model and experimental results differed by a wide margin. The model provided a KD,B prediction of 94.3 nM whereas BLI yielded a KD,eff of 23 nM. This disagreement suggests that the parameterization of the PTPN11 C-SH2 affinity of 20 μM was inaccurate, and the monomer is capable of binding with a stronger affinity as supported by the PTPN11C BLI results. This is not surprising given the 20 μM parameter was arbitrarily chosen due to limitations in literature data.
These results highlight the importance of having accurate experimental data to parameterize the ODE model. However, the strong agreement between the model and experimental results of the PTPN11tan:GAB1 interaction, despite the uncertainty surrounding PTPN11C demonstrates the ability of the model to handle errors in one domain if the remaining monovalent interaction is relatively strong and can drive avidity. Furthermore, these results introduce the idea that certain pTyr sites within the human proteome that are not expected to interact with any SH2 domain-containing protein, based on available monovalent binding experiments, may actually be a critical binding site to enhance bivalent binding of a tandem SH2 domain partner, changing both amount and duration of the interaction.
Exploring generalizability of tandem SH2 domain binding.
Having established that the computational approach can successfully predict high affinity bivalent interactions when parameterized by previously defined monovalent affinities and estimates of linker parameters and the ability to confirm experimentally, we set out to explore the generalizability of this approach to more broadly predict tandem SH2 domain recruitment. First, we asked if we could develop predictions for each family member of the tandem SH2 domains, which vary in the SH2 domain linker region. Secondly, we ask if we can scan a set of different pTyr pairs along a receptor tyrosine kinase C-terminal tail and identify the preferred binding partners of an SH2 domain.
Expanding to all tandem SH2 domain family members.
Across the five structurally homologous groups of tandem SH2 proteins, the SH2 linker is highly variable. Based on sequence alone, PTPN11 and PLCγ1 are similar in length being 9 and 10 amino acids long, respectively. On the other hand, the ZAP70 linker is 60 amino acids long and rich in alpha helices, making it significantly more rigid (the persistence length of alpha helices is typically between 150 – 200Å (26, 27)). Similarly, the PIK3R1 linker is rigid and 195 amino acids long. RASA1’s SH2 domains are split by an SH3 domain, providing moderate rigidity and a length of 78 amino acids (28). Using reasonable estimates for SH2 domain persistence lengths, the model-based approach predicts highly specific individual family member models. To evaluate the accuracy of these estimates we measured the physical distance across the linker for available PDB structures. Surprisingly, while the model-estimated distances between SH2 domains varied widely, the structure-estimated distances were much more consistent across all families, based on average distance across structures (Fig. 3A). There was only a 23Å difference across all tandem SH2 domains based on structure (17.3Å for PTPN11 versus 40.9Å for ZAP70), versus a model predicted 290Å difference (27Å for PTPN11 versus 321.75Å for PIK3R1). This suggests that despite exhibiting high variability within the sequence of their SH2 linkers, tandem SH2 proteins evolved to maintain a strongly conserved physical distance between domains.
Fig. 3.
Evaluating linkers between all tandem SH2 domains across human SH2 families. A) The structure-based end-to-end distance versus the primary sequence length for each tandem SH2 protein family is shown. Structure-based distance is averaged across available structures for that protein. B) We compared effect of avidity of two possible linkers for ZAP70 using a model parameterized by either the primary sequence-based estimate or the structure-extracted distance. C) We tested the hypothesis that structure-based distance is similar between family members, despite high sequence variability by replacing the short PTPN11 linker region between PTPN11 SH2 domains with the large, helical region of ZAP70. We modeled and experimentally measured the effective affinity of the PTPN11ZAP70 chimera with doubly phosphorylated GAB1 Y627/659. Experimentally measured chimera affinity (6.79nM) is very close to that of the PTPN11 binding with GAB1 (5.92nM), suggesting that the model estimated using structure-informed parameters (9.24nM) is significantly closer than a model using the primary sequence length (396.6nM).
To evaluate the effect of the modeled linker versus the structure-extracted linker distances on avidity, we performed a computational experiment. We set KD,N and KD,C constant at 4 μM and performed a pTyr linker sweep with ZAP70 using both the model-estimated and structure-estimated linker lengths (Fig 3B). There is a striking difference, not just in the idealized pTyr partner linker parameters that create optimal avidity, but also in a lower overall maximal avidity that can be achieved by ZAP70 if its SH2 domains were separated by the larger distance expected by the model. These data suggest that the highly conserved physical separation of all tandem SH2 domains likely plays a role in enabling them to bind with high avidity.
To confirm that the tandem SH2 domain separation is parameterized best by the structure-extracted distance, we created a linker “swap”. ZAP70 has the largest structure-predicted difference in separation distance compared to PTPN11, so we replaced the PTPN11 SH2 domain linker with the ZAP70 linker (Fig 3C). The (PTPN11ZAP70) chimera is predicted (parameterized by the structure-based distance) to bind with high avidity on a similar scale to the original PTPN11 wild type protein (9.24 nM compared to 8.39 nM of PTPN11). Experimentally, the PTPN11ZAP70 bound to GAB1 with an effective affinity of 6.79 nM, closely matching the model estimate and, as expected, being only slightly higher than the original PTPN11tan KD,eff of 5.92 nM (Fig. S4). In comparison, when parameterized to match the longer model-predicted linker length of ZAP70, the interaction was predicted to achieve an effective affinity of 369.6 nM, clearly demonstrating that the structure-based estimates of distance are better representations of the biophysical structure of tandem SH2 proteins.
Given this new insight into the importance of structure-based linker estimates and to confirm the accuracy of our previous investigations into PTPN11:GAB1, new simulations of these dynamics were performed using the structure-based PTPN11 linker parameters where the linker length is 5.75 amino acids (SFig. S5). Since there is a minimal difference between the model- and structure-based estimates of the PTPN11 linker, the impact of this change was negligible, and our conclusions from the PTPN11:GAB1 studies remain sound. The results also suggest that not only are the tandem SH2 domain family members constrained to similar physical separation, but that it means they are all “tuned” relatively similarly to bind similarly spaced pTyr residues.
Ranking EGFR phosphotyrosine pair recruitment of PLCγ1.
We have demonstrated the utility of the ODE model to investigate the PTPN11:GAB1 interaction, including chimeras and their impacts. Next, we wanted to test whether the framework could translate well to a new set of interactions. We elected to study recruitment of PLCγ1 to pTyr pairs along the EGFR C-terminal tail, where bivalent interactions are not yet well characterized. The EGFR C-tail contains nine pTyr sites, and both SH2 domains of PLCγ1 have a measured monovalent affinity towards at least four of these sites (Fig. 4A). We used the ODE model to predict the KD,B of each potential pTyr pair. However, for two pairs spanning almost the entire length of the tail (998/1197 and 1016/1197), the estimation of the linker biophysics was unsolvable in our computing environment and memory and were not considered. All other remaining pair predictions were ranked, based on bivalent avidity (Fig. 4B), suggesting a broad range of effective affinities and some with very high avidity.
Fig. 4.
Predicting and testing bivalent recruitment of PLCγ1 to EGFR C-terminal tail phosphotyrosine pairs. A) Literature based monovalent affinities of PLCγ1 with pTyr sites across the EGFR C-terminal tail (23), where N is N-terminal SH2 domain affinity and C is C-terminal SH2 domain affinity, where teh array is proportional to affinity. B) Using literature affinity values, structure-based SH2 domain distance, and estimated biophysical distances for the disordered tail, we predicted the bivalent binding KD,B of PLCγ 1 to EGFR pTyr pairs. Orange and gray bars indicate pairs experimentally tested (orange where interactions were successfully tested in our experimental system). C) The experimental data for Y7F pTyr pairs in orange in panel C and a Y9F control by BLI with the effective affinity for each interaction.
We selected seven pTyr pairs spanning a range of bivalent affinity predictions for experimental validation. Here, we compared Y7F EGFR C-tail constructs containing two tyrosines to a Y9F control (no tyrosines). We found that native PAGE was unable to sufficiently separate the phospho-species of these large proteins, so we also used a Phos-Tag approach (29), where we were able to achieve separation of some species of phosphorylation. We used phosphospecific antibodies to the EGFR sites to confirm that we had successfully produced phosphorylation of target sites in the kinase conditions selected (SFig. S6). However, without the ability to resolve doubly phosphorylated forms, we chose to go forward with BLI testing knowing that if an interaction was observed at higher affinity than the monovalent data it suggests that sufficient bisphosphorylation had been achieved for evaluation of the EGFR:PLCγ1 partnership to be tested.
Three pTyr pairs predicted to have the highest binding affinity produced strong BLI responses and the predicted rank was recapitulated by experiments −1016/1138 was strongest (KD,eff of 63.4 nM), followed by 1138/1197 (KD,eff of 116 nM) and 1016/1092 (KD,eff of 157 nM) (Fig. 4C, SFig. S7). Unfortunately, the other four EGFR pairs tested did not achieve an adequate fit across any replicates, so their KD,eff cannot be reported (SFig. S8). Poor binding behavior in the BLI system might be attributed to insufficient bisphosphorylated receptor or a limitation of the maximum concentration of PLCγ1tan, both of which would be problematic for these lower affinity interactions along the spectrum of predicted affinity. Regardless, the three strongest pairs that were measurable in our system matched the model-predicted rank order, suggesting that the model can be used to effectively estimate preferred binding partners within a complex system and is useful in wider contexts beyond the PTPN11:GAB1 interaction.
While the relative affinity matches the predicted order, it should be noted that the exact KD,eff value for each EGFR pair differs from its predicted KD,B by a wider margin than observed in the PTPN11:GAB1 interaction – with systematically stronger model predicted affinities. Our model assumed similar flexibility of the EGFR C-tail as the GAB1 linker (a persistence length of 4Å), but it is possible that the C-tail achieves a higher rigidity following phosphorylation which could account for the reduced affinity. However, the C-tail clearly maintains a relatively high degree of flexibility as evidenced by the ability for PLCγ1 – with its short 10 amino acid SH2 linker – to incur strong bivalent binding with sites separated by 121 amino acids in the 1016/1138 pair. Although estimating the specific persistence length of disordered regions like receptor tyrosine kinase tails may lead to some general errors, it is encouraging that despite this the approach is able to identify affinity interactions that emerge from bivalent cooperation and their relative rank. Importantly, these results demonstrate that tandem SH2 domain recruitment may be a much more generalized phenomenon in tyrosine kinase signaling than was previously anticipated and that these bivalent events might span regions much larger than the spacing found in the classic immune ITAM-type pTyr patterns.
Discussion
A guiding principle in the field of phosphotyrosine signaling is that the relatively low affinity of SH2 domains (on the order of μM) is important to the reversibility of signaling – allowing the dissociation of signaling complexes and phosphatase access (30–32). This study, along with existing knowledge in the immune receptor system, indicates that tandem SH2 domain containing proteins can be recruited with a strikingly higher affinity (on the order of nM). Both total amount and dynamics shape cellular outcomes (33, 34), and increasing this by 30- to 100-fold has profound impacts on protein interactions and downstream consequences. Notably, we found that the normal range of affinity for an SH2 domain (0.1 - 10 μM) matches well to the range of monovalent KD values that maximize avidity, providing a new perspective on SH2 binding affinity. Perhaps the relatively low affinity of SH2 domains evolved because it optimizes the advantage of multivalency within the tandem SH2 proteins to allow for these dramatic improvements in affinity. Additionally, we found it interesting that the tandem SH2 domain families appear to have evolved relatively similar geometries in 3D structure that set them up exceedingly well for high avidity interactions and were surprised to find that the EGFR C-terminal tail allowed for high avidity interactions, despite significant sequence separation between sites – an experimental finding that would not have been possible via peptide synthesis methods. Given the breadth of signaling proteins covered by the tandem SH2 domain family, there is a possibility that almost all phosphotyrosine signaling is governed to some degree by tandem SH2 domain interaction avidity. Finally, if effective affinities are really in the nM range, then active forms of reversibility may be necessary and might explain findings in the regulation of SH2 domain linker regions (9) and on or near binding interface residues of the SH2 domain (14).
Given the high degree of multidomain interaction units throughout the proteome, it is likely that broad swaths of molecular biology research should consider avidity, yet the combinatorial complexity of these systems have created barriers to such avidity incorporation. Combining mechanistic modeling and structure informed parameterization presents a broader framework for identifying protein interactions with high effective affinities due to multivalency, and it provides theoretical constraints that would help identify those interactions that experience avidity, such as matched distancing between interfaces and the presence of a monovalent interaction strength that is capable of initially recruiting and a second site that has the capacity to stabilize this interaction. Even with limited information about monovalent species or details of linker biophysics, modeling helps to bound the reasonable space of whether high affinity interactions might occur and even rank-order, even if not able to perfectly predict effective affinity. Additionally, the synthetic toolkit for driving tyrosine phosphorylation on proteins can easily be extended to cover alternate post-translational modifications and mixtures of bivalency (e.g. polyproline and a pTyr for SH3-SH2 testing and possibly acetylation recruitment of tandem bromoodomains). Of course, even though modeling and experiments of these two part systems suggest important findings for signaling consequences, it does not yet consider competition, cellular localization constraints, or other dynamic processes that are also at play in physiological signaling systems. Despite this, the framework presented here offer an approach for tackling the high complexity of identifying and testing protein interactions with emergent properties that significantly differ from their monovalent parts, which is an important step in modeling and dissecting complex interactions that govern cell physiology and considering more broadly how to intervene in dysregulated signaling cascades in disease.
Materials and Methods
Plasmid cloning and mutagenesis.
We used plasmid backbones as described in Ryan et al. (17) to clone GAB1 and EGFR into the toolkit substrate backbones (sVd with Avitag™) for co-expression and phosphorylation with a kVh-kinase (also described in Ryan et al.). The Avitag™ originally came from the pGEX-SH2A-SH2 plasmid which was a gift from Bruce Mayer (Addgene plasmid # 46481; http://n2t.net/addgene:46481; RRID:Addgene_46481) (35). SH2 domains were cloned into a pGEX backbone, with an N-terminal GST fusion. We used either restriction/ligation cloning or In-Fusion cloning for engineering of backbone vectors, insertion of tags and targeting, and all kinase and substrate insertions. We used Snapgene to design primers, and sequenced verified by Sanger sequencing. Sequencing was performed to get full coverage of insert. Some constructs were whole plasmid sequenced to verify the entire plasmid. For cloning, we used NEB Phusion polymerase (part no. M0530S) and included extensions for cloning into the target backbones using NEB enzymes. We used Agilent’s QuickChange mutagenesis kit (part no 210518) to perform point mutations. We used carbenicillin at 100μg per mL for selection of ampicillin plasmids (substrate and GST plasmids) and kanamycin at 50μg per mL for selection of kanamycin plasmids (kinase plasmids). Subcloning Efficiency DH5α Competent Cells: ThermoFisher Scientific Cat. 18265017 were used for DNA propagation. The sources for SH2 domains were: PTPN11 – from pBABE-PTPN11 which was a gift from Dr. Matt Lazzara (36); RASA1 – pDONR223_RASA1_WT was a gift from Jesse Boehm & William Hahn & David Root (Addgene plasmid # 81778 ; http://n2t.net/addgene:81778 ; RRID:Addgene_81778); ZAP70 – pDONR223-ZAP70 was a gift from William Hahn & David Root (Addgene plasmid # 23887 ; http://n2t.net/addgene:23887 ; RRID:Addgene_23887); PLCγ1 – pGEX PLCg1(NC)-SH2 was a gift from Bruce Mayer (Addgene plasmid # 46471 ; http://n2t.net/addgene:46471 ; RRID:Addgene_46471). GAB1 – from pCDNA3.1-HA-GAB1 which was a gift from Dr. Matt Lazzara (37). EGFR C-terminal tail was gifted to us by Linda Pike in both Y9F and Y8F formats (38). We selected the appropriate Y8F plasmids to mutate a second tyrosine to phenylalanine to create the Y7F plasmids.
Protein expression and purification.
Protein induction and bacterial lysis.
SH2 Domains
Plasmids were transformed into BL21 gold DE3 (Agilent part no. 230132) and an isolated colony was used to inoculate a 5 mL LB, 5 μL carbenicillin culture which grew overnight at 37°C. 2.5 mLs of this culture was used to inoculate a 100 mL LB, 100 μL carbenicillin culture which shook at 37°C until it reached the desired OD600 (0.7-1.0). The culture was then induced using 0.5 mM IPTG and grown overnight in an 18°C shaker. The culture was spun down at 8500 rpm for 10 minutes and the pellet was frozen at −20°C until lysis.
Phosphorylated Proteins
Plasmids were co-transformed along with a kinase plasmid into BL21 gold DE3 (Agilent part no. 230132) and an isolated colony was used to inoculate a 5 mL LB, 5μL carbenicillin/kanamycin culture which grew overnight at 37°C. 2.5 mLs of this culture was used to inoculate a 100 mL LB, 100μL carbenicillin/kanamycin culture which shook at 37°C until it reached the desired OD600 (0.7-1.0). The culture was then induced using 0.5 mM IPTG and grown overnight in an 18°C shaker. The culture was spun down at 8500 rpm for 10 minutes and resuspended in 100 mL LB, 100 uL carbenicillin/kanamycin, and 0.2% L-arabinose. The culture was grown for another 4 hrs in a 37°C shaker and then spun down at 8500 rpm for 10 minutes and the pellet was frozen at −20°C until lysis.
Lysis
We resuspended bacterial cell pellets in a 10:1 ratio of final growth volume to lysis buffer. Lysis buffer was 50mM Tris-HCl at pH7.0 and 150mM NaCl, supplemented with protease inhibitors (EMD Millipore 539137), phosphatase inhibitors (Millipore Sigma P5726), and PMSF (1:1000). For small volume lysis of 1mL to 5mL, we used bead beating, adding 400μL of beads to each 1mL of resuspended cell pellet and bead beating for 2 minutes. For larger lysis volumes, we sonicated resuspended pellets for 5 minutes with 40% pulse sequence. Following lysis, we clarified by centrifugation at 13,000 rpm for 10 minutes at 4°C.
Protein purification.
Nickel/6xHis Purification.
We used Genscript Ni-NTA MagBeads (cat no. L00295), using a magnetic tube stand for isolating beads during decanting. Prior to protein binding, we equilibrated beads in lysis buffer (50mM Tris-HCl, 300mM NaCl (pH 8.0), 0.5% TX-100, 80 mM imidazole). Beads were isolated by magnet and decanted. We added equilibration buffer again, this time in addition to 1mL of clarified lysate and incubated at 4°C for at least 1hr up to overnight, while rotating. Isolated beads were washed in a 20-fold excess (relative to bead volume) of wash buffer (50mM Tris-HCl, 300mM NaCl, 40mM Imidazole, 0.5% Triton-X (pH 8.0)) a total of three times, incubating in the wash buffer for 5 minutes each time. PreScission protease (GenScript cat no. Z02799) was used to cleave the N-terminal solubility tag and p40 sequence. The beads were resuspended in 50 mM Tris-HCl, 150 mM NaCl, 3 mM DTT at 4 times the resin bead volume and incubated overnight at 4°C with 1 μL of PreScission. Protein was eluted using 2 times the resin bed volume with elution buffer (50mM Tris-HCl, 300mM NaCl, 500mM Imidazole pH 8.0). We repeated elution two total times. We dialyzed protein following manufacturer directions for the Pierce Slide-A-Lyzer mini dialysis with a 3.5kDa cutoff into 50mM Tris-HCl, 150mM NaCl at pH. 7.8. Proteins were aliquoted and frozen at −80°C.
GST Purification
We used Pierce Glutathione Agarose (part no. 16100) and followed the manufacturer’s base protocol for GST-based purification. Beads were equilibrated in buffer (50mM Tris, 150mM NaCl) and protein capture was performed for one hour to overnight at 4°C. Beads were washed (three times) with a 10-fold excess volume of equilibration buffer, with centrifugation to isolate the beads. We used PreScission protease-based elution according to manufacturers directions (GenScript GST-PreScission protease part no. Z02799), incubating protease with beads resuspended in 50 mM Tris-HCl, 150 mM NaCl, 3 mM DTT overnight at 4°C. PreScission cleavage was performed to ensure no dimerization of the GST domain (31).
Biotinylation.
Following nickel- purification and dialysis, phosphorylated substrates were biotinylated using the BirA500 kit (Avidity EC 6.3.4.15). We reacted 10nmol of substrate with 2.5μg of BirA ligase for between 30 minutes and 5 hours at 30°C depending on the protein concentration. Proteins were dialyzed following biotinylation to remove excess biotin. Biotinylation was frequently confirmed through 680IRdye-conjugated streptavidin, incubated for one hour at 4°C (at 1:1500), prior to final washes and LICOR-based imaging.
Western-based analysis.
Samples were reduced in Laemmli loading buffer (Boston BioProducts) and boiled for 10 minutes at 95°C, unless they contained imidazole, where we boiled at 70°C for 10 minutes to avoid breaking protein bonds. We performed standard SDS-PAGE analysis using purchased precast NuPAGE gels from Invitrogen, typically using 4-12% gradients or single 10% gels, running in a MOPS SDS running buffer (Novex Invitrogen). Transfer to nitrocellulose membranes occurred in a buffer (Novex Invitrogen) with 20% methanol (membranes were pre-wetted in transfer buffer). Pre-stained ladder from LICOR (Chameleon Duo) was used along with IRDye-680 and −800 secondary antibodies for infrared scanning. We used Intercept Blocking Buffer (LI-COR), diluted 1:1 in Tris Buffered Saline (TBS) for blocking membranes (one hour at room temperature or overnight at 4°C) and preparing primary and secondary antibodies. Following primary incubation (1 hour at room temperature or overnight at 4°C) and secondary incubation (1 hour at room temperature at 1:10,000, we washed membranes in a TBS-T (1% tween solution) or TBS solution. Antibody stripping was done with 0.2N NaOH for 30 minutes, as needed for efficient stripping. Membranes were scanned on a LI-COR Odyssey system.
Antibodies
We used the following antibodies for verification of expression and phosphorylation: MYC antibody Mouse Bio X Cell BE0238 at 1:3000 dilution; Rabbit PY1000 antibody CST 8954S at 1:1500 dilution; EGFR pY998 Rabbit CST 2641 at 1:1000 dilution; EGFR pY992/EGFR pY1016 Rabbit CST 2235 at 1:1000 dilution; ERBB2 pY1196 (with cross-reactivity of EGFR pY1138 and pY1016) Rabbit CST 6942 at 1:1000 dilution; EGFR pY1068/EGFR pY1092 Rabbit CST 3777 at 1:1000 dilution; EGFR pY1148/EGFR pY1172 Rabbit CST 4404 at 1:1000 dilution; EGFR pY1173/EGFR pY 1197 Rabbit CST 4407 at 1:1000 dilution. We used the following secondary antibodies from LI-COR Biosciences (at 1:10,000 dilution): Donkey Anti-Mouse IgG 680RD, Donkey Anti-Rabbit IgG 800CW; Donkey Anti-Rabbit igG 680RD; Donkey Anti-Mouse IgG 800CW.
Native PAGE:
Phosphorylation efficiency of phosphor proteins was evaluated by running protein samples by native PAGE to allow for separation of protein species with varying levels of phosphorylation. A 9% acrylamide separating gel was poured with 1.8 mLs 30% acrylamide, 1.5 mLs 1.5 M Tris-HCl, 600 μLs 10% APS, and 6 μL TEMED. A stacking gel was poured using 533 μL 30% acrylamide, 1 mL 1.5 M Tris-HCl, 40 μL 10% APS, and 8 μL TEMED. Gels were cast using the Bio-Rad Mini-PROTEAN® Tetra Handcast System. 1L of 1x Running Buffer (250mM Tris-HCl and 1.92M Glycine) was used, and gels were run at 4°C at 90V until separation was achieved. Samples were prepared in loading dye: 40% Glycerol, 248 mM Tris-HCl, 0.02% Bromophenol Blue.
Protein concentration estimation:
We used PAGE-based separation and Coomassie staining (ThermoScientific Pierce Coomassie Brilliant Blue G250 part no 20279), followed by 680-infrared detection by LICOR for estimating protein concentration compared to a BSA standard curve.
Biolayer Interferometry (BLI).
Experiments were run on the Gator® Pilot BLI instrument using Streptavidin (SA) probes (Gator SKU 160002), MAX plates (Gator SKU 130062) for probe loading, and black flat plates (Gator SKU 130150) for sample loading. A sampling rate of 10 Hz was used, and a temperature of 30°C was maintained within the instrument. 250 uL of buffer (50 mM Tris-HCl, 150 mM NaCl, 1% Bovine Serum Albumin, filter sterilized) was loaded into the MAX plate to allow for equilibration of the probes, and a 10-minute, 400 rpm equilibration step was performed at the beginning of each experiment. 180 uL of sample was loaded into the black flat plate and held at a tilt throughout the experiment. Experiments were run with a 30 second baseline in buffer, 120 second load of biotinylated, phosphorylated protein, 30 second baseline in buffer, 60 second association of SH2 domain, and 60 second dissociation in buffer. The load wells contain equal concentrations of phosphorylated protein, whereas the association wells contain a serial dilution of SH2 domain within the first three wells of the column. The final well contains buffer to act as a reference. Plates were shaken throughout the experiment with PTPN11:GAB1 experiments being shaken at 1000 rpm and PLCγ1:EGFR experiments being shaken at 400 rpm.
BLI Analysis.
Kinetic analysis was performed within the Gator One® software (version is 2.16.6.0130). Data was aligned on the Y-axis at the association step, and an inter-step correction was applied along with Savitzky-Golay filtering to reduce noise. The reference well is subtracted from the other three wells to account for non-binding artifacts within the signal. A 1:1 global binding model is fit to both the association and dissociation curves of each set of three serial dilutions with an unlinked Rmax to quantify the dissociation constant, KD. BLI figures contained within this paper uses data processed in the manner described above, and both the data and binding fit curves have been re-plotted in MATLAB®. The data is subsampled to include every 10th data point. Data provided in the supplement was not subsampled but did undergo the other pre-processing steps.
ODE Modeling of tandem SH2 domain interactions.
The original code was written by Dr. Wesley Errington in the Sarkar lab at the University of Minnesota (15) and shared by compressed folder by Dr. Sarkar. The code was adapted to include greater flexibility within the parameters. Most notably, additional kon and koff parameters were incorporated into the differential equations to allow for different affinities by each SH2 domain for each pTyr site, and persistence length was made universally adjustable instead of binning all protein interactions into two possible groups, flexible and rigid. The output of the original code is a series of plots mapping the binding behavior of the proteins under different initial concentrations of SH2 domain. Fifteen binding configurations are possible within a bivalent system, but for the purposes of this paper, only subsets were summed together and plotted as a single metric of binding. When analyzing tandem SH2 domains, the two bivalently bound configurations (inline and twisted) were isolated, whereas configurations detailing single domain binding were utilized when simulating monovalent binding. Additional code was written to extract the equilibrium concentration of each of these plots and feed that into a concentration effect curve plot. This operates by plotting the fraction of bound species at equilibrium – calculated by dividing the equilibrium concentration by the starting concentration of phosphorylated protein (R0) – against the starting concentration of SH2 domain (L0). Within the model, the concentration of SH2 domain is assumed to remain constant throughout the simulation. The value of L0 that corresponds to 50% of the phosphorylated protein being bound at equilibrium is the dissociation constant (KD). Results provided were generated using Matlab R2024a or R2024b. All parameters have been provided in Data S1. We used the University of Virginia’s High Performance Computing environment Rivanna for model predictions. All Matlab code is provided in our repository at https://github.com/NaegleLab/BivalentModeling_tanSH2.
Estimation of diameters from structure.
For simulations provided in this paper, estimates of the molecular weight of SH2 domains and pTyr sites were calculated using the following website https://web.expasy.org/compute_pi/. For pTyr sites, the sequence spanning two sites N-terminal of the pTyr site and four sites C-terminal was used to estimate molecular weight. Sites Y627 and Y659 within Gab1 were calculated and used as an estimate for all pTyr pairs.
SH2 diameters
We identified PDB entries containing the domain of interest (SH2) and their corresponding domain boundaries using CoDIAC (14). We included wildtype and mutant structures only if the mutations resided well outside the domain regions. To estimate domain diameters, we assumed each domain adopts an approximately spherical shape. Based on this assumption, the diameter was defined as the longest distance between any two atoms within the domain. To compute this, we measured all pairwise distances between backbone atoms (N and C) within the defined domain boundaries and recorded the maximum distance as the domain’s diameter for multiple PDB structures available for the same domain. Since the ODE code assumed the same size for both entities, and since we saw relatively small differences between the N and C terminal diameters, we averaged all diameters for both domains across all available structures. All diameter calculations are provided in Data S2. Code for extraction of domain diameters is provided at https://github.com/NaegleLab/BivalentModeling_tanSH2.
Estimation of linkers from structure.
pTyr linkers
Linker length was estimated as the number of amino acids between the pTyr sites or SH2 domains, unless a PDB file was available for the protein in which case linker length was estimated using the same PDB-extraction based approach as SH2 domains.
SH2 domain linkers
Structures exist for all tandem SH2 domain family members and so we used those to estimate the linker lengths. We used UniProt defined SH2 domain boundaries to define the linker (as the first amino acid after the N-terminal SH2 domain and up to the amino acid before the start of the C-terminal SH2 domain). Then for all available structures, we calculated the average distance between all atoms of the two amino acids farthest apart from each other on the linker region (i.e. the residues immediately adjacent to the domains). All linker calculations are provided in Data S2. Code for extraction of linker distances is provided at https://github.com/NaegleLab/BivalentModeling_tanSH2.
Estimation of monovalent KD values.
Estimates of monovalent KD were taken from Ronan et al. meta-analysis of fluorescence polarization experiments (23). All koff values were set to 1 s−1, and kon values were adjusted to match the estimated KD.
Supplementary Material
ACKNOWLEDGEMENTS
Research reported in this publication was supported by the National Institute Of General Medical Sciences of the National Institutes of Health under Award Numbers R35GM138127 and T32GM008715 (to Reagan Portelance). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. We would like to thank Dr. Casim Sarkar and Dr. Wesley Errington for sharing the Matlab code and giving guidance on model adaptations. We would additionally like to thank Devon Semoy and Dr. Neel Shah for their helpful guidance in native PAGE for the separation of phosphospecies. The authors acknowledge Research Computing at The University of Virginia for providing computational resources and technical support that have contributed to the results reported within this publication. URL: https://rc.virginia.edu
Bibliography
- 1.Kane Ravi S.. Thermodynamics of multivalent interactions: influence of the linker. Langmuir : the ACS journal of surfaces and colloids, 26(11):8636–8640, June 2010. ISSN 1520-5827 0743-7463. doi: 10.1021/la9047193. Place: United States. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Vauquelin Georges and Charlton Steven J.. Exploring avidity: understanding the potential gains in functional affinity and target residence time of bivalent and heterobivalent ligands. British journal of pharmacology, 168(8):1771–1785, April 2013. ISSN 1476-5381 0007-1188. doi: 10.1111/bph.12106. Place: England. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Nuñez-Prado Natalia, Compte Marta, Harwood Seandean, Álvarez Méndez Ana, Lykkemark Simon, Sanz Laura, and Vallina Luis Álvarez. The coming of age of engineered multivalent antibodies. Drug discovery today, 20(5):588–594, May 2015. ISSN 1878-5832 1359-6446. doi: 10.1016/j.drudis.2015.02.013. Place: England. [DOI] [PubMed] [Google Scholar]
- 4.Cuesta Angel M., Sainz-Pastor Noelia, Bonet Jaume, Oliva Baldomero, and Alvarez-Vallina Luis. Multivalent antibodies: when design surpasses evolution. Trends in biotechnology, 28(7):355–362, July 2010. ISSN 1879-3096 0167-7799. doi: 10.1016/j.tibtech.2010.03.007. Place: England. [DOI] [PubMed] [Google Scholar]
- 5.Ottinger Elizabeth A., Botfield Martyn C., and Shoelson Steven E.. Tandem SH2 Domains Confer High Specificity in Tyrosine Kinase Signaling*. Journal of Biological Chemistry, 273(2):729–735, January 1998. ISSN 0021-9258. doi: 10.1074/jbc.273.2.729. [DOI] [PubMed] [Google Scholar]
- 6.Deng Yunxin, Efremov Artem K., and Yan. Jie Modulating binding affinity, specificity, and configurations by multivalent interactions. Biophysical journal, 121(10):1868–1880, May 2022. ISSN 1542-0086 0006-3495. doi: 10.1016/j.bpj.2022.04.017. Place: United States. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Isakov N., Wange R. L., Burgess W. H., Watts J. D., Aebersold R., and Samelson L. E.. ZAP-70 binding specificity to T cell receptor tyrosine-based activation motifs: the tandem SH2 domains of ZAP-70 bind distinct tyrosine-based activation motifs with varying affinity. The Journal of experimental medicine, 181(1):375–380, January 1995. ISSN 0022-1007 1540-9538. doi: 10.1084/jem.181.1.375. Place: United States. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Sloan-Lancaster J., Presley J., Ellenberg J., Yamazaki T., Lippincott-Schwartz J., and Samelson L. E.. ZAP-70 association with T cell receptor zeta (TCRzeta): fluorescence imaging of dynamic changes upon cellular stimulation. The Journal of cell biology, 143(3):613–624, November 1998. ISSN 0021-9525 1540-8140. doi: 10.1083/jcb.143.3.613. Place: United States. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Travers Timothy, Kanagy William K., Mansbach Rachael A., Jhamba Elton, Cleyrat Cedric, Goldstein Byron, Lidke Diane S., Wilson Bridget S., and Gnanakaran S.. Combinatorial diversity of Syk recruitment driven by its multivalent engagement with FcεRIγ. Molecular Biology of the Cell, 30(17):2331–2347, August 2019. ISSN 1059-1524. doi: 10.1091/mbc. E18-11-0722. Publisher: American Society for Cell Biology (mboc). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Xu Xiaozheng, Masubuchi Takeya, Cai Qixu, Zhao Yunlong, and Hui Enfu. Molecular features underlying differential SHP1/SHP2 binding of immune checkpoint receptors. eLife, 10:e74276, November 2021. ISSN 2050-084X. doi: 10.7554/eLife.74276. Publisher: eLife Sciences Publications, Ltd. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Patsoukis Nikolaos, Duke-Cohan Jonathan S., Chaudhri Apoorvi, Aksoylar Halil-Ibrahim, Wang Qi, Council Asia, Berg Anders, Freeman Gordon J., and Boussiotis Vassiliki A.. Interaction of SHP-2 SH2 domains with PD-1 ITSM induces PD-1 dimerization and SHP-2 activation. Communications Biology, 3(1):128, March 2020. ISSN 2399-3642. doi: 10.1038/s42003-020-0845-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Kiyatkin Anatoly, Aksamitiene Edita, Markevich Nick I., Borisov Nikolay M., Hoek Jan B., and Kholodenko. Boris N. Scaffolding protein Grb2-associated binder 1 sustains epidermal growth factor-induced mitogenic and survival signaling by multiple positive feedback loops. The Journal of biological chemistry, 281(29):19925–19938, July 2006. ISSN 0021-9258 1083-351X. doi: 10.1074/jbc.M600482200. Place: United States. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Cunnick Jess M., Mei Lin, Doupnik Craig A., and Wu Jie. Phosphotyrosines 627 and 659 of Gab1 Constitute a Bisphosphoryl Tyrosine-based Activation Motif (BTAM) Conferring Binding and Activation of SHP2 *. Journal of Biological Chemistry, 276(26):24380–24387, June 2001. ISSN 0021–9258. doi: 10.1074/jbc.M010275200. Publisher: Elsevier. [DOI] [PubMed] [Google Scholar]
- 14.Kandoor Alekhya, Martinez Gabrielle, Hitchcock Julianna M., Angel Savannah, Campbell Logan, Rizvi Saqib, and Naegle Kristen M.. CoDIAC: A comprehensive approach for interaction analysis reveals novel insights into SH2 domain function and regulation. bioRxiv, page 2024.07.18.604100, January 2025. doi: 10.1101/2024.07.18.604100. [DOI] [Google Scholar]
- 15.Errington Wesley J., Bruncsics Bence, and Sarkar Casim A.. Mechanisms of noncanonical binding dynamics in multivalent protein–protein interactions. Proceedings of the National Academy of Sciences, 116(51):25659–25667, December 2019. doi: 10.1073/pnas.1902909116. Publisher: Proceedings of the National Academy of Sciences. [DOI] [Google Scholar]
- 16.Bilbrough Tim, Piemontese Emanuele, and Seitz Oliver. Dissecting the role of protein phosphorylation: A chemical biology toolbox. Chemical Society Reviews, 51(13):5691–5730, July 2022. ISSN 1460-4744. doi: 10.1039/D1CS00991E. [DOI] [PubMed] [Google Scholar]
- 17.Ryan Margaret M., Portelance Reagan, Newman Graham F., Martinez Gabrielle, Shekharan Swathi, Wu Anqi, Angel Savannah, Schaberg Katherine E., Gilmore Petra, Sprung Robert, Townsend Reid, and Naegle Kristen M.. A signaling inspired synthetic toolkit for efficient production of tyrosine phosphorylated proteins. bioRxiv, page 2024.12.22.629992, January 2025. doi: 10.1101/2024.12.22.629992. [DOI] [Google Scholar]
- 18.Zhou Huan-Xiang. Quantitative Relation between Intermolecular and Intramolecular Binding of Pro-Rich Peptides to SH3 Domains. Biophysical Journal, 91(9):3170–3181, November 2006. ISSN 0006-3495. doi: 10.1529/biophysj.106.090258. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Chin Alexander F., Toptygin Dmitri, Elam W. Austin, Schrank Travis P., and Hilser Vincent J.. Phosphorylation Increases Persistence Length and End-to-End Distance of a Segment of Tau Protein. Biophysical Journal, 110(2):362–371, January 2016. ISSN 0006-3495. doi: 10.1016/j.bpj.2015.12.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Stirnemann Guillaume, Giganti David, Fernandez Julio M., and Berne B. J.. Elasticity, structure, and relaxation of extended proteins under force. Proceedings of the National Academy of Sciences, 110(10):3847–3852, March 2013. doi: 10.1073/pnas.1300596110. Publisher: Proceedings of the National Academy of Sciences. [DOI] [Google Scholar]
- 21.Hause Ronald J., Leung Kin K., Barkinge John L., Ciaccio Mark F., Chuu Chih-pin, and Jones Richard B.. Comprehensive Binary Interaction Mapping of SH2 Domains via Fluorescence Polarization Reveals Novel Functional Diversification of ErbB Receptors. PLoS ONE, 7(9):e44471, September 2012. ISSN 1932-6203. doi: 10.1371/journal.pone.0044471. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Leung Kin K, Hause Ronald J, Barkinge John L, Ciaccio Mark F, Chuu Chih-Pin, and Jones Richard B. Enhanced prediction of Src homology 2 (SH2) domain binding potentials using a fluorescence polarization-derived c-Met, c-Kit, ErbB, and androgen receptor interactome. Molecular & cellular proteomics : MCP, 13(7):1705–23, July 2014. ISSN 1535-9484. doi: 10.1074/mcp.M113.034876. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Ronan Tom, Garnett Roman, and Naegle Kristen M.. New analysis pipeline for high-throughput domain–peptide affinity experiments improves SH2 interaction data. Journal of Biological Chemistry, 295(32):11346–11363, August 2020. ISSN 0021-9258. doi: 10.1074/jbc.RA120.012503. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Kaneko Tomonori, Joshi Rakesh, Feller Stephan M., and Shawn SC Li. Phosphotyrosine recognition domains: the typical, the atypical and the versatile. Cell Communication and Signaling, 10(1):32, November 2012. ISSN 1478-811X. doi: 10.1186/1478-811X-10-32. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Vauquelin Georges and Charlton Steven J.. Long-lasting target binding and rebinding as mechanisms to prolong in vivo drug action. British journal of pharmacology, 161(3):488–508, October 2010. ISSN 1476-5381 0007-1188. doi: 10.1111/j.1476-5381.2010.00936.x. Place: England. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Sivaramakrishnan Sivaraj and Spudich James A.. Systematic control of protein interaction using a modular ER/K α-helix linker. Proceedings of the National Academy of Sciences, 108(51):20467–20472, December 2011. doi: 10.1073/pnas.1116066108. Publisher: Proceedings of the National Academy of Sciences. [DOI] [Google Scholar]
- 27.Wilcox Kathryn G., Dingle Marlee E., Saha Ankit, Hore Michael J. A., and Morozova Svetlana. Persistence length of α-helical poly-l-lysine. Soft Matter, 18(35):6550–6560, 2022. ISSN 1744–683X. doi: 10.1039/D2SM00921H. Publisher: The Royal Society of Chemistry. [DOI] [PubMed] [Google Scholar]
- 28.Kjaergaard Magnus. Estimation of Effective Concentrations Enforced by Complex Linker Architectures from Conformational Ensembles. Biochemistry, 61(3):171–182, February 2022. ISSN 0006-2960. doi: 10.1021/acs.biochem.1c00737. Publisher: American Chemical Society. [DOI] [PubMed] [Google Scholar]
- 29.Kinoshita Eiji, Kinoshita-kikuta Emiko, and Koike Tohru. Phos-tag SDS-PAGE systems for phosphorylation profiling of proteins with a wide range of molecular masses under neutral pH conditions. Proteomics, 12:192–202, 2012. doi: 10.1002/pmic.201100524. [DOI] [PubMed] [Google Scholar]
- 30.Kaneko Tomonori, Huang Haiming, Cao Xuan, Li Xing, Li Chengjun, Voss Courtney, Sidhu Sachdev S., and Li Shawn S. C.. Superbinder SH2 Domains Act as Antagonists of Cell Signaling. Science Signaling, 5(243):ra68–ra68, September 2012. doi: 10.1126/scisignal.2003021. Publisher: American Association for the Advancement of Science. [DOI] [PubMed] [Google Scholar]
- 31.Ladbury J E, Lemmon M a, Zhou M, Green J, Botfield M C, and Schlessinger J. Measurement of the binding of tyrosyl phosphopeptides to SH2 domains: A reappraisal. Proceedings of the National Academy of Sciences of the United States of America, 92(8):3199–203, May 1995. ISSN 0027-8424. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Ladbury John E. and Arold Stefan T.. Energetics of Src Homology Domain Interactions in Receptor Tyrosine Kinase-Mediated Signaling, volume 488. Elsevier Inc., 2011. ISBN 978-0-12-381268-1. doi: 10.1016/B978-012-3812681.00007-0. [DOI] [Google Scholar]
- 33.Grammer T. C. and Blenis J.. Evidence for MEK-independent pathways regulating the prolonged activation of the ERK-MAP kinases. Oncogene, 14(14):1635–1642, April 1997. ISSN 0950-9232. doi: 10.1038/sj.onc.1201000. Place: England. [DOI] [PubMed] [Google Scholar]
- 34.Santos Silvia D. M., Verveer Peter J., and Bastiaens Philippe I. H.. Growth factor-induced MAPK network topology shapes Erk response determining PC-12 cell fate. Nature Cell Biology, 9(3):324–330, March 2007. ISSN 1476-4679. doi: 10.1038/ncb1543. [DOI] [PubMed] [Google Scholar]
- 35.Machida Kazuya, Thompson Christopher M., Dierck Kevin, Jablonowski Karl, Satu Kärkkäinen Bernard Liu, Zhang Haimin, Nash Piers D., Newman Debra K., Nollau Peter, Pawson Tony, Renkema G. Herma, Saksela Kalle, Schiller Martin R., Shin Dong-Guk, and Mayer Bruce J.. High-throughput phosphotyrosine profiling using SH2 domains. Molecular cell, 26(6):899–915, June 2007. ISSN 1097-2765. doi: 10.1016/j.molcel.2007.05.031. Place: United States. [DOI] [PubMed] [Google Scholar]
- 36.Buonato Janine M., Lan Ingrid S., and Lazzara Matthew J.. EGF augments TGF β-induced epithelial-mesenchymal transition by promoting SHP2 binding to GAB1. Journal of Cell Science, 128(21):3898–3909, January 2015. ISSN 1477-9137. doi: 10.1242/jcs.169599. [DOI] [PubMed] [Google Scholar]
- 37.Furcht Christopher M., Buonato Janine M., and Lazzara Matthew J.. EGFR-activated Src family kinases maintain GAB1-SHP2 complexes distal from EGFR. Science Signaling, 8 (376):1–13, 2015. ISSN 19379145. doi: 10.1126/scisignal.2005697. [DOI] [Google Scholar]
- 38.Gill Kamaldeep, Macdonald-Obermann Jennifer L., and Pike Linda J.. Epidermal growth factor receptors containing a single tyrosine in their C-terminal tail bind different effector molecules and are signaling-competent. Journal of Biological Chemistry, 292(50):20744–20755, December 2017. ISSN 0021-9258. doi: 10.1074/jbc.M117.802553. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.




