Skip to main content
Springer logoLink to Springer
. 2025 Nov 28;94(1):136–148. doi: 10.1007/s00239-025-10291-3

Correlated Evolution Drives Structural Convergence of Interacting Proteins

Ksenia Macias Calix 1, Antara Anika Piya 1, Raquel Assis 1,2,3,
PMCID: PMC12920738  PMID: 41313352

Abstract

Understanding the relationship between protein structures and their interactions is a fundamental biological problem. Here we broadly tackle this problem by examining associations between protein structural features and interaction patterns in rodents and yeast–two highly divergent taxa from different kingdoms. In both taxa, we uncover positive correlations between intrinsic disorders of interacting proteins, consistent with a prior study showing stronger affinity between proteins with similar structures. However, closer examination reveals that these relationships are restricted to proteins involved in evolutionarily conserved interactions, or interologs. We also find that interologs generally exhibit more similar protein structures and less evolutionary structural divergence than non-interologs, supporting the hypothesis that conserved interactions are associated with structural convergence of interacting proteins. Further analyses show that interologs are typically less intrinsically disordered and play more central functional roles than non-interologs, suggesting that these structural similarities may help preserve stable interactions involved in essential biological processes. Overall, this study underscores the interconnected evolution of protein structures and their interactions, illustrating how the optimization of protein fitness landscapes for both structural and functional stability may promote structural convergence across divergent taxa.

Supplementary Information

The online version contains supplementary material available at 10.1007/s00239-025-10291-3.

Keywords: Protein-protein interaction, Interolog, Protein structure, Protein disorder, Coevolution, Structural convergence

Introduction

Exploring the relationship between protein structures and their interaction patterns can enhance understanding of key biological processes. In particular, mutations that alter protein structures can affect their interactive capabilities and, consequently, their functions. As such, protein-protein interactions (PPIs) often impose constraints on structural evolution, shaping the ways that proteins fold and function together (Teichmann 2002; Worth et al. 2009; Andreani and Guerois 2014). For instance, comparisons of homologous proteins across diverse species have demonstrated that structural conservation often coincides with conserved interaction patterns (Teichmann 2002; Andreani et al. 2012; Holland et al. 2017). Moreover, large-scale interactome studies have shown that network hub proteins tend to be more evolutionarily conserved than their peripheral counterparts, suggesting that interactions critical to cellular function are under strong purifying selection (Fraser et al. 2002; Wuchty et al. 2003). These findings illustrate that the evolution of protein structures is closely tied to the evolution of their interactions, such that structurally conserved motifs can retain their biological functions over time.

Due to their intimate relationships, interacting proteins often exhibit signatures of correlated evolution, particularly at their binding interfaces (Liu et al. 2007; Lovell and Robertson 2010). These patterns may arise from coevolution driven by reciprocal selective pressures (Lovell and Robertson 2010) or simply from shared functional constraints or cellular contexts (Clark et al. 2012; Kann et al. 2009; Lovell and Robertson 2010). In several well-characterized systems, such as enzyme-inhibitor and receptor-ligand pairs, evidence of coevolution has been observed in the form of compensatory mutations that preserve binding functionality across diverging sequences (Kesteren et al. 1996; Damasceno et al. 2008). Specifically, mutations that weaken or abolish interactions are typically deleterious unless accompanied or followed by compensatory mutations that restore these interactions (Kimura 1983). Indeed, studies have shown that coevolutionary dynamics between interacting proteins are often mediated by compensatory changes that preserve the physical and chemical properties of binding sites, thereby allowing protein structures to maintain their interactions across long evolutionary timescales (Liu et al. 2007; Storz 2018; Chaurasia and Dutheil 2022).

In this study, we sought to determine whether and how the structures of interacting proteins are related to one another. Though the evolutionary trajectories of interacting proteins are often closely intertwined, it is unclear whether their structures are also associated. If so, then do the structures of interacting proteins tend to be more similar to one another, or do opposites attract? In either case, are there specific structural or functional outcomes of such relationships? To answer these questions, we employed expansive datasets of known PPIs and protein structures for two parallel analyses in rodents and yeast, which represent highly divergent taxa from different kingdoms. Specifically, we performed analyses between mouse (Mus musculus) and rat (Rattus norvegicus), as well as between brewer’s yeast (Saccharomyces cerevisiae) and fission yeast (Schizosaccharomyces pombe), to uncover patterns in the relationships between PPI partners and their evolutionary underpinnings. Thus, our study adds to a wealth of knowledge about the intricacies of PPI evolution by investigating and shedding light on the structural and functional optima of their fitness landscapes.

Results

As a first step toward understanding the relationship between structures of interacting proteins, we compared their estimated intrinsic disorders within rodents and within yeast (see Methods). To establish a baseline, we generated a randomized dataset by shuffling protein partners (see Methods). In both taxa, we uncovered weak but highly significant positive correlations between intrinsic disorders of interacting proteins, contrasting with the absence of correlations among randomized pairs (top panels of Fig. 1; see Methods). These findings suggest that interacting proteins tend to exhibit similar levels of disorder. Such similarity is consistent with a theoretical study showing stronger attraction between proteins with similar structures (Lukatsky et al. 2007) and may represent a generalization of the recent finding that interacting protein homologs tend to be less structurally divergent than non-interacting homologs (Naveenkumar et al. 2022).

Fig. 1.

Fig. 1

Relationships between intrinsic disorder of interacting proteins in rodents (left) and yeast (right), approximated by mean pLDDT score. Scatterplots overlaid with least squares regression lines and Pearson (r) correlation coefficients depict pairwise correlations between disorder metrics. Top: interacting proteins (green) compared with randomized non-interacting pairs (purple). Bottom: interologs (blue) compared with non-interologs (red). **Inline graphic, ***Inline graphic (see Methods). Data used to generate these plots are provided in Tables S1 and S2

To investigate this pattern further, we divided interacting protein pairs into two subsets: interologs and non-interologs. Interologs are broadly defined as protein pairs whose interaction is conserved between two species, whereas non-interologs lack such conservation (Fig. S1). In our study, we distinguished between interologs and non-interologs by considering the conservation of interactions between the two species within a particular taxon (either rodents or yeast). Strikingly, the positive relationships in intrinsic disorder of interacting proteins were entirely driven by interologs, with little or no correlations observed for non-interologs (bottom panels of Fig. 1). This contrast persisted when we used an alternative metric for intrinsic disorder (Fig. S2; see Methods). Together, these findings suggest that evolutionary conservation of PPIs plays a central role in shaping structural similarity between partners.

It is worth noting that relative differences in correlation strengths of all interacting proteins and interologs between taxa are likely influenced by the greater availability of experimental PPI data for yeast. Consequently, the yeast dataset includes a broader representation of both interologs and non-interologs, whereas the rodent dataset may be biased toward well-studied proteins, which are more likely to be interologs. Nonetheless, analyses using larger datasets without experimental filters confirmed the overall trend, yielding weak positive correlations for all interacting proteins (Inline graphic in rodents, Inline graphic in yeast, Inline graphic in both) and interologs (Inline graphic in rodents, Inline graphic in yeast, Inline graphic in both), and correlations hovering around zero for non-interologs (Inline graphic in rodents, Inline graphic in yeast, Inline graphic in rodents, Inline graphic in yeast; see Methods). Comparisons of aligned structures using FATCAT similarity scores (structural alignment metric), FATCAT optimized root-mean-square deviation (RMSD) values (atomic distance metric), and template modeling (TM) scores (topological similarity metric) reinforced this finding. Specifically, interologs showed higher FATCAT similarity scores and TM-scores and lower RMSD values than non-interologs in both taxa (Fig. 2). Thus, interacting proteins exhibit similar levels of intrinsic disorder and structures primarily when their interactions are conserved between species. To illustrate this contrast, we superimposed representative interolog and non-interolog structures from rodents and yeast (Fig. 3).

Fig. 2.

Fig. 2

Comparisons of structural similarities between interacting proteins in rodents (left) and yeast (right). Boxplots depict distributions of structural similarity metrics for interologs (blue) and non-interologs (red). Top: FATCAT similarity scores. Middle: FATCAT optimized RMSD values. Bottom: TM-scores. ***Inline graphic (see Methods). Data used to generate these plots are provided in Tables S1 and S2

Fig. 3.

Fig. 3

Superimposed structural alignments of representative interacting proteins in rodents (left) and yeast (right). Top: interologs. Bottom: non-interologs

Based on our findings, we hypothesized that conservation of interactions between species is associated with correlated structural evolution of interologs, potentially reflecting constraints that preserve interacting interfaces in new genetic backgrounds or environments. While correlated evolution has been documented in specific cases of interacting proteins (Teichmann 2002; Mintseris and Weng 2005; Perica et al. 2012; Mukherjee and Chakrabarti 2021), including among interologs (Leducq et al. 2012; Vo et al. 2016), structural convergence has not yet been widely recognized as a general outcome of this process. Therefore, to assess whether conserved interactions impose structural constraints, we examined the structural evolutionary divergence of orthologous interologs and non-interologs, with the expectation that interologs would exhibit slower rates of structural divergence (Mandloi and Chakrabarti 2017). Comparisons of FATCAT similarity scores, FATCAT optimized RMSD values, and TM-scores revealed that orthologous interologs consistently displayed greater structural conservation than orthologous non-interologs in both taxa (Fig. 4; see Methods). Specifically, FATCAT similarity and TM-scores were notably higher, while FATCAT optimized RMSD values were lower for interologs compared to non-interologs. This analysis also adds to a bounty of work illustrating slower sequence divergence rates of interologs (Yu et al. 2004; Leducq et al. 2012; Vo et al. 2016) and proteins with high connectivities (Fraser et al. 2002; Brown and Jurisica 2007; Teppa et al. 2017), which are often interologs (Fox et al. 2009). Moreover, when considered alongside the observed structural similarities of interologs (Figs.  1 bottom,  2), these results suggest that their correlated evolution may frequently lead to structural convergence.

Fig. 4.

Fig. 4

Evolutionary divergence between orthologous protein structures in rodents (left) and yeast (right). Boxplots depict distributions of structural similarity metrics for interologs (blue) and non-interologs (red). Top: FATCAT similarity scores. Middle: FATCAT optimized RMSD values. Bottom: TM-scores. ***Inline graphic (see Methods). Data used to generate these plots are provided in Tables S3 and S4

Last, we considered whether structural convergence might represent a generally preferred outcome of correlated evolution among interologs. To address this question, we compared estimated intrinsic disorders of interologs within rodents and within yeast (see Methods). In both taxa, we found that interologs tend to be less disordered than non-interologs (Figure 5 top). This finding aligns with expectations based on the slower structural divergence rates of interologs (Fig. 2), as less disordered proteins typically evolve more slowly (Ward et al. 2004; Brown et al. 2011; Szalkowski and Anisimova 2011; Gerek et al. 2013; Marsh and Teichmann 2014). Further, less disordered proteins tend to form more thermodynamically stable interactions due to lower entropic costs during binding (Karshikoff et al. 2015), which may facilitate the slower structural divergence and trans-species conservation of interologs. On the other hand, more disordered proteins are prone to promiscuous interactions (Schreiber and Keating 2011; Marsh and Teichmann 2014), perhaps promoting the faster structural divergence and lack of trans-species conservation of non-interologs.

Fig. 5.

Fig. 5

Structural and functional characteristics of interacting proteins in rodents (left) and yeast (right). Top: Boxplots depict distributions of intrinsic disorder for interologs (blue) and non-interologs (red). ***Inline graphic (see Methods). Middle: REVIGO scatterplots visualize GO enrichment for proteins with low intrinsic disorder. Bottom: REVIGO scatterplots visualize GO enrichment for proteins with high intrinsic disorder. For both REVIGO scatterplots, circle size is proportional to GO term frequency, and color indicates Inline graphic, with green representing the lowest and most significant values. Data used to generate these plots are provided in Tables S1, S2, S5, and S6

To gain further insight into the functional implications of this finding, we performed gene ontology (GO) enrichment analyses within each taxon to identify overrepresented functional categories among proteins with low and high intrinsic disorders (see Methods). In both taxa, we found that less disordered proteins tend to be involved in important metabolic and biosynthetic processes (Fig. 5 middle), whereas more disordered proteins are often involved in gene regulatory processes (Fig. 5 bottom; Tables S5 and S6). Taken together with our other findings, this observation suggests that interologs may generally evolve in a unidirectional and coordinated manner toward less disordered structures, as such conformations enable their participation in essential biological processes.

Discussion

Our study contributes to a growing body of knowledge about how protein structures influence their interactions and, conversely, how these interactions shape structural evolution. This bidirectional relationship underscores both the evolutionary pressures that preserve PPIs and the structural constraints essential for their stability, offering insight into how proteins maintain essential functions across species while accommodating diverse binding partners. The tendency of structurally similar proteins to interact aligns with prior studies showing a statistical preference for structurally similar interaction partners (Lukatsky et al. 2007). However, our analysis revealed that this trend is specifically driven by interologs, which exhibit significantly slower rates of structural divergence than non-interologs. We hypothesize that structural similarity among interologs reflects evolutionary constraints acting to preserve binding interfaces and essential interactions across species (Liu et al. 2007; Lovell and Robertson 2010). While previous studies have shown that interacting proteins (including interologs) often undergo correlated evolution (Teichmann 2002; Mintseris and Weng 2005; Leducq et al. 2012; Perica et al. 2012; Vo et al. 2016; Mukherjee and Chakrabarti 2021), structural convergence has not been widely recognized as a general outcome of these dynamics.

In our study, we use structural convergence to describe the tendency of interologous proteins to evolve toward increasingly similar structural configurations under shared functional constraints. Because interologs share common ancestry, convergence here does not imply the independent invention of identical folds. Rather, it captures correlated evolutionary dynamics in which selective pressures to preserve conserved interactions guide proteins along parallel structural trajectories, often at both global and local levels, such as binding interfaces. In this sense, structural convergence reflects the reduction of structural divergence between interacting partners across evolutionary timescales, consistent with the maintenance of essential and stable PPIs. Hence, our findings suggest that stronger affinities between topologically similar structures may guide the correlated evolution of PPI partners, such that even proteins with divergent sequences undergo analogous structural adaptations to maintain stable and precise interactions.

For example, conserved obligate assemblies such as the proteasome and ribosome exemplify the strong structural organization observed among interologs. In the proteasome, Inline graphic-subunits (PSMA) and Inline graphic-subunits (PSMB) assemble into the 20 S core particle, which combines with regulatory particles to form the active 26 S holoenzyme (Tanaka 2009). In the ribosome, large subunit proteins (RPLs) stabilize pre-rRNA structures and coordinate 60 S maturation, and their disruption destabilizes preribosomal intermediates (Wild et al. 2010; Komili et al. 2007; Ohmayer et al. 2015). Despite sequence divergence, homologous RPLs from other eukaryotes can substitute for yeast proteins, underscoring the conservation of these structural roles (Ross et al. 2007).

By contrast, scaffold and regulatory proteins show a different pattern. For instance, the kinetochore protein Knl1 (also known as CASC5/Spc105) coordinates spindle checkpoint signaling through numerous short linear motifs and regions of low structural complexity, enabling dynamic and context-dependent interactions (Ghongane et al. 2014; Bollen 2014). Similarly, the yeast transcription factor Msn2 relies on an extensive intrinsically disordered region to integrate coactivator recruitment and promoter selection, with transcriptional activity diverging rapidly across orthologs despite conservation of its DNA-binding specificity (Mindel et al. 2024). Together, these case studies suggest that structural convergence is characteristic of obligate complexes but less apparent in flexible scaffolds or disordered regulators, where functional versatility is prioritized over strict structural conservation.

One alternative hypothesis for the observed structural similarity between interologs is the prevalence of homo-domain interactions, in which proteins sharing the same domain type preferentially interact due to inherent structural compatibility (Orlowski et al. 2007; Björkholm and Sonnhammer 2009; Maleki et al. 2011; Finn et al. 2014). While such interactions are well-documented for specific domain families (e.g., SH3, WD40; (Stirnimann et al. 2010; Kurochkina and Guha 2013; Jain and Pandey 2018)), their impact should be similar for both interologs and non-interologs, as both categories encompass diverse protein interactions involving various domain architectures. If homo-domain interactions were the primary driver of our findings, we would therefore expect to see comparable levels of structural similarity between interologs and non-interologs. However, our results reveal significantly greater structural conservation between interologs, suggesting that this pattern is not merely a byproduct of domain-level compatibility but instead reflects evolutionary pressures acting specifically on conserved interactions. Such evolutionary pressures may result in a delicate balance between preserving interactions and enabling functional diversity.

Further analyses revealed that, in addition to their greater structural similarity, interologs exhibit lower intrinsic disorder than non-interologs. This observation is indirectly supported by previous studies showing that less disordered proteins are typically subject to strong evolutionary pressures (Siltberg-Liberles et al. 2011; Tóth-Petróczy and Tawfik 2011). Our gene ontology (GO) analysis bridges these findings by demonstrating that less disordered proteins are often involved in critical metabolic and biosynthetic processes, which tend to be conserved (Peregrín-Alvarez et al. 2009; Muto et al. 2013; Moolhuijzen et al. 2020). In contrast, the conformational flexibility of intrinsically disordered proteins enables their engagement in diverse interactions, such that they are commonly involved in regulatory roles (Wright and Dyson 2015; Bhattarai and Emerson 2020), as highlighted by our GO analysis. Though this versatility benefits dynamic cellular functions, it can also reduce interaction stability (Kaare et al. 2011) and increase susceptibility to mutations (Ward et al. 2004; Brown et al. 2011; Szalkowski and Anisimova 2011; Gerek et al. 2013; Marsh and Teichmann 2014), potentially explaining their lower evolutionary conservation. Together, our findings suggest that interologs may evolve along a unidirectional path characterized by structural convergence and reduced intrinsic disorder. This trajectory likely reflects an evolutionary propensity toward stable structures that facilitate essential biological interactions, reinforcing the structural and functional fidelity across species. Thus, the conservation of PPIs appears to impose structural constraints that enable proteins to remain robust yet adaptable within diverse cellular contexts.

Notably, our findings were mirrored in two taxa from different kingdoms, suggesting that the observed patterns may be broadly generalizable. Though we hypothesize that this is indeed the case, the scope of our conclusions is limited by the conservative design of our study. In particular, the STRING database (Szklarczyk et al. 2023) from which we obtained our PPIs integrates information from a variety of sources, potentially introducing inconsistencies or conflicting evidence for some interactions. We therefore chose to focus our analysis on PPIs that were experimentally verified, enhancing the reliability of our findings and ensuring that our conclusions were grounded in high-confidence data. Yet, availability of such high-quality data is biased toward well-studied organisms, thereby restricting the taxa that we were able to employ for our analyses. Further, our assessment of protein disorder was based on the AlphaFold pLDDT metric (Jumper et al. 2021) (see Methods), which does not directly represent disorder, but rather the level of agreement between predicted and experimental structures (Jumper et al. 2021; Ruff and Pappu 2021). Nevertheless, pLDDT is considered a reliable estimate of intrinsic disorder that has been widely employed for this purpose (Ruff and Pappu 2021; Tunyasuvunakool et al. 2021). Importantly, we did not solely rely on pLDDT when assessing structural similarity; we coupled these analyses with those of protein structural alignments, yielding consistent findings for both approaches. Hence, this study provides a foundation for further exploration of the interplay between PPIs, structural evolution, and conserved interaction networks. Whereas improved predictive models or experimental data may help to refine these findings, our work provides a crucial step toward understanding the evolutionary dynamics of protein networks, with broad implications for exploring protein function and resilience.

Methods

Data Acquisition and Processing

Protein structure data for mouse (Mus musculus), rat (Rattus norvegicus), brewer’s yeast (Saccharomyces cerevisiae), and fission yeast (Schizosaccharomyces pombe) were downloaded as macromolecular crystallographic information (mmCIF) and Protein Data Bank (PDB) files from version 4 of the AlphaFold protein structure database (Jumper et al. 2021) at https://alphafold.ebi.ac.uk. For each protein, we extracted predicted local distance divergence test (pLDDT) values for all residues, which estimate how well the predicted structure agrees with the experimental structure (Jumper et al. 2021; Ruff and Pappu 2021). These values range from 0 to 100, with smaller values generally corresponding to higher intrinsic disorder (Jumper et al. 2021; Ruff and Pappu 2021). We considered mean pLDDT in our primary analyses, though we also computed the fraction of residues with pLDDT Inline graphic as a complementary measure that reflects the extent of disordered regions (Fig. S2). To normalize left-skewed distributions of pLDDTs, we applied the transformation Inline graphic, where x is pLDDT.

Protein-protein interaction (PPI) data for all species were obtained from version 12.0 of the STRING database (Szklarczyk et al. 2023) at https://string-db.org. For each species, we downloaded files containing interactions with physical links. In total, there were 305,266 PPIs involving 15,686 unique proteins in mouse, 333,704 PPIs involving 15,213 unique proteins in rat, 171,914 PPIs involving 5,651 unique genes in brewer’s yeast, and 53,726 PPIs involving 3,316 unique genes in fission yeast. We then extracted only those with experimental support (experimental score Inline graphic). After this filtering step, there were 53,850 PPIs involving 9,747 unique proteins in mouse, 8,197 PPIs involving 2,833 unique proteins in rat, 121,926 PPIs involving 5,630 unique genes in brewer’s yeast, and 9,989 PPIs involving 2,562 unique genes in fission yeast (Tables S1 and S2). To generate a baseline for comparisons, we constructed a randomized dataset of 10,000 protein pairs by shuffling partners while preserving the pLDDT value of each protein. This approach maintained pLDDT scores while ensuring that pairs did not correspond to known interactions in the original dataset.

We obtained all 1:1 orthologs in rodents and in yeast from Ensembl release 110 (Martin et al. 2023) via the BioMart database (Smedley et al. 2009). In total, there were 15,362 1:1 orthologs in rodents and 4,011 1:1 orthologs in yeast. Because we were interested in associations of PPI conservation status with protein structural properties of interacting partners, rather than with their evolutionary gains or losses, for such analyses we only considered PPIs with 1:1 orthologs for both partners. After removing PPIs not meeting this requirement, there were 2,039 PPIs in rodents and 1,442 PPIs in yeast (Tables S3 and S4). Each PPI was classified as an interolog if it was observed in both species within a taxon (either rodents or yeast) and as a non-interolog if it was observed in only one of the two species (Fig. S1). Interologs comprised Inline graphic of the rodent dataset and Inline graphic of the yeast dataset, a difference that may reflect the more extensive experimental coverage and database representation of yeast PPIs. To accurately reflect the proportion of interolog versus non-interolog interactions for each protein, we relied on data containing duplicated ortholog entries, as these preserve the overall interaction status across multiple interactions of the same protein. A flowchart summarizing all data acquisition and processing steps is provided in Fig. S3.

Comparisons of Protein Structures

We used three approaches to evaluate structural differences between interacting and orthologous proteins. First, we compared their intrinsic disorders by evaluating correlations between transformed pLDDTs (Fig. 1; see Data acquisition and processing). Second, we compared folded tertiary structures by aligning them and estimating their structural similarities and optimized RMSD values with FATCAT 2.0 (Li et al. 2020), a flexible alignment algorithm that accounts for rotations and translations while minimizing the overall RMSD between structures. Because our analyses span proteins from divergent species, we employed flexible rather than rigid alignments. Flexible FATCAT accounts for local structural rearrangements, domain motions, and evolutionary shifts in fold geometry, thereby capturing conserved structural cores even when global topologies have diverged. This property makes it particularly suitable for assessing structural similarities relevant to the preservation of protein-protein interactions across evolutionary timescales. Third, we complemented FATCAT alignments by generating independent alignments using TM-align to compute TM-scores, which assess both structural similarity and coverage of the aligned region.

To perform FATCAT alignments, we provided Protein Data Bank (PDB) files from AlphaFold (see Data acquisition and processing) as input, then extracted similarity scores and optimized RMSD values. Similarity scores range from 0 to 100 and estimate how well two structures superimpose (Yu et al. 2004), while optimized RMSD values measure the average distance between corresponding atoms in the aligned regions after flexible superposition. We provided the same PDB files to TM-align and extracted aligned lengths and TM-scores, which range from 0 to 1, with higher values indicating closer structural correspondence. Distributions of protein lengths and numbers of annotated domains, were summarized for unique proteins in each taxon (Fig. S4). We also evaluated alignment lengths, which showed comparable distributions between interologs and non-interologs in rodents, but a moderate shift toward longer alignments for non-interologs in yeast (Fig. S5). This difference is unlikely to account for the observed results, as TM-align explicitly normalizes for alignment length, and consistent patterns were obtained across all structural similarity metrics.

For Fig. 3, we selected representative examples of interologs and non-interologs with protein lengths close to the medians of their respective distributions (Fig. S5) and TM-scores above the median within their class. These criteria ensured that the chosen pairs were illustrative rather than extreme. Structural superimpositions of these examples were generated and visualized using PyMOL version 3.1.0 (Schrödinger LLC 2023).

Gene Ontology Enrichment Analyses

Gene ontology (GO) enrichment analysis was conducted with the STRING web-based tool at https://string-db.org (accessed on September 23, 2024) (Szklarczyk et al. 2023). Specifically, STRING was used to identify enriched GO terms in protein lists ranked by their pLDDT scores, using combined data from M. musculus and R. norvegicus for rodents, and S. cerevisiae and S. pombe for yeast. Note that GO analyses were performed separately for rodents and yeast. The input for STRING consisted of protein names and their ranked pLDDT scores, and the output included enriched GO terms, their associations with low or high pLDDT scores, and false discovery rate (FDR)-corrected P-values. The REVIGO tool at http://revigo.irb.hr/ (accessed on October 3, 2024) (Supek et al. 2011) was used to remove redundant GO terms and visualize cluster representatives by applying multidimensional scaling to a matrix of the semantic similarities of the GO (Fig. 5 middle, bottom).

Statistical Analyses

All statistical analyses were performed in R (R Core Team 2023) with the RStudio IDE (RStudio Team 2024). We used the cor.test() function in the stats package (R Core Team 2023) to estimate Pearson correlation coefficients (Pearson 1896) and evaluate their statistical significance for the relationships between transformed pLDDT scores (see Data acquisition and processing) in Figs. 1 and S2. Two-tailed Mann–Whitney U tests (Mann and Whitney 1947), implemented with the wilcox.test() function in the stats package (R Core Team 2023), were utilized to compare distributions in Figs. 2,  4, and  5.

Supplementary Information

Below is the link to the electronic supplementary material.

Author Contributions

Conceptualization: R.A. and K.M.C.; formal analysis: K.M.C.; funding acquisition: R.A.; investigation: K.M.C.; methodology: R.A., K.M.C., and A.A.P.; supervision: R.A.; visualization: K.M.C.; writing—original draft preparation: K.M.C.; writing—review and editing: R.A. and A.A.P.

Funding

This work was supported by National Institutes of Health grant R35GM142438 and National Science Foundation grant DBI-2130666.

Data Availability

Data produced and analyzed in this study are provided in Supplementary Tables S1-S6.

Declarations

Conflict of interest

The authors declare that they have no Conflict of interest, financial or otherwise.

References

  1. Andreani J, Guerois R (2014) Evolution of protein interactions: from interactomes to interfaces. Arch Biochem Biophys 554:65–75 [DOI] [PubMed] [Google Scholar]
  2. Andreani J, Faure G, Guerois R (2012) Versatility and invariance in the evolution of homologous heteromeric interfaces. PLOS Comput Biol [DOI] [PMC free article] [PubMed]
  3. Bhattarai A, Emerson I (2020) Dynamic conformational flexibility and molecular interactions of intrinsically disordered proteins. Springerlink, 45 [PubMed]
  4. Björkholm P, Sonnhammer ELL (2009) Comparative analysis and unification of domain–domain interaction networks. Bioinformatics 25:3020–3025 [DOI] [PubMed] [Google Scholar]
  5. Bollen M (2014) Kinetochore signalling: the kiss that melts Knl1. Curr Biol 24:R68–R70 [DOI] [PubMed] [Google Scholar]
  6. Brown C, Johnson A, Dunker A, Daughdrill G (2011) Evolution and disorder. Curr Opin Struct Biol 21:441–446 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Brown KR, Jurisica I (2007) Unequal evolutionary conservation of human protein interactions in interologous networks. Genome Biol, 8 [DOI] [PMC free article] [PubMed]
  8. Chaurasia S, Dutheil JY (2022) The structural determinants of intra-protein compensatory substitutions. Mol Biol Evolut, 39 [DOI] [PMC free article] [PubMed]
  9. Clark NL, Alani E, Aquadro CF (2012) Evolutionary rate covariation reveals shared functionality and coexpression of genes. Genome Res 22:714–720 [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Damasceno CM, Bishop JG, Ripoll DR, Win J, Kamoun S, Rose JK (2008) Structure of the Glucanase inhibitor protein (GIP) family from phytophthora species suggests coevolution with plant endo-beta-1,3-glucanases. Mol Plant Microbe Interact 21:820–830 [DOI] [PubMed] [Google Scholar]
  11. Finn RD, Miller BL, Clements J, Bateman A (2014) iPfam: a database of protein family and domain interactions found in the protein data bank. Nucleic Acids Res 42:D364–D373 [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Fox A, Taylor D, Slonim DK (2009) High throughput interaction data reveals degree conservation of hub proteins. Pac Symp Biocomput, 391–402 [DOI] [PMC free article] [PubMed]
  13. Fraser HB, Hirsh AE, Steinmetz LM, Scharfe C, Feldman MW (2002) Evolutionary rate in the protein interaction network. Science 296:750–752 [DOI] [PubMed] [Google Scholar]
  14. Gerek ZN, Kumar S, Ozkan SB (2013) Structural dynamics flexibility informs function and evolution at a proteome scale. Evol Appl 6:423–433 [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Ghongane P, Kapanidou M, Asghar A, Elowe S, Bolanos-Garcia VM (2014) The dynamic protein Knl1—a kinetochore rendezvous. J Cell Sci 127:3415–3423 [DOI] [PubMed] [Google Scholar]
  16. Holland DO, Shapiro BH, Xue P, Johnson ME (2017) Protein-protein binding selectivity and network topology constrain global and local properties of interface binding networks. Sci Rep, 7 [DOI] [PMC free article] [PubMed]
  17. Jain BP, Pandey S (2018) Wd40 repeat proteins: signalling scaffold with diverse functions. Protein 37:391–406 [DOI] [PubMed] [Google Scholar]
  18. Jumper J, Evans R, Pritzel A, Green T, Figurnov M et al (2021) Highly accurate protein structure prediction with alphafold. Nature 596:583–589 [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Kaare T, Johan G, Birthe B (1814) Protein stability, flexibility and function. Biochimica et Biophysica Acta (BBA) 969–976:2011 [DOI] [PubMed] [Google Scholar]
  20. Kann MG, Shoemaker BA, Panchenko AR, Przytycka TM (2009) Correlated evolution of interacting proteins: looking behind the mirrortree. J Mol Biol 385:91–98 [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Karshikoff A, Nilsson L, Ladenstein R (2015) Rigidity versus flexibility: the dilemma of understanding protein thermal stability. FEBS J 282:3899–3917 [DOI] [PubMed] [Google Scholar]
  22. Kesteren R, Tensen C, Smit A, Minnen J, Kolakowski L, Meyerhof W, Richter D, Heerikhuizen H, Vreugdenhil E, Geraerts W (1996) Co-evolution of ligand-receptor pairs in the vasopressin/oxytocin superfamily of bioactive peptides. Nucleic Acids Protein Synth Mol Genet 271:3619–3626 [DOI] [PubMed] [Google Scholar]
  23. Kimura M (1983) The neutral theory of molecular evolution. Cambridge University Press, Cambridge [Google Scholar]
  24. Komili S, Farny NG, Roth FP, Silver PA (2007) Functional specificity among ribosomal proteins regulates gene expression. PLoS Biol 131:557–571 [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Kurochkina N, Guha U (2013) Sh3 domains: modules of protein-protein interactions. Biophys Rev 5:29–39 [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Leducq JB, Charron G, Diss G, Gagnon -Arsenault I, Dube AK (2012) Evidence for the robustness of protein complexes to inter-species hybridization. PLOS Genetics [DOI] [PMC free article] [PubMed]
  27. Li Z, Jaroszewski L, Iyer M, Sedova M, Godzik A (2020) Fatcat 2.0: towards a better understanding of the structural diversity of proteins. Nucleic Acids Res 48:w60–w64 [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Liu QX, Nakashima-Kamimura N, Ikeo K, Hirose S, Gojobori T (2007) Compensatory change of interacting amino acids in the coevolution of transcriptional coactivator mbf1 and tata-box–binding protein. Mol Biol Evol 24:1458–1463 [DOI] [PubMed] [Google Scholar]
  29. Lovell S, Robertson D (2010) An integrated view of molecular coevolution in protein–protein interactions. Mol Biol Evol 27:2567–2575 [DOI] [PubMed] [Google Scholar]
  30. Lukatsky D, Shakhnovich B, Mintseris J, Shakhnovich E (2007) Structural similarity enhances interaction propensity of proteins. J Mol Biol 365:1596–606 [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Maleki M, Aziz M, Rueda L (2011) Analysis of obligate and non-obligate complexes using desolvation energies in domain-domain interactions. Association for Computing Machinery, New York [Google Scholar]
  32. Mandloi S, Chakrabarti S (2017) Protein sites with more coevolutionary connections tend to evolve slower, while more variable protein families acquire higher coevolutionary connections. F1000Res [DOI] [PMC free article] [PubMed]
  33. Mann HB, Whitney DR (1947) On a test of whether one of two random variables is stochastically larger than the other. Ann Math Stat 18:50–60 [Google Scholar]
  34. Marsh JA, Teichmann SA (2014) Protein flexibility facilitates quaternary structure assembly and evolution. PLoS Biol 12(5):e1001870 [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Martin FJ, Amode MR, Aneja A, Austine-Orimoloye O, Azov AG et al (2023) Ensembl 2023. Nucleic Acids Res 51(D1):D933–D941 [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Mindel V, Brodsky S, Cohen A, Manadre W, Jonas F et al (2024) Intrinsically disordered regions of the Msn2 transcription factor encode multiple functions using interwoven sequence grammars. Nucleic Acids Res. 52:2260–2272 [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Mintseris J, Weng Z (2005) Structure, function, and evolution of transient and obligate protein–protein interactions. PNAS 102:10930–10935 [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Moolhuijzen PM, Muria-Gonzalez MJ, Syme R, Rawlinson C, See PT, Moffat CS, Ellwood SR (2020) Expansion and conservation of biosynthetic gene clusters in pathogenic Pyrenophora spp. toxins, 12 [DOI] [PMC free article] [PubMed]
  39. Mukherjee I, Chakrabarti S (2021) Co-evolutionary landscape at the interface and non-interface regions of protein-protein interaction complexes. Comput Struct Biotechnol J 19:3779–3795 [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Muto A, Kotera M, Tokimatsu T, Nakagawa Z, Goto S, Kanehisa M (2013) Modular architecture of metabolic pathways revealed by conserved sequences of reactions. J Chem Inf Model, 53 [DOI] [PMC free article] [PubMed]
  41. Naveenkumar N, Prabantu VM, Vishwanath S, Sowdhamini R, Srinivasan N (2022) Structures of distantly related interacting protein homologs are less divergent than non-interacting homologs. FEBS Open Bio 12:2147–2153 [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Ohmayer U, Gil-Hernández Á, Sauert M, Martín-Marcos P, Tamame M, et al (2015) Studies on the coordination of ribosomal protein assembly events involved in processing and stabilization of yeast early large ribosomal subunit precursors. PLoS One, 10 [DOI] [PMC free article] [PubMed]
  43. Orlowski J, Kaczanowski S, Zielenkiewicz P (2007) Overrepresentation of interactions between homologous proteins in interactomes. FEBS Lett 581:52–56 [DOI] [PubMed] [Google Scholar]
  44. Pearson K (1896) Mathematical contributions to the theory of evolution-iii. regression, heredity, and panmixia. Philos Trans R Soc Lond Ser A Contain Pap Math Phys Charact 187:253–318 [Google Scholar]
  45. Peregrín-Alvarez JM, Sanford C, Parkinson J (2009) The conservation and evolutionary modularity of metabolism. Genome Biol, 10 [DOI] [PMC free article] [PubMed]
  46. Perica T, Chothia C, Teichmann SA (2012) Evolution of oligomeric state through geometric coupling of protein interfaces. PNAS 109:8127–8132 [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. R Core Team (2023) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna [Google Scholar]
  48. Ross CL, Patel RR, Mendelson TC, Ware VC (2007) Functional conservation between structurally diverse ribosomal proteins from drosophila melanogaster and saccharomyces cerevisiae: fly l23a can substitute for yeast l25 in ribosome assembly and function. Nucleic Acids Res 35:4503–14 [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. R Studio Team (2024) RStudio: integrated development environment for R. Posit Software, PBC, Boston
  50. Ruff K, Pappu R (2021) Alphafold and implications for intrinsically disordered proteins. J Mol Biol 433:167208 [DOI] [PubMed] [Google Scholar]
  51. Schreiber G, Keating AE (2011) Protein binding specificity versus promiscuity. Curr Opin Struct Biol 21:50–61 [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Schrödinger LLC (2023) The PyMOL molecular graphics system, Version 3.1.0. Schrödinger, LLC, New York [Google Scholar]
  53. Siltberg-Liberles J, Grahnen JA, Liberles DA (2011) The evolution of protein structures and structural ensembles under functional constraint. Genes 2:748–762 [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Smedley D, Haider S, Ballester B, Holland R, London D et al (2009) Biomart–biological queries made easy. BMC Genom [DOI] [PMC free article] [PubMed]
  55. Stirnimann CU, Petsalaki E, Russell RB, Müller CW (2010) Wd40 proteins propel cellular networks. Trends Biochem Sci 35:565–574 [DOI] [PubMed] [Google Scholar]
  56. Storz JF (2018) Compensatory mutations and epistasis for protein function. Curr Opin Struct Biol 50:18–25 [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Supek F, Bošnjak M, Škunca N, Šmuc T (2011) REVIGO summarizes and visualizes long lists of gene ontology terms. PLOS ONE, 6 [DOI] [PMC free article] [PubMed]
  58. Szalkowski AM, Anisimova M (2011) Markov models of amino acid substitution to study proteins with intrinsically disordered regions. PLoS ONE 6:e20488 [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Szklarczyk D, Kirsch R, Koutrouli M, Nastou K, Mehryary F (2023) The string database in 2023: protein-protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucleic Acids Res 51(D1):D638–D646 [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Tanaka K (2009) The proteasome: overview of structure and functions. Proc Jpn Acad Ser B Phys Biol Sci 85:12–36 [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Teichmann SA (2002) The constraints protein-protein interactions place on sequence divergence. J Mol Biol 324:399–407 [DOI] [PubMed] [Google Scholar]
  62. Teppa E, Zea DJ, Marino-Buslje C (2017) Protein–protein interactions leave evolutionary footprints: High molecular coevolution at the core of interfaces. Protein Sci 26:2438–2444 [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Tunyasuvunakool K, Adler J, Wu Z (2021) Highly accurate protein structure prediction for the human proteome. Nature 596:590–596 [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Tóth-Petróczy A, Tawfik DS (2011) Slow protein evolutionary rates are dictated by surface–core association. PNAS, 108 [DOI] [PMC free article] [PubMed]
  65. Vo TV, Das J, Meyer MJ, Pleiss JA, Xia Y, Yu H (2016) A proteome-wide fission yeast interactome reveals network evolution principles from yeasts to human. Cell 164:310–323 [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Ward JJ, Sodhi JS, McGuffin LJ, Buxton BF, Jones DT (2004) Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. J Mol Biol 337:635–645 [DOI] [PubMed] [Google Scholar]
  67. Wild T, Horvath P, Wyler E, Widmann B, Badertscher L et al (2010) A protein inventory of human ribosome biogenesis reveals an essential function of exportin 5 in 60s subunit export. PLoS Biol [DOI] [PMC free article] [PubMed]
  68. Worth CL, Gong S, Blundell TL (2009) Structural and functional constraints in the evolution of protein families. Nat Rev Mol Cell Biol 10:709–720 [DOI] [PubMed] [Google Scholar]
  69. Wright P, Dyson H (2015) Intrinsically disordered proteins in cellular signaling and regulation. Nat Rev Mol Cell Biol 16:18–29 [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Wuchty S, Oltvai ZN, Barabási AL (2003) Evolutionary conservation of motif constituents in the yeast protein interaction network. Nat Genet 35:176–179 [DOI] [PubMed] [Google Scholar]
  71. Yu H, Luscombe NM, Lu HX, Zhu X, Xia Y et al (2004) Annotation transfer between genomes: protein-protein interologs and protein-DNA regulogs. Genome Res 14:1107–18 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data Availability Statement

Data produced and analyzed in this study are provided in Supplementary Tables S1-S6.


Articles from Journal of Molecular Evolution are provided here courtesy of Springer

RESOURCES