Skip to main content
The Journal of Biological Chemistry logoLink to The Journal of Biological Chemistry
. 2024 Feb 8;300(3):105736. doi: 10.1016/j.jbc.2024.105736

Rheostats, toggles, and neutrals, Oh my! A new framework for understanding how amino acid changes modulate protein function

Liskin Swint-Kruse 1,, Aron W Fenton 1
PMCID: PMC10914490  PMID: 38336297

Abstract

Advances in personalized medicine and protein engineering require accurately predicting outcomes of amino acid substitutions. Many algorithms correctly predict that evolutionarily-conserved positions show “toggle” substitution phenotypes, which is defined when a few substitutions at that position retain function. In contrast, predictions often fail for substitutions at the less-studied “rheostat” positions, which are defined when different amino acid substitutions at a position sample at least half of the possible functional range. This review describes efforts to understand the impact and significance of rheostat positions: (1) They have been observed in globular soluble, integral membrane, and intrinsically disordered proteins; within single proteins, their prevalence can be up to 40%. (2) Substitutions at rheostat positions can have biological consequences and ∼10% of substitutions gain function. (3) Although both rheostat and “neutral” (defined when all substitutions exhibit wild-type function) positions are nonconserved, the two classes have different evolutionary signatures. (4) Some rheostat positions have pleiotropic effects on function, simultaneously modulating multiple parameters (e.g., altering both affinity and allosteric coupling). (5) In structural studies, substitutions at rheostat positions appear to cause only local perturbations; the overall conformations appear unchanged. (6) Measured functional changes show promising correlations with predicted changes in protein dynamics; the emergent properties of predicted, dynamically coupled amino acid networks might explain some of the complex functional outcomes observed when substituting rheostat positions. Overall, rheostat positions provide unique opportunities for using single substitutions to tune protein function. Future studies of these positions will yield important insights into the protein sequence/function relationship.

Keywords: protein evolution, protein engineering, protein function, protein stability, personalized medicine


Many applications, from protein engineering to interpreting differences in genome sequences, require accurately predicting the effects of amino acid substitutions. Unfortunately, most prediction algorithms have low success rates, indicating that some fundamental feature(s) of the sequence/function relationship have not yet been accounted for (1). The purpose of the current review is to briefly describe the limitations and biases that have historically affected prediction algorithms and then to review the past decade’s experimental studies that aimed to rectify deficiencies in the data used to build prediction algorithms. These studies have revealed several surprising features of an important class of “rheostat” protein positions, for which substitutions modulate protein function. In the future, it will be critical to understand these features at the biophysical level to improve predictions about functional outcomes.

As a prelude to these topics, it is useful to first define “protein function”. In this article, “function” refers broadly to any molecular event performed by a folded protein (with the exception of intrinsically disordered proteins, discussed below) and not including changes in protein stability. When particular examples are reviewed, biochemical descriptors are provided (e.g., binding affinity, Kd). We further note that a complete description of “protein function” usually requires multiple biochemical parameters. For example, full characterization of an enzyme includes binding affinities for all substrates and turnover numbers for forward and reverse reactions. Ideally, one would monitor all parameters associated with all of a protein’s functions, but this is often impractical. To circumvent this limitation, some studies monitor substitution outcomes using biological assays that are sensitive to changes in multiple functional parameters; these assays are often also sensitive to changes in protein concentration, folding, or stability. The tradeoff is that biological readouts are less informative for learning how proteins actually work; biochemical and biophysical assays that quantify discrete functional aspects (or stability) provide higher-resolution insight. Fortunately, even with their respective limitations, both biochemical and biological studies have been useful for elucidating the characteristics of rheostat positions.

Historical limitations of prediction algorithms and the existence of rheostat positions were identified by a “semiSM” experimental design

Nearly all prediction algorithms incorporate the same input and three underlying assumptions about protein substitutions. Their common input is the use of a multiple sequence alignment (MSA) for each protein of interest. Within each MSA, sequence variation is treated as a “natural mutation experiment”: If a particular amino acid is present in a particular MSA column (amino acid position) that amino acid is presumed to allow protein function to occur. A corollary is that the absence of a particular amino acid from a position’s column is often inferred to indicate that the substitution is detrimental.

The three additional assumptions that underlie prediction algorithms are the culmination of decades of substitution experiments and analyses of protein families. The first two “rules” are even taught to undergraduates: (i) If a particular position in a protein is “important”, most amino acid substitutions are expected to be catastrophic. (ii) A few side chains might be tolerated (allow function) if they are chemically similar to the wild-type amino acid; these alternative amino acids are often observed in an MSA column. For example, if a side chain hydroxyl is required in an active site, all homologs contain either a serine or a threonine at that position. The third (albeit often implicit) assumption arises because an MSA is used as input: (iii) Information about an amino acid change at one position in one protein can be extrapolated to the analogous position in all other homologs.

Unfortunately, these assumptions about substitution outcomes have been influenced by an under-appreciated bias: Before ∼2012, most substitution studies were performed at evolutionarily conserved positions (2). At these positions, many changes are indeed catastrophic and “toggle off” function, consistent with the canonical rules. (Indeed, historic studies of evolutionarily-conserved positions provided the observations used to generate the canonical rules.) In contrast, changes at nonconserved positions were largely overlooked, despite their key roles in paralog evolution. Thus, a decades-long study of LacI and other members of the LacI/GalR family was carried out to test whether the canonical rules described above are appropriate for understanding amino acid substitutions at nonconserved positions (3, 4, 5, 6, 7, 8, 9, 10, 11, 12). A key design feature of these (and subsequent) studies was to substitute each position with ∼10 to 14 different, randomly chosen amino acids—a semi-saturating mutagenesis approach (semiSM). Protein variants were then assessed with functional assays.

This semiSM approach revealed a novel class of “rheostat” positions (e.g., Fig. 1; (11)) that would not have been readily apparent from other strategies used to identify functionally relevant positions, such as alanine scans, directed evolution, and ancestral reconstruction. Furthermore, substitutions at the rheostat positions did not follow any of the canonical substitution rules listed above: First, the protein variants derived from semiSM of each position exhibited a wide range of functional outcomes. For example, ten substitutions at one LacI position exhibited DNA binding affinities that sampled nearly three orders of magnitude (4); follow-up studies with other positions in LacI and its homologs identified many positions with this behavior (5, 6, 7, 8, 10, 11). Second, neither side-chain chemical similarities nor evolutionary frequency explained the rank order of substitution outcomes observed for individual positions (11); these findings were not unique to the LacI/GalR homologs – other studies also noted such puzzling results (13, 14, 15). Third, a specific substitution in one homolog did not cause the same functional change in other homologs (5, 6, 7, 8, 10, 11, 12). Fourth, ∼10% of substitutions at rheostat positions showed a gain-of-function (11); a similar percentage of gain-of-function variants was identified in a recent search of the variants in the Human Gene Mutation Database (16). Finally, separately predicting the outcomes of amino acid changes for both rheostat and toggle positions, using 16 mathematically distinct computer algorithms, was successful for toggle positions but not rheostat positions (1).

Figure 1.

Figure 1

Examples of two conventional and three possible rheostat substitution outcomes. The overall substitution behaviors of neutral (top left), toggle (top right), and rheostat (bottom) positions can be revealed by a semiSM approach. The double arrow in the middle of this figure represents the total range of possible functional outcomes that can be measured for a hypothetical functional parameter; circles are used to indicate the relative locations of wild-type (WT) and “dead” variants; the arrow extends to the left of WT to indicate gain-of-function variants (“GoF”). Five possible outcomes are shown: (upper left) neutral position, all substituted variants cluster with the wild-type phenylalanine; (upper right) toggle position, all substitutions are catastrophic; (below the arrow) three versions of a rheostat position, variants exhibit a range of functional outcomes. The three rheostat examples shown here exhibit (i) an “ideal” rheostat behavior, for which the 20 possible substitutions evenly sample the full functional range; (ii) a “partial range” behavior, for which 20 substitutions sample more than half but less than the full functional range; and (iii) a “sampled range” behavior, for which fewer than 20 substitutions are sufficient to sample more than half of the full functional range (white circles represent untested amino acids). These five examples are also used in Figure 2 to exemplify the modified histogram analysis for quantitative neutral, toggle, and rheostat assignments.

Clearly, a better understanding of this under-appreciated class of rheostat positions provides one avenue for improving predictions of functional change. In the years since the LacI/GalR studies, semiSM studies of multiple model proteins have been performed to (i) assess which structural classes of proteins have the potential to contain rheostat positions; (ii) assess the prevalence of rheostat positions within individual proteins; (iii) assess whether rheostat positions are associated with a particular type of “nonconservation”; (iv) identify nonconserved positions that are not important to function (i.e., neutral positions, which accept most substitutions without consequence) and are thus distinct from rheostat positions; (v) explore the range of functional parameters that can be affected by substitutions at rheostat positions; and (vi) glean information about potential conformational and dynamic changes that give rise to the non-canonical substitution outcomes.

Rheostat, toggle, and neutral positions can be objectively assigned

In addition to revealing to the novel class of rheostat positions, a comprehensive study of LacI/GalR homologs (11) identified positions with the two conventional substitution outcomes illustrated in Figure 1: First, as mentioned above, the canonical rules predict that substitutions at important amino acid positions exhibit a “toggle” outcome; that is, the function would either be “on” or “off”, depending on which side chain is present. Second (and at the other extreme), nonconserved positions are often expected to be “neutral”; that is, substitutions at these positions have no functional effect (with the possible exceptions of proline and glycine, which alter the properties of the peptide backbone).

Assigning rheostat, toggle, and neutral outcomes to each position in the LacI/GalR study was readily accomplished by visual inspection of the data, which spanned five orders of magnitude. In subsequent studies, the need for quantitative assignments became clear and a modified histogram analysis (“RheoScale”) was developed (Fig. 2; (17)). Using this scoring system, a quantitative definition of rheostat positions was based on their sets of substitutions sampling at least half of the possible functional range. This scoring system is further described in the legend to Figure 2 and was used to mathematically demonstrate that a semiSM approach can obviate the expense of site-saturating mutagenesis of an individual amino acid position (SSM; all 20 amino acids assessed at each position) (17).

Figure 2.

Figure 2

Example histogram analyses used to derive rheostat (R), toggle (T), and neutral (N) scores for the examples inFigure 1. When functional outcomes are known for a set of substitutions at a single position (either semiSM or SSM), modified histogram analysis data can be used to classify each amino acid position as a neutral, rheostat, or toggle position. This analysis can be used for any quantitatively assessed functional parameter (biochemical or biological), as well as measures of protein stability. Two features are important for this analysis: the overall range and the number of bins. The histogram range is derived from the entire substitution set, which preferably includes data for multiple positions. As shown by the arrow in Figure 1, one end of the range is defined by a “dead” variant that lacks detectable activity (magenta dot). The other end is defined by the “best” observed function; this could be the wild-type variant (green dot), although many studies reported to date have identified variants with function better than wild-type. Considerations for choosing the bin number are described in (17). Empirically, ten bins work well for most datasets; sparse datasets can be analyzed with as few as four bins (e.g., LacI in (33)). Analyses score bin occupancy as either 0 (unoccupied) or 1 (occupied by any number of substitutions); to account for experimental error, intermediate bins can be weighted more heavily than those adjacent to the WT; the occupancy of all bins is used to calculate rheostat, toggle, and neutral scores, each of which ranges from 0 to 1. Here, the examples in Figure 1 were simulated with a WT functional value of 1, a “best” value of 0.2, and a “dead” value of 30. Histogram bins are labeled with the log of the value of their upper limit. Four examples were simulated with n = 20 variants; the third rheostat example was simulated with ten variants. The R, N, and T scores associated with each example are shown within the plots. The neutral score reflects the percent of non-WT substitutions that exhibit WT-like function; factors relevant to a significance threshold of 0.7 are in (31); a truly neutral position must be neutral for all parameters that can be measured (31). The toggle score reflects the fraction of substitutions that result in “dead” protein; factors relevant to a significance threshold of 0.64 are in (1). The rheostat score reflects how well a set of substitutions samples the observed range; the significance threshold of 0.5 can be simplistically interpreted as “the set of substitutions samples at least half of the functional range” (17). The sampled range can either be continuous (e.g., “ideal rheostat” and “partial range”) or discontinuous (e.g., “sampled range”).

Among its advantages, RheoScale analyses avoid the false dichotomy of classifying each position as either “WT-like” or “deleterious”; when dichotomous categories are used to classify either position phenotypes or substitution outcomes, results with intermediate values are often ignored (e.g., (18, 19)). Furthermore, Rheostat scores (i) provide a convenient summary of the SSM datasets that are usually shown as large heatmaps (e.g., (20, 21, 22, 23, 24, 25)) but are difficult to digest or compare to structures, yet (ii) retain the position-specific information that is obscured in histograms that aggregate all substitution data for all positions (e.g., (23, 25, 26, 27)). A “weighted score” was recently used to quantify inactive variants (27) and is similar to the “toggle score” of RheoScale, but such a score misses the important rheostat positions. Rheostat positions can also be obscured when SSM data are analyzed as the average/median for each position (e.g., (28, 29), see (30) for further discussion).

Finally, examination of the RheoScale scores for the various proteins described below suggests the neutral, rheostat, and toggle substitution behaviors themselves fall along a continuum: Studies have observed (i) neutral positions (11, 31, 32); (ii) positions for which substitution outcomes are not neutral but do not meet the rheostat score threshold, which have been referred to as “non-neutral” and “modest” rheostat positions in various publications (33, 34, 35); (iii) rheostat positions with substitution outcomes that sample the entire range (11, 33, 36); (iv) rheostat positions for which many substitutions are deleterious or catastrophic, i.e., at least half of the range is sampled, but the sampling is skewed towards the most deleterious outcomes (32, 37); and (v) toggle positions (1, 17).

Rheostat positions are widespread and biologically important

Rheostat positions are present in many protein types

The continuum of neutral, rheostat, and toggle substitution behaviors described above were collated from a wide variety of protein types: Since the first identification of rheostat positions was limited to homologs in the LacI/GalR family (11), the next generation of studies was designed to determine how widespread rheostat positions might be. To simplify the vast possibilities, one approach was to sample proteins that evolved under divergent structural constraints: globular soluble proteins, integral membrane proteins, and intrinsically disordered proteins (Table 1). Rheostat positions were identified in all three structural types; definitions for their various functional parameters measured and technical considerations important to these studies are listed in Table 1 footnotes. In addition, retrospective analyses of other SSM datasets assessed with high-throughput techniques identified rheostat positions in a variety of other proteins (e.g., (29, 38, 39, 40, 41, 42, 43)). Rheostat positions can also be retrospectively inferred from alanine scans: Substitutions with partially detrimental (i.e., not catastrophic) or enhancing functional outcomes are likely to the rheostat positions (44). Our review of these ever-expanding datasets has not been exhaustive and, given the growing popularity of these high-throughput approaches, we expect many more rheostat positions will be identified in a wide variety of protein structures.

Table 1.

Model proteins used in studies of rheostat positions

Protein (abbreviation; # of amino acids per monomer)a Overall function Parameters measuredb,c # Rheostat positions identified to dated Citations
Globular soluble
 E. coli lactose repressor protein (LacI; 360 aa) Allosterically regulated transcription repressor
  • 1.

    In vivo repression benchmarked against DNA binding affinity

  • 2.

    In vivo induction at high IPTGe concentration; EC50 for in vivo IPTG induction

  • 3.

    IPTG binding affinity

  • 4.

    Allosteric response

  • 1.

    9 of 12 positions via semiSM and assayed by high-resolution in vivo repression assays; 1 of 1 position verified by biochemical assays.

  • 2.

    149 of 320 positions via semiSM assessed by low-resolution in vivo assays

(1, 3, 4, 11, 17, 37, 54, 57, 60, 139),f
 Engineered E. coli LacI/GalR paralogs (LLhX series, ∼320 aa) Allosterically regulated transcription repressors
  • 1.

    In vivo repression benchmarked against DNA binding affinity

65 of 112 positions via semiSM (5, 6, 7, 8, 10, 11, 12, 17, 49)
 Human liver pyruvate kinase (hLPYK; 543 aa) Allosterically regulated enzyme
  • 1.

    Apparentg binding affinity for substrate (Kapp-PEPh)

  • 2.

    Apparent binding affinity for allosteric activator (KFBPh) and allosteric inhibitor (KAlah)

  • 3.

    Allosteric coupling between active site and each of the allosteric sites (QFBP, QAla).

  • 1.

    18 of 33 positions via semiSM substitutions

  • 2.

    >160 (of 543) rheostat positions estimated from alanine substitutions

(17, 33, 36, 44, 59, 99, 100),i
 Zymomonas mobilis pyruvate kinase (ZmPYK; 475 aa) Non-allosterically regulated enzyme Apparentg binding affinity for substrate (Kapp-PEP) Zero of 26 positions via semiSM (34)
 Human aldolase A (aldolase; 364 aa) Non-allosterically regulated enzyme Apparentg binding affinity for substrate (Kapp), cooperativity, structures 1 of 1 position via semiSM (106)
Integral membrane
 Sodium taurocholate co-transport-ing polypeptide (NTCP; 349 aa) Bile acid and statin transport Transport at a fixed substrate concentration normalized to cell surface expression, Michaelis constant (Km), maximal velocity (Vmax), computed energetic effects on structure 4 of 4 positions via SSM (32, 50, 55)
 Organic anion transporting polypeptide 1B1 (OATP1b1; 691 aa) Hepatic drug transport Transport at a fixed substrate concentration normalized to cell surface expression 1 of 1 position via semiSM; 11 estimated from cysteine-scanning variants (140)
Intrinsically disordered
 Human MHC class II transactivator protein (CIITA; isoform 3; 1130 aa) Eukaryotic transcription-al activator In vivo transcriptional activation normalized to cellular concentration 4 of 7 positions via semiSM (35)
a

One criterion in selecting these proteins was that all functional assays were either independent of or accounted for changes protein stability/cellular concentration. For example, in catalytic reactions, enzyme concentration is extremely low and measurements of Kd,app and allosteric coupling are independent of concentration. Despite these efforts, substitution effects could not be disentangled for some “dead” variants, which derive from either lost function or structural disruption (33, 34, 36). Fortunately, at rheostat positions, dead variants comprise the minority of substitutions and thus do not impact conclusions about the overall roles of these positions.

b

Functional assays should allow assessments for dozens, if not hundreds, of amino acid substitutions. For enzymes, the ability to carry out assays after partial purification increased throughput (31, 33, 34, 59, 106). For other proteins, protein purification is precluded with in vivo measurements, with the caveat that measured changes can arise from changes in multiple biochemical parameters, including effects on protein concentration/stability. Fortunately, the latter can be detected and compensated using parallel experiments to measure concentration (e.g., (11, 35, 50, 55, 141)). The possibility of combined outcomes is especially important to keep in mind for “deep mutational scanning” studies (e.g., (38, 41, 48, 142)), which – in addition to combining potential changes from multiple biochemical parameters – often make assumptions to connect the measured outcome (variant frequency within a genetic population) to a substitution’s effect(s) on the protein of interest; many of these studies are now adopting parallel assays of protein concentration (e.g., (143, 144, 145)).

c

To detect the intermediate functional outcomes common at rheostat positions, assay resolution is critical. With a few exceptions, biochemical and biophysical studies usually have smaller experimental errors than high-throughput in vivo studies. Experimental errors should be especially considered for assays with a narrow overall range; multiple biological and technical replicates for wild-type protein (10 or more) can provide a good description of error (31, 33, 35, 106). Fortunately, high-throughput robotics are being developed that allow biochemical measures for hundreds of variants (45, 146, 147) and will greatly advance experimental studies of rheostat positions.

d

The first number in this column (e.g., 9 or 149 for LacI in row 1) gives the number of rheostat positions experimentally identified by the method indicated; the second number (e.g., 12 or 320 for LacI) gives the total number of positions experimentally tested in the study.

e

Inducer “IPTG” is isopropyl β-d-1-thiogalactopyranoside.

f

The AlloRep database (54, 139) collates the published substitution outcomes known for LacI. For both LacI and its engineered paralogs, in vivo repression and induction assays have been extensively benchmarked against in vitro biochemical measurements and direct measures of in vivo concentration ((4, 5, 6, 7, 8, 10, 11, 12, 49, 58, 148) and many other LacI studies in (54, 139)).

g

Apparent binding affinities and allosteric coupling were derived from analyses of kinetic assays; mathematical descriptions for these parameters are in (59, 101, 102, 149).

h

Substrate “PEP” is phosphoenolpyruvate; allosteric activator “FBP” is fructose-1,6-bisphosphate; allosteric inhibitor “Ala” is free alanine.

i

The PYK-SubstitutionOME database (44) collates the published substitution outcomes known for many pyruvate kinase homologs.

Several surprising outcomes were noted in these studies. First, despite their widespread occurrence in many protein types, rheostat positions for function have not yet been found in some globular proteins (34, 45); this is discussed further below. A second surprise was the identification of rheostat positions in an intrinsically disordered (ID) region that comprises the transcriptional activation domain of the eukaryotic transcription factor CIITA (35). Originally, the CIITA ID region was anticipated to contain either (i) neutral positions, as was the case for the intrinsically disordered activation domains of p53 (22) and PPARγ (46) or (ii) to only be affected by substitutions that altered the balance of acidic and hydrophobic residues, as has been proposed for other ID activation domains (26, 47). Instead, the CIITA study shows that, like globular proteins, functional sensitivity to single substitutions and the presence/absence of rheostat positions cannot be generalized among ID regions. A 2023 survey of the MAVE database (48) shows that ID regions are seldom targeted with SSM or semiSM experimental designs, although those designs are used to interrogate the globular domains present in the same proteins. Thus, more SSM/semiSM studies are needed to understand the sequence/function relationships ID regions.

Substitutions at rheostat positions have a biological impact

The existence of rheostat positions is interesting from a purely academic perspective, but their significance hinges on whether changes at these positions have measurable biological impacts. For known rheostat positions, there is both direct and indirect evidence of biological importance: In the LacI/GalR homologs, any measurable change in repression (≥2-fold) was sufficient to alter bacterial growth (49). In human NTCP, the site of a medically-relevant polymorphism was a rheostat position; furthermore, the functional difference between the two naturally occurring variants was substrate-dependent (50). In hLPYK, the locations of some rheostat positions coincide with known disease-causing mutations (30). In the database of clinically relevant mutations in von Willebrand factor, different amino acid substitutions at the same position are associated with different disease severity (51), consistent with the features of rheostat positions; likewise, biophysical assessments of von Willebrand factor show that some pathogenic variants cause partial loss of function whereas others lead to gain of function (52).

These studies suggest that non-catastrophic effects on protein function can have biological consequences. Different levels of functional change might lead to a range of disease severities, or disease might result when variant function crosses a biologically critical threshold (e.g., (23)). Note that the disease threshold will differ for each protein, and even for a single protein under different environments or epistatic conditions (i.e., other proteins in the organisms have amino acid changes) (reviewed in (53)). Furthermore, in some proteins, the most deleterious substitutions may cause fetal death, so that only the intermediate range of reduced function are seen as disease-causing. Nonetheless, any correlation of disease with mutations at rheostat positions is strong evidence for biological impact. Thus, as noted by Miller et al. (1), one goal of computational predictions should be to predict intermediate functional outcomes rather than imposing binary assignments such as “benign/detrimental”. Indeed, such intermediate functional outcomes might be one explanation for the 11% of “uncertain” human missense mutations with intermediate pathogenicity scores by the recently published predictor “AlphaMissense;” (this study only reported confidence in substitutions predicted to be either “likely benign” or “likely pathogenic” (19)).

It is also interesting to think about the biological consequences of substitutions that lead to gain-of-function. As noted above, such gain-of-function outcomes were observed for ∼10% of substitutions in semiSM studies (e.g., (11, 33, 35, 50, 54, 55)). This suggests that the wild-type proteins have not evolved to their maximum possible function. One reason for this could be that “enhanced” function is biologically detrimental (e.g., (56)). Another reason could be that the protein evolved to be “good enough”; there was no pressure to evolve functional perfection. Yet a third reason is that enhancing one aspect of function may simultaneously alter another; the related functions must simultaneously evolve to be “good enough.” Future studies of gain-of-function substitutions will advance the engineering of protein reagents for biotechnology.

Rheostat positions can be prevalent within individual proteins

Demonstrating the importance of rheostat positions also requires assessing their prevalence within individual proteins. In LacI, retrospective analyses of the Miller lab’s semiSM studies (57, 58) suggest that at least 40% of this protein’s positions are rheostat positions ((33); Fig. 3). In hLPYK, results from a whole-protein alanine scan (59) indicate that >30% of this protein’s positions are rheostat positions ((44); Fig. 3). In the region of human ACE2 that binds the SARS CoV2 spike protein, 26% of the 117 positions assessed were rheostat positions for spike binding ((30); data from (29) were later published in (40)). With these high numbers in three distinct proteins, it is safe to conclude that rheostat positions can be prevalent within individual proteins!

Figure 3.

Figure 3

Prevalence of position classes in representative proteins. The fraction of toggle positions (“T”) is shown with magenta; the fraction of rheostat positions (“R”) is shown with light gray; and the fraction of neutral positions (“N”) is shown with green. The fraction of positions for which no assignment could be made are shown in dark gray (unclassified, “U”); at these positions, data were either too sparse or their RheoScale scores did not meet any threshold (most often, such positions were non-neutral “moderate” rheostat positions (33)). For LacI, the numbers of rheostat, toggle, and neutral positions (33) were estimated from low-resolution in vivo repression assays (57, 58); as detailed in (33), the number of neutral positions is likely an over-estimate and the number of rheostat positions is likely an underestimate. For hLPYK, numbers were estimated from a whole-protein alanine scan (44, 59); the number of rheostat positions (based on the number of alanines with either gain- or partial-loss-of-function) is a low estimate; positions counted in the neutral and toggle categories are likely high over-estimates (lighter colors). For ACE2, variants were generated by SSM of the ACE spike binding interface, and functional outcomes were assessed via binding surface-expressed ACE2 to soluble, GFP-tagged SARS Cov2 spike protein and measured with flow cytometry (29, 40).

Nonetheless, rheostat positions have not been identified in some proteins. No rheostat positions were observed in ZmPYK when substrate affinity was assessed at 26 positions (34). Instead, the variants at most positions could be categorized as either “near wild-type” or “dead.” At first glance, many of these ZmPYK positions appeared to exhibit classic toggle behavior; however, they failed to meet two criteria: (i) Only about half of the variants for each position lacked activity, in contrast to toggle positions, for which most variants lack activity. (ii) Amino acid substitutions that retained near wild-type function showed few chemical similarities. The ZmPYK data were also striking in comparison to its homolog hLPYK, which is estimated to have >30% rheostat positions (Fig. 3). A second example of paltry substitution effects on a functional parameter was found in studies of the alkaline phosphatase PafA. In comprehensive valine/glycine/alanine scan of this protein’s enzymatic function, only 17 of its >520 substituted positions exhibited non-catastrophic, non-neutral effects on the Michaelis constant Km (45).

Together, these results suggest that some proteins have functions that are not substantially impacted by single amino acid substitutions. The evolution of such proteins must instead be accomplished by combinations of substitutions at multiple positions, which can produce change via (i) additive combinations of small effects (e.g., 2- to 3-fold) (60, 61, 62) and/or (ii) larger, non-additive effects arising from epistasis among substitutions with small/neutral effects (63, 64, 65, 66, 67)). By extrapolation, these proteins may lack rheostat positions for any of their functional parameters. This difference is not related to either (i) the assayed function – both hLPYK (with many rheostat positions) and ZmPYK (with no identified rheostat positions) catalyze the same enzymatic reaction (reviewed in (44)), or (ii) overall protein fold – hLPYK (e.g., (68)) and ZmPYK (69) have similar monomeric structures. In the future, it will be important to determine which biophysical properties allow the presence of rheostat positions in some proteins and not in others.

Characterization of rheostat positions reveals complex relationships

Both rheostat and neutral positions can be “nonconserved”

Another historical reason why rheostat positions were overlooked is that many have assumed that nonconserved positions are neutral and did not include them in experiments. Thus, it is important to differentiate these two types of nonconserved positions and to directly test the association between nonconservation and functional neutrality. To that end, one avenue of study has been to identify whether any evolutionary features found in MSAs can discriminate neutral and rheostat positions.

When analyzing MSAs, one important insight is recognizing that “nonconservation” can be defined many ways. First, “nonconserved” is contextual. One context is provided by the sequence identity threshold used to define a protein family (53). For example, rheostat positions in the LacI/GalR family are nonconserved in the super-family and often conserved within subfamilies (9, 11). A second context is provided by the taxonomic distribution of the protein family. For example, two protein families with correlations between evolutionary features and position types—LacI/GalR and pyruvate kinase (PYK)—are taxonomically diverse (9, 34, 70). LacI/GalR homologs are present in most bacteria; LacI/GalR paralogs bind different DNA and allosteric ligands (9, 71). PYK homologs are present in all domains of life; isozymes catalyze the same chemical reaction with varied allosteric regulators (34, 70, 72, 73, 74). In contrast, the CIITA family is limited to vertebrates (75, 76), and no correlation was observed between evolutionary features and rheostat positions in human CIITA (35). Even though only a few positions were tested in CIITA, it seems reasonable to infer that MSA analyses are of limited utility for taxonomically restricted proteins.

Second, within an MSA, several types of “nonconservation” can occur: Changes at some pairs (or larger clusters) of positions are correlated, suggesting that they co-evolve (e.g., (77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92)). At other positions, amino acid changes correlate with the branch points of the MSA’s phylogenetic tree (e.g., (93, 94, 95, 96)). When compared to experiments, all of these patterns can identify “important” protein positions (e.g., (9, 77, 79)), and all patterns appear to reflect various biological or biochemical pressures that occurred during protein evolution. Thus, Martin et al. (31) hypothesized that positions with the least evidence of a changing pattern (i.e., most random evolutionary change quantified by a “least-patterned” score) would be likely candidates for neutral positions.

When the least-patterned hypothesis was tested experimentally in hLPYK, results were encouraging (31). Although it is impossible to prove a negative (as extensively discussed (31)), several of the substituted positions showed zero change in the functions monitored, and others showed only modest deviations from wild-type. A second set of neutral positions was gleaned from a high-throughput/low-resolution study of LacI (33, 57, 58). When the least-patterned scores of (31) were calculated for LacI, a subset of the neutral positions was highly discriminated from rheostat and toggle positions (Fig. 4, A and B). Since neutral positions provide a much-needed control for understanding the biophysical underpinnings of rheostat positions, future studies should be geared towards identifying neutral positions in more types of proteins.

Figure 4.

Figure 4

Evolutionary signatures of neutral (N), rheostat (R), and toggle (T) positions in LacI. For each position in LacI, an overall substitution outcome was determined using a semiSM set of variants assessed with in vivo functional assays (33, 58); for the resulting groups of N, R, and T positions, scores derived from various MSA analyses were compared. A, the “least patterned” score reported in (31) distinguishes neutral positions from rheostat/toggle positions; the dashed horizontal line illustrates a threshold that perfectly discriminates a subset of the neutral positions. B, the receiver–operator curve (ROC) for the true positive (sensitivity) and false positive (1 − specificity) outcomes that arise from using all possible thresholds of the least-patterned score to discriminate the set of neutral positions versus the set of [Rheostat + Toggle] positions; the area under the curve (AUC) is 0.8669 with a 95% confidence interval of 0.8128 to 0.9210; the curve for a perfect predictor would be a step function with AUC 1.0; the dashed line represents a useless predictor with AUC 0.5. C, distributions of LacI ConSurf raw scores for the three positions classes (33). ConSurf was also successful at discriminating neutral and non-neutral [Rheostat + Toggle] positions, with the AUC of a ROC curve of 0.8369 (See Supplemental Fig. S8 in (33)). The dashed lines show the statistically-determined thresholds for separating these three categories (33). Given the substantial additional effort required to calculate the “least patterned” score of (31), ConSurf is a practical choice. For (A) and (C), ANOVA was carried out using Tukey’s multiple comparisons test of the means; ∗∗∗∗p < 0.0001; ∗∗∗p = 0.0003; “ns”, not significant.

Evolutionary signatures have some utility for discriminating rheostat, neutral, and toggle positions

Given the success of combining MSA analyses to identify a subset of neutral positions, the question was raised whether a similar strategy would effectively discriminate the three classes of rheostat, toggle, and neutral positions. To that end, the LacI experimental dataset was compared against results from 12 independent MSA analyses, as well as combinations of analyses. Each analysis suggested differences among the three classes, but the three distributions of scores showed substantial overlap (33). Of these, the phylogenetic analysis ConSurf best distinguished the three classes ((33); Fig. 4C).

ConSurf uses a tripartite function to quantify conservation at each position (MSA column); this information is then combined with information from the protein’s phylogenetic tree to generate a score (94) that can be used to discriminate positions with whole-family conservation, subfamily conservation, and nonconservation (e.g., (9, 33)). The set of positions that show subfamily-specific conservation are enriched for rheostat positions (33). Conveniently, ConSurf is available as a recently updated web server and as a stand-alone pipeline ((94); https://consurf.tau.ac.il). Notably, ConSurf separated LacI rheostat and toggle positions much better than co-evolutionary methods, and a combinatorial approach showed no improvement. When phylogenetic scores were used to predict the locations of rheostat positions in hLPYK, experiments showed that several of these positions were highly rheostatic for multiple functional parameters; none of the predicted positions were neutral (33).

Despite this success, the distributions of the scores for neutral, rheostat, and toggle positions show significant overlap. One reason for this overlap could be that the following assumptions inherent to MSA analyses are not always appropriate:

Assumption 1

Every MSA analysis implicitly assumes that the role of each position (MSA column) is the same for all homologs in the family. This need not be so. For example, when subfamilies of the LacI/GalR family were separately analyzed, some conserved and co-evolving positions were in different locations on their common structural architecture, suggesting that these positions play roles unique to each subfamily (79). Furthermore, when analogous positions were experimentally assessed in ten engineered LacI/GalR paralogs, their substitution classes were not always the same. For example, several positions showed rheostatic substitution behavior in 80% of the homologs tested but exhibited different behaviors (i.e., toggle or neutral) in the remaining homologs (11, 17). It would not be surprising to find that positional contributions to function also vary among natural paralogs.

Assumption 2

A second assumption that may not always be true – and therefore may contribute to the observed overlap in Figure 4 – is implicit in MSA analyses: predicting substitution classes from MSAs assumes that all homologs in the family actually have rheostat positions and that the protein function can be tuned by single substitutions. As described above, available data for ZmPYK suggests that this is not true for all PYK homologs (34).

Assumption 3

MSA analyses detect amino acid conservation or patterns of change that are assumed to arise from evolutionary pressures on the protein. However, every protein experiences multiple pressures to maintain various structural and functional features (described in (34, 97)). Since the various pressures all contribute to the patterns of change observed in MSAs, this likely confounds assigning MSAs signatures to any one functional parameter or substitution behavior (e.g., functional rheostat positions).

Since MSAs are central inputs to essentially all algorithms that predict substitution outcomes, these three assumptions must be kept in mind because their associated limitations may limit the amount and types of information that can be extracted from sequence alignments. When these assumptions are not true, one result might be the overlap exemplified in Figure 4. Nonetheless, as detailed above, MSA-based predictions of rheostat and neutral positions can be useful for guiding experiments and selecting subsets of positions that are enriched for the desired behavior.

Multiple aspects of function can be simultaneously altered by substitutions at rheostat positions

Since patterns of change observed in MSAs can reflect multiple evolutionary pressures, one might expect substitutions at rheostat positions to simultaneously affect multiple functional parameters. Indeed, using high-resolution biochemical assays, many examples of this behavior have been found.

For example, substitutions at one LacI rheostat position simultaneously modulated DNA binding, effector binding, and the magnitude of allosteric response (4). Even in low-resolution in vivo assays, at least 27 LacI rheostat positions had simultaneous effects on repression and induction (98). In a second example, a rheostat position in the bile acid/statin transport protein NTCP showed ligand-dependent effects on each substitution’s outcome (50); this indicates that the substitutions altered substrate specificity (i.e., which substrate is the most preferred ligand; further discussed in (12)). Furthermore, changes were observed in both kinetic parameters (Km, Vmax), with uncorrelated outcomes (50, 55). In a third example, experiments for hLPYK were designed to simultaneously assess three apparent binding affinities and two allosteric couple constants (Table 1). Substitutions at hLPYK rheostat positions often affected more than one parameter, with at least one position affecting all five; other hLPYK positions showed rheostat outcomes for some parameters and toggle outcomes for others (17, 33, 36, 99, 100).

For parameters related to allosteric regulation, simultaneous effects are expected: Allosteric coupling is mathematically defined as a non-unity ratio of substrate binding in the absence of effector versus substrate binding in the presence of saturating concentrations of the effector (e.g., (59, 101, 102, 103)). As such, simultaneous effects on binding and allosteric coupling arise from the same functional change (e.g., Fig. 5A). However, deviations from perfect correlation (e.g., Fig. 5B) and uncorrelated effects on other parameters (e.g., Fig. 5C) have also been observed. Since the functions in this hLPYK example involve three separate binding sites, each separated from the others by ≥20 Å, the combined results suggest that substitutions at rheostat positions have long-range effects that are mediated by changes in the protein structure.

Figure 5.

Figure 5

Correlations of measured parameters for substitutions attwohLPYK rheostat positions. The enzymatic reaction and allosteric regulation of this enzyme are described in Table 1. Functional parameters measured for each of the variants at hLPYK (A) position 192 and (B and C) position 56 are shown with dots; error bars are errors of fit and may be smaller than the symbol size. Data and analyses were taken from (33, 36, 100). For position 56, in addition to substitutions shown on the plots, S56C had no measurable activity; S56N and S56Q showed no binding for the allosteric activator FBP; S56G showed so little binding for FBP that QFBP could not be measured. The allosteric coupling constants (Q) are derived from a ratio of binding affinities and are unitless. This figure also provides an example of which physico-chemically similar amino acid side chains exhibit different outcomes (e.g., at position 56, S and T differ for QAla and QFBP, and N and Q have disparate KAla).

Structural features that might enable functional rheostat positions

Functional rheostat positions have no obvious relationship to binding sites

To understand the mechanisms that give rise to the noncanonical substitution outcomes at rheostat positions, an obvious line of inquiry is to determine their locations on protein structures. A simple explanation could be that a position’s substitution behavior is related to its proximity to ligand binding/active sites: Toggle positions might directly contact the ligand; rheostat positions might occur in the “second shell” from the ligand; functionally neutral positions might be located far from the ligand binding site, including on the protein’s surface. However, to date, no such structural pattern has been observed for the locations of functional rheostat positions. The side chains of some rheostat positions directly contact ligands; many others do not. Visual inspection shows that the locations of rheostat positions are scattered all over the structures of LacI (e.g., Fig. 6) and hLPYK (33). In contrast, many ZmPYK positions close to its active site were not rheostat positions (34).

Figure 6.

Figure 6

Locations of rheostat positions on the LacI homodimer. For all positions in LacI, the Miller lab used semiSM and low-resolution assays (57, 58) to assess each variant for effects on (i) transcription repression—which is mediated by DNA (black ladder) binding and (ii) transcription induction—which is mediated by binding an allosteric inhibitor such as IPTG that diminishes DNA binding (binding sites denoted with ligand in gray spheres). In vivo data have been extensively benchmarked against biochemical analyses of purified proteins (summarized in (54)) and were analyzed to identify rheostat positions (33). Rheostat positions that alter either repression (magenta) or induction (green) are shown on pdb 1EFA (104). Positions that were rheostatic for both functional outcomes are in blue; such “multiplex” rheostat positions were also identified in pyruvate kinase (17, 36). Notably, the side chains for many of the LacI rheostat positions are solvent-exposed. Forty percent of LacI positions had rheostatic behaviors, which exceeds the numbers of either toggle or neutral positions (33). This illustration was created in UCSF Chimera (137).

Thus, the substitution effects of many rheostat positions must propagate “long range” through the structure to some region of the protein that is energetically coupled to a binding/active site. In other words, altered ligand binding can arise even if there are no observed changes in the binding site when the ligand is absent. Future studies should identify the biophysical properties that allow such propagation in some proteins and not others. Such studies could provide key insights into which types of proteins can contain rheostat positions.

Functional rheostat positions do not cause substantial conformational change

Another question about substitutions at rheostat positions involves assessing what changes occur in the proteins’ folded conformations. For a protein’s overall secondary/tertiary/quaternary structure, various studies have provided both indirect and direct evidence that suggests the answer is “Very little.”

The indirect evidence is that most substitutions at rheostat positions have measurable functions. This would not be possible if the substitution substantially distorted the protein’s overall structure. Indeed, all substitutions at rheostat positions in LacI and engineered LacI/GalR paralogs were capable of binding DNA, even if their binding affinity was too weak to repress transcription (11). Likewise, most substitutions at hLPYK were able to catalyze the native reaction and bind the same allosteric effectors (44); this would not be possible if the conformation was substantially changed. Even glycine and proline substitutions at rheostat positions, which are expected to cause drastic structural changes (albeit from historical studies of toggle positions), allowed measurable function in several model proteins (4, 33). In several instances, proline was not even the worst-performing variant.

Comparison of paralog structures also suggests no global conformational changes. For example, the whole protein Cα RMSD for LacI (PDB 1EFA; (104)) and Escherichia coli PurR (PDB 1WET; (105)) is <1.5 Å; nonetheless, the intra-protein interactions of the nonconserved rheostat positions show substantial local re-packing, with different amino acids at analogous positions contacting different partners (3). This type of side-chain re-packing (i.e., plasticity) was also computationally predicted for NTCP (55). When substitutions were modeled for the highly buried side chain at rheostat position 271, only localized re-packing was observed; essentially no changes were predicted for the overall conformation in either the open or the closed states. Another 12 position NTCP positions were predicted to have similar structural plasticities. Finally, crystal structures of substitutions at a rheostat position in human aldolase A showed essentially no change beyond the immediate region of the substituted side chain (106). Thus, the protein regions that accommodate rheostat positions and their substitutions appear to show structural plasticity.

However, none of the studies described above rule out altering the distributions of alternative conformations on each protein’s conformational landscape. Individual substitutions—even at the same position—could have unique effects on this landscape. Indeed, a conclusion from structural studies of human aldolase A is that substituting a rheostat position shifts its conformational ensemble, with some of the alternative conformations being inactive (106): Four variants at a single rheostat position crystallized in the same space group and their structures had the same fold; nevertheless, the different resolutions of the four variants correlated with a changing functional parameter. A fifth variant at the same position crystallized but did not diffract, had unchanged secondary structure by circular dichroism (suggesting that its overall fold was intact), yet lacked detectable enzymatic activity. Thus, the authors proposed that substitutions at this functional rheostat position shifted the aldolase conformation (106), similar to the shifts observed with fluorescence spectroscopy for substitutions that altered its substrate specificity (107). A second example of a protein for which enzymatically inactive substitutions arise from shifts in the conformational ensemble is the alkaline phosphatase PafA (45).

Substitutions at functional rheostat positions might cause functionally important changes in dynamics

Another possible mechanism by which substitutions at rheostat positions could exert their long-range, through-structure effects on protein function is by altering protein dynamics. Support for this hypothesis can be found in the dynamics computations for variants at a LacI rheostat position (37). In this study, the protein conformation was subjected to all-atom molecular dynamics simulations and then treated like a 3D network of coupled balls (atoms) and springs (bonds). In this model, when one atom is perturbed—by Brownian motion or ligand binding—vibrations propagate through the anisotropic network of the 3D structure (e.g., Fig. 7); some of these vibrations reach the atoms in the binding site. When an amino acid is substituted with another side chain, the resulting alterations effectively “re-wire” that local section of the network. Although the resulting conformational changes look small and localized, effects on vibrational propagations can be large. Intriguingly, for substitutions at a LacI rheostat position, changes in the flexibilities of the DNA binding contacts correlated with changes in the measured DNA binding affinities (37).

Figure 7.

Figure 7

Protein structures are anisotropic networks. In an anisotropic network, the propagation of a perturbation from node 1 to node 2 differs from the propagation from node 2 to node 1 (e.g., (37)). One consequence of anisotropy is that coupled positions can have unique substitution outcomes. In the example illustrated here, a perturbation (upper black arrow) at position 94 (green space filling) propagates through contacting residues (magenta ball and sticks) to the rest of the protein structure (cyan arrows), including to position 220 (large shaded arrow). Because position 220 has different contact partners than position 94, the same size perturbation (lower black arrow) propagates differently through the structure (yellow arrows), and dynamics coupling from position 220 to position 94 differs (smaller shaded arrow). This illustration uses the LacI pdb 1EFA (104) and was created with ChimeraX (138).

As such, the LacI dynamics study suggests a potential approach for predicting substitutions with partially deleterious or enhancing outcomes on function. In support of this approach, other computational studies have correlated such dynamic changes with pathogenic missense substitutions (108) and leveraged this information to design enhancing substitutions (109). Amino acid-dependent, long-range coupling has also been observed experimentally via NMR (110, 111). In future studies, such dynamic correlations should be tested in other proteins and for a range of ligand types.

Substitutions at functional rheostat positions have unknown effects on stability

To date, most studies of rheostat positions have focused on changes in protein function, using assays that were largely insensitive to changes in protein stability (the equilibrium between folded and unfolded protein, measured by the free energy of unfolding, ΔGu). However, given the large numbers of amino acid substitutions that are known to alter ΔGu (e.g., (21, 112, 113, 114, 115, 116, 117)), the relationship between function and stability should also be considered for substitutions at rheostat positions.

One possibility is that ΔGu must substantially favor the folded state to observe the functional modulation associated with destabilizing substitutions. In support of this hypothesis, several studies have concluded that mutations that increase protein stability enable natural and directed evolution of function variation (e.g., (118, 119, 120, 121)) and that ancestral proteins (prior to the evolution of functional variation) might have had higher stability (122). High stability is known for at least one of the model proteins with a high prevalence of functional rheostat positions: tetrameric LacI. With a ΔGu of −60 kcal/mol for the wild-type protein (123), most single mutations in LacI probably do not alter stability sufficiently to cause a biochemically or biologically meaningful effect.

Related to this idea, since the stabilities of smaller proteins have been shown to be more sensitive to perturbations than larger proteins (112, 124), the presence (or prevalence) of functional rheostat positions may depend upon protein size: In smaller proteins, substitutions that rheostatically alter function might have a greater likelihood of producing a concomitant, catastrophic effect on stability. In larger proteins, substitutions that alter function might have a greater chance that they can be accommodated without causing catastrophic changes in ΔGu.

Finally, we note that the folded/unfolded equilibrium that defines ΔGu has a different relationship to function in globular proteins than in intrinsically disordered (ID) proteins: Globular proteins must be folded to function. In contrast, some ID proteins may never adopt a regular structure; “function” is a property of the ensemble of unfolded conformations rather than of folded conformations. In other ID proteins, ligand binding and protein folding are tightly coupled (125), making it extremely difficult to experimentally dissect the two parameters. For example, in the ID transactivation domain of CIITA, substitutions at two rheostat positions show a correlation between transcriptional activation and intrinsic structural propensities, consistent with substitution effects on coupled folding and binding (35).

“Stability rheostat positions” modulate ΔGu

Analogous to functional rheostat positions, “stability rheostat positions” were hypothesized in 2016 (53). At such positions, effects on stability would vary in an amino acid-dependent manner, and values of ΔGu would span a wide range. Subsequently, Matthews et al. carried out deep mutational scanning studies on a region in a TIM barrel known to alter stability in three indole-3-glycerol phosphate synthase isozymes (42). Several of these positions had rheostatic outcomes (17), although effects on stability and function were likely to be intertwined in this assay. More convincingly, Mayo et al. combined SSM with a high-throughput, quantitative approach for measuring ΔGu in the B1 domain of Streptococcal Protein G (GB1; (115)). Forty percent of GB1’s positions had rheostatic substitution outcomes on stability (Fig. 8); substitutions at the strongest rheostat position sampled nearly the full range of observed stabilities.

Figure 8.

Figure 8

SSM in GB1 leads to a wide range of stabilities and reveals numerous rheostat positions.Top, data from (115) were used to create histograms of (dark gray) stability outcomes for all 933 GB1 variants (including both quantitatively measured and “dead” variants, which assigned a ΔΔGu of −4) and (light gray) values for 18 amino acid substitutions at position 54. The bins corresponding to WT (green; 0.14) and dead (magenta; −3.41) are indicated with circles. Bottom, RheoScale analyses (17) were carried out using histograms for each set of variants at all GB1 positions; calculations of rheostat and toggle scores used ten bins; calculations for neutral scores used five bins to encompass error in the wild-type measurements (as described in (31)). In addition to rheostat (R), toggle (T) and neutral (N) substitution outcomes, a few positions show “moderate” effects (M) on stability; that is, most substitutions showed significant differences from wild-type but did not meet the threshold of sampling at least half of the stability range. Substitution outcomes for a few positions could not be classified (U).

Mayo et al. also noted that, except for proline and glycine substitutions, some positions tolerated a wide range of side chains with little effect on ΔGu, whereas other positions did not. Thus they concluded that a position’s location on the structure was more important to governing stability than the chemistry of the amino acid substitution (115). Likewise, positions associated with stability changes in the TIM barrels study (42) also showed a structural pattern, with the loops enriched for rheostat positions and one side of the beta-barrel enriched for toggle positions (17). These location-based findings reinforce the value of assessing each position’s structural (or functional) role instead of considering separate roles for each of the 20 possible acid side chains with which it can be substituted.

Going forward, it will be interesting to determine whether the presence of stability rheostat positions and the distance correlation observed in GB1 generalize to other proteins. Another consideration is that, with 56 amino acids, GB1 is fairly small, and single substitutions alter a large percent of its bonds (which additively contribute to its stability; described in (126, 127, 128, 129)). As such, it could be easier for different substitutions at a single position to widely sample a range of stabilities, allowing stability rheostat positions to exist. In contrast, substitutions in larger proteins—which alter a smaller percent of the bonds that contribute to stability—might be limited to small effects on protein stability, precluding the presence of stability rheostat positions. To tune the stability of larger proteins over a large range, simultaneous substitutions at multiple positions might be required. Nonetheless, even oligomeric proteins may be susceptible: a 2024 study found that substitutions at a nonconserved, interfacial position in a trimeric chloramphenicol acetyltransferase (291 aa monomer) had rheostatic effects on the midpoint for thermal denaturation (Tm), with substitutions sampling a range of nearly 30 °C (130).

Finally, future studies of globular proteins should consider whether stability and functional rheostat positions comprise separate sets of positions or are inter-twined. For example, it is not yet known whether (i) the GB1 stability rheostat positions play any role in GB1 function or (ii) the functional rheostat positions identified in various model proteins (Table 1) also alter ΔGu. Interestingly, for enzymes, there is a widespread expectation of a “stability/activity tradeoff” (e.g., (131, 132, 133, 134)). However, Whitehead et al. conducted SSM for two enzymes and compared each variant’s effect on (i) its cellular function, as a proxy for protein function and (ii) the amount of folded protein recognizable by antibody in a yeast display assay, as a proxy for protein stability. These studies found a range of “stability/activity” relationships (21). As such, the function/stability relationship does not appear generalizable within or among proteins, and we do not expect this to be the case for rheostat positions.

Thus, predictions about substitution outcomes will need to account for both changes in function and stability. An early prototype of such a combined algorithm using a simple biochemical equation is described in (120) and a recent advancement using machine learning to analyze high-throughput functional data, patterns of evolutionary change, and predicted stability changes is described in (135). Ideally, in the future, parallel stability and functional experiments will be generated for large numbers of model proteins to further generalize such models.

Conclusion

Tuneability—a new conceptual framework for understanding the protein structure-function relationship?

In hindsight, perhaps the existence of functional rheostat positions should have been anticipated from earlier studies. Their effects are almost certainly manifest in various “scanning” data sets, which have revealed that some protein functions are highly tunable by single amino acid substitutions. For example, when 427 of 543 positions were substituted in an alanine scan of hLPYK, values of Kapp-PEP fully sampled the >200-fold, observed range (Fig. 9; (59)). Rheostat positions are but a special case of this substitution behavior—the same range of functional outcomes is available via substitutions at just a single position. For example, in hLPYK, the same 200-fold range of functional range was well-sampled by just 14 substitutions at a single rheostat position (33), and >70 hLPYK positions appear to be rheostat positions for Kapp-PEP (Fig. 9; (44)).

Figure 9.

Figure 9

Functional tunability. The structure on the left shows one monomer of tetrameric hLPYK. Gray side chains are used to show the locations of 427 positions that were substituted with alanine (59); green spacefilling is used to show the location of rheostat position 538, which was substituted with 13 additional amino acids (14 total variants; (33)). The enzyme active site is shown in yellow; many of these positions were also substituted with alanine. The center plot depicts a histogram of the apparent binding affinities for substrate, Kapp-PEP, (which is mediated by the active site) measured for each of these two sets of variants. Note that 14 substitutions at rheostat position 538 (light gray bars) sampled nearly the exact same range of change as the whole-protein alanine scan (dark gray bars). The structure on the right shows the >80 positions (magenta) that are predicted to have rheostatic effects on Kapp-PEP (44); other side chains are colored the same as on the left. This illustration was created in UCSF Chimera (137) from the hLPYK PDB 4IMA (68).

Thus, we conclude here by considering how the existence of rheostat positions might give us new insights into the general relationship between protein sequence and function: Rather than attempting to formulate separate structural mechanisms that explain how each substitution at each rheostat position modulates function, it is intriguing to consider whether a single mechanistic framework can provide an explanation for many of the changes. For example, the substitutional-modulation of an anisotropic dynamic network may provide one global perspective for describing altered functional outcomes (e.g., (37, 136)).

Understanding the molecular mechanisms by which substitutions at rheostat positions give rise to a wide range of responses will be critical for advancing personalized medicine and protein engineering. Future studies to identify the locations of rheostat positions will enable experimental studies that focus on these interesting and widespread positions. Given the wide range of complex functional outcomes that arise from substitutions at rheostat positions—including their potential for enhancing function, simultaneous effects on multiple biochemical parameters, effects on ligand specificity, their independence of side chain chemical “similarities”, and their long-range through-protein effects—it is imperative to understand the biophysical roots of these intriguing protein positions.

Conflict of interest

The authors declare that they have no conflicts of interest with the contents of this article.

Acknowledgments

We thank our many fantastic collaborators who have participated in studies of rheostat positions, in particular: Drs Joe Fontes, Alexey Ladokhin, Bruno Hagenbuch and Antonio Artigues (KUMC), Audrey Lamb (University of Texas-San Antonio), John Karanicolas (Fox Chase Cancer Center), Paul E. Smith (Kansas State University), S. Banu Ozkan and Paul Campitelli (Arizona State University). We also thank current and past members of the Fenton and Swint-Kruse labs for their many discussions on these topics and contributions to published studies; in particular, Shwetha Sreenivasan, Dr Pierce O'Neil, and Anastasiia Sivchenko made comments on this manuscript; SS also contributed to the discussion about the relationship between IDR fold/function and PO contributed to discussions about the relationship between functional variation and protein stability.

Author contributions

L. S.-K. and A. W. F. conceptualization; L. S.-K. data curation; L. S.-K. and A. W. F. writing–original draft; L. S.-K. and A. W. F. writing–review & editing; L. S.-K. visualization; L. S.-K. and A. W. F. project administration; L. S.-K. and A. W. F. funding acquisition.

Funding and additional information

This work was supported by funding from the W. M. Keck Foundation (L. S.-K., A. W. F.) and the National Institutes of Health: GM118589 (L. S.-K., A. W. F.), GM147635 (L. S.-K.), GM115340 (A. W. F.) and DK78076 (A. W. F.). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Reviewed by members of the JBC Editorial Board. Edited by Karen Fleming

References

  • 1.Miller M., Bromberg Y., Swint-Kruse L. Computational predictors fail to identify amino acid substitution effects at rheostat positions. Sci. Rep. 2017;7 doi: 10.1038/srep41329. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Gray V.E., Kukurba K.R., Kumar S. Performance of computational tools in evaluating the functional impact of laboratory-induced amino acid mutations. Bioinformatics. 2012;28:2093–2096. doi: 10.1093/bioinformatics/bts336. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Swint-Kruse L., Larson C., Pettitt B.M., Matthews K.S. Fine-tuning function: correlation of hinge domain interactions with functional distinctions between LacI and PurR. Protein Sci. 2002;11:778–794. doi: 10.1110/ps.4050102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Zhan H., Swint-Kruse L., Matthews K.S. Extrinsic interactions dominate helical propensity in coupled binding and folding of the lactose repressor protein hinge helix. Biochemistry. 2006;45:5896–5906. doi: 10.1021/bi052619p. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Tungtur S., Egan S.M., Swint-Kruse L. Functional consequences of exchanging domains between LacI and PurR are mediated by the intervening linker sequence. Proteins. 2007;68:375–388. doi: 10.1002/prot.21412. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Meinhardt S., Swint-Kruse L. Experimental identification of specificity determinants in the domain linker of a LacI/GalR protein: bioinformatics-based predictions generate true positives and false negatives. Proteins. 2008;73:941–957. doi: 10.1002/prot.22121. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Zhan H., Taraban M., Trewhella J., Swint-Kruse L. Subdividing repressor function: DNA binding affinity, selectivity, and allostery can be altered by amino acid substitution of nonconserved residues in a LacI/GalR homologue. Biochemistry. 2008;47:8058–8069. doi: 10.1021/bi800443k. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Tungtur S., Meinhardt S., Swint-Kruse L. Comparing the functional roles of nonconserved sequence positions in homologous transcription repressors: implications for sequence/function analyses. J. Mol. Biol. 2010;395:785–802. doi: 10.1016/j.jmb.2009.10.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Tungtur S., Parente D.J., Swint-Kruse L. Functionally important positions can comprise the majority of a protein's architecture. Proteins. 2011;79:1589–1608. doi: 10.1002/prot.22985. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Tungtur S., Skinner H., Zhan H., Swint-Kruse L., Beckett D. In vivo tests of thermodynamic models of transcription repressor function. Biophys. Chem. 2011;159:142–151. doi: 10.1016/j.bpc.2011.06.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Meinhardt S., Manley M.W., Jr., Parente D.J., Swint-Kruse L. Rheostats and toggle switches for modulating protein function. PLoS One. 2013;8 doi: 10.1371/journal.pone.0083502. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Tungtur S., Schwingen K.M., Riepe J.J., Weeramange C.J., Swint-Kruse L. Homolog comparisons further reconcile in vitro and in vivo correlations of protein activities by revealing over-looked physiological factors. Protein Sci. 2019;28:1806–1818. doi: 10.1002/pro.3695. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Jonson P.H., Petersen S.B. A critical view on conservative mutations. Protein Eng. 2001;14:397–402. doi: 10.1093/protein/14.6.397. [DOI] [PubMed] [Google Scholar]
  • 14.Gilbert G.E., Novakovic V.A., Kaufman R.J., Miao H., Pipe S.W. Conservative mutations in the C2 domains of factor VIII and factor V alter phospholipid binding and cofactor activity. Blood. 2012;120:1923–1932. doi: 10.1182/blood-2012-01-408245. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Pál G., Kouadio J.-L.K., Artis D.R., Kossiakoff A.A., Sidhu S.S. Comprehensive and quantitative mapping of energy landscapes for protein-protein interactions by rapid combinatorial scanning. J. Biol. Chem. 2006;281:22378–22385. doi: 10.1074/jbc.M603826200. [DOI] [PubMed] [Google Scholar]
  • 16.Sevim Bayrak C., Stein D., Jain A., Chaudhary K., Nadkarni G.N., Van Vleck T.T., et al. Identification of discriminative gene-level and protein-level features associated with pathogenic gain-of-function and loss-of-function variants. Am. J. Hum. Genet. 2021;108:2301–2318. doi: 10.1016/j.ajhg.2021.10.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Hodges A.M., Fenton A.W., Dougherty L.L., Overholt A.C., Swint-Kruse L. RheoScale: a tool to aggregate and quantify experimentally determined substitution outcomes for multiple variants at individual protein positions. Hum. Mutat. 2018;39:1814–1826. doi: 10.1002/humu.23616. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Matreyek K.A., Stephany J.J., Ahler E., Fowler D.M. Integrating thousands of PTEN variant activity and abundance measurements reveals variant subgroups and new dominant negatives in cancers. Genome Med. 2021;13:165. doi: 10.1186/s13073-021-00984-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Cheng J., Novati G., Pan J., Bycroft C., Žemgulytė A., Applebaum T., et al. Accurate proteome-wide missense variant effect prediction with AlphaMissense. Science. 2023;381 doi: 10.1126/science.adg7492. [DOI] [PubMed] [Google Scholar]
  • 20.Starr T.N., Greaney A.J., Stewart C.M., Walls A.C., Hannon W.W., Veesler D., et al. Deep mutational scans for ACE2 binding, RBD expression, and antibody escape in the SARS-CoV-2 Omicron BA.1 and BA.2 receptor-binding domains. PLoS Pathog. 2022;18 doi: 10.1371/journal.ppat.1010951. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Klesmith J.R., Bacik J.-P., Wrenbeck E.E., Michalczyk R., Whitehead T.A. Trade-offs between enzyme fitness and solubility illuminated by deep mutational scanning. Proc. Natl. Acad. Sci. U. S. A. 2017;114:2265–2270. doi: 10.1073/pnas.1614437114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Giacomelli A.O., Yang X., Lintner R.E., McFarland J.M., Duby M., Kim J., et al. Mutational processes shape the landscape of TP53 mutations in human cancer. Nat. Genet. 2018;50:1381–1387. doi: 10.1038/s41588-018-0204-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Chakraborty S., Ahler E., Simon J.J., Fang L., Potter Z.E., Sitko K.A., et al. Profiling of drug resistance in Src kinase at scale uncovers a regulatory network coupling autoinhibition and catalytic domain dynamics. Cell Chem. Biol. 2023 doi: 10.1016/j.chembiol.2023.08.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Hobbs H.T., Shah N.H., Shoemaker S.R., Amacher J.F., Marqusee S., Kuriyan J. Saturation mutagenesis of a predicted ancestral Syk-family kinase. Protein Sci. 2022;31 doi: 10.1002/pro.4411. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Mathy C.J.P., Mishra P., Flynn J.M., Perica T., Mavor D., Bolon D.N.A., et al. A complete allosteric map of a GTPase switch in its native cellular network. Cell Syst. 2023;14:237–246.e7. doi: 10.1016/j.cels.2023.01.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Staller M.V., Ramirez E., Kotha S.R., Holehouse A.S., Pappu R.V., Cohen B.A. Directed mutational scanning reveals a balance between acidic and hydrophobic residues in strong human activation domains. Cell Syst. 2022;13:334–345.e5. doi: 10.1016/j.cels.2022.01.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Leander M., Liu Z., Cui Q., Raman S. Deep mutational scanning and machine learning reveal structural and molecular rules governing allosteric hotspots in homologous proteins. Elife. 2022;11 doi: 10.7554/eLife.79932. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Amorosi C.J., Chiasson M.A., McDonald M.G., Wong L.H., Sitko K.A., Boyle G., et al. Massively parallel characterization of CYP2C9 variant enzyme activity and abundance. Am. J. Hum. Genet. 2021;108:1735–1751. doi: 10.1016/j.ajhg.2021.07.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Procko E. The sequence of human ACE2 is suboptimal for binding the S spike protein of SARS coronavirus 2. bioRxiv. 2020 doi: 10.1101/2020.03.16.994236. [preprint] [DOI] [Google Scholar]
  • 30.Fenton A.W., Page B.M., Spellman-Kruse A., Hagenbuch B., Swint-Kruse L. Rheostat positions: a new classification of protein positions relevant to pharmacogenomics. Med. Chem. Res. 2020;29:1133–1146. doi: 10.1007/s00044-020-02582-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Martin T.A., Wu T., Tang Q., Dougherty L.L., Parente D.J., Swint-Kruse L., et al. Identification of biochemically neutral positions in liver pyruvate kinase. Proteins. 2020;88:1340–1350. doi: 10.1002/prot.25953. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Ruggiero M.J. Characterization of the Function and Expression of Variants at Potential Rheostat, Toggle, and Neutral Positions in the Na+/Taurocholate Cotransporting Polypeptide. University of Kansas, KU Scholarworks; Lawrence, KS: 2021. [Google Scholar]
  • 33.Swint-Kruse L., Martin T.A., Page B.M., Wu T., Gerhart P.M., Dougherty L.L., et al. Rheostat functional outcomes occur when substitutions are introduced at nonconserved positions that diverge with speciation. Protein Sci. 2021;30:1833–1853. doi: 10.1002/pro.4136. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Page B.M., Martin T.A., Wright C.L., Fenton L.A., Villar M.T., Tang Q., et al. Odd one out? Functional tuning of Zymomonas mobilis pyruvate kinase is narrower than its allosteric, human counterpart. Protein Sci. 2022;31 doi: 10.1002/pro.4336. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Sreenivasan S., Heffren P., Suh K.S., Rodnin M.V., Kosa E., Fenton A.W., et al. The intrinsically disordered transcriptional activation domain of CIITA is functionally tuneable by single substitutions: an exception or a new paradigm? Protein Sci. 2024;33 doi: 10.1002/pro.4863. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Wu T., Swint-Kruse L., Fenton A.W. Functional tunability from a distance: rheostat positions influence allosteric coupling between two distant binding sites. Sci. Rep. 2019;9 doi: 10.1038/s41598-019-53464-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Campitelli P., Swint-Kruse L., Ozkan S. Substitutions at non-conserved rheostat positions modulate function by re-wiring long-range, dynamic interactions. Mol. Biol. Evol. 2021;38:201–214. doi: 10.1093/molbev/msaa202. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Fowler D.M., Fields S. Deep mutational scanning: a new style of protein science. Nat. Methods. 2014;11:801–807. doi: 10.1038/nmeth.3027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Roscoe B.P., Thayer K.M., Zeldovich K.B., Fushman D., Bolon D.N.A. Analyses of the effects of all ubiquitin point mutants on yeast growth rate. J. Mol. Biol. 2013;425:1363–1377. doi: 10.1016/j.jmb.2013.01.032. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Chan K.K., Dorosky D., Sharma P., Abbasi S.A., Dye J.M., Kranz D.M., et al. Engineering human ACE2 to optimize binding to the spike protein of SARS coronavirus 2. Science. 2020;369:1261–1265. doi: 10.1126/science.abc0870. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Hietpas R.T., Jensen J.D., Bolon D.N.A. Experimental illumination of a fitness landscape. Proc. Natl. Acad. Sci. U. S. A. 2011;108:7896–7901. doi: 10.1073/pnas.1016024108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Chan Y.H., Venev S.V., Zeldovich K.B., Matthews C.R. Correlation of fitness landscapes from three orthologous TIM barrels originates from sequence and structure constraints. Nat. Commun. 2017;8 doi: 10.1038/ncomms14614. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Dutta S., Gulla S., Chen T.S., Fire E., Grant R.A., Keating A.E. Determinants of BH3 binding specificity for Mcl-1 versus Bcl-xL. J. Mol. Biol. 2010;398:747–762. doi: 10.1016/j.jmb.2010.03.058. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Swint-Kruse L., Dougherty L.L., Page B., Wu T., O'Neil P.T., Prasannan C.B., et al. PYK-SubstitutionOME: an integrated database containing allosteric coupling, ligand affinity and mutational, structural, pathological, bioinformatic and computational information about pyruvate kinase isozymes. Database (Oxford) 2023;2023 doi: 10.1093/database/baad030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Markin C.J., Mokhtari D.A., Sunden F., Appel M.J., Akiva E., Longwell S.A., et al. Revealing enzyme functional architecture via high-throughput microfluidic enzyme kinetics. Science. 2021;373 doi: 10.1126/science.abf8761. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Majithia A.R., Tsuda B., Agostini M., Gnanapradeepan K., Rice R., Peloso G., et al. Prospective functional classification of all possible missense variants in PPARG. Nat. Genet. 2016;48:1570–1575. doi: 10.1038/ng.3700. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Staller M.V., Holehouse A.S., Swain-Lenz D., Das R.K., Pappu R.V., Cohen B.A. A high-throughput mutational scan of an intrinsically disordered acidic transcriptional activation domain. Cell Syst. 2018;6:444–455.e6. doi: 10.1016/j.cels.2018.01.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Esposito D., Weile J., Shendure J., Starita L.M., Papenfuss A.T., Roth F.P., et al. MaveDB: an open-source platform to distribute and interpret data from multiplexed assays of variant effect. Genome Biol. 2019;20:223. doi: 10.1186/s13059-019-1845-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Meinhardt S., Manley M.W., Becker N.A., Hessman J.A., Maher L.J., Swint-Kruse L. Novel insights from hybrid LacI/GalR proteins: family-wide functional attributes and biologically significant variation in transcription repression. Nucleic Acids Res. 2012;40:11139–11154. doi: 10.1093/nar/gks806. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Ruggiero M.J., Malhotra S., Fenton A.W., Swint-Kruse L., Karanicolas J., Hagenbuch B. A clinically-relevant polymorphism in the Na(+)/taurocholate cotransporting polypeptide (NTCP) occurs at a rheostat position. J. Biol. Chem. 2021;296 doi: 10.1074/jbc.RA120.014889. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Saunders R.E., O'Connell N.M., Lee C.A., Perry D.J., Perkins S.J. Factor XI deficiency database: an interactive web database of mutations, phenotypes, and structural analysis tools. Hum. Mutat. 2005;26:192–198. doi: 10.1002/humu.20214. [DOI] [PubMed] [Google Scholar]
  • 52.Auton M., Sedlák E., Marek J., Wu T., Zhu C., Cruz M.A. Changes in thermodynamic stability of von Willebrand factor differentially affect the force-dependent binding to platelet GPIbalpha. Biophys. J. 2009;97:618–627. doi: 10.1016/j.bpj.2009.05.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Swint-Kruse L. Using evolution to guide protein engineering: the Devil IS in the details. Biophys. J. 2016;111:10–18. doi: 10.1016/j.bpj.2016.05.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Sousa F.L., Parente D.J., Shis D.L., Hessman J.A., Chazelle A., Bennett M.R., et al. AlloRep: a repository of sequence, structural and mutagenesis data for the LacI/GalR transcription regulators. J. Mol. Biol. 2016;428:671–678. doi: 10.1016/j.jmb.2015.09.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Ruggiero M.J., Malhotra S., Fenton A.W., Swint-Kruse L., Karanicolas J., Hagenbuch B. Structural plasticity is a feature of rheostat positions in the human Na(+)/Taurocholate cotransporting polypeptide (NTCP) Int. J. Mol. Sci. 2022;23:3211. doi: 10.3390/ijms23063211. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Liu L., Okada S., Kong X.F., Kreins A.Y., Cypowyj S., Abhyankar A., et al. Gain-of-function human STAT1 mutations impair IL-17 immunity and underlie chronic mucocutaneous candidiasis. J. Exp. Med. 2011;208:1635–1648. doi: 10.1084/jem.20110958. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Markiewicz P., Kleina L.G., Cruz C., Ehret S., Miller J.H. Genetic studies of the lac repressor. XIV. Analysis of 4000 altered Escherichia coli lac repressors reveals essential and non-essential residues, as well as “spacers” which do not require a specific sequence. J. Mol. Biol. 1994;240:421–433. doi: 10.1006/jmbi.1994.1458. [DOI] [PubMed] [Google Scholar]
  • 58.Suckow J., Markiewicz P., Kleina L.G., Miller J., Kisters-Woike B., Müller-Hill B. Genetic studies of the Lac repressor. XV: 4000 single amino acid substitutions and analysis of the resulting phenotypes on the basis of the protein structure. J. Mol. Biol. 1996;261:509–523. doi: 10.1006/jmbi.1996.0479. [DOI] [PubMed] [Google Scholar]
  • 59.Tang Q., Fenton A.W. Whole-protein alanine-scanning mutagenesis of allostery: a large percentage of a protein can contribute to mechanism. Hum. Mutat. 2017;38:1132–1143. doi: 10.1002/humu.23231. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Tack D.S., Tonner P.D., Pressman A., Olson N.D., Levy S.F., Romantseva E.F., et al. The genotype-phenotype landscape of an allosteric protein. Mol. Syst. Biol. 2021;17 doi: 10.15252/msb.202110847. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Bloom J.D., Arnold F.H., Wilke C.O. Breaking proteins with mutations: threads and thresholds in evolution. Mol. Syst. Biol. 2007;3:76. doi: 10.1038/msb4100119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Poelwijk F.J., Kiviet D.J., Weinreich D.M., Tans S.J. Empirical fitness landscapes reveal accessible evolutionary paths. Nature. 2007;445:383–386. doi: 10.1038/nature05451. [DOI] [PubMed] [Google Scholar]
  • 63.Natarajan C., Signore A.V., Bautista N.M., Hoffmann F.G., Tame J.R.H., Fago A., et al. Evolution and molecular basis of a novel allosteric property of crocodilian hemoglobin. Curr. Biol. 2023;33:98–108.e4. doi: 10.1016/j.cub.2022.11.049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Nishikawa K.K., Hoppe N., Smith R., Bingman C., Raman S. Epistasis shapes the fitness landscape of an allosteric specificity switch. Nat. Commun. 2021;12:5562. doi: 10.1038/s41467-021-25826-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Poelwijk F.J., Socolich M., Ranganathan R. Learning the pattern of epistasis linking genotype and phenotype in a protein. Nat. Commun. 2019;10:4213. doi: 10.1038/s41467-019-12130-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Khersonsky O., Lipsh R., Avizemer Z., Ashani Y., Goldsmith M., Leader H., et al. Automated design of efficient and functionally diverse enzyme repertoires. Mol. Cell. 2018;72:178–186.e5. doi: 10.1016/j.molcel.2018.08.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Harms M.J., Eick G.N., Goswami D., Colucci J.K., Griffin P.R., Ortlund E.A., et al. Biophysical mechanisms for large-effect mutations in the evolution of steroid hormone receptors. Proc. Natl. Acad. Sci. U. S. A. 2013;110:11475–11480. doi: 10.1073/pnas.1303930110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Holyoak T., Zhang B., Deng J., Tang Q., Prasannan C.B., Fenton A.W. Energetic coupling between an oxidizable cysteine and the phosphorylatable N-terminus of human liver pyruvate kinase. Biochemistry. 2013;52:466–476. doi: 10.1021/bi301341r. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Meneely K.M., McFarlane J.S., Wright C.L., Vela K., Swint-Kruse L., Fenton A.W., et al. The 2.4 Å structure of Zymomonas mobilis pyruvate kinase: implications for stability and regulation. Arch. Biochem. Biophys. 2023;744 doi: 10.1016/j.abb.2023.109679. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Pendergrass D.C., Williams R., Blair J.B., Fenton A.W. Mining for allosteric information: natural mutations and positional sequence conservation in pyruvate kinase. IUBMB Life. 2006;58:31–38. doi: 10.1080/15216540500531705. [DOI] [PubMed] [Google Scholar]
  • 71.Swint-Kruse L., Matthews K.S. Allostery in the LacI/GalR family: variations on a theme. Curr. Opin. Microbiol. 2009;12:129–137. doi: 10.1016/j.mib.2009.01.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Oria-Hernández J., Riveros-Rosas H., Ramírez-Sílva L. Dichotomic phylogenetic tree of the pyruvate kinase family: K+ dependent and independent enzymes. J. Biol. Chem. 2006;281:30717–30724. doi: 10.1074/jbc.M605310200. [DOI] [PubMed] [Google Scholar]
  • 73.Schramm A., Siebers B., Tjaden B., Brinkmann H., Hensel R. Pyruvate kinase of the hyperthermophilic crenarchaeote Thermoproteus tenax: physiological role and phylogenetic aspects. J. Bacteriol. 2000;182:2001–2009. doi: 10.1128/jb.182.7.2001-2009.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Muñoz M.E., Ponce E. Pyruvate kinase: current status of regulatory and functional properties. Comp. Biochem. Physiol. B Biochem. Mol. Biol. 2003;135:197–218. doi: 10.1016/s1096-4959(03)00081-2. [DOI] [PubMed] [Google Scholar]
  • 75.León Machado J.A., Steimle V. The MHC class II transactivator CIITA: not (quite) the odd-one-out anymore among NLR proteins. Int. J. Mol. Sci. 2021;22:1074. doi: 10.3390/ijms22031074. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Hughes A.L. Evolutionary relationships of vertebrate NACHT domain-containing proteins. Immunogenetics. 2006;58:785–791. doi: 10.1007/s00251-006-0148-8. [DOI] [PubMed] [Google Scholar]
  • 77.Parente D.J., Ray J.C., Swint-Kruse L. Amino acid positions subject to multiple coevolutionary constraints can be robustly identified by their eigenvector network centrality scores. Proteins. 2015;83:2293–2306. doi: 10.1002/prot.24948. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Lee Y., Mick J., Furdui C., Beamer L.J. A coevolutionary residue network at the site of a functionally important conformational change in a phosphohexomutase enzyme family. PLoS One. 2012;7 doi: 10.1371/journal.pone.0038114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Parente D.J., Swint-Kruse L. Multiple co-evolutionary networks are supported by the common tertiary scaffold of the LacI/GalR proteins. PLoS One. 2013;8 doi: 10.1371/journal.pone.0084398. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Brown C.A., Brown K.S. Validation of coevolving residue algorithms via pipeline sensitivity analysis: ELSC and OMES and ZNMI, Oh my! PLoS One. 2010;5 doi: 10.1371/journal.pone.0010779. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Halabi N., Rivoire O., Leibler S., Ranganathan R. Protein sectors: evolutionary units of three-dimensional structure. Cell. 2009;138:774–786. doi: 10.1016/j.cell.2009.07.038. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Jana B., Morcos F., Onuchic J.N. From structure to function: the convergence of structure based models and co-evolutionary information. Phys. Chem. Chem. Phys. 2014;16:6496–6507. doi: 10.1039/c3cp55275f. [DOI] [PubMed] [Google Scholar]
  • 83.Hopf T.A., Ingraham J.B., Poelwijk F.J., Schärfe C.P., Springer M., Sander C., et al. Mutation effects predicted from sequence co-variation. Nat. Biotechnol. 2017;35:128–135. doi: 10.1038/nbt.3769. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Gobel U., Sander C., Schneider R., Valencia A. Correlated mutations and residue contacts in proteins. Proteins. 1994;18:309–317. doi: 10.1002/prot.340180402. [DOI] [PubMed] [Google Scholar]
  • 85.Lockless S.W., Ranganathan R. Evolutionarily conserved pathways of energetic connectivity in protein families. Science. 1999;286:295–299. doi: 10.1126/science.286.5438.295. [DOI] [PubMed] [Google Scholar]
  • 86.Olmea O., Rost B., Valencia A. Effective use of sequence correlation and conservation in fold recognition. J. Mol. Biol. 1999;293:1221–1239. doi: 10.1006/jmbi.1999.3208. [DOI] [PubMed] [Google Scholar]
  • 87.Dekker J.P., Fodor A., Aldrich R.W., Yellen G. A perturbation-based method for calculating explicit likelihood of evolutionary co-variance in multiple sequence alignments. Bioinformatics. 2004;20:1565–1572. doi: 10.1093/bioinformatics/bth128. [DOI] [PubMed] [Google Scholar]
  • 88.Gloor G.B., Martin L.C., Wahl L.M., Dunn S.D. Mutual information in protein multiple sequence alignments reveals two classes of coevolving positions. Biochemistry. 2005;44:7156–7165. doi: 10.1021/bi050293e. [DOI] [PubMed] [Google Scholar]
  • 89.Buslje C.M., Santos J., Delfino J.M., Nielsen M. Correction for phylogeny, small number of observations and data redundancy improves the identification of coevolving amino acid pairs using mutual information. Bioinformatics. 2009;25:1125–1131. doi: 10.1093/bioinformatics/btp135. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Simonetti F.L., Teppa E., Chernomoretz A., Nielsen M., Marino Buslje C. MISTIC: mutual information server to infer coevolution. Nucleic Acids Res. 2013;41:W8–W14. doi: 10.1093/nar/gkt427. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91.Konecki D.M., Hamrick S., Wang C., Agosto M.A., Wensel T.G., Lichtarge O. CovET: a covariation-evolutionary trace method that identifies protein structure-function modules. J. Biol. Chem. 2023;299 doi: 10.1016/j.jbc.2023.104896. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92.Baker F.N., Porollo A. CoeViz: a web-based tool for coevolution analysis of protein residues. BMC Bioinformatics. 2016;17:119. doi: 10.1186/s12859-016-0975-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Ye K., Vriend G., Ijzerman A.P. Tracing evolutionary pressure. Bioinformatics. 2008;24:908–915. doi: 10.1093/bioinformatics/btn057. [DOI] [PubMed] [Google Scholar]
  • 94.Yariv B., Yariv E., Kessel A., Masrati G., Chorin A.B., Martz E., et al. Using evolutionary data to make sense of macromolecules with a “face-lifted” ConSurf. Protein Sci. 2023;32 doi: 10.1002/pro.4582. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95.Gu X., Vander Velden K. DIVERGE: phylogeny-based analysis for functional-structural divergence of a protein family. Bioinformatics. 2002;18:500–501. doi: 10.1093/bioinformatics/18.3.500. [DOI] [PubMed] [Google Scholar]
  • 96.Matthew Ward R., Venner E., Daines B., Murray S., Erdin S., Kristensen D.M., et al. Evolutionary Trace Annotation Server: automated enzyme function prediction in protein structures using 3D templates. Bioinformatics. 2009;25:1426–1427. doi: 10.1093/bioinformatics/btp160. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97.Chi P.B., Liberles D.A. Selection on protein structure, interaction, and sequence. Protein Sci. 2016;25:1168–1178. doi: 10.1002/pro.2886. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 98.Bantis L.E., Parente D.J., Fenton A.W., Swint-Kruse L. “Multiplex” rheostat positions cluster around allosterically critical regions of the lactose repressor protein. bioRxiv. 2020 doi: 10.1101/2020.11.17.386979. [preprint] [DOI] [Google Scholar]
  • 99.Ishwar A., Tang Q., Fenton A.W. Distinguishing the interactions in the fructose 1,6-bisphosphate binding site of human liver pyruvate kinase that contribute to allostery. Biochemistry. 2015;54:1516–1524. doi: 10.1021/bi501426w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 100.Tang Q., Alontaga A.Y., Holyoak T., Fenton A.W. Exploring the limits of the usefulness of mutagenesis in studies of allosteric mechanisms. Hum. Mutat. 2017;38:1144–1154. doi: 10.1002/humu.23239. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 101.Fenton A.W., Alontaga A.Y. The impact of ions on allosteric functions in human liver pyruvate kinase. Methods Enzymol. 2009;466:83–107. doi: 10.1016/S0076-6879(09)66005-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 102.Fenton A.W. Allostery: an illustrated definition for the 'second secret of life'. Trends Biochem. Sci. 2008;33:420–425. doi: 10.1016/j.tibs.2008.05.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 103.Swint-Kruse L., Matthews K.S. Thermodynamics, protein modification, and molecular dynamics in characterizing lactose repressor protein: strategies for complex analyses of protein structure-function. Methods Enzymol. 2004;379:188–209. doi: 10.1016/S0076-6879(04)79011-4. [DOI] [PubMed] [Google Scholar]
  • 104.Bell C.E., Lewis M. A closer view of the conformation of the Lac repressor bound to operator. Nat. Struct. Biol. 2000;7:209–214. doi: 10.1038/73317. [DOI] [PubMed] [Google Scholar]
  • 105.Schumacher M.A., Glasfeld A., Zalkin H., Brennan R.G. The X-ray structure of the PurR-guanine-purF operator complex reveals the contributions of complementary electrostatic surfaces and a water-mediated hydrogen bond to corepressor specificity and binding affinity. J. Biol. Chem. 1997;272:22648–22653. doi: 10.1074/jbc.272.36.22648. [DOI] [PubMed] [Google Scholar]
  • 106.Fenton K.D., Meneely K.M., Wu T., Martin T.A., Swint-Kruse L., Fenton A.W., et al. Substitutions at a rheostat position in human aldolase A cause a shift in the conformational population. Protein Sci. 2022;31:357–370. doi: 10.1002/pro.4222. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 107.Rago F., Saltzberg D., Allen K.N., Tolan D.R. Enzyme substrate specificity conferred by distinct conformational pathways. J. Am. Chem. Soc. 2015;137:13876–13886. doi: 10.1021/jacs.5b08149. [DOI] [PubMed] [Google Scholar]
  • 108.Ose N.J., Butler B.M., Kumar A., Kazan I.C., Sanderford M., Kumar S., et al. Dynamic coupling of residues within proteins as a mechanistic foundation of many enigmatic pathogenic missense variants. PLoS Comput. Biol. 2022;18 doi: 10.1371/journal.pcbi.1010006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 109.Kazan I.C., Sharma P., Rahman M.I., Bobkov A., Fromme R., Ghirlanda G., et al. Design of novel cyanovirin-N variants by modulation of binding dynamics through distal mutations. Elife. 2022;11 doi: 10.7554/eLife.67474. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 110.Lee E., Redzic J.S., Zohar Eisenmesser E. Relaxation and single site multiple mutations to identify and control allosteric networks. Methods. 2023;216:51–57. doi: 10.1016/j.ymeth.2023.06.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 111.Otten R., Liu L., Kenner L.R., Clarkson M.W., Mavor D., Tawfik D.S., et al. Rescue of conformational dynamics in enzyme catalysis by directed evolution. Nat. Commun. 2018;9:1314. doi: 10.1038/s41467-018-03562-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 112.Watson M.D., Monroe J., Raleigh D.P. Size-dependent relationships between protein stability and thermal unfolding temperature have important implications for analysis of protein energetics and high-throughput assays of protein-ligand interactions. J. Phys. Chem. B. 2018;122:5278–5285. doi: 10.1021/acs.jpcb.7b05684. [DOI] [PubMed] [Google Scholar]
  • 113.Baase W.A., Liu L., Tronrud D.E., Matthews B.W. Lessons from the lysozyme of phage T4. Protein Sci. 2010;19:631–641. doi: 10.1002/pro.344. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 114.Noel A.F., Bilsel O., Kundu A., Wu Y., Zitzewitz J.A., Matthews C.R. The folding free-energy surface of HIV-1 protease: insights into the thermodynamic basis for resistance to inhibitors. J. Mol. Biol. 2009;387:1002–1016. doi: 10.1016/j.jmb.2008.12.061. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 115.Nisthal A., Wang C.Y., Ary M.L., Mayo S.L. Protein stability engineering insights revealed by domain-wide comprehensive mutagenesis. Proc. Natl. Acad. Sci. U. S. A. 2019;116:16367–16377. doi: 10.1073/pnas.1903888116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 116.Robinson A.C., Castañeda C.A., Schlessman J.L., García-Moreno E.,B. Structural and thermodynamic consequences of burial of an artificial ion pair in the hydrophobic interior of a protein. Proc. Natl. Acad. Sci. U. S. A. 2014;111:11685–11690. doi: 10.1073/pnas.1402900111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 117.Marx D.C., Fleming K.G. Influence of protein scaffold on side-chain transfer free energies. Biophys. J. 2017;113:597–604. doi: 10.1016/j.bpj.2017.06.032. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 118.Bloom J.D., Silberg J.J., Wilke C.O., Drummond D.A., Adami C., Arnold F.H. Thermodynamic prediction of protein neutrality. Proc. Natl. Acad. Sci. U. S. A. 2005;102:606–611. doi: 10.1073/pnas.0406744102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 119.Bloom J.D., Labthavikul S.T., Otey C.R., Arnold F.H. Protein stability promotes evolvability. Proc. Natl. Acad. Sci. U. S. A. 2006;103:5869–5874. doi: 10.1073/pnas.0510098103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 120.Pena M.I., Davlieva M., Bennett M.R., Olson J.S., Shamoo Y. Evolutionary fates within a microbial population highlight an essential role for protein folding during natural selection. Mol. Syst. Biol. 2010;6:387. doi: 10.1038/msb.2010.43. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 121.Tokuriki N., Tawfik D.S. Stability effects of mutations and protein evolvability. Curr. Opin. Struct. Biol. 2009;19:596–604. doi: 10.1016/j.sbi.2009.08.003. [DOI] [PubMed] [Google Scholar]
  • 122.Trudeau D.L., Kaltenbach M., Tawfik D.S. On the potential origins of the high stability of reconstructed ancestral proteins. Mol. Biol. Evol. 2016;33:2633–2641. doi: 10.1093/molbev/msw138. [DOI] [PubMed] [Google Scholar]
  • 123.Barry J.K., Matthews K.S. Thermodynamic analysis of unfolding and dissociation in lactose repressor protein. Biochemistry. 1999;38:6520–6528. doi: 10.1021/bi9900727. [DOI] [PubMed] [Google Scholar]
  • 124.Rees D.C., Robertson A.D. Some thermodynamic implications for the thermostability of proteins. Protein Sci. 2001;10:1187–1194. doi: 10.1110/ps.180101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 125.Yang J., Gao M., Xiong J., Su Z., Huang Y. Features of molecular recognition of intrinsically disordered proteins via coupled folding and binding. Protein Sci. 2019;28:1952–1965. doi: 10.1002/pro.3718. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 126.Dill K.A. Dominant forces in protein folding. Biochemistry. 1990;29:7133–7155. doi: 10.1021/bi00483a001. [DOI] [PubMed] [Google Scholar]
  • 127.Matthew J.B., Gurd F.R.N. Methods in Enzymology. Academic Press; San Diego, CA: 1986. [18] Stabilization and destabilization of protein structure by charge interactions; pp. 437–453. [DOI] [PubMed] [Google Scholar]
  • 128.Swint-Kruse L., Robertson A.D. Hydrogen bonds and the pH dependence of ovomucoid third domain stability. Biochemistry. 1995;34:4724–4732. doi: 10.1021/bi00014a029. [DOI] [PubMed] [Google Scholar]
  • 129.Robertson A.D., Murphy K.P. Protein structure and the energetics of protein stability. Chem. Rev. 1997;97:1251–1268. doi: 10.1021/cr960383c. [DOI] [PubMed] [Google Scholar]
  • 130.Hoya M., Matsunaga R., Nagatoishi S., Tsumoto K. Experimental modification in thermal stability of oligomers by alanine substitution and site saturation mutagenesis of interfacial residues. Biochem. Biophys. Res. Commun. 2024;691 doi: 10.1016/j.bbrc.2023.149316. [DOI] [PubMed] [Google Scholar]
  • 131.Xie W.J., Warshel A. Natural evolution provides strong hints about laboratory evolution of designer enzymes. Proc. Natl. Acad. Sci. U. S. A. 2022;119 doi: 10.1073/pnas.2207904119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 132.Xie W.J., Asadi M., Warshel A. Enhancing computational enzyme design by a maximum entropy strategy. Proc. Natl. Acad. Sci. U. S. A. 2022;119 doi: 10.1073/pnas.2122355119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 133.Khersonsky O., Rothlisberger D., Dym O., Albeck S., Jackson C.J., Baker D., et al. Evolutionary optimization of computationally designed enzymes: Kemp eliminases of the KE07 series. J. Mol. Biol. 2010;396:1025–1042. doi: 10.1016/j.jmb.2009.12.031. [DOI] [PubMed] [Google Scholar]
  • 134.Hou Q., Rooman M., Pucci F. Enzyme stability-activity trade-off: new insights from protein stability weaknesses and evolutionary conservation. J. Chem. Theory Comput. 2023;19:3664–3671. doi: 10.1021/acs.jctc.3c00036. [DOI] [PubMed] [Google Scholar]
  • 135.Cagiada M., Bottaro S., Lindemose S., Schenstrøm S.M., Stein A., Hartmann-Petersen R., et al. Discovering functionally important sites in proteins. Nat. Commun. 2023;14:4175. doi: 10.1038/s41467-023-39909-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 136.Kazan I.C., Mills J.H., Ozkan S.B. Allosteric regulatory control in dihydrofolate reductase is revealed by dynamic asymmetry. Protein Sci. 2023;32 doi: 10.1002/pro.4700. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 137.Pettersen E.F., Goddard T.D., Huang C.C., Couch G.S., Greenblatt D.M., Meng E.C., et al. UCSF Chimera--a visualization system for exploratory research and analysis. J. Comput. Chem. 2004;25:1605–1612. doi: 10.1002/jcc.20084. [DOI] [PubMed] [Google Scholar]
  • 138.Pettersen E.F., Goddard T.D., Huang C.C., Meng E.C., Couch G.S., Croll T.I., et al. UCSF ChimeraX: structure visualization for researchers, educators, and developers. Protein Sci. 2021;30:70–82. doi: 10.1002/pro.3943. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 139.Sousa F.L., Parente D.J., Hessman J.A., Chazelle A., Teichmann S.A., Swint-Kruse L. Data on publications, structural analyses, and queries used to build and utilize the AlloRep database. Data Brief. 2016;8:948–957. doi: 10.1016/j.dib.2016.07.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 140.Ohnishi S., Hays A., Hagenbuch B. Cysteine scanning mutagenesis of transmembrane domain 10 in organic anion transporting polypeptide 1B1. Biochemistry. 2014;53:2261–2270. doi: 10.1021/bi500176e. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 141.Cagiada M., Johansson K.E., Valanciute A., Nielsen S.V., Hartmann-Petersen R., Yang J.J., et al. Understanding the origins of loss of protein function by analyzing the effects of thousands of variants on activity and abundance. Mol. Biol. Evol. 2021;38:3235–3246. doi: 10.1093/molbev/msab095. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 142.Da K., Weile J., Kishore N., Rubin A.F., Fields S., Fowler D.M., et al. MaveRegistry: a collaboration platform for multiplexed assays of variant effect. Bioinformatics. 2021;37:3382–3383. doi: 10.1093/bioinformatics/btab215. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 143.Ahler E., Register A.C., Chakraborty S., Fang L., Dieter E.M., Sitko K.A., et al. A combined approach reveals a regulatory mechanism coupling Src’s kinase activity, localization, and phosphotransferase-independent functions. Mol. Cell. 2019;74:393–408.e20. doi: 10.1016/j.molcel.2019.02.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 144.Suiter C.C., Moriyama T., Matreyek K.A., Yang W., Scaletti E.R., Nishii R., et al. Massively parallel variant characterization identifies NUDT15 alleles associated with thiopurine toxicity. Proc. Natl. Acad. Sci. U. S. A. 2020;117:5394–5401. doi: 10.1073/pnas.1915680117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 145.Zhang Y., Minagawa Y., Kizoe H., Miyazaki K., Iino R., Ueno H., et al. Accurate high-throughput screening based on digital protein synthesis in a massively parallel femtoliter droplet array. Sci. Adv. 2019;5 doi: 10.1126/sciadv.aav8185. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 146.Mokhtari D.A., Appel M.J., Fordyce P.M., Herschlag D. High throughput and quantitative enzymology in the genomic era. Curr. Opin. Struct. Biol. 2021;71:259–273. doi: 10.1016/j.sbi.2021.07.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 147.Fram B., Truebridge I., Su Y., Riesselman A.J., Ingraham J.B., Passera A., et al. Simultaneous enhancement of multiple functional properties using evolution-informed protein design. bioRxiv. 2023 doi: 10.1101/2023.05.09.539914. [preprint] [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 148.Kleina L.G., Miller J.H. Genetic studies of the lac repressor. XIII. Extensive amino acid replacements generated by the use of natural and synthetic nonsense suppressors. J. Mol. Biol. 1990;212:295–318. doi: 10.1016/0022-2836(90)90126-7. [DOI] [PubMed] [Google Scholar]
  • 149.Reinhart G.D. Quantitative analysis and interpretation of allosteric behavior. Methods Enzymol. 2004;380:187–203. doi: 10.1016/S0076-6879(04)80009-0. [DOI] [PubMed] [Google Scholar]

Articles from The Journal of Biological Chemistry are provided here courtesy of American Society for Biochemistry and Molecular Biology

RESOURCES