Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Feb 1.
Published in final edited form as: J Phys Chem B. 2019 Nov 27;123(49):10441–10455. doi: 10.1021/acs.jpcb.9b07278

Computational Analysis of Energy Landscapes Reveals Dynamic Features That Contribute to Binding of Inhibitors to CFTR-Associated Ligand

Graham T Holt †,‡,, Jonathan D Jou †,, Nicholas P Gill §, Anna U Lowegard †,, Jeffrey W Martin , Dean R Madden §, Bruce R Donald †,∥,⊥,*
PMCID: PMC6995034  NIHMSID: NIHMS1068674  PMID: 31697075

Abstract

The CFTR-associated ligand PDZ domain (CALP) binds to the cystic fibrosis transmembrane conductance regulator (CFTR) and mediates lysosomal degradation of mature CFTR. Inhibition of this interaction has been explored as a therapeutic avenue for cystic fibrosis. Previously, we reported the ensemble-based computational design of a novel peptide inhibitor of CALP, which resulted in the most binding-efficient inhibitor to date. This inhibitor, kCAL01, was designed using OSPREY and evinced significant biological activity in in vitro cell-based assays. Here, we report a crystal structure of kCAL01 bound to CALP and compare structural features against iCAL36, a previously developed inhibitor of CALP. We compute side-chain energy landscapes for each structure to not only enable approximation of binding thermodynamics but also reveal ensemble features that contribute to the comparatively efficient binding of kCAL01. Finally, we compare the previously reported design ensemble for kCAL01 vs the new crystal structure and show that, despite small differences between the design model and crystal structure, significant biophysical features that enhance inhibitor binding are captured in the design ensemble. This suggests not only that ensemble-based design captured thermodynamically significant features observed in vitro, but also that a design eschewing ensembles would miss the kCAL01 sequence entirely.

Graphical Abstract

graphic file with name nihms-1068674-f0001.jpg

1. INTRODUCTION

Interactions between proteins and short linear motif peptides are important in many cellular contexts.1 One such class of peptide-binding proteins is the PDZ (PSD-95, discs large, ZO-1) domain family, characterized by an 80–90 residue motif2 that adopts a conserved fold composed of 2–3 α-helices and 5–6 β-strands and binds C-terminal peptides through β-sheet interactions.3 These domains commonly modulate protein localization and complex assembly46 and regulate cell signaling,7,8 thereby playing critical roles in auditory and visual systems;9,10 epilepsy, pain, and addiction;6,11 synapse formation;5 cancer;7,8,12,13 and cystic fibrosis.1417

Protein-peptide interactions have been implicated as therapeutic targets in cystic fibrosis (CF), a genetic disease characterized by defects in the cystic fibrosis transmembrane conductance regulator (CFTR) that result in impaired chloride ion transport.14 Approximately 90% of CF patients are homozygous or heterozygous for the F508del (c.1521_1523delCTT) mutation,14,18,19 which encodes a protein variant F508del-CFTR (p.Phe508del) with severe loss of function. This variant exhibits impaired folding,20 increased degradation by endoplasmic reticulum (ER) quality control machinery,21 reduced capacity for Cl transport,14 and decreased half-life at the plasma membrane.22 CFTR is recycled from the cell membrane and preferentially targeted for lysosomal degradation by interaction of the CFTR C-terminus with the CFTR-associated ligand PDZ domain (CAL/P).15,16 CALP has been implicated in both decreasing concentration of CFTR at the membrane16 and arresting CFTR trafficking in the ER,17 and knockdown of CALP has been shown to rescue transepithelial chloride transport in polarized CFBE41o- cells expressing F508del-CFTR by increasing the concentration of F508del-CFTR at the plasma membrane.23 Hence, inhibition of the interaction between the CFTR C-terminal peptide and CALP is a potential therapeutic avenue for CF. Understanding of the CALP:CFTR binding interaction is critical for the development of therapeutic inhibitors.

Previous work toward inhibitor development2427 resulted in extensive characterization of the structural and stereo-chemical components of CALP binding. The structure of CALP bound to the CFTR C-terminal peptide was solved by solution NMR24 with well-resolved interactions between the 4 C-terminal peptide residues (P−3–P0) and CALP. This structure revealed canonical class 1 PDZ interactions28 including those between Leu P0 and a hydrophobic pocket between secondary-structure elements α2 and β2, and the essential hydrogen bond between Thr P−2 and a histidine residue in helix α2. Peptide screening and iterative optimization using substitutional analysis25 revealed significant affinity effects of residues at other peptide positions up to P−9 and resulted in a decapeptide inhibitor (iCAL36, ANSRWPT-SII) with an affinity of 22.6 ± 8.0 μM25,26 that rescued functional CFTR activity as assessed by in vitro Ussing chamber assays.29 Crystal structures of iCAL36 (and substituted peptide variants) in complex with CALP26,27 revealed structural features that influence CALP binding and selectivity. In particular, shifts in peptide orientation and location, along with conformational shifts in the carboxylate-binding loop (characterized by a XΦ1GΦ2 sequence motif, where Φi represents a hydrophobic and X is any residue), affect the binding geometry and specificity of the peptide P0 residue,26 allowing CALP to accommodate both Leu and Ile at P0. Additionally, side-chain interactions at P−1, P−3, P−4, and P−5 modulate affinity and specificity of CALP binding.27 Finally, despite the fact that CALP:CFTR binding is thought to be primarily driven by enthalpic effects,30 NMR data and molecular dynamics (MD) simulations suggest that entropy may play a role in modulating CALP binding,24 a hypothesis which is reflected in studies of other PDZ domains.3134

Previously,35 we developed the most binding-efficient36 inhibitor of CALP to date using the OSPREY37 protein design software package, suggesting that components of CALP binding can be effectively captured using provable, ensemble-based computational protein design algorithms. starting from the solution NMR structure of CALP:CFTR,24 we used the K* algorithm38 to compute approximations to Ka—the K* score—for CALP binding to ≈8000 hexameric C-terminal peptides (residue positions P−5–P0). Retrospective predictions on 6223 previously characterized sequences25,30 showed that our algorithm was able to effectively classify sequences by binding affinity to CALP, with an area under the receiver operating characteristic (RoC) curve of 0.84. Additionally, OSPREY designs on 2166 sequences resulted in novel peptides that bind tightly to CALP. All of the top 11 prospective predictions were experimentally shown to bind with high affinity to CALP, and the tightest binding hexamer, kCAL01 (Ac-WQVTRV), bound with Ki = 2.3 ± 0.2 μM.35 kCAL01 bound more tightly than both the previous best hexamer (iCAL35, WQTSII, Ki = 14.0 ± 1 μM)35 and decamer (iCAL36, ANSRWPTSII, Ki = 22.6 ± 8.0 μM)26 peptide inhibitors. Despite its small size (MW 829 Da), kCAL01 binds with an affinity comparable to a much larger (MW 1502 Da) fluorescein-modified version of iCAL36 (F*-iCAL36, Kd =1.3 ± 0.1 μM),25 yielding a much better binding efficiency for kCAL01 (for molecular weights and inhibition constants of various inhibitors see Table S1). Furthermore, kCAL01 rescued chloride ion transport activity of F508del-CFTR in cell-based assays.35,39 Ensemble-based design algorithms were shown to be critical for the success of this design: Ranking by energy of the global minimum energy conformations (GMECs) resulted in poor prediction accuracy and little overlap with the ensemble-based predictions.35 These data not only suggest that computational structural protein design (CsPD) algorithms can capture features that contribute to CALP:peptide binding, but also that ensemble-based or entropic effects are critical for prediction accuracy.

Indeed, computational designs are more biophysically accurate when they model protein thermodynamic ensembles.35,38,4047 The objectives of CSPD algorithms are to (1) compute biophysical or thermodynamic properties of a protein or protein complex and (2) efficiently search for optimal sequences given an objective function. Without loss of generality, we choose binding affinity as our biophysical property of interest. CsPD algorithms search over a user-specified input model (viz., a structural model, allowed side-chain and backbone flexibility, allowed mutations, energy function, etc.37). Because proteins exist as thermodynamic ensembles,41,48 principled algorithms should exploit statistical thermodynamics of non-covalent binding, and therefore require approximation of the partition function.41,49 However, because the conformation space available to proteins in vivo and in vitro is massive and grows exponentially with the number of flexible amino acid residues, protein design algorithms often make simplifying modeling assumptions to allow tractable computation. such assumptions often include (1) modeling only rigid, discrete side-chain configurations, or rotamers,50 and a small set of discrete backbone conformations,5153 (2) considering (or approximating) only a single global minimum energy conformation (GMEC),51,5459 and (3) approximating the partition function using stochastic, heuristic sampling methods.6062 However, these assumptions (1) fail to model small, commonly observed side-chain and backbone movements, (2) entirely omit conformational entropy, and (3) often fail to find even the GMEC.53 Algorithms in the OSPREY software package efficiently solve protein design problems without these simplifications.37

Recently we developed the MARK* algorithm, which, in addition to provably and efficiently approximating partition functions for input model states (i.e., bound complex, unbound protein, and ligand), allows visualization of the entire energy landscape accessible to a protein input model.63 MARK* provably bounds the energy and statistical weight of every conformation in the input model conformation space, allowing designers to compute and visualize changes in conformation distribution, instead of merely analyzing changes to a small set of low-energy conformations, or to ensemble averaged values like free energy or Ka. By computing both a conformation distribution and provably approximating energies for every conformation, MARK* enables visualization of the energy landscape. This novel capability complements traditional structural analysis by providing insight into entropie and dynamic contributions to binding.

In this work, we report a 1.7 Å resolution crystal structure of a decapeptide variant of the peptide inhibitor kCAL01 bound to CALP (PDB ID: 6OV7). To evaluate the structural basis for the enhanced binding efficiency of kCAL01, we compare this structure to that of a previously developed decapeptide inhibitor of CALP, iCAL36 (PDB ID: 4E34).26 In addition to performing traditional structural analysis, we compute energy landscapes for bound and unbound structural models for CALP:kCAL01 and CALP:iCAL36 using MARK*. From these landscapes we compute approximations to the free energy, internal energy, and entropy for each model state (i.e., bound complex, unbound protein, unbound ligand) and use these quantities to model thermodynamics of binding for CALP:kCAL01 and CALP:iCAL36. Additionally, we analyze the energy landscapes for dynamic effects and show that these energy landscapes foreground important structural features of binding and reveal dynamic features that may contribute to the efficient binding of kCAL01. We conclude that investigation of energy landscapes complements traditional analysis of one or few low-energy structures represented in crystal structures and provides important information about the entire conformational ensemble that is available to a protein structure model. Finally, to assess the extent to which designs reported in ref 35 are a result of accurate modeling of structural and dynamic components of CALP binding, we compare the design output ensemble for kCAL01 to the newly solved crystal structure. We show that, despite notable differences between the NMR-based CALP structure used as design input vs the bound crystallographic conformation of CALP, many significant crystal structure features are captured in the design output ensemble. This suggests that the success of the ensemble-based computational design of kCAL0135 was a result of effective modeling of structural and dynamic features of binding.

An overview of our system, model, and data is as follows. Table 1 shows data collection and refinement statistics for the crystal structure of CALP:kCAL01. Figure 1 depicts the crystal structure of CALP:kCAL01 and important structural features. Figure 2 shows a detailed structural comparison between the carboxylate-binding loops of CALP:kCAL01 and CALP:iCAL36. Figure 3 details structural views of predicted conformational heterogeneity at P0. Figure 4 depicts energy landscape diagrams for CALP:kCAL01. Figure 5 shows energy landscape diagrams for CALP:iCAL36. Figure 6 shows a schematic diagram of structural comparisons performed in this work. Figure 7 shows the original design model reported in ref 35 and highlights its similarity to the crystal structure of CALP:kCAL01.

Table 1.

Data Collection and Refinement Statistics for CALP:kCAL01 Complex (PDB ID: 6OV7)

Data Collection
space group P212121
unit cell dimensions a, b, c (Å) 43.6, 60.8, 81.2
resolutiona(Å) 48.6–1.71 (1.83–1.71)
Rsymb (%) 10.8 (95.9)
I/σI, 14.0 (2.0)
completeness (%) 99.6 (99.8)
Refinement
total number of reflections 23681
reflections in test set 1160
Rworkc/Rfreed (%) 18.2/21.4 (28.1/30.7)
no. atoms protein 1470
no. atoms water 175
Ramachandran plote (%) 98.4, 1.6, 0, 0
Bav2)
protein 22.3
solvent 33.1
bond length RMSD (Å) 0.01
bond angle RMSD (deg) 1.18
a

Values in parentheses are for data in the highest-resolution shell.

b

Rsym=ΣhΣi|I(h)Ii(h)|/ΣhΣiIi(h), where Ii(h) and I(h) values are the ith and mean measurements of the intensity of reflection h.

c

Rwork =Fobs|h|Fcalc|h|/|Fobs |h|, h ∈ {working set}.

d

Rfree is calculated as Rwork for the reflections h ∈ {test set}.

e

Core, allowed, generously allowed, disallowed.

Figure 1.

Figure 1.

Crystal structure of CALP:kCAL01 (PDB ID: 6OV7) displays canonical class 1 PDZ binding and favorable interactions at P−1 and P−4. The CALP and kCAL01 crystal structures are shown in green and pink, respectively (protomer A). Hydrogen bonds predicted by the Probe78 software are represented as dashed yellow lines. The kCAL01 peptide binds in the groove defined by helix α2 and strand β2. (A) Gln P−4 makes favorable hydrogen bonds with Glu300 or His301 and forms van der Waals interactions with His341. Additionally, Gln P−4 can coordinate with several water molecules, shown as red, nonbonded oxygens. (B) kCAL01 binds in the groove defined by helix α2 and strand β2 and forms an antiparallel β-sheet interaction with strand β2. The C-terminus of kCAL01 forms favorable hydrogen bonds with the carboxylate-binding loop (CBL). (C) kCAL01 displays features of class 1 PDZ binding,28 including the conserved hydrogen bond between Thr P−2 and His341, as well as the interaction between the Val P0 side chain and the hydrophobic pocket. (D) Arg P−1 appears to form favorable π-interactions with His311 and also coordinates with several waters.

Figure 2.

Figure 2.

Binding geometry of kCAL01 P0 and the carboxylate-binding loop. Superimposed views of the P0 interaction with the carboxylate-binding loop (CBL) and hydrophobic pocket (side chains shown as lines) are shown for CALP:kCAL01 protomer A (green:pink), CALP:iCAL36 protomer A (blue:orange), and CALP:iCAL36 protomer B (purple:yellow). (A) Superimposed Cα traces show that the CALP conformation at the Ile Φ2 Cα is more similar to the CALP:iCAL36 protomer A conformation than the CALP:iCAL36 protomer B conformation. (B) A pairwise comparison shows that the CALP:kCAL01 CBL geometry matches most closely with CALP:iCAL36 protomer A, seen at the side chains at CBL positions Φ1 and Φ2. However, the kCAL01 peptide P0 shifts toward the CBL by 0.7 Å relative to the CALP:iCAL36 structure. (C) A pairwise comparison shows that the CALP:kCAL01 peptide orientation matches most closely with CALP:iCAL36 protomer B, seen at position P0. However, the CALP:iCAL36 CBL shifts outward by 1.3 Å relative to the CALP:kCAL01 structure, and the hydrophobic pocket expands due to changes in rotamer at CBL position Φ1.

Figure 3.

Figure 3.

Energy landscape analysis reveals conformational heterogeneity at Val P0 for CALP:kCAL01. Energy landscape analysis of bound kCAL01 indicates three rotamers at peptide P0 that contribute significantly to the partition function. We refer to these rotamers as m, t, or p, which describe the valine N–CαCβ–Cγ1 dihedral angle as minus 60 (~−60°), trans (~180°), or plus 60 (~60°), respectively, conforming to the convention defined in ref 50. This landscape analysis (see Figure 4C, outermost ring) suggests that the complex can sample any of these rotamers with relatively high probability, but that the m rotamer will be most occupied, and the p rotamer will be least occupied. Conformations containing each of these three rotamers were selected from the bound kCAL01 ensemble. Interactions between the P0 side-chain atoms and the CALP structure are shown using Probe dots,78 where green and blue dots indicate favorable interactions, yellow dots indicate small overlaps, and red and pink lines show steric clashes. (A) The m rotamer forms favorable interactions with Thr P−2, Val345, Leu348, Ile Φ2, and Leu Φ1. (B) The t rotamer forms favorable interactions with Thr P−2, Val345, Ser349, Leu348, and Ile Φ2. The slight overlaps (yellow dots) generated due to the interaction with Leu348, along with the lack of interaction with Leu Φ1, suggest that this conformation is slightly less favorable than the m rotamer. (C) The p rotamer forms favorable interactions with Ser349, Leu348, Ile Φ2, and Leu Φ1. Slight overlaps (yellow dots) can be seen in interactions with Leu348, Ile Φ2, and Leu Φ1, and there is no interaction with Val345, suggesting that this rotamer may be slightly less favorable than either the m or t rotamers. Nevertheless, all three rotamers are well-sampled in the ensemble and contribute significantly to the partition function.

Figure 4.

Figure 4.

Energy landscape analysis reveals components of binding thermodynamics for CALP:kCAL01. Upper bounds on the Boltzmann-weighted partition function computed using the MARK* algorithm63 in OSPREY37 for a 15-residue design at the protein–protein interface of CALP:kCAL01 are shown as colored ring charts. A brief explanation of the ring chart diagram can be found in Section 2.4. (A, B) Energy landscapes for CALP in the bound (A) and unbound (B) states show the change in conformation distribution induced by binding. (A) Bound CALP has a narrow distribution, with the GMEC accounting for nearly 50% of the partition function. (B) In contrast, unbound CALP shows a wide conformation distribution, with the unbound GMEC accounting for roughly 5% of the partition function, with conformational entropy generated largely by residues Thr296 and His301, indicated by a large number of similarly sized arcs at their corresponding rings. Importantly, the bound and unbound GMECs are not the same conformation and do not have the same energy. (C, D) Energy landscapes for kCAL01 in the bound (C) and unbound (D) states show a similar change in conformation distribution upon binding. (C) The bound kCAL01 energy landscape shows that the GMEC accounts for roughly 5% of the partition function. Even so, the landscape suggests considerable entropy, driven by residues at P−4, P−1, and P−2, and P0. Prediction of conformational heterogeneity at P0 is particularly interesting given its buried location. (D) Conversely, unbound kCAL01 shows a very high-entropy conformation distribution, with many conformations that contribute to the partition function, as seen by the presence of many small arcs in the outer ring. This is consistent with our expectations for an extended peptide backbone. Thermodynamic parameters calculated from these partition functions (see Section 3.2.2) indicate that binding of kCAL01 to CALP results in a decrease in internal energy and a decrease in entropy, which is represented visually in these energy landscapes.

Figure 5.

Figure 5.

Energy landscape analysis reveals components of binding thermodynamics for CALP:iCAL36. Upper bounds on the Boltzmann-weighted partition function computed using the MARK* algorithm63 in OSPREY37 for a 15-residue design at the protein—protein interface of CALP:iCAL36 are shown as colored ring charts. A brief explanation of the ring chart diagram can be found in Section 2.4. (A, B) Energy landscapes for CALP in the bound and unbound states show the change in conformation distribution induced by binding. (A) Bound CALP has a narrower distribution, with the GMEC accounting for roughly 20% of the partition function. (B) Unbound CALP shows a wide conformation distribution, with the unbound GMEC accounting for roughly 2% of the partition function, with conformational variation in multiple residues, indicated by a large number of similarly sized arcs at multiple rings. (C, D) Energy landscapes for iCAL36 in the bound and unbound states show a change in conformation distribution upon binding. (C) The bound iCAL36 energy landscape exhibits a lower-entropy distribution, with the GMEC accounting for roughly 3% of the partition function. Even so, the landscape suggests considerable heterogeneity, much of which is attributable to variation at P−2, and P−3. Interestingly, in contrast to the bound kCAL01 landscape, little heterogeneity is predicted at P0. (D) Unbound iCAL36 has a high-entropy conformation distribution, with many conformations that contribute to the partition function, as seen by the presence of many small arcs in the outer ring. This is consistent with our expectations for an extended peptide backbone. Thermodynamic parameters calculated from these partition functions (see Section 3.2.2) indicate that binding of iCAL36 to CALP results in a decrease in internal energy and a decrease in entropy, which is represented visually in these energy landscapes.

Figure 6.

Figure 6.

Schematic diagram of design process reported in ref 35 and structural comparisons presented in this work. A flowchart of design work performed in ref 35 is shown with black arrows. First, a design input model was generated by performing MD refinement of an NMR structure of CALP:CFTR.24 Using OSPREY, this input model was used for the K* algorithm’s38 ensemble-based design of peptide inhibitors of CALP, resulting in a design output model (ensemble) of CALP:kCAL01.35 Finally, in this work we perform detailed structural comparisons between several CALP:peptide structures and models, indicated by red arrows.

Figure 7.

Figure 7.

Structural analysis of the kCAL01 design models35 reveals similarities to the CALP:kCAL01 crystal structure. The design output model ensemble35 of CALP (gray) bound to kCAL01 (orange) closely resembles the bound CALP:kCAL01 crystal structure (Figure 1). (A) Comparison of the design input model (gray) and 6OV7 crystal structure (green) CALP conformations shows significant shifts (red arrows) in strand β2 and the β2–β3 loop. These shifts greatly expand the binding cleft between helix α2 and strand β2. The shift in β2–β3 loop conformation is a result of MD refinement, as it is not present in the structure of 2LOB before refinement (see Figure S2). The 100 conformations in the design output model ensemble capture interactions between (B) Arg P−1 and His311, as well as (C) the interaction between Thr P−2 and His341. (D) The design output ensemble models conformational heterogeneity at P0, suggesting that modeling of entropy at this site was important for this design’s success. The t, p, or m rotamers of P0 are shown in red, orange, and yellow, respectively.

2. METHODS

2.1. Structure Determination of CALP:kCAL01.

Recombinant CAL PDZ (CALP; UniProt accession number Q9HD26–2; residues 278–362) was expressed and purified as described previously.64 Briefly, an expression construct was engineered in pET16b containing an N-terminal decahistidine tag, a short linker, and a human rhinovirus (HRV) 3C protease cleavage site, followed by the CALP sequence. The construct was transformed into E. coli BL21 (DE3) RIL cells; expression was induced as previously described64 (except that TB medium was used), and protein was purified by nickel-nitrilotriacetic acid (NiNTA) affinity chromatography and size-exclusion chromatography (SEC). Following removal of the affinity tag and linker by HRV-3C protease cleavage, CALP was recovered in the flow-through fraction of a NiNTA affinity column and further purified by SEC. To facilitate crystallization, kCAL01 was synthesized as a decapeptide (ANSRWQVTRV) containing four N-terminal residues (in italics) that form lattice contacts in other CALP:peptide co-crystals.26,27,64,65 The kCAL01 decapeptide was synthesized using standard Fmoc solid-phase peptide synthesis and purified by reverse-phase HPLC. Peptide mass was confirmed using liquid chromatography/mass spectrometry (LC/MS). Using the hanging-drop method, CALP:kCAL01 co-crystals were obtained by mixing 1 mM kCAL01 and 6 mg/mL CALP with reservoir solution containing 25% (w/v) PEG 8000, 5% (v/v) PEG 400, 150 mM sodium chloride, and 100 mM Tris pH 8.5. Crystals were transferred into cryoprotectant solution (25% [w/v] PEG 8000, 15% [v/v] PEG 400, 150 mM sodium chloride, and 100 mM Tris pH 8.5) and flash-cooled in a liquid-nitrogen bath. Oscillation diffraction data were recorded at 100 K on beamline BL9–2 at the Stanford Synchrotron Radiation Lightsource (SSRL) over a 180° range with 0.5 s, 0.2° exposures. Reflection intensities were integrated and scaled using the XDS package66 (version 20190315). Initial phase estimates were obtained by molecular replacement using Phaser67 within the Phenix package68 (version 1.15.2) and using PDB ID 4E3426 as the search model (containing chains A and C only). Subsequent model building and refinement were performed using Phenix and Coot69 (version 0.8.9.2) to generate the final model of CALP in a complex with the kCAL01 decapeptide at 1.71 Å resolution. Data quality and refinement statistics are reported in Table 1. The coordinates and structure factors have been deposited in the Protein Data Bank (www.rcsb.org) with ID 6OV7.

2.2. Computational Methods.

The new crystal structure of CALP:kCAL01 (PDB ID: 6OV7, protomer A) and the crystal structure of CALP:iCAL36 (PDB ID: 4E34, protomer A) were used to model energy landscapes of CALP binding to kCAL01 and iCAL36, respectively. For each structure, we computed energy landscapes for three states: the bound CALP:peptide complex, the unbound CALP, and the unbound peptide. This was accomplished by first defining a set of accessible conformations for each state and then using the MARK* algorithm63 in OSPREY 3.037 to compute both a provable approximation to the partition function value and an approximation to the energy landscape.

Sets of accessible conformations, or conformation spaces, were defined as follows for each state. These conformation spaces are an approximation to the ensemble of conformations available to each state in vivo. First, hydrogens were added to each crystal structure using the MolProbity server70 in order to generate protonated crystal structures. Backbone atom coordinates for the bound complex state were obtained directly from protonated crystal structures 6OV7 (for the CALP:kCAL01 state) and 4E34 (for the CALP:iCAL36 state). Nine residues for CALP and the six most C-terminal residues for kCAL01 or iCAL36 (for a total of 15 residues in in each bound complex, see Table S2) were modeled as continuously flexible using continuous rotamers7172 in OSPREY. As in refs 46, 63, 71, and 73, rotamers from the Penultimate Rotamer Library50 were allowed to adopt any side-chain conformation such that all χ-angles are within ±9° of their modal χ-angles. For all other residues, side-chain coordinates were obtained from protonated crystal structures. Models for unbound CALP and peptide states were obtained by removing all atoms of the peptide or CALP structure, respectively, from the complex state. Thus, we defined approximations to the conformational ensembles for bound and unbound states, herein referred to as models, for CALP:kCAL01 and CALP:iCAL36.

For each model, we computed ε-approximate bounds on the value of the partition function to a deterministic, guaranteed accuracy of ε < 0.01 using the MARK* algorithm63 in OSPREY. All computations were run on 40–48 core Intel Xeon nodes with up to 500 GB of memory. As proved previously,63 not only does MARK* compute a provable ε-approximation to the partition function, it also bounds the energy landscape by provably approximating the energy and therefore statistical weight of all model conformations in the conformation space.

2.3. Entropy, Internal Energy, and Helmholtz Free Energy Calculation.

Aggregate values for the ensembles in each state were computed by bounding the energy for each conformation in the ensemble, and combining these energy bounds. For each bound and unbound state, we first computed bounds on the energy of each conformation in the conformational ensemble defined by that state, as was done in refs 35, 37, 44, 63, 74, and 75 described in Section 2.2. Using these energy bounds, we then computed bounds on the corresponding Boltzmann-weighted partition function ZC=ΣcC exp(E(c)/RT), where E is a function that returns the energy of conformation c, by computing and summing bounds on the Boltzmann weights for conformations c in that state C (see ref 63 for details). We then divided the upper bound on Boltzmann weight of c by the upper bound on the partition function to compute the probability pc for each conformation c within the ensemble. Using these probabilities, we then calculated the entropy S=RΣcCpcln pc and internal energy U=ΣcCpcE(c) of the ensemble and combined these two values to compute the Helmholtz free energy F = UTS at a temperature of 298.15 K. Here, E is a function that returns the lower bound on the energy of conformation c. We also used the upper bounds on Boltzmann weight for each conformation to compute energy landscape diagrams,63 which are explained in Section 2.4.

Although the K* score exhibits good Spearman’s rank correlation with experimental Ka values,37,74 the correlation between K* scores and Ka is not yet quantitative. First, most physics-based energy functions are based on small-molecule energetics, which can overestimate van der Waals terms and thereby overestimate internal energy. Additionally, the input models used in the current computation model only a subset of biologically available flexibility; in this case, flexibility was restricted to up to 4 side-chain χ angles per residue. We allowed side-chain χ angles to minimize continuously within ±9° of modal χ-angles but did not model backbone flexibility. Furthermore, we did not model explicit waters, instead relying on the EEF1 implicit solvation model76 in OSPREY. As a result, our models likely underestimate entropy and overestimate internal energy. Therefore, we scaled our thermodynamic values, decreasing internal energy U by a factor of 4, similar to the method described in ref 63.

2.4. Interpretation of Energy Landscape Diagrams.

Visual representations of computed energy landscapes can be found in Figures 4 and 5. For a full description of ring diagram visualizations, see ref 63. Briefly, each concentric ring represents a design amino acid residue, with each ring arc representing a single rotamer assignment to that residue given the residue assignments of the inner arcs, or “partial conformation”. Therefore, any arc in the outermost ring represents a “full conformation”, where all amino acid positions are each assigned a single rotamer. The angle of any given arc corresponds to the partition function contribution of all conformations that contain the given partial conformation. The color of any given arc corresponds to the smallest energy difference between the GMEC and the lowest-energy conformation that contains the given partial conformation, with small energy differences colored green, and larger energy differences colored red. Notably, white gaps are indicative of relatively high-energy conformations that individually contribute less than 0.1% of the partition function value. Therefore, a ring diagram visually represents the entire energy landscape for a design problem, showing the distribution of conformations according to their probability.

3. RESULTS AND DISCUSSION

3.1. Structural Analysis of CALP:kCAL01.

Co-crystals were formed with recombinant CALP and a decapeptide variant of kCAL01 with a four-residue N-terminal extension (ANSRWQVTRV, extension in italics). The refined model exhibits an excellent fit to the density, with Rwork and Rfree of 0.182 and 0.214, respectively, and lies within typical peptide geometry constraints (Table 1). The asymmetric unit consists of two protomers (A and B) of CALP complexed with kCAL01 (Figure S1A). The 9 or 8 C-terminal residues of the peptide ligand are well-resolved in protomers A and B, respectively. The crystal structure was deposited as PDB ID: 6OV7.

Alignment of protomers A and B of 6OV7 by CALP main-chain atoms using PyMOL77 results in good overlap, with 281 of the 348 total backbone atoms aligning with an RMSD of 0.32 Å Notable differences can be seen between the two protomer structures at two sites: helix α1 and adjacent to the carboxylate-binding loop (CBL). Significant distortion of the protomer B α1 helix results from an interprotomer disulfide bond between CALP residues 319 (protomer A) and 319 (protomer B) (Figure S1A,B). We hypothesize that this disulfide bond and resulting helix α1 distortion are artifacts of crystallization. Pronounced conformational differences adjacent to the protomer A and B CBLs, which connect the β-strands β1 and β2, are evident between CALP residues 284 and 289 (Figure S1B) However, these differences occur upstream of the CBL sequence motif residues 291 (Φ1) and 293 (Φ2) and do not appear to affect peptide binding. Due to the distortion of protomer B helix α1, the following analysis focuses on the protomer A CALP:kCAL01 structure.

3.1.1. Gross Structural Analysis of CALP:kCAL01 Reveals Canonical PDZ:Peptide Binding.

The overall topology of the protomer A CALP fold, composed of 5 β-strands and 2 α-helices, matches well with previous CALP structures and represents a canonical PDZ fold.2,3 Class 1 PDZ domains bind to peptides containing a C-terminal S/T-X-Φ binding motif, where Φ is hydrophobic and X is any residue, and form an antiparallel β-sheet with the β2 strand.28 kCAL01 binds in a manner consistent with typical class 1 PDZ domains, occupying the groove defined by helix α2 and strand β2 (Figure 1). Four main-chain hydrogen bonds are formed between CALP β2 and the 3 C-terminal kCAL01 residues P−2–P0, forming an antiparallel β-strand interaction (Figure 1B). This positions the most C-terminal peptide residue (P0) such that the main-chain carboxyl terminus interacts with the CBL, defined by an XΦ1GΦ2 sequence motif, where Φi is a hydrophobic amino acid.3,26 Additionally, the hydrophobic P0 side chain is buried in the pocket defined by CALP residues Leu291, Ile293, Ile295, Val345, and Leu348 (Figure 1C). The CALP:kCAL01 structure also contains the critical hydrogen bond between Thr P−2 and His341, which plays a significant role in defining PDZ domain class 1 specificity28 (Figure 1C). Notably, only the six C-terminal residues of the extended kCAL01 peptide form any direct contacts with CALP; the four additional N-terminal residues make only lattice contacts and were added to facilitate crystallization. Overall, this structure depicts a binding interaction that is consistent with the structural characteristics observed for canonical class 1 PDZ domains.3,28

To evaluate the basis for the enhanced efficiency of the CALP:kCAL01 binding interaction, we compared this crystal structure (PDB ID: 6OV7, protomer A) to the structure of CALP:iCAL36 (PDB ID: 4E34, protomers A and B), a previously developed inhibitor of CALP that also exhibits incell activity,29 but binds less tightly to CALP. We note that CALP residue numbering differs between these two structures, with numbering for 4E34 +8 relative to 6OV7. Unless otherwise noted, all residue numbering refers to the 6OV7 numbering convention.

3.1.2. Comparison of CALP:kCAL01 to CALP:iCAL36 Reveals Differences in Carboxylate-Binding Loop Conformation.

First, we analyzed the CBL conformation and peptide orientation, because previous work26 demonstrated that these features play a role in modulating CALP specificity for peptide residue P0. In particular, through analysis of 4E34, ref 26 presented two structural mechanisms by which CALP accommodates a Ile P0 side chain: (1) a CBL conformation that narrows the entrance to the hydrophobic binding pocket concomitant with an N-terminal peptide shift (4E34, protomer A), and (2) a CBL conformation that widens the entrance to the hydrophobic binding pocket concomitant with a change in rotamer at Leu Φ1, thus expanding the hydrophobic binding pocket (4E34, protomer B). kCAL01, in contrast, has a valine at position P0. To investigate the structural consequences of this substitution, we aligned each protomer of CALP:iCAL36 to CALP:kCAL01 by the main-chain atoms of CALP secondary-structure elements β2 and α2, which flank the peptide-binding groove. Structures of CALP:kCAL01 protomer A and CALP:iCAL36 protomer A showed good correspondence at the binding pocket, with an RMSD of 0.24 Å, and CALP:kCAL01 protomer A and CALP:iCAL36 protomer B aligned with an RMSD of 0.41 Å (RMSD calculated using backbone heavy atoms of secondary-structure elements α2 and β2).

The overall binding geometry of the CBL and peptide for CALP:kCAL01 contains distinct features of both CALP:iCAL36 protomers A and B (Figure 2). The CBL conformation of kCAL01 at residue Φ2 is more similar to that of the iCAL36 protomer A than protomer B, with Cα deviations of 0.5 and 1.3 Å, respectively (see Figure 2A). Additionally, the rotamer at loop residue Φ1 matches with iCAL36 protomer A (Figure 2B). Overall, the CALP CBL when bound to kCAL01 adopts a conformation that narrows the entrance of the hydrophobic binding pocket, which is similar to the previously reported CALP:iCAL36 protomer A structure (Figure 2B). This suggests that kCAL01 binding to CALP does not require hydrophobic pocket expansion to accommodate the Val P0.

However, the bound kCAL01 peptide shifts toward the CBL, similar to the CALP:iCAL36 protomer B structure (Figure 2C). kCAL01 shifts toward the CBL relative to iCAL36 protomer A by 0.7 Å measured at the P0 Cα (Figure 2B). This results in side-chain positioning at kCAL01 Val P0 that is intermediate to CALP:iCAL36 protomers A and B. This shift propagates up the backbone of the peptide, as the P−1 Cα also shifts by 0.8 Å. These results suggest that the presence of a valine at P0, rather than a sterically larger leucine or isoleucine, allows shifts in the peptide backbone that accommodate the less common C-terminal side chain within a high-affinity interaction.

On the basis of this structural analysis, it is unclear how the changes in P0 binding mode between CALP:kCAL01 and CALP:iCAL36 affect binding affinity. On one hand, the Val P0 present in kCAL01 appears to allow a peptide C-terminal shift without requiring a shift in the CBL and hydrophobic pocket expansion. On the other hand, it is not clear whether this C-terminal peptide shift is either favorable or unfavorable for binding. Indeed, the interactions formed by Val and Ile P0 in 6OV7 and 4E34, respectively, appear qualitatively similar, and the inclusion of the sterically larger Ile could more effectively fill the pocket. Overall, more analysis is needed to clarify the effects of structural variation at this site. This analysis is provided in Section 3.2.1, where investigation of CALP:kCAL01 and CALP:iCAL36 energy landscapes suggests that these structural shifts allow the kCAL01 Val P0 to sample three favorable rotamers, which we predict to be favorable for binding.

3.1.3. Comparison of CALP:kCAL01 and CALP:iCAL36 at Modulator Residues Reveals Interactions That Favor kCAL01 Binding.

Previous work25,27,65 pinpointed “modulator” residues at P−1, P−3, and P−4–P−9 that show individually modest effects on binding and specificity but together can create significant effects. We compared the CALP:kCAL01 and CALP:iCAL36 protomer A structures in order to determine the effect of these modulator residues on inhibitor binding.

kCAL01 contains an arginine residue at P−1 that interacts with His311 in an apparent π-cation interaction (Figure 1D). In contrast, iCAL36 contains an isoleucine at this position, which forms minor van der Waals interactions with His311 and Ser294. These interactions appear to be much less extensive than those formed by the Arg P−1 (see Figure S3A). Favorable interactions between CALP and kCAL01 are indicated at Gln P−4, which hydrogen bonds with Glu300 or His301, in addition to forming van der Waals interactions with His341 and interacting with several waters (Figure 1A). While the Pro P−4 found in the CALP:iCAL36 structure does interact with His301, His341, and a single water molecule, these interactions appear to be less favorable (see Figure S3B).

kCAL01 and iCAL36 differ only slightly at P−3, containing a valine and threonine residue, respectively. Both residues form van der Waals interactions between a methyl group and Ser308, but only the kCAL01 Val P−3 forms interactions with Thr296 due to the larger steric volume of the methyl group. These differences, while minor, suggest slightly more favorable interactions for kCAL01 at this position (see Figure S3C).

Overall, the most notable structural differences in binding stereochemistry between kCAL01 and iCAL36 occur at residues P−1 and P−4, where mutations to long polar and charged residues likely result in an increase in favorable energetic interactions. These results suggest that sequence differences shift the thermodynamic balance: the more hydrophobic iCAL36 peptide may have higher energy alone in solvent, whereas the more polar kCAL01 sequence is preferentially stabilized in the bound state.

3.2. Energy Landscape Analysis of CALP:kCAL01.

Conformational entropy can play a significant role in defining protein structure and function.7981 For this reason, when modeling binding of protein:ligand complexes, it is useful to compute partition functions over protein ensembles to better model and understand binding thermodynamics.35,38,4046 To approximate the conformational ensembles involved in CALP:kCAL01 and CALP:iCAL36 binding, we computed partition functions and energy landscapes using OSPREY for bound and unbound models of CALP:kCAL01 (PDB ID: 6OV7, protomer A) and CALP:iCAL36 (PDB ID: 4E34, protomer A) as described in Section 2.2. We compared energy landscape features of CALP:kCAL01 and CALP:iCAL36 to reveal dynamic features that contribute to CALP:kCAL01 binding. Furthermore, we used these energy landscapes to compute approximations to thermodynamic components of binding (described in Section 2.3) to analyze differences in binding of CALP:kCAL01 and CALP:iCAL36. A discussion on how to interpret energy landscape diagrams presented herein can be found in Section 2.4.

3.2.1. Energy Landscape Comparison of CALP:kCAL01 and CALP:iCAL36 Foregrounds Structural and Dynamic Features That Explain Differences in Binding.

Detailed comparison of CALP:kCAL01 (Figure 4) and CALP:iCAL36 (Figure 5) landscapes reveals local differences in side-chain conformational distributions for each structure. This is most notable when comparing the bound inhibitor landscapes of kCAL01 and iCAL36 (Figures 4C and 5C, respectively). Note that for ease of comparison we have decomposed the bound complex CALP:kCAL01 landscape into bound CALP and kCAL01 landscapes. The original convolved landscapes can be found in Figure S4.

The bound kCAL01 landscape (Figure 4C) indicates that residue Val P0 adopts three significant rotamers, shown by subdivision into three arcs at the outermost ring (ring 5). In contrast, the bound iCAL36 landscape (Figure 5C) indicates that residue Ile P0 adopts only one significant rotamer, shown in the second ring from the center (ring 1). Structural analysis of the rotamer distribution for this residue (Figure 3) suggests that Val P0 forms favorable interactions with the CALP hydrophobic pocket in each of three rotamers (Figure 3AC) defined by a rotation of ~60° around the N–Cα–Cβ–Cγ1 dihedral angle. As a result, the landscape analysis of bound kCAL01 suggests that residue Val P0 interacts with low energy and locally high entropy. Conversely, the iCAL36 Ile P0 is likely too large to interact favorably in multiple rotameric states and is predicted to occupy only one significant rotamer. These predicted differences in conformational heterogeneity at P0 could help explain the improved binding efficiency of kCAL01. Ensemble features are difficult to visualize when examining only a static crystal structure, but are now made clear by the energy landscape analysis.

Comparison of the energy landscapes of bound CALP for the CALP:kCAL01 and CALP:iCAL36 models (Figures 4A and 5A) also reveals differences in conformational heterogeneity. CALP bound to kCAL01 appears to be heavily conformationally restricted, with the GMEC occupying nearly 50% of the landscape (Figure 4A), while CALP bound to iCAL36 is less conformationally restricted, with the GMEC occupying roughly 20% of the landscape (Figure 5A). These differences appear to be driven in large part by differences in residue conformational heterogeneity at His311 and His301 (CALP:kCAL01 residue numbering, CALP:iCAL36 numbering is +8 relative), which can be seen by comparing the innermost two rings (ring 0 and 1) in the two bound CALP landscapes (Figures 4A and 5A). For each of ring 0 (the innermost) or 1 (the second innermost), the bound CALP landscape in the iCAL36 structure shows an additional minor rotamer population, shown by a purple or blue arc, respectively. This indicates that the rotamer distribution for His311 and His301 in the bound CALP:iCAL36 model has more entropy than that in the CALP:kCAL01 model.

The greater calculated side-chain entropy for His301 and His311 in the iCAL36-bound state might appear counterintuitive, given the better affinity of the kCAL01 complex. However, these entropic contributions may be offset by other changes in the protein and by a loss of favorable energetic interactions locally. Indeed, examination of structural interactions between His311 and His301 and the peptide inhibitor for CALP:kCAL01 and CALP:iCAL36 structures suggests that this relative increase in entropy for CALP:iCAL36 can be explained by a loss of interactions between His311 and peptide P−1, and between His301 and peptide P−4. Specifically, kCAL01 Arg P−1 forms strong π-stacking interactions with His311, while iCAL36 Ile P−1 forms weaker interactions with the analogous His319. Similarly, kCAL01 Gln P−4 forms a hydrogen bond with His301, while iCAL36 Pro P−4 forms van der Waals interactions. As a result, we expect these histidines to form more energetically favorable interactions with kCAL01 than iCAL36. Our models predict that these favorable interactions are sensitive to the rotamer choice at His311 and His301, resulting in a less conformationally heterogeneous ensemble at these positions.

Thus, energy landscape analysis both reveals conformational heterogeneity at Val P0 and draws attention to important modulator residue interactions in the bound state. These ensemble features are not clearly evident from electron density or B-factor analysis, indicating that our models capture information that is missed by traditional structural analysis. This information could likely be captured by measuring 3-bond scalar couplings by NMR,82 analysis of a higher-resolution structure with Ringer83 or qFit,8486 or use of advanced crystallography techniques including room-temperature85,87 and multitemperature88 crystallography.

3.2.2. Thermodynamics of CALP:kCAL01 and CALP:iCAL36 Energy Landscapes Reveal Decreases in Internal Energy and Entropy upon Binding.

Energy landscapes of CALP:kCAL01 binding visualize the loss of entropy upon binding. Figure 4 depicts energy landscapes of the unbound CALP (Figure 4B), unbound kCAL01 (Figure 4D), and bound CALP and kCAL01 (Figure 4A,C, respectively). Comparison of the unbound and bound ensemble landscapes for CALP (Figure 4A,B) reveals the significant loss of entropy due to conformational rearrangement upon binding. The unbound CALP landscape (Figure 4B) shows many low-energy conformations that contribute to the partition function, indicated by the many green-blue arcs in the outermost ring. Conversely, the bound ensemble of CALP (Figure 4A) is dominated by a single low-energy conformation, indicated by the large green arc, with the rest of the landscape occupied by higher-energy minor conformations. This indicates that the conformational rearrangement due to binding of kCAL01 imposes a significant entropic cost that must be compensated for by the gain of favorable intermolecular interactions.

Comparison of the unbound and bound ensemble landscapes for kCAL01 (Figure 4C,D, respectively) reveals a similar picture, with the decrease in entropy upon binding illustrated by the decrease in number and increase in size of the arcs in the outermost ring. Notably, for the unbound kCAL01 landscape, the GMEC occupies less than 0.2% of the partition function and outer rings are characterized by extensive whitespace, indicating the presence of many conformations that occupy individually less than 0.1% of the partition function. Together, these features are indicative of a high-entropy landscape. In contrast, the bound kCAL01 landscape is characterized by fewer conformations that contribute relatively more to the partition function, and a GMEC that occupies roughly 5% of the partition function. These landscape representations depict the loss of entropy upon binding of CALP:kCAL01.

Using these energy landscapes, we calculated approximations to the ensemble-weighted internal energy and entropy for the bound and unbound states of CALP:kCAL01 as described in Section 2.3. Additionally, we computed the same approximations for the binding-competent ensemble89—an alchemical state defined by the conformations and occupancies found in the bound protein or ligand modeled with the energy field of the unbound state—to deconvolve the changes in entropy vs internal energy upon binding. Conveniently, as shown previously89 this construction allows us to decompose binding into an “induced fit” step, involving a change in conformation distribution, and a “lock and key” step, in which protein:ligand interactions are formed, without regard to actual mechanism. At the chosen temperature of ~298 K, both CALP and kCAL01 exhibit a change in conformation distribution upon binding. This change in distribution results in a change in internal energy and entropy for CALP of 0.113 and −1.97 kcal/mol, respectively, and for kCAL01 of 0.234 and −2.23 kcal/mol, respectively, quantifying the large decrease in entropy due to binding, where the total contribution of the entropy change to the Helmholtz free energy -TΔSbinding is +4.19 kcal/mol. Complex formation—the “lock and key” step—results in a decrease in internal energy: whereas the combined internal energy of the unbound models is −18.84 kcal/mol, the bound model internal energy is −28.49 kcal/mol, resulting in a ΔUbinding of −9.65 kcal/mol. As a result, the approximated change in Helmholtz free energy due to binding ΔFbinding is −5.46 kcal/mol. These models suggest that both binding partners incur penalties to entropy and internal energy when adopting the binding-competent ensemble, which are compensated for by a large decrease in internal energy upon complex formation.

Energy landscapes of CALP:iCAL36 binding reveal a loss of entropy upon binding that is similar to that of CALP:kCAL01. Figure 5 depicts energy landscapes of the unbound CALP (Figure 5B), unbound iCAL36 (Figure 5D), and bound CALP and iCAL36 (Figure 5A,C, respectively). Comparison of bound and unbound states for CALP and iCAL36 also reveal a decrease in entropy upon binding for both binding partners, indicated by CALP and iCAL36 energy landscapes exhibiting a reduction in the number of arcs and an increase in arc size upon binding.

Approximations of ensemble-weighted internal energy and entropy for bound, binding-competent ensemble, and unbound models of CALP:iCAL36 revealed a smaller decrease in entropy upon binding compared to CALP:kCAL01, but also showed a smaller decrease in internal energy upon binding. At a temperature of ~298 IK, our models indicate that both CALP and iCAL36 undergo a change in conformation distribution upon binding. This results in a change in internal energy and entropy for CALP of 0.004 and −1.60 kcal/mol, respectively, and for iCAL36 of 0.558 and −1.89 kcal/mol, respectively, illustrating a decrease in entropy due to binding, where the total contribution of the entropy change to the Helmholtz free energy -TΔSbinding is +3.48 kcal/mol. Complex formation results in a decrease in internal energy: whereas the combined internal energy of the unbound models is −17.7 kcal/mol, the bound model internal energy is −25.6 kcal/mol, resulting in a ΔUbinding of −7.84 kcal/mol. As a result, the approximated change in Helmholtz free energy due to binding ΔFbinding is −4.36 kcal/mol. Similar to CALP:kCAL01, both binding partners undergo a loss of entropy and reduction in internal energy upon binding.

Overall, these landscapes and thermodynamic calculations suggest that although both CALP:kCAL01 and CALP:iCAL36 undergo a decrease in both entropy and internal energy upon binding, the energetic interactions gained upon binding are less favorable for CALP:iCAL36 than for CALP:kCAL01. This is reflected in the change in internal energy due to binding for each model, with CALP:kCAL01 and CALP:iCAL36 exhibiting ΔUbinding of −9.65 kcal/mol and ΔUbinding of −7.84 kcal/mol, respectively. Although the entropic penalty due to binding is less for CALP:iCAL36, these models predict that kCAL01 binds more tightly to CALP than does iCAL36, with ΔFbinding values of −5.46 kcal/mol and −4.36 kcal/mol, respectively. Despite the fact that our models account for only a subset of biologically relevant flexibility (see Section 2.3), this predicted 1.1 kcal/mol difference in free energy of binding is semi-quantitatively in line with experimentally determined inhibition constants25,26,35 (see Table S1), suggesting that these models are capturing biologically relevant features.

These results suggest that side-chain conformational entropy played a role in the design for improved binding efficiency of kCAL01, an observation that supports related work by Head-Gordon and co-workers.81 Previous studies24 suggested that backbone flexibility may play a role in CALP:peptide binding, but backbone conformational entropy was not addressed in this study. Investigation into the effects of backbone flexibility on predicted energy landscapes and binding thermodynamics for both kCAL01 and CALP is a promising avenue for future work. In particular, future directions include modeling backbone flexibility with the CATS59 and DEEPER90 algorithms in OSPREY, focusing on the CALP α2 helix, β1- β2 loop, and β2- β3 loop. Investigation of the backbone flexibility of iCAL36 and kCAL01 would also be valuable, especially given that iCAL36 contains a Pro P−4, whereas kCAL01 contains a Gln P−4.

3.3. Design Model Corresponds Closely with Bound Crystal Structure.

In this section, we briefly compare the design model reported in ref 35 with 6OV7 to determine which key structural features contributed to a successful kCAL01 design. We perform structural comparisons for both the CALP:kCAL01 design output model (defined as the ensemble of structures output from the K* algorithm38 in OSPREY) and the CALP:CFTR design input model (defined as the structural input to OSPREY). A schematic diagram depicting these definitions is shown in Figure 6. We begin by comparing gross features, and then we proceed to identify shared side-chain energetic and dynamic interactions.

To determine the accuracy of the structural design model reported in ref 35, we compared the crystal structure of CALP:kCAL01 to the ensemble of 100 low-energy structures that comprise the CALP:kCAL01 design output model35 (Figure 7). We aligned members of the design output ensemble to the CALP:kCAL01 crystal structure using the main chain of secondary-structure elements α2 and β2 and obtained good alignment quality, with one representative structure aligning with an RMSD of 0.94 Å Deviations were primarily a result of a more relaxed hydrophobic binding pocket in the design model relative to the CALP:kCAL01 crystal structure 6OV7, involving an outward shift of the β2 strand (Figure 7A). Additionally the loop connecting β2 and β3 adopts a different conformation, resulting in a large loop shift and a significant change in orientation of the β2 strand (Figure 7A).

We hypothesized that these differences were inherited from the CALP:CFTR design input model, generated during MD refinement24,35 of the bound NMR structure of CALP:CFTR (PDB ID: 2LOB).24 This MD-refined structure was chosen as the design input35 due to the optimization of the β-strand interactions between the CFTR peptide and strand β2.24 In order to test this hypothesis, we aligned both 2LOB NMR model 1 and the design input model to 6OV7 by the main-chain atoms of secondary-structure elements α2 and β2 which revealed good alignment quality, with an RMSD of 0.69 and 0.94 Å, respectively. Overall, 2LOB is more relaxed than the bound CALP:kCAL01 structure, with slight outward shifts and changes in angle in both helix α2 and strand β2 that result in an apparent expansion of the hydrophobic pocket that interacts with peptide residue P0 (Figure S2). However, the loop connecting β2 and β3 shows good correspondence between the CALP:CFTR NMR structure and the CALP:kCAL01 crystal structure (Figure S2). The change in loop conformation and significant reorientation of strand β2 is a result of MD refinement and does not appear in either the unrefined NMR structure (2LOB), the CALP:kCAL01 crystal structure (6OV7), or indeed in the CALP:iCAL36 structure (4E34). Therefore, we conclude that deviations in CALP conformation observed in the design output model of ref 35 were inherited from MD refinement of a relaxed NMR structure.

Nonetheless, this NMR-based design model captured key structural and ensemble properties of the CALP:kCAL01 complex,91 which allowed OSPREY to design kCAL01, the most binding-efficient inhibitor of CALP to date. Almost all members of the design ensemble predict favorable interactions between Arg P−1 and His311 (Figure 7B). Additionally, all three rotamers of Val P0 (t, p, and m) appear in the design output ensemble, indicating that the K* algorithm successfully identified sequences with multiple low-energy states (Figure 7D). A significant subset of the design ensemble captures the important hydrogen bond between His341 and Thr P−2, indicating that the design of kCAL01 captured key components of the class 1 PDZ binding geometry28 (Figure 7C). Interestingly, some members of the design output ensemble do not contain the hydrogen bond between histidine 341 and threonine P−2, consistent with observations of hydrogen bond breaking and reformation from experimental92 and MD simulation9396 studies. Indeed, solution NMR studies of ubiquitin showed that threonine residues occupy multiple rotameric states by measuring 3-bond scalar couplings.82 Overall, the crystal structure and design ensemble are quite quantitatively and qualitatively similar, despite differences in CALP structure, indicating that the design presented in ref 35 succeeded in capturing important structural and ensemble interactions. We conclude that these features allowed OSPREY to design kCAL01, the most binding-efficient inhibitor of CALP to date with rescue activity for F508del-CFTR, a disease-associated variant present in approximately 90% of CF patients. Key interactions and entropic effects predicted by the OSPREY design model are supported by the new crystal structure and landscape analysis presented herein.

3.4. PDZ Domain Energy Landscapes.

Modeling of energy landscapes complements traditional structural analysis of CALP:peptide crystal structures and provides a novel way to probe the conformational distribution available to the protein complex. We submit that these tools may prove useful for analyzing PDZ:peptide complexes in general. To investigate differences in conformational distributions for PDZ domains, we computed partition functions and energy landscapes for 10 structures of bound PDZ:peptide complexes. A preliminary analysis of these landscapes reveals a general trend of loss of entropy upon binding for both PDZ and peptide ligands, similar to that observed for CALP:kCAL01 and CALP:iCAL36. Additionally, these supplementary energy landscapes predict no significant conformational heterogeneity for any studied system at the peptide position P0 in the bound state, contrasting with the heterogeneity we observed for CALP:kCAL01 at Val P0. This raises the intriguing possibility that, similarly to kCAL01 for CALP, more binding-efficient inhibitors for other PDZ domains could be designed by maximizing relative entropy at P0. We include these energy landscapes for the scientific communtiy in SI Section S1.3 in the hope that these insights and data may be of further benefit.

4. CONCLUSION

In this work we investigated the basis for the binding efficiency of kCAL01, an OSPREY-designed peptide inhibitor of CALP that rescued functional CFTR activity as assessed by in vitro Ussing chamber assays.35 On the basis of structure and energy landscape analysis of the new crystal structure of CALP:kCAL01, we conclude that the comparative binding efficiency of kCAL01 stems from entropic effects at P0 and substitutions that result in more favorable energetic interactions at modulator residues. This conclusion is supported not only by comparative analysis of the CALP:kCAL01 and CALP:iCAL36 crystal structure conformations, but also by investigating energy landscapes for each ensemble model. We used energy landscape analysis enabled by the MARK* algorithm63 in OSPREY to provably approximate the energies of all conformations in each ensemble model, generated by assigning flexibility to residues in each crystal structure. These landscapes probed local residue conformational heterogeneity and enabled us to approximate binding thermodynamics to correctly predict that kCAL01 binds more tightly to CALP than does iCAL36. We conclude that modeling of energy landscapes complemented traditional structural analysis of CALP:peptide crystal structures and provided a novel way to probe the conformational distribution available to the protein complex. Modeling energy landscapes may prove useful for analyzing PDZ:peptide complexes in general, and hence, we provide energy landscapes for 10 additional bound PDZ:peptide complexes. Finally, we show that our successful design of kCAL01 was a result of effective modeling of both energetic and ensemble properties of CALP:peptide binding.

Supplementary Material

Supporting Information

ACKNOWLEDGMENTS

The authors thank Carrie Ann Davison and Dr. Mark Spaller for kCAL01 peptide synthesis; Dr. Terrence Oas, Drs. Jane and Dave Richardson, Nathan Guerin, Hong Niu, and all members of the Donald lab for helpful discussion and comments; the NIH (R01-GM078031, R01-GM118543 to B.R.D.; R01-DK101541, P20-GM113132, P30-DK117469, T32-GM008704 to D.R.M.) for funding; and the Stanford Synchrotron Radiation Lightsource (SSRL) staff, especially Irimpan I. Mathews. Use of the Stanford Synchrotron Radiation Lightsource, SLAC National Accelerator Laboratory, is supported by the U.S. Department of Energy, Office of Science, Office of Basic Energy Sciences under Contract No. DE-AC02-76SF00515. The SSRL Structural Molecular Biology Program is supported by the DOE Office of Biological and Environmental Research, and by the National Institutes of Health, National Institute of General Medical Sciences.

Footnotes

Supporting Information

The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jpcb.9b07278.

Supplementary figures and tables of structures (Figures S1–S3), previous biochemical characterization (Table S1), computational design flexibility (Table S2), and energy landscapes (Figures S4–S15) (PDF)

The authors declare the following competing financial interest(s): B.R.D. and J.D.J. are founders of Gavilan Biodesign, Inc. All other authors declare no conflicts of interest.

REFERENCES

  • (1).Diella F; Haslam N; Chica C; Budd A; Michael S; Brown NP; Trave G; Gibson TJ Understanding eukaryotic linear motifs and their role in cell signaling and regulation. Front. Biosci 2008, 13, 6580. [DOI] [PubMed] [Google Scholar]
  • (2).Fanning AS; Anderson JM Protein–protein interactions: PDZ domain networks. Curr. Biol 1996, 6, 1385–1388. [DOI] [PubMed] [Google Scholar]
  • (3).Lee H-J; Zheng JJ PDZ domains and their binding partners: structure, specificity, and modification. Cell Commun. Signal 2010, 8, 8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (4).Harris BZ; Lim WA Mechanism and role of PDZ domains in signaling complex assembly. J. Cell Sci 2001, 114, 3219–3231. [DOI] [PubMed] [Google Scholar]
  • (5).Kim E; Sheng M PDZ domain proteins of synapses. Nat. Rev. Neurosci 2004, 5, 771–781. [DOI] [PubMed] [Google Scholar]
  • (6).Piserchio A; Salinas GD; Li T; Marshall J; Spaller MR; Mierke DF Targeting specific PDZ domains of PSD-95: structural basis for enhanced affinity and enzymatic stability of a cyclic peptide. Chem. Biol 2004, 11, 469–473. [DOI] [PubMed] [Google Scholar]
  • (7).Lee H-J; Wang NX; Shi D-L; Zheng JJ Sulindac inhibits canonical Wnt signaling by blocking the PDZ domain of the protein Dishevelled. Angew. Chem. Int. Ed 2009, 48, 6448–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (8).Babault N; Cordier F; Lafage M; Cockburn J; Haouz A; Prehaud C; Rey F; Delepierre M; Buc H; Lafon M; et al. Peptides targeting the PDZ domain of PTPN4 are efficient inducers of glioblastoma cell death. Structure 2011, 19, 1518–1524. [DOI] [PubMed] [Google Scholar]
  • (9).Reiners J; Nagel-Wolfrum K; Jürgens K; Märker T; Wolfrum U Molecular basis of human Usher syndrome: Deciphering the meshes of the Usher protein network provides insights into the pathomechanisms of the Usher disease. Exp. Eye Res 2006, 83, 97–119. [DOI] [PubMed] [Google Scholar]
  • (10).Liu W; Wen W; Wei Z; Yu J; Ye F; Liu C-H; Hardie RC; Zhang M The INAD scaffold is a dynamic, redox-regulated modulator of signaling in the drosophila eye. Cell 2011, 145, 1088–1101. [DOI] [PubMed] [Google Scholar]
  • (11).Thorsen TS; Madsen KL; Rebola N; Rathje M; Anggono V; Bach A; Moreira IS; Stuhr-Hansen N; Dyhring T; Peters D; et al. Identification of a small-molecule inhibitor of the PICK1 PDZ domain that inhibits hippocampal LTP and LTD. Proc. Natl. Acad. Sci. U. S. A 2010, 107, 413–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (12).Humbert P; Russell S; Richardson H Dlg, Scribble and Lgl in cell polarity, cell proliferation and cancer. BioEssays 2003, 25, 542–553. [DOI] [PubMed] [Google Scholar]
  • (13).Shepherd TR; Klaus SM; Liu X; Ramaswamy S; DeMali KA; Fuentes EJ The Tiam1 PDZ domain couples to Syndecan1 and promotes cell-matrix adhesion. J. Mol. Biol 2010, 398, 730–746. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (14).Guggino WB; Stanton BA New insights into cystic fibrosis: molecular switches that regulate CFTR. Nat. Rev. Mol. Cell Biol 2006, 7, 426–436. [DOI] [PubMed] [Google Scholar]
  • (15).Cheng J; Moyer BD; Milewski M; Loffing J; Ikeda M; Mickle JE; Cutting GR; Li M; Stanton BA; Guggino WB A golgi-associated PDZ domain protein modulates cystic fibrosis transmembrane regulator plasma membrane expression. J. Biol. Chem 2002, 277, 3520–3529. [DOI] [PubMed] [Google Scholar]
  • (16).Cheng J; Wang H; Guggino WB Modulation of mature cystic fibrosis transmembrane regulator protein by the PDZ domain protein CAL. J. Biol. Chem 2004, 279, 1892–1898. [DOI] [PubMed] [Google Scholar]
  • (17).Bergbower E; Boinot C; Sabirzhanova I; Guggino W; Cebotaru L The CFTR-associated ligand arrests the trafficking of the mutant ΔF508 CFTR channel in the ER contributing to cystic fibrosis. Cell. Physiol. Biochem 2018, 45, 639–655. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (18).Worldwide Survey of the AF508 Mutation—Report from the Cystic Fibrosis Genetic Analysis Consortium; 1990; Vol. 47, pp 354–359. [PMC free article] [PubMed] [Google Scholar]
  • (19).Population variation of common cystic fibrosis mutations. Hum. Mutat 1994, 4, 167–177. [DOI] [PubMed] [Google Scholar]
  • (20).Cheng SH; Gregory RJ; Marshall J; Paul S; Souza DW; White GA; O’Riordan CR; Smith AE Defective intracellular transport and processing of CFTR is the molecular basis of most cystic fibrosis. Cell 1990, 63, 827–834. [DOI] [PubMed] [Google Scholar]
  • (21).Welch WJ Role of quality control pathways in human diseases involving protein misfolding. Semin. Cell Dev. Biol 2004, 15, 31–38. [DOI] [PubMed] [Google Scholar]
  • (22).Swiatecka-Urban A; Brown A; Moreau-Marquis S; Renuka J; Coutermarsh B; Barnaby R; Karlson KH; Flotte TR; Fukuda M; Langford GM; et al. The short apical membrane half-life of rescued ΔF508-cystic fibrosis transmembrane conductance regulator (CFTR) results from accelerated endocytosis of ΔF508-CFTR in polarized human airway epithelial cells. J. Biol Chem 2005, 280, 36762–36772. [DOI] [PubMed] [Google Scholar]
  • (23).Wolde M; Fellows A; Cheng J; Kivenson A; Coutermarsh B; Talebian L; Karlson K; Piserchio A; Mierke DF; Stanton BA; et al. Targeting CAL as a negative regulator of F508-CFTR cell-surface expression: an RNA interference and structure-based mutagenic approach. J. Biol. Chem 2007, 282, 8099–8109. [DOI] [PubMed] [Google Scholar]
  • (24).Piserchio A; Fellows A; Madden DR; Mierke DF Association of the cystic fibrosis transmembrane regulator with CAL: structural features and molecular dynamics. Biochemistry 2005, 44, 16158–16166. [DOI] [PubMed] [Google Scholar]
  • (25).Vouilleme L; Cushing PR; Volkmer R; Madden DR; Boisguerin P Engineering peptide inhibitors to overcome PDZ binding promiscuity. Angew. Chem. Int. Ed 2010, 49, 9912–9916. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (26).Amacher JF; Cushing PR; Bahl CD; Beck T; Madden DR Stereochemical determinants of C-terminal specificity in PDZ peptide-binding domains: a novel contribution of the carboxylate-binding loop. J. Biol. Chem 2013, 288, 5114–26. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (27).Amacher JF; Cushing PR; Brooks L; Boisguerin P; Madden DR Stereochemical preferences modulate affinity and selectivity among five pdz domains that bind CFTR: comparative structural and sequence analyses. Structure 2014, 22, 82–93. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (28).Songyang Z; Fanning AS; Fu C; Xu J; Marfatia SM; Chishti AH; Crompton A; Chan AC; Andersen JM; Cantley LC Recognition of unique carboxyl-terminal motifs by distinct PDZ domains. Science 1997, 275, 73–77. [DOI] [PubMed] [Google Scholar]
  • (29).Cushing PR; Vouilleme L; Pellegrini M; Boisguérin P; Madden DR A stabilizing influence: CAL PDZ inhibition extends the half-life of ΔF508-CFTR. Angew. Chem., Int. Ed 2010, 49, 9907–9911. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (30).Cushing PR; Fellows A; Villone D; Boisguerin P; Madden DR The relative binding affinities of PDZ partners for CFTR: a biochemical basis for efficient endocytic recycling. Biochemistry 2008, 47, 10084–98. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (31).Basdevant N; Weinstein H; Ceruso M Thermodynamic basis for promiscuity and selectivity in protein-protein interactions: PDZ domains, a case study. J. Am. Chem. Soc 2006, 128, 12766–12777. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (32).Gerek ZN; Keskin O; Ozkan SB Identification of specificity and promiscuity of PDZ domain interactions through their dynamic behavior. Proteins: Struct., Fund. Genet 2009, 77, 796–811. [DOI] [PubMed] [Google Scholar]
  • (33).Petit CM; Zhang J; Sapienza PJ; Fuentes EJ; Lee AL Hidden dynamic allostery in a PDZ domain. Proc. Natl. Acad. Sci. U. S. A 2009, 106, 18249–54. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (34).Lee JH; Park H; Park SJ; Kim HJ; Eom SH The structural flexibility of the shank1 PDZ domain is important for its binding to different ligands. Biochem. Biophys. Res. Commun 2011, 407, 207–212. [DOI] [PubMed] [Google Scholar]
  • (35).Roberts KE; Cushing PR; Boisguerin P; Madden DR; Donald BR Computational design of a PDZ domain peptide inhibitor that rescues CFTR activity. PLoS Comput. Biol 2012, 8, No. e1002477. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (36).Abad-Zapatero C; Metz JT Ligand efficiency indices as guideposts for drug discovery. Drug Discovery Today 2005, 10, 464–469. [DOI] [PubMed] [Google Scholar]
  • (37).Hallen MA; Martin JW; Ojewole A; Jou JD; Lowegard AU; Frenkel MS; Gainza P; Nisonoff HM; Mukund A; Wang S; et al. OSPREY 3.0: open-source protein redesign for you, with powerful new features. J. Comput. Chem 2018, 39, 2494–2507. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (38).Lilien RH; Stevens BW; Anderson AC; Donald BR Algorithm for protein redesign and its application synthetase a phenylalanine adenylation enzyme. J. Comput. Biol 2005, 12, 740–761. [DOI] [PubMed] [Google Scholar]
  • (39).Qian Z; Xu X; Amacher JF; Madden DR; Cormet-Boyaka E; Pei D Intracellular delivery of peptidyl ligands by reversible cyclization: discovery of a PDZ domain inhibitor that rescues CFTR activity. Angew. Chem. Int. Ed 2015, 54, 5874–5878. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (40).Tzeng SR; Kalodimos CG Protein activity regulation by conformational entropy. Nature 2012, 488, 236–240. [DOI] [PubMed] [Google Scholar]
  • (41).Gilson MK; Given JA; Bush BL; McCammon JA The statistical-thermodynamic basis for computation of binding affinities: a critical review. Biophys. J 1997, 72, 1047–1069. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (42).Sciretti D; Bruscolini P; Pelizzola A; Pretti M; Jaramillo A Computational protein design with side-chain conformational entropy. Proteins: Struct., Fund. Genet 2009, 74, 176–191. [DOI] [PubMed] [Google Scholar]
  • (43).Georgiev I; Keedy D; Richardson JS; Richardson DC; Donald BR Algorithm for backrub motions in protein design. Bioinformatics 2008, 24, i196–i204. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (44).Chen C-Y; Georgiev I; Anderson AC; Donald BR Computational structure-based redesign of enzyme activity. Proc. Natl. Acad. Sci. U. S. A 2009, 106, 3764–3769. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (45).Silver NW; King BM; Nalam MNL; Cao H; Ali A; Reddy GSKK; Rana TM; Schiffer CA; Tidor B Efficient computation of small-molecule configurational binding entropy and free energy changes by ensemble enumeration. J. Chem. Theory Comput 2013, 9, 5098–5115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (46).Georgiev I; Lilien RH; Donald BR The minimized dead-end elimination criterion and its application to protein redesign in a hybrid scoring and search algorithm for computing partition functions over molecular ensembles. J. Comput. Chem 2008, 29, 1527–1542. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (47).Donald BR Algorithms in Structural Molecular Biology; MIT Press: Cambridge, MA, 2011. [Google Scholar]
  • (48).Karplus M; Ichiye T; Pettitt BM Configurational entropy of native proteins. Biophys. J 1987, 52, 1083–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (49).Dill K; Bromberg S Molecular Driving Forces: Statistical Thermodynamics in Biology, Chemistry, Physics, and Nanoscience; CRC Press, 2012. [Google Scholar]
  • (50).Lovell SC; Word JM; Richardson JS; Richardson DC The penultimate rotamer library. Proteins: Struct., Funct., Genet 2000, 40, 389–408. [PubMed] [Google Scholar]
  • (51).Traoré S; Allouche D; André I; de Givry S; Katsirelos G; Schiex T; Barbe S A new framework for computational protein design through cost function network optimization. Bioinformatics 2013, 29, 2129–36. [DOI] [PubMed] [Google Scholar]
  • (52).Viricel C; Simoncini D; Schiex T; Barbe S Guaranteed Weighted Counting for Affinity Computation: Beyond Determinism and Structure The 22nd International Conference on Principles and Practice of Constraint Programming; 2016. [Google Scholar]
  • (53).Simoncini D; Allouche D; de Givry S; Delmas C; Barbe S; Schiex T Guaranteed discrete energy optimization on large protein design problems. J. Chem. Theory Comput 2015, 11, 5980–9. [DOI] [PubMed] [Google Scholar]
  • (54).Georgiev I; Lilien RH; Donald BR Improved pruning algorithms and divide-and-conquer strategies for dead-end elimination, with application to protein design. Bioinformatics 2006, 22, e174–e183. [DOI] [PubMed] [Google Scholar]
  • (55).Dahiyat BI; Mayo SL De novo protein design: fully automated sequence selection. Science 1997, 278, 82–7. [DOI] [PubMed] [Google Scholar]
  • (56).Leach AR; Lemon AP Exploring the conformational space of protein side chains using dead-end elimination and the A* algorithm. Proteins: Struct., Funct., Genet 1998, 33, 227–39. [DOI] [PubMed] [Google Scholar]
  • (57).Chazelle B; Kingsford C; Singh M A semidefinite programming approach to side chain positioning with new rounding strategies. INFORMS J. Comput 2004, 16, 380–392. [Google Scholar]
  • (58).Hallen MA; Gainza P; Donald BR Compact representation of continuous energy surfaces for more efficient protein design. J. Chem. Theory Comput 2015, 11, 2292–2306. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (59).Hallen MA; Donald BR CATS (coordinates of atoms by taylor series): protein design with backbone flexibility in all locally feasible directions. Bioinformatics 2017, 33, i5–i12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (60).Lee C; Subbiah S Prediction of protein side-chain conformation by packing optimization. J. Mol. Biol 1991, 217, 373–388. [DOI] [PubMed] [Google Scholar]
  • (61).Kuhlman B; Baker D Native protein sequences are close to optimal for their structures. Proc. Natl. Acad. Sci. U. S. A 2000, 97, 10383–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (62).Leaver-Fay A; Tyka M; Lewis SM; Lange OF; Thompson J; Jacak R; Kaufman K; Renfrew PD; Smith CA; Sheffler W; et al. ROSETTA3: an object-oriented software suite for the simulation and design of macromolecules. Methods Enzymol 2011, 487, 545–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (63).Jou JD; Holt GT; Lowegard AU; Donald BR Minimization-aware recursive K* (MARK*): a novel, provable algorithm that accelerates ensemble-based protein design and provably approximates the energy landscape. J. Comput. Biol In press. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (64).Amacher JF; Cushing PR; Weiner JA; Madden DR Crystallization and preliminary diffraction analysis of the CAL PDZ domain in complex with a selective peptide inhibitor. Acta Crystallogr., Sect. F: Struct. Biol. Cryst. Commun 2011, 67, 600–3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (65).Amacher JF; Zhao R; Spaller MR; Madden DR Chemically modified peptide scaffolds target the CFTR-associated ligand PDZ domain. PLoS One 2014, 9, No. e103650. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (66).Kabsch WXDS Acta Crystallogr., Sect. D: Biol. Crystallogr 2010, 66, 125–32. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (67).McCoy AJ; Grosse-Kunstleve RW; Adams PD; Winn MD; Storoni LC; Read RJ Phaser crystallographic software. J. Appl Crystallogr 2007, 40, 658–674. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (68).Adams PD; Afonine PV; Bunkóczi G; Chen VB; Davis IW; Echols N; Headd JJ; Hung L-W; Kapral GJ; Grosse-Kunstleve RW; et al. PHENIX: a comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr., Sect. D: Biol. Crystallogr 2010, 66, 213–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (69).Emsley P; Lohkamp B; Scott WG; Cowtan K Features and development of Coot. Acta Crystallogr., Sect. D: Biol. Crystallogr 2010, 66, 486–501. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (70).Chen VB; Arendall WB; Headd JJ; Keedy DA; Immormino RM; Kapral GJ; Murray LW; Richardson JS; Richardson DC MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystaiiogr. D 2010, 66, 12–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (71).Gainza P; Roberts KE; Donald B R Protein design using continuous rotamers. PLoS Comput. Biol 2012, 8, e1002335. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (72).Roberts KE; Donald BR Improved energy bound accuracy enhances the efficiency of continuous protein design. Proteins: Struct., Fund, Genet 2015, 83, 1151–1164. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (73).Ojewole AA; Jou JD; Fowler VG; Donald B R BBK* (branch and bound over K*): a provable and efficient ensemble-based protein design algorithm to optimize stability and binding affinity over large sequence spaces. J. Comput. Biol 2018, 25, 726–739. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (74).Reeve SM; Gainza P; Frey KM; Georgiev I; Donald BR; Anderson AC Protein design algorithms predict viable resistance to an experimental antifolate. Proc. Natl. Acad. Sci. U. S. A 2015, 112, 749–54. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (75).Gorczynski MJ; Grembecka J; Zhou Y; Kong Y; Roudaia L; Douvas MG; Newman M; Bielnicka I; Baber G; Corpora T; et al. Allosteric inhibition of the protein-protein interaction between the leukemia-associated proteins runx1 and CBFβ. Chem. Biol 2007, 14, 1186–1197. [DOI] [PubMed] [Google Scholar]
  • (76).Lazaridis T; Karplus M Effective energy function for proteins in solution. Proteins: Struct., Fund. Genet 1999, 35, 133–152. [DOI] [PubMed] [Google Scholar]
  • (77).Schrödinger LLC The PyMOL Molecular Graphics System, version 1.8; 2015. [Google Scholar]
  • (78).Word JM; Lovell SC; LaBean TH; Taylor HC; Zalis ME; Presley BK; Richardson JS; Richardson DC Visualizing and quantifying molecular goodness-of-fit: small-probe contact dots with explicit hydrogen atoms. J. Mol. Biol 1999, 285, 1711–33. [DOI] [PubMed] [Google Scholar]
  • (79).Frederick KK; Marlow MS; Valentine KG; Wand AJ Conformational entropy in molecular recognition by proteins. Nature 2007, 448, 325–329. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (80).Fleishman SJ; Khare SD; Koga N; Baker D Restricted sidechain plasticity in the structures of native proteins and complexes. Protein Sci 2011, 20, 753–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (81).Bhowmick A; Sharma SC; Honma H; Head-Gordon T The role of side chain entropy and mutual information for improving the de novo design of Kemp eliminases KE07 and KE70. Phys. Chem. Chem. Phys 2016, 18, 19386–19396. [DOI] [PubMed] [Google Scholar]
  • (82).Chou JJ; Case DA; Bax A Insights into the mobility of methyl-bearing side chains in proteins from 3 J CC and 3 J CN couplings. J. Am. Chem. Soc 2003, 125, 8959–8966. [DOI] [PubMed] [Google Scholar]
  • (83).Lang PT; Ng HL; Fraser JS; Corn JE; Echols N; Sales M; Holton JM; Alber T Automated electron-density sampling reveals widespread conformational polymorphism in proteins. Protein Sci 2010, 19, 1420–1431. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (84).Van Den Bedem H; Dhanik A; Latombe JC; Deacon AM Modeling discrete heterogeneity in X-ray diffraction data by fitting multi-conformers. Acta Crystallogr., Sect. D: Biol. Crystaiiogr 2009, 65, 1107–1117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (85).Fraser JS; Van Den Bedem H; Samelson AJ; Lang PT; Holton JM; Echols N; Alber T Accessing protein conformational ensembles using room-temperature X-ray crystallography. Proc. Natl. Acad. Sci. U. S. A 2011, 108, 16247–16252. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (86).Keedy DA; Fraser JS; van den Bedem H Exposing hidden alternative backbone conformations in X-ray crystallography using qFit. PLoS Comput. Biol 2015, 11, e1004507. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (87).Woldeyes RA; Sivak DA; Fraser JS E pluribus unum, no more: From one crystal, many conformations. Curr. Opin. Struct. Biol 2014, 28, 56–62. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (88).Keedy DA; Hill ZB; Biel JT; Kang E; Rettenmaier TJ; Brandão-Neto J; Pearce NM; von Delft F; Wells JA; Fraser JS An expanded allosteric network in PTP1B by multitemperature crystallography, fragment screening, and covalent tethering. eLife 2018, 7, e36307. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (89).Qi Y; Martin JW; Barb AW; Thelot F; Yan A; Donald BR; Oas TG Continuous interdomain orientation distributions reveal components of binding thermodynamics. J. Mol. Biol 2018, 430, 3412–3426. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (90).Hallen MA; Keedy DA; Donald BR Dead-End Elimination with Perturbations (“DEEPer”): A provable protein design algorithm with continuous sidechain and backbone flexibility. Proteins: Struct. Funct., Genet 2013, 81, 18–39. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (91).Schneider M; Fu X; Keating AE X-ray vs. NMR structures as templates for computational protein design. Proteins: Struct., Funct. Genet 2009, 77, 97–110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (92).Gaffney KJ; Piletic IR; Fayer MD Hydrogen bond breaking and reformation in alcohol oligomers following vibrational relaxation of a non-hydrogen-bond donating hydroxyl stretch. J. Phys. Chem. A 2002, 106, 9428–9435. [Google Scholar]
  • (93).Sessions RB; Gibbs N; Dempsey CE Hydrogen bonding in helical polypeptides from molecular dynamics simulations and amide hydrogen exchange analysis: alamethicin and melittin in methanol. Biophys. J 1998, 74, 138–152. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (94).Bondar A-N; White SH Hydrogen bond dynamics in membrane protein function. Biochim. Biophys. Acta, Biomembr 2012, 1818, 942–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (95).Kaiser A; Ismailova O; Koskela A; Huber SE; Ritter M; Cosenza B; Benger W; Nazmutdinov R; Probst M Ethylene glycol revisited: Molecular dynamics simulations and visualization of the liquid and its hydrogen-bond network. J. Mol. Liq 2014, 189, 20–29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (96).Guardia E; Skarmoutsos I; Masia M Hydrogen bonding and related properties in liquid water: a Car–Parrinello molecular dynamics simulation study. J. Phys. Chem. B 2015, 119, 8926–8938. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information

RESOURCES