Abstract
Intrinsically disordered regions (IDRs) are common and important functional domains in many proteins. However, IDRs are difficult to target for drug development due to the lack of defined structures that would facilitate the identification of possible drug-binding pockets. Galectin-3 is a carbohydrate-binding protein of which overexpression has been implicated in a wide variety of disorders, including cancer and inflammation. Apart from its carbohydrate-recognition/binding domain (CRD), Galectin-3 also contains a functionally important disordered N-terminal domain (NTD) that contacts the C-terminal domain (CTD) and could be a target for drug development. To overcome challenges involved in inhibitor design due to lack of structure and the highly dynamic nature of the NTD, we used a protocol combining nuclear magnetic resonance data from recombinant Galectin-3 with accelerated molecular dynamics (MD) simulations. This approach identified a pocket in the CTD with which the NTD makes frequent contact. In accordance with this model, mutation of residues L131 and L203 in this pocket caused loss of Galectin-3 agglutination ability, signifying the functional relevance of the cavity. In silico screening was used to design candidate inhibitory peptides targeting the newly discovered cavity, and experimental testing of only three of these yielded one peptide that inhibits the agglutination promoted by wild-type Galectin-3. NMR experiments further confirmed that this peptide indeed binds to a cavity in the CTD, not within the actual CRD. Our results show that it is possible to apply a combination of MD simulations and NMR experiments to precisely predict the binding interface of a disordered domain with a structured domain, and furthermore use this predicted interface for designing inhibitors. This procedure can potentially be extended to many other targets in which similar IDR interactions play a vital functional role.
Significance
Numerous carbohydrate-binding proteins including Galectin-3 contribute significantly to important pathologies but are difficult drug targets. Galectin-3 contains an N-terminal end without defined structure that is essential for its function. Designing inhibitors of this domain using computational methods is challenging because it rapidly switches between structurally different conformations under physiological conditions. We combined long-timescale accelerated molecular dynamics with existing structural data to predict the binding interface of the unstructured N-terminal domain with a domain in the protein with defined structure. We were then able to design a peptide in silico and show that it inhibits Galectin-3 function in tissue cultured cells. Many other proteins that contain regions without defined structure could be drug targeted using similar approaches.
Introduction
Many multifunctional proteins contain one or more domains with no clearly defined three-dimensional structure (1). Although such intrinsically disordered regions (IDRs) are generally small (less than 100 amino acids) they are surprisingly abundant and have important functions in the proteins that contain them: Afafasyeva et al. (2), who analyzed such structures, identified 6600 human proteins containing IDRs. The lack of a higher-order structure in IDRs allows such domains to be extremely flexible, and most IDR-containing proteins are known to functionally engage in protein and RNA/DNA interactions (2).
Galectin-3 can be described as a carbohydrate-binding protein, but this does not adequately capture its highly diverse cellular roles: it has been recovered at many different subcellular locations, including the nucleus, the cytoplasm (at the ER-mitochondrial interface (3), in spindle poles (4), and associated with lysosomes and autosomes (5)), in membrane-less cytoplasmic ribonucleotide-protein (RNP) particles (6,7) as well as bound to the cell surface, and secreted into the extracellular space, including peripheral blood. More than 300 proteins can form complexes with Galectin-3 in hematopoietic stem cells and peripheral blood mononuclear cells (8), and Galectin-3 has been implicated in numerous pathologies ranging from heart disease and diabetes to cancer (9,10).
The N-terminal end of Galectin-3 contains an IDR of around 80 amino acids, with the C-terminal domain (CTD) consisting of two faces (Fig. 1 A). The S face is the moiety that recognizes and binds to specific glycoproteins and includes the carbohydrate-recognition/binding domain (CRD). The function of the CRD has been studied in most detail on the surface of cells, which are covered by a dense layer of carbohydrate-containing biomolecules, including glycoproteins and glycolipids. At that location, extracellular Galectin-3 regulates signal transduction strength of glycoprotein receptors through its multimerization and crosslinking activity, resulting in intermolecular and intercellular lattice complex formation (11).
Figure 1.
NMR-guided Galectin-3 conformer generation. (A) Description of the key domains in the Galectin-3 structure. (B) Comparison of the experimental CSDs to those calculated from the AMD structural ensemble after filtering using the chemical shift data. (C) Conformations from the AMD simulations were clustered by structural similarity; for each cluster, the average number of NTD-CTD contacts (error bar shows the range of contacts) and RMSD from experimental CSDs are plotted against each other; clusters that show high NTD-CTD contacts and low RMSD with experimental CSDs are circled. (D) The two major NTD structural ensembles that agree with the experimental NMR data are shown; NTD residues that form major contacts with the CTD in these ensembles (Y36 and Y45) are highlighted. (E) Expanded view of the interface between the NTD and the CTD showing major CTD residues in contact with the NTD; the residues are colored by their peak intensities in the experimental CSD plot shown below. Green, strong intensity; magenta, medium intensity. To see this figure in color, go online.
For a Figure360 author presentation of this figure, see https://doi.org/10.1016/j.bpj.2022.10.008.
Increased Galectin-3 expression correlates with many different disease states, including inflammation and cancer, but a direct cause-effect relationship has also been demonstrated for some of these using knockout models. Thus, the ability to inhibit the protein is viewed as an important goal with the ultimate objective to therapeutically target Galectin-3 in different diseases (12,13). To this end, efforts have mainly focused on the CRD: because the structure of the CTD has been determined, and the interactions of the CRD with glycans have been well described, many carbomimetics that will interfere with the ability of Galectin-3 to bind to glycoprotein targets have been reported (14,15). TD139 is a Galectin-3 inhibitor in this category (16) that is being tested as an inhaled drug in clinical trials for idiopathic fibrosis (17). However, some carbomimetics may have unfavorable pharmacokinetic properties (18) and, as reviewed in (19), there are currently few examples of glycan-directed therapies that have transitioned to clinical use. This may also be due to challenges relating to shallow solvent-exposed binding surfaces, lack of many hydrophobic residues for ligand contact, and low residence time of the bound inhibitors when lectins bind to their carbohydrates.
The N-terminal domain (NTD) of Galectin-3 also appears to have a critically important contribution to its function (20) and was recently shown to mediate protein multimerization (21). Removal of this domain yields a CTD Galectin-3 protein with dominant negative activity (22,23,24). The CTD of Galectin-3 also contains a domain that is not the main site of direct carbohydrate recognition/binding called the F face. Ippel et al. (25) showed that the NTD interacts transiently with the CTD F face and characterized this interaction in more detail. Moreover, Lin et al. (26) reported that the disordered NTD including amino acids 20–100 forms a fuzzy complex with β strand regions of the F face. Importantly, the NTD mediates liquid-liquid phase separation of Galectin-3 (21,27), which could explain its contribution to forming membrane-less structures such as cytoplasmic ribonucleotide-protein (RNP) particles.
The NMR-based chemical shift differences (CSDs) measured by Ippel et al. (25) between full-length and CTD-only Galectin-3 provided information about the dynamics, but these are averaged values and do not inform on individual structures. However, it is likely that the Galectin-3 IDR will adopt an ensemble of structurally diverse conformations that transition in the picosecond to millisecond timescale under physiological conditions, which poses serious challenges to the application of computational methods. Here, we have approached the general problem of IDR characterization using accelerated molecular dynamics (AMD) combined with existing structural data of Galectin-3 to predict the binding interface of the CTD with the IDR. The CTD binding/N-terminal interface, as observed in the AMD simulations, includes a diverse ensemble of structures in which multiple amino acid motifs between residues 20 and 100 of Galectin-3 engage with the CTD. We show that these structures collectively explain the NMR data from Ippel et al. (25) and agree with the fuzzy complex model of IDR interaction. In silico-designed peptides based on the interacting N-terminal motifs were then used to validate the model predicted by AMD. The process described here could be used to economically target other IDR interactions with proteins or protein domains with defined structures.
Materials and methods
Molecular modeling and MD simulations
We retrieved the human Galectin-3 CTD crystal structure from the PDB Databank (PDB: 6FOF) (28). The NTD was added as a random chain using Modeller (29). The full-length structure was subjected to 100 ns of MD simulation in an implicit solvent environment (igb = 8) using AMBER (version 18, University of California, San Francisco) (30,31). For the implicit solvent simulations, the protein was parameterized using the AMBER FF14SBonlysc force field (32). To improve sampling of the NTD backbone, the simulation was performed at an elevated temperature of 450 K. In order to avoid denaturing, distance restraints were added to all Cα atom pairs of the CTD that were within 7.5 Å of each other. The resulting protein conformations were then clustered by backbone RMSD and the mean radius of gyration was calculated for each cluster. We selected a representative structure from the cluster of which the mean radius of gyration was closest to the experimentally measured one for Galectin-3 (26). This structure was used as the starting conformation for the AMD simulations. The starting structure for the AMD was solvated in explicit water and ions were added to neutralize the net charge. The system was parameterized using the a99sb-disp force field, which has been shown to perform well with both folded and disordered proteins (33). Since Galectin-3 consists of both a folded and a disordered domain, this force field is a suitable choice. Further, hydrogen mass repartitioning (HMR) was implemented in order to use a 4-fs time step to accelerate the simulations further (34). HMR has previously been reported in molecular dynamics studies of IDRs and the results compared favorably with experimental properties such as radius of gyration (35,36). The system was first heated at constant volume from 0 to 310 K over 30 ns with harmonic restraints applied to the protein heavy atoms. Then the system was equilibrated for 50 ns in the NPT ensemble, while the heavy atom restraints were gradually reduced to zero. Finally, the system was equilibrated for a further 50 ns unrestrained.
In AMD, the potential energy of the system is modified by adding a boost factor, where the value of the boost, ΔV(r) (r: coordinate vector of atomic positions) is determined based on the equation below:
| (1) |
Here, E is a threshold below which the boost is applied, and the value of α determines the depth and smoothness of the modified potential energy wells. We used the dual boost AMD protocol which applies separate boost potentials to the net potential energy and the protein dihedral potential. In our simulations, the values of E and α parameters were determined using the guidelines followed in (37). First, a short 50ns regular MD simulation was carried out to estimate the average potential energy of the system and that of the protein dihedrals (−882,845 kcal/mol and 3,039 kcal/mol respectively). Based on these values and the system size (290,304 atoms and 252 protein residues), the AMD parameters were set as follows: E(tot) = −836,400 kcal/mol, α(tot) = 46,449 kcal/mol and E(dihed) = 4,042 kcal/mol, α(dihed) = 202 kcal/mol.
The application of boost potential during AMD enables efficient exploration of the conformational space but leads to a biased structural ensemble. Theoretically, it would be possible to remove this bias introduced in AMD by reweighting based on the recorded values of the boost potential for each simulation frame. However, in practice it is nearly impossible, beyond very small molecules such as alanine dipeptide, due to the amplified noise in the energy terms when converted to exponentials (Boltzmann factors) (38). This issue led to the introduction of a refined AMD approach, the Gaussian accelerated MD (GAMD) (39). In GAMD, practical reweighting is possible through the use of a cumulant expansion technique, although this necessitates proper choice of reaction coordinates. In our system, GAMD proved unstable even after testing multiple adjustments of the boost parameters, whereas classical AMD was stable. Hence, instead of using reweighting, we decided to make use of published NMR data to refine the structural ensemble obtained through AMD.
Five independent AMD simulations at 310 K (NPT ensemble), each lasting for 250 ns, were performed using the GPU (graphics processing unit) accelerated AMBER software package (40). Most of these trajectories showed consistencies in terms of the radius of gyration distribution (Fig. S1 A) and the number of NTD-CTD contacts (Fig. S1 B), although run 5 sampled slightly more compact conformations compared with the other four runs. Notably, the radii of gyration observed during MD (19–21Å) are smaller than the experimentally derived value for the NTD-CTD complex using small-angle X-ray scattering (28–29 Å) (26). This suggests that the Galectin-3 conformations sampled during MD are more compact than usual, despite using a force field that is tailored toward IDRs. Comparing the most frequent NTD-CTD contacts among the five runs revealed a heterogeneous contact landscape where each trajectory explores unique contacts, while a few core contacts, such as Y107-Y247, F5-I250, and D3-R129, are populated in multiple trajectories (Fig. S1 C). The Galectin-3 NTD conformations resulting from the five simulations were clustered by backbone dihedral RMSD, and, for each cluster, the average number of NTD-CTD contacts was determined. Residue pairs for which the Cα atoms were within 8.5 Å were defined as contacts. We also calculated the average per-residue CSD () for each cluster using the software SHIFTX2 (41). For calculating , we calculated the chemical shifts for the full-length Galectin-3 and those of the CTD domain only by truncating the NTD region. The was then obtained as the difference between the two shifts as in Ippel et al. (25). Finally, the RMSD between the calculated and experimental was determined for each cluster and plotted against the number of NTD-CTD contacts (Fig. 1 C). The clusters with the lowest RMSD as well as with average number of NTD-CTD contacts greater than five (circled clusters in Fig. 1 C) were combined to obtain a conformational ensemble with significant NTD-CTD interactions, and which is in agreement with experimental NMR data (Fig. 1 B).
It is important to note in this context that methods such as SHIFTX2 are trained based on available protein crystal structures and hence are likely to show higher error when applied to disordered regions such as the NTD. However, SHIFTX2 has been applied previously to derive CSDs of disordered proteins from MD simulations and compare them with experimental data (42). Here, we applied SHIFTX2 for analyzing the CSDs of the CTD residues only, rather than the full-length Galectin-3. Even in deconvoluting the NTD ensemble, our analysis focused on parts of the NTD that interact with the CTD. In the bound state, these NTD regions are likely to be less dynamic than in the free state and more amenable to CSD predictions using methods such as SHIFTX2. Notably, the chemical shift predictions using SHIFTX2 for the CTD residues show reasonable agreement with the experimental values (Fig. S2 A–D) (41,43,44,45). Further, analyzing CSDs rather than the chemical shifts themselves may lead to less uncertainty due to error cancellation. Nevertheless, in the absence of prediction methods that are specifically tailored toward IDRs, the comparison of MD-derived CSDs to experimental data may arguably remain a challenging area.
Bayesian maximum entropy method
The details of the Bayesian maximum entropy (BME) approach are described in (46). Briefly, the weights for the AMD-derived protein ensemble (, where n is the total number of conformations) were obtained by minimizing the cost function , where is the agreement between observed and experimental CSDs and is the entropy relative to starting weights. Here, xj denotes the set of protein coordinates for the jth conformation, F(xj) represents the calculated CSDs using SHIFTX2 and FiEXP is the CSD for the ith residue. m is the number of residues for which experimental CSDs are available. Initially, all conformations were assigned the same weight w0, where w0 = 1/n. denotes the uncertainty of SHIFTX2 in calculating the 1H and 15N CSDs from structure and are obtained from (41). is an adjustable parameter that determines the tradeoff between the entropy and the agreement with experiments. Using the relationship (where, λi∗ represents a set of Lagrange multipliers, one for each CSD, wj∗ and wj0 represent the optimal and initial weights of the jth frame respectively, and Z is a normalization factor to ensure that the weights add up to 1) and assuming that the error involved in estimating the CSDs from SHIFTX2 follows a Gaussian distribution, a modified objective function Γ(λ) can be derived, . Given a certain θ, this objective function can be minimized to first derive the set of optimal λ values and then the weights from the derived λs. As explained in (46), this step allows the optimization to be performed in the smaller Lagrange multiplier space (corresponding to 226 CSDs in our case), as opposed to the space of wj (corresponding to 17,363 MD frames), thereby facilitating convergence. Starting with equal initial weights assigned to each MD frame, the optimization was performed using the L-BFGS-B method (47), where the gradient of Γ(λ) was analytically derived from its algebraic expression. The optimal value of (0.08) was determined by performing the optimization for different values of and locating the elbow of the versus curve, as suggested by (46) (Fig. S3). The Pearson’s correlation coefficient between the experimental CSDs and those from the reweighted MD ensemble using the optimized weights was 0.76. The optimization was carried out using the “stats” package in R. Comparing the CSDs of the CTD residues derived from the BME ensemble with those from experiments, we found a reasonable concordance (Fig. S2).
Choice of initial weights
Conventionally, in applying the BME method to biased MD ensembles, it is recommended to set the initial weights to the reweighting factors derived from the boost potentials applied to each frame during AMD. However, in our case, the reweighting factors exhibit too much noise, with only a few conformations showing very high weights and the rest having near-zero weights (also, see the discussion under section “molecular modeling and MD simulations”). Therefore, using the AMD reweighting factors as initial weights in BME was not feasible. To verify whether using equal initial weights for all conformations could influence the BME convergence, we performed the BME procedure starting with randomly assigned initial weights, instead of the same weight for all conformations. We tested two different initial weight distributions, uniform-random and bimodal, as shown in Fig. S3 C and F. In the bimodal distribution, most conformations were assigned a weight that is lower than average, while a small fraction of the conformations received higher weights, which is typically the scenario in AMD. In both cases, the choice of the initial weights did not significantly affect the values for different θs (Fig. S3 D and G) or those of the converged BME weights (Fig. S3 E and H, r2 = 0.97–0.99). Therefore, we conclude that correcting the AMD bias by reweighting the initial ensemble is unlikely to have affected the outcome of the BME calculation, since the convergence of the BME function in our system is insensitive to the initial weight distribution.
Peptide mutation scanning
From the AMD simulations, we selected four templates (two centered around Y36 and two around Y45), as described in the “results” section. The following are the sequences of the NTD fragments corresponding to the four templates: Y36_cls3/Y36_cls5: A1GAGGY6PGASY11, Y45_cls1/Y45_cls70: S1YPGAY6PGQAP11. In each template, all sequence positions excluding three residues centered around Y6 were mutated to all 20 amino acids, including alternative forms of histidine (protonated and unprotonated). The mutants were sorted by the Δaffinity score (48) (defined as the difference in binding free energy between the mutant and the wild-type template sequences, which represents the improvement in affinity of the mutant peptide over the starting NTD fragment) and the sequence positions frequently appearing at the top, and the corresponding mutated amino acids were analyzed (Figs. S4–S7 A; Table 1). Thus, three positions for Y36_cls3, four positions for Y36_cls5, four positions for Y45_cls1, and five positions for Y45_cls70 were identified as most promising for mutagenesis. These single mutations (82 total for the four templates) were then combined into double, triple, quadruple, and, in the case of Y45_cls70, quintuple mutants. In total, 7400 multiple mutants were scanned. From each template, the top 20–25 mutants were visualized to assess sidechain packing, burying of hydrophobic residues, and desolvation energy. We selected two of the best mutants from each template for further stability verification using MD (Figs. S4–S7 B).
Table 1.
Selected mutations from the single-mutant scanning step shown for each peptide template
| Template | Sequence | Selected single mutants (position: mutation) |
|---|---|---|
| Y36_cls3 | A1GAGGY6PGASY11 | 2: W, M, F, R, T, Y, H, I, S, L 4: I, R, M, F, L, V 8: R, F, Y, M |
| Y36_cls5 | A1GAGGY6PGASY11 | 2: R, Y, H, M, W, I, F 4: R, H, F, M, W, K 8: W, H, I 9: M, F, Y, R, P, L, H |
| Y45_cls1 | S1YPGAY6PGQAP11 | 3: R, M 4: R 8: L, I, M, F, R, H, S, P, Y, E, D 10: M, F, R, I, L, H |
| Y45_cls70 | S1YPGAY6PGQAP11 | 1: N, M, Y, Q, W, HIE 8: R, Q, Y, V, HIP, W, M 9: R 10: Y, F 11: R, F |
The amino acid position is counted from the left of the template sequence by designating the first position as 1.
Cells, culture and agglutination assay
LAX56 human precursor B acute lymphoblastic leukemia (pre-B ALL) cells were routinely co-cultured with mitomycin-C inactivated OP9 stromal cells. These previously described primary leukemia cells grew directly out from a relapse bone marrow sample (49,50). For agglutination assays, cells were harvested, washed once in α-MEM medium, resuspended in 10 mL of X-VIVO 15 medium (Lonza, Walkersville, MD), and incubated at 37°C for 24 h to remove the Galectin-3 produced by OP9 stromal cells. For the assay, acute lymphoblastic leukemia (ALL) cells were resuspended in X-VIVO 15 medium at a concentration of 1 × 106/mL and seeded at 2 × 105/200 μL cells into wells. Glutathione S-transferase (GST) or GST-Galectin-3 (150 μg/mL, Fig. 4; 25 μg/mL, Fig. 5) were added in 300 μL of X-VIVO 15 medium to wells. Peptides, if included, were preincubated for 5 min with the recombinant proteins and added at different concentrations as indicated in the figures. TD139 was purchased from MedChemExpress (Monmouth Junction, NJ) and used at 100 μM. Phase contrast images were taken after 1–2 h. Agglutination was defined as aggregates containing >10 cells per cluster. Between five and 13 images from different areas were taken and evaluated for cell clusters per condition. Biological data were graphed with GraphPad Prism software (San Diego CA, version 8.3.1). Values represent mean ± SEM of the number of aggregates scored per independent image.
Figure 4.
Peptide-3 inhibits agglutination of human leukemia cells mediated by Galectin-3. Representative bright-field images of suspensions of pediatric pre-B acute lymphoblastic leukemia LAX56 cells. Cells were incubated for 2 h (A) with no added protein (B) with control GST, or (C–G) with GST-Gal3. (D) Addition of 100 μM TD139 Galectin-3 inhibitor. (E–H) Peptides as indicated were added together with GST or GST-Gal3. GST-Gal3 was used at 150 μg/mL. Similar results were obtained with two independently generated batches of GST-Gal3; representative images are shown. (I) Quantification of agglutination expressed as the number of cell aggregates. Agglutination is defined as aggregates containing >10 cells per cluster. Error bars, mean ± SEM of cell cluster counts from five to 13 images from different areas per condition. ∗∗∗∗p < 0.0001, one-way ANOVA, Gal-3 agglutination compared with Gal3 + P1, TD139, or P3. All samples were processed in a single experiment. To see this figure in color, go online.
Figure 5.
Site-directed mutagenesis of Galectin-3 shows combined L131 and L203 are essential for agglutination. (A–U) Bright-field images of LAX56 cells incubated with the recombinant fusion proteins indicated above the panels, all used at a concentration of 25 μg/mL. Peptides P1 or P3 added at 100 μg/mL together with the recombinant proteins are also noted above the images. (V) Quantitation of cellular aggregation under the indicated conditions. Error bars, mean ± SEM. Separate graphs below show 1) the comparison between the GST-Gal3 agglutination and that of the different mutants; ∗∗∗∗p < 0.0001, one-way ANOVA, multiple comparisons. 2) The effect of P3 on the agglutination of each GST-Gal3 protein; ∗p < 0.05; ∗∗∗p < 0.001, one-way ANOVA, pairwise comparisons between GST-Gal3 and mutants or between GST-Gal3 (mutants) alone or incubated with 100 μg/mL peptide-3. To see this figure in color, go online.
GST-galectin-3 and mutants
Full-length human GST-Galectin-3 (hereafter named GST-Gal3) in pGEX2T was previously described (51). To generate mutants, we used Takara (San Jose, CA) online primer design tools and a Takara In-Fusion Snap Assembly Kit (catalog no. 638945) to generate mutations according to the manufacturer’s instructions. DNAs run on agarose gels were purified using a Thermo Scientific (Irwindale, CA) GeneJET Gel Extraction Kit (catalog no. K0691). In-Fusion reactions (Takara) were assembled and Stellar competent cells (Takara) used for transformation. All constructs were verified by DNA sequencing (Eton Bioscience, San Diego, CA).
Galectin-3 CTD construct for NMR
The Galectin-3 CTD construct was generated using the same methods described above for the mutants. The protein includes Galectin-3 amino acids P117-I250 as well as residual attached glycine and serine residues after thrombin cleavage. Single colonies were grown overnight in LB medium, collected by centrifugation, then inoculated in M9 medium with ammonium-15N chloride (catalog no. 299251 Sigma-Aldrich, Burlington, MA) and grown for 3–4 h. After induction of protein production with IPTG for a three to four additional hours, cells were harvested and suspended in 1% NP40, PI, PMSF, 1 mM DTT, 50 mM Tris-HCl, pH 7.5. Cells were disrupted by sonication. GST-Galectin-3 was bound to glutathione-agarose (catalog no. L00207 Genscript, Piscataway, NJ) overnight at 4°C. Beads were washed four times in lysis buffer, then suspended in 50 mM Tris-HCl pH 7.5, 0.1 mM DTT and treated with 60 U thrombin/mL (Cytiva Thrombin Protease, catalog no. 45-001-320 Fisher Scientific, Pittsburg PA) for 16 h at RT. The supernatant containing Galectin-3 protein was treated with benzamidine Sepharose (HiTrap Benzamidine FF, Sigma-Aldrich, catalog no. GE17-5143-02) to bind and remove thrombin. Protein was concentrated using an Amicon 3K filter and used in 20 mM potassium phosphate buffer pH 6.8, 0.1 mM DTT for NMR. Protein concentrations were determined by BCA.
Ippel et al. (25) reported that, in their NMR experiments, agglutination happens at concentrations above 20 μM and is mediated by the F face of the CTD. Lin et al. (26) and Chiu et al. (21,27), however, reported that the CTDs do not self-associate; Lin et al. (26) using paramagnetic relaxation enhancement experiments found no obvious intensity bleaching in the CTD, indicating that intermolecular interactions between the CTDs were negligible. Chiu et al. stated that, “galectin-3 does not agglutinate in the absence of the NTD” and reported no self-association between CTDs. Here, in our final NMR studies, protein was used at a starting concentration of 13.8 μM. Under these conditions we did not see formation of aggregates.
NMR experiments and data analysis
1H-15N HSQC experiments were carried out at 30°C on a 700 MHz Bruker AvanceIII (Billerica, MA) with a TXI-triple resonance cryoprobe. The 20 μM Galectin-3 CTD was prepared in 20 mM potassium phosphate buffer, pH 6.8, 0.1 mM DTT, and complex with peptide-3 in different molar ratios are indicated in Fig. 6. The data were processed and analyzed using NMRPipe (52) and NMRFAM-SPARKY (53). The chemical shift perturbation (CSP) in the unit of hertz was calculated using the following equation: . The and are the nitrogen and proton CSD between free 15N-CTD and that in the mixture with P3 peptide. Assignments are based on the Galectin-3 CTD NMR data of Ippel et al. (25) and Umemoto and Leffler (54). NTD sequences in our construct are slightly different from theirs, causing limited miss-assignment in the NTD, and ambiguity in residues 240–248 due to their close contact with the short β strand in the slightly different NTDs. Because L135 and W181 patterns differ between Ippel et al. and Umemoto and Leffler, their assignments could not be unambiguously determined. In addition, the position of T137 differs between Ippel et al. and Umemoto and Leffler, and position changes of T248 make its assignments unclear. However, none of these residues appear to be involved in the interaction with P3 peptide since their chemical shift perturbation was quite small, except for residue T248, with a CSP about 10.9 Hz and one unit of RMSD.
Figure 6.
Residues in the Galectin-3 CTD domain with significant chemical shift perturbations through interaction with P3 peptide-3. (A) The 1H-15N HSQC spectrum in blue was acquired on free 15N-labeled Galectin-3 CTD. The spectrum in red was acquired on the complex between 15N-labeled CTD and peptide-3 with molar ratio of 100:1 between P3 peptide and 15N-labeled CTD. Some residues with significant chemical shift perturbations are labeled, including two side chains from Q201 and N214. (B) Selected overlay of 1H-15N HSQC spectrum region of 15N-labeled CTD versus titration of P3 peptide. Residues with notable chemical shift changes are labeled together with the cross-peak moving direction, as indicated by the arrow, with increased concentration of P3 peptide. Spectrum in black is free CTD; spectra in red, green, blue, and magenta are from complexes with molar ratio of 20:1, 40:1, 60:1, and 100:1 between P3 peptide and Galectin-3 CTD, respectively. (C) Chemical shift perturbation of 15N-CTD in complex with P3 peptide versus primary sequence of residues 117–250. The chemical shift changes between free 15N-CTD and in complex with 100-fold molar excess of P3 peptide are presented. The thin horizontal line indicates the limit above which values of 15N-CTD in complex with P3 peptide are two times the RMSD of the CSP of free 15N-CRD. Residues of V138 and E205 are color coded in blue since they shifted apart in complex from the overlaid cross peak in free 15N-CTD, and their CSP values could be swapped. To see this figure in color, go online.
Peptides
Peptides were purified by high-performance liquid chromatography (HPLC). These included peptide-1 ACE-ARAMGYPGASY-NH2, peptide-2 N-terminal acetyl-ARAFGYPIYSY-C-terminal amide, and peptide-3 ACE-YYPGAYPRRYR-NH2. Peptide-4 was the Galectin-3 inhibitory peptide ANTPCGPYTHDCPVKR G3-C12 described in Zou et al. (55) to target the CTD and peptide-5 the scrambled negative control peptide PTHVTCKYCPAGNRDP G3-H12s described in the same study. Neither peptide-4 nor peptide-5 had an effect on Galectin-3-mediated agglutination (Fig. S8 G and H).
Results
Derivation of the Galectin-3 NTD ensemble
An enhanced MD method called accelerated MD (AMD) uses an energy rescaling method to increase the probability of observing long-timescale transitions (potentially in the order of milliseconds), that are beyond the reach of conventional MD (56). Therefore, we applied AMD to the problem of conformational sampling of the Galectin-3 NTD, which is an IDR. Starting from the initial Galectin-3 structure, where the CTD was modeled based on an existing crystal structure and the NTD was modeled as a random polymer chain, AMD was used to generate the initial conformational ensemble consisting of 17,000 NTD conformations. For each of these conformations, the corresponding chemical shifts were predicted using SHIFTX2, for both the full-length protein as well as for the CTD alone (41). The CSDs were then calculated according to the formula , where and are the CSDs of the 15N labeled backbone nitrogen and hydrogen atoms between full-length and CTD-only Galectin-3, respectively (Fig. 1 B, top panel). The NTD conformations were clustered by their structural similarity (detailed in the “materials and methods” section) and, for each cluster, the root-mean-square deviation (RMSD) from the experimental NMR CSDs (25) was calculated. The clusters were first filtered by average number of NTD-CTD contacts, retaining those that showed a minimum of five contacts. These clusters were then sorted by decreasing order of chemical shift RMSD and the top five clusters with the lowest RMSD, including a total of 1300 conformations, were selected for further processing (Fig. 1 C). As shown in Fig. 1 B there was an excellent agreement between the AMD-calculated and experimental CSDs within the filtered ensemble. The selected clusters are highlighted in Fig. 1 C. By analyzing the NTD conformations that showed agreement with the experimental NMR data, two major classes of NTD-CTD contacts were identified, in which Y36 and Y45 of all the residues in the NTD made the most long-term contact with the CTD, with a shallow cavity in the CTD as shown in Fig. 1 D and E. The model predicted that the cavity would encompass candidate contacts including residues L131, F192, F198, K199, Q201, V202, L203, V204, K210, D215, A216, H217, L219, and Q220. These residues that showed close contact with the NTD in the MD ensemble also correspond to the strongest peaks in the experimental CSD profile, as shown in Fig. 1 E.
BME approach uncovers diverse CTD-bound NTD conformations
We also further investigated the NTD-CTD interaction obtained from AMD using the BME method. Details of the BME approach are given in the “materials and methods” section. In brief, the BME approach tries to achieve agreement between an MD-derived ensemble and available experimental data, while maximizing the information entropy within the obtained ensemble. This leads to a conformational ensemble that maintains its diversity, while still agreeing with experimental data. The BME approach assigns a weight to each conformation, which is proportional to its contribution to the measured experimental property.
By applying the BME approach, and using the per residue CSDs as experimental data, we calculated the weight of each NTD conformation from AMD. The highest weighted conformations were then clustered by dihedral RMSD and, within each cluster, the frequencies of pairwise residue contacts between the NTD and the CTD were obtained. Fig. 2 shows the normalized frequency of each inter-residue contact within the different clusters. Applying the BME approach, we therefore obtained a diverse ensemble, in which, apart from Y36 and Y45, multiple NTD residues have significant interactions with the CTD. The contacts where the CTD residue shows a significant peak in the experimental CSD profile are highlighted in red in the heatmap. Interestingly, we find that multiple NTD residues, notably several aromatic residues, such as W22, Y101, Y41, Y45, Y54, Y70, and Y79, make contact with the CTD in a way that satisfies the NMR data. Also, looking at the pairwise interactions, it appears that, in many cases, multiple NTD residues interact with a single CTD residue in different conformations. Examples of such contacts include Y41/Y45/G47/Q48 (NTD) → D215 (CTD), Y79/A73/T104/T98/P71/P106/Y89/Y54 (NTD) → Y247 (CTD), and A100/G112 (NTD) → T243 (CTD).
Figure 2.
Contact heatmap of conformations with high BME weights. The colors represent the contact frequencies reweighted according to the BME weights. CTD residues with significant experimental NMR shifts are as indicated with red font below the heatmap. NTD residues making frequent CTD contacts are listed and include A2, A49, A53, A69, D3, F5, G108, G112, G43, G47, G52, G68, G72, H8, P106, P71, Q20, Q48, S84, T98, V78, W22, Y101, Y41, Y45, Y54, Y70, and Y79. To see this figure in color, go online.
Both our original clustering approach and the BME method using existing NMR CSD data identified multiple NTD-CTD contacts that involved a targetable pocket on the F face of the CTD. The involvement of the F face of the CTD in mediating NTD interactions agrees with other independently measured CSD and paramagnetic relaxation enhancement (PRE) profiles (26). We used two approaches to test this experimentally. Mutation of critical residues in that pocket could abolish binding to the IDR and the agglutination activity of Galectin-3. Also, a peptide could potentially fit in the shallow pocket and inhibit the IDR interaction. A classical test for carbohydrate-binding activity of a lectin including Galectin-3 is an agglutination assay (49,51,57). In this assay, recombinant Galectin-3 is tested for its ability to promote lattice formation by binding in a multivalent manner to glycoproteins located on the cell surface: when cell surface glycoprotein targets are located on different cells, carbohydrate binding combined with multimer formation causes cellular agglutination. Such an assay has widespread use for testing Galectin-3 inhibitors (e.g., (21,55)). Thus, we used an agglutination assay in which recombinant Galectin-3 was added to patient-derived pre-B ALL as a readout for Galectin-3 lattice-forming activity.
Computational design of inhibitory peptide sequences
To design inhibitory peptides, a limited number of backbone templates were initially selected based on the ensemble of NTD conformations that showed agreement with NMR. Briefly, the NTD conformations that showed agreement with NMR (Fig. 1 D) were clustered by Cα RMSD and the representative conformations from the most populated clusters were selected for template design. The initial templates were obtained by retaining five amino acids on both sides of Y36 or Y45 in the CTD-bound NTD conformations. The main steps involved in obtaining the peptide templates from the Galectin-3 NTD ensemble are shown in Fig. 3 A and B, and described in more detail below.
Figure 3.
In silico peptide design. (A, B) Steps in generating the (A) initial peptide templates and (B) top peptide candidates starting from the initial templates. (C) The protein-peptide interaction energies of the top eight candidate peptides. (D) Duration of binding to the CTD. To see this figure in color, go online.
We first categorized the 1300 NMR-supported Galectin-3 conformations into two groups, based on whether Y36 or Y45 of the NTD contacts the CTD (see Fig. 1 D). These conformations were then converted into peptide templates by retaining five amino acids on both sides of the Y residue of the NTD and clustered by Cα RMSD (Fig. S9 A and B). From each cluster, the template structure with the lowest binding free energy was retained. When selecting the final templates, we visualized the top structures by binding free energy and the key amino acid interactions with the CTD. We observed that the central Tyr of the NTD segment (Y36/Y45) makes two types of contacts with the CTD (1): hydrogen bonding of the terminal hydroxyl of the Tyr side chain with the side chain of H217 and (2) packing against the hydrophobic core formed by several leucine and alanine residues in the CTD pocket. Both patterns of contacts were found in the template structures with the lowest binding free energy, as seen in Fig. S9. From each category of conformations (either Y36- or Y45-bound), we selected two templates showing either hydrogen bond with H217 or hydrophobic packing.
It is noteworthy to state here that, while we prioritized the templates having the lowest binding free energy, we did not strictly rely on energy as the only selection criterion for choosing the final templates. Alongside energetic stability, we wanted the templates to be as diverse as possible with respect to peptide orientation. In case of the Y36-bound conformations, this was easily achievable since the top two cluster representatives by energy (cls 5 and cls 3) showed distinctly different peptide orientations within the CTD cavity (Fig. S9 A). However, in case of the Y45-bound conformations, the top structures by binding free energy showed similar orientations of Y45 (Fig. S9 B). In contrast, cls 70 showed a Y45 orientation that was distinct and buried deep inside the CTD pocket. Additionally, Y45 showed a pi-stacking with H217 in the T orientation, alongside hydrophobic packing with other CTD residues. These structural features indicated that cls 70 would be a promising template for peptide design. Finally, a total of four templates were selected from the Y36- and Y45-mediated NTD-CTD complexes based on binding free energy, interaction pattern of the NTD, and overall packing of the NTD against the CTD using manual visualization.
Starting from a given peptide template, each residue was systematically mutated to all 20 amino acids and a Δaffinity score (see Peptide mutation scanning under Materials and methods) was calculated using the Maestro software (Schrodinger, NY). Each mutant was energy minimized, followed by binding free energy calculation using Prime MM-GBSA (58). Full details and validity of the Δaffinity score can be found in (48). The top scoring mutations were analyzed to identify three to five sequence positions in each template that frequently appeared at the top when sorted by the Δaffinity score (Figs. S4–S7 A). These positions were then mutated combinatorically to generate multiple double, triple, and quadruple (quintuple in one template) mutants (Figs. S4–S7 B), and the top mutants by Δaffinity score were analyzed for features such as strong interaction with the CTD hydrophobic cavity, low desolvation energy, and sequence diversity. This step generated eight peptide candidates (two from each template), which were then subjected to 500 ns of all-atom MD simulations in an explicit water environment, to test their stability of binding to the CTD. Also, the binding free energies were calculated using the MM-PBSA method from the AMBER software package (59) (Table 2; Fig. 3 C). During MD, four out of eight peptides left the CTD cavity within 300 ns and were deemed unstable (Table 2; Fig. 3 D). Among the rest that remained bound and also showed strong interaction with the CTD as measured by the protein-peptide interaction energy and number of hydrogen bonds, one peptide, Y45_cls70_Y1_M8_R9_F10_R11, was very similar in sequence to another peptide in the list and hence eliminated. The other three peptides were subjected to experimental testing (Table 2).
Table 2.
Binding properties for eight candidate peptides calculated from all-atom MD simulations
| Peptide | Stable binding duration (ns) | Binding free energy (kcal/mol) | SEM | Protein-peptide interaction energy (kcal/mol) | Desolvation energy (kcal/mol) | Sequence | No. of stable peptide-protein hydrogen bonds | |
|---|---|---|---|---|---|---|---|---|
| Y36_cls3_M2_M4_R8 | 278 | −25.9 | 0.06 | −85.6 | 59.7 | ACE-AMAMGYPRASY-NH2 | 0 | |
| Y36_cls3_R2_M4∗ | 394 | −46.2 | 0.06 | −140.4 | 94.2 | ACE-ARAMGYPGASY-NH2 | 2 | Peptide 1 |
| Y36_cls5_M2_R4_W8_Y9 | 184 | −22.3 | 0.08 | −64.1 | 41.8 | ACE-AMARGYPWYSY-NH2 | 0 | |
| Y36_cls5_R2_F4_I8_Y9∗ | 500 | −42.3 | 0.06 | −124.5 | 82.2 | ACE-ARAFGYPIYSY-NH2 | 1 | Peptide 2 |
| Y45_cls1_M3_R4_M8_I10 | 172 | −16.2 | 0.06 | −66.2 | 50 | ACE-SYMRAYPMQIP-NH2 | 0 | |
| Y45_cls1_M3_R4_M8_M10 | 91 | −35.5 | 0.2 | −87.8 | 52.3 | ACE-SYMRAYPMQMP-NH2 | 0 | |
| Y45_cls70_Y1_M8_R9_F10_R11 | 500 | −34.4 | 0.5 | −136.6 | 102.2 | ACE-YYPGAYPMRFR-NH2 | 0 | |
| Y45_cls70_Y1_R8_R9_Y10_R11∗ | 500 | −29.5 | 0.04 | −169.3 | 139.8 | ACE-YYPGAYPRRYR-NH2 | 2 | Peptide 3 |
∗These peptides are the best binders according to their duration of binding to the CTD, number of protein-peptide hydrogen bonds, and binding free energy. These peptides were selected for experimental testing and were designated as peptides 1, 2, and 3 in the experimental assays. The last row shows peptide 3, which was found to be a positive hit in the agglutination assay.
Peptide testing on pre-B ALL cells
We tested these peptides in the agglutination assay. As shown in Fig. 4 A, without treatment, LAX56 cells appear as a single-cell suspension. When GST alone was added as a negative control (Fig. 4 B), no agglutination was measured, whereas GST-Gal3 (Fig. 4 C) caused cellular agglutination as expected. We used the glycomimetic TD-139 ((60,61) and citations therein) as positive control (62) and the compound clearly inhibited Galectin-3-mediated agglutination (Fig. 4 D). Two of the three peptides tested, P1 peptide-1 (Fig. 4 E) and P2 peptide-2 (Fig. S8 E) had no effect on agglutination. However, peptide-3 (P3) clearly was inhibitory: there was a dose response, with a correlation between different concentrations of P3 and degree of disruption of Galectin-3-mediated lattice formation (Fig. 4 G and H; quantitation Fig. 4 I).
Site-directed mutagenesis identifies residues important for Galectin-3 agglutination function
The model predicted strong contacts of, among others, Y36 and Y45 in the IDR with amino acids L131, Q201, L203, and H217 in the CTD (Fig. S10). Therefore, these amino acids were mutated to alanine to test their contribution to the Galectin-3 agglutination activity. The assay showed that all single mutants (Fig. 5 G, J, P, S) were still able to agglutinate LAX56 cells. The L131A mutant caused stronger agglutination, but the L203A and H217A mutants less agglutination, than wild-type Gal3 (Fig. 5 V-1). Peptide-3 was also still able to reduce agglutination mediated by the mutants (Fig. 5 V-2). We also generated an L131A/L203A double mutant. As shown in Fig. 5, this mutant was functionally inactive and failed to agglutinate LAX56 leukemia cells. This identified the combination of L131 and L203 as essential for agglutination.
NMR identifies contacts of peptide-3 with the F face of the CTD
If P3 peptide-3 inhibits GST-Gal3-mediated agglutination by interfering with the interaction of the IDR with the CTD, P3 would likely make contact with the CTD. We next used NMR to investigate this. We generated a Galectin-3 CTD construct including amino acids 117–250 and used published NMR structure data (25,54,63,64) for assignments of CTD amino acid residues. As shown in Fig. 6, NMR showed that P3 makes extensive contacts with amino acids in the CTD. In a dose-response titration with increasing concentrations of P3, as exemplified in Fig. 6 B, large Δδ shifts were measured with a number of amino acids, such as L203, V204, A212, and L218. Fig. 6 C provides a summary of the CSP measured at the highest molar ratio of P3 to Galectin-3 CTD for the amino acid residues identified. There were seven amino acids that had a more than twofold increased RMSD of their CSP when exposed to P3. This included residues K210, V211, A212, and V213 in the β8 sheet as well as A216, L218, and L219 in the β9 sheet. Other residues with an increased RMSD of around 2 in their CSP included V202, V204, and E205 located in the β7 sheet, and I132 in the β2 sheet. These residues are all located on the F face of the CTD (Fig. S10 A). Residues such as R186, K227, or Y221, which are located in the S face of the CTD, exhibited no shift upon exposure to P3, as shown in Fig. 6 C.
Discussion
Proteins containing IDRs have emerged as major players in various biochemical pathways, creating unprecedented opportunities for drug targeting. However, three main challenges in designing drugs targeting IDRs using structure-based approaches are 1) their lack of well-defined structure, 2) the difficulty in translating experimental structural information into three-dimensional atomic coordinates, and 3) our lack of understanding how conventional drug-design approaches that target specific protein structures can be applied to the ensemble of diverse conformations of an IDR (65,66,67,68). Moreover, designing therapeutics necessitates a detailed mechanistic understanding of IDR dynamics and the interaction with self or other partners.
We here used Galectin-3 as a prototypical IDR-containing protein to explore the potential of designing IDR-directed function-targeting therapeutics. Using the AMD method that efficiently samples the IDR conformational space, and existing NMR data that enable filtering the MD-derived ensemble into an experimentally relevant subset, we identified diverse NTD conformations bound to the CTD. This is a significant development, since the NMR data alone only allowed the identification of the CTD residues that interact with the NTD, but not specific NTD structures that contribute to this interaction.
The in silico exploration of the Galectin-3 conformational ensemble and the subsequent mutagenesis results obtained here indicate a complicated mechanistic picture of the NTD-CTD interaction. Our initial analysis of the AMD-generated ensemble pointed to two residues, Y36 and Y45, as making the most stable contacts with the CTD as well as their region of contact within the CTD. However, site-directed mutagenesis of Y45 alone or combined with Y36 still yielded a Galectin-3 protein capable of agglutinating pre-B ALL cells, which was sensitive to P3 mediated inhibition (Fig. S11). Subsequently, applying the BME approach, we obtained a more diverse structural ensemble consistent with NMR, in which, apart from Y36 and Y45, multiple IDR/NTD residues have significant interactions with the CTD. These observations are entirely consistent with Lin et al. (26), who, using NTD-truncated constructs, concluded that multiple aromatic residues in the NTD interact with the CTD, and with Zhao et al. (21), who mutated 14 prolines in the NTD to show that many of them are involved in the NTD-CTD interaction.
From a thermodynamic point of view, transient interactions of multiple NTD motifs with the CTD (“many-to-one” interactions) may lead to a dynamic structural ensemble with high entropy. We anticipate that this should contribute positively to the stability of the NTD-CTD complex. Stability of protein-protein interactions is determined by the free energy difference ΔG (which is comprised of entropic and enthalpic components) between the bound and the unbound states. In interactions involving folded proteins, the formation of the complex is associated with a large loss of protein configurational entropy that needs to be compensated through enthalpic and solvent entropy gain in order for the complex to be energetically stable (i.e., negative ΔG). With proteins such as Galectin-3, involving interactions with an IDR, entropic loss, relative to the unbound state in which no NTD-CTD interactions are present, can be minimized through many-to-one interactions. Therefore, the enthalpic gain upon binding need not be as strong as in the case of folded protein interactions. We note that it should be possible to estimate the enthalpic and entropic components of the binding free energy of the NTD from MD simulations. However, considering the challenge of proper reweighting of the AMD ensembles, as discussed in the “materials and methods” section, we did not calculate those here. Since many-to-one interactions are hallmarks of many disordered proteins, the above principle applies to other IDR-containing proteins with repeat motifs as well. An elegant discussion of the role of repeat motifs and entropy in IDR interactions can be found in Flock et al. (69).
Using NMR and a functional agglutination assay, we showed, respectively, that peptide-3 interacts with a region in the Galectin-3 F face that includes L131/L203 and that mutation of these residues together abrogates Galectin-3-mediated agglutination. We did not perform experiments that would show that P3 peptide and L131/L203 mutations prevent interactions of the CTD with the NTD. To demonstrate this directly, NMR experiments would need to be done with a CTD-only Galectin-3 protein (wild type and L131/L203 mutant) and add-back of isolated recombinant NTD. The NTD should make contacts with the wild-type CTD but not with the L131/L203 double-mutant CTD. Similarly, one could add peptide-3 to the L131/L203 CTD and to the wild- type CTD. Peptide-3 should not make contact with the L131/L203 CTD.
Our computational analysis did, however, demonstrate that the IDR in the NTD makes contact with the region including L131 and L203. In addition, there is abundant documentation from the literature that removal of the Galectin-3 NTD generates a protein that can no longer cause agglutination. Our AMD simulations and subsequent analysis detected a pocket in the F face of CTD as a possible area of NTD contact and identified L131/L203 as essential for the NTD interaction (Fig. 1 E). L203 was previously shown to be important for the interaction between the NTD and CTD (25), and an L203A Galectin-3 mutant has reduced capacity to form liquid-liquid phase separation droplets (21). In concordance with this, we also found that the L203A single mutant had reduced ability to agglutinate the leukemia cells, although it still retained some activity.
To assess the impact of L131 and L203 on the NTD interaction, we also calculated the average interaction energy of each CTD residue in the F face in the NMR-filtered AMD ensemble. The top five CTD residues showing the lowest interaction energy are shown in Fig. S12. We calculated the interaction energy separately in the two conformational clusters, where either Y36 or Y45 makes contact with the CTD. In both cases, L131 and L203 show up at the top, along with several other polar residues, such as H217, Q201, and D215. The impact of polar residues on protein-protein interactions are likely to be small due to competition with the polar solvent. This leaves L131 and L203 as the key hydrophobic residues that contribute to NTD binding, with an interaction energy of −1 to −1.5 kcal/mol. According to previous computational studies on protein-protein interface (PPI) hotspots, an energy contribution greater than −2 kcal/mol is likely to affect binding of partner proteins significantly (70). Here, individually, the energy contributions of the two hydrophobic residues are less than −2 kcal/mol, but, together, their contribution is a substantial −2 to −2.5 kcal/mol. The importance of AMD analysis was thus illustrated by the additional identification of the need for cooperativity of L131 with L203 in Galectin-3 to allow this lectin to agglutinate leukemia cells.
Although studies to inhibit Galectin-3 classically focused on blocking the binding of carbohydrate substrates to the recognition domain (14), (55), (71), galactomannans (72) and PTX008, a calixarene (63) may also target the NTD-CTD interaction. PTX008 contacts V202, K210, V211, and A216 located in the F face of the CTD, which are also contacted by peptide-3 in our study. In contrast to peptide-3, galactomannans and PTX008 may also make some contacts with the S-face of the CTD, and inhibit Galectin-1, a lectin that has overlap in binding targets with Galectin-3 (49,73).
Peptide-3 was identified using a completely in silico approach. The computational rational design approach reduced the large number of candidate peptides for experimental screening (208–1010 possible amino acid combinations) to a very small number. Out of the three predicted peptide candidates, one (P3) showed Galectin-3 targeting activity in the experimental agglutination assay. Therefore, taken together, the computational design workflow presented here has significant predictive power. However, it is not entirely clear why the other two peptides failed to show any effect, although they both showed worse interaction energy with the CTD compared with P3. Importantly, peptide-3 includes the –PGAY- motif previously shown to be critical in the Galectin-3 NTD-CTD interaction (25). In that study, the PGAY peptide was reported to mainly contact residues G124, F192, Q201, V202, L203, V204, K210, V211, A212, V213, D215, A216, L218, L219, and Q220. According to our NMR data, the CTD residues that show significant chemical shifts in response to peptide-3 binding are I132, V202, V204, E205, K210, V211, A212, V213, A216, L218, and L219 (Fig. S10 A and C). According to the MD simulation, these residues are all located within 5 Å of the predicted binding site of peptide-3. Thus the contacts made by the PGAY peptide and P3 have a large degree of overlap, but also some differences. However, most of the residues that contact the PGAY peptide are located within 5 Å of peptide-3, with the exception of G124 and Q220. This indicates that the gross binding sites of the two peptides are highly similar. In particular, I132, which is located in the β2 sheet of the CTD, is of interest because the adjacent mutation of L131 combined with L203 abrogated the agglutination of Galectin-3, suggesting that the β2 sheet may have a critical contribution to the IDR-CTD interaction.
The tissue culture agglutination assay entirely depends on the ability of Galectin-3 to oligomerize. It is noteworthy that, in such experiments, the proteins are added at much lower final overall concentrations than in NMR experiments. We used GST-Gal3 proteins at 150–25 μg/mL, corresponding to an approximate concentration of around 2.8–0.5 μM, respectively, which is well below the 20 μM in vitro oligomerization concentration reported by Ippel et al. as mediating CTD-CTD interactions (25). However, it is highly likely that the interaction of Galectin-3 with glycoproteins in tissue culture experiments will concentrate it in a much smaller volume and that this may lead to oligomerization and possibly liquid-liquid condensation. Although we have not examined this possibility experimentally, Chiu et al. made a very interesting remark in this context: they wrote, “Although galectin-3 only undergoes LLPS [liquid-liquid phase separation] under non-physiological conditions, it agglutinates in the extracellular matrix through interactions with cell surface ligands that locally increase its concentration. This agglutination may involve the same multivalent mechanism as the one that drives LLPS” and “Galectin-3 binding on the surface of LPS [lipopolysaccharide] micelles increases the local protein concentration to levels in the tens of mM range” (13.3 mM estimate in Chiu et al. (27) supplemental note). Thus, local concentrations of Galectin-3 found in vivo could be sufficient to promote agglutination, albeit possibly not through CTD-CTD interactions.
It is not exactly clear how the intrinsically disordered region in Galectin-3 works together with its structured C-terminal end to achieve a biological effect such as agglutination. NMR studies have indicated that the interaction of the IDR/NTD with the CTD does not cause a strong conformational shift in critical residues of the S face (25), ruling out a simple allosteric effect in which contact of the IDR with the F face in the CTD would open up the carbohydrate-binding site on the S face located in the other side and allow entry of the client glycoprotein. Zhao et al. (21) showed that the IDR/NTD did not need to be physically attached to the CTD for Galectin-3 to cause liquid-liquid phase separation of a client glycoprotein, CD45, in vitro. Their data provide evidence for a model in which the S face of the CTD binds to a glycoprotein and leaves the F face to interact with the IDR of other Galectin-3 molecules as a key step in polymerization. Our BME results entirely support such a mechanism of cooperation between the NTD and the CTD by showing the involvement of multiple NTD motifs in recognizing the CTD. This can lead to stable CTD binding by minimizing entropic loss through many-to-one interactions, as discussed previously. Moreover, during oligomerization, when multiple CTDs from different Galectin-3 molecules could be interacting with the NTD, the repeat NTD motifs can support such interactions. In such a scenario, each NTD motif can bind to a different CTD, as opposed to multiple CTDs competing for the same NTD site for binding. Importantly, this property could allow a single NTD to interact with multiple F-face pockets in different CTDs. In this model, the switch from one NTD binding to its own CTD to one NTD binding to multiple CTDs could be concentration driven, where low Galectin-3 levels may favor the former, while high protein levels promote the latter interactions.
Conclusion
In the current study, we have addressed the question of whether it is possible to interfere with the interaction of an IDR with a domain of defined structure using Galectin-3 as a test case. Our results show that this is feasible. Because IDRs are enriched in many important proteins that form RNP complexes and membrane-less subcellular compartments such as stress granules, it may be possible to use a strategy similar to the one used here to disperse complexes that they are part of and inhibit their function.
Author contributions
S.B., M.Z., W.H., T.Q., and N.H. designed the studies and analyzed the data. S.B., M.Z., W.H., and T.Q. performed the experiments. N.H. wrote the original draft, which was reviewed and edited by S.B., M.Z., and W.H.
Acknowledgments
This study was supported by NIH RO1 CA172040 to N.H. Research reported in this publication included work performed in the Synthetic and Biopolymer Chemistry, X-ray Crystallography, Integrative Genomics and Computational Therapeutics Cores supported by the National Cancer Institute of the National Institutes of Health under award number P30CA033572. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Kevin Mayo and Hans Ippel are acknowledged for sharing Galectin-3 NMR data. We thank Yuelong Ma (Shared Resources-Synthetic Chemistry) for peptide synthesis and the Nuclear Magnetic Resonance Core for performing the NMR experiments.
Declaration of interests
City of Hope submitted a patent application for the recognition of IDR contacts with ordered domains and the design of peptide-3 in August of 2021. It has not been licensed for production or testing. S.B. and N.H. are listed as inventors. No other competing interests are present for any of the other authors.
Editor: Frauke Graeter.
Footnotes
Supriyo Bhattacharya and Mingfeng Zhang contributed equally to this work.
Supporting material can be found online at https://doi.org/10.1016/j.bpj.2022.10.008.
Supporting material
References
- 1.Wright P.E., Dyson H.J. Intrinsically disordered proteins in cellular signalling and regulation. Nat. Rev. Mol. Cell Biol. 2015;16:18–29. doi: 10.1038/nrm3920. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Afanasyeva A., Bockwoldt M., et al. Gossmann T.I. Human long intrinsically disordered protein regions are frequent targets of positive selection. Genome Res. 2018;28:975–982. doi: 10.1101/gr.232645.117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Coppin L., Jannin A., et al. Pigny P. Galectin-3 modulates epithelial cell adaptation to stress at the ER-mitochondria interface. Cell Death Dis. 2020;11:360. doi: 10.1038/s41419-020-2556-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Magescas J., Sengmanivong L., et al. Delacour D. Spindle pole cohesion requires glycosylation-mediated localization of NuMA. Sci. Rep. 2017;7:1474. doi: 10.1038/s41598-017-01614-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Jia J., Claude-Taupin A., et al. Deretic V. Galectin-3 coordinates a cellular system for lysosomal repair and removal. Dev. Cell. 2020;52:69–87.e8. doi: 10.1016/j.devcel.2019.10.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Coppin L., Leclerc J., et al. Pigny P. Messenger RNA life-cycle in cancer cells: emerging role of conventional and non-conventional RNA-binding proteins? Int. J. Mol. Sci. 2018;19:E650. doi: 10.3390/ijms19030650. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Coppin L., Vincent A., et al. Pigny P. Galectin-3 is a non-classic RNA binding protein that stabilizes the mucin MUC4 mRNA in the cytoplasm of cancer cells. Sci. Rep. 2017;7:43927. doi: 10.1038/srep43927. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Joeh E., O'Leary T., et al. Huang M.L. Mapping glycan-mediated galectin-3 interactions by live cell proximity labeling. Proc. Natl. Acad. Sci. USA. 2020;117:27329–27338. doi: 10.1073/pnas.2009206117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Sciacchitano S., Lavra L., et al. Ricci A. Galectin-3: one molecule for an alphabet of diseases, from A to Z. Int. J. Mol. Sci. 2018;19:E379. doi: 10.3390/ijms19020379. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Suthahar N., Meijers W.C., et al. de Boer R.A. Galectin-3 activation and inhibition in heart failure and cardiovascular disease: an update. Theranostics. 2018;8:593–609. doi: 10.7150/thno.22196. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Farhadi S.A., Liu R., et al. Hudalla G.A. Physical tuning of galectin-3 signaling. Proc. Natl. Acad. Sci. USA. 2021;118 doi: 10.1073/pnas.2024117118. e2024117118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Lee J.J., Hsu Y.C., et al. Cheng S.P. Galectin-3 inhibitors suppress anoikis resistance and invasive capacity in thyroid cancer cells. Internet J. Endocrinol. 2021;2021:5583491. doi: 10.1155/2021/5583491. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Dings R.P.M., Miller M.C., et al. Mayo K.H. Galectins as molecular targets for therapeutic intervention. Int. J. Mol. Sci. 2018;19:E905. doi: 10.3390/ijms19030905. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Bertuzzi S., Quintana J.I., et al. Jiménez-Barbero J. Targeting galectins with glycomimetics. Front. Chem. 2020;8:593. doi: 10.3389/fchem.2020.00593. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Blanchard H., Yu X., et al. Bum-Erdene K. Galectin-3 inhibitors: a patent review (2008-present) Expert Opin. Ther. Pat. 2014;24:1053–1065. doi: 10.1517/13543776.2014.947961. [DOI] [PubMed] [Google Scholar]
- 16.Stegmayr J., Zetterberg F., et al. Leffler H. Extracellular and intracellular small-molecule galectin-3 inhibitors. Sci. Rep. 2019;9:2186. doi: 10.1038/s41598-019-38497-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Hirani N., MacKinnon A.C., et al. Maher T.M. Target inhibition of galectin-3 by inhaled TD139 in patients with idiopathic pulmonary fibrosis. Eur. Respir. J. 2021;57:2002559. doi: 10.1183/13993003.02559-2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Bratteby K., Torkelsson E., et al. Erlandsson M. In Vivo veritas: (18)F-radiolabeled glycomimetics allow insights into the pharmacological fate of galectin-3 inhibitors. J. Med. Chem. 2020;63:747–755. doi: 10.1021/acs.jmedchem.9b01692. [DOI] [PubMed] [Google Scholar]
- 19.Smith B.A.H., Bertozzi C.R. The clinical impact of glycobiology: targeting selectins, Siglecs and mammalian glycans. Nat. Rev. Drug Discov. 2021;20:217–243. doi: 10.1038/s41573-020-00093-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Dumic J., Dabelic S., Flögel M. Galectin-3: an open-ended story. Biochim. Biophys. Acta. 2006;1760:616–635. doi: 10.1016/j.bbagen.2005.12.020. [DOI] [PubMed] [Google Scholar]
- 21.Zhao Z., Xu X., et al. Zhou Y. Galectin-3 N-terminal tail prolines modulate cell activity and glycan-mediated oligomerization/phase separation. Proc. Natl. Acad. Sci. USA. 2021;118 doi: 10.1073/pnas.2021074118. e2021074118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Uchino Y., Woodward A.M., et al. Argüeso P. Galectin-3 is an amplifier of the interleukin-1beta-mediated inflammatory response in corneal keratinocytes. Immunology. 2018;154:490–499. doi: 10.1111/imm.12899. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Mirandola L., Yu Y., et al. Chiriva-Internati M. Galectin-3 inhibition suppresses drug resistance, motility, invasion and angiogenic potential in ovarian cancer. Gynecol. Oncol. 2014;135:573–579. doi: 10.1016/j.ygyno.2014.09.021. [DOI] [PubMed] [Google Scholar]
- 24.Mirandola L., Yu Y., et al. Chiriva-Internati M. Galectin-3C inhibits tumor growth and increases the anticancer activity of bortezomib in a murine model of human multiple myeloma. PLoS One. 2011;6:e21811. doi: 10.1371/journal.pone.0021811. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Ippel H., Miller M.C., et al. Mayo K.H. Intra- and intermolecular interactions of human galectin-3: assessment by full-assignment-based NMR. Glycobiology. 2016;26:888–903. doi: 10.1093/glycob/cww021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Lin Y.H., Qiu D.C., et al. Huang J.R. The intrinsically disordered N-terminal domain of galectin-3 dynamically mediates multisite self-association of the protein through fuzzy interactions. J. Biol. Chem. 2017;292:17845–17856. doi: 10.1074/jbc.M117.802793. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Chiu Y.P., Sun Y.C., et al. Huang J.R. Liquid-liquid phase separation and extracellular multivalent interactions in the tale of galectin-3. Nat. Commun. 2020;11:1229. doi: 10.1038/s41467-020-15007-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Flores-Ibarra A., Vértesy S., et al. Romero A. Crystallization of a human galectin-3 variant with two ordered segments in the shortened N-terminal tail. Sci. Rep. 2018;8:9835. doi: 10.1038/s41598-018-28235-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Eswar N., John B., et al. Sali A. Tools for comparative protein structure modeling and analysis. Nucleic Acids Res. 2003;31:3375–3380. doi: 10.1093/nar/gkg543. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Nguyen H., Roe D.R., Simmerling C. Improved generalized born solvent model parameters for protein simulations. J. Chem. Theor. Comput. 2013;9:2020–2034. doi: 10.1021/ct3010485. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Salomon-Ferrer R., Case D.A., Walker R.C. An overview of the Amber biomolecular simulation package. WIREs. Comput. Mol. Sci. 2013;3:198–210. [Google Scholar]
- 32.Maier J.A., Martinez C., et al. Simmerling C. ff14SB: improving the accuracy of protein side chain and backbone parameters from ff99SB. J. Chem. Theor. Comput. 2015;11:3696–3713. doi: 10.1021/acs.jctc.5b00255. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Robustelli P., Piana S., Shaw D.E. Developing a molecular dynamics force field for both folded and disordered protein states. Proc. Natl. Acad. Sci. USA. 2018;115:E4758–E4766. doi: 10.1073/pnas.1800690115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Hopkins C.W., Le Grand S., et al. Roitberg A.E. Long-time-step molecular dynamics through hydrogen mass repartitioning. J. Chem. Theor. Comput. 2015;11:1864–1874. doi: 10.1021/ct5010406. [DOI] [PubMed] [Google Scholar]
- 35.Shabane P.S., Izadi S., Onufriev A.V. General purpose water model can improve atomistic simulations of intrinsically disordered proteins. J. Chem. Theor. Comput. 2019;15:2620–2634. doi: 10.1021/acs.jctc.8b01123. [DOI] [PubMed] [Google Scholar]
- 36.Liu M., Das A.K., et al. Head-Gordon T. Configurational entropy of folded proteins and its importance for intrinsically disordered proteins. Int. J. Mol. Sci. 2021;22:3420. doi: 10.3390/ijms22073420. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Pierce L.C.T., Salomon-Ferrer R., et al. Walker R.C. Routine access to millisecond time scale events with accelerated molecular dynamics. J. Chem. Theor. Comput. 2012;8:2997–3002. doi: 10.1021/ct300284c. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Miao Y., Sinko W., et al. McCammon J.A. Improved reweighting of accelerated molecular dynamics simulations for free energy calculation. J. Chem. Theor. Comput. 2014;10:2677–2689. doi: 10.1021/ct500090q. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Miao Y., Feher V.A., McCammon J.A. Gaussian accelerated molecular dynamics: unconstrained enhanced sampling and free energy calculation. J. Chem. Theor. Comput. 2015;11:3584–3595. doi: 10.1021/acs.jctc.5b00436. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Salomon-Ferrer R., Götz A.W., et al. Walker R.C. Routine microsecond molecular dynamics simulations with AMBER on GPUs. 2. Explicit solvent particle mesh ewald. J. Chem. Theor. Comput. 2013;9:3878–3888. doi: 10.1021/ct400314y. [DOI] [PubMed] [Google Scholar]
- 41.Han B., Liu Y., et al. Wishart D.S. SHIFTX2: significantly improved protein chemical shift prediction. J. Biomol. NMR. 2011;50:43–57. doi: 10.1007/s10858-011-9478-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Shrestha U.R., Smith J.C., Petridis L. Full structural ensembles of intrinsically disordered proteins from unbiased molecular dynamics simulations. Commun. Biol. 2021;4:243. doi: 10.1038/s42003-021-01759-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Kjaergaard M., Poulsen F.M. Sequence correction of random coil chemical shifts: correlation between neighbor correction factors and changes in the Ramachandran distribution. J. Biomol. NMR. 2011;50:157–165. doi: 10.1007/s10858-011-9508-2. [DOI] [PubMed] [Google Scholar]
- 44.Kjaergaard M., Brander S., Poulsen F.M. Random coil chemical shift for intrinsically disordered proteins: effects of temperature and pH. J. Biomol. NMR. 2011;49:139–149. doi: 10.1007/s10858-011-9472-x. [DOI] [PubMed] [Google Scholar]
- 45.Schwarzinger S., Kroon G.J., et al. Dyson H.J. Sequence-dependent correction of random coil NMR chemical shifts. J. Am. Chem. Soc. 2001;123:2970–2978. doi: 10.1021/ja003760i. [DOI] [PubMed] [Google Scholar]
- 46.Bottaro S., Bengtsen T., Lindorff-Larsen K. In: Structural Bioinformatics: Methods and Protocols. Gáspári Z., editor. Springer US; 2020. Integrating molecular simulation and experimental data: a bayesian/maximum entropy reweighting approach; pp. 219–240. [DOI] [PubMed] [Google Scholar]
- 47.Byrd R.H., Lu P., et al. Zhu C. A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comput. 1995;16:1190–1208. [Google Scholar]
- 48.Beard H., Cholleti A., et al. Loving K.A. Applying physics-based scoring to calculate free energies of binding for single amino acid mutations in protein-protein complexes. PLoS One. 2013;8:e82849. doi: 10.1371/journal.pone.0082849. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Paz H., Joo E.J., et al. Heisterkamp N. Treatment of B-cell precursor acute lymphoblastic leukemia with the Galectin-1 inhibitor PTX008. J. Exp. Clin. Cancer Res. 2018;37:67. doi: 10.1186/s13046-018-0721-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.George A.A., Paz H., et al. Abdel-Azim H. Phosphoflow-based evaluation of mek inhibitors as small-molecule therapeutics for B-cell precursor acute lymphoblastic leukemia. PLoS One. 2015;10:e0137917. doi: 10.1371/journal.pone.0137917. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Fei F., Joo E.J., et al. Heisterkamp N. B-cell precursor acute lymphoblastic leukemia and stromal cells communicate through Galectin-3. Oncotarget. 2015;6:11378–11394. doi: 10.18632/oncotarget.3409. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Delaglio F., Grzesiek S., et al. Bax A. NMRPipe: a multidimensional spectral processing system based on UNIX pipes. J. Biomol. NMR. 1995;6:277–293. doi: 10.1007/BF00197809. [DOI] [PubMed] [Google Scholar]
- 53.Lee W., Tonelli M., Markley J.L. NMRFAM-SPARKY: enhanced software for biomolecular NMR spectroscopy. Bioinformatics. 2015;31:1325–1327. doi: 10.1093/bioinformatics/btu830. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Umemoto K., Leffler H. Assignment of 1H, 15N and 13C resonances of the carbohydrate recognition domain of human galectin-3. J. Biomol. NMR. 2001;20:91–92. doi: 10.1023/a:1011269008175. [DOI] [PubMed] [Google Scholar]
- 55.Zou J., Glinsky V.V., et al. Deutscher S.L. Peptides specific to the galectin-3 carbohydrate recognition domain inhibit metastasis-associated cancer cell adhesion. Carcinogenesis. 2005;26:309–318. doi: 10.1093/carcin/bgh329. [DOI] [PubMed] [Google Scholar]
- 56.Hamelberg D., Mongan J., McCammon J.A. Accelerated molecular dynamics: a promising and efficient simulation method for biomolecules. J. Chem. Phys. 2004;120:11919–11929. doi: 10.1063/1.1755656. [DOI] [PubMed] [Google Scholar]
- 57.Fei F., Abdel-Azim H., et al. Heisterkamp N. Galectin-3 in pre-B acute lymphoblastic leukemia. Leukemia. 2013;27:2385–2388. doi: 10.1038/leu.2013.175. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Li J., Abel R., et al. Friesner R.A. The VSGB 2.0 model: a next generation energy model for high resolution protein structure modeling. Proteins. 2011;79:2794–2812. doi: 10.1002/prot.23106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Miller B.R., McGee T.D., et al. Roitberg A.E. MMPBSA.py: an efficient program for end-state free energy calculations. J. Chem. Theor. Comput. 2012;8:3314–3321. doi: 10.1021/ct300418h. [DOI] [PubMed] [Google Scholar]
- 60.St-Gelais J., Denavit V., Giguère D. Efficient synthesis of a galectin inhibitor clinical candidate (TD139) using a Payne rearrangement/azidation reaction cascade. Org. Biomol. Chem. 2020;18:3903–3907. doi: 10.1039/d0ob00910e. [DOI] [PubMed] [Google Scholar]
- 61.Chan Y.C., Lin H.Y., et al. Lin C.H. Dissecting the structure-activity relationship of galectin-ligand interactions. Int. J. Mol. Sci. 2018;19:E392. doi: 10.3390/ijms19020392. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Hsieh T.J., Lin H.Y., et al. Lin C.H. Dual thio-digalactoside-binding modes of human galectins as the structural basis for the design of potent and selective inhibitors. Sci. Rep. 2016;6:29457. doi: 10.1038/srep29457. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Miller M.C., Zheng Y., et al. Mayo K.H. Targeting the CRD F-face of human galectin-3 and allosterically modulating glycan binding by angiostatic PTX008 and a structurally optimized Derivative. ChemMedChem. 2021;16:713–723. doi: 10.1002/cmdc.202000742. [DOI] [PubMed] [Google Scholar]
- 64.Zhang Z., Miller M.C., et al. Mayo K.H. NMR-based insight into galectin-3 binding to endothelial cell adhesion molecule CD146: evidence for noncanonical interactions with the lectin's CRD beta-sandwich F-face. Glycobiology. 2019;29:608–618. doi: 10.1093/glycob/cwz036. [DOI] [PubMed] [Google Scholar]
- 65.Bhattacharya S., Lin X. Recent advances in computational protocols addressing intrinsically disordered proteins. Biomolecules. 2019;9:E146. doi: 10.3390/biom9040146. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Joshi P., Vendruscolo M. Druggability of intrinsically disordered proteins. Adv. Exp. Med. Biol. 2015;870:383–400. doi: 10.1007/978-3-319-20164-1_13. [DOI] [PubMed] [Google Scholar]
- 67.Uversky VN. Intrinsically disordered proteins. Structural Biology in Drug Discovery. 2020. p. 587-612.
- 68.Cheng Y., LeGall T., et al. Dunker A.K. Rational drug design via intrinsically disordered protein. Trends Biotechnol. 2006;24:435–442. doi: 10.1016/j.tibtech.2006.07.005. [DOI] [PubMed] [Google Scholar]
- 69.Flock T., Weatheritt R.J., et al. Babu M.M. Controlling entropy to tune the functions of intrinsically disordered regions. Curr. Opin. Struct. Biol. 2014;26:62–72. doi: 10.1016/j.sbi.2014.05.007. [DOI] [PubMed] [Google Scholar]
- 70.Zerbe B.S., Hall D.R., et al. Kozakov D. Relationship between hot spot residues and ligand binding hot spots in protein-protein interfaces. J. Chem. Inf. Model. 2012;52:2236–2244. doi: 10.1021/ci300175u. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Stasenko M., Smith E., et al. Spriggs D.R. Targeting galectin-3 with a high-affinity antibody for inhibition of high-grade serous ovarian cancer and other MUC16/CA-125-expressing malignancies. Sci. Rep. 2021;11:3718. doi: 10.1038/s41598-021-82686-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Miller M.C., Ippel H., et al. Mayo K.H. Binding of polysaccharides to human galectin-3 at a noncanonical site in its carbohydrate recognition domain. Glycobiology. 2016;26:88–99. doi: 10.1093/glycob/cwv073. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Miller M.C., Klyosov A.A., Mayo K.H. Structural features for alpha-galactomannan binding to galectin-1. Glycobiology. 2012;22:543–551. doi: 10.1093/glycob/cwr173. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.






