Skip to main content
Fundamental Research logoLink to Fundamental Research
. 2025 Jan 28;6(1):53–61. doi: 10.1016/j.fmre.2025.01.013

Decoding intrinsically disordered regions in biomolecular condensates

Minglei Shi a, Zhaoxu Wu a, Yi Zhang a, Tingting Li a,b,
PMCID: PMC12869743  PMID: 41647538

Abstract

Biomolecular condensates comprise a diverse array of molecular entities, with intrinsically disordered regions (IDRs) receiving mounting attention due to their pivotal roles. In recent years, significant progress has been made in understanding the linear and conformational molecular grammar of IDRs in biomolecular condensates. This review will focus on the advances in studying IDR conformational ensembles and their relationship to function, with a particular emphasis on molecular dynamics (MD) simulations and the emerging synergy between MD and machine learning (ML) methods. Nevertheless, the inherent flexibility and dynamic nature of IDRs continue to present substantial challenges for conformation analysis. The integration of advanced experimental techniques, computational methods, and evolutionary analysis promises to unveil the conformational mysteries and therapeutic potential of IDRs in condensates.

Keywords: Intrinsically disordered regions, Biomolecular condensates, Molecular grammar, Molecular dynamics, Artificial intelligence


Biomolecular condensates, as dynamic membraneless assemblies within cells, play crucial roles in regulating gene expression, signal transduction, stress responses, and various other key life processes [[1], [2], [3], [4]]. Intrinsically Disordered Regions (IDRs) are parts of proteins that lack a fixed three-dimensional structure [5,6]. While IDRs are neither universally necessary nor sufficient for condensate formation [[7], [8], [9]], it is widely accepted that they play a vital role in this process. Unlike globular proteins, IDRs exhibit highly flexible and dynamic conformational characteristics. This unique structural property endows IDRs with the ability to interact with multiple molecular partners and rapidly adjust their behavior under different environmental conditions, making them ideal drivers for the formation of biomolecular condensates [10,11]. However, this inherent flexibility also poses significant challenges for structural analysis.

In recent years, significant progress has been made in the structural dissection of IDRs, based on the synergistic development of experimental techniques and computational methods [12]. In this review, we will explore the current understanding of IDRs’ composition and sequence characteristics regarding the formation of biomolecular condensates. Then we delve into recent advancements in dissecting higher-order IDR structures, with a particular focus on molecular dynamics (MD) and machine learning (ML) approaches, highlighting their significant contributions to the prediction of IDR conformational ensembles. Given the vital role of evolutionary information in protein structure analysis, we will specifically introduce studies on the conservation of IDR sequences and their conformational ensembles. By examining these interconnected aspects, we aim to provide a comprehensive and holistic view of IDR structure and function, paving the way for future research directions.

1. Linear IDR grammars in biomolecular condensates

The discovery of IDRs marks significant progress in protein structure research. The systematic study and prediction of these regions gained momentum in the late 1990s and early 2000s, driven by two major factors: accumulating experimental evidence highlighting the functional importance of IDRs and the development of computational tools dedicated to IDR prediction [5,6]. This shift in perspective led to the establishment of focused computational initiatives aimed at IDR prediction. Researchers launched dedicated prediction competitions, most notably the Critical Assessment of Protein Structure Prediction (CASP) [13,14], and more recently the Critical Assessment of Protein Intrinsic Disorder (CAID) [15]. Over 100 prediction methods are now available [16,17], reflecting the field’s rapid growth and the increasing recognition of IDRs’ significance in vital cellular processes. These computational approaches range from straightforward sequence-based analyses to advanced machine learning algorithms, all designed to detect and characterize protein regions that lack stable three-dimensional conformations.

Despite the enhanced ability to predict IDRs, a particularly intriguing advance is the discovery that certain IDRs can mediate biomolecular condensate formation through phase separation [18,19]. These specific attributes of IDRs capable of forming condensates have been conceptualized as a ‘molecular grammar’, which includes particular amino acid compositions and functional modules [20].

Current research has established a substantial understanding of the amino acid composition that drives biomolecular condensates, as illustrated in Fig. 1A. A distinctive feature of IDRs is the lack of hydrophobic amino acids, which hinders their ability to fold into stable three-dimensional structures. Instead, IDRs are marked by a prevalence of polar, charged amino acids such as arginine (Arg), lysine (Lys), aspartic acid (Asp), and glutamic acid (Glu). These residues with contrasting charges promote interactions between different proteins through electrostatic attraction [21]. Polar residues can participate in hydrogen bonds or dipole-dipole interactions [22]. Furthermore, aromatic residues play a significant role in condensation; they can engage in intramolecular interactions via side chain π-π interactions (π electrons), cation-π interactions (with arginine, lysine, and protonated histidine), methyl-π interactions, or hydrophobic interactions (with aliphatic residues) [23]. For example, prion-like low complexity domains (PLCDs) show an enrichment in aromatic (Y/F) and polar amino acids (G/S/Q/N), with a reduction in charged residues [24,25]. In addition to these well-studied amino acids, Shiv et al. have demonstrated that non-polar or uncharged amino acids can maintain phase separation as well [26].

Fig. 1.

Fig 1 dummy alt text

Schematic representation linear IDR grammars in biomolecular condensates. A. The “amino acid composition” panel illustrates that in IDRs, glycine (G) and proline (P) are the most prevalent amino acids, while hydrophobic amino acids are the least abundant. B. The “functional modules” panel delineates key structural elements within IDRs: Short Linear Motifs (SLiMs) and Low-Complexity Regions (LCRs), which can be conceptualized as 1D grammar for IDRs. C. The “sticker-spacer” panel elucidates the “sticker-spacer” model, emphasizing how specific amino acid residues (stickers) drive intermolecular interactions, while spacer regions separate these stickers and modulate the condensates. This model can be viewed as 2D grammar for IDRs, and has been widely applied in elucidating IDR interaction mechanisms.

The aforementioned amino acids can interact with each other, leading to the formation of condensates. To better understand these complex interactions, researchers have introduced the stickers-and-spacers model (Fig. 1C), based on studies of the fused in sarcoma (FUS) protein. This model offers a conceptual framework for understanding the mechanisms behind protein-protein interactions and phase separation [23,27,28]. According to the stickers-and-spacers model, certain amino acid sequences within functional modules act as “stickers” promoting intermolecular interactions, while other regions act as “spacers” influencing the overall properties of the condensates. In the case of FUS, spacer regions rich in glycine residues result in more dynamic and liquid-like condensates, while those rich in glutamic acid residues lead to more solid-like condensates. These properties vary spatially and are responsive to various chemical and biological stimuli [29,30].

Beyond the interactions at the individual amino acid level, IDRs also harbor short, fixed amino acid sequences known as short linear motifs (SLiMs), which are essential for mediating protein condensation (Fig. 1B). SLiMs are 3–10 residue peptide segments within intrinsically disordered proteins (IDPs), marked by loosely conserved amino acid sequences interspersed with highly conserved residues [31]. Millions of SLiMs have been predicted in the human proteome [32]. Li et al. described a simple two-component condensate system consisting of arrays of binding domains and their corresponding SLiMs, which are connected by flexible linkers. This interaction module comprises SH3 binding domains and PRM SLiMs [19]. Further studies have demonstrated that similar condensates can be formed through interaction modules involving protein-protein interactions (e.g., SUMO-SIM) or even protein-RNA interactions (e.g., RRM-RNA motifs) [33]. SLiM-mediated interactions can also be integrated into the stickers-and-spacers model. Here, SLiMs and their interacting domains act as stickers, while the sequences surrounding the SLiMs and domains serve as spacers [34].

Some IDRs do not rely on specific amino acid sequences for their function; instead, their role is maintained through their inherent flexibility. These IDRs can either serve as flexible linkers that permit relative movement between domains or as spacers that regulate the distances between domains [10]. The importance of IDR flexibility is highlighted by the conserved dynamic behavior of flexible linkers in the RPA70 subunit, even with minimal sequence conservation [35]. Similarly, the disordered linker in adenovirus E1A shows evolutionary conservation of the distance between binding motifs rather than the sequence itself, a phenomenon referred to as conformational buffering [36]. These examples underscore the vital role of IDRs in sustaining protein function through preserved flexibility and spatial arrangements, rather than strict sequence conservation.

In summary, based on the aforementioned amino acid compositions and functional modules, the phase-separated biomolecular condensates mediated by IDRs are associated with three types of amino acids or peptides: 1. Amino acid stickers which create weak multivalent interactions at the individual amino acid level; 2. SLiMs which engage in weak multivalent interactions as a set of fixed residues; 3. Spacers which do not interact directly but serve as separators between amino acid stickers or SLiMs or structural domains. Collectively, these elements contribute to the formation and function of biomolecular condensates through IDR-mediated phase separation.

2. Characterizing the dynamics of IDR conformation ensembles

While researchers have made significant strides in understanding the linear grammar of IDRs, it's crucial to acknowledge that IDRs function in three-dimensional space within biomolecular condensates. Therefore, studying their 3D conformations is increasingly recognized as essential for understanding their functions. Moving away from the traditional sequence-structure-function paradigm, a growing body of research suggests that IDR function can be elucidated through a novel sequence-ensemble-function paradigm [5,6,37,38]. An ensemble represents the collective set of all possible IDR conformations. The ensemble properties of IDRs are quantifiable and consist of measurable parameters that describe the three-dimensional characteristics derived from this collective set. These include global size metrics of IDRs (such as the radius of gyration, end-to-end distance, or hydrodynamic radius), local transient structures (e.g., sparse helices and extended conformations), and inter-residue distances [39]. The ensemble properties of IDRs are likely critical for their biological roles. For example, although IDRs lack stable secondary structures like α-helices or β-sheets, they may transiently form secondary structure elements, albeit with low stability and short lifetimes [40,41]. These fleeting secondary structures could predispose IDRs to interact with specific partners and significantly influence binding energetics. In other instances, the average end-to-end distance of IDRs bridging two folded domains might position these domains at distances that are functionally significant [39]. Therefore, alterations in ensemble properties can have profound effects on cellular functions. This capability allows IDRs to integrate complex signaling pathways and intercellular communications across multiple inputs. However, the conformational flexibility stemming from the low sequence complexity of IDRs, while beneficial for the regulation of biological condensates, presents considerable challenges for defining a clear IDR grammar. In many cases, the precise molecular mechanisms by which IDRs confer biological function remain obscure.

2.1. Molecular simulations of ensembles

In recent years, there has been significant progress in analyzing the conformational ensembles of IDRs. This progress is due to various experimental techniques, such as nuclear magnetic resonance (NMR) spectroscopy, small-angle X-ray scattering (SAXS), single-molecule Förster resonance energy transfer (smFRET), electron paramagnetic resonance (EPR), and circular dichroism (CD) spectroscopy (Fig. 2A). These methods provide valuable data that enhance our understanding of the dynamic behavior and structural features of IDRs. However, each technique has its limitations. For example, while NMR spectroscopy can offer high-resolution structural information, it often requires high sample concentrations and complex data processing. SAXS and CD techniques are excellent for probing overall conformational shapes but do not offer high-resolution local structural information. smFRET and EPR can study intermolecular distances and interactions, but they frequently necessitate complex labeling and sample preparation processes [12]. These limitations can significantly hinder our understanding of the complex behavior of IDRs under physiological conditions.

Fig. 2.

Fig 2 dummy alt text

Methods for characterizing the dynamics of IDR conformation ensembles. A. Experimental techniques suited for IDRs’ structural analysis. Experimental methods can provide actual conformations of IDRs, but often low resolution and low throughput. Currently, experimental data is primarily used to assist Molecular Dynamics (MD) modeling, which can significantly reduce computational requirements. B. Various levels of MD simulations for IDRs’ structural analysis. When selecting appropriate methods, it is important to balance the complexity of the research subject with efficiency. C. Machine learning, particularly the integration of ML with MD simulations, plays a crucial role in predicting IDR conformational ensembles. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

MD simulations have become an essential tool for analyzing intrinsically disordered regions (IDRs) due to the limitations of experimental techniques [42,43]. These simulations model the forces between molecules at the atomic level and use high-performance computers to numerically solve Newton’s second law of motion, thereby predicting the dynamics of biomolecules. MD simulations can generate detailed trajectories that show how molecules reorganize between different structural states, revealing the structural fluctuations and dynamic characteristics of IDRs. The MD simulation techniques include a wide range of approaches such as quantum mechanics, atomistic simulations, coarse-grained models, meso‑scale methods, and analytical approaches (Fig. 2B). The choice of method often involves a trade-off between model detail and simulation efficiency, emphasizing the importance of selecting appropriate techniques based on specific research requirements and system size [44]. For instance, coarse-grained (CG) models are well-suited for simulating large IDP assemblies. In these models, groups of atoms are collectively represented as single “beads”. The level of protein coarse-graining is highly flexible and can be adjusted as needed. This adjustment can range from using multiple beads to represent a single amino acid to having one bead represent multiple amino acids. By adjusting the degree of coarse-graining, researchers can find an optimal balance between maintaining sufficient model detail and improving computational efficiency, thereby enabling more effective simulation of the behavior of large IDP systems [45,46]. By employing these diverse computational approaches, researchers can gain valuable insights into the complex behavior of IDPs and their role in phase separation phenomena, complementing experimental observations and advancing our understanding of these important biomolecular systems.

However, MD simulations also have their limitations. The forces between individual atoms are modeled in so-called “force fields”. Existing protein force fields are mainly optimized for describing single-domain globular proteins. When these force fields are applied to sample the dynamics of disordered domains, their accuracy may be insufficient and recalibration might be necessary. Additionally, MD simulations are limited by the spatial and temporal scales that can be sampled. Typically, the timescale of traditional MD simulations ranges from microseconds to milliseconds, which is adequate for observing small protein folding, ligand binding, or short-timescale conformational changes. However, longer biological processes, such as large protein folding or protein-protein interactions, often occur on timescales from milliseconds to seconds, which is beyond the reach of traditional MD simulations [47]. To address this limitation, methods based on physical insights or coarse-grained models can be used to sacrifice spatial resolution in exchange for longer timescales. Additionally, suitable force fields and sampling methods for IDRs have been developed. Wang et al. [48] reviewed the recent progress in atomic MD simulation studies of IDPs, including the development of force fields and sampling methods.

2.2. Combining ML and MD for ensemble predictions

In recent years, ML tools, such as AlphaFold, have achieved remarkable progress in protein structure prediction [49,50]. Although these ML tools still face challenges when predicting the structures of IDRs [51], they have demonstrated significant potential in analyzing the sequence characteristics and interaction behaviors of these regions. The primary advantage of ML lies in its ability to rapidly learn from existing data, including both experimental and simulated data. Experimental data on IDRs can help reduce the computational demands of MD simulations [42]. Concurrently, MD simulations that calculate conformations under various conditions can provide valuable training data for ML models.

Researchers have been making strides in understanding the ensemble properties of IDRs by leveraging the complementary strengths of MD simulations and ML techniques [52,53]. These approaches can be broadly categorized into two groups based on their purpose and the utilization of MD simulations and ML models. The first category of methods is designed to train ML models using short or limited MD simulations to generate full conformational ensembles. These are typically encoder-decoder models that map IDR conformations into a latent space and then generate new conformations by sampling within that space. Notable examples include the Variational Autoencoder (VAE) method developed by Zhu et al. [54], the generative autoencoder approach introduced by Gupta et al. [55], and the ICoN (Internal Coordinate Net) model, which learns the physical principles of conformational changes [56]. These techniques are particularly adept at extracting maximum information from limited simulation data, making them particularly valuable for studying large IDP systems that are difficult to simulate over extended timescales. By learning conformational features from short simulations, these methods can infer protein behavior over longer timescales, providing a powerful tool for investigating IDP structure-function relationships while significantly reducing computational resource demands.

The second category of methods focuses on training ML models with large-scale MD simulations to generate conformational ensembles from protein sequences or to predict ensemble properties. These are typically conditional generative models that generate an IDR conformational ensemble by inputting the corresponding sequence along with different noise vectors. This group includes several innovative approaches: the IdpGAN model, which employs a generative adversarial network architecture [57]; the idpSAM model, a latent diffusion model based on transformer neural networks [58]; IDPFold model, a sequence-conditional SE(3) diffusion model [59]; the ALBATROSS tool, which combines rational sequence design with coarse-grained simulations [60]; and a machine-learning-based two-stage computational pipeline [61]. Furthermore, Tesei et al. utilized CALVADOS2 for extensive MD simulations and subsequently trained a Support Vector Regression model to predict conformational properties based on the MD data, generating conformational ensembles for 28,058 IDRs from the human proteome. Their analysis revealed that certain functions, such as DNA binding ability or localization in nuclear speckles are enriched in proteins with compact IDRs, while mitochondrial and endosome components are enriched in proteins with expanded IDRs [62]. These methods, which typically leverage advanced deep learning techniques, can generate realistic conformational ensembles or predict ensemble properties directly from protein sequences. They offer the advantage of producing comprehensive ensembles for arbitrary IDP sequences, capturing overall dimensions, contact distributions, and multiple-order correlations.

3. Evolutionary conservation: a key to understanding IDR grammars and protein conformations

Identifying evolutionarily conserved amino acid sequences is crucial for recognizing functional motifs in structured proteins, as co-evolving residues often indicate interactions and functional importance [63]. Due to the lack of structural constraints, IDRs are known to evolve rapidly and are generally thought to have lower conservation levels [64]. However, systematic analyses of IDR conservation have shown that this cannot be generalized or oversimplified. A comprehensive study using disorder prediction and multiple sequence alignment of orthologous groups across 23 yeast species identified three biologically distinct classes of IDRs with specific functions: Flexible Disorder, Constrained Disorder, and Nonconserved Disorder [65] (Fig. 3). Flexible Disorder maintains conserved disorder but exhibits rapidly evolving amino acid sequences. It is associated with functions that require biophysical flexibility, such as spacers and linkers with entropic chain functions [10], and accounts for over half of the disordered residues in yeast. Constrained Disorder displays both conserved disorder and conserved amino acid sequences, including modular domains that may fold upon binding to their targets, comprising about one-third of disordered residues. Nonconserved Disorder, where even the property of being disordered is not conserved among closely related species, appears largely nonfunctional and accounts for approximately 17% of disordered residues. This classification result suggests that some IDRs exhibit evolutionarily conserved sequences, while others, despite lacking sequence conservation, maintain conserved amino acid compositions and physicochemical properties, which are sufficient to preserve their biological functions.

Fig. 3.

Fig 3 dummy alt text

Evolutionary conservation-based classification of IDRs. A. IDRs can be categorized into three distinct classes based on evolutionary conservation: Constrained, Flexible, and Non-conserved. Constrained IDRs exhibit conservation in both disorder and amino acid (AA) sequence, primarily functioning at interaction interfaces. Flexible IDRs display high conservation in disorder features but low conservation in AA sequence, often performing entropic chain functions. Non-conserved IDRs, which show low conservation in both features, are largely considered nonfunctional. B. IDR sequence examples and conservation features. This panel illustrates representative IDR sequences categorized into “Constrained” (red), “Non-conserved” (gray), and “Flexible” (purple) regions. Conservation of AA sequence is represented by color intensity, with deeper red indicating higher conservation. The bar heights below each sequence depict the level of conservation in disorder features, with taller bars representing greater conservation. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

The functional conservation of IDRs involved in condensate formation provides an opportunity for deeper research into the functional relevance of IDR grammars. By analyzing IDRs from yeast and humans, researchers have uncovered a correlation between the evolutionary signatures of IDRs and their functions. In yeast, Zarin et al. [66] categorized the 82 evolutionary signatures they studied into five classes: amino acid composition, sequence motifs, sequence complexity, charge properties, and physicochemical properties. Interestingly, IDRs involved in the same cellular processes possess similar evolutionary signatures, hinting at a link between IDR function and specific molecular characteristics. For example, IDRs related to DNA repair display highly conserved charge residues and a significant separation of positive and negative charges in their linear sequences, a pattern that has been preserved throughout evolution. Pritišanac et al. [67] calculated the evolutionary signatures of the intrinsically disordered human proteome (IDR-ome). Their analysis revealed that when the IDR-ome is clustered based on evolutionary signatures, it can be enriched in specific pathways, such as transcriptional regulation, RNA processing, and cellular signaling. This study also identified enriched IDR grammar features in particular condensates, including nucleoli, stress granules, and nuclear speckles (Fig. 4).

Fig. 4.

Fig 4 dummy alt text

Functional related IDRs share similar evolutionary signatures. A. This panel demonstrates how different feature vectors of IDRs, derived from their molecular features and calculated based on evolutionary conservation, correlate with their functional roles and cellular localization in structures like the nucleolus, P body, and nuclear speckle. B. The left panel explains the computation of the Z-score between the evolutionary mean of a feature in extant species (x) and the average evolutionary mean of the feature in simulations of the null hypothesis (µ), normalized by the standard deviation of the evolutionary mean in simulations of the null hypothesis (σ). The right panel shows the clustering of Z-scores of allmolecular features (evolutionary signatures) for IDRs, summarizing conservation across the IDRome. Hierarchical clustering of IDRs based on these Z-scores reveals patterns that define a global map of the IDRome, with IDRs exhibiting similar evolutionary signatures appearing in close proximity. This approach encodes the molecular evolutionary properties of IDRs as feature vectors, predicts their functions and localizations based on evolutionary conservation, and highlights the significance of evolutionary signatures in understanding the roles of IDRs. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

In addition to the conservation at the level of linear IDRs grammar mentioned above, researchers have also discovered conservation at the level of conformational ensembles. Nicolás et al. found that the viral E1A protein binds to the retinoblastoma (Rb) protein through two binding motifs connected by variable disordered linkers, thereby disrupting cell cycle regulation. The study revealed that although the linker sequences are highly variable, their sizes are conserved. This conservation is achieved through compensatory changes in sequence length and composition, a phenomenon they call “conformational buffering” [36]. Conformational buffering and the co-evolution of motifs and linkers explain the robust encoding of function in highly variable disordered linkers, which may underlie functional selection in many IDRs. Giulio et al. predict a set of homologous IDRs (26,839 human IDRs and their 1,088,250 homologs) and found that from a conformational ensembles perspective (conformational entropy and compaction), there is a high correlation between human sequences and their homologs, demonstrating a high degree of conservation [62].

In summary, compared to simple sequence conservation, the conservation of IDRs needs to be considered from multiple perspectives, including physicochemical properties and conformational ensembles. These findings highlight the importance of considering multiple aspects of conservation when studying IDRs and underscore their crucial roles in protein function and cellular organization, particularly in the context of biomolecular condensates.

4. Targeting intrinsically disordered regions: from computational dissection to therapeutic development

IDRs exhibit remarkable conformational plasticity, enabling them to engage in diverse molecular interactions, including protein-protein, protein-DNA, protein-RNA, and protein-small molecule interactions [68]. This versatility establishes IDRs as critical components in biomolecular condensates and promising therapeutic targets [69]. Several key drug targets, including the androgen receptor (AR), p53, Tau, and members of the FET protein family (FUS, EWS, and TAF15), undergo phase separation to form condensates through multivalent interactions between their IDRs [70]. Small molecules targeting these IDR regions, termed condensate-modulating compounds (c-mods), can achieve therapeutic effects by regulating condensate formation [71].

To enable systematic screening of c-mods targeting specific druggable pathways, Wang and colleagues developed DropScan, an advanced computer vision-based platform for monitoring phase separation dynamics [72]. DropScan facilitates the precise detection and quantification of subtle alterations in biomolecular condensates, providing a robust tool for high-throughput screening of small molecules designed to modulate aberrant phase separation. This innovative platform has been effectively utilized to identify small molecules capable of disrupting condensates formed by FET-ETS oncofusion proteins, highlighting its potential in therapeutic discovery against phase separation-driven pathologies.

Beyond drug screening, computational algorithms offer valuable insights into the molecular mechanisms underlying IDR-ligand interactions. All-atom molecular dynamics simulations have demonstrated great value in elucidating the mechanistic details of small molecule interactions with IDRs. This is exemplified by the groundbreaking work on EPI-002 and EPI-7170, compounds targeting the intrinsically disordered transactivation domain of the androgen receptor. These simulations revealed that both compounds induce the formation of partially folded collapsed helical states at the interface of two transiently helical regions [73]. The characteristic structural transitions of IDRs upon ligand binding provide crucial insights for developing therapeutic strategies against traditionally “undruggable” targets. In this context, the development of accurate prediction algorithms for the interactions of binders with IDRs becomes particularly significant. A breakthrough in this field was achieved by Wu et al., who developed an innovative approach combining biophysical principles with deep learning to design specific binders for disordered sequences. Their method represents a paradigm shift from conventional approaches that typically assume fixed structures of targets. Instead, they created a highly flexible combinatorial template library that can specifically recognize IDR regions and induce them into appropriate conformations. The sophistication of this method is further enhanced by Rfdiffusion refinement, which optimizes the molecular fit between the binder and its target. This approach has been successfully generating binders with exceptional affinity, ranging from nanomolar to picomolar, for a diverse array of unstructured targets [74].

The integrated approach–from computational screening to mechanistic understanding and therapeutic development–represents a powerful paradigm for targeting IDRs in disease intervention. This synergistic strategy greatly expanded the therapeutic target space and continues to open new avenues for developing more effective and specific treatments for diseases involving disordered proteins.

5. Challenges and future directions in AI-driven IDR research

Despite remarkable advances in molecular dynamics and machine learning (MD+ML) technologies for predicting IDR conformational ensembles, fundamental challenges remain in bridging the gap between computational predictions and intracellular microenvironments. Similar to how cellular populations constitute tissue microenvironments, biomolecular condensates formed by diverse molecular constituents create complex intracellular microenvironments (Fig. 5). These compartmentalized structures exhibit unique physicochemical properties such as molecular crowding, variations in ionic strength, pH fluctuations, and mechanical stress. These characteristics significantly influence the structure and function of intracellular biomolecules.

Fig. 5.

Fig 5 dummy alt text

Biomolecular condensates formed by multicomponent generate complex intracellular microenvironments. Diverse molecular constituents that participate in intracellular microenvironments: (1) IDRs mediate condensate formation through specific sequences acting as stickers and spacers, (2) Tandem structured domains contributing to condensate formation through multivalent interactions, (3) DNA mediate phase separation occurring in nuclear membrane-less organelles (MLOs), (4) RNA forming condensates in conjunction with RNA-binding proteins (RBPs) and (5) Small molecules selectively partitioning into condensates, influencing their dynamics and properties. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

From a conformational perspective, IDRs undergo significant conformational transitions in biomolecular complexes, compared to their unbound states [75,76]. Bhargava et al. investigated conformational dynamics in protein-nucleic acid interactions and found that 10% of IDRs become partially ordered following complex formation, while 42% achieve complete order upon complexes formation. Additionally, 29% of IDRs that are fully ordered in their unbound states become disordered upon complexation, and 17% of IDRs that are partially ordered when unbound become entirely disordered after binding to nucleic acids [77]. A deeper understanding of this intricate relationship between IDR conformational dynamics and intracellular environments promises to advance our knowledge of IDR conformational ensemble dynamics significantly.

Intracellular microenvironments also provide new perspectives for understanding drug localization and resistance mechanisms. Pioneering research has revealed new models for understanding drug action and resistance through the lens of cellular compartmentalization. In 2020, Klein and colleagues demonstrated a novel mechanism connecting MED1 overexpression to tamoxifen resistance by monitoring the concentration of FLTX1 (a fluorescent tamoxifen analog) within MED1 condensates [78]. Subsequent systematic investigations have deepened our understanding of condensate-drug interactions: Michael Rosen’s team employed in vitro mass spectrometry to establish small molecule hydrophobicity as a key determinant of condensate entry [79], while Samie Jaffrey's team uncovered the essential role of phospholipids and their hydrophobic fatty acyl groups in various condensates [80]. Further studies by Richard Young’s team, utilizing extensive fluorescence probes, revealed that distinct condensate microenvironments exhibit unique specificities in their effects on small-molecule localization, highlighting the complex relationship between cellular compartmentalization and drug efficacy [81].

Currently, research on how Intracellular microenvironments influence IDR conformations and the distribution of small-molecule drugs is in its early stages. Existing AI models face significant limitations in accurately predicting IDR behaviors. The adoption of diffusion models, as seen in AlphaFold 3, has introduced severe hallucination issues in IDR predictions [49]. Moreover, these AI-based predictions are heavily reliant on existing experimental data and computational models trained under idealized conditions. This dependence limits their applicability to real Intracellular microenvironments, where dynamic IDR functions and environmental complexities are prevalent. The fundamental mismatch between the dynamic reality of IDR function and the static nature of current prediction methods highlights the urgent need for more sophisticated modeling approaches. Future models must account for both the structural flexibility of IDRs and the intricate environmental factors present in biological systems.

To advance our understanding of IDRs’ dynamic conformations and their interactions in complex intracellular microenvironments, a systematic multi-disciplinary approach is essential. First and most fundamentally, the experimental component centers on the comprehensive characterization of IDR structure and function. IDRs that adopt specific conformations upon ligand binding serve as excellent model systems for investigating the relationship between conformational dynamics and function. This investigation should combine high-resolution in vitro biophysical techniques with live-cell experiments and functional assays in physiological contexts, providing crucial temporal and spatial information about IDR behavior under native conditions. Second, building upon these experimental foundations, the computational aspect leverages MD simulations as powerful tools to explore the complete conformational ensemble of IDRs. By analyzing conformational changes induced by ligand binding, MD simulations can capture the environmental responses of IDRs, providing detailed insights into their structural transitions and interaction mechanisms that may be challenging to observe experimentally. Third, to integrate and extend these approaches, advanced AI modeling strategies should be developed to integrate multiple data sources: experimental measurements, MD simulation results, and evolutionary information. Evolutionary data is particularly valuable as it can reveal functionally relevant conformational states of IDRs specific to different species. This integration could leverage recent advances in deep learning, particularly dynamic modeling approaches and deep generative networks, to create more comprehensive and accurate predictions of IDR behavior [82]. The development of interpretable machine learning models and the utilization of transfer learning will also be crucial for improving our ability to predict IDR dynamics and understand their cellular functions in complex biological contexts.

Looking forward, advancing IDR decoding within intracellular microenvironments requires coordinated efforts across multiple fronts, from cross-field collaboration between computational and experimental scientists to leveraging AI-driven simulation and incorporating biological context. This interdisciplinary approach promises more precise therapeutic strategies for condensate-related diseases, from neurodegenerative disorders to cancer, bringing new hope for human health.

Declaration of competing interest

The authors declare that they have no conflicts of interest in this work.

Acknowledgments

Declaration of generative AI and AI-assisted technologies in the writing process

During the preparation of this work, the author(s) used Claude3.5 to extract literature information and revise the language. After using this tool/service, the author(s) reviewed and edited the content as needed and take(s) full responsibility for the content of the publication.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (T2325003 and 32470575), the National Key Research and Development Program of China (2023YFF1204703 and 2021YFF1200900), and Beijing Natural Science Foundation (Z230014).

Biographies

Minglei Shi(BRID: 08092.00.01522) is a research asscioate professor at the Peking University Health Science Center. He specializes in developing and applying innovative multi-omics technologies, including three-dimensional genomics, proteomics, and spatial single-cell multi-omics techniques. By combining these cutting-edge approaches with artificial intelligence-based deep analysis, he is dedicated to exploring the mechanisms of biomolecular condensates and their roles in aging and disease progression.

Tingting Li(BRID: 09750.00.13021) is a Boya Distinguished Professor and tenured full professor at Peking University, a recipient of the National Science Fund for Distinguished Young Scholars, and a Young Changjiang Scholar of the Ministry of Education. As an expert specializing in the development and application of AI algorithms for protein phase separation analysis, Professor Li’s work involves establishing high-throughput detection methods and data platforms, mining novel features, constructing multi-modal prediction algorithms, and developing target identification and targeted drug computational screening platforms for biomolecular condensation. These interdisciplinary efforts aim to bridge information technology, biology, and medicine to advance our understanding of biomolecular condensation and provide innovative solutions for intervene disease progression.

References

  • 1.Sabari B.R., Dall'Agnese A., Boija A., et al. Coactivator condensation at super-enhancers links phase separation and gene control. Science. 2018;361(6400):eaar3958. doi: 10.1126/science.aar3958. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Boija A., Klein I.A., Sabari B.R., et al. Transcription factors activate genes through the phase-separation capacity of their activation domains. Cell. 2018;175(7):1842–1855.e1816. doi: 10.1016/j.cell.2018.10.042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Brodsky S., Jana T., Mittelman K., et al. Intrinsically disordered regions direct transcription factor In vivo binding specificity. Mol. Cell. 2020;79(3):459–471.e454. doi: 10.1016/j.molcel.2020.05.032. [DOI] [PubMed] [Google Scholar]
  • 4.Calabretta S., Richard S. Emerging roles of disordered sequences in RNA-binding proteins. Trends Biochem. Sci. 2015;40(11):662–672. doi: 10.1016/j.tibs.2015.08.012. [DOI] [PubMed] [Google Scholar]
  • 5.Wright P.E., Dyson H.J. Intrinsically unstructured proteins: re-assessing the protein structure-function paradigm. J. Mol. Biol. 1999;293(2):321–331. doi: 10.1006/jmbi.1999.3110. [DOI] [PubMed] [Google Scholar]
  • 6.Dunker A.K., Lawson J.D., Brown C.J., et al. Intrinsically disordered protein. J. Molecular Graph. Modell. 2001;19(1):26–59. doi: 10.1016/s1093-3263(00)00138-8. [DOI] [PubMed] [Google Scholar]
  • 7.Banerjee P.R., Holehouse A.S., Kriwacki R., et al. Dissecting the biophysics and biology of intrinsically disordered proteins. Trends Biochem. Sci. 2024;49(2):101–104. doi: 10.1016/j.tibs.2023.10.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Borcherds W., Bremer A., Borgia M.B., et al. How do intrinsically disordered protein regions encode a driving force for liquid–liquid phase separation? Curr. Opin. Struct. Biol. 2021;67:41–50. doi: 10.1016/j.sbi.2020.09.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Hou S., Hu J., Yu Z., et al. Machine learning predictor PSPire screens for phase-separating proteins lacking intrinsically disordered regions. Nat. Commun. 2024;15(1):2147. doi: 10.1038/s41467-024-46445-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.van der Lee R., Buljan M., Lang B., et al. Classification of intrinsically disordered regions and proteins. Chem. Rev. 2014;114(13):6589–6631. doi: 10.1021/cr400525m. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Pajkos M., Dosztányi Z. In: Progress in Molecular Biology and Translational Science. Vol 183. Uversky VN, editor. Academic Press; 2021. Chapter two-functions of intrinsically disordered proteins through evolutionary lenses; pp. 45–74. [DOI] [PubMed] [Google Scholar]
  • 12.Maiti S., Singh A., Maji T., et al. Experimental methods to study the structure and dynamics of intrinsically disordered regions in proteins. Curr. Res. Structur. Biol. 2024;21(7) doi: 10.1016/j.crstbi.2024.100138. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Melamud E., J. Moult Evaluation of disorder predictions in CASP5. Proteins. 2003;53(Suppl 6):561–565. doi: 10.1002/prot.10533. [DOI] [PubMed] [Google Scholar]
  • 14.Monastyrskyy B., Kryshtafovych A., Moult J., et al. Assessment of protein disorder region predictions in CASP10. Proteins. 2014;82(S2):127–137. doi: 10.1002/prot.24391. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Necci M., Piovesan D., Hoque M.T., et al. Critical assessment of protein intrinsic disorder prediction. Nat. Methods. 2021;18(5):472–481. doi: 10.1038/s41592-021-01117-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Wang K., Hu G., Basu S., et al. flDPnn2: Accurate and fast predictor of intrinsic disorder in proteins. J. Mol. Biol. 2024;436(17) doi: 10.1016/j.jmb.2024.168605. [DOI] [PubMed] [Google Scholar]
  • 17.L. Kurgan Resources for computational prediction of intrinsic disorder in proteins. Methods. 2022;204:132–141. doi: 10.1016/j.ymeth.2022.03.018. [DOI] [PubMed] [Google Scholar]
  • 18.Ibrahim A.Y., Khaodeuanepheng N.P., Amarasekara D.L., et al. Intrinsically disordered regions that drive phase separation form a robustly distinct protein class. J. Biol. Chem. 2023;299(1) doi: 10.1016/j.jbc.2022.102801. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Li P., Banjade S., Cheng H.-C., et al. Phase transitions in the assembly of multivalent signalling proteins. Nature. 2012;483(7389):336–340. doi: 10.1038/nature10879. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Chong S., M. Mir Towards decoding the sequence-based grammar governing the functions of intrinsically disordered protein regions. J. Mol. Biol. 2021;433(12) doi: 10.1016/j.jmb.2020.11.023. [DOI] [PubMed] [Google Scholar]
  • 21.Fisher R.S., S. Elbaum-Garfinkle Tunable multiphase dynamics of arginine and lysine liquid condensates. Nat. Commun. 2020;11(1):4628. doi: 10.1038/s41467-020-18224-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Schuster B.S., Dignon G.L., Tang W.S., et al. Identifying sequence perturbations to an intrinsically disordered protein that determine its phase-separation behavior. Proceed. Nation. Acad. Sci. 2020;117(21):11421–11431. doi: 10.1073/pnas.2000223117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Martin E.W., Holehouse A.S., Peran I., et al. Valence and patterning of aromatic residues determine the phase behavior of prion-like domains. Science. 2020;367(6478):694–699. doi: 10.1126/science.aaw8653. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Das R.K., R.V. Pappu Conformations of intrinsically disordered proteins are influenced by linear sequence distributions of oppositely charged residues. Proceed. Nation. Acad. Sci. 2013;110(33):13392–13397. doi: 10.1073/pnas.1304749110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Romero P., Obradovic Z., Li X., et al. Sequence complexity of disordered protein. Proteins. 2001;42(1):38–48. doi: 10.1002/1097-0134(20010101)42:1<38::aid-prot50>3.0.co;2-3. [DOI] [PubMed] [Google Scholar]
  • 26.Rekhi S., Garcia C.G., Barai M., et al. Expanding the molecular language of protein liquid–liquid phase separation. Nat. Chem. 2024;16(7):1113–1124. doi: 10.1038/s41557-024-01489-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Wang J., Choi J.-M., Holehouse A.S., et al. A molecular grammar governing the driving forces for phase separation of prion-like RNA binding proteins. Cell. 2018;174(3):688–699.e616. doi: 10.1016/j.cell.2018.06.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Choi J.-M., Holehouse A.S., Pappu R.V. Physical principles underlying the complex biology of intracellular phase transitions. Annu. Rev. Biophys. 2020;49:107–133. doi: 10.1146/annurev-biophys-121219-081629. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Murthy A.C., Dignon G.L., Kan Y., et al. Molecular interactions underlying liquid−liquid phase separation of the FUS low-complexity domain. Nat. Struct. Mol. Biol. 2019;26(7):637–648. doi: 10.1038/s41594-019-0250-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Quiroz F.G., A. Chilkoti Sequence heuristics to encode phase behaviour in intrinsically disordered protein polymers. Nat. Mater. 2015;14(11):1164–1171. doi: 10.1038/nmat4418. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Davey N.E., Van Roey K., Weatheritt R.J., et al. Attributes of short linear motifs. Mol. Biosyst. 2012;8(1):268–281. doi: 10.1039/c1mb05231d. [DOI] [PubMed] [Google Scholar]
  • 32.Dinkel H., Michael S., Weatheritt R.J., et al. ELM—The database of eukaryotic linear motifs. Nucleic Acids Res. 2012;40(D1):D242–D251. doi: 10.1093/nar/gkr1064. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Banani S.F., Rice A.M., Peeples W.B., et al. Compositional control of phase-separated cellular bodies. Cell. 2016;166(3):651–663. doi: 10.1016/j.cell.2016.06.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Hastings R.L., S. Boeynaems Designer Condensates: a toolkit for the biomolecular architect. J. Mol. Biol. 2021;433(12) doi: 10.1016/j.jmb.2021.166837. [DOI] [PubMed] [Google Scholar]
  • 35.Daughdrill G.W., Narayanaswami P., Gilmore S.H., et al. Dynamic behavior of an intrinsically unstructured linker domain is conserved in the face of negligible amino acid sequence conservation. J. Mol. Evol. 2007;65(3):277–288. doi: 10.1007/s00239-007-9011-2. [DOI] [PubMed] [Google Scholar]
  • 36.González-Foutel N.S., Glavina J., Borcherds W.M., et al. Conformational buffering underlies functional selection in intrinsically disordered protein regions. Nat. Struct. Mol. Biol. 2022;29(8):781–790. doi: 10.1038/s41594-022-00811-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Uversky V.N., Gillespie J.R., Fink A.L. Why are “natively unfolded” proteins unstructured under physiologic conditions? Proteins: structure, Function, and Bioinformatics. 2000;41(3):415–427. doi: 10.1002/1097-0134(20001115)41:3<415::aid-prot130>3.0.co;2-7. [DOI] [PubMed] [Google Scholar]
  • 38.Uversky V.N. In: Progress in Molecular Biology and Translational Science. Vol 166. Uversky VN, editor. Elsevier; Amsterdam, Netherlands: 2019. Chapter one - protein intrinsic disorder and structure-function continuum; pp. 1–17. [DOI] [PubMed] [Google Scholar]
  • 39.Holehouse A.S., B.B. Kragelund The molecular basis for cellular function of intrinsically disordered protein regions. Nat. Rev. Molecul. Cell Biol. 2024;25(3):187–211. doi: 10.1038/s41580-023-00673-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Berlow R.B., Dyson H.J., P.E. Wright Functional advantages of dynamic protein disorder. FEBS Lett. 2015;589(19PartA):2433–2440. doi: 10.1016/j.febslet.2015.06.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Hu X., Xu Y., Wang C., et al. Combined prediction and design reveals the target recognition mechanism of an intrinsically disordered protein interaction domain. Proceed. Nation. Acad. Sci. 2023;120(39) doi: 10.1073/pnas.2305603120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Ghosh C., Nagpal S., V. Muñoz Molecular simulations integrated with experiments for probing the interaction dynamics and binding mechanisms of intrinsically disordered proteins. Curr. Opin. Struct. Biol. 2024;84 doi: 10.1016/j.sbi.2023.102756. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Dignon G.L., Zheng W., J. Mittal Simulation methods for liquid–liquid phase separation of disordered proteins. Curr. Opin. Chem. Eng. 2019;23:92–98. doi: 10.1016/j.coche.2019.03.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Ruff K.M., Pappu R.V., A.S. Holehouse Conformational preferences and phase behavior of intrinsically disordered low complexity sequences: insights from multiscale simulations. Curr. Opin. Struct. Biol. 2019;56:1–10. doi: 10.1016/j.sbi.2018.10.003. [DOI] [PubMed] [Google Scholar]
  • 45.Pak A.J., G.A. Voth Advances in coarse-grained modeling of macromolecular complexes. Curr. Opin. Struct. Biol. 2018;52:119–126. doi: 10.1016/j.sbi.2018.11.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Dignon G.L., Zheng W., Kim Y.C., et al. Sequence determinants of protein phase behavior from a coarse-grained model. PLoS Comput. Biol. 2018;14(1) doi: 10.1371/journal.pcbi.1005941. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Beck M., Covino R., Hänelt I., et al. Understanding the cell: Future views of structural biology. Cell. 2024;187(3):545–562. doi: 10.1016/j.cell.2023.12.017. [DOI] [PubMed] [Google Scholar]
  • 48.Wang W. Recent advances in atomic molecular dynamics simulation of intrinsically disordered proteins. Phys. Chem. Chemic. Phys. 2021;23(2):777–784. doi: 10.1039/d0cp05818a. [DOI] [PubMed] [Google Scholar]
  • 49.Abramson J., Adler J., Dunger J., et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature. 2024;630(8016):493–500. doi: 10.1038/s41586-024-07487-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Yang Z., Zeng X., Zhao Y., et al. AlphaFold2 and its applications in the fields of biology and medicine. Signal Transduct. Targ. Ther. 2023;8(1):115. doi: 10.1038/s41392-023-01381-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Wang L., Wen Z., Liu S.-W., et al. Overview of AlphaFold2 and breakthroughs in overcoming its limitations. Comput. Biol. Med. 2024;176 doi: 10.1016/j.compbiomed.2024.108620. [DOI] [PubMed] [Google Scholar]
  • 52.Ramanathan A., Ma H., Parvatikar A., et al. Artificial intelligence techniques for integrative structural biology of intrinsically disordered proteins. Curr. Opin. Struct. Biol. 2021;66:216–224. doi: 10.1016/j.sbi.2020.12.001. [DOI] [PubMed] [Google Scholar]
  • 53.Zheng L-E, Barethiya S., Nordquist E., et al. Machine learning generation of dynamic protein conformational ensembles. Molecules. 2023;28(10):4047. doi: 10.3390/molecules28104047. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Zhu J.-J., Zhang N.-J., Wei T., et al. Enhancing conformational sampling for intrinsically disordered and ordered proteins by variational autoencoder. Int. J. Mol. Sci. 2023;24(8):6896. doi: 10.3390/ijms24086896. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Gupta A., Dey S., Hicks A., et al. Artificial intelligence guided conformational mining of intrinsically disordered proteins. Commun. Biol. 2022;5(1):610. doi: 10.1038/s42003-022-03562-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Ruzmetov T., T.I. Hung, Jonnalagedda S.P., et al. Sampling conformational ensembles of highly dynamic proteins via generative deep learning. bioRxiv. 2024;2024 doi: 10.1021/acs.jcim.4c01838. 2005.2005.592587. [DOI] [PubMed] [Google Scholar]
  • 57.Janson G., Valdes-Garcia G., Heo L., et al. Direct generation of protein conformational ensembles via machine learning. Nat. Commun. 2023;14(1):774. doi: 10.1038/s41467-023-36443-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Janson G., M. Feig Transferable deep generative modeling of intrinsically disordered protein conformations. PLoS Comput. Biol. 2024;20(5) doi: 10.1371/journal.pcbi.1012144. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Zhu J., Li Z., Zhang B., et al. Precise generation of conformational ensembles for intrinsically disordered proteins using fine-tuned diffusion models. bioRxiv. 2024;2024 2005.2005.592611. [Google Scholar]
  • 60.Lotthammer J.M., Ginell G.M., Griffith D., et al. Direct prediction of intrinsically disordered protein conformational properties from sequence. Nat. Methods. 2024;21(3):465–476. doi: 10.1038/s41592-023-02159-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Taneja I., K. Lasker Machine-learning-based methods to generate conformational ensembles of disordered proteins. Biophys. J. 2024;123(1):101–113. doi: 10.1016/j.bpj.2023.12.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Tesei G., Trolle A.I., Jonsson N., et al. Conformational ensembles of the human intrinsically disordered proteome. Nature. 2024;626(8000):897–904. doi: 10.1038/s41586-023-07004-5. [DOI] [PubMed] [Google Scholar]
  • 63.Jumper J., Evans R., Pritzel A., et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596(7873):583–589. doi: 10.1038/s41586-021-03819-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Ahmed S.S., Rifat Z.T., Lohia R., et al. Characterization of intrinsically disordered regions in proteins informed by human genetic diversity. PLoS Comput. Biol. 2022;18(3) doi: 10.1371/journal.pcbi.1009911. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Bellay J., Han S., Michaut M., et al. Bringing order to protein disorder through comparative genomics and genetic interactions. Genome Biol. 2011;12(2):R14. doi: 10.1186/gb-2011-12-2-r14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Zarin T., Strome B., Nguyen Ba A.N., et al. Proteome-wide signatures of function in highly diverged intrinsically disordered regions. Elife. 2019;8:e46883. doi: 10.7554/eLife.46883. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Pritišanac I., Alderson T.R., Kolarić Đ, et al. A functional map of the human intrinsically disordered proteome. bioRxiv. 2024;2024 2003.2015.585291. [Google Scholar]
  • 68.Pang Y., Liu B. DisoFLAG: Accurate prediction of protein intrinsic disorder and its functions using graph-based interaction protein language model. BMC Biol. 2024;22(1):3. doi: 10.1186/s12915-023-01803-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Kilgore H.R., Young R.A. Learning the chemical grammar of biomolecular condensates. Nat. Chem. Biol. 2022;18(12):1298–1306. doi: 10.1038/s41589-022-01046-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Li S., Wang Y., L. Lai Small molecules in regulating protein phase separation. Acta Biochim Biophys Sin (Shanghai) 2023;55(7):1075–1083. doi: 10.3724/abbs.2023106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Mitrea D.M., Mittasch M., Gomes B.F., et al. Modulating biomolecular condensates: a novel approach to drug discovery. Nat. Rev. Drug Discov. 2022;21(11):841–862. doi: 10.1038/s41573-022-00505-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Wang Y., Yu C., Pei G., et al. Dissolution of oncofusion transcription factor condensates for cancer therapy. Nat. Chem. Biol. 2023;19(10):1223–1234. doi: 10.1038/s41589-023-01376-5. [DOI] [PubMed] [Google Scholar]
  • 73.Zhu J., Salvatella X., P. Robustelli Small molecules targeting the disordered transactivation domain of the androgen receptor induce the formation of collapsed helical states. Nat. Commun. 2022;13(1):6390. doi: 10.1038/s41467-022-34077-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Wu K., Jiang H., Hicks D.R., et al. Sequence-specific targeting of intrinsically disordered protein regions. bioRxiv. 2024;2024 2007.2015.603480. [Google Scholar]
  • 75.Fukuchi S., Sakamoto S., Nobe Y., et al. IDEAL: Intrinsically disordered proteins with extensive annotations and literature. Nucleic Acids Res. 2012;40(D1):D507–D511. doi: 10.1093/nar/gkr884. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Mohan A., Oldfield C.J., Radivojac P., et al. Analysis of molecular recognition features (MoRFs) J. Mol. Biol. 2006;362(5):1043–1059. doi: 10.1016/j.jmb.2006.07.087. [DOI] [PubMed] [Google Scholar]
  • 77.Bhargava P., Yadav P., A. Barik Computational insights into intrinsically disordered regions in protein-nucleic acid complexes. Int. J. Biol. Macromol. 2024;277 doi: 10.1016/j.ijbiomac.2024.134021. [DOI] [PubMed] [Google Scholar]
  • 78.Klein I.A., Boija A., Afeyan L.K., et al. Partitioning of cancer therapeutics in nuclear condensates. Science. 2020;368(6497):1386–1392. doi: 10.1126/science.aaz4427. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Ambadi Thody S., Clements H.D., Baniasadi H., et al. Small-molecule properties define partitioning into biomolecular condensates. Nat. Chem. 2024;16(11):1794–1802. doi: 10.1038/s41557-024-01630-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Dumelie J.G., Chen Q., Miller D., et al. Biomolecular condensates create phospholipid-enriched microenvironments. Nat. Chem. Biol. 2024;20(3):302–313. doi: 10.1038/s41589-023-01474-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Kilgore H.R., Mikhael P.G., Overholt K.J., et al. Distinct chemical environments in biomolecular condensates. Nat. Chem. Biol. 2024;20(3):291–301. doi: 10.1038/s41589-023-01432-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Rangan R., Feathers R., Khavnekar S., et al. CryoDRGN-ET: Deep reconstructing generative networks for visualizing dynamic biomolecules inside cells. Nat. Methods. 2024;21(8):1537–1545. doi: 10.1038/s41592-024-02340-4. [DOI] [PubMed] [Google Scholar]

Articles from Fundamental Research are provided here courtesy of The Science Foundation of China Publication Department, The National Natural Science Foundation of China

RESOURCES