Skip to main content
Cellular and Molecular Life Sciences: CMLS logoLink to Cellular and Molecular Life Sciences: CMLS
. 2026 Feb 14;83(1):113. doi: 10.1007/s00018-026-06087-3

Modern resources for intrinsic disorder predictions: protein language models, deep learning, meta-servers, and databases

Kui Wang 1, Gang Hu 1, Jing Yu 2, Lukasz Kurgan 2,
PMCID: PMC12913823  PMID: 41689628

Abstract

Computational prediction of intrinsic disorder in protein sequences is an impactful and growing research area, recently infused with deep learning and protein language models, prompting the need to assess the impact of these advancements. We systematically surveyed 128 disorder predictors, many of which are accurate, and some that have been cited thousands of times. We demonstrated that recent methods utilizing protein language models outperform those that do not, particularly when combined with deep learning, yielding substantial gains in predictive quality. We place these observations within the context of other key factors, including runtime and coverage. We also identified and discussed resources that expedite and ease the collection of disorder predictions, including meta-web servers and large databases of pre-computed disorder predictions. Altogether, this work guides users in their pursuit of efficiently and conveniently obtaining accurate disorder predictions and offers practical insights for the developers of disorder predictors.

Supplementary Information

The online version contains supplementary material available at 10.1007/s00018-026-06087-3.

Keywords: Intrinsic disorder, Intrinsically disordered proteins, Deep learning, Protein language model, Prediction, Protein function

Introduction

Sequences of many proteins include one or more intrinsically disordered regions (IDRs), which are defined as sequence segments that are devoid of a well-defined equilibrium structure and which form highly dynamic ensembles of conformations [14]. Several bioinformatics studies demonstrated that proteins with IDRs are ubiquitous in eukaryotic species [59]. Their functions complement the functional repertoire of structured proteins by covering a broad spectrum of intermolecular interactions that are crucial for molecular recognition, assembly, signaling, regulation, transcription, and translation [1019]. Recent works also suggest that IDRs are instrumental for the biogenesis of membrane-less organelles [2023] and that they are invaluable targets for the drug development efforts [2428], further motivating research on identifying and characterizing IDR-containing proteins. Experimental data on IDRs are available across several databases that include DisProt [29], Protein Data Bank (PDB) [30], DIBS [31], FuzDB [32] and MFIB [33]. However, these data cover a small fraction of the current protein sequence space that spans over 400 million sequences in the recent release of the RefSeq resource [34]. For instance, only about 25 thousand proteins with IDRs were identified in the PDB [35] and DisProt includes about 2700 such proteins [29]. This large and ever-growing annotation gap motivates the development of computational methods that accurately predict IDRs in protein sequences [3638]. We note that predictions of IDRs are distinct from a similarly long-standing effort in predictions of protein domains [3942], as IDRs are defined by their structural features, while domains are defined by their underlying functions. Disorder predictors were shown to accelerate the research on the IDR-containing proteins [4345]. The ability to predict IDRs in protein sequences stems from the observation that disorder is an inherent/intrinsic property of the amino acid sequences [4], and that specific amino acid types are substantially enriched in IDRs [46, 47].

Several surveys have discussed and analyzed intrinsic disorder predictors [3638, 4854]. These studies describe and classify selected collections of disorder predictors, offer a historical perspective, and identify trends in the development efforts, collectively making this active field of research more accessible. Moreover, several comparative studies were published, offering invaluable guidance on identifying the most accurate methods [35, 5567]. The arguably most comprehensive review from 2021 identified 103 disorder predictors and stressed the surge in developing methods that rely on deep neural network models [36]. Recent works investigated the impact of deep learning, showing that these predictive models are broadly more accurate than other types of predictive models [6871]. However, these works do not cover the most recent methods (i.e., 17 methods were published in 2024 and the first half of 2025) and, correspondingly, could not investigate or assess the impact of the most recent advancements related to the application of protein language models (PLMs). Moreover, they overlooked other relevant and impactful resources that facilitate applications of disorder predictors, such as databases of pre-computed disorder predictions and meta-web servers that generate multiple disorder predictions. To this end, we offer a current, holistic, and comprehensive review of the disorder prediction field. This work does not focus on comparative assessment of disorder predictors, as this aspect was recently addressed in the two community-organized assessments in 2023 [65] and 2025 [72]. We overview the history of the disorder prediction field, identify and summarize the most complete collection of 128 predictors (25% increase over the largest previously published study), which include 14 methods that utilize PLMs and 33 that have been published since 2020, evaluate the impact of PLMs on predictive accuracy, and discuss related resources, including several prediction databases and meta-web servers.

Computational predictors of intrinsic disorder

We identified a comprehensive collection of 128 disorder predictors by analyzing recent surveys and comparative studies [3638, 54, 58, 65], citations of these studies, information from the recently released disorder prediction portals [7375], and by performing manual screening of relevant PubMed searches. We included all methods for which we could identify details of the predictive model (in particular, deep network architectures and PLM types, if used), including some methods not published in peer-reviewed venues. We list them, including details of their publications, predictive models, and citations, in Supplementary Table S1.

Historical overview

We summarized historical progress in the development of these methods in Fig. 1. In line with prior studies [36, 52, 53], we found that the first disorder predictor was published in 1979 [76], and only five additional methods were released until 2002 [7780]. Figure 1 shows a steep increase in the number of predictors published between 2003 and 2015, with 75 methods and an average of 5.8 methods per year. This trend coincides with the inclusion of the disorder prediction assessment into the CASP5 (5th Critical Assessment of protein Structure Prediction) experiment in 2002 [63] and the fact that this was continued as part of the biannual CASP events until CASP10 in 2012 [5964]. CASP is a community-organized event where methods are evaluated on blind test datasets using community-accepted metrics of predictive performance by independent assessors. This means that the test datasets are withheld from the authors of the predictors before the assessment (i.e., authors are blind to the content of these test sets) and that the assessors do not participate in the event. Arguably, this setup is more objective when contrasted with the comparative studies done by the authors of individual predictors. The first assessment in CASP5 was limited in scope, with only four methods completing a sufficient number of predictions to perform their evaluation [63]. However, the participation grew steadily over the years, culminating with 28 predictors partaking in CASP10 [62]. The disorder prediction was discontinued after CASP10, primarily because these assessments focused on the structure prediction, so the corresponding test datasets featured relatively few IDRs. Figure 1 suggests that the subsequent lack of community-organized evaluations seemed to deflate the development efforts, resulting in just five new predictors published between 2016 and 2017, corresponding to an annual average of 2.5.

Fig. 1.

Fig. 1

Historical overview of the disorder prediction field. The bar chart represents the number of methods in each color-coded category published in each time interval defined on the horizontal axis. The yellow callouts at the top denote the timing of the CASP and CAID events that evaluated the predictive performance of disorder predictors. The callouts at the bottom name and give a timeline for the highly cited predictors (in orange), meta-web servers that provide access to multiple disorder and disorder function predictors (in red), and databases of disorder predictions (in purple)

The disorder prediction community established a new Critical Assessment of Intrinsic Protein Disorder (CAID) event that took place for the first time in 2019 (Fig. 1). Like CASP, CAID relies on blind test datasets and community-agreed performance metrics. However, methods developed by the organizers are included in the assessment, and test datasets are composed exclusively of proteins with IDRs, which are not accessible to the participants (and thus could not be used for model training) and which were subsequently included in the DisProt database [29]. Moreover, authors are required to deposit their methods with the organizers before the event starts, and the predictions are performed and evaluated entirely by the assessors. The latter facilitates measuring and comparing runtime and independently validates that the participating methods produce predictions in a fully autonomous manner (with no human input). The inaugural CAID1 event included 32 disorder predictors, and the results were published in 2021 [58]. The subsequent CAID2 and CAID3 events were held in 2022 (38 disorder predictors) and 2024 (61 disorder predictors), with the results released in 2023 [65] and 2025 [72], via the CAID portal at https://caid.idpcentral.org/challenge/results. These assessments, combined with new advances in predictive models, including deep learning and PLMs, are the likely reasons for the recent and substantial increase in the efforts to develop new disorder predictors (Fig. 1). We identified 43 predictors published from 2018 to mid-2025, corresponding to an average of 5.7 new methods per year. Figure 1 also reveals that 17 methods were published in 2024 and 2025 alone, suggesting a strong revival of interest in developing new disorder predictors.

Recent advances in the intrinsic disorder prediction

The bars in Fig. 1 are color-coded to reflect several different types of predictive models, including classical machine learning (ML) models (e.g., support vector machine, random forest, conditional random field, nearest neighbor, and shallow neural network), non-ML models that implement scoring functions (e.g., weighted linear functions), deep learning (DL) models (i.e., deep neural networks that have multiple/many hidden layers and use more advanced types of neurons and connections compared to the shallow neural networks), and the above three model types in combination with PLMs that are typically used as predictive inputs. The underlying data is included in Supplementary Table S1. Most of the 128 disorder predictors rely on classical ML models (73/128 = 57%) while only 21/128 = 16% utilize the non-ML models. The ML models dominated the development efforts until the mid-2010s, when the trends shifted toward the DL-based models. The first DL method was released in 2013 [81], and deep neural networks have become the most popular model type by the early 2020s. Altogether, as of mid-2025, 34/128 = 27% of all disorder predictors rely on the DL-based models, constituting a significant majority (26/33 = 79%) of the models released since 2020. This shift toward the DL-based models was already observed in prior studies [36, 69], and can be explained by the popularity of DL across many other areas of structural bioinformatics of protein [8287] and the success of deep network-based predictors in the recent assessments of the intrinsic disorder predictions [55, 58, 67, 69]. We highlight the diversity of the deep network architectures utilized by modern disorder predictors (Supplementary Table S1). They cover feed-forward (e.g., flDPnn [88], NeProc [89], and flDPnn2 [90]), convolutional (e.g., AUCpred [91], PUNCH2 [92], and PredIDR [93]), recurrent (e.g., DisoMine [94], Metapredict [95], and DisoFLAG [96]), transformer (ADOPT [97], IDP-Fusion [98], and DR-BERT [99]), and hybrid (e.g., convolutional and recurrent for SPOT-Disorder2 [100], RawMSA [101], and DeepIDP-2 L [102]) topologies.

We note a recent trend of adopting PLMs to develop disorder predictors (dark shaded parts of bars in Fig. 1, with the underlying data in Supplementary Table S1). PLMs are used to produce sequence-derived inputs to the predictive models, and they are typically combined with other types of inputs, such as the sequence itself and multiple sequence alignments. The first disorder predictor that relies on PLMs, SETH [103], was published in 2022, and 13 more PLM-utilizing methods have been released. Interestingly, our analysis shows that PLMs were used with each of the three major model types, including non-ML models (e.g., UdonPred), classical ML models (e.g., SETH [103] and DisPredict3 [104]), and DL models (ADOPT [97], DR-BERT [99], IDP-ELM [105], PUNCH2 [92], and flDPnn3). Moreover, Fig. 1 reveals that most disorder predictors developed since 2022 utilize PLMs (14/26 = 54%), indicating that these efforts have fueled the recent progress in this research area. Like the diversity of the DL topologies, modern disorder predictors utilize several different types of PLMs (Supplementary Table S1). A significant majority of these methods utilize precomputed PLMs, without finetuning them for disorder prediction. These methods include SETH-0 [103], SETH-1 [103], ADOPT [97], LMDisorder [106], DisPredict3 [104], DisoFLAG [96], UdonPred, flDPnn3, IDP-ELM [105], PUNCH2 [92], PUNCH2-light [92], and DisorderUnetLM [107]. They take advantage of the mainstream PLMs, such as ProtTrans-T5 [108], ESM [109], ESM-2 [110], and ProtBERT [108], and we detail which methods use which PLMs in Supplementary Table S1. Authors of two disorder predictors computed and finetuned new PLMs for the disorder prediction: IDP-BERT, which is used in the IDP-LM predictor [111]; and DR-BERT, which is utilized by the disorder predictor of the same name [99]. The IDP-LM method [111] combines the use of two precomputed PLMs (ProtTrans-T5 and ProtBERT) with the new IDP-BERT PLM. The new PLM relies on the BERT (Bidirectional Encoder Representation from Transformers) architecture and was trained using a relatively small dataset of 105 thousand proteins, which cover 69 thousand proteins collected from MobiDB database that include curated or derived annotations of intrinsic disorder and 36 thousand fully structured proteins that were collected from PDB. Similarly, DR-BERT also utilizes the BERT architecture; however, the training was done using a larger dataset of 6.5 million proteins randomly sampled from the UniRef90 dataset, and this model was subsequently finetuned using a small collection of about 2,400 disordered proteins from the DisProt database [99]. Essentially, authors of both disorder-tuned PLMs rely on two distinct and disorder-specific collections of training sequences to ensure that their models are specifically tuned for disorder prediction. More broadly, Supplementary Table S1 shows that numerous combinations of PLM types and predictive model architectures were explored, contributing to the recent surge in developing new disorder predictors.

Impact measured by citations

Supplementary Table S1 reports the number of citations to the articles that introduced disorder predictors, which we collected using Google Scholar in July 2025. These citation counts can be used as one of the relatively easy-to-quantify measures of impact. We provide the total number of citations and the average annual number, which we computed by dividing the total by the number of years since the publication. The yearly numbers are more suitable for comparisons between predictors. We also exclude methods published in 2024 and 2025 from this analysis, as the corresponding articles are arguably too new to have accumulated usable citation data.

The 111 predictors published up to 2023 and included in this analysis were collectively cited nearly 42,000 times, with a median citation number of 148 (Supplementary Table S1). The median of the average annual citations is 12.3, suggesting that a typical disorder predictor was cited once per month. Moreover, we identified 18 methods with an average annual number of citations exceeding 50, which are highlighted in bold font in Supplementary Table S1. They include (chronologically by their publication date) PONDR VL-XT [80], three variants of DisEMBL [112], GlobPlot [113], DISOPRED2 [5], FoldIndex [114], two variants of IUPred [115], two variants of PONDR VSL2 [116], PrDOS [117], PONDR FIT [118], DISOPRED3 [119], two variants of IUPred2A [120], flDPnn [88], and IUPred3 [121]. These most cited tools include older methods that rely on classical ML and non-ML models (i.e., PONDR, DisEMBL, IUPred, DISOPRED, PrDOS) and a newer DL-based model (flDPnn). This analysis suggests that disorder predictors are heavily utilized, with many methods cited hundreds or thousands of times.

Predictive quality in the era of deep learning and protein Language models

Our objective was to investigate whether the new advances in this field, DL and PLMs, have led to measurable improvements in predictive quality. To this end, we used the recently released data from the community-run CAID3 event to comparatively evaluate the predictive performance of methods grouped by these two key factors that define their predictive models. Specifically, we considered four groups of intrinsic disorder predictors that included those that: (1) do not use DL and do not use PLM (noDL&noPLM group); (2) use DL and do not use PLM (DL&noPLM group); (3) do not use DL and use PLM (noDL&PLM group); and (4) use DL and PLM (DL&PLM group). We did not compare or analyze methods individually since such analysis is already available in CAID2 [65], CAID3 [72], and as part of the recent release of the CAID prediction portal [73], complementing the scope of our article that focuses on the big picture of new advances. Moreover, our analysis expands and complements the 2022 study that concentrated on comparing DL-based vs. classical ML-based disorder predictors using the older data from CAID1 [69].

We analyzed publicly available results from CAID3 (https://caid.idpcentral.org/challenge/results) using the blind (unavailable to participants) Disorder-NOX dataset, which includes 204 proteins. Per the CAID organizers, this is the primary benchmarking dataset in CAID3 [72]. It relies on the experimental annotations of disorder that were generated using a variety of methods that include circular dichroism, NMR spectroscopy, cryogenic electron microscopy, and infrared spectroscopy [72], and these proteins were later deposited into the DisProt database. The Disorder-NOX dataset excludes the X-ray missing residues from the assessment, motivated by an observation that these annotations cannot be used to infer disorder for fully or substantially disordered proteins and are mainly used to train disorder prediction methods [65, 72]. The latter suggests that this exclusion reduces the potential influence of training data. This dataset encompasses a diverse collection of disordered proteins, primarily focusing on eukaryotes, but also including bacteria and viruses. The protein sizes range between short peptides and five large proteins that are over 2,000 amino acids long, with a median chain length of 345 amino acids. They also represent a broad spectrum of the disorder content (i.e., fraction of disorder in the protein sequences), ranging from nearly fully structured proteins that have a marginal number of disordered residues (< 5%) to fully disordered proteins (100% disordered), with the median amount of disorder at 23.9%. Supplementary Figs. 1 A and B show detailed histograms of the sequence length and disorder content values for the proteins in the Disorder-NOX dataset. We also emphasize that this dataset was blind to the participating predictors, i.e., these proteins were not publicly released before the evaluation, which ensures that they could not be used for model training.

We covered all methods that participated in CAID3 and for which the predictive model was described in the publication or on the CAID portal, including the use of DL and PLMs, which is necessary to group them appropriately. Consequently, we considered 57 methods, including 18 in the noDL&noPLM group, 26 in the DL&noPLM group, 6 in the noDL&PLM group, and 7 in the DL&PLM group. We computed four metrics of predictive performance reported in the past CAID events [58, 65]. They include AUROC and Area Under the Precision-Recall Curve (AUPRC) that assess performance of the predicted real-valued propensities for disorder, and F1 and MCC that quantify performance of the putative binary predictions of disorder (disordered vs. structured). The F1 and MCC values rely on the thresholds (i.e., residues with disorder propensities > threshold are predicted as disordered) that generate the correct number of disordered residues. This ensures that F1 and MCC values are calibrated across predictors and can be compared directly. We also collected the coverage (i.e., fraction of the test proteins that a given method was able to predict) and runtime values from the CAID3 website. We provided these results in Supplementary Table S2.

We focused on assessing differences between these four groups of methods, and correspondingly, we evaluated the statistical significance of differences between all pairs of these groups. We tested the normality of the measured performance metrics using the Shapiro-Wilk test at a 0.05 significance level. For normal data, we used the group t-test; otherwise, we employed the Wilcoxon-Mann-Whitney test. We assumed that a given difference is statistically significant if the corresponding p-value < 0.05. We summarized these results, including the distributions of the predictive performance metrics for each of the four predictor groups and the statistical significance analysis results, in Fig. 2. We found that 30 out of the 57 methods achieved AUROC > 0.80, suggesting that many disorder predictors yield accurate results. Moreover, four methods have AUROC > 0.85 (DisoFLAG-IDR [96], DisorderUnetLM [107], flDPnn3 (unpublished), and UdonPred (unpublished)), and they all utilize PLMs.

Fig. 2.

Fig. 2

Comparison of the four groups of disorder predictors that include methods that do not use DL and do not use PLM (noDL&noPLM group; green box plots); use DL and do not use PLM (DL&noPLM group; blue box plots); do not use DL and use PLM (noDL&PLM group; orange box plots); and use DL and PLM (DL&PLM group; purple box plots). The box plots detail distributions of the predictive performance metrics (AUROC, AUPRC, MCC, and F1) for the predictors in each group, where we show the minimum (bottom whisker), first quartile, median (red dashed line), third quartile, and maximum (top whisker). The assessment of the statistical significance of differences in the predictive performance for each pair of groups is presented at the top of each plot, where we provide p-values and use * to denote statistically significant differences at p-value < 0.05

As expected, we found that methods that do not use DL and PLMs secure the lowest levels of predictive quality (green box plots in Fig. 2). The predictors that rely on DL but do not use PLMs (blue box plots in Fig. 2) are more accurate, and this improvement over the methods that do not use DL and PLMs is statistically significant for AUROC (p-value < 0.05). This aligns with recent studies that similarly showed that using DL substantially improved intrinsic disorder prediction [6971]. More importantly, our analysis revealed that tools that utilize PLMs (orange and purple box plots in Fig. 2) are significantly more accurate than the tools that do not (blue and green box plots in Fig. 2), including methods that use DL but not PLMs (p-values < 0.05 for AUROC, AUPRC, MCC, and F1). Moreover, combining DL and PLMs (purple box plots in Fig. 2) yields a modest and non-statistically significant improvement in AUROC, F1, and MCC compared to methods that use PLM but not DL (p-value > 0.20; orange box plots in Fig. 2). Altogether, these results suggest that the recent introduction of PLMs into the disorder prediction field has significantly improved predictive performance.

We provide additional context for this analysis in Fig. 3, which visualizes relationships between runtime, coverage, and predictive performance measured with AUROC. The corresponding raw values are reported in Supplementary Table S2. We include three methods with the highest AUROC values in each of the four groups, which we color-coded in Fig. 2, for which runtime and coverage are reported in CAID3. This analysis reveals that predictors vary widely in their prediction speed, from very fast methods that require under 1 s to predict a protein (ESpritz-D and IUPred3), to tools that are two orders of magnitude slower (rawMSA, UdonPred, DisPredict3, DisoFLAG-IDR, and DisorderUnetLM), and to methods that require over 1000 s to process a protein (SPOT-Disorder2). Moreover, while most of these nine accurate methods were able to predict all test proteins (coverage of 100%), some were unable to complete predictions for a sizeable fraction of proteins, such as SPOT-Disorder2 and DisPredict3, which have coverage ≤ 95%. This is typically due to a limit on the maximum length of the input protein sequence. This analysis suggests that selecting a suitable disorder predictor should consider multiple factors, including predictive performance, runtime, and ability to secure predictions for all proteins. We note that different methods excel in various aspects, with no single tool being the most accurate, fastest, and providing the highest coverage. Instead, users have many options to consider, including extremely fast, relatively accurate, and high-coverage tools (ESpritz-D and IUPred3 with AUROC of 0.81, ≥ 99% coverage, runtime < 1 s per protein, respectively); relatively fast and very accurate methods with 100% coverage (flDPnn2 and flDPnn3 with AUROC of 0.83 and 0.86 and runtime of 15 and 33 s per protein, respectively); and the most accurate method that is slower and has more limited coverage (DisoFLAG-IDR with AUROC of 0.87, runtime of 151 s per protein, and 98% coverage).

Fig. 3.

Fig. 3

AUROC, runtime, and coverage for the three predictors from each of the four color-coded groups (noDL&noPLM in green, DL&noPLM in blue, noDL&PLM in orange, and DL&PLM in purple) that secured the highest AUROC values and for which runtime and coverage were reported in CAID3. The runtime is shown in the base 10 logarithmic scale on they-axis. The coverage, which quantifies a fraction of the test proteins a given method could predict, is reported inside the callouts as the C value

Resources that facilitate the collection of disorder predictions

Besides the availability of numerous disorder predictors, many methods that predict specific cellular functions of intrinsic disorder have also been published. They focus on identifying disordered linker regions and IDRs that interact with various types of ligands. A few representative examples include DFLpred [122], APOD [123], TransDFL [124], and DisoFLAG [96] that predict disordered linkers; ANCHOR [125], MoRFpred [126], MoRFChibi [127], OPAL [128], and ANCHOR2 [120] that predict protein/peptide binding; DisoRDPbind [129, 130], CLIP [131], DeepDISObind [132], and DRPBind [133] that predict IDRs that interact with proteins, DNA, and RNA; bindEmbed21IDR [134] that targets interactions with metal ions, nucleic acids, and small molecules; DisoFLAG [96] that focuses on the protein, DNA, RNA, ion, and lipid binding IDRs; and DisoLipPred [135], CoMemMoRFPred [136], MemDis [137], and pLMMoRF [138] that predict IDRs that interact with lipids. More details concerning the disorder function prediction are available in several recently published surveys [139143]. The impact of the disorder and disorder function predictors is boosted by the availability of resources that facilitate and ease access to these results. These resources include meta-web servers that produce multiple disorder-related predictions and databases of pre-computed disorder predictions.

The web servers are particularly suitable for users who perform predictions in an ad hoc manner and have no equipment and/or ability to install and run predictors locally. The predictions via the web servers are made on the server side and do not require software installation on the user’s end, making these resources very convenient. However, these servers are typically limited to running one or a few proteins at a time and could be busy running jobs for other users. A particular benefit of the meta-web servers is the ability to obtain and check multiple results for convergence to increase confidence in the resulting disorder prediction. This is supported by studies that empirically demonstrated that combining multiple disorder predictions (i.e., consensus prediction) typically leads to an improved predictive performance when compared to using predictors individually [35, 74, 144146]. We identified five operational meta-web servers as of July 2025, when we tested them, that offer access to at least four disorder and/or disorder function predictions. They include MFDp [147149], MetaDisorder [150], DisMeta [151], DEPICTER2 [74, 75], and the CAID prediction portal [73]. We provide their URLs and summarize their scope in Table 1. The three older meta servers, MFDp, MetaDisorder, and DisMeta, are limited to older disorder predictors and lack disorder function predictions, and as such are arguably less practical than the newer DEPICTER2 and CAID portal resources. The two newer resources provide complementary capabilities. DEPICTER2 produces disorder predictions with flDPnn [88], which was ranked as the top performer in CAID1 [152], and provides a broad coverage of the disorder function predictions generated by five popular methods: MoRFChibi [127] (peptide binding), ANCHOR2 [120] (protein binding), DisoRDPbind [129, 130, 153] (RNA and DNA binding), DisoLipPred [135] (lipid-binding), and DFLpred [122] (disordered linkers). These predictors are runtime-efficient, except for DisoLipPred, and correspondingly, this meta server is fast, facilitating batch predictions of up to 25 proteins when excluding the slow DisoLipPred predictor. The CAID prediction portal covers many predictors, including over 30 disorder predictors and a dozen predictors of disorder functions; Table 1 details these methods. While the number and scope of the included predictors are significantly larger compared to DEPICTER2, the CAID portal does not support batch predictions and utilizes a “CPU credit” system that further limits the ability to predict many proteins. To compare, DEPICTER2 offers a more limited number of methods but facilitates faster predictions for larger collections of proteins.

Table 1.

Resources that provide access to at least four disorder and disorder function predictions include databases of pre-computed and searchable predictions and meta-web servers that provide on-demand predictions. The databases and meta-web servers are sorted in chronological order of publication

Type Name First published Refs Size
[millions of proteins]
Disorder predictors included Other disorder-related predictions included URL
Databases

MobiDB

ver. 6.1

Aug. 2012 [154158] 245.62 MobiDB-lite [145], DisEMBL [112], ESpritz [159], IUPred [160], GlobPlot [112], and AF2-disorder [70]

Disordered protein-binding by ANCHOR [125]

Linear interacting peptides (LIPs) by FLIPPER [161]

Low complexity regions by SEG [162]

https://mobidb.org/

D2P2

ver. 1.0

Jan. 2013 [163] 10.43 PONDR VL-XT [80], PONDR VSL2 [116, 164], PrDOS [165], PV2 [166], ESpritz [159], IUPred [115] Disordered protein-binding residues by ANCHOR [125] https://d2p2.pro/

DescribePROT

ver. 2.3

Jan. 2021 [167169] 2.28 flDPnn [88]

Disordered DNA- and RNA-binding by DisoRDPbind [129, 130, 153]

Disordered protein-binding by DisoRDPbind [129, 130, 153] and MoRFChibi [127]

Disordered linkers by DFLpred [122]

http://biomine.cs.vcu.edu/servers/DESCRIBEPROT/

DisEnrichDB

ver. 1.0

Jan. 2022 [170] 0.08 DISOPRED [171], IUPred2 [120], SPOT-Disorder [172] Compositionally biased regions by fLPS [173] http://prodata.swmed.edu/DisEnrichDB/
Meta web servers MFDp Sept. 2010 [147149] N/A DISOclust [174], DISOPRED [171, 175], IUPred [115, 160], and MFDp [147] None http://biomine.cs.vcu.edu/servers/MFDp/
MetaDisorder May 2012 [150] N/A DisEMBL [112], DISOPRED2 [5], DISpro [176, 177], GlobPlot [113], iPDA [178], IUPred [115, 160], Pdisorder, POODLE-S [179], POODLE-L [180], PrDOS [117], Spritz [181], RONN [182], GSmetaserver [150, 183] None http://iimcb.genesilico.pl/metadisorder/
DisMeta Oct. 2013 [151] N/A DISOPRED2 [5], PONDR VSL2 [116], DisMeta [151], and SEG (low complexity regions) [162] None http://montelionelab.chem.rpi.edu/dismeta/
DEPICTER2 May 2023 [74, 75] N/A flDPnn [88]

Disordered protein-binding by MoRFChibi [127] and ANCHOR2 [120]

Disordered DNA- and RNA-binding by DisoRDPbind [129, 130, 153]

Disordered linkers by DFLpred [122]

Disordered lipid-binding by DisoLipPred [135]

http://biomine.cs.vcu.edu/servers/DEPICTER2/
CAID prediction portal May 2023 [73] N/A AIUPred [184], AUCPred [91], DISOPRED3 [119], DeepIDP-2 L [102], DisEMBL [112], DisoMine [102], DisPredict2 [91], DisPredict3 [104], Espritz [159], flDPnn [88], flDPnn2 [90], FoldUnfold [185], IDP-Fusion [98], IsUnstruct [186], IUPred3 [121], Metapredict [95], MobiDB-lite [145], PredIDR [93], PreDisorder [187], pyHCA [188], rawMSA [101], RONN [182], s2D-2 [189], SETH [103], SPOT-Disorder [172], SPOT-Disorder-Single [190], SPOT-Disorder2 [100], PONDR VSL2 [116]

Disordered linkers by DFLpred [122] and APOD [123]

Disordered protein-binding by MoRFChibi [127], ANCHOR2 [120], OPAL [128], DisoRDPbind [129, 130, 153], DeepDISObind [132], and DRPBind [133]

Disordered DNA- and RNA-binding by DisoRDPbind [129, 130, 153], DeepDISObind [132], and DRPBind [133]

Binding residues by bindEmbed21IDR [134]

https://caid.idpcentral.org/portal

The disorder predictions could be time-consuming and challenging, especially when using individual predictors instead of the meta-web servers. Users must identify suitable predictors, navigate multiple servers and/or install several pieces of software, provide input data (sequences and identifiers) in different formats across different servers/software, and collect and standardize the outputs across various formats. The prediction itself could require a substantial amount of runtime, as much as several minutes per protein [58], which is particularly challenging when predicting large datasets, as it is done frequently in the literature [57, 9, 191197]. Moreover, different users may request predictions for the same protein(s) from the same method, which leads to repeated and wasteful duplication of the predictions. These problems are addressed by databases of pre-computed disorder predictions, which facilitate rapid retrieval of predictions generated by multiple methods. This solution eliminates the issues of the long prediction runtime, assembling predictions, and performing duplicate predictions. However, databases are limited to the collection of proteins they include, and users still must utilize predictors and/or a meta-web server to predict excluded and/or novel protein sequences. We found four databases of disorder predictions that were operational as of July 2025: D2P2 [163], MobiDB [154158], DescribePROT [167169], and DisEnrichDB [170], which we summarize in Table 1. Following, we briefly discuss their key characteristics.

MobiDB was initially published in 2012, and it has been frequently updated since, with the newest release, 6.1, that was deployed in July of 2024 [158]. It is almost exclusively focused on the intrinsic disorder and covers the largest number of proteins, at roughly 245.6 million. MobiDB provides access to predictions generated by six disorder predictors and three disorder function predictors that target protein-binding IDRs and linear interacting peptides (LIPs) [198], and includes consensus disorder predictions produced by the MobiDB-lite algorithm [145]. It also provides access to experimental data on disorder that it draws from multiple sources: DIBS [31], DisProt [29], ELM [199], FuzDB [32], IDEAL [200], MFIB [201], PDBe [202], PhasePro [203], and UniProt [204].

D2P2 was released in 2013 and has not been updated since. It includes data for 10.4 million proteins, and, like MobiDB, it is centered on the intrinsic disorder. It covers predictions generated by six disorder predictors, consensus disorder predictions based on the 75% consensus approach (i.e., a residue is predicted disordered if at least 75% of methods predict it as disordered), predictions of protein-binding IDRs, and also predictions of protein domains and posttranslational modification sites by SUPERFAMILY [205] and PhosphoSitePlus [206], respectively.

DescribePROT was first published in 2021 and underwent multiple updates since, with the most recent version 2.3 from December 2024 [168]. In contrast to MobiDB and D2P2, it has a broader scope that covers intrinsic disorder, structure, and function, but for a smaller collection of 2.3 million proteins (i.e., 273 complete proteomes of popular/model organisms). It includes results generated by one disorder predictor, three disorder function predictors that focus on disordered linkers and protein, DNA, and RNA binding, and seven methods that predict protein and nucleic acids binding in structured regions, solvent accessibility, secondary structures, signal peptides, and posttranslational modification sites. Altogether, it provides access to 21.3 billion predictions. It also offers experimental annotations of intrinsic disorder, secondary structure, and solvent accessibility that were extracted from PDB [207] and DisProt [29].

DisEnrichDB is the newest database that was released in 2022, with no subsequent updates so far [170]. It is singularly dedicated to intrinsic disorder, focusing on disseminating IDRs enriched in different combinations of amino acids for 80 thousand human proteins. Correspondingly, it provides disorder predictions generated by three methods and consensus disorder prediction from MobiDB, and it also annotates compositionally biased IDRs.

We note that these four databases provide options to collect data for individual proteins and download results for whole proteomes, facilitating analysis at diverse scales. For example, DescribePROT offers access to the residue-level predictions and protein-level summaries at the entire proteome level. Altogether, these databases vary in size, frequency of updates, scope of included predictions, and inclusion of experimental data, each providing valuable and complementary information.

Conclusions and discussion

The intrinsic disorder prediction field has delivered over 120 methods. We catalogued a comprehensive collection of these predictive tools, highlighted historical development trends, summarized their predictive models, discussed their impact, and evaluated the effects of the applications of DL and PLMs on predictive performance. We identified a recent surge in method development efforts, which likely stems from the introduction of the CAID events and the availability of new DL and PLM technologies that advance the predictive performance of disorder predictors. This resulted in the release of 17 methods in 2024 and the first half of 2025, demonstrating that this field of research is highly active and deserves attention. The latter claim is strengthened by our analysis, which reveals that many disorder predictors are highly cited, with 18 tools cited on average over 50 times per year.

Empirical analysis of results from the CAID3 events revealed that many disorder predictors are highly accurate, and that methods that apply PLMs outperform tools that do not utilize this technology. PLMs are particularly beneficial when combined with deep neural network models. While past studies have revealed that the use of DL has led to an increase in predictive quality [6971], we found that adding PLMs has provided an additional and substantial boost, resulting in a new generation of very accurate predictors. When compared to previously used approaches that derive inputs directly from protein sequences, PLMs are trained on vast datasets comprised of dozens to hundreds of millions of sequences. They provide high-dimensional (hundreds to thousands of values per amino acid) and rich contextual representations that complement the sequence-derived information extracted by the older predictors, which is why their use leads to substantial improvements.

We believe that further advances in disorder prediction could be produced by designs that specialize in addressing specific disorder types (flavors) [208, 209]. For instance, based on differences in their conformational space, proteins with IDRs are categorized into native coils, native pre-molten globules, and native molten globules [11]. Moreover, IDRs differ in length and localization in the sequence, where typically short IDRs located at a sequence terminus [204] are distinct from long IDRs that can cover the whole sequence length [210, 211]. A disorder-flavor-specific approach contrasts with the current designs that attempt to model all intrinsic disorder flavors together. We believe that simultaneous tailoring of network topologies and predictive inputs, including PLMs, should provide enough flexibility to produce models that accurately predict individual flavors of disorder. One of the key challenges will be detecting disorder flavor for an input protein, so it matches the corresponding model. However, such matching was already developed in the context of the prediction of protein binding [212] and DNA binding [213], where the most suitable predictor was identified for a given input protein.

This survey also highlights that selecting a suitable disorder predictor should consider multiple factors, including predictive performance, speed, and coverage. We did not find a silver bullet, i.e., a predictor that is the fastest, most accurate, and has the highest coverage, and correspondingly, users should prioritize their selection based on the most desired characteristics. This survey, the CAID1 and CAID2 articles [58, 65], and a recently published disorder prediction tutorial [37], offer practical insights that should help users to strike the right balance between runtime, performance, and coverage.

We also introduce and discuss related resources that facilitate the convenient collection of disorder predictions, including the meta-web servers and databases of pre-computed disorder predictions. These resources ease access for the end users and consequently boost the impact of disorder predictors. The meta-web servers expedite the collection of multiple disorder and disorder function predictions. We highlighted two modern and comprehensive meta-web servers, DEPICTER2 and the CAID prediction portal. These provide complementary services, i.e., smaller coverage but larger throughput for DEPICTER2 vs. much larger coverage but more limited throughput for the CAID portal. We also discussed four databases that offer near-instantaneous access to pre-computed disorder and disorder function predictions. They differ in the number of proteins included, the breadth of the coverage of disorder functions, and the inclusion of other aspects, such as information on protein structure and function. These resources are particularly valuable when addressing the analysis of the intrinsic disorder for large datasets of proteins and when analyzing disorder in the context of other structural and functional features. Altogether, we show that nowadays end users have access to a variety of readily available options to efficiently and conveniently obtain accurate disorder predictions.

Lastly, several studies investigated the potential use of AlphaFold2 [214], which has revolutionized protein structure prediction, in the context of the intrinsic disorder predictions [68, 70, 71, 215]. They found that while AlphaFold2 can accurately identify IDRs, it is outperformed by modern disorder predictors. Moreover, authors of AlphaFold3 note that their model generates “spurious structural order (hallucinations) in disordered regions” [216], suggesting that their tool is not meant to predict IDRs. One reason for it could be that AlphaFold3 was trained on ground truth annotations collected exclusively from PDB [216], which primarily focuses on well-structured globular proteins, but lacks coverage of disordered proteins, which is better in other repositories, such as DisProt, DIBS, and FuzDB. This implies that the protein structure and the intrinsic disorder prediction areas require different solutions, and it further justifies the recent spike in the development of disorder predictors.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary Material 1 (199KB, docx)

Acknowledgements

This research was funded in part by the National Science Foundation (grants 2125218 and 2146027) and the Robert J. Mattauch Endowment funds to L.K., the National Natural Science Foundation of China (grants 12326611, 32570776 and 92370128), and the Tianjin Science and Technology Program (grant 24ZXZSSS00320) to K.W. and G.H.

Authors’ contributions

Conceptualization: Lukasz Kurgan; Data curation: Lukasz Kurgan, Kui Wang, Gang Hu; Formal analysis: Lukasz Kurgan and Jing Yu; Funding acquisition: Lukasz Kurgan, Kui Wang, Gang Hu; Investigation: Lukasz Kurgan, Kui Wang, Gang Hu, Jing Yu; Project administration: Lukasz Kurgan; Visualization: Lukasz Kurgan, Kui Wang; Gang Hu; Writing – original draft: Lukasz Kurgan.

Data availability

Data underlying this study are available in the Supplement.

Declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Conflict of interest

The authors declare no conflicts of interest.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Lieutaud P et al (2016) How disordered is my protein and what is its disorder for? A guide through the dark side of the protein universe. Intrinsically Disord Proteins 4(1):e1259708 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Oldfield CJ (2019) Introduction to intrinsically disordered proteins and regions. Intrinsically Disordered Proteins: Dynamics, Binding, and Function. pp 1–34
  • 3.Habchi J et al (2014) Introducing protein intrinsic disorder. Chem Rev 114(13):6561–6588 [DOI] [PubMed] [Google Scholar]
  • 4.Dunker AK et al (2013) What’s in a name? Why these proteins are intrinsically disordered. Intrinsically Disord Proteins 1(1):e24157 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Ward JJ et al (2004) Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. J Mol Biol 337(3):635–645 [DOI] [PubMed] [Google Scholar]
  • 6.Xue B, Dunker AK, Uversky VN (2012) Orderly order in protein intrinsic disorder distribution: disorder in 3500 proteomes from viruses and the three domains of life. J Biomol Struct Dyn 30(2):137–149 [DOI] [PubMed] [Google Scholar]
  • 7.Peng Z et al (2015) Exceptionally abundant exceptions: comprehensive characterization of intrinsic disorder in all domains of life. Cell Mol Life Sci 72(1):137–151 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Yan J et al (2016) Molecular recognition features (MoRFs) in three domains of life. Mol Biosyst 12(3):697–710 [DOI] [PubMed] [Google Scholar]
  • 9.Zhao B et al (2020) Idpology of the living cell: intrinsic disorder in the subcellular compartments of the human cell. Cell Mol Life Sci. 10.1007/s00018-020-03654-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Oldfield CJ, Dunker AK (2014) Intrinsically disordered proteins and intrinsically disordered protein regions. Annu Rev Biochem 83:553–584 [DOI] [PubMed] [Google Scholar]
  • 11.Uversky VN (2013) Unusual biophysics of intrinsically disordered proteins. Biochim Biophys Acta 1834(5):932–951 [DOI] [PubMed] [Google Scholar]
  • 12.Peng Z et al (2014) A creature with a hundred waggly tails: intrinsically disordered proteins in the ribosome. Cell Mol Life Sci 71(8):1477–1504 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Peng ZL et al (2012) More than just tails: intrinsic disorder in histone proteins. Mol Biosyst 8(7):1886–1901 [DOI] [PubMed] [Google Scholar]
  • 14.Staby L et al (2017) Eukaryotic transcription factors: paradigms of protein intrinsic disorder. Biochem J 474(15):2509–2532 [DOI] [PubMed] [Google Scholar]
  • 15.Zhou JH, Zhao SW, Dunker AK (2018) Intrinsically disordered proteins link alternative splicing and post-translational modifications to complex cell signaling and regulation. J Mol Biol 430(16):2342–2359 [DOI] [PubMed] [Google Scholar]
  • 16.Tantos A, Han KH, Tompa P (2012) Intrinsic disorder in cell signaling and gene transcription. Mol Cell Endocrinol 348(2):457–465 [DOI] [PubMed] [Google Scholar]
  • 17.Zhao B et al (2021) Intrinsic disorder in human RNA-Binding proteins. J Mol Biol 433(21):167229 [DOI] [PubMed] [Google Scholar]
  • 18.Wu Z et al (2015) In various protein complexes, disordered protomers have large per-residue surface areas and area of protein-, DNA- and RNA-binding interfaces. FEBS Lett 589(19 Pt A):2561–9 [DOI] [PubMed] [Google Scholar]
  • 19.Fuxreiter M et al (2014) Disordered proteinaceous machines. Chem Rev 114(13):6806–6843 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Uversky VN (2024) Functional unfoldomics: roles of intrinsic disorder in protein (multi)functionality. Adv Protein Chem Struct Biol 138:179–210 [DOI] [PubMed] [Google Scholar]
  • 21.Ibrahim AY et al (2023) Intrinsically disordered regions that drive phase separation form a robustly distinct protein class. J Biol Chem 299(1):102801 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Fonin AV et al (2022) Biological soft matter: intrinsically disordered proteins in liquid-liquid phase separation and biomolecular condensates. Essays Biochem 66(7):831–847 [DOI] [PubMed] [Google Scholar]
  • 23.Darling AL et al (2018) Intrinsically disordered proteome of human Membrane-Less organelles. Proteomics 18(5–6):e1700193 [DOI] [PubMed] [Google Scholar]
  • 24.Qin C et al (2025) Current perspectives in drug targeting intrinsically disordered proteins and biomolecular condensates. BMC Biol 23(1):118 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Dhar A, Sisk TR, Robustelli P (2025) Ensemble docking for intrinsically disordered proteins. bioRxiv [DOI] [PMC free article] [PubMed]
  • 26.Uversky VN (2024) How to drug a cloud? Targeting intrinsically disordered proteins. Pharmacol Rev [DOI] [PubMed]
  • 27.Hosoya Y, Ohkanda J (2021) Intrinsically Disordered Proteins as Regulators of Transient Biological Processes and as Untapped Drug Targets. Molecules. 10.3390/molecules26082118 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Hu G et al (2016) Untapped potential of disordered proteins in current druggable human proteome. Curr Drug Targets 17(10):1198–1205 [DOI] [PubMed] [Google Scholar]
  • 29.Aspromonte MC et al (2024) DisProt in 2024: improving function annotation of intrinsically disordered proteins. Nucleic Acids Res 52(D1):D434–D441 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Le Gall T et al (2007) Intrinsic disorder in the protein data bank. J Biomol Struct Dyn 24(4):325–342 [DOI] [PubMed] [Google Scholar]
  • 31.Schad E et al (2018) DIBS: a repository of disordered binding sites mediating interactions with ordered proteins. Bioinformatics 34(3):535–537 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Hatos A et al (2022) FuzDB: a new phase in Understanding fuzzy interactions. Nucleic Acids Res 50(D1):D509–D517 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Ficho E et al (2017) MFIB: a repository of protein complexes with mutual folding induced by binding. Bioinformatics 33(22):3682–3684 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Goldfarb T et al (2025) NCBI refseq: reference sequence standards through 25 years of curation and annotation. Nucleic Acids Res 53(D1):D243–D257 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Walsh I et al (2015) Comprehensive large-scale assessment of intrinsic protein disorder. Bioinformatics 31(2):201–208 [DOI] [PubMed] [Google Scholar]
  • 36.Zhao B, Kurgan L (2021) Surveying over 100 predictors of intrinsic disorder in proteins. Expert Rev Proteom, : p. 1–11 [DOI] [PubMed]
  • 37.Kurgan L et al (2023) Tutorial: a guide for the selection of fast and accurate computational tools for the prediction of intrinsic disorder in proteins. Nat Protoc 18(11):3157–3172 [DOI] [PubMed] [Google Scholar]
  • 38.Erdos G, Dosztanyi Z (2024) Deep learning for intrinsically disordered proteins: from improved predictions to Deciphering conformational ensembles. Curr Opin Struct Biol 89:102950 [DOI] [PubMed] [Google Scholar]
  • 39.Wang Y et al (2017) ThreaDomEx: a unified platform for predicting continuous and discontinuous protein domains by multiple-threading and segment assembly. Nucleic Acids Res 45(W1):W400–W407 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Xue Z et al (2015) Extending protein domain boundary predictors to detect discontinuous domains. PLoS ONE 10(10):e0141541 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Xue Z et al (2013) ThreaDom: extracting protein domain boundary information from multiple Threading alignments. Bioinformatics 29(13):i247–i256 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Shi Q et al (2019) DNN-Dom: predicting protein domain boundary from sequence alone by deep neural network. Bioinformatics 35(24):5128–5136 [DOI] [PubMed] [Google Scholar]
  • 43.Kurgan L et al (2020) On the Importance of computational biology and bioinformatics to the origins and rapid progression of the intrinsically disordered proteins field. In: Biocomputing 2020. pp. 149–158
  • 44.Kurgan L, Li M, Li Y (2021) The methods and tools for intrinsic disorder prediction and their application to systems medicine, in Systems medicine. Academic, Oxford, pp 159–169. O. Wolkenhauer, Editor [Google Scholar]
  • 45.Deng X et al (2015) An overview of practical applications of protein disorder prediction and drive for Faster, more accurate predictions. Int J Mol Sci 16(7):15384–15404 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Zhao B, Kurgan L (2022) Compositional Bias of Intrinsically Disordered Proteins and Regions and Their Predictions. Biomolecules. 10.3390/biom12070888 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Campen A et al (2008) TOP-IDP-scale: a new amino acid scale measuring propensity for intrinsic disorder. Protein Pept Lett 15(9):956–963 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Deng X, Eickholt J, Cheng J (2012) A comprehensive overview of computational protein disorder prediction methods. Mol Biosyst 8(1):114–121 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Li J et al (2015) An overview of predictors for intrinsically disordered proteins over 2010–2014. Int J Mol Sci 16(10):23446–23462 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Liu Y, Wang X, Liu B (2019) A comprehensive review and comparison of existing computational methods for intrinsically disordered protein and region prediction. Brief Bioinform 20(1):330–346 [DOI] [PubMed] [Google Scholar]
  • 51.Meng F, Uversky V, Kurgan L (2017) Computational prediction of intrinsic disorder in proteins. Curr Protoc Protein Sci 88(1–2):1614 [DOI] [PubMed] [Google Scholar]
  • 52.Meng F, Uversky VN, Kurgan L (2017) Comprehensive review of methods for prediction of intrinsic disorder and its molecular functions. Cell Mol Life Sci 74(17):3069–3090 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.He B et al (2009) Predicting intrinsic disorder in proteins: an overview. Cell Res 19(8):929–949 [DOI] [PubMed] [Google Scholar]
  • 54.Uversky VN, Kurgan L (2023) Overview update: computational prediction of intrinsic disorder in proteins. Curr Protoc 3(6):e802 [DOI] [PubMed] [Google Scholar]
  • 55.Katuwawala A, Kurgan L (2020) Comparative assessment of intrinsic disorder predictions with a focus on protein and nucleic acid-binding proteins. Biomolecules. 10.3390/biom10121636 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Necci M et al (2018) A comprehensive assessment of long intrinsic protein disorder from the DisProt database. Bioinformatics 34(3):445–452 [DOI] [PubMed] [Google Scholar]
  • 57.Peng ZL, Kurgan L (2012) Comprehensive comparative assessment of in-silico predictors of disordered regions. Curr Protein Pept Sci 13(1):6–18 [DOI] [PubMed] [Google Scholar]
  • 58.Necci M et al (2021) Critical assessment of protein intrinsic disorder prediction. Nat Methods 18(5):472–481 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Jin Y, Dunbrack RL Jr. (2005) Assessment of disorder predictions in CASP6. Proteins 61(Suppl 7):167–175 [DOI] [PubMed] [Google Scholar]
  • 60.Bordoli L, Kiefer F, Schwede T (2007) Assessment of disorder predictions in CASP7. Proteins 69(Suppl 8):129–136 [DOI] [PubMed] [Google Scholar]
  • 61.Noivirt-Brik O, Prilusky J, Sussman JL (2009) Assessment disorder predictions CASP8. Proteins 77:210–216 [DOI] [PubMed] [Google Scholar]
  • 62.Monastyrskyy B et al (2014) Assessment of protein disorder region predictions in CASP10. Proteins 82(Suppl 2):127–137 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Melamud E, Moult J (2003) Evaluation of disorder predictions in CASP5. Proteins 53:561–565 [DOI] [PubMed] [Google Scholar]
  • 64.Monastyrskyy B et al (2011) Evaluation of disorder predictions in CASP9. Proteins 79:107–118 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Conte AD et al (2023) Critical assessment of protein intrinsic disorder prediction (CAID) - results of round 2. Proteins. 10.1002/prot.26582 [DOI] [PubMed] [Google Scholar]
  • 66.Zhang F, Kurgan L (2025) Evaluation of predictions of disordered binding regions in the CAID2 experiment. Comput Struct Biotechnol J 27:78–88 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Wang K et al (2024) Assessment of disordered linker predictions in the CAID2 experiment. Biomolecules. 10.3390/biom14030287 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Zhao B, Ghadermarzi S, Kurgan L (2023) Comparative evaluation of AlphaFold2 and disorder predictors for prediction of intrinsic disorder, disorder content and fully disordered proteins. Comput Struct Biotechnol J 21:3248–3258 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Zhao B, Kurgan L (2022) Deep learning in prediction of intrinsic disorder in proteins. Comput Struct Biotechnol J 20:1286–1294 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Piovesan D, Monzon AM, Tosatto SCE (2022) Intrinsic protein disorder and conditional folding in alphafolddb. Protein Sci 31(11):e4466 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Wilson CJ, Choy WY, Karttunen M (2022) AlphaFold2: a role for disordered protein/region prediction? Int J Mol Sci 23(9):4591 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Mehdiabadi M et al (2025) Critical assessment of protein intrinsic disorder round 3 - predicting disorder in the era of protein language models. Proteins [DOI] [PMC free article] [PubMed]
  • 73.Del Conte A et al (2023) CAID prediction portal: a comprehensive service for predicting intrinsic disorder and binding regions in proteins. Nucleic Acids Res 51(W1):W62–W69 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Barik A et al (2020) DEPICTER: intrinsic disorder and disorder function prediction server. J Mol Biol 432(11):3379–3387 [DOI] [PubMed] [Google Scholar]
  • 75.Basu S, Gsponer J, Kurgan L (2023) DEPICTER2: a comprehensive webserver for intrinsic disorder and disorder function prediction. Nucleic Acids Res. 10.1093/nar/gkad330 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Williams RJ (1979) The conformation properties of proteins in solution. Biol Rev Camb Philos Soc 54(4):389–437 [DOI] [PubMed] [Google Scholar]
  • 77.Wootton JC (1994) Non-globular domains in protein sequences: automated segmentation using complexity measures. Comput Chem 18(3):269–285 [DOI] [PubMed] [Google Scholar]
  • 78.Romero P et al (1997) Identifying disordered regions in proteins from amino acid sequence. Ieee International Conference on Neural Networks, Vols 1–4, pp. 90–95
  • 79.Romero O, Dunker K (1997) Sequence data analysis for long disordered regions prediction in the calcineurin family. Genome Inf Ser Workshop Genome Inf 8:110–124 [PubMed] [Google Scholar]
  • 80.Romero P et al (2001) Sequence complexity of disordered protein. Proteins 42(1):38–48 [DOI] [PubMed] [Google Scholar]
  • 81.Eickholt J, Cheng J (2013) DNdisorder: predicting protein disorder using boosting and deep networks. BMC Bioinformatics 14:88 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Meng Y et al (2025) Protein structure prediction via deep learning: an in-depth review. Front Pharmacol 16:1498662 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Boadu F, Lee A, Cheng J (2025) Deep learning methods for protein function prediction. Proteomics 25(1–2):e2300471 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Alanazi W, Meng D, Pollastri G (2025) Advancements in one-dimensional protein structure prediction using machine learning and deep learning. Comput Struct Biotechnol J 27:1416–1430 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Zhang J, Durham J, Qian C (2024) Revolutionizing protein-protein interaction prediction with deep learning. Curr Opin Struct Biol 85:102775 [DOI] [PubMed] [Google Scholar]
  • 86.Lin P, Li H, Huang SY (2024) Deep learning in modeling protein complex structures: from contact prediction to end-to-end approaches. Curr Opin Struct Biol 85:102789 [DOI] [PubMed] [Google Scholar]
  • 87.Ismi DP, Pulungan R, Afiahayati (2022) Deep learning for protein secondary structure prediction: pre and post-AlphaFold. Comput Struct Biotechnol J 20:6271–6286 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Hu G et al (2021) FlDPnn: accurate intrinsic disorder prediction with putative propensities of disorder functions. Nat Commun 12(1):4438 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Anbo H, Amagai H, Fukuchi S (2020) Neproc predicts binding segments in intrinsically disordered regions without learning binding region sequences. Biophys Physicobiol 17:147–154 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Wang K et al (2024) flDPnn2: accurate and fast predictor of intrinsic disorder in proteins. J Mol Biol 436(17):168605 [DOI] [PubMed] [Google Scholar]
  • 91.Wang S, Ma J, Xu J (2016) Aucpred: proteome-level protein disorder prediction by AUC-maximized deep convolutional neural fields. Bioinformatics 32(17):i672–i679 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92.Meng D, Pollastri G (2025) PUNCH2: explore the strategy for intrinsically disordered protein predictor. PLoS ONE 20(3):e0319208 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Han KS et al (2025) PredIDR: accurate prediction of protein intrinsic disorder regions using deep convolutional neural network. Int J Biol Macromol 284(Pt 1):137665 [DOI] [PubMed] [Google Scholar]
  • 94.Orlando G et al (2022) Prediction of disordered regions in proteins with recurrent neural networks and protein dynamics. J Mol Biol 434(12):167579 [DOI] [PubMed] [Google Scholar]
  • 95.Emenecker RJ, Griffith D, Holehouse AS (2021) Metapredict: a fast, accurate, and easy-to-use predictor of consensus disorder and structure. Biophys J 120(20):4312–4319 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 96.Pang Y, Liu B (2024) DisoFLAG: accurate prediction of protein intrinsic disorder and its functions using graph-based interaction protein language model. BMC Biol 22(1):3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97.Redl I et al (2023) ADOPT: intrinsic protein disorder prediction through deep bidirectional Transformers. NAR Genom Bioinform 5(2):lqad041 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 98.Tang YJ et al (2023) Protein intrinsically disordered region prediction by combining neural architecture search and multi-objective genetic algorithm. BMC Biol 21(1):188 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 99.Nambiar A et al (2024) DR-BERT: a protein language model to annotate disordered regions. Structure. 10.1016/j.str.2024.04.010 [DOI] [PubMed] [Google Scholar]
  • 100.Hanson J et al (2019) SPOT-Disorder2: improved protein intrinsic disorder prediction by ensembled deep learning. Genomics Proteom Bioinf 17(6):645–656 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 101.Mirabello C, Wallner B (2019) RawMSA: End-to-end deep learning using Raw multiple sequence alignments. PLoS ONE 14(8):e0220182 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 102.Tang YJ, Pang YH, Liu B (2022) DeepIDP-2L: protein intrinsically disordered region prediction by combining convolutional attention network and hierarchical attention network. Bioinformatics 38(5):1252–1260 [DOI] [PubMed] [Google Scholar]
  • 103.Ilzhofer D, Heinzinger M, Rost B (2022) SETH predicts nuances of residue disorder from protein embeddings. Front Bioinform 2:1019597 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 104.Ul Kabir MW, Hoque MT (2024) DisPredict3.0: prediction of intrinsically disordered regions/ proteins using protein language model. Appl Math Comput. 10.1016/j.amc.2024.128630 [Google Scholar]
  • 105.Xu S, Onoda A (2024) Accurate and fast prediction of intrinsically disordered protein by multiple protein language models and ensemble learning. J Chem Inf Model 64(7):2901–2911 [DOI] [PubMed] [Google Scholar]
  • 106.Song Y et al (2023) Fast and accurate protein intrinsic disorder prediction by using a pretrained language model. Brief Bioinform. 10.1093/bib/bbad173 [DOI] [PubMed] [Google Scholar]
  • 107.Kotowski K, Roterman I, Stapor K (2025) DisorderUnetLM: validating ProteinUnet for efficient protein intrinsic disorder prediction. Comput Biol Med 185:109586 [DOI] [PubMed] [Google Scholar]
  • 108.Elnaggar A et al (2022) ProtTrans: toward Understanding the Language of life through Self-Supervised learning. IEEE Trans Pattern Anal Mach Intell 44(10):7112–7127 [DOI] [PubMed] [Google Scholar]
  • 109.Rives A et al (2021) <Emphasis Type="Italic">Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences.</Emphasis>. Proc Natl Acad Sci U S A. 10.1073/pnas.2016239118 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 110.Lin Z et al (2023) Evolutionary-scale prediction of atomic-level protein structure with a Language model. Science 379(6637):1123–1130 [DOI] [PubMed] [Google Scholar]
  • 111.Pang Y, Liu B (2023) IDP-LM: prediction of protein intrinsic disorder and disorder functions based on Language models. PLoS Comput Biol 19(11):e1011657 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 112.Linding R et al (2003) Protein disorder prediction: implications for structural proteomics. Structure 11(11):1453–1459 [DOI] [PubMed] [Google Scholar]
  • 113.Linding R et al (2003) GlobPlot: exploring protein sequences for globularity and disorder. Nucleic Acids Res 31(13):3701–3708 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 114.Prilusky J et al (2005) FoldIndex: a simple tool to predict whether a given protein sequence is intrinsically unfolded. Bioinformatics 21(16):3435–3438 [DOI] [PubMed] [Google Scholar]
  • 115.Dosztanyi Z et al (2005) IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content. Bioinformatics 21(16):3433–3434 [DOI] [PubMed] [Google Scholar]
  • 116.Peng K et al (2006) Length-dependent prediction of protein intrinsic disorder. BMC Bioinformatics 7:208 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 117.Ishida T, Kinoshita K (2007) <Emphasis Type="Italic">PrDOS: prediction of disordered protein regions from amino acid sequence.</Emphasis>. Nucleic Acids Res 35(Web Server issue):W460-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 118.Xue B et al (2010) PONDR-FIT: a meta-predictor of intrinsically disordered amino acids. Biochim Biophys Acta 1804(4):996–1010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 119.Jones DT, Cozzetto D (2015) DISOPRED3: precise disordered region predictions with annotated protein-binding activity. Bioinformatics 31(6):857–863 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 120.Meszaros B, Erdos G, Dosztanyi Z (2018) IUPred2A: context-dependent prediction of protein disorder as a function of redox state and protein binding. Nucleic Acids Res 46(W1):W329–W337 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 121.Erdos G, Pajkos M, Dosztanyi Z (2021) IUPred3: prediction of protein disorder enhanced with unambiguous experimental annotation and visualization of evolutionary conservation. Nucleic Acids Res 49(W1):W297–W303 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 122.Meng F, Kurgan L (2016) DFLpred: high-throughput prediction of disordered flexible linker regions in protein sequences. Bioinformatics 32(12):i341–i350 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 123.Peng Z, Xing Q, Kurgan L (2020) APOD: accurate sequence-based predictor of disordered flexible linkers. Bioinformatics 36(Supplement2):i754–i761 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 124.Pang Y, Liu B (2023) TransDFL: identification of disordered flexible linkers in proteins by transfer learning. Genomics Proteomics Bioinformatics 21(2):359–369 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 125.Dosztanyi Z, Meszaros B, Simon I (2009) Anchor: web server for predicting protein binding regions in disordered proteins. Bioinformatics 25(20):2745–2746 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 126.Disfani FM et al (2012) MoRFpred, a computational tool for sequence-based prediction and characterization of short disorder-to-order transitioning binding regions in proteins. Bioinformatics 28(12):i75–83 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 127.Malhis N, Jacobson M, Gsponer J (2016) <Emphasis Type="Italic">MoRFchibi SYSTEM: software tools for the identification of MoRFs in protein sequences.</Emphasis>. Nucleic Acids Res. 10.1093/nar/gkw409 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 128.Sharma R et al (2018) OPAL: prediction of MoRF regions in intrinsically disordered protein sequences. Bioinformatics 34(11):1850–1858 [DOI] [PubMed] [Google Scholar]
  • 129.Peng Z, Kurgan L (2015) High-throughput prediction of RNA, DNA and protein binding regions mediated by intrinsic disorder. Nucleic Acids Res 43(18):e121 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 130.Peng Z et al (2017) Prediction of disordered RNA, DNA, and protein binding regions using disordpbind. Methods Mol Biol 1484:187–203 [DOI] [PubMed] [Google Scholar]
  • 131.Peng Z et al (2023) <Emphasis Type="Italic">CLIP: accurate prediction of disordered linear interacting peptides from protein sequences using co-evolutionary information.</Emphasis>. Brief Bioinform. 10.1093/bib/bbac502 [DOI] [PubMed] [Google Scholar]
  • 132.Zhang F et al (2022) <Emphasis Type="Italic">DeepDISOBind: accurate prediction of RNA-, DNA- and protein-binding intrinsically disordered residues with deep multi-task learning.</Emphasis>. Brief Bioinform. 10.1093/bib/bbab521 [DOI] [PubMed] [Google Scholar]
  • 133.Sharma R, Tsunoda T, Sharma A (2023) <Emphasis Type="Italic">DRPBind: prediction of DNA, RNA and protein binding residues in intrinsically disordered protein sequences.</Emphasis>. bioRxiv 2023.03.20.533427
  • 134.Littmann M et al (2021) <Emphasis Type="Italic">Protein embeddings and deep learning predict binding residues for various ligand classes.</Emphasis>. Sci Rep. 10.1038/s41598-021-03431-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 135.Katuwawala A, Zhao B, Kurgan L (2021) Disolipred: accurate prediction of disordered lipid-binding residues in protein sequences with deep recurrent networks and transfer learning. Bioinformatics 38(1):115–124 [DOI] [PubMed] [Google Scholar]
  • 136.Basu S, Hegedus T, Kurgan L (2023) CoMemMoRFPred: Sequence-based prediction of MemMoRFs by combining predictors of intrinsic Disorder, MoRFs and disordered Lipid-binding regions. J Mol Biol 435(21):168272 [DOI] [PubMed] [Google Scholar]
  • 137.Dobson L, Tusnady GE (2021) <Emphasis Type="Italic">MemDis: predicting disordered regions in transmembrane proteins.</Emphasis>. Int J Mol Sci. 10.3390/ijms222212270 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 138.Csepi M et al (2025) pLMMoRF: A web server that accurately predicts Membrane-interacting molecular recognition features by employing a protein Language model. J Mol Biol 437(17):169236 [DOI] [PubMed] [Google Scholar]
  • 139.Song JN, Kurgan L (2025) Two decades of advances in sequence-based prediction of MoRFs, disorder-to-order transitioning binding regions. Expert Rev Proteomics 22(1):1–9 [DOI] [PubMed] [Google Scholar]
  • 140.Basu S et al (2024) <Emphasis Type="Italic">Twenty years of advances in prediction of nucleic acid-binding residues in protein sequences.</Emphasis>. Brief Bioinform. 10.1093/bib/bbaf016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 141.Basu S, Kihara D, Kurgan L (2023) Computational prediction of disordered binding regions. Comput Struct Biotechnol J 21:1487–1497 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 142.Han B et al (2023) Computational prediction of protein intrinsically disordered region related interactions and functions. Genes. 10.3390/genes14020432 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 143.Chen R et al (2022) Prediction of protein-protein interaction sites in intrinsically disordered proteins. Front Mol Biosci 9:985022 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 144.Peng Z, Kurgan L (2012) On the complementarity of the consensus-based disorder prediction. Pac Symp Biocomput, 176–187 [PubMed]
  • 145.Necci M et al (2017) MobiDB-lite: fast and highly specific consensus prediction of intrinsic disorder in proteins. Bioinformatics 33(9):1402–1404 [DOI] [PubMed] [Google Scholar]
  • 146.Fan X, Kurgan L (2014) Accurate prediction of disorder in protein chains with a comprehensive and empirically designed consensus. J Biomol Struct Dyn 32(3):448–464 [DOI] [PubMed] [Google Scholar]
  • 147.Mizianty MJ et al (2010) Improved sequence-based prediction of disordered regions with multilayer fusion of multiple information sources. Bioinformatics 26(18):i489–i496 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 148.Mizianty MJ, Peng Z, Kurgan L (2013) MFDp2: accurate predictor of disorder in proteins by fusion of disorder probabilities, content and profiles. Intrinsically Disord Proteins 1(1):e24428 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 149.Mizianty MJ, Uversky V, Kurgan L (2014) Prediction of intrinsic disorder in proteins using MFDp2. Methods Mol Biol 1137:147–162 [DOI] [PubMed] [Google Scholar]
  • 150.Kozlowski LP, Bujnicki JM (2012) Metadisorder: a meta-server for the prediction of intrinsic disorder in proteins. BMC Bioinformatics 13:111 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 151.Huang YJ, Acton TB, Montelione GT (2014) Dismeta: a meta server for construct design and optimization. Methods Mol Biol 1091:3–16 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 152.Lang B, Babu MM (2021) A community effort to bring structure to disorder. Nat Methods 18(5):454–455 [DOI] [PubMed] [Google Scholar]
  • 153.Oldfield CJ, Peng Z, Kurgan L (2020) Disordered RNA-binding region prediction with disordpbind. Methods Mol Biol 2106:225–239 [DOI] [PubMed] [Google Scholar]
  • 154.Di Domenico T et al (2012) MobiDB: a comprehensive database of intrinsic protein disorder annotations. Bioinformatics 28(15):2080–2081 [DOI] [PubMed] [Google Scholar]
  • 155.Piovesan D et al (2023) MobiDB: 10 years of intrinsically disordered proteins. Nucleic Acids Res 51(D1):D438–D444 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 156.Piovesan D et al (2018) MobiDB 3.0: more annotations for intrinsic disorder, conformational diversity and interactions in proteins. Nucleic Acids Res 46(D1):D471–D476 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 157.Potenza E et al (2015) MobiDB 2.0: an improved database of intrinsically disordered and mobile proteins. Nucleic Acids Res 43(Database issue):D315–D320 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 158.Piovesan D et al (2025) MOBIDB in 2025: integrating ensemble properties and function annotations for intrinsically disordered proteins. Nucleic Acids Res 53(D1):D495–D503 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 159.Walsh I et al (2012) ESpritz: accurate and fast prediction of protein disorder. Bioinformatics 28(4):503–509 [DOI] [PubMed] [Google Scholar]
  • 160.Dosztanyi Z et al (2005) The pairwise energy content estimated from amino acid composition discriminates between folded and intrinsically unstructured proteins. J Mol Biol 347(4):827–839 [DOI] [PubMed] [Google Scholar]
  • 161.Monzon AM et al (2021) FLIPPER: predicting and characterizing linear interacting peptides in the protein data bank. J Mol Biol 433(9):166900 [DOI] [PubMed] [Google Scholar]
  • 162.Wootton JC (1994) Nonglobular domains in protein sequences - automated segmentation using complexity-measures. Comput Chem 18(3):269–285 [DOI] [PubMed] [Google Scholar]
  • 163.Oates ME et al (2013) D(2)P(2): database of disordered protein predictions. Nucleic Acids Res 41(Database issue):D508–D516 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 164.Obradovic Z et al (2005) Exploiting heterogeneous sequence properties improves prediction of protein disorder. Proteins 61(Suppl 7):176–182 [DOI] [PubMed] [Google Scholar]
  • 165.Ishida T, Kinoshita K (2008) Prediction of disordered regions in proteins based on the meta approach. Bioinformatics 24(11):1344–1348 [DOI] [PubMed] [Google Scholar]
  • 166.Ghalwash MF, Dunker AK, Obradovic Z (2012) Uncertainty analysis in protein disorder prediction. Mol Biosyst 8(1):381–391 [DOI] [PubMed] [Google Scholar]
  • 167.Zhao B et al (2021) DescribePROT: database of amino acid-level protein structure and function predictions. Nucleic Acids Res 49(D1):D298–D308 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 168.Basu S et al (2024) DescribePROT in 2023: more, higher-quality and experimental annotations and improved data download options. Nucleic Acids Res 52(D1):D426–D433 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 169.Zhao B, Basu S, Kurgan L (2025) DescribePROT database of Residue-Level protein structure and function annotations. Prediction of protein secondary structure. Springer US, New York, NY, pp 169–184. A. Kloczkowski, L. Kurgan, and E. Faraggi, Editors [DOI] [PubMed] [Google Scholar]
  • 170.Medvedev KE, Pei J, Grishin NV (2022) DisEnrich: database of enriched regions in human dark proteome. Bioinformatics 38(7):1870–1876 [DOI] [PMC free article] [PubMed]
  • 171.Jones DT, Ward JJ (2003) Prediction of disordered regions in proteins from position specific score matrices. Proteins 53(Suppl 6):573–578 [DOI] [PubMed] [Google Scholar]
  • 172.Hanson J et al (2017) Improving protein disorder prediction by deep bidirectional long short-term memory recurrent neural networks. Bioinformatics 33(5):685–692 [DOI] [PubMed] [Google Scholar]
  • 173.Harrison PM (2017) fLPS: fast discovery of compositional biases for the protein universe. BMC Bioinformatics 18(1):476 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 174.McGuffin LJ (2008) Intrinsic disorder prediction from the analysis of multiple protein fold recognition models. Bioinformatics 24(16):1798–1804 [DOI] [PubMed] [Google Scholar]
  • 175.Ward JJ et al (2004) The DISOPRED server for the prediction of protein disorder. Bioinformatics 20(13):2138–2139 [DOI] [PubMed] [Google Scholar]
  • 176.Hecker J, Yang JY, Cheng JL (2008) Protein disorder prediction at multiple levels of sensitivity and specificity. BMC Genomics. 10.1186/1471-2164-9-S1-S9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 177.Cheng JL, Sweredoski MJ, Baldi P (2005) Accurate prediction of protein disordered regions by mining protein structure data. Data Min Knowl Disc 11(3):213–222 [Google Scholar]
  • 178.Su CT, Chen CY, Hsu CM (2007) iPDA: integrated protein disorder analyzer. Nucleic Acids Res 35:W465–W472 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 179.Shimizu K, Hirose S, Noguchi T (2007) Poodle-s: web application for predicting protein disorder by using physicochemical features and reduced amino acid set of a position-specific scoring matrix. Bioinformatics 23(17):2337–2338 [DOI] [PubMed] [Google Scholar]
  • 180.Hirose S et al (2007) POODLE-L: a two-level SVM prediction system for reliably predicting long disordered regions. Bioinformatics 23(16):2046–2053 [DOI] [PubMed] [Google Scholar]
  • 181.Vullo A et al (2006) Spritz: a server for the prediction of intrinsically disordered regions in protein sequences using kernel machines. Nucleic Acids Res 34:W164–W168 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 182.Yang ZR et al (2005) RONN: the bio-basis function neural network technique applied to the detection of natively disordered regions in proteins. Bioinformatics 21(16):3369–3376 [DOI] [PubMed] [Google Scholar]
  • 183.Kurowski MA, Bujnicki JM (2003) Genesilico protein structure prediction meta-server. Nucleic Acids Res 31(13):3305–3307 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 184.Erdos G, Dosztanyi Z (2024) AIUPred: combining energy estimation with deep learning for the enhanced prediction of protein disorder. Nucleic Acids Res 52(W1):W176–W181 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 185.Galzitskaya OV, Garbuzynskiy SO, Lobanov MY (2006) Foldunfold: web server for the prediction of disordered regions in protein chain. Bioinformatics 22(23):2948–2949 [DOI] [PubMed] [Google Scholar]
  • 186.Lobanov MY, Galzitskaya OV (2011) The Ising model for prediction of disordered residues from protein sequence alone. Phys Biol 8(3):035004 [DOI] [PubMed] [Google Scholar]
  • 187.Deng X, Eickholt J, Cheng J (2009) PreDisorder: Ab initio sequence-based prediction of protein disordered regions. BMC Bioinformatics 10:436 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 188.Bruley A et al (2023) A sequence-based foldability score combined with AlphaFold2 predictions to disentangle the protein order/disorder continuum. Proteins 91(4):466–484 [DOI] [PubMed] [Google Scholar]
  • 189.Sormanni P et al (2015) The s2D method: simultaneous sequence-based prediction of the statistical populations of ordered and disordered regions in proteins. J Mol Biol 427(4):982–996 [DOI] [PubMed] [Google Scholar]
  • 190.Hanson J, Paliwal K, Zhou Y (2018) Accurate single-sequence prediction of protein intrinsic disorder by an ensemble of deep recurrent and convolutional architectures. J Chem Inf Model 58(11):2369–2376 [DOI] [PubMed] [Google Scholar]
  • 191.Dunker AK et al (2000) Intrinsic protein disorder in complete genomes. Genome Inf Ser Workshop Genome Inf 11:161–171 [PubMed] [Google Scholar]
  • 192.Hu G et al (2018) Taxonomic landscape of the dark proteomes: whole-proteome scale interplay between structural darkness, intrinsic disorder, and crystallization propensity. Proteomics. 10.1002/pmic.201800243 [DOI] [PubMed] [Google Scholar]
  • 193.Wang C, Uversky VN, Kurgan L (2016) Disordered nucleiome: abundance of intrinsic disorder in the DNA- and RNA-binding proteins in 1121 species from eukaryota. Bacteria Archaea Proteom 16(10):1486–1498 [DOI] [PubMed] [Google Scholar]
  • 194.Pentony MM, Jones DT (2010) Modularity of intrinsic disorder in the human proteome. Proteins 78(1):212–221 [DOI] [PubMed] [Google Scholar]
  • 195.Colak R et al (2013) Distinct types of disorder in the human proteome: functional implications for alternative splicing. PLoS Comput Biol 9(4):e1003030 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 196.Oldfield CJ et al (2020) Codon selection reduces GC content bias in nucleic acids encoding for intrinsically disordered proteins. Cell Mol Life Sci 77(1):149–160 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 197.Peng Z, Uversky VN, Kurgan L (2016) Genes encoding intrinsic disorder in eukaryota have high GC content. Intrinsically Disordered Proteins 4(1):e1262225 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 198.Piovesan D, Tosatto SCE (2018) Mobi 2.0: an improved method to define intrinsic disorder, mobility and linear binding regions in protein structures. Bioinformatics 34(1):122–123 [DOI] [PubMed] [Google Scholar]
  • 199.Dinkel H et al (2016) ELM 2016–data update and new functionality of the eukaryotic linear motif resource. Nucleic Acids Res 44(D1):D294–300 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 200.Fukuchi S et al (2014) IDEAL in 2014 illustrates interaction networks composed of intrinsically disordered proteins and their binding partners. Nucleic Acids Res 42(Database issue):D320–D325 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 201.Ficho E et al (2025) MFIB 2.0: a major update of the database of protein complexes formed by mutual folding of the constituting protein chains. Nucleic Acids Res 53(D1):D487–D494 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 202.consortium PD-K (2022) PDBe-KB: collaboratively defining the biological context of structural data. Nucleic Acids Res 50(D1):D534–D542 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 203.Meszaros B et al (2020) PhaSePro: the database of proteins driving liquid-liquid phase separation. Nucleic Acids Res 48(D1):D360–D367 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 204.UniProt C (2025) UniProt: the universal protein knowledgebase in 2025. Nucleic Acids Res 53(D1):D609–D617 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 205.Gough J et al (2001) Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure. J Mol Biol 313(4):903–919 [DOI] [PubMed] [Google Scholar]
  • 206.Hornbeck PV et al (2012) PhosphoSitePlus: a comprehensive resource for investigating the structure and function of experimentally determined post-translational modifications in man and mouse. Nucleic Acids Res 40(Database issue):D261–D270 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 207.Burley SK et al (2023) RCSB protein data bank (RCSB.org): delivery of experimentally-determined PDB structures alongside one million computed structure models of proteins from artificial intelligence/machine learning. Nucleic Acids Res 51(D1):D488–D508 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 208.Necci M, Piovesan D, Tosatto SC (2016) Large-scale analysis of intrinsic disorder flavors and associated functions in the protein sequence universe. Protein Sci 25(12):2164–2174 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 209.Deiana A et al (2019) Intrinsically disordered proteins and structured proteins with intrinsically disordered regions have different functional roles in the cell. PLoS ONE 14(8):e0217889 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 210.Nielsen JT, Mulder FA (2016) There is diversity in disorder-In all chaos there is a Cosmos, in all disorder a secret order. Front Mol Biosci 3:4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 211.Xue B et al (2009) CDF it all: consensus prediction of intrinsically disordered proteins based on various cumulative distribution functions. FEBS Lett 583(9):1469–1474 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 212.Zhang F et al (2020) PROBselect: accurate prediction of protein-binding residues from proteins sequences via dynamic predictor selection. Bioinformatics 36(Supplement2):i735–i744 [DOI] [PubMed] [Google Scholar]
  • 213.Nagarajan R, Ahmad S, Gromiha MM (2013) Novel approach for selecting the best predictor for identifying the binding sites in DNA binding proteins. Nucleic Acids Res 41(16):7606–7614 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 214.Jumper J et al (2021) Highly accurate protein structure prediction with alphafold. Nature 596(7873):583–589 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 215.Aderinwale T et al (2022) Real-time structure search and structure classification for alphafold protein models. Commun Biology 5(1):316 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 216.Abramson J et al (2024) Accurate structure prediction of biomolecular interactions with alphafold 3. Nature 630(8016):493–500 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material 1 (199KB, docx)

Data Availability Statement

Data underlying this study are available in the Supplement.


Articles from Cellular and Molecular Life Sciences: CMLS are provided here courtesy of Springer

RESOURCES