Abstract
The development of aging is associated with the disruption of key cellular processes manifested as well-established hallmarks of aging. Intrinsically disordered proteins (IDPs) and intrinsically disordered regions (IDRs) have no stable tertiary structure that provide them a power to be configurable hubs in signaling cascades and regulate many processes, potentially including those related to aging. There is a need to clarify the roles of IDPs/IDRs in aging. The dataset of 1702 aging-related proteins was collected from established aging databases and experimental studies. There is a noticeable presence of IDPs/IDRs, accounting for about 36% of the aging-related dataset, which is however less than the disorder content of the whole human proteome (about 40%). A Gene Ontology analysis of the used here aging proteome reveals an abundance of IDPs/IDRs in one-third of aging-associated processes, especially in genome regulation. Signaling pathways associated with aging also contain IDPs/IDRs on different hierarchical levels, revealing the importance of "structure-function continuum" in aging. Protein–protein interaction network analysis showed that IDPs present in different clusters associated with different aging hallmarks. Protein cluster with IDPs enrichment has simultaneously high liquid–liquid phase separation (LLPS) probability, “nuclear” localization and DNA-associated functions, related to aging hallmarks: genomic instability, telomere attrition, epigenetic alterations, and stem cells exhaustion. Intrinsic disorder, LLPS, and aggregation propensity should be considered as features that could be markers of pathogenic proteins. Overall, our analyses indicate that IDPs/IDRs play significant roles in aging-associated processes, particularly in the regulation of DNA functioning. IDP aggregation, which can lead to loss of function and toxicity, could be critically harmful to the cell. A structure-based analysis of aging and the identification of proteins that are particularly susceptible to disturbances can enhance our understanding of the molecular mechanisms of aging and open up new avenues for slowing it down.
Supplementary Information
The online version contains supplementary material available at 10.1007/s00018-023-04897-3.
Keywords: Aggregation, Aging, Epigenetics, Intrinsic disorder, Liquid–liquid phase separation, Protein function, Protein–protein interaction, Protein structure, Proteostasis, Proteomics
Introduction
Aging is a process, occurring as a result of different uncompensated abnormalities at the intracellular, intercellular, organ/tissue, and organismal levels [1, 2]. Since aging causes the emergence of age-related diseases, an understanding of the aging process could be a synergistic tool for combating all age-related disorders [3, 4]. Aging inside the cell is characterized by dysfunctional regulation, structure disturbances, genome instability, epigenetic changes, loss of proteostasis, mitochondrial dysfunction, and disruption of cellular metabolism [1, 2].
Intrinsically disordered proteins (IDPs) and regions (IDRs) are typically characterized by low sequence complexity; their sequences are enriched in hydrophilic residues and depleted of hydrophobic residues [5]. Although such peculiarities of the amino acid sequences preclude IDPs/IDRs from spontaneous folding, the resulting structural plasticity confers multiple functional advantages on these proteins such as multifunctionality and binding promiscuity [6–13].
A lack of fixed structure allows IDPs/IDRs to have high binding specificity coupled with low affinity [14–16]. Consequently, signals to be "switched on" and "switched off" quickly. Interactions IDP-partner can be weak but specific because when IDP binds to other molecule, a portion of the binding energy is often spent to fold (fully or partially) the IDP from the initially disordered state [17]. Many IDPs usually contain so-called molecular recognition features (MoRFs) — disorder-based protein-partner interaction sites that acquire structure as a result of binding to partners [18–22].
The structural plasticity and conformational heterogeneity of IDPs allow them to interact with a large number of often unrelated partners, placing them at the centers of protein-protein interaction (PPI) networks [23–24]. This dramatically broadens the sequence-function relationships in proteins. Moreover, it represents a departure from the classical "lock-and-key" model of protein enzymatic activity as well as the "induced fit" model of protein-partner interaction [25], toward the more general "protein structure-function continuum" model. In such continuum model, a given protein exists as a dynamic conformational ensemble that contains multiple proteoforms characterized by a wide variety of structural features and possessing various functional potentials [26–29].
IDRs can participate in liquid-liquid phase separation (LLPS), thereby playing crucial roles in the biogenesis of various proteinaceous membrane-less organelles (PMLOs) or biomolecular condensates [30]. Either IDPs can interact with specific ordered protein domains, forming transient protein complexes, as, for example, occurs in the transcription elongation system [31, 32]. The proteomes of PMLOs contain multiple IDPs; in addition, the disorder-based interactions among these proteins are considered to be the main driving force for the formation of PMLOs [33–44]. IDRs with a characteristic pattern of charged amino acids result in protein incorporation into the PMLO as do leader sequences for classical membrane organelles [45].
The disorder content at the organism level increases with the complexity of the organism, which indicates the importance of IDPs [46–48]. Such prevalence of intrinsic disorder is explained by its crucial functions. First, the conformational transition occurs through intermediate disordered states. Second, a number of proteins require a transition from a structured state to a disordered state in order to perform a function. In addition, some proteins do not fold into a specific 3-D structure, but remain in the form of a conformational ensemble. In this case, some functions will be feasible precisely due to the complete absence of structure.
An illustrative example of an important IDP related to aging that has characteristic IDP features is MYC proto-oncogene protein (a.k.a. c-MYC, UniProt ID: P01106). The OSKM Yamanaka cocktail for epigenetic reprogramming is composed of MYC together with Oct-4 (Q01860), SOX2 (P48431), and KLF4 (O43474), each of which contain multiple IDRs (AlphaFold predicted structures shown in Figure 1a-d, IDRs appears as orange regions, corresponding to low AlphaFold confidence).
MYC protein is a transcription factor, it is also a chromatin remodeling protein that regulates cell proliferation [49, 50] and is a central player in oncotransformation [51]. Moreover, MYC modulates telomerase catalytic unit (TERT) expression [52] (telomere attrition aging hallmark) and is in use for cells reprogramming [53] and organism rejuvenation [54] (intervention on epigenetic alterations aging hallmark, however, MYC is now excluded in the most recent rejuvenation studies [55, 56]). MYC can interact with a variety of partners due to its IDRs (orange regions in Figure 1d). IDRs in general are known to contain MoRFs (marked by patterned yellow segments, Figure 1e). Indeed, numerous examples of regions capable of binding-induced folding, that can be found in MYC PDB structures, shown by arrows, are located within MoRFs, IDRs, or in sequence regions with a PONDR-FIT [57] disorder score around 0.5.
It is important to find out how structural intrinsic disorder is represented in proteins associated with aging and what aging-related cellular processes are most enriched by IDPs. First, high binding specificity coupled with low affinity essential for IDPs is closely related to “weak link theory of aging”. With aging proteostasis quality control decreases and damaged proteins accumulates. Under such stress conditions, chaperones, that essentially have IDRs [61], lose its specificity, and cellular "noise" increases [62], or weak interactions disappears [63]. As a result, the correct PPI network is destroyed leading to systemic decrepitude of the cell and organism.
Second, intrinsically disordered proteins are the key proteins in signaling and stress-response processes [5] intersecting with hallmarks of aging. Indeed, phosphorylation sites [64, 65], as well as sites of many other enzymatically catalyzed posttranslational modifications (PTMs) are usually located within intrinsically disordered regions [11–13, 26, 66–68].
Third, as abundance of IDP is well known for transcription factors [69, importance of structural disorder for regulation of gene expression including epigenetics is well understood. One example is that IDRs helps to regulate circadian clock (Cry1 tail modulates its interaction with CLOCK:BMAL1 complex [70], C-terminal transactivation domain (TAD) of BMAL1 impact on circadian rhythm by conformational switching [71] and other examples reviewed in [72]. Loss of circadian-regulated change in amplitude of gene expression is a reason for many age-related disorders (cancer [73], metabolic diseases [74], reviewed in [75]).
Fourth, p53 is widely studied IDP [76] that is central player in cancer [77] and senescence [78].
Fifth, of the aging hallmarks, “loss of proteostasis” serves as another exemplar with respect to the role that IDPs play in the aging process. Aging-related loss of proteostasis is characterized by the formation of toxic protein aggregates [79]. Misfolding and toxic aggregation of IDPs are often associated with neurodegenerative diseases [80], such as Alzheimer's disease [81], Parkinson’s disease, frontotemporal lobar degeneration, and amyotrophic lateral sclerosis [82]. Additionally, disorder in IDPs/IDRs is strongly linked to other age-related diseases. Among them are cancer [83], amyloidoses [84], cardiovascular diseases [85], and diabetes [86]. Although aging itself is not a disease, it shares molecular and cellular mechanisms with age-related diseases [4].
Sixth, protein aggregation and LLPS of IDPs in general has different relationship to loss of proteostasis and aging. Protein propensity for LLPS and aggregation could be toxic or protective [87]. On the one hand, it has been shown that controlled protein aggregation has cytoprotective functions, vital for the maintenance of cell integrity and survival under adverse stress [88]. Moreover, functional aggregation is common among amyloids involved in processes such as skin pigmentation, necrosis activation and RNA regulation [89, 90], examples of functional amyloids and prions collected systematically, for example, in CPAD 2.0 database [91]. On the other hand, LLPS has protective role against protein aggregation (like in the case of stress granules) [92, 82].
LLPS and the stability of PMLOs is tightly regulated in the cell, and when physicochemical conditions change, a reverse phase transition can occur. In this case, the uncontrolled formation of a gel or solid fibrous aggregates can be observed [93]. Deviation from phase transitions of the “liquid-liquid” type, which inevitably occurs with aging, can accelerate the aging of the body and the development of age-related diseases through several pathways.
Cellular aging is the cause of disruption in the formation of canonical and non-canonical stress granules. It was shown that the accumulation of protein structural transformations during condensate aging leads to aberrant multiphase architectures [94, 95]. In fact, the inclusion of mutant forms of the IDPs, such as T-cell intracellular antigen-1 (TIA-1), TIA-1-related (TIAR), RNA-binding protein fused in sarcoma (FUS), heterogeneous nuclear ribonucleoprotein A1 (hNRNPA1), transactive response DNA-binding protein 43 kDa (TDP-43), and polyadenylate-binding protein 1 (PABP1), which are characteristic of the amyotrophic lateral sclerosis, Alzheimer's disease and frontotemporal dementia, and polyadenylate binding protein 1 (PABP1), characteristic of amyotrophic lateral sclerosis, Alzheimer's disease, and frontotemporal dementia,, is associated with the dysregulation of the biogenesis, structure, and properties of these PMLOs [93, 96–100].
In addition, in old cells, adaptation to chronic and acute stress, as well as the regulation of stress organelles, is also impaired [100]. The transformation of stress granules into amyloid fibrils in neurodegenerative diseases may be related not only to the inclusion of mutant forms of aforementioned proteins in stress granules, but also to the disruption of stress granule degradation by autophagosomes [101, 102]. Permanent presence of stress granules in the cell cytoplasm promotes their transformation into amyloid-like fibrils and the development of neurodegeneration [103].
So, time-dependent degradation of PMLOs during cell or organism aging increases probability of solid aggregate formation from or within the liquid droplet. However, other types of aberrant phase separation besides liquid-to-solid transitions can add to proteostasis disruption [104], maintaining the stability of stress granules and other PMLOs, and play a special role in aging [100].
Thus, considering the importance of intrinsic disorder in age-related diseases and accelerated aging hallmarks, it can be hypothesized that IDPs should strongly affect the aging process. But such a sample analysis of IDP aging connection needs systemic confirmation that is done in this work.
Bioinformatic analysis of changes in transcriptomes and proteomes with aging and age-related diseases showed what differences in biological processes occurs. Heinze et al. [105], based on the comparison of livers of long-living naked mole-rat and human versus shorter-lived guinea pigs, showed that the mitochondrial metabolism in these species is very different. Decreased oxidative phosphorylation and increased metabolism of fatty acids provide detoxification capable to increase lifespan [105]. Team of Luigi Ferrucci [106] performed proteomic analysis in the sarcopenia manifestation of aging and showed that decline in the bioenergetics is accompanied by inflammation. However, in old persons, some rescue might be achieved via protective mechanisms linked to the maintenance of the proteostasis and increase in alternative splicing [106].
To the best of our knowledge, there is no systematic analysis linking aging-related proteins to their structural disorder, propensity for LLPS, and aggregation. It is also worth annotating biological processes, signaling pathways, and protein–protein interaction networks according to their enrichment in IDPs. To fill this gap, we present here bioinformatics analysis of 1702 proteins, selected from different aging-related databases, studies, and networks to evaluate the prevalence of intrinsic disorder in these proteins associated with aging. We show that signaling pathways associated with aging contain IDPs on different hierarchical levels. Enrichment in IDPs and LLPS-prone proteins was shown for the proteins sub-group involved in DNA-associated aging hallmarks. LLPS and aggregation propensity should be considered as features that could be markers of pathogenic proteins. Knowledge of the level of intrinsic disorder and the set of functions of the proteins under consideration will make it possible to deepen the understanding of the mechanisms of aging and outline ways to slow it down.
Methods
AlphaFold structures comparison with partial experimental structures for OSKM proteins
Visualization of the structural models generated by the AlphaFold 2.3.1 [107] was done for sequences of proteins of interest. First 15 BLASTp [108] structures for the protein found in the Protein Data Bank (PDB, https://www.rcsb.org/) and/or the UniProt (https://www.uniprot.org/) PDB structures from the “Feature viewer” page were superimposed on the corresponding AlphaFold structures. Only parts of the structures closer than 1 Å to the AlphaFold model were shown. Images were produced using ChimeraX [109].
PDB structures drawing
Visualizations of structures in Fig. 1e were obtained using PyMOL Molecular Graphics System, Version 2.0 Schrödinger, LLC. In every visualized complex, MYC shown using “surface” visualization mode with color set to “light violet” and its interaction partners are shown using “cartoon” visualization and manually chosen colors.
Fig. 1.
Illustration of structural disorder of OSKM Yamanaka cocktail constituents. a–d AlphaFold prediction of structures for Oct-4 (Q01860), SOX2 (P48431), KLF4 (O43474), MYC (P01106). Structure prediction reveals presence of the IDRs (unstructured regions, corresponding to orange parts). AlphaFold models are colored according to the per-residue confidence metrics, predicted local distance difference test, pLDDT that ranges from 0 to 100 [color code is blue, model confidence is very high (pLDDT > 90); cyan, model confidence is high (90 > pLDDT > 70); yellow, model confidence is low (70 > pLDDT > 50); orange, model confidence is very low (pLDDT < 50)]. AlphaFold structures match partial experimental structures inside ordered regions (see Figure S1). Furthermore, as it was reported previously, AlphaFold confidence score correlates with the intrinsic disorder propensity [58–60], where regions with low confidence in AlphaFold models corresponds to the IDRs. e MYC interaction with different binding partners illustrates peculiarities of one-to-many signaling. An intrinsic disorder prediction by PONDR-FIT algorithm on the MYC amino acid sequence is shown in the center of the figure (above 0.5 threshold = disordered regions, down = ordered, gray—errors) along with the structures of various regions of MYC bound to 14 different partners. The various regions of MYC are color coded to show their structures in the complex and to map the binding segments to the amino acid sequence. Starting with the Fbw7-Skp1-MycNdegron complex (bottom, left, yellow MYC protein, two blue partners, pdb 7T1Z), and moving in a clockwise direction, the Protein Data Bank IDs and partner names are as follows for the 13 complexes: (6C4U—FHA with Myc-pTBD peptide), (1MV0—BIN1), (7T1Y—Fbw7-Skp1), (4Y7R—WDR5), (5I4Z—apo OmoMYC), (5I50—double-stranded DNA), (6G6K—MAX bHLHZip), (1NKP—DNA), (1A93—MAX heterodimeric leucine zipper), (2A93—MAX heterodimeric leucine zipper), (2OR9—anti-c-MYC antibody 9E10), (1EE4—yeast karyopherin (importin) alpha), and (6E24—TBP-TAF1)
Sources: proteins associated with aging
Two different approaches have been implemented to collect a set of aging-associated proteins. The first approach is based on the selection of proteins from databases of human genes and proteins, such as GenAge (version 20, 306 genes with experimental evidence of direct relation to human aging, representative of Human Aging Genomic Resources), Gene Ontology (GO:0007568 term "aging", 277 proteins, GO:0090398 term "cellular senescence", 75 proteins), KEGG (Longevity Regulating Pathway, 91 proteins) [110–113], Digital Ageing Atlas (2500 genes for Human) [114], and Aging Atlas (344 proteins) [115]. There are also other resources, which are not used here, such as AgeFactDB [116], which is quite old. We analyzed Digital Ageing Atlas (DAA) for the abundance of intrinsically disordered proteins. However, since this dataset was created for the purpose of most complete coverage, it includes proteins even without strong experimental evidence (1 reference was sufficient). Furthermore, this database contains a rather large set of proteins. Therefore, we did not analyze DAA proteins by the whole set of techniques utilized in this study. Of note, we failed to find protein clusters in DAA protein–protein interaction network that would correspond to the hallmarks of aging in GO biological process enrichment analysis (see Fig. S9A).
The second approach is to consider proteins whose expression profile changes with age. It is worth to note that it was shown that expression of some proteins could increase in middle age relative to young age level and decrease in the end of life (see Supplementary Note 1). Review criteria were as follows. A search for original articles focused on aging and proteomes was performed in Google Scholar, PubMed and, for transcriptomes, in GSEA (https://www.gsea-msigdb.org/gsea/index.jsp) and ReGEO (https://regeo.org/search.jsp). The search terms used were “aging proteome,” “aging database,” “aging human,” “aging human proteins,” “aging fold enrichment,” “ageing,” and “differentially expressed,” as well as “systematic review” alone and in combination. Reference lists of found articles were used for search of further relevant papers, as well as Mendeley Reference Manager suggestions of similar or connected papers.
Proteomics review (blood): 337 proteins
We used datasets with significant change in the level of protein expression during aging in blood plasma (altered intercellular communication aging hallmark, parabiosis intervention) (collection from [117], proteins that were found in 3 or 4 or more articles—a total of 337 proteins).
Gene expression review: 284 proteins
In a study by Peters et al. [118], a large-scale analysis of gene expression in human peripheral blood was carried out. The selection of genes differentially expressed with aging was carried out in two stages. At the first stage, 2228 genes were selected using literature analysis [118]. At the second stage, 7909 more blood samples were analyzed, which confirmed the differential expression for 1479 genes, of which 897 are negatively associated with chronological age, and 600 are positively correlated with it. This analysis was performed using blood samples from people of European descent. To distinguish between changes caused by cellular composition and biological processes, clustering was carried out for groups of genes negatively and positively associated with age using GeneNetwork. For our analysis, only those proteins that remained after clustering were selected—184 genes that showed a decrease and 100 genes that showed an increase.
Aging brain: 418 proteins
In a study by Lu et al. [119], the age-dependent regulation of gene expression in the human brain was analyzed. The analysis was carried out on samples of the frontal lobe of the brain of 30 individuals from 26 to 106 years old. Two clusters of identically regulated genes were identified—decreasing and increasing expression with age. Interestingly, the group of people under 42 years of age showed the most homogeneous expression pattern, as did the group over 73 years of age. The expression of genes responsible for synaptic function and the plasticity required for memory and learning was significantly reduced in the aging brain. The same can be said about neuronal signaling, calcium homeostasis, and vesicular transport. On the contrary, genes regulating the stress response and DNA repair mechanisms were induced. For this study, 418 genes changing expression with age unambiguously were selected. Thus, a set of proteins, one way or another related to aging, was assembled. It claims to reflect the overall picture due to different sources and approaches to data collection.
For our analysis, only proteins with the age-related level changes were selected—263 proteins that showed an increase and 155 proteins that showed a decrease in abundance.
Chaperome: 163 proteins
Proteostasis maintenance system is of big importance for the aging viewing here through proteome point of view. The study by Brehme et al. [120] showed that the expression of chaperones strongly changes with age: the level of expression of 32% of chaperones corresponding to ATP-dependent molecular machines decreases, and 19.5% of chaperones, corresponding to ATP-independent ones, increases. For this analysis, 332 genes were selected, of which 88 are chaperone genes and 244 are co-chaperones. Experimental analysis of the expression of 48 samples of the superior frontal gyrus taken from healthy individuals, whose age ranged from 20 to 99 years, was carried out. Analysis of the correlation with aging showed the enrichment of two clusters with certain functional families of chaperones. TPR (tetracopeptide repeats)-containing proteins tend to induce expression, while HSP40 does the opposite. In our study, an analysis was carried out for groups showing different directions of expression with aging.
For our analysis, only those proteins with age-related changes were selected—101 genes that showed a decrease and 62 genes that showed an increase.
GWAS of centenarians: 35 proteins
In a study by Zeng and colleagues [121] of 2178 Han Chinese centenarians, 35 proteins and 4 processes were identified that were significantly associated with lifespan.
Full aging proteome: 1702 proteins
The complete aging proteome analyzed in this study includes 1712 proteins with experimental evidence. 306 GenAge + 344 Aging Atlas + 277 GO Aging + 75 GO Cellular Senescence + 91 KEGG LRP + 337 Proteomics review (Blood) + 284 Gene expression review (184 DOWN + 100 UP) + 418 Aging brain (263 UP and 155 DOWN) + 163 Chaperome (62 UP + 101 DOWN) + 35 GWAS centenarians = 1912, but actually it was 1702 due to intersections.
In this whole aging proteome, 446 proteins were DOWN-regulated, 394 proteins UP-regulated, and 784 were not assigned as UP- or DOWN-regulated. Assignment was done by intersecting the information about gene transcription or proteome analysis by three works, such as Gene expression review, Chaperome, Aging brain, listed in the previous section Sources: proteins associated with aging.
Random proteins control: 400, 500, 1702, 2500 proteins
As a control sample, 400 (Fig. 5), 500 (Fig. S4), 1702 (Fig. 4, Fig S9A) or 2500 (Fig. 5) human proteins were randomly taken from the UniProt database [122]. The control sample does not overlap with the protein sets described above.
Fig. 5.
Distribution of aging-related proteins to BP and disorder score (PPIDRPONDR-FIT). a Distribution of disorder score by proteins in different Aging-associated datasets. Control (400 random proteins) was compared with all datasets except Digital Ageing Atlas, which was compared with Control Large (2500 proteins). b Aging-related processes, enriched for our aging dataset, with distribution of disorder score for the proteins
Fig. 4.
Correlation between intrinsic disorder and LLPS propensity for aging dataset VS controls. a PSPredictor score in relation to PPIDRPONDR-FIT for aging dataset (left panel) VS control dataset of random 1702 proteins (right panel). b Comparison of partitions of proteins on groups by disorder and LLPS propensity. Red columns are for IDPs, densest red is for LLPS-prone IDPs, less dense red is not LLPS-prone IDPs. Blue columns are for ordered proteins, densest blue is for not LLPS-prone ordered proteins, less dense blue is for LLPS-prone ordered proteins. Control dataset (that was used with PSPredictor) shows results which is quite similar to whole human proteome (from FuzDrop, see Figure S3).
ML aging-related set
As a comparison, machine learning (ML)-derived dataset of aging proteins was used [123]. Another approach to the analysis of the correlation of intrinsic disorder and proteins involved in aging processes was implemented based on a study by Kerepesi et al. [123]. In this study, human proteins were classified as related/not related to aging using machine learning methods. The models were built on the basis of 21,000 feature descriptions of proteins obtained using the UniProt [122], Gene Ontology, and GeneFriends databases. The proteins present in GenAge were considered as a selection of aging proteins, and all other Swiss-Prot proteins as non-aging proteins. The models were trained using five iterations of cross-validation: all analyzed proteins were divided into five non-overlapping samples, four of them were used for model training, and five for testing. As a result, numerical values ranging from 0 to 1 were obtained for all proteins [123]. These values correspond to the expected relevance of a protein to aging. Based on the obtained model, new proteins that might play key roles in the aging process were identified. The results of the final model are in agreement with the previously known fact that human proteins associated with aging tend to primarily interact with each other.
We considered these proteins independently. We found that only 184 proteins of our dataset are assigned as related and 1440 as unrelated to aging by this approach.
Enzymes, ordered protein control: 507 proteins
Enzymes, with their capability to catalyze specific reactions, are typically considered as ordered proteins. 507 enzymes from the collected dataset of 1702 proteins were extracted by UniProt search of proteins with EC (enzyme category) numbers and were used as ordered protein control.
Disorder prediction: description and comparison of algorithms
The usage of several computational tools to predict the level of protein intrinsic disorder for a given sequence is common practice today. This is because different tools use different algorithms to make predictions and different features, such as amino acid composition, content of charged residues, hydropathy, etc., to describe proteins. As a consequence of these differences, disorder prediction algorithms often show considerable variation in their results. For a comparison of the features of different predictors see Supplementary Note 2. Until an ideal reference algorithm and feature set is established for disorder prediction, it will be necessary to consider the phenomenon of protein intrinsic disorder from multiple perspectives. In doing so, a deeper understanding is gained, and a more complete picture is formed.
In this work, we used Rapid Intrinsic Disorder Analysis Online (a.k.a. RIDAO, https://ridao.app) [124], a newly developed platform for disorder analysis that yields results from PONDR® VLXT [125], PONDR® VSL2 [126], PONDR® VL3 [127], PONDR® FIT [57], and IUPred2A [128] in a single run. RIDAO also performs CH-CDF analysis [129] using PONDR® VLXT. Each of these predictors is well established and trained on datasets emphasizing various aspects of protein intrinsic disorder. The PONDR predictors employ in this work utilize neural networks while IUPred2A is a physics-based model that does not rely on machine learning.
Each algorithm takes the FASTA formatted amino acid sequences provided to RIDAO as input and outputs a number between 0 and 1 for each amino acid in the input sequence. If the predicted value for a residue exceeds or equals 0.5, this residue is considered disordered. RIDAO averages the obtained per-residue values for the whole protein sequence to provide a mean disorder score (MDS) that describes the level of protein flexibility as a whole. The predicted percent of intrinsically disordered residues (PPIDR) is defined as the number of disordered residues divided by the length of the sequence. Based on their levels of intrinsic disorder, proteins are classified as highly disordered (MDS ≥ 0.5, PPIDR ≥ 30%), moderately disordered (0.25 ≤ MDS < 0.5, 10% ≤ PPIDR < 30%), and ordered (MDS < 0.25, PPIDR < 10%) [130]. The MDS is not directly related to PPIDR (in particular, at 100% PPIDR value of MDS can be anything in the range of 0.5–1); therefore, these two methods of protein intrinsic disorder evaluation should be analyzed using correlation.
Important information regarding global classification of proteins based on their levels of intrinsic disorder can be obtained utilizing binary predictors such as charge–hydropathy (CH) plots [131, 132] and cumulative distribution functions (CDFs) [132]. By combining these classifiers, one can discriminate proteins in four structurally different classes [129, 133, 134]. The corresponding CH-CDF graph is generated as follows: the y-coordinate is calculated as distance from the boundary in the CH plot, and the x-coordinate is calculated as the average distance of a corresponding CDF curve from the CDF boundary. Positive and negative x-values correspond to ordered and disordered (by CDF) proteins, whereas positive and negative y-values correspond to intrinsically disordered and ordered proteins (by CH), respectively.
On the resulting CH-CDF graph, the lower-right quadrant (Q1) corresponds to ordered and compact proteins (agreement of both predictors); lower-left quadrant (Q2) comprises proteins predicted to be disordered (by CDF) but having compact conformation (by CH). These proteins usually are native molten globules, or hybrid proteins, containing considerable. Q3, the upper-left quadrant, includes proteins, which are disordered according to both predictors: native coils and native pre-molten globules. The upper-right quadrant (Q4) is usually poorly filled. These rare proteins are predicted to be disordered by CH but ordered by CDF.
Notably, in this work, disorder was not calculated for proteins from the collected datasets that were too short (< 25 residues) or for which a FASTA sequence could not be obtained from UniProt. Additionally, because the disorder prediction algorithms used herein are not designed to operate on non-canonical amino acids, selenocysteine was replaced with cysteine when present. Since for some GeneIDs, there are several human proteins, a reviewed UniProtID entry was used.
MoRF prediction
In this study, MoRF regions were predicted using the ANCHOR web server [135], which is available at http://anchor.enzim.hu/index_multi.php. Amino acid residues with a value greater than 0.5 were considered as MoRF regions, disordered regions visualized for many proteins in the D2P2 database [136] (available at https://d2p2.pro/search).
Aggregation propensity prediction
AggreScan server (http://bioinf.uab.es/aggrescan/) was used for aggregation propensity assessment for the each protein [137]. FASTA formatted sequence was used for calculations.
Two output parameters used for the aggregation propensity prediction—NnHS and Na4vSS. Hot Spots are pieces of protein, which are usually inaccessible to the solvent and are hydrophobic, but when the protein undergoes partial unfolding, they crawl out and provoke aggregation. AggreScan uses amino-acid aggregation-propensity values (propensities to be Hot Spots) from the experiments [138]. Hot Spot (HS) is defined as a Region with five or more residues on sequence with an a4v larger than threshold (HST) and no proline (aggregation breaker)). Used NnHS parameter is a normalized number of Hot Spots for 100 residues (number of Hot Spots divided by the number of residues in the input sequence and multiplied by 100).
Normalized a4vSS for 100 residues (Na4vSS) is the a4vSS divided by the number of residues in the input amino acid sequence and multiplied by 100, where a4vSS is a Sum of amino-acid aggregation-propensity value, averaged over a sliding window.
Protein is assessed as aggregation prone if it has positive or low negative Na4vSS value or has high NnHS (number of Hot Spots) [137].
Evaluation of protein self-aggregation propensity for the full human proteome was done by PASTA 2 by best energy, which is a negative value, with the lowest scores corresponding to the highest aggregation propensity [139].
Liquid–liquid phase separation (LLPS) prediction
To perform the prediction of probability for LLPS for the proteins two algorithms were used: PSPredictor [140] (http://www.pkumdl.cn:8000/PSPredictor/) and FuzDrop (https://fuzdrop.bio.unipd.it) [141–143]. FASTA formatted amino acid sequence was used as input and proteins were predicted as LLPS positive when value was ≥ 0.5 for PSPredictor and ≥ 0.6 for FuzDrop. PScore was also used with the Threshold of ≥ 4 for LLPS-positive conclusion [144] (http://abragam.med.utoronto.ca/~JFKlab/Software/psp.htm).
Disorder parameter choice: correlation matrix
The correlation matrix of disorder predictors assessments (Figure S2) and other properties of all studied proteins represent score (Pearson standard correlation coefficient), equals 1 if two values are fully correlate. Following parameters bring new independent information: "CDF" (Cumulative Distribution Function), "PONDR-FIT PPIDR" (PONDR® FIT assessment of predicted percentage of intrinsically disordered residues), "IUPred2A(Short) PPIDR" [assessment of predicted percentage of intrinsically disordered residues by IUPred2A(Short) predictor], ‘VLXT PPIDR’ (PPIDR estimated by PONDR® VLXT predictor), "VL3 PPIDR" (PPIDR estimated by PONDR® VL3 predictor), "VSL2 PPIDR" (PPIDR estimated by PONDR® VSL2 predictor), "CH" (Charge–Hydropathy value), "Percentage of MoRF Residues" (percentage of residues prone to fold under interaction with the partner), and "Protein length" (size of the protein in number of residues).
On the basis of this analysis PONDR® FIT, CH, and CDF scores were selected as representative disorder evaluations.
Data processing and statistical analysis
The prediction and analysis of protein intrinsic disorder were carried out using the disorder analysis platform, RIDAO [124] (https://ridao.app).
-
The PANTHER web server was used to classify the assembled sets of proteins according to biological processes. Automated obtaining of the proteins UniprotIDS, fold enrichments, p-values, and GO terms from xml files was carried out using the Python libraries (see Supplement Notebooks).
All 1702 proteins from aging proteome were analyzed by PANTHER [145]. Overrepresentation Test (Released 20221013) on GO biological processes complete (https://doi.org/10.5281/zenodo.6799722) for Homo sapiens, reference dataset was whole human proteome. Statistical Fisher’s test with False Discovery Rate (FDR) correction was used with significance level 0.05.
Fold enrichments are a number of proteins in BP of interest for the aging proteome divided by expected number of proteins in BP of interest for the random dataset of the equal size as aging proteome. nFoE—normalized fold enrichment.
To compare the samples, the Wilcoxon test was used, due to the fact that the distribution of proteins according to the level of internal disorder is not uniform. The Wilcoxon test is non-parametric and tests the null hypothesis that the medians or means of the populations from which two or more samples are taken are identical.
Bootstrapping was used to quantify the differences between the medians of the compared samples of significantly different sizes. Bootstrapping is a method of determining statistics by generating pseudo-samples of same size. In our study, bootstrapping was carried out to determine the confidence interval, which is the difference between the median values obtained by generating random subsamples of compared datasets (without returning).
For the comparison of features of sub-set and full dataset Z-score evaluation was done by online resource https://www.socscistatistics.com/tests/ztest/default2.aspx.
Analyzing IDPs in interactomes: cytoscape-assisted STRING PPI network creation and assessment of GO biological processes, cellular compartment enrichment by Bingo app, PANTHER and STRING functional enrichment
Protein–protein interaction (PPI) networks were built in Cytoscape (v.3.9.1; http://www.cytoscape.org) [146] by following protocol. Cytoscape program with STRING App [147] and BINGO [148] applications was used. For PPI network creation STRING protein query search was done for the whole collected proteins dataset. Then, created network was analyzed (tool “Analyze network”) to calculate number of edges for the nodes (proteins). Defined in such a way, degree value was used for the determination of hub proteins.
Calculated as previously described (“Methods”, subsection "Disorder Prediction: Description and Comparison of Algorithms") proteins disorder scores were added to the node descriptors table by function Import with key column selected by query name. Excess of number of edges (protein–protein interactions) was removed to rest only connections with high confidence by change STRING score to 0.95 and subscores textmining, databases, experiments to 0.25, coexpression to 0.03, cooccurrence, neighborhood, fusion to 0. Nodes clustering was done by Scaling (Layout Tools Scale = 8) and Edge-Weighted Spring Embedded Layout (clustering by score, i.e., by interaction confidence (distance matrix obtained from the String global scores), so interacting proteins with a higher global score have more chances to end up in the same cluster). Network was drawn with node size reflecting hubness (degree), color—intrinsic disorder state (according to RIDAO calculation of PONDR-FIT values), edge color as betweenness to visualize the most confident interactions in bolder manner.
Finally, Biological Process (BP) or Cellular Compartment overrepresentation (enrichment) analyses were done by BINGO application for proteins subsets from clusters. For the verification that enriched process is really describe selected cluster STRING Enrichment also was calculated by STRING Functional Enrichment. Double verification was done by clusters analysis in PANTHER [145] (http://www.pantherdb.org/).
Results
Bioinformatics analysis of structural intrinsic disorder of aging-associated proteins revealed the functional importance of full and partial absence of the structure.
Design of the analysis
A brief scheme of the study design is visualized in Fig. 2. As discussed in the “Methods”, subsection "Sources: proteins associated with aging" a dataset of 1702 proteins related to aging was collected utilizing two different approaches: searching the aging databases and literature overview. For each protein of the resulting dataset, its intrinsic disorder levels were analyzed using predictors of the PONDR family.
Fig. 2.
The scheme illustrating the study design
Next, these proteins were classified by disorder into distinct groups: ordered, partial IDPs, and IDPs. Using PANTHER [145] and BinGO [148], key biological processes, overrepresented among the obtained dataset, were identified and further analyzed for IDPs enrichment. KEGG pathways related to aging were analyzed for the presence of IDPs at different hierarchical levels. Following this step, protein–protein interaction (PPI) network of all selected aging-related proteins was extensively analyzed using Cytoscape software (for clustering) with BINGO application (for Biological Processes and Cellular Compartment Enrichment) [148] along with PANTHER and STRING Functional Enrichment analysis of protein clusters.
Afterward, propensity for aggregation (by AggreScan, PASTA2) for IDPs with high propensity for LLPS (by FuzDrop, PSPredictor, and PScore) was assessed to understand specific markers of proteins prone to toxic aggregation.
Disorder evaluation results
Protein structural disorder evaluation by agreement of machine learning-based predictors was done by PONDR-FIT meta-predictor. Combination of counting disordered residues [predicted percent of intrinsically disordered residues (PPIDR) score] or averaging disorder score per-residue values for the whole protein sequence [mean disorder score (MDS)] (Fig. 3a) reveals that the majority of the selected proteins are moderately or highly disordered and 35.5% of proteins in the dataset are highly disordered and could be called IDPs (17.5% of proteins are in pink area and 18% of proteins are in the red area where PPIDR ≥ 30%). Because PPIDR and MDS PONDR predictions show agreement between each other, PPIDR representation was used for the intrinsic disorder calculation in further parts of the study (PPIDRPONDR-FIT). CH-CDF analysis, based on physical properties and machine learning prediction, also assigned 34% of proteins as IDPs (Fig. 3b, dots on the left side from vertical threshold CDF = 0).
Fig. 3.
Analysis of disorder scores for proteins from whole dataset. a, d MDSPONDR-FIT (PPIDRPONDR-FIT) plot for aging proteome and whole human proteome respectively. Labeled regions are marked by 0.25 and 0.5 values of MDS and 10 and 30% of PPIDR. MDS (mean disorder score, describes the level of protein flexibility as a whole) and PPIDR (predicted percent of intrinsically disordered residues) scores from PONDR-FIT assessment correlates with each other. b, e CH-CDF analysis for aging proteome and whole human proteome. CH-CDF analysis shows IDPs as agreement of CH and CDH prediction (upper-left quadrant) and as only CDF-predicted as disordered (bottom left quadrant). c, f IDP abundance in aging proteome and whole human proteome
Whole human proteome has slightly higher amount of IDPs by PPIDRPONDR-FIT (Fig. 3d, around 40%, i.e. 18% and 21.5% of proteins are in the pink and red areas) and CH-CDF analysis (Fig. 3e, around 40%).
Thus, examination of aging proteins dataset, collected from genomic and proteomic databases and studies, for protein level of intrinsic disorder showed a high percentage (about 36%) of highly disordered proteins that is smaller than that evaluated for the whole human proteome (around 40%) (Fig. 3c, f, according to threshold PPIDRPONDR-FIT ≥ 30%). The Z-score analysis, which assesses significance of the difference between one quantitative property for the sub-group and this property for a large population, was used here to understand the significance of the difference of IDP levels between the aging proteome and the whole human proteome. In this case, the Z-score is − 3.5726 with p-value of 0.00036. This indicates that this result is significant at p-value < 0.05.
Analysis of the aging-associated proteins for LLPS propensities
Since intrinsically disordered proteins seem to be prone to toxic aggregation (β-amyloid, α-synuclein, tau protein, and others [148–157]), it is necessary to analyze if highly disordered aging-associated proteins are potentially dangerous as drivers of proteinopathies. To achieve this goal, it is important to evaluate, as first step, the probability of liquid–liquid transition for proteins (Fig. 4). Analysis of the LLPS propensity was done using algorithms (FuzDrop, PSPredictor) that has not high correlation (Figure S3a–c) between each other and with disorder score, so could be considered as independent.
It is of interest to analyze distribution of proteins in space spanned by PPIDRPONDR-FIT and LLPS propensity predictors, PSPredictor and FuzDrop. Using PSPredictor, it is possible to compare the collected aging dataset and control set of random 1702 proteins (Fig. 4a). 60% of proteins from the aging dataset (against 54% for the control set) are placed in the lower-left region corresponding to ordered proteins with low probability for LLPS. Similar observation can be done using larger, whole proteome dataset with FuzDrop predictor (Fig. 4b, blue bars for aging dataset are higher than for controls and whole proteome, see also Figure S3d for FuzDrop LLPS vs. PPIDRPONDR-FIT plots). Z-score for difference of abundance of ordered and not LLPS proteins between aging and whole human proteome is 3.9796. The value of p-value is 0.00006. The result is significant at p-value < 0.05. Thus, aging dataset on average is biased to ordered and not to LLPS proteins. This conclusion do not conflict with the results on Fig. 3c, where partial IDPs were separated from ordered proteins. The goal in this section is to understand if highly disordered proteins with high LLPS potential presented in Aging dataset can serve as potentially toxic aggregate candidates.
We can conclude from Figs. 3 and 4 that one of the specific characteristics of the collected aging dataset is elevated percentage of ordered proteins with low probability for LLPS. There is no over-abundance of intrinsic disorder in aging dataset. However, it is well known that enzymes and secreted proteins, that are simple to monitor and overrepresented in all studies, are biased to ordered state [158]. Furthermore, if we could find special IDP-enriched biological process, it would be quite useful for anti-aging interventions. To accomplish that, we turned to more detailed analysis.
Biological processes: analysis of the level of disorder in various biological processes
We checked median disorder level for the protein subsets from different sources, which constitute our full aging dataset, independently (Fig. 5a, see detailed sources description in Methods): databases (KEGG Longevity regulating pathway, GenAge, Gene Ontology terms “Aging,” “Cellular Senescence”; Aging Atlas, Digital Ageing Atlas), for proteins, which expression profile changes with age, selected by means of literature analysis (Proteomics Review, Gene Expression Review, Chaperome, Aging brain), a control samples (random datasets of 400 or 2500 human proteins), aging-associated and non-aging-associated proteins according to the Csab Kerepesi classification of proteins [123] (ML aging-related, ML aging-not related), and ordered proteins control (Enzymes).
Comparison of distributions of intrinsic disorder in each sample with control random dataset, representing median proteome level of disorder, has shown that no one of sources have median disorder statistically relevantly higher than Random control. As a positive control, highly ordered Enzymes dataset showed strong statistical difference (p-value ≤ 0.0001) according to the Wilcoxon test between sample and control. Reviews summarized significantly different expressed proteins with aging (Proteomics and Gene Expression Reviews). Proteomics and Gene Expression Reviews are united in investigation of proteins (translated or transcribed) in blood plasma (see Supplementary Note 1 for the discussion of the validity of blood plasma data). Thus, the reason behind lower medians is possibly due to the fact that the majority of proteins in these two datasets are secreted ones. Digital Ageing Atlas and "Gene Ontology Aging" have median also truly lower than control. Some aging-related datasets (in Fig. 5a, "GenAge", "Gene Ontology Cellular Senescence","Aging Atlas",’ "KEGG Longevity regulating pathway"), along with Aging brain and Chaperome fdatasets, have a distribution very similar to that of the control sample.
Figure 5a also shows the comparison proteins that are classified based on the machine learning model [123] as related and not related to aging, that are not taken into account during building of our aging proteome, collected on the basis of experimental evidence. The Wilcoxon test confirms the differences between ML-related and ML-not related to aging samples. This result to some extent supports the hypothesis that for aging-related proteins, there is a tendency to a higher level of intrinsic disorder. However, this model usually classifies proteins with a high number of interacting partners as "aging-related" feature that is inherent for the IDPs [159]. Thus, this result might be due to the specifics of the model and might not represent the general picture. Furthermore, level of median disorder for ML aging-related sub-dataset is as high as in control sample. So, no disorder abundance for the ML aging-related sample. Further, we did not use ML-derived aging dataset as well as Digital Ageing Atlas (see comment in the section Methods, subsection Sources: Proteins associated with aging).
To deeper investigate relations between intrinsic disorder and aging, we selected proteins (from our aging dataset) for which the direction of expression change was known, using information from literature [118, 120, 160]. UP- and DOWN-regulated proteins did not show statistical differences in levels of intrinsic disorder from the control sample and each other (Figure S4). However, both UP- and DOWN-regulated proteins sets are composed of proteins, which take part in a number of important biological processes. These biological processes could differ in their average levels of intrinsic disorder. Therefore, it is more important to analyze which of the most pronounced processes (in the resulting dataset) have increased (or decreased) levels of disorder, and if aging proteins are more disordered among all proteins of a single considered process.
An analysis of the total set of proteins on biological processes was carried out using the PANTHER web server [145], which showed the enrichment of many biological processes (BP). Figure 5b shows the distribution of aging-associated proteins among 77 selected BPs, representing aging hallmarks. Processes were selected based on three criteria: relevance to key aging mechanisms [1], enrichment for the BP and regulation of this BP, and the presence of at least 8 proteins from the collected dataset.
All BP sub-groups of aging-related proteins were analyzed to found their disorder distributions (Fig. 5b). It could be concluded that a Gene Ontology analysis of the aging proteome revealed an abundance of IDPs in one-third of aging-associated processes, especially in regulation of gene expression, proteostasis maintenance, regulation of nutrient sensing, and DNA damage response. Such enrichment makes sense due to the known regulatory function of IDPs, need of disorder for interaction with DNA [30], and the existence of hormones in IIS pathway (regulation of insulin-like growth signaling). Another one-third of aging-associated processes are depleted in IDPs, due to the enrichment in enzymes. Finally, one-third of aging-associated processes have IDPs percentage similar to the whole human proteome. Taking into account that for the whole proteome median value of PPIDRPONDR-FIT is 22.44%, it can be stated that among aging-related proteins, mostly ordered proteins take part in the processes below threshold. On the contrary, aging-related proteins of processes above threshold are mostly highly or moderately disordered. Graph also indicates that for the selected aging-associated overrepresented processes intrinsic disorder levels are higher than 10%, meaning that median level of disorder for these processes is higher than the threshold for ordered proteins.
It is of interest to study CH-CDF layout of enriched biological processes. Most of them have similar distribution, with around 40–75% of proteins being in the lower-right quadrant (ordered proteins) (Fig. 6b). However, there are BPs with enrichment of ordered proteins by CH-CDF analysis (Fig. 6a): receptor signaling pathway via JAK/STAT (100% ordered proteins), regulation of receptor signaling pathway via JAK/STAT (100%), regulation of endoplasmic reticulum unfolded proteins response (87.5%), phosphorus metabolic process (77.7%), G protein-coupled receptor signaling pathway (75.3%), and macroautophagy (75%). Contrary, depletion of ordered proteins is shown for other BPs (Fig. 6c): epigenetic regulation of gene expression (24% ordered proteins), endoplasmic reticulum unfolded protein response (31.8%), histone deacetylation (33.3%), regulation of insulin-like growth factor receptor signaling (36.4%), DNA damage response, signal transduction by p53 class (37.5%), and circadian regulation of gene expression (38.1%).
Fig. 6.
CH-CDF analyses of proteins of our aging proteome in biological processes sub-groups. a Processes relying on highly ordered proteins (more than 75% of proteins are ordered). b Processes relying on moderately ordered proteins (a number of ordered proteins are between 40 and 75%). c Processes depending on highly disordered proteins (less than 40% of proteins are ordered). Three examples for each category presented
As a result, we could stress that different approaches of protein structural disorder assessment give consistent results in ranging BPs by disorder (comparison of PONDR-FIT results Fig. 5b and same manner plot for CH-CDF Figure S5).
A special distribution in terms of intrinsic disorder may or may not be a characteristic of the proteins from the collected dataset in comparison to other proteins for each BP. To test this hypothesis, three aging-related processes were selected, the proteomes of which included a large number of proteins from the obtained dataset for the correctness of the statistical analysis. For each process, the distributions of intrinsic disorder from the obtained dataset and the remaining proteins of this particular biological process were compared (Figure S6). Results show that intrinsic disorder can be not only higher for aging-related proteins but also lower. Therefore, the presence and absence of protein intrinsic disorder may be important for aging-related protein sub-group within the biological process.
Once the overrepresented in the resulting dataset BPs were identified, it is needed to see the impact to the enrichment in each BP of three disorder groups: ordered, partial IDP, and IDP. Two complementary estimations by PANTHER (Fig. 7 and Figure S7, sorted by IDP impact into enrichment) and BinGO (Figure S8, consistent with PANTHER results) conducted. Based on these analyses, highly disordered proteins are most widely represented in the following processes: regulation of gene expression, histone deacetylation, endoplasmic reticulum unfolded protein response, and regulation of insulin-like growth factor receptor signaling pathway. Ordered proteins are depleted in DNA repair, cellular response to oxidative stress, and others, in which IDPs are enriched in comparison with whole proteome.
Fig. 7.
Impact in BP enrichment of proteins with different disorder assignment. PANTHER Overrepresentation Test (Released 20221013) collects enriched biological processes. Ones, associated with aging hallmarks, were chosen. All 1702 proteins from aging proteome were analyzed on GO biological processes complete [10.5281/zenodo.6799722] for Homo sapiens organism, reference dataset was whole human proteome. Statistical Fisher’s test with False Discovery Rate (FDR) correction was used with significance level 0.05. Impact of different disordered classes to enrichment represented by coloring: blue—ordered, violet—partially disordered, red—IDP. For the comparison of IDP presence in each BP with expected level for the whole proteome, there is shown distribution of whole human proteome proteins on IDP-partial IDP-ordered groups and vertical thresholds for IDP and ordered proteins presence drawn. Plots presents normalized Enrichment, not-normalized FoE plots shown in Figure S8. nFoE—normalized fold enrichment (number of proteins in BP of interest for the aging Proteome divided by expected number of proteins in BP of interest for the random dataset of the equal size as aging Proteome). * values for the “regulation of kinase activity” are average of ones for the two GO terms (positive regulation of kinase activity, negative regulation of kinase activity), values for the “G protein-coupled receptor signaling pathway” are average of ones for the three GO terms (phospholipase C-activating G protein-coupled receptor signaling pathway, adenylate cyclase-activating G protein-coupled receptor signaling pathway, adenylate cyclase-modulating G protein-coupled receptor signaling pathway)
Such plots show again that gene expression regulation relies on highly disordered proteins, whereas enzymes-rich processes (such as "proteolysis", "metabolic process", etc.) are biased toward order, if one would select aging-related proteins from these processes. However, this selection is sometimes ambiguous because there is no universal aging dataset and aging-related BPs. Databases that we used in the present study include different protein sets. Therefore, some incorrect conclusions about biases for order or disorder in aging-related selection can be ascribed to imperfect databases.
The presence of intrinsic disorder in signaling pathways associated with aging
The presence of intrinsic disorder in specific signaling pathways is of big interest due to pathways integrates BPs needed for accomplishing some function. In Fig. 8a one can see a diagram of the interaction of key proteins involved in the recognition of nutrients. Distortions in this pathway are one of the hallmarks of aging [1, 2], and interventions against this hallmark being very common [161]. Scheme illustrates growth hormone and insulin-like growth factors signaling (IIS) and dietary restriction (DR) pathways, and the relationship of these pathways and aging. IIS pathway accelerates aging and dietary restriction slows it down.
Fig. 8.
Nutrient sensing signaling pathway of mTOR complex 1. a Diagram showing key proteins and signaling pathways involved in nutrient recognition. The scheme of the pathway is modified from the publication of López-Otín and colleagues [1]. PPIDRPONDR-FIT values for each protein are labeled in protein boxes, coloring reflects classification by structural disorder: blue—ordered, violet—partially disordered, red—IDP. b Analysis of the level of intrinsic disorder of key proteins in aging regulating metabolism. Red-to-Green coloring (heat-map) represents high values as red: long length, long MoRFs length, high disorder score, high LLPS and aggregation propensities. UniProt IDs for the proteins: FOXO1—Q12778; PGC-1α—Q9UBK2; SIRT1—Q96EB6; IGF-1—P05019; AMPK gamma-2—Q9UGJ0; PTEN—P60484; Akt1—P31749; Akt3—Q9Y243; Akt2—P31751; mTOR—P42345; PI3K—P42336
The scheme in Fig. 8a demonstrates that opposite pathways (IIS, aging acceleration, and DR, aging deceleration) contain IDPs. Figure 8b illustrates the properties of key aging-related proteins in this metabolism regulating network: PPIDRPONDR-FIT as disorder score, abundance of MoRF regions, as well as LLPS and aggregation propensity. We could see that, first, intrinsic disorder is represented differently in key proteins in aging. Second, MoRFs abundance and LLPS propensity correlate strongly with intrinsic disorder, whereas aggregation propensity predominantly anti-correlates with other parameters. Only for MAPK subunit gamma-2 disorder, LLPS and aggregation are high in the same time. Thus, disordered and ordered proteins are mixed in the nutrient sensing signaling pathway.
Another, KEGG Longevity regulating, pathway is probably the most important pathway for the processes of aging. Analysis of this pathway (Fig. 9a) shows that most of the proteins involved in it are moderately (violet coloring) or completely intrinsically disordered (red boxes). The illustration also shows other signaling pathways intersecting with this pathway as ovals. All intersecting pathways are moderately intrinsically disordered—the median PPIDR value fluctuates around 19%. At the same time, for proteins of KEGG Longevity regulating Pathway the median value lies much higher, 25%. Therefore, within this narrow dataset, we can say that aging-associated proteins turned out to be, on average, less ordered.
Fig. 9.
Classification of proteins of the KEGG Longevity regulating (a) and mTOR signaling pathway (b) according to the level of intrinsic disorder
It is interesting that both KEGG Longevity regulating (Fig. 9a) and nutrient recognition (Fig. 9b) pathways (which are partly intercepted) have several highly ordered nodes that are followed by 1–2 disordered ones. That looks like brickwork (structure–function continuum [26, 27–29]), where the interaction of ordered enzymes (“bricks”) is controlled by mobile fraction of IDR-containing proteins (“viscous cement”).
Therefore, signaling pathways associated with aging contain IDPs on different hierarchical levels.
Analysis of protein–protein interaction (PPI) networks in BPs
In a study by Zeng and colleagues [121], 35 proteins and 4 processes were identified as significantly associated with lifespan. The PPI diagram (Fig. 10) shows that these proteins and processes are strongly interconnected. On the scheme, the levels of intrinsic disorder of proteins are displayed and colored in accordance with the boundary values—ordered proteins are shown in blue, partial IDPs in purple, and highly disordered proteins (IDPs) in red. From this scheme, it can be concluded that proteins of the MAPK pathway, differently expressed in centenarians, are characterized by the prevalence of intrinsic disorder, since all four proteins are highly or partially disordered. The process of immune response can be called moderately disordered and response to stress process is biased to ordered proteins. Thus, 35 proteins inherent for centenarians have different levels of disorder, and proteins of different disorder make different BPs.
Fig. 10.
Networks of proteins, characteristic for the Chinese centenarians, colored based on disorder (PPIDRPONDR-FIT)
Similar to this small protein dataset, a study of the various protein–protein networks utilizing STRING database was carried out for our whole aging proteome. PPI analysis is complementary to the classification by BPs described in previous sections because PPI gives proteins clusters and reveals main functions of studied proteome. Obtained PPI network for the whole aging proteome is not scale free (degree distribution follows a power law), i.e., contains clusters. Two random protein datasets from human proteome of same size as aging dataset have no clusters and enrichment on GO biological process or cellular compartment (Figure S9a).
PPI network of aging-associated proteins was clustered on sub-groups by interaction density. Clusters sharing cellular compartment and biological process, UP- or DOWN-regulation with aging [Fig. 11A, assignment of the clusters to cellular compartment and BPs was performed by agreement of Bingo App (Figure S9), PANTHER (Figure S10), and STRING Functional Enrichment (Figure S11)].
Fig. 11.
Analysis of PPI network for the full collected dataset of aging-related proteins. a PPI network showing IDP as red (PONDR-FIT PPIDR > 30%), partial IDP are pink/violet (30% > PONDR-FIT PPIDR > 10%), ordered proteins as blue (PONDR-FIT PPIDR < 10%). Node size is proportional to the number of connections. Four clusters are clearly distinguishable. GO-analysis of clusters proteins subsets (Figure S11, S12) reveals their main functions (BPs) which are attributed to hallmarks of aging and written on the figure. For each cluster total number/IDPs part/LLPS-positive part of proteins are shown in black, red, dark pink colors correspondingly. b Distribution of disorder by clusters, red parts—ordered proteins, violet parts—partially IDP, red parts—IDP. c FuzDrop and PSPredictor results in LLPS propensity assesment for proteins from PPI network clusters. Violet parts—LLPS-prone proteins, green ones—not LLPS-prone proteins
The densest cluster №1, i.e., representative one for whole dataset, includes 264 proteins, ~ 30% of which are IDPs, nuclear location enrichment, and performs several hallmarks-associated functions: nutrient sensing, intercellular communication, response to stress, maintenance of proteostasis, mitophagy, and cellular senescence. In contrast, a much smaller cluster №3 of 131 proteins, enriched with IDPs ( 52% compared to 36% for the whole human proteome, Fig. 11b, difference is significant as z-score is 3.6521, p-value is 0.00026), performs about the same number of functions related to the aging hallmarks, especially DNA-related ones, such as DNA repair, telomere maintenance, epigenetic regulation, stem cells exhaustion, and cellular senescence. Such multifunctionality of this cluster is natural due to its enrichment in IDPs. This protein sub-group №3 has high enrichment in “nucleus” location and DNA-associated functions that confirms importance of IDPs for gene expression regulation process, showed in previous sections. Indeed, it is well known that IDPs are known to be especially enriched among proteins that regulate chromatin and transcription [162]. Also, high intrinsic disorder is common for the proteins of nuclear location [34, 163].
Aging dataset should not be unique in the property that protein cluster with high intrinsic disorder and LLPS propensity relates to the "nuclear localization" and "genome regulation". To test this hypothesis, dataset of almost equal size was constructed from GO Biological processes translation (GO:0006412), glycolytic process (GO:0006096), regulation of glycolytic process (GO:0006110), and GO cellular compartment mitochondrion (GO:0005739). The PPI network contains clusters, with the clusters of highest IDP and LLPS having, among others, enrichment in nucleus, cytoskeleton cellular compartment, and DNA-associated biological processes (gene expression) (Fig. S9a). It is known that some cellular compartments are enriched in IDPs (cytosol, nucleus, membranes, cytoskeleton) [164]. Therefore, it is quite interesting that the IDPs from aging proteome appears to be biased to the nucleus location.
The cluster №4 enriched in the UP-regulated with aging proteins regulates immune and stress responses. DOWN-regulated proteins are enriched in cluster №2 ensuring protein folding function. These clusters differ in IDPs amount from whole dataset insignificantly.
Thus, in collected dataset we have UP-regulation of stress responses and DOWN-regulation of structural processes, like proteostasis, which is in consistence with previous works [119, 165].
Another important question is the relation of disorder and characteristics of formed networks. Two different networks were constructed for comparison: one formed by the aging-related ordered proteins (PPIDRPONDR-FIT 0–10%) and another formed by the aging-related highly disordered proteins (PPIDRPONDR-FIT > 50%). Network formed by ordered proteins turned out to be nearly 20% less dense and 10% less heterogeneous, than the network formed by the disordered proteins (Figure S12). These results are in line with the idea that intrinsically disordered proteins have more interactions with partners.
LLPS prediction (Fig. 11C, Figure S13) showed that IDPs have high probability to be engaged in liquid–liquid phase transition. Protein clusters associated with molecular hallmarks of aging have LLPS-positive proteins. IDP-enriched cluster №3 has also LLPS enrichment, so nuclear IDPs associated with aging have high LLPS propensity. These observations indicate that LLPS is coupled to IDP and is quite important for aging. Z-scores for difference of LLPS positive proteins enrichment for cluster 3 from the whole Aging dataset is 2.5844, p-value is <0.01.
In-depth analysis of highly disordered proteins from various biological processes
Insoluble protein aggregates are the hallmarks of many neurodegenerative diseases. However, whether aggregates cause cellular toxicity is still not clear, even in simpler cellular systems. Aggregation of TDP-43 is not harmful but protects cells, most likely by taking the protein away from a toxic liquid-like phase [87]. On the other hand, it is recognized that the protein aggregation could be dangerous, e.g., aggregated proteins can distort biogenesis and mechanical properties of the PMLO [92, 166, 167]. Therefore, aggregation and LLPS propensities should be assessed for the IDPs in aging dataset with representative biological functions (BP) and taken into account.
For each of the biological processes found to be significantly overrepresented, some highly disordered proteins were selected for in-depth analysis (Fig. 12). We also analyzed proteins performing functional liquid-like states, but known to have the ability to switch to less dynamic reversible hydrogel structures or even irreversible gel-like states sustained by fibrillar aggregates: Fused in Sarcoma (FUS), Trans-activation response DNA-binding protein 43 (TDP-43), and heterogeneous nuclear ribonucleoprotein A1 (hnRNPA1) [94]. Also, functional amyloids and prions analysed (under the black level on the Figure 12). It seems to be characteristic for proteinopathies-associated proteins (tau, prion protein, α-synuclein, SOD1, β-amyloid precursor protein, Huntingtin, APOE, marked as bold black in Fig. 12) that structural disorder, LLPS, and aggregation scores are high.
Fig. 12.
Analysis of the most disordered proteins from different biological processes on “dangerous” features: LLPS propensities and aggregation. Proteinopathies-associated proteins highlighted as bold black, new “dangerous” IDP marked as red. Heatmapped columns gradually colored from red to green from top to lowest disorder (MoRFs, PPIDRPONDR-FIT), LLPS propensity (PSPredictor, FuzDrop, PScore), aggregation scores (AggreScan “Hot Spots” number and area of them, PASTA best energy). *Tau protein, known to be aggregating, has low score by AggreScan but in original publication [137] authors claim that some number of “Hot Spots” is enough to predict a protein as aggregation prone
For ordinary, not disease-associated, not dangerous proteins, there is usually another picture: ordered proteins have low disorder and LLPS scores but aggregation is predicted to be high, and, vice versa, IDPs have high disorder and LLPS scores but low aggregation. The reasons for such clustering are clear from the structural point of view. Hydrophobic regions inside protein globule of ordered proteins will tend to aggregate with any other protein in the case of misfolding. In other words, structurally ordered domains are often prone to aggregation, but they can avoid it if folds quickly [168]. Contrary, IDPs are predominantly hydrophilic and charged, so mostly are not tending to hide any regions from water inside the aggregates. Generally, IDR appearance coupled with both high LLPS and aggregation propensity is characteristic for disease-associated proteins (reviewed in [166]). Therefore, we considered such a combination as a potential danger marker of a protein. For the well-known LLPS-positive proteins G3BP1, SERF1, and TDP-43 (TADBP), VHL aggregations assessments are close to control proteinopathies-associated proteins. So, we have small positive control, checked that simultaneous disorder, LLPS, and aggregation could reflect dangerous features.
Verification of qualitative conclusions was done by creation of a predictive model using intrinsic disorder, LLPS, and aggregation propensities for evaluating the score and label protein as dangerous or non-dangerous. Training of the model was done on small dataset of pathogenic protein from CPAD 2.0 database [91] and neutral datasets. Evaluation of aggregation propensity was done by PASTA2 [139]. Model is described in Supplementary Note 3. Accuracy of the predictions is 0.84.
As all linear regression model coefficients are proportional to the dangerous feature (intrinsic disorder has positive impact in final dangerous protein probability score 0.06, LLPS propensity by PScore 0.3, best energy of PASTA2 aggregation prediction is negative and is minimal for highly aggregating proteins, so coefficient − 0.16 corresponds to that smaller PASTA2 score cause higher score for dangerous protein), we could conclude that used features are for some extent responsible for dangerous protein behavior.
This predictive model succeeded in labeling the following proteins from the CPAD 2 database of aggregating proteins: Fiber protein, Glucagon, PMEL M-alpha, Zona pellucida sperm-binding protein 1, Beta-endorphin, Corticoliberin, Defensin-6, Gastric inhibitory polypeptide, MLKL, Obestatin, Protachykinin-1, RIP1, RIP3, Somatotropin, and Urocortin-3 as “non-dangerous” functional amyloids.
For aging dataset, 96 proteins were predicted as dangerous, but involvement in disease (based on the UniProt data) was shown only for 57 of these proteins (accuracy of prediction 0.59). Therefore, propensities of a protein for intrinsic disorder, liquid–liquid phase separation, and aggregation can serve as predictive markers of pathology (at least to some extent).
Conclusions
IDPs were given little importance before, and they could even be considered as some kinds of rudiments that do not have functions as was for non-coding RNA. Recently, the role of both of them has become clearer [169]. Among other things, their significant regulatory role was discovered, as well as the ability to influence the interactions of biopolymers with each other, including aggregation and a more subtle process—liquid–liquid phase separation. LLPS, as it turned out in the last decade, is closely related to the processes of regulation of proteostasis and stress responses. At the same time, the peculiarities of IDPs as well as the presence of phase separation in the cell were ignored for a long time, in particular, in aging research. In order to understand the role of disordered proteins and LLPS in the aging process, we conduct comprehensive bioinformatics meta-analysis. Thus, we worked in the two “dark” places in aging research, both functionally (IDP and LLPS) and methodologically (bioinformatics), the combination of which can provide new information about the aging process.
To better uncover the mechanisms of aging, it is important to understand the prevalence of intrinsic disorder in proteins associated with aging and elucidate which cellular aging-related processes are most enriched in IDPs. Previously, similar intrinsic disorder-centric analyses have been performed for proteins involved in Alzheimer’s disease [81, 170], Parkinson’s disease [171, 172], as well as frontotemporal lobar degeneration and amyotrophic lateral sclerosis [82, 173]. These works showed the presence of IDPs in aging-related diseases.
Based on bioinformatics analysis, we investigated the characteristics of 1702 aging-associated proteins obtained from public databases or published studies. It has been shown that in aging-associated proteins, the disorder and LLPS propensity, coupled to each other, are on average lower than in the control sample and for the whole human proteome. Thus, both ordered and disordered LLPS-positive and -negative proteins are important for the aging processes.
Further, aging-associated proteins were attributed to distinct biological processes and consequently to different aging hallmarks. When the aging proteome was sorted by biological processes, it was discovered that there is significant heterogeneity in the average level of disorder between BPs—about one-third is enriched with ordered proteins and one-third with disordered ones. IDPs are enriched in DNA-associated processes (especially in regulation of gene expression, DNA damage response).
The analysis of signaling pathways showed that IDPs are located at all levels of the signaling system hierarchy—several highly ordered nodes are followed by 1–2 disordered ones. This looks like brickwork (structure–function continuum [26, 27–29]), where the interaction of ordered enzymes (“bricks”) is controlled by the mobile fraction of IDR-containing proteins (“viscous cement”). If we add to this that IDP is known for highly specific but weak interactions, we will get even more universality, which allows forming signaling networks of almost any shape, number of nodes, and high plasticity for removing or adding new components. It should be further noted that some chains of this pathways are protein complexes consisting of several subunits (such as mTORC1). So, combination of ordered and disordered parts can be observed even within one component of the signaling system, in which one subunit may be responsible for enzymatic activity or scaffold (ordered parts), while other may recruit interaction partners (parts enriched with disorder). That proposes an answer on a very delicate question of how one protein complex (mTORC1, AMPK, Sirt1, etc.) can quickly interact with dozens or hundreds of proteins, which even may not have a pronounced “lock-and-key” correspondence. The fact that represented core metabolic pathways are highly nutrient-dependent highlighted the degree of human dependence on the environment, in which effects on the IDR-containing proteins and LLPS are just beginning to be seriously explored.
For better understanding, aging-associated proteins were grouped by their mutual interactions, and four clusters were discovered. Two of them are mainly localized in the nucleus. One nuclear cluster is enriched with disordered proteins with a high propensity for LLPS and is responsible for DNA-associated functions. Therefore, we confirmed that IDP and LLPS have a special role in aging but for the highest degree within the framework of processes associated with the genome regulation.
UP-regulated proteins with aging, associated with stress response, and DOWN-regulated proteins, performing proteostasis functions, have no difference in intrinsic disorder and LLPS enrichment from each other and from the whole human proteome. Information about expression change with aging for the cluster of disordered and LLPS-prone proteins, enriched in genome regulation, is lacking and of high interest to be explored.
Finally, we analyzed properties of proteins that somehow could be used to distinguish dangerous and neutral proteins. Linear regression predictive model using IDP, LLPS, and aggregation propensities as protein descriptors has reasonable accuracy in revealing dangerous proteins. Therefore, once again, we emphasize here that in studies of aging and age-related pathologies, intrinsic disorder levels, and tendencies for liquid–liquid phase separation and aggregation should be considered.
Our approach to identifying dangerous markers in protein structure is consistent with the outputs of other studies. For example, one such marker is the presence of Low-complexity Aromatic-Rich Kinked Segments (LARKS) [1, 3] which tend to form inter-protein β-sheets (regions undergoing disorder-to-order transitions) in environments of high protein concentration [174]. The search for proteins prone to both aggregation and phase separation can be considered as a generalization of this approach.
In summary, our analysis reveals the special role of ID and LLPS in the regulation of aging, highlighting the group of proteins for a more thorough experimental study in that context.
Knowledge of the level of intrinsic disorder and the set of functions of the proteins under consideration will make it possible to deepen the understanding of the mechanisms of aging and outline ways to slow it down. To build robust link between aging and structural disorder, as it was done for other aging hallmarks [1, 2], one should do positive and negative controls—rescue of accelerated aging by expression/depletion of IDPs and premature aging after IDP knock out/overexpression correspondingly. Some studies present so far [175, 176]. Further research is needed to add IDP, LLPS, and PMLO disturbance as aging hallmark.
Supplementary Information
Below is the link to the electronic supplementary material.
Acknowledgements
NI acknowledges the Ministry of Science and Higher Education of the Russian Federation (Agreement # 075-03-2023-106, Project FSMG-2020-0003, in the part of analysis of protein-protein interactions. The research was also supported by the Ministry of Science and Higher Education of the Russian Federation (Agreement # 075-01593-23-04, Project 720000F.99.1.BN62AA40000) in the part of aggregation evaluation. The part of this work (LLPS analysis) was funded by Russian Science Foundation, Grant number 21-75-10166 (A.V.F.).
Abbreviations
- IDP
Intrinsically disordered protein
- IDR
Intrinsically disordered region
- PONDR
Predictor of natural disordered regions
- PONDR-FIT, VLXT, VL3, VSL2, IUPred2A
Predictors of protein disorder (existence of the disordered residues in the protein) from PONDR family
- MDS
Mean disorder score—average of PONDR scores of each residue for whole protein
- PPIDR
Predicted percentage of intrinsically disordered residues (with propensity to be disordered higher than 0.5)
- CDF
Cumulative distribution function
- CH
Charge–hydropathy
- MoRF
Molecular recognition features (protein–partner interaction sites that acquire structure upon binding to partners)
- LLPS
Liquid–liquid phase separation
- FuzDrop, PSPredictor, PScore
Predictors of the LLPS propensity
- PPI
Protein–protein interaction
- BP
Biological process
- AggreScan, PASTA2
Predictors of the aggregation propensity
- IIS
Insulin and insulin-like signaling
- DR
Dietary restriction
- TERT
Telomerase catalytic unit
- nFoE
Normalized fold enrichment
- ML
Machine learning
Author contributions
VDM and NSI contributed equally to this work. NSI, VI, and VNU conceived and designed study. VDM, NSI, SVN, BMGAS, GWD, EVZ, SSM, AVF, IMK, KKT, and VNU conducted research, analyzed data, and collected and analyzed literature data. VDM, NSI, SVN, BMGAS, GWD, EVZ, and SSM designed illustrations. VDM, NSI, SVN, BMGAS, GWD, EVZ, SSM, AVF, IMK, KKT, VI, and VNU wrote the manuscript. All the authors have read and agreed to the published version of the manuscript.
Funding
This work was supported in part by the Ministry of Science and Higher Education of the Russian Federation (Agreement # 075-03-2023-106, Project FSMG-2020-0003; and Agreement # 075-01593-23-04, Project 720000F.99.1.BN62AA40000). The part of this work was funded by Russian Science Foundation (Grant number 21-75-10166).
Data availability
The authors confirm that the data supporting the findings of this study are available within the article and also present as electronic supplementary materials. Additional data will be available on reasonable request.
Declarations
Conflict of interest
The authors have no relevant financial or non-financial interests to disclose. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.
Ethics approval and consent to participate
Not applicable.
Consent to publish
Not applicable.
Footnotes
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Vladimir D. Manyilov and Nikolay S. Ilyinsky contributed equally.
Change history
1/13/2024
A Correction to this paper has been published: 10.1007/s00018-023-04989-0
Contributor Information
Nikolay S. Ilyinsky, Email: ilinsky@phystech.edu
Vladimir N. Uversky, Email: vuversky@usf.edu
References
- 1.López-Otín C, Blasco MA, Partridge L, et al. The hallmarks of aging. Cell. 2013;153:1194–1217. doi: 10.1016/j.cell.2013.05.039. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.López-Otín C, Blasco MA, Partridge L, et al (2023) Hallmarks of aging: An expanding universe. Cell 186:243–278 [DOI] [PubMed]
- 3.Anisimov VN. Insulin/IGF-1 signaling pathway driving aging and cancer as a target for pharmacological intervention. Exp Gerontol. 2003;38:1041–1049. doi: 10.1016/S0531-5565(03)00169-4. [DOI] [PubMed] [Google Scholar]
- 4.Franceschi C, Garagnani P, Morsiani C, et al. The continuum of aging and age-related diseases: common mechanisms but different rates. Front Med (Lausanne) 2018;5:61. doi: 10.3389/fmed.2018.00061. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Wright PE, Dyson HJ. Intrinsically disordered proteins in cellular signalling and regulation. Nat Rev Mol Cell Biol. 2015;16:18–29. doi: 10.1038/nrm3920. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Dunker AK, Silman I, Uversky VN, Sussman JL. Function and structure of inherently disordered proteins. Curr Opin Struct Biol. 2008;18:756–764. doi: 10.1016/j.sbi.2008.10.002. [DOI] [PubMed] [Google Scholar]
- 7.Uversky VN. Intrinsic disorder-based protein interactions and their modulators. Curr Pharm Des. 2013;19:4191–4213. doi: 10.2174/1381612811319230005. [DOI] [PubMed] [Google Scholar]
- 8.Uversky VN. Unusual biophysics of intrinsically disordered proteins. Biochim Biophys Acta (BBA) Proteins Proteom. 2013;1834:932–951. doi: 10.1016/j.bbapap.2012.12.008. [DOI] [PubMed] [Google Scholar]
- 9.Uversky VN. Functional roles of transiently and intrinsically disordered regions within proteins. FEBS J. 2015;282:1182–1189. doi: 10.1111/febs.13202. [DOI] [PubMed] [Google Scholar]
- 10.Dunker AK, Cortese MS, Romero P, et al. Flexible nets: the roles of intrinsic disorder in protein interaction networks. FEBS J. 2005;272:5129–5148. doi: 10.1111/j.1742-4658.2005.04948.x. [DOI] [PubMed] [Google Scholar]
- 11.Bondos SE, Dunker AK, Uversky VN. On the roles of intrinsically disordered proteins and regions in cell communication and signaling. Cell Commun Signal. 2021;19:88. doi: 10.1186/s12964-021-00774-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Uversky VN, Oldfield CJ, Dunker AK. Showing your ID: intrinsic disorder as an ID for recognition, regulation and cell signaling. J Mol Recognit: Interdiscip J. 2005;18:343–384. doi: 10.1002/jmr.747. [DOI] [PubMed] [Google Scholar]
- 13.Van Der Lee R, Buljan M, Lang B, et al. Classification of intrinsically disordered regions and proteins. Chem Rev. 2014;114:6589–6631. doi: 10.1021/cr400525m. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Schulz GE (1979) Nucleotide binding proteins. In: Molecular mechanisms of biological recognition. Elsevier, Amsterdam, pp 79–94
- 15.Dunker AK, Obradovic Z. The protein trinity—linking function and disorder. Nat Biotechnol. 2001;19:805–806. doi: 10.1038/nbt0901-805. [DOI] [PubMed] [Google Scholar]
- 16.Uversky VN, Dunker AK. Understanding protein non-folding. Biochim Biophys Acta (BBA) Proteins Proteom. 2010;1804:1231–1264. doi: 10.1016/j.bbapap.2010.01.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Oldfield CJ, Cheng Y, Cortese MS, et al. Coupled folding and binding with α-helix-forming molecular recognition elements. Biochemistry. 2005;44:12454–12470. doi: 10.1021/bi050736e. [DOI] [PubMed] [Google Scholar]
- 18.Mohan A, Oldfield CJ, Radivojac P, et al. Analysis of molecular recognition features (MoRFs) J Mol Biol. 2006;362:1043–1059. doi: 10.1016/j.jmb.2006.07.087. [DOI] [PubMed] [Google Scholar]
- 19.Vacic V, Oldfield CJ, Mohan A, et al. Characterization of molecular recognition features, MoRFs, and their binding partners. J Proteome Res. 2007;6:2351–2366. doi: 10.1021/pr0701411. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Cheng Y, Oldfield CJ, Meng J, et al. Mining α-helix-forming molecular recognition features with cross species sequence alignments. Biochemistry. 2007;46:13468–13477. doi: 10.1021/bi7012273. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Disfani FM, Hsu W-L, Mizianty MJ, et al. MoRFpred, a computational tool for sequence-based prediction and characterization of short disorder-to-order transitioning binding regions in proteins. Bioinformatics. 2012;28:i75–i83. doi: 10.1093/bioinformatics/bts209. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Yan J, Dunker AK, Uversky VN, Kurgan L. Molecular recognition features (MoRFs) in three domains of life. Mol Biosyst. 2016;12:697–710. doi: 10.1039/C5MB00640F. [DOI] [PubMed] [Google Scholar]
- 23.Mészáros B, Simon I, Dosztányi Z. The expanding view of protein–protein interactions: complexes involving intrinsically disordered proteins. Phys Biol. 2011;8:035003. doi: 10.1088/1478-3975/8/3/035003. [DOI] [PubMed] [Google Scholar]
- 24.Malaney P, Pathak RR, Xue B, et al. Intrinsic disorder in PTEN and its interactome confers structural plasticity and functional versatility. Sci Rep. 2013;3:1–14. doi: 10.1038/srep02035. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Koshland DE., Jr Application of a theory of enzyme specificity to protein synthesis. Proc Natl Acad Sci. 1958;44:98–104. doi: 10.1073/pnas.44.2.98. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Uversky VN. Protein intrinsic disorder and structure-function continuum. Prog Mol Biol Transl Sci. 2019;166:1–17. doi: 10.1016/bs.pmbts.2019.05.003. [DOI] [PubMed] [Google Scholar]
- 27.Fonin AV, Darling AL, Kuznetsova IM, et al. Multi-functionality of proteins involved in GPCR and G protein signaling: making sense of structure–function continuum with intrinsic disorder-based proteoforms. Cell Mol Life Sci. 2019;76:4461–4492. doi: 10.1007/s00018-019-03276-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Uversky VN. p53 proteoforms and intrinsic disorder: an illustration of the protein structure–function continuum concept. Int J Mol Sci. 2016;17:1874. doi: 10.3390/ijms17111874. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Uversky VN. Dancing protein clouds: the strange biology and chaotic physics of intrinsically disordered proteins. J Biol Chem. 2016;291:6681–6688. doi: 10.1074/jbc.R115.685859. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Nesterov SV, Ilyinsky NS, Uversky VN. Liquid–liquid phase separation as a common organizing principle of intracellular space and biomembranes providing dynamic adaptive responses. Biochim Biophys Acta (BBA) Mol Cell Res. 2021;1868:119102. doi: 10.1016/j.bbamcr.2021.119102. [DOI] [PubMed] [Google Scholar]
- 31.Cermakova K, Demeulemeester J, Lux V, et al. A ubiquitous disordered protein interaction module orchestrates transcription elongation. Science. 2021;374:1113–1121. doi: 10.1126/science.abe2913. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Cermakova K, Veverka V, Hodges HC. The TFIIS N-terminal domain (TND): a transcription assembly module at the interface of order and disorder. Biochem Soc Trans. 2023;51:125–135. doi: 10.1042/BST20220342. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Uversky VN, Kuznetsova IM, Turoverov KK, Zaslavsky B. Intrinsically disordered proteins as crucial constituents of cellular aqueous two phase systems and coacervates. FEBS Lett. 2015;589:15–22. doi: 10.1016/j.febslet.2014.11.028. [DOI] [PubMed] [Google Scholar]
- 34.Meng F, Na I, Kurgan L, Uversky VN. Compartmentalization and functionality of nuclear disorder: intrinsic disorder and protein–protein interactions in intra-nuclear compartments. Int J Mol Sci. 2016;17:24. doi: 10.3390/ijms17010024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Uversky VN. Protein intrinsic disorder-based liquid–liquid phase transitions in biological systems: complex coacervates and membrane-less organelles. Adv Colloid Interface Sci. 2017;239:97–114. doi: 10.1016/j.cis.2016.05.012. [DOI] [PubMed] [Google Scholar]
- 36.Uversky VN. Intrinsically disordered proteins in overcrowded milieu: membrane-less organelles, phase separation, and intrinsic disorder. Curr Opin Struct Biol. 2017;44:18–30. doi: 10.1016/j.sbi.2016.10.015. [DOI] [PubMed] [Google Scholar]
- 37.Darling AL, Liu Y, Oldfield CJ, Uversky VN. Intrinsically disordered proteome of human membrane-less organelles. Proteomics. 2018;18:1700193. doi: 10.1002/pmic.201700193. [DOI] [PubMed] [Google Scholar]
- 38.Darling AL, Zaslavsky BY, Uversky VN. Intrinsic disorder-based emergence in cellular biology: physiological and pathological liquid-liquid phase transitions in cells. Polymers (Basel) 2019;11:990. doi: 10.3390/polym11060990. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Turoverov KK, Kuznetsova IM, Fonin AV, et al. Stochasticity of biological soft matter: emerging concepts in intrinsically disordered proteins and biological phase separation. Trends Biochem Sci. 2019;44:716–728. doi: 10.1016/j.tibs.2019.03.005. [DOI] [PubMed] [Google Scholar]
- 40.Uversky VN. Supramolecular fuzziness of intracellular liquid droplets: liquid–liquid phase transitions, membrane-less organelles, and intrinsic disorder. Molecules. 2019;24:3265. doi: 10.3390/molecules24183265. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Uversky VN. Recent developments in the field of intrinsically disordered proteins: intrinsic disorder-based emergence in cellular biology in light of the physiological and pathological liquid–liquid phase transitions. Annu Rev Biophys. 2021;50:135–156. doi: 10.1146/annurev-biophys-062920-063704. [DOI] [PubMed] [Google Scholar]
- 42.Kulkarni P, Bhattacharya S, Achuthan S, et al. Intrinsically disordered proteins: critical components of the wetware. Chem Rev. 2022;122:6614–6633. doi: 10.1021/acs.chemrev.1c00848. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Antifeeva IA, Fonin AV, Fefilova AS, et al. Liquid–liquid phase separation as an organizing principle of intracellular space: overview of the evolution of the cell compartmentalization concept. Cell Mol Life Sci. 2022;79:251. doi: 10.1007/s00018-022-04276-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Fonin AV, Antifeeva IA, Kuznetsova IM, et al. Biological soft matter: intrinsically disordered proteins in liquid–liquid phase separation and biomolecular condensates. Essays Biochem. 2022;66:831–847. doi: 10.1042/EBC20220052. [DOI] [PubMed] [Google Scholar]
- 45.Lyons H, Veettil RT, Pradhan P, et al. Functional partitioning of transcriptional regulators by patterned charge blocks. Cell. 2023;186:327–345.e28. doi: 10.1016/j.cell.2022.12.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Dunker AK, Romero P, Obradovic Z, et al. Intrinsic protein disorder in complete genomes. Genome Inform. 2000;11:161–171. [PubMed] [Google Scholar]
- 47.Ward JJ, Sodhi JS, McGuffin LJ, et al. Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. J Mol Biol. 2004;337:635–645. doi: 10.1016/j.jmb.2004.02.002. [DOI] [PubMed] [Google Scholar]
- 48.Xue B, Dunker AK, Uversky VN. Orderly order in protein intrinsic disorder distribution: disorder in 3500 proteomes from viruses and the three domains of life. J Biomol Struct Dyn. 2012;30:137–149. doi: 10.1080/07391102.2012.675145. [DOI] [PubMed] [Google Scholar]
- 49.Kim J-Y, Cho Y-E, Park J-H. The nucleolar protein GLTSCR2 is an upstream negative regulator of the oncogenic nucleophosmin-MYC axis. Am J Pathol. 2015;185:2061–2068. doi: 10.1016/j.ajpath.2015.03.016. [DOI] [PubMed] [Google Scholar]
- 50.Shi Y, Xu X, Zhang Q, et al. tRNA synthetase counteracts c-Myc to develop functional vasculature. Elife. 2014;3:e02349. doi: 10.7554/eLife.02349. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Hanahan D, Weinberg RA. Hallmarks of cancer: the next generation. Cell. 2011;144:646–674. doi: 10.1016/j.cell.2011.02.013. [DOI] [PubMed] [Google Scholar]
- 52.Wu K-J, Grandori C, Amacker M, et al. Direct activation of TERT transcription by c-MYC. Nat Genet. 1999;21:220–224. doi: 10.1038/6010. [DOI] [PubMed] [Google Scholar]
- 53.Takahashi K, Tanabe K, Ohnuki M, et al. Induction of pluripotent stem cells from adult human fibroblasts by defined factors. Cell. 2007;131:861–872. doi: 10.1016/j.cell.2007.11.019. [DOI] [PubMed] [Google Scholar]
- 54.Ocampo A, Reddy P, Martinez-Redondo P, et al. In vivo amelioration of age-associated hallmarks by partial reprogramming. Cell. 2016;167:1719–1733. doi: 10.1016/j.cell.2016.11.052. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Lu Y, Brommer B, Tian X, et al. Reprogramming to recover youthful epigenetic information and restore vision. Nature. 2020;588:124–129. doi: 10.1038/s41586-020-2975-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Yang J-H, Hayano M, Griffin PT, et al. Loss of epigenetic information as a cause of mammalian aging. Cell. 2023;186:305–326. doi: 10.1016/j.cell.2022.12.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Xue B, Dunbrack RL, Williams RW, et al. PONDR-FIT: a meta-predictor of intrinsically disordered amino acids. Biochim Biophys Acta (BBA) Proteins Proteom. 2010;1804:996–1010. doi: 10.1016/j.bbapap.2010.01.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Akdel M, Pires DEV, Pardo EP, et al. A structural biology community assessment of AlphaFold2 applications. Nat Struct Mol Biol. 2022;29:1056–1067. doi: 10.1038/s41594-022-00849-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Zhao B, Ghadermarzi S, Kurgan L. Comparative evaluation of AlphaFold2 and disorder predictors for prediction of intrinsic disorder, disorder content and fully disordered proteins. Comput Struct Biotechnol J. 2023;21:3248–3258. doi: 10.1016/j.csbj.2023.06.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Piovesan D, Monzon AM, Tosatto SCE. Intrinsic protein disorder and conditional folding in AlphaFoldDB. Protein Sci. 2022 doi: 10.1002/pro.4466. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Kovacs D, Tompa P. Diverse functional manifestations of intrinsic structural disorder in molecular chaperones. Biochem Soc Trans. 2012;40(5):963–968. doi: 10.1042/BST20120108. [DOI] [PubMed] [Google Scholar]
- 62.Lindner AB, Demarez A. Protein aggregation as a paradigm of aging. Biochim Biophys Acta. 2009;1790(10):980–996. doi: 10.1016/j.bbagen.2009.06.005. [DOI] [PubMed] [Google Scholar]
- 63.Csermely P, Sőti C. Cellular networks and the aging process. Arch Physiol Biochem. 2006;112(2):60–64. doi: 10.1080/13813450600711243. [DOI] [PubMed] [Google Scholar]
- 64.Iakoucheva LM, Radivojac P, Brown CJ, et al. The importance of intrinsic disorder for protein phosphorylation. Nucleic Acids Res. 2004;32:1037–1049. doi: 10.1093/nar/gkh253. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Collins MO, Yu L, Campuzano I, et al. Phosphoproteomic analysis of the mouse brain cytosol reveals a predominance of protein phosphorylation in regions of intrinsic sequence disorder. Mol Cell Proteom. 2008;7:1331–1348. doi: 10.1074/mcp.M700564-MCP200. [DOI] [PubMed] [Google Scholar]
- 66.Darling AL, Uversky VN. Intrinsic disorder and posttranslational modifications: the darker side of the biological dark matter. Front Genet. 2018;9:158. doi: 10.3389/fgene.2018.00158. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Pejaver V, Hsu W, Xin F, et al. The structural and functional signatures of proteins that undergo multiple events of post-translational modification. Protein Sci. 2014;23:1077–1093. doi: 10.1002/pro.2494. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Xie H, Vucetic S, Iakoucheva LM, et al. Functional anthology of intrinsic disorder. 3. Ligands, post-translational modifications, and diseases associated with intrinsically disordered proteins. J Proteome Res. 2007;6:1917–1932. doi: 10.1021/pr060394e. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Smith NS, Kuravsky M, Shammas SL, Matthews JM. Binding and folding in transcriptional complexes. Current Opinion in Structural Biology. 2021;66:156–162. doi: 10.1016/j.sbi.2020.10.026. [DOI] [PubMed] [Google Scholar]
- 70.Gian Carlo GP, Ivette P, Jennifer LF, Britney NH, Lee H-W, Carrie LP (2020) The human CRY1 tail controls circadian timing by regulating its association with CLOCK:BMAL1 Significance Proceedings of the National Academy of Sciences 117(45):27971–27979. 10.1073/pnas.1920653117 [DOI] [PMC free article] [PubMed]
- 71.Gustafson CL, Parsley NC, Asimgil H, et al. A Slow Conformational Switch in the BMAL1 Transactivation Domain Modulates Circadian Rhythms. Mol Cell. 2017;66(4):447–457.e7. doi: 10.1016/j.molcel.2017.04.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Partch CL. Orchestration of Circadian Timing by Macromolecular Protein Assemblies. J Mol Biol. 2020;432(12):3426–3448. doi: 10.1016/j.jmb.2019.12.046. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Vinogradova IA, Anisimov VN, Bukalev AV, et al. Circadian disruption induced by light-at-night accelerates aging and promotes tumorigenesis in rats. Aging. 2009;1(10):855–865. doi: 10.18632/aging.v1i1010.18632/aging.100092. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Riera CE, Dillin A. Tipping the metabolic scales towards increased longevity in mammals. Nat Cell Biol. 2015;17(3):196–203. doi: 10.1038/ncb3107. [DOI] [PubMed] [Google Scholar]
- 75.Campisi J, Kapahi P, Lithgow GJ, Melov S, Newman JC, Verdin E. From discoveries in ageing research to therapeutics for healthy ageing. Nature. 2019;571(7764):183–192. doi: 10.1038/s41586-019-1365-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Uversky VN, Oldfield CJ, Midic U, et al (2009) Unfoldomics of human diseases: linking protein intrinsic disorder with diseases BMC Genomics 10(Suppl 1): S7. 10.1186/1471-2164-10-S1-S7 [DOI] [PMC free article] [PubMed]
- 77.Joerger AC, Fersht AR. Structure–function–rescue: the diverse nature of common p53 cancer mutants. Oncogene. 2007;26(15):2226–2242. doi: 10.1038/sj.onc.1210291. [DOI] [PubMed] [Google Scholar]
- 78.Childs BG, Baker DJ, Kirkland JL, Campisi J, van Deursen JM. Senescence and apoptosis: dueling or complementary cell fates? EMBO reports. 2014;15(11):1139–1153. doi: 10.15252/embr.201439245. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Ilyinsky NS, Nesterov SV, Shestoperova EI, et al. On the role of normal aging processes in the onset and pathogenesis of diseases associated with the abnormal accumulation of protein aggregates. Biochem Mosc. 2021;86:275–289. doi: 10.1134/S0006297921030056. [DOI] [PubMed] [Google Scholar]
- 80.Uversky VN. Intrinsically disordered proteins and their (disordered) proteomes in neurodegenerative disorders. Front Aging Neurosci. 2015;7:18. doi: 10.3389/fnagi.2015.00018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Gadhave K, Gehi BR, Kumar P, et al. Subclassifying disordered proteins by the CH-CDF plot method. Cell Mol Life Sci. 2020;77:4163–4208. doi: 10.1007/s00018-019-03414-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Uversky VN. The roles of intrinsic disorder-based liquid-liquid phase transitions in the “Dr. Jekyll–Mr. Hyde” behavior of proteins involved in amyotrophic lateral sclerosis and frontotemporal lobar degeneration. Autophagy. 2017;13:2115–2162. doi: 10.1080/15548627.2017.1384889. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Iakoucheva LM, Brown CJ, Lawson JD, et al. Intrinsic disorder in cell-signaling and cancer-associated proteins. J Mol Biol. 2002;323:573–584. doi: 10.1016/S0022-2836(02)00969-5. [DOI] [PubMed] [Google Scholar]
- 84.Uversky VN. Amyloidogenesis of natively unfolded proteins. Curr Alzheimer Res. 2008;5:260–287. doi: 10.2174/156720508784533312. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Cheng Y, LeGall T, Oldfield CJ, et al. Abundance of intrinsic disorder in protein associated with cardiovascular disease. Biochemistry. 2006;45:10448–10460. doi: 10.1021/bi060981d. [DOI] [PubMed] [Google Scholar]
- 86.Du Z, Uversky VN. A comprehensive survey of the roles of highly disordered proteins in type 2 diabetes. Int J Mol Sci. 2017;18:2010. doi: 10.3390/ijms18102010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Bolognesi B, Faure AJ, Seuma M, et al. The mutational landscape of a prion-like domain. Nat Commun. 2019;10:1–12. doi: 10.1038/s41467-019-12101-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Miller SBM, Mogk A, Bukau B. Spatially organized aggregation of misfolded proteins as cellular stress defense strategy. J Mol Biol. 2015;427:1564–1574. doi: 10.1016/j.jmb.2015.02.006. [DOI] [PubMed] [Google Scholar]
- 89.Li J, McQuade T, Siemer AB, et al. The RIP1/RIP3 necrosome forms a functional amyloid signaling complex required for programmed necrosis. Cell. 2012;150:339–350. doi: 10.1016/j.cell.2012.06.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Garcia-Pardo J, Bartolomé-Nafría A, Chaves-Sanjuan A, et al. Cryo-EM structure of hnRNPDL-2 fibrils, a functional amyloid associated with limb-girdle muscular dystrophy D3. Nat Commun. 2023 doi: 10.1038/s41467-023-35854-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Rawat P, Prabakaran R, Sakthivel R, et al. CPAD 2.0: a repository of curated experimental data on aggregating proteins and peptides. Amyloid. 2020;27:128–133. doi: 10.1080/13506129.2020.1715363. [DOI] [PubMed] [Google Scholar]
- 92.Darling AL, Shorter J. Combating deleterious phase transitions in neurodegenerative disease. Biochim Biophys Acta (BBA) Mol Cell Res. 2021;1868:118984. doi: 10.1016/j.bbamcr.2021.118984. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Alberti S, Hyman AA. Are aberrant phase transitions a driver of cellular aging? BioEssays. 2016;38:959–968. doi: 10.1002/bies.201600042. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Garaizar A, Espinosa JR, Joseph JA, et al. Aging can transform single-component protein condensates into multiphase architectures. Proc Nat Acad Sci. 2022;104:5925–5930. doi: 10.1073/pnas. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Guillén-Boixet J, Kopach A, Holehouse AS, et al. RNA-induced conformational switching and clustering of G3BP drive stress granule assembly by condensation. Cell. 2020;181:346–361.e17. doi: 10.1016/j.cell.2020.03.049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Lechler MC, David DC. More stressed out with age? Check your RNA granule aggregation. Prion. 2017;11:313–322. doi: 10.1080/19336896.2017.1356559. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Verdile V, De Paola E, Paronetto MP. Aberrant Phase Transitions: Side Effects and Novel Therapeutic Strategies in Human Disease. Front Genet. 2019;10:173. doi: 10.3389/fgene.2019.00173. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Shiina N. Liquid- and solid-like RNA granules form through specific scaffold proteins and combine into biphasic granules. J Biol Chem. 2019;294:3532–3548. doi: 10.1074/jbc.RA118.005423. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Tüű-Szabó B, Hoffka G, Duro N, Fuxreiter M. Altered dynamics may drift pathological fibrillization in membraneless organelles. Biochim Biophys Acta Proteins Proteom. 2019;1867:988–998. doi: 10.1016/j.bbapap.2019.04.005. [DOI] [PubMed] [Google Scholar]
- 100.Cao X, Jin X, Liu B. The involvement of stress granules in aging and aging-associated diseases. Aging Cell. 2020;19:e13136. doi: 10.1111/acel.13136. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Maharjan N, Künzli C, Buthey K, Saxena S. C9ORF72 regulates stress granule formation and its deficiency impairs stress granule assembly, hypersensitizing cells to stress. Mol Neurobiol. 2017;54:3062–3077. doi: 10.1007/s12035-016-9850-1. [DOI] [PubMed] [Google Scholar]
- 102.Chitiprolu M, Jagow C, Tremblay V, et al. A complex of C9ORF72 and p62 uses arginine methylation to eliminate stress granules by autophagy. Nat Commun. 2018 doi: 10.1038/s41467-018-05273-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 103.Lechler MC, Crawford ED, Groh N, et al. Reduced insulin/IGF-1 signaling restores the dynamic properties of key stress granule proteins during aging. Cell Rep. 2017;18:454–467. doi: 10.1016/j.celrep.2016.12.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104.Portz B, Lee BL, Shorter J. FUS and TDP-43 phases in health and disease. Trends Biochem Sci. 2021;46:550–563. doi: 10.1016/j.tibs.2020.12.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Heinze I, Bens M, Calzia E, et al. Species comparison of liver proteomes reveals links to naked mole-rat longevity and human aging. BMC Biol. 2018 doi: 10.1186/s12915-018-0547-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106.Ubaida-Mohien C, Lyashkov A, Gonzalez-Freire M, et al. Discovery proteomics in aging human skeletal muscle finds change in spliceosome, immunity, proteostasis and mitochondria. Elife. 2019;8:e49874. doi: 10.7554/eLife.49874.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Jumper J, Evans R, Pritzel A, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596:583–589. doi: 10.1038/s41586-021-03819-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108.Altschup SF, Gish W, Miller W, et al. Basic Local Alignment Search Tool. J Mol Biol. 1990;215:403–410. doi: 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
- 109.Pettersen EF, Goddard TD, Huang CC, et al. UCSF ChimeraX: structure visualization for researchers, educators, and developers. Protein Sci. 2021;30:70–82. doi: 10.1002/pro.3943. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Tacutu R, Thornton D, Johnson E, et al. Human ageing genomic resources: new and updated databases. Nucleic Acids Res. 2018;46:D1083–D1090. doi: 10.1093/nar/gkx1042. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 111.Ashburner M, Ball CA, Blake JA, et al. Gene ontology: tool for the unification of biology. Nat Genet. 2000;25:25–29. doi: 10.1038/75556. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 112.The Gene Ontology Consortium The Gene Ontology resource: enriching a GOld mine. Nucleic Acids Res. 2021;49:D325–D334. doi: 10.1093/nar/gkaa1113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 113.Kanehisa M, Goto S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000;28:27–30. doi: 10.1093/nar/28.1.27. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 114.Craig T, Smelick C, Tacutu R, et al. The Digital Ageing Atlas: integrating the diversity of age-related changes into a unified resource. Nucleic Acids Res. 2015;43:D873–D878. doi: 10.1093/nar/gku843. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 115.Liu GH, Bao Y, Qu J, et al. Aging Atlas: a multi-omics database for aging biology. Nucleic Acids Res. 2021;49:D825–D830. doi: 10.1093/nar/gkaa894. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 116.Hühne R, Thalheim T, Sühnel J. AgeFactDB—The JenAge Ageing Factor Database—towards data integration in ageing research. Nucleic Acids Res. 2014 doi: 10.1093/nar/gkt1073. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 117.Johnson AA, Shokhirev MN, Wyss-Coray T, Lehallier B. Systematic review and analysis of human proteomics aging studies unveils a novel proteomic aging clock and identifies key processes that change with age. Ageing Res Rev. 2020;60:101070. doi: 10.1016/j.arr.2020.101070. [DOI] [PubMed] [Google Scholar]
- 118.Peters MJ, Joehanes R, Pilling LC, et al. The transcriptional landscape of age in human peripheral blood. Nat Commun. 2015;6:1–14. doi: 10.1038/ncomms9570. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 119.Lu T, Pan Y, Kao S-Y, et al. Gene regulation and DNA damage in the ageing human brain. Nature. 2004;429:883–891. doi: 10.1038/nature02661. [DOI] [PubMed] [Google Scholar]
- 120.Brehme M, Voisine C, Rolland T, et al. A chaperome subnetwork safeguards proteostasis in aging and neurodegenerative disease. Cell Rep. 2014;9:1135–1150. doi: 10.1016/j.celrep.2014.09.042. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 121.Zeng YI, Nie C, Min J, et al. Novel loci and pathways significantly associated with longevity. Sci Rep. 2016;6:1–13. doi: 10.1038/srep21243. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 122.Apweiler R, Bairoch A, Wu CH, et al. UniProt: the universal protein knowledgebase. Nucleic Acids Res. 2004;32:D115–D119. doi: 10.1093/nar/gkh131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 123.Kerepesi C, Daróczy B, Sturm Á, et al. Prediction and characterization of human ageing-related proteins by using machine learning. Sci Rep. 2018;8:1–13. doi: 10.1038/s41598-018-22240-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 124.Dayhoff GW, Uversky VN. Rapid prediction and analysis of protein intrinsic disorder. Protein Sci. 2022;31:e4496. doi: 10.1002/pro.4496. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 125.Li X, Romero P, Rani M, et al. Predicting protein disorder for N-, C-and internal regions. Genome Inform. 1999;10:30–40. [PubMed] [Google Scholar]
- 126.Peng K, Radivojac P, Vucetic S, et al. Length-dependent prediction of protein intrinsic disorder. BMC Bioinform. 2006;7:1–17. doi: 10.1186/1471-2105-7-208. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 127.Radivojac P, Obradović Z, Brown CJ, Dunker AK (2002) Prediction of boundaries between intrinsically ordered and disordered protein regions. In: Biocomputing 2003. World Scientific, Singapore, pp 216–227 [PubMed]
- 128.Dosztanyi Z, Csizmok V, Tompa P, Simon I. The pairwise energy content estimated from amino acid composition discriminates between folded and intrinsically unstructured proteins. J Mol Biol. 2005;347:827–839. doi: 10.1016/j.jmb.2005.01.071. [DOI] [PubMed] [Google Scholar]
- 129.Huang F, Oldfield C, Meng J et al (2012) Subclassifying disordered proteins by the CH-CDF plot method. In: Biocomputing 2012. World Scientific, Singapore, pp 128–139 [PubMed]
- 130.van Bibber NW, Haerle C, Khalife R, et al. Intrinsic disorder in tetratricopeptide repeat proteins. Int J Mol Sci. 2020;21:3709. doi: 10.3390/ijms21103709. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 131.Uversky VN, Gillespie JR, Fink AL. Why are “natively unfolded” proteins unstructured under physiologic conditions? Proteins: Struct Funct Bioinform. 2000;41:415–427. doi: 10.1002/1097-0134(20001115)41:3<415::AID-PROT130>3.0.CO;2-7. [DOI] [PubMed] [Google Scholar]
- 132.Oldfield CJ, Cheng Y, Cortese MS, et al. Comparing and combining predictors of mostly disordered proteins. Biochemistry. 2005;44:1989–2000. doi: 10.1021/bi047993o. [DOI] [PubMed] [Google Scholar]
- 133.Mohan A, Sullivan WJ, Jr, Radivojac P, et al. Intrinsic disorder in pathogenic and non-pathogenic microbes: discovering and analyzing the unfoldomes of early-branching eukaryotes. Mol Biosyst. 2008;4:328–340. doi: 10.1039/b719168e. [DOI] [PubMed] [Google Scholar]
- 134.Huang F, Oldfield CJ, Xue B, et al. Improving protein order-disorder classification using charge-hydropathy plots. BMC Bioinform. 2014;15:1–13. doi: 10.1186/1471-2105-15-S17-S4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 135.Dosztányi Z, Mészáros B, Simon I. ANCHOR: web server for predicting protein binding regions in disordered proteins. Bioinformatics. 2009;25:2745–2746. doi: 10.1093/bioinformatics/btp518. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 136.Oates ME, Romero P, Ishida T, et al. D2P2: database of disordered protein predictions. Nucleic Acids Res. 2012;41:D508–D516. doi: 10.1093/nar/gks1226. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 137.Conchillo-Solé O, de Groot NS, Avilés FX, et al. AGGRESCAN: a server for the prediction and evaluation of “hot spots” of aggregation in polypeptides. BMC Bioinform. 2007;8:1–17. doi: 10.1186/1471-2105-8-65. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 138.de Groot NS, Pallarés I, Avilés FX, et al. Prediction of “hot spots” of aggregation in disease-linked polypeptides. BMC Struct Biol. 2005;5:1–15. doi: 10.1186/1472-6807-5-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 139.Walsh I, Seno F, Tosatto SCE, Trovato A. PASTA 2.0: an improved server for protein aggregation prediction. Nucleic Acids Res. 2014 doi: 10.1093/nar/gku399. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 140.Chu X, Sun T, Li Q, et al. Prediction of liquid–liquid phase separating proteins using machine learning. BMC Bioinform. 2022;23:1–13. doi: 10.1186/s12859-022-04599-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 141.Hardenberg M, Horvath A, Ambrus V, et al. Widespread occurrence of the droplet state of proteins in the human proteome. Proc Natl Acad Sci. 2020;117:33254–33262. doi: 10.1073/pnas.2007670117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 142.Vendruscolo M, Fuxreiter M. Sequence determinants of the aggregation of proteins within condensates generated by liquid–liquid phase separation. J Mol Biol. 2022;434:167201. doi: 10.1016/j.jmb.2021.167201. [DOI] [PubMed] [Google Scholar]
- 143.Hatos A, Tosatto SCE, Vendruscolo M, Fuxreiter M. FuzDrop on AlphaFold: visualizing the sequence-dependent propensity of liquid–liquid phase separation and aggregation of proteins. Nucleic Acids Res. 2022;50:W337–W344. doi: 10.1093/nar/gkac386. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 144.McCoy Vernon R, Andrew Chong P, Tsang B, et al. Pi-Pi contacts are an overlooked protein feature relevant to phase separation. Elife. 2018;7:e31486. doi: 10.7554/eLife.31486.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 145.Thomas PD, Campbell MJ, Kejariwal A, et al. PANTHER: a library of protein families and subfamilies indexed by function. Genome Res. 2003;13:2129–2141. doi: 10.1101/gr.772403. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 146.Shannon P, Markiel A, Ozier O, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003;13:2498–2504. doi: 10.1101/gr.1239303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 147.Szklarczyk D, Gable AL, Lyon D, et al. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 2019;47:D607–D613. doi: 10.1093/nar/gky1131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 148.Maere S, Heymans K, Kuiper M. BiNGO: a Cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks. Bioinformatics. 2005;21:3448–3449. doi: 10.1093/bioinformatics/bti551. [DOI] [PubMed] [Google Scholar]
- 149.Uversky VN. A protein-chameleon: conformational plasticity of α-synuclein, a disordered protein involved in neurodegenerative disorders. J Biomol Struct Dyn. 2003;21:211–234. doi: 10.1080/07391102.2003.10506918. [DOI] [PubMed] [Google Scholar]
- 150.Uversky VN. Targeting intrinsically disordered proteins in neurodegenerative and protein dysfunction diseases: another illustration of the D2 concept. Expert Rev Proteom. 2010;7:543–564. doi: 10.1586/epr.10.36. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 151.Uversky VN. α-Synuclein misfolding and neurodegenerative diseases. Curr Protein Pept Sci. 2008;9:507–540. doi: 10.2174/138920308785915218. [DOI] [PubMed] [Google Scholar]
- 152.Uversky VN (2009) Intrinsic disorder in proteins associated with neurodegenerative diseases. In: Ovádi, J., Orosz, F. (eds) Protein Folding and Misfolding: Neurodegenerative Diseases. Focus on Structural Biology, vol 7. Springer, Dordrecht, pp 21–75
- 153.Uversky VN, Oldfield CJ, Dunker AK. Intrinsically disordered proteins in human diseases: introducing the D2 concept. Annu Rev Biophys. 2008;37:215–246. doi: 10.1146/annurev.biophys.37.032807.125924. [DOI] [PubMed] [Google Scholar]
- 154.Breydo L, Wu JW, Uversky VN. α-Synuclein misfolding and Parkinson’s disease. Biochim Biophys Acta (BBA) Mol Basis Dis. 2012;1822:261–285. doi: 10.1016/j.bbadis.2011.10.002. [DOI] [PubMed] [Google Scholar]
- 155.Uversky VN. The triple power of D (3): protein intrinsic disorder in degenerative diseases. Front Biosci (Landmark Ed) 2014;19:181–258. doi: 10.2741/4204. [DOI] [PubMed] [Google Scholar]
- 156.Uversky VN. Looking at the recent advances in understanding α-synuclein and its aggregation through the proteoform prism. F1000Res. 2017;6:525. doi: 10.12688/f1000research.10536.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 157.Coskuner-Weber O, Mirzanli O, Uversky VN. Intrinsically disordered proteins and proteins with intrinsically disordered regions in neurodegenerative diseases. Biophys Rev. 2022;14:679–707. doi: 10.1007/s12551-022-00968-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 158.DeForte S, Uversky VN. Not an exception to the rule: the functional significance of intrinsically disordered protein regions in enzymes. Mol Biosyst. 2017;13:463–469. doi: 10.1039/C6MB00741D. [DOI] [PubMed] [Google Scholar]
- 159.Haynes C, Oldfield CJ, Ji F, et al. Intrinsic disorder is a common feature of hub proteins from four eukaryotic interactomes. PLoS Comput Biol. 2006;2:e100. doi: 10.1371/journal.pcbi.0020100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 160.Lehallier B, Gate D, Schaum N, et al. Undulating changes in human plasma proteome profiles across the lifespan. Nat Med. 2019;25:1843–1850. doi: 10.1038/s41591-019-0673-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 161.de Cabo R, Carmona-Gutierrez D, Bernier M, et al. The search for antiaging interventions: from elixirs to fasting regimens. Cell. 2014;157:1515–1526. doi: 10.1016/j.cell.2014.05.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 162.Cermakova K, Hodges HC. Interaction modules that impart specificity to disordered protein. Trends Biochem Sci. 2023;5:477–490. doi: 10.1016/j.tibs.2023.01.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 163.Wang C, Uversky VN, Kurgan L. Disordered nucleiome: abundance of intrinsic disorder in the DNA- and RNA-binding proteins in 1121 species from Eukaryota, Bacteria and Archaea. Proteomics. 2016;16:1486–1498. doi: 10.1002/pmic.201500177. [DOI] [PubMed] [Google Scholar]
- 164.Zhao B, Katuwawala A, Uversky VN, Kurgan L. IDPology of the living cell: intrinsic disorder in the subcellular compartments of the human cell. Cell Mol Life Sci. 2021;78:2371–2385. doi: 10.1007/s00018-020-03654-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 165.Lin MT, Beal MF. Mitochondrial dysfunction and oxidative stress in neurodegenerative diseases. Nature. 2006;443:787–795. doi: 10.1038/nature05292. [DOI] [PubMed] [Google Scholar]
- 166.Babinchak WM, Surewicz WK. Liquid–liquid phase separation and its mechanistic role in pathological protein aggregation. J Mol Biol. 2020;432:1910–1925. doi: 10.1016/j.jmb.2020.03.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 167.Ahmad A, Uversky VN, Khan RH. Aberrant liquid–liquid phase separation and amyloid aggregation of proteins related to neurodegenerative diseases. Int J Biol Macromol. 2022;220:703–720. doi: 10.1016/j.ijbiomac.2022.08.132. [DOI] [PubMed] [Google Scholar]
- 168.Tartaglia GG, Vendruscolo M. Proteome-level interplay between folding and aggregation propensities of proteins. J Mol Biol. 2010;402:919–928. doi: 10.1016/j.jmb.2010.08.013. [DOI] [PubMed] [Google Scholar]
- 169.Statello L, Guo C-J, Chen L-L, Huarte M. Gene regulation by long non-coding RNAs and its biological functions. Nat Rev Mol Cell Biol. 2021;22:96–118. doi: 10.1038/s41580-020-00315-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 170.Gadhave K, Kumar D, Uversky VN, Giri R. A multitude of signaling pathways associated with Alzheimer’s disease and their roles in AD pathogenesis and therapy. Med Res Rev. 2021;41:2689–2745. doi: 10.1002/med.21719. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 171.Coskuner O, Uversky VN. Intrinsically disordered proteins in various hypotheses on the pathogenesis of Alzheimer’s and Parkinson’s diseases. Prog Mol Biol Transl Sci. 2019;166:145–223. doi: 10.1016/bs.pmbts.2019.05.007. [DOI] [PubMed] [Google Scholar]
- 172.Salahuddin P, Fatima MT, Uversky VN, et al. The role of amyloids in Alzheimer’s and Parkinson’s diseases. Int J Biol Macromol. 2021;190:44–55. doi: 10.1016/j.ijbiomac.2021.08.197. [DOI] [PubMed] [Google Scholar]
- 173.Darling AL, Breydo L, Rivas EG, et al. Repeated repeat problems: combinatorial effect of C9orf72-derived dipeptide repeat proteins. Int J Biol Macromol. 2019;127:136–145. doi: 10.1016/j.ijbiomac.2019.01.035. [DOI] [PubMed] [Google Scholar]
- 174.Tejedor AR, Sanchez-Burgos I, Estevez-Espinosa M, et al. Protein structural transitions critically transform the network connectivity and viscoelasticity of RNA-binding protein condensates but RNA can prevent it. Nat Commun. 2022 doi: 10.1038/s41467-022-32874-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 175.Ganne A, Balasubramaniam M, Ayyadevara S, Shmookler Reis RJ. Machine-learning analysis of intrinsically disordered proteins identifies key factors that contribute to neurodegeneration-related aggregation. Front Aging Neurosci. 2022;14:938117. doi: 10.3389/fnagi.2022.938117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 176.Mukherjee P, Panda P, Kasturi P. A comparative meta-analysis of membraneless organelle-associated proteins with age related proteome of C. elegans. Cell Stress Chaperones. 2022;27:619–631. doi: 10.1007/s12192-022-01299-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The authors confirm that the data supporting the findings of this study are available within the article and also present as electronic supplementary materials. Additional data will be available on reasonable request.