Abstract
Plants have acquired the ability to adapt and respond to varying environmental conditions through modifications in their developmental programs. This adaptability relies on the plant’s capacity to sense environmental cues and respond via diverse signal transduction pathways and transcriptional regulation. Transcription factors are central in these processes, orchestrating specific gene expression in both developmental and stress responses. In Arabidopsis thaliana, 91% of transcription factors contain large intrinsically disordered regions (IDRs). The structural flexibility in these regions is critical in protein-protein interactions and contributes to functional versatility across different cell types. MADS-domain transcription factors constitute an eukaryotic protein family involved in a diversity of developmental processes and stress responses. Using bioinformatic tools, we found that most Arabidopsis MADS-domain proteins contain IDRs (≥30 residues) in their C-terminal region, with a higher proportion of global disorder in Type II compared to Type I MADS-domain proteins. Remarkably orthologous proteins from non-plant species in the Eukarya domain (Drosophila melanogaster, Saccharomyces cerevisiae, and Homo sapiens) also present disordered C-terminal regions, containing longer IDRs than those found in Arabidopsis, or other analyzed plant species. Additionally, conserved motifs were identified within the C-terminal IDRs of Arabidopsis Type I and Type II MADS-domain proteins, suggesting interactions with co-regulatory partners. We also identified putative activation domains in the C-terminal region of Type I and Type II MADS-domain proteins. The involvement of IDRs in selecting co-regulators is further supported by the identification of Molecular Recognition Features (MoRFs) in Type II MADS-domain proteins. The conserved structural disorder in the C-terminal region of MADS-domain proteins, which includes specific motifs, across diverse domains of life provides valuable insights into their structural properties and mechanisms of action as transcriptional regulators.
Introduction
Plants have developed a wide range of mechanisms to adapt to changing environmental conditions. Their ability to sense various signals and respond accordingly through diverse and complex signaling pathways to accomplish accurate gene regulation, allows them to exhibit phenotypic plasticity. This plasticity enables plants to optimize their growth and development in response to factors such as light, temperature, water availability, nutrient levels, and biotic stresses. Remarkably different species of plants share plastic traits associated with their specific habitats (e.g., [1]), highlighting the adaptive role of plastic responses and underlying the functional role of the genes involved and their regulation [2].
Transcription factors (TFs) are central regulators in all organisms, playing key roles in diverse biological processes, such as growth, differentiation, hormone signaling, stress responses, and immune defense. By integrating signals from different pathways, TFs orchestrate the complex regulatory networks that underlie plant development and adaptation, enabling plants to respond dynamically to changing environmental conditions [3–9].
MADS-domain TFs belong to a eukaryotic protein family that participates in several developmental processes and stress responses [10,11]. The name MADS-box is derived from the four founding members of this family: MINICHROMOSOME MAINTENANCE 1 (MCM1) in yeast, AGAMOUS (AG) in Arabidopsis thaliana (from now on Arabidopsis), DEFICIENS (DEF) in Antirrhinum majus (snapdragon), and SERUM RESPONSE FACTOR (SRF) in humans [12]. The MADS (M) domain is highly conserved and present in all members of the family and is typically located in the N-terminal region of the protein. However, there are exceptions to this pattern, such as in SRF in humans and MCM1 in yeast, where the MADS domain is found in different positions within the protein [13,14]. While some organisms have just a few MADS-box genes, such as Saccharomyces cerevisiae (yeast) with four, Drosophila melanogaster with two, and Homo sapiens with five [15–18], plants exhibit a considerably higher abundance of these genes (52–167 in 21 angiosperms) [19–22]. MADS-domain proteins are classified into two main types: Type I (Serum Response Factor (SRF)-like), and Type II (Myocyte Enhancer Factor (MEF)-like) based on their sequence characteristics, genomic organization, and functional roles [23,24]. In yeast, MADS-domain proteins are involved in key processes such as arginine metabolism, osmotic stress response, and mating type regulation. In Drosophila, MEF2 genes participate in muscle differentiation, while in humans the different MEF2 genes also contribute to heart and neural development, and the response to various diseases [13,16,25–27]. In plants, MADS-domain proteins play several roles in development, encompassing the transition to flowering, flower organ development, fruit ripening, root development, and vegetative phase change, among others [24]. In Arabidopsis, Type I MADS-domain proteins have been primarily implicated in female gametogenesis and seed development, while Type II MADS-domain proteins play central roles in controlling floral organ identity and flower development in angiosperms and are involved in almost all Arabidopsis developmental processes [24,28]. Type I MADS-domain proteins typically consist of three domains: M, I (intervening), and C (C-terminal domain), whereas Type II proteins possess four domains, forming the acronym MIKC [29–33]. In both cases, the “M” domain is responsible for DNA binding but also participates in protein-protein interactions. MADS-domain proteins bind to a specific DNA sequence known as the “CArG” box located in the regulatory regions of target genes. The “I” domain is located between the “M” and the “K” domains. While it is less conserved than the “M” domain, it also participates in DNA-binding and dimerization specificity [33,34]. The “K” domain derives its name from its structural resemblance to proteins known as keratins. This domain is involved in protein-protein interactions and contributes to the formation of higher-order complexes. Finally, the “C” domain is located at the C-terminal of the protein and exhibits variations in length and sequence among different MADS-domain proteins. For some MADS-domain proteins, it has been shown that this domain participates in transactivation and in the formation of higher-order complexes [35].
A phylogenetic analysis of 107 Arabidopsis MADS-box sequences revealed that these genes can be grouped into two main lineages (Type I and Type II) or five subfamilies: Mα, Mβ, and Mγ, and MIKCc and MIKC*. The Mα, Mβ, and Mγ, subfamilies belong to Type I MADS-domain proteins whereas the MIKCc and MIKC* subfamilies are classified as Type II [24,36]. The genomic distribution of MIKC genes, along with evidence from genome history, suggests that these genes existed before the Arabidopsis genome polyploidization event and are distinct from the Mα, Mβ, and Mγ subfamilies [23,36].
In Arabidopsis, 91% of TFs are characterized by large intrinsically disordered regions (IDRs), crucial for diverse cellular functions [37]. IDRs are segments of proteins that lack a fixed or stable three-dimensional structure under physiological conditions. Their unique physicochemical properties arise from the specific nature of their amino acid sequences and the characteristics of the individual amino acid residues within these sequences. Intrinsically disordered proteins (IDPs) and/or IDRs often lack a significant number of hydrophobic residues typically associated with folded protein domains whereas they are enriched in amino acids that allow their flexibility [38]. The IDP amino acid sequences enable a structural dynamic that depends on the physicochemical properties of their environment and the interactions with other molecules [39,40]. This flexibility plays a pivotal role in mediating IDP multiple protein-protein interactions and in their participation in different developmental processes, such as cell cycle, transcriptional control, and responses to different stress conditions [41,42]. However, only a limited number of TF families have been analyzed, highlighting the presence of IDRs and their association with their role in their respective functions [43–47].
Interestingly, IDRs often contain specific sequences known as Molecular Recognition Features (MoRFs), these short transiently folding sequences play a key role in determining site-partner specificity [48,49]. MoRFs show distinctive physicochemical characteristics that may aid in protein interaction through the hydrophobicity of their amino acid composition (rich in proline and methionine) and the hydrophilic nature of IDRs [50,51]. The charge of individual amino acids and their electrostatic interactions affect the conformational structure of the protein, which in turn affects its binding specificity and stability. This has led to the hypothesis that MoRFs are key participants in the multi-partner binding capacity of hub proteins [52–54]. The dynamic interconnectivity of IDPs/IDRs has also been associated with their ability to aggregate through liquid-liquid phase separation (LLPS), a process that plays a critical role in protein and RNA organization [55]. LLPS is now recognized as a fundamental organizational and regulatory principle across all organisms. It enables the concentration of specific proteins and nucleic acids into biomolecular condensates, i.e., membrane-less organelles that regulate cellular activity by localizing particular proteins, accelerating enzymatic reactions, and favoring selective interactions. This dynamic regulation supports the precise control of a diversity of processes involved in growth, development, and responses to environmental changes and pathogens [56].
In this work, we used bioinformatic tools and publicly available datasets to investigate the presence of IDRs in MADS-domain proteins from plants and other taxa. We found that Arabidopsis Type I and Type II MADS-domain TFs present a high level of disorder, ranging from 20 to 80%, with longer IDRs in their C-terminal domain. This characteristic was found to be conserved in orthologs of these proteins from other plant species as well as from Drosophila melanogaster, Saccharomyces cerevisiae, and Homo sapiens. Of note, within the C-terminal region of proteins in both the SOC1 and the FLC clades [23,24], we found two different motifs common to all members of the SOC1 and FLC clades, suggesting functional restriction and phylogenetic conservation within these groups. Interestingly, the SOC1 motif is conserved across SOC1 orthologs from different plant species, and in other Arabidopsis Type II MADS-domain proteins. Many Arabidopsis MADS-domain proteins also contain MoRFs that overlap with or are located just a few amino acids away from the SOC1 and the FLC motifs in the last residues of the C-terminal region. We mapped putative activation domains (ADs) across all clades of Type I and Type II MADS-domain proteins on their C-terminal region. Interestingly, Type I showed a higher proportion of ADs compared to Type II MADS-domain proteins. Finally, our in silico analysis predicts that some MADS-domain proteins may form condensates, potentially leading to different conformational arrangements and expanding their functional roles. The information in this work adds valuable insights to better understand the MADS TFs molecular mechanisms involved in the control of plant growth, morphogenesis, and responses to environmental changes.
Materials and methods
Complete sequence retrieval and domain description
Individual protein sequences of Arabidopsis thaliana were retrieved from the TAIR database [57] except for STK (AGL11) which was retrieved from UniProt (Q38836). Protein domains were identified using the UniProt database [58]. The I domain was defined based on the AtSOC1 protein obtained from the supplementary information in Lai et al. (2021) [33] for both Type I and Type II MADS-domain proteins. In this study, the C-terminal region was defined as the sequence located downstream of the I (for both Type I MADS-domain proteins) or downstream of the K domain (for Type II MADS-domain proteins). Retrieval of protein sequences from other plants and non-plant organisms was obtained from GenBank, the Rice Annotation Project DataBase, and UniProt [58–60] (S1 Table). A total of 143 sequences were collected, including 100 from Arabidopsis, nine from rice, and at least one homolog from basal plants and other angiosperms (S1 Table). The selection of the Type I MADS-domain proteins was based on Bemer et al. (2010) [28], while the selection of rice sequences covering both Types of MADS-domain protein orthologs was based on Arora et al. (2007) [61].
Sequence alignment and phylogenetic analysis
After retrieving the individual protein sequences, they were aligned with the MAFFT algorithm at the MAFFT web server [62,63]. The parameters used were L-INS-i and the mafft-homolog function with UniRef. After the alignment of the 143 sequences, we visually inspected for ambiguous misalignments with the AliView program [64]. Using this alignment, we determined the position of the I domain for Type I and Type II MADS-domain proteins. The M and the K domains for all MADS-domain proteins were retrieved from Uniprot. This approach allowed us to accurately map the C-terminal region along with its corresponding IDRs. By assigning approximately 190 amino acid residues up to the K-domain, we found a better alignment of conserved regions among the selected protein sequences. The alignment with the MIK domains was used to recover a phylogeny by applying a Maximum Likelihood approach with the RAxML algorithm, following a JTT substitution model, at the CIPRES website [65–68].
Motif and Molecular Recognition Features (MoRFs) identification
The MEME algorithm [69,70] was used to identify motifs within the whole protein sequence or the C-terminal regions. Molecular Recognition Features (MoRFs) are small IDRs potentially involved in the initial events of molecular recognition during protein-protein interactions. These regions undergo disorder-to-order transitions upon binding. MoRFs were identified in the complete protein sequences using the fMoRFpred tool that uses the physicochemical properties of amino acid residues to fit a Support Vector Machine (SVM) model for predicting the presence of MoRFs within IDRs [49,71]. To map the predicted MoRFs, their locations were aligned across the protein sequences alongside the MEME motifs.
Mapping of activation domains within the C-terminal region of MADS-domain proteins
Potential activation domains (ADs) in MADS-domain proteins were retrieved directly from Morffy et al. (2024) [72] (S2 Table). In this study, the authors experimentally identified activation domains in various plant TFs using a comprehensive library. This library consisted of overlapping 40 amino acid fragments, spanning the entire set of plant TFs, with a step size of 10 amino acids, resulting in a total of 68,441 fragments. These fragments were screened in yeast to assess their transcriptional activation capacity. Based on this experiment and subsequent normalization, a (Plant Activation Domain Identification) PADI score was assigned to each fragment. Among the fragments showing transcriptional activation activity, some corresponded to MADS-domain TFs. The PADI scores included in our manuscript are those reported here, and the localization of activation domains (AD) within MADS-domain TFs was inferred from the results obtained in Morffy et al. (2024) [72], where the authors applied a neural network-based algorithm, known as transcriptional activation domain activity (TADA) network. Their work integrated multiple layers of analysis, including the construction of a feature matrix and the use of methods to assess the impact of both individual input features and border local and global interactions predicted by TADA. Additionally, these results were further analyzed using deep learning to identify key properties relevant to the prediction of ADs. These authors also applied a Shapley additive explanations (SHAP) analysis to capture non-linear and linear correlations, thereby uncovering complex patterns. This was followed by additional deep-learning steps that culminated in the development of a tool capable of predicting potential ADs. Using the Morffy et al. (2024) [72] dataset, we filtered the information for MADS-domain proteins to localize the “NOT-AD” and the “AD” fragments within their C-terminal IDR.
IDR prediction and structural disorder score of the complete proteins
The majority of Arabidopsis IDRs were retrieved from the Alphafold section of the MobiDb database [73] (accessed in November 2023). The IDR prediction was conducted with the AlphaFold Colab Notebook with default values or directly in the AlphaFold web server [74] (accessed in January 2024). The resulting structures were visually inspected with the pbd.file coupled with the Predicted Aligned Error (PAE) of the proteins in the Chimera X molecular visualizer [75]. We also checked the IDRs for each prediction in JalView and accounted for the low Temperature Factor regions, which correspond to the low pLDDT values [76]. Disordered regions predicted by AlphaFold are those with a pLDDT < 50, usually seen as ribbon-like structures [77, 78]. The structural disorder analysis was assessed using the RIDAO platform [79], which includes various intrinsic disorder predictors [79,80]. Protein sequences in FASTA format were used as input for the analysis, the amino acid sequence of the C-terminal region was manually extracted from the original full-length protein sequences and formatted in FASTA. The RIDAO platform provides two key metrics for each protein: The Average Disorder Score (ADS) and the Percent of Predicted Intrinsic Disorder Residues (PPIDR). The ADS indicate the overall propensity of a protein to be intrinsically disordered, allowing a comprehensive analysis of structural disorder. The PPIDR represents the proportion of amino acids predicted to be disordered, considering those with significant intrinsic disorder score (> 0.5), relative to the total number of residues in the protein [81].
Physicochemical properties of disordered proteins
Parameters related to the amino acid residue charge (NCPR: net charge per residue, and FCR: fraction of charge residues) and their distribution (patterning parameter kappa) [52] across the C-terminal regions of MADS-domain proteins were calculated using tools available in the CIDER web server [82,83].
Post-translational modifications (phosphorylation)
Phosphorylation in serine, threonine, and tyrosine residues within the complete protein sequences was predicted using the NetPhos algorithm [84–86]. Additionally, experimentally validated data were obtained from the ATHENA at [87] (accessed in February 2024) and the EPSD databases (accessed in May 2025) [88].
Condensate formation tendency
Condensate propensity was calculated for the complete MADS-domain protein and their C-terminal domain region sequences using the FuzDrop algorithm [89,90]. Protein sequences in FASTA format were used as input for the analysis, the amino acid sequence of the C-terminal region was manually extracted from the original full-length protein sequences and formatted in FASTA.
Data analyses and image rendering
We evaluated differences in disorder ADS and PPIDRS among MADS-domain proteins, with a Wilcoxon test under a permutation approach with the coin R package [91]. Also, we compared the raw number of phosphorylation sites among Arabidopsis MADS-domain protein types with a Chi-squared test. All the analyses and graphics were performed in the R program [92]. Protein diagrams were made with the drawProteins library from the Bioconductor repository, and the final rendering was done with the GIMP program [93] (license GPLv3. Version 3.0.4, Free Software). Statistical graphs were made with the tidyverse libraries and with ggplot2 side-by-side working libraries (S3 Table). Other R packages used here are cited in S3 Table.
Results
The C-terminal domain of Arabidopsis MADS-domain transcription factor family presents structural disorder propensity
The function of MADS-domain proteins in development is conserved across diverse plant species and other taxa. Given the diversity of cell types and conditions in which these transcription factors regulate gene expression, the structural disorder appears to be an advantageous property, enabling efficient and versatile functionality. In this study, we evaluated the occurrence of structural disorder in 58 Type I and 42 Type II MADS-domain proteins from Arabidopsis to gain insight into their protein structure and its relationship to their function. To evaluate the degree of disorder, the Average Disorder Score (ADS) and the Percent of Predicted Intrinsic Disorder Residues (PPIDR) were calculated in the RIDAO platform, for both the complete proteins and their C-terminal regions. For Type I MADS-domain proteins, both the mean ADS and mean PPIDR of the complete proteins and their C-terminal regions are similar (Fig 1A and 1C, left panel). In contrast, for Type II MADS-domain proteins, both the mean ADS and mean PPIDR are larger for the C-terminal region than for the complete proteins (Fig 1A and 1C, right panel). Also, the relation between the protein length and ADS or PPIDR values revealed a broader length distribution (100–450 amino acid residues) for Type I MADS-domain proteins and only a minor inverse correlation between protein length and global disorder (Fig 1B and 1D, left panel). In contrast, Type II proteins exhibited a narrower length range (200–300 amino acid residues) but followed a positive trend between length and disorder (Fig 1B and 1D, right panel). Moreover, Type II MADS-domain proteins present a significantly higher disorder (ADS and PPIDR) compared to Type I proteins (permutation Wilcoxon test, [ADS] Z = −3.1497, p = 0.0016; [PPIDR] Z = −3.897 p = 1x10-4, Fig 1). AlphaFold-based analysis of the structural disorder distribution revealed that this property is predominantly located in the C-terminal regions for both types: in Type I this region follows the I-domain, while in Type II proteins it is found beyond the K-domain (Fig 2A, S4 Table). Notably, the longest IDRs in both types were consistently located within the C-terminal region, and the IDRs at the C-terminal region in Type I MADS-domain proteins were longer than in the Type II MADS-domain proteins (permutation Wilcoxon test, Z = 4.1552, p-value < 1e-04p, Fig 2, S4 Table). The number of IDRs containing at least 30 amino acid residues was similar between the two groups (S4 Table, Type I total count = 51, Type II total count = 42, χ2 = 0.87, df = 1, p = 0.351). Nevertheless, in nine Type I MADS-domain proteins (AGAMOUS-LIKE91 [AGL91], AGL29, AGL58, AGL64, AGL87, AGL101, AGL59, AGL102, AGL85) we only detected IDRs shorter than 30 amino acid residues, whereas all Type II MADS-domain proteins contain at least one IDR with ≥30 amino acids. Interestingly, most MADS-domain proteins present an IDR at the beginning of the N-terminal, in the first residues of the M-domain.
Fig 1. The Average Disorder Score (ADS) and Predicted Percentage of Intrinsically Disordered Residues (PPIDR) values in Type I and Type II MADS-domain proteins.
(A) Distribution of ADS in MADS-domain TFs. (B) Scatterplots showing the association between protein length and ADS. (C) Distribution of PPIDR values in MADS-domain TFs. (D) Scatterplots showing the association between protein length and PPIDR. Red dots indicate the mean of each value, and black dots indicate the raw values of either ADS or PPIDR per protein. Statistical analysis using the Wilcoxon test shows a significant difference between Type I and Type II proteins (p = 0.0002).
Fig 2. MIKC domains, disordered regions, and phosphorylation sites for MADS-domain proteins.
(A) Schematic representation of Type I and Type II MADS-domain proteins indicating their distinctive domains: MADS (M = yellow), Intervening (I = green), Keratin-like (K = blue), and Intrinsically Disordered Regions (IDR = red). Numbers below indicate the amino acid positions. Dots above the domains indicate putative phosphorylation sites. The columns to the right of each protein diagram indicate their corresponding values for PPIDR and ADS. Proteins are arranged in descending order according to their PPIDR. (B) Distribution of predicted phosphorylation sites in the N-terminal and C-terminal regions of Type I and Type II MADS-domain proteins. Whiskers indicate ±1.5 times the interquartile range (IQR) according to the Tukey method; the middle line denotes the median.
The flexibility and dimensions of IDRs depend on their amino acid composition, where the charge content and hydrophobicity are key factors. Because the composition of most IDPs includes positive and negative charges, some of their characteristics can be described by the fraction of charged residues (FCR) and net charge per residue (NCPR). Nevertheless, the best descriptors for IDP conformational properties are the FCR and the distribution of oppositely charged residues, defined by a patterning parameter named kappa (k) [52]. On average, polypeptides with low kappa-values (closer to 0) are predicted to adopt more extended conformations, while sequences with higher kappa-values (closer to 1) are expected to form more compact, hairpin-like structures [52]. The kappa-values obtained for the C-terminal of MADS-domain proteins showed that these regions in Type I and Type II proteins are prone to adopting extended conformations, likely due to the counterbalance of the interchain electrostatic interactions resulting from the more even distribution of oppositely charged residues. This result aligns with a mean NCPR close to zero and a mean FCR value at the boundary between weak and strong polyampholytes obtained for these IDRs (S1 Fig). Although this conformational analysis was applied to the IDRs in isolation from the rest of the protein and under certain conditions, which may influence their overall conformational properties, these correlations further reinforce the potential impact of these physicochemical properties on their structural organization and dimensions.
The C-terminal region of Arabidopsis MADS-domain proteins has a high propensity for phosphorylation
As phosphorylation is closely associated with protein disorder [94], we predicted the phosphorylation propensity of Type I and Type II MADS-domain proteins using the NetPhos algorithm. When comparing the C-terminal sequences of both MADS-domain protein types, we found that the Type II C-terminal contains a similar number of phosphorylation sites compared to those of the Type I C-terminal region (number of phosphorylation sites/ C-terminal length, permutation Wilcoxon test, Z = −1.8163, p = 0.069) (Fig 2B). In contrast, Type I MADS-domain proteins exhibited more abundance of predicted phosphorylation sites in regions outside of their C-terminal region (Fig 2B).
To further investigate the significance of the phosphorylation sites between the N-terminal and the C-terminal regions of MADS-domain proteins, we examined experimentally validated phosphorylation sites in Arabidopsis using two actively curated databases, ATHENA and EPSD. Compared with the large number of predicted sites from NetPhos, experimentally confirmed sites are less represented in Arabidopsis MADS-domain proteins. For Type I MADS-domain proteins, we found 48 experimentally verified sites in the N-terminal region and 17 sites in the C-terminal region. For Type II MADS-domain proteins, there were 18 sites in the N-terminal and 13 in the C-terminal regions (S5 Table).
The disordered C-terminal is a conserved feature of MADS-domain proteins across diverse taxa
Type I and Type II Arabidopsis MADS-domain proteins contain two or three domains, respectively, which we grouped into the N-terminal region (Fig 2A). As both types have a large, disordered C-terminal region (Fig 2A), we investigated whether this characteristic is more broadly conserved across organisms within the Eukarya domain. As a proof of concept, we only examined MADS-domain proteins from three of the four kingdoms within the Eukarya domain: Plantae, Fungi, and Animalia [95]. We selected sequences from Homo sapiens [5], Drosophila melanogaster [2], and Saccharomyces cerevisiae [4] and used Arabidopsis MADS-domain protein sequences as a reference due to their extensive characterization. These sequences were compared with those of various phyla within Plantae including Chlorophytes, Charophytes, Gymnospermae, and Angiospermae. We extended this comparison to model species within Animalia (Chordata and Arthropoda) and Fungi (Ascomycota) (Fig 3 and S2 Fig). This analysis showed that all the selected MADS-domain proteins of non-plant species contain a disordered C-terminal region and that at least for these proteins, these regions are considerably longer than in most plant MADS-domain proteins (Fig 3 and S2 Fig). Moreover, regardless of the taxonomic group to which a species belongs, the C-terminal region consistently presents a higher level of disorder compared to the full-length protein, either with ADS or PPIDR (S3 Fig).
Fig 3. Intrinsically disordered regions (IDRs) in selected MADS-domain proteins from plant and non-plant species.
The phylogenetic tree presented here is a simplified and rescaled version of the original presented in Supplementary Fig 2, with branch lengths adjusted for clarity. Deeper nodes within Type I MADS-domain protein clades exhibit low phylogenetic support, and their relationships should be interpreted with caution. In contrast, the more derived nodes, particularly among Type II proteins, show stronger phylogenetic support, consistent with previously published phylogenies. PPIDR and ADS values for each protein are shown in the columns to the right of the diagram.
The disordered C-terminal regions of Arabidopsis MADS-domain proteins contain conserved motifs and MoRFs
To further characterize the C-terminal region of the MADS-domain proteins, we looked for conserved motifs within this region and identified three distinct motifs both in Type I and in Type II MADS-domain proteins (Fig 4). Among the Arabidopsis MADS-domain proteins, there is a conserved pattern of motif distribution, with motifs shared in a subfamily-specific manner. Within the β subfamily of Type I MADS-domain proteins [24], 10 out of 19 proteins (Fig 4) contain both motif 1 and motif 2, whereas seven proteins from both the γ and β subfamilies possess only motif 2 (Fig 4). In the γ subfamily [24], 10 out of 17 proteins have motif 3 (Fig 4). Our analysis shows that none of the α subfamily of Type I MADS-domain proteins contain any distinctive motif.
Fig 4. Motifs identified within the IDRs of the C-terminal regions of MADS-domain proteins.
Motifs were detected using the MEME algorithm [121] based on the C-terminal sequences of Type I and (left panel) Type II MADS-domain proteins (right panel). Type I MADS-domain proteins do not contain molecular recognition features (MoRFs; purple rectangles) near or within the identified motifs. In contrast, in Type II MADS-domain proteins, MoRFs (purple rectangles) overlap with the predicted motifs, supporting their potential functional relevance. MorRFs were predicted using the fMoRF algorithm [123]. Consensus sequences of the identified motifs shown in panels (lower panel).
Among Type II MADS-domain proteins, three motifs are shared by members of different clades. In the SOC1 clade (SOC1, XAL2, AGL42, AGL19, AGL71 and AGL72), all members contain motif 4, which is not only shared by the SOC1 clade members but also by other proteins outside this clade, including SEP3, AGL24, SEP2, AGL15, and SVP (Fig 4, right panel). Within the FLC clade, all members shared motif 5, suggesting functional constraints and convergent acquisition (Fig 4). Motif 6 is exclusive to CAL and AP1, which are partially redundant in floral meristem determination and belong to the AP1 clade [96]. Additionally, we selected various SOC1 orthologs and members of the FLC clade to further characterize their conserved motifs. For SOC1 orthologs, the consensus motif is predicted to span 19 residues, with a core of 10 highly conserved residues across angiosperms, except in Populus trichocarpa (S4A Fig). The FLC consensus motif is 21 amino acids long and contains at least eight conserved residues , primarily located in the latter portion of the sequence [24] (S4B Fig).
We also evaluated the Molecular Recognition Features (MoRFs) across the full-length MADS-domain proteins to determine whether the ubiquitous IDRs in the C-terminal coincide with any MoRFs. Since MoRFs are known to mediate molecular recognition and are proposed to facilitate specific interactions among proteins, we hypothesized that the C-terminal would contain at least one predicted MoRF. Indeed, we found that several MADS-domain proteins, regardless of their species of origin, have a MoRF within the last 10 amino acids of their C-terminal region (S5 Fig). Moreover, several MADS-domain proteins show 1–4 amino acid MoRFs at the beginning of their N-terminal region. Intriguingly, one of these proteins is AG, which also has an IDR of at least 30 amino acids in the N-terminal region. Interestingly, the MoRFs in SOC1, AGL24, AGL42 and SVP proteins were found associated with the SOC1 motif (Fig 4). Similarly, the FLC motif also shows predicted MoRFs within the last 5–7 amino acids, located towards the end of the motif in all the proteins of the FLC clade. In contrast, Type I MADS-domain proteins do not present any MoRFs coinciding with identified motifs (Fig 4).
The disordered C-terminal regions of MADS-domain proteins contain potential activation domains
Activation domains (AD) in TFs play a central role in the function of these proteins, as they constitute the recruiting sites of coactivator complexes to activate transcription [97]. A connection between IDRs and ADs has been established in several transcription factor families, highlighting the importance of structural flexibility in recognizing diverse molecular targets according to the cellular conditions [98,99]. The highly conserved presence of a C-terminal IDR in Type I and Type II MADS-domain proteins prompted us to search for potential ADs in this region. Using the plant activation domain identification (PADI) score developed by Morffy et al. (2024) [72] to analyze these C-terminal regions, we identified high-scoring regions in 34 (58.3%) of Type I MADS-domain proteins, indicating the presence of a high proportion of potential ADs, compared to the 9 (22.5%) of Type II MADS-domain proteins with a high PADI score (Fig 5 and S2 Table).
Fig 5. Potential activation domains in the C-terminal region of MADS-domain proteins.

The Predictive Activation Domain Index (PADI), or scaled activation score, predicts the likelihood of putative activation domains (ADs), with fragments with a PADI score ≥1 classified as potential ADs [64]. In the graphs, dots indicate the positions of 40-amino-acid fragments predicted to containADs, while black stars mark experimentally validated ADs. Numbers along the horizontal lines show regions within the protein that are enriched in putative ADs. Colored boxes represent the defined domains in Type I and Type II MADS-domain proteins: MADS domain (yellow), I domain (green), K domain (blue), and C-terminal region (red).
Although not all MADS-domain proteins exhibited potential ADs, numerous regions with a high PADI score were found in some of them. To look for possible AD distribution patterns, we graphed the localization of the high PADI scoring regions across the C-terminal region of the MADS-domain proteins. Given the large amount of Type I proteins, we grouped them by subfamilies (Mα, Mβ, and Mγ) as described for Type I MADS-domain proteins to improve clarity in the analysis [24,36]. The C-terminal region of Type I Mα MADS-domain proteins showed a higher abundance of potential ADs (208–341). In contrast, the highest abundance of putative ADs in Type I Mβ was found from 111 to 324 amino acid residues. For the Type I Mγ subfamily, the highest AD abundance was found in two distinct sections (161–278 and 281–339) of their central region. In the case of Type II MADS-domain proteins, potential ADs were more evenly distributed across the region between 171–255 amino acid residues (Fig 5 and S2 Table).
Additionally, for Type I MADS-domain proteins, we found that putative ADs are associated with motifs previously identified in one member of the Mβ subfamily (AGL81), and one member of the Mγ subfamily (PHE1) [36]. Similarly, for Type II MADS-domain proteins, some ADs are associated with different motifs identified in one member of the SOC clade (AGL19), in two members of the SQUA clade (CAL, AP1), and two members of the SEP clade (SEP2 and SEP3) [24]. The overlap of specific motifs within particular phylogenetic groups of MADS-domain proteins supports the functional significance of these potential ADs.
MADS-domain proteins show a propensity for liquid-liquid phase separation
Some proteins can form compartments within cells where certain proteins, RNA, and metabolites concentrate to orchestrate specific cellular processes. These compartments are generated via Liquid-Liquid Phase Separation (LLPS), a process in which certain molecules concentrate to form a new liquid phase distinct from the surroundings. Some proteins are considered droplet drivers according to their likelihood of forming a droplet state via pLLPS. In this state, protein interactions can occur in different binding configurations, making IDPs common components of these structures. To explore the propensity of MADS-domain proteins to spontaneously undergo LLPS and form condensed cellular states, we used the FuzDrop server [89]. The FuzDrop algorithm defines a probability threshold value (pLLPS ≥ 0.60) to identify proteins capable of phase separation and driving droplet formation. This analysis showed that among Type I MADS-domain proteins and their C-terminal domain, the LLPS propensity is widely distributed between proteins with low or high ADS or PPIDR, finding 21 proteins with LLPS propensity (AGL103, AGL93, AGL89, AGL53, AGL74, AGL48, AGL77, AGL102, AGL60, AGL64, AGL23, AGL56, AGL92, AGL98, AGL75, AGL52, AGL76, AGL45, AGL81, AGL43, AGL29), and three more when analyzed only the C-terminal region (AGL91, AGL99, AGL86), except six of the first proteins (AGL60, AGL64, AGL98, AGL52, AGL76, AGL81) (Fig 6 and S4 Table). Regarding Type II MADS-domain proteins, only four proteins showed LLPS propensity (AGL104, AGL79, AGL66, GOA), whose ADS are between 0.4 and 0.6 (S4 Table). This number increased to 17 more when only the C-terminal region was analyzed (including SEP1, SEP4, AGL13, AGL15, AGL18, FCL, MAF1, AP1, AG, AGL19, AGL67, AGL24, SHP2, MAF5, AGL17, AGL72, MAF4) (Fig 6 and S4 Table).
Fig 6. Liquid-Liquid Phase Separation (LLPS) propensity in Type I (right panel) and Type II (left panel) MADS-domain proteins.
The LLPS probability index (pLLPS) was calculated using FuzPred, while the ADS and PPIDR were derived from RIDAO predictions. The scatterplots illustrate the relationship between the pLLPS index and ADS (A) or PPIDR (B) for the full-length proteins (dark diamonds) and the C-terminal (light grey diamonds), for Type I and Type II MADS-domain proteins.
Discussion
The Arabidopsis MADS-domain proteins participate in nearly all developmental processes and are also involved in many different stress responses [11,29,100]. These proteins are divided into two groups based on their phylogenetic relationship and protein domains: Type I with three domains (M, I, and C), and Type II with four domains (M, I, K, and C) [24,35,101,102].
In this work, we demonstrated through in silico analyses that the C-terminal regions of 100 Arabidopsis MADS-domain proteins listed in UniProt are enriched with IDRs (≥30 residues), with no significant differences in the number of IDRs between Type I and Type II proteins. Although we haven’t been able to find information regarding the functional characterization of these regions in Type I proteins, several examples underscore the significance of the C-terminal domain in the function of Type II MADS-domain proteins. In particular, specific phenotypes and altered protein-protein interactions in Type II MADS-domain proteins have been associated with point mutations or deletions within IDRs in the C-terminal region, highlighting their functional importance (Fig 7 and S6 Table). For instance, in Arabidopsis, Raphanus sativus, Nicotiana sylvestris and N. tabacum, the C-terminal regions of APETALA 1 (AP1) and its orthologs have been shown to mediate transcriptional activation in yeast and mammalian cells [101]. Additionally, three AP1 loss-of-function mutants (ap1–4, ap1–6 and ap1–8), all with mutations in the C-terminal region, exhibit different phenotypes [103]. Similarly, the C-terminal domains of GLOBOSA (GLO) and DEFICIENS (DEF) in Antirrinhum majus are critical for the interaction between GLO and DEF and between DEF and SQUAMOSA (SQUA) [103]. Furthermore, the C-terminal domain plays a central role in mediating interactions between MADS-domain proteins and non-MADS-domain proteins. For instance, the co-repressors SEUSS (SEU) and LEUNIG (LUG) interact with AP1 or SEP3 through their C-terminal domains [104]. In Arabidopsis, the K and C-terminal domains of AGAMOUS (AG) are also indispensable for DNA binding [105,106]. Interestingly, a small IDR in the N-terminal of AG, located before the MADS-domain (Fig 1), is essential for its function, as constructs lacking this IDR exhibit a phenotype like an AP2 mutant. Moreover, overexpression of AG without its C-terminal domain results in a phenotype like that of the AG loss-of-function mutant (ag), indicating that the C-terminal domain participates in AG functions [106]. For SEP3, the interaction between helices in the N-terminal domain and those in the C-terminal domain of different partner proteins creates a hydrophobic interface that facilitates dimerization [107]. This supports the hypothesis that the C-terminal domain participates in the stabilization of protein complexes. The dimerization of TFs has an important role in regulating heterodimerization, enabling dynamic temporal responses to changes in protein concentrations, among other functions [108]. Given that most MADS-domain proteins function as homo or heterodimers [107], the role of IDRs in their C-terminal region becomes of particular interest.
Fig 7. Proposed model for the role of intrinsically disordered regions (IDRs) in the C-terminal region of MADS-domain transcription factors.
Top left panel – Schematic representation of Type I and Type II MADS proteins. Different functional domains are color-coded, with the C-terminal region illustrated as a red ribbon. In the C-terminal IDR are Activation Domains (ADs), Molecular Recognition Features (MoRFs) or motifs, and phosphorylation sites (see color-codes to the left of the diagrams). These features are associated with promoting protein–protein interactions both between MADS-domain proteins and with other regulatory partners. Some MADS-domain proteins display a high propensity for liquid–liquid phase separation, suggesting their involvement in the formation of biomolecular condensates. Such transcription factor condensates have been reported to enhance target gene expression and, in some cases, to recruit components of the transcriptional machinery (e.g., RNA Pol II). In some instances, deletion of the C-terminal region of MADS-domain proteins disrupts their protein–protein interactions, potentially impairing dimer formation and resulting in reduced activation of target gene activation.
The analysis conducted in this study also revealed that structural disorder in the C-terminal region of MADS-domain proteins is widely conserved across diverse taxa, including Drosophila, yeast, and human MADS-domain proteins. In humans, there are four MADS-domain proteins (MEF2A-D) mainly involved in neural development, muscle formation, heart development, and carcinogenesis. Of note, like some Arabidopsis MADS-domain proteins, the C-terminal domain of human MEF2B and MEF2D is required for transactivation [109,110]. Furthermore, phosphorylation at specific sites within the C-terminal domain of MEF2D has been shown to inhibit its transcriptional activity [111]. Consistent with these observations, our analysis also found a high frequency of predicted phosphorylation sites in the C-terminal domain of both Type I and Type II Arabidopsis MADS-domain proteins. This finding also aligns with previous observations showing a higher phosphorylation propensity in IDRs compared to ordered regions across entire proteomes [94]. It underscores the functional relevance of these post-translational modifications and highlights the prevalence of multisite phosphorylation within disorder regions. Although some of the sites predicted from the NetPhos may not be functional and/or are still waiting for experimentally testing, the relatively balanced distribution of experimentally tested sites across both regions in Type II suggests functional relevance in both the N-terminal and C-terminal regions. Furthermore, the higher representation of sites in the N-terminal regions (M, I, and K domains) may partly reflect a historical research focus on the DNA-binding function of the M domain. On the contrary, the C-terminal region is more variable and has not been throughly studied and might play an important but underappreciated role in MADS-domain protein function. The presence of multiple phosphorylation sites in IDRs has been associated with their role in modulating gradual cellular responses and mediating protein-protein interactions, emphasizing the regulatory role of phosphorylation within disordered domains [112–115]. Further investigation into the function of the predicted phosphorylation sites within the C-terminal disordered region of MADS-domain proteins will enhance our understanding of the mechanisms by which these TFs regulate gene expression.
By searching for conserved motifs within the disordered C-terminal region of MADS-domain proteins, we identified three distinct and specific motifs in Type I and three for Type II MADS-domain proteins. Interestingly, motif 3 identified in Type I MADS-domain proteins corresponds to a conserved region previously reported by De Bodt et al. (2003) [116]. Furthermore, some of these motifs were found in all the members of different clades, suggesting that they are associated with particular functions mediated by specific interactions and shared between the related proteins. This is the case of two motifs found in Type I MADS-domain proteins that are shared among some MADS-domain proteins of Mγ and Mβ clades [24]. Motif 4 and 5, identified in Type II MADS-domain proteins (Fig 4B), is conserved across all SOC1-clade and FLC-clade proteins, respectively, highlighting its potential functional significance. Conducting motif-swapping experiments among SOC1 and FLC clade members and MADS-domain proteins that naturally lack this motif would provide valuable insights into its functional significance. The presence of clade-specific motifs in MADS-domain proteins may be attributed to the highly conserved interaction networks within different plant MADS-domain protein clades [117]. Interactions among MADS-domain proteins are restricted by their highly conserved domains in the complete proteins [117]. Conserved motifs within the variable C-terminal region identified in this study might also be involved in facilitating and stabilizing the interactions between MADS-domain proteins.
Several examples in plants demonstrate that transcriptional activation depends on the recruitment of coactivators. In MADS-domain proteins, tetramer formation increases the DNA regions available for transcriptional binding, enhancing their regulatory capacity [107,118]. Using the data from activation domains (ADs) obtained by Morffy et al. (2024) [72], we showed that a significant proportion of the potential ADs for MADS-domain TFs are in their C-terminal IDR. Moreover, we found that some of the identified ADs overlap with a conserved motif in the MADS-domain TFs of the SOC1 clade. This is particularly evident in the C-terminal of Type II MADS-domain proteins. The conserved motif in the C-terminal IDR of SEP homologs across different angiosperms [119], which corresponds to motif 4 in the SOC TFs, coincides with one of the ADs identified by Morffy et al. (2024) [72]. This overlap strongly suggests that this motif has a functional significance. We made similar observations for AP1 TFs, where ADs, characterized by the presence of acidic, proline-rich, and glutamine-rich subdomains, have been experimentally identified in their C-terminal IDRs [101].
Interestingly, Type I MADS-domain proteins exhibit more putative ADs than Type II and this could reflect specific roles for Type I MADS-domain proteins not only in the female gametophyte development but also in seed development [28,120–124]. Unfortunately, we were unable to find any study specifically analyzing the functional relevance of the C-terminal domains or any other regions of these proteins. This outcome adds further interest to the findings presented here and encourages future research into the role of the potential ADs in mediating the association between MADS-domain proteins and their co-regulators.
Finally, our analysis revealed that some MADS-domain proteins have a propensity to undergo liquid-liquid phase separation (LLPS). Nevertheless, even among those with high disorder scores, particularly within Type II MADS-domain proteins, not all appear capable of undergoing LLPS. This discrepancy may be influenced by the fact that our analysis was based solely on protein primary structures, without accounting for possible posttranslational modifications such as phosphorylation. As we show, MADS-domain TFs could be phosphorylated at multiple sites. These posttranslational modifications, whether occurring at one or several sites, could be implicated in the promotion of LLPS, with the extent of phase separation likely dependent on specific cellular conditions. Additionally, the LLPS propensity obtained in this analysis is in agreement with findings showing that the capacity to drive LLPS is not determined solely by the disordered nature of the sequence. Instead, it depends on specific sequence features, such as the distribution and patterning of aromatic and charged residues. These sequence-encoded patterns are essential for enabling the multivalent interactions required for the formation of biomolecular condensates [125]. When only the C-terminal region of MADS-domain proteins is analyzed for LLPS propensity, the effect of intrinsic disorder shifts the scores upward. A greater number of proteins present LLPS scores above 0.6 compared to analyses of the full-length proteins, supporting the contribution of IDRs to protein condensation propensity. Furthermore, some studies have suggested that ADs can drive TF phase separation, leading to the formation of transcriptional condensates associated with chromatin [126–128]. However, recent findings indicate that this mechanism is not universal, as the recruitment of activators can also occur independently of phase separation. Important events that enhance transcriptional activation include the multivalent interactions mediated by the ADs, which increase the residence time of TFs on chromatin and thereby promote the recruitment of coactivators [97].
This study highlights common characteristics shared by MADS-domain proteins, not only in plants but also across organisms from other domains of life. Notably, the presence of a disordered C-terminal region in Type I and Type II MADS-domain TFs stands out. The functional significance of this region is strongly supported by the identification of several potential ADs and phosphorylation sites. Furthermore, we identified not only putative protein-protein interaction sites within this region but also conserved motifs specific to evolutionary related MADS-domain proteins, further supporting their role in the transcriptional regulatory function of these TFs.
The remarkable conservation of the structurally disordered C-terminal region in MADS-domain TFs suggests specific biological functions for this region. Some of these may be associated with the presence of conserved motifs and/or phosphorylation sites. However, these elements correspond to short segments within a broader region that, based on primary sequence alignments, do not appear to be under strong evolutionary constraint. To date, the persistence of IDRs across the proteomes of all analyzed organisms remains an open question. Although some evolutionary approaches have attempted to identify conserved molecular features (i.e., NCPR, kappa, FCR, etc.) in the amino acid sequences of IDRs [129], the findings suggest that, while certain molecular features are linked to known functions in yeast IDRs and may reflect a mechanism of IDR evolution, this pattern does not appear to extend to multicellular organisms like Drosophila [130]. These observations show that although IDRs may follow unique patterns of amino acid substitutions, intrinsic disorder itself is subjected to dynamic evolutionary processes, shaped by more complex evolutionary constraints across evolving properties of different domains of life.
Considering the well-established role of MADS-domain TFs in regulating diverse developmental processes and stress responses, the conservation of a structurally disordered C-terminal region across all family members, along with the presence conserved motifs, potential activation domains and phosphorylation sites, suggest a shared regulatory mechanism. Based on these findings, we proposed a mechanistic model describing the functional role of specific structural elements within the C-terminal domain of both Type I and Type II MADS-domain TFs. The structural flexibility provided by the IDRs in the C-terminal domain, combined with the presence of MoRFs, ADs, and a high phosphorylation propensity, points to a regulatory role in modulating the MADS-domain TF activity within transcriptional complexes (Fig 7, S6 Table). In this model, these IDRs confer both structural flexibility and modularity, facilitating the formation of dynamic protein complexes. This may occur either through a high propensity for liquid-liquid phase separation (LLPS) or by promoting transient interactions with other proteins, both of which support the formation of transcriptional condensates (Fig 7). In the present study, we identified MoRFs and ADs within the C-terminal IDRs, where MoRFs likely mediate specific, transient interactions with regulatory proteins or other MADS-domain TFs in concert with ADs, these elements may modulate the assembly and stability of transcriptional complexes. Previous reports suggest that the formation of condensates enriched in transcription factors enables fine-tuned regulation of transcriptional activity, particularly in response to physiological or developmental cues [131,132]. Within such condensates, a variety of proteins can interact with TF IDRs through their ADs and LLPS-driven mechanisms further supporting the functional significance of the C-terminal IDRs in MADS-domain TFs.
Overall, this study provides valuable insights for a deeper understanding of the relationship between protein structure and function for MADS-domain TFs. We believe this information will encourage and support further experimental studies by researchers working on MADS-domain proteins in diverse biological systems, especially to investigate the functional relevance of these conserved regions.
Supporting information
(DOCX)
AD = activation domain. AD, Maybe and Not AD definitions are given based on the PADI (plant activation domain identification). PADI score ≥1 and mean disorder >0.5 are defined as “AD”; PADI score ≥1 and mean disorder de ≤ 0.5 are defined as “Maybe”; PADI <1 are defined as “Not AD”.
(DOCX)
These packages are available from CRAN (https://CRAN.R-project.org/) or Bioconductor (Huber et al., 2015).
(DOCX)
(XLSX)
Sites were mapped on domains, according to UniProt and Liu et al., 2021 definition. See main text for a detailed description.
(DOCX)
(DOCX)
(A) Net charge per residue (NCPR), (B) Charge distribution (Kappa), and (C) Fraction of Charged Residues (FCR). Whiskers indicate ±1.5*IQR based on Tukey test. The middle line represents the median.
(TIF)
Oryza sativa japonica (Os), Solanum dulcamara (SD), Mangifera indica (MI), Populus trichocarpa (PT), Amborella trichopoda (AMB), Pinus radiata (Prad), Selaginella mollendorffii (SM), Chara braunii (CB), Chlorella dessiccata (CD), Saccharomyces cerevisiae (SC), Drosophila melanogaster (DM), and Homo sapiens (HS). Yellow-shaded branches cover Type I MADS grouped with SRF-like and MEF-like MADS-domain proteins. Purple-shaded branches cover most Type II MADS-domain proteins. Numbers adjacent to nodes represent bootstrap support. Orange dots at particular nodes indicate the putative ancestral motif for that specific clade.
(TIF)
Boxplots showing the ADS and PPIDR values for the C-terminal region and the full-length proteins from: (A) Plant species analysed in this study, excluding Arabidopsis. (B) Saccharomyces cerevisiae, (C) Homo sapiens, and (D) Drosophila melanogaster. Whiskers indicate ±1.5*IQR according to the Tukey test. The middle line represents the median.
(TIF)
The phylogenetic tree was derived from the complete MADS-domain protein tree shown in Supplementary Fig 2. To enhance the visualization of phylogenetic relationships among the proteins, branch lengths were rescaled and truncated. The conserved amino acid residues of the SOC1-motif and FLC-motif are highlighted in bold within the consensus motif.
(TIF)
MADS-domain protein sequences from Arabidopsis, other plants, and non-plant organisms are shown, with predicted MoRFs highlighted in yellow Putative MoRFs were identified using the fmoRFpred algorithm [123], based on the analysis of full-length protein sequences.
(TIF)
Acknowledgments
The first authors wish to thank Consejo Nacional de Humanidades, Ciencias y Tecnología (CONAHCyT) for the postdoctoral scholarships granted (E.R.A., CVU number 413896; T.N.R., CVU 501149).
Data Availability
All relevant data are within the paper and its Supporting information files.
Funding Statement
This work was partially financed by the projects PAPIIT No. IN213524 from Universidad Nacional Autónoma de México (UNAM), and project No. CBF2023- 2024-1002 from the Consejo Nacional de Humanidades, Ciencias y Tecnologías (CONAHCyT) granted to A.G.A.; and projects No. CF-2019/252952 and CF2023-I-503 from the Consejo Nacional de Humanidades, Ciencias y Tecnologías (CONAHCyT) granted to A.A.C.
References
- 1.Wells CL, Pigliucci M. Adaptive phenotypic plasticity: the case of heterophylly in aquatic plants. Perspect Plant Ecol Evol Syst. 2000;3:1–18. [Google Scholar]
- 2.Sultan SE. Phenotypic plasticity for plant development, function and life history. Trends Plant Sci. 2000;5(12):537–42. doi: 10.1016/s1360-1385(00)01797-0 [DOI] [PubMed] [Google Scholar]
- 3.Spitz F, Furlong EEM. Transcription factors: from enhancer binding to developmental control. Nat Rev Genet. 2012;13(9):613–26. doi: 10.1038/nrg3207 [DOI] [PubMed] [Google Scholar]
- 4.Schwechheimer C, Bevan M. The regulation of transcription factor activity in plants. Trends Plant Sci. 1998;3:378–83. [Google Scholar]
- 5.Mosa KA, Ismail A, Helmy M. Plant Stress Tolerance. Springer International Publishing. 2017. [Google Scholar]
- 6.Hoang XLT, Nhi DNH, Thu NBA, Thao NP, Tran LSP. Transcription factors and their roles in signal transduction in plants under abiotic stresses. Curr Genomics. 2017;18:483–97. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Kaufmann K, Muiño JM, Jauregui R, Airoldi CA, Smaczniak C, Krajewski P, et al. Target genes of the MADS transcription factor SEPALLATA3: integration of developmental and hormonal pathways in the Arabidopsis flower. PLoS Biol. 2009;7(4):e1000090. doi: 10.1371/journal.pbio.1000090 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Yu L-H, Miao Z-Q, Qi G-F, Wu J, Cai X-T, Mao J-L, et al. MADS-box transcription factor AGL21 regulates lateral root development and responds to multiple external and physiological signals. Mol Plant. 2014;7(11):1653–69. doi: 10.1093/mp/ssu088 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Babu MM, Luscombe NM, Aravind L, Gerstein M, Teichmann SA. Structure and evolution of transcriptional regulatory networks. Curr Opin Struct Biol. 2004;14(3):283–91. doi: 10.1016/j.sbi.2004.05.004 [DOI] [PubMed] [Google Scholar]
- 10.Melzer R, Theissen G. MADS and more: transcription factors that shape the plant. Methods Mol Biol. 2011;754:3–18. doi: 10.1007/978-1-61779-154-3_1 [DOI] [PubMed] [Google Scholar]
- 11.Castelán-Muñoz N, Herrera J, Cajero-Sánchez W, Arrizubieta M, Trejo C, García-Ponce B. MADS-Box genes are key components of genetic regulatory networks involved in abiotic stress and plastic developmental responses in plants. Front Plant Sci. 2019;10:853. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Schwarz-Sommer Z, Huijser P, Nacken W, Saedler H, Sommer H. Genetic Control of Flower Development by Homeotic Genes in Antirrhinum majus. Science. 1990;250(4983):931–6. doi: 10.1126/science.250.4983.931 [DOI] [PubMed] [Google Scholar]
- 13.Wu W, de Folter S, Shen X, Zhang W, Tao S. Vertebrate paralogous MEF2 genes: origin, conservation, and evolution. PLoS One. 2011;6(3):e17334. doi: 10.1371/journal.pone.0017334 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Acton TB, Zhong H, Vershon AK. DNA-binding specificity of Mcm1: operator mutations that alter DNA-bending and transcriptional activities by a MADS box protein. Mol Cell Biol. 1997;17(4):1881–9. doi: 10.1128/MCB.17.4.1881 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Dichoso D, Brodigan T, Chwoe KY, Lee JS, Llacer R, Park M, et al. The MADS-Box factor CeMEF2 is not essential for Caenorhabditis elegans myogenesis and development. Dev Biol. 2000;223(2):431–40. doi: 10.1006/dbio.2000.9758 [DOI] [PubMed] [Google Scholar]
- 16.Lilly B, Galewsky S, Firulli AB, Schulz RA, Olson EN. D-MEF2: a MADS-box transcription factor expressed in differentiating mesoderm and muscle cell lineages during Drosophila embryogenesis. Proc Natl Acad Sci. 1994;91:5662–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Pollock R, Treisman R. Human SRF-related proteins: DNA-binding properties and potential regulatory targets. Genes Dev. 1991;5(12A):2327–41. doi: 10.1101/gad.5.12a.2327 [DOI] [PubMed] [Google Scholar]
- 18.Alvarez-Buylla ER, Liljegren SJ, Pelaz S, Gold SE, Burgeff C, Ditta GS, et al. MADS-box gene evolution beyond flowers: expression in pollen, endosperm, guard cells, roots and trichomes. Plant J. 2000;24(4):457–66. doi: 10.1046/j.1365-313x.2000.00891.x [DOI] [PubMed] [Google Scholar]
- 19.Becker A, Theissen G. The major clades of MADS-box genes and their role in the development and evolution of flowering plants. Mol Phylogenet Evol. 2003;29(3):464–89. doi: 10.1016/s1055-7903(03)00207-0 [DOI] [PubMed] [Google Scholar]
- 20.Meng D, Cao Y, Chen T, Abdullah M, Jin Q, Fan H, et al. Evolution and functional divergence of MADS-box genes in Pyrus. Sci Rep. 2019;9(1):1266. doi: 10.1038/s41598-018-37897-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Shore P, Sharrocks AD. The MADS-box family of transcription factors. Eur J Biochem. 1995;229(1):1–13. doi: 10.1111/j.1432-1033.1995.tb20430.x [DOI] [PubMed] [Google Scholar]
- 22.Qiu Y, Li Z, Walther D, Köhler C. Updated Phylogeny and Protein Structure Predictions Revise the Hypothesis on the Origin of MADS-box Transcription Factors in Land Plants. Mol Biol Evol. 2023;40(9):msad194. doi: 10.1093/molbev/msad194 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Alvarez-Buylla ER, Pelaz S, Liljegren SJ, Gold SE, Burgeff C, Ditta GS, et al. An ancestral MADS-box gene duplication occurred before the divergence of plants and animals. Proc Natl Acad Sci U S A. 2000;97(10):5328–33. doi: 10.1073/pnas.97.10.5328 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Smaczniak C, Immink RGH, Angenent GC, Kaufmann K. Developmental and evolutionary diversity of plant MADS-domain factors: insights from recent studies. Development. 2012;139(17):3081–98. doi: 10.1242/dev.074674 [DOI] [PubMed] [Google Scholar]
- 25.Chen X, Gao B, Ponnusamy M, Lin Z, Liu J. MEF2 signaling and human diseases. Oncotarget. 2017;8(67):112152–65. doi: 10.18632/oncotarget.22899 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Mead J, Bruning AR, Gill MK, Steiner AM, Acton TB, Vershon AK. Interactions of the Mcm1 MADS box protein with cofactors that regulate mating in yeast. Mol Cell Biol. 2002;22(13):4607–21. doi: 10.1128/MCB.22.13.4607-4621.2002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Messenguy F, Dubois E. Role of MADS box proteins and their cofactors in combinatorial control of gene expression and cell development. Gene. 2003;316:1–21. doi: 10.1016/s0378-1119(03)00747-9 [DOI] [PubMed] [Google Scholar]
- 28.Bemer M, Heijmans K, Airoldi C, Davies B, Angenent GC. An atlas of type I MADS-box gene expression during female gametophyte and seed development in Arabidopsis. Plant Physiol. 2010;154:287–300. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Theißen G, Kim JT, Saedler H. Classification and phylogeny of the MADS-box multigene family suggest defined roles of MADS-box gene subfamilies in the morphological evolution of eukaryotes. J Mol Evol. 1996;43:484–516. [DOI] [PubMed] [Google Scholar]
- 30.Krizek BA, Meyerowitz EM. Mapping the protein regions responsible for the functional specificities of the Arabidopsis MADS domain organ-identity proteins. Proc Natl Acad Sci U S A. 1996;93(9):4063–70. doi: 10.1073/pnas.93.9.4063 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Riechmann JL, Krizek BA, Meyerowitz EM. Dimerization specificity of Arabidopsis MADS domain homeotic proteins APETALA1, APETALA3, PISTILLATA, and AGAMOUS. Proc Natl Acad Sci U S A. 1996;93(10):4793–8. doi: 10.1073/pnas.93.10.4793 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Yang Y, Jack T. Defining subdomains of the K domain important for protein-protein interactions of plant MADS proteins. Plant Mol Biol. 2004;55(1):45–59. doi: 10.1007/s11103-004-0416-7 [DOI] [PubMed] [Google Scholar]
- 33.Lai X, Vega-Léon R, Hugouvieux V, Blanc-Mathieu R, van der Wal F, Lucas J, et al. The intervening domain is required for DNA-binding and functional identity of plant MADS transcription factors. Nat Commun. 2021;12:4760. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Fan HY, Hu Y, Tudor M, Ma H. Specific interactions between the K domains of AG and AGLs, members of the MADS domain family of DNA binding proteins. Plant J. 1997;12(5):999–1010. doi: 10.1046/j.1365-313x.1997.12050999.x [DOI] [PubMed] [Google Scholar]
- 35.Riechmann JL, Meyerowitz EM. MADS domain proteins in plant development. Biol Chem. 1997;378(10):1079–101. [PubMed] [Google Scholar]
- 36.Parenicová L, de Folter S, Kieffer M, Horner DS, Favalli C, Busscher J, et al. Molecular and phylogenetic analyses of the complete MADS-box transcription factor family in Arabidopsis: new openings to the MADS world. Plant Cell. 2003;15(7):1538–51. doi: 10.1105/tpc.011544 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Salladini E, Jørgensen MLM, Theisen FF, Skriver K. Intrinsic Disorder in Plant Transcription Factor Systems: Functional Implications. Int J Mol Sci. 2020;21(24):9755. doi: 10.3390/ijms21249755 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Dunker AK, Brown CJ, Lawson JD, Iakoucheva LM, Obradović Z. Intrinsic disorder and protein function. Biochemistry. 2002;41(21):6573–82. doi: 10.1021/bi012159+ [DOI] [PubMed] [Google Scholar]
- 39.Dunker AK, Lawson JD, Brown CJ, Williams RM, Romero P, Oh JS, et al. Intrinsically disordered protein. J Mol Graph Model. 2001;19(1):26–59. doi: 10.1016/s1093-3263(00)00138-8 [DOI] [PubMed] [Google Scholar]
- 40.Radivojac P, Obradovic Z, Smith DK, Zhu G, Vucetic S, Brown CJ. Protein flexibility and intrinsic disorder. Protein Sci. 2004;13:71–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Dyson HJ, Wright PE. Intrinsically unstructured proteins and their functions. Nat Rev Mol Cell Biol. 2005;6(3):197–208. doi: 10.1038/nrm1589 [DOI] [PubMed] [Google Scholar]
- 42.Tompa P. Intrinsically disordered proteins: A 10-year recap. Trends in Biochem Sci. 2012;37:509–16. [DOI] [PubMed] [Google Scholar]
- 43.O’Shea C, Kryger M, Stender EGP, Kragelund BB, Willemoës M, Skriver K. Protein intrinsic disorder in Arabidopsis NAC transcription factors: transcriptional activation by ANAC013 and ANAC046 and their interactions with RCD1. Biochem J. 2015;465(2):281–94. doi: 10.1042/BJ20141045 [DOI] [PubMed] [Google Scholar]
- 44.Stender EG, O’Shea C, Skriver K. Subgroup-specific intrinsic disorder profiles of arabidopsis NAC transcription factors: Identification of functional hotspots. Plant Signal Behav. 2015;10:e1010967. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Sun X, Jones WT, Rikkerink EHA. GRAS proteins: the versatile roles of intrinsically disordered proteins in plant signalling. Biochem J. 2012;442(1):1–12. doi: 10.1042/BJ20111766 [DOI] [PubMed] [Google Scholar]
- 46.Valsecchi I, Guittard-Crilat E, Maldiney R, Habricot Y, Lignon S, Lebrun R, et al. The intrinsically disordered C-terminal region of Arabidopsis thaliana TCP8 transcription factor acts both as a transactivation and self-assembly domain. Mol Biosyst. 2013;9(9):2282–95. doi: 10.1039/c3mb70128j [DOI] [PubMed] [Google Scholar]
- 47.Tarczewska A, Greb-Markiewicz B. The Significance of the Intrinsically Disordered Regions for the Functions of the bHLH Transcription Factors. Int J Mol Sci. 2019;20(21):5306. doi: 10.3390/ijms20215306 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Oldfield CJ, Cheng Y, Cortese MS, Romero P, Uversky VN, Dunker AK. Coupled folding and binding with alpha-helix-forming molecular recognition elements. Biochemistry. 2005;44(37):12454–70. doi: 10.1021/bi050736e [DOI] [PubMed] [Google Scholar]
- 49.Yan J, Dunker AK, Uversky VN, Kurgan L. Molecular recognition features (MoRFs) in three domains of life. Mol Biosyst. 2016;12(3):697–710. doi: 10.1039/c5mb00640f [DOI] [PubMed] [Google Scholar]
- 50.Mohan A, Oldfield CJ, Radivojac P, Vacic V, Cortese MS, Dunker AK, et al. Analysis of molecular recognition features (MoRFs). J Mol Biol. 2006;362(5):1043–59. doi: 10.1016/j.jmb.2006.07.087 [DOI] [PubMed] [Google Scholar]
- 51.Vacic V, Oldfield CJ, Mohan A, Radivojac P, Cortese MS, Uversky VN, et al. Characterization of molecular recognition features, MoRFs, and their binding partners. J Proteome Res. 2007;6(6):2351–66. doi: 10.1021/pr0701411 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Das RK, Pappu RV. Conformations of intrinsically disordered proteins are influenced by linear sequence distributions of oppositely charged residues. Proc Natl Acad Sci U S A. 2013;110(33):13392–7. doi: 10.1073/pnas.1304749110 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Bhattarai A, Emerson IA. Dynamic conformational flexibility and molecular interactions of intrinsically disordered proteins. J Biosci. 2020;45:29. doi: 10.1007/s12038-020-0010-4 [DOI] [PubMed] [Google Scholar]
- 54.Dunker AK, Cortese MS, Romero P, Iakoucheva LM, Uversky VN. Flexible nets. FEBS Journal. 2005;272:5129–48. [DOI] [PubMed] [Google Scholar]
- 55.Alberti S. The wisdom of crowds: regulating cell function through condensed states of living matter. J Cell Sci. 2017;130(17):2789–96. doi: 10.1242/jcs.200295 [DOI] [PubMed] [Google Scholar]
- 56.Alberti S, Hyman AA. Biomolecular condensates at the nexus of cellular stress, protein aggregation disease and ageing. Nat Rev Mol Cell Biol. 2021;22(3):196–213. doi: 10.1038/s41580-020-00326-6 [DOI] [PubMed] [Google Scholar]
- 57.TAIR - Home. Available from: https://www.arabidopsis.org/.
- 58.UniProt. Available from: https://www.uniprot.org/.
- 59.National Center for Biotechnology Information. National Center for Biotechnology Information. Available from: https://www.ncbi.nlm.nih.gov/.
- 60.RAP-DB | HOME. Available from: https://rapdb.dna.affrc.go.jp/.
- 61.Arora R, Agarwal P, Ray S, Singh AK, Singh VP, Tyagi AK, et al. MADS-box gene family in rice: genome-wide identification, organization and expression profiling during reproductive development and stress. BMC Genomics. 2007;8:242. doi: 10.1186/1471-2164-8-242 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.MAFFT alignment and NJ/ UPGMA phylogeny. Available from: https://mafft.cbrc.jp/alignment/server/.
- 63.Katoh K, Rozewicki J, Yamada KD. MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization. Brief Bioinform. 2019;20(4):1160–6. doi: 10.1093/bib/bbx108 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Larsson A. AliView: a fast and lightweight alignment viewer and editor for large datasets. Bioinformatics. 2014;30(22):3276–8. doi: 10.1093/bioinformatics/btu531 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Jones DT, Taylor WR, Thornton JM. The rapid generation of mutation data matrices from protein sequences. Comput Appl Biosci. 1992;8(3):275–82. doi: 10.1093/bioinformatics/8.3.275 [DOI] [PubMed] [Google Scholar]
- 66.Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30(9):1312–3. doi: 10.1093/bioinformatics/btu033 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Miller MA, Pfeiffer W, Schwartz T. Creating the CIPRES Science Gateway for inference of large phylogenetic trees. In: 2010 Gateway Computing Environments Workshop, GCE. 2010. [Google Scholar]
- 68.Portal | CIPRES. Available from: https://www.phylo.org/index.php/site.
- 69.Meme - Submission Form. Available from: https://meme-suite.org/meme/tools/meme.
- 70.Bailey TL, Johnson J, Grant CE, Noble WS. The MEME Suite. Nucleic Acids Research. 2015;43:W39-49. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.fMoRFpred - fast Molecular Recognition Feature predictor. Available from: http://biomine.cs.vcu.edu/servers/fMoRFpred/.
- 72.Morffy N, Van den Broeck L, Miller C, Emenecker RJ, Bryant JA Jr, Lee TM, et al. Identification of plant transcriptional activation domains. Nature. 2024;632(8023):166–73. doi: 10.1038/s41586-024-07707-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Piovesan D, Del Conte A, Clementel D, Monzon AM, Bevilacqua M, Aspromonte MC, et al. MobiDB: 10 years of intrinsically disordered proteins. Nucleic Acids Res. 2023;51:D438-44. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.AlphaFold. Google Colab. Available from: https://colab.research.google.com/github/deepmind/alphafold/blob/main/notebooks/AlphaFold.ipynb. [Google Scholar]
- 75.Meng EC, Goddard TD, Pettersen EF, Couch GS, Pearson ZJ, Morris JH, et al. UCSF ChimeraX: Tools for structure building and analysis. Protein Sci. 2023;32(11):e4792. doi: 10.1002/pro.4792 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Annotation from 3D structure data. Available from: https://www.jalview.org/help/html/features/xsspannotation.html.
- 77.Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596(7873):583–9. doi: 10.1038/s41586-021-03819-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Wilson CJ, Choy WY, Karttunen M. AlphaFold2: A role for disordered protein/region prediction? Int J Mol Sci. 2022;23:4591. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.RIDAO: Rapid Intrinsic Disorder Analysis Online. Available from: https://ridao.app/users/sign_in.
- 80.Dayhoff GW 2nd, Uversky VN. Rapid prediction and analysis of protein intrinsic disorder. Protein Sci. 2022;31(12):e4496. doi: 10.1002/pro.4496 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Shukla S, Lastorka SS, Uversky VN. Intrinsic Disorder and Phase Separation Coordinate Exocytosis, Motility, and Chromatin Remodeling in the Human Acrosomal Proteome. Proteomes. 2025;13(2):16. doi: 10.3390/proteomes13020016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Pappu Lab. Available from: https://pappulab.wustl.edu/CIDERinfo.html.
- 83.Holehouse AS, Das RK, Ahad JN, Richardson MOG, Pappu RV. CIDER: Resources to Analyze Sequence-Ensemble Relationships of Intrinsically Disordered Proteins. Biophys J. 2017;112(1):16–21. doi: 10.1016/j.bpj.2016.11.3200 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.NetPhos 3.1 - DTU Health Tech - Bioinformatic Services. Available from: https://services.healthtech.dtu.dk/services/NetPhos-3.1/.
- 85.Blom N, Gammeltoft S, Brunak S. Sequence and structure-based prediction of eukaryotic protein phosphorylation sites. J Mol Biol. 1999;294:1351–62. [DOI] [PubMed] [Google Scholar]
- 86.Blom N, Sicheritz-Pontén T, Gupta R, Gammeltoft S, Brunak S. Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence. Proteomics. 2004;4(6):1633–49. doi: 10.1002/pmic.200300771 [DOI] [PubMed] [Google Scholar]
- 87.ATHENA. Available from: https://athena.proteomics.wzw.tum.de/master_arabidopsisshiny/.
- 88.Lin S, Wang C, Zhou J, Shi Y, Ruan C, Tu Y, et al. EPSD: a well-annotated data resource of protein phosphorylation sites in eukaryotes. Brief Bioinform. 2021;22(1):298–307. doi: 10.1093/bib/bbz169 [DOI] [PubMed] [Google Scholar]
- 89.Hatos A, Tosatto SCE, Vendruscolo M, Fuxreiter M. FuzDrop on AlphaFold: visualizing the sequence-dependent propensity of liquid-liquid phase separation and aggregation of proteins. Nucleic Acids Res. 2022;50(W1):W337–44. doi: 10.1093/nar/gkac386 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.FuzDrop. Available from: https://fuzdrop.bio.unipd.it/predictor
- 91.Hothorn T, Van De Wiel MA, Hornik K, Zeileis A. Implementing a class of permutation tests: The coin package. J Stat Softw. 2008;28:1–23.27774042 [Google Scholar]
- 92.R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. 2024. [Google Scholar]
- 93.GIMP Development Team. GNU Image Manipulation Program (GIMP), Version 3.0.4. Community. 2025. [Google Scholar]
- 94.Iakoucheva LM, Radivojac P, Brown CJ, O’Connor TR, Sikes JG, Obradovic Z, et al. The importance of intrinsic disorder for protein phosphorylation. Nucleic Acids Res. 2004;32(3):1037–49. doi: 10.1093/nar/gkh253 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Margulis L, Chapman MJ. Kingdoms and Domains: An Illustrated Guide to the Phyla of Life on Earth, Fourth Edition. Elsevier. 2009. [Google Scholar]
- 96.Alvarez-Buylla ER, García-Ponce B, Garay-Arroyo A. Unique and redundant functional domains of APETALA1 and CAULIFLOWER, two recently duplicated Arabidopsis thaliana floral MADS-box genes. J Exp Bot. 2006;57(12):3099–107. doi: 10.1093/jxb/erl081 [DOI] [PubMed] [Google Scholar]
- 97.Trojanowski J, Frank L, Rademacher A, Mücke N, Grigaitis P, Rippe K. Transcription activation is enhanced by multivalent interactions independent of phase separation. Mol Cell. 2022;82(10):1878–1893.e10. doi: 10.1016/j.molcel.2022.04.017 [DOI] [PubMed] [Google Scholar]
- 98.Liu J, Perumal NB, Oldfield CJ, Su EW, Uversky VN, Dunker AK. Intrinsic disorder in transcription factors. Biochemistry. 2006;45(22):6873–88. doi: 10.1021/bi0602718 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Dyson HJ, Wright PE. Role of Intrinsic Protein Disorder in the Function and Interactions of the Transcriptional Coactivators CREB-binding Protein (CBP) and p300. J Biol Chem. 2016;291(13):6714–22. doi: 10.1074/jbc.R115.692020 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Vandenbussche M, Theissen G, Van de Peer Y, Gerats T. Structural diversification and neo-functionalization during floral MADS-box gene evolution by C-terminal frameshift mutations. Nucleic Acids Res. 2003;31(15):4401–9. doi: 10.1093/nar/gkg642 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Cho S, Jang S, Chae S, Chung KM, Moon YH, An G, et al. Analysis of the C-terminal region of Arabidopsis thaliana APETALA1 as a transcription activation domain. Plant Mol Biol. 1999;40(3):419–29. doi: 10.1023/a:1006273127067 [DOI] [PubMed] [Google Scholar]
- 102.Kaufmann K, Melzer R, Theissen G. MIKC-type MADS-domain proteins: structural modularity, protein interactions and network evolution in land plants. Gene. 2005;347(2):183–98. doi: 10.1016/j.gene.2004.12.014 [DOI] [PubMed] [Google Scholar]
- 103.Egea-Cortines M, Saedler H, Sommer H. Ternary complex formation between the MADS-box proteins SQUAMOSA, DEFICIENS and GLOBOSA is involved in the control of floral architecture in Antirrhinum majus. EMBO J. 1999;18(19):5370–9. doi: 10.1093/emboj/18.19.5370 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104.Sridhar VV, Surendrarao A, Liu Z. APETALA1 and SEPALLATA3 interact with SEUSS to mediate transcription repression during flower development. Development. 2006;133(16):3159–66. doi: 10.1242/dev.02498 [DOI] [PubMed] [Google Scholar]
- 105.Shiraishi H, Okada K, Shimura Y. Nucleotide sequences recognized by the AGAMOUS MADS domain of Arabidopsis thaliana in vitro. Plant J. 1993;4(2):385–98. doi: 10.1046/j.1365-313x.1993.04020385.x [DOI] [PubMed] [Google Scholar]
- 106.Mizukami Y, Ma H. Determination of Arabidopsis floral meristem identity by AGAMOUS. Plant Cell. 1997;9(3):393–408. doi: 10.1105/tpc.9.3.393 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Puranik S, Acajjaoui S, Conn S, Costa L, Conn V, Vial A, et al. Structural basis for the oligomerization of the MADS domain transcription factor SEPALLATA3 in Arabidopsis. Plant Cell. 2014;26(9):3603–15. doi: 10.1105/tpc.114.127910 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108.Amoutzias GD, Robertson DL, Van de Peer Y, Oliver SG. Choose your partners: dimerization in eukaryotic transcription factors. Trends Biochem Sci. 2008;33(5):220–9. doi: 10.1016/j.tibs.2008.02.002 [DOI] [PubMed] [Google Scholar]
- 109.Martin JF, Miano JM, Hustad CM, Copeland NG, Jenkins NA, Olson EN. A Mef2 gene that generates a muscle-specific isoform via alternative mRNA splicing. Mol Cell Biol. 1994;14(3):1647–56. doi: 10.1128/mcb.14.3.1647-1656.1994 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Molkentin JD, Li L, Olson EN. Phosphorylation of the MADS-Box transcription factor MEF2C enhances its DNA binding activity. J Biol Chem. 1996;271(29):17199–204. doi: 10.1074/jbc.271.29.17199 [DOI] [PubMed] [Google Scholar]
- 111.Wang X, She H, Mao Z. Phosphorylation of neuronal survival factor MEF2D by glycogen synthase kinase 3beta in neuronal apoptosis. J Biol Chem. 2009;284(47):32619–26. doi: 10.1074/jbc.M109.067785 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 112.Darling AL, Uversky VN. Intrinsic disorder and posttranslational modifications: The darker side of the biological dark matter. Front Genet. 2018;4:9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 113.Jin F, Grater F. How multisite phosphorylation impacts the conformations of intrinsically disordered proteins. PLoS Comput Biol. 2021;17:e1008939. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 114.Nosella ML, Forman-Kay JD. Phosphorylation-dependent regulation of messenger RNA transcription, processing and translation within biomolecular condensates. Curr Opin Cell Biol. 2021;69:30–40. doi: 10.1016/j.ceb.2020.12.007 [DOI] [PubMed] [Google Scholar]
- 115.Shnitkind S, Martinez-Yamout MA, Dyson HJ, Wright PE. Structural Basis for Graded Inhibition of CREB:DNA Interactions by Multisite Phosphorylation. Biochemistry. 2018;57(51):6964–72. doi: 10.1021/acs.biochem.8b01092 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 116.De Bodt S, Raes J, Van de Peer Y, Theissen G. And then there were many: MADS goes genomic. Trends Plant Sci. 2003;8(10):475–83. doi: 10.1016/j.tplants.2003.09.006 [DOI] [PubMed] [Google Scholar]
- 117.Veron AS, Kaufmann K, Bornberg-Bauer E. Evidence of interaction network evolution by whole-genome duplications: a case study in MADS-box proteins. Mol Biol Evol. 2007;24(3):670–8. doi: 10.1093/molbev/msl197 [DOI] [PubMed] [Google Scholar]
- 118.Strader L, Weijers D, Wagner D. Plant transcription factors - being in the right place with the right company. Curr Opin Plant Biol. 2022;65:102136. doi: 10.1016/j.pbi.2021.102136 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 119.Zahn LM, Leebens-Mack JH, Arrington JM, Hu Y, Landherr LL, DePamphilis CW. Conservation and divergence in the AGAMOUS subfamily of MADS-box genes: evidence of independent sub- and neofunctionalization events. Evol Dev. 2006;8:30–45. [DOI] [PubMed] [Google Scholar]
- 120.Bemer M, Wolters-Arts M, Grossniklaus U, Angenent GC. The MADS domain protein DIANA acts together with AGAMOUS-LIKE80 to specify the central cell in Arabidopsis ovules. Plant Cell. 2008;20(8):2088–101. doi: 10.1105/tpc.108.058958 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 121.Colombo M, Masiero S, Vanzulli S, Lardelli P, Kater MM, Colombo L. AGL23, a type I MADS-box gene that controls female gametophyte and embryo development in Arabidopsis. Plant J. 2008;54(6):1037–48. doi: 10.1111/j.1365-313X.2008.03485.x [DOI] [PubMed] [Google Scholar]
- 122.Kang IH, Steffen JG, Portereiko MF, Lloyd A, Drews GN. The AGL62 MADS domain protein regulates cellularization during endosperm development in Arabidopsis. Plant Cell. 2008;20:635–47. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 123.Köhler C, Hennig L, Spillane C, Pien S, Gruissem W, Grossniklaus U. The Polycomb-group protein MEDEA regulates seed development by controlling expression of the MADS-box gene PHERES1. Genes Dev. 2003;17:1540–53. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 124.Zhang WJ, Zhou Y, Zhang Y, Su YH, Xu T. Protein phosphorylation: A molecular switch in plant signaling. Cell Rep. 2023;42(7):112729. doi: 10.1016/j.celrep.2023.112729 [DOI] [PubMed] [Google Scholar]
- 125.Borcherds W, Bremer A, Borgia MB, Mittag T. How do intrinsically disordered protein regions encode a driving force for liquid-liquid phase separation? Curr Opin Struct Biol. 2021;67:41–50. doi: 10.1016/j.sbi.2020.09.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 126.Pei G, Lyons H, Li P, Sabari BR. Transcription regulation by biomolecular condensates. Nat Rev Mol Cell Biol. 2025;26(3):213–36. doi: 10.1038/s41580-024-00789-x [DOI] [PMC free article] [PubMed] [Google Scholar]
- 127.Sabari BR. Biomolecular Condensates and Gene Activation in Development and Disease. Dev Cell. 2020;55(1):84–96. doi: 10.1016/j.devcel.2020.09.005 [DOI] [PubMed] [Google Scholar]
- 128.Wei M-T, Chang Y-C, Shimobayashi SF, Shin Y, Strom AR, Brangwynne CP. Nucleated transcriptional condensates amplify gene expression. Nat Cell Biol. 2020;22(10):1187–96. doi: 10.1038/s41556-020-00578-6 [DOI] [PubMed] [Google Scholar]
- 129.Zarin T, Strome B, Nguyen Ba AN, Alberti S, Forman-Kay JD, Moses AM. Proteome-wide signatures of function in highly diverged intrinsically disordered regions. Elife. 2019;8:e46883. doi: 10.7554/eLife.46883 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 130.Singleton MD, Eisen MB. Evolutionary analyses of intrinsically disordered regions reveal widespread signals of conservation. PLoS Comput Biol. 2024;20(4):e1012028. doi: 10.1371/journal.pcbi.1012028 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 131.Mann R, Notani D. Transcription factor condensates and signaling driven transcription. Nucleus. 2023;14:2205758. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 132.Boija A, Klein IA, Sabari BR, Dall’Agnese A, Coffey EL, Zamudio AV, et al. Transcription Factors Activate Genes through the Phase-Separation Capacity of Their Activation Domains. Cell. 2018;175(7):1842-1855.e16. doi: 10.1016/j.cell.2018.10.042 [DOI] [PMC free article] [PubMed] [Google Scholar]






