Abstract
Deleted in breast cancer 1 (DBC1, CCAR2, KIAA1967) is a large, predominantly nuclear, multidomain protein that modulates gene expression by inhibiting several epigenetic modifiers, including the deacetylases SIRT1 and HDAC3, and the methyltransferase SUV39H1. DBC1 shares many highly conserved protein domains with its paralog cell cycle and apoptosis regulator 1 (CCAR1, CARP-1). In this study, we examined the full-length sequential and structural properties of DBC1 and CCAR1 from multiple species and correlated these properties with evolution. Our data shows that the conserved domains shared between DBC1 and CCAR1 have similar domain structures, as well as similar patterns of predicted disorder in less-conserved intrinsically disordered regions. Our analysis indicates similarities between DBC1, CCAR1, and the nematode protein lateral signaling target 3 (LST-3), suggesting that DBC1 and CCAR1 may have evolved from LST-3. Our data also suggests that DBC1 emerged later in evolution than CCAR1. DBC1 contains regions that show less conservation across species as compared to the same regions in CCAR1, suggesting a continuously evolving scenario for DBC1. Overall, this study provides insight into the structure and evolution of DBC1 and CCAR1, which may impact future studies on the biological functions of these proteins.
1. Introduction
DBC1 (deleted in breast cancer 1, KIAA1967, CCAR2) and paralog CCAR1 (cell cycle and apoptosis regulator 1, CARP1) are emerging as important regulators for a variety of physiological processes. DBC1 was originally identified by its localization to a region of chromosome 8p21 that is homozygously deleted in breast cancer. However, DBC1 is not localized to the epicenter of the deletion and is therefore not the strongest candidate tumor suppressor gene in this region [1, 2]. DBC1 is now also called CCAR2 (cell cycle and apoptosis regulator 2) in order to distinguish it from an unrelated protein that is also named DBC1 (deleted in bladder cancer 1) [3]. DBC1 exerts some of its biological effects through interactions with protein modifying enzymes, including the deacetylases SIRT1 and HDAC3, and the methyltransferase SUV39H1 [4–7]. Through its many interactions, DBC1 regulates a variety of cellular processes including aging, metabolism, apoptosis, and stress response pathways [4, 7–11]. DBC1 studies are currently expanding to uncover new interacting partners and the possibility of roles in other biological processes.
CCAR1 is the paralog to DBC1 that was originally identified as a mediator of apoptosis in a process that involves sequestration of 14-3-3 and altered expression of multiple cell cycle regulatory genes [8, 9, 12]. CCAR1 can bind to the mediator complex and enhance transcription of estrogen receptor and glucocorticoid receptor target genes and can act as a coactivator for p53-dependent transcription [13]. CCAR1 can also cooperatively bind to DBC1 and synergistically enhance estrogen receptor function [14]. Thus, like DBC1, CCAR1 is also involved in a variety of cellular processes and functions together with DBC1 in some cases.
DBC1 and CCAR1 share many of the same functional domains including an S1-Like domain and a nuclear localization signal (NLS) on the N-terminus, a Leucine zipper (LZ) domain and a Nudix domain that are centrally located, and an EF-Hand domain and coiled-coil segments on the C-terminus [9, 14, 15]. Experimental evidence regarding the specific functions of the S1-Like, Nudix, and EF-Hand domains for both DBC1 and CCAR1 have not yet been determined. CCAR1 (1150aa) is a larger protein compared to DBC1 (923aa) due to the presence of two extra domains, including a centrally located SAP domain and an extra coiled-coil segment found immediately after the SAP domain.
The N-terminus of DBC1 (aa1–264) is the region where most of the currently known protein-protein interactions have been mapped. The S1-Like domain was originally identified in the ribosomal protein S1. Proteins that contain homology to this domain typically have RNA binding capabilities, suggesting evolution from an ancient nucleic acid binding protein [16]. The NLS is an important site for regulation via post-translational modifications, where acetylation can disrupt DBC1 translocation into the nucleus and ultimately inhibit nuclear interactions [17]. The LZ is a structural motif that functions as a dimerization domain and can bind to DNA to regulate gene expression in DBC1, but it is likely non-functional in CCAR1 [18]. DBC1 interactions with epigenetic modifiers, nuclear receptors, and mRNA splicing components all take place within the N-terminal area [4–7, 14, 17]. The DBC1/SIRT1 interaction has been highly studied due to the important role that DBC1 plays in inhibiting the epigenetic modifications that are regulated by SIRT1. Conflicting data points to either the LZ of DBC1 (aa243–264) [4, 14, 19, 20] or the N-terminal amino acids 1–240 as being critical for this interaction with SIRT1 [17].
The central region of DBC1 and CCAR1 contains a Nudix domain that is catalytically inactive due to the absence of key amino acid residues within the catalytic site. However, it has been suggested to play a role in sensing the products of the SIRT1 deacetylase reaction [9]. The SAP domain, specific to CCAR1, is also centrally located. This domain shares homology with a DNA binding motif commonly found in a diverse set of nuclear proteins that are typically involved in chromosomal organization [21].
The C-terminus of both DBC1 and CCAR1 contains an EF-Hand domain. EF-Hand domains bind to calcium ions and regulate gene expression but, similar to the Nudix domain, the EF-Hand of DBC1 and CCAR1 may not be functional because it is unlikely to bind to calcium ions [8]. Both proteins also contain a coiled-coil segment in the C-terminus, with an extra coiled-coil segment present in CCAR1. Coiled-coil regions are known to contain important protein interaction motifs [22]. The coiled-coil region of DBC1 has been shown to interact with only one protein thus far, the circadian cycle nuclear receptor Rev-erbα. Overexpression of DBC1 can enhance the stability and expression of Rev-erbα, ultimately affecting circadian oscillations and metabolism [23].
Evaluating the detailed evolutionary path that DBC1 and CCAR1 have taken, and the factors that have influenced the evolutionary path, will expand our knowledge base of these two proteins. The presence of highly conserved domains shared between DBC1 and CCAR1 indicates that DBC1 and CCAR1 may have evolved from one common ancestor. Corresponding to this, an evolutionary connection between DBC1 and CCAR1 has been reported in the large ortholog database OrthoDB [24]. Interestingly, DBC1 and CCAR1 are predicted to be intrinsically disordered as demonstrated in the D2P2 database of predicted disordered proteins [25]. As shown by previous studies on other proteins [26–36], evolutionary analysis that takes protein intrinsic disorder into account is particularly informative. Intrinsically disordered regions within proteins can be critical for protein function, as these regions are structurally flexible and are frequent sites of protein interactions and various modifications [37–40]. In many cases, intrinsically disordered regions are also associated with higher substitution rates [32, 41, 42].
In this study, we have predicted the structural properties of DBC1 and CCAR1, and our findings support the function of both proteins in many protein-protein interactions due to the high occurrence of disordered residues. Phylogenetic analysis predicts that both DBC1 and CCAR1 evolved from one common ancestor, the nematode protein LST-3. Collectively, this study provides insight into the future studies of DBC1 and CCAR1 by evaluating both evolution and structure.
2. Materials and Methods
2.1. DBC1 Homologs and Paralogs
BLASTP [43, 44] was used to align the human DBC1 (hDBC1) protein sequence (UniProt ID: Q8N163) against the entire UniProt database [45]. The obtained UniProt hits were filtered using a cutoff alignment score set at 10% of the hDBC1-hDBC1 alignment score based on trial and error. The 129 sequences that remained had a minimal sequence similarity of ~30% and a minimal sequence coverage of ~50% as compared to hDBC1, complying with previous studies [42]. After removing redundant sequences using BLASTClust and cutoff of 90% sequence identity, 93 sequences from 65 species were used for analysis (Table S1; Supplementary Material available online at http://dx.doi.org/10.1155/2014/418458). The 65 species include mammals, birds, reptiles, amphibians, fish, insects, and nematodes. The proteins found include DBC1, CCAR1, LST-3, and other undefined generic names. In addition, the complete proteomes of human, zebrafish, and C. elegans were downloaded from UniProt for comparative studies.
2.2. Disorder Prediction
PONDR-FIT [46] and PONDR-VLXT [47] were employed to run disorder prediction analysis. The disorder scores from both PONDR-FIT and PONDR-VLXT were used to measure the flexibility of the amino acids in all of the proteins examined. PONDR-FIT is one of the most accurate disorder predictors as it adopts the metapredictor strategy. Metapredictors are prevailing in the field of disorder prediction due to improved accuracy [46, 48]. Many state-of-the-art disorder predictors are metapredictors [48–53]. PONDR-FIT uses an artificial neural network to optimize the prediction results from six component predictors: PONDR-VLXT [47], PONDR-VSL2 [54, 55], PONDR-VL3 [56, 57], FoldIndex [58], IUPred [59], and TopIDP [60]. PONDR-FIT improves the prediction accuracy significantly in various testing datasets compared to its component predictors [46]. PONDR-VLXT is the first generation of disorder prediction software that is specifically designed to detect local flexibility of amino acid sequences. Although PONDR-VLXT is not the most accurate tool, it is still powerful due to its sensitivity to amino acid composition [46, 61]. PONDR-VLXT has been successfully applied in detecting linear interaction motifs (MoRFs) [62–64], which have proven to be extremely abundant in protein-protein interactions [65–68]. The values of predicted disorder scores were applied to measure the structural flexibility of proteins and the combination of disorder scores from PONDR-VLXT, PONDR-FIT, and other predictors, has been applied in many studies in order to explore a broad range of biological questions. Examples of these studies include methionine oxidation [69], phosphorylation [70], p53 evolution [42], binding motifs [71], iPS transcription factors [72], PTEN interactions [70], the spliceosome [73], the structural flexibility of viral proteins [72], and evolution across species [74, 75]. In this paper, we have also used predicted disorder scores to measure the structural flexibility of whole protein sequences along with local regions.
2.3. CH-CDF Analysis
The previously mentioned predictors were used to predict intrinsic disorder at a residue level. Intrinsic disorder can also be measured at the entire sequence level using a Charge-Hydropathy and Cumulative Distribution Function (CH-CDF) plot [51, 76]. A CH-CDF plot is composed using parameters from both a Charge-Hydropathy (CH) plot [77] and a Cumulative Distribution Function (CDF) plot [78]. In each of these two plots, structured proteins and disordered proteins stay in different regions and can be separated by a linear boundary line. The distance of a protein from the boundary line in each of them (dCH in CH plot and dCDF in CDF plot) describes the tendency of the protein being structured or disordered. The sign of the distance (positive or negative) shows whether the entire protein is structured or disordered. The performances of these two individual plots are often complementary. Therefore, their combination improves the prediction accuracy at the sequential level to 90% [51, 76]. In the CH-CDF plot, the directional distance dCH in the CH plot is set as the y-axis, and the directional distance dCDF in the CDF plot is used as the x-axis. As dCH and dCDF both have positive and negative values, the entire CH-CDF plot can be split into four quadrants using dCH = 0 and dCDF = 0: (1) Q1, dCH ≥ 0 & dCDF ≥ 0; (2) Q2, dCH < 0 & dCDF ≥ 0; (3) Q3, dCH < 0 & dCDF < 0; and (4) Q4, dCH ≥ 0 & dCDF < 0. By definition, proteins in Q2 are predicted to be structured. Proteins in Q3 and Q4 are disordered. Proteins in Q1 have excessive charged residues but can be structured.
2.4. Three-Dimensional Structure Prediction
The 3D structure of the structured domains for both human DBC1 and CCAR1 were built using HHpred [79], RaptorX [80], and I-TASSER [81]. Each structured domain has three predicted structures. The structure with the highest QMean score [82] was selected as the final predicted structure.
2.5. Phylogenetic Analysis
Mega5 [83] was used to run multiple sequence alignments and to analyze the phylogeny of the sequences. CLUSTALW was chosen to perform the multiple sequence alignment (default PAM matrix, Gap opening score 10, and Gap extension score 0.1). Nearest-neighboring algorithm was selected to analyze the phylogenetic relations among the sequences. The final phylogenetic tree was obtained by bootstrapping 2000 times.
2.6. Genome Neighborhood Analysis
We calculated the Conservation of Genomic Neighborhood (CGN) score [84, 85] of both DBC1 and CCAR1 for selected species, including mammals, birds, insects, reptiles, amphibians, and fish. When calculating the CGN score, all of the genes within a window of two million bases from the center of DBC1 or CCAR1 of that species were extracted from GeneBank. The number of total genes in the window of the human genome was defined as M HS, the number of common genes between human and another species X was counted as C X, and then the CGN score of species X was calculated as CGNX = C X/M HS. The human and mouse gene lists were also used to build a synteny plot [86–88].
2.7. Mutation Rate Analysis
The substitution rate between each group of domains between each species set is calculated as follows. (1) Two groups of sequences were aligned using CLUSTALW. (2) The domain structure of hCCAR1 was used to label the location of similar regions in all of the other sequences. (3) The amino acid of each sequence, on each site within a specific region, in the second group was compared with the amino acid sequence in the first group. If no match was found, a substitution was recorded for this site. (4) This process was repeated for each group with step (3) for all of the sequences in the second group. The final substitution frequency is S i for the ith site. (5) The sum was calculated for all of the substitutions for the sites in the entire region to get the total substitution ∑S i (i = 1,…, L), where L is the number of sites in this region. (6) For example, if there were M sequences in the first group and N sequences in the second group, the final substitution rate would be ∑S i/(M∗N∗L).
3. Results
3.1. DBC1 and CCAR1 Are Intrinsically Disordered Proteins with a Similar Domain Structure and a Similar Pattern of Predicted Intrinsic Disorder
Human DBC1 (hDBC1) and human CCAR1 (hCCAR1) have ~30% sequence similarity to each other (Figure S1) and share multiple highly conserved domains. Table 1 shows the sequential locations of the functional domains for both proteins and indicates the known or predicted functions of the domains. The sequence alignment of each domain is provided in Figures S2 and S3. The size difference between DBC1 and CCAR1 is the result of three segments found in hCCAR1 that hDBC1 lacks, including an elongated N-terminal disordered region, a SAP domain, and an extra coiled-coil segment in its C-terminus (CC1).
Table 1.
| |
Domain Function | |
| |
S1-Likea,b | Homology to an RNA-binding domain. |
NLSa,c | Nuclear localization signal. Acetylation of the NLS in DBC1 regulates nuclear localization. |
LZa,b,c,d | Likely non-functional in CCAR1. Regulation of a diverse set of cellular pathways in DBC1. |
Nudixb | Catalytically inactive hydrolase domain in DBC1 and CCAR1. Predicted to function as a sensor in DBC1 that may bind to NAD metabolites and regulate SIRT1. |
SAPa,b,d | Homology to a putative DNA-binding motif predicted to be involved in chromosomal organization. |
EF-Handa,b | Inactive variant of a calcium dependent regulator of multiple cellular processes. |
CCa,c,d | Predicted protein-protein interaction motif. |
In order to compare the similarities in conformational properties between hDBC1 and hCCAR1, the degree of predicted protein intrinsic disorder of these two proteins was analyzed (Figure 1). While 41% of hDBC1 is composed of disordered residues, 61% of residues are disordered in hCCAR1. As expected, a majority of the functional domains listed in Table 1, including the S1-Like, LZ, Nudix, and EF-Hand, are located in the structured segments of both proteins. The domains that are intrinsically disordered, or have a high degree of structural flexibility as indicated by higher disorder score, are the NLS and coiled-coil segments on hDBC1 and hCCAR1, as well as the SAP domain that is specific to hCCAR1. The predicted 3D structures of these functional domains indicate that many of these domains are structured and have limited flexibility.
Analysis of the disorder curves on the C-terminal residues (~400aa) of hDBC1 and hCCAR1 indicates a high degree of similarity on the curve of predicted protein intrinsic disorder in this region, thereby suggesting a similar conformational fluctuation and further functional role (Figure 1, shaded region). Corresponding to this finding, a sequence alignment between these two regions shows 30% sequence identity, indicating evolutionary conservation (Figure S3). The high degree of similarity between the structure of the conserved domains in hDBC1 and hCCAR1, along with the similarity in the C-terminal region, leads to the presumption that hDBC1 and hCCAR1 may share a common molecular origin.
In addition to hDBC1 and hCCAR1, we determined the sequence level of intrinsic disorder from other species using a CH-CDF plot. We carried out CH-CDF analysis for all of the proteins in our dataset and found that all DBC1 and CCAR1 proteins have large negative values (<−0.1) on CDF distance, indicating that these proteins are mostly intrinsically disordered (Figure 2). In terms of CH distances, CCAR1 and DBC1 proteins have varying distributions. Most CCAR1 proteins have a positive CH distance, while all DBC1 proteins have a negative CH distance. Detailed analysis indicates that many groups have localized distribution on this CH-CDF plot, such as amphibian CCAR1 and bird CCAR1. Mammalian CCAR1 proteins have the broadest distribution on CH distance, followed by CCAR1 proteins from aquatic animals. Compared to mammalian DBC1 proteins, mammalian CCAR1 proteins are more scattered in the CH-CDF plot, indicating more structural variability. Therefore, based on the algorithms of CH and CDF distance, it can be concluded that most CCAR1 proteins have extra charged residues, while almost all DBC1 proteins are more structure-prone.
3.2. Human DBC1 Shares Common Ancestry with the Nematode CCAR1 Ortholog LST-3
To study the evolutionary relationship between DBC1 and CCAR1, phylogenetic analysis was performed with the DBC1 homologs and paralogs listed in Table S1, and the results are shown in Figure 3. Most mammals have evolved to incorporate both DBC1 and CCAR1 into their genomes (Figure 3, purple shaded region). As a comparison, insects and nematodes have only incorporated CCAR1 into their genomes (Figure 3, light and dark blue shaded regions). Interestingly, the first known evolutionary appearance of DBC1 is in zebrafish (Figure 3, pink shaded region). This is a clear indication that DBC1 emerged later than CCAR1. Another interesting observation is that the nematode LST-3 proteins are more closely related to CCAR1 in lower species, such as insects. These observations have revealed an interesting evolutionary picture of DBC1/CCAR1/LST-3, where DBC1 has evolved from CCAR1 and CCAR1 originated from LST-3.
To compare homology between nematode LST-3 and hDBC1/hCCAR1, a cross-validation between the nematode proteome and the human proteome was performed. Specifically, the nematode Caenorhabditis elegans protein LST-3 was aligned against the complete human proteome and the only two significant hits (E value <1.0e − 20) were hDBC1 and hCCAR1. Conversely, both hDBC1 and hCCAR1 protein sequences were aligned against the complete C. elegans proteome, and the only significant hit was LST-3. This result shows that the only possible ancestor of DBC1 and CCAR1 in higher species is nematode LST-3.
To further explore the evolutionary path of LST-3, CCAR1, and DBC1, we compared the full-length sequences of C. elegans LST-3, zebrafish CCAR1 (zCCAR1), and zebrafish DBC1 (zDBC1), as the first emergence of DBC1 is in zebrafish. The resulting gapped-disorder curve of this comparison is shown in Figure 4. A gapped disorder curve aligns proteins by the shape of the curve based on the disorder score. Instead of presenting the actual sequence similarity, the gapped disorder curve describes the overall similarity of the flexibility of the protein segments, which indicates any evolutionary gaps that may be present between proteins [72, 89, 90].
The structural similarities shared between zDBC1, zCCAR1, and LST-3 include a structured segment present at aa200 (Figure 4, shaded area 1), a similar disordered curvature spanning from aa200 to 400 (Figure 4, shaded area 2), the fluctuating peaks from aa500 to 600, and the increasingly disordered C-termini beginning at aa1100 and continuing to the end of the proteins (Figure 4, shaded area 3). The N-termini of zDBC1 and LST-3 have two gapped regions (pink lines) that are located immediately before and after the S1-Like domain. zCCAR1 contains two other gapped segments similar to zDBC1, one near the center of the protein that corresponds to the conserved Nudix domain and another on the C-terminus that does not correspond to a well-defined functional region. When compared to either zCCAR1 or LST-3, zDBC1 contains four additional gapped regions, thus indicating a difference of four insertions or deletions throughout evolution. The locations of these insertions and deletions are roughly in line with the gapped segments in the sequence alignment provided in Figures S4 and S5.
3.3. DBC1 Is More Conserved than CCAR1
The conservation of the genomic neighborhood (CGN) of DBC1 and CCAR1 was calculated for mammals, birds, insects, reptile, amphibians, and fish (Figure 5). The DBC1 gene is not present in birds, insects, or amphibians, but in the species where DBC1 gene is present, the CGN score of the DBC1 gene for that species is higher than that of CCAR1. This is another indication that the genomic region surrounding the CCAR1 gene is less conserved than the DBC1 gene.
Two synteny plots comparing the human and mouse genes for DBC1 and CCAR1 are illustrated in Figure 6. In these plots, more conserved genes can be found in the neighboring region of DBC1 as compared to CCAR1, also indicating that DBC1 is more conserved than CCAR1.
3.4. CCAR1 Appeared before DBC1 in Evolution
To further examine the variability of insertions and deletions in DBC1 and CCAR1, the amino acid substitution rate of each conserved domain across various groups of species was analyzed (Figure 7). The overall mutation rate for CCAR1 is approximately 20% from amphibian to human, while DBC1 has a relatively high substitution rate of about 50% from insect to mammal and 30% from Therapsida to primate. Even after DBC1 becomes more conserved after Therapsida, the substitution rate of various domains in DBC1 is still approximately 10% higher than that of the corresponding CCAR1 domains. Also, the mutation rate of each domain for DBC1 in fish is similar to that of insects for CCAR1, indicating a similar trend during the beginning of the evolutionary process between both proteins, with CCAR1 evolving first.
Examination of each domain individually reveals that the evolutionary process has varied between both proteins. Interestingly, the fish DBC1 protein contains a SAP domain that then disappears from amphibian and onward. The remaining S1-Like, NLS, LZ, Nudix, and CC2 domains continue to have high mutation rates from fish to amphibian, while the EF-Hand domain remains relatively conserved. Variation continues to be observed between all domains until a gradual steadying of mutation rates occurs in Therapsida and continues until human.
The mutation rate of each domain in CCAR1 exhibits a different trend. The mutation rate of the S1-Like, LZ, EF-Hand, SAP, NLS, and CC2 domains in CCAR1 varies from insects to fish, while the mutation rate between all domains decreases from fish to amphibian. Limited changes occur from amphibians to mammals in all domains except for the NLS and CC1 domains, where the mutation rate decreases more dramatically. This is followed by a further increase in mutation rates in all domains from mammals to Therapsida, with an eventual leveling off of the mutation rates in humans. This data further supports CCAR1 appearing first in evolution, not only by appearing in insects before DBC1, but also by becoming conserved much earlier in evolution.
3.5. DBC1 and CCAR1 Exhibit Similar Domain Flexibility
To determine if the mutation rate of each domain has affected protein structure and flexibility in the evolutionary process, the average disorder score of each domain across different species was compiled (Figure 8). The overall disorder scores between the various domains of DBC1 and CCAR1 are very similar. The NLS and CC2 domains are disordered in both proteins, while the Nudix, S1-Like, EF-Hand, and LZ are ordered domains. Also, the SAP domain in DBC1 that is only present in fish is highly disordered, corresponding with the intrinsic disorder of the SAP domain throughout CCAR1 evolution. Even though this overall similarity in structure exists, differences can be seen in the trend of intrinsic disorder across evolution between each domain in DBC1 and CCAR1. The S1-Like, Nudix, LZ, and EF-Hand domains in CCAR1 tend to become less structured throughout evolution, while in DBC1 the same domains become more structured.
During the evolutionary process, it appears that some domains underwent a drastic change in structural flexibility that is measured by predicted disorder score. In DBC1, the S1-Like and LZ domains decreased in structural flexibility from fish to amphibian, whereas the opposite trend is observed in CCAR1. The NLS and CC2 domains of DBC1 tend to drastically change in structural flexibility where a vibration pattern can be observed, before eventual leveling off into a disordered structure in primates, and carrying over into humans. In CCAR1, structural flexibility has increased in the SAP domain from fish to amphibian, The NLS domain had a sudden decrease in structural flexibility from mammals to Therapsida, before increasing from primate to human.
4. Discussion
DBC1 and CCAR1 are emerging as important regulators in a number of cellular pathways. DBC1 and CCAR1 share a similar domain structure, indicating they may have similar biological functions. The similar domain structure shared between the two multidomain proteins may also indicate origination from a common ancestor [91]. Therefore, it is important to investigate the process by which the functional domains of these two proteins have evolved. We predict here for the first time that DBC1 and CCAR1 are comprised of mostly intrinsically disordered regions, and that several of the functional domains in these two proteins are intrinsically disordered. These findings provide support for the role of these two proteins in many molecular interactions, as intrinsically disordered regions are frequent sites for protein-protein interactions [65]. In our prediction, the extended N-terminus of CCAR1 is much more flexible than that of DBC1, as denoted by a higher disorder score, which may indicate a unique functional role for this region. Corresponding to this, CCAR1 has recently been shown to have distinct functions apart from DBC1, such as binding directly to the LZ domain of DBC1 and synergistically regulating the function of DBC1 [14]. This regulatory ability would require a unique region capable of binding directly to DBC1, which our analysis suggests may be on the N-terminal domain.
It is often found that two protein paralogs develop throughout evolution from one common ancestor [92]. Supporting this notion, our analysis of DBC1 and CCAR1 from multiple species has established evidence that both proteins have evolved from one common nematode ancestor, LST-3. We have shown that CCAR1 first appears in insects, while DBC1 first appears much later in fish. Further supporting this claim is the fact that the DBC1 protein in fish is the only species in which DBC1 contains a SAP domain, a domain typically only found in CCAR1 proteins.
The SAP domain is a DNA/RNA binding domain with about 40 amino acids. The core structure of this domain is a two- or three-helix bundle. There are currently two X-ray structures of SAP domains in the PDB; however, in these two X-ray structures, the SAP domains are in complexes with other proteins and RNA. Therefore, these two X-ray structures of SAP domains cannot be used to assess the actual structural flexibility of the SAP domain by itself. There are also several other NMR structures for SAP domains in the PDB database. From these structures, it is clear that the SAP domain has huge structural flexibility. The RMSD values of structural alignment between these structures from different NMR experiments are very large. Therefore, results from NMR experiments provide evidence that the SAP domain is flexible. In addition, the length of the SAP domain is less than 40 amino acids, indicating that the SAP domain may not have enough hydrophobic interaction by itself. Further, in our prediction, it is clear that SAP domain is likely to be in a “dip” indicating a structure-prone tendency; however, this “dip” is flanked by two long disordered regions. Therefore, the possible 3D structural picture of the SAP domain is that this domain forms a small and flexible hydrophobic core and sits in the middle of a long disordered region. The SAP domain of LST-3 was likely passed on to the fish DBC1 protein in the early evolutionary history of DBC1. Other species may not have required this particular domain in DBC1 and hence continued to evolve without it. All of these findings combined suggest that LST-3 is the common ancestor and ortholog, to both DBC1 and CCAR1.
A gapped disorder plot shows that zDBC1 contains four additional gapped regions when compared to zCCAR1 and LST-3, indicating that a difference of four insertions/deletions and/or substitutions allowed for the first evolutionary appearance of DBC1 in zebrafish. As mutation rate is linked to evolution, understanding the mutation rates of these proteins can help to decipher their evolutionary history. We see that CCAR1 emerges earlier in evolution in insects and that it becomes relatively conserved by the amphibian period with the exception of the CC1 and the CC2 regions, which were still undergoing evolution. DBC1, on the other hand, took much longer to become relatively conserved, and still has not yet reached the low mutation rate of CCAR1, indicating the possibility of acquiring new functional roles in the future evolutionary period. DBC1 becomes relatively conserved in Therapsida. Taking into consideration that Therapsida appeared about 300 million years later than amphibians, this provides evidence that DBC1 arose much later in evolution than CCAR1, further supporting the notion that a CCAR1 homolog gave rise to DBC1 over time.
Domain-level analysis provides yet more information on the correlation between protein flexibility and evolution. The disorder scores for the NLS and CC2 domains in DBC1 have undergone drastic changes in the evolutionary process. This may indicate the sudden acquisition of a hydrophobic core, and thus an increased protein-binding ability. However, since this new function may not be essential for the species, the acquired hydrophobic binding sites may disappear in the evolutionary process.
The LZ domain of DBC1 has been implicated in a variety of regulatory processes. The LZ domain is a heptad repeat of leucine residues, which represents the hydrophobic core of a coiled-coil formed by two different chains. The basic regions next to the LZ domain along the coiled-coil can interact with the major groove of DNA to regulate the process of gene expression. A substitution in heptad repeats from hydrophobic to less-hydrophobic or hydrophilic and charged residues will lessen the hydrophobic interaction and distort the structure of the coiled-coil and thereby prevent DNA binding ability. Therefore, in order to keep the function of the LZ domain, all of the mutations on this domain should be hydrophobic-dominant and thus be more structure-prone with a lower disorder score. In DBC1, the LZ domain tends to become more structure-prone. Conversely, the LZ in CCAR1 loses structure slightly from insect to fish. It is interesting to see this deviation since DBC1 first appears in fish. The presence of DBC1 in fish may have resulted in a decreased requirement for the LZ domain in CCAR1, ultimately affecting the current functional role of the LZ in modern DBC1 and CCAR1. This same concept is also applicable to all of the structured domains as well.
Overall, our data presents new findings on the structure and evolution of DBC1 and CCAR1. Our findings support the function of both proteins in many protein-protein interactions due to the high occurrence of disordered residues. We have found an unstructured region on the N-terminus of CCAR1 that may be responsible for unique protein interactions independent of DBC1. Similarly, we have determined that the LZ of DBC1 may be involved in unique interactions, as the LZ of CCAR1 has become unstructured and possibly nonfunctional throughout evolution. We see that CCAR1 appeared much earlier in the process of evolution as compared to DBC1, and that the nematode LST-3 protein may be the common ancestor of DBC1 and CCAR1. As the nematode C. elegans is a model organism frequently used in experimental biology; this work may help to broaden DBC1 and CCAR1 studies by demonstrating the important role that nematodes have had in the evolution of these two proteins. Specifically, the nematode LST-3 protein may have undergone multiple insertions and deletions to give rise to modern-day CCAR1 and DBC1.
Supplementary Material
Acknowledgments
This work was supported by start-up grants from the Department of Cell Biology, Microbiology and Molecular Biology and College of Arts and Sciences at the University of South Florida to Sandy D. Westerheide and Bin Xue.
Abbreviations
- DBC1:
Deleted in breast cancer 1
- CCAR1:
Cell cycle and apoptosis regulator 1
- LST-3:
Lateral signaling target 3
- hDBC1:
Human DBC1
- hCCAR1:
Human CCAR1
- zDBC1:
Zebrafish DBC1
- zCCAR1:
Zebrafish CCAR1
- CC1:
Coiled-coil 1
- CC2:
Coiled-coil 2
- NLS:
Nuclear localization signal
- LZ:
Leucine zipper
- SIRT1:
Silent mating type information regulation 2 homolog
- HDAC3:
Histone deacetylase 3
- aa:
amino acid(s).
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
References
- 1.Hamaguchi M., Meth J. L., Von Klitzing C., Wei W., Esposito D., Rodgers L., Walsh T., Welcsh P., King M.-C., Wigler M. H. DBC2, a candidate for a tumor suppressor gene involved in breast cancer. Proceedings of the National Academy of Sciences of the United States of America. 2002;99(21):13647–13652. doi: 10.1073/pnas.212516099. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Kim J.-E., Chen J., Lou Z. p30 DBC is a potential regulator of tumorigenesis. Cell Cycle. 2009;8(18):2932–2935. doi: 10.4161/cc.8.18.9473. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Gray K. A., Daugherty L. C., Gordon S. M., Seal R. L., Wright M. W., Bruford E. A. Genenames.org: the HGNC resources in 2013. Nucleic Acids Research. 2013;41(1):D545–D552. doi: 10.1093/nar/gks1066. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Kim J. E., Chen J., Lou Z. DBC1 is a negative regulator of SIRT1. Nature. 2008;451(7178):583–586. doi: 10.1038/nature06500. [DOI] [PubMed] [Google Scholar]
- 5.Zhao W., Kruse J.-P., Tang Y., Jung S. Y., Qin J., Gu W. Negative regulation of the deacetylase SIRT1 by DBC1. Nature. 2008;451(7178):587–590. doi: 10.1038/nature06515. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Li Z., Chen L., Kabra N., Wang C., Fang J., Chen J. Inhibition of SUV39H1 methyltransferase activity by DBC1. Journal of Biological Chemistry. 2009;284(16):10361–10366. doi: 10.1074/jbc.M900956200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Chini C. C. S., Escande C., Nin V., Chini E. N. HDAC3 is negatively regulated by the nuclear protein DBC1. The Journal of Biological Chemistry. 2010;285(52):40830–40837. doi: 10.1074/jbc.M110.153270. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Sundararajan R., Chen G., Mukherjee C., White E. Caspase-dependent processing activates the proapoptotic activity of deleted in breast cancer-1 during tumor necrosis factor-alpha-mediated death signaling. Oncogene. 2005;24(31):4908–4920. doi: 10.1038/sj.onc.1208681. [DOI] [PubMed] [Google Scholar]
- 9.Anantharaman V., Aravind L. Analysis of DBC1 and its homologs suggests a potential mechanism for regulation of sirtuin domain deacetylases by NAD metabolites. Cell Cycle. 2008;7(10):1467–1472. doi: 10.4161/cc.7.10.5883. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Raynes R., Pombier K. M., Nguyen K., Brunquell J., Mendez J. E., Westerheide S. D. The SIRT1 modulators AROS and DBC1 regulate HSF1 activity and the heat shock response. PLoS ONE. 2013;8(1) doi: 10.1371/journal.pone.0054364.e54364 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Park S. H., Riley P., IV, Frisch S. M. Regulation of anoikis by deleted in breast cancer-1 (DBC1) through NF-κB. Apoptosis. 2013;18(8):949–962. doi: 10.1007/s10495-013-0847-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Rishi A. K., Zhang L., Boyanapalli M., Wali A., Mohammad R. M., Yu Y., Fontana J. A., Hatfield J. S., Dawson M. I., Majumdar A. P. N., Reichert U. Identification and characterization of a cell cycle and apoptosis regulatory protein-1 as a novel mediator of apoptosis signaling by retinoid CD437. The Journal of Biological Chemistry. 2003;278(35):33422–33435. doi: 10.1074/jbc.M303173200. [DOI] [PubMed] [Google Scholar]
- 13.Kim J. H., Yang C. K., Heo K., Roeder R. G., An W., Stallcup M. R. CCAR1, a key regulator of mediator complex recruitment to nuclear receptor transcription complexes. Molecular Cell. 2008;31(4):510–519. doi: 10.1016/j.molcel.2008.08.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Ji Yu E., Kim S. H., Heo K., Ou C. Y., Stallcup M. R., Kim J. H. Reciprocal roles of DBC1 and SIRT1 in regulating estrogen receptor α activity and co-activator synergy. Nucleic Acids Research. 2011;39(16):6932–6943. doi: 10.1093/nar/gkr347. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Marchler-Bauer A., Lu S., Anderson J. B., Chitsaz F., Derbyshire M. K., deWeese-Scott C., Fong J. H., Geer L. Y., Geer R. C., Gonzales N. R., Gwadz M., Hurwitz D. I., Jackson J. D., Ke Z., Lanczycki C. J., Lu F., Marchler G. H., Mullokandov M., Omelchenko M. V., Robertson C. L., Song J. S., Thanki N., Yamashita R. A., Zhang D., Zhang N., Zheng C., Bryant S. H. CDD: a Conserved Domain Database for the functional annotation of proteins. Nucleic Acids Research. 2011;39(1):D225–D229. doi: 10.1093/nar/gkq1189. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Bycroft M., Hubbard T. J. P., Proctor M., Freund S. M. V., Murzin A. G. The solution structure of the S1 RNA binding domain: a member of an ancient nucleic acid-binding fold. Cell. 1997;88(2):235–242. doi: 10.1016/S0092-8674(00)81844-9. [DOI] [PubMed] [Google Scholar]
- 17.Hubbard B. P., Loh C., Gomes A. P., et al. Carboxamide SIRT1 inhibitors block DBC1 binding via an acetylation-independent mechanism. Cell Cycle. 2013;12(14):2233–2240. doi: 10.4161/cc.25268. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Alber T. Structure of the leucine zipper. Current Opinion in Genetics and Development. 1992;2(2):205–210. doi: 10.1016/S0959-437X(05)80275-8. [DOI] [PubMed] [Google Scholar]
- 19.Nin V., Escande C., Chini C. C., Giri S., Camacho-Pereira J., Matalonga J., Lou Z., Chini E. N. Role of Deleted in Breast Cancer 1 (DBC1) protein in SIRT1 deacetylase activation induced by protein kinase A and AMP-activated protein kinase. The Journal of Biological Chemistry. 2012;287(28):23489–23501. doi: 10.1074/jbc.M112.365874. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Kang H., Suh J. Y., Jung Y. S., Jung J. W., Kim M. K., Chung J. H. Peptide switch is essential for Sirt1 deacetylase activity. Molecular Cell. 2011;44(2):203–213. doi: 10.1016/j.molcel.2011.07.038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Aravind L., Koonin E. V. SAP - A putative DNA-binding motif involved in chromosomal organization. Trends in Biochemical Sciences. 2000;25(3):112–114. doi: 10.1016/S0968-0004(99)01537-6. [DOI] [PubMed] [Google Scholar]
- 22.Neukirch S., Goriely A., Hausrath A. C. Chirality of coiled coils: Elasticity matters. Physical Review Letters. 2008;100(3) doi: 10.1103/PhysRevLett.100.038105.038105 [DOI] [PubMed] [Google Scholar]
- 23.Chini C. C. S., Escande C., Nin V., Chini E. N. DBC1 (Deleted in Breast Cancer 1) modulates the stability and function of the nuclear receptor Rev-erbα . Biochemical Journal. 2013;451(3):453–461. doi: 10.1042/BJ20121085. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Waterhouse R. M., Tegenfeldt F., Li J., Zdobnov E. M., Kriventseva E. V. OrthoDB: a hierarchical catalog of animal, fungal and bacterial orthologs. Nucleic Acids Research. 2013;41(1):D358–D365. doi: 10.1093/nar/gks1116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Oates M. E., Romero P., Ishida T., Ghalwash M., Mizianty M. J., Xue B., Dosztányi Z., Uversky V. N., Obradovic Z., Kurgan L., Dunker A. K., Gough J. D2P2: Database of disordered protein predictions. Nucleic Acids Research. 2013;41(1):D508–D516. doi: 10.1093/nar/gks1226. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Chen J. W., Romero P., Uversky V. N., Dunker A. K. Conservation of intrinsic disorder in protein domains and families: I. A database of conserved predicted disordered regions. Journal of Proteome Research. 2006;5(4):879–887. doi: 10.1021/pr060048x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Chen J. W., Romero P., Uversky V. N., Dunker A. K. Conservation of intrinsic disorder in protein domains and families: II. Functions of conserved disorder. Journal of Proteome Research. 2006;5(4):888–898. doi: 10.1021/pr060049p. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Daughdrill G. W., Narayanaswami P., Gilmore S. H., Belczyk A., Brown C. J. Dynamic behavior of an intrinsically unstructured linker domain is conserved in the face of negligible amino acid sequence conservation. Journal of Molecular Evolution. 2007;65(3):277–288. doi: 10.1007/s00239-007-9011-2. [DOI] [PubMed] [Google Scholar]
- 29.Brown C. J., Johnson A. K., Daughdrill G. W. Comparing models of evolution for ordered and disordered proteins. Molecular Biology and Evolution. 2010;27(3):609–621. doi: 10.1093/molbev/msp277. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Wrabl J. O., Gu J., Liu T., Schrank T. P., Whitten S. T., Hilser V. J. The role of protein conformational fluctuations in allostery, function, and evolution. Biophysical Chemistry. 2011;159(1):129–141. doi: 10.1016/j.bpc.2011.05.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Brown C. J., Johnson A. K., Dunker A. K., Daughdrill G. W. Evolution and disorder. Current Opinion in Structural Biology. 2011;21(3):441–446. doi: 10.1016/j.sbi.2011.02.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Jeong C.-S., Kim D. Coevolved residues and the functional association for intrinsically disordered proteins. Pacific Symposium on Biocomputing. 2012:140–151. [PubMed] [Google Scholar]
- 33.Chemes L. B., Glavina J., Alonso L. G., Marino-Buslje C., de Prat-Gay G., Sánchez I. E. Sequence evolution of the intrinsically disordered and globular domains of a model viral oncoprotein. PLoS ONE. 2012;7(10) doi: 10.1371/journal.pone.0047661.e47661 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Mahani A., Henriksson J., Wright A. P. H. Origins of Myc proteins—using intrinsic protein disorder to trace distant relatives. PLoS ONE. 2013;8(9) doi: 10.1371/journal.pone.0075057.e75057 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Borcherds W., Kashtanov S., Wu H., Daughdrill G. W. Structural divergence is more extensive than sequence divergence for a family of intrinsically disordered proteins. Proteins: Structure, Function and Bioinformatics. 2013;81(10):1686–1698. doi: 10.1002/prot.24303. [DOI] [PubMed] [Google Scholar]
- 36.Peng Z., Oldfield C. J., Xue B., Mizianty M. J., Dunker A. K., Kurgan L., Uversky V. N. A creature with a hundred waggly tails: intrinsically disordered proteins in the ribosome. Cellular and Molecular Life Sciences. 2014;71(8):1477–1504. doi: 10.1007/s00018-013-1446-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Dunker A. K., Lawson J. D., Brown C. J., Williams R. M., Romero P., Oh J. S., Oldfield C. J., Campen A. M., Ratliff C. M., Hipps K. W., Ausio J., Nissen M. S., Reeves R., Kang C., Kissinger C. R., Bailey R. W., Griswold M. D., Chiu W., Garner E. C., Obradovic Z. Intrinsically disordered protein. Journal of Molecular Graphics and Modelling. 2001;19(1):26–59. doi: 10.1016/S1093-3263(00)00138-8. [DOI] [PubMed] [Google Scholar]
- 38.Iakoucheva L. M., Brown C. J., Lawson J. D., Obradović Z., Dunker A. K. Intrinsic disorder in cell-signaling and cancer-associated proteins. Journal of Molecular Biology. 2002;323(3):573–584. doi: 10.1016/S0022-2836(02)00969-5. [DOI] [PubMed] [Google Scholar]
- 39.Dunker A. K., Cortese M. S., Romero P., Iakoucheva L. M., Uversky V. N. Flexible nets: the roles of intrinsic disorder in protein interaction networks. FEBS Journal. 2005;272(20):5129–5148. doi: 10.1111/j.1742-4658.2005.04948.x. [DOI] [PubMed] [Google Scholar]
- 40.Uversky V. N., Oldfield C. J., Dunker A. K. Showing your ID: intrinsic disorder as an ID for recognition, regulation and cell signaling. Journal of Molecular Recognition. 2005;18(5):343–384. doi: 10.1002/jmr.747. [DOI] [PubMed] [Google Scholar]
- 41.Szalkowski A. M., Anisimova M. Markov models of amino acid substitution to study proteins with intrinsically disordered regions. PLoS ONE. 2011;6(5) doi: 10.1371/journal.pone.0020488.e20488 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Xue B., Brown C. J., Dunker A. K., Uversky V. N. Intrinsically disordered regions of p53 family are highly diversified in evolution. Biochimica et Biophysica Acta. 2013;1834(4):725–738. doi: 10.1016/j.bbapap.2013.01.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Altschul S. F., Gish W., Miller W., Myers E. W., Lipman D. J. Basic local alignment search tool. Journal of Molecular Biology. 1990;215(3):403–410. doi: 10.1006/jmbi.1990.9999. [DOI] [PubMed] [Google Scholar]
- 44.Camacho C., Coulouris G., Avagyan V., Ma N., Papadopoulos J., Bealer K., Madden T. L. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10, article 421 doi: 10.1186/1471-2105-10-421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Magrane M., Consortium U. P. UniProt Knowledgebase: a hub of integrated protein data. Database: The Journal of Biological Databases and Curation. 2011;2011 doi: 10.1093/database/bar009.bar009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Xue B., Dunbrack R. L., Williams R. W., Dunker A. K., Uversky V. N. PONDR-FIT: a meta-predictor of intrinsically disordered amino acids. Biochimica et Biophysica Acta: Proteins and Proteomics. 2010;1804(4):996–1010. doi: 10.1016/j.bbapap.2010.01.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Romero P., Obradovic Z., Li X., Brown C. J., Dunker A. K. Sequence complexity of disordered protein. Proteins. 2001;42(1):38–48. doi: 10.1002/1097-0134(20010101)42:1<38::aid-prot50>3.0.co;2-3. [DOI] [PubMed] [Google Scholar]
- 48.Schlessinger A., Punta M., Yachdav G., Kajan L., Rost B. Improved disorder prediction by combination of orthogonal approaches. PLoS ONE. 2009;4(2) doi: 10.1371/journal.pone.0004433.e4433 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Ishida T., Kinoshita K. Prediction of disordered regions in proteins based on the meta approach. Bioinformatics. 2008;24(11):1344–1348. doi: 10.1093/bioinformatics/btn195. [DOI] [PubMed] [Google Scholar]
- 50.Deng X., Eickholt J., Cheng J. PreDisorder: ab initio sequence-based prediction of protein disordered regions. BMC Bioinformatics. 2009;10, article 436 doi: 10.1186/1471-2105-10-436. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Xue B., Oldfield C. J., Dunker A. K., Uversky V. N. CDF it all: consensus prediction of intrinsically disordered proteins based on various cumulative distribution functions. FEBS Letters. 2009;583(9):1469–1474. doi: 10.1016/j.febslet.2009.03.070. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Mizianty M. J., Stach W., Chen K., Kedarisetti K. D., Disfani F. M., Kurgan L. Improved sequence-based prediction of disordered regions with multilayer fusion of multiple information sources. Bioinformatics. 2010;26(18):i489–i496. doi: 10.1093/bioinformatics/btq373. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Kozlowski L. P., Bujnicki J. M. MetaDisorder: a meta-server for the prediction of intrinsic disorder in proteins. BMC Bioinformatics. 2012;13, article 111 doi: 10.1186/1471-2105-13-111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Vucetic S., Brown C. J., Dunker A. K., Obradovic Z. Flavors of protein disorder. Proteins: Structure, Function and Genetics. 2003;52(4):573–584. doi: 10.1002/prot.10437. [DOI] [PubMed] [Google Scholar]
- 55.Peng K., Radivojac P., Vucetic S., Dunker A. K., Obradovic Z. Length-dependent prediction of protein in intrinsic disorder. BMC Bioinformatics. 2006;7, article 208 doi: 10.1186/1471-2105-7-208. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Obradovic Z., Peng K., Vucetic S., Radivojac P., Brown C. J., Dunker A. K. Predicting intrinsic disorder from amino acid sequence. Proteins: Structure, Function and Genetics. 2003;53(supplement 6):566–572. doi: 10.1002/prot.10532. [DOI] [PubMed] [Google Scholar]
- 57.Peng K., Vucetic S., Radivojac P., Brown C. J., Dunker A. K., Obradovic Z. Optimizing long intrinsic disorder predictors with protein evolutionary information. Journal of Bioinformatics and Computational Biology. 2005;3(1):35–60. doi: 10.1142/S0219720005000886. [DOI] [PubMed] [Google Scholar]
- 58.Prilusky J., Felder C. E., Zeev-Ben-Mordehai T., et al. FoldIndex: a simple tool to predict whether a given protein sequence is intrinsically unfolded. Bioinformatics. 2005;21(16):3435–3438. doi: 10.1093/bioinformatics/bti537. [DOI] [PubMed] [Google Scholar]
- 59.Dosztányi Z., Csizmok V., Tompa P., Simon I. IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content. Bioinformatics. 2005;21(16):3433–3434. doi: 10.1093/bioinformatics/bti541. [DOI] [PubMed] [Google Scholar]
- 60.Campen A., Williams R. M., Brown C. J., Meng J., Uversky V. N., Dunker A. K. TOP-IDP-scale: a new amino acid scale measuring propensity for intrinsic disorder. Protein and Peptide Letters. 2008;15(9):956–963. doi: 10.2174/092986608785849164. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.He B., Wang K., Liu Y., Xue B., Uversky V. N., Dunker A. K. Predicting intrinsic disorder in proteins: an overview. Cell Research. 2009;19(8):929–949. doi: 10.1038/cr.2009.87. [DOI] [PubMed] [Google Scholar]
- 62.Cheng Y., Oldfield C. J., Meng J., Romero P., Uversky V. N., Dunker A. K. Mining α-helix-forming molecular recognition features with cross species sequence alignments. Biochemistry. 2007;46(47):13468–13477. doi: 10.1021/bi7012273. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Mohan A., Oldfield C. J., Radivojac P., Vacic V., Cortese M. S., Dunker A. K., Uversky V. N. Analysis of molecular recognition features (MoRFs) Journal of Molecular Biology. 2006;362(5):1043–1059. doi: 10.1016/j.jmb.2006.07.087. [DOI] [PubMed] [Google Scholar]
- 64.Oldfield C. J., Cheng Y., Cortese M. S., Romero P., Uversky V. N., Dunker A. K. Coupled folding and binding with α-helix-forming molecular recognition elements. Biochemistry. 2005;44(37):12454–12470. doi: 10.1021/bi050736e. [DOI] [PubMed] [Google Scholar]
- 65.Hsu W.-L., Oldfield C., Meng J., Huang F., Xue B., Uversky V. N., Romero P., Dunker A. K. Intrinsic protein disorder and protein-protein interactions. Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing. 2012:116–127. [PubMed] [Google Scholar]
- 66.Hsu W.-L., Oldfield C. J., Xue B., Meng J., Huang F., Romero P., Uversky V. N., Dunker A. K. Exploring the binding diversity of intrinsically disordered proteins involved in one-to-many binding. Protein Science. 2013;22(3):258–273. doi: 10.1002/pro.2207. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Gypas F., Tsaousis G. N., Hamodrakas S. J. MpMoRFsDB: a database of molecular recognition features in membrane proteins. Bioinformatics. 2013;29(19):2517–2518. doi: 10.1093/bioinformatics/btt427. [DOI] [PubMed] [Google Scholar]
- 68.Kotta-Loizou I., Tsaousis G. N., Hamodrakas S. J. Analysis of molecular recognition features (MoRFs) in membrane proteins. Biochimica et Biophysica Acta: Proteins and Proteomics. 2013;1834(4):798–807. doi: 10.1016/j.bbapap.2013.01.006. [DOI] [PubMed] [Google Scholar]
- 69.Xue B., Oldfield C. J., Van Y.-Y., Dunker A. K., Uversky V. N. Protein intrinsic disorder and induced pluripotent stem cells. Molecular BioSystems. 2012;8(1):134–150. doi: 10.1039/c1mb05163f. [DOI] [PubMed] [Google Scholar]
- 70.Malaney P., Pathak R. R., Xue B., Uversky V. N., Davé V. Intrinsic disorder in PTEN and its interactome confers structural plasticity and functional versatility. Scientific Reports. 2013;3, article 2035 doi: 10.1038/srep02035. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Xue B., Dunker A. K., Uversky V. Retro-MoRFS: Identifying protein binding sites by normal and reverse alignment and intrinsic disorder prediction. International Journal of Molecular Sciences. 2010;11(10):3725–3747. doi: 10.3390/ijms11103725. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Xue B., Mizianty M. J., Kurgan L., Uversky V. N. Protein intrinsic disorder as a flexible armor and a weapon of HIV-1. Cellular and Molecular Life Sciences. 2012;69(8):1211–1259. doi: 10.1007/s00018-011-0859-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Ribeiro M. D. L. C., Espinosa J., Islam S., Martinez O., Thanki J. J., Mazariegos S., Nguyen T., Larina M., Xue B., Uversky V. N. Malleable ribonucleoprotein machine: protein intrinsic disorder in the Saccharomyces cerevisiae spliceosome. PeerJ. 2013;2013(1, article e2) doi: 10.7717/peerj.2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Xue B., Williams R. W., Oldfield C. J., Dunker A. K., Uversky V. N. Archaic chaos: intrinsically disordered proteins in Archaea. BMC Systems Biology. 2010;4(1, article S1) doi: 10.1186/1752-0509-4-S1-S1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Xue B., Dunker A. K., Uversky V. N. Orderly order in protein intrinsic disorder distribution: disorder in 3500 proteomes from viruses and the three domains of life. Journal of Biomolecular Structure and Dynamics. 2012;30(2):137–149. doi: 10.1080/07391102.2012.675145. [DOI] [PubMed] [Google Scholar]
- 76.Oldfield C. J., Cheng Y., Cortese M. S., Brown C. J., Uversky V. N., Bunker A. K. Comparing and combining predictors of mostly disordered proteins. Biochemistry. 2005;44(6):1989–2000. doi: 10.1021/bi047993o. [DOI] [PubMed] [Google Scholar]
- 77.Uversky V. N., Gillespie J. R., Fink A. L. Why are “natively unfolded” proteins unstructured under physiologic conditions? Proteins: Structure, Function and Genetics. 2000;41(3):415–427. doi: 10.1002/1097-0134(20001115)41:3<415::aid-prot130>3.0.co;2-7. [DOI] [PubMed] [Google Scholar]
- 78.Dunker A. K., Obradovic Z., Romero P., Garner E. C., Brown C. J. Intrinsic protein disorder in complete genomes. Genome Informatics. 2000;11:161–171. [PubMed] [Google Scholar]
- 79.Söding J., Biegert A., Lupas A. N. The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Research. 2005;33(2):W244–W248. doi: 10.1093/nar/gki408. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Källberg M., Wang H. P., Wang S., et al. Template-based protein structure modeling using the RaptorX web server. Nature Protocols. 2012;7(8):1511–1522. doi: 10.1038/nprot.2012.085. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Roy A., Kucukural A., Zhang Y. I-TASSER: a unified platform for automated protein structure and function prediction. Nature Protocols. 2010;5(4):725–738. doi: 10.1038/nprot.2010.5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Benkert P., Tosatto S. C. E., Schomburg D. QMEAN: a comprehensive scoring function for model quality assessment. Proteins: Structure, Function and Genetics. 2008;71(1):261–277. doi: 10.1002/prot.21715. [DOI] [PubMed] [Google Scholar]
- 83.Tamura K., Peterson D., Peterson N., Stecher G., Nei M., Kumar S. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Molecular Biology and Evolution. 2011;28(10):2731–2739. doi: 10.1093/molbev/msr121. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Rogozin I. B., Makarova K. S., Wolf Y. I., Koonin E. V. Computational approaches for the analysis of gene neighbourhoods in prokaryotic genomes. Briefings in Bioinformatics. 2004;5(2):131–149. doi: 10.1093/bib/5.2.131. [DOI] [PubMed] [Google Scholar]
- 85.De S., Teichmann S. A., Babu M. M. The impact of genomic neighborhood on the evolution of human and chimpanzee transcriptome. Genome Research. 2009;19(5):785–794. doi: 10.1101/gr.086165.108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Poyatos J. F., Hurst L. D. The determinants of gene order conservation in yeasts. Genome Biology. 2007;8(11, article R233) doi: 10.1186/gb-2007-8-11-r233. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Engström P. G., Sui S. J. H., Drivenes Ø., Becker T. S., Lenhard B. Genomic regulatory blocks underlie extensive microsynteny conservation in insects. Genome Research. 2007;17(12):1898–1908. doi: 10.1101/gr.6669607. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Dawson D. A., Åkesson M., Burke T., Pemberton J. M., Slate J., Hansson B. Gene order and recombination rate in homologous chromosome regions of the chicken and a passerine bird. Molecular Biology and Evolution. 2007;24(7):1537–1552. doi: 10.1093/molbev/msm071. [DOI] [PubMed] [Google Scholar]
- 89.Sun X., Xue B., Jones W. T., Rikkerink E., Dunker A. K., Uversky V. N. A functionally required unfoldome from the plant kingdom: intrinsically disordered N-terminal domains of GRAS proteins are involved in molecular recognition during plant development. Plant Molecular Biology. 2011;77(3):205–223. doi: 10.1007/s11103-011-9803-z. [DOI] [PubMed] [Google Scholar]
- 90.Wood M., Rae G. M., Wu R.-M., Walton E. F., Xue B., Hellens R. P., Uversky V. N. Actinidia DRM1—an intrinsically disordered protein whose mRNA expression is inversely correlated with spring budbreak in kiwifruit. PLoS ONE. 2013;8(3) doi: 10.1371/journal.pone.0057354.e57354 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Hegyi H., Gerstein M. Annotation transfer for genomics: measuring functional divergence in multi-domain proteins. Genome Research. 2001;11(10):1632–1640. doi: 10.1101/gr.183801. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Ispolatov I., Yuryev A., Mazo I., Maslov S. Binding properties and evolution of homodimers in protein-protein interaction networks. Nucleic Acids Research. 2005;33(11):3629–3635. doi: 10.1093/nar/gki678. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.