Abstract
Protein phosphorylation plays an important role in the regulation of protein function. Phosphorylated residues are generally assumed to be subject to functional constraint, but it has recently been suggested from a comparison of distantly related vertebrate species that most phosphorylated residues evolve at the rates consistent with the surrounding regions. To resolve the controversy, we infer the ancestral phosphoproteome of human and mouse to compare the evolutionary rates of phosphorylated and nonphosphorylated serine (S), threonine (T), and tyrosine (Y) residues. This approach enables accurate estimation of evolutionary rates as it does not assume deep conservation of phosphorylated residues. We show that phosphorylated S/T residues tend to evolve more slowly than nonphosphorylated S/T residues not only in disordered but also in ordered protein regions, indicating evolutionary conservation of phosphorylated S/T residues in mammals. Thus, phosphorylated S/T residues tend to be subject to stronger functional constraint than nonphosphorylated residues regardless of the protein regions in which they reside. In contrast, phosphorylated Y residues evolve at similar rates as nonphosphorylated ones. We also find that the human lineage has gained more phosphorylated T residues and lost fewer phosphorylated Y residues than the mouse lineage. The cause of the gain/loss imbalance remains a mystery but should be worth exploring.
Keywords: phosphorylated residue, protein disordered region, evolutionary rate, functional constraint
Introduction
Protein phosphorylation plays an important role in protein function (Trinidad et al. 2008) and is critical to many biological processes such as intracellular signal transduction pathways (Shaywitz et al. 2002) and subcellular protein localization (Blenis and Resh 1993). Many large-scale studies have recently been conducted to identify phosphorylated residues and to study the function and evolution of protein phosphorylation (Iakoucheva et al. 2004; Gnad et al. 2007; Jimenez et al. 2007; Diella et al. 2008; Keshava Prasad et al. 2009; Landry et al. 2009). However, conflicting conclusions on the evolutionary conservation of phosphorylated residues have been reported. Phosphorylated residues have been reported to be better conserved than nonphosphorylated residues (Gnad et al. 2007; Aivaliotis et al. 2009). Moreover, Boekhorst et al. (2008) found significant overlaps of phosphorylated residues among species when studying the phosphoproteomes of six eukaryotes. In contrast, using the human phosphoproteome and the protein sequences of other vertebrates to estimate the evolutionary rates of phosphorylated and nonphosphorylated residues, Landry et al. (2009) concluded that these two types of residues evolved at comparable rates, though phosphorylated residues located in disordered regions seem to have evolved more slowly. However, because these authors used the human phosphoproteome data to infer the ancestral phosphorylation status of amino acid residues in the common ancestor of vertebrates, including zebrafish, the estimated evolutionary rates of phosphorylated residues may not be accurate.
The purpose of the present study was 2-fold. First, we wished to resolve the above controversy. To this end, we estimated the evolutionary rates of phosphorylated and nonphosphorylated serine (S), threonine (T), and tyrosine (Y) residues in the human and mouse lineages since their divergence, which is much more recent than the common ancestor of vertebrates, making the estimation of evolutionary rates easier. We reconstructed phosphorylated and nonphosphorylated S, T, and Y residues in the common ancestor of human, mouse, and dog, using opossum as the outgroup. We chose human and mouse because of the availability of extensive phosphorylation data in these two species, and we included dog because its inclusion could help infer the ancestral status of the analyzed residues. Our analysis scheme avoids the assumption of deep conservation of phosphorylated and nonphosphorylated residues. Moreover, we conducted another analysis in which we treated structurally ordered and disordered protein regions separately. Phosphorylated residues have been found to have a tendency to be more solvent accessible (Gnad et al. 2007) and to be located in disordered regions (Iakoucheva et al. 2004; Jimenez et al. 2007). Furthermore, residues on the protein surface or in disordered regions tend to evolve faster (Brown et al. 2002; Dunker et al. 2002; Linding et al. 2003; Lin et al. 2007). Landry et al. (2009) ascribed the high evolutionary rate of phosphorylated residues to the lack of function of most of these residues in disordered regions and to the tendency of phosphorylated residues to be located in disordered or solvent-accessible regions. We addressed this issue by classifying S, T, and Y residues on the basis of their surface accessibility and disordered potential. Second, we wanted to infer the lineage-specific gains and losses of phosphorylated residues in the human and mouse lineages because such gains or losses might indicate divergence in protein function.
Materials and Methods
Data Set
The one-to-one orthologous protein sequences in human, mouse, dog, and opossum were downloaded from Ensembl v.53 (http://www.ensembl.org/). The orthologous protein pairs with <50% identity between human and any one of the other three species were excluded. The experimentally verified phosphorylated residues of human and mouse proteins were downloaded from UniprotKB v.15.1 (http://www.uniprot.org/), PhospoELM v.8.2 (Diella et al. 2008), and HPRD (Keshava Prasad et al. 2009). A total of 3,526 human–mouse–dog–opossum orthologous phosphoprotein groups were thus selected for subsequent analysis.
Reconstruction of Ancestral Protein Sequences
The ancestral sequences of human, mouse, and dog were reconstructed using PAML (Yang 1997) with human and mouse clustered in one clade and opossum as an outgroup species (fig. 1a). We used Probcons (Do et al. 2005) with default parameters to align the 3,526 groups of orthologous phosphoproteins. For comparison, the same sequences were also aligned using MUSCLE (Edgar 2004). The aligned sequences were subsequently concatenated and analyzed using PAML for estimating the branch lengths of the phylogenetic tree of the four species; trees with the branch lengths estimated from individual proteins were also reconstructed for comparison. We used the resulting tree to reconstruct the ancestral sequence of each orthologous protein group separately using PAML. Moreover, as the phylogenetic relationship among human, mouse, and dog is not completely certain, we also reconstructed the ancestral sequences under the alternative phylogenetic tree in which human and dog were clustered in one clade. The PAML parameters used were as follows: aaRatefile = wag.dat, fix_alpha = 0, alpha = 0.04, cleandata = 0, and fix_blength = 2. The other parameters used were default values.
FIG. 1.
Reconstruction of ancestral phosphorylated residues. (a) Phylogenetic tree with branch length proportional to the number of expected substitutions per amino acid position estimated by PAML, using a concatenated alignment of 3,526 groups of human–mouse–dog–opossum orthologous phosphoproteins. The scale bar stands for 0.02 expected substitutions per site in the aligned regions. The oval denotes the common ancestor of human, dog, and mouse. (b) Classification of ancestral residues using serine as an example. Sp, Sn, S, Ns, and SE, respectively, stands for “serine residue that is predicted to be phosphorylated,” “serine residue that is not predicted to be phosphorylated,” “serine residue whose phosphorylation status is unknown,” “non-serine residue,” and “serine residue that is experimentally validated as phosphorylated.” A (B) is a conserved (one-change) ancestral phosphorylated residue, with a human or mouse descendant residue experimentally validated as phosphorylated; C (D) is a conserved (one-change) ancestral nonphosphorylated residue, with neither the human nor the mouse descendant residue experimentally validated as phosphorylated; E (F) is a conserved (one-change) ancestral phosphorylated residues, with neither the human nor the mouse descendant residue experimentally validated as phosphorylated; G (H) is a conserved (one-change) ancestral nonphosphorylated residue, with a human or mouse descendant residue experimentally validated as phosphorylated.
Prediction of Ancestral Phosphorylated and Nonphosphorylated Residues
Phosphorylation can occur at a serine (S), threonine (T), or tyrosine (Y) residue. We identified phosphorylated and nonphosphorylated residues in the ancestral sequences using two phosphorylated residue prediction tools, GPS2.1 (Xue et al. 2008) and KinasePhos 2.0 (Wong et al. 2007). The prediction threshold value for GPS2.1 was set to be “high.” For KinasePhos 2.0, two different threshold values were tried: “default” and “80% specificity.” Our inference of ancestral status of phosphorylation should be largely reliable because the phosphorylated residues predicted by the two tools are highly correlated (supplementary table S22, Supplementary Material online). An ancestral S/T/Y residue can fall into one and only one of the following four categories: “P” (it is predicted to be phosphorylated in the ancestral sequence by at least one prediction tool and the human or mouse residue has been experimentally validated as phosphorylated; A and B in fig. 1b), “NP” (it is not predicted to be phosphorylated in the ancestral sequence by at least one of the prediction tools and it is not known to be phosphorylated in either human or mouse; C and D in fig. 1b), “WP” (it is predicted to be phosphorylated in the ancestral sequence by both prediction tools and it is not known to be phosphorylated in either human or mouse; E and F in fig. 1b), and “HP” (it is not predicted to be phosphorylated in the ancestral sequence by at least one of the prediction tools and the human or mouse residue has been experimentally validated as phosphorylated; G and H in fig. 1b). However, the number of residues in the HP category is negligible (from 0 to 170, depending on the type of residue and prediction thresholds). Therefore, only residues in the P, NP, and WP categories are included in this study.
We also defined “NP-A” as the S/T/Y residues that are not known to be phosphorylated in either human or mouse phosphoproteins. It is generally assumed that most of these S/T/Y residues that are not known to be phosphorylated in phosphoproteins are “not-phosphorylated” (Gnad et al. 2007; Wong et al. 2007; Xue et al. 2008; Landry et al. 2009). Therefore, NP-A is the union set of NP and WP.
Calculation of the Rates of Change at Phosphorylated and Nonphosphorylated Residues
An ancestral S/T/Y residue was classified into one of the following three groups according to the level of conservation in human, mouse, and dog: 1) conserved sites: all three eutherian mammals and their ancestor have the same type of residue; 2) one-change sites: only two of the three eutherians have the same type of residue as the ancestor; 3) two-change sites: only human or mouse has the same type of residue as the ancestor. For two-change sites, the sites that changed in both human and mouse are excluded. If these sites are included, they will all be classified into NP or NP-A because of the lack of phosphorylation data of dog, and accordingly, the rate of change for NP or NP-A sites will be overestimated. After excluding these sites, the vast majority of the two-change sites are not derived from the shared changes before the divergence of human and mouse. This is the reason why we double the number of two-change sites when calculating R2 (see below). We calculated the rate of change of phosphorylated and nonphosphorylated residues separately as follows:
Rate of change (1 change):
Rate of change (one and two changes):
We also used the probability of the ancestral state (from PAML) to determine the probability of a residue change and calculate the number of expected conserved (Ec), one-change (E1), and two-change (E2) sites as follows:
![]() |
where i, j, and k are the numbers of conserved, one-change, and two-change sites, respectively.
Prediction of Surface Accessibility and Disordered Regions
We predicted the surface accessibility and disordered potential of the ancestral residues using SABLE 2.0 (Wagner et al. 2005) and DISOPRED2 (version2.2) (Ward et al. 2004), respectively, both with default parameters. We defined a residue as having high accessibility if its SABLE prediction score was ≥3 because the accessibility prediction scores for most residues fell between 0 and 5 (0 means fully buried and 9 means fully exposed). Furthermore, the average prediction scores of phosphorylated and nonphosphorylated residues were around 3.8 and 3.0, respectively (Gnad et al. 2007). The overall results remained the same when another SABLE prediction score cutoff (≥4) was tried. The false-positive prediction rate for DISOPRED2 was set to 5%.
Detection of Lineage-Specific Gains and Losses of Phosphorylated Residues
By aligning the sequences of human, mouse, and the human–mouse–dog common ancestor, we could identify human-specific gains (H-G), mouse-specific losses (M-L), mouse-specific gains (M-G), and human-specific losses (H-L) of phosphorylated residues. An H-G residue was defined as one where the human residue was experimentally verified to be phosphorylated, but neither the corresponding mouse residue nor the ancestral residue was S, T, or Y. An M-L residue was defined as one where the human was experimentally identified as phosphorylated and the aligned ancestral residue was also predicted to be phosphorylated, but the mouse residue was not S, T, or Y. The M-G and H-L residues were defined in a similar manner.
Results and Discussion
Evolutionary Conservation of Phosphorylated Serine and Threonine Residues
It is generally assumed that most of the serine, threonine, or tyrosine residues that are not known to be phosphorylated in phosphoproteins are “nonphosphorylated” (Gnad et al. 2007; Wong et al. 2007; Xue et al. 2008; Landry et al. 2009). Accordingly, we first compared between the ancestral residues with a human or mouse descendant residue that is experimentally validated as phosphorylated (P) and the ancestral residues with both human and mouse descendants that are not experimentally verified as phosphorylated in phosphoproteins (NP-A) (see Materials and Methods). In total, we identified 8,082 (210,214), 1,855 (130,341), and 1,789 (65,991) serine, threonine, and tyrosine residues in the P (NP-A) category, respectively (table 1). To compare the difference in the level of evolutionary conservation between P and NP-A residues, we calculated the fractions of these two types of residues that had changed since human, mouse, and dog diverged from their common ancestor. We first focused on one-change sites because the identification of these sites was more reliable and because one-change sites were more abundant than two-change sites. We found that the S and T residues in the P category were, on average, significantly more conserved than the corresponding NP-A residues (S: P = 2.20 ×10−16; T: P = 4.31 × 10−8, by Fisher's exact test, table 1). When two-change sites were included, the same trend was also observed (table 1). In contrast, phosphorylated and nonphosphorylated Y residues evolve, on average, at similar rates (P = 0.745, table 1).
Table 1.
The Numbers and Rates of Change of Phosphorylated Residues (P) and of Residues Not Predicted to Be Phosphorylated (NP) in Mammals.
| No. of Conserved Sites | No. of One-Change Sites | No. of Two-Change Sites | Total Sites | R1a | R2a | ||
| Serine | P | 7,225 | 774 | 83 | 8,082 | 0.097 | 0.116 |
| NP | 40,326 | 5,843 | 610 | 46,779 | 0.127 | 0.151 | |
| NP-Ab | 175,170 | 31,247 | 3,797 | 210,214 | 0.151 | 0.185 | |
| Threonine | P | 1,627 | 224 | 21 | 1,872 | 0.121 | 0.142 |
| NP | 18,381 | 2,965 | 330 | 21,676 | 0.139 | 0.167 | |
| NP-Ab | 106,325 | 21,349 | 2,667 | 13,0341 | 0.167 | 0.205 | |
| Tyrosine | P | 1,689 | 97 | 5 | 1,791 | 0.054 | 0.060 |
| NP | 14,445 | 865 | 64 | 15,374 | 0.056 | 0.065 | |
| NP-Ab | 61,582 | 4,109 | 300 | 65,991 | 0.063 | 0.071 |
NOTE.—The prediction thresholds were set to “high” and “default,” respectively, for the phosphorylation prediction tools GPS2.1 and Kinasephos 2.0.
“R1” and “R2,” respectively, stand for the rate of changes at one-change sites and at one- and two-change sites.
“NP-A” stands for all the residues that their phosphorylation status is unknown in the phosphoproteins of human or mouse.
A potential caveat of the above analysis is that some NP-A residues may actually be phosphorylated but remain undetected. To address this concern, from the NP-A group, we extracted a subset of residues that were less likely to be phosphorylated (i.e., NP) because they were not predicted to be phosphorylated by at least one of the prediction tools. The comparison between the P and NP residues reveals the same trend that phosphorylated S and T (but not Y) residues tend to evolve more slowly than nonphosphorylated ones (S: P = 1.34 × 10−14; T: P = 0.03, Y: P = 0.745, by Fisher's exact test, table 1). The result holds well when we use several alternative methods: 1) reconstructing ancestral sequences using the branch lengths estimated from each protein (supplementary table S1, Supplementary Material online); 2) using a different alignment tool—MUSCLE (Edgar 2004)—to align the sequences (supplementary tables S2–S3, Supplementary Material online); 3) using probability-based estimation of residue changes (here, the probability of a residue change was dependent on the probability of the ancestral state; see Materials and Methods and supplementary tables S4–S7, Supplementary Material online); 4) using a different threshold setting for KinasePhos 2.0 (Wong et al. 2007) (supplementary table S8, Supplementary Material online); and 5) applying an alternative phylogenetic tree in which human and dog instead of human and mouse were clustered in one clade (supplementary table S9, Supplementary Material online).
Interestingly, we observed a faster rate of change for the S/T NP-A residues than that for the S/T NP residues (table 1). It has been suggested that the residues on the protein surface or in disordered regions tend to evolve faster (Brown et al. 2002; Dunker et al. 2002; Linding et al. 2003; Lin et al. 2007). Therefore, the higher evolutionary rate of NP-A than NP residues may be ascribed to the larger proportion of NP-A (more specifically, WP, because NP-A is the union of NP and WP) residues to be located in disordered regions.
Conservation of Phosphorylated S/T Residues in Both Ordered and Disordered Regions
We next investigated whether local protein structure was implicated in the differences in the rate of change. To take local protein structure into account, we predicted the surface accessibility and disordered potential of S/T/Y residues. We find a larger proportion of phosphorylated (P) S/T/Y residues to have high accessibility and to be located in disordered regions (fig. 2), which is consistent with the results of Landry et al. (2009). Furthermore, the NP-A (more specifically, WP, which is a subset of NP-A) category indeed has a larger proportion of residues to have high accessibility and to be located in disordered regions than does the NP category (fig. 2). We also find that residues that are located in disordered regions tend to evolve faster than those that are located in ordered regions (P < 0.05, fig. 3 and supplementary tables S10 and S12–S18, Supplementary Material online), suggesting that local protein structure is implicated in the difference in the rate of change.
FIG. 2.
Structural characteristics of phosphorylated residues (P), not predicted as phosphorylated residues (NP) and not known to be phosphorylated in either human or mouse (NP-A) residues. (a) Proportions (%) of P, NP, and NP-A residues in predicted disordered regions. (b) Proportions (%) of P, NP, and NP-A residues with high surface accessibility.
FIG. 3.
Rates of change of phosphorylated residues (P) and of residues not predicted as phosphorylated (NP). The rates of change of P-Disordered (P residues in disordered regions), NP-Disordered (NP residues in disordered regions), P-Ordered (P residues in ordered regions), and NP-Ordered (NP residues in ordered regions). A connector linking two bars indicates that the difference between the two bars is significant at P < 0.05 (by Fisher's exact test).
We next examined whether the trend that S and T residues of the P category evolve more slowly than the corresponding residues of the NP category still holds when local protein structure is considered. Our results show that not only in disordered regions (S: P < 2.2 × 10−16; T: P = 1.93 × 10−11, by Fisher's exact test) but also in ordered regions (S: P = 0.01; T: P = 6.05 × 10−4) (fig. 3 and supplementary tables S10 and S12–S18, Supplementary Material online), the S and T residues of the P category have a significantly lower average rate of change than those of the NP category. The same comment applies to the analysis of surface accessibility (S: P < 2.2 × 10−16; T: P = 1.61 × 10−4 for high accessibility and S: P = 1.62 × 10−4; T: P = 0.09 for low accessibility) (supplementary fig. S1 and tables S11 and S19, Supplementary Material online). When we applied the same analysis to the WP and NP categories, we found that the S/T residues of the two categories evolved at a comparable rate in both ordered and disordered regions (supplementary tables S10 and S12–S18, Supplementary Material online). Accordingly, when the factor of local structure is controlled, the rates of change for the S/T residues in NP and WP are similar and are higher than the rate for the residues in P. This observation suggests that the previously observed higher rate of change of WP than of NP residues could be ascribed to the larger proportion of WP than NP residues to be located in disordered regions. For Y residues, however, the previously observed trend that P and NP residues evolve at similar rates seems to hold only in ordered regions. In disordered regions, P residues show a lower rate of change than NP residues (P: 0.072 vs. NP: 0.093), although the difference is not statistically significant (P = 0.08; fig. 3 and supplementary tables S10–S18, Supplementary Material online), probably because of the small number of Y residues in disordered regions. Taken together, our results suggest that phosphorylated S and T residues evolve more slowly than nonphosphorylated ones in both ordered and disordered regions and that classifying residues according to their regional selective constraints is important when calculating the rate of change.
Gains and Losses of Phosphorylated Residues in the Human and Mouse Lineages
If the conservation of phosphorylated residues is indicative of functionality, gains or losses of such residues may be associated with functional divergence. Therefore, it is interesting to investigate whether gains and losses of phosphorylated residues occur at different rates and whether the patterns of gain/loss differ between the human and mouse lineages. We obtained “human-specific gains” (H-G) and “mouse-specific losses” (M-L) from experimentally verified human phosphorylated residues and mouse-specific gains (M-G) and human-specific losses (H-L) from experimentally verified mouse phosphorylated residues (see Materials and Methods). We found no significant difference between gains and losses of phosphorylated S residues in either the human or the mouse lineage (P > 0.05, by Fisher's exact test, fig. 4a and supplementary table S20, Supplementary Material online). In contrast, we observed a larger proportion of gains than losses of phosphorylated T residues in the human lineage (H-G [0.038] > H-L [0.015]; P = 0.008) and a larger proportion of losses than gains of phosphorylated Y residues in the mouse lineage (M-L [0.023] > M-G [0.008]; P = 0.01) (fig. 4a and supplementary table S20, Supplementary Material online).
FIG. 4.
Gains and losses of phosphorylated residues. (a) The ratios of gains and losses of phosphorylated residues in the human and mouse lineages. A connector linking two bars indicates that the difference is significant at P < 0.05 (by Fisher's exact test). (b) The normalized ratios of (a). The ratios were normalized by dividing the proportions in supplementary table S3 (Supplementary Material online) by the branch lengths of the human and mouse lineages in fig. 1a, respectively.
In addition, compared with the human lineage, the mouse lineage has a larger proportion of gains of phosphorylated S and T residues and also a larger proportion of losses of phosphorylated S, T, and Y residues (supplementary table S20, Supplementary Material online). When the branch lengths in fig. 1a were used to normalize these proportions, the differences became small for S residues but remained substantial for T residues and Y residues (fig. 4b and supplementary table S21, Supplementary Material online). Collectively, our results suggest that the human lineage has had net gains of phosphorylated T residues and that the mouse lineage has had net losses of phosphorylated Y residues since the divergence of the two lineages. The functional effects of these lineage-specific gains/losses need further experimental characterization.
Conclusion
In this study, we have systematically analyzed the evolutionary rates of phosphorylated and nonphosphorylated residues. Our analysis provides evidence for significant evolutionary conservation of phosphorylated S and T residues in eutherian mammals, regardless of the protein regions studied, contrary to the conclusion of Landry et al. (2009). We also show that in the human and mouse lineages, the rates of gains and losses are approximately equal for phosphorylated S residues, but the human lineage has gained more phosphorylated T residues and has lost fewer phosphorylated Y residues than the mouse lineage.
This study demonstrates the importance of choosing an appropriate evolutionary scale for an accurate reconstruction of the ancestral sequences, so that one can obtain reliable estimates of evolutionary rates and lineage-specific gains and losses of phosphorylated residues. In the future, our approach can be applied to the evolutionary studies of other types of posttranslational modifications, such as glycosylation, methylation, and ubiquitination when genome-wide data sets of these modifications become available.
Supplementary Material
Supplementary tables S1–S22 and figure S1 are available at Molecular Biology and Evolution online (http://www.mbe.ofordjournals.org/).
Supplementary Material
Acknowledgments
We thank the two anonymous reviewers for valuable comments. This study was supported by National Science Council (fellowship NSC-096-2917-I-010-103 to SCCC and NSC 98-2311-B-400-002-MY3 to FCC). This study was partially supported by National Institutes of Health (grant GM30998 to W-H.L.) and by Academia Sinica, Taiwan, and also by National Health Research Institutes intramural funding (to F.C.C.).
References
- Aivaliotis M, Macek B, Gnad F, Reichelt P, Mann M, Oesterhelt D. Ser/Thr/Tyr protein phosphorylation in the archaeon Halobacterium salinarum—a representative of the third domain of life. PLoS One. 2009;4:e4777. doi: 10.1371/journal.pone.0004777. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Blenis J, Resh MD. Subcellular localization specified by protein acylation and phosphorylation. Curr Opin Cell Biol. 1993;5:984–989. doi: 10.1016/0955-0674(93)90081-z. [DOI] [PubMed] [Google Scholar]
- Boekhorst J, van Breukelen B, Heck AJ, Snel B. Comparative phosphoproteomics reveals evolutionary and functional conservation of phosphorylation across eukaryotes. Genome Biol. 2008;9:R144. doi: 10.1186/gb-2008-9-10-r144. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brown CJ, Takayama S, Campen AM, Vise P, Marshall TW, Oldfield CJ, Williams CJ, Dunker AK. Evolutionary rate heterogeneity in proteins with long disordered regions. J Mol Evol. 2002;55:104–110. doi: 10.1007/s00239-001-2309-6. [DOI] [PubMed] [Google Scholar]
- Diella F, Gould CM, Chica C, Via A, Gibson TJ. Phospho.ELM: a database of phosphorylation sites–update 2008. Nucleic Acids Res. 2008;36:D240–D244. doi: 10.1093/nar/gkm772. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Do CB, Mahabhashyam MS, Brudno M, Batzoglou S. ProbCons: probabilistic consistency-based multiple sequence alignment. Genome Res. 2005;15:330–340. doi: 10.1101/gr.2821705. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dunker AK, Brown CJ, Lawson JD, Iakoucheva LM, Obradovic Z. Intrinsic disorder and protein function. Biochemistry. 2002;41:6573–6582. doi: 10.1021/bi012159+. [DOI] [PubMed] [Google Scholar]
- Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32:1792–1797. doi: 10.1093/nar/gkh340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gnad F, Ren S, Cox J, Olsen JV, Macek B, Oroshi M, Mann M. PHOSIDA (phosphorylation site database): management, structural and evolutionary investigation, and prediction of phosphosites. Genome Biol. 2007;8:R250. doi: 10.1186/gb-2007-8-11-r250. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Iakoucheva LM, Radivojac P, Brown CJ, O'Connor TR, Sikes JG, Obradovic Z, Dunker AK. The importance of intrinsic disorder for protein phosphorylation. Nucleic Acids Res. 2004;32:1037–1049. doi: 10.1093/nar/gkh253. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jimenez JL, Hegemann B, Hutchins JR, Peters JM, Durbin R. A systematic comparative and structural analysis of protein phosphorylation sites based on the mtcPTM database. Genome Biol. 2007;8:R90. doi: 10.1186/gb-2007-8-5-r90. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Keshava Prasad TS, Goel R, Kandasamy K, et al. (30 co-authors) Human Protein Reference Database–2009 update. Nucleic Acids Res. 2009;37:D767–D772. doi: 10.1093/nar/gkn892. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Landry CR, Levy ED, Michnick SW. Weak functional constraints on phosphoproteomes. Trends Genet. 2009;25:193–197. doi: 10.1016/j.tig.2009.03.003. [DOI] [PubMed] [Google Scholar]
- Lin YS, Hsu WL, Hwang JK, Li WH. Proportion of solvent-exposed amino acids in a protein and rate of protein evolution. Mol Biol Evol. 2007;24:1005–1011. doi: 10.1093/molbev/msm019. [DOI] [PubMed] [Google Scholar]
- Linding R, Russell RB, Neduva V, Gibson TJ. GlobPlot: exploring protein sequences for globularity and disorder. Nucleic Acids Res. 2003;31:3701–3708. doi: 10.1093/nar/gkg519. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shaywitz AJ, Dove SL, Greenberg ME, Hochschild A. Analysis of phosphorylation-dependent protein-protein interactions using a bacterial two-hybrid system. Sci STKE. 2002;2002:pl11. doi: 10.1126/stke.2002.142.pl11. [DOI] [PubMed] [Google Scholar]
- Trinidad JC, Thalhammer A, Specht CG, Lynn AJ, Baker PR, Schoepfer R, Burlingame AL. Quantitative analysis of synaptic phosphorylation and protein expression. Mol Cell Proteomics. 2008;7:684–696. doi: 10.1074/mcp.M700170-MCP200. [DOI] [PubMed] [Google Scholar]
- Wagner M, Adamczak R, Porollo A, Meller J. Linear regression models for solvent accessibility prediction in proteins. J Comput Biol. 2005;12:355–369. doi: 10.1089/cmb.2005.12.355. [DOI] [PubMed] [Google Scholar]
- Ward JJ, Sodhi JS, McGuffin LJ, Buxton BF, Jones DT. Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. J Mol Biol. 2004;337:635–645. doi: 10.1016/j.jmb.2004.02.002. [DOI] [PubMed] [Google Scholar]
- Wong YH, Lee TY, Liang HK, Huang CM, Wang TY, Yang YH, Chu CH, Huang HD, Ko MT, Hwang JK. KinasePhos 2.0: a web server for identifying protein kinase-specific phosphorylation sites based on sequences and coupling patterns. Nucleic Acids Res. 2007;35:W588–W594. doi: 10.1093/nar/gkm322. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xue Y, Ren J, Gao X, Jin C, Wen L, Yao X. GPS 2.0, a tool to predict kinase-specific phosphorylation sites in hierarchy. Mol Cell Proteomics. 2008;7:1598–1608. doi: 10.1074/mcp.M700574-MCP200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yang Z. PAML: a program package for phylogenetic analysis by maximum likelihood. Comput Appl Biosci. 1997;13:555–556. doi: 10.1093/bioinformatics/13.5.555. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.





