Abstract
To find the most rapidly evolving regions in the yeast genome we compared most of chromosome III from three closely related lineages of the wild yeast Saccharomyces paradoxus. Unexpectedly, the centromere appears to be the fastest-evolving part of the chromosome, evolving even faster than DNA sequences unlikely to be under selective constraint (i.e., synonymous sites after correcting for codon usage bias and remnant transposable elements). Centromeres on other chromosomes also show an elevated rate of nucleotide substitution. Rapid centromere evolution has also been reported for some plants and animals and has been attributed to selection for inclusion in the egg or the ovule at female meiosis. But Saccharomyces yeasts have symmetrical meioses with all four products surviving, thus providing no opportunity for meiotic drive. In addition, yeast centromeres show the high levels of polymorphism expected under a neutral model of molecular evolution. We suggest that yeast centromeres suffer an elevated rate of mutation relative to other chromosomal regions and they change through a process of “centromere drift,” not drive.
COMPARISONS of genome sequences among species allow detailed analyses of the mode and tempo of evolution at the molecular level (Schein et al. 2004; Chimpanzee Sequencing and Analysis Consortium 2005; Shapiro et al. 2007). Comparisons of closely related species are especially needed to identify and analyze the fastest-evolving regions of genomes without ambiguities of homology or uncertainties due to multiple substitutions. Here we study the evolution of an entire chromosome (excluding telomeres and subtelomeres) by comparing sequences from three closely related and phylogenetically independent lineages of the wild yeast Saccharomyces paradoxus.
Yeasts provide an excellent model system for comparative studies in genome evolution, as they have small genomes, dense with genes and regulatory elements, and complete genome sequences are now available for a number of species (Goffeau et al. 1996; Cliften et al. 2003; Kellis et al. 2003, 2004; Dujon et al. 2004; Liti and Louis 2005). The closest relatives sequenced thus far, S. paradoxus and S. cerevisiae, however, are 13% divergent at the nucleotide level (Kellis et al. 2003), and many intergenic regions are difficult to align due to extensive insertions/deletions and ambiguities introduced by multiple substitutions. S. paradoxus strains from Europe, Far East Asia, and Brazil (also known as S. cariocanus) represent three genealogically independent populations that show partial hybrid sterility and much lower sequence divergence [1.5% divergence between Europe and the Far East and 5% between either one and S. cariocanus (Greig et al. 2003; Koufopanou et al. 2006; Liti et al. 2006)]. These three populations are ideal for population genomic studies, as they provide independent replicates for testing the repeatability of evolutionary patterns. Moreover, S. paradoxus is sufficiently closely related to S. cerevisiae that its genome can be annotated by homology, allowing full use of the vast amounts of information on S. cerevisiae. Finally, S. paradoxus has never been domesticated, and results will therefore reflect natural rather than artificial processes caused by human interventions.
MATERIALS AND METHODS
Strains:
To measure divergence we sequenced most of chromosome III from one Far East strain of S. paradoxus (CBS 8442) and the Type strain of S. cariocanus (CBS 8841) and compared these to the published sequence for the European Type strain of S. paradoxus (CBS 432) (Kellis et al. 2003). For polymorphism, we used 11 more European strains from Berkshire in the United Kingdom (T18.2, T26.3, T32.1, T62.1, T68.2, T76.6, Q4.1, Q6.1, Q14.4, Q15.1, and Q43.5) (Johnson et al. 2004) and 7 more Far East strains (CBS 8436, CBS 8437, CBS 8438, CBS 8439, CBS 8440, CBS 8441, and CBS 8444) (Naumov et al. 1997; Koufopanou et al. 2006). All strains were made fully homozygous prior to sequencing by isolating a single spore from a tetrad and allowing it to autodiploidize.
DNA sequencing, assembly, alignment, and annotation:
DNA sequence was obtained by a PCR-based strategy. The published S. paradoxus chromosome III sequence was used as a reference to design primers for PCR amplification and sequencing of the nontelomeric fraction of the chromosome from total genomic DNA. The PCRs generated overlapping 2-kb fragments, which were then sequenced with internal primers. Base-calling of DNA sequence traces was conducted using Phred (Ewing and Green 1998), and sequences were assembled using the Gap4 component of Staden (http://staden.sourceforge.net/). The 14 genes and adjacent intergenes used to estimate polymorphism levels included MRC1, SPB1, YCL045C, ATG22, ILV6, CIT2, PGK1, MAK32, FEN2, PER1, CTR86, HCM1, YCR072C, and KIN82. To ensure a high degree of confidence in the polymorphism data, only bases with a consensus Phred quality score ≥q40 were accepted (probability of miscall <1/10,000), the rest being treated as missing data. DNA sequences have been deposited in GenBank (accession nos. EU444725, EU444726, and EU444121–EU444533).
Sequences were aligned against the published sequences for chromosome III of S. cerevisiae (October 1, 2003 version: ftp://genome.cse.ucsc.edu/goldenPath/sacCer1/bigZips/chromFa.zip) and S. paradoxus (http://www.broad.mit.edu/ftp/pub/annotation/fungi/comp_yeasts/S1a.Assembly/), using mlagan, and further improved manually using SeaView and BioEdit (Galtier et al. 1996; Hall 1999; Brudno et al. 2003). Annotations of the S. cerevisiae chromosome (http://hgdownload.cse.ucsc.edu/goldenPath/sacCer1/database/) were transferred to the alignment using custom scripts. Sequence that aligned to the right of position 270,757 on S. cerevisiae chromosome III was excluded due to uncertainty in the orthology of the sequence. We also excluded six PCR fragments generated by pairs of primers that were predicted to amplify paralogous sequences on other chromosomes, using in silico PCR (isPcr) (settings −minPerfect = 1; http://www.cse.ucsc.edu/∼kent/src/) and the published S. paradoxus genome sequence. Long terminal repeat (LTR) regions were identified using RepeatMasker (version open-3.0); the repeat library included all sequences for Saccharomyces yeasts in RepBase 9.11 and the S. cerevisiae Ty4 retrotransposon. Only fixed LTRs were included in the analyses, to exclude recent inserts that would not be comparable to the rest of the chromosome.
Analyses:
Divergence and nucleotide diversity were estimated using polydNdS (http://molpopgen.org/) (Thornton 2003), and VariScan (Vilella et al. 2005). We do not correct for multiple hits, and insertions, deletions, and missing or ambiguous data are ignored. To remove the effect of codon usage bias from our estimates of synonymous divergence, we used the measures of codon bias (c) for each gene in Hirsh et al. (2005), calculated from several Saccharomyces species, including S. paradoxus.
RESULTS
Divergence along chromosome III:
We sequenced ∼295 kb of the Far East chromosome, representing ∼91% of the complete S. cerevisiae chromosome, and ∼250 kb of the S. cariocanus chromosome (∼76%). The overall nucleotide divergence between the European and the Far East chromosomes is 1.4% (about equal to that between humans and chimpanzees; chimpanzee sequencing and analysis consortium 2005); the divergence of either one from S. cariocanus is ∼4%. Levels of divergence vary significantly along the length of the chromosome; surprisingly, the greatest divergence is at the centromere (Figure 1).
Figure 1.—
Sliding-window analysis of nucleotide divergence along chromosome III. Each bar represents the pairwise divergence estimated from a 50-bp window, with 10 bp offset, chosen to capture divergence of short elements such as the centromere (red) and transposable element fragments (blue). Other intergenic regions are shown in gold, and genic regions (gray) include protein-coding exons, introns, tRNA, and snoRNA genes. All gaps shown are due to gaps in the alignment rather than zero divergence. For Europe vs. Far East there are 255,186 aligned sites, ∼81% of the complete S. cerevisiae chromosome; for S. paradoxus (Europe) vs. S. cariocanus there are 218,286 aligned sites (69%).
Divergence of other centromeres:
To test whether the elevated divergence applies to other centromeres, we sequenced the centromeres of four additional chromosomes (CEN5, CEN7, CEN9, and CEN15). The rate of divergence does not differ significantly among centromeres (G-tests: the P-value for the Europe–Far East comparison is PEF = 0.26; that for European S. paradoxus–S. cariocanus is PEC = 0.41, with comparable values for Far East–S. cariocanus here and throughout the article). All five centromeres show high levels of divergence compared to other types of DNA (Table 1, Figure 2; see also supplemental Table 1).
TABLE 1.
Divergence and polymorphism for different classes of DNA
Divergence
|
Polymorphism (π)
|
|||
---|---|---|---|---|
Europe vs. Far East | S. paradoxus (Europe)cvs. S. cariocanus | Europe | Far East | |
Centromere | ||||
CDEII | 12.9 (9.2–17.7) [5; 380] | 32.2 (27.3–37.7) [5; 410] | 0.55 (0.08–1.03) [5; 424] | 0.79 (0.23–1.30) [5; 411] |
CDEI, CDEIII | 4.3 (1.6–7.4) [9; 162] | 7.7 (4.8–10.1) [10; 169] | 0.10 (0–0.37) [10; 170] | 0.25 (0–0.75) [9; 158] |
Flanking intergene | 3.1 (2.7–3.9) [5; 8,444] | 8.2 (7.4–10.1) [5; 8,323] | 0.29 (0.12–0.52) [5; 8,765] | 0.43 (0.40–0.58) [5; 8,653] |
Other | ||||
Synonymous | 3.2 (2.9–3.5) [133; 36,528] | 10.3 (9.9–10.8) [122; 31,536] | 0.37 (0.18–0.61) [14; 4,685] | 0.11 (0.05–0.18) [14; 4,681] |
Synonymous (corrected)a | 4.7 (4.2–5.2) [74; 22,075] | 15.3 (13.7–16.9) [70; 18,901] | NA | NA |
LTRb | 4.6 (3.9–5.6) [10, 5,146] | 12.3 (9.4–14.7) [7; 1,626] | 0.40 (0.25–0.53) [11; 6,580] | 0.34 (0.24–0.42) [12; 6,632] |
Intergenic | 2.2 (1.9–2.4) [139; 73,202] | 6.2 (5.6–6.8) [126; 59,184] | 0.22 (0.14–0.27) [14; 8,688] | 0.13 (0.07–0.19) [14; 8,637] |
Nonsynonymous | 0.5 (0.4–0.6) [133; 141,675] | 1.3 (1.2–1.6) [122; 123,488] | 0.05 (0.04–0.07) [14; 18,850] | 0.02 (0.01–0.04) [14; 18,739] |
AT rich | 1.5 (0.7–2.5) [10; 846] | 4.5 (2.7–7.1) [9; 684] | 0.09 (0–0.24) [10; 851] | 0.20 (0.06–0.3) [10; 826] |
Mean × 100 (95% C.I.) [number of loci; total length in base pairs] is shown. Mean is weighted by length of each locus; confidence intervals are estimated from 10,000 bootstrap replicates. NA, not applicable.
Divergence at synonymous sites corrected for codon usage bias.
One transposable-element region has a single highly diverged allele in the European population that may have arisen by gene conversion from another locus and was excluded from the analyses.
Results are similar for the Far East vs. S. cariocanus comparisons.
Figure 2.—
Polymorphism among European (circles) and Far East (triangles) strains as a function of divergence for different types of DNA sequence. Note the square-root scale on both axes, used to spread the points.
One possible cause of this high divergence is that centromeres on the same chromosome in different lineages are not orthologous, but instead have been transferred by gene conversion from some other chromosome. To test this possibility we aligned all the sequences and constructed a phylogeny. This shows the pattern expected from orthology: centromeres from the same chromosome cluster together, with Europe and the Far East more closely related to each other than either is to S. cariocanus (Figure 3).
Figure 3.—
Unrooted phylogram showing that centromeres from the same chromosome cluster together [maximum-parsimony analysis using PAUP v.4b10 (Swofford 1999); note that branch lengths between centromeres from different chromosomes are approximate due to uncertainties in the alignment].
Centromeres in Saccharomyces yeast are very short (∼120 bp long) and well defined and consist of three functionally distinct regions: two protein-binding sites [centromere DNA elements (CDE)I and CDEIII, 8 bp and ∼25 bp long] and a highly AT-rich spacer region separating them (CDEII, ∼90 bp long) (Clarke 1998). CDEII wraps around the centromere-specific histone Cse4; this binding is analogous to that between mammalian centromeric repeats and CENPA (the mammalian homolog of Cse4), although in the case of yeast only a single nucleosome is formed for each centromere (Sullivan et al. 2001). CDEIIs from different chromosomes are highly dissimilar (up to 60% differences among those sequenced here) yet functionally interchangeable (Clarke and Carbon 1983), indicating that the binding of CDEII to the centromere-specific histone Cse4 is not sequence specific, although changes in the length, AT content, and pattern of runs of A's and T's can disrupt centromere function, perhaps by altering DNA bendability or flexibility (Baker and Rogers 2005). CDEII diverges more than twice as fast as the two binding sites (Figure 4 and Table 1; G-test, PEF = 0.001, PEC = 2 × 10−11), which do not differ significantly from each other (P > 0.4). CDEII also diverges about twice as fast as the 85-bp regions immediately flanking the two binding sites (data not shown), suggesting the effect is specific to the centromeres. In the remainder of this article we focus on the fast-evolving CDEII component of centromeres.
Figure 4.—
Alignment of sequences at CEN3. N indicates missing data. SCA, S. cariocanus; SCE, S. cerevisiae. Also shown is the sequence for centromere 11 from S. cerevisiae, which has been shown to be functionally interchangeable with CEN3 (Clarke and Carbon 1983). For visual clarity three sequences with missing data are not shown.
Comparison with sequences likely to be evolving neutrally:
An obvious question is whether CDEII regions are evolving faster than selectively neutral sequences. One class of DNA likely to show little selective constraint is synonymous sites in genes with low codon bias (Akashi 2001; Fay and Benavides 2005). Synonymous sites as a whole diverge only a third as fast as CDEII (Wilcoxon tests: PEF = PEC = 0.0002). To estimate the rate of divergence of synonymous sites in the absence of codon bias, we calculated the regression of divergence against (1 − CAI), where CAI is the codon adaptation index of Hirsh et al. (2005) (Figure 5). Regression lines were forced through the origin (i.e., CAI = 1), on the grounds that complete bias should result in no divergence. The estimated divergence in the absence of bias (CAI = 0) is 50% higher than the observed synonymous divergence, but still only half the value for CDEII (Table 1).
Figure 5.—
Divergence at synonymous sites as a function of the degree of “unbias” in codon usage, 1 − CAI, where CAI is the codon adaptation index from Hirsh et al. (2005). Lines are regressions forced through the origin, and divergence in the absence of codon usage bias (Table 1) is estimated from the regression line using 1 − CAI = 1. For two genes there were <10 synonymous sites in the alignment of S. paradoxus (Europe) vs. S. cariocanus (due to missing sequence), and these have been omitted from the graph.
Other DNA sequences likely to show little selective constraint are the remnant LTR regions of partially deleted transposable elements, as these are no longer functional. These sequences diverge at about the same rate as synonymous sites after codon bias correction, but again less than half as fast as CDEII (Wilcoxon tests: PEF and PEC < 0.003; Table 1). Divergence at other regions (intergenes and nonsynonymous sites) is lower still (Table 1).
Polymorphism, selection, and mutation:
For CDEII to diverge faster than selectively neutral sequences it must experience an elevated mutation rate or recurrent positive selection. To attempt to distinguish these possibilities we measured levels of polymorphism segregating in each of the European and Far East populations. If mutation rates are elevated, then polymorphism at CDEII will also be elevated, proportional to divergence, whereas if there has been recurrent positive selection, then polymorphism will be reduced (Hudson et al. 1987). We analyzed polymorphism at the five centromeres and their flanking intergenic regions, the LTRs, and 14 genes and adjacent intergenes along chromosome III, in 12 European and 8 Far Eastern strains (see supplemental Table 2). To minimize linkage effects the 14 genes were chosen so that they are at least 4 kb apart. We compared average π-values for the two populations between different classes of DNA using a Wilcoxon test and found that CDEIIs are more polymorphic than LTRs or synonymous sites, although with borderline statistical significance (P = 0.08 for both comparisons). These analyses are somewhat limited because the CDEII region is so short and we are unable to accurately assess the extent of heterogeneity in π for centromeres on different chromosomes. π ranges from 0 to 1.6%, (Figure 2), but this variation is no more than expected by chance [P > 0.3 for both Europe and Far East; tested by simulating 10,000 data sets with the observed average π (0.0055/bp or 0.0079/bp), number of sequences (12 or 8), and sequence length (85 bp)]. These limitations notwithstanding, there is no evidence for the reduced polymorphism expected in a simple model of recurrent positive selection and the ratio of polymorphism to divergence for CDEII is not significantly different from that of LTRs or synonymous sites, either by a Wilcoxon test (P > 0.4) or in a coalescent-based maximum-likelihood analysis [MLHKA test, P > 0.2 (Wright and Charlesworth 2004); Figure 2].
Analysis of other AT-rich regions:
CDEII is extremely AT rich, averaging 90% for the five centromeres studied here. To test whether other AT-rich regions have elevated divergence, we scanned the published chromosome III sequence for the 10 most AT-rich regions of similar length to the CDEII region [85-bp-long regions, 86–94% AT; in cases where many consecutive windows showed the same proportion of A or T sites the first (left) window was selected; all regions found were >4 kb apart]. These show much lower divergence and polymorphism than CDEII, similar to other intergenic regions (Table 1). Thus AT-rich regions are not fast evolving in general, and AT richness in itself cannot account for the high divergence we observe.
DISCUSSION
We have compared near-complete sequences of chromosome III from three closely related lineages of yeast and found that the fastest-evolving sequence is the CDEII region of the centromere. Further analyses indicate that centromeres on other chromosomes also evolve rapidly, and even more rapidly than sequences likely to be evolving neutrally. To our knowledge, ours is the most detailed sequence analysis of centromere evolution thus far, made possible by the small size of yeast centromeres. Our analysis specifically points to centromeres having an elevated rate of nucleotide substitution rather than, say, more frequent repeat expansions and contractions.
Rapid centromere evolution has also been observed in some plants and animals (Haaf and Willard 1997; Henikoff et al. 2001; Lee et al. 2005; Ma et al. 2007; Ventura et al. 2007). In these taxa centromeres are extremely long and complex stretches of highly repetitive AT-rich satellite DNA, usually surrounded by or embedded in heterochromatin (Clarke 1998; Henikoff et al. 2001; Sullivan et al. 2001). Rapid centromere evolution in plants and animals has been attributed to recurrent positive selection for mutant repeats that distort their segregation at female meiosis, somehow orienting toward the spindle pole that will contribute to the egg or ovule and away from the polar bodies (the “centromere drive hypothesis”; Henikoff et al. 2001; Malik and Henikoff 2002). Saccharomyces yeasts have symmetrical meioses, with all four products being viable, and selection for this type of segregation distortion is not possible. Thus some other factor(s) must be responsible.
The two most obvious possible causes for rapid evolution of yeast centromeres are an elevated mutation rate and/or recurrent positive selection (due to some factor other than drive). Distinguishing these two possibilities is not always easy. Centromeres appear to be more polymorphic than neutral regions of the chromosome (with borderline statistical significance) and the ratio of polymorphism to divergence is not different. Thus the data are consistent with centromeres having an elevated rate of mutation, and this seems to us the most likely explanation (although we are not able to exclude some complex forms of selection, such as a combination of directional and balancing selection). Due to the small size of CDEII we were not able to accurately assess the extent of heterogeneity in polymorphism among centromeres on different chromosomes; more data from more isolates and centromeres would be useful in this regard. Interestingly, the two proteins directly interacting with CDEII, Cse4 and Mif2 (homologous to the fast-evolving mammalian CENPC protein), appear to be under strong purifying selection in yeast, with no indication of recurrent directional selection (Henikoff and Dalal 2005; Baker and Rogers 2006).
If yeast centromeres do suffer an elevated mutation rate, it is not clear what might be causing it. Recombination may be mutagenic (Bussell et al. 2006), but genealogical analysis of the five centromeres and flanking intergenes shows no evidence of recombination in any of these regions in either population (data not shown). CDEII is extremely AT rich, and runs of A's and T's can lead to insertion and deletion mutations by replication slippage (Levinson and Gutman 1987), but our analysis includes only base substitutions, not indels. Both polymorphism and divergence are sufficiently low to exclude the possibility of artificially inflated mutation rates due to misalignments (Figure 4 and supplemental Figure 1). In addition, other AT-rich regions do not show elevated divergence or polymorphism. Perhaps the structure of centromeres is such that CDEII is more exposed to damaging agents (e.g., free radicals) in the nuclear environment or less exposed to repair enzymes. It is also possible that there are occasional small-scale gene-conversion events from other centromeres, too short to be detected in our analysis of orthology, and that these contribute to the observed divergence.
An increased mutation rate need not affect all nucleotides equally. If C's and G's were disproportionately affected, then this could account in part for the A/T compositional bias of CDEII—indeed, if C/G is nine times more mutable than A/T, then a neutral sequence would evolve to be 90% A/T. Moreover, a 9-fold increase in the mutability of C/G would also produce a 1.8-fold increase in the rate of divergence at equilibrium (Haddrill et al. 2005). On the other hand, the A/T bias appears to be functionally important (Baker and Rogers 2005), and it could be maintained purely by selection, without a mutation bias. In this case there would be purifying selection against mutations to C or G, and the actual mutation rate at CDEII would be even higher than the divergence rate we observe.
To conclude, we find that centromeres are the fastest-evolving regions in the yeast genome (possibly excluding telomeres and subtelomeres), despite their essential and conserved role in chromosome segregation. Our results also indicate that rapid centromere evolution can occur in the absence of drive and instead point to elevated mutation rates as a possible explanation. Other Saccharomyces lineages also show relatively high rates of centromere evolution (D. Barton and E. Louis, unpublished results). The centromere divergence we observe may have no functional consequences, although experimental transfers between species would be the best way to test this idea. Elevated mutation rates should also be considered as a possible contributor to rapid centromere evolution in plants and animals. Centromeres in these taxa are determined epigenetically rather than by DNA sequence, and this could be an adaptation to maintain function in the face of unavoidably high mutation rates (Murphy and Karpen 1998).
Acknowledgments
We thank David Barton and Edward Louis for useful discussions and for showing us their manuscript prior to publication. We also thank Jason Tsai for help with the analysis and Casey Bergman, Brian Charlesworth, Harmit Malik, Mick Crawley, Daniela Delneri, and two anonymous reviewers for helpful comments and discussions. DNA sequencing was done by AGOWA (Berlin). This work was funded by the Biotechnology and Biological Sciences Research Council (to V.K. and A.B.) and a Wellcome Trust VIP award (to D.B.).
References
- Akashi, H., 2001. Gene expression and molecular evolution. Curr. Opin. Genet. Dev. 11 660–666. [DOI] [PubMed] [Google Scholar]
- Baker, R. E., and K. Rogers, 2005. Genetic and genomic analysis of the AT-rich centromere DNA element II of Saccharomyces cerevisiae. Genetics 171 1463–1475. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baker, R. E., and K. Rogers, 2006. Phylogenetic analysis of fungal centromere H3 proteins. Genetics 174 1481–1492. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brudno, M., C. B. Do, G. M. Cooper, M. F. Kim, E. Davydov et al., 2003. LAGAN and multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA. Genome Res. 13 721–731. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bussell, J. J., N. M. Pearson, R. Kanda, D. A. Filatov and B. T. Lahn, 2006. Human polymorphism and human-chimpanzee divergence in pseudoautosomal region correlate with local recombination rate. Gene 368 94–100. [DOI] [PubMed] [Google Scholar]
- Chimpanzee Sequencing and Analysis Consortium, 2005. Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 437 69–87. [DOI] [PubMed] [Google Scholar]
- Clarke, L., 1998. Centromeres: proteins, protein complexes, and repeated domains at centromeres of simple eukaryotes. Curr. Opin. Genet. Dev. 8 212–218. [DOI] [PubMed] [Google Scholar]
- Clarke, L., and J. Carbon, 1983. Genomic substitutions of centromeres in Saccharomyces cerevisiae. Nature 305 23–28. [DOI] [PubMed] [Google Scholar]
- Cliften, P., P. Sudarsanam, A. Desikan, L. Fulton, B. Fulton et al., 2003. Finding functional features in Saccharomyces genomes by phylogenetic footprinting. Science 301 71–76. [DOI] [PubMed] [Google Scholar]
- Dujon, B., D. Sherman, G. Fischer, P. Durrens, S. Casaregola et al., 2004. Genome evolution in yeasts. Nature 430 35–44. [DOI] [PubMed] [Google Scholar]
- Ewing, B., and P. Green, 1998. Base-calling of automated sequencer traces using Phred. II. Error probabilities. Genome Res. 8 186–194. [PubMed] [Google Scholar]
- Fay, J. C., and J. A. Benavides, 2005. Hypervariable noncoding sequences in Saccharomyces cerevisiae. Genetics 170 1575–1587. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Galtier, N., M. Gouy and C. Gautier, 1996. SEAVIEW and PHYLO_WIN: two graphic tools for sequence alignment and molecular phylogeny. Comput. Appl. Biosci. 12 543–548. [DOI] [PubMed] [Google Scholar]
- Goffeau, A., B. G. Barrell, H. Bussey, R. W. Davis, B. Dujon et al., 1996. Life with 6000 genes. Science 274 546. [DOI] [PubMed] [Google Scholar]
- Greig, D., M. Travisano, E. J. Louis and R. H. Borts, 2003. A role for the mismatch repair system during incipient speciation in Saccharomyces. J. Evol. Biol. 16 429–437. [DOI] [PubMed] [Google Scholar]
- Haaf, T., and H. F. Willard, 1997. Chromosome-specific alpha-satellite DNA from the centromere of chimpanzee chromosome 4. Chromosoma 106 226–232. [DOI] [PubMed] [Google Scholar]
- Haddrill, P. R., B. Charlesworth, D. L. Halligan and P. Andolfatto, 2005. Patterns of intron sequence evolution in Drosophila are dependent upon length and GC content. Genome Biol. 6 R67. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hall, T. A., 1999. BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp. Ser. 41 95–98. [Google Scholar]
- Henikoff, S., and Y. Dalal, 2005. Centromeric chromatin: What makes it unique? Curr. Opin. Genet. Dev. 15 177–184. [DOI] [PubMed] [Google Scholar]
- Henikoff, S., K. Ahmad and H. S. Malik, 2001. The centromere paradox: stable inheritance with rapidly evolving DNA. Science 293 1098–1102. [DOI] [PubMed] [Google Scholar]
- Hirsh, A. E., H. B. Fraser and D. P. Wall, 2005. Adjusting for selection on synonymous sites in estimates of evolutionary distance. Mol. Biol. Evol. 22 174–177. [DOI] [PubMed] [Google Scholar]
- Hudson, R. R., M. Kreitman and M. Aguade, 1987. A test of neutral molecular evolution based on nucleotide data. Genetics 116 153–159. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Johnson, L. J., V. Koufopanou, M. R. Goddard, R. Hetherington, S. M. Schafer et al., 2004. Population genetics of the wild yeast Saccharomyces paradoxus. Genetics 166 43–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kellis, M., N. Patterson, M. Endrizzi, B. Birren and E. S. Lander, 2003. Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423 241–254. [DOI] [PubMed] [Google Scholar]
- Kellis, M., B. W. Birren and E. S. Lander, 2004. Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae. Nature 428 617–624. [DOI] [PubMed] [Google Scholar]
- Koufopanou, V., J. Hughes, G. Bell and A. Burt, 2006. The spatial scale of genetic differentiation in a model organism: the wild yeast Saccharomyces paradoxus. Philos. Trans. R. Soc. B 361 1941–1946. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee, H. R., W. L. Zhang, T. Langdon, W. W. Jin, H. H. Yan et al., 2005. Chromatin immunoprecipitation cloning reveals rapid evolutionary patterns of centromeric DNA in Oryza species. Proc. Natl. Acad. Sci. USA 102 11793–11798. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Levinson, G., and G. A. Gutman, 1987. Slipped-strand mispairing—a major mechanism for DNA-sequence evolution. Mol. Biol. Evol. 4 203–221. [DOI] [PubMed] [Google Scholar]
- Liti, G., and E. J. Louis, 2005. Yeast evolution and comparative genomics. Annu. Rev. Microbiol. 59 135–153. [DOI] [PubMed] [Google Scholar]
- Liti, G., D. B. H. Barton and E. J. Louis, 2006. Sequence diversity, reproductive isolation and species concepts in Saccharomyces. Genetics 174 839–850. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ma, J., R. A. Wing, J. L. Bennetzen and S. A. Jackson, 2007. Plant centromere organization: a dynamic structure with conserved functions. Trends Genet. 23 134–139. [DOI] [PubMed] [Google Scholar]
- Malik, H. S., and S. Henikoff, 2002. Conflict begets complexity: the evolution of centromeres. Curr. Opin. Genet. Dev. 12 711–718. [DOI] [PubMed] [Google Scholar]
- Murphy, T. D., and G. H. Karpen, 1998. Centromeres take flight: alpha satellite and the quest for the human centromere. Cell 93 317–320. [DOI] [PubMed] [Google Scholar]
- Naumov, G. I., E. S. Naumova and P. D. Sniegowski, 1997. Differentiation of European and Far East Asian populations of Saccharomyces paradoxus by allozyme analysis. Int. J. Syst. Bacteriol. 47 341–344. [DOI] [PubMed] [Google Scholar]
- Schein, M., Z. Yang, T. Mitchell-Olds and K. J. Schmid, 2004. Rapid evolution of a pollen-specific oleosin-like gene family from Arabidopsis thaliana and closely related species. Mol. Biol. Evol. 21 659–669. [DOI] [PubMed] [Google Scholar]
- Shapiro, J. A., X. Q. Huang, C. Zhang, M. J. Hubisz, J. Lu et al., 2007. Adaptive genic evolution in the Drosophila genomes. Proc. Natl. Acad. Sci. USA 104 2271–2276. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sullivan, B. A., M. D. Blower and G. H. Karpen, 2001. Determining centromere identity: cyclical stories and forking paths. Nat. Rev. Genet. 2 584–596. [DOI] [PubMed] [Google Scholar]
- Swofford, D. L., 1999. PAUP* Phylogenetic Analysis Using Parsimony. Sinauer, Sunderland, MA.
- Thornton, K., 2003. libsequence: a C++ class library for evolutionary genetic analysis. Bioinformatics 19 2325–2327. [DOI] [PubMed] [Google Scholar]
- Ventura, M., F. Antonacci, M. F. Cardone, R. Stanyon, P. D'Addabbo et al., 2007. Evolutionary formation of new centromeres in macaque. Science 316 243–246. [DOI] [PubMed] [Google Scholar]
- Vilella, A. J., A. Blanco-Garcia, S. Hutter and J. Rozas, 2005. VariScan: analysis of evolutionary patterns from large-scale DNA sequence polymorphism data. Bioinformatics 21 2791–2793. [DOI] [PubMed] [Google Scholar]
- Wright, S. I., and B. Charlesworth, 2004. The HKA test revisited: a maximum-likelihood-ratio test of the standard neutral model. Genetics 168 1071–1076. [DOI] [PMC free article] [PubMed] [Google Scholar]