Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 2001 Dec 15;29(24):5156–5162. doi: 10.1093/nar/29.24.5156

Effect of chromosomal locus, GC content and length of homology on PCR-mediated targeted gene replacement in Saccharomyces

Misa Gray 1, Saul M Honigberg 1,a
PMCID: PMC97614  PMID: 11812849

Abstract

Targeted gene replacement (TGR) using fragments generated by PCR is a widely-used technique for deleting genes in Saccharomyces cerevisiae. We found that the efficiency of this procedure, defined as the fraction of transformants that delete the targeted gene, varied by >10-fold depending on the sequence being targeted. We examined the effect of chromosomal position, length of homology and GC content on TGR efficiency. When URA3 was positioned at five different chromosomal locations, the efficiency of replacing this gene with LEU2 remained the same. Similarly, varying the length of homology from 35 to 60 bp had only a small effect on the efficiency of targeting (<50%), though an increase in the length of homology to 200 bp on one end of the disruption fragment did increase TGR efficiency. Strikingly, as GC content in the target sequence increased, the efficiency of targeting also increased. When TGR efficiency was high, the frequency of untargeted integration events was low. These results suggest two strategies for designing TGR primers: (i) use 40 bp targeting sequences containing 40–50% GC, and (ii) if necessary, increase TGR efficiency by extending the length of homology on one end of the disruption fragment.

INTRODUCTION

Deletion of specific genes is a powerful tool for deciphering gene function. In the yeast Saccharomyces cerevisiae, a common method for gene deletion uses a fragment amplified by PCR. In a simple version of this method, here termed targeted gene replacement (TGR), a selectable marker is amplified using oligonucleotides with 5′-tails containing sequences identical to the ends of the targeted gene (see Fig. 1A) (1,2). This PCR fragment is transformed into yeast, and the transformants are selected for the marker gene. Homologous recombination between the disruption fragment and the targeted gene on the chromosome leads to replacement of the targeted gene with the selectable marker. Besides TGR, transformants can also arise from untargeted recombination events; for example, conversion of the chromosomal allele of the marker gene. Thus, transformants are screened to identify those resulting from TGR.

Figure 1.

Figure 1

Effect of GC content of TGR efficiency. (A) Diagram of PCR fragment mediated gene disruption as described previously (1). (B) TGR efficiency of different disruption fragments targeted at the same gene. TGR efficiencies of the CAT8 and SNF3 disruption fragments shown in Table 1 were compared to TGR efficiencies of disruption fragments targeted <100 bp from the original fragments. The average of the GC content in the left and right targeting regions is shown beneath each bar. (C) A scatter plot of the data shown in (B) and in Table 1. The arrows point to the data from the two CAT8 targeting experiments.

Because the frequency of TGR among transformants (termed TGR efficiency) is often much lower than the frequency of untargeted events, several strategies have been devised to increase this efficiency. One such strategy involves extending the length of the targeting region by fusing a several-hundred base pair targeting sequence to each end of the targeted gene (3,4); this extended homology greatly increases the TGR efficiency (5). Another strategy involves using strains in which the chromosomal copy of the marker gene is deleted such that this gene cannot be converted to the wild-type allele by the disruption fragment. Finally, one can use marker genes that do not have close homologs in the S.cerevisiae genome, such as the Kluyveromyces lactis URA4 gene or the bacterial kanr gene (57). Gene replacements containing the kanr marker have now been constructed in >75% of the long ORFs present in the S.cerevisiae genome (8). Genomic DNA from these disruption strains can be used as template to amplify disruption fragments containing extensive homology on both ends of the gene.

In addition to deleting genes, TGR is also widely used to target fusion genes, conditional promoters and epitope-tagged genes to the chromosome (913). Thus, it is both interesting and important to define the parameters affecting TGR efficiency. These parameters might include: (i) the DNA sequence of the region of homology (sequence effects), (ii) the length of the homologous region (length effects), and (iii) the chromatin structure or other features determined by the location of the targeted gene on the chromosome (locus effects). These parameters have all been shown to affect the efficiency of other types of homologous recombination in yeast (1418) as well as in other organisms.

In this study, we report that TGR efficiency varies >10-fold among the 10 different targeted regions we tested. Interestingly, targeted regions with higher GC contents tended to have higher efficiencies of targeting. In contrast, increasing the length of homology in the range from 35 to 60 bp had only a modest effect on targeting efficiency, and when the same gene was placed at different locations, the efficiency of TGR did not vary. Finally, increasing the length of homology to 200 bp, even on only one side of the disruption fragment, greatly increased TGR efficiency.

MATERIALS AND METHODS

Primers used to generate disruption fragments

All disruption fragments were generated by PCR and used either the pRS305 (LEU2) or pRS306 (URA3) vectors (19) as template. All primers used to generate disruption fragments contained the same (universal) vector sequence on their 3′ ends and differed only in their 5′-tail sequence: for the left primer the universal sequence was TAACTATGCGGCATCAGAGC and for the right primer this sequence was CCTGATGCGGTATTTTCTCC. The primers used to construct the 55 bp target RGT1 disruption fragment contained the following sequence at their 5′-tails: left primer, CGCAATTTTCATTGGTGCAATAATAACACGCTTCTTCGA; right primer, TGGTCATGAAGCTGTCAAACTCGATGAATGAAGTAGTTCAAATCA-CCAGCGTGCT. The primers used to construct the 35, 40, 45 and 50 bp RGT1 disruption fragments had the same 5′ end as above, but the targeted region extended for correspondingly fewer base pairs. Similarly, the primers for the 60 bp homology CAT8 disruption fragment contained the following sequence on their 5′ ends: left primer, AAATAATAATTCTGATCGACAAGGTTTGGAACCCAGAGTCATTAGAACTCT-GGTTCACA; right primer, GTTTTGCCATTGGAATAAATCAGATACATTATCTGTGTTGGAACTTCCGCCACCAG-AACT. The primers used to construct the 40 and 50 bp homology CAT8 disruption fragments had the same 5′ ends as above but the targeted region extended for correspondingly fewer base pairs.

The 5′-tail sequence for the primers used to generate other disruption fragments all contained 40 bp of homology to the target at both left and right ends and were as follows. SNF3: left, CCGCATCGCACATTCTAAAGAACAAAAGGAGGAGGAACTG; right, CAGGTTGATTAGTGGCGTTTTCTTCGCTTGAGTGGAGGAT; SNF3 with low GC content: left, AATTCAGTCATACTGAGAAAAACTAGACAATAGTCCTATC; right, ATTAAATTAATTATTTCAAATCATTATTTTCATTTACAGG; CAT8 with high GC: left, CATTAGAACTCTGGGTTCACAAGCGCTTAGCGGTGGTAGC; right, ATTTTTCTGAGGCCGTCTTGGACCTCGGTGGGACGCCGAG; YHP1: left, GCAGAAATACCGTGCTTCCTTCTTTACCAAACATAATAAC; right, CAAGGGGTTTTCTTTCGTATGCATTGAACTTTAGGCGGTT; RGT2: left, AGAGCAGATCAGGAATAGTATCAATAGTCTAAACCATCAA; right, AATGTTAAAAGAAACTCAAGAGTCGTGTATAAGAATAAT; SUM1: left, CAGCCCCTTCTGATAACATAACCAATGAACAGAGACTTCC; right, GCGGTATGATTCTGGAATATTTGACCTGGTTGTTTTCTGA; SWE1: left, GATGAGTTCTTTGGACGAGGATGAAGAGGACTTCGAAATG; right, TTGCGTGTCATTTCTACATACAGGCATTCCTCAGTTTGTA; SWI4: left, CAGCAGTAATAACAATAATAATGGCAATAATAATAACTTG; right, GATACCAATCAAGTTTCTGTATTTATCTAACTTCACCGAT. All primers were chosen using Primer 3 (http://www-genome.wi.mit.edu/cgi-bin/primer/primer3_www.cgi ) focusing on sequences just inside the ORF. Parameters were chosen to minimize secondary structure, primer–primer complementation and differences in Tm between the two primers.

The primers used to generate the v:LEU2:u fragment were a left primer consisting of the 20 bp of universal vector sequence described above and a right primer containing the sequence AATAGTCCTCTTCCAACAATAATAATGTCAGATCCTGTAG at its 5′ end (followed by the universal sequence). The URA3 extension fragment was amplified with a left primer that was the reverse complement of the right primer of v:LEU2:u and a right primer with the sequence CTTTTCTGTAACGTTCACCC; this fragment contains 199 bp of URA3 sequence.

PCR reactions

PCR reactions to generate disruption fragments contained 1 µM primer, 0.1 mM NTP, 1.5 mM MgCl2, 10 mM Tris–HCl pH 8.3, 50 mM KCl, ∼0.01 µg RS305 or RS306 template DNA and 3 U/100 µl reaction of TaqI polymerase (Promega). Reactions were heated to 95°C for 5 min, and then 10 cycles of 94°C for 30 s, 54°C for 1 min, 72°C for 2 min and another 20 cycles of 94°C for 30 s, 65°C for 1 min, 72°C for 2 min. An elongation step at 72°C for 10 min was added at the end of the reaction. Disruption fragments were purified using the Wizard kit (Promega). The fusion PCR to generate the one-arm extension product was performed by mixing 1:100 dilutions of the v:LEU2:u and URA3 extension fragments. The 40 bp at the left end of the extension fragment matched the 40 bp at the right end of the v:LEU2:u fragment. Fusion PCR reactions conditions were the same as the standard PCR reactions except the annealing temperature of all 30 cycles was 55°C for 30 s.

Yeast strains and transformation

LiOAc yeast transformations were performed as described (20). All transformations were in the strain SH677, a derivative of W303-1a with the genotype, MATa ade2 can1:ADE2:CAN1 his3-11,15 lys2(3′Δ):HIS3:lys2(5′Δ) leu2-3,112 trp1-1 ura3-1. An aliquot of 0.1–0.2 µg DNA was used for each transformation yielding ∼2000–4000 colonies/µg. Transformants in which the targeted gene was replaced by the marker gene were identified by diagnostic PCR as follows. A small amount of the colony to be analyzed was lysed by heating in a microwave oven for 1 min and the DNA from this lysate amplified as described above except that the annealing temperature for all 30 cycles was 50°C and one of the original primers was replaced by a diagnostic primer (a detailed protocol can be found at http://sgi.bls.umkc.edu/honigberglab/). The 20 bp diagnostic primers were 50–200 bp outside the deleted region of the targeted gene. Six to eight positive control reactions with an isolate known to contain the disruption were performed with each set of diagnostic primers; data were tabulated when all positive controls yielded a band of the appropriate size. The parent strain served as a negative control and did not yield PCR fragments. In transformations in which URA3 was replaced by LEU2, Leu+ transformants were analyzed by replica plating to Ura medium. UraLeu+ isolates were judged to derive from precise replacement of the URA3 gene by the LEU2 marker.

Allele tests for targeted integration and marker conversion

In addition to PCR analysis of transformants, allelism tests for the site of recombination were performed on the transformants generated in four of the experiments (cat8Δ::URA3, rgt2Δ::URA3, swi4Δ::URA3 and swe1Δ::URA3). Briefly, transformants were patched to a YPD master plate, and this plate was crossed to lawns of two different strains. Both strains are isogenic to SH773 (MATα ade2 can1:ADE2:CAN1 his3-11,15 leu2-3,112 trp1-3′Δ ura3-1); in one strain the ura3-1 gene was converted to URA3 (SH1334) and in the other strain the targeted gene was replaced with URA3 (SH1878, SH1651, SH1996 or SH1959). Diploids were selected by replica-plating to HisLysUra medium for 24 h, the HisLysUra plates were then replica-plated to sporulation medium. After 48 h, the sporulation plates were replica-plated to canavanine medium, which selects for the haploid products of sporulation. After an additional 2 days, the canavanine plates were replica-plated to medium containing FOA, which selects for Ura isolates. When the URA3+ gene is present at the same locus in the transformant as in the tester strain no Ura haploids will be produced by sporulation (FOAS patch); in contrast when the URA3 gene is integrated at a non-allelic position, most tetrads contain at least one Ura spore (FOAr patch).

RESULTS

Efficiency of targeted recombination events varies at different genes

For this paper, the efficiency of TGR is defined as the percentage of transformants containing the targeted disruption. In the course of other experiments, we noticed that TGR efficiency varied from 2 to 20% depending on the targeted gene. To understand the causes for this variation, we transformed the same haploid yeast strain (SH677) with PCR disruption fragments targeted at eight different yeast genes (CAT8, RGT1, RGT2, SNF3, SUM1, SWE1, SWI4 and YHP1). Each of these PCR fragments contained 40 bp of homology to the left end of the targeted gene followed by the URA3 gene followed by 40 bp of homology to the right end of the targeted gene (see Materials and Methods). To determine the TGR efficiency, 84–108 independent transformants were screened by diagnostic PCR. Transformants can arise from either targeted or untargeted events (see Introduction), but because the disruption fragments being compared all contained the same marker, the frequency of non-targeted recombination should remain constant. Thus the fraction of transformants arising from TGR reflects the efficiency of targeted integration. This fraction varied from 5 to 33% for the eight genes tested (Table 1, column 2).

Table 1. Efficiency of targeted replacement using x:URA3:x.

Genotype Percentage targeteda Percentage untargetedb Percentage GCc P valued   Sizee (kb)
      Left Right Left Right  
CAT8 5 (100) 17 38 33 0.72 0.1 4.2
RGT2 11 (89) 11 35 28 0.98 0.33 2.4
SWI4 19 (108) 3 33 25 <0.001 0.03 2.2
RGT1 20 (85) n.d. 38 43 0.03 0.33 1.5
SWE1 25 (100) 4 45 40 0.12 0.96 1.6
YHP1 26 (93) n.d. 38 43 0.01 0.1 0.8
SUM1 27 (84) n.d. 45 40 0.28 0.02 3.1
SNF3 33 (84) n.d. 48 48 0.17 0.78 2.8

a(Targeted replacements/total transformants) × 100%; numbers of transformants tested are in parentheses.

b(Untargeted integration of marker/total transformants) × 100%; measured by allele test (see Materials and Methods); n.d., not determined.

cThe GC content of the left end and the right end targeting region of each disruption fragment.

dP value representing the degree of homology between the left or right targeting and their most homologous sequence in the yeast genome, not including the targeted gene itself.

eSize of the region replaced by the disruption fragment.

We asked whether variation in the efficiency of targeted replacement could be ascribed to a feature of the targeted sequence. Using Dotter software to compare sequences (21), we could not identify a consensus sequence containing at least 5 out of 6 bp of homology that was present in the targeting regions of all three fragments that targeted at high efficiency (SNF3, SUM1 and YHP1) and that was absent from sequences with low efficiency (CAT8 and RGT2). Thus we found no evidence of a conserved sequence associated with high TGR efficiency.

Among the eight targeted gene replacement fragments examined, the size of the sequence being replaced varied from 0.8 to 4.2 kb, but this size did not correlate with targeting efficiency (r = –0.40). Because it is possible that disruption fragments integrate into partially homologous sites on the yeast genome, we searched the complete yeast genome for sequences that were similar to the segments being targeted; these partially homologous sites could potentially compete with the targeted site. The degree of homology between the target sequence and its closest homolog in yeast is reflected by a P value (Table 1); this P value is the probability of finding, by chance, a sequence in the yeast genome with as much similarity as there is between the target and its closest yeast homolog. Thus, the smaller the P value, the more significant the homology. P values ranged from 5 × 10–5 (5′-targeted region of SWI4, which contained 31 out of 36 bp of homology) to 0.84 (5′-targeted region of RGT2, which contained 20 out of 25 bp of homology), but did not correlate with targeting efficiency whether we considered only the highest P value of the two ends, only the lowest, or the average of the two (for average P value, r = –0.3).

As an independent test of the role of partially homologous sequences in TGR efficiency we examined the CAT8, SWE1, RGT2 and SWI4 transformants using an allelism test (see Materials and Methods). The same transformants identified by the PCR assay as targeted were also identified as targeted by the allelism test, confirming the accuracy of both assays. The allelism assay also distinguishes between transformants arising by recombination at the ura3-1 locus and transformants arising from insertion of URA3 marker at non-targeted positions. As expected, >80% of untargeted transformants resulted from recombination at the ura3-1 locus (probably gene conversion). Segregation analysis of several of the remaining untargeted transformants show that they contain URA3 stably integrated at loci other than either the target or ura3-1. The frequency of these non-targeted integrants varied from 3 to 17% of the total transformants (Table 1, column 3). Interestingly, when the TGR efficiency was higher, the frequency of untargeted integration tended to be lower.

Increased GC content in targeted region correlates with increased TGR efficiency

For the eight genes tested above, the GC content of the targeting region the fragments varied from 35 to 48%, and the efficiency of TGR tended to be higher when the GC content was higher (Table 1). The correlation coefficient (r) between TGR efficiency and GC content is 0.746, indicating that we can reject the null hypothesis (no correlation) with P < 0.05.

As a test of the correlation between TGR efficiency and GC content, we chose alternative targeting sequences for two of the genes tested above (SNF3 and CAT8) . For both genes, the new targeted regions were within 100 bp of the original targeted regions. The new SNF3 targeted region had lower GC content than the original, whereas the new CAT8 target had higher GC content than the original. For both genes, we found that the efficiency of TGR was higher when the GC content was higher (Fig. 1B). When the results from these two transformations were considered along with the results from the eight genes described above, the correlation between GC content and TGR efficiency remained significant (r = 0.645, P < 0.05). All 10 experiments are shown in a scatter plot (Fig. 1C). Note that the range of TGR efficiency in this figure was >10-fold. It is also important to note that the CAT8 disruption fragments (Fig. 1C, arrows) promoted TGR at lower efficiencies relative to their GC content than the other fragments. Thus, it is unlikely that GC content is the only determinant of TGR efficiency.

Effect of changing length of targeted region on TGR efficiency

High TGR efficiency may correlate with high GC content because pairing between the disruption fragment and the chromosome is more stable at higher GC content. Because the stability (Tm) of any paired DNA region depends on both its length and GC content, we determined whether changing the length of the targeted region affects TGR efficiency. For this purpose, we chose a gene that had high TGR efficiency (RGT1) and a gene that had low TGR efficiency (CAT8). Each of the RGT1::URA3 disruption fragments had the same ends but the targeting region extended for either 35, 40, 45, 50 or 55 bp from these ends (Fig. 2A). Similarly, the CAT8:URA3 disruption fragments extended either 40, 50 or 60 bp from the same ends. For both genes, as the length of the homologous region increased, the Tm of this region increased 10–12°C while the GC content remained relatively constant. For both genes, on average the TGR efficiency increased only slightly as the length of homology increased, and in some cases the TGR efficiency actually decreased with increased size (Fig. 2B).

Figure 2.

Figure 2

Effect of length of homology on efficiency of targeted integration. (A) Disruption fragments used in this experiment. All five fragments have the same left and right ends, but the regions of homology extend for different lengths (from 35 to 55 bp) before joining with the URA3 gene, indicated by the number to the left of each fragment. The right-hand targeted regions of these disruption fragment are not shown, but these sequences also share the same end as one another and extend for the same lengths as the left hand ends. (B) The efficiency of transformation for disruption fragments of CAT8 (open circles) and RGT1 (filled circles) with targeted regions of different lengths as shown in (A). The error bars shown represent the standard error of the mean of three independent determinations, at least 60 transformants were screened for each determination.

To test reproducibility, each of the transformations for the experiment shown in Figure 2 was done in triplicate, and for each transformation, 50 to 100 colonies were screened. The data shown are the mean of the three experiments with the standard error of the mean represented by the length of the error bars. Note that the variation in the data between experiments is far less than the 10-fold range in TGR efficiency shown in Figure 1C.

Role of chromosome location

We next asked whether TGR efficiency was affected by chromosomal location. RGT2 and SNF3 are 0.1 Mb apart on chromosome IVL; YHP1 and SUM1 are ∼1.1 Mb away from these genes on the other arm of chromsome IV and separated from each other by 0.2 Mb. CAT8, RGT1, SWE1 and SWI4 are on other chromosomes (XIII, XI, X and V, respectively). The rgt2Δ::URA3, snf3Δ::URA3, yhp1Δ::URA3, rgt1Δ::URA3 and cat8Δ::URA3 strains constructed above were transformed with a fragment that targeted replacement of URA3 with LEU2 (‘marker replacement’). This disruption fragment, v:LEU2:u, contained 40 bp of homology to vector sequence inserted with the original URA3 disruption, followed by LEU2, followed by 40 bp of homology to an internal sequence in URA3. All five loci had TGR efficiencies in the range of 6–8% (Table 2). Thus, for the genes tested, the chromosomal location did not have a large effect on TGR efficiency. Interestingly, the TGR efficiency relative to the GC content (left side, 48%; right side, 35%) was lower for the v:LEU2:u fragment than for the x:URA3:x fragments shown in the previous experiments (Fig. 1C). The most likely explanation for this difference is that conversion of the leu2-3,112 locus occurred at higher frequencies than conversion of the ura3-1 locus.

Table 2. Efficiency of targeted replacement of URA3 gene.

Genotype Percentage targeteda  
  v:L:ub v:L:U(199)c
CAT8:URA3
8 (100)
19 (129)
RGT2:URA3
7 (100)
16 (118)
RGT1:URA3
7 (96)
19 (120)
YHP1:URA3
7 (96)
15 (120)
SNF3:URA3 6 (98) 17 (120)

a(Replacements/total transformants) × 100%; numbers of transformants are in parentheses.

bv:L:u contains 40 bp of vector sequence present at the 5′ end of each disrupted gene, followed by LEU2, followed by 40 bp of URA3 sequence present at 3′ end of each disrupted gene.

cv:L:U(199) contains 40 bp of vector sequence present at the 5′ end of each disrupted gene, followed by LEU2, followed by 199 bp of URA3 sequence present at 3′ end of each disrupted gene.

200 bp one-arm extended fragments greatly increase TGR frequency

Although TGR efficiency increased only modestly when the length of homology was increased from 35 to 55 bp, earlier studies show that large regions of homology at both ends of a disruption fragment (3) greatly increase this efficiency (5). To further explore the effect of length on TGR efficiency, we extended the length of homology on the v:LEU2:u disruption fragment from 40 to 199 bp using fusion PCR (see Materials and Methods). This ‘one-arm extended’ fragment, v:LEU2:u(199), was then used for marker replacement in the same five strains as the previous experiment. The TGR efficiency with the one-arm extended fragment was approximately three times greater than original fragment for each of the five strains (Table 2). Thus, increasing the length of the targeting regions from 40 to 60 bp had little effect on TGR efficiency, but increasing the length the targeting region on one end of the fragment to 200 bp greatly increased this efficiency.

DISCUSSION

TGR using fragments generated by PCR is a common method for creating gene deletions in yeast; thus it is both interesting and valuable to understand the parameters that determine the efficacy of this technique. Our two principle results are: (i) the frequency of TGR/transformant (termed TGR efficiency) varies by >10-fold when different sequences were targeted; (ii) the TGR efficiency tends to increase as GC content increases.

Differences in TGR efficiency at different genes might be caused by several factors. For example, the specific location of the targeted gene in the genome could affect the accessibility of this gene to the disruption fragment. However, we found that the same gene was targeted at the same efficiency when it was placed at five different locations in the genome. Thus our results indicate that chromosomal location is not the primary determinant of TGR efficiency, at least for the five genes we tested. This does not necessarily imply that chromatin structure does not affect TGR efficiency. For example, to the extent that the sequence of URA3 determines its chromatin structure, it would have the same structure regardless of its chromosomal location.

In addition to the chromosomal location of the target gene, other parameters that might affect TGR efficiency include: (i) the strain background, (ii) the marker gene present on the disruption fragment, and (iii) the size of the gene being replaced. The first two parameters were not addressed in this study because the same strain and marker gene were used for all comparisons. The size of the replaced genes ranged from 0.8 to 4.2 kb for the different targeted genes examined here, but the size of the replaced gene did not correlate with TGR efficiency.

Given the above results, the most likely cause for differences in TGR efficiency is the DNA sequence of the targeted region. We were unable to detect a specific sequence associated with high TGR efficiency among the 10 targeted sequences we examined, and weak homology of the targeted region to other sequences in the genome also did not correlate with TGR efficiency. On the other hand, GC content correlates strongly with TGR efficiency. Because GC base pairs are more stable than AT base pairs, one explanation for this correlation is that TGR efficiency is limited by the stability of the association between the disruption fragment and the chromosome. If this hypothesis were true, then increasing the length of the targeted region might also increase the stability of association and hence TGR efficiency. However, increasing the length of homology by up to 50% for either RGT2 or CAT8, while greatly increasing the Tm of the targeted region, only slightly increases the TGR efficiency.

One explanation for TGR efficiency being affected by GC content but not target length is that the initial region of pairing between disruption fragment and chromosome may be short (<35 bp). In this case, GC content would affect the stability of this pairing much more than the target size. Interestingly, in vitro pairing and strand exchange reactions mediated by either the Escherichia coli recA protein or the human rad51 protein are more efficient when substrates have high AT content (22); this is likely to be due to the rapid exchange of A:T base pairs relative to G:C base pairs (23). If increased GC content does not promote pairing, its positive effect on TGR efficiency might instead result from inhibiting mismatch repair. Mismatch repair enzymes block recombination between diverged sequences (reviewed in 23) and thus may also inhibit TGR efficiency, which depends on pairing relatively short regions of homology.

Creating disruption fragments that contain several hundred base pairs of homology at either end by fusion PCR (3), greatly increases the efficiency of TGR (5). Our results are consistent with these earlier studies and show additionally that TGR efficiency can be increased even when the homology is extended at only one end of the fragment (one-arm extension). We are uncertain why large, but not small, increases in the targeted region affect TGR efficiency, but one possibility is that homologous recombination proceeds through an alternative and more efficient pathway once the length of the targeting region reaches a threshold. Indeed, for several types of homologous recombination, the minimal length of homology (MEPS) required for homologous recombination has been extrapolated to 200–300 bp (16,17).

Although GC content correlates with TGR efficiency, other factors also affect this efficiency, and the effect of DNA sequence on integration efficiency is likely to be complex. For example, both CAT8 disruption fragments yielded lower TGR efficiency relative to their GC content than the other genes tested. This result might indicate that chromatin structure at the CAT8 loci is more refractory to TGR than at the other loci we tested. Another possibility is that secondary structure formed after resection of the 5′ ends of the disruption fragment affects TGR efficiency. Indeed, secondary structure in single-stranded substrates inhibits association of recombination proteins with DNA substrates in vitro (24). Although TGR efficiency did not correlate with the P values of the closest homolog to the target gene, for the four genes we tested, those with higher TGR efficiencies tended to have lower efficiencies of untargeted integration. One possible explanation is that the efficiency of processing the ends of the fragment (for example, to leave 3′-tails) limits TGR efficiency. By this view, once ends are formed they integrate either at a targeted locus or at an untargeted locus, and the relative efficiencies of these two pathways depends largely on the GC content of the targeted locus.

Our results suggest two strategies that may assist in construction of targeted gene replacements in yeast. First, it is important to design primers with high GC content, especially in the range from 40 to 50% GC, but it is unnecessary to initially target regions >40 bp. We are uncertain about the effect of GC content >50%, but for most gene replacements marked with URA3, a 40 bp disruption fragment should yield a TGR efficiency of >20% when the GC content is between 40 and 50%. Secondly, for genes where the TGR efficiency is unusually low, extending only one end of the original fragment by fusion PCR is expected to greatly increase this efficiency.

Acknowledgments

ACKNOWLEDGEMENTS

We thank Dr C. Radding (Yale University), Dr T. Menees (University of Missouri—Kansas City) and members of the Honigberg lab for helpful discussions. We thank Dr Marilyn Yoder (University of Missouri—Kansas City) for help with computer analysis of sequence homology, and Dr Benton Cobb (University of Kansas) for advice on statistical analysis. The work was supported by NIH grant R01-GM58013.

REFERENCES

  • 1.Baudin A., Ozier-Kalogeropoulos,O., Denouel,A., Lacroute,F. and Cullin,C. (1993) A simple and efficient method for direct gene deletion in Saccharomyces cerevisiae. Nucleic Acids Res., 21, 3329–3330. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Lorenz M., Muir,R., Lim,E., McElver,J., Weber,S.C. and Heitman,J. (1995) Gene disruption with PCR products in Saccharomyces cerevisiae. Gene, 158, 113–117. [DOI] [PubMed] [Google Scholar]
  • 3.Amberg D.C., Botstein,D. and Beasley,E.M. (1995) Precise gene disruption in Saccharomyces cerevisiae by double fusion polymerase chain reaction. Yeast, 11, 1275–1280. [DOI] [PubMed] [Google Scholar]
  • 4.Nikawa J. and Kawabata,M. (1998) PCR- and ligation-mediated synthesis of marker cassettes with long flanking homology regions for gene disruption in Saccharomyces cerevisiae. Nucleic Acids Res., 26, 860–861. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Wach A. (1996) PCR-synthesis of marker cassettes with long flanking homology regions for gene disruptions in S. cerevisiae. Yeast, 12, 259–265. [DOI] [PubMed] [Google Scholar]
  • 6.Wach A., Brachat,A., Pohlmann,R. and Philippsen,P. (1994) New heterologous modules for classical or PCR-based gene disruptions in Saccharomyces cerevisiae. Yeast, 10, 1793–1808. [DOI] [PubMed] [Google Scholar]
  • 7.Langle-Rouault F. and Jacobs,E. (1995) A method for performing precise alterations in the yeast genome using a recyclable selectable marker. Nucleic Acids Res., 23, 3079–3081. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Winzeler E.A., Shoemaker,D.D., Astromoff,A., Liang,H., Anderson,K., Andre,B., Bangham,R., Benito,R., Boeke,J.D., Bussey,H. et al. (1999) Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis. Science, 285, 901–906. [DOI] [PubMed] [Google Scholar]
  • 9.Schneider B.L., Seufert,W., Steiner,B., Yang,Q.H. and Futcher,A.B. (1995) Use of polymerase chain reaction epitope tagging for protein tagging in Saccharomyces cerevisiae. Yeast, 11, 1265–1274. [DOI] [PubMed] [Google Scholar]
  • 10.Lafontaine D. and Tollervey,D. (1996) One-step PCR mediated strategy for the construction of conditionally expressed and epitope tagged yeast proteins. Nucleic Acids Res., 24, 3469–3471. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Wach A., Brachat,A., Alberti-Segui,C., Rebischung,C. and Philippsen,P. (1997) Heterologous HIS3 marker and GFP reporter modules for PCR-targeting in Saccharomyces cerevisiae. Yeast, 13, 1065–1075. [DOI] [PubMed] [Google Scholar]
  • 12.Puig O., Rutz,B., Luukkonen,B.G., Kandels-Lewis,S., Bragado-Nilsson,E. and Seraphin,B. (1998) New constructs and strategies for efficient PCR-based gene manipulations in yeast. Yeast, 14, 1139–1146. [DOI] [PubMed] [Google Scholar]
  • 13.Longtine M.S., McKenzie,A.,III, Demarini,D.J., Shah,N.G., Wach,A., Brachat,A., Philippsen,P. and Pringle,J.R. (1998) Additional modules for versatile and economical PCR-based gene deletion and modification in Saccharomyces cerevisiae. Yeast, 14, 953–961. [DOI] [PubMed] [Google Scholar]
  • 14.Mizuno K., Emura,Y., Baur,M., Kohli,J., Ohta,K. and Shibata,T. (1997) The meiotic recombination hot spot created by the single-base substitution ade6-M26 results in remodeling of chromatin structure in fission yeast. Genes Dev., 11, 876–886. [DOI] [PubMed] [Google Scholar]
  • 15.Fritze C.E., Verschueren,K., Strich,R. and Easton Esposito,R. (1997) Direct evidence for SIR2 modulation of chromatin structure in yeast rDNA. EMBO J., 16, 6495–6509. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Jinks-Robertson S., Michelitch,M. and Ramcharan,S. (1993) Substrate length requirements for efficient mitotic recombination in Saccharomyces cerevisiae. Mol. Cell. Biol., 13, 3937–3950. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Inbar O., Liefshitz,B., Bitan,G. and Kupiec,M. (2000) The relationship between homology length and crossing over during the repair of a broken chromosome. J. Biol. Chem., 275, 30833–30838. [DOI] [PubMed] [Google Scholar]
  • 18.Petes T.D. (2001) Meiotic recombination hot spots and cold spots. Nat. Rev. Genet., 2, 360–369. [DOI] [PubMed] [Google Scholar]
  • 19.Sikorski R.S. and Hieter,P. (1989) A system of shuttle vectors and yeast host strains designed for efficient manipulation of DNA in Saccharomyces cerevisiae. Genetics, 122, 19–27. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Rose M.D., Winston,F. and Hieter,P. (1990) Methods in Yeast Genetics: A Laboratory Course Manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.
  • 21.Sonnhammer E.L. and Durbin,R. (1995) A dot-matrix program with dynamic threshold control suited for genomic DNA and protein sequence analysis. Gene, 167, GC1–G10. [DOI] [PubMed] [Google Scholar]
  • 22.Gupta R.C., Folta-Stogniew,E. and Radding,C.M. (1999) Human Rad51 protein can form homologous joints in the absence of net strand exchange. J. Biol. Chem., 274, 1248–1256. [DOI] [PubMed] [Google Scholar]
  • 23.Gupta R.C., Folta-Stogniew,E., O‘Malley,S., Takahashi,M. and Radding,C.M. (1999) Rapid exchange of A:T base pairs is essential for recognition of DNA homology by human Rad51 recombination protein. Mol. Cell, 4, 705–714. [DOI] [PubMed] [Google Scholar]
  • 24.Biet E., Sun,J. and Dutreix,M. (1999) Conserved sequence preference in DNA binding among recombination proteins: an effect of ssDNA secondary structure. Nucleic Acids Res., 27, 596–600. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES