Abstract
To investigate the relationship between T-DNA integration and double-stranded break (DSB) repair in Arabidopsis, we studied 67 T-DNA/plant DNA junctions and 13 T-DNA/T-DNA junctions derived from transgenic plants. Three different types of T-DNA-associated joining could be distinguished. A minority of T-DNA/plant DNA junctions were joined by a simple ligation-like mechanism, resulting in a junction without microhomology or filler DNA insertions. For about one-half of all analyzed junctions, joining of the two ends occurred without insertion of filler sequences. For these junctions, microhomology was strikingly combined with deletions of the T-DNA ends. For the remaining plant DNA/T-DNA junctions, up to 51-bp-long filler sequences were present between plant DNA and T-DNA contiguous sequences. These filler segments are built from several short sequence motifs, identical to sequence blocks that occur in the T-DNA ends and/or the plant DNA close to the integration site. Mutual microhomologies among the sequence motifs that constitute a filler segment were frequently observed. When T-DNA integration and DSB repair were compared, the most conspicuous difference was the frequency and the structural organization of the filler insertions. In Arabidopsis, no filler insertions were found at DSB repair junctions. In maize (Zea mays) and tobacco (Nicotiana tabacum), DSB repair-associated filler was normally composed of simple, uninterrupted sequence blocks. Thus, although DSB repair and T-DNA integration are probably closely related, both mechanisms have some exclusive and specific characteristics.
The bacterial phytopathogen Agrobacterium tumefaciens provides a natural system to transfer and introduce genetic information into susceptible plant cells. The T-DNA region, a portion of the tumor-inducing plasmid delimited by two 24-bp border repeats, is transferred to the plant cell as a single-stranded DNA after vir (virulence) gene induction (Gheysen et al., 1998; Gelvin, 2003). Upon arrival in the plant cell, the T-strand is targeted to the cell nucleus, and the T-DNA becomes integrated into the host genome by the plant's illegitimate recombination apparatus (Matsumoto et al., 1990; Gheysen et al., 1991; Mayerhofer et al., 1991; Tinland, 1996). Recently, T-DNA integration in the Arabidopsis genome has been shown to depend on the sequence of the pre-insertion site (Brunaud et al., 2002).
That T-DNA strands can be captured at double-stranded breaks (DSBs; Salomon and Puchta, 1998) in tobacco (Nicotiana tabacum) and also that T-DNA integration in yeast (Saccharomyces cerevisiae) depends on nonhomologous end-joining (NHEJ) enzymes (van Attikum et al., 2001) suggest that double-stranded break repair via NHEJ is involved as a pathway in T-DNA integration. More recently, the T-DNA integration efficiency has been found to decrease in Ku80- and DNA ligase IV-deficient plants (Friesner and Britt, 2003). The first molecular data on NHEJ in plants came from the footprint analysis of excised transposons. Repair at transposon-induced DSBs seems to generate only minor changes, such as small deletions, insertion of filler, or inversions (Gorbunova and Levy, 1999; Yan et al., 1999). Also, T-DNA integration has been studied as an example of NHEJ. Short regions of microhomology, insertions of filler DNA, and small deletions have been reported as features of T-DNA/plant DNA junctions (Matsumoto et al., 1990; Gheysen et al., 1991; Mayerhofer et al., 1991; Tinland, 1996; Fladung, 1999; Kumar and Fladung, 2002; Meza et al., 2002). More direct NHEJ analysis in plants involved the characterization of naturally occurring deletions in plant genomes (Ralston et al., 1988; Wessler et al., 1990) and the sequencing of newly formed junctions at repaired DSBs (Gorbunova and Levy, 1997; Salomon and Puchta, 1998). For five spontaneous deletions of the waxy locus in maize (Zea mays; Wessler et al., 1990), for a spontaneous deletion in the maize bz-R allele (Ralston et al., 1988), and for NHEJ-repaired junctions in tobacco cells (Gorbunova and Levy, 1997; Salomon and Puchta, 1998), the majority of repaired junctions contained uninterrupted filler insertions (Gorbunova and Levy, 1999). Recently, species-specific differences in plant DSB repair have been reported by Kirik et al. (2000), who showed that DSB repair in tobacco cells is accompanied by insertions but not in Arabidopsis in which large deletions of the joined plant DNA ends are frequent. These findings have been further substantiated by Orel and Puchta (2003). These authors suggest that free DNA break ends in Arabidopsis are less stable and more prone to end degradation than free DNA ends in tobacco cells.
To investigate the relationship between DSB repair by NHEJ and T-DNA integration in Arabidopsis, we sequenced and analyzed 67 T-DNA/plant DNA junctions and 13 junctions between linked T-DNAs. We compared the outcome of the T-DNA/plant DNA and T-DNA/T-DNA joining reaction with the characteristics of NHEJ-repaired DSBs reported in Arabidopsis (Kirik et al., 2000; Orel and Puchta, 2003) and other plant species (Gorbunova and Levy, 1997; Salomon and Puchta, 1998). This comparison mainly focused on the frequency, the presence, and the origin of filler DNA sequences.
RESULTS
Experimental Setup
T-DNA junctions, derived from a population of Arabidopsis transformants (see “Materials and Methods”), were amplified and sequenced. We characterized 44 left T-DNA/plant DNA junctions, 23 right T-DNA/plant DNA junctions, eight junctions between tandemly linked T-DNAs, and five junctions connecting two T-DNAs in inverted orientation, for which one T-DNA was truncated for the T-DNA border region (see supplemental data, available in the online version of this article at http://www.plantphysiol.org). To determine the end point of the T-DNA borders, the T-DNA region was aligned against the T-DNA plasmid sequence. We will refer throughout to a correct nicking process of the T-strand when the T-DNA is cleaved between the third and fourth base of the 24-bp border repeat (Van Haaren et al., 1988).
Further sequence analysis revealed three distinct junction types. A first class included T-DNA junctions that were joined by a simple ligation-like mechanism, resulting in a junction without microhomology or filler DNA insertions. A second class had microhomology at the transition point between two adjacent DNA segments. Here, microhomology is defined as small DNA sequences, ranging from 1 to 7 bp and found at the junction point between the T-DNA and the plant DNA for plant DNA/T-DNA junctions and between two T-DNAs, when linked T-DNAs are studied (Roth et al., 1985; Roth and Wilson, 1986). A third class grouped junctions with insertion of filler DNA. These new, intermediate filler sequences were detected easily because the end point of the T-DNA border, and the starting point of the plant DNA for T-DNA/plant DNA junctions could be determined. Similarly, for linked T-DNAs, the end points of the two adjacent T-DNA borders were known. Filler sequences longer than 17 bp were screened for complete colinear sequence identity with the Arabidopsis genome to evaluate whether the observed filler sequences could be uninterrupted sequence blocks that had been copied from the plant genome. In addition, for 21 filler DNAs larger than 6 bp, we analyzed the origin of the filler DNA by looking for identical sequence motifs with: (a) the first 100 bp of the T-DNA sequence situated immediately adjacent to the transition point of the T-DNA junction, (b) a 200-bp plant DNA segment surrounding the T-DNA integration point, and (c) the complete T-DNA plasmid sequence (see “Materials and Methods” and Supplemental Data). An experimental, statistical analysis was performed to determine thresholds for reporting identities, allowing us to discriminate mechanistic relevance from by chance occurrence (see “Materials and Methods” and Supplemental Data).
Outcome of the Joining Reaction between T-DNA and Plant DNA Ends
We wanted to address the question whether T-DNA junction characteristics and the mode of T-DNA joining are related. Therefore, we characterized the outcome of the joining reaction of 67 plant DNA/T-DNA junctions, taking into account the T-DNA border type involved and the degree of T-DNA end processing (Table I). Of 44 left border T-DNA/plant DNA junctions, four had the T-DNA end joined to the plant DNA without the presence of microhomology or the formation of filler DNA sequences, 23 displayed microhomology of 1 bp up to a stretch of 6 bp, and 17 had filler DNA insertions ranging from 1 up to 48 bp (Table I; Supplemental Data). Similar results were obtained for the right border junctions: three of 23 junctions were joined by means of a simple ligation-like joining reaction, 10 had microhomology of 2 up to 7 bp, and 10 had filler DNA ranging from 1 to 51 bp (Table I; Supplemental Data). These results clearly indicate that the frequency by which simple ligation, microhomology, or filler DNA is observed at left and right border junctions is comparable.
Table I.
T-DNA end | Simple Ligation | Microhomology | Filler Sequences |
---|---|---|---|
Left T-DNA border | 4/44 | 23/44 | 17/44 |
Right T-DNA border | 3/23 | 10/23 | 10/23 |
Intact T-DNA ends | 2/13 | 3/13 | 8/13 |
Processed T-DNA ends | 5/54 | 30/54 | 19/54 |
Total | 7/67 | 33/67 | 27/67 |
We also analyzed whether T-DNA joining characteristics and the degree of T-DNA end processing were correlated (Table I; Supplemental Data). Regarding the T-DNA end processing, our results confirm recent observations (Brunaud et al., 2002; Krysan et al., 2002; Meza et al., 2002) that, although the deletion size at the left T-DNA ends is larger than that at the right borders, processing and trimming is a ubiquitous feature of T-DNA ends. Interestingly, when the degree of T-DNA end processing is correlated with the outcome of the joining reaction, three of 13 intact T-DNA ends were joined to the plant DNA with microhomology (Table I), in contrast to 30 of 54 processed T-DNA ends (Table I). Also striking is that eight of 13 intact and only 19 of 54 processed T-DNA ends were joined with insertion of filler sequences. In conclusion, microhomology was on average 2.4-fold more associated with deleted than with intact T-DNA borders (Table I). Intact T-DNA ends were on average 1.7-fold more associated with filler DNA than were T-DNA junctions with deleted T-DNA ends (Table I). A Fisher exact test was performed that verified that the correlation between the intactness or the processing of T-DNA ends, and the presence of filler DNA or microhomology, respectively, was significant (P < 0.05; see “Materials and Methods” and Supplemental Data). However, these conclusions should be considered with caution because of the small number of analyzed junctions.
Origin of Filler DNA at T-DNA/Plant DNA Junctions
Analysis of DSB lesions repaired via NHEJ in Arabidopsis has shown that none of 40 analyzed junctions had filler insertions (Kirik et al., 2000). Because of the striking difference with our observation that in Arabidopsis 27 of 67 T-DNA junctions (approximately 40%) harbor filler insertions (Table I), we investigated the origin and structure of the filler sequences found at the T-DNA junctions.
The origin of the filler DNA sequences was determined by comparative sequence analysis. The results of this analysis are presented in Figure 1A. For a full overview, containing all filler sequences and their origin analysis, see Supplemental Data. None of the filler sequences larger than 17 bp had a complete colinear sequence identity with the Arabidopsis genome sequence, indicating that the observed filler insertions did not consist of uninterrupted sequence blocks duplicated from the Arabidopsis genome. For 15 of the 17 T-DNA/plant DNA junctions with intermediate filler DNA of 6 bp or more, the filler DNA was made up of short duplicated sequences, identical to sequence motifs occurring in either the plant DNA surrounding the T-DNA insertion point and/or the T-DNA plasmid sequence (Fig. 1; Supplemental Data; plant DNA and T-DNA plasmid-derived repeats are indicated above or below the junction sequence, respectively). The remaining two filler insertions of 8 and 11 bp did not contain a sequence motif longer than the determined thresholds for reporting sequence identities. Therefore, in these two cases, we were unable to discern mechanistic relevance and by chance occurrence (see “Materials and Methods”). Of the 15 analyzed filler DNAs for which the origin could be attributed, four contained sequence motifs, identical to the plant DNA target or the T-DNA end, four had sequence blocks identical to the plant DNA target only, and seven consisted of sequence identities with the T-DNA end only (Fig. 1; Supplemental Data).
The first class of filler DNAs, with both plant DNA- and T-DNA-derived identical sequence blocks, is clearly illustrated by junction kg353LB-21 (Fig. 1A-3). This T-DNA/plant DNA junction had a 13-bp filler sequence, made up of five identified, different, and overlapping sequence motifs. Two sequence blocks of 9 bp (Fig. 1A-3, motifs 9 and 10) were identical to sequence motifs present in the plant target, whereas the other motifs, one of 7 bp (Fig. 1A-3, motif 13) and two of 6 bp (Fig. 1A-3, motifs 11 and 12), were identical to sequence blocks occurring in the T-DNA border sequence. The sequence identities with the plant genome corresponded either with the plant DNA immediately adjacent to the left border end of the T-DNA (Fig. 1A-3, motif 9) or with the plant DNA adjacent to the opposite T-DNA end of the same T-DNA insert (Fig. 1A-3, motif 10). The different sequence motifs, which make up the filler, shared small stretches of microhomology of up to a few bases, and microhomology was usually restricted to the ends of adjacent motifs. Moreover, microhomology was also observed between the filler DNA and the T-DNA end and/or the plant DNA end (Fig. 1A-3, motif 13). On the other hand, sequence blocks with a full overlap were also detected. For junction kg353LB-21, motif 12 lies entirely within motif 9 (Fig. 1A-3). Similar results were obtained for the other filler DNA sequences (see Supplemental Data), and mutual microhomology seemed to be a general characteristic of all sequence identities that make up a filler sequence at T-DNA/plant DNA junctions. Furthermore, junction kg353LB-21 also nicely illustrated that sequence identities that make up a filler can be in direct or inverted orientation relative to their template DNAs (Fig. 1, solid or arrowed line, respectively). One 9-bp sequence block, identical to a sequence motif present in the plant target (Fig. 1A-3, motif 9), was in inverted orientation, whereas the other four sequence identities were in direct orientation relative to their template DNAs. In general, 20% of all sequence identities in the filler had an inverted orientation, but the majority was in direct orientation relative to their template DNA.
The second class of filler sequences, containing sequences identical with the plant target only, is illustrated by junction kg353RB-8 (Fig. 1A-1). This right border T-DNA/plant DNA junction harbored a 51-bp filler segment composed of three sequence motifs, identical to sequence blocks occurring in the plant target. First, one 20-bp sequence block (Fig. 1A-1, motif 4) was identical to the plant DNA that was immediately adjacent to the filler sequence itself. Then, a 30-bp sequence (Fig. 1A-1, motif 1) was identical to the plant DNA adjacent to the opposite T-DNA end of the same T-DNA insert. The third, a 21-bp sequence block (Fig. 1A-1, motif A), was identical to part of the target site deletion.
The final class of filler sequences consisted solely of sequence motifs identical to sequence blocks occurring in the T-DNA. Junction kg150LB-8 (Fig. 1A-4) contained an 8-bp filler DNA harboring identical sequences in inverted orientation to two left-end T-DNA-derived sequence motifs (Fig. 1A-4, motifs 16 and 17). Also, a 16-bp sequence motif enclosing the 8-bp filler DNA was completely identical with a 16-bp sequence block present in the T-DNA 1,360 bp distant from the left border T-DNA end. This observation shows that sequence motifs observed in filler DNAs did not always originate from sequences proximal to the filler segment itself. Similarly, for the filler insertions at junctions kg44LB-2, CK2L102RB-7, and kg135RB-4, sequence motifs identical to sequence blocks positioned 293, 1,693, and 638 bp, respectively, from the corresponding T-DNA end were found (see Supplemental Data, A1, B1, and B4).
In addition, chimeric filler sequences (Fig. 1, dotted lines) were identified, meaning that the observed sequence motifs in these filler segments were duplicated within the filler DNA itself. Chimeric filler sequences were present in junctions kg353LB-20 (Fig. 1A-3, motifs 14 and 15), kg135RB-4 (Supplemental Data, B4, motif 43), and kg165-12 (Fig. 1C-1, motifs 28 and 29).
Origin of Filler DNA at T-DNA/T-DNA Junctions
Because the frequency of filler insertions differed in T-DNA/plant DNA junctions and DSB-repaired plant DNA/plant DNA joints, we also analyzed the frequency and origin of filler insertions at T-DNA/T-DNA junctions to investigate the influence of the plant DNA end in a T-DNA/plant DNA joining reaction. Of eight junctions between tandemly linked T-DNAs, three contained filler DNA sequences of 1, 21, and 38 bp; of three junctions between T-DNA-inverted repeats over the left border, one junction harbored an intermediate 21-bp filler DNA, and of two junctions between two right border T-DNA ends, both contained filler sequences of 4 and 40 bp. Taken together, six of 13 linked T-DNAs harbor filler insertions at their junction (see Supplemental Data).
For the two junctions between tandemly linked T-DNAs with filler DNA of 21 and 38 bp, the origin of the filler sequences was determined (Fig. 1B). For these two junctions, the origin of the filler was attributed to T-DNA sequences. The filler DNA at junction kd73-4 (Fig. 1B-1) was identical to a sequence motif found in the T-DNA sequence 1,181 bp distant from the T-DNA border end. It is the only filler reported in this study that consisted of an uninterrupted sequence motif. In contrast, the 38-bp filler sequence at junction kg150-7 (Fig. 1B-2) is a nice example of how alternating sequence motifs identical to different T-DNA sequences make up a filler DNA. Similar results were observed for the filler DNAs of two inverted repeat junctions (Fig. 1, C-1 and C-2).
In addition, we analyzed whether the observed filler DNAs at linked T-DNAs could be attributed to Arabidopsis genomic sequences. In none of the filler DNAs could an identical sequence longer than 15 bp be detected with 100% identity with the plant genome. Only for junction kd12-1, a 16-bp plant DNA sequence was found that overlapped with two T-DNA-located sequence identities (Fig. 1C-2, italicized sequence). Nevertheless, the statistical significance of this 16-bp sequence motif, identical to a sequence block occurring in the plant genome, is difficult to evaluate because the surrounding plant DNA for the kd12 T-DNA locus had not been characterized. In conclusion, our findings with regard to the origin of the filler sequences at T-DNA/T-DNA junctions resemble those for T-DNA/plant DNA junctions.
DISCUSSION
That T-DNAs can be captured at DSB sites in tobacco (Salomon and Puchta, 1998), that NHEJ enzymes in yeast are involved in T-DNA integration (van Attikum et al., 2001), and that Ku80- and DNA ligase IV-deficient plants exhibit a reduced T-DNA integration efficiency (Friesner and Britt, 2003) have been taken as evidence for the role played by NHEJ mechanisms in a pathway for T-DNA integration. However, when we compare the outcome of the joining reaction at the T-DNA junctions with junction characteristics of repaired DSBs in plant cells, some remarkable differences appear. First, NHEJ-repaired junctions in Arabidopsis are characterized by the occurrence of large deletions ranging up to 2,207 bp (Kirik et al., 2000). T-DNA insertions in Arabidopsis are normally associated with much smaller target site deletions. We characterized target site deletions for eight single-copy CK2L-transgenic lines, and we found that on average, 48 bp of the genomic pre-insertion target site are deleted (data not shown), confirming earlier findings that on average the target site deletion in the host genome is small after T-DNA integration (Krysan et al., 2002; Meza et al., 2002). However, the most conspicuous difference is that approximately 40% of all T-DNA junctions harbor filler, whereas filler insertions are absent at NHEJ-repaired junctions in Arabidopsis (Kirik et al., 2000). Moreover, not only the frequency of filler insertions seems different, but also the composition of T-DNA-associated filler can be discriminated from filler insertions found at repaired DSBs in plant species other than Arabidopsis (Ralston et al., 1988; Wessler et al., 1990; Gorbunova and Levy, 1997; Salomon and Puchta, 1998; Yan et al., 1999). Filler insertions at TDNA/plant DNA junctions consist of several, predominantly short, sequence motifs that are identical to sequence blocks occurring in: (a) the immediate surrounding of the plant T-DNA target site, (b) the T-DNA sequence, or (c) both. The T-DNA target site-derived motifs were identical to the pre-insertion site deletion or to the plant DNA adjacent to either site of the T-DNA, whereas T-DNA-derived sequence identities were scattered along the whole T-DNA sequence. Alternatively, chimeric sequence motifs that form internal repeat structures in the filler DNA sequence are observed. The majority of the sequence motifs are in direct orientation relative to their template DNA (Fig. 1; Supplemental Data). In contrast, the majority of the filler insertions found at DSB-repaired junctions in tobacco and maize are made up of simple, uninterrupted sequence blocks identical to sequences of the plant genome. Analogously, in yeast, in which illegitimate recombination is normally not accompanied by insertion of filler DNA, filler sequences are observed at the junctions upon T-DNA integration into the yeast genome (Bundock and Hooykaas, 1996). Moreover, DNA ligase IV and Ku80, which are required for repair of DNA damage in Arabidopsis, seem not to be involved in the integration of T-DNA into plants (Gallego et al., 2003; van Attikum et al., 2003). Taken together, although DSB repair and T-DNA integration might be associated, both mechanisms probably have some exclusive and specific characteristics. The observed differences are seemingly linked to the T-DNA integration process, meaning that the T-DNA may follow a pathway of its own during T-DNA integration rather than an entirely plant cell-determined route.
The differences between T-DNA integration and DSB repair might be explained in several ways. First, the T-DNA is transferred to the plant nucleus as a nucleoprotein complex, and T-DNA-associated factors or cotransferred virulence proteins might influence the outcome of the joining reaction. After transposon excision, transposition-specific proteins probably play a role in protecting and repairing the broken sites (Rinehart et al., 1997; Gorbunova and Levy, 1999). We show that right and left T-DNA borders are joined to the plant DNA in a similar manner (Table I), implying that the right border-associated VirD2 does not play a major role in it, confirming the findings of Ziemienowicz et al. (2000). However, other—yet unidentified—virulence factors might take part in the T-DNA joining process. Second, subtle differences between T-DNA integration and DSB repair may affect the NHEJ process. In our opinion, complex filler at T-DNA junctions could result from unstable interactions between the plant DNA target site and the invading T-DNA end. DNA repair via NHEJ has been shown to involve DNA end-stabilizing factors, such as DNA end-binding Ku proteins, a DNA-dependent protein kinase, the XRCC4/DNA ligase IV complex, and the Mre11/Rad50/Xrs2 complex (for review, see Tsukamoto and Ikeda, 1998; Pfeiffer et al., 2000), for which analogs in Arabidopsis have been identified (Hartung and Puchta, 1999; West et al., 2000; Gallego et al., 2001; Daoudal-Cotterell et al., 2002; Tamura et al., 2002). This DNA end-joining protein complex might protect the DNA ends against nuclease-mediated degradation and also stabilize the intramolecular association of the interacting DNA ends (Tsukamoto and Ikeda, 1998). Our results suggest that when filler insertions are observed at T-DNA/plant DNA junctions, the initial interaction between the invading T-DNA end and the plant target is not stabilized immediately by this kind of protein complexes. Before stabilization, a free 3′ protruding T-DNA end either from the left or right T-DNA end lands and screens for microcomplementarity. Upon landing, the free 3′ end is taken as primer for simultaneous template-based DNA synthesis. Because at this stage the initial interaction is not stabilized by host-dependent NHEJ-associated protein complexes, the T-DNA takes off and lands in the neighborhood. Landing and taking off is repeated regularly. Different regions, such as the surroundings of the T-DNA target site and the T-DNA plasmid sequence, can be invaded, resulting in a patchwork-like filler sequence. It is tempting to speculate that at least during this initial interaction, DNA end-stabilizing protein complexes are absent, whereas a repair-polymerase complex might be involved. The presence of polymerase activity and the absence of exonuclease activity during this initial interaction might explain why filler DNA has been associated more frequently with intact T-DNA ends than with T-DNA ends that are joined via a microhomology-based mechanism. In contrast, DSB repair sites might not recruit DNA replication enzyme activities in combination with the plant's NHEJ machinery, which might result in a faster and stable interaction between both DSB ends.
MATERIALS AND METHODS
Transgenic Lines
Two populations of transgenic Arabidopsis C24 lines were used for amplification of T-DNA junction fragments. A first population included 21 transgenic lines that were cotransformed with two Agrobacterium tumefaciens strains containing each a different T-DNA, of which six transgenic lines (the kd lines) were cotransformed with the T-DNAs from plasmids pAK1202 and pAD1201 (De Neve et al., 1997), 14 lines (the kg lines) with those from plasmids pAK1202 and pAG1201 (De Neve et al., 1997), and one (Hsb-K2L 67/1) with those from plasmids pHsb and pK2L610 (De Buck et al., 1999). A second population included 12 CK2L-transgenic lines that were transformed with the A. tumefaciens strain C58C1RifR (pGV2260 and pK2L610; De Buck et al., 2000) and subsequently screened for single-copy inserts by T-DNA fingerprinting (Theuns et al., 2002).
Amplification of T-DNA Junctions
All T-DNA junctions were amplified with T-DNA fingerprinting (Theuns et al., 2002). From each transgenic line, a T-DNA fingerprint was generated either with MseI- or with BfaI-digested DNA as template DNA. Subsequently, the sequence of these fragments was determined after amplification upon elution (Windels et al., 2001). All transgenic lines were fingerprinted by using a right and left border-specific anchor primer, situated 129 and 277 bp internally to the right and left border repeats, respectively.
Characterization of the T-DNA Junctions and Filler Sequences
Pair-wise sequence alignment was used to screen for sequence identity between the amplified junctions and the T-DNA plasmid. Sequence identity between the plant region of the T-DNA junction fragment and the Arabidopsis genome was determined with the BLAST algorithm (Altschul et al., 1990) against the Arabidopsis database (http://www.arabidopsis.org/BLAST; Huala et al., 2001).
We first screened the complete Arabidopsis genome sequence with the BLAST algorithm for the filler sequences larger than 17 bp. In this manner, we could determine whether the observed filler DNAs were uninterrupted sequence blocks that had been copied from the Arabidopsis genome. A threshold was set at 17 bp because the probability of finding a 17-bp identical sequence motif in a 125-Mb sequence, which is the length of the complete Arabidopsis genome sequence (The Arabidopsis Genome Initiative, 2000), is less than 0.01.
Subsequently, for those junctions for which a filler DNA segment of 6 bp or more was detected, their origin was analyzed as follows. First, we screened the filler DNA at plant DNA/T-DNA junctions for sequence motifs, identical either in direct or inverted orientation to sequence blocks occurring in the 200-bp plant DNA region surrounding the T-DNA integration point. For those T-DNA inserts for which both T-DNA junctions were amplified, the filler sequence was easily compared with the target site deletion and the plant DNA adjacent to the opposite T-DNA junction. For the filler in T-DNA junctions for which the corresponding, opposite T-DNA junction had not been amplified, sequence identities with the target site deletion or the plant DNA adjacent to the opposite T-DNA border could not be distinguished. We chose to evaluate the statistical relevance of the reported sequence identities by means of an experimental statistical test. To perform a numeral, statistical analysis for each of the observed identical sequence motifs is extremely difficult because the statistical relevance is influenced by several parameters, such as length of the sequence identity, distance between the reported identity and the originating sequence, number of identities observed per filler insertion, and length of the filler sequence itself. Therefore, we compared the actual, observed filler sequences with the 200-bp plant target regions of other, characterized T-DNA inserts. Because this analysis revealed that sequence identities of 8 bp and shorter could occur simply by chance when a 200-bp plant region is screened (see Supplemental Data), only sequence identities of at least 9 bp with the plant target site were reported. Second, we compared the filler DNA with the first 100 bp of the T-DNA sequence that are situated immediately adjacent to the filler DNA. To determine a threshold for statistically relevant sequence identities when this 100-bp T-DNA border region was taken into account, the actual filler sequences were shuffled and compared with a 100-bp T-DNA border region. In that case, identities of at least 6 bp in length were found to be relevant. To detect identical sequences in the filler with T-DNA plasmid sequences that were not situated immediately adjacent to the filler DNA, we performed a pair-wise comparison between the T-DNA plasmid and filler sequences. Here, only sequence identities of at least 11 bp are reported. The probability of finding an 11-bp repeat in a 12-kb sequence (which is the average length of the T-DNA plasmids used in this study) is less than 0.01.
Fisher Exact Test for Statistical Significance
To test whether the differences observed with regard to the degree of T-DNA processing and the frequency of either filler DNA or microhomology at the T-DNA junction were significant, a Fisher exact test was performed. A two-by-two contingency table was constructed with the following values: A1, 3; A2, 30; B1, 8; and B2, 19 (see Supplemental Data). A Fisher one-sided exact P value of 0.043 was obtained.
Supplementary Material
Acknowledgments
The authors thank Isabel Roldán-Ruíz, Rosalinde van Lipzig, and Sergei Kushnir for critical reading of the manuscript and helpful comments, and Martine De Cock for layout.
This work was supported by the European Community (grant no. 18687-2001-11), by the European Union BIOTECH program (grant no. QLRT-2000-00078), and by the Fund for Scientific Research Flanders (G.0118.01).
The online version of this article contains Web-only data.
Article, publication date, and citation information can be found at www.plantphysiol.org/cgi/doi/10.1104/pp.103.027532.
References
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215: 403-410 [DOI] [PubMed] [Google Scholar]
- The Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408: 796-815 [DOI] [PubMed] [Google Scholar]
- Brunaud V, Balzergue S, Dubreucq B, Aubourg S, Samson F, Chauvin S, Bechtold N, Cruaud C, DeRose R, Pelletier G et al. (2002) T-DNA integration into the Arabidopsis genome depends on sequences of pre-insertion sites. EMBO Rep 3: 1152-1157 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bundock P, Hooykaas PJJ (1996) Integration of Agrobacterium tumefaciens T-DNA in the Saccharomyces cerevisiae genome by illegitimate recombination. Proc Natl Acad Sci USA 93: 15272-15275 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Daoudal-Cotterell S, Gallego ME, White CI (2002) The plant Rad50-Mre11 protein complex. FEBS Lett 516: 164-166 [DOI] [PubMed] [Google Scholar]
- De Buck S, De Wilde C, Van Montagu M, Depicker A (2000) Determination of the T-DNA transfer and the T-DNA integration frequencies upon cocultivation of Arabidopsis thaliana root explants. Mol Plant-Microbe Interact 13: 658-665 [DOI] [PubMed] [Google Scholar]
- De Buck S, Jacobs A, Van Montagu M, Depicker A (1999) The DNA sequences of T-DNA junctions suggest that complex T-DNA loci are formed by a recombination process resembling T-DNA integration. Plant J 20: 295-304 [DOI] [PubMed] [Google Scholar]
- De Neve M, De Buck S, Jacobs A, Van Montagu M, Depicker A (1997) T-DNA integration patterns in co-transformed plant cells suggest that T-DNA repeats originate from ligation of separate T-DNAs. Plant J 11: 15-29 [DOI] [PubMed] [Google Scholar]
- Fladung M (1999) Gene stability in transgenic aspen (Populus): I. Flanking DNA sequences and T-DNA structure. Mol Gen Genet 260: 574-581 [DOI] [PubMed] [Google Scholar]
- Friesner J, Britt AB (2003) Ku80- and DNA ligase IV-deficient plants are sensitive to ionizing radiation and defective in T-DNA integration. Plant J 34: 427-440 [DOI] [PubMed] [Google Scholar]
- Gallego ME, Bleuyard J-Y, Daoudal-Cotterell S, Jallut N, White CI (2003) Ku80 plays a role in non-homologous recombination but is not required for T-DNA integration in Arabidopsis. Plant J 35: 557-565 [DOI] [PubMed] [Google Scholar]
- Gallego ME, Jeanneau M, Granier F, Bouchez D, Bechtold N, White CI (2001) Disruption of the Arabidopsis RAD50 gene leads to plant sterility and MMS sensitivity. Plant J 25: 31-41 [DOI] [PubMed] [Google Scholar]
- Gelvin SB (2003) Agrobacterium-mediated plant transformation: the biology behind the “gene-jockeying” tool. Microbiol Mol Biol Rev 67: 16-37 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gheysen G, Angenon G, Van Montagu M (1998) Agrobacterium-mediated plant transformation: a scientifically intriguing story with significant applications. In K Lindsey, ed, Transgenic Plant Research, Harwood Academic Publishers, Amsterdam, pp 1-33
- Gheysen G, Villarroel R, Van Montagu M (1991) Illegitimate recombination in plants: a model for T-DNA integration. Genes Dev 5: 287-297 [DOI] [PubMed] [Google Scholar]
- Gorbunova V, Levy AA (1997) Non-homologous DNA end joining in plant cells is associated with deletions and filler DNA insertions. Nucleic Acids Res 25: 4650-4657 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gorbunova V, Levy AA (1999) How plants make ends meet: DNA double-strand break repair. Trends Plant Sci 4: 263-269 [DOI] [PubMed] [Google Scholar]
- Hartung F, Puchta H (1999) Isolation of the complete cDNA of the Mre11 homologue in Arabidopsis (Accession No. AJ243822) indicates conservation of DNA recombination mechanisms between plants and other eukaryotes (PGR 99-132). Plant Physiol 121: 31110577163 [Google Scholar]
- Huala E, Dickerman AW, Garcia-Hernandez M, Weems D, Reiser L, LaFond F, Hanley D, Kiphart D, Zhuang M, Huang W et al. (2001) The Arabidopsis Information Resource (TAIR): a comprehensive database and web-based information retrieval, analysis, and visualization system for a model plant. Nucleic Acids Res 29: 102-105 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kirik A, Salomon S, Puchta H (2000) Species-specific double-strand break repair and genome evolution in plants. EMBO J 19: 5562-5566 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kumar S, Fladung M (2002) Transgene integration in aspen: structures of integration sites and mechanism of T-DNA integration. Plant J 31: 543-551 [DOI] [PubMed] [Google Scholar]
- Krysan PJ, Young JC, Jester PJ, Monson S, Copenhaver G, Preuss D, Sussman MR (2002) Characterization of T-DNA insertion sites in Arabidopsis thaliana and the implications for saturation mutagenesis. OMICS 6: 163-174 [DOI] [PubMed] [Google Scholar]
- Matsumoto S, Ito Y, Hosoi T, Takahashi Y, Machida Y (1990) Integration of Agrobacterium T-DNA into a tobacco chromosome: possible involvement of DNA homology between T-DNA and plant DNA. Mol Gen Genet 224: 309-316 [DOI] [PubMed] [Google Scholar]
- Mayerhofer R, Koncz-Kalman Z, Nawrath C, Bakkeren G, Crameri A, Angelis K, Redei GP, Schell J, Hohn B, Koncz C (1991) T-DNA integration: a mode of illegitimate recombination in plants. EMBO J 10: 697-704 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meza TJ, Stangeland B, Mercy IS, Skårn M, Nymoen DA, Berg A, Butenko MA, Håkelien A-M, Haslekås C, Meza-Zepeda LA et al. (2002) Analyses of single-copy Arabidopsis T-DNA transformed lines show that the presence of vector backbone sequences, short inverted repeats and DNA methylation is not sufficient or necessary for the induction of transgene silencing. Nucleic Acids Res 30: 4556-4566 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Orel N, Puchta H (2003) Differences in the processing of DNA ends in Arabidopsis thaliana and tobacco: possible implications for genome evolution. Plant Mol Biol 51: 523-531 [DOI] [PubMed] [Google Scholar]
- Pfeiffer P, Goedecke W, Obe G (2000) Mechanisms of DNA double-strand break repair and their potential to induce chromosomal aberrations. Mutagenesis 15: 289-302 [DOI] [PubMed] [Google Scholar]
- Ralston EJ, English JJ, Dooner HK (1988) Sequence of three bronze alleles of maize and correlation with the genetic fine structure. Genetics 119: 185-197 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rinehart TA, Dean C, Weill CF (1997) Comparative analysis of non-random repair following Ac transposon excision in maize and Arabidopsis. Plant J 12: 1419-1427 [DOI] [PubMed] [Google Scholar]
- Roth DB, Porter TN, Wilson JH (1985) Mechanisms of nonhomologous recombination in mammalian cells. Mol Cell Biol 5: 2599-2607 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roth DB, Wilson JH (1986) Nonhomologous recombination in mammalian cells: role for short sequence homologies in the joining reaction. Mol Cell Biol 6: 4295-4304 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Salomon S, Puchta H (1998) Capture of genomic and T-DNA sequences during double-strand break repair in somatic plant cells. EMBO J 17: 6086-6095 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tamura K, Adachi Y, Chiba K, Aguchi K, Takahashi H (2002) Identification of Ku70 and Ku80 homologues in Arabidopsis thaliana: evidence for a role in the repair of DNA double-stranded breaks. Plant J 29: 771-781 [DOI] [PubMed] [Google Scholar]
- Theuns I, Windels P, De Buck S, Depicker A, Van Bockstaele E, De Loose M (2002) Identification and characterization of T-DNA inserts by T-DNA fingerprinting. Euphytica 123: 75-84 [Google Scholar]
- Tinland B (1996) The integration of T-DNA into plant genomes. Trends Plant Sci 1: 178-184 [Google Scholar]
- Tsukamoto Y, Ikeda H (1998) Double-strand break repair mediated by DNA end-joining. Genes Cells 3: 135-144 [DOI] [PubMed] [Google Scholar]
- van Attikum H, Bundock P, Hooykaas PJJ (2001) Non-homologous end-joining proteins are required for Agrobacterium T-DNA integration. EMBO J 20: 6550-6558 [DOI] [PMC free article] [PubMed] [Google Scholar]
- van Attikum H, Bundock P, Overmeer RM, Lee L-Y, Gelvin SB, Hooykaas PJJ (2003) The Arabidopsis AtLIG4 gene is required for the repair of DNA damage, but not for the integration of Agrobacterium T-DNA. Nucleic Acids Res 31: 4247-4255 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Van Haaren MJJ, Sedee NJA, Krul M, Schilperoort RA, Hooykaas PJJ (1988) Function of heterologous and pseudo border repeats in T region transfer via the octopine virulence system of Agrobacterium tumefaciens. Plant Mol Biol 11: 773-781 [DOI] [PubMed] [Google Scholar]
- Wessler S, Tarpley A, Purugganan M, Spell M, Okagaki R (1990) Filler DNA is associated with spontaneous deletions in maize. Proc Natl Acad Sci USA 87: 8731-8735 [DOI] [PMC free article] [PubMed] [Google Scholar]
- West CE, Waterworth WM, Jiang Q, Bray CM (2000) Arabidopsis DNA ligase IV is induced by γ-irradiation and interacts with an Arabidopsis homologue of the double strand break repair protein XRCC4. Plant J 24: 67-78 [DOI] [PubMed] [Google Scholar]
- Windels P, Taverniers I, Depicker A, Van Bockstaele E, De Loose M (2001) Characterisation of the Roundup Ready soybean insert. Eur Food Res Technol 213: 107-112 [Google Scholar]
- Yan Y, Martinez-Ferez IM, Kavchok S, Dooner HK (1999) Origination of Ds elements from Ac elements in maize: evidence for rare repair synthesis at the site of Ac excision. Genetics 152: 1733-1740 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ziemienowicz A, Tinland B, Bryant J, Gloeckler V, Hohn B (2000) Plant enzymes but not Agrobacterium VirD2 mediate T-DNA ligation in vitro. Mol Cell Biol 20: 6317-6322 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.