Skip to main content
Journal of Biomedicine and Biotechnology logoLink to Journal of Biomedicine and Biotechnology
. 2010 Dec 23;2011:476723. doi: 10.1155/2011/476723

Construction, Characterization, and Preliminary BAC-End Sequence Analysis of a Bacterial Artificial Chromosome Library of the Tea Plant (Camellia sinensis)

Jinke Lin 1, 2,2, Dave Kudrna 1, Rod A Wing 1, 3,3,*
PMCID: PMC3017946  PMID: 21234344

Abstract

We describe the construction and characterization of a publicly available BAC library for the tea plant, Camellia sinensis. Using modified methods, the library was constructed with the aim of developing public molecular resources to advance tea plant genomics research. The library consists of a total of 401,280 clones with an average insert size of 135 kb, providing an approximate coverage of 13.5 haploid genome equivalents. No empty vector clones were observed in a random sampling of 576 BAC clones. Further analysis of 182 BAC-end sequences from randomly selected clones revealed a GC content of 40.35% and low chloroplast and mitochondrial contamination. Repetitive sequence analyses indicated that LTR retrotransposons were the most predominant sequence class (86.93%–87.24%), followed by DNA retrotransposons (11.16%–11.69%). Additionally, we found 25 simple sequence repeats (SSRs) that could potentially be used as genetic markers.

1. Introduction

Since 1994, bacterial artificial chromosome (BAC) libraries have become an invaluable resource tool to initiate genomics research in theareas of genome sequencing, physical mapping, positional cloning, complex analysis of targeted genomic regions, and analysis of gene structure and function [18]. More recently, physical map tiles of BACs have been shown to be suitable for next generation sequencing of chromosomal regions or whole genomes [9].

BAC-end sequencing (BES) is a powerful tool that enhances the value of BAC libraries as a genomic resource by providing partial sequence information that can be used to understand genome content and architecture and develop genetic markers [10, 11]. Physical maps constructed from fingerprinted BAC clones, together with associated BAC-end sequence information, can be used to: construct BAC fingerprint-/BES- based physical maps [1, 12] which can be aligned to reference genome sequences; sequence genomes by “walking” from one clone to the next [13]; anchor whole genome shotgun sequence data; integrate genetic linkage maps with physical maps [14].

Tea (Camellia sinensis) is an important world beverage [15]. About 3.9 million tons of tea is produced (http://faostat.fao.org/site/567/DesktopDefault.aspx?PageID=567#ancor) and consumed yearly with an estimated world value of $9.04 billion ($2.39 per kg; http://www.fao.org/docrep/meeting/019/K8336E.pdf). Throughout history, it has become well known that tea is one of the most beneficial beverages to drink due to its health attributes. Tea polyphenols (catechins) and antioxidants, saponin, polysaccharides, L-theanine, pigments, and tea water extracts are the most important components that contribute to tea's medicinal qualities [1619].

There are different varieties or types of tea, that is, green tea, black tea, oolong tea, white tea, and so on [20], (see also http://jingtea.com/tea-knowledge/tea-varieties). Oolong tea is a special variety of tea and has been studied for its effects on diabetes, eczema, allergies, bacterial infections, dental caries (cavities), obesity, cardiovascular disease, cancer, and its antioxidant properties [2126]. It has been shown that consumption of oolong tea stimulates both energy expenditure and fat oxidation in normal weight men [21]. Oolong tea is semifermented which occurs when the leaves are gently rolled during processing [20], and this gives oolong tea a unique appearance and flavor. Chin-shin oolong is one of the most widely cultivated oolong tea varieties worldwide [27].

It has been suggested that functional genomic research should be a major emphasis of tea genetics and breeding in the future [28]. Although rapid progress of gene identification and isolation from tea plants has been made in the past several years [29], the study of the tea plant genome lags far behind other crop species due to the lack of good genomic research tools. This is most likely due to the difficulties of preparing high-quality tea DNA and due to the distinctness of tea plant from other taxa (i.e., perennial nature, high inbreeding depression, unavailability of distinct mutants of different biotic and abiotic stress, and large genome size of 4Gb [30]). One key genomic tool that is completely lacking for tea is the availability of a high-quality, deep-coverage, BAC library. In this paper, we report the construction of a high-quality, publicly available BAC library of tea plant from the variety Chin-shin oolong. We generated and analyzed a limited data set of BAC-end sequences from this library which provided an early glimpse into the sequence composition of the tea genome.

2. Materials and Methods

2.1. Germplasm and Plant Tissue Processing

Sixty 6-year-old plants, derived from a clonally propagated single mother plant of a Camellia sinensis cultivar Chin-shin oolong, were selected as the plant germplasm source. Healthy young shoot tips and the uppermost two leaves were collected (similar to harvesting top-quality tea), then washed quickly to remove debris, and immediately frozen by submersion in liquid nitrogen followed by short-term storage at −80°C. All plant growth and tissue selection was kindly provided by Dr. Francis Zee (USDA, Hilo, Hawaii).

2.2. Preparation of High Molecular Weight (HMW) Tea Plant DNA in Agarose Plugs

Tea genetic resources have lagged behind other important plants due to the complex and difficult nature of adequate extraction of quality nuclear DNA. However, a detailed manuscript describing the experiments that led to the successful isolation of high-quality, high molecular weight, nuclear DNA, suitable for BAC library construction, was recently made available [31]. The method for tea plant BAC library construction utilized the standard Arizona Genomics Institute (AGI) protocol [32], the method of Luo and Wing [3], and Ammiraju et al. [2]. Modifications necessary to achieve successful construction of a tea plant BAC library are described as follows.

Twenty grams of frozen tissue was homogenized and transferred to a flask containing 200 mL of prechilled extraction buffer (10 mM Tris-HCL, pH 8.0, 10 mM EDTA, pH 8.0, 100 mM KCl, 0.5 M sucrose, 4 mM spermidine, 1 mM spermine, 0.10% w/v L-ascorbic acid, 2.00% w/v PVP-40, 0.13% w/v sodium diethyldithiocarbamate trihydrate; PVP-40 was only about 50 percent dissolved) and 400 μl β-mercaptoethanol. The homogenate was filtered into a flask that contained 200 mL of the same above prechilled extraction buffer and 400 μl β-mercaptoethanol. The homogenate was filtered a second time into a fresh flask that contained 45 mL of prechilled extraction buffer with 10.00% v/v Triton X-100. The mixture was centrifuged for 15 min at 3250 rpm at 4°C. The resulting pellet was washed several times with the same buffer and resuspended in 1 mL of prechilled extraction buffer, incubated in a 45°C water bath for 5 minutes, and gently mixed with one-third volume of 1.00% low melting temperature agarose (in extraction buffer) at 45°C. The mixture was allowed to solidify after transferring to plug molds. Twenty-four plugs were transferred into a 50 mL-Falcon tube, containing 40 mL of standard proteinase K solution (1.00% w/v N-lauroylsarcosine (sodium salt), 0.1 mg/mL proteinase K, dissolved in 0.5 M EDTA, pH 9.4), and incubated in a hybridization oven at 50°C with a gentle rotation for 24 h. After the plugs were rewashed with fresh proteinase K solution for an additional 24 h, they were rewashed in 2 additional solutions. First, two times with 40 mL T10E10 containing 1 mM PMSF (phenylmethanesulfonyl fluoride) and then twice with 40 mL TE, each time for about 1 h at room temperature with gentle shaking. The plugs were stored in 70.00% ethanol at −20°C (for long-term storage).

2.3. Restriction Digestion of HMW DNA and Isolation of Size-Selected Fragments

Two and a half DNA plugs were used to establish optimal HindIII partial digestion conditions. Formal partial digestion, using 5 DNA plugs, was performed using 5U of HindIII restriction enzyme added to each sample (per half plug). Digested samples were loaded to 1.00% agarose gel and subjected to pulsed-field gel electrophoresis (PFGE). DNA was visualized, and agarose fragments, containing specific DNA sizes, were cut from the gel slabs. A second- and third-PFGE run of the fragments was performed to further purify the DNA and remove small DNA fragments. After finishing the third size selection, the gel fractions containing different sized fragments (B2, B1, A2) were recovered and stored at −20°C in 70.00% ethanol (for long-term storage).

2.4. Ligation of Sized DNA Fragments

For each size-selected fraction, the high molecular weight genomic DNA was electroeluted from the agarose at 4°C. Pipet tips (with cutoff tips) were used when manipulating high molecular weight genomic DNA to avoid mechanical shearing. The DNA concentrations were estimated, and 120 ng–200 ng DNA was used to ligate to the linearized (HindIII site) and dephosphorylated vector pIndigoBAC536 SwaI, commonly known as pAGIBAC1 [2]. Separately, ligations were mixed well by tapping and then incubated in a water bath at 16°C for 19 hours. Ligation samples were transferred into 0.1 M glucose/1.00% agarose cones to desalt for 1.5 h on ice. Ligations were transferred into fresh microcentrifuge tubes and stored at 4°C until transformation tests were completed.

2.5. Transformation Testing, Library Name, and BACClone End Sequencing (BES)

Ligation samples were tested to determine transformation efficiency and cloned insert quality. The methods for test transformations, BAC DNA isolations, BAC insert size analyses, bulk transformations, colony arrays, and library characterizations followed exactly the AGI protocols (http://www.genome.arizona.edu/information/publications/wingpub/construct.pdf and http://www2.genome.arizona.edu/pdfs/publications/meizhong_metmb.pdf). From each HMW fraction (B2, B1, A2), 2.3 μl of each ligation was used to transform 20 μl of DH10B T1 phage-resistant E. coli cells (Invitrogen) by electroporation using the Cell Porator and Voltage Booster electroporation system (Life Technologies). Electroporation was performed on ice at 327 DC V with fast charge rate at a low resistance (4 kΩ) and a capacitance of 330 μF. The cells were transferred into 3 mL of SOC media, and incubated at 37°C for 1 h with shaking at 250 rpm, followed by the addition of an equal volume of sterile glycerol and gentle shaking for 3 minutes. These mixtures were immediately frozen by submersion into liquid nitrogen followed by long-term storage at −80°C.

To evaluate these transformation tests, 300 μl of each (containing cells, SOC, and glycerol) were spread on Petri dishes (containing LB–X-gal–IPTG agar with 12.5 μg/mL of chloramphenicol, 80 μg/mL X-gal, and 100 μg/mL IPTG) and incubated at 37°C overnight. Five-hundred-seventy-six white recombinants (positive for insert), 192 from each of the three sublibraries (B2, B1, A2), were randomly selected and grown overnight at 37°C in 1.2 mL LB broth (with chloramphenicol 12.5 μg/mL) with shaking at 220 rpm. BAC DNAs were isolated and digested with NotI to release the BAC insert. Digestions were separated by PFGE at 6 V/cm, switch time from 5 to 15 s, angle 120°, and run for 16 h followed by staining, destaining, and visualization.

Transformed E. coli from the B2 ligation, selected to contain the largest insert sizes, were prepared for array to 384-well microtiter dishes. Five mL of the mixture from the B2 ligation were spread on LB–X-gal–IPTG-CM agar Q-trays (22.5 × 22.5 cm) and incubated at 37°C overnight. The white (insert positive recombinant) clones were robotically picked (using a Genetix Q-Bot; Genetix Ltd.) into forty-eight 384-well microtiter plates containing freezing media and stored at −80°C. Hybridization screening filters were printed from library copies to facilitate future experiments. To further evaluate this sub library (B2 ligation), 576 BACs were randomly selected from these forty-eight plates and grown overnight at 37°C with shaking at 220 rpm in 1.2 mL LB supplemented with chloramphenicol (12.5 μg/mL). BAC DNAs were isolated and digested with NotI. The digested clones were separated by PFGE and analyzed. Additionally, 192 BAC clones were subjected to BAC clone end sequencing (BES) using the method of Kim et al. [12], and the resultant sequences were analyzed for chloroplast and mitochondrial genome contaminations and for repeat content using BLASTN searches with settings that required 98% identity over lengths of at least 51 bp.

The successful C. sinensis BAC library, composed of E. coli transformants from each of the three ligations, was named CSBCBa.

3. Results

3.1. Construction and Characterization of the Tea BAC Library

Camellia sinensis cultivar Chin-shin oolong was chosen to construct the BAC library since it is one of the most widely cultivated oolong tea varieties. To avoid contamination with small, trapped DNA fragments and improve the size and uniformity of the inserts, the high molecular weight (HMW) genomic DNA was partially digested with HindIII and three separate size fractionations were collected. Following ligations into the HindIII site of the pAGIBAC1 vector, the three size fractionations were transformed, and the new tea BAC library, consisting of 401,280 BAC clones, was named CSBCBa; see Table 1. The overall average insert size of the total library was calculated to be 135 kb with inserts ranging from 8 to 314 kb, and the total library coverage was estimated to equal 13.54 haploid genome equivalents based upon mathematical calculations that utilize a genome size of 4,000 Mb. Sub library 1 (Ligation B2) was composed of 40,320 clones with an average insert size 140 kb; Sub library 2 (Ligation B1) was composed of 160,896 clones with an average insert size 140 kb; Sub library 3 (Ligation A2) was composed of 200,064 clones with an average insert size 130 kb. Five-hundred-seventy-six BACs, 192 from each of the three sublibraries, were randomly selected and analyzed by NotI restriction enzyme digestion in order to evaluate the library (Figure 1). Based on this sample size, no empty vector clones were visualized. Over 74.40% of the BAC clones were shown to carry DNA inserts greater than 130 kb, with a reasonable fraction (22.30%) carrying inserts larger than 170 kb (Figure 2). BAC clones from sub library 1, Ligation B2, were spread on agar dishes, clones (18,432) were robotically picked and arrayed to 48- 384well plates, and hybridization screening filters were printed to facilitate future experiments.

Table 1.

Composition of the Camellia sinensis tea BAC library, CSBCBa.

Size selection
range (kb)
Number of
clones
Average insert
size (kb)
Genome
equivalent
coverage
Insert size
range (kb)
Proportion of the
whole library
size (%)
Total Library 190–350 401,280 135 13.5 8–316 100%
Sublibrary 1 (Ligation B2) 300–350 40,320 140 1.4 8–316 10.05
Sublibrary 2 (Ligation B1) 240–300 160,896 140 5.6 32–185 40.10
Sublibrary 3 (Ligation A2) 190–240 200,064 130 6.5 19–180 49.85

Figure 1.

Figure 1

Insert DNA analysis of random BAC clones from the Camellia sinensis HindIII BAC library, CSBCBa, by pulsed-field gel electrophoresis. DNA samples were digested with NotI and separated by pulsed-field gel electrophoresis with ramp pulse time of 5–15 s at 6 V/cm at 14°C in 0.5× TBE buffer for 16 h. Markers used are midrange I (outside lanes). The 7.5-kb common band corresponds to the cloning vector pIndigoBAC536 SwaI (pAGIBAC1).

Figure 2.

Figure 2

Insert size distribution of Camellia sinensis BAC clones from the CSBCBa library. The average insert size of the BAC library was calculated to be 135 kb from an analyzed random sample size of 576 BAC clones.

3.2. Analysis of Repetitive Sequences from Pilot BAC-End Sequence of C. sinensis

To examine the quality of the library and to obtain a preliminary view of the major repetitive element content of the C. sinensis genome, unidirectional end sequencing was performed on 192 random BAC clones which produced 182 reads with an average length of 581 high-quality bases; the BESs are available at the URL: http://www2.genome.arizona.edu/files/camellia_BES.txt. Of the 182 random BESs, the total number of nucleotides was 111,695 bp and the GC level was 40.35% (Table 2).

Table 2.

Analysis of repetitive elements from preliminary BAC-end sequence (BES) of the Camellia sinensis tea BAC library, CSBCBa.

No. of
BES
No. of BES
nucleotides
(bp)
BES GC
contents
(%)
Compared
to
Repeat
coverage
(bp/%)
LTR repeat
coverage
(bp/%)
DNA repeat
coverage
(bp/%)
LINE repeat
coverage
(bp/%)
Non-LTR
retrotransposon
repeat coverage (bp/%)
182 111,695 40.35 Rice 5,251/4.70 4,581/87.24 614/11.69 56/1.06 0/0
Arabidopsis 5,457/4.88 4,744/86.93 609/11.16 0/0 104/1.90

Likewise, the 182 random BAC-end sequences were evaluated using BLAST searches and queries to the transposable element databases of Arabidopsis (Arabidopsis thaliana) at Genetic Information Research Institute and of rice (Oryza sativa) at the Arizona Genomics Institute. This preliminary analysis of repetitive sequences from pilot BAC-end sequences indicated that 97 to 98 percent of the BESs contained sequences related to transposable elements. LTR retrotransposons were the predominant class of repeat elements in C. sinensis (Arabidopsis 86.93% and rice 87.24%; Table 2). The second major class of transposable elements belongs to DNA retrotransposon (Arabidopsis 11.16% and rice 11.69%). LINE and non-LTR elements were also observed though at much lower amounts.

BLASTN searches against the chloroplast and mitochondrial genomes of Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa) using the 182 random BESs revealed no significant homology found, thus indicating that the BAC library contained very low levels of chloroplast or mitochondrial DNA contamination. Visualization of the inserts from the random 576 BACs from pulsed-field electrophoresis also indicated no patterns typical of organelle contamination (data not shown). Further repeat analysis of the 182 random BESs revealed 25 different microsatellite repeats (Table 3). The 25 simple sequence repeats (SSRs) were found from 17 different BAC clones (from the forward and reverse reads), thus suggesting that eight BACs represented genomic regions that contained high repeat content. Interestingly, one BES read, > _camellia_g1-12-3-07_D02_.g1, contained two different dimer, and one pentamer, SSRs.

Table 3.

Identification of simple sequence repeats (SSRs) from preliminary BAC-end sequence (BES) of the Camellia sinensis tea BAC library, CSBCBa.

Sequence filename Type Motif No. of repeats SSR start SSR end
> _camellia_g1-12-3-07_G11_.g1 di ac 6 370 381
> _camellia_g1-12-3-07_A01_.g1 di ag 7 173 186
> _camellia_g1-12-3-07_C10_.g1 di ag 8 133 148
> _camellia_g1-12-3-07_D02_.g1 di ag 6 461 472
> _camellia_g1-12-3-07_H02_.g1 di ag 5 585 594
> _camellia_b1-12-3-07_G05_.b1 di at 5 306 315
> _camellia_g1-12-3-07_C10_.g1 di at 6 121 132
> _camellia_g1-12-3-07_D02_.g1 di at 11 194 215
> _camellia_g1-12-3-07_D08_.g1 di at 5 388 397
> _camellia_b1-12-3-07_A03_.b1 di ct 13 358 383
> _camellia_b1-12-3-07_D05_.b1 di ct 8 1 16
> _camellia_g1-12-3-07_A06_.g1 di ct 7 208 221
> _camellia_g1-12-3-07_B11_.g1 di ct 7 310 323
> _camellia_g1-12-3-07_G11_.g1 di ct 5 287 296
> _camellia_g1-12-3-07_B06_.g1 di ga 18 516 551
> _camellia_g1-12-3-07_C10_.g1 di ga 18 586 621
> _camellia_g1-12-3-07_D02_.g1 di ga 8 538 553
> _camellia_b1-12-3-07_C09_.b1 di ta 6 124 135
> _camellia_b1-12-3-07_F10_.b1 di ta 8 29 44
> _camellia_b1-12-3-07_G12_.b1 di ta 11 437 458
> _camellia_g1-12-3-07_D03_.g1 di ta 5 555 564
> _camellia_b1-12-3-07_A03_.b1 di tc 8 314 329
> _camellia_g1-12-3-07_B06_.g1 di tg 6 505 516
> _camellia_b1-12-3-07_H09_.b1 tri aat 5 197 211
> _camellia_g1-12-3-07_D02_.g1 penta gagat 9 556 600

4. Discussion

Generally, the world's agriculture and food systems come from the tremendous biological diversity encompassed in over 200 independently (and perhaps convergent) domesticated species of Angiosperms [33]. Paradoxically, much of the potential genomic research with these taxa has been centered on a few model species, or taxonomic families, that represent only a very small amount of this diversity. In the past decade, prominent plant genome initiatives have resulted in genomic resources of enormous magnitude. These include molecular/genetic maps, transcript databases, large-insert libraries, five complete genome sequences (rice [14], Arabidopsis [34], poplar [35], grape [36], and maize [37]), and a suite of several others scheduled for completion. The ultimate goals of these projects are to continue to discover new ways to meet future world agricultural and food needs while simultaneously providing an understanding of functional systems biology. However, the expected dramatic advances in theoretical and applied plant biology for other taxa, following these innovations, have been critically slow due to two major obstacles: lack of adequate genomic tools and resources to efficiently and effectively transfer existing genomic knowledge to other economically and diverse agriculturally important species and lack of representation of novel genes and regulatory networks that underlie key traits of agriculture and ecology in the sequenced model species. Despite the significant economic impact of tea and other similar commodities (i.e., coffee, cocoa), a lack of adequate public genomic resources, especially access to large-insert libraries, has contributed to a lack of advanced genetic knowledge useable for modern breeding and improvement.

Recent published tea plant works involving molecular tools have shown an AFLP and RAPD marker-based linkage map [38], identified cDNAs involved in secondary metabolism [39], described sequence analysis from 4320 tea ESTs [40], and used intersimple sequence repeats to analyze genetic variability of somaclonal embryo-derived tea plants [41]. A recent review described the use of molecular resources for tea cultivar classification, the identification of parentage of Mulberry scale resistance, and more advanced linkage maps with SSR markers [42]. While these and other reports provide important insight to the understanding of the tea plant, the availability and utilization of BAC resources are absent.

Here, we report the construction and public availability of a high-quality, deep-coverage, large-insert, bacterial artificial chromosome (BAC) library of cultivated tea plant (Camellia sinensis) variety Chin-shin oolong. This BAC library resource is publicly available in the form of whole libraries, filters, and individual clones, through the AGI BAC/EST Resource Center (http://www.genome.arizona.edu/orders), and we expect it to be extensively used worldwide for the analysis of genome evolution and organization, positional cloning, and eventual gap closure of a C. sinensis reference sequence.

This tea BAC library was made possible after significant methodological improvements were accomplished in the preparation of purified high molecular weight (HMW) DNA [31] and subsequent partial enzymatic digestions and ligations. We showed that, despite expected difficulties in obtaining HMW DNA of reliable quality from tissues known to contain compounds deleterious to established extraction techniques, our results yielded ligations that produced a BAC library with low organellar contamination, such as chloroplast and mitochondria, yet with high transformation efficiencies and large-insert sizes. Though the genome size of tea is quite large, 4 Gb, the described BAC library has an average insert size of 135 kb and provides over 13x genome coverage to allow for adequate utility. It was divided into three sublibraries: sub library 1 (Ligation B2) composed of 40,320 clones with average insert sizes of 140 kb, sub library 2 (Ligation B1) composed of 160,896 clones with average insert sizes of 140 kb, and sub library 3 (Ligation A2) composed of 200,064 clones with average insert sizes of 130 kb.

Fresh tea leaves are rich in both volatile secondary compounds such as tea polyphenols and carbohydrate matrices such as tea polysaccharides. It has been shown that tea leaves contain 20–33% dry weight of polyphenols [43], and polysaccharides have been shown to be present at 7.02% dry weight in oolong tea [44]. These types of compounds contribute negatively [32] toward the handling of high molecular weight (HMW) DNA necessary for construction of large-insert genomic clone libraries. Tea polyphenols must be prevented from interacting with the nuclear DNA, and tea polysaccharides must be prevented from trapping nuclei in the process of tissue homogenization. In this study, to prepare high-quality HMW DNA, the frozen tissue was homogenized with a buffer (composed of combining ingredients from two methods but requiring the PVP-40 to be partially undissolved) in double volumes followed by additional filtration. Double volumes of buffer were found to dilute the polyphenols and polysaccharides, as evidenced by the absence of sticky nuclear pellets, thus allowing for increased removal during the centrifugation steps. The combining of the nuclei with the low melting temperature agarose at a lower ratio of 1 : 3 (instead of 1 : 1) was required to concentrate the nuclei but was performed in the presence of the undissolved PVP-40. To lower the organellar DNA contamination and to further eliminate tea polysaccharides and tea polyphenols in the process of preparing tea plant HMW DNA plugs, the use of additional washing steps were performed [45].

Removing small restriction fragments is vital for construction of a high-quality BAC library [46, 47]. Tea plant genomic plugs prepared by the method used in this paper contained abundant HMW DNA and produced satisfactory restriction fragments when digested with HindIII. To avoid contamination with small-trapped DNA fragments and improve the size and uniformity of the inserts, we performed three separate size selections. Usually two size selections are sufficient, but visual examination of the DNA fragments after the second size selection revealed that an additional selection was required (data not shown). A detailed manuscript describing the experiments for the appropriate isolation of high-quality, high molecular weight DNA that led to the successful construction of this tea plant BAC library (tea plant nuclei isolation, buffer compositions, HMW DNA, large-insert ligations, etc.) was recently published [31], and this method will also yield sufficient quality tea DNA for other purposes, such as next generation sequencing (NGS).

Pulsed-field gel electrophoresis analysis of 576 random BAC clone plasmids showed that the majority of C. sinensis cloned DNA inserts were present as single NotI fragments. This indicated that the C. sinensis genome apparently contained few NotI sites, a feature commonly observed with the genomes of other plant dicot species and contrary to the results obtained with monocot species [48].

Long terminal repeats (LTRs), a component of genome repeat analysis [4951], are sequences of DNA that are repeated hundreds or thousands of times in the genome. They are often found in retrotransposons and flanking functional genes [51, 52]. LTR retrotransposons constitute a significant portion of most eukaryote genomes and in plants have been suggested to be causative for the dramatic differences of genome sizes (and polyploidization) and for disruptions of genome organization and structure [2, 51, 53, 54]. While insertions of DNA have been attributed to amplifications of retrotransposons [51, 5457], deletions have been suggested to involve all DNA sequence class types and may be the result of homologous recombination and/or illegitimate recombination [56]. Recent analysis of BES from different Oryza species [2] found good correlation with flow cytometric genome sizing and repeat content. Our preliminary analysis of tea repetitive sequences from pilot BAC-end sequences indicated that over 98 percent of the tea genome could be repetitive. We found that LTR retrotransposons are the predominant class of repeat elements in C. sinensis followed by DNA retroelements. These results support the correlation of large genome size with proliferation of repeat content since the tea genome is quite large, 4,000 Mb, which is 1.6X maize and 0.8X barley whose genomes are approximately 80–95 percent repetitive. Therefore, it is not surprising that the majority of tea BESs contained sequences highly enriched with transposable elements.

5. Conclusion

A high-quality, deep-coverage, HindIII BAC library of tea plant (Camellia sinensis) has been constructed. The library, named CSBCBa, is publicly available from the Arizona Genomics Institute Resource Center (http://www.genome.arizona.edu/orders/). The average insert size of the library was 135 kb, it contained very low organellar contamination, and it provides 13.54x genome equivalent coverage from a total of 401,280 clones. Analysis of BAC clone end sequences revealed that the repetitive fraction of the tea genome was highly enriched for LTRs and DNA TEs. This public resource will provide a useful platform for genomics research, such as genome sequencing, DNA fingerprinting and physical mapping, gene identification, isolation and regulation, as well as complex analysis of targeted genomic regions. Research of genomic information and gene expression patterns of tea plant have advanced slowly probably owing to the lack of an available high-quality BAC library as one key genomic tool. Since this BAC library has been made available to the public, we expect that the most advanced work with DNA tools, which has been initiated by other researchers, will move forward for deepening genomics research of tea plant. The specific genes of tea plant, such as the genes involved in the oxidative pathways of catechins by polyphenol oxidase [20], which are involved in the pathways of the powerful antioxidant epigallocatechin-3-gallate (EGCG), may be revealed by use of this resource. Previous work on tea plant genetics, such as linkage mapping, genetic integrity of somaclonal variants [38, 41, 58, 59], and tea plant functional genes and biopathways [39, 40, 60, 61], may be integrated and/or expanded with this library to help unravel the tea plant genome.

Acknowledgments

The authors thank Dr. Francis T. Zee (Research Leader, Supervisory Research Horticulturist at USDA/ARS, Hilo, Hawaii) for helpful discussions and collecting and providing tea plant tissues for this research. Additional thanks, to the staff of AGI for robotic and barcode handling of the clones, DNA template extractions, BES and data pipeline.

References

  • 1.Wing RA, Ammiraju JSS, Luo M, et al. The Oryza map alignment project: the golden path to unlocking the genetic potential of wild rice species. Plant Molecular Biology. 2005;59(1):53–62. doi: 10.1007/s11103-004-6237-x. [DOI] [PubMed] [Google Scholar]
  • 2.Ammiraju JSS, Luo M, Goicoechea JL, et al. The Oryza bacterial artificial chromosome library resource: construction and analysis of 12 deep-coverage large-insert BAC libraries that represent the 10 genome types of the genus Oryza. Genome Research. 2006;16(1):140–147. doi: 10.1101/gr.3766306. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Luo M, Wing RA. An improved method for plant BAC library construction. In: Grotewold E, editor. Plant Functional Genomics: Methods and Protocols. Totowa, NJ, USA: Humana Press; 2003. pp. 3–19. [DOI] [PubMed] [Google Scholar]
  • 4.Choi S, Wing RA. The construction of bacterial artificial chromosome (BAC) libraries. In: Gelvin S, Schilperoort R, editors. Plant Molecular Biology Manual. 2nd edition. Norwell, Mass, USA: Kluwer Academic Publishers; 2000. pp. 1–28. [Google Scholar]
  • 5.Bayou N, M’rad R, Belhaj A, et al. De novo balanced translocation t (7;16) (p22.1; p11.2) associated with autistic disorder. Journal of Biomedicine and Biotechnology. 2008;2008(1):5 pages. doi: 10.1155/2008/231904. Article ID 231904. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Crawford B, Hussain AA, Jideama NM. Evidence of a genomic biomarker in normal human epithelial mammary cell line, MCF-10A, that is absent in the human breast cancer cell line, MCF-7. Journal of Biomedicine and Biotechnology. 2006;2006:5 pages. doi: 10.1155/JBB/2006/43181. Article ID 43181. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Shizuya H, Birren B, Kim U-J, et al. Cloning and stable maintenance of 300-kilobase-pair fragments of human DNA in Escherichia coli using an F-factor-based vector. Proceedings of the National Academy of Sciences of the United States of America. 1992;89(18):8794–8797. doi: 10.1073/pnas.89.18.8794. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Breen JM, Wicker T, Kong X, et al. A highly conserved gene island of three genes on chromosome 3B of hexaploid wheat: diverse gene function and genomic structure maintained in a tightly linked block. BMC Plant Biology. 2010;10(1, article 98) doi: 10.1186/1471-2229-10-98. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Rounsley S, Marri P, Yu Y, et al. De novo next generation sequencing of plant genomes. Rice. 2009;2:35–43. [Google Scholar]
  • 10.David P, Sévignac M, Thareau V, et al. BAC end sequences corresponding to the B4 resistance gene cluster in common bean: a resource for markers and synteny analyses. Molecular Genetics and Genomics. 2008;280(6):521–533. doi: 10.1007/s00438-008-0384-8. [DOI] [PubMed] [Google Scholar]
  • 11.Han Y, Korban SS. An overview of the apple genome through BAC end sequence analysis. Plant Molecular Biology. 2008;67(6):581–588. doi: 10.1007/s11103-008-9321-9. [DOI] [PubMed] [Google Scholar]
  • 12.Kim HR, Hurwitz B, Yu Y, et al. Construction, alignment and analysis of twelve framework physical maps that represent the ten genome types of the genus Oryza. Genome Biology. 2008;9(2, article R45) doi: 10.1186/gb-2008-9-2-r45. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.The Rice Chromosome 3 Sequencing Consortium. Sequence, annotation, and analysis of synteny between rice chromosome 3 and diverged grass species. Genome Research. 2005;15:1–10. doi: 10.1101/gr.3869505. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.International Rice Genome Sequencing Project. The map-based sequence of the rice genome. Nature. 2005;436(7052):793–800. doi: 10.1038/nature03895. [DOI] [PubMed] [Google Scholar]
  • 15.Chen ZM. Prospect on tea industry in the year of 2000. Journal of Tea Science. 1994;14(2):81–88. [Google Scholar]
  • 16.Khan N, Mukhtar H. Tea polyphenols for health promotion. Life Sciences. 2007;81(7):519–533. doi: 10.1016/j.lfs.2007.06.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Jia X-D, Han C, Chen J-S. Tea pigments induce cell-cycle arrest and apoptosis in HepG2 cells. World Journal of Gastroenterology. 2005;11(34):5273–5276. doi: 10.3748/wjg.v11.i34.5273. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Monobe M, Ema K, Kato F, Maeda-Yamamoto M. Immunostimulating activity of a crude polysaccharide derived from green tea (Camellia sinensis) extract. Journal of Agricultural and Food Chemistry. 2008;56(4):1423–1427. doi: 10.1021/jf073127h. [DOI] [PubMed] [Google Scholar]
  • 19.Bukowski JF, Percival SS. L-theanine intervention enhances human γδ T lymphocyte function. Nutrition Reviews. 2008;66(2):96–102. doi: 10.1111/j.1753-4887.2007.00013.x. [DOI] [PubMed] [Google Scholar]
  • 20.Karori SM, Wachira FN, Wanyoko JK, Ngure RM. Antioxidant capacity of different types of tea products. African Journal of Biotechnology. 2007;6(19):2287–2296. [Google Scholar]
  • 21.Rumpler W, Seale J, Clevidence B, et al. Oolong tea increases metabolic rate and fat oxidation in men. Journal of Nutrition. 2001;131(11):2848–2852. doi: 10.1093/jn/131.11.2848. [DOI] [PubMed] [Google Scholar]
  • 22.Sano M, Suzuki M, Miyase T, Yoshino K, Maeda-Yamamoto M. Novel antiallergic catechin derivatives isolated from oolong tea. Journal of Agricultural and Food Chemistry. 1999;47(5):1906–1910. doi: 10.1021/jf981114l. [DOI] [PubMed] [Google Scholar]
  • 23.Sato S, Adachi A, Sasaki Y, Ghazizadeh M. Oolong tea extract as a substitute for uranyl acetate in staining of ultrathin sections. Journal of Microscopy. 2008;229(1):17–20. doi: 10.1111/j.1365-2818.2007.01881.x. [DOI] [PubMed] [Google Scholar]
  • 24.Kuroda Y, Hara Y. Antimutagenic and anticarcinogenic activity of tea polyphenols. Mutation Research. 1999;436(1):69–97. doi: 10.1016/s1383-5742(98)00019-2. [DOI] [PubMed] [Google Scholar]
  • 25.Han LK, Takaku T, Li J, et al. Antiobesity action of oolong tea. International Journal of Obesity and Related Metabolic Disorders. 1999;23:98–105. doi: 10.1038/sj.ijo.0800766. [DOI] [PubMed] [Google Scholar]
  • 26.Kurihara H, Fukami H, Toyoda Y, et al. Inhibitory effect of oolong tea on the oxidative state of low density lipoprotein (LDL) Biological and Pharmaceutical Bulletin. 2003;26(5):739–742. doi: 10.1248/bpb.26.739. [DOI] [PubMed] [Google Scholar]
  • 27.Zhou SB, Chen XD. Tea in Taiwan. Chinese Tea. 2006;2006(2):34–35. [Google Scholar]
  • 28.Chen L, Zhou Z-X, Yang Y-J. Genetic improvement and breeding of tea plant (Camellia sinensis) in China: from individual selection to hybridization and molecular breeding. Euphytica. 2007;154(1-2):239–248. [Google Scholar]
  • 29.Ma C, Chen L. Research progress on isolation and cloning of functional genes in tea plants. Molecular Plant Breeding. 2006;4(3, supplement):16–22. [Google Scholar]
  • 30.Tanaka J, Taniguchi F, Hirai N, Yamaguchi S. Estimation of the genome size of tea (Camellia sinensis), Camellia (C. japonica), and their interspecific hybrids by flow cytometry. Tea Research Report. 2006;101:1–7. [Google Scholar]
  • 31.Lin JK, Kudrna D, Wing RA. High Molecular Weight (HMW) genomic DNA preparation from Tea plant (Camellia sinensis) for BAC library construction. Journal of Agricultural Science and Technology. 2009;3(1):1–10. [Google Scholar]
  • 32.Peterson D, Tomkins J, Frisch D, Wing R, Paterson A. Construction of plant bacterial artificial chromosome (BAC) libraries: an illustrated guide. Journal of Agricultural Genomics. 2000;5:1–100. [Google Scholar]
  • 33.Paterson AH. Leafing through the genomes of our major crop plants: strategies for capturing unique information. Nature Reviews Genetics. 2006;7(3):174–184. doi: 10.1038/nrg1806. [DOI] [PubMed] [Google Scholar]
  • 34.Arabidopsis Genome Initiative. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2001;410(6826):p. 299. [Google Scholar]
  • 35.Brunner AM, Busov VB, Strauss SH. Poplar genome sequence: functional genomics in an ecologically dominant plant species. Trends in Plant Science. 2004;9(1):49–56. doi: 10.1016/j.tplants.2003.11.006. [DOI] [PubMed] [Google Scholar]
  • 36.Jaillon O, Aury J-M, Noel B, et al. The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature. 2007;449(7161):463–467. doi: 10.1038/nature06148. [DOI] [PubMed] [Google Scholar]
  • 37.Schnable PS, Ware D, Fulton RS, et al. The B73 maize genome: complexity, diversity, and dynamics. Science. 2009;326(5956):1112–1115. doi: 10.1126/science.1178534. [DOI] [PubMed] [Google Scholar]
  • 38.Hackett CA, Wachira FN, Paul S, Powell W, Waugh R. Construction of a genetic linkage map for Camellia sinensis (tea) Heredity. 2000;85(4):346–355. doi: 10.1046/j.1365-2540.2000.00769.x. [DOI] [PubMed] [Google Scholar]
  • 39.Park J-S, Kim J-B, Hahn B-S, et al. EST analysis of genes involved in secondary metabolism in Camellia sinensis (tea), using suppression subtractive hybridization. Plant Science. 2004;166(4):953–961. [Google Scholar]
  • 40.Chen L, Zhao L-P, Gao Q-K. Generation and analysis of expressed sequence tags from the tender shoots cDNA library of tea plant (Camellia sinensis) Plant Science. 2005;168(2):359–363. [Google Scholar]
  • 41.Thomas J, Vijayan D, Joshi SD, Joseph Lopez S, Raj Kumar R. Genetic integrity of somaclonal variants in tea (Camellia sinensis (L.) O Kuntze) as revealed by inter simple sequence repeats. Journal of Biotechnology. 2006;123(2):149–154. doi: 10.1016/j.jbiotec.2005.11.005. [DOI] [PubMed] [Google Scholar]
  • 42.Tanaka J, Taniguchi F. Kole C, editor. Tea. (Technical Crops).Genome Mapping and Molecular Breeding in Plants. 2007;6:119–125. [Google Scholar]
  • 43.Wan XC. Tea Biochemistry. 3rd edition. Beijing, China: China Agriculture Press; 2003. [Google Scholar]
  • 44.Fu BQ, Xie MY, Nie XP, et al. Method simplified in assaying tea polysaccharide. Food Science. 2001;22(11):69–73. [Google Scholar]
  • 45.Yüksel B, Paterson AH. Construction and characterization of a peanut HindIII BAC library. Theoretical and Applied Genetics. 2005;111(4):630–639. doi: 10.1007/s00122-005-1992-x. [DOI] [PubMed] [Google Scholar]
  • 46.Zeng CJ, Pan HJ, Gong SB, Yu JQ, Wan QH, Fang SG. Giant panda BAC library construction and assembly of a 650-kb contig spanning major histocompatibility complex class II region. BMC Genomics. 2007;8, article 315 doi: 10.1186/1471-2164-8-315. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Chalhoub B, Belcram H, Caboche M. Efficient cloning of plant genomes into bacterial artificial chromosome (BAC) libraries with larger and more uniform insert size. Plant Biotechnology Journal. 2004;2(3):181–188. doi: 10.1111/j.1467-7652.2004.00065.x. [DOI] [PubMed] [Google Scholar]
  • 48.Noir S, Patheyron S, Combes M-C, Lashermes P, Chalhoub B. Construction and characterisation of a BAC library for genome analysis of the allotetraploid coffee species (Coffea arabica L.) Theoretical and Applied Genetics. 2004;109(1):225–230. doi: 10.1007/s00122-004-1604-1. [DOI] [PubMed] [Google Scholar]
  • 49.Havecker ER, Gao X, Voytas DF. The diversity of LTR retrotransposons. Genome Biology. 2004;5(6, article 225) doi: 10.1186/gb-2004-5-6-225. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Gao L, McCarthy EM, Ganko EW, McDonald JF. Evolutionary history of Oryza sativa LTR retrotransposons: a preliminary survey of the rice genome sequences. BMC Genomics. 2004;5, article 18 doi: 10.1186/1471-2164-5-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Zuccolo A, Sebastian A, Talag J, et al. Transposable element distribution, abundance and role in genome size variation in the genus Oryza. BMC Evolutionary Biology. 2007;7, article 152 doi: 10.1186/1471-2148-7-152. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Sasaki T, Burr B. International Rice Genome Sequencing Project: the effort to completely sequence the rice genome. Current Opinion in Plant Biology. 2000;3(2):138–141. doi: 10.1016/s1369-5266(99)00047-3. [DOI] [PubMed] [Google Scholar]
  • 53.Ammiraju JSS, Zuccolo A, Yu Y, et al. Evolutionary dynamics of an ancient retrotransposon family provides insights into evolution of genome size in the genus Oryza. Plant Journal. 2007;52(2):342–351. doi: 10.1111/j.1365-313X.2007.03242.x. [DOI] [PubMed] [Google Scholar]
  • 54.Hurwitz BL, Kudrna D, Yu Y, et al. Rice structural variation: a comparative analysis of structural variation between rice and three of its closest relatives in the genus Oryza. Plant Journal. 2010;63(6):990–1003. doi: 10.1111/j.1365-313X.2010.04293.x. [DOI] [PubMed] [Google Scholar]
  • 55.Devos KM, Brown JKM, Bennetzen JL. Genome size reduction through illegitimate recombination counteracts genome expansion in Arabidopsis. Genome Research. 2002;12(7):1075–1079. doi: 10.1101/gr.132102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Ma J, Bennetzen JL. Rapid recent growth and divergence of rice nuclear genomes. Proceedings of the National Academy of Sciences of the United States of America. 2004;101(34):12404–12410. doi: 10.1073/pnas.0403715101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Ma J, Devos KM, Bennetzen JL. Analyses of LTR-retrotransposon structures reveal recent and rapid genomic DNA loss in rice. Genome Research. 2004;14(5):860–869. doi: 10.1101/gr.1466204. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Kaundun SS, Matsumoto S. Identification of processed Japanese green tea based on polymorphisms generated by STS-RFLP analysis. Journal of Agricultural and Food Chemistry. 2003;51(7):1765–1770. doi: 10.1021/jf020821i. [DOI] [PubMed] [Google Scholar]
  • 59.Huang J, Li J, Huang Y, et al. Construction of AFLP molecular markers linkage map in tea plant. Journal of Tea Science. 2005;25(1):7–15. [Google Scholar]
  • 60.Jin JQ, Cui HR, Gong XC, Chen WY, Xin Y. Studies on tea plants (Camellia sinensis) germplasms using EST-SSR marker. Hereditas. 2007;29(1):103–108. doi: 10.1360/yc-007-0103. [DOI] [PubMed] [Google Scholar]
  • 61.Punyasiri PAN, Abeysinghe ISB, Kumar V, et al. Flavonoid biosynthesis in the tea plant Camellia sinensis: properties of enzymes of the prominent epicatechin and catechin pathways. Archives of Biochemistry and Biophysics. 2004;431(1):22–30. doi: 10.1016/j.abb.2004.08.003. [DOI] [PubMed] [Google Scholar]

Articles from Journal of Biomedicine and Biotechnology are provided here courtesy of Wiley

RESOURCES