Skip to main content
Genome Research logoLink to Genome Research
. 1998 Aug;8(8):842–847. doi: 10.1101/gr.8.8.842

Sequencing of cDNA Clones from the Genetic Map of Tomato (Lycopersicon esculentum)

Martin W Ganal 1,1, Rosemarie Czihal 1, Ulrich Hannappel 1, Dorothee-U Kloos 1, Andreas Polley 1, Hong-Qing Ling 1
PMCID: PMC310761  PMID: 9724330

Abstract

The dense RFLP linkage map of tomato (Lycopersicon esculentum) contains >300 anonymous cDNA clones. Of those clones, 272 were partially or completely sequenced. The sequences were compared at the DNA and protein level to known genes in databases. For 57% of the clones, a significant match to previously described genes was found. The information will permit the conversion of those markers to STS markers and allow their use in PCR-based mapping experiments. Furthermore, it will facilitate the comparative mapping of genes across distantly related plant species by direct comparison of DNA sequences and map positions.

[cDNA sequence data reported in this paper have been submitted to the EMBL database under accession nos. AA824695AA825005 and the dbEST_Id database under accession nos. 1546519–1546862.]


Molecular markers used in mapping experiments should ideally have a known function. This requirement is fulfilled best by the use of known genes in the form of genomic clones or cDNA clones. Many genetic maps based on molecular markers, such as RFLPs (restriction fragment length polymorphisms) include a number of known genes. However, the number of known genes available for a given organism is usually very limited. To circumvent this problem, many maps contain a large number of randomly selected cDNA clones. cDNAs have the additional advantage that they are frequently single or low copy and thus ideal for genetic mapping (Bernatzky and Tanksley 1986). Furthermore, they show a higher level of conservation between species than genomic clones (Zamir and Tanksley 1988) and allow more efficient cross-mapping in related species for the determination of synteny (Ahn and Tanksley 1993).

In recent years random cDNA clones have attracted considerable interest, as the sequencing of such clones provides a means to obtain a fast catalog of expressed genes from a given organism without sequencing the entire genome. In plants, major efforts are now under way to sequence random cDNAs in Arabidopsis and rice (Sasaki et al. 1994; Delseny et al. 1997). Many thousands of such fragments have been deposited in the respective databases. For ∼30%–40% of the cDNA sequences a possible function can be deduced based on homologies to known genes from bacteria, animals, or plants (Delseny et al. 1997).

For all expressed genes, not only the DNA sequence and the possible function should be known but also their map position on the chromosomes. In the long term this will allow merging of the classical genetic map based on mutations with potential candidate genes for these mutants. A common repertoire of mapped cDNA clones will, in the future, enable us to study synteny even between distantly related species (Paterson et al. 1996) for which studies by cross-hybridization are very difficult. The availability of sequence information for mapped cDNA clones should make conclusions from hybridization studies more firm and testable.

Tomato (Lycopersicon esculentum) has one of the most densely populated genetic maps among plants. Currently, >1000 RFLP markers have been published by Tanksley et al. (1992), of which >300 are random cDNA clones. An additional 500 RFLP markers have been localized in reference to this map through the fact that all markers from potato can be readily transferred to tomato on the basis of extensive synteny (Gebhardt et al. 1991; Jacobs et al. 1995). We report here the results of the sequencing of ∼90% of the mapped cDNA clones from the high-density tomato map.

RESULTS

cDNA Sequencing and Analysis

A total of >300 cDNA clones have been mapped onto the high-density map of tomato by Tanksley et al. (1992). These clones were derived from two different libraries. CD clones were generated from leaf tissue and CT clones from epidermal tissue. The clones were in four different vectors. For sequencing, only those clones were used that were generated in vectors with M13 or pBluescript primer sites. This excluded some of the early cDNA clones (Bernatzky and Tanksley 1986) because they were generated in pBR322. A total of 272 cDNA clones could be sequenced in this way from at least one direction. Sequences from another 15 clones were not included in the analysis because they contained obvious cloning artifacts that could not be resolved. From both sides 145 clones were sequenced, and for 73 clones the entire insert was sequenced. The largest clone completely sequenced was 896 bp.

Because these clones had already been placed onto the genetic map of tomato, it was anticipated that most should represent different genes. Nevertheless, for some cDNA clones duplicates could be identified (clones CT71 and CT210, CT72 and CT242, CT75 and CT166, CT88 and CT257, CT115 and CT218, CT154 and CT214, CT223 and CD61, CD18 and CD34). In these cases, the cDNA clones show a complex hybridization pattern upon which it was not possible to state previously that they are derived from the same gene.

Sequence Homologies

For 156 of the 275 analyzed clones (57%), significant nucleic acid and/or protein homologies were found with the respective databases. Of those, 125 clones showed matches with known DNA sequences and 141 clones showed matches with protein sequences with a BLAST score of <10−10. Some sequenced clones showed only homology to other plant sequences at the DNA level and not at the protein level because the cDNA clones were not integrated directionally into the cloning vector and for these clones sequence information was obtained only from the untranslated 3′ region with the poly(A) tail. If matches on the DNA and protein level were found, only those were considered significant that matched to the same protein type or corresponding genes. Table 1 shows a summary of the data for the clones with significant homologies. For 25 tomato genes already described we found a direct match with the database.

Table 1.

Nucleic Acid and Protein Homologies of Mapped cDNA Clones from Tomato

Number Homology DNA Homology protein Match to organism




CT
graphic file with name gr.8t1arev1.jpg
graphic file with name gr.8t1brev1.jpg

Those cDNA clones listed show a homology score of <10−10 at least on either the DNA or protein level. Homology scores of >10−10 were included when the lower match agreed with the higher score (>10−10) on the DNA or protein level, respectively. For homology searches on the DNA level, the GenBank DNA database was used. For homology searches on the protein level, the GenBank protein database was used. Accession numbers are listed in the corresponding columns for the respective databases. If the same match was found on the DNA and protein level, only a single organism is listed in the match to organism column. In some cases, two organisms are indicated—the most significant matches are to different organisms but the same protein type on the DNA and protein level, respectively. Clones that match directly to known tomato genes have been indicated by an asterisk (*). 

DISCUSSION

The availability of 272 sequenced cDNA clones from the tomato map, together with other previously mapped genes of known function (Pillen et al. 1996b), creates a framework of >350 markers for the tomato genome for which at least part of the DNA sequence is known. This information will be sufficient for the generation of PCR-based markers for most regions of the tomato genome. Such sequence-tagged sites (STSs) will function as genetic anchor markers (Inoue et al. 1994) and permit fast and high throughput analysis of loci in large populations (Schumacher et al. 1995). In plant breeding and genetic experiments, STS markers, for example, in the form of CAPs (cleaved amplified polymorphisms), are very useful to follow linked genes in an economical manner through generations (Konieczny and Ausubel 1993). Such markers are especially useful for preselecting recombinants in specific regions of the tomato genome for the purpose of high-resolution mapping of genes targeted for map-based cloning (Alpert and Tanksley 1996). At the same time, they provide starting points for the rapid isolation of large insert clones from tomato DNA libraries, such as yeast artificial chromosomes (YACs) (Martin et al. 1992; Bonnema et al. 1996; Pillen et al. 1996a) or bacterial artificial chromosomes (BACs) (Shizuya et al. 1992) and binary BAC (BiBAC) clones (Hamilton et al. 1996).

Considerable effort is currently spent on the comparative mapping of highly conserved cDNA clones across a wide range of plant taxa. This has revealed that in limited regions, synteny exists even between distantly related plant species (Paterson et al. 1996). Sequence data from mapped cDNA clones will eventually help to reveal synteny between such plant species. For example, using the information from these mapped tomato clones, it will be possible to identify genes or expressed sequence tags (ESTs) with very high sequence homology on the DNA and/or protein level to the model plant organism Arabidopsis thaliana. Such probes are very likely cross-hybridizing between Arabidopsis and tomato. If they are single copy in hybridization on both genomes they can be mapped comparatively in these species and it can be determined whether there is conservation of linkage in certain areas of their genomes. Similarly, such experiments can be expanded to additional plant species for which large numbers of ESTs will be generated in the future. A carefully chosen approach involving sequence comparisons in combination with genetic mapping by hybridization is absolutely necessary for the study of synteny between distantly related plant species to discriminate between orthologous and paralogous genes (Tatusov et al. 1997). Large numbers of cDNA clones are currently sequenced and mapped in some plant genomes. For rice, EST sequences were used for the construction of a high-density genetic map (Kurata et al. 1994). Large numbers of ESTs are mapped onto the genetic map or YAC contig map of A. thaliana (Agyare et al. 1997). With the sequencing of the entire Arabidopsis and rice genomes in the forseeable future, it will be easier to study exclusively orthologous sequences from those two genomes in comparison to data from other plant species.

METHODS

cDNA Clones

All cDNA clones from the map of Tanksley et al. (1992) that have been cloned in pUC vectors, the pBluescript SK vector, or the pCR II (Invitrogen) vector were used for this study. Insert sizes were confirmed prior to sequencing by PCR using M13 forward and reverse sequencing primers.

Plasmid Preparation and Sequencing

Plasmids were prepared according to standard protocols using Qiagen colums (Qiagen, Hilden, Germany) from 5-ml cultures. Plasmid DNA was sequenced using commercially available sequencing kits and analyzed on ALF (automated laser fluorescence) sequencers (Pharmacia) and ABI sequencers (Perkin Elmer) using either M13 forward (pUC vectors) or SK primers (pBluescript). Most of the clones were also sequenced from the opposite side using M13 reverse (pUC vectors) or KS primers (pBluescript).

Sequence Analysis

Raw sequences were transferred into the sequence analysis program DNAsis (Hitachi) and edited for vector sequences, poly(A) tails, and other cloning artifacts. If sequencing was performed from both sides of a clone, it was determined whether the two sequences overlap and in such cases they were edited and merged into a single file. The edited sequences were analyzed. The DNA sequence and the translated protein sequences were compared to all available DNA and protein sequences using the NIH BLAST server (BLASTN and BLASTX). Matches with a score of <10−10 were considered to be significant. Accession numbers correspond to the respective entries in the GenBank Nucleotide Sequence and GenBank Protein databases, respectively.

Acknowledgments

The technical support for sequencing of S. König and S. Gentz is acknowledged. The data will be also available through the SolGenes database. Part of this research has been supported by the Deutsche Forschungsgemeinschaft (Ga470/1-2).

The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 USC section 1734 solely to indicate this fact.

Footnotes

E-MAIL ganal@ipk-gatersleben.de; FAX 49-39482-5137.

REFERENCES

  1. Agyare FD, Lashkari DA, Lagos A, Namath AF, Lagos G, Davis RW, Lemieux B. Mapping expressed sequence tag sites on yeast artificial chromosome clones of Arabidopsis thaliana DNA. Genome Res. 1997;7:1–9. doi: 10.1101/gr.7.1.1. [DOI] [PubMed] [Google Scholar]
  2. Ahn S, Tanksley SD. Comparative linkage maps of the rice and maize genomes. Proc Natl Acad Sci. 1993;90:7980–7984. doi: 10.1073/pnas.90.17.7980. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Alpert KB, Tanksley SD. High resolution mapping and isolation of a yeast artificial chromosome contig containing fw2.2: A major fruit weight quantitative trait locus in tomato. Proc Natl Acad Sci. 1996;93:15503–15507. doi: 10.1073/pnas.93.26.15503. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Bernatzky R, Tanksley SD. Majority of random cDNA clones correspond to single loci in the tomato genome. Mol & Gen Genet. 1986;203:8–14. [Google Scholar]
  5. Bonnema G, Hontelez J, Verkerk R, Zhang YQ, van Daelen R, van Kamen A, Zabel P. An improved method for partially digesting plant megabase DNA suitable for YAC cloning: Application to the construction of a 5.5 genome equivalent YAC library of tomato. Plant J. 1996;9:125–133. doi: 10.1046/j.1365-313x.1996.09010125.x. [DOI] [PubMed] [Google Scholar]
  6. Delseny M, Cooke R, Raynal M, Grellet F. The Arabidopsis thaliana cDNA sequencing projects. FEBS Lett. 1997;403:221–224. doi: 10.1016/s0014-5793(97)00075-6. [DOI] [PubMed] [Google Scholar]
  7. Gebhardt C, Ritter E, Barone A, Debener T, Walkemeier B, Schachtschnabel U, Kaufmann H, Thompson RD, Bonierbale MW, Ganal MW, Tanksley SD, Salamini F. RFLP maps of potato and their alignment with the homeologous tomato genome. Theor Appl Genet. 1991;83:49–57. doi: 10.1007/BF00229225. [DOI] [PubMed] [Google Scholar]
  8. Hamilton CM, Frary A, Lewis C, Tanksley SD. Stable transfer of intact high molecular weight DNA into plant chromosomes. Proc Natl Acad Sci. 1996;93:9975–9979. doi: 10.1073/pnas.93.18.9975. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Inoue T, Zhong HS, Miyao A, Ashikawa I, Monna L, Fukuoka S, Miyadera N, Nagamura Y, Kurata N, Sasaki T, Minobe Y. Sequence-tagged sites (STSs) as standard landmarkers in rice genome. Theor Appl Genet. 1994;89:728–734. doi: 10.1007/BF00223712. [DOI] [PubMed] [Google Scholar]
  10. Jacobs JME, van Eck HJ, Arens P, Verkerk-Bakker B, te Lintel Hekkert B, Bastiaanssen HJM, El-Kharborty A, Periera A, Jacobsen E, Stiekema WJ. A genetic map of potato (Solanum tuberosum) integrating molecular markers, including transposons and classical markers. Theor Appl Genet. 1995;91:289–300. doi: 10.1007/BF00220891. [DOI] [PubMed] [Google Scholar]
  11. Konieczny A, Ausubel FM. A procedure for mapping Arabidopsis mutations using co-dominant ecotype-specific PCR-based markers. Plant J. 1993;4:403–410. doi: 10.1046/j.1365-313x.1993.04020403.x. [DOI] [PubMed] [Google Scholar]
  12. Kurata N, Nagamura Y, Yamamoto K, Harushima Y, Sue N, Wu J, Antonio BA, Shomura A, Shimizu T, Lin S-Y, et al. A 300 kilobase interval genetic map of rice including 883 expressed sequences. Nature Genet. 1994;8:365–372. doi: 10.1038/ng1294-365. [DOI] [PubMed] [Google Scholar]
  13. Martin GB, Ganal MW, Tanksley SD. Construction of a yeast artificial chromosome library of tomato and identification of cloned segments linked to two disease resistance loci. Mol Gen Genet. 1992;233:25–32. doi: 10.1007/BF00587557. [DOI] [PubMed] [Google Scholar]
  14. Paterson AH, Lan T-H, Reischmann KP, Chang C, Lin Y-R, Liu S-C, Burow MD, Kowalski SP, Katsar CS, DelMonte TA, et al. Toward a unified genetic map of higher plants, transcending the monocot-dicot divergence. Nature Genet. 1996;14:380–382. doi: 10.1038/ng1296-380. [DOI] [PubMed] [Google Scholar]
  15. Pillen K, Alpert KB, Giovannoni JJ, Ganal MW, Tanksley SD. Rapid and reliable screening of a tomato YAC library exclusively based on PCR. Plant Mol Biol Rep. 1996a;14:58–67. [Google Scholar]
  16. Pillen K, Pineda O, Lewis CB, Tanksley SD. Status of genome mapping tools in the taxon Solanaceae. In: Paterson AH, editor. Genome mapping in plants. Austin, TX: R.G. Landes; 1996b. pp. 281–308. [Google Scholar]
  17. Sasaki T, Song J, Koga-Ban Y, Matsui E, Fang F, Higo H, Nagasaki H, Hori M, Miya M, Murayama-Kayano E, et al. Toward cataloguing all rice genes: Large scale sequencing of randomly chosen rice cDNAs from a callus cDNA library. Plant J. 1994;6:615–624. doi: 10.1046/j.1365-313x.1994.6040615.x. [DOI] [PubMed] [Google Scholar]
  18. Schumacher K, Ganal M, Theres K. Genetic and physical mapping of the lateral suppressor (ls) locus in tomato. Mol Gen Genet. 1995;246:761–766. doi: 10.1007/BF00290724. [DOI] [PubMed] [Google Scholar]
  19. Shizuya H, Birren B, Kim U-J, Mancino V, Slepak T, Tachiiri Y, Simon M. Cloning and stable maintenance of 300-kilobase-pair fragments of human DNA in Escherichia coli using an F-factor-based vector. Proc Natl Acad Sci. 1992;89:8794–8797. doi: 10.1073/pnas.89.18.8794. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Tanksley SD, Ganal MW, Prince JP, de Vicente MC, Bonierbale MW, Broun P, Fulton TM, Giovannoni JJ, Grandillo S, Martin GB, et al. High density molecular linkage maps of the tomato and potato genomes. Genetics. 1992;132:1141–1160. doi: 10.1093/genetics/132.4.1141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Tatusov RL, Koonin EV, Lipman DJ. A genomic perspective on protein families. Science. 1997;278:631–637. doi: 10.1126/science.278.5338.631. [DOI] [PubMed] [Google Scholar]
  22. Zamir D, Tanksley SD. Tomato genome is comprised largely of fast-evolving low copy-number sequences. Mol & Gen Genet. 1988;213:254–261. [Google Scholar]

Articles from Genome Research are provided here courtesy of Cold Spring Harbor Laboratory Press

RESOURCES