Skip to main content
Genetics logoLink to Genetics
. 2020 Aug 26;216(2):261–262. doi: 10.1534/genetics.120.303647

Perspective: Linkage Maps, Communities of Geneticists, and Genome Databases

David Botstein 1,1
PMCID: PMC7536864  PMID: 33023927

Abstract

The Thomas Hunt Morgan Medal recognizes lifetime contributions to the field of genetics. The 2020 recipient is David Botstein of Calico Labs and Princeton University, recognizing his multiple contributions to genetics, including the collaborative development of methods for defining genetic pathways, mapping genomes, and analyzing gene expression.


graphic file with name 261fx1.jpg

THE great honor of being awarded the Genetics Society of America (GSA)’s Thomas Hunt Morgan Medal stimulated me to think of how Morgan’s ideas and leadership in the development of genetics as a science influenced my own intellectual development. Three foundational elements stand out, two of which are directly traceable to Morgan himself and his “fly room” at Columbia University 100 years ago. Morgan’s ideas and example shaped my career and were the basis of the contributions I was able to make to the development of the modern sciences of genetics and genomics.

One of these elements is the idea that frequencies of recombination could be used quantitatively to map genes relative to one another on the chromosomes (Sturtevant 1913; Bridges 1921). The second, which grew organically out of the process of making comprehensive linkage maps, was the establishment and nurturing of an open and cooperative intellectual culture and community dedicated to the genetics of Drosophila melanogaster and the mapping and characterization of its functional genes. As scientists found new genes (characterized in those days by their heritable mutant phenotypes), they mapped them to each other by exchanging mutant strains and crossing them. The third element, the development of genome databases, came much later, following closely upon the determination of genome sequences and requiring, of course, the development of modern computers and computer science.

A very similar history underlies the development of the scientific community dedicated to the study of the genetics of budding yeast, Saccharomyces cerevisiae. Like the case of Drosophila, interest in yeast began with genetic linkage mapping, which gave rise to an open community of geneticists who cooperated in the construction and maintenance of the genetic map. Like the fly community, the yeast community is still active and vibrant today. I was fortunate to have joined the yeast community near its beginning, long before genome sequencing. At that time, yeast and fly geneticists who had found and mapped a new gene combined their data in periodic publications of the community’s map with a list of markers and their arrangement on the chromosomes. The last such yeast publication was a review (Mortimer and Schild 1985) and the last such fly publication was the “Red Book” (Lindsley and Zimm 1992). These maps and the data that underlay them became the starting point for SGD (Saccharomyces Genome Database; Cherry et al. 1997) and Flybase (Ashburner and Drysdale 1994). One of the contributions to our science that I am most proud of was my participation in founding SGD and helping to guide its continuing success over the decades in providing current genomic information to the yeast community.

The necessity for a sustained cooperative endeavor to make and maintain linkage maps lay behind the formation of cooperative communities for other genetically tractable “model” organisms, notably the worm Caenorhabditis elegans and the bacterium Escherichia coli. As the sequences of these genomes became available, each of these communities transitioned from periodic review publications to public genomic databases as the way to maintain and distribute the maps and, crucially, the list of genes that had been mapped by linkage on the basis of mutant phenotypes. The databases contained not just gene order on the chromosomes: they assembled as much of the published information about each gene as could be obtained from the literature, ultimately vastly more than just map position and a one-sentence phenotypic description typical of the review papers. The sequence-based methods provided exponentially more genetic information and very soon the focus for virtually all genetic information became the gene and genome sequences as organized and delivered by the genome databases. The open and cooperative communities of scientists working on the biology of these organisms persisted in no small part because the aggregated information in a genomic database was even more useful than the linkage map alone had been.

It is important to note that gene and genome sequences by themselves are of limited value. The connection between a particular gene and its function in the biology of the organism, usually revealed by mutant phenotypes, is essential to biological understanding. Establishing this connection is still often done even today by some variant of linkage analysis: a mutation or sequence variant is associated with a phenotype via its pattern of inheritance in crosses. In such studies a gene can be followed by two features: a phenotype or a sequence variation – usually a DNA polymorphism. Robust evidence for the connection between phenotype and sequence is generated when the two patterns coincide.

In 1980 my collaborators and I (Botstein et al. 1980) introduced the idea of a genome-scale linkage map of DNA polymorphisms as a way to map genes known only by their phenotype and their pattern of inheritance in humans (generally diseases inherited as single Mendelian factors). This idea, of course, traces directly back to Morgan, Sturtevant, Bridges, and the fly room. It took root in the human genetics community, and today literally thousands of human disease genes have been located on the human genome by this principle. The linkage map became a scaffold on which the first DNA sequence of the human genome was in part assembled.

One of the first fruits of the new science of genomics was the direct evidence, by sequence similarity, that all the organisms on earth are ultimately related by descent. In eukaryotes the similarities are so strong among expressed genes that one can usually infer function for orthologs whose function is known in one organism but not the other. In this way the function of many genes could be investigated in an experimentally suitable organism and annotated via sequence similarity in another, including the human. In many cases, including yeast, the functional relationship could be tested by complementation of a gene defect in yeast by DNA with a sequence from an ortholog in a mammal.

In truth, most geneticists already understood that these relationships must exist long before genome sequencing, going back to Morgan’s time. Such relationships are the logical consequence of evolution. The advent of the genomic sequences made these relationships accessible and the degree of relationship quantitative. By the 1980s my group was actively characterizing a number of yeast genes specifying important functions in eukaryotic cell biology. We chose the genes (e.g., those specifying actin and tubulins) by sequence similarity to mammalian genes characterized by biochemical analysis of their protein products in mammals (cf. Botstein and Fink 1988, 2011; Botstein et al. 1997).

The several eukaryotic “model organism” communities, including yeast, fly, and worm, had, by the turn of the 21st century, realized that the possibility of transfer of annotation for functional genes by homology was an important tool for understanding biology function in all organisms. They recognized this as a “tool for the unification of biology” (Ashburner et al. 2000) and organized a consortium of genome databases into the “Gene Ontology Consortium.” In the succeeding years, the gene ontology and its database have become a vital tool for connecting genetically determined biological functions in all organisms, including humans.

To conclude, my career was greatly influenced by the ideas and discoveries of Thomas Hunt Morgan. My work (and that of Gerry Fink, who was honored with me) was in many ways a logical consequence of his, and I cannot think of a more appropriate recognition for our work than to celebrate this continuity in the intellectual history of genetics.

Literature Cited

  1. Ashburner M and Drysdale R (1994) FlyBase - The Drosophila genetic database. Development 120: 2077–2079. [DOI] [PubMed] [Google Scholar]
  2. Ashburner M., Ball C. A., Blake J. A., Botstein D., Butler H. et al. , 2000.  Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25: 25–29. 10.1038/75556 [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Botstein D., and Fink G., 1988.  Yeast: an experimental organism for modern biology. Science 240: 1439–1443. 10.1126/science.3287619 [DOI] [PubMed] [Google Scholar]
  4. Botstein D., and Fink G. R., 2011.  Yeast: an experimental organism for 21st century biology. Genetics 189: 695–704 [PMCID:PMC3213361] 10.1534/genetics.111.130765 [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Botstein D., White R. L., Skolnick M., and Davis R. W., 1980.  Construction of a genetic linkage map in man using restriction fragment length polymorphisms. Am. J. Hum. Genet. 32: 314–331. [PMC free article] [PubMed] [Google Scholar]
  6. Botstein D., Amberg D., Huffaker T., Mulholland J., Adams A. et al. , 1997.  The yeast cytoskeleton, in Molecular and Cellular Biology of the Yeast Saccharomyces, Vol. 3, edited by Broach J. R., Pringle J. R., and Jones E. W.. Cold Spring Harbor Press, New York. [Google Scholar]
  7. Bridges C. B., 1921.  Current maps of the location of the mutant genes of Drosophila melanogaster. Proc. Natl. Acad. Sci. USA 7: 127–132. 10.1073/pnas.7.4.127 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Cherry J. M., Ball C., Weng S., Juvik G., Schmidt R. et al. , 1997.  Genetic and physical maps of Saccharomyces cerevisiae. Nature 387: 67–73. 10.1038/387s067 [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Lindsley D. L., and Zimm G. G. (Editors), 1992.  The Genome of Drosophila melanogaster, Vol. 1 Academic Press, San Diego. [Google Scholar]
  10. Mortimer R. K., and Schild D., (1985) Genetic map of Saccharomyces cerevisiae, edition 9. Microbiol Rev 49: 181–213. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Sturtevant A. H., 1913.  The linear arrangement of six sex‐linked factors in Drosophila, as shown by their mode of association. J. Exp. Zool. 14: 43–59. 10.1002/jez.1400140104 [DOI] [Google Scholar]

Articles from Genetics are provided here courtesy of Oxford University Press

RESOURCES