Skip to main content
Genetics logoLink to Genetics
. 1999 Mar;151(3):1217–1228. doi: 10.1093/genetics/151.3.1217

The ancestry of a sample of sequences subject to recombination.

C Wiuf 1, J Hein 1
PMCID: PMC1460527  PMID: 10049937

Abstract

In this article we discuss the ancestry of sequences sampled from the coalescent with recombination with constant population size 2N. We have studied a number of variables based on simulations of sample histories, and some analytical results are derived. Consider the leftmost nucleotide in the sequences. We show that the number of nucleotides sharing a most recent common ancestor (MRCA) with the leftmost nucleotide is approximately log(1 + 4N Lr)/4Nr when two sequences are compared, where L denotes sequence length in nucleotides, and r the recombination rate between any two neighboring nucleotides per generation. For larger samples, the number of nucleotides sharing MRCA with the leftmost nucleotide decreases and becomes almost independent of 4N Lr. Further, we show that a segment of the sequences sharing a MRCA consists in mean of 3/8Nr nucleotides, when two sequences are compared, and that this decreases toward 1/4Nr nucleotides when the whole population is sampled. A measure of the correlation between the genealogies of two nucleotides on two sequences is introduced. We show analytically that even when the nucleotides are separated by a large genetic distance, but share MRCA, the genealogies will show only little correlation. This is surprising, because the time until the two nucleotides shared MRCA is reciprocal to the genetic distance. Using simulations, the mean time until all positions in the sample have found a MRCA increases logarithmically with increasing sequence length and is considerably lower than a theoretically predicted upper bound. On the basis of simulations, it turns out that important properties of the coalescent with recombinations of the whole population are reflected in the properties of a sample of low size.

Full Text

The Full Text of this article is available as a PDF (196.1 KB).

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Griffiths R. C., Marjoram P. Ancestral inference from samples of DNA sequences with recombination. J Comput Biol. 1996 Winter;3(4):479–502. doi: 10.1089/cmb.1996.3.479. [DOI] [PubMed] [Google Scholar]
  2. Hudson R. R., Kaplan N. L. Statistical properties of the number of recombination events in the history of a sample of DNA sequences. Genetics. 1985 Sep;111(1):147–164. doi: 10.1093/genetics/111.1.147. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Hudson R. R. Properties of a neutral allele model with intragenic recombination. Theor Popul Biol. 1983 Apr;23(2):183–201. doi: 10.1016/0040-5809(83)90013-8. [DOI] [PubMed] [Google Scholar]
  4. Kaplan N., Hudson R. R. The use of sample genealogies for studying a selectively neutral m-loci model with recombination. Theor Popul Biol. 1985 Dec;28(3):382–396. doi: 10.1016/0040-5809(85)90036-x. [DOI] [PubMed] [Google Scholar]
  5. Watterson G. A. On the number of segregating sites in genetical models without recombination. Theor Popul Biol. 1975 Apr;7(2):256–276. doi: 10.1016/0040-5809(75)90020-9. [DOI] [PubMed] [Google Scholar]

Articles from Genetics are provided here courtesy of Oxford University Press

RESOURCES