Skip to main content
. 2014 May 15;10(5):e1004342. doi: 10.1371/journal.pgen.1004342

Figure 1. An ancestral recombination graph (ARG) for four sequences.

Figure 1

(A) Going backwards in time (from bottom to top), the graph shows how lineages that lead to modern-day chromosomes (bottom) either “coalesce” into common ancestral lineages (dark blue circles), or split into the distinct parental chromosomes that were joined (in forward time) by recombination events (light blue circles). Each coalescence and recombination event is associated with a specific time (dashed lines), and each recombination event is also associated with a specific breakpoint along the chromosomes (here, Inline graphic and Inline graphic). Each non-recombining interval of the sequences (shown in red, green, and purple) corresponds to a “local tree” embedded in the ARG (shown in matching colors). Recombinations cause these trees to change along the length of the sequences, making the correlation structure of the data set highly complex. The ARG for four sequences is denoted Inline graphic in our notation. (B) Representation of Inline graphic in terms of a sequence of local trees Inline graphic and recombination events Inline graphic. A local tree Inline graphic is shown for each nonrecombining segment in colors matching those in (A). Each tree, Inline graphic, can be viewed as being constructed from the previous tree, Inline graphic, by placing a recombination event along the branches of Inline graphic (light blue circles), breaking the branch at this location, and then allowing the broken lineage to re-coalesce to the rest of the tree (dashed lines in matching colors; new coalescence points are shown in gray). Together, the local trees and recombinations provide a complete description of the ARG. The Sequentially Markov Coalescent (SMC) approximate the full coalescent-with-recombination by assuming that Inline graphic is statistically independent of all previous trees given Inline graphic. (C) An alignment of four sequences, Inline graphic, corresponding to the linearized ARG shown in (B). For simplicity, only the derived alleles at polymorphic sites are shown. The sequences are assumed to be generated by a process that samples an ancestral sequences from a suitable background distribution, then allows each nonrecombining segment of this sequence to mutate stochastically along the branches of the corresponding local tree. Notice that the correlation structure of the sequences is fully determined by the local trees; that is, Inline graphic is conditionally independent of the recombinations Inline graphic given the local trees Inline graphic.