Abstract
The genealogical relationships of individuals in a finite population can create statistical non-independence of alleles at unlinked loci. In this paper, we introduce a flexible graphical method for computing the probabilities that two individuals in a finite, randomly-mating population have the same haplotype or genotype at several loci. This method allows us to generalize the analysis of Laurie and Weir (2003) to cases with more loci and other models of mating. We show that monogamy increases the probabilities of genotypic matches at unlinked loci and that the effect of monogamy increases with the number L of loci. We conjecture a sharp upper bound on the effect of monogamy for a given L.
Keywords: match probability, product rule, unlinked, linkage disequilibrium, monogamy, match graph
1 Introduction
The probability of a complete genotypic match of two unrelated individuals at two or more unlinked loci is of importance to the forensic use of DNA typing. The question that often arises is the extent to which a genotypic match at several unlinked loci between a suspect and a blood or other sample from a crime scene indicates that the suspect is the source of the crime-scene sample (Evett and Weir, 2003). The standard procedure in US criminal courts is to assume that the probability of a genotypic match between two unrelated individuals in the same population can be obtained by assuming statistical independence of the loci. With that assumption, the probability a genotypic match at all loci, called the random match probability (RMP), is obtained by multiplying the probabilities of genotypic matches at each locus, which are obtained from Hardy-Weinberg frequencies (Evett and Weir, 2003). This assumption, which is called the product rule in US courts, is the basis for computing such low RMPs that juries are usually convinced that a suspect whose genotype matches that from a crime-scene sample at several loci was indeed at the crime scene.
The product rule is based on the well-established population genetics theory that shows that recombination in an infinite population eliminates statistical dependence between pairs of loci, i.e., linkage disequilibrium (LD). In finite populations, however, genealogical relationships between unrelated individuals can create LD even between unlinked loci. For two loci the effect is very small (Hill and Robertson, 1968; Ohta and Kimura, 1969). Although this result supports the use of the product rule, it does not ensure that consistent deviations from the predictions of the product rule will not emerge when more than two loci are considered together. At present, 13 tetranucleotide microsatellite loci, called the Combined DNA Index System (CODIS) loci, are generally typed in the US and many other populations (the CODIS web-site is http://www.fbi.gov/hq/lab/codis/index1.htm). Because there are 78 pairs of CODIS loci, it is possible that subtle LD between each pair could result in substantial errors in the RMP for all 13 loci. In a detailed study of a very large data set of genotypes at 9 loci, Weir (2004) found approximate agreement between the numbers of individuals who had the same genotypes at 5 of 9 loci and the predictions of the product rule, provided that a large enough correction (denoted θ) for excess homozygosity was assumed.
Laurie and Weir (2003) presented a way to compute the probability that two unrelated individuals match at two and three loci in a finite randomly mating population. They showed that the product rule works quite well unless the mutation rate to new neutral alleles is unreasonably high. Their results are obtained from a system of coupled linear recurrence equations. The equilibrium match probabilities are found by assuming stationarity.
Although the method of Laurie and Weir (2003) is simple in principle, setting up the systems of recurrence equations becomes increasingly difficult for more than two unlinked loci. For the standard Wright-Fisher model of random mating, Laurie and Weir succeeded in computing the genotypic match probability for two loci and the haplotypic match probability for two and three loci, but they concluded that finding the genotypic match probability for more than two loci or the haplotypic match probability for more than three loci, “would be combinatorially very difficult.”
In this paper, we develop a simpler and more flexible framework for computing match probabilities. Using this framework, we can consider more than three loci and other models of mate choice. Our strategy is to represent match probabilities in terms of graphs. By performing a set of prescribed operations on a given graph at generation t, we determine how it is related to a linear combination of graphs at generation t − 1. The graphical method makes the combinatorial structure of the problem easier to understand. For constructing the required systems of equations, it is possible to implement our method in a fully automated program, thus reducing the chance of human error in finding the recurrence equations for a particular model. We have written such a program in Mathematica that can compute genotypic match probabilities for up to three loci and haplotypic match probabilities for up to five loci. It should be possible to analyze more loci by implementing our algorithm in a faster programming language such as C. If mutation rates at all loci are the same, then certain match probabilities become equal; this reduction in the number of independent variables should allow us to handle about twice as many loci.
In addition to the standard Wright-Fisher model of random mating, we consider a mating scheme with perfect monogamy. We show that the effect of monogamy on the L-locus match probability increases as L increases. Furthermore, for a given number of loci, we conjecture sharp upper bounds on the effect of monogamy on the haplotypic and genotypic match probabilities.
This paper is organized as follows. The models considered in this paper are described in Section 2. Our graphical framework is described in detail in Section 3, where we explain the correspondence between match probabilities and graphs, as well as the operations that one needs to perform on the graphs. Simple examples are provided in Section 4 and the main results on match probabilities are discussed in Section 5, where we also describe an approximation method and discuss the aforementioned sharp upper bounds on the effect of monogamy on match probabilities. We conclude with discussion in Section 6.
2 Model Description
Some frequently used symbols are listed in Table 1. Throughout, we assume a neutral infinite-alleles model for a single population containing N diploid individuals where N is assumed to be large. By a gamete, we simply mean a collection of loci; different loci may physically reside on different chromosomes. We assume that generations are non-overlapping and that mutations occur at locus i with probability μi per gamete per generation, independently of other loci.
Table 1.
Frequently used notation.
| Notation | Explanation |
|---|---|
| 2N | Number of gametes in each generation. |
| L | Number of loci. |
| μi | Per gamete per generation mutation rate at locus i. |
| xi | Allele at locus i in either a haplotypic or a genotypic sequence (it will be clear which from context). |
| x | A haplotypic or a genotypic sequence x = x1x2 … xL. |
| xi ≡ yi | Allele xi matches allele yi. |
| x ≡ y | Allele xi matches allele yi for all loci i = 1,… L. |
| ℙh(xi ≡ yi) | One-locus haplotypic match probability for locus i. |
| ℙh(x ≡ y) | L-locus haplotypic match probability. |
| ℙg(xi≡ yi) | One-locus genotypic match probability for locus i. |
| ℙg(x ≡ y) | L-locus genotypic match probability. |
| The ratio under unconstrained and perfect monogamy mating schemes, respectively. | |
| The ratio under unconstrained and perfect monogamy mating schemes, respectively. |
We use xi to denote the allele at locus i in gamete x. When many gametes are considered, a superscript is sometimes used to distinguish different gametes. For example, denotes the allele at locus i in gamete xk. Our convention differs from that of Laurie and Weir (2003), who use subscripts to denote gamete labels. In their notation ai denotes the allele at locus a in gamete i.
2.1 Mating schemes
How gametes in the next generation are produced from those in the current generation depends on the assumed mating scheme. In this paper we consider the following two random mating schemes:
Unconstrained mating
Randomly sample two gametes, each with replacement. The same gamete may be sampled twice under this mating scheme. A new gamete is produced as a mosaic of the two samples (as described below). This is the standard Wright-Fisher model and the work of Laurie and Weir (2003) pertains to this model. With probability μi, the offspring gamete has an allele at locus i that has never been seen before.
Perfect monogamy
Before sampling, first randomly partition the 2N gametes into a set of N disjoint pairs. To create an offspring gamete, randomly sample a pair from the set of pairs, replacing the pair after sampling. As in unconstrained mating, a new gamete is produced as a mosaic of the two sampled gametes (see below), and with probability μi, the offspring gamete has an allele at locus i that has never been seen before. Unlike in unconstrained mating, the two parental gametes are always different gametes, though they may be identical by state.
2.2 Inheritance pattern of the offspring gamete
Two loci
Let x1x2 and y1y2 denote the two sampled parental gametes. Then, the inheritance pattern of the offspring gamete is x1x2, y1y2, x1y2, or y1x2, with probability , , , or , respectively. Note that r = 1/2 corresponds to the case of unlinked loci.
More than two loci
Let x1x2 … xL and y1y2 … yL denote the two sampled parental gametes with L loci. For ease of discussion, we focus on a set of loci that are pairwise unlinked, as was done previously by other authors (Strobeck and Golding, 1983; Laurie and Weir, 2003). Hence, in the offspring gamete z1z2 … zL, the allele zi at locus i is equally likely to have descended from xi or yi. The probability of any particular inheritance pattern is 1/2L.
3 Graphical Framework: Overall Idea
In this section, we lay out our strategy, explaining the correspondence between match probabilities and graphs, and that between the events in the assumed reproduction model and certain operations on graphs. In the previous section, we described a forward perspective on genealogy. Here, we adopt a backward point of view and determine how a match probability at generation t is related to a combination of match probabilities at generation t − 1. Henceforward, L denotes the number of loci.
3.1 Graphical representation of match probabilities
We use xi ≡ yi to denote that alleles at locus i are identical in gametes x and y. To a particular match probability (e.g., the probability of (xi ≡ yi) ⋀ (xj ≡ zj) ⋀ (yk ≡ zk)), we associate a fully-labeled graph as follows:
Vertex: Create a vertex labeled x for gamete x.
Edge: Draw an edge labeled i between vertices x and y if and only if xi ≡ yi.
For example, shown in Figure 1 are two graphs G1 and G2 which correspond to the match probabilities ℙ(x1 ≡ y1, x2 ≡ y2, x3 ≡ z3) and ℙ(x1 ≡ y1, x2 ≡ y2, y3 ≡ z3), respectively. Under random mating, note that these two probabilities are equal. More generally, any two match probabilities are equal under random mating if they are related by some permutation of the gamete labels. In terms of our graphical representation, this equality of match probabilities translates to the following equivalence relation: two fully-labeled graphs (i.e., all vertices and edges are labeled) are equivalent if they are isomorphic as edge-labeled graphs (i.e., ignoring vertex labels). In Figure 1, G1 and G2 are equivalent since they are isomorphic as edge-labeled graphs. In terms of this graphical framework, our objective is as follows.
Figure 1. Examples of fully-labeled graphs.

Vertex labels correspond to gamete labels and edge labels denote loci. The graph G1 represents the match probability ℙ(x1 ≡ y1, x2 ≡ y2, x3 ≡ z3), whereas G2 represents ℙ(x1 ≡ y1, x2 ≡ y2, y3 ≡ z3). Ignoring the vertex labels, these graphs are isomorphic as edge-labeled graphs. Under random mating, ℙ(x1 ≡ y1, x2 ≡ y2, x3 ≡ z3) = ℙ (x1 ≡ y1, x2 ≡ y2, y3 ≡ z3), and G1 and G2 are considered equivalent.
Main Goal
To develop a graphical method of setting up systems of equations that correctly relate edge-labeled graphs, in the same way that corresponding match probabilities are related.
3.2 Mutations (Vertex Count)
Let denote a set of alleles at locus i in k gametes at time t. Under an infinite-alleles model, the alleles all match only if their parental alleles at time t − 1 all match and no mutation occurs between times t − 1 and t in the lineages relating to their parents. Hence, the probability of any match relation at time t that requires must contain an overall factor of (1 − μi)k when written in terms of match probabilities at time t − 1. This fact translates to the following statement in our graphical representation:
Given a graph G, let V (G) denote the set of all vertices in G, and, for υ ∈ V (G), define
| (1) |
That is,δi(υ) is an indicator variable that says whether the gamete associated with vertex υ is involved in a match relation at locus i. The total number of gametes involved in match relations at is denoted by δi(G) : = Συ∈V (G) δi(υ). When relating G to graphs in the previous generation, there will be an overall factor of
For instance, each of the graphs shown in Figure 2 has δ1(G) = δ2(G) = 2, so the corresponding probability of each graph is proportional to (1 − μ1)2(1 − μ2)2.
Figure 2.

Two-locus match probabilities each proportional to (1−μ1)2(1−μ1)2.
3.3 Inheritance pattern across loci for each gamete (Vertex Split)
Here, we consider only a single gamete at time t and investigate the inheritance pattern across its loci. When more than one gamete is considered at time t, we also need to consider how they can share parental gametes. That will be discussed in the next subsection.
By “δ-degree” of a vertex υ, we mean the sum , where δi(υ) is defined in (1); it is equal to the total number of distinctly labeled edges incident with υ. In the graphs corresponding to haplotypic match probabilities, each edge label appears at most once, so the δ-degree of any vertex coincides with its ordinary degree, the total number of edges incident with the vertex.
Two loci
Consider the case of two loci. Let x and y denote the two gametes sampled at time t − 1, giving rise a child gamete h at time t. With probability r, one of the two loci in h has descended from x and the other from y, while with probability 1 − r, both loci in h have descended from a single parental gamete.
Let R denote a match relation at time t and G the corresponding match graph. If only one of the two loci in a gamete is involved in R (e.g., in R = (x1 ≡ y1) ⋀ (y2 ≡ z2), locus 2 of gamete x is not involved in the match relation. Similarly, locus 1 of gamete z is not involved in the match relation.), then, since we only need to track ancestral loci, we do not need to consider the possibility of the gamete having two parental gametes. Suppose that both loci in gamete h are involved in R, so that the vertex labeled h in G has δ-degree 2. If gamete h has two parental gametes, each contributing one locus to h, then that is represented in our graphical framework by splitting the vertex h into two vertices, distributing the edges that used to be incident with h such that each new vertex has δ-degree 1. An example is shown on the left hand side of Figure 3.
Figure 3. Illustration of vertex split operations on match graphs for two loci.

Vertex h has δ-degree 2. On the left hand side, vertex h is split into two vertices, and the edges that used to be incident with h are divided between the two new vertices such that each new vertex has δ-degree 1. On the right hand side, zero vertex split operation is performed.
A graph obtained from splitting zero or more δ-degree-2 vertices in G is called a split graph of G, and G is called a pivot graph. The two new vertices that result from a vertex split are called a split pair. If G contains at least one δ-degree-2 vertex, then more than one inequivalent split graph can be obtained. Note that a split graph is only an intermediate graph that is useful for relating a pivot graph at time t to a set of relevant match graphs at time t − 1.
More than two loci
Suppose that L > 2. For ease of discussion, we focus on a set of loci that are pairwise unlinked. A case with linked loci can easily be accommodated in our framework by introducing more parameters (recombination rates) and putting constraints on vertex split operations.
Let D = {1, 2,…, n}, where n ≤ L, denote the set of distinct loci in gamete h that are involved in a match relation R. Let B1 ⊔ B2 denote a bipartition of D into two disjoint subsets, such that the loci in B1 and those in B2 come from different parental gametes. (Note that if the bipartition is Ø⊔D, then effectively there is only one parental gamete.) There are 2n−1 inequivalent bipartitions of D, and we assume that each bipartition has probability 1/2n−1. In the graph G corresponding to R, the vertex labeled h has δ-degree n, and the bipartition of D into {i1,…, ik}⨆{ik+1,…, in} corresponds to splitting h into two vertices υ1 and υ2, such that of all edges that used to be incident with h in G, those that had labels in Bi now becomes incident with υi, for i = 1, 2. An example is shown in Figure 4.
Figure 4. A split of a δ-degree-5 vertex in a model with unlinked loci.

This vertex split corresponds to a bipartition of {1,…, 5} into {1, 4} and {2, 3, 5}. These are not entire graphs; only the parts relevant for illustrating a vertex split are shown here.
3.4 Sharing of parental gametes (Vertex Merge)
As described above, a vertex split operation is used to capture that a gamete at time t has inherited at least one locus from each of the two sampled gametes at time t − 1 (c.f., Section 2.1). We now need to consider the possibility of a gamete at time t − 1 being a common parental gamete of two or more gametes at time t. This sharing of a parental gamete translates to merging relevant vertices in the split graph into a single vertex. The precise pattern of allowed sharing of parental gametes depends on the assumed mating scheme, and so do the allowed set of vertex merge operations and their associated probabilities. In what follows, we adopt the following convention:
Convention 1 When a set of vertices merge into a single vertex, we remove all edges that used to join any pair of vertices in that set.
Consider the example shown in Figure 5. The leftmost graph GP is a pivot graph corresponding to the probability of the match relation (x1 ≡ y1) ⋀ (x2 ≡ y2) at time t. Since there are two vertices in GP each with δ-degree greater than 1, we can perform zero, one or two vertex splits in GP. Shown in the middle of Figure 5 is the split graph GS obtained from two vertex splits in GP. We have given different labels to the vertices in GS for ease of discussion, but we are not saying that they necessarily correspond to distinct gametes at time t − 1. Graph GM1 on the right hand side of Figure 5 does correspond to the case in which all four vertices are associated with distinct gametes. If more than one vertex in GS in fact corresponds to the same gamete at time t − 1, then that is represented by merging those vertices into a single vertex.
Figure 5. Examples of vertex split and merge operations under unconstrained mating scheme.

There are other possible vertex merge operations not shown here. Further, there are other split graphs, obtained from either zero or one vertex split.
Unconstrained Mating
Under unconstrained mating, recall that the same gamete may be sampled twice, and each of the sampled gametes may transmit genetic material to its offspring. Hence, going backwards in time, an offspring gamete splits into two parental gametes as a consequence of “recombination” and then the latter two gametes may immediately find a common ancestor in the previous generation. Analogously, two vertices in GS that are a split pair (e.g., vertices w and x or y and z in GS in Figure 5), may merge into the same vertex. More generally, following a similar line of reasoning, we see that any set of vertices in GS may merge into a single vertex under unconstrained mating. This fact simplifies things considerably since we do not need to keep track of which vertices are a split pair.
Under unconstrained mating, determining the probability associated with a given merge operation on a given split graph is straightforward. Suppose that a split graph GS contains n vertices labeled by [n] = {1, 2,…, n}. Then, under unconstrained mating, there exists a one-to-one correspondence between the set of all vertex merge operations on GS and the set of all partitions of [n] into non-empty subsets; each subset of [n] corresponds to those vertices that merge. A partition of [n] into k non-empty subsets defines a particular case of assigning n labeled gametes to k distinct unlabeled parental gametes, with each of those k parents having at least one child. It is easy to see that the probability of such a choice under unconstrained mating is given by
| (2) |
where z(k) denotes the falling factorial z(z − 1) ⋯ (z − k + 1). Hence, the probability of a particular set of vertex merges in GS such that k vertices remain, is given by f(n, k). It is important to note that different sets of merges can produce graphs that are equivalent. For example, consider GM2 on the right hand side of Figure 5. There are four different merge operations on GS—namely, merge w with x, w with z, y with x, or y with z—that produce match graphs equivalent to GM2 as edge-labeled graphs. Hence, the probability of obtaining GM2 from GS through merge operations is 4 × f(4, 3). In contrast, there exists a unique merge operation that produces GM3 from GS, and therefore the probability of obtaining GM3 from GS is f(4, 3). The same goes true for GM4.
Note that graphs GM3 and GM4 each contain an isolated vertex (a vertex with no incident edges). Such a vertex is not involved in any match relation and therefore can be ignored. We say that two graphs are i-equivalent, denoted by ḭ, if they become isomorphic as edge-labeled graphs after dropping isolated vertices. (See Figure 6 for examples.) Two i-equivalent graphs correspond to the same match probability. If a graph only contains isolated vertices, then it defines no match relation, and the associated probability is defined to be 1.
Figure 6. Examples of i-equivalent graphs.

Two graphs are said to be i-equivalent, denoted by ḭ if they become isomorphic as edge-labeled graphs after dropping isolated vertices.
Perfect Monogamy
In the case of perfect monogamy, vertex merge operations need to be constrained and merge probabilities modified. One needs to keep track of which vertices in each split graph are a split pair, to determine allowed merge operations. So, in drawing a split graph, we add a new edge labeled “s” between the two vertices in each split pair. The perfect monogamy condition imposes the following two constraints on vertex merges:
Two vertices joined by an edge labeled “s” may not merge. (Two gametes sampled under perfect monogamy, as described in Section 2.1, are necessarily different gametes, so if the off-spring gamete is obtained via “recombination”, it must have two different parental gametes.)
Vertex merges may not produce a non-cyclic length-2 path with both edges labeled “s”. (If two gametes at time t each have two parental gametes at time t − 1, then their sets of parental gametes are either disjoint or the same, i.e., there can be no half-sibs.)
In addition to Convention 1, we remove all edges labeled “s” after vertex merge operations are complete. The above constraints imply that, under perfect monogamy, GM2, GM3, and GM4 in Figure 5 cannot be obtained from GS; i.e., the corresponding merge operations have probability zero under perfect monogamy. The graphs that can be obtained from allowed merge operations on GS are shown in Figure 7.
Figure 7. Examples of vertex split and merge operations under perfect monogamy.

In GS, an edge labeled “s” joins two vertices that are a split pair. No other vertex merge operations are possible for the given GS. There still are other split graphs, obtained from either zero or one vertex spilt.
For a given split graph GS of a pivot graph GP, label the vertices in the split graph with [n]. Let denote a partition of [n] into k non-empty subsets X1,…, Xk. The partition defines a set of merges in GS, collapsing all vertices in Xi into a single vertex, for each i = 1,…, k. Let GM denote the graph resulting from those merge operations, and define
Note that |T| + |U| = k is the number of vertices in GM, before dropping any isolated vertices. The set T corresponds to the vertices in GM that the vertices in S will map to under the merge operation defined by , whereas the set U corresponds to the remaining vertices in GM. Then, as described in Appendix A, the probability of the set of vertex merges corresponding to is given by
| (3) |
provided that the merges are consistent with the aforementioned two constraints for perfect monogamy. Otherwise, the probability is defined to be zero. For a split graph obtained from zero split operation, S = Ø, T = Ø, and |U| = k; and therefore (3) reduces to (2). (We use the convention that a product of form is defined to be 1 if l ≤ 0.)
Example
Consider the split graph GS shown in Figure 7. To distinguish edge labels from vertex labels, we have labeled the four vertices in GS with Ψ = {w, x, y, z} instead of [4] = {1, 2, 3, 4}. Since w, x and y, z are both split pairs in GS, we obtain S = Ψ, , and U = Ø for all partitions of Ψ The partition is not compatible with perfect monogamy (since w, x are a split pair, they are not allowed to merge). The partition is compatible with perfect monogamy and the corresponding merge operation produces GM1. Using n = 4, |S| = 4, |T| = 4, |U| = 0 in (3), we obtain (N − 1)/N for the probability of that merge operation. The partition produces GM2 and using n = 4, |S| = 4, |T| = 2, |U| = 0 in (3) produces 1/(2N). The partition produces GM3 and, again, using n = 4, |S| = 4, |T| = 2, |U| = 0 in (3) produces 1/(2N). More examples can be found in Section 4.3.
3.5 Summary
Schematically illustrated in Figure 8 is our method of generating the equation that relates a match probability at time t to appropriate match probabilities at times t − 1. Our strategy is to express a pivot graph GP at time t in terms of GMi at time t − 1, by considering all allowed vertex split and merge operations. In this framework, it is easy to keep track of the combinatorial factors and the probabilities associated with inheritance patterns and sharing of parental gametes.
Figure 8. Schematic summary of our graphical approach.

For each pivot graph GP , all allowed vertex spilt and merge operations are considered, keeping track of the corresponding probabilities. The pivot GP can be written as a linear combination of the resulting GMi.
Here is how our graphical framework can be used in practice: Suppose the match probability associated with a particular graph H is not known. To compute it, we need to find a closed system ℰ of equations that has H as one of its unknown variables. Let denote the set of all graphs whose associated match probability values have already been determined. In what follows, denotes the set of graphs on which vertex split and merge operations need to be performed; the set of new unknown graphs reached from via vertex split and merge operations; V the set of all variables in ℰ. With , , and V = Ø as initialization, our algorithm for constructing ℰ goes as follows:
For each pivot graph , consider all possible vertex split operations, producing a set SGP of split graphs. Record the probability of obtaining each split graph.
-
For all , in any order, carry out the following steps:
For each graph in SGP, consider all allowed vertex merge operations, again keeping track of the associated probabilities. Let ℳGP denote the set of all graphs obtained after considering the entire SGP.Now, GP can be written in terms of the graphs in ℳGP, with appropriate coeffcients determined by split, merge and mutation probabilities.
Update by setting (
Set .
If , set and . Then, go back to step 1. If , then a closed system of equations has been obtained for the graphs in V and it can be solved.
Some explicit examples are provided in the following section.
4 Examples of Closed Systems of Equations
In this section, we consider some simple examples to elucidate the graphical framework described in the previous section. We adopt the following notational convention when discussing two-locus examples:
Convention 2 For two loci, there are only two edge types. So, to simplify notation, we adopt the convention of drawing edges for locus 1 (respectively, locus 2) as arcs above (respectively, below) vertices.
4.1 Simplest example
Most mating schemes have the same expression for the probability of xi ≡ yi, a one-locus match relation involving two gametes. As illustrated in Figure 9, the recurrence equation for ℙ(xi ≡ yi) and its solution at stationarity can easily be obtained using the graphical approach described above.
Figure 9. The equilibrium equation satisfied by the one-locus match probability ℙh(xi ≡ yi).

Here, f(n, k) is defined as in (2) and the factor (1 − μi)2 arises as explained in Section 3.2. In deriving the recurrence equation, one needs to recall that a graph consisting of a single isolated vertex has probability 1.
4.2 Unconstrained mating example
We consider two-locus examples in the remainder of this section. Assuming stationarity and un-constrained mating, it is straightforward to obtain the system of coupled linear equations shown in Figure 10. Let G1, G2, and G3 denote the graphs on the left hand sides of those three equations, respectively, from top to bottom. Note that G1 does not contain any vertex with δ-degree greater than 1, so no vertex split is possible. Modulo (1 − μ1)2(1 − μ2)2, the expression on the right hand side of the equation for G1 is obtained from considering all possible merge operations on G1. The same combination of terms, denoted Ω1, also appear in the equation for G2, since G1 can be obtained from a vertex split operation on G2 and there are no constraints on vertex merges. The remaining terms, denoted Ω2, arise from performing all possible vertex merges in G2 without any vertex split. Note that Ω1 and Ω2 appear in the equation for G3, corresponding to performing two and one vertex splits, respectively, in G3, followed by all possible vertex merges. Notice the factor of 2 in 2r(1 − r) Ω2; it comes from the fact that the two possible ways of applying a single vertex split in G3 produces equivalent split graphs.
Figure 10. A closed system of coupled equations under unconstrained mating.

We use G1, G2 and G3 to refer to the graphs on the left hand side of the first, the second, and the third equation, respectively. These equations should be compared with the equations for perfect monogamy in Figure 11.
For μ1 = μ2 = 0, all match probabilities are equal to 1, and indeed the right hand side of each equation in Figure 10 sums to 1 in that case. Such consistency conditions are useful for checking that coeFcients in recurrence equations have been determined correctly. Since the one-locus match probability ℙh(xi ≡ yi) can be determined as shown in Figure 9, the equations in Figure 10 form a closed system of coupled equations that can be solved for G1, G2, and G3.
4.3 Perfect monogamy example
We now consider the same three graphs G1, G2, G3 under the perfect monogamy model. For each graph, we need to consider the same set of vertex split operations as in the unconstrained mating scheme. However, vertex merges are constrained under perfect monogamy, and the allowed merges carry probabilities different from the corresponding merges under unconstrained mating. Using the allowed vertex merges described in Section 3.4 for perfect monogamy and the merge probability given in (3), at stationarity we obtain the set of equations shown in Figure 11. For μ1 = μ2 = 0, the right hand side of each equation correctly sums to 1 when all match probabilities are set to 1. As in the unconstrained mating case, these equations form a closed system of coupled equations, and we can solve it for G1, G2, and G3.
Figure 11. A closed system of coupled equations under perfect monogamy.

We use G1, G2 and G3 to refer to the graphs on the left hand side of the first, the second, and the third equation, respectively. These equations should be compared with the equations for unconstrained mating in Figure 10.
5 Match Probabilities
Given two gametes h = h1h2 … hL and randomly sampled without replacement, we define ℙh(h ≡ h′) as the L-locus haplotypic match probability. The product rule probability is given by , where is the one-locus match probability for locus i. are interested in studying the following ratio:
To study genotypic match probabilities, we consider two pairs of gametes sampled without replacement. Each pair of gametes defines an individual’s genotypic sequence. Let g = g1g2 … gL and denote the two genotypic sequences so obtained. We are interested in the ratio
with ℙg(g ≡ g′) being the L-locus genotypic match probability and the one-locus genotypic match probability for locus i.
In what follows, the superscript “U” is used to refer to the unconstrained mating scheme, whereas “M” is used to refer to the perfect monogamy model. The one-locus haplotypic match probability for unconstrained mating is equal to that for perfect monogamy. Similarly, the one-locus genotypic match probability for unconstrained mating is equal to that for perfect monogamy. Hence, it follows that
and these ratios capture the effect of monogamy on the L-locus match probability. At the end of this section, we conjecture sharp upper bounds on these ratios.
5.1 Two-locus haplotypic match probability
As a warm-up exercise, we first consider the two-locus haplotypic match probability. Given a random pair of gametes h = h1h2 and , we are interested in comparing the two locus haplotypic match probability ℙh(h ≡ h′) with the product . In our graphical framework, ℙh(h ≡ h′) is as shown in Figure 12. Hence, we can compute ℙh(h ≡ h′) for unconstrained mating and for perfect monogamy using the systems of coupled equations shown in Figures 10 and 11, respectively. Recall that and are as shown in Figure 9. Hence, the ratios and can easily be computed. With μ1 = μ2 = u, some numerical values of and are shown on the left hand side of Table 2 for N = 10,000 and r = 1/2. The shown values of agree exactly with that of Laurie and Weir (see Table 2 of their paper), thus confirming the correctness of our graphical framework. Note that both ratios and can be substantially larger than 1, and that for all u. For two loci, mutation rates need to be rather high for the effect of monogamy to be noticeable. As we discuss later in Section 5.4, the effect of monogamy increases with the number of loci.
Figure 12.

The match graph corresponding to the two-locus haplotypic match probability, using Convention 2.
Table 2.
Ratios of the two-locus match probability to the product of one-locus match probabilities for N = 10,000, r = 1/2, and μ1 = μ2 = u.
| Haplotypic | Genotypic | |||
|---|---|---|---|---|
| u | ||||
| 1 × 10−1 | 2.1691 × 102 | 4.3279 × 102 | 2.3535 × 104 | 9.3678 × 104 |
| 2.5 × 10−2 | 1.6747 × 101 | 3.2492 × 101 | 1.4097 × 102 | 5.2933 × 102 |
| 1 × 10−2 | 3.6058 | 6.2113 | 7.0176 | 1.9858 × 101 |
| 5 × 10−3 | 1.6590 | 2.3179 | 1.8782 | 3.1949 |
| 1 × 10−3 | 1.0266 | 1.0532 | 1.0270 | 1.0547 |
| 1 × 10−4 | 1.0003 | 1.0005 | 1.0003 | 1.0005 |
| 1 × 10−5 | 1.0000 | 1.0000 | 1.0000 | 1.0000 |
5.2 Two-locus genotypic match probability
Let w = w1w2 and x = x1x2 denote two gametes forming a genotypic sequence g = g1g2, and let y = y1y2 and z = z1z2 denote two other gametes forming another genotypic sequence . There are four possible ways, illustrated in Figure 13, that the genotypic match g ≡ g′ can happen. These possibilities are not mutually exclusive, and to compute the probability of any one of them being true — that is, the probability of g ≡ g′ — we invoke the inclusion-exclusion principle. First, we need to introduce a new definition. Given a set of fully-labeled graphs H1, H2,…, Hk with the same labeled vertex sets, we define H1 ⊕ ⋯ ⊕ Hk as the graph obtained by the following two steps:
Figure 13. Four possible ways of having two-locus genotypic match.

Convention 2 is used here. Gametes w and x form one genotype, and y and z form another. Note that G1 ~ G4 and G2 ~ G3, where ~ denotes equivalence as edge-labeled graphs. However, the ⊕ operation is defined on Gi as fully labeled graphs
Let ℋ denote the match graph obtained by taking a union of the edges in Ha, a = 1,…, k.
In ℋ , if xi ≡ yi is implied by transitivity of match relations but there is no edge labeled i between vertices x and y, then add such an edge. (By transitivity of match relations, we mean that xi ≡ zi and zi ≡ yi together imply xi ≡ yi.)
Then, by the principle of inclusion-exclusion, we obtain
Under random mating, this expression simplifies to the graphical representation shown in Figure 14, where we have dropped vertex labels and used the equivalence described in Section 3.1. In a similar vein, it is straightforward to show that the one-locus genotypic match probability for locus i is as illustrated in Figure 15. The only difference between and is in their corresponding mutation rates μ1 and μ2.
Figure 14.

Two-locus genotypic match probability, adopting Convention 2.
Figure 15.

One-locus genotypic match probability . Every edge shown here should be labeled i.
For μ1 = μ2 = u, numerical values of the genotypic ratios and are shown on the right hand side of Table 2. As mentioned before, our computation of the haplotypic ratio agrees exactly with that of Laurie and Weir (2003). However, for u < 2.5 × 10−2, there is a slight difference between our computation of the genotypic ratio and that reported by Laurie and Weir (see Table 1 of their paper). We found that the difference could be attributed to a minor error in the Maple code used to obtain their results. After correcting that error, we verified that their program produces exactly the same results as ours.
Note that for all u. Illustrated in Figure 16 are plots of and for N = 10,000 and N = 100,000. (The human effective population size before expansion into Europe has been estimated to be between 10,000 and 100,000. See Harding et al. 1997; Harpending et al. 1998; Takahata 1993; Ayala 1995. Note that Laurie and Weir (2003) also used N = 10,000 and N = 100,000 in reporting numerical results.) Although both and significantly increase as N increases, Figure 17 shows that the ratio does not depend as much on N, especially for large mutation rates. For low mutation rates, as u increases, increases at a faster rate for larger N. Figure 17 suggests that the ratio is bounded from above by a finite number. We return to this topic in Section 5.6.
Figure 16.

Ratios of two-locus genotypic match probabilities to the product of one-locus match probabilities, assuming μ1 = μ2 = u. As these plots show, the ratio for perfect monogamy can be much higher than the ratio for unconstrained mating. Both and significantly increase as N increases.
Figure 17.

Ratio of the two-locus genotypic match probability for perfect monogamy to the probability for unconstrained mating, with μ1 = μ2 = u. The ratio seems to approach an integer (namely, 4) as u approaches 1 from below. See Section 5.6 for further discussion.
5.3 1/N Expansion
In the L-locus case, a graph that arises in the haplotypic match probability computation can contain up to 2L vertices, while a graph in the genotypic case can contain up to 4L vertices. Let n denote the number of vertices in a split graph. For n ≥ 12, the total number of partitions of the set [n] = {1,…, n}—that is, the Bell number B(n)—can be very large (e.g., B(12) = 4, 213, 597, B(13) = 27, 644, 437, and B(14) = 190, 899, 322). (Recall that a set partition of [n] defines a particular vertex merge operation on a split graph with vertices labeled by [n].) Hence, to handle many loci, we propose an approximation scheme that truncates the equations at certain order in 1/N, where N is assumed to be substantially large.
Consider the vertex merge operation corresponding to a partition of [n] into k non-empty subsets, merging all vertices within each subset into a single vertex (k corresponds to the number of vertices after merges). Under unconstrained mating, the probability of such a merge operation is of order 1/Nn−k, as can be seen in (2). Hence, in generating the required systems of equations, if we want to keep only those terms with coeffcients of order 1/Nm where m ≤ 2—call this order-2 truncation—then we only need to consider those partitions of [n] with k ≥ n−2 non-empty subsets. So, the total number of merge operations we need to consider will be T(n) := S(n, n)+ S(n, n−1) + S(n, n−2), with S(n, k) being the Stirling number of the second kind. Note that T(n) is substantially smaller than the Bell number B(n) for n ≥ 10. For example, T(12) = 1772, T(13) = 2510, and T(14) = 3459. Compare these numbers with the corresponding B(n) shown above.
Truncation in the perfect monogamy model is a bit more subtle. In that case, some partitions with k = n − 3 or k = n − 4 have probabilities proportional to 1/N2. Therefore, to obtain those terms with coefficients of order 1/Nm where m ≤ 2, we need to consider the partitions of [n] with k ≥ n − 4 non-empty subsets that are consistent with the conditions of the perfect monogamy model (described in Section 3.4).
Shown in Table 3 are two-locus match ratios computed using order-2 truncation. Comparing that table with Table 2, we conclude that the proposed approximation scheme produces very accurate answers. The haplotypic ratios and in Table 3 are identical to that in Table 2, and we have noticed that even for more loci, and obtained from order-2 truncation are very close to the exact values. Regarding genotypic match ratios and , comparing Table 3 with Table 2 shows that the accuracy of order-2 truncation decreases with increasing mutation rate, but still is quite high (about 99.99%).
Table 3.
Approximate two-locus match probability ratios for N = 10,000, r = 1/2, and μ1 = μ2 = u.
| Haplotypic | Genotypic | |||
|---|---|---|---|---|
| u | ||||
| 1 × 10−1 | 2.1691 × 102 | 4.3279 × 102 | 2.3529 × 104 | 9.3691 × 104 |
| 2.5 × 10−2 | 1.6747 × 101 | 3.2492 × 101 | 1.4093 × 102 | 5.2928 × 102 |
| 1 × 10−2 | 3.6058 | 6.2113 | 7.0162 | 1.9856 × 101 |
| 5 × 10−3 | 1.6590 | 2.3179 | 1.8780 | 3.1947 |
| 1 × 10−3 | 1.0266 | 1.0532 | 1.0270 | 1.0547 |
| 1 × 10−4 | 1.0003 | 1.0005 | 1.0003 | 1.0005 |
These results were obtained using truncated systems of equations, ignoring terms with coeffcients of O(1/N3). Comparing this table with Table 2 shows that the proposed approximation method produces very accurate answers.
5.4 Multi-locus haplotypic match probabilities
To compute the L-locus haplotypic match probability ℙh(h ≡ h′), we need to solve for the graph shown in Figure 18. Taking that graph as a pivot graph, we need to perform all possible vertex split and merge operations, and then iterate the procedure on newly arising graphs, until we obtain a closed system of equations which we can solve. (See Section 3.5 for details. We remark that no two edges have the same label in any haplotypic match graph.) Under unconstrained mating, the same split graph GS may arise from different pivot graphs. We found that using dynamic programming, which allows one to avoid performing the same vertex merge operations on GS more than once, can considerably speed up the computation. Further, for both unconstrained mating and perfect monogamy, k-locus graphs, for k = 2, 3,…, L − 1, will appear in the L-locus computation, so one may again employ dynamic programming and carry out the computation sequentially in increasing number of loci.
Figure 18.

The L-locus haplotypic match probability ℙh(h ≡ h′).
The one-locus haplotypic match probability ℙh(hi ≡ hi) for locus i is shown in Figure 9. For L ≤ 5, and are shown in Table 4. For two and three loci, the values shown in that table agree with the corresponding results in Table 2 of Laurie and Weir (2003). To speed up the computation, we used order-2 truncation (described in Section 5.3) for the 5-locus case. Several conclusions can be drawn from this study. First, for a given mutation rate u, both and increase with the number of loci; the higher the mutation rate, the faster the increase. Second, the effect of monogamy increases with the number of loci, i.e., the ratio increases with the number of loci. Third, for a given number of loci, the effect of monogamy increases with the mutation rate.
Table 4.
L-locus haplotypic match ratios for N = 10, 000 and μi = u for all i = 1, …, L.
| 2-lxocus | 3-locus | |||||
|---|---|---|---|---|---|---|
| u | ||||||
| 1 × 10−1 | 2.1691 × 102 | 4.3279 × 102 | 1.995 | 1.7799 × 105 | 7.1055 × 105 | 3.992 |
| 2.5 × 10−2 | 1.6747 × 101 | 3.2492 × 101 | 1.940 | 3.2277 × 103 | 1.2812 × 104 | 3.969 |
| 1 × 10−2 | 3.6058 | 6.2113 | 1.723 | 2.1811 × 102 | 8.5372 × 102 | 3.914 |
| 5 × 10−3 | 1.6590 | 2.3179 | 1.397 | 2.9387 × 101 | 1.1058 × 102 | 3.763 |
| 1 × 10−3 | 1.0266 | 1.0532 | 1.026 | 1.2927 | 2.0111 | 1.556 |
| 1 × 10−4 | 1.0003 | 1.0005 | 1.0003 | 1.0010 | 1.0025 | 1.0014 |
|
| ||||||
|
| ||||||
| 4-locus | 5-locus | |||||
| u | ||||||
|
| ||||||
| 1 × 10−1 | 1.6479 × 108 | 1.3145 × 109 | 7.977 | 1.5604 × 1011 | 2.4855 × 1012 | 15.93 |
| 2.5 × 10−2 | 7.6574 × 105 | 6.0701 × 106 | 7.927 | 1.8809 × 108 | 2.9735 × 109 | 15.81 |
| 1 × 10−2 | 2.0755 × 104 | 1.6247 × 105 | 7.828 | 2.0627 × 106 | 3.2122 × 107 | 15.57 |
| 5 × 10−3 | 1.3677 × 103 | 1.0481 × 104 | 7.663 | 6.8626 × 104 | 1.0426 × 106 | 15.19 |
| 1 × 10−3 | 4.0398 | 2.0942 × 101 | 5.184 | 3.3603 × 101 | 4.1157 × 102 | 12.25 |
| 1 × 10−4 | 1.0027 | 1.0082 | 1.0056 | 1.0060 | 1.0252 | 1.0191 |
All loci are assumed to be pairwise unlinked. For ease of reference, we repeat here the results for two loci. We used order-2 truncation for five loci and the exact computation for all other cases.
5.5 Three-locus genotypic match probability
We now consider the three-locus genotypic match probability. Let w = w1w2w3 and x = x1x2x3 denote two gametes forming a genotypic sequence g = g1g2g3, and let y = y1y2y3 and z = z1z2z3 denote two other gametes forming another genotypic sequence . There are eight possible ways that the genotypic match g ≡ g′ can happen, as illustrated in Figure 19. As in the case of two loci, these possibilities are not mutually exclusive and we need to use the inclusion-exclusion principle to compute the probability of any one of them being true. More precisely,
Figure 19. Eight possible ways of having three-locus genotypic match.

Gametes w and x form one genotype, and y and z form another. Edge labels are omitted here to avoid clutter; solid arcs above vertices are for locus 1, dotted lines are for locus 2, and solid arcs below vertices are for locus 3. Note that G1 ~ G8, G2 ~ G7, G3 ~ G6, and G4 ~ G5, where ~ denotes equivalence as edge-labeled graphs. Recall that the ⊕ operation is defined on Gi as fully labeled graphs.
where X denotes a non-empty subset of {1, 2,…, 8} and the ⊕ operation is defined as in Section 5.2. This expression simplifies to an expression involving fourteen inequivalent edge-labeled graphs, not shown here. As in the two-locus case, the one-locus genotypic match probability for locus i is as shown in Figure 15.
Shown in Table 5 are the ratios and for N = 10,000, with μi = u for all i = 1,…, L. Two-locus results are repeated there for ease of comparison. Comparing these genotypic results with the haplotypic results in Table 4, we see that for two loci, and for any given mutation rate. For three loci, however, these inequalities are violated for low mutation rates (say, μ ≲ 1.2×10−3). As in the haplotypic case, for any given mutation rate. The results in Table 5 show that, as in the haplotypic case, the effect of monogamy grows with the number of loci; i.e., the ratio increases with the number of loci.
Table 5.
Genotypic match ratios for N = 10, 000 and μi = u for all i = 1, …, L, with all loci assumed to be pairwise unlinked.
| 2-locus | 3-locus | |||||
|---|---|---|---|---|---|---|
| u | ||||||
| 1 × 10−1 | 2.35 × 104 | 9.37 × 104 | 3.98 | 7.92 × 109 | 1.26 × 1011 | 16.0 |
| 2.5 × 10−2 | 1.41 × 102 | 5.29 × 102 | 3.76 | 2.61 × 106 | 4.12 × 107 | 15.8 |
| 1 × 10−2 | 7.016 | 1.986 × 101 | 2.840 | 1.20 × 104 | 1.84 × 105 | 15.3 |
| 5 × 10−3 | 1.878 | 3.195 | 1.701 | 2.21 × 102 | 3.10 × 103 | 14.1 |
| 1 × 10−3 | 1.027 | 1.055 | 1.027 | 1.210 | 1.861 | 1.538 |
| 1 × 10−4 | 1.0003 | 1.0005 | 1.0003 | 1.0009 | 1.0020 | 1.0011 |
5.6 Sharp upper bounds on the effect of monogamy
Tables 4 and 5 suggest that the L-locus ratios and stay bounded by a finite number (dependent on L) as the common mutation rate u increases. We have checked numerically that this property still holds for mutation rates higher than 1 × 10−1. Based on this empirical observation, we make the following two conjectures regarding sharp upper bounds on the effect of monogamy:
Conjecture 1 Let h = h1h2…hL and denote L-locus haplotypic sequences, and recall that is equal to the ratio of the L-locus haplotypic match probability ℙh(h ≡ h′) under perfect monogamy to that under unconstrained mating. Suppose that μi = u for all i = 1,…, L. Then,
and for all u.
Conjecture 2 Let g = g1g2 … gL and denote L-locus genotypic sequences, and recall that is equal to the ratio of the L-locus genotypic match probability ℙg(g ≡ g′) under perfect monogamy to that under unconstrained mating. Suppose that μi = u for all i = 1,…, L. Then,
and for all u.
The above conjectures are independent of N. However, the larger the N, the faster the rate at which and approach their respective upper bounds as u increases. This property is illustrated in Figure 17 for the two-locus genotypic case. Since and are for perfect monogamy (i.e., the most extreme level of monogamy), the upper bounds shown in the above conjectures are also upper bounds for all intermediate levels of monogamy.
We believe that there may exist a simple combinatorial explanation for the upper bounds 2L − 1 and 22L − 2 appearing in Conjectures 1 and 2, respectively. It would be interesting to study the asymptotic behavior analytically. Further, it would be worthwhile to study the dependence of and on the mutation rate u, especially for small u. As Figure 17 indicates, it seems that interesting dynamics can happen within a small window of u.
6 Discussion and Conclusions
The goal of this paper is to provide a framework within which multi-locus probabilities that two unrelated individuals have the same genotype at several loci can be analyzed in a relatively simple manner. Although the analysis of models involving two or more loci is necessarily complicated because of the many ways in which identity and nonidentity propagate from one generation to the next, the graphical method introduced here makes the combinatorial structure of the problem clear and the analysis as simple as possible, and it leads to a method for automatic generation of the appropriate recurrence equations that minimizes the problem of human error. The graphical method takes advantage of the underlying symmetry of the inheritance of unlinked loci and can be adapted to the analysis of similar models.
We have shown that the qualitative conclusion of Laurie and Weir (2003) is correct under a wider range of conditions than they were able to consider with their method. In a randomly mating population, the product rule provides a very close approximation to the probability that two unrelated individuals have the same genotype provided that mutation rates are not too large. If the population size is 10,000, then u = 0.0001 corresponds to a heterozygosity of 80%, which is typical of CODIS loci (Budowle et al., 2001). For that value of u, the ratio R is very close to 1 even for the haplotypic match probability at 5 loci and even if there is complete monogamy (see Table 4).
One limitation of our study, as well as that of Laurie and Weir (2003), is that we assume an infinite alleles model of mutation. Consequently, identity in allelic state implies identity by descent. We do not allow for independent origins of the same allele, as can happen with microsatellite loci. Our results show, however, that there is no substantial increase in the joint probability of identity by descent because of shared genealogies in a finite population. That conclusion is true for other mutation models as well.
Acknowledgments
This research is supported in part by NSF grants CCF-0515278 and IIS-0513910 (YSS) and by NIH grant R01-GM40282 (MS). We thank C. Laurie for helpful comments on a preliminary version of this paper and for kindly providing a copy of the Maple program used to generate the results in Laurie and Weir (2003).
Appendix A. Derivation of (3)
We briefly describe here how the probability shown in (3) is obtained. The same notation introduced at the end of Section 3.4 is used here. A set partition of [n] defines a particular case of assigning n labeled gametes to k distinct unlabeled parental gametes, with each of those k parents having at least one child. The elements in T and U correspond to the parents. Suppose that the merge operation defined by is consistent with perfect monogamy. Then, |T| is even, since two vertices in each split pair in the split graph map to two distinct subsets Xi, Xj, and two different split pairs map to either the same pair of subsets or two disjoint pairs of subsets. In the perfect monogamy model, recall that there are N pairs of parental gametes. Each split pair can choose a particular pair of parents with probability 1/N. Two split pairs w, x and y, z can choose the same pair of parents in two ways: either w collides with y and x collides z, or w collides with z and x collides with y. Each possibility has probability 1/(2N). Putting all these things together, we conclude that the probability of surjectively assigning |S|/2 split pairs to |T|/2 disjoint pairs of parents is
The remaining n − |S| vertices in the split graph choose parents such that each parent in U has at least one child, and the associated probability is
Equation (3) now follows from putting the above two probabilities together.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Contributor Information
Yun S. Song, Email: yssong@cs.ucdavis.edu.
Montgomery Slatkin, Email: slatkin@berkeley.edu.
References
- Ayala FJ. The myth of Eve: molecular biology and human origins. Science. 1995;270:1930–1936. doi: 10.1126/science.270.5244.1930. [DOI] [PubMed] [Google Scholar]
- Budowle B, Shea B, Niezgoda S, Chakraborty R. CODIS STR loci data from 41 sample populations. J Forensic Sci. 2001;46:453–489. [PubMed] [Google Scholar]
- Evett IW, Weir BS. Interpreting DNA Evidence. Sinauer Associates; Sunderland, Mass: 2003. [Google Scholar]
- Harding RM, Fullerton SM, GriFths RC, Bond J, Cox MJ, Schneider JA, Moulin DS, Clegg JB. Archaic African and Asian lineages in the genetic ancestry of modern humans. Am J Hum Genet. 1997;60:772–789. [PMC free article] [PubMed] [Google Scholar]
- Harpending HC, Batze MA, Gurven M, Jorde LB, Rogers AR, Sherry ST. Genetic traces of ancient demography. Proc Nat Acad Sci. 1998;95:1961–1967. doi: 10.1073/pnas.95.4.1961. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hill WG, Robertson A. Linkage disequilibrium in finite populations. Theor Appl Genet. 1968;38:226–231. doi: 10.1007/BF01245622. [DOI] [PubMed] [Google Scholar]
- Laurie C, Weir BS. Dependency effects in multi-locus match probabilities. Theor Popul Biol. 2003;63:207–219. doi: 10.1016/s0040-5809(03)00002-9. [DOI] [PubMed] [Google Scholar]
- Ohta T, Kimura M. Linkage disequilibrium at steady state determined by random genetic drift and recurrent mutation. Genetics. 1969;63:229–238. doi: 10.1093/genetics/63.1.229. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Strobeck C, Golding GB. The variance of linkage disequilibrium between three loci in a finite population. Can J Genet Cytol. 1983;25:139–45. doi: 10.1139/g83-026. [DOI] [PubMed] [Google Scholar]
- Takahata N. Allelic genealogy and human evolution. Mol Biol Evol. 1993;10:2–22. doi: 10.1093/oxfordjournals.molbev.a039995. [DOI] [PubMed] [Google Scholar]
- Weir BS. Matching and partially-matching DNA profiles. J Forensic Sci. 2004;49:1009–1014. [PubMed] [Google Scholar]
