Skip to main content
Genetics logoLink to Genetics
. 2007 May;176(1):477–488. doi: 10.1534/genetics.106.065433

Variance of the Parental Genome Contribution to Inbred Lines Derived From Biparental Crosses

Matthias Frisch 1, Albrecht E Melchinger 1,1
PMCID: PMC1893034  PMID: 17409089

Abstract

The expectation of the parental genome contribution to inbred lines derived from biparental crosses or backcrosses is well known, but no theoretical results exist for its variance. Our objective was to derive the variance of the parental genome contribution to inbred lines developed by the single-seed descent or double haploid method from biparental crosses or backcrosses. We derived formulas and tabulated results for the variance of the parental genome contribution depending on the chromosome lengths and the mating scheme used for inbred line development. A normal approximation of the probability distribution function of the parental genome contribution fitted well the exact distribution obtained from computer simulations. We determined upper and lower quantiles of the parental genome contribution for model genomes of sugar beet, maize, and wheat using normal approximations. These can be employed to detect essentially derived varieties in the context of plant variety protection. Furthermore, we outlined the application of our results to predict the response to selection. Our results on the variance of the parental genome contribution can assist breeders and geneticists in the design of experiments or breeding programs by assessing the variation around the mean parental genome contribution for alternative crossing schemes.


THE expected contribution of a parental line to the genome of an inbred line derived from a biparental cross is Inline graphic For inbred lines derived from a backcross, the expected genome contribution of the nonrecurrent parent is Inline graphic, where t is the number of backcross generations. Experimental studies showed a considerable variation in the parental genome contribution around these mean values (Heckenberger et al. 2006) but until now no theoretical concept for describing the variance of the parental genome contribution to homozygous inbred lines existed.

Inbred lines are developed for various purposes in genetic research and applied plant breeding programs, e.g., for direct use as line cultivars or as parents of hybrid and synthetic varieties. A theoretical concept for calculating the variance of the parental genome contribution to inbred lines can be used (1) in plant variety protection to test hypotheses on the mating scheme that was employed for inbred line development and (2) to assess and compare the variability in experimental and breeding populations generated with a certain mating scheme depending on the number and length of the chromosomes of the species under consideration.

Hill (1993) derived the variance of the parental genome contribution to heterozygous backcross individuals under the assumption of no interference in crossover formation. Employing his formula for the variance, he found that a normal approximation fitted well the probability distribution of the parental genome contribution obtained from computer simulations. Using the cattle genome as an example, he demonstrated that his results can be employed to determine approximate upper bounds for the parental genome contribution of the nonrecurrent stock.

Our objectives were to (1) derive the variance of the parental genome contribution to inbred lines developed by the single-seed descent (SSD) or double haploid (DH) method from biparental crosses or backcrosses adopting the approach of Hill (1993), (2) investigate with computer simulations the fit of a normal approximation to the probability distribution of the parental genome contribution, and (3) demonstrate the application of the formulas in the context of plant variety protection.

THEORY

Assumptions:

We assume that the offspring are completely homozygous lines, derived without selection from a biparental cross of completely homozygous parents P1 and P2. For all derivations, we assume absence of interference (Stam 1979) in crossover formation such that the recombination frequency ruv between two loci on a chromosome with map positions u and v is calculated by Haldane's (1919) mapping function

graphic file with name M3.gif (1)

Variance of the parental genome contribution:

Meiosis on different chromosomes is stochastically independent. Hence, the variance of the genome contribution Z of parent P1 to the genome of a derived line can be written in terms of the variances Var(Zi) for individual chromosomes as

graphic file with name M4.gif (2)

where c is the number of chromosomes, li the length of the ith chromosome, and Inline graphic the total length of the genome in Morgan units.

Following the approach introduced by Hill (1993) in the context of backcross populations, the variance of the parental genome contribution to a chromosome equals the expected covariance between two randomly sampled loci on the chromosome,

graphic file with name M6.gif (3)

where Gu and Gv are random variables taking the value 1 if the loci at map positions u and v carry the allele of parent P2 and 0 otherwise, and Duv is a random variable describing the linkage disequilibrium between two loci on the chromosome with probability density

graphic file with name M7.gif (4)

Using the formulas for

graphic file with name M8.gif (5)

given in Frisch and Melchinger (2006, Table 1 therein), D(u, v) can be calculated as

graphic file with name M9.gif (6)

TABLE 1.

Formulas for the expected gametic disequilibrium D(u, v) between two loci at map positions u and v in populations of infinite size under four mating schemes

D (u, v)
Mating system General form After inserting Haldane's mapping function
(F2)t-SSD Inline graphic Inline graphic
(F1)t-DH Inline graphic Inline graphic
BCt-SSD Inline graphic Inline graphic
BCt-DH Inline graphic Inline graphic

D(u, v) depends on the recombination frequency ruv between the two loci and the number t of intermating or backcrossing generations.

We present formulas for D(u, v) for the following four mating systems (Table 1): (1) (F2)t-SSD lines, developed by t (t ≥ 0) generations of random mating of a F2 population and subsequent application of the SSD method for line development; (2) (F1)t-DH lines, developed by t (t ≥ 0) generations of random mating of a F1 cross and subsequent inbred line development with the DH method; and (3) BCt-SSD and (4) BCt-DH lines, developed from a F1 cross backcrossed t (t ≥ 1) times to parent P1, with subsequent line development by the SSD or DH method.

Inserting D(u, v) (Table 1) into Equation 3 yields Var(Zi). Analytical results for Var(Zi) are derived in the appendix and summarized in Table 2. Numerical results for Var(Zi) are given in Table 3. To check our derivations, we determined the results in Table 3 also with computer simulations using Plabsoft (Maurer et al. 2004). The differences between simulated and analytically determined variances were < 0.001 if one million chromosomes were simulated.

TABLE 2.

Formulas for the variance Var(Zi) of the parental genome contribution to a chromosome of length li under four mating schemes

Mating system Var(Zi)
(F2)t-SSD Inline graphica
(F1)t-DH Inline graphic
BCt-SSD Inline graphic
BCt-DH Inline graphic
a

Inline graphic

Inline graphic

Inline graphic

TABLE 3.

Variance Var(Zi) of the parental genome contribution to a chromosome of length li under four mating schemes

Chromosome length li
t 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0
Inline graphic-SSD
0 0.1410 0.1231 0.1091 0.0978 0.08860 0.0809 0.0743 0.0687
1 0.1246 0.1064 0.0927 0.0821 0.07356 0.0666 0.0608 0.0559
2 0.1110 0.0931 0.0800 0.0701 0.06232 0.0561 0.0509 0.0467
3 0.0997 0.0823 0.0700 0.0608 0.05374 0.0481 0.0435 0.0398
Inline graphic-DH
0 0.1740 0.1566 0.1419 0.1294 0.11867 0.1094 0.1014 0.0943
1 0.1517 0.1330 0.1181 0.1060 0.09604 0.0877 0.0806 0.0745
2 0.1335 0.1145 0.1000 0.0886 0.07948 0.0720 0.0657 0.0605
3 0.1186 0.0998 0.0860 0.0754 0.06711 0.0604 0.0549 0.0503
BCt-SSD
1 0.1058 0.0923 0.0818 0.0734 0.06654 0.0607 0.0558 0.0516
2 0.0576 0.0497 0.0436 0.0389 0.03500 0.0318 0.0291 0.0269
3 0.0283 0.0241 0.0209 0.0185 0.01654 0.0150 0.0137 0.0126
4 0.0133 0.0112 0.0096 0.0084 0.00749 0.0067 0.0061 0.0056
BCt-DH
1 0.1194 0.1057 0.0945 0.0854 0.07769 0.0712 0.0656 0.0608
2 0.0632 0.0550 0.0486 0.0435 0.03929 0.0358 0.0328 0.0303
3 0.0306 0.0262 0.0229 0.0203 0.01821 0.0165 0.0151 0.0139
4 0.0143 0.0121 0.0104 0.0092 0.00816 0.0074 0.0067 0.0061

Probability distribution of the parental genome contribution:

The probability distribution of the parental genome contribution is determined by the number and location of crossover events occuring during the meioses in inbred line development. We investigated the probability distribution assuming no interference in crossover formation (Stam 1979), employing properties of the Poisson process (cf. Karlin 1968).

For an individual chromosome, the probability that exactly k crossovers occur during all meioses in inbred line development can be obtained from the probability function of the Poisson distribution. If no crossover occurs (k = 0), then the genome contribution of parent P1 is either 0 or 1. In consequence, the probabilities P(Zi = 0) and P(Zi = 1) do exist and the random variable Zi is discrete for Zi = 0 and Zi = 1. If k > 0 crossovers occur, then the length of chromosome segments between crossovers is exponentially distributed and the sum of lengths of chromosome segments is gamma distributed. In consequence, Zi is in the interval (0, 1) a mixture of linear transformations of the gamma distributions for different values of k. For the entire genome, the distribution of the parental genome contribution is a convolution of the distributions for the individual chromosomes.

Analytical results for the exact probability distribution of the parental genome contribution could be derived by employing the above considerations. However, the resulting equations would be rather unwieldy and using them to derive important parameters such as quantiles directly from the density functions would require a heavy use of high quality numerical mathematics. Alternatively, we suggest employing our relatively simple equations for the variance (Table 2) and a normal approximation instead.

DISCUSSION

Genetic model:

For all derivations we used the assumption of no interference (Stam 1979) underlying Haldane's (1919) mapping function. This is a simplified mathematical model and there exist more sophisticated models of crossover formation in meiosis, which fit experimental data better (McPeek and Speed 1995). Briefly, the advantages of the assumption of no interference are (1) mathematical simplicity, yielding equations that can be easily evaluated, and (2) that the results can be applied without knowing the exact amount of interference in the chromosome region under consideration. For a more detailed discussion concerning the use of the assumption of no interference see Frisch and Melchinger (2001).

Equation 3, defining the variance of the parental genome contribution in terms of the linkage disequilibrium D(u, v), and the formulas for D(u, v), in terms of the recombination frequency ruv presented in Table 1, hold true irrespectively of the amount of interference. These formulas can be used with arbitrary mapping functions to derive the variance of the parental genome contribution under the assumption of interference. Presumably, analytical solutions as presented in the appendix cannot be derived for some mapping functions. In such cases, approximative solutions of Equation 3 can be obtained with numerical integration routines of mathematical software packages.

Compared with no interference, negative interference results in a greater number of chromosome segments with intermediate length and a smaller number of very long or short chromosome segments. Therefore, negative interference will result in smaller variances of the parental genome contribution than those presented in our results. The opposite is the case for positive interference.

Comparison with previous studies:

Hill (1993) derived the variance of the parental genome contribution to backcross individuals. Each backcross individual receives from the recurrent backcross parent one set of homologous chromosomes, for which the variance of the parental genome contribution is zero. Hence, the variance of the parental genome contribution to backcross individuals is entirely determined by the variance Inline graphic (following Hill's 1993 notation) of the parental genome contribution to the homologous chromosome set originating from the nonrecurrent parent. These homologous chromosomes are genetically identical to the chromosomes of DH lines derived from a backcross individual. In consequence, Inline graphic derived by Hill (1993) for backcross individuals equals Var(Zi) for BCt-DH lines.

Wang and Bernardo (2000) derived the variance V(kX) of marker estimates of parental genome contribution to F2- and BC1-SSD lines. They considered a finite number k of marker loci per chromosome and employed Kosambi's (1944) mapping function Inline graphic The major difference to our approach is that Wang and Bernardo (2000) obtain V(kX) by summing over a discrete number of marker loci, whereas we obtain Var(Zi) by integrating over an infinite number of genomic loci (Equation 3). The results on V(kX) and Var(Zi) can be related as follows. Inserting D(u, v) (Table 1) in Equation 3, but employing Kosambi's instead of Haldane's mapping function, yields Inline graphic In consequence, V(kX) of Wang and Bernardo (2000) converges to Var(Zi) for large numbers of markers on a chromosome (assuming that the same mapping function is employed).

Heckenberger et al. (2006) estimated the parental genome contribution to 102 F2-SSD and 11 BC1-SSD maize lines with 100 SSR and 1017 AFLP markers. They determined the standard deviations of the parental genome contribution (Table 4) and compared their results with computer simulations. The observed standard deviations were not significantly different (χ2 test with α = 0.05) from the simulated values. The standard deviations determined with Equation 2 as well as those obtained with the model of Wang and Bernardo (2000) were in good agreement with the experimental and simulated values (Table 4). In conclusion, both theoretical models fit the data set of Heckenberger et al. (2006) well.

TABLE 4.

Standard deviation VarInline graphic of the parental genome contribution to F2-SSD and BC1-SSD maize lines for experimental and simulated data (Heckenberger et al. 2006), the model of Wang and Bernardo (2000), and the model developed in this study

Mating system (Inline graphic)
F2-SSD BC1-SSD
Experimental data of Heckenberger et al. (2006)a
100 SSR markes 0.10 0.09
1017 AFLP markers 0.10 0.11
Simulations of Heckenberger et al. (2006)b
Entire genome 0.09 0.07
Model of Wang and Bernardo (2000)c
100 markers 0.088 0.076
1020 markers 0.090 0.078
∞ markers 0.090 0.078
Model of this study
Entire genome 0.088 0.076
a

The linkage map consisted of 10 chromosomes of 1.70, 1.30, 1.06, 1.48, 1.28, 1.15, 1.14, 1.21, 0.99, and 0.91 M length.

b

The linkage map of the experimental data and a noninterference model was used for the simulations.

c

The “nonterminal marker model” of the authors was employed with 10 chromosomes of 1.22 M length and 10 SSRs or 102 AFLPs equally spaced on each chromosome.

Numerical results:

The variance of the parental genome contribution to a chromosome depends on the expected number of crossovers occurring on the chromosome during inbred line development. A large number of expected crossovers results in many small chromosome segments, whereas few crossovers result in few long chromosome segments. With few long segments, the probability that chromosomes with very large or very small parental genome contributions do occur is greater and, therefore, the variance of the parental genome contribution is greater than for many small segments.

The number of crossovers expected per meiosis on a chromosome equals its length in Morgan units. Therefore, the variance of the parental genome contribution is smaller for long chromosomes than for short chromosomes. This trend can be observed irrespective of the employed breeding scheme for inbred line development (Table 3).

The total number of crossovers occurring on a chromosome during inbred line development depends on the total number of meioses and, hence, the employed breeding scheme. Intermating or backcrossing prior to employing the SSD or DH method results in an increased total number of meioses and, therefore, in a smaller variance of the parental genome contribution (Table 3). Generating DH lines comprises only one meiosis, whereas in the SSD scheme one meiosis occurs in each selfing generation. Therefore the variances of the parental genome contribution is greater for DH than for SSD lines.

Normal approximation:

A normal approximation is not expected to fit the distribution of the parental genome contribution for individual chromosomes well, because Zi = 0 and Zi = 1 can occur with rather high probabilities, especially for short chromosomes or when inbreds are generated by the DH method. However, the genomes of important crops consist of many chromosomes (9 in sugar beet, 10 in maize, and 21 in wheat). Therefore, the random variable describing the parental genome contribution to the entire genome is a sum of independent random variables for the individual chromosomes. According to the central limit theorem (Shao 1999) the probability distribution of a sum of a large number of random variables converges to a normal distribution, irrespective of the type of distributions of random variables that are summed up. As a consequence, theory suggests that a normal approximation of the probability distribution of the parental genome contribution to the entire genome should fit the true distribution well.

To investigate the fit of the normal approximation, we used the software Plabsoft (Maurer et al. 2004) to simulate the parental genome contribution to (a) one chromosome of 1.6 M length and (b) a model of the maize genome consisting of 10 chromosomes each of 1.6 M length for the F2-SSD and BC1-SSD mating schemes. The normal approximations fit the simulated distributions of the parental genome contribution for individual chromosomes only poorly (Figure 1). In contrast, the fit was very good for the simulated distribution of the entire genomes for both F2-SSD and BC1-SSD lines. Hence, our formulas for the variances, together with a normal approximation, provide a good means by which to investigate the distribution of the parental genome contribution in many applications in genetics and breeding.

Figure 1.—

Figure 1.—

Simulated distribution (histogram) and normal approximation (solid line) of the parental genome contribution to one chromosome of 1.6 M length (left) and to a model of the maize genome with 10 chromosomes of 1.6 M length (right) for F2-SSD lines (top) and BC1-SSD lines (bottom).

Application in plant variety protection:

An essentially derived variety is a cultivar or an inbred line, which is for the most part identical to one of its ancestors. Essentially derived varieties can be detected by comparing predictions of the parental genome contribution to inbred lines with threshold values. The variances of the parental genome contribution derived here can be employed together with the prediction method described in a companion article (Frisch and Melchinger 2006) to establish a test for detecting essentially derived varieties.

The first step of the test is to identify breeding schemes that are generally considered acceptable for inbred line development. For example, in wheat breeding in Europe, it is an accepted breeding scheme to cross a proprietary inbred line with a registered line cultivar of a competitor and to select a new line cultivar from the resulting population of F2-SSD lines. In contrast, deriving inbred lines from backcross populations is not accepted.

Then the null hypothesis, “An inbred line was derived using an accepted breeding scheme,” is tested. The critical value for the test is determined from the quantiles of a normal approximation of the distribution of the parental genome contribution under the null hypothesis. For example, in wheat, the 0.99 quantile of the parental genome contribution to F2-SSD lines is 0.638 (Table 5). As test statistic, the genome contribution of the parental line that is assumed to be plagiarized to the putative essentially derived variety is determined by using the prediction method of Frisch and Melchinger (2006). If the test statistic is greater than the critical value, then the null hypothesis is rejected and plagiarism is assumed. (Of course, the accused breeder always has the possibility to prove that an accepted method was employed, e.g., by disclosing the breeding records.)

TABLE 5.

Quantiles of the parental genome distribution for models of the genomes of sugar beet (9 chromosomes of 1.0 M length), maize (1.0 chromosomes of 1.6 M length), and wheat (21 chromosomes of 1.8 M length)

Quantile
Mating system 0.005 0.025 0.050 0.900 0.950 0.975 0.990 0.995 0.999
Sugar beet model: 9 chromosomes of length 1 M
F2-SSD 0.216 0.284 0.319 0.641 0.681 0.716 0.756 0.784 0.840
Inline graphic-SSD 0.257 0.315 0.345 0.621 0.655 0.685 0.719 0.743 0.791
F1-SSD 0.177 0.254 0.293 0.661 0.707 0.746 0.792 0.823 0.888
Inline graphic-SSD 0.228 0.293 0.327 0.635 0.673 0.707 0.745 0.772 0.826
BC1-SSD 0.504 0.563 0.593 0.872 0.907 0.937 0.972 0.996 1.000
BC1-DH 0.486 0.549 0.581 0.881 0.919 0.951 0.988 1.000 1.000
Maize model: 10 chromosomes of length 1.6 M
F2-SSD 0.268 0.324 0.352 0.615 0.648 0.676 0.709 0.732 0.778
Inline graphic-SSD 0.307 0.353 0.377 0.596 0.623 0.647 0.674 0.693 0.731
F1-DH 0.231 0.295 0.328 0.634 0.672 0.705 0.743 0.769 0.823
Inline graphic-DH 0.281 0.334 0.360 0.609 0.640 0.666 0.697 0.719 0.762
BC1-SSD 0.549 0.597 0.622 0.850 0.878 0.903 0.931 0.951 0.991
BC1-DH 0.533 0.585 0.611 0.858 0.889 0.915 0.946 0.967 1.000
Wheat model: 21 chromosomes of length 1.8 M
F2-SSD 0.347 0.383 0.402 0.576 0.598 0.617 0.638 0.653 0.684
Inline graphic-SSD 0.373 0.403 0.419 0.563 0.581 0.597 0.615 0.627 0.652
F1-DH 0.321 0.364 0.386 0.589 0.614 0.636 0.662 0.679 0.715
Inline graphic-DH 0.356 0.390 0.408 0.572 0.592 0.610 0.630 0.644 0.673
BC1-SSD 0.617 0.649 0.665 0.816 0.835 0.851 0.870 0.883 0.909
BC1-DH 0.606 0.640 0.658 0.822 0.842 0.860 0.880 0.894 0.923

For use as threshold values, we determined quantiles of the parental genome contribution for model genomes of sugar beet (9 chromosomes of 1.0 M length), maize (10 chromosomes of 1.6 M length), and wheat (21 chromosomes of 1.8 M length) by employing normal approximations (Table 5). The upper quantiles were considerably lower for long genomes than for short ones, e.g., the 0.95 quantile for F2-SSD lines was 0.598 for wheat and 0.681 for sugar beet. Breeding schemes with intermating before inbred line development had slightly smaller 0.95 quantiles than the corresponding breeding schemes without intermating.

The upper quantiles for F1-DH lines were considerably greater than those for F2-SSD lines. For example, the 0.95 quantile for F2-SSD lines of maize was 0.648, whereas for F1-DH lines it was 0.672 (Table 5). Typically, the expectation of the parental genome contribution is the criterion that determines acceptance or nonacceptance of a certain breeding scheme for inbred line development. The F2-SSD scheme is often suggested as an accepted breeding method for determining critical threshold values (cf. Heckenberger et al. 2006). If F2-SSD lines are considered acceptable, then F1-DH lines should also be considered acceptable, because both have an expected parental genome contribution of one-half. However, F1-DH lines have a considerably greater variance of the parental genome contribution (Table 3) and, consequently, greater upper quantiles (Table 5). Therefore, the F1-DH mating scheme seems in general more appropriate than the F2-SSD scheme for determining threshold values.

The test described above can be modified by using alternative test statistics or/and alternative methods to determine critical values. Alternative predictors of the parental genome contribution for use as test statistics were discussed by Frisch and Melchinger (2006), and alternative methods to determine critical threshold values were proposed by Smith et al. (1995), Wang and Bernardo (2000), and Heckenberger et al. (2005).

Smith et al. (1995) suggested employing fixed threshold values and proposed a parental genome contribution of 0.9 as threshold for maize lines. Compared with using fixed values as thresholds, our method has the advantage that it is genetically justified. For F2 and F1 derived lines of maize, the 0.999 quantiles of the parental genome contribution ranged between 0.73 and 0.82 (Table 5). In consequence, employing 0.9 as threshold value results in a low power of detecting backcross-derived inbreds.

Wang and Bernardo (2000) suggested determining threshold values using the variance of marker estimates of the parental genome contribution. Compared with the method of Wang and Bernardo (2000), our method has the advantage that the threshold values (Table 5) are independent of the employed set of molecular markers.

Heckenberger et al. (2005) suggested determining threshold values with computer simulations. Our results on the quantiles of the parental genome contribution for F2-SSD lines of maize were in good agreement with the corresponding results of Heckenberger et al. (2005). However, our method has the advantage that no computer simulations are required.

Application in selection theory:

Selection for parental marker alleles in backcross populations was investigated and a comprehensive selection theory was developed by Frisch and Melchinger (2005). That approach takes into account (a) the exact distribution of the parental genome contribution and (b) that selection for the parental alleles at marker loci is actually an indirect selection for the parental alleles at all loci of the entire genome. However, such theory is not available for inbred lines developed with the mating schemes considered in this study. Using a simpler mathematical model that neglects (a) and (b), the variances of the parental genome contribution can be employed to estimate the response to selection.

We consider a population of inbred lines analyzed for a large number of polymorphic molecular markers, which are covering the entire genome without larger gaps (e.g., one marker per centimorgan). Selection is carried out for the alleles of one parental line and the marker score is regarded as the target trait for selection. Under these assumptions, an approximate pre-test estimate of the response to selection R can be obtained adopting from standard selection theory (Falconer and Mackay 1996, p. 189, Equation 11.3),

graphic file with name M14.gif (7)

where i is the selection intensity, h2 the heritability, and σp the square root of the phenotypic variance. Assigning a heritability of h2 = 1 for the markers and using the variance of the parental genome contribution as phenotypic variance we obtain

graphic file with name M15.gif (8)

Further applications:

In addition to the above applications, the results presented are of general interest for breeders and geneticists because they allow comparison of the distribution of the parental genome contribution for alternative mating schemes.

For example, an important goal in second-cycle breeding is the development of inbred lines that share the general characteristics with one parental line and are improved by specific characteristics of a second crossing partner. Such derived lines are then used as a replacement for parental lines in a breeding program. As a rule of thumb, the breeder may attempt to derive lines with a parental genome contribution of 3/4 from the parental line, which should be replaced by the derived line. The probability distribution of the parental genome contribution can help to assess the suitability of mating schemes to deliver such inbred lines. For sugar beet, the overlap of the probability density functions of the parental genome contribution to F2-SSD and BC1-SSD lines is considerable (Figure 2) and it is possible to select lines with a parental genome contribution of 70–75% from an F2-derived population. In contrast, for wheat, F2-SSD lines with parental genome contributions of 3/4 or more from one crossing partner do occur only with an extremely small probability (Figure 2). Therefore, in wheat a BC1-derived population must be generated to be able to select lines with the desired parental genome contribution.

Figure 2.—

Figure 2.—

Probability density functions of the normal approximation of the parental genome contribution to F2-SSD and BC1-SSD lines in (left) sugar beet (9 chromosomes of 1 M length) and (right) wheat (21 chromosomes of 1.8 M length).

These examples demonstrate that our results can be used to assess the expected variation of the parental genome contribution in populations derived from planned crosses of parental lines, depending on the number and length of the chromosomes of the species. This information can help breeders and geneticists in the design of breeding programs and experiments.

Acknowledgments

We thank Frank M. Gumpert for checking the derivations in the appendix and an anonymous reviewer for helpful comments and suggestions.

APPENDIX

We derive the variance of the parental genome contribution to a chromosome according to Equation 3 for four mating systems.

BCt-DH lines:

Inserting D(u, v) for BCt-DH lines (Table 1) into Equation 3 yields

graphic file with name M41.gif (A1)

With

graphic file with name M42.gif (A2)

and

graphic file with name M43.gif (A3)

we get

graphic file with name M44.gif (A4)

BCt-SSD lines:

For BCt-SSD lines we have (Table 1)

graphic file with name M45.gif (A5)

We consider the second indefinite integral in Equation A5 for the case uv and set

graphic file with name M46.gif (A6)

and with logarithmic integration we get

graphic file with name M47.gif (A7)

Applying the same principle to the case u > v we get

graphic file with name M48.gif (A8)

We now consider the first indefinite integral in Equation A5. Adding to the numerator

graphic file with name M49.gif (A9)

and applying

graphic file with name M50.gif (A10)

we get

graphic file with name M51.gif (A11)

Hence, we get for a fixed value of v,

graphic file with name M52.gif (A12)

where

graphic file with name M53.gif (A13)

For symmetry reasons

graphic file with name M54.gif (A14)

Employing the dilogarithm function (cf. Galassi et al. 2006)

graphic file with name M55.gif (A15)

we get

graphic file with name M56.gif (A16)

Using this and

graphic file with name M57.gif (A17)

yields

graphic file with name M58.gif (A18)

(F1)t-DH lines:

For (F1)t-DH lines we have (Table 2)

graphic file with name M59.gif (A19)

Consider uv and set

graphic file with name M60.gif (A20)

then

graphic file with name M61.gif (A21)

With integration by substitution we get

graphic file with name M62.gif (A22)

In analogy we get for u > v

graphic file with name M63.gif (A23)

and therefrom for a fixed v

graphic file with name M64.gif (A24)

We have

graphic file with name M65.gif (A25)

For symmetry reasons

graphic file with name M66.gif (A26)

and therefrom we get

graphic file with name M67.gif (A27)

(F2)t-SSD lines:

Using the definition of D(u, v) from Table 1 and Equation A11 we get for a fixed value of v

graphic file with name M68.gif (A28)

where ξ1, ξ2, ξ3, ξ4, and ξ5 are defined in Equation A13, and therefrom

graphic file with name M69.gif (A29)

References

  1. Falconer, D. S., and T. C. F. Mackay, 1996. Introduction to Quantitative Genetics, Ed. 4. Longman Group, Harlow, UK.
  2. Frisch, M., and A. E. Melchinger, 2001. The length of the intact chromosome segment around a target gene in marker-assisted backcrossing. Genetics 157: 1343–1356. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Frisch, M., and A. E. Melchinger, 2005. Selection theory for marker-assisted backcrossing. Genetics 170: 909–917. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Frisch, M., and A. E. Melchinger, 2006. Marker-based prediction of the parental genome contribution to inbred lines derived from biparental crosses. Genetics 174: 795–803. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Galassi, M., J. Davies, J. Theiler, B. Gough, G. Jungman et al., 2006. GNU Scientific Library Reference Manual, Ed. 2, V1.8. Network Theory, Bristol, UK.
  6. Haldane, J. B. S., 1919. The combination of linkage values and the calculation of distance between the loci of linkage factors. J. Genet. 8: 299–309. [Google Scholar]
  7. Heckenberger, M., M. Bohn, M. Frisch, H. P. Maurer and A. E. Melchinger, 2005. Identification of essentially derived varieties with molecular markers: an approach based on statistical test theory and computer simulations. Theor. Appl. Genet. 111: 598–608. [DOI] [PubMed] [Google Scholar]
  8. Heckenberger, M. J. Muminovic, J. R. van der Voort, J. Peleman, M. Bohn, and A. E. Melchinger, 2006. Identification of essentially derived varieties obtained from biparental crosses of homozygous lines. III. AFLP data from maize inbreds and comparison with SSR data. Mol. Breeding 17: 111–125. [Google Scholar]
  9. Hill, W. G., 1993. Variation in genetic composition in backcrossing programs. J. Hered. 84: 212–213. [Google Scholar]
  10. Karlin, S., 1968. A First Course in Stochastic Processes. Academic Press, New York.
  11. Kosambi, D. D., 1944. The estimation of the map distance from recombination values. Ann. Eugen. 12: 172–175. [Google Scholar]
  12. Maurer, H. P., A. E. Melchinger and M. Frisch, 2004. Plabsoft: Software for simulation and data analysis in plant breeding. Proceedings of the 17th Eucarpia General Congress, September 8–11, 2004, Tulln, Austria, pp. 359–362.
  13. McPeek, M. S., and T. P. Speed, 1995. Modeling interference in genetic recombination. Genetics 139: 1031–1044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Shao, J., 1999. Mathematical Statistics. Springer-Verlag, New York.
  15. Smith, J. S. C., D. S. Ertl and B. A. Orman, 1995. Identification of maize varieties, pp. 253–264 in Identification of Food-Grain Varieties, edited by C. W. Wrigley. American Association of Cereal Chemists, St. Paul, MN.
  16. Stam, P., 1979. Interference in genetic crossing over and chromosome mapping. Genetics 92: 573–594. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Wang, J., and R. Bernardo, 2000. Variance of marker estimates of parental contribution to F2 and BC1-derived inbreds. Crop Sci. 40: 659–665. [Google Scholar]

Articles from Genetics are provided here courtesy of Oxford University Press

RESOURCES