Note on bias from averaging repeated measurements in heritability studies

Benjamin B Risk; Hongtu Zhu

doi:10.1073/pnas.1719250115

letter

. 2017 Dec 29;115(2):E122. doi: 10.1073/pnas.1719250115

Note on bias from averaging repeated measurements in heritability studies

Benjamin B Risk ^a,¹, Hongtu Zhu ^b

PMCID: PMC5777080 PMID: 29288225

Ge et al. (1) consider the extension of Fisher’s classic model for heritability to the case where there are repeated measurements on subjects. One approach to analyzing repeated measurements is to average observations. The authors show empirically and via simulations that estimates of heritability derived from averaging repeated measurements lead to underestimates of heritability. Some may find the bias revealed by Ge et al. (1) to be surprising because averaging is commonly justified in other settings such as repeated measures ANOVA. In this letter we detail the bias that arises from conflating measurement error and unique environmental variance. This elucidates the authors’ empirical findings, which represent a case with large measurement error exacerbated by only two measurements per subject.

Consider the model for additive genetic, common environmental, and unique environmental components. We use the mixed-model formulation as in ref. 2 but include measurement error. For conciseness, we assume no nuisance covariates. Let $y_{ijk}$ be the kth measurement for k = 1, . . . , n (for simplicity, we assume the same n for all subjects) for the jth individual in the ith family. Let $a_{i} \sim N (0, σ_{A}^{2})$ denote the additive genetic component, and for dizygotics (DZs) and siblings we add $a_{ij}$ as in ref. 2. Let $c_{i} \sim N (0, σ_{C}^{2})$ denote the common environmental component, $e_{ij} \sim N (0, σ_{E}^{2})$ the unique environmental, and $ε_{ijk} \sim N (0, σ_{M}^{2})$ the measurement error (as the authors note, $ε_{ijk}$ can also include biological transients). For monozygotic twins (MZs),

y_{ijk} = a_{i} + c_{i} + e_{ij} + ε_{ijk}

and for DZs and full siblings,

y_{ijk} = \sqrt{0.5} a_{i} + \sqrt{0.5} a_{ij} + c_{i} + e_{ij} + ε_{ijk},

and, consequently, $Var y_{ijk} = σ_{A}^{2} + σ_{C}^{2} + σ_{E}^{2} + σ_{M}^{2}$ for both cases. Ge at al. (1) observe $σ_{E}^{2}$ and $σ_{M}^{2}$ are identifiable if data are not averaged and define measurement-error corrected heritability: $h^{2} = σ_{A}^{2} / (σ_{A}^{2} + σ_{C}^{2} + σ_{E}^{2})$ .

Now consider ${\bar{y}}_{ij} . = \frac{1}{n} \sum_{k = 1}^{n} y_{ijk}$ (the averaged data). For MZs,

{\bar{y}}_{ij} . = a_{i} + c_{i} + e_{ij} + \frac{1}{n} \sum_{k = 1}^{n} ɛ_{ijk},

and then $Var {\bar{y}}_{ij} . = σ_{A}^{2} + σ_{C}^{2} + σ_{E}^{2} + \frac{1}{n} σ_{M}^{2}$ , and similarly for DZs and full siblings. Now $σ_{E}^{2}$ and $σ_{M}^{2}$ are no longer identifiable, so they are conflated as a single term; define $σ_{F}^{2} = σ_{E}^{2} + \frac{1}{n} σ_{M}^{2}$ . Consequently, $h_{bias}^{2} = σ_{A}^{2} / (σ_{A}^{2} + σ_{C}^{2} + σ_{F}^{2})$ ; equivalently, $h_{bias}^{2} = σ_{A}^{2} / (σ_{A}^{2} + σ_{C}^{2} + σ_{E}^{2} + \frac{1}{n} σ_{M}^{2})$ .

Thus, we see that the bias in the averaged data decreases as the number of repeated measurements increases, as well as when the measurement error decreases. Here we have only examined the model-based heritabilities based on identifiability issues, whereas in practice maximum likelihood or other estimators may introduce some additional biases, but the simple analysis here suffices to reveal the crux of the issue. The authors highlight a case with high variability and only two measurements (n = 2), and hence the differences between approaches are large. We commend the authors for pointing out the empirical differences that can arise but also note the role of the number of repeated measurements.

Footnotes

The authors declare no conflict of interest.

References

1.Ge T, Holmes AJ, Buckner RL, Smoller JW, Sabuncu MR. Heritability analysis with repeat measurements and its application to resting-state functional connectivity. Proc Natl Acad Sci USA. 2017;114:5521–5526. doi: 10.1073/pnas.1700765114. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Rabe-Hasketh S, Skrondal A, Gjessing H. Biometrical modeling of twin and family data using standard mixed model software. Biometrics. 2008;64:280–288. doi: 10.1111/j.1541-0420.2007.00803.x. [DOI] [PubMed] [Google Scholar]

[r1] 1.Ge T, Holmes AJ, Buckner RL, Smoller JW, Sabuncu MR. Heritability analysis with repeat measurements and its application to resting-state functional connectivity. Proc Natl Acad Sci USA. 2017;114:5521–5526. doi: 10.1073/pnas.1700765114. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r2] 2.Rabe-Hasketh S, Skrondal A, Gjessing H. Biometrical modeling of twin and family data using standard mixed model software. Biometrics. 2008;64:280–288. doi: 10.1111/j.1541-0420.2007.00803.x. [DOI] [PubMed] [Google Scholar]

PERMALINK

Note on bias from averaging repeated measurements in heritability studies

Benjamin B Risk

Hongtu Zhu

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Note on bias from averaging repeated measurements in heritability studies

Benjamin B Risk

Hongtu Zhu

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases