Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
letter
. 2017 Dec 29;115(2):E122. doi: 10.1073/pnas.1719250115

Note on bias from averaging repeated measurements in heritability studies

Benjamin B Risk a,1, Hongtu Zhu b
PMCID: PMC5777080  PMID: 29288225

Ge et al. (1) consider the extension of Fisher’s classic model for heritability to the case where there are repeated measurements on subjects. One approach to analyzing repeated measurements is to average observations. The authors show empirically and via simulations that estimates of heritability derived from averaging repeated measurements lead to underestimates of heritability. Some may find the bias revealed by Ge et al. (1) to be surprising because averaging is commonly justified in other settings such as repeated measures ANOVA. In this letter we detail the bias that arises from conflating measurement error and unique environmental variance. This elucidates the authors’ empirical findings, which represent a case with large measurement error exacerbated by only two measurements per subject.

Consider the model for additive genetic, common environmental, and unique environmental components. We use the mixed-model formulation as in ref. 2 but include measurement error. For conciseness, we assume no nuisance covariates. Let yijk be the kth measurement for k = 1, . . . , n (for simplicity, we assume the same n for all subjects) for the jth individual in the ith family. Let aiN(0,σA2) denote the additive genetic component, and for dizygotics (DZs) and siblings we add aij as in ref. 2. Let ciN(0,σC2) denote the common environmental component, eijN(0,σE2) the unique environmental, and εijkN(0,σM2) the measurement error (as the authors note, εijk can also include biological transients). For monozygotic twins (MZs),

yijk=ai+ci+eij+εijk

and for DZs and full siblings,

yijk=0.5ai+0.5aij+ci+eij+εijk,

and, consequently, Varyijk=σA2+σC2+σE2+σM2 for both cases. Ge at al. (1) observe σE2 and σM2 are identifiable if data are not averaged and define measurement-error corrected heritability: h2=σA2/(σA2+σC2+σE2).

Now consider y¯ij.=1nk=1nyijk (the averaged data). For MZs,

y¯ij.=ai+ci+eij+1nk=1nɛijk,

and then Vary¯ij.=σA2+σC2+σE2+1nσM2, and similarly for DZs and full siblings. Now σE2 and σM2 are no longer identifiable, so they are conflated as a single term; define σF2=σE2+1nσM2. Consequently, hbias2=σA2/(σA2+σC2+σF2); equivalently, hbias2=σA2/(σA2+σC2+σE2+1nσM2).

Thus, we see that the bias in the averaged data decreases as the number of repeated measurements increases, as well as when the measurement error decreases. Here we have only examined the model-based heritabilities based on identifiability issues, whereas in practice maximum likelihood or other estimators may introduce some additional biases, but the simple analysis here suffices to reveal the crux of the issue. The authors highlight a case with high variability and only two measurements (n = 2), and hence the differences between approaches are large. We commend the authors for pointing out the empirical differences that can arise but also note the role of the number of repeated measurements.

Footnotes

The authors declare no conflict of interest.

References

  • 1.Ge T, Holmes AJ, Buckner RL, Smoller JW, Sabuncu MR. Heritability analysis with repeat measurements and its application to resting-state functional connectivity. Proc Natl Acad Sci USA. 2017;114:5521–5526. doi: 10.1073/pnas.1700765114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Rabe-Hasketh S, Skrondal A, Gjessing H. Biometrical modeling of twin and family data using standard mixed model software. Biometrics. 2008;64:280–288. doi: 10.1111/j.1541-0420.2007.00803.x. [DOI] [PubMed] [Google Scholar]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES