Skip to main content
Nature Communications logoLink to Nature Communications
letter
. 2018 Jun 29;9:2537. doi: 10.1038/s41467-018-04807-3

Misestimation of heritability and prediction accuracy of male-pattern baldness

Chloe X Yap 1, Julia Sidorenko 1,2, Riccardo E Marioni 3,4, Loic Yengo 1, Naomi R Wray 1,5, Peter M Visscher 1,5,
PMCID: PMC6026149  PMID: 29959328

Pirastu et al.1 perform the largest GWAS to date on male-pattern baldness (MPB), discover 71 loci (of which 30 are new) and draw inference about its heritability and genetic architecture. They report a SNP heritability on the scale of liability (hl2) of 94%, with 38% of total heritability explained by the 71 loci. From these estimates, they draw strong conclusions about the genetic architecture of MPB. However, the chosen definition of the phenotype and the applied transformation to the unobserved scale of liability have led to a large upwards bias of the estimates of these parameters, as shown here in theory and from data.

In the UK Biobank (UKB), MPB is measured on a four-point ordinal scale (values 1–4, with 1 representing no sign of baldness). Using the same UKB sub-sample selection as Pirastu et al. (unrelated British, genetically Caucasian, n = 54,813), the proportion of men with self-report MPB in each category is 0.317, 0.229, 0.269 and 0.185, respectively. In analysis, the authors ignore 23% of the population with a score of 2, and define ‘cases’ as those with self-reported scores of 3 or 4, and ‘controls’ as self-reported scores of 1, leading to a ‘prevalence’ of 59%. Yet the reported hl2 estimates are presented as if parameters in the (whole) population. An implicit assumption of their approach is that those self-reporting a score of 2, which they consider to be ‘rather dubious baldness’, are randomly drawn from the population. To determine if this assumption is valid, we took the 47 most associated independent autosomal loci that were identified independently26,10 of the UKB data (to avoid bias) and then used the same UKB data as in Pirastu et al. to estimate the frequencies of the trait-increasing alleles for each of the 4 scores. The results (Fig. 1) show that these frequencies are approximately linear in scores 1–4, and clearly score 2 is not random with respect to liability. Moreover, the observed pattern is consistent with an additive model on the scale of these scores. Therefore, since a score of 2 is correlated with liability to MPB, ignoring individuals with a score of 2, without accounting for the resulting extreme tail ascertainment, will lead to a bias in the estimate of genetic parameters. We derived from theory the general transformation equation that should be applied to the estimate of heritability made on the binary observed scale in samples that are ascertained based on tail selection and/or oversampling of cases or controls (ho[s]2) to achieve unbiased estimates of hl2 (equation [1] in Supplementary Methods).

Fig. 1.

Fig. 1

Trait-increasing allele frequency by MPB score in UKB for 47 genome-wide significant GWAS loci identified in refs. 26,10. For each of the 47 loci, the trait-increasing allele frequency in the UK Biobank sample is given on the y-axis, as a deviation from its frequency for men with a MPB score of 1. The x-axis labels represent the observed MPB categories in the UK Biobank

We first replicated the results of Pirastu et al., using their sampling design and model (as best as we could deduce from the details provided) and using the same UK Biobank data. The estimate ho[s]2 for scores 3 + 4 vs. score 1 using GCTA7 was 0.61 (s.e. = 0.03). If this is transformed to the scale of liability using the standard equation8 (equation [2] in Supplementary Methods) then the estimate of hl2 is 0.98 (standard error, s.e. = 0.04) similar to the estimate reported by Pirastu et al. However, the correct transformation (equation [1] in Supplementary Methods) generates an estimate of 0.64 (s.e. = 0.03). To empirically explore assumptions of the liability threshold model, we analysed random samples of 20,000 males dichotomised in a number of ways (Table 1). These analyses generated estimates of hl2 in the range of 0.61–0.75. We also analysed MPB on the continuous scale of 1–4, which does not remove information through dichotomisation, transforming the estimate of heritability to the liability scale hl2 = 0.69 (s.e. = 0.03)9 (equation [3] in Supplementary Methods).

Table 1.

Estimates of heritability of liability of MPB using different random samples of 20,000 men ascertained in different ways

MPB scores for cases MPB scores for controls KL KU P ho[s]2(s.e.) hl2(s.e.) R 2a
4 1,2,3 0.81 0.19 0.19 0.36 (0.03) 0.75 (0.06) 0.15
3,4 1,2 0.54 0.46 0.46 0.46 (0.03) 0.72 (0.05) 0.16
2,3,4 1 0.68 0.32 0.32 0.41 (0.03) 0.70 (0.05) 0.17
3,4b 1 0.32 0.46 0.59 0.61 (0.03) 0.64 (0.03) 0.16
4 1 0.32 0.19 0.37 0.96 (0.03) 0.63 (0.02) 0.13
Quantitative 1,2,3,4 0.59 (0.03) 0.69 (0.03) 0.16

KL proportion of the population in the lower tail, extreme controls. KU proportion of the population in the upper tail, cases. P proportion of the samples used for analyses that are cases.

aProportion of variance in liability explained by the 107-SNP predictor

bThe sampling strategy conducted by Pirastu et al.

We estimated the variance explained by the 107 SNP predictor from the difference in the estimate of total phenotypic variance in models excluding and including the predictor as a fixed effect. This method for estimation of the contribution of the SNP predictor to trait variation differs to that presented by Pirastu et al. In contrast to their approach, it does not depend on unbiased estimation of genetic variance in the two models. Moreover, it is accurate (the s.e. of estimating a phenotypic variance is small) and quantifies a parameter that is most relevant to epidemiology and risk prediction. From the estimate of the variance explained by the predictor, we calculated the proportion of variance it explained on the observed scale and then transformed this proportion to the scale of liability. Results (Table 1) imply that the variance in liability attributable to this predictor is ~15–20%, substantially less than claimed by the authors.

In conclusion, the evidence presented by Pirastu et al. is not consistent with the claims that virtually all variation in liability to MPB is genetic and that common SNPs capture all that variation. A correct transformation from the observed scale to a scale of liability results in an estimate of SNP heritability of ~60–70%, and the 71-loci (107-SNP predictor) explains about 15–20% of variation in liability.

Electronic supplementary material

Supplementary Information (472.3KB, pdf)

Acknowledgements

This research has been conducted using the UK Biobank Resource under project 12514.

Author contributions

P.M.V. and N.R.W. designed the experiment and derived theory. C.Y., J.S., R.E.M., and L.Y. performed analyses, and all authors contributed to writing the paper.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Change history

11/20/2018

The original version of this Article contained an error in the spelling of the author Julia Sidorenko, which was incorrectly given as Julia Sirodenko. This has now been corrected in both the PDF and HTML versions of the Article. Further, the sixth sentence of the second paragraph of the Correspondence and the legend to Fig. 1 incorrectly omitted citation of work by Heilmann-Helmbach, S. et al. This has now been corrected in both the PDF and HTML versions of the Article.

Electronic supplementary material

Supplementary Information accompanies this paper at 10.1038/s41467-018-04807-3.

References

  • 1.Pirastu N, et al. GWAS for male-pattern baldness identifies 71 susceptibility loci explaining 38% of the risk. Nat. Commun. 2017;8:1584. doi: 10.1038/s41467-017-01490-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Li R, et al. Six novel susceptibility Loci for early-onset androgenetic alopecia and their unexpected association with common diseases. PLOS Genet. 2012;8:e1002746. doi: 10.1371/journal.pgen.1002746. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Hillmer AM, et al. Susceptibility variants for male-pattern baldness on chromosome 20p11. Nat. Genet. 2008;40:1279–1281. doi: 10.1038/ng.228. [DOI] [PubMed] [Google Scholar]
  • 4.Brockschmidt FF, et al. Susceptibility variants on chromosome 7p21.1 suggest HDAC9 as a new candidate gene for male-pattern baldness. Br. J. Dermatol. 2011;165:1293–1302. doi: 10.1111/j.1365-2133.2011.10708.x. [DOI] [PubMed] [Google Scholar]
  • 5.Richards JB, et al. Male-pattern baldness susceptibility locus at 20p11. Nat. Genet. 2008;40:1282–1284. doi: 10.1038/ng.255. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Heilmann S, et al. Androgenetic alopecia: identification of four genetic risk loci and evidence for the contribution of WNT signaling to its etiology. J. Invest. Dermatol. 2013;133:1489–1496. doi: 10.1038/jid.2013.43. [DOI] [PubMed] [Google Scholar]
  • 7.Yang J, Lee SH, Goddard ME, Visscher PM. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 2011;88:76–82. doi: 10.1016/j.ajhg.2010.11.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Lee SH, Wray NR, Goddard ME, Visscher PM. Estimating missing heritability for disease from genome-wide association studies. Am. J. Hum. Genet. 2011;88:294–305. doi: 10.1016/j.ajhg.2011.02.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Gianola D. Heritability of polychotomous characters. Genetics. 1979;93:1051–1055. doi: 10.1093/genetics/93.4.1051. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Heilmann-Helmbach, S. et al. Meta-analysis identifies novel risk loci and yields systematic insights into the biology of male-pattern baldness. Nat Commun. 8, 14694 (2017). [DOI] [PMC free article] [PubMed]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Information (472.3KB, pdf)

Articles from Nature Communications are provided here courtesy of Nature Publishing Group

RESOURCES