Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Mar 17.
Published in final edited form as: Behav Genet. 2019 Nov 11;50(1):67–71. doi: 10.1007/s10519-019-09979-2

No evidence for social genetic effects or genetic similarity among friends beyond that due to population stratification: a reappraisal of Domingue et al (2018)

Loic Yengo a, Morgan Sidari b, Karin J H Verweij c, Peter M Visscher a, Matthew C Keller d,e, Brendan P Zietsch b,*
PMCID: PMC7077882  NIHMSID: NIHMS1559958  PMID: 31713005

Abstract

Using data from 5,500 adolescents from the National Longitudinal Study of Adolescent to Adult Health, Domingue et al. (2018) claimed to show that friends are genetically more similar to one another than randomly selected peers, beyond the confounding effects of population stratification by ancestry. The authors also claimed to show ‘social-genetic’ effects, whereby individuals’ educational attainment (EA) is influenced by their friends’ genes. We argue that neither claim is justified by the data. Mathematically we show that 1) the genetic similarity reported between friends is far larger than theoretically possible if it was caused by phenotypic assortment as the authors claim; uncontrolled population stratification is a likely reason for the genetic similarity they observed, and 2) significant association between individuals’ EA and their friends’ polygenic scores for EA is a necessary consequence of EA similarity among friends, and does not provide evidence for social-genetic effects. Going forward, we urge caution in the analysis and interpretation of data at the intersection of human genetics and the social sciences.


The availability of large samples of individuals with genome-wide genetic data in combination with behavioural phenotypes and social outcomes has led to a resurgence in research that addresses questions at the interface of genetics and the social sciences. Some of that research is hypothesis driven while much of it is data-driven and hypothesis-generating. The genetics and statistical analysis of human traits has a solid underpinning theory in quantitative and population genetics (Lynch & Walsh, 1998; Walsh & Lynch, 2018), and rigorous benchmarking against these underpinnings is essential – especially when novel or unexpected results in human behaviour are reported. In this paper, we highlight one example (and list others) where novel results and claims are not justified by the data presented and instead have alternative and more parsimonious explanations.

Using data from 5,500 adolescents from the National Longitudinal Study of Adolescent to Adult Health, Domingue et al. (2018) claimed to show that friends are genetically more similar to one another than randomly selected peers, beyond the confounding effects of population stratification by ancestry. The authors also claimed to find evidence of ‘social-genetic’ effects, whereby individuals’ educational attainment (EA) is influenced by their friends’ genes. Here we argue that neither claim is justified by the data.

Genetic similarity of friends – phenotypic assortment or uncontrolled population stratification?

One might intuit that individuals who pair up according to their similarity on a heritable trait (i.e. phenotypic assortment) should be genetically more similar than random pairs of individuals. Although this intuition is technically correct, the induced genetic similarity is trivially small for polygenic traits, as has been shown before (Robinson et al., 2017) and as we reiterate below. In this context, we refer to genetic similarity as the degree to which pairs of individuals – say, romantic partners or in this case friends – share alleles across the whole genome. More specifically, when single nucleotide polymorphism (SNP) data is available, the genetic similarity between two individuals i and j is classically measured using a genomic relationship coefficient (Yang et al., 2010) Aij defined as

Aij= 1Mk=1M(xik2pk)(xjk2pk)2pk(1pk), (1)

where M is the number of SNPs used to estimate Gij, pk the minor allele frequency of SNP k and xik and xjk the numbers of minor alleles at SNP k that individual i and j possess respectively.

Robinson et al. (2017) previously showed that the expected genomic relationship coefficient between individuals who assort on a trait equals rh2σGRM2, where r is the trait correlation between assorted individuals, h2 is the heritability of the trait and σGRM2 is the variance of genomic relationship coefficients in the population. Previous studies (Goddard, Wray, Verbyla, & Visscher, 2009) have shown for a given population and a given set of common SNPs used to calculate Aij, that σGRM2 is a fixed quantity which only depends on the effective size of human populations (Ne). Assuming Ne~10,000 (Takahata, 1993), Visscher and colleagues (2014) estimated in individuals of European ancestry that for common SNPs, σGRM2~2×105. Therefore, assuming the SNP heritability of EA to be h2~0.12 (Lee et al., 2018) and given the correlation of EA between friends (r=0.415; from their Table 2) reported by Domingue et al., we would expect the mean genomic relationship coefficients between friends assorting on EA to be 0.12×0.415×2×10−5, i.e. ~10−6. Such a slight increase in genetic similarity would require millions of friend pairs to be reliably detected, as already emphasised in Robinson et al. (2017).

How does this expected genetic similarity compare to that reported by Domingue et al.? Unfortunately, such a comparison is not straightforward, because the friend-pair genetic similarity reported by Domingue et al. does not represent a standard measure used in the human genetics literature (Yang et al., 2010); instead it is an alternative measure introduced by the authors in a previous publication (Domingue, Fletcher, Conley, & Boardman, 2014). To enable a comparison, we derive the mathematical relation between Domingue et al.’s similarity measure, hereafter denoted I(μ), and the genomic relationship coefficient (Yang et al., 2010).

I(μ) is defined in Domingue et al. (2014) as an estimator of the area under the curve defined by the quantiles of the distribution of kinship coefficients under the null (random pairing) versus the quantiles of the distribution of kinship coefficients under the alternative (e.g. assortative mating). Given that kinship coefficients equal half of genomic relationship coefficients, we derive below an interpretation of I(μ) in terms of differences in mean genomic relationship coefficients between two groups of pairs. We consider two distributions of genomic relationship coefficients under the null (H0:N(0,σGRM2)) and under the alternative (H1:N(μ,σGRM2)). Therefore I(μ) can be expressed as

I(μ)=01Φ1[Φ0-1(u)]du-1/2, (2)

where Φk is the cumulative distribution function of genomic relationship coefficients under Hk.

If we posit v=Φ0-1(u), i.e. u = Φ0(v), then du = ϕ0(v)dv, with ϕ0(.) being the probability density function of genomic relationship coefficients under the null. When u = 0, v=Φ0-1(0)=- and when u = 1, v=Φ0-1(1)=+. Moreover, we can show, under Gaussian assumptions, that Φ1(v) = Φ0(v + μ). Therefore, I(μ) can be rewritten as

I(μ)= -+Φ0(v+μ)ϕ0(v)dv-1/2. (3)

I(μ) cannot be calculated analytically from Equation (3). However, we can still derive its Taylor’s series expansion for small values of |μ| as

I(μ)|μ|0I(0)+I(0)(μ0)=μ/4πσGRM2. (4)

In Domingue et al. (2018), the reported genetic similarity between friends is .031 (95%CI: 0.022 – 0.036) (data from Table 1 in Domingue et al., 2018). Therefore, Equation (4) implies that such a level of genetic similarity would correspond to a mean difference in genomic relationship coefficients of ~.031×4πσGRM2, i.e. ~4.9×10−4 (95%CI: 3.5×10−4 – 5.7×10−4) between friend pairs and random pairs.

Although quite small in absolute terms, this value is still about 500 times larger than the theoretical value of ~10−6 expected from friends phenotypically assorting on EA. Indeed, even if friends were perfectly correlated on a phenotype that was 100% heritable, we would still only expect a genetic similarity of ~2×10−5, still an order of magnitude smaller than the lower bound of the smallest of Domingue et al.’s relatedness estimate (i.e. 0.11 from their Table S3 ×4πσGRM2=1.7×104). Therefore, the claim by Domingue et al. that the genetic similarity between friends could be due to phenotypically assorting on EA is incompatible with theory and inconsistent with known properties of the human genome and trait variation.

An alternative explanation of such genetic similarity among friends is population stratification, whereby individuals are more likely to befriend others living in close geographical vicinity and thus of likely similar ancestry. Domingue et al. acknowledged the possibility of confounding due to population stratification, but we do not believe their correction was effective. The standard way to correct for stratification in these types of analyses is to control for principal components from a genomic relationship matrix (Price, Zaitlen, Reich, & Patterson, 2010). Instead, Domingue et al. reported (in their Supplementary Materials) a secondary analysis using the program REAP (Thornton et al., 2012), which they argue is robust to stratification and still reveals a significant genetic similarity among friends (though reduced: from 0.031 (95%CI: 0.022 – 0.036) to 0.02 (95%CI: 0.011 – 0.028); data from Table S3 in Domingue et al., 2018). However, REAP was designed to estimate kinship among related individuals in admixed samples with heterogeneous continental ancestry (Thornton et al., 2012) – not for estimating genetic similarity among unrelated individuals with homogenous continental ancestry (e.g. Domingue et al.’s sample of European ancestry; T. Thornton, personal communication, April 8, 2018). There is no evidence that REAP effectively corrects for subtle population stratification within a continental ancestry group (e.g. Northern vs. Southern European ancestry). Given the results REAP has yielded in this case are implausible, as described above, it appears to us the most likely and more parsimonious explanation for the observed genomic similarity among friends is residual within-continental population stratification. We cannot rule out other explanations for the high genetic similarity observed, but, importantly, we can rule out the explanation—friends’ assortment on EA scores—provided by Domingue et al.

‘Social-genetic effects’ or simply like-befriending-like?

Domingue et al. report a significant association between focal individuals’ educational attainment (EA) and their friends’ polygenic score for EA (PGSEA), controlling for focal individuals’ PGSEA. They argue that this is evidence for social-genetic effects, whereby individuals’ educational attainment (EA) is influenced by their friends’ genes via direct effects of one’s (heritable) EA on friends’ EA. Here, we demonstrate that, given the predictive ability of PGSEA, Domingue et al.’s findings are necessary consequences of the well-documented observation that friends tend to have similar EA values. Thus, the genetic findings they report are irrelevant to understanding why such similarity occurs, and provide no evidence for social-genetic effects over simpler alternatives, such as social homophily (the general tendency to associate and bond with similar others; Smirnov & Thurner, 2017; Tuma & Hallinan, 1979).

We derive the results reported by Domingue et al. to be evidence for social-genetic effects based only on the EA-PGS correlation and the correlation between EAs of friends. We define the following terms:

  • EAi: the EA of focal individual i

  • EAjFi: the EA of individual j, a member of individual i’s friend group (Fi)

  • PGSi: the PGS for EA of focal individual i

  • PGSjFi: the PGS of EA of individual j, a member of individual i’s friend group (Fi)

  • PGS¯Fi: the mean PGS of EA across all friends of individual i (Fi)

Applying path tracing rules to Figure 1a (in which correlations are taken from Table 2 of Domingue et al.), the expected correlation between EAi and PGSjFi (or equivalently between PGSi and EAjFi) is .26 × .42 = .109, and the expected correlation between PGSi and PGSjFi is .26 × .42 × .26 = .029 (all variables are standardized as noted on p. S3 of Domingue et al. Supplementary Materials). Thus, PGS’s are expected to correlate between mates or friends whenever there is phenotypic assortment.

Figure 1.

Figure 1.

Path models of the results Domingue et al. use to argue for social-genetic effects. Figure 1A can be used to derive the expected relationship between the polygenic risk scores (PGS) of education between friends given the phenotypic correlation of education between friends reported by Domingue et al. Figure 1B can be used to derive the expected slope of educational attainment (EAi) of a focal individual regressed on the average educational PGS of their friends (PGS¯Fi). As shown in the diagrams, such associations are necessary consequences of phenotypic assortment.

In their Figure 2 and Table S6, Domingue et al. report the association (the slope in this case) between EAi and PGS¯Fi. This differs from the association between EAi and PGSjFi derived above, and depends on the number of friends included in the average PGS score, PGS¯Fi. The number of friends an individual had in the study sample varied across individual but had a mean of 2 (Domingue et al. Figure S3). For mathematical tractability, we assume the number of friends was constant at 2 across all focal individuals. Furthermore, the slope of EAi~PGS¯Fi depends on the variances of both variables. The authors state that outcomes and predictors were standardized for this analysis (Domingue et al. Figure 2 caption), and our expectations below agree with this. We therefore assume that PGS¯Fi was standardized after taking the mean. The coefficients to PGS¯Fi in Figure 1b (.697 and .697) generate var(PGS¯Fi)=1, after accounting for the correlation between EAj=1Fi and EAj=2Fi. Our path model assumes that co-friends of a focal individual are correlated as highly as each friend is to the focal individual, but this assumption has only a minor influence on results: the coefficient from PGSjFi to PGS¯Fi would be only slightly different (.5.707) if co-friends’ EA values were uncorrelated.

The expected slope of EAi regressed on PGS¯Fi, E[β^EAi~PGS¯Fi] can be expressed as E[β^EAi~PGS¯Fi]=E[cov(EAi,PGS¯Fi)var(PGS¯Fi)]=E[cov(EAi,PGS¯Fi)] given that all variables are standardized. Using path tracing rules, E[cov(EAi,PGS¯Fi)]=2×[.697×.26×.42]=.152, which agrees closely with the reported β^EAi~PGS¯Fi=.175±.03 (Table S6, column 4).

Domingue et al. then control for PGSi and find that this partial slope is only slightly reduced (β^EAi~PGS¯Fi|PGSi=.154±.03) and still significant. They interpret this partial slope as evidence “…that the genetics of individuals in a person’s social environment influence that person’s phenotype,” (p. 705). However, controlling for a variable that is only weakly associated with the outcome and predictor variables, such as PGSi, is expected to change the slope by only a small amount. In particular, given that all variables are standardized,

E[β^EAi~PGS¯Fi|PGSi]=E[cov(EAi,PGS¯Fi|PGSi)var(PGS¯Fi|PGSi)]=E[rEAi,PGS¯Fi(rEAi,PGSi)(rPGSi,PGS¯Fi)1rPGSi,PGS¯Fi2] (5)

Using path tracing rules and Figure 1b, the correlation between PGSi and PGS¯Fi is 2 × [.697 × .26 × .42 × .26] = .0396 ≅ .04 Therefore, this expected partial slope is

E[β^EAi~PGS¯Fi|PGSi]=.152.26×.041.042=.142 (6)

which is, again, not significantly different from the partial slope (.154 ± .03) reported in the manuscript (Domingue et al. Figure 2 and Column 2 of Table S7).

Thus, the results interpreted by Domingue et al. as evidence for social-genetic effects are expected under a simple model of individuals befriending others of similar educational attainment values. It is well established that friends have similar educational performance (Smirnov & Thurner, 2017; Tuma & Hallinan, 1979); more broadly, the general tendency to associate and bond with similar others is one of the most pervasive observations in the social sciences (McPherson, Smith-Lovin, & Cook, 2001). Therefore, there is no need to invoke social-genetic effects to explain Domingue et al.’s findings. Furthermore, other research suggests such effects are unlikely in any substantive sense. Using a large, longitudinal sample of high school and university students (N=6,000), Smirnov and Thurner (2017) showed that friend similarity in academic performance is due to initial choice of similar friends, not change in individuals’ academic performance towards that of their friends. This lack of an effect on friends’ academic performance is inconsistent with ‘social-genetic effects’ as envisaged by Domingue et al.

Conclusions

The advent of large samples of genotyped individuals with known social relationships has provided unprecedented opportunities for research at the intersection of human genetics and social sciences. However, analysis and interpretation of these data require great care. Several other high-profile papers (Christakis & Fowler, 2014; Connolly, Anney, Gallagher, & Heron, 2019; Domingue et al., 2014) on the genetic similarity of social or romantic mates have forwarded exciting but unfounded interpretations of results that probably have more parsimonious explanations, such as population stratification (e.g. see commentaries by Abdellaoui, Verweij, & Zietsch, 2014; Chen, 2014; Wray & Yengo, 2019).

Acknowledgements

This research was supported by the Australian Research Council (DP160102400; FT160100298; FL180100072), the Australian National Health and Medical Research Council (1113400 and 1078037) and the National Institute of Health (NIH grants R01AH042568 and R01MH100141). K.J.H.V. is supported by the Foundation Volksbond Rotterdam.

References

  1. Abdellaoui A, Verweij KJH, & Zietsch BP (2014). No evidence for genetic assortative mating beyond that due to population stratification. Proceedings of the National Academy of Sciences, 111, E4137. doi: 10.1073/pnas.1410781111 [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Chen G-B (2014). Where is the friend’s home? Frontiers in Genetics, 5. doi: 10.3389/fgene.2014.00400 [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Christakis NA, & Fowler JH (2014). Friendship and natural selection. Proceedings of the National Academy of Sciences of the United States of America, 111, 10796–10801. doi: 10.1073/pnas.1400825111 [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Connolly S, Anney R, Gallagher L, & Heron EA (2019). Evidence of assortative mating in Autism Spectrum Disorder. Biological Psychiatry. doi: 10.1016/j.biopsych.2019.04.014 [DOI] [PubMed] [Google Scholar]
  5. Domingue BW, Belsky DW, Fletcher JM, Conley D, Boardman JD, & Harris KM (2018). The social genome of friends and schoolmates in the National Longitudinal Study of Adolescent to Adult Health. Proceedings of the National Academy of Sciences. doi: 10.1073/pnas.1711803115 [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Domingue BW, Fletcher J, Conley D, & Boardman JD (2014). Genetic and educational assortative mating among US adults. Proceedings of the National Academy of Sciences, 111, 7996–8000. doi: 10.1073/pnas.1321426111 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Goddard ME, Wray NR, Verbyla K, & Visscher PM (2009). Estimating effects and making predictions from genome-wide marker data. Statistical Science, 24, 517–529. [Google Scholar]
  8. Lee JJ, Wedow R, Okbay A, Kong E, Maghzian O, Zacher M, … Cesarini D (2018). Gene discovery and polygenic prediction from a genome-wide association study of educational attainment in 1.1 million individuals. Nat Genet, 50, 1112–1121. doi: 10.1038/s41588-018-0147-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Lynch M, & Walsh B (1998). Genetics and analysis of quantitative traits. Sunderland, MA: Sinauer. [Google Scholar]
  10. McPherson M, Smith-Lovin L, & Cook JM (2001). Birds of a feather: Homophily in social networks. Annual Review of Sociology, 27, 415–444. [Google Scholar]
  11. Price AL, Zaitlen NA, Reich D, & Patterson N (2010). New approaches to population stratification in genome-wide association studies. Nat Rev Genet, 11, 459–463. doi: 10.1038/nrg2813 [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Robinson MR, Kleinman A, Graff M, Vinkhuyzen AAE, Couper D, Miller MB, … Visscher PM (2017). Genetic evidence of assortative mating in humans. Nature Human Behaviour, 1, 0016. doi:10.1038/s41562-016-001610.1038/s41562-016-0016http://www.nature.com/articles/s41562-016-0016#supplementary-informationhttp://www.nature.com/articles/s41562-016-0016#supplementary-information [Google Scholar]
  13. Smirnov I, & Thurner S (2017). Formation of homophily in academic performance: Students change their friends rather than performance. PLos One, 12, e0183473. doi: 10.1371/journal.pone.0183473 [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Takahata N (1993). Allelic genealogy and human evolution. Molecular Biology and Evolution, 10, 2–22. [DOI] [PubMed] [Google Scholar]
  15. Thornton T, Tang H, Hoffmann Thomas J., Ochs-Balcom Heather M., Caan Bette J., & Risch N (2012). Estimating kinship in admixed populations. The American Journal of Human Genetics, 91, 122–138. doi: 10.1016/j.ajhg.2012.05.024 [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Tuma NB, & Hallinan MT (1979). The effects of sex, race, and achievement on schoolchildren’s friendships. Social Forces, 57, 1265–1285. [Google Scholar]
  17. Visscher PM, Hemani G, Vinkhuyzen AA, Chen GB, Lee SH, Wray NR, … Yang J (2014). Statistical power to detect genetic (co)variance of complex traits using SNP data in unrelated samples. PLoS Genet, 10, e1004269. doi: 10.1371/journal.pgen.1004269 [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Walsh B, & Lynch M (2018). Evolution and selection of quantitative traits: Oxford University Press. [Google Scholar]
  19. Wray NR, & Yengo L (2019). Assortative mating in Autism Spectrum Disorder: toward an evidence base from DNA data, but not there yet. Biological Psychiatry, 86, 250–252. doi: 10.1016/j.biopsych.2019.06.007 [DOI] [PubMed] [Google Scholar]
  20. Yang JA, Benyamin B, McEvoy BP, Gordon S, Henders AK, Nyholt DR, … Visscher PM (2010). Common SNPs explain a large proportion of the heritability for human height. Nature Genetics, 42, 565–U131. doi: 10.1038/ng.608 [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES