Summary
Eye color is highly variable in populations with European ancestry, ranging from low to high quantities of melanin in the iris. Polymorphisms in the HERC2/OCA2 locus have the largest effect on eye color in these populations, although other genomic regions also influence eye color. We performed genome-wide association studies of eye color in a Canadian cohort of European ancestry (N = 5,641) and investigated candidate causal variants. We uncovered several candidate causal signals in the HERC2/OCA2 region, whereas other loci likely harbor a single causal signal. We observed colocalization of eye color signals with the expression or methylation profiles of cultured primary melanocytes. Genetic correlations of eye and hair color suggest high genome-wide pleiotropy, but locus-level differences in the genetic architecture of both traits. Overall, we provide a better picture of the polymorphisms underpinning eye color variation, which may be a consequence of specific molecular processes in the iris melanocytes.
Subject areas: Genetics, Genomics, Human Genetics
Graphical abstract
Highlights
-
•
Genome-wide association studies of eye color in 5,641 participants
-
•
Multiple independent candidate causal variants were identified across HERC2/OCA2
-
•
Single candidate causal variants observed on or near IRF4, SLC24A4, TYR, and TYRP1
-
•
Colocalization of eye color signals with expression and methylation profiles
Genetics; Genomics; Human Genetics
Introduction
Pigmentation levels in the iris vary among humans, ultimately leading to different eye colors. The melanin pigment in the iris is synthesized in the melanocytes, within organelles named melanosomes (Sturm and Frudakis, 2004). Eye color diversity is a consequence of different amounts of melanin concentrated in the melanocytes of the iris. In addition, the shape and distribution of melanosomes influence eye color variation. The mechanism is different from that of hair and skin pigmentation, in which two types of cells, melanocytes and keratinocytes (i.e. the epidermal melanin unit), play a key role in the production and distribution of melanin to give hair and skin color (Rees, 2003; Lin and Fisher, 2007; Parra, 2007). In addition, out of the two types of melanin synthesized by melanocytes (i.e. eumelanin, a brown/black pigment and pheomelanin, an orange/yellow pigment), different categorical iris colors are a result of variation mainly on eumelanin content, whereas there is little, nonsignificant variation on pheomelanin quantity, based on measurements on cultured uveal melanocytes (Wakamatsu et al., 2007).
At a molecular level, blue irises appear as melanin-free melanocytes, in which molecules in the iris scatter short blue wavelengths to the surface (Sturm and Frudakis, 2004). Green irises have medium levels of eumelanin, whereas high levels of eumelanin result in brown irises. Therefore, broad eye color classifications (i.e. blue, green, hazel, brown) cover a continuum of low to high quantities of eumelanin accumulated in the iris (Sturm and Frudakis, 2004).
Twin studies have shown that eye color is a highly heritable trait (>85%) and that it does not significantly vary throughout an adult’s lifespan (Bito et al., 1997; Larsson et al., 2003). Furthermore, association studies have demonstrated that eye color has a polygenic architecture (Sulem et al., 2007; Edwards et al., 2016; Lloyd-Jones et al., 2016; Adhikari et al., 2019). Some of the loci with moderate/large effects associated with eye color variation are at or near the following genes: OCA2, TYR, TYRP1, SLC45A2, SLC24A4, SLC24A5, and IRF4. However, the variant with the largest effect on eye color variation is an intronic SNP (rs12913832) located in an enhancer within the gene HERC2 that regulates the expression of the downstream gene OCA2 (Visser et al., 2012). Functional studies have shown that the A-allele of rs12913832 allows the formation of a chromatin loop with the promoter of OCA2, facilitating the transcription of the gene. In contrast, the G-allele hinders the formation of the chromatin loop, leading to a diminished expression of OCA2 (Visser et al., 2012; Visser et al., 2014).
The SNP rs12913832 is the key regulatory element of OCA2. But it has been hypothesized that additional distal elements within the same region may be involved in the regulation of OCA2, a process that often is tissue specific (Palstra, 2009; Visser et al., 2014). In fact, through conditional analyses of association, genome-wide association studies (GWAS) have highlighted the presence of additional SNPs associated with variation in pigmentary traits (i.e. skin, hair, and eye pigmentation) within the HERC2/OCA2 region (Beleza et al., 2013; Adhikari et al., 2019; Lona-Durazo et al., 2019; Landi et al., 2020).
These studies have identified variants within HERC2 and OCA2 that are in low (r2 < 0.2) linkage disequilibrium (LD) with rs12913832 (e.g. rs4778249, rs1667392, rs4778219, rs1800407, rs1448484), as well as other distant candidate regulatory variants near or within the APBA2 gene (e.g. rs4424881, rs36194177), which is located ∼700kb away from OCA2. However, pinpointing additional causal variants within the HERC2/OCA2 region is challenging due to the complex LD patterns among the genetic variants and the lack of tissue-specific regulatory annotations. For instance, the Gene and Tissue Expression (GTEx) database (Consortium et al., 2017) includes skin tissue, which beyond a very small proportion of melanocytes, encompasses a diverse set of cell types not involved in pigmentation variation.
In order to improve our understanding of the genetic mechanisms behind eye pigmentation and melanocyte biology, in this paper we present the results of a GWAS of eye color conducted in a Canadian cohort from the Canadian Partnership of Tomorrow’s Health (CanPath), along with fine-mapping analyses. We combined these results with gene expression and methylation data of cultured melanocytes by conducting colocalization analyses and transcriptome-wide association studies (TWAS). Our main results indicate that there are several candidate signals in the HERC2/OCA2 region associated with eye color, a different pattern from what is observed for hair color in the same sampled population. By integrating expression and methylation data assayed in melanocytes, we gain a better picture about how genetic polymorphisms may modulate eye color variation.
Results
Eye color distribution in the CanPath cohort
A total of 5,732 participants of the Canadian Partnership for Tomorrow’s Health (CanPath), who were genotyped using two genome-wide genotyping arrays (See STAR Methods for details), also self-reported their natural eye color using one of six possible answers: blue, gray, green, amber, hazel, or brown. We excluded amber eye color individuals due to the low number of individuals who self-reported this category. After quality control of the genotypes (i.e. exclusion of poor-quality samples and PCA outliers), we kept 5,641 individuals for further analyses. Overall, the distribution of eye color categories was quite similar across all provinces sampled (Figure 1), in which green and hazel were the least frequent categories and blue was the most common one. An important exception is the significantly lower proportion of individuals who self-reported blue eye color in Quebec, compared with other provinces (chi-square test: 104.39, df = 4, p value < 0.01). This pattern may be explained by the high proportion of French ancestry in the Quebec population (See Figure S1) due to the migration and settlement of French people in the province relatively recently (Bherer et al., 2011). This hypothesis is supported by the difference in allele frequencies between Quebec and all other provinces, in which the HERC2 rs12913832 G-allele has a lower frequency than in other provinces (see Table S1). In addition, a higher proportion of females self-reported green and hazel eye colors, relative to their male counterparts (chi-square test for green and hazel combined = 244.49; df = 1; p value < 0.01), which is similar to the observations previously reported in the case of green eye color (Sulem et al., 2007). Compared with several reported eye color frequencies per country, the CanPath eye color distributions differ from those found in other European and Asian countries, with the blue eye color frequency being most similar to that of Germany (39.6%) (Katsara and Nothnagel, 2019).
Genome-wide association studies and meta-analyses
We performed GWAS of eye color on each genotyping array (genotyped and imputed single-nucleotide polymorphisms (SNPs)) using a linear mixed model and an additive genetic model, using GCTA 1.26.0 (Yang et al., 2011, 2014). We coded eye color categories as follows: 1 = blue or gray, 2 = green, 3 = hazel, and 4 = brown. We included sex, age, and the first ten principal components (PCs) as fixed effects and a genetic relationship matrix (GRM) as random effect to control for subtle population structure. We did not detect residual population substructure, based on Q-Q plots, in which observed p values did not show an early deviation from the expected p values (See Figure S2).
We then carried out a meta-analysis using the summary statistics (beta and SE) including the two GWAS on METASOFT v2.0.1 (Han and Eskin, 2011). Q-Q plots of the meta-analyses (See Figure S3) and LD Score regression (intercept = 0.9935) indicated no residual population structure. We identified several known genome-wide significant loci (p value ≤ 5e-08) associated with eye color (Figure 2; see Figure S4), overlapping or near the genes TYRP1 (lead SNP: rs1326779; beta = 0.139; SE = 0.024), IRF4 (lead SNP: rs12203592; beta = −0.164; SE = 0.029), TYR (lead SNP: rs1126809; beta = −0.136; SE = 0.024), SLC24A4 (lead SNP: rs4144266; beta = −0.124; SE = 0.022), and HERC2 (lead SNP: rs1129038; beta = −1.239; SE = 0.024). In addition, we observed a signal on chromosome 6 overlapping the ILRUN gene (lead SNP: rs116072038; beta = −0.438; SE = 0.077), a locus that has not been previously associated with pigmentation. Data S1 summarizes the suggestive and genome-wide associated SNPs.
Fine-mapping of GWAS hits
We conducted approximate conditional and joint analyses of association using GCTA-COJO (Yang et al., 2012), and using as LD reference the samples genotyped in this study with the UKBB array, to investigate if the genome-wide significant loci were being driven by one or more independent signals.
On the IRF4, TYRP1, TYR, and SLC24A4 regions, we identified one independent genome-wide significant SNP per locus (see Table S3), corresponding to known causal variants, such as rs12203592 on IRF4 and rs1126809 on TYR. The lead SNP on SLC24A4 is in high LD (r2 = 0.98) with rs12896399, a SNP previously associated with pigmentation (Sulem et al., 2007; Eriksson et al., 2010). In the case of TYRP1, the selected SNP (rs1326779) is downstream of TYRP1 and has not been previously highlighted in eye color studies. On the HERC2/OCA2 region, we identified six independent SNPs overlapping OCA2 and HERC2 with p values in the conditional analysis exceeding the genome-wide significant threshold (see Table S3). Several of these SNPs have evidence of heterogeneity among the two studies, as indicated by I2 and Cochran’s Q values in the meta-analysis. However, they all have genome-wide significant p values in the random effects (RE2) model too, which takes into account heterogeneity among studies (see Table S3). To validate our results, we carried out the same GCTA-COJO analysis a second time, using samples genotyped with the GSA array as LD reference. We obtained concordant results, with multiple independent SNPs on the HERC2/OCA2 region and a single SNP highlighted on the other pigmentation associated loci (i.e. IRF4, TYR, SLC24A4, and TYRP1) (see Table S4).
We also carried out a Bayesian fine-mapping analysis, in which all possible combinations of SNPs are iteratively considered without arbitrary selection of conditioned SNPs. We used the program FINEMAP (Benner et al., 2016) to perform fine-mapping analysis and to identify candidate causal SNPs for functional prioritization. In agreement with the GCTA-COJO analysis, by using FINEMAP we identified known pigmentation loci harboring one causal signal within IRF4, TYR, SLC24A4, and TYR (Data S2). On the IRF4 locus, the only candidate causal SNP with considerable evidence of causality (log10BF > 2) was the same SNP highlighted by GCTA-COJO (rs12203592; PIP = 0.999). In contrast, the missense SNP rs1126809 on TYR had a low posterior inclusion probability (PIP = 0.209), due to high LD with other nearby SNPs. Other candidate causal SNPs in the 95% credible set of the TYR locus include intergenic variants and one SNP (rs11018578) on the 3′UTR region of NOX4. On the SLC24A4 locus, the variants with considerable evidence of causality (i.e. log10BF ≥ 2) include intronic SNPs within SLC24A4 and other variants upstream of the gene, including the SNP rs12896399 (log10BF = 2.58) (Data S2).
On the TYRP1 locus, all candidate causal SNPs in the credible set had a low (<0.1) PIP, most likely due to high LD among multiple SNPs in the locus (See Figure S5). The 95% credible set includes rs10809826 and rs1408799, two SNPs that have been previously associated with eye color (Sulem et al., 2008; Zhang et al., 2013; Galván-Femenía et al., 2018; Adhikari et al., 2019). Among the SNPs in the same 95% credible set, rs13297008 is located upstream of TYRP1, and it overlaps a DNase Hypersensitive Site identified in foreskin melanocytes (Data S2), indicative of an active transcriptional regulator region. We did not observe any poorly imputed or unimputed variants overlapping regulatory regions found in melanocytes. However, we did find that rs13297008, rs2733831, and rs13296454 are in an active transcription start site (TSS) state on foreskin melanocytes only (across the tissues tested with the 15-core chromatin states). These SNPs are suggestive of either association or genome-wide significance, and all three are within the 95% credible set (Data S1).
Multiple HERC2/OCA2 variants associated with eye color variation
By applying a Bayesian fine-mapping approach on the HERC2/OCA2 region, we identified five candidate causal signals (i.e. five 95% credible sets) associated with eye color. Within these signals, three SNPs had a PIP >0.98 (Table 1 and Figure 3A). These results suggest independent causality of various signals in the locus. One of the candidate SNPs within HERC2 is rs12913832, a known enhancer that regulates the expression of OCA2 (Visser et al., 2012). In addition, three other SNPs within HERC2 and one within OCA2 were nominated as candidate causal loci, all of which fall within introns. These results are similar to the conditional analysis with GCTA-COJO, in which the independent SNPs in the locus encompass both OCA2 and HERC2, but the selected SNPs do not fully overlap. Importantly, all Bayesian fine-mapped SNPs in the locus had genome-wide significant p values on each sample and the same direction of effect. We annotated the putative regulatory function of the SNPs in all five credible sets using diverse databases (e.g. ENCODE, Roadmap Epigenomics Project). Aside from the overlap of rs12913832 (HERC2) with an open chromatin region in foreskin melanocytes, only the SNP rs117007668 is located within an open chromatin region in foreskin melanocytes (Data S2).
Table 1.
Credible Set | rsid | Position in chr 15 | Gene | Beta | SE | p value (FE) | p value (RE2) | MAF | I2 | Cochran’s Q | Cochran’s Q p value | PIP | Log10BF |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | rs12913832 | 28365618 | HERC2 | −1.26 | 0.02 | 0 | 0 | 0.23 | 96.25 | 26.66 | 243E-07 | 1.00 | 13.14 |
2 | rs117007668 | 28371422 | HERC2 | 0.99 | 0.08 | 5.43E-38 | 8.30E-38 | 0.01 | 81.03 | 5.27 | 0.02 | 1.00 | 5.59 |
3 | rs4778138 | 28335820 | OCA2 | 0.81 | 0.03 | 2.56E-162 | 1.51E-161 | 0.13 | 0.00 | 0.25 | 0.62 | 0.99 | 5.08 |
4 | rs117744568 | 28498692 | HERC2 | 0.87 | 0.06 | 3.95E-46 | 1.01E-45 | 0.02 | 73.99 | 3.85 | 0.05 | 0.89 | 4.04 |
4 | rs117743506 | 28510460 | HERC2 | 0.85 | 0.06 | 6.46E-46 | 1.96E-45 | 0.02 | 65.92 | 2.93 | 0.09 | 0.11 | 2.23 |
5 | rs71467328 | 28518229 | HERC2 | −0.48 | 0.05 | 5.60E-18 | 1.13E-17 | 0.04 | 27.37 | 1.38 | 0.24 | 0.63 | 3.36 |
5 | rs1597196 | 28294922 | OCA2 | 0.42 | 0.03 | 3.86E-55 | 1.27E-54 | 0.18 | 63.25 | 2.72 | 0.10 | 0.10 | 2.17 |
FE = fixed-effects model; RE2 = random-effects model; MAF = minor allele frequency; I2 and Cochran’s Q: meta-analysis heterogeneity indices; PIP = posterior inclusion probability; log10BF = log10 of Bayes Factor.
We then explored the LD patterns among the candidate causal variants (Table 1) in the credible sets using the CanPath genotypes (i.e. the same LD matrix used for fine-mapping) and considering the genotype probabilities, computed with LDStore v2.0 (Benner et al., 2017). Among the top candidate causal variants across the five credible sets (Table 1), most correlations are low (r2 ≤ 0.2), with the exception of rs117007668 and rs117744568, which are in high LD (r2 = 0.9) (Figure 3B). We further compared D′ values among these same SNPs computed on LDLink, using as a proxy the European populations of the 1000 Genomes Project (Machiela and Chanock, 2015). The D′ patterns, compared with r2 values, reflect the allele frequency differences among the SNPs (D’ > 0.6 in most cases) and suggest that the candidate causal SNPs are not in complete linkage equilibrium (See Figure S6).
We explored with HaploReg (version 4) (Ward and Kellis, 2012, 2016) if other variants in the same LD block as our credible set SNPs in the OCA2/HERC2 locus harbor a putative regulatory function, to consider genetic variants that may have not been present in our dataset after imputation. By using as input the top SNP on each of the five credible sets (Table 1), we identified a nominally significant enrichment of enhancers (as defined by the 15-state core ChromHMM model) in foreskin melanocytes (binomial test compared with all 1KGP variants with MAF ≥5%; p value = 0.0169). The SNP rs117743506, which is in high LD with rs117744568 and in the same credible set, has an enhancer state in foreskin melanocytes. In addition, it may alter the motif of POU3F2, a transcription factor (TF) present in melanoma cell lines known to alter the expression of pigmentation genes (i.e. MITF, KITLG), although a recent study suggests that this TF does not have a role in normal skin melanocytes (Larue et al., 2010; Chitsazan et al., 2020).
Finally, we explored SNP-SNP interactions across the top GWAS SNPs that passed the genome-wide significant threshold using the program CASSI and included also the loci fine mapped in the OCA2/HERC2 region (SNPs in Figure 3). By applying a Bonferroni-corrected p value threshold = 0.0015 (i.e. 0.05/33 pairwise tests), we did not identify significant interactions. However, there were nominally significant interactions between SLC24A4 (index SNP: rs4144266) and HERC2/OCA2 (SNPs: rs12913832 and rs4778138) (see Table S7).
Associations with eye color in recent studies
By investigating eye color variation in the OCA2 locus, Andersen et al. (2016) identified that two missense SNPs within OCA2 (rs121918166 and rs74653330) have a measurable effect on eye color variation, in which the alternative alleles decrease melanin levels, even in a heterozygote state. These two SNPs were rare in our sample (MAF <1%); therefore, we did not consider them in our GWAS analyses. By looking at the genotype-level data in these two rare SNPs, we identified 108 individuals heterozygous for either rs121918166 or rs74653330. Among these, 16 individuals were homozygous for the nonblue eye color rs12913832-AA genotype, but none self-reported having blue eye color (brown = 11; green = 1; hazel = 4). In addition, among heterozygous individuals for either one of the rare SNPs (i.e. rs121918166 or rs74653330), 12 individuals were also heterozygous for the nonblue eye color rs12913832-AG genotype and self-reported blue eye color. These results indicate that these two rare variants do not account for the incidences of blue eye color with an rs12913832-AA background in the CanPath, but they may influence the eye color phenotype under an rs12913832-AG background. Furthermore, we identified in our sample a subset of individuals (N = 904) harboring the rs12913832:GG genotype, 21 of whom self-reported brown eye color, 631 green eye color, and 252 hazel eye color, suggesting that the rs12913832:GG genotype does not exclusively yield a blue eye color.
We compared the OCA2/HERC2 haplotypes previously described for eye color in European ancestry populations, with the fine-mapped loci reported in the present study. Donnelly et al. (2012) identified three haplotypes in the region, two of which match the fine-mapped SNPs we have identified: rs12913832 (BEH2) and rs4778138 (BEH1). The third haplotype includes rs916977 and rs1667394, but these two SNPs are not highly correlated (i.e., r2 < 0.6) with any of our fine-mapped loci. The IrisPlex System is widely used to predict blue/brown eye color based on common genetic polymorphisms (Walsh et al., 2012). Currently, it includes two SNPs in the OCA2/HERC2 region (rs12913832 and rs1800407). The rs1800407 SNP is a missense SNP in OCA2, known to be involved in pigmentation variation (Donnelly et al., 2012; Pospiech et al., 2014; Kidd et al., 2020). This SNP did not pass the quality control in our study, but other SNPs in high LD are not within the fine-mapped 95% credible sets. In addition, other common SNPs across other genes are also included in the IrisPlex System, namely SLC24A4 (rs12896399), SLC45A2 (rs16891982), TYR (rs1393350), and IRF4 (rs12203592). Importantly, in our study there is a genome-wide significant association of TYRP1 with eye color, but this locus is not included in the IrisPlex System.
Adhikari et al. (2019) conducted conditional analyses of association of eye color (measured qualitatively and quantitatively) in Latin American individuals with mainly European and Native American ancestry and identified up to five independent signals in the HERC2/OCA2 locus (indexed by: rs4778219, rs1800407, rs1800404, rs12913832, and rs4778249). Aside from rs12913832, none of their index SNPs are within the candidate causal SNP sets in our sample, even though rs1800404 was genome-wide significant in our meta-analysis (p value = 1.18 × 10−11). In addition, they identified three novel loci associated with eye pigmentation: DSTYK (chromosome 1), WFDC5 (chromosome 20), and MPST (chromosome 22). We followed-up each of the three index SNPs in our meta-analyses (rs3795556, rs17422688 and rs5756492), but failed to replicate the former two SNPs using a Bonferroni correction (p value threshold = 0.005), whereas rs5756492 was not present in our meta-analysis. These differences may be driven by population ancestry differences, given that the CANDELA cohort includes recently admixed individuals from Latin America, although the phenotyping approach may also be driving these differences.
Finally, the largest eye color GWAS to date conducted in populations of mainly European ancestry reported several novel loci, which had not been previously associated with eye color, and a subset of them has not been previously associated with any pigmentation trait (eye, hair, or skin pigmentation) (Simcoe et al., 2021). However, the ILRUN locus identified in the present study was not among their novel signals, nor we were able to replicate it. In addition, they conducted conditional analyses of association to identify secondary signals on the significant loci, in which they identified a total of 115 independent signals, including three signals in chromosome X. Notably, they identified 33 independent signals in the HERC2/OCA2 region and two signals in the nearby gene GABRB3. We followed-up their signals (Table S1 from Simcoe et al., 2021) in our meta-analyses and identified 50 SNPs nominally significant in our meta-analysis (p value ≤ 0.05), all with a consistent direction of effect between both studies (see Table S5). This set of SNPs includes novel associations with eye color and/or pigmentation traits included in their study (i.e. DTL, MITF, PDCD6/AHRR, ADRB2, GCNT2, and SIK1). After considering a Bonferroni correction (0.05/112; p value ≤ 4.46e-04), 16 SNPs remained significant, all of which overlap known pigmentation genes (IRF4, TYRP1, TYR, OCA2, and HERC2).
Lastly, we checked if the independent SNPs associated with eye color identified with GCTA-COJO by Simcoe et al. (2021) overlap with our eye color SNPs highlighted by GCTA-COJO (See Table S3). We identified an overlap of three SNPs: rs12203592 (IRF4), rs1126809 (TYR), and rs1129038 (HERC2). Interestingly, even though Simcoe et al. identified several independent loci in the OCA2/HERC2 region, the known rs12913832 SNP is not among them, likely because another SNP in perfect LD has a lower p value, which is similar to what we observe in our GCTA-COJO analysis (i.e. rs1129038). We also compared the same independent SNPs identified by Simcoe et al. with our candidate causal loci as defined by FINEMAP and found three overlapped SNPs: rs12203592 (IRF4), rs1126809 (TYR), and rs13297008 (TYRP1).
Colocalization with expression and methylation QTLs from cultured melanocytes
We performed colocalization analyses with hyprcoloc (Foley et al., 2021) of the eye color meta-analysis with melanocyte gene expression and methylation cis-QTLs (eQTLs, meQTLs, respectively) to explore if there were shared causal signals (see STAR Methods for details). Through the colocalization of GWAS with eQTLs, we identified a region overlapping OCA2 and AC090696.2 (the latter being a transcript that partially overlaps OCA2) (Table 2), in which the candidate marker is rs12913832. We also colocalized meQTLs with GWAS hits (Table 2) overlapping the gene body of HERC2 (tagged by cg25622125, cg27374167, and cg05271345) using a posterior probability threshold of 0.8. Notably, we did not find colocalized eQTLs on the TYRP1 locus that passed the probability cutoff.
Table 2.
Chromosome | Candidate SNP | Posterior probability | Regional probability | Posterior explained by SNP | Gene/Methylation annotation | QTL |
---|---|---|---|---|---|---|
15 | rs12913832 | 0.9999 | 1 | 1 | OCA2 | eQTL |
15 | rs12913832 | 0.9864 | 0.9891 | 1 | AC090696.2 | eQTL |
15 | rs12913832 | 0.9991 | 0.9995 | 1 | HERC2 (Body) | OpenSea | meQTL |
15 | rs12913832 | 0.9855 | 0.9872 | 1 | HERC2 (Body) |S_Shelf | meQTL |
15 | rs12913832 | 0.9807 | 0.9831 | 1 | HERC2 (Body) |S_Shore | meQTL |
The methylation annotation indicates the location with respect to the nearest gene (TSS. = transcription start site), as well as the location of the tagged CpG marker within the CpG island.
Transcriptome-wide association studies
We conducted TWAS using a subset of the CanPath cohort as LD reference and the expression weights from cultured melanocytes to predict the gene expression profile with FUSION (Gusev et al., 2016). Our results highlighted the expression of three genes as significantly associated with eye color: OCA2, SLC24A4, and RIN3 (See Table S6; See Figure S7). The gene RIN3 is located near SLC24A4, and by conducting conditional TWAS we have shown that these two genes are not independent from each other (See Figure S8).
Genetic correlations
We calculated the genetic correlation between eye and hair color using the data from the two CanPath genotyping arrays for which we had full phenotype data (Lona-Durazo et al., 2021). Using a linear scale for both traits and the same covariates as used in the GWAS (See STAR Methods for details), there is a genetic correlation (rg) of 55% (SE = 0.12; p value = 7.33e-6) and 69% (SE = 0.21; p value = 0.001) on the UK Biobank and GSA arrays, respectively. Similar to the approach used in a previous study (Lin et al., 2016), we then calculated the genetic correlation without controlling for the effect of significant principal components and obtained genetic correlation values of 63% (SE = 0.08; p value = 3.41e-15) and 79% (SE = 0.15; p value = 1.39e-07) on the UK Biobank and GSA arrays, respectively. These results are in line with the genetic correlations previously reported (Lin et al., 2016), in which they found a lower correlation when including principal components as covariates, due to the correlations between ancestry captured by the PCs and eye or hair color. Nevertheless, by not controlling for significant PCs, we may as well be capturing ancestry in the genetic correlation estimation.
Discussion
In this paper, we present the results of our genome-wide association studies of eye color, as measured categorically through self-reports, from 5,641 participants of the Canadian Partnership for Tomorrow’s Health (CanPath). We did not identify new loci associated with eye color that were successfully replicated, and we focused on performing downstream analysis to pinpoint candidate causal SNPs, specifically on those loci for which a functional variant has not yet been identified or in which there is evidence of more than one independent signal. We found that fine-mapping provides evidence for multiple independent SNPs within the HERC2/OCA2 region, whereas other loci likely have a single causal signal. Furthermore, we characterized our GWAS signals by using colocalization analyses with expression and methylation QTLs of cultured melanocytes and conducted TWAS, in which we identified the expression of SLC24A4/RIN3 and OCA2 as significantly associated with eye color. Lastly, we explored the genetic correlations between hair and eye color in the CanPath cohort.
One of the caveats of this study is that we utilized eye color categories self-reported by participants of the CanPath cohorts, as categorical classifications do not capture as well iris color variation as quantitative measures (Liu et al., 2010; Norton et al., 2015; Edwards et al., 2016). In our sample we have identified significant differences in self-reporting of eye color between sexes, and although these may be a reflection of true sex differences, as has been previously reported in other studies reporting self-assessed categorical eye color (Sulem et al., 2007), we cannot discard the possibility of a self-reporting bias. This limitation is counter-balanced by the relatively large sample size, in comparison to the majority of previous studies (Kayser et al., 2008; Candille et al., 2012; Beleza et al., 2013), with the exception of the largest recent GWAS of eye color (Simcoe et al., 2021). Furthermore, significantly associated loci from self-reported eye color are useful in forensics, for predicting eye color categories, which the human eye can easily distinguish. Indeed, we have here identified most signals associated with eye color that are used in the IrisPlex eye color prediction system (Walsh et al., 2014, 2017; Chaitanya et al., 2018). One of the loci not included currently in the IrisPlex system is TYRP1, a locus that could potentially improve the eye color prediction system. However, it is important to point out that structural features of the iris (i.e. contraction furrows, Wolfflin nodules, heterochromia) also contribute to color perceptions, but we are not able to distinguish them using the current dataset.
Through our GWAS meta-analysis we identified five known loci associated with eye color, encompassing the genes SLC24A4, IRF4, TYRP1, TYR, and HERC2/OCA2, similar to what was identified in a recent large GWAS of both categorical and quantitative eye color loci in a Latin American (CANDELA) cohort (Adhikari et al., 2019). The only two known pigmentation loci that our GWAS failed to identify as significantly associated with eye color encompass the genes SLC24A5 on chromosome 15 and SLC45A2 on chromosome 5, in which missense variants (rs1426654 and rs16891982) are known to alter pigmentation traits (Lamason et al., 2005). The missense SNP on SLC24A5 was rare in our sample (MAF <1%) hence excluded, in line with frequencies observed in the 1000 Genomes Project European populations, in which the alternative allele is nearly fixed. In the case of the SLC45A2 locus, the missense SNP did not reach genome-wide significance (p value = 1.71e-6).
Our fine-mapping analyses identified known causal pigmentation loci in the credible sets, such as rs12203592 on IRF4, rs1126809 on TYR, and rs12913832 on HERC2. Contrary to what has been observed for hair pigmentation (Adhikari et al., 2019), we identified here one independent SNP in the TYR locus associated with eye color, even though there is at least another independent missense variant (rs1042602) known to alter melanin synthesis within the same gene, in addition to candidate regulatory variants in the upstream GRM5 gene associated with skin pigmentation (Stokowski et al., 2007; Sulem et al., 2007; Liu et al., 2010, 2015; Beleza et al., 2013; Jacobs et al., 2013; Adhikari et al., 2019; Lona-Durazo et al., 2019). This result is also in line with what was reported in the CANDELA study (Adhikari et al., 2019), in which, compared with hair color, they identified a single candidate SNP in the TYR locus associated with eye color. This exemplifies the importance of characterizing the genetic architecture of different pigmentation traits independently and opens up new questions to investigate the different mechanisms involved in melanin synthesis between cutaneous versus iris melanocytes.
We conducted TWAS and colocalization analysis with expression and methylation QTLs to further explore the shared causal signals among these phenotypes. Through colocalization with melanocyte eQTLs, we found colocalization in the OCA2 region, likely regulated by the SNP rs12913832 in the nearby HERC2 gene. Similarly, colocalization with meQTLs highlighted a signal in the HERC2 locus. The shared signals between meQTLs, eQTLS, and eye color GWAS hits in the HERC2 region may suggest that DNA methylation could play a role in the differential expression of OCA2, thus influencing the eye color phenotype, although this cannot be confirmed with the current evidence. Further analyses, such as Mendelian randomization, will be useful to evaluate causal associations among these traits (e.g. Bonilla et al., 2020).
We did not find colocalization of GWAS SNPs with eQTLs on the TYRP1 locus, even though our GWAS and fine-mapping results suggest a regulatory role of the candidate causal variants in this locus due to (1) the location ∼11kb upstream of the gene and (2) the overlap of a SNP (rs13297008) with open chromatin regions in foreskin melanocytes. In addition, this gene was absent from the TWAS expression weights dataset, suggesting that the gene expression in the current dataset is not sufficiently heritable (i.e. heritability p > 0.01). Therefore, we are not able to provide evidence of the mechanism in which the variants in the locus affect pigmentation variation, nor we are able to nominate a single causal SNP.
The cultured melanocyte expression and methylation QTLs we used for colocalization and TWAS come from newborn foreskin melanocytes (Zhang et al., 2018, 2021). Similarly, the regulatory annotations from the ENCODE and Roadmap Epigenomics projects (Dunham et al., 2012; Roadmap Epigenomics Consortium et al., 2015) also come from melanocytes, keratinocytes, and fibroblasts from foreskin tissue. The melanocytes from skin and iris have several similarities and same embryological origin, but there are also significant differences between them. For instance, the melanosomes within the iris melanocytes are retained in the cytoplasm, and they are not transferred through dendrite-like structures to adjacent keratinocytes, as it is the case in the skin and hair melanocytes (Sturm and Frudakis, 2004). Moreover, the iris melanocytes are not reactive to the alpha melanocyte stimulating hormone (α-MSH) (Li et al., 2006), and instead, alternative signaling cascades trigger and regulate melanogenesis (Zhou et al., 2018). Therefore, future QTL efforts using a more precise tissue type, such as uveal melanocytes, may aid in characterizing the regulatory differences between cutaneous and iris melanocytes.
The HERC2/OCA2 region on chromosome 15 has the strongest effect on eye color variation, such that blue eye color was initially considered a Mendelian trait (Davenport and Davenport, 1907; Sturm and Frudakis, 2004). The most significant variant associated with blue versus brown eye color is rs12913832, an enhancer of the expression of OCA2 (Visser et al., 2012), whereas the same polymorphism only causes a mild decrease of hair and skin eumelanin content, suggesting that the effect of this locus is different between dermal and iris melanocytes (Visser et al., 2014). Our findings suggest that it is likely that other SNPs in the locus also have an effect on the expression of OCA2 in the iris. For instance, a subset of participants harbored the rs12913832 homozygous genotype associated with blue eye color (i.e. GG), but they self-reported non-blue eye color. In addition, there may be rare genetic variants within the OCA2/HERC2 region (i.e. rs121918166 and rs74653330) accounting for the blue eye color individuals under the heterozygous rs12913832 genotype, but larger sample size studies are needed to statistically test this hypothesis. Therefore, the expression of OCA2 might be induced by other regulatory variants in the locus, counteracting the effect of rs12913832, as has been previously proposed (Andersen et al., 2016). An alternative explanation could be that a subset of participants self-reported their eye color inaccurately, a hypothesis that we are not able to discard.
In addition, it is possible that genetic interactions between IRF4 and OCA2 also play a role. For instance, it is known that individuals may have blue eye color when harboring one or two rs12913832 A-alleles (HERC2), associated with nonblue eye color, along with one or two rs12203592 T-alleles, associated with light eye color (Laino et al., 2018). Similarly, in the CanPath there is a significant increase in the number of nonbrown eye color individuals with the rs129138-AG genotype as the number of rs12203592-T alleles increases (Fisher exact test p value = 2.2e-16) (See Tables S8 and S9 and Figure S9). Finally, it has been recently suggested that SNPs in the genes TYR (rs1126809), TYRP1 (rs35866166, rs62538956), and SLC24A4 (rs1289469) may be responsible for the brown eye color in individuals of European ancestry with an rs12913832 homozygous G-allele background (Meyer et al., 2020). However, our formal interaction analyses using CASSI did not identify significant interactions (after Bonferroni correction) between any pair of markers analyzed, including the polymorphisms rs12913832 (HERC2) and rs12203592 (IRF4).
Genetic correlations among hair and eye color in the CanPath cohort are high, in line with what has been previously reported (Lin et al., 2016) and considering the effect that most genes have in both phenotypes too (e.g. SLC24A4, IRF4, OCA2). However, we have demonstrated that certain genetic differences come to light when investigating candidate causal variants across the genome. Within the CanPath cohort, we observed that red hair color is driven mainly by multiple candidate causal signals in the MC1R locus and that variants within the same gene also have a significant effect upon blonde hair color. In contrast, variants within MC1R and its antagonist, ASIP, are not associated with eye color, which may be explained by the fact that MC1R is not expressed in iris melanocytes (Li et al., 2006). In addition, this may explain why iris melanocytes do not respond to UV radiation as opposed to skin melanocytes. HERC2/OCA2 is the most significant locus in our analysis of blond versus black and brown versus black hair color (although as described earlier, MC1R is the most important locus determining red hair color). HERC2/OCA2 is also the most significant locus for eye color. However, the signal from hair color is primarily driven by rs12913832, whereas there are several independent signals in HERC2/OCA2 associated with eye color. Lastly, even though IRF4, a transcription factor that upregulates tyrosinase, has a large effect on both blonde hair and blue eye color, the direction of effect of the causal SNP rs12203592 is opposite for both traits: the derived T-allele is associated with blue eye color, whereas the same allele is associated with the presence of brown hair color (Praetorius et al., 2013).
Limitations of the study
The main caveat of the study is the self-reported eye color from CanPath participants.
STAR★Methods
Key resources table
Resource availability
Lead contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact: Frida Lona-Durazo (frida.lona-durazo@mail.utoronto.ca).
Materials availability
This study did not generate new unique reagents.
Experimental model and subject details
This study was approved by the University of Toronto Ethics Committee (Human Research Protocol # 36429) and data access was granted by the Canadian Partnership for Tomorrow’s Health (Application number DAO-034431). The samples in this study correspond to a subset of 5,675 individuals from the Canadian Partnership for Tomorrow’s Health (CanPath), which were sampled in different provinces: Alberta (N = 926; 16.4%), Atlantic Coast Provinces (i.e. New Brunswick, Newfoundland, Nova Scotia and Prince Edward Island) (N = 385; 6.8%), British Columbia (N = 965; 17.1%), Ontario (N = 934; 16.5%) and Quebec (N = 2434; 43.1%). We selected the individuals who self-reported having European-related ancestry and for whom self-reported eye colour was available (N = 5,641), of which 58.78% were females. The average age across participants was 55 years old (SE ± 0.11).
Method details
Genotyping of participants and quality control
Individuals who self-reported as having European-related ancestry were genotyped between 2012 and 2018 using two different genotyping array chips: Axiom 2.0 UK Biobank (Affymetrix) (N = 3,212) and the Global Screening Array (GSA) 24v1+MDP (N = 2,429) by the Canadian Partnership for Tomorrow’s Health (CanPath). The number of single nucleotide polymorphisms (SNPs) of these chip arrays ranges between 658,296 and 813,168 SNPs.
We performed genotype quality control for each array chip separately by first filtering out variants that deviated in minor allele frequency >0.2 from the 1000 Genomes Project Phase 3 European sample (1KGP-EUR), GC/TA variants with minor allele frequency >0.4 in the 1KGP-EUR and flipping alleles according to the 1KGP-EUR, using a Perl script (version 4.2) (Rayner, 2019). Afterwards, we used PLINK (version 1.9) (Purcell et al., 2007; Chang et al., 2015) to filter out variants with minor allele frequency <1%, high missing genotyping rate (--geno 0.05), high missing individual rate (--mind 0.05) or variants that significantly deviated from the Hardy-Weinberg Equilibrium (--hwe 1e-06). Then, we also identified second-degree relatives (--genome, PI_HAT >0.2) using a pruned set of variants in linkage disequilibrium (LD) (--indep-pairwise 100 10 0.1), and filtered out, from each pair, the individual with the lowest genotyping rate. Finally, we performed a Principal Components Analysis (PCA) of a pruned set of common variants of our study samples projected on the 1KGP Phase 3 samples on PLINK (version 1.9) (Purcell et al., 2007; Chang et al., 2015), and removed individual outliers that did not cluster within the European sample of the 1KGP by inspecting the first three principal components (total PCA outliers across genotyping arrays = 81). Amongst the outliers, 63 individuals are from Quebec, 8 from British Columbia, 5 from the Atlantic Provinces, 5 from Alberta and none from Ontario.
Imputation of genotypes
Each genotyping array was first phased with EAGLE2 (version 2.0.5) (Loh et al., 2016) using the Sanger Imputation Server (McCarthy et al., 2016). After phasing, samples on each genotyping array were imputed on the Sanger Imputation Server using the positional Burrows-Wheeler transform (PBWT) algorithm (Durbin, 2014) and the Haplotype Reference Consortium (HRC) release 1.1 dataset as reference (McCarthy et al., 2016). The HRC includes ∼64,000 haplotypes and ∼40,000,000 autosomal SNPs of ∼32,000 individuals predominantly of European ancestry, which makes it ideal for the imputation of our datasets, which are of European-related ancestry. After imputation, we used PLINK (version 2) (Purcell et al., 2007; Chang et al., 2015) to filter out variants with minor allele frequency <1%, high missing genotyping rate (--geno 0.05), imputation score (INFO) < 0.3, or variants that significantly deviated from the Hardy-Weinberg Equilibrium (--hwe 1e-06).
Phenotyping
Participants of the CanPath answered a questionnaire that included self-report on eye colour using the following discrete categories: grey, blue, green, amber, hazel or brown eye colour. These categories were then transformed into a linear scale using R (version 3.5.1) (R Core Team, 2019) to build a linear model with the following levels: 1 = grey or blue, 2 = green, 3 = hazel, 4 = brown. Table S2 shows the number of individuals on each eye colour category by genotyping array. In addition, participants also reported their age and sex. We excluded the individuals who reported amber eye colour, due to the low sample count.
Genome-wide association studies (GWAS) and meta-analyses
Genome-wide association studies of eye colour were performed for each genotyping array with a linear mixed model on Genome-Wide Complex Trait Analysis (GCTA- MLMA) 1.26.0 (Yang et al., 2011, 2014), using an additive genetic model (i.e. the effect size is a linear function of the number of effect alleles). We performed a PCA of a pruned set of genotyped variants for each genotyping array after quality control, keeping only SNPs with MAF >0.05 and excluding regions of high LD, using PLINK (version 1.9) (Purcell et al., 2007; Chang et al., 2015). We included in the model sex, age and the first ten PCs as fixed effects, and a genetic relationship matrix (GRM) of genotyped SNPs computed on GCTA 1.26.0 (Yang et al., 2011, 2014) as random effects, to control for more subtle population structure. To evaluate the case of residual population substructure, we computed the expected vs. observed p-values using Q-Q plots on R (version 3.5.1) (R Core Team, 2019), and ran LD Score regression with LDSC, in which an LD Score intercept considerably higher than 1 may indicate remaining confounding bias (Bulik-Sullivan et al., 2015).
We performed a meta-analysis of eye colour using the beta coefficient and standard error (SE) of each study on the software METASOFT (version 2.0.1) (Han and Eskin, 2011). METASOFT conducts a meta-analysis using a fixed effects model (FE), which works well when there is no evidence of heterogeneity (i.e. assumes same effect size across studies), and an optimized random effects model (RE2), which works well when there is evidence of heterogeneity among studies (Han and Eskin, 2011). Additionally, METASOFT computes two estimates of statistical heterogeneity, Cochran’s Q statistic and I2, as well as a Bayesian posterior probability that an effect exists on each individual study (M) (Han and Eskin, 2012).
For the meta-analyses results, we generated Manhattan and Q-Q plots using the qqman (Turner, 2018) and ggplot2 (Wickham et al., 2019) R packages. In addition, we visualized the significant loci with regional plots using the web-based program LocusZoom (Pruim et al., 2011), with the 1KGP Phase 3 European sample as reference LD. We focused our results on the fixed effects model, but we also report the RE2 on the summary statistics of the top signals as Data S1, and compared the statistical significance between both models when there was evidence of heterogeneity based on Cochran’s Q p-value and I2 statistics.
Annotation of significant loci
We used the web-based program SNPNexus (Ullah et al., 2012; Ullah et al., 2018) to annotate the genome-wide significant signals (p value < 1e-08) from the meta-analysis. Specifically, gene and variant type annotation were done using the University of California Santa Cruz (UCSC) and Ensembl databases (human genome version hg19); assessment of the predictive effect of non-synonymous coding variants on protein function was done with SIFT and PolyPhen scores. Both SIFT and PolyPhen output qualitative prediction scores (i.e. probably damaging/deleterious, possibly damaging/deleterious-low confidence, tolerated/benign). Non-coding variation scoring was assessed using CADD score, which is based on ranking the deleteriousness of a variant relative to all possible substitutions of the human genome. For instance, a score ≥ 20 indicates that the variant is predicted to be in the top 1% most deleterious variants in the genome (Ullah et al., 2018). In addition, we explored the effect of significant loci on RNA and protein expression using the GTEx database (Consortium et al., 2017) and the effect of significant genes using the Protein Atlas (Uhlén et al., 2015).
Approximate conditional analyses of association
In order to identify if the genome-wide significant loci of our original logistic meta-analyses were driven by one or more independent variants, we conducted approximate conditional and joint analyses of association (COJO) with GCTA (Yang et al., 2012). We performed the analysis (--cojo-slct) using as input the summary statistics of our eye colour meta-analysis (fixed effects, FE) and the weighted average effect allele frequency from all studies. In addition, the program requires a reference sample for computing LD correlations and, in the case of a meta-analysis, it is suggested to use one of the study’s large samples (Yang et al., 2012). Therefore, we ran the analysis twice: 1) using as a reference the sample genotyped with the Axiom UKBB array, and 2) using as a reference the sample genotyped with the GSA 24v1+MDP. We assumed that variants farther than 10 Mb are in complete linkage equilibrium and used a p-value threshold of 5e-08.
Statistical fine-mapping of significant loci
We used the program FINEMAP (version 1.4) (Benner et al., 2016) to identify candidate causal variants in the genome-wide associated loci across the genome for eye colour. FINEMAP is based on a Bayesian framework, which uses summary statistics and LD correlations among variants to compute the posterior probabilities of causal variants, with a shotgun stochastic search algorithm (Benner et al., 2016). Compared to other methods, FINEMAP allows a maximum of 20 causal variants per locus. To run the program, we used as input the meta-analysis summary statistics, including the weighted average MAF across all studies, and an LD correlation matrix from one of the large samples in our study (Axiom UKBB array, N = 4,745). The LD correlation matrix was computed using LDStore (version 2.0), which considers genotype probabilities (Benner et al., 2017). We defined regions for fine-mapping as ± 500 kb regions flanking the lead SNP, based on the genome-wide and suggestive signals of association from the meta-analyses, and setting the maximum number of causal SNPs to 10 for each locus (i.e. a maximum of 10 credible sets). A credible set is comprised of SNPs that cumulatively reach a probability of at least 95%. The SNPs within a credible set are referred to as candidate causal variants and each of them has a corresponding posterior inclusion probability (PIP).
We filtered FINEMAP results by removing candidate causal variants with a log10BF < 2 from each of the 95% credible sets, where a log10BF indicates considerable evidence of causality. We annotated the remaining SNPs using SNPnexus (Ullah et al., 2018) to obtain information about the overlapping/nearest genes, overlapping regulatory elements and CADD scores. Annotation of gene expression on ENCODE, Roadmap Epigenomics and Ensembl Regulatory Build was restricted to melanocytes and fibroblasts, which are the relevant cell types involved in eye colour. Based on the combined evidence of fine-mapping and posterior annotation, we defined the candidate causal variants with strong evidence of causality (based on their log10BF and annotation) as the most likely candidate causal variants. We computed LD correlations between the candidate causal SNP(s) on each locus (i.e. configuration with highest posterior probability and k number of SNPs) and the other variants on each locus using LDStore (version 2.0) and plotted the Posterior Inclusion Probability (PIP) results on R (version 3.5.1) (R Core Team, 2019) using ggplot2 (Wickham et al., 2019).
Given that there may be candidate functional SNPs that we did not genotype or did not impute with high accuracy, we explored if markers in the same LD blocks of the credible sets have functional annotations using HaploReg (version 4) (Ward and Kellis, 2012, 2016). We used as input the most likely candidate causal SNP on each credible set, the LD from the 1KGP European population with a threshold of r2 ≥ 0.8 and the core chromatin 15-state model, which is based on several histone marks associated with promoters, enhancers, insulators and heterochromatin.
SNP-SNP interactions
We computed statistical interactions using the program CASSI (Howey, 2017) across the index GWAS SNPs that reached a genome-wide significance threshold (IRF4: rs12203592, TYRP1: rs1326779; TYR: rs1126809; SLC24A4: rs4144266), and including as well the five loci fine mapped in the OCA2/HERC2 region (rs4778138, rs12913832, rs117007668, rs117744568, rs71467328) using a linear regression model and the same covariates as in the GWAS. We used a Bonferroni-corrected p-value threshold to account for all the tests performed (0.05/33 tests = 0.0015).
Colocalization with expression and methylation QTLs from cultured melanocytes
We conducted colocalization analyses of our GWAS meta-analyses results with gene expression and methylation cis-QTL data from primary cultures of foreskin melanocytes, isolated from foreskin of 106 newborn males (Zhang et al., 2018, 2021). Cis-QTLs were assessed for variants in the ± 1Mb region of each gene or CpG. We used the program hyprcoloc (Foley et al., 2021) to obtain the posterior probability of a variant being shared between the eye colour GWAS signals and the expression or methylation QTLs. We tested all the significant eQTL genes or meQTL probes within ± 250 kb regions flanking the most significant GWAS SNP on each of the genome-wide regions (p-value ≤ 5e-8) from the meta-analysis summary statistics (five different loci). We used as LD reference the matrix obtained from the CanPath’s Axiom UKBB Array (INFO score >0.3), computed on PLINK (version 1.9; --r square) (Purcell et al., 2007; Chang et al., 2015). We kept colocalized regions that reached a posterior probability ≥0.8, indicating high confidence of shared causality.
Transcriptome-wide association studies
We performed a transcriptome-wide association study (TWAS) by imputing the expression profile of the CanPath cohort using GWAS summary statistics and melanocyte RNA-seq expression data (Zhang et al., 2018). Using the program FUSION (Gusev et al., 2016), we used as LD reference the CanPath’s Axiom UKBB genotyping array computed in binary PLINK format (version 1.9; --make-bed) (Purcell et al., 2007; Chang et al., 2015). As recommended by FUSION, we used the LDSC munge_sumstats.py script to check the GWAS summary statistics (90). Before running the script, we filtered out SNPs with MAF <0.01, SNPs with a genotyping missing rate >0.01 and SNPs that failed Hardy-Weinberg test at significance threshold of 1 × 10−7 using PLINK (version 1.9; --maf 0.01, --geno 0.01, --hwe 10e-7) (Purcell et al., 2007; Chang et al., 2015). We computed functional weights from our melanocyte RNA-seq data one gene at a time. Genes that failed quality control during a heritability check (using minimum heritability p-value of 0.01) were excluded from the further analyses, yielding a total of 3998 genes. We restricted the locus to 500 kb on either side of the gene boundary. We applied a significance cut-off to the final TWAS result of 1.25e-5 (i.e. 0.05/3998 genes tested). Finally, we performed conditional analysis on FUSION (FUSION_post.process.R script) if more than one gene in a locus was significant, to identify if these were independent signals.
Genetic correlations
We used a bivariate restricted maximum likelihood (REML) approach to test for genome-wide pleiotropy between hair and eye colour using GCTA (--reml-bivar option) (Yang et al., 2012), by taking advantage of the hair colour meta-analysis dataset from CanPath (Lona-Durazo et al., 2021). To consider the whole spectrum of colour in both traits, and thus maximize the number of loci, we coded both traits on a linear scale (excluding red hair colour). Hair colour ranged from 1= blonde, 2=light brown, 3= dark brown, and 4= black, whereas eye colour categories ranged from 1= grey/blue, 2= green, 3= hazel, and 4= brown. Given that the program requires genotype-level data, we computed the analysis twice, using the two largest samples for which eye colour data was available: Axiom UKBB array (N= 3,212) and GSA 24v1+MDP (N=2,429). Significance of the genetic correlations was computed with a likelihood ratio test on R (version 3.5.1) (R Core Team, 2019).
We first included in the model sex, age and the significant PCs as covariates, and restricted the analysis to SNPs with high INFO score (i.e. INFO >0.8) and MAF >1%. We then explored the correlations between each phenotype (i.e. hair and eye colour) and the eigenvectors of the principal components analysis. If the eigenvectors are correlated with the ancestry (i.e. geography) of the individuals, setting them as covariates may hinder the true genetic correlation between both traits, given that hair and eye colour are themselves correlated with ancestry. Therefore, we ran the genetic correlation a second time using as covariates only the non-significant principal components. In the case of the Axiom UKBB array we used PC3 and PC5, and in the case of GSA 24v1+MDP we used PC4, PC5 and PC8.
Quantification and statistical analysis
The quantitative and statistical analyses are described in the relevant sections of the Method details or in the table and figure legends.
Acknowledgments
The data used in this research were made available by CanPath—Canadian Partnership for Tomorrow’s Health (formerly CPTP), CARTaGENE, Alberta’s Tomorrow Project, Ontario Health Study, BC Generations Project, and Atlantic PATH. The authors would like to thank all the participants of the Canadian Partnership for Tomorrow’s Health. FLD was supported by the National Council for Science and Technology (CONACYT) in Mexico. EJP received funding from the Natural Sciences and Engineering Research Council of Canada (NSERC Discovery Grant). RT, KF, MAK, JC, TZ, and KMB are supported by the Intramural Research Program of the NIH, National Cancer Institute, Division of Cancer Epidemiology and Genetics; https://dceg.cancer.gov/); the content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the US Government. Computations were performed on the GPC supercomputer at the SciNet HPC Consortium, Canada and at the UTM High Performance Computing server at Mississauga, ON, Canada. This work also utilized the computational resources of the NIH HPC Biowulf cluster (http://hpc.nih.gov). SciNet is funded by the Canada Foundation for Innovation under the auspices of Compute Canada; the Government of Ontario; Ontario Research Fund-Research Excellence; and the University of Toronto. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Graphical abstract created with Biorender.com.
Author contributions
EJP and FLD designed the study. FLD, RT, and EPC performed statistical analyses. FLD wrote the draft of the manuscript. FLD, EJP, EPC, KF, TZ, MAK, JC, IJ, and KMB aided in the interpretation of the results and in the preparation of the final version of the manuscript.
Declaration of interests
The authors declare no competing interests.
Published: June 17, 2022
Footnotes
Supplemental information can be found online at https://doi.org/10.1016/j.isci.2022.104485.
Contributor Information
Frida Lona-Durazo, Email: frida.lonadurazo@mail.utoronto.ca.
Esteban J. Parra, Email: esteban.parra@utoronto.ca.
Supplemental information
Data and code availability
The datasets supporting this manuscript are included as supplemental information. We provide the genome-wide (p ≤ 5e-8) and suggestive (p ≤ 1e-6) signals identified in the eye colour GWAS meta-analysis as a supplemental information File (Data S1). Further information and requests for data published here should be directed to CanPath, which regulates the access to the data and biological materials (https://canpath.ca/). Melanocyte genotype data, RNA-seq expression data, and all meQTL association results have been deposited in Genotypes and Phenotypes (dbGaP) under accession dbGaP: phs001500.v1.p1. This paper does not report original code. Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.
References
- Adhikari K., Mendoza-Revilla J., Sohail A., Fuentes-Guajardo M., Lampert J., Chacón-Duque J.C., Hurtado M., Villegas V., Granja V., Acuña-Alonzo V., et al. A GWAS in Latin Americans highlights the convergent evolution of lighter skin pigmentation in Eurasia. Nat. Commun. 2019;10:358. doi: 10.1038/s41467-018-08147-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Andersen J.D., Pietroni C., Johansen P., Andersen M.M., Pereira V., Børsting C., Morling N. Importance of nonsynonymous <scp>OCA</scp> 2 variants in human eye color prediction. Mol. Genet. Genom. Med. 2016;4:420–430. doi: 10.1002/mgg3.213. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Beleza S., Johnson N.A., Candille S.I., Absher D.M., Coram M.A., Lopes J., Campos J., Araújo I.I., Anderson T.M., Vilhjálmsson B.J., et al. Genetic architecture of skin and eye color in an African-European admixed population. PLoS Genet. 2013;9:e1003372. doi: 10.1371/journal.pgen.1003372. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Benner C., Spencer C.C., Havulinna A.S., Salomaa V., Ripatti S., Pirinen M. FINEMAP: efficient variable selection using summary data from genome-wide association studies. Bioinformatics. 2016;32:1493–1501. doi: 10.1093/bioinformatics/btw018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Benner C., Havulinna A.S., Järvelin M.R., Salomaa V., Ripatti S., Pirinen M. Prospects of fine-mapping trait-associated genomic regions by using summary statistics from genome-wide association studies. Am. J. Hum. Genet. 2017;101:539–551. doi: 10.1016/j.ajhg.2017.08.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bherer C., Labuda D., Roy-Gagnon M.H., Houde L., Tremblay M., Vézina H. Admixed ancestry and stratification of Quebec regional populations. Am. J. Phys. Anthropol. 2011;144:432–441. doi: 10.1002/ajpa.21424. [DOI] [PubMed] [Google Scholar]
- Bito L.Z., Matheny A., Cruickshanks K.J., Nondahl D.M., Carino O.B. Eye color changes past early childhood: the Louisville twin study. JAMA Ophthalmol. 1997;115:659–663. doi: 10.1001/archopht.1997.01100150661017. [DOI] [PubMed] [Google Scholar]
- Bonilla C., Bertoni B., Min J.L., Hemani G., Elliott H.R. Investigating DNA methylation as a potential mediator between pigmentation genes, pigmentary traits and skin cancer. Pigm. Cell Melanoma Res. 2020;34:892–904. doi: 10.1111/pcmr.12948. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bulik-Sullivan B.K., Loh P.R., Finucane H.K., Ripke S., Yang J., Patterson N., Daly M.J., Price A.L., Neale B.M. LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 2015;47:291–295. doi: 10.1038/ng.3211. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Candille S.I., Absher D.M., Beleza S., Bauchet M., McEvoy B., Garrison N.A., Li J.Z., Myers R.M., Barsh G.S., Tang H., Shriver M.D. Genome-wide association studies of quantitatively measured skin, hair, and eye pigmentation in four European populations. PLoS One. 2012;7:e48294. doi: 10.1371/journal.pone.0048294. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chaitanya L., Breslin K., Zuñiga S., Wirken L., Pośpiech E., Kukla-Bartoszek M., Sijen T., Knijff P., Liu F., Branicki W., et al. The HIrisPlex-S system for eye, hair and skin colour prediction from DNA: introduction and forensic developmental validation. Forensic Sci. Int. Genet. 2018;35:123–135. doi: 10.1016/j.fsigen.2018.04.004. [DOI] [PubMed] [Google Scholar]
- Chang C.C., Chow C.C., Tellier L.C., Vattikuti S., Purcell S.M., Lee J.J. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience. 2015;4:7. doi: 10.1186/s13742-015-0047-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chitsazan A., Lambie D., Ferguson B., Handoko H.Y., Gabrielli B., Walker G.J., Boyle G.M. Unexpected high levels of BRN2/POU3F2 expression in human dermal melanocytic Nevi. J. Invest. Dermatol. 2020;140:1299–1302.e4. doi: 10.1016/j.jid.2019.12.007. [DOI] [PubMed] [Google Scholar]
- Consortium, Gte. Battle A., Brown C.D., Engelhardt B.E., Montgomery S.B. Genetic effects on gene expression across human tissues. Nature. 2017;550:204–213. doi: 10.1038/nature24277. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Davenport G.C., Davenport C.B. Heredity of eye-color in man. Science. 1907;26:589–592. doi: 10.1126/science.26.670.589-b. [DOI] [PubMed] [Google Scholar]
- Donnelly M.P., Paschou P., Grigorenko E., Gurwitz D., Barta C., Lu R.B., Zhukova O.V., Kim J.J., Siniscalco M., New M., et al. A global view of the OCA2-HERC2 region and pigmentation. Hum. Genet. 2012;131:683–696. doi: 10.1007/s00439-011-1110-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dunham I., Kundaje A., Aldred S.F., Collins P.J., Davis C.A., Doyle F., Epstein C.B., Frietze S., Harrow J., Kaul R., et al. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489:57–74. doi: 10.1038/nature11247. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Durbin R. Efficient haplotype matching and storage using the positional Burrows-Wheeler transform (PBWT) Bioinformatics. 2014;30:1266–1272. doi: 10.1093/bioinformatics/btu014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Edwards M., Cha D., Krithika S., Johnson M., Cook G., Parra E.J. Iris pigmentation as a quantitative trait: variation in populations of European, East Asian and South Asian ancestry and association with candidate gene polymorphisms. Pigm. Cell Melanoma Res. 2016;29:141–162. doi: 10.1111/pcmr.12435. [DOI] [PubMed] [Google Scholar]
- Eriksson N., Macpherson J.M., Tung J.Y., Hon L.S., Naughton B., Saxonov S., Avey L., Wojcicki A., Pe'er I., Mountain J. Web-based, participant-driven studies yield novel genetic associations for common traits. PLoS Genet. 2010;6:e1000993. doi: 10.1371/journal.pgen.1000993. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Foley C.N., Staley J.R., Breen P.G., Sun B.B., Kirk P.D.W., Burgess S., Howson J.M.M. A fast and efficient colocalization algorithm for identifying shared genetic risk factors across multiple traits. Nat. Commun. 2021;12:764. doi: 10.1038/s41467-020-20885-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Galván-Femenía I., Obón-Santacana M., Piñeyro D., Guindo-Martinez M., Duran X., Carreras A., Pluvinet R., Velasco J., Ramos L., Aussó S., et al. Multitrait genome association analysis identifies new susceptibility genes for human anthropometric variation in the GCAT cohort. J. Med. Genet. 2018;55:765–778. doi: 10.1136/jmedgenet-2018-105437. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gusev A., Ko A., Shi H., Bhatia G., Chung W., Penninx B.W.J.H., Jansen R., de Geus E.J.C., Boomsma D.I., Wright F.A., et al. Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet. 2016;48:245–252. doi: 10.1038/ng.3506. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Han B., Eskin E. Random-effects model aimed at discovering associations in meta-analysis of genome-wide association studies. Am. J. Hum. Genet. 2011;88:586–598. doi: 10.1016/j.ajhg.2011.04.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Han B., Eskin E. Interpreting meta-analyses of genome-wide association studies. PLoS Genet. 2012;8:e1002555. doi: 10.1371/journal.pgen.1002555. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Howey R. GNU General Public License; 2017. CASSI: Genome-wide Interaction Analysis Software.http://www.staffnclacuk/richardhowey/cassi/ [Google Scholar]
- Jacobs L.C., Wollstein A., Lao O., Hofman A., Klaver C.C., Uitterlinden A.G., Nijsten T., Kayser M., Liu F. Comprehensive candidate gene study highlights UGT1A and BNC2 as new genes determining continuous skin color variation in Europeans. J. Hum. Genet. 2013;132:147–158. doi: 10.1007/s00439-012-1232-9. [DOI] [PubMed] [Google Scholar]
- Katsara M.A., Nothnagel M. True colors: a literature review on the spatial distribution of eye and hair pigmentation. Forensic Sci. Int. Genet. 2019;39:109–118. doi: 10.1016/j.fsigen.2019.01.001. [DOI] [PubMed] [Google Scholar]
- Kayser M., Liu F., Janssens A.C.J., Rivadeneira F., Lao O., van Duijn K., Vermeulen M., Arp P., Jhamai M.M., van IJcken W.F., et al. Three genome-wide association studies and a linkage analysis identify HERC2 as a human Iris color gene. Am. J. Hum. Genet. 2008;82:411–423. doi: 10.1016/j.ajhg.2007.10.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kidd K.K., Pakstis A.J., Donnelly M.P., Bulbul O., Cherni L., Gurkan C., Kang L., Li H., Yun L., Paschou P., et al. The distinctive geographic patterns of common pigmentation variants at the OCA2 gene. Sci. Rep. 2020;10:15433. doi: 10.1038/s41598-020-72262-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Laino A.M., Berry E., Jagirdar K., Lee K., Duffy D., Soyer H., Sturm R. Iris pigmented lesions as a marker of cutaneous melanoma risk : an Australian case – control study. Br. J. Dermatol. 2018;178:1119–1127. doi: 10.1111/bjd.16323. [DOI] [PubMed] [Google Scholar]
- Lamason R.L., Mohideen M.A.P.K., Mest J.R., Wong A.C., Norton H.L., Aros M.C., Jurynec M.J., Mao X., Humphreville V.R., Humbert J.E., et al. SLC24A5, a putative cation exchanger, affects pigmentation in Zebrafish and humans. Science. 2005;310:1782–1786. doi: 10.1126/science.1116238. [DOI] [PubMed] [Google Scholar]
- Landi M.T., Bishop D.T., MacGregor S., Machiela M.J., Stratigos A.J., Ghiorzo P., Brossard M., Calista D., Choi J., Fargnoli M.C., et al. Genome-wide association meta-analyses combining multiple risk phenotypes provide insights into the genetic architecture of cutaneous melanoma susceptibility. Nat. Genet. 2020;52:494–504. doi: 10.1038/s41588-020-0611-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Larsson M., Pedersen N.L., Stattin H. Importance of genetic effects for characteristics of the human iris. Twin Res. 2003;6:192–200. doi: 10.1375/136905203765693843. [DOI] [PubMed] [Google Scholar]
- Kobi D., Steunou A.L., Dembélé D., Legras S., Larue L., Nieto L., Davidson I. Genome-wide analysis of POU3F2⁄BRN2 promoter occupancy in human melanoma cells reveals Kitl as a novel regulated target gene. Pigm. Cell Melanoma Res. 2010;23:404–418. doi: 10.1111/j.1755-148X.2010.00697.x. [DOI] [PubMed] [Google Scholar]
- Li L., Hu D.N., Zhao H., McCormick S.A., Nordlund J.J., Boissy R.E. Uveal melanocytes do not respond to or express receptors for alpha-melanocyte-stimulating hormone. Invest. Ophthalmol. Vis. Sci. 2006;47:4507–4512. doi: 10.1167/iovs.06-0391. [DOI] [PubMed] [Google Scholar]
- Lin B.D., Willemsen G., Abdellaoui A., Bartels M., Ehli E.A., Davies G.E., Boomsma D.I., Hottenga J.J. The genetic overlap between hair and eye color. Twin Res. Hum. Genet. 2016;19:595–599. doi: 10.1017/thg.2016.85. [DOI] [PubMed] [Google Scholar]
- Lin J.Y., Fisher D.E. Melanocyte biology and skin pigmentation. Nature. 2007;445:843–850. doi: 10.1038/nature05660. [DOI] [PubMed] [Google Scholar]
- Liu F., Wollstein A., Hysi P.G., Ankra-Badu G.A., Spector T.D., Park D., Zhu G., Larsson M., Duffy D.L., Montgomery G.W., et al. Digital quantification of human eye color highlights genetic association of three new loci. PLoS Genet. 2010;6:e1000934. doi: 10.1371/journal.pgen.1000934. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu F., Visser M., Duffy D.L., Hysi P.G., Jacobs L.C., Lao O., Zhong K., Walsh S., Chaitanya L., Wollstein A., et al. Genetics of skin color variation in Europeans: genome-wide association studies with functional follow-up. Hum. Genet. 2015;134:823–835. doi: 10.1007/s00439-015-1559-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lloyd-Jones L.R., Robinson M.R., Moser G., Zeng J., Beleza S., Barsh G.S., Tang H., Visscher P.M. Inference on the genetic basis of eye and skin color in an admixed population via bayesian linear mixed models. Genetics. 2016;206:1113–1126. doi: 10.1534/genetics.116.193383. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Loh P.R., Danecek P., Palamara P.F., Fuchsberger C., A Reshef Y., K Finucane H., Schoenherr S., Forer L., McCarthy S., Abecasis G.R., et al. Reference-based phasing using the haplotype reference Consortium panel. Nat. Genet. 2016;48:1443–1448. doi: 10.1038/ng.3679. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lona-Durazo F., Hernandez-Pacheco N., Fan S., Zhang T., Choi J., Kovacs M.A., Loftus S.K., Le P., Edwards M., Fortes-Lima C.A., et al. Meta-analysis of GWA studies provides new insights on the genetic architecture of skin pigmentation in recently admixed populations. BMC Genet. 2019;20:1–16. doi: 10.1186/s12863-019-0765-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lona-Durazo F., Mendes M., Thakur R., Funderburk K., Zhang T., Kovacs M.A., Choi J., Brown K.M., Parra E.J. A large Canadian cohort provides insights into the genetic architecture of human hair colour. Commun. Biol. 2021;4:1253. doi: 10.1038/s42003-021-02764-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Machiela M.J., Chanock S.J. LDlink: a web-based application for exploring population-specific haplotype structure and linking correlated alleles of possible functional variants. Bioinformatics. 2015;31:3555–3557. doi: 10.1093/bioinformatics/btv402. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McCarthy S., Das S., Kretzschmar W., Delaneau O., Wood A.R., Teumer A., Kang H.M., Fuchsberger C., Danecek P., Sharp K., et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat. Genet. 2016;48:1279–1283. doi: 10.1038/ng.3643. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meyer O.S., Lunn M.M.B., Garcia S.L., Kjærbye A.B., Morling N., Børsting C., Andersen J.D. Association between brown eye colour in rs12913832:GG individuals and SNPs in TYR, TYRP1, and SLC24A4. PLoS One. 2020;15:e0239131. doi: 10.1371/journal.pone.0239131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Norton H.L., Edwards M., Krithika S., Johnson M., Werren E.A., Parra E.J. Quantitative assessment of skin, hair, and iris variation in a diverse sample of individuals and associated genetic variation. Am. J. Phys. Anthropol. 2015;160:570–581. doi: 10.1002/ajpa.22861. [DOI] [PubMed] [Google Scholar]
- Palstra R.J.T.S. Close encounters of the 3C kind: long-range chromatin interactions and transcriptional regulation. Briefings Funct. Genomics Proteomics. 2009;8:297–309. doi: 10.1093/bfgp/elp016. [DOI] [PubMed] [Google Scholar]
- Parra E.J. Human pigmentation variation: evolution, genetic basis, and implications for public Health. Yearbk. Phys. Anthropol. 2007;134:85–105. doi: 10.1002/ajpa.20727. [DOI] [PubMed] [Google Scholar]
- Pośpiech E., Wojas-Pelc A., Walsh S., Liu F., Maeda H., Ishikawa T., Skowron M., Kayser M., Branicki W. The common occurrence of epistasis in the determination of human pigmentation and its impact on DNA-based pigmentation phenotype prediction. Forensic Sci. Int. Genet. 2014;11:64–72. doi: 10.1016/j.fsigen.2014.01.012. [DOI] [PubMed] [Google Scholar]
- Praetorius C., Grill C., Stacey S., Metcalf A., Gorkin D., Robinson K., Van Otterloo E., Kim R., Bergsteinsdottir K., Ogmundsdottir M., et al. A polymorphism in IRF4 affects human pigmentation through a tyrosinase- dependent MITF/TFAP2A pathway. Cell. 2013;155:1022–1033. doi: 10.1016/j.cell.2013.10.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pruim R.J., Welch R.P., Sanna S., Teslovich T.M., Chines P.S., Gliedt T.P., Boehnke M., Abecasis G.R., Willer C.J. LocusZoom: regional visualization of genome-wide association scan results. Bioinformatics. 2011;26:2336–2337. doi: 10.1093/bioinformatics/btq419. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Purcell S., Neale B., Todd-Brown K., Thomas L., Ferreira M.A., Bender D., Maller J., Sklar P., de Bakker P.I., Daly M.J., Sham P.C. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 2007;81:559–575. doi: 10.1086/519795. [DOI] [PMC free article] [PubMed] [Google Scholar]
- R Core Team . R Foundation for Statistical Computing; 2019. R: A Language and Environment for Statistical Computing.http://www.r-project.org/ [Google Scholar]
- Rayner W. McCarthy group tools. 2019. https://www.well.ox.ac.uk/∼wrayner/tools/index.html#Checking
- Rees J.L. Genetics of hair and skin color. Annu. Rev. Genet. 2003;37:67–90. doi: 10.1146/annurev.genet.37.110801.143233. [DOI] [PubMed] [Google Scholar]
- Roadmap Epigenomics Consortium. Kundaje A., Meuleman W., Ernst J., Bilenky M., Yen A., Heravi-Moussavi A., Kheradpour P., Zhang Z., Wang J., Ziller M.J., et al. Integrative analysis of 111 reference human epigenomes. Nature. 2015;518:317–330. doi: 10.1038/nature14248. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Simcoe M., Valdes A., Liu F., Furlotte N.A., Evans D.M., Hemani G., Ring S.M., Smith G.D., Duffy D.L., Zhu G., et al. 23andMe Research Team. International Visible Trait Genetics Consortium Genome-wide association study in almost 195, 000 individuals identifies 50 previously unidentified genetic loci for eye color. Sci. Adv. 2021;7:eabd1239. doi: 10.1126/sciadv.abd1239. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stokowski R.P., Pant P.K., Dadd T., Fereday A., Hinds D.A., Jarman C., Filsell W., Ginger R.S., Green M.R., van der Ouderaa F.J., Cox D.R. A genomewide association study of skin pigmentation in a south Asian population. Am. J. Hum. Genet. 2007;81:1119–1132. doi: 10.1086/522235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sturm R.A., Frudakis T.N. Eye colour: portals into pigmentation genes and ancestry. Trends Genet. 2004;20:327–332. doi: 10.1016/j.tig.2004.06.010. [DOI] [PubMed] [Google Scholar]
- Sulem P., Gudbjartsson D.F., Stacey S.N., Helgason A., Rafnar T., Magnusson K.P., Manolescu A., Karason A., Palsson A., Thorleifsson G., et al. Genetic determinants of hair, eye and skin pigmentation in Europeans. Nat. Genet. 2007;39:1443–1452. doi: 10.1038/ng.2007.13. [DOI] [PubMed] [Google Scholar]
- Sulem P., Gudbjartsson D.F., Stacey S.N., Helgason A., Rafnar T., Jakobsdottir M., Steinberg S., Gudjonsson S.A., Palsson A., Thorleifsson G., et al. Two newly identified genetic determinants of pigmentation in Europeans. Nat. Genet. 2008;40:835–837. doi: 10.1038/ng.160. [DOI] [PubMed] [Google Scholar]
- D Turner S. Qqman : an R package for visualizing GWAS results using Q-Q and manhattan plots. J. Open Source Softw. 2018;3:731. doi: 10.21105/joss.00731. [DOI] [Google Scholar]
- Uhlén M., Fagerberg L., Hallström B.M., Lindskog C., Oksvold P., Mardinoglu A., Sivertsson Å., Kampf C., Sjöstedt E., Asplund A., et al. Proteomics. Tissue-based map of the human proteome. Science. 2015;347:1260419. doi: 10.1126/science.1260419. [DOI] [PubMed] [Google Scholar]
- Ullah A.Z.D., Oscanoa J., Wang J., Nagano A., Lemoine N.R., Chelala C. SNPnexus: assessing the functional relevance of genetic variation to facilitate the promise of precision medicine. Nucleic Acids Res. 2018;46:109–113. doi: 10.1093/nar/gky399. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ullah A.Z.D., Lemoine N.R., Chelala C. SNPnexus: a web server for functional annotation of novel and publicly known genetic variants (2012 update) Nucleic Acids Res. 2012;40:65–70. doi: 10.1093/nar/gks364. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Visser M., Kayser M., Grosveld F., Palstra R.J. Genetic variation in regulatory DNA elements: the case of OCA2 transcriptional regulation. Pigm. Cell Melanoma Res. 2014;27:169–177. doi: 10.1111/pcmr.12210. [DOI] [PubMed] [Google Scholar]
- Visser M., Kayser M., Palstra R.J. HERC2 rs12913832 modulates human pigmentation by attenuating chromatin-loop formation between a long-range enhancer and the OCA2 promoter. Genome Res. 2012;22:446–455. doi: 10.1101/gr.128652.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wakamatsu K., Hu D.N., McCormick S.A., Ito S. Original article: characterization of melanin in human iridal and choroidal melanocytes from eyes with various colored irides. Pigm. Cell Melanoma Res. 2007;21:97–105. doi: 10.1111/j.1755-148X.2007.00415.x. [DOI] [PubMed] [Google Scholar]
- Walsh S., Wollstein A., Liu F., Chakravarthy U., Rahu M., Seland J.H., Soubrane G., Tomazzoli L., Topouzis F., Vingerling J.R., et al. DNA-based eye colour prediction across Europe with the IrisPlex system. Forensic Sci. Int. Genet. 2012;6:330–340. doi: 10.1016/j.fsigen.2011.07.009. [DOI] [PubMed] [Google Scholar]
- Walsh S., Chaitanya L., Clarisse L., Wirken L., Draus-Barini J., Kovatsi L., Maeda H., Ishikawa T., Sijen T., de Knijff P., et al. Developmental validation of the HIrisPlex system: DNA-based eye and hair colour prediction for forensic and anthropological usage. Forensic Sci. Int. Genet. 2014;9:150–161. doi: 10.1016/j.fsigen.2013.12.006. [DOI] [PubMed] [Google Scholar]
- Walsh S., Chaitanya L., Breslin K., Muralidharan C., Bronikowska A., Pospiech E., Koller J., Kovatsi L., Wollstein A., Branicki W., et al. Erratum to: global skin colour prediction from DNA. Hum. Genet. 2017;136:865–866. doi: 10.1007/s00439-017-1817-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ward L.D., Kellis M. HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants. Nucleic Acids Res. 2012;40:930–934. doi: 10.1093/nar/gkr917. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ward L.D., Kellis M. HaploReg v4: systematic mining of putative causal variants, cell types, regulators and target genes for human complex traits and disease. Nucleic Acids Res. 2016;44:D877–D881. doi: 10.1093/nar/gkv1340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wickham H., Averick M., Bryan J., Chang W., McGowan L.D.Agostino, François R., Grolemund G., Hayes A., Henry L., Hester J., et al. Vol. 4. 2019. pp. 1–6. (Welcome to the tidyverse tidyverse package). [DOI] [Google Scholar]
- Yang J., Lee S.H., Goddard M.E., Visscher P.M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 2011;88:76–82. doi: 10.1016/j.ajhg.2010.11.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yang J., Ferreira T., Morris A.P., Medland S.E., Madden P.A.F., Heath A.C., Martin N.G., Montgomery G.W., Weedon M.N., Loos R.J., et al. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat. Genet. 2012;44:369–375. doi: 10.1038/ng.2213. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yang J., Zaitlen N.A., Goddard M.E., Visscher P.M., Price A.L. Advantages and pitfalls in the application of mixed-model association methods. Nat. Genet. 2014;46:100–106. doi: 10.1038/ng.2876. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang M., Song F., Liang L., Nan H., Zhang J., Liu H., Wang L.E., Wei Q., Lee J.E., Amos C.I., et al. Genome-wide association studies identify several new loci associated with pigmentation traits and skin cancer risk in European Americans. Hum. Mol. Genet. 2013;22:2948–2959. doi: 10.1093/hmg/ddt142. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang T., Choi J., Kovacs M.A., Shi J., Xu M., Goldstein A.M., Trower A.J., Bishop D.T., Iles M.M., Duffy D.L., et al. Cell-type specific eQTL of primary melanocytes facilitates identification of melanoma susceptibility genes. Genome Res. 2018;28:1621–1635. doi: 10.1101/gr.233304.117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang T., Choi J., Dilshat R., Einarsdóttir B.Ó., Kovacs M.A., Xu M., Malasky M., Chowdhury S., Jones K., Bishop D.T., et al. Cell-type-specific meQTLs extend melanoma GWAS annotation beyond eQTLs and inform melanocyte gene-regulatory mechanisms. Am. J. Hum. Genet. 2021;108:1631–1646. doi: 10.1016/j.ajhg.2021.06.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhou D., Ota K., Nardin C., Feldman M., Widman A., Wind O., Simon A., Reilly M., Levin L.R., Buck J., et al. Mammalian pigmentation is regulated by a distinct cAMP-dependent mechanism that controls melanosome pH. Sci. Signal. 2018;11:eaau7987. doi: 10.1126/scisignal.aau7987. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The datasets supporting this manuscript are included as supplemental information. We provide the genome-wide (p ≤ 5e-8) and suggestive (p ≤ 1e-6) signals identified in the eye colour GWAS meta-analysis as a supplemental information File (Data S1). Further information and requests for data published here should be directed to CanPath, which regulates the access to the data and biological materials (https://canpath.ca/). Melanocyte genotype data, RNA-seq expression data, and all meQTL association results have been deposited in Genotypes and Phenotypes (dbGaP) under accession dbGaP: phs001500.v1.p1. This paper does not report original code. Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.