Skip to main content
European Journal of Human Genetics logoLink to European Journal of Human Genetics
. 2011 Aug 17;20(1):97–101. doi: 10.1038/ejhg.2011.156

A systematic eQTL study of cistrans epistasis in 210 HapMap individuals

Jessica Becker 1,2,4, Jens R Wendland 3,4, Britta Haenisch 1,2, Markus M Nöthen 1,2, Johannes Schumacher 2,*
PMCID: PMC3234520  PMID: 21847142

Abstract

We aimed at identifying transcripts whose expression is regulated by a SNP–SNP interaction. Out of 47 294 expression phenotypes we used 3107 transcripts that survived an extensive quality control and 86 613 linkage disequilibrium-pruned SNP markers that have been genotyped in 210 individuals. For each transcript we defined cis-SNPs, tested them for epistasis with all trans-SNPs, and corrected all observed cistrans-regulated expression effects for multiple testing. We determined that the expression of about 15% of all included transcripts is regulated by a significant two-locus interaction, which is more than expected (P=2.86 × 10−144). Our findings suggest further that cis-markers with so called ‘marginal effects' are more likely to be involved in two-locus gene regulation than expected (P=8.27 × 10−05), although the majority of interacting cis-markers showed no one-locus regulation. Furthermore, we found evidence that gene-mediated trans-effects are not a major source of epistasis, as no enrichment of genes has been found in close vicinity of trans-SNPs. In addition, our data support the notion that neither chromosomal regions nor cellular processes are enriched in epistatic interactions. Finally, some of the cistrans regulated genes have been found in genome-wide association studies, which might be interesting for follow-up studies of the corresponding disorders. In summary, our results provide novel insights into the complex genome-transcriptome regulation.

Keywords: eQTLs, epistasis, interaction, cis-regulation, trans-regulation

Introduction

Mapping studies of gene expression phenotypes have successfully lead to the identification of regulatory variants and networks across the genome.1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 In these expression quantitative trait locus (eQTL) analyses, genes have been identified whose expression are regulated by SNP markers, which are either in close proximity to (cis-acting SNPs) or at greater distances from the gene locus (trans-acting SNPs).12 Although the nature of cis-regulation is influenced by factors such as 5′ promoter- or 3′ transcript-variants, the mechanisms involved in trans-regulation include gene-mediated (eg, transcription factors) or sterical interactions such as ‘chromosome cross-talk'.13, 14, 15, 16 However, at many gene loci it must be assumed that both, cis- and trans-effects are involved simultaneously in the regulation of expression. Furthermore, it is possible that expression at certain gene loci is regulated by a more complex process that involves epistasis (eg, cistrans interaction). Unfortunately, these regulatory effects are not detected in one-locus eQTL studies where genetic variants are examined solely. There are two main reasons why two-locus or interaction eQTL mappings have not been applied to existing data. First, potential two-locus effects are difficult to identify and interpret, as substantial correction for multiple testing is required if the interaction was analyzed in a genome-wide fashion. In a genome-wide 100K SNP set, for example, the P-value of an observed interaction would have to be in the range of P=5 × 10−12 per transcript before being considered significant. Second, systematic two-locus eQTL mappings require substantial computational resources, although this limitation has recently been overcome by the introduction of novel biostatistical methods.17, 18, 19

In the present study we tried to circumvent some of the limitations associated to interaction scans and performed a systematic two-locus eQTL study for epistasis. Out of three possible two-locus interaction models (ie, ciscis, cistrans, trans–trans), we restricted our analysis only to cistrans epistasis. We used the expression data of 3107 high-quality transcripts and 86 613 linkage disequilibrium (LD)-pruned SNP markers obtained from 210 HapMap founders. For each transcript, we tested whether expression levels showed statistical epistasis between a locus-specific cis- and an interacting trans-SNP located elsewhere in the genome. Although other interaction effects may be involved in gene regulation, cistrans interacting effects were investigated as these may be easier to interpret. For example, it is difficult to control for intermarker LD in ciscis or for multiple testing in transtrans interaction studies. A further aim of the study was to characterize identified cistrans interaction effects, for example, to determine whether SNP markers involved in epistatic gene regulation also represent significant one-locus eQTLs.

Materials and methods

Expression data and study sample

For our genome-transcriptome eQTL analysis we used the expression phenotypes that have been generated by The Wellcome Trust Sanger Institute Cambridge (GENEVAR, http://ftp://ftp.sanger.ac.uk/pub/genevar/) from human lymphoblastoid cell lines (LCLs) of all 210 founders in the four International HapMap II populations (http://snp.cshl.org/).8, 9 The sample includes 60 Caucasian individuals (CEU, of northern and western European ancestry), 90 Asian individuals (45 Han Chinese, CHB; and 45 Japanese, JPT), as well as 60 African individuals (YRI, from Nigeria). Although this strategy cannot detect interaction effects on gene regulation that are restricted to one particular population, use of the combined sample provides improved statistical power for the detection of epistasis and has been successfully used in previous one-locus eQTL studies.8, 9 In this sample, we used only expression phenotypes for transcripts that were filtered through a detailed and extensive quality control. Of the 47 294 transcripts analyzed using Illumina's human whole genome expression (WG-6 version 1) array (Illumina Inc., San Diego, CA, USA), only those probes that have shown an Illumina detection score of >0.99 in each of the four hybridization experiments conducted across all 210 HapMap individuals were used. These scores were obtained from the Sanger Institute website (‘gene_profile-files' at http://ftp://ftp.sanger.ac.uk/pub/genevar/) and reduced the number of transcripts included in the present study to 7978 probes. The respective transcripts could be expected to be robustly expressed in human LCLs. In a subsequent step, the presence of SNPs in the hybridization probes was excluded using the web-based program ReMOAT (version March 2009, http://www.compbio.group.cam.ac.uk/Resources/Annotation/index.html)20 and the dbSNP 126 database (http://www.ncbi.nlm.nih.gov/projects/SNP/). Although there is a current debate in the field as to whether this step is necessary and other studies have included SNP-containing probes, we decided to exclude them as they possibly might influence the true expression quantity. However, the removal of probes with known coding SNPs did not substantially reduce the number of included transcripts to 6226 probes. Furthermore, we used ReMOAT for the inclusion of probes that are located on autosomes only and mapped over the full length (50 bp) to a contiguous genomic location (ie, no intron-spanning probes). We decided to use exon-specific probes only in order to avoid any inaccurate expression signals, which could be caused by insufficient hybridization to different isoforms of the gene (eg, due to exon-skipping or -incorporation). This step reduced the number of included probes to 5237. Next, the uniqueness of genomic hits for each probe was determined using nuID (https://prod.bioinformatics.northwestern.edu/nuID/), which represents a probe identifier for microarray experiments. This reduced the number of included probes further to 4418 showing a nuID uniqueness score of 100. Only these probes could be specifically mapped to a single Entrez GeneID. Entrez Gene is a repository from the National Center for Biotechnology Information (NCBI) for gene-specific information. In final steps, we filtered for probes whose corresponding transcripts were annotated as ‘reviewed' or ‘validated' using NMN=3124). The RefSeq database provides a collection of annotated sequences including transcripts. When multiple probes hybridized to the same RefSeq NM_ transcript, only one randomly selected probe was included in the analyses. In the final filtering step, the UCSC Browser version HG18 (http://genome.ucsc.edu/cgi-bin/hgGateway) was used to identify probes with defined transcription start and end sites. Exact matches were found for a total of 3107 transcripts, and these were included in the two-locus eQTL analysis. The expression data for each of these 3107 probes were subjected to inverse quantile normalization according to the procedure described by Veyrieras et al10 and the normalized data were saved as PLINK21 alternate phenotype files. PLINK represents the program that was used for the interaction analysis (see below).

Genotyping data

SNP genotypes of each of the 210 founder individuals were obtained from HapMap release 23 using PLINK.21 A total of 3.95 million SNPs were available for each individual after exclusion of SNPs with Mendel errors. The Mendel check was performed in the 30 CEU and 30 YRI trios analyzed in the HapMap Project. Next, only SNPs were selected, which were located on autosomes, which had no HWE deviation (P>0.05), and which had allele frequencies between 0.2–0.8 as well as a per-SNP genotyping missingness cutoff of 0.02. Although this filtering procedure was done in each of the four populations separately, an LD-pruning step was restricted to the YRI acknowledging the lowest LD structure in this population. Here, a pairwise SNP-SNP-r2 of 0.8 was used as a pruning criterion. The filtering process resulted in N=86 613 SNPs, which were saved as PLINK binary file for inclusion in the analyses.

Interaction analysis

The two-locus interaction eQTL analysis was performed using the PLINK --epistasis command. For every transcript that corresponded to an included probe, cis-SNPs were defined as being variants located within the transcript or <1 Mb apart from the transcription start and end site. Each cis-SNP of a transcript was then tested for epistasis with all remaining SNPs, which were defined as trans-SNPs (ie, 86 613 SNPs minus the number of cis-SNPs per transcript). For the interaction eQTL mapping, the four different HapMap populations were used as categorical co-variates. To determine the significance of our findings, we finally corrected for each transcript all cistrans interaction results by multiplying the number of analyzed cis-variants with the number of included trans-SNPs. This resulted in transcript-wise Bonferroni-adjusted P-values between 5.77 × 10−07 (1 cis-SNP and 86 612 trans-SNPs for DNAJA2, NETO2 and ORC6L) and 2.84 × 10−09 (204 cis-SNPs and 86 409 trans-SNPs for CHD8 and SUPT16H). Under the null hypothesis of no enrichment for transcripts showing cistrans interactions 0.05*3107=155 transcripts would be expected to have at least one significant cistrans interaction following a transcript-wise Bonferroni's correction. The applied correction procedure is also given in detail in Supplementary Table 1.

Results

Of all 3107 included probes we identified 440 transcripts whose expression was – transcript-wise Bonferroni-adjusted – regulated by a cistrans interaction (Supplementary Table 2). The significant two-locus eQTL P-values ranged between 4.69 × 10−08 and 2.82 × 10−12. The observed interactions showed a significant (P=2.86 × 10−144) and almost threefold enrichment compared with the number of SNP pairs expected under the null hypothesis, ie 5% of all probes (N=155) would be associated by chance. Table 1 lists the top-16 interaction findings, which were all associated with P-values of <10−10. Importantly, as an LD-pruning step was applied, all of the 440 cistrans SNP combinations were independent and not the result of LD between cis- or trans-markers.

Table 1. Column 1 lists the top-16 cistrans interacting transcripts; column 2 shows the number of tested cis-SNPs for each transcript; column 3 shows the number of cis-trans tests; column 4 list shows the Bonferroni-adjusted P-values necessary for a ‘significant' finding; column 5 shows the uncorrected P-value per transcript obtained in the two-locus interaction analysis; the next columns provide information about the cis- and trans-SNPs including their eQTL effects under a one-locus model.

  No. of No. of     Top cis-acting SNP Top trans-acting SNP
Transcripta tested cis-SNPs epistasis tests Bonferroni P-value Top two-locus P-value rs Chr Position One-locus P-value rs Chr Position One-locus P-value RefSeq genes
TRIM4 9 779 436 6.41E-08 2.82E-12 rs1121592 7 99 361 567 2.40E-06 rs457414 3 10 177 884 1.47E-01 VHL, IRAK2
PNPLA6 77 6 663 272 7.50E-09 5.99E-12 rs608773 19 7 743 306 8.73E-01 rs1794066 2 113 602 821 7.19E-01 IL1RN
ARNT 27 2 337 822 2.14E-08 8.26E-12 rs7532008 1 149 226 974 8.68E-01 rs2937504 5 11 015 227 5.14E-01 CTNND2, DAP
MANBA 51 4 414 662 1.13E-08 1.70E-11 rs4698863 4 1 03 764 896 1.81E-04 rs13171027 5 4 031 902 5.97E-01 IRX1
PHF11 46 3 982 082 1.26E-08 2.08E-11 rs2181539 13 48 569 216 6.52E-01 rs7571794 2 67 969 620 7.46E-01 ETAA1
C17orf70 47 4 068 602 1.23E-08 5.10E-11 rs7207933 17 77 131 682 3.75E-07 rs35060330 5 150 818 278 8.38E-01 SLC36A1
UEVLD 58 5 020 190 9.96E-09 5.66E-11 rs6483561 11 18 966 071 7.05E-01 rs5743404 8 6 724 531 4.68E-01 DEFB1
GMDS 138 11 933 550 4.19E-09 6.06E-11 rs932409 6 1 396 521 6.63E-01 rs2143980 14 32 277 657 3.21E-01 AKAP6
CCDC28A 65 5 625 620 8.89E-09 6.74E-11 rs12190319 6 138 316 778 2.29E-01 rs1391285 1 215 628 091 4.90E-01 ESRRG, GPATCH2
UBTD2 92 7 959 932 6.28E-09 6.76E-11 rs17074786 5 171 791 185 3.32E-01 rs4776794 15 64 659 320 8.57E-02 LCTL, SMAD6
RNF40 15 1 298 970 3.85E-08 6.77E-11 rs4788213 16 29 942 025 1.23E-01 rs638286 19 55 397 668 2.20E-01 MYH14
CCDC88C 83 7 181 990 6.96E-09 7.26E-11 rs2430363 14 91 434 804 2.78E-01 rs2748992 6 52 704 534 3.12E-01
GEMIN5 98 8 478 470 5.90E-09 7.35E-11 rs7732085 5 153 693 955 2.72E-01 rs1562797 16 52 900 570 4.55E-01 IRX3
EZH2 85 7 354 880 6.80E-09 7.82E-11 rs851704 7 147 169 364 3.97E-01 rs1957190 14 45 567 885 5.58E-01 RPL10L
TGDS 70 6 058 010 8.25E-09 9.00E-11 rs7993213 13 94 886 853 6.40E-01 rs13392004 2 48 495 333 2.92E-01 FOXN2, CCDC128
CEBPZ 87 7 527 762 6.64E-09 9.16E-11 rs12052952 2 36 842 683 4.59E-02 rs807018 10 102 763 001 5.41E-01 PDZD7
a

Illumina probe Ids are available upon request.

To elucidate the nature of the epistasis, an analysis was performed to determine whether SNPs, which are involved in gene regulation via one-locus eQTL effects, mainly contributed to the interactions. At present there is no consensus on whether SNPs with so-called ‘marginal effects' are more likely to be involved in epistasis and should be prioritized for SNP–SNP interaction scans. An analysis was therefore performed to determine whether the 440 cis- and trans-SNPs involved in epistasis also have regulatory effects on gene expression without their interacting markers, that is, in a one-locus fashion. This proved to be true for the cis-markers: a total of 40 of the 440 cis-SNPs (9.09%) also showed regulatory effects in the one-locus analysis at an uncorrected significance level of P≤0.05. This was significant compared with the expected number of SNPs with marginal effects (N=22, P=8.27 × 10−05) (Supplementary Table 3). However, it is notable that the majority of cis-markers (> 90%) were not involved in gene regulation at the one-locus level.

In contrast, only 16 of the 440 two-locus trans-SNPs (3.63%) were involved in gene regulation on the one-locus level. This was not significant compared with the number of expected markers (N=22, P=0.187, Supplementary Table 3) and points to more independent mechanisms involved in the one- and two-locus regulation.

As the mechanisms involved in trans-regulation and -epistasis are complex and not well understood, we tried to characterize them in more detail. We analyzed whether the trans-epistasis is gene or pathway mediated rather than the result of other regulatory mechanisms and tested at each trans-locus if there are more genes in close vicinity to the marker than expected. Of all 440 trans-markers, 198 SNPs (45.10%) were closely located to at least one gene according to the program SNPper (http://snpper.chip.org/bio/snpper-enter), that is, the SNP is located within a distance of ≤10 kb to a corresponding gene (Supplementary Table 2). However, the number of observed genes involved in trans-epistasis was not significantly increased compared with the number of all potentially involved genes tagged by all included trans-SNPs using SNPper (N=35 731, 41.35%, P=0.112).

Previous one-locus eQTL studies have reported an enrichment of certain chromosomal regions involved in the regulation of gene expression. We adapted the approach of Morley et al6 and analyzed our data for evidence for so-called ‘master regulator' SNP-regions on a two-locus interaction level. Master regulator-regions are chromosomal regions that contain more SNPs involved in epistasis than expected by chance. All 86 613 SNPs were used, and the entire autosomal genome was divided into 444 non-overlapping bins, each containing 200 neighboring SNPs. We estimated that a bin, which comprises more than 4 of the 440 trans-SNPs, would be a master regulator region. However, correcting this number by a factor of 444, which corresponds to the number of analyzed bins, more than six trans-SNPs per bin are necessary for defining a significant master regulator region. Only for bins at the end of chromosomes did we adapt our approach to account for the number of SNPs within these regions. For example, if 100 neighboring SNPs were located within the last bin of a chromosome, more than three trans-SNPs were necessary to fulfill the criterion of a significant master regulator region. Although we found 8 out of the 444 bins harboring four trans-SNPs, which are nominally significant (P=0.019), no bin fulfilled the criterion of a significant master regulator region after the correction procedure. In addition, our data provide no evidence for superordinated mechanisms involved in epistasis by analyzing whether certain chromosomal ‘hotspot' regions harbor more regulated transcripts than expected. We used all 3107 transcripts, divided the autosomal genome into 321 bins, each containing 10 neighboring transcripts, and estimated that a bin with more than 6 of the 440 identified transcripts would be a significant hit. After a correction for the number of analyzed bins (factor 321) no hotspot could be identified, although one bin harbored six transcripts and 12 further bins harbored four transcripts (uncorrected P=0.001 and P=0.041, respectively).

On the functional level, we tested whether certain cellular processes are particularly regulated by epistatic effects. We used all 440 genes that were identified as being cistrans regulated and performed an analysis for enriched cellular functions using Ingenuity Pathways Analysis (IPA, version 8.6, http://www.ingenuity.com). IPA is a web-based interface that provides computational algorithms to identify biological processes and networks on the basis of functional annotation and molecular interactions. The top biological category was ‘gene expression', including 69 transcripts. However, the most enriched subcategory ‘transcription of chromosome components' (P=0.046 after Benjamini–Hochberg correction) was defined by only 4 of all 440 included transcripts (CREBBP, EP300, SRC and TBP). Finally, an analysis was performed to determine whether any of the two-locus regulated genes are implicated in complex disorders. Complex disorders were considered, as genome-wide association studies (GWAS) of a number of diseases have failed to identify any one-locus variants, which are associated with a strong genetic effect size. Two-locus regulation may therefore have an impact on the respective phenotypes. Furthermore, the functional consequence of many top GWAS-SNPs is unknown, which suggests that expression differences may be disease-relevant mechanisms. In total, we identified 25 cistrans regulated genes that have been implicated in complex disorders using the web tool GWAS Catalog (http://genome.gov/26525384). For example (Table 2), we identified a two-locus interaction between a trans-SNP 5.9 kb upstream of CCL4 (MIM 182284) and a cis-SNP of BLK (MIM 191305) influencing its expression. BLK is one of the strongest risk genes for rheumatoid arthritis and systemic lupus erythematosus and CCL4 encodes a chemokine ligand involved in immune activation.22, 23, 24, 25, 26 However, the connection between BLK and CCL4 remains speculative, as it is unclear whether the close proximity of the trans-SNP to CCL4 reflects a gene- or pathway-mediated mechanism, or whether other interaction mechanisms that do not involve CCL4 exist. Unfortunately, we could not test the effect of the trans-SNP on the expression of CCL4 because no probe for CCL4 has been included in our analysis. Another interesting finding concerns STAT2 (MIM 600556). Its expression was found to be cis–trans regulated, and the corresponding trans-SNP is located 31.1 kb upstream of IL23R (MIM 607562) (Table 2). Again, we could not test whether this SNP is involved in the expression of IL23R due to a missing probe, but it is noteworthy that both genes have an important role in the innate immune system and have been implicated in the development of psoriasis in a recently published GWAS.27, 28, 29

Table 2. Column 1 lists the 25 cistrans interacting transcripts listed in GWAS catalog; column 3 lists the observed two-locus P-values; the remaining columns provide information concerning the cis- and trans-SNPs.

    Two-locus Top cis-acting SNP Top trans-acting SNP
Transcripta Disease (GWAS catalog) P-value rs Chr Position rs Chr Position RefSeq genes
ELMO1 QT interval26 1.09E-10 rs10259008 7 36 799 785 rs776692 15 40 134 818 PLA2G4D, PLA2G4E
NIN Cognitive performance30 1.13E-10 rs11850904 14 51 130 033 rs6836445 4 29 296 175
ZFP64 Amyotrophic lateral sclerosis31 1.90E-09 rs4811201 20 49 629 361 rs6561342 13 46 458 623 HTR2A, GNG5P5
GORASP2 Cognitive performance32 2.15E-09 rs10930438 2 171 315 117 rs17081840 4 55 718 936 KDR
VRK2 Schizophrenia33 2.16E-09 rs10178765 2 58 363 538 rs4950076 1 95 349 885 ALG14, TMEM56
SYNE1 Blood pressure34 2.23E-09 rs1856057 6 152 109 562 rs6445296 3 62 678 612 CADPS
C6orf106 Height35, 36 2.52E-09 rs3800341 6 33 972 976 rs17105347 14 36 335 202 SLC25A21
JAK2 Inflammatory bowel disease37, 38 3.40E-09 rs10974793 9 4 793 651 rs12475354 2 77 441 949 LRRTM4
WDR1 Serum urate (cardiovascular disease)39, 40 3.53E-09 rs7660895 4 9 594 543 rs10085762 7 135 220 728
CXXC1 Chronic lymphocytic leukemia41 3.63E-09 rs1705521 18 45 955 763 rs11836262 12 8 772 935 FAM80B
AP1B1 Carotid atherosclerosis42 3.83E-09 rs4822998 22 27 690 297 rs2753596 14 38 712 591 TRAPPC6B
ST6GAL1 Drug-induced liver injury43 4.11E-09 rs3872724 3 188 223 915 rs1959205 14 43 877 663 YWHAZP1
PEX1 Height44 4.66E-09 rs2285504 7 92 825 257 rs7034789 9 6 935 423 JMJD2C
EXT1 Height45 4.95E-09 rs7006088 8 119 720 982 rs6696976 1 97 701 564 DPYD
BLK Systemic lupus erythematosus22, 24, 25, rheumatoid arthritis23 5.30E-09 rs1293320 8 11 729 348 rs1634506 17 31 449 476 CCL3, CCL4
WDR36 Plasma eosinophil count (asthma)46 5.47E-09 rs27409 5 111 459 912 rs9504183 6 4 605 997
FNTB Mean corpuscular volume47 5.96E-09 rs1679880 14 64 723 379 rs7165654 15 56 627 331 LIPC
TSR1 Aortic root size48 6.70E-09 rs1109303 17 1 350 227 rs1334751 10 29 057 579 BAMBI
PRDM1 Systemic lupus erythematosus24 7.23E-09 rs1891720 6 107 259 564 rs2993312 13 112 731 466 MCF2L
MBD1 Chronic lymphocytic leukemia41 8.44E-09 rs1705521 18 45 955 763 rs11836262 12 8 772 935 FAM80B
METTL1 Multiple sclerosis49 8.68E-09 rs1908536 12 57 124 955 rs4833611 4 120 366 908 USP53
LSP1 Breast cancer50 8.91E-09 rs2301160 11 1 053 767 rs10930873 2 152 549 752 CACNB4
LDLR Myocardial infarction51, LDL cholesterol52, 53, 54 9.08E-09 rs11085720 19 10 178 763 rs6445704 3 54 614 308 CACNA2D3
STAT2 Psoriasis27 1.20E-08 rs4495925 12 55 554 383 rs10489631 1 67 373 703 IL23R
UBE2L3 Systemic lupus erythematosus24 4.29E-08 rs165846 22 19 254 028 rs5751963 22 23 462 498 PIWIL3
a

Illumina probe Ids are available upon request.

Discussion

Genes function through a complex mechanism that involves multiple genetic factors. These effects are missed if genetic factors are examined in isolation without taking potential interactions with other genetic factors into account. The aim of the present study was to elucidate the genetic architecture of gene expression through the performance of a systematic cis–trans interaction analysis. Out of 47 294 expression phenotypes, we used 3107 transcripts that survived a stringent quality control procedure and 86 613 LD-pruned SNP markers, which were in linkage equilibrium and have been genotyped in the 210 HapMap founder individuals. Using a conservative correction procedure, we identified that the expression of about 15% of all included transcripts (N=440) is regulated by a two-locus interaction, which is far more than expected by chance (P=2.86 × 10−144). The results of the present study confirm that epistasis has an important role in the genetic architecture of complex phenotypes and imply that this approach may be of relevance to other eQTL and GWAS data sets. Such studies could also benefit from samples that are ethnically more homogeneous. Although we have used four different populations as categorical co-variates, we cannot completely rule out that our results are to a certain degree inflated by the heterogeneity of the present sample.

The present findings also indicate that regulatory one-locus cis-markers are more likely to be involved in two-locus gene regulation than would be expected by chance alone (P=8.27 × 10−05). This suggests that there is a correlation between the mechanisms, which underlie one- and two-locus gene regulation. However, as the majority of cis-markers involved in epistasis showed no ‘marginal effects', our findings imply that most epistasis effects would be missed if interaction studies were focused on cis-markers with marginal effects only.

Furthermore, the present results indicate that gene- or pathway-mediated trans-effects were not the major source of epistasis, as trans-SNPs were not more likely to be located in or in close proximity to an annotated gene or transcript (P=0.112). Therefore, other regulatory mechanisms, such as non-coding sequence-mediated effects (eg, RNA) and intra- or interchromosomal cross-talk, seem to be of equal importance in trans-epistatic regulation.

Our analyses as to whether particular chromosomal regions are involved in epistasis produced negative results (P>0.05 for master regulators and hotspots). This implies that cistrans epistasis is not ‘topographically' organized throughout the genome. In addition, the IPA analysis revealed that only one functional category (involving only four transcripts) was enriched for epistatic effects (P=0.046 for the subcategory ‘transcription of chromosome components' within the high-level category ‘gene expression'). This suggests that multiple cellular processes are regulated by two-locus interactions rather than specific ones. Furthermore, 25 of all cistrans-regulated genes have been found to be associated with complex diseases through GWAS. The trans-markers and -genes identified in the present study may therefore represent interesting candidates for epistatic tests in the respective GWAS data.

In conclusion, the present cistrans interaction approach identified transcripts, which are potentially influenced by a two-locus epistasis, and yielded certain characteristics of the complex process of genome-transcriptome regulation. Furthermore, the approach may represent a solution for overcoming the problem of multiple testing in interaction scans, and it may thus be worthwhile to apply this approach to other eQTL data. A limitation of this approach, however, is that it is only able to detect cistrans epistasis and cannot be used to detect other regulation mechanisms such as ciscis, transtrans or higher-order interactions.

Acknowledgments

JS was supported by a NIH/DFG Research Career Transition Award, and MMN was supported by the Alfried Krupp von Bohlen und Halbach-Stiftung. We are grateful to all of the scientists at The Wellcome Trust Sanger Institute in Cambridge who were involved in generating the expression data, and to all of the scientists from the HapMap Consortium who were involved in generating the genotypic data used in the present study.

The authors declare no conflict of interest.

Footnotes

Supplementary Information accompanies the paper on European Journal of Human Genetics website (http://www.nature.com/ejhg)

Supplementary Material

Supplementary Tables 1–3

References

  1. Dimas AS, Deutsch S, Stranger BE, et al. Common regulatory variation impacts gene expression in a cell type-dependent manner. Science. 2009;325:1246–1250. doi: 10.1126/science.1174148. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Dixon AL, Liang L, Moffatt MF, et al. A genome-wide association study of global gene expression. Nat Genet. 2007;39:1202–1207. doi: 10.1038/ng2109. [DOI] [PubMed] [Google Scholar]
  3. Dombroski BA, Nayak RR, Ewens KG, Ankener W, Cheung VG, Spielman RS. Gene expression and genetic variation in response to endoplasmic reticulum stress in human cells. Am J Hum Genet. 2010;86:719–729. doi: 10.1016/j.ajhg.2010.03.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Goring HH, Curran JE, Johnson MP, et al. Discovery of expression QTLs using large-scale transcriptional profiling in human lymphocytes. Nat Genet. 2007;39:1208–1216. doi: 10.1038/ng2119. [DOI] [PubMed] [Google Scholar]
  5. Idaghdour Y, Czika W, Shianna KV, et al. Geographical genomics of human leukocyte gene expression variation in southern Morocco. Nat Genet. 2010;42:62–67. doi: 10.1038/ng.495. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Morley M, Molony CM, Weber TM, et al. Genetic analysis of genome-wide variation in human gene expression. Nature. 2004;430:743–747. doi: 10.1038/nature02797. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Myers AJ, Gibbs JR, Webster JA, et al. A survey of genetic human cortical gene expression. Nat Genet. 2007;39:1494–1499. doi: 10.1038/ng.2007.16. [DOI] [PubMed] [Google Scholar]
  8. Stranger BE, Forrest MS, Dunning M, et al. Relative impact of nucleotide and copy number variation on gene expression phenotypes. Science. 2007;315:848–853. doi: 10.1126/science.1136678. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Stranger BE, Nica AC, Forrest MS, et al. Population genomics of human gene expression. Nat Genet. 2007;39:1217–1224. doi: 10.1038/ng2142. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Veyrieras JB, Kudaravalli S, Kim SY, et al. High-resolution mapping of expression-QTLs yields insight into human gene regulation. PLoS Genet. 2008;4:e1000214. doi: 10.1371/journal.pgen.1000214. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Cheung VG, Nayak RR, Wang IX, et al. Polymorphic cis- and trans-regulation of human gene expression. PLoS Bio. 2010;8 doi: 10.1371/journal.pbio.1000480. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Cheung VG, Spielman RS. Genetics of human gene expression: mapping DNA variants that influence gene expression. Nat rev. 2009;10:595–604. doi: 10.1038/nrg2630. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Fraser P, Bickmore W. Nuclear organization of the genome and the potential for gene regulation. Nature. 2007;447:413–417. doi: 10.1038/nature05916. [DOI] [PubMed] [Google Scholar]
  14. Gondor A, Ohlsson R. Chromosome crosstalk in three dimensions. Nature. 2009;461:212–217. doi: 10.1038/nature08453. [DOI] [PubMed] [Google Scholar]
  15. Spilianakis CG, Lalioti MD, Town T, Lee GR, Flavell RA. Interchromosomal associations between alternatively expressed loci. Nature. 2005;435:637–645. doi: 10.1038/nature03574. [DOI] [PubMed] [Google Scholar]
  16. Williams A, Spilianakis CG, Flavell RA. Interchromosomal association and gene regulation in trans. Trends Genet. 2010;26:188–197. doi: 10.1016/j.tig.2010.01.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Herold C, Steffens M, Brockschmidt FF, Baur MP, Becker T. INTERSNP: genome-wide interaction analysis guided by a priori information. Bioinformatics. 2009;25:3275–3281. doi: 10.1093/bioinformatics/btp596. [DOI] [PubMed] [Google Scholar]
  18. Steffens M, Becker T, Sander T, et al. Feasible and successful: genome-wide interaction analysis involving all 1.9 × 10(11) pair-wise interaction tests. Hum Hered. 2010;69:268–284. doi: 10.1159/000295896. [DOI] [PubMed] [Google Scholar]
  19. Wan X, Yang C, Yang Q, et al. BOOST: A fast approach to detecting gene-gene interactions in genome-wide case-control studies. Am J Hum Genet. 2010;87:325–340. doi: 10.1016/j.ajhg.2010.07.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Barbosa-Morais NL, Dunning MJ, Samarajiwa SA, et al. A re-annotation pipeline for Illumina BeadArrays: improving the interpretation of gene expression data. Nucleic Acids Res. 2010;38:e17. doi: 10.1093/nar/gkp942. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Purcell S, Neale B, Todd-Brown K, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81:559–575. doi: 10.1086/519795. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Graham RR, Cotsapas C, Davies L, et al. Genetic variants near TNFAIP3 on 6q23 are associated with systemic lupus erythematosus. Nat Genet. 2008;40:1059–1061. doi: 10.1038/ng.200. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Gregersen PK, Amos CI, Lee AT, et al. REL, encoding a member of the NF-kappaB family of transcription factors, is a newly defined risk locus for rheumatoid arthritis. Nat Genet. 2009;41:820–823. doi: 10.1038/ng.395. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Han JW, Zheng HF, Cui Y, et al. Genome-wide association study in a Chinese Han population identifies nine new susceptibility loci for systemic lupus erythematosus. Nat Genet. 2009;41:1234–1237. doi: 10.1038/ng.472. [DOI] [PubMed] [Google Scholar]
  25. Hom G, Graham RR, Modrek B, et al. Association of systemic lupus erythematosus with C8orf13-BLK and ITGAM-ITGAX. N Engl J Med. 2008;358:900–909. doi: 10.1056/NEJMoa0707865. [DOI] [PubMed] [Google Scholar]
  26. Marroni F, Pfeufer A, Aulchenko YS, et al. A genome-wide association scan of RR and QT interval duration in 3 European genetically isolated populations: the EUROSPAN project. Circ Cardiovasc Genet. 2009;2:322–328. doi: 10.1161/CIRCGENETICS.108.833806. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Nair RP, Duffin KC, Helms C, et al. Genome-wide scan reveals association of psoriasis with IL-23 and NF-kappaB pathways. Nat Genet. 2009;41:199–204. doi: 10.1038/ng.311. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Zhernakova A, van Diemen CC, Wijmenga C. Detecting shared pathogenesis from the shared genetics of immune-related diseases. Nat Rev. 2009;10:43–55. doi: 10.1038/nrg2489. [DOI] [PubMed] [Google Scholar]
  29. Cordell HJ. Detecting gene-gene interactions that underlie human diseases. Nat Rev. 2009;10:392–404. doi: 10.1038/nrg2579. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Cirulli ET, Kasperaviciute D, Attix DK, et al. Common genetic variation and performance on standardized cognitive tests. Eur J Hum Genet. 2010;18:815–820. doi: 10.1038/ejhg.2010.2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Schymick JC, Scholz SW, Fung HC, et al. Genome-wide genotyping in amyotrophic lateral sclerosis and neurologically normal controls: first stage analysis and public release of data. Lancet Neurol. 2007;6:322–328. doi: 10.1016/S1474-4422(07)70037-6. [DOI] [PubMed] [Google Scholar]
  32. Need AC, Attix DK, McEvoy JM, et al. A genome-wide study of common SNPs and CNVs in cognitive performance in the CANTAB. Hum Mol Genet. 2009;18:4650–4661. doi: 10.1093/hmg/ddp413. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Stefansson H, Ophoff RA, Steinberg S, et al. Common variants conferring risk of schizophrenia. Nature. 2009;460:744–747. doi: 10.1038/nature08186. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Levy D, Larson MG, Benjamin EJ, et al. Framingham Heart Study 100K Project: genome-wide associations for blood pressure and arterial stiffness. BMC Med Genet. 2007;8 (Suppl 1:S3. doi: 10.1186/1471-2350-8-S1-S3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Soranzo N, Rivadeneira F, Chinappen-Horsley U, et al. Meta-analysis of genome-wide scans for human adult stature identifies novel Loci and associations with measures of skeletal frame size. PLoS Genet. 2009;5:e1000445. doi: 10.1371/journal.pgen.1000445. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Weedon MN, Lango H, Lindgren CM, et al. Genome-wide association analysis identifies 20 loci that influence adult height. Nat Genet. 2008;40:575–583. doi: 10.1038/ng.121. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Asano K, Matsushita T, Umeno J, et al. A genome-wide association study identifies three new susceptibility loci for ulcerative colitis in the Japanese population. Nat Genet. 2009;41:1325–1329. doi: 10.1038/ng.482. [DOI] [PubMed] [Google Scholar]
  38. Barrett JC, Hansoul S, Nicolae DL, et al. Genome-wide association defines more than 30 distinct susceptibility loci for Crohn's disease. Nat Genet. 2008;40:955–962. doi: 10.1038/NG.175. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. McArdle PF, Parsa A, Chang YP, et al. Association of a common nonsynonymous variant in GLUT9 with serum uric acid levels in old order amish. Arthritis Rheum. 2008;58:2874–2881. doi: 10.1002/art.23752. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Wallace C, Newhouse SJ, Braund P, et al. Genome-wide association study identifies genes for biomarkers of cardiovascular disease: serum urate and dyslipidemia. Am J Hum Genet. 2008;82:139–149. doi: 10.1016/j.ajhg.2007.11.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Crowther-Swanepoel D, Broderick P, Di Bernardo MC, et al. Common variants at 2q37.3, 8q24.21, 15q21.3 and 16q24.1 influence chronic lymphocytic leukemia risk. Nat Genet. 2010;42:132–136. doi: 10.1038/ng.510. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Shrestha S, Irvin MR, Taylor KD, et al. A genome-wide association study of carotid atherosclerosis in HIV-infected men. Aids. 2010;24:583–592. doi: 10.1097/QAD.0b013e3283353c9e. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Daly AK, Donaldson PT, Bhatnagar P, et al. HLA-B*5701 genotype is a major determinant of drug-induced liver injury due to flucloxacillin. Nat Genet. 2009;41:816–819. doi: 10.1038/ng.379. [DOI] [PubMed] [Google Scholar]
  44. Gudbjartsson DF, Walters GB, Thorleifsson G, et al. Many sequence variants affecting diversity of adult human height. Nat Genet. 2008;40:609–615. doi: 10.1038/ng.122. [DOI] [PubMed] [Google Scholar]
  45. Kim JJ, Lee HI, Park T, et al. Identification of 15 loci influencing height in a Korean population. J Hum Genet. 2010;55:27–31. doi: 10.1038/jhg.2009.116. [DOI] [PubMed] [Google Scholar]
  46. Gudbjartsson DF, Bjornsdottir US, Halapi E, et al. Sequence variants affecting eosinophil numbers associate with asthma and myocardial infarction. Nat Genet. 2009;41:342–347. doi: 10.1038/ng.323. [DOI] [PubMed] [Google Scholar]
  47. Ganesh SK, Zakai NA, van Rooij FJ, et al. Multiple loci influence erythrocyte phenotypes in the CHARGE Consortium. Nat Genet. 2009;41:1191–1198. doi: 10.1038/ng.466. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Vasan RS, Glazer NL, Felix JF, et al. Genetic variants associated with cardiac structure and function: a meta-analysis and replication of genome-wide association data. JAMA. 2009;302:168–178. doi: 10.1001/jama.2009.978-a. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. MS-consortium Genome-wide association study identifies new multiple sclerosis susceptibility loci on chromosomes 12 and 20. Nat Genet. 2009;41:824–828. doi: 10.1038/ng.396. [DOI] [PubMed] [Google Scholar]
  50. Easton DF, Pooley KA, Dunning AM, et al. Genome-wide association study identifies novel breast cancer susceptibility loci. Nature. 2007;447:1087–1093. doi: 10.1038/nature05887. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Kathiresan S, Voight BF, Purcell S, et al. Genome-wide association of early-onset myocardial infarction with single nucleotide polymorphisms and copy number variants. Nat Genet. 2009;41:334–341. doi: 10.1038/ng.327. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Kathiresan S, Melander O, Guiducci C, et al. Six new loci associated with blood low-density lipoprotein cholesterol, high-density lipoprotein cholesterol or triglycerides in humans. Nat Genet. 2008;40:189–197. doi: 10.1038/ng.75. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Sabatti C, Service SK, Hartikainen AL, et al. Genome-wide association analysis of metabolic traits in a birth cohort from a founder population. Nat Genet. 2009;41:35–46. doi: 10.1038/ng.271. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Willer CJ, Sanna S, Jackson AU, et al. Newly identified loci that influence lipid concentrations and risk of coronary artery disease. Nat Genet. 2008;40:161–169. doi: 10.1038/ng.76. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Tables 1–3

Articles from European Journal of Human Genetics are provided here courtesy of Nature Publishing Group

RESOURCES