Skip to main content
Nature Communications logoLink to Nature Communications
. 2022 Aug 17;13:4840. doi: 10.1038/s41467-022-29931-z

GWAS meta-analysis of intrahepatic cholestasis of pregnancy implicates multiple hepatic genes and regulatory elements

Peter H Dixon 1,#, Adam P Levine 2,3,#, Inês Cebola 4,#, Melanie M Y Chan 2, Aliya S Amin 1, Anshul Aich 2, Monika Mozere 2, Hannah Maude 4, Alice L Mitchell 1, Jun Zhang 2,5; NIHR BioResource6; Genomics England Research Consortium Collaborators7, Jenny Chambers 8,9, Argyro Syngelaki 10, Jennifer Donnelly 11, Sharon Cooley 11, Michael Geary 11, Kypros Nicolaides 10, Malin Thorsell 12, William M Hague 13, Maria Cecilia Estiu 14, Hanns-Ulrich Marschall 15, Daniel P Gale 2,#, Catherine Williamson 1,✉,#
PMCID: PMC9385867  PMID: 35977952

Abstract

Intrahepatic cholestasis of pregnancy (ICP) is a pregnancy-specific liver disorder affecting 0.5–2% of pregnancies. The majority of cases present in the third trimester with pruritus, elevated serum bile acids and abnormal serum liver tests. ICP is associated with an increased risk of adverse outcomes, including spontaneous preterm birth and stillbirth. Whilst rare mutations affecting hepatobiliary transporters contribute to the aetiology of ICP, the role of common genetic variation in ICP has not been systematically characterised to date. Here, we perform genome-wide association studies (GWAS) and meta-analyses for ICP across three studies including 1138 cases and 153,642 controls. Eleven loci achieve genome-wide significance and have been further investigated and fine-mapped using functional genomics approaches. Our results pinpoint common sequence variation in liver-enriched genes and liver-specific cis-regulatory elements as contributing mechanisms to ICP susceptibility.

Subject terms: Genome-wide association studies, Gene expression profiling, Liver diseases


Investigation of variation in three cohorts has identified multiple genetic signals associated with the pregnancy-specific liver disorder, intrahepatic cholestasis of pregnancy, giving insight into the disease which can cause preterm birth and stillbirth.

Introduction

Intrahepatic cholestasis of pregnancy (ICP) is a pregnancy-specific liver disorder affecting ~0.5–2% of pregnant women. The majority of cases present in the third trimester with pruritus. The diagnosis is confirmed by elevated maternal total serum bile acids (TSBA) and abnormal serum liver tests1. Elevation of maternal TSBA ≥40 µmol/L is associated with an increased incidence of adverse pregnancy outcomes2,3, and a large meta-analysis of ICP cases identified 100 µmol/L as the threshold at which the risk of stillbirth increases4.

ICP has a complex etiology, with genetic, endocrine, and environmental contributions. A variable geographical prevalence has also been observed with this disease. This may be explained by genetic variation in women of different ancestry, or could be the result of altered environmental factors, e.g., vitamin D or selenium deficiency5. The cholestatic effect of gestational elevations of estrogen and progesterone metabolites unmasks the disease in genetically susceptible individuals6,7. Rare mutations in the hepatobiliary transporter genes ABCB4 and ABCB11 (encoding the phosphatidyl choline floppase and the bile salt export pump, respectively) have been reported, with the most recent study identifying pathogenic or likely pathogenic heterozygous mutations in either of these two genes in up to 25% of cases of severe, early-onset ICP8. A limited role has been proposed for rare genetic variants in other known cholestatic loci9. In contrast with rare variants, the contribution of common genetic variation to the heritability of ICP has not been systematically addressed thus far. Candidate gene analyses have been reported10 but to date the only genome-wide study of ICP has employed admixture mapping of genetically distinct populations11. Here, we utilized the recent whole-genome sequencing study by the NIHR BioResource Rare Disease collaboration (NIHR-RD) and the 100,000 Genomes Project (100KGP)12 along with data from the FinnGen Consortium to systematically identify common variants that contribute to ICP susceptibility. We have performed the first GWAS and meta-analyses for ICP, examining >7 million common sequence variants in 1138 cases of ICP compared to 153,642 non-ICP controls. This resulted in the identification of 11 genome-wide significant loci associated with ICP. The application of a functional prioritization pipeline further enabled the identification of liver-enriched genes and liver-specific cis-regulatory elements as likely effectors of ICP risk.

Results and discussion

GWAS and GWAS meta-analyses for ICP

GWAS were first performed using the NIHR-RD (https://bioresource.nihr.ac.uk/using-our-bioresource/our-cohorts/rare-diseases/) and 100KGP (https://www.genomicsengland.co.uk/initiatives/100000-genomes-project) data separately (Supplementary Fig. 1). In the NIHR-RD, 303 patients with ICP (severe, early-onset disease with symptoms by 33 weeks’ gestation and maternal TSBA ≥ 40 µmol/L) were recruited. For women that used hormonal contraception, 15/254 (5.9%) reported itching when used and 39/256 reported cyclical itching. In addition, 15/247 patients (6.1%) reported drug-induced itching (other than for contraception). In the 100KGP, 225 individuals with an International Classification of Diseases (ICD) code for ICP were identified across all 100KGP study cohorts. For each of these datasets, common (minor allele frequency (MAF) ≥0.01) high-quality variants were extracted from the whole-genome sequence data. An unrelated subset of individuals of genetically defined European (EUR) ancestry (Supplementary Fig. 2), was identified by kinship coefficient estimation and principal component analysis. After quality control, the NIHR-RD dataset comprised 216 cases and 8436 controls with 8,291,828 variants, and the 100KGP dataset comprised 182 cases and 45,585 controls with 9,429,238 variants. A GWAS was performed in each cohort separately using a generalized mixed model association test with saddle-point approximation to adjust for case–control imbalance using SAIGE13 with sex and ten principal components (PCs) as covariates. The resulting summary statistics for variants with matching alleles and allele frequencies between the two datasets were then meta-analyzed using METAL14, weighting the effect size estimates using the inverse of the standard errors. Variants showing heterogeneity of effect between the two datasets (P < 1 × 10−5) were excluded. In total, 8,199,999 variants were meta-analyzed and four loci were identified as achieving genome-wide significance (P < 5 × 10−8) (Supplementary Figs. 1, 3a). The genomic inflation (lambda) was 1.01 (Supplementary Fig. 3c), indicating no significant population stratification.

To enhance the power for genetic discovery, we meta-analyzed the NIHR-RD/100KGP (UK) ICP GWAS meta-analysis summary statistics with GWAS data from FinnGen (Release 4) (https://www.finngen.fi/en). FinnGen is a public-private partnership combining digital health record data from Finnish health registries with genotyping data from Finnish Biobanks. Release 4 includes association data at 16,962,023 variants for 2444 endpoints in 176,899 Finnish individuals. As with the 100KGP, cases of ICP were identified based on ICD codes. There were 740 cases and 99,621 controls following FinnGen phenotype evaluation. GWAS was performed by FinnGen using SAIGE13 with sex, age, 10 PCs, and genotyping batch as covariates. Analysis of the summary statistics for the FinnGen ICP GWAS revealed seven loci achieving genome-wide significance (Supplementary Figs. 1, 3b, d), including the four loci identified in the UK meta-analysis. After allele matching, allele frequency matching, and removal of heterogeneous effects, 7,715,762 variants were available for meta-analysis across all three cohorts. In total, 11 loci achieved the genome-wide significance threshold (P < 5 × 10−8), (Fig. 1, Table 1, and Supplementary Fig. 1) and the genomic inflation (lambda) for the overall meta-analysis was 1.027 (Supplementary Fig. 3e). Summary statistics for the lead variants in each of the separate GWAS and meta-analyses are provided in Supplementary Data 1.

Fig. 1. Manhattan plot for the genome-wide association study meta-analysis of intrahepatic cholestasis of pregnancy (ICP).

Fig. 1

Data shown are from the combined meta-analysis including 1138 cases and 153,642 controls, all of European ancestry. The chromosomes are ordered on the x axis; the y axis shows the −log10(P) values for the association tests. Eleven loci achieved genome-wide significance (P < 5 × 10−8, indicated by the red line). The prioritized gene at each locus is shown. The effector gene functional prioritization strategy followed in this study is presented in Fig. 2. Association testing was performed using a generalized logistic mixed model to account for population stratification and saddle-point approximation to control for type 1 error rates due to unbalanced case–control ratios, followed by meta-analysis weighting the effect size estimates using the inverse of the standard errors.

Table 1.

Association statistics of the eleven loci achieving genome-wide significance (P < 5 × 10−8) in the genome-wide association study meta-analysis of intrahepatic cholestasis of pregnancy (ICP).

Chr Position Variant (rsID) Non-risk allele Risk allele UK Freq Finn Freq OR 95% CI P
2 27508073 rs1260326 T C 0.598 0.650 1.43 1.35–1.52 6.84E-16
2 43844604 rs4148211 G A 0.608 0.556 1.70 1.61–1.78 1.72E-33
2 169074756 rs66927685 TGAACCTTAGAGACTGAAGAA T 0.455 0.430 1.42 1.33-1.5 7.83E–16
4 76490987 rs13146355 G A 0.451 0.456 1.27 1.19–1.36 2.00E–08
7 87500928 rs5885586 TG T 0.821 0.833 1.63 1.51–1.74 7.83E–17
8 58480178 rs10107182 T C 0.341 0.377 1.50 1.41–1.58 1.95E–19
14 94378610 rs28929474 C T 0.021 0.020 7.09 6.74–7.43 1.04E–28
17 79732281 rs34491636 A G 0.217 0.231 1.36 1.26–1.46 2.62E–09
19 35552195 rs2251250 C T 0.379 0.321 1.34 1.25–1.42 1.41E–10
19 47867143 rs296384 G T 0.837 0.847 1.46 1.34–1.58 2.50E–10
20 44413724 rs1800961 C T 0.030 0.045 2.06 1.84–2.29 3.42E–10

Chromosome (Chr), position (Build 38), and rsID are given for the lead variant (variant with the lowest P value) at each locus. For each variant, the risk allele has been indicated (the allele that is more common in cases than controls) with the corresponding odds ratios (OR). The frequencies of the risk alleles in control individuals in the UK (UK Freq) and Finnish (Finn Freq) datasets are shown. Association testing was performed using a generalized logistic mixed model to account for population stratification and saddle-point approximation to control for type 1 error rates due to unbalanced case–control ratios, followed by meta-analysis weighting the effect size estimates using the inverse of the standard errors.

Conditional and joint (GCTA-COJO)15 analyses in each of the genome-wide significant loci demonstrated a single signal underlying each association. Testing for epistatic interactions was performed across all 55 possible pairs of the 11 risk variants in NIHR-RD and 100KGP (for which full genotype data were available) separately with PLINK16 and the resulting P values were combined using Fisher’s method. No significant interactions were identified after correcting for multiple testing based on 55 comparisons (P < 0.0009). Post hoc power calculations based on the size of the cohort utilized for the combined meta-analysis demonstrated 80% power to detect effects with an odds ratio (OR) > 1.3 at MAF = 0.5 and OR > 1.7 at MAF ≥ 0.05 (Supplementary Fig. 4).

Functional fine-mapping of causal variants at ICP susceptibility loci

The liver is a central organ in the development of ICP, as evidenced by the fact that most known ICP causal mutations affect liver-specific genes8,9 (Supplementary Fig. 5a). Moreover, unbiased analysis of the 11 ICP association signals from our GWAS highlighted that the expression of genes within these loci is highly liver-specific (Supplementary Fig. 5b) and relate to pathways involved in bile acid and lipid metabolism (Supplementary Fig. 6). We, therefore, reasoned that both coding and noncoding ICP risk variants likely affect liver-specific genes and/or cis-regulatory elements (CREs). In order to gain insights into the genetic mechanisms driving ICP susceptibility and identify likely causal variants, we carried out functional fine-mapping of the 11 association signals detected in the meta-analysis using PAINTOR17 to calculate the probability of a variant being causal at a given locus considering not only its strength of association but also functional annotation data (Fig. 2). We included adult liver tissue active chromatin features in the datasets used to prioritize causal variants, including accessible chromatin regions and those enriched in H3K27ac (see “Methods”). This analysis enabled the prioritization of nine variants with a high functional posterior probability (PPfunc > 0.8) across eight ICP susceptibility loci (Supplementary Data 2). For the remaining three loci, which did not include any protein-altering variant, we assumed that variants residing within active liver CREs were more likely functional. We thus defined a high-confidence set of human liver accessible chromatin sites using ATAC-seq profiles from four whole liver samples (data from ENCODE18, see “Methods”), which allowed us to further prioritize one likely causal variant at each remaining locus (Table 2).

Fig. 2. Flowchart describing the variant prioritization strategy followed for the eleven loci reaching genome-wide significance in the GWAS meta-analysis of ICP.

Fig. 2

Sequential steps used to prioritize the variants are shown with the different approaches for nonsynonymous, synonymous, and noncoding variants indicated.

Table 2.

Prioritized variants and genes within each of the eleven ICP genome-wide significant loci.

Lead variant Functional prioritization
Chr Position rsID Gene Variant (rsID) Same as lead Non-risk Risk Functional impact
2 27508073 rs1260326 GCKR rs1260326 Yes T C Coding, missense
2 43844604 rs4148211 ABCG5/8 rs4148211 Yes G A CRE
2 169074756 rs66927685 ABCB11 rs66927685 Yes TGAACCTTAGAGACTGAAGAA T CRE
4 76490987 rs13146355 SCARB2 rs4859682 No C A CRE
7 87500928 rs5885586 ABCB1/4 rs55747905 No C T CRE, creation of TCF7L2-binding site
8 58480178 rs10107182 CYP7A1 rs10504255 No A G CRE, disruption of DMRTA1-binding site, and decreased gene expression
14 94378610 rs28929474 SERPINA1 rs28929474 Yes C T Coding, missense
17 79732281 rs34491636 ENPP7 rs9916601 No C T CRE, increased gene expression
19 35552195 rs2251250 TMEM147 rs4806173 No C G CRE, creation of NC3R1-binding site and decreased gene expression
19 47867143 rs296384 SULT2A1 rs296361 No G A CRE, creation of SOX-D-binding site and decreased gene expression
20 44413724 rs1800961 HNF4A rs1800961 Yes C T Coding, missense

The chromosome (Chr), position (Build 38), and rsID are given for the lead variant (variant with the lowest P value) in each locus are shown along with the gene, rsID, risk, and non-risk alleles and functional impact following in silico functional prioritization.

ICP coding variant analysis

Three ICP signals (SERPINA1, GCKR, and HNF4A) hosted a single missense coding variant with high likelihood of being causal, with functional posterior probabilities above 99% and a pathogenicity score CADD >1019 (Fig. 3). From these three signals, the one with the strongest association with ICP susceptibility was rs28929474 in the SERPINA1 gene, which encodes alpha-1-antitrypsin, a major plasma serine protease inhibitor (Fig. 3a). This variant has been extensively reported in the literature as the SERPINA1 Z allele, an amino acid substitution of Glu342Lys that causes alpha-1-antitrypsin deficiency20 and is associated with cystic fibrosis liver disease21. The two other missense variants alter the sequences of the GCKR (glucokinase regulatory protein) and HNF4A (hepatic nuclear factor 4 alpha) (Fig. 3b, c), genes that are essential for hepatic metabolic homeostasis and when mutated may lead to monogenic diabetes22. We examined the potential functional impact of these coding variants through the querying of the ClinVar variant interpretation database and Metadome analysis23. At GCKR, we identified an ICP risk variant, rs1260326 (Leu446Pro), which has been previously associated with altered serum fasting plasma glucose and triglyceride concentrations24,25 and nonalcoholic fatty liver disease (NAFLD)26. Previous functional studies of this variant have indicated that the risk allele in this locus is associated with increased glucokinase activity in the liver27. At HNF4A, a single variant was associated with ICP susceptibility, rs1800961 (Thr139Ile), which Metadome analysis23 predicted to be deleterious for protein function (Supplementary Fig. 7a).

Fig. 3. Analysis of intrahepatic cholestasis of pregnancy (ICP) association signals uncovers three coding risk variants affecting the SERPINA1, GCKR, and HNF4A genes.

Fig. 3

ac Regional association and linkage disequilibrium (LD) plots for the ICP signals at SERPINA1, GCKR, and HNF4A, respectively. The y axis represents the −log10(P) values for the association test from the GWAS meta-analysis of ICP (1138 cases and 153,642 controls) and the x axis the chromosomal positions (GRCh38/hg38). The SNP with the lowest p-value in the locus is indicated by a purple diamond. The remaining SNPs in the region are colored according to their r2 with the lead SNP (r2 calculated using 1000 Genomes European (EUR) reference). The dotted gray line represents the genome-wide significance threshold of 5 × 10−8. Plots were generated using LocusZoom61. Association testing was performed as described in Fig. 1. Zoomed insets reveal the prioritized coding variant at each locus, with functional posterior probability (PPfunc) above 0.99. PPfunc were calculated using PAINTOR17. All three variants showed Combined Annotation Dependent Depletion (CADD) scores19,84 above 10, which is usually considered the threshold for pathogenic variants. Further details on coding variant analysis are provided in Supplementary Data 3. 95% PP ICP variants, variants within the 95% genetic credible set in the ICP meta-analysis. LD ICP variants, variants with EUR r2 > 0.8 with the meta-analysis lead variant.

Functional prioritization of noncoding variants at lipid and bile acid homeostasis loci

At the remaining loci identified in the GWAS meta-analysis for ICP, we observed that seven of the eleven loci contained noncoding and synonymous variants (examining the 95% genetic credible set, as well as the lead variant and those in linkage equilibrium at r2 > 0.8 in EUR individuals). This observation is in line with previous reports suggesting that common sequence variation predominantly contributes to human disease via disruption of noncoding cis-regulatory elements28,29. We therefore took advantage of the functional fine-mapping strategy described above, along with the integration of adult liver epigenomic datasets and pathogenicity measures, to further dissect the causal mechanisms at play in each ICP locus (Fig. 2, see “Methods”).

In the liver, the family of ABC transporters play central roles in numerous physiological processes, including the export of cholesterol, bile salts, bilirubin, and drug conjugates. Three ABC transporter gene loci (ABCG5/8, ABCB1/4, and ABCB11) contained ICP GWAS signals (Fig. 4). ABCG5 and ABCG8 together encode a heterodimeric cholesterol transporter, expressed in hepatocytes and intestinal cells30. A missense polymorphism in ABCG8, Tyr54Cys (rs4148211), was identified as the lead variant at this locus (Fig. 4a). However, Metadome analysis23 indicated that this protein sequence change is expected to be tolerated (Supplementary Fig. 7b and Supplementary Data 3), suggesting that this or other variants in linkage disequilibrium (LD) have a different type of functional impact at this locus. Combined analysis of adult liver epigenomic datasets and a high-throughput reporter assay dataset from liver cells (Survey of Regulatory Elements, SuRE31), indicated that the lead variant rs4148211 (PPfunc > 0.99) resides within a hepatic CRE and its risk allele is associated with allele-dependant transcriptional activity in liver cells (P = 9.1 × 10−4), where the risk allele is associated with transcriptional repression (Fig. 4b and Supplementary Fig. 7c). Analysis of the SuRE assay dataset from liver cells revealed seven additional ICP-associated variants in the ABCG8 locus for which the risk allele associates with transcriptional repression (Supplementary Fig. 7c). However, only one overlapped a hepatic CRE (rs4148204) and none of them was prioritized as likely causal by PAINTOR (Supplementary Data 2).

Fig. 4. Three ABC transporter gene loci are associated with intrahepatic cholestasis of pregnancy (ICP).

Fig. 4

a, c, e LocusZoom61 plots showing the association signals at ABCG8, ABCB1/4, and DHRS9, respectively. The LocusZoom plots are as per Fig. 3 and association testing was performed as described in Fig. 1. At each locus, a single noncoding variant was prioritized as likely causal following the strategy outlined in Fig. 2. All epigenomic and transcriptomic datasets shown were retrieved from ENCODE18. Adult liver ATAC-seq peaks track (green regions) corresponds to regions accessible in at least two out of four ENCODE adult liver ATAC-seq (see Methods). 95% PP ICP variants, variants within the 95% genetic credible set in the ICP meta-analysis. LD ICP variants, variants with EUR r2 > 0.8 with the meta-analysis lead variant. PPfunc, functional posterior probability calculated using PAINTOR17. CADD, combined annotation-dependent depletion. a Zoomed inset shows a coding variant at ABCG8 overlapping a hepatic transcriptional enhancer, as revealed by strong H3K27ac enrichment and accessible chromatin in adult female liver tissue. Metadome analysis did not suggest that this variant affects protein function (see Supplementary Fig. 7). b Allele-specific effect of the prioritized risk variant rs4148211, assessed by Survey of Regulatory Elements (SuRE) in HepG2 cells31. The results suggest that the risk variant associates with decreased transcriptional activity. Mean signal and two-sided Wilcoxon rank-sum test. Previously determined 5% false-discovery rate threshold: P < 0.0017312131. Source data are provided as a Source Data file. c, e Zoomed insets show two prioritized regulatory risk variants for ICP: one affecting a hepatic enhancer between ABCB1 and ABCB4 (c), and one affecting a hepatic enhancer in an intron of DHRS9 (d). We note however that DHRS9 is not expressed in liver tissue, as shown in the strand-specific RNA-seq tracks. In contrast, ABCB11 is highly expressed in this tissue making it a likely effector transcript in this locus.

The ICP association signal on chromosome 7 spanned a genomic segment including the genes ABCB1 and ABCB4 (Fig. 4c), which encode for the Multidrug Resistance Proteins 1 and 3 (MDR1 and MDR3), respectively. Whilst the lead variant in this locus did not overlap a coding sequence or active hepatic CRE, we prioritized the variant rs55747905 as likely functional (PPfunc = 0.94), which overlapped an active enhancer element (accessible chromatin enriched in H3K27ac) (Fig. 4c). Transcription factor binding motif disruption analysis further revealed that the risk allele of rs55747905 (C) disrupts a recognition sequence for the Wnt signaling transcription factor TCF7L2 (Fig. 4d).

At the third ABC transporter locus, we prioritized ABCB11 as the gene mediating the ICP susceptibility process. In this case, functional fine-mapping with PAINTOR enabled the prioritization of a single variant residing in an intronic liver enhancer in the DHRS9 gene (Fig. 4e). However, analysis of ENCODE18 RNA-seq data revealed that DHRS9 has very weak expression in adult liver, in contrast with ABCB11, which encodes the bile salt export pump (BSEP).

The ICP association signal on chromosome 8 is located between the UBXN2A and CYP7A1 genes (Fig. 5a). UBXN2A encodes a ubiquitin-like protein involved in proteasomal degradation32 and CYP7A1 encodes the enzyme (cholesterol 7 alpha-hydroxylase) responsible for the first and rate-limiting step in bile acid biosynthesis in hepatocytes33. At this locus, only one regulatory variant was prioritized, rs10504255, which resides in an active liver enhancer. Unlike the lead variant at this locus (rs10107182), the prioritized variant reached a high functional posterior probability (PPfunc > 0.99 vs. <0.01) and had a high pathogenicity score (CADD = 12.9 vs. 5.8). Furthermore, analysis of GTEx34 eQTL data revealed that rs10504255 is associated with allele-dependant expression of UBXN2B in human liver (Fig. 5b), where the risk allele (G) is associated with weaker transcriptional activity (P = 6.18 × 10−6). Even though this variant was not a statistically significant eQTL for CYP7A1 in the GTEx database (P = 0.15), we observed a similar trend on the direction of effect of the risk allele (Fig. 5b) and previous chromatin conformation capture and genome editing studies in human hepatocytes have linked this CRE with the modulation of CYP7A1 expression35. These previous studies, combined with the functional significance of CYP7A1 in bile acid homeostasis, led us to prioritize CYP7A1 as the major effector transcript in this locus. Motif analysis further revealed that the risk allele disrupts a binding site for the transcription factor DMRTA1 (Fig. 5c).

Fig. 5. GWAS meta-analysis of ICP reveals association signals affecting the bile acid homeostasis genes CYP7A1 and SULT2A1.

Fig. 5

a, d LocusZoom61 plots showing the association signals at CYP7A1 and SULT2A1, respectively. The LocusZoom plots are as per Fig. 3 and association testing was performed as described in Fig. 1. All epigenomic and transcriptomic datasets shown were retrieved from ENCODE18. Adult liver ATAC-seq peaks track (green regions) corresponds to regions accessible in at least two out of four ENCODE adult liver ATAC-seq (see “Methods”). 95% PP ICP variants, variants within the 95% genetic credible set in the ICP meta-analysis. LD ICP variants, variants with EUR r2 > 0.8 with the meta-analysis lead variant. PPfunc, functional posterior probability calculated using PAINTOR. CADD, combined annotation-dependent depletion. a Functional fine-mapping with PAINTOR17 identified a single variant (rs10504255) with a posterior probability over 0.99 overlapping an active liver enhancer located between the CYP7A1 gene, which encodes the rate-limiting enzyme for bile acid synthesis, and UBXN2B. b, e GTEx34 [GTEx Analysis Release V8 (dbGaP Accession phs000424.v8.p2)] human liver eQTLs identify the top prioritized variant in the CYP7A1 locus (rs10504255) as an eQTL associated with the eGene UBXN2B (b) and show that the SULT2A1 promoter ICP variant (rs296361) is an eQTL for both SULT2A1 and its downstream lncRNA LINC01595 (e). Violin plots represent the density distribution of the samples in each genotype (n for each genotype is indicated below in blue). Box plots show normalized gene expression in median (white line), first and third quartiles. Gene-level adjusted P values calculated with FastQTL85 are shown. c The risk allele of the ICP-associated variant rs10504255 (G) disrupts a DMRTA1-binding motif. We note that this allele also associates with weaker transactivation in liver tissue, as observed by eQTL analysis (b). d Overlap of ICP risk variants with adult liver accessible chromatin sites revealed one variant in the promoter of SULT2A1 (rs296361). f Transcription factor motif analysis revealed that the ICP variant rs296361 affects a motif for the SOX-D transcriptional repressor family.

In addition to CYP7A1, the ICP GWAS meta-analysis identified another important locus for bile acid homeostasis, SULT2A1, which encodes the enzyme Sulfotransferase 2A1, highly expressed in the liver and capable of phase II metabolism of bile acids36 (Fig. 5d). Here, a variant in the promoter of SULT2A1 was prioritized (rs296361), with the risk allele conferring lower SULT2A1 expression in the liver (P = 1.31 × 10−7) (Fig. 5e), likely through the creation of a binding site for the family of transcriptional repressors SOX-D (Fig. 5f). We note that the prioritized variant also associated with allele-dependent effects on the expression of a nearby long noncoding RNA (LINC01595, P = 1.02 × 10−50), albeit in the opposite direction. This discrepancy may reflect an indirect effect of the expression of SULT2A1 on LINC01595.

Functional prioritization of previously uncharacterized ICP genes

We next explored the remaining three ICP-associated signals where the closest genes were not directly involved in lipid and bile acid homeostasis. On chromosome 19, even though it was not possible to employ PAINTOR for functional fine-mapping due to the presence of ambiguous alleles (A/T or G/C), we prioritized the variant rs4806173 based on its overlap with a hepatic CRE (Fig. 6a). This variant resides in an intron of the testis-specific gene GAPDHS37. We thus queried eQTL GTEx34 data to identify potential targets of this regulatory variant in liver tissue. This analysis led us to pinpoint TMEM147 (transmembrane protein 147) as the likely effector transcript of this association signal, with the risk allele associating with lower hepatic gene expression (1.52 × 10−40) (Fig. 6b). Using transcription factor motif disruption analysis, we detected that the risk allele of rs4806173 creates a recognition sequence for the glucocorticoid receptor (Fig. 6c), which has been previously reported to act as either a gene expression activator or repressor in a context-dependent manner38,39. Recent studies in HeLa cells have linked loss of TMEM147 with increased cholesterol uptake40 supporting the role of this gene as an effector of ICP genetic susceptibility.

Fig. 6. Analysis human liver eQTLs uncovers TMEM147 and ENPP7 as effector transcripts of ICP susceptibility.

Fig. 6

a, d LocusZoom86 plots showing the association signals at TMEM147 and ENPP7, respectively. The LocusZoom plots are as per Fig. 3 and association testing was performed as described in Fig. 1. All epigenomic and transcriptomic datasets shown were retrieved from ENCODE18. Adult liver ATAC-seq peaks track (green regions) corresponds to regions accessible in at least two out of four ENCODE adult liver ATAC-seq (see “Methods”). 95% PP ICP variants, variants within the 95% genetic credible set in the ICP meta-analysis. LD ICP variants, variants with EUR r2 > 0.8 with the meta-analysis lead variant. PPfunc, functional posterior probability calculated using PAINTOR17. CADD, combined annotation-dependent depletion. Overlap of ICP-associated variants in these two loci with adult liver accessible chromatin sites enabled the prioritization of two ICP regulatory variants in intronic regions of GAPDHS (rs4806173) (a) and ENPP7 (rs9916601) (d). b, e The ICP variants prioritized with functional liver chromatin annotations are eQTLs in liver tissue and pinpoint TMEM147 and ENPP7 as effector transcripts in these loci. Violin plots represent the density distribution of the samples in each genotype (n for each genotype is indicated below in blue). Box plots show normalized gene expression in median (white line), first and third quartiles. Gene-level adjusted P values calculated with FastQTL86 are shown. Liver eQTL plots were retrieved from GTEx34 [GTEx Analysis Release V8 (dbGaP Accession phs000424.v8.p2)]. We note that GAPDHS is a testis-specific gene and its interrogation as an eGene for rs4806173 was therefore not possible (b). c The ICP risk allele of rs4806173 (G), which associates with less transcription of TMEM147, is predicted to create a binding site for glucocorticoid receptors, a family of context-dependent transcriptional regulators that can act as either activators or repressors87.

Similarly, we prioritized ENPP7 as the likely effector transcript driving the association signal in chromosome 17 (Fig. 6d, e), with the risk variant (T) associating with increased ENPP7 expression (Fig. 5e). ENPP7 encodes the alkaline sphingomyelinase (Alk-SMase) that is present in the intestinal tract and bile, and responsible for the digestion of sphingomyelins in a bile salt-dependent manner41. Previous studies have suggested that Alk-SMase promotes intestinal cholesterol absorption42.

The final signal encompassed intronic variants in the SHROOM3 gene, including one variant in a hepatic CRE (rs4859682) (Fig. 7a). However, analysis of liver eQTLs did not assist in the identification of potential effector transcripts in the locus. Topologically associating domains (TADs) correspond to self-interacting genomic domains that contain genes and regulatory elements that interact more frequently with each other than with other regions43. We thus carried out a systematic analysis of all genes located in the TAD that contains this ICP association signal, interrogating their expression in liver tissue (Fig. 7b) and their liver-specificity (Fig. 7c). Given the liver-specificity of most ICP genes (Supplementary Fig. 5b), we decided to further investigate genes whose expression is both high and specific in liver tissue. Two genes met these criteria: STBD1 and SCARB2 (Fig. 7c). We then investigated human liver Hi-C maps in order to find out whether there were long-range 3D chromatin contacts between the prioritized causal variant rs4859682, SHROOM3 (closest gene), STBD1, and SCARB2, in this tissue. Only long-range interactions stemming from SCARB2 were detected (Fig. 7d), which led us to prioritize this gene as the likely effector of the association signal in this locus. SCARB2 encodes for the lysosomal integral membrane protein type 2 (LIMP-2), which has been reported to play a role in cholesterol transport in lysosomes44.

Fig. 7. Tissue-specificity and 3D chromatin conformation analyses prioritize SCARB2 as an ICP susceptibility gene.

Fig. 7

a LocusZoom86 plot showing the association signal at SHROOM3. The LocusZoom plots are as per Fig. 3 and association testing was performed as described in Fig. 1. All epigenomic and transcriptomic datasets shown were retrieved from ENCODE18. Adult liver ATAC-seq peaks track (green regions) corresponds to regions accessible in at least two out of four ENCODE adult liver ATAC-seq (see “Methods”). 95% PP ICP variants, variants within the 95% genetic credible set in the ICP meta-analysis. LD ICP variants, variants with EUR r2 > 0.8 with the meta-analysis lead variant. CADD, combined annotation-dependent depletion. Overlap of ICP risk variants in this locus with adult liver accessible chromatin sites enabled the prioritization of a single ICP-regulatory variant in an intronic region of the SHROOM3 gene (rs4859682). None of the other ICP variants (EUR r2 > 0.8 with lead variant or in the 95% PP credible set) overlapped accessible chromatin sites. b Expression of all genes within the TAD containing the ICP lead variant rs13146355 in the human liver. Expression data was retrieved from GTEx34 [GTEx Analysis Release V8 (dbGaP Accession phs000424.v8.p2)]. Violin plots represent the density distribution of expression of the indicated genes in human liver tissue (n = 226 donors). Box plots show gene expression in median (white line), first and third quartiles. TPM, transcripts per million. Note that the gene harboring the ICP risk variant, SHROOM3, is lowly expressed in the liver. c Scatter plot showing the expression of all genes within the TAD containing the ICP lead variant rs13146355 in the human liver versus their liver-specificity. Tissue-specificity scores (SPECS) were precalculated by Everaert et al.81. Source data are provided as a Source Data file. This plot highlights two highly expressed and liver-specific genes, STBD1, and SCARB2. d, Analysis of long-range chromatin interactions in human liver detected by promoter capture Hi-C79, showing that the genomic region containing the ICP risk variant rs4859682 interacts with SCARB2, but not SHROOM3. Long-range chromatin interactions stemming from the promoter of STBD1 were not detected in liver tissue.

Overlap of risk loci for ICP and other traits

All but one of the loci identified in this study (rs34491636, ENPP7) have been previously associated (at r2 > 0.6) with diseases or traits in the GWAS Catalog45, many of which have phenotypic overlap with ICP (Supplementary Fig. 8 and Supplementary Data 4). Notably, six of the loci have previously been associated with gallstone disease, six with LDL cholesterol concentrations, and four with liver enzyme concentrations (Supplementary Fig. 8). Likewise, we observed a strong enrichment for ICP susceptibility genes (identified in an unbiased manner from the GWAS loci) being also associated with gallstone disease (P = 2.5 × 10−15) and LDL cholesterol (P = 2.67 × 10−07) (Supplementary Fig. 9).

The elevated serum bile acids that form the core phenotype are also associated with maternal and fetal dyslipidemia4648, and increased rates of gallstone formation49. Variation at ABCG8 has previously been linked to gallstone disease and other lipid-related phenotypes50,51. However, the specific haplotype associated with ICP differs from the main reported association with gallstone disease. CYP7A1 variation is associated with blood lipid traits52 in addition to gallstone formation53. Heterozygous ABCB4 mutations were the first genetic link identified to ICP54,55 and common variation at this locus has been previously described in ICP10. In the context of pregnancy, raised gestational hormones cause reduced FXR-mediated induction of hepatic Bsep, Shp, and Mdr36, and enterocyte Shp and Fgf15/1956 thereby causing abnormal bile acid homeostasis. This is likely to exacerbate susceptibility to hypercholanemia in genetically predisposed women with the variants reported in this manuscript.

Given the strong overlap between genes associated with ICP and cholelithiasis (Supplementary Fig. 9), we directly examined the evidence of association of the 11 ICP loci with cholelithiasis using the FinnGen cholelithiasis GWAS (https://r4.finngen.fi/pheno/K11_CHOLELITH), which includes 15,683 cases and 158,425 controls. Six of the loci identified in our ICP GWAS meta-analysis achieve genome-wide significance for cholelithiasis as well (P < 5 × 10−08) (Supplementary Fig. 8). Two additional loci demonstrate nominal evidence of association. Despite 100% power (assuming a similar-sized effect on the risk of ICP and cholelithiasis) to detect association, three ICP loci (ABCB11, SCARB2, and ENPP7) showed no evidence of association with cholelithiasis indicating that they are specific to ICP and not shared with cholelithiasis.

We also directly examined evidence of association at the 29 loci with AF > 0.01 (the MAF threshold in our study) previously associated with cholelithiasis in the largest meta-analysis published to date (comprising 27,174 cholelithiasis cases and 736,838 controls57). Five of these loci achieve genome-wide significance (P < 5 × 10−08) in the ICP meta-analysis (SERPINA1, CYP7A1, GCKR, ABCB4, and HNF4A) and a further five show evidence of association with ICP after correcting for examining 29 loci (P < 0.05/29 = 0.0017); two of these are genes reported as genome-wide significant in the present ICP GWAS (ABCG8 and SULT2A1) and three are not (FADS2, JMJD1C, and TTC39B). Assuming the same effect size of these associations between cholelithiasis and ICP, the power to detect association at P = 0.0017 was only present for one locus (at which significance was achieved) so we examined whether a composite effect of the known cholelithiasis genetic risk variants was detectable in ICP patients by calculating polygenic risk scores (PRS). Of the 29 cholelithiasis loci, genotype data were available in the NIHR-RD dataset for 27. PRS calculated using the OR from the previously published cholelithiasis GWAS demonstrated ICP patients have a significantly higher burden of cholelithiasis genetic risk variants compared with controls (Wilcoxon P = 1.4 × 10−05). However, when this analysis was repeated with the above loci that showed association in the ICP meta-analysis removed (leaving 18 variants for which data were available), there was no difference in PRS (P = 0.198) between ICP and controls, despite 92% power to detect a difference at P < 0.05. This suggests that not all the genetic risk factors for cholelithiasis are risk factors for ICP. Together, these findings indicate that cholelithiasis and ICP have distinct but overlapping sets of genetic risk factors.

Given the overlapping genetic architecture between ICP and gallstone disease, and the broad spectrum of disease severity in this group, these loci represent possible modifier genes in a number of other cholestatic diseases such as the progressive familial cholestasis syndromes58.

In conclusion, this study has uncovered common variation that influences ICP susceptibility, identifying key contributing genes. These genes highlight an underlying shared mechanism of variation at liver-specific genes and cis-regulatory elements, providing insights into the pathogenesis of this disease that warrant further investigation.

Methods

Study cohorts and ethics

An extensive description of the cohorts included in this study and details of the ethics approvals can be found in the Supplementary Note in Methods (Supplementary Information).

NIHR-RD and 100KGP whole-genome sequencing variant calling, quality control, ancestry, and relatedness estimation

Detailed methods describing the processing of whole-genome sequence data for the NIHR-RD and 100KGP cohorts can be found in the Supplementary Note in Methods (Supplementary Information).

NIHR-RD and 100KGP genome-wide association analyses

For genome-wide association analysis (GWAS) in both the NIHR-RD and 100KGP cohorts separately, variants passing the following criteria were retained: MAF ≥ 0.01, minimum minor allele count (MAC) ≥ 20, missingness <1%, Hardy–Weinberg equilibrium P value >1 × 10−6 and differential (case/control) missingness P value >1 × 10−5. The final NIHR-RD and 100KGP datasets comprised 8,337,027 and 9,545,879 variants, respectively. GWAS was performed in each dataset separately using the R package SAIGE13 (versions 0.42.1 and 0.44.2) (https://github.com/weizhouUMICH/SAIGE). SAIGE implements a generalized logistic mixed model to account for population stratification and a saddle-point approximation to control type 1 error rates when case–control ratios are unbalanced. SAIGE was run using the default parameters. Null logistic mixed models were fitted using the subsets of linkage disequilibrium (LD)-free high-quality common SNVs. Sex and the top ten principal components were used as fixed effects. The genomic inflation factors (lambda), calculated based on the 50th percentile, were 1.009 and 1.001 in NIHR-RD and 100KGP, respectively, indicating no evidence of confounding by population structure. The summary statistics from the NIHR-RD were lifted over from build GRCh37 to build GRCh38 of the human genome using CrossMap59 (version 0.5.2) (http://crossmap.sourceforge.net/). In all, 12,756 variants failed to lift over. Due to data sharing restrictions, it was not possible to combine the raw genome-wide genotype data from the NIHR-RD and 100KGP datasets.

NIHR-RD and 100KGP meta-analysis

Meta-analysis of the NIHR-RD and 100KGP GWAS summary statistics was performed using METAL (version 2011-03-25) (https://genome.sph.umich.edu/wiki/METAL)14. Multi-allelic variants were excluded such that there were 8,291,828 and 9,545,879 variants in NIHR-RD and 100KGP, respectively. Of these, 8,207,819 variants were shared and 8,201,630 variants had matching alleles. Meta-analysis was performed weighting the effect size estimates using the inverse of the standard errors. Variants showing heterogeneity of effect between the two datasets (P < 1 × 10−5) and those in which the minimum/maximum allele frequencies differed by >0.05 were excluded leaving 8,199,999 variants. A Manhattan plot was produced using the R package qqman (version 0.1.8) (https://github.com/stephenturner/qqman)60 (Supplementary Figure 3a). Four loci achieved genome-wide significance (P < 5 × 10−8). A QQ plot was generated using the observed and expected –log10 (P values) (Supplementary Fig. 3c). The genomic inflation factor (lambda), calculated based on the 50th percentile, was 1.01 indicating no significant population stratification.

FinnGen genome-wide association analysis

FinnGen is a public-private partnership combining digital health record data from Finnish health registries with genotyping data from Finnish Biobanks. Release 4 (https://finngen.gitbook.io/documentation/v/r4/) includes association data at 16,962,023 variants for 2444 endpoints in 176,899 Finnish individuals. A detailed description of the study design and analytical methods are available in the online documentation (URL above). In brief, individuals were genotyped with Illumina and Affymetrix chip arrays. QC was performed to remove samples and variants of poor quality. Genome-wide imputation was performed using reference Finnish whole-genome sequence data. Disease endpoints were defined using nationwide registries based on ICD (and other) codes. A subset of unrelated individuals of genetically confirmed Finnish ancestry was identified. GWAS was performed using SAIGE (version 0.35.8.8) for all variants with MAC > 5. Sex, age, 10 PCs, and genotyping batch were included as covariates in the analysis.

For the ICP GWAS (https://r4.finngen.fi/pheno/O15_ICP), cases (n = 740) were identified as those with the ICD-10 code of O26.6 (“Liver disorders in pregnancy, childbirth and the puerperium” including “cholestasis (intrahepatic) in pregnancy” and “obstetric cholestasis”) or an ICD-9 code of 646.7 (“Liver and biliary tract disorders in pregnancy”). All females without these ICD codes were included as controls (n = 99,621). Seven loci achieved genome-wide significance (P < 5 × 10−8) (Supplementary Fig. 3b). The genomic inflation factor (lambda), calculated based on the 50th percentile, was 1.046 (Supplementary Fig. 3d).

Combined UK (NIHR-RD and 100KGP) and FinnGen meta-analysis

A meta-analysis was subsequently performed of the combined NIHR-RD and 100KGP data (“UK meta-analysis”) and the FinnGen summary statistics. This included 154,780 individuals of whom 1138 were cases and 153,642 were controls. Meta-analysis was performed using the same methodology as described above. A total of 7,721,597 variants were shared between the final UK meta-analysis dataset and the FinnGen dataset and 7,715,997 variants had matching alleles. After performing the meta-analysis, variants showing heterogeneity of effect between the two datasets (P < 1 × 10−5) and those in which the minimum/maximum allele frequencies differed by >0.25 were excluded leaving 7,715,762 variants. Eleven loci achieved genome-wide significance (P < 5 × 10−8) (Fig. 1). The genomic inflation factor (lambda), calculated based on the 50th percentile, was 1.027 (Supplementary Fig. 3e). Summary statistics for each of the lead variants in each of the separate GWAS and meta-analyses are provided in Supplementary Data 1.

LocusZoom

Regional high-resolution association plots showing the LD between markers in the prioritized loci (as per European reference data) were generated using LocusZoom (version 0.13.3) (https://my.locuszoom.org/)61. Credible sets were calculated by LocusZoom using Bayes factors based on the P values (https://statgen.github.io/gwas-credible-sets/method/locuszoom-credible-sets.pdf).

Epistasis

Testing for epistatic interactions was performed across all possible 55 pairs of the 11 prioritized lead risk variants in NIHR-RD and 100KGP separately with PLINK16 (version 1.9) (https://www.cog-genomics.org/plink/). The resulting P values were combined using Fisher’s method. No significant interactions were identified after correcting for multiple testing based on 55 comparisons (P < 0.0009).

Conditional analysis

Each of the prioritized loci was examined using “conditional and joint analyses” (GCTA-COJO)15 to assess whether there was more than one independent signal. COJO was run from GCTA (version 1.93.2) (https://cnsgenomics.com/software/gcta/#COJO) using default parameters (examining a 10-Mb window surrounding genome-wide significant variants). The NIHR-RD dataset was utilized to provide the reference LD structure of the variants.

Power calculations

Post hoc power calculations were performed to assess the range of odds ratio (OR) and allele frequencies for which associations could be detected at genome-wide significance (P < 5 × 10−8) given the number of cases and controls in the final meta-analysis. The R package genpwr62 (version 1.0.4) (https://cran.r-project.org/web/packages/genpwr/index.html) was employed using a logistic model under genetic additivity. Power to detect specific previously described associations was calculated with genpwr using the OR of the described associations, reference European AF and cohort sizes. For cholelithiasis, the OR were taken from the supplementary data of a published meta-analysis57 (Supplementary Data 1). For rs708686 the OR from the additive model described in the manuscript was utilized.

Polygenic risk score calculation

A polygenic risk score (PRS) for cholelithiasis in the NIHR-RD data was calculated using Mangrove (version 1.21) (https://cran.r-project.org/web/packages/Mangrove/index.html)63 with the previously published cholelithiasis OR57 and reference European AF, as described above. The PRS was calculated assuming additivity within and between loci. Statistical comparison of the PRS in cases and controls was performed using an unpaired two-sample two-sided Wilcoxon rank-sum test.

Power calculations for the PRS analyses were performed through simulating genotype data for case and control datasets using custom code, written in R (available on request). Genotypes were simulated for each variant in each individual with the probabilities of each genotype status (homozygous reference, heterozygous, and homozygous alternate) determined by the Hardy–Weinberg equilibrium formulae based on the AF and assuming complete linkage equilibrium between the variants. For controls, the reference European AF were used. For cases, the AF of the risk allele were calculated from the reference AF and OR. A total of 100 repetitions of the simulation process were performed in which genotypes for the target variants were generated for the required number of cases and controls (matching the numbers available in the ICP analysis). For each repetition, PRS were generated for each individual using Mangrove, and the statistical significance of the difference between cases and controls was calculated using Wilcoxon rank-sum test. The power to detect a difference was then determined as the percentage of simulations in which a statistically significant difference between cases and controls was seen at P < 0.05.

Unbiased gene prioritization and expression/pathway/GWAS Catalog analyses

An unbiased assessment of gene expression and functional pathway enrichment of the genes within the 11 prioritized risk loci was undertaken as follows. All 655 variants achieving P < 5 × 10−8 in the final meta-analysis were identified. Of these, 629 were successfully mapped to an rsID. All variants in LD with these variants at r2 > 0.8 in European individuals using 1000 Genomes data were identified using LDlinkR (version 1.1.2) (https://cran.r-project.org/web/packages/LDlinkR/vignettes/LDlinkR.html)64. This resulted in the identification of 871 unique variants. These variants were subsequently analyzed using the Ensembl Variant Effect Predictor (VEP)65 (version 104) (https://www.ensembl.org/info/docs/tools/vep/index.html) to identify the 39 unique genes impacted. These genes were interrogated using the GENE2FUNC tool in FUMA (version 1.3.6b) (https://fuma.ctglab.nl/gene2func)66. 35 genes were successfully mapped and analyzed by GENE2FUNC (ABCB1, ABCB11, ABCB4, ABCG5, ABCG8, ATP4A, BABAM2, C2orf16, CCDC121, CRX, CYP7A1, DHRS9, EIF2B4, ENPP7, GAPDHS, GCKR, GPN1, HNF4A, LINC01595, MRPL33, NSMAF, PPIAP85, PPP4R4, RBKS, SBSN, SDCBP, SERPINA1, SERPINA10, SERPINA2, SERPINA6, SHROOM3, SNX17, SULT2A1, TMEM147, TMEM147-AS1, TPRX2P, UBXN2B, ZNF512, and ZNF513). The results of tissue-specificity analysis (as per GTEx34 data (https://www.gtexportal.org)), gene sets/pathway analysis (as per MsigDB67 (https://www.gsea-msigdb.org/gsea/msigdb)) and overlap of the genes with GWAS Catalog45 reported associations (https://www.ebi.ac.uk/gwas/) were interrogated. Further details regarding the methodology employed by GENE2FUNC are available on the FUMA website (https://fuma.ctglab.nl/tutorial#g2fOutputs). Statistical significance was assessed using hypergeometric tests with Bonferroni correction for multiple testing.

Overlap with previous GWAS findings

Previously described GWAS findings (as per the GWAS Catalog (https://www.ebi.ac.uk/gwas)45 accessed on July 23, 2021) were searched for overlap with the lead variants at each of the 11 genome-wide significant ICP loci. This was undertaken using the LDtrait Tool from LDlink (https://ldlink.nci.nih.gov/?tab=ldtrait)68. All associations achieving P < 5 × 10−6 that were in LD with the 11 identified lead variants at r2 > 0.6 in European populations (using 1000 Genomes data) were included. Phenotype labels were manually amended such that they were consistent and duplicates were removed (Supplementary Fig. 8).

Identification of adult human liver replicated accessible regions

In order to functionally annotate ICP risk variants, we retrieved from the ENCODE portal18 (https://www.encodeproject.org/) the IDR ranked peaks for a set of high-quality ATAC-seq experiments carried out on four adult human liver samples. The file identifiers were ENCFF948WBQ, ENCFF012SCX, ENCFF658LHI, and ENCFF658LHI, and corresponded to liver samples from two males and two females with ages ranging from 40 to 61 years old (mean 47 years old). Using these files, we identified the set of genomic regions that were accessible in at least two of the four donors, which we used to prioritize and identify functional ICP regulatory variants. These regions are provided in Supplementary Data 5 (coordinates from genome assembly GRCh38/hg38).

Variant functional prioritization

The overall prioritization strategy to identify the most likely risk causal variants and genes at each of the ICP-associated loci is presented in Fig. 2. Details of each of the steps taken to prioritize causal variants and genes are described in the next sections.

Statistical fine-mapping

We applied PAINTOR (version 3.1) (https://bogdan.dgsom.ucla.edu/pages/paintor/)17 (a Bayesian fine-mapping approach) which uses an empirical Bayes prior to integrate functional annotation data, LD patterns and strength of association to estimate the posterior probability (PP) of a variant being causal. Variants within a 100-kb window centering on the lead variant at each of the 11 genome-wide significant loci in the meta-analysis were included. LD matrices of pairwise correlation coefficients were derived using European 1000 Genomes (Phase 3) imputed data69, excluding variants with ambiguous alleles (A/T or G/C) where reference/alternate alleles could not be reliably matched. Each variant was intersected with the following functional annotations to generate a PP of being causal (PPfunc): PhastCons elements (phastConsElements100way, updated August 5, 2015), ENCODE18 ATAC-seq replicated peaks generated from the adult liver (see above), and Roadmap E066 (adult liver) ChIP-seq narrow peaks for H3K27ac and H3K4me1 histone modifications. Variants with PPfunc > 0.8 were prioritized for further investigation.

For the loci in which PAINTOR did not identify at least one variant with a high posterior probability of being causal (PPfunc > 0.8), we then proceeded to intersect the coordinates of all ICP variants in the locus with human adult liver accessible chromatin sites (see details above) using pybedtools version 0.8.070 (https://daler.github.io/pybedtools/). This approach was taken because none of the remaining loci contained coding variants. To carry out this analysis, we first defined ICP variants as either (a) all variants within the 95% credible set defined in the ICP GWAS meta-analysis by LocusZoom, or (b) all variants in high LD (r2 > 0.8) with the meta-analysis lead variant in European (EUR) individuals. High LD variants were identified using LDlink (https://ldlink.nci.nih.gov/?tab=ldproxy).

Variant functional annotation of coding variants

ICP risk variants were first annotated in relation to their potential to disrupt protein-coding sequences. We identified four ICP-associated loci in which there was at least one coding non-synonymous risk variant. To investigate their potential pathogenicity, we first interrogated them in ClinVar (https://www.ncbi.nlm.nih.gov/clinvar/)71. For the ICP coding variants not reported as pathogenic in ClinVar, we performed further pathogenicity analysis using MetaDome23 (version 1.0.1), which is based on the concept of protein domain homology in the human genome. Briefly, homologous Pfam protein domains were aggregated into meta-domains. Then, population variation from the Exome Aggregation Consortium (ExAC) (https://exac.broadinstitute.org/) and pathogenic mutations from the Human Gene Mutation Database (HGMD)72 (http://www.hgmd.cf.ac.uk/ac/index.php) were used to create genetic tolerance profiles across human meta-domains at amino acid resolution. MetaDome genetic tolerance profiles were previously derived using 56,319 transcripts, 71,419 protein domains, 12,164,292 genetic variants from gnomAD73, and 34,076 pathogenic mutations from ClinVar71 and queried via the online portal at https://stuart.radboudumc.nl/metadome. Interrogation of the coding variant in the GCKR gene was not possible due to a lack of data availability. Finally, we retrieved the CADD scores of pathogenicity (release version 1.6, GRCh38/hg38) for all coding variants from https://cadd.gs.washington.edu/download. The pathogenicity analysis results are presented in Supplementary Data 3.

Analysis of the survey of regulatory elements data

Survey of Regulatory Elements was previously performed in HepG2 cells31. In that study, the authors interrogated 5.9 million human biallelic SNPs for their ability to confer allele-dependent regulatory activity. We retrieved this dataset (SuRE_SNP_table_LP190708.txt.gz) from https://osf.io/pjxm4/ and queried all ICP variants (see details above) at each of the 11 ICP risk loci. Data are presented as non-risk versus risk allele. P values were obtained by a two-sided Wilcoxon rank-sum test, comparing for each SNP the set of SuRE values of all fragments containing the reference allele versus the set of SuRE values of all fragments containing the alternative allele. Following the original report31, P < 0.00173121 were considered as passing a 5% false-discovery rate threshold. Data mining was performed using Python (version 3.7) and figures were plotted using seaborn74 version 0.11.2 (https://seaborn.pydata.org/).

Transcription factor binding motif analysis

The prioritized ICP-associated variants (see details above) were investigated for their potential effect on TF binding affinity using Motifbreak R version 2.2.0 (https://bioconductor.org/packages/release/bioc/html/motifbreakR.html). To query motifs, we imported MotifDb (https://bioconductor.org/packages/release/bioc/html/MotifDb.html) (version 1.30.0), which contains >4200 TF motifs, including >2800 motifs from studies in human samples and includes the databases HOCOMOCO75, HOMER76, and ENCODE-motifs18. For visual representation, motif logos were downloaded from JASPAR 202077 (http://jaspar.genereg.net/).

Analysis of human liver eQTLs

Human liver eQTL data shown in this study were retrieved from the Genotype-Tissue Expression (GTEx) Portal34 on May 12, 2021, which at the time of accession contained data for 208 liver samples. [GTEx Analysis Release version 8 (dbGaP Accession phs000424.v8.p2)]. At each ICP association signal, we queried the lead variant and, if different, the variant that was functionally prioritized. The analysis was restricted to liver eQTLs and eGenes contained in the same TAD as the lead variant (human liver TAD coordinates were retrieved from http://3dgenome.fsm.northwestern.edu/publications.html). Figures were directly downloaded from the GTEx Portal.

Analysis of adult liver 3D chromatin conformation

Gene expression and eQTL analyses were restricted to the topologically associating domains (TADs) containing the ICP-associated variants identified in our meta-analysis. Human liver TAD coordinates (GRCh38/hg38)78 were retrieved from http://3dgenome.fsm.northwestern.edu/downloads/hg38.TADs.zip.

For specific investigation of long-range chromatin interactions in liver tissue, we interrogated two complementary datasets from adult liver: Hi-C78 and capture Hi-C79. These datasets were visualized and analyzed using the 3D-genome interaction viewer and database (http://www.3div.kr/).

Visualization of human adult liver datasets

Human adult female liver (right lobe) epigenomic and transcriptomic datasets we downloaded from the ENCODE portal18 (https://www.encodeproject.org/) in bigwig format and visualized using the UCSC Genome Browser80 (GRCh38/hg38). All data corresponds to fold change over control for the shown assay. The visualized files had the following identifiers: ENCFF232QBB (ATAC-seq), ENCFF280QYJ (H3K4me3 ChIP-seq) and ENCFF853WPX (H3K27ac ChIP-seq). Adult liver 3D chromatin interaction datasets were visualized and analyzed using the 3D-genome interaction viewer and database (http://www.3div.kr/).

Tissue-specificity analysis

In order to investigate whether the ICP risk variants at SHROOM3 could be in the vicinity of or interact with liver-specific genes, we used SPECS scores81, which were pre-computed for adult liver for all Ensembl (GRCh38.v85) genes using all GTEx34 samples. In brief, SPECS is a non-parametric tissue-specificity score that is compatible with unequal sample group sizes. SPECS uses all individual data points available and enables the detection of features that are specifically present or absent in one or more tissue types. The SPECS score has been shown to outperform other tissue-specificity scores81, including z-score82 and JSD83. Data mining was performed using Python (v3.7) and figures were plotted using seaborn74 version 0.11.2 (https://seaborn.pydata.org/)18.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Supplementary information

Peer Review File (1,000.2KB, pdf)
41467_2022_29931_MOESM3_ESM.pdf (394.4KB, pdf)

Description of Additional Supplementary Files

Reporting Summary (333.2KB, pdf)

Acknowledgements

We thank NIHR BioResource volunteers for their participation, and gratefully acknowledge NIHR BioResource Centres, NHS Trusts and staff for their contribution. We thank the National Institute for Health Research, NHS Blood and Transplant, and Health Data Research UK as part of the Digital Innovation Hub Programme. P.H.D. and C.W. are supported by the National Institute for Health Research (NIHR) Biomedical Research Centre at Guy’s and St Thomas’ Foundation Trust and King’s College London. A.P.L. is supported by an NIHR Academic Clinical Lectureship. C.W. is funded by an NIHR Senior Investigator award. The views expressed are those of the authors and not necessarily those of the NHS, NIHR, or Department of Health and Social Care. I.C. is an Academy of Medical Sciences Springboard Fellow (SBF005\1050). The Scheme is supported by the British Heart Foundation, Diabetes UK, the Global Challenges Research Fund, the Government Department for Business, Energy and Industrial Strategy and the Wellcome Trust. I.C. was also supported by the Foundation of National Institutes of Health (FNIH) and the Accelerated Medicines Partnerships Type 2 Diabetes (AMP T2D) initiative (RFP16) and by the National Institute for Health Research (NIHR) Biomedical Research Centre at Imperial College Healthcare NHS Trust. We also thank the Imperial College London High-Performance Computing Service. MMYC is supported by a Kidney Research UK Clinical Training Fellowship. D.P.G. is supported by the St Peter’s Trust. The Genotype-Tissue Expression (GTEx) Project was supported by the Common Fund of the Office of the Director of the National Institutes of Health, and by NCI, NHGRI, NHLBI, NIDA, NIMH, and NINDS. The data used for the analyses described in this manuscript were obtained from the GTEx Portal on May 12, 2021. This research was also made possible through access to the data and findings generated by the 100,000 Genomes Project. The 100,000 Genomes Project is managed by Genomics England Limited (a wholly owned company of the Department of Health and Social Care). The 100,000 Genomes Project is funded by the National Institute for Health Research and NHS England. The Wellcome Trust, Cancer Research UK, and the Medical Research Council have also funded research infrastructure. The 100,000 Genomes Project uses data provided by patients and collected by the National Health Service as part of their care and support. The authors gratefully acknowledge the participation of the patients and their families recruited to the 100,000 Genomes Project. We want to acknowledge the participants and investigators of the FinnGen study.

Source data

Source Data (13.8KB, xlsx)

Author contributions

Study conception and design: P.H.D., A.P.L., I.C., M.C., D.P.G., and C.W. Laboratory work: A.S.A., A.A., A.L.M., and J.Z. Bioinformatics: P.H.D., A.P.L., I.C., M.C., M.M., D.P.G., and H.M. Patient recruitment and phenotyping: J.C., A.S., J.D., S.C., M.G., K.N., M.T., W.M.H., M.C.E., and H.U.M. All authors critically reviewed the manuscript and approved the final version.

Peer review

Peer review information

Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.

Data availability

The genome-wide summary statistics for the final combined meta-analysis carried out in this study have been deposited in the GWAS Catalog under accession code GCP000309, available at https://www.ebi.ac.uk/gwas/studies/GCST90095084. The lead variants’ GWAS and GWAS meta-analysis summary statistics generated in this study are provided in Supplementary Data 1. Genotype and phenotype data from the NIHR-RD participants are available from several sources. 4,835 of the NIHR-RD participants were part of the 100,000 Genomes Project—Rare Diseases Pilot. These data can be accessed by application to Genomics England Limited following the procedure outlined at https://www.genomicsengland.co.uk/about-gecip/joining-researchcommunity/. The genotype and phenotype data from the remaining 7348 NIHR-RD participants can be accessed by application to the NIHR BioResource Data Access Committee at dac@bioresource.nihr.ac.uk. Subject to ethical consent, the genotype data of a subset of 6,939 NIHR-RD participants are also available from the European Genome-phenome Archive (EGA) at the EMBL European Bioinformatics Institute [https://ega-archive.org/dacs/EGAC00001000259]. This includes data from 305 ICP cases [https://ega-archive.org/datasets/EGAD00001004515]. Genomic and phenotype data from the 100KGP participants can be accessed by application to Genomics England Limited following the procedure outlined at https://www.genomicsengland.co.uk/about-gecip/joining-researchcommunity/. Genotype data for the 764 UK Biobank samples are available through the UK Biobank [https://www.ukbiobank.ac.uk/]. The FinnGenn GWAS summary statistics are publicly accessible following registration [https://www.finngen.fi/en/access_results]. The adult human liver replicated accessible regions generated in this study are provided in Supplementary Data 5. The variant pathogenicity data used in this study are available in the ClinVar database [https://www.ncbi.nlm.nih.gov/clinvar/] and through MetaDome v1.0.1 [https://stuart.radboudumc.nl/metadome]. The CADD scores (v1.6) used in this study are available on the CADD website [https://cadd.gs.washington.edu/download]. The liver epigenomic and transcriptomic data used in this study are available in the ENCODE portal18 [https://www.encodeproject.org/]. The SuRE data31 used in this study are available in the OFS data repository [https://osf.io/pjxm4/]. The transcription factor motif logos used in this study are available in the JASPAR database [http://jaspar.genereg.net/]. The eQTL data used in this study are available in the Genotype-Tissue Expression (GTEx) Portal34 under dbGaP accession phs000424.v8.p2 [https://gtexportal.org/home/]. The tissue-specificity SPECS81 scores for liver tissue used in this study are available in the SPECS web browser [https://specs.cmgg.be]. The adult liver TAD coordinates used in this study are available in the 3D Genome Browser [http://3dgenome.fsm.northwestern.edu/index.html].

Code availability

Custom code used in this study is available upon request.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

These authors contributed equally: Peter H. Dixon, Adam P. Levine, Inês Cebola, Daniel P. Gale, Catherine Williamson.

Lists of authors and their affiliations appear at the end of the paper.

Change history

10/12/2022

In the original version of this article, the given and family names of Anton Vonk Noordegraaf were incorrectly structured. The name was displayed correctly in all versions at the time of publication. The original article has been corrected.

Contributor Information

Catherine Williamson, Email: catherine.williamson@kcl.ac.uk.

NIHR BioResource:

Julian Adlard, Munaza Ahmed, Tim Aitman, Hana Alachkar, David Allsup, Jeff Almeida-King, Philip Ancliff, Richard Antrobus, Ruth Armstrong, Gavin Arno, Sofie Ashford, William Astle, Anthony Attwood, Chris Babbs, Tamam Bakchoul, Tadbir Bariana, Julian Barwell, David Bennett, David Bentley, Agnieszka Bierzynska, Tina Biss, Marta Bleda, Harm Bogaard, Christian Bourne, Sara Boyce, John Bradley, Gerome Breen, Paul Brennan, Carole Brewer, Matthew Brown, Michael Browning, Rachel Buchan, Matthew Buckland, Teofila Bueser, Siobhan Burns, Oliver Burren, Paul Calleja, Gerald Carr-White, Keren Carss, Ruth Casey, Mark Caulfield, John Chambers, Jennifer Chambers, Floria Cheng, Patrick F. Chinnery, Martin Christian, Colin Church, Naomi Clements Brod, Gerry Coghlan, Elizabeth Colby, Trevor Cole, Janine Collins, Peter Collins, Camilla Colombo, Robin Condliffe, Stuart Cook, Terry Cook, Nichola Cooper, Paul Corris, Abigail Crisp-Hihn, Nicola Curry, Cesare Danesino, Matthew Daniels, Louise Daugherty, John Davis, Sri V. V. Deevi, Timothy Dent, Eleanor Dewhurst, Peter Dixon, Kate Downes, Anna Drazyk, Elizabeth Drewe, Tina Dutt, David Edgar, Karen Edwards, William Egner, Wendy Erber, Marie Erwood, Maria C. Estiu, Gillian Evans, Dafydd Gareth Evans, Tamara Everington, Mélanie Eyries, Remi Favier, Debra Fletcher, James Fox, Amy Frary, Courtney French, Kathleen Freson, Mattia Frontini, Daniel Gale, Henning Gall, Claire Geoghegan, Terry Gerighty, Stefano Ghio, Hossein-Ardeschir Ghofrani, Simon Gibbs, Kimberley Gilmour, Barbara Girerd, Sarah Goddard, Keith Gomez, Pavels Gordins, David Gosal, Stefan Gräf, Luigi Grassi, Daniel Greene, Lynn Greenhalgh, Andreas Greinacher, Paolo Gresele, Philip Griffiths, Sofia Grigoriadou, Russell Grocock, Detelina Grozeva, Scott Hackett, Charaka Hadinnapola, William Hague, Matthias Haimel, Matthew Hall, Helen Hanson, Kirsty Harkness, Andrew Harper, Claire Harris, Daniel Hart, Ahamad Hassan, Grant Hayman, Alex Henderson, Jonathan Hoffmann, Rita Horvath, Arjan Houweling, Luke Howard, Fengyuan Hu, Gavin Hudson, Joseph Hughes, Aarnoud Huissoon, Marc Humbert, Sean Humphray, Sarah Hunter, Matthew Hurles, Louise Izatt, Roger James, Sally Johnson, Stephen Jolles, Jennifer Jolley, Neringa Jurkute, Mary Kasanicki, Hanadi Kazkaz, Rashid Kazmi, Peter Kelleher, David Kiely, Nathalie Kingston, Robert Klima, Myrto Kostadima, Gabor Kovacs, Ania Koziell, Roman Kreuzhuber, Taco Kuijpers, Ajith Kumar, Dinakantha Kumararatne, Manju Kuria, Michael Laffa, Fiona Lalloo, Michele Lamber, Hana Lango Alle, Allan Lawrie, Mark Layton, Claire Lentaigne, Adam Levine, Rachel Linger, Hilary Longhurst, Eleni Louka, Robert MacKenzie Ross, Bella Madan, Eamonn Maher, Jesmeen Maimaris, Sarah Mangles, Rutendo Mapeta, Kevin Marchbank, Stephen Marks, Hugh S. Markus, Andrew Marshall, Jennifer Martin, Mary Mathias, Emma Matthews, Heather Maxwell, Paul McAlinden, Mark McCarthy, Stuart Meacham, Adam Mead, Karyn Megy, Sarju Mehta, Michel Michaelides, Carolyn Millar, Shahin Moledina, David Montani, Tony Moor, Nicholas Morrell, Keith Muir, Andrew Mumford, Michael Newnham, Jennifer O’Sullivan, Samya Obaji, Steven Okoli, Andrea Olschewski, Horst Olschewski, Kai Ren Ong, Elizabeth Ormondroy, Willem Ouwehan, Sofia Papadi, Soo-Mi Park, David Parry, Joan Paterson, Andrew Peacock, John Peden, Kathelijne Peerlinck, Christopher Penkett, Joanna Pepke-Zaba, Romina Petersen, Angela Pyle, Stuart Rankin, Anupama Rao, F. Lucy Raymond, Paula Rayner-Matthew, Christine Rees, Augusto Rendon, Tara Renton, Andrew Rice, Sylvia Richardson, Alex Richter, Irene Roberts, Catherine Roughley, Noemi Roy, Omid Sadeghi-Alavijeh, Moin Saleem, Nilesh Samani, Alba Sanchis-Juan, Ravishankar Sargur, Simon Satchell, Sinisa Savic, Laura Scelsi, Sol Schulman, Marie Scully, Claire Searle, Werner Seeger, Carrock Sewell, Denis Seyres, Susie Shapiro, Olga Sharmardina, Rakefet Shtoyerman, Keith Sibson, Lucy Side, Ilenia Simeoni, Michael Simpson, Suthesh Sivapalaratnam, Anne-Bine Skytte, Katherine Smith, Kenneth G. C. Smith, Katie Snape, Florent Soubrier, Simon Staines, Emily Staples, Hannah Stark, Jonathan Stephens, Kathleen Stirrups, Sophie Stock, Jay Suntharalingam, Emilia Swietlik, R. Campbell Tait, Kate Talks, Rhea Tan, James Thaventhiran, Andreas Themistocleous, Moira Thomas, Kate Thomson, Adrian Thrasher, Chantal Thys, Marc Tischkowitz, Catherine Titterton, Cheng-Hock Toh, Mark Toshner, Matthew Traylor, Carmen Treacy, Richard Trembath, Salih Tuna, Wojciech Turek, Ernest Turro, Tom Vale, Chris Van Geet, Natalie Van Zuydam, Marta Vazquez-Lopez, Julie von Ziegenweidt, Anton Vonk Noordegraaf, Quintin Waisfisz, Suellen Walker, James Ware, Hugh Watkins, Christopher Watt, Andrew Webster, Wei Wei, Steven Welch, Julie Wessels, Sarah Westbury, John-Paul Westwood, John Wharton, Deborah Whitehorn, James Whitworth, Martin R. Wilkins, Edwin Wong, Nicholas Wood, Yvette Wood, Geoff Woods, Emma Woodward, Stephen Wort, Austen Worth, Katherine Yates, Patrick Yong, Tim Young, Ping Yu, and Patrick Yu-Wai-Man

Genomics England Research Consortium Collaborators:

J. C. Ambrose, P. Arumugam, R. Bevers, M. Bleda, F. Boardman-Pretty, C. R. Boustred, H. Brittain, M. A. Brown, M. J. Caulfield, G. C. Chan, T. Fowler, A. Giess, A. Hamblin, S. Henderson, T. J. P. Hubbard, R. Jackson, L. J. Jones, D. Kasperaviciute, M. Kayikci, A. Kousathanas, L. Lahnstein, S. E. A. Leigh, I. U. S. Leong, F. J. Lopez, F. Maleady-Crowe, M. McEntagart, F. Minneci, L. Moutsianas, M. Mueller, N. Murugaesu, A. C. Need, P. O’Donovan, C. A. Odhams, C. Patch, D. Perez-Gil, M. B. Pereira, J. Pullinger, T. Rahim, A. Rendon, T. Rogers, K. Savage, K. Sawant, R. H. Scott, A. Siddiq, A. Sieghart, S. C. Smith, A. Sosinsky, A. Stuckey, M. Tanguy, A. L. Taylor Tavares, E. R. A. Thomas, S. R. Thompson, A. Tucci, M. J. Welland, E. Williams, K. Witkowska, and S. M. Wood

Supplementary information

The online version contains supplementary material available at 10.1038/s41467-022-29931-z.

References

  • 1.Ovadia C, Williamson C. Intrahepatic cholestasis of pregnancy: recent advances. Clin. Dermatol. 2016;34:327–334. doi: 10.1016/j.clindermatol.2016.02.004. [DOI] [PubMed] [Google Scholar]
  • 2.Glantz A, Marschall HU, Mattsson LA. Intrahepatic cholestasis of pregnancy: relationships between bile acid levels and fetal complication rates. Hepatology. 2004;40:467–474. doi: 10.1002/hep.20336. [DOI] [PubMed] [Google Scholar]
  • 3.Geenes V, et al. Association of severe intrahepatic cholestasis of pregnancy with adverse pregnancy outcomes: a prospective population-based case-control study. Hepatology. 2014;59:1482–1491. doi: 10.1002/hep.26617. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Ovadia C, et al. Association of adverse perinatal outcomes of intrahepatic cholestasis of pregnancy with biochemical markers: results of aggregate and individual patient data meta-analyses. Lancet. 2019;393:899–909. doi: 10.1016/S0140-6736(18)31877-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Reyes H, et al. Selenium, zinc and copper plasma levels in intrahepatic cholestasis of pregnancy, in normal pregnancies and in healthy individuals, in Chile. J. Hepatol. 2000;32:542–549. doi: 10.1016/S0168-8278(00)80214-7. [DOI] [PubMed] [Google Scholar]
  • 6.Abu-Hayyeh S, et al. Intrahepatic cholestasis of pregnancy levels of sulfated progesterone metabolites inhibit farnesoid X receptor resulting in a cholestatic phenotype. Hepatology. 2013;57:716–726. doi: 10.1002/hep.26055. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Abu-Hayyeh S, et al. Prognostic and mechanistic potential of progesterone sulfates in intrahepatic cholestasis of pregnancy and pruritus gravidarum. Hepatology. 2016;63:1287–1298. doi: 10.1002/hep.28265. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Turro E, et al. Whole-genome sequencing of patients with rare diseases in a national health system. Nature. 2020;583:96–102. doi: 10.1038/s41586-020-2434-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Dixon PH, et al. An expanded role for heterozygous mutations of ABCB4, ABCB11, ATP8B1, ABCC2 and TJP2 in intrahepatic cholestasis of pregnancy. Sci. Rep. 2017;7:11823. doi: 10.1038/s41598-017-11626-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Dixon PH, et al. A comprehensive analysis of common genetic variation around six candidate loci for intrahepatic cholestasis of pregnancy. Am. J. Gastroenterol. 2014;109:76–84. doi: 10.1038/ajg.2013.406. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Bull LN, et al. Intrahepatic cholestasis of pregnancy (ICP) in U.S. Latinas and Chileans: clinical features, ancestry analysis, and admixture mapping. PLoS ONE. 2015;10:e0131211. doi: 10.1371/journal.pone.0131211. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Turnbull C, et al. The 100 000 Genomes Project: bringing whole genome sequencing to the NHS. BMJ. 2018;361:k1687. doi: 10.1136/bmj.k1687. [DOI] [PubMed] [Google Scholar]
  • 13.Zhou W, et al. Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies. Nat. Genet. 2018;50:1335–1341. doi: 10.1038/s41588-018-0184-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Willer CJ, Li Y, Abecasis GR. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics. 2010;26:2190–2191. doi: 10.1093/bioinformatics/btq340. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Yang J, et al. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat. Genet. 2012;44:369–375. doi: 10.1038/ng.2213. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Chang CC, et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience. 2015;4:7. doi: 10.1186/s13742-015-0047-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Kichaev G, et al. Integrating functional data to prioritize causal variants in statistical fine-mapping studies. PLoS Genet. 2014;10:e1004722. doi: 10.1371/journal.pgen.1004722. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Davis CA, et al. The encyclopedia of DNA elements (ENCODE): data portal update. Nucleic Acids Res. 2018;46:D794–D801. doi: 10.1093/nar/gkx1081. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Kircher M, et al. A general framework for estimating the relative pathogenicity of human genetic variants. Nat. Genet. 2014;46:310–315. doi: 10.1038/ng.2892. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Greene CM, et al. alpha1-Antitrypsin deficiency. Nat. Rev. Dis. Primers. 2016;2:16051. doi: 10.1038/nrdp.2016.51. [DOI] [PubMed] [Google Scholar]
  • 21.Boelle PY, Debray D, Guillot L, Corvol H, French CFMGSI. SERPINA1 Z allele is associated with cystic fibrosis liver disease. Genet. Med. 2019;21:2151–2155. doi: 10.1038/s41436-019-0449-6. [DOI] [PubMed] [Google Scholar]
  • 22.Hattersley AT, Patel KA. Precision diabetes: learning from monogenic diabetes. Diabetologia. 2017;60:769–777. doi: 10.1007/s00125-017-4226-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Wiel L, et al. MetaDome: pathogenicity analysis of genetic variants through aggregation of homologous human protein domains. Hum. Mutat. 2019;40:1030–1038. doi: 10.1002/humu.23798. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Orho-Melander M, et al. Common missense variant in the glucokinase regulatory protein gene is associated with increased plasma triglyceride and C-reactive protein but lower fasting glucose concentrations. Diabetes. 2008;57:3112–3121. doi: 10.2337/db08-0516. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Vaxillaire M, et al. The common P446L polymorphism in GCKR inversely modulates fasting glucose and triglyceride levels and reduces type 2 diabetes risk in the DESIR prospective general French population. Diabetes. 2008;57:2253–2257. doi: 10.2337/db07-1807. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Li, J. et al. Contribution of Rs780094 and Rs1260326 polymorphisms in GCKR gene to nonalcoholic fatty liver disease: a meta-analysis involving 26,552 participants. Endocr Metab Immune Disord Drug Targets21, 1696–1708 (2020). [DOI] [PubMed]
  • 27.Beer NL, et al. The P446L variant in GCKR associated with fasting plasma glucose and triglyceride levels exerts its effect through increased glucokinase activity in liver. Hum. Mol. Genet. 2009;18:4081–4088. doi: 10.1093/hmg/ddp357. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Maurano MT, et al. Systematic localization of common disease-associated variation in regulatory DNA. Science. 2012;337:1190–1195. doi: 10.1126/science.1222794. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Roadmap Epigenomics C, et al. Integrative analysis of 111 reference human epigenomes. Nature. 2015;518:317–330. doi: 10.1038/nature14248. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Patel SB, Graf GA, Temel RE. ABCG5 and ABCG8: more than a defense against xenosterols. J. Lipid Res. 2018;59:1103–1113. doi: 10.1194/jlr.R084244. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.van Arensbergen J, et al. High-throughput identification of human SNPs affecting regulatory element activity. Nat. Genet. 2019;51:1160–1169. doi: 10.1038/s41588-019-0455-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Sane S, et al. UBXN2A enhances CHIP-mediated proteasomal degradation of oncoprotein mortalin-2 in cancer cells. Mol. Oncol. 2018;12:1753–1777. doi: 10.1002/1878-0261.12372. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Chiang, J. Y. Bile acids: regulation of synthesis. J. Lipid Res. 50, 1955–1966 (2009). [DOI] [PMC free article] [PubMed]
  • 34.Consortium GT. The genotype-tissue expression (GTEx) project. Nat. Genet. 2013;45:580–585. doi: 10.1038/ng.2653. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Wang D, Hartmann K, Seweryn M, Sadee W. Interactions between regulatory variants in CYP7A1 (cholesterol 7alpha-hydroxylase) promoter and enhancer regions regulate CYP7A1 expression. Circ. Genom. Precis. Med. 2018;11:e002082. doi: 10.1161/CIRCGEN.118.002082. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Huang J, et al. Kinetic analysis of bile acid sulfation by stably expressed human sulfotransferase 2A1 (SULT2A1) Xenobiotica. 2010;40:184–194. doi: 10.3109/00498250903514607. [DOI] [PubMed] [Google Scholar]
  • 37.Margaryan H, et al. Characterization and possible function of glyceraldehyde-3-phosphate dehydrogenase-spermatogenic protein GAPDHS in mammalian sperm. Reprod. Biol. Endocrinol. 2015;13:15. doi: 10.1186/s12958-015-0008-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Uhlenhaut NH, et al. Insights into negative regulation by the glucocorticoid receptor from genome-wide profiling of inflammatory cistromes. Mol. Cell. 2013;49:158–171. doi: 10.1016/j.molcel.2012.10.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Oakley RH, Cidlowski JA. The biology of the glucocorticoid receptor: new signaling mechanisms in health and disease. J. Allergy Clin. Immunol. 2013;132:1033–1044. doi: 10.1016/j.jaci.2013.09.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Christodoulou, A. et al. TMEM147 interacts with lamin B receptor, regulates its localization and levels, and affects cholesterol homeostasis. J. Cell Sci.133, jcs245357 (2020). [DOI] [PubMed]
  • 41.Duan RD. Alkaline sphingomyelinase: an old enzyme with novel implications. Biochim Biophys Acta. 2006;1761:281–291. doi: 10.1016/j.bbalip.2006.03.007. [DOI] [PubMed] [Google Scholar]
  • 42.Zhang P, et al. Alkaline sphingomyelinase (NPP7) promotes cholesterol absorption by affecting sphingomyelin levels in the gut: a study with NPP7 knockout mice. Am. J. Physiol. Gastrointest Liver Physiol. 2014;306:G903–G908. doi: 10.1152/ajpgi.00319.2013. [DOI] [PubMed] [Google Scholar]
  • 43.Dixon JR, et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature. 2012;485:376–380. doi: 10.1038/nature11082. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Heybrock S, et al. Lysosomal integral membrane protein-2 (LIMP-2/SCARB2) is involved in lysosomal cholesterol export. Nat. Commun. 2019;10:3521. doi: 10.1038/s41467-019-11425-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Buniello A, et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 2019;47:D1005–D1012. doi: 10.1093/nar/gky1120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Martineau MG, et al. The metabolic profile of intrahepatic cholestasis of pregnancy is associated with impaired glucose tolerance, dyslipidemia, and increased fetal growth. Diabetes Care. 2015;38:243–248. doi: 10.2337/dc14-2143. [DOI] [PubMed] [Google Scholar]
  • 47.Dann AT, et al. Plasma lipid profiles of women with intrahepatic cholestasis of pregnancy. Obstet. Gynecol. 2006;107:106–114. doi: 10.1097/01.AOG.0000189096.94874.9c. [DOI] [PubMed] [Google Scholar]
  • 48.Borges Manna L, et al. Ursodeoxycholic acid improves feto-placental and offspring metabolic outcomes in hypercholanemic pregnancy. Sci. Rep. 2020;10:10361. doi: 10.1038/s41598-020-67301-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Floreani A, Gervasi MT. New insights on intrahepatic cholestasis of pregnancy. Clin. Liver Dis. 2016;20:177–189. doi: 10.1016/j.cld.2015.08.010. [DOI] [PubMed] [Google Scholar]
  • 50.Buch S, et al. A genome-wide association scan identifies the hepatic cholesterol transporter ABCG8 as a susceptibility factor for human gallstone disease. Nat. Genet. 2007;39:995–999. doi: 10.1038/ng2101. [DOI] [PubMed] [Google Scholar]
  • 51.Gellert-Kristensen H, et al. Identification and replication of six loci associated with gallstone disease. Hepatology. 2019;70:597–609. doi: 10.1002/hep.30313. [DOI] [PubMed] [Google Scholar]
  • 52.Teslovich TM, et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature. 2010;466:707–713. doi: 10.1038/nature09270. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Joshi AD, et al. Four susceptibility loci for gallstone disease identified in a meta-analysis of genome-wide association studies. Gastroenterology. 2016;151:351–363 e28. doi: 10.1053/j.gastro.2016.04.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Jacquemin E, Cresteil D, Manouvrier S, Boute O, Hadchouel M. Heterozygous non-sense mutation of the MDR3 gene in familial intrahepatic cholestasis of pregnancy. Lancet. 1999;353:210–211. doi: 10.1016/S0140-6736(05)77221-4. [DOI] [PubMed] [Google Scholar]
  • 55.Dixon PH, et al. Heterozygous MDR3 missense mutation associated with intrahepatic cholestasis of pregnancy: evidence for a defect in protein trafficking. Hum. Mol. Genet. 2000;9:1209–1217. doi: 10.1093/hmg/9.8.1209. [DOI] [PubMed] [Google Scholar]
  • 56.Ovadia C, et al. Ursodeoxycholic acid enriches intestinal bile salt hydrolase-expressing Bacteroidetes in cholestatic pregnancy. Sci. Rep. 2020;10:3895. doi: 10.1038/s41598-020-60821-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Ferkingstad E, et al. Genome-wide association meta-analysis yields 20 loci associated with gallstone disease. Nat. Commun. 2018;9:5101. doi: 10.1038/s41467-018-07460-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Bull LN, Thompson RJ. Progressive familial intrahepatic cholestasis. Clin. Liver Dis. 2018;22:657–669. doi: 10.1016/j.cld.2018.06.003. [DOI] [PubMed] [Google Scholar]
  • 59.Zhao H, et al. CrossMap: a versatile tool for coordinate conversion between genome assemblies. Bioinformatics. 2014;30:1006–1007. doi: 10.1093/bioinformatics/btt730. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Turner S. qqman: an R package for visualizing GWAS results using Q-Q and manhattan plots. J. Open Source Softw. 2018;3:1731. doi: 10.21105/joss.00731. [DOI] [Google Scholar]
  • 61.Boughton, A. P. et al. LocusZoom.js: Interactive and embeddable visualization of genetic association study results. Bioinformatics37, 3017–3018 (2021). [DOI] [PMC free article] [PubMed]
  • 62.Moore CM, Jacobson SA, Fingerlin TE. Power and sample size calculations for genetic association studies in the presence of genetic model misspecification. Hum. Hered. 2019;84:256–271. doi: 10.1159/000508558. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Jostins L, Levine AP, Barrett JC. Using genetic prediction from known complex disease Loci to guide the design of next-generation sequencing experiments. PLoS ONE. 2013;8:e76328. doi: 10.1371/journal.pone.0076328. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Myers TA, Chanock SJ, Machiela MJ. LDlinkR: an R package for rapidly calculating linkage disequilibrium statistics in diverse populations. Front. Genet. 2020;11:157. doi: 10.3389/fgene.2020.00157. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.McLaren W, et al. The Ensembl variant effect predictor. Genome Biol. 2016;17:122. doi: 10.1186/s13059-016-0974-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Watanabe K, Taskesen E, van Bochoven A, Posthuma D. Functional mapping and annotation of genetic associations with FUMA. Nat. Commun. 2017;8:1826. doi: 10.1038/s41467-017-01261-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Liberzon A, et al. The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Syst. 2015;1:417–425. doi: 10.1016/j.cels.2015.12.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Lin SH, Brown DW, Machiela MJ. LDtrait: an online tool for identifying published phenotype associations in linkage disequilibrium. Cancer Res. 2020;80:3443–3446. doi: 10.1158/0008-5472.CAN-20-0985. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Genomes Project C, et al. A global reference for human genetic variation. Nature. 2015;526:68–74. doi: 10.1038/nature15393. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Dale RK, Pedersen BS, Quinlan AR. Pybedtools: a flexible Python library for manipulating genomic datasets and annotations. Bioinformatics. 2011;27:3423–3424. doi: 10.1093/bioinformatics/btr539. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Landrum MJ, et al. ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 2018;46:D1062–D1067. doi: 10.1093/nar/gkx1153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Stenson PD, et al. Human Gene Mutation Database (HGMD): 2003 update. Hum. Mutat. 2003;21:577–581. doi: 10.1002/humu.10212. [DOI] [PubMed] [Google Scholar]
  • 73.Karczewski KJ, et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature. 2020;581:434–443. doi: 10.1038/s41586-020-2308-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Waskom, M. L. seaborn: statistical data visualization. Journal of Open Source Software6, 3021 (2021).
  • 75.Kulakovskiy IV, et al. HOCOMOCO: a comprehensive collection of human transcription factor binding sites models. Nucleic Acids Res. 2013;41:D195–D202. doi: 10.1093/nar/gks1089. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Heinz S, et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol Cell. 2010;38:576–589. doi: 10.1016/j.molcel.2010.05.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Fornes, O. et al. JASPAR 2020: update of the open-access database of transcription factor binding profiles. Nucleic Acids Res.48, 87–92 (2020). [DOI] [PMC free article] [PubMed]
  • 78.Leung D, et al. Integrative analysis of haplotype-resolved epigenomes across human tissues. Nature. 2015;518:350–354. doi: 10.1038/nature14217. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Jung I, et al. A compendium of promoter-centered long-range chromatin interactions in the human genome. Nat. Genet. 2019;51:1442–1449. doi: 10.1038/s41588-019-0494-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Kent WJ, et al. The human genome browser at UCSC. Genome Res. 2002;12:996–1006. doi: 10.1101/gr.229102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Everaert C, Volders PJ, Morlion A, Thas O, Mestdagh P. SPECS: a non-parametric method to identify tissue-specific molecular features for unbalanced sample groups. BMC Bioinform. 2020;21:58. doi: 10.1186/s12859-020-3407-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Vandenbon A, Nakai K. Modeling tissue-specific structural patterns in human and mouse promoters. Nucleic Acids Res. 2010;38:17–25. doi: 10.1093/nar/gkp866. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Cabili MN, et al. Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev. 2011;25:1915–1927. doi: 10.1101/gad.17446611. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Rentzsch P, Schubach M, Shendure J, Kircher M. CADD-Splice-improving genome-wide variant effect prediction using deep learning-derived splice scores. Genome Med. 2021;13:31. doi: 10.1186/s13073-021-00835-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Ongen H, Buil A, Brown AA, Dermitzakis ET, Delaneau O. Fast and efficient QTL mapper for thousands of molecular phenotypes. Bioinformatics. 2016;32:1479–1485. doi: 10.1093/bioinformatics/btv722. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Pruim RJ, et al. LocusZoom: regional visualization of genome-wide association scan results. Bioinformatics. 2010;26:2336–2337. doi: 10.1093/bioinformatics/btq419. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Surjit M, et al. Widespread negative response elements mediate direct repression by agonist-liganded glucocorticoid receptor. Cell. 2011;145:224–241. doi: 10.1016/j.cell.2011.03.027. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Peer Review File (1,000.2KB, pdf)
41467_2022_29931_MOESM3_ESM.pdf (394.4KB, pdf)

Description of Additional Supplementary Files

Reporting Summary (333.2KB, pdf)

Data Availability Statement

The genome-wide summary statistics for the final combined meta-analysis carried out in this study have been deposited in the GWAS Catalog under accession code GCP000309, available at https://www.ebi.ac.uk/gwas/studies/GCST90095084. The lead variants’ GWAS and GWAS meta-analysis summary statistics generated in this study are provided in Supplementary Data 1. Genotype and phenotype data from the NIHR-RD participants are available from several sources. 4,835 of the NIHR-RD participants were part of the 100,000 Genomes Project—Rare Diseases Pilot. These data can be accessed by application to Genomics England Limited following the procedure outlined at https://www.genomicsengland.co.uk/about-gecip/joining-researchcommunity/. The genotype and phenotype data from the remaining 7348 NIHR-RD participants can be accessed by application to the NIHR BioResource Data Access Committee at dac@bioresource.nihr.ac.uk. Subject to ethical consent, the genotype data of a subset of 6,939 NIHR-RD participants are also available from the European Genome-phenome Archive (EGA) at the EMBL European Bioinformatics Institute [https://ega-archive.org/dacs/EGAC00001000259]. This includes data from 305 ICP cases [https://ega-archive.org/datasets/EGAD00001004515]. Genomic and phenotype data from the 100KGP participants can be accessed by application to Genomics England Limited following the procedure outlined at https://www.genomicsengland.co.uk/about-gecip/joining-researchcommunity/. Genotype data for the 764 UK Biobank samples are available through the UK Biobank [https://www.ukbiobank.ac.uk/]. The FinnGenn GWAS summary statistics are publicly accessible following registration [https://www.finngen.fi/en/access_results]. The adult human liver replicated accessible regions generated in this study are provided in Supplementary Data 5. The variant pathogenicity data used in this study are available in the ClinVar database [https://www.ncbi.nlm.nih.gov/clinvar/] and through MetaDome v1.0.1 [https://stuart.radboudumc.nl/metadome]. The CADD scores (v1.6) used in this study are available on the CADD website [https://cadd.gs.washington.edu/download]. The liver epigenomic and transcriptomic data used in this study are available in the ENCODE portal18 [https://www.encodeproject.org/]. The SuRE data31 used in this study are available in the OFS data repository [https://osf.io/pjxm4/]. The transcription factor motif logos used in this study are available in the JASPAR database [http://jaspar.genereg.net/]. The eQTL data used in this study are available in the Genotype-Tissue Expression (GTEx) Portal34 under dbGaP accession phs000424.v8.p2 [https://gtexportal.org/home/]. The tissue-specificity SPECS81 scores for liver tissue used in this study are available in the SPECS web browser [https://specs.cmgg.be]. The adult liver TAD coordinates used in this study are available in the 3D Genome Browser [http://3dgenome.fsm.northwestern.edu/index.html].

Custom code used in this study is available upon request.


Articles from Nature Communications are provided here courtesy of Nature Publishing Group

RESOURCES