Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Jul 1.
Published in final edited form as: Arthritis Rheumatol. 2024 Mar 26;76(7):1071–1084. doi: 10.1002/art.42829

A multilayered post-GWAS analysis pipeline defines functional variants and target genes for systemic lupus erythematosus (SLE)

Mehdi Fazel-Najafabadi 1,*, Loren L Looger 2,3,*, Harikrishna Reddy Rallabandi 1, Swapan K Nath 1,
PMCID: PMC11213670  NIHMSID: NIHMS1971575  PMID: 38369936

Abstract

Objectives:

Systemic lupus erythematosus (SLE), an autoimmune disease with incompletely understood etiology, has a strong genetic component. Although genome-wide association studies (GWAS) have revealed multiple SLE susceptibility loci and associated single nucleotide polymorphisms (SNPs), the precise causal variants, target genes, cell types, tissues, and mechanisms of action remain largely unknown.

Methods:

Here, we report a comprehensive post-GWAS analysis using extensive bioinformatics, molecular modeling, and integrative functional genomic and epigenomic analyses to optimize fine-mapping. We compile and cross-reference immune cell-specific expression quantitative trait loci (cis- and trans-eQTLs) with promoter-capture Hi-C, allele-specific chromatin accessibility, and massively parallel reporter assay data to define predisposing variants and target genes. We experimentally validate a predicted locus using CRISPR/Cas9 genome editing, qPCR, and Western blot.

Results:

Anchoring on 452 index SNPs, we selected 9,931 high-linkage disequilibrium (r2>0.8) SNPs and defined 182 independent non-HLA SLE loci. The 3,746 SNPs from 143 loci were identified as regulating 564 unique genes. Target genes are enriched in lupus-related tissues and associated with other autoimmune diseases. Of these, 329 SNPs (106 loci) showed significant allele-specific chromatin accessibility and/or enhancer activity, indicating regulatory potential. Using CRISPR/Cas9, we validated rs57668933 as a functional variant regulating multiple targets, including SLE risk gene ELF1, in B-cells.

Conclusion:

We demonstrate and validate post-GWAS strategies for utilizing multi-dimensional data to prioritize likely causal variants with cognate gene targets underlying SLE pathogenesis. Our results provide a catalog of significantly SLE-associated SNPs and loci, target genes, and likely biochemical mechanisms, to guide experimental characterization.

INTRODUCTION

Systemic lupus erythematosus (SLE, lupus) is a complex autoimmune disease with substantial genetic underpinnings, e.g., strong familial aggregation(1), large twin concordance (monozygotic>dizygotic)(2), and high sibling recurrence risk ratio (λs~30)(3). Upwards of 50 candidate gene studies and genome-wide association studies (GWAS) have identified >100 SLE risk loci (p-value<5 × 10−8), across multiple ethnicities(47). However, these loci explain only ~30% of SLE heritability (h2)(5, 8).

In addition to incomplete knowledge of precise risk loci and alleles underlying GWAS peaks, how such alleles mechanistically contribute to disease remains notoriously challenging. For a given locus, GWAS often reports a sole (“index”) single nucleotide polymorphism (SNP), which may or may not itself be functional, but is likely in linkage disequilibrium (LD) with disease-predisposing SNPs(9). As in other complex diseases, >90% of reported SLE index SNPs are non-coding (intronic and intergenic).

Post-GWAS analyses face the challenge of accurately determining associations in clinical genomics. Combining multiple GWAS signals using LD structure and epigenetic data is common in post-GWAS analyses(10, 11). Integrating additional data sources and annotations assists in prioritizing potentially functional SNPs(12). SNPs can influence gene regulation by modulating transcription factor binding and chromatin structure, often revealed through cis- and trans-expression quantitative trait locus (eQTL) analyses(13). Together, annotating open, active chromatin from DNase I hypersensitivity and Assay for Transposase-Accessible Chromatin (ATAC-seq) peaks(14), alongside individual genomic regulatory elements (promoters, enhancers, silencers, etc.) using histone marks, chromatin modifiers, and transcription factors, and by in silico bioinformatics(15), yields a powerful framework for testing GWAS hypotheses.

We collated information from multiple databases of histone marks(16, 17), chromatin accessibility(16), chromatin conformation(17, 18), chromatin accessibility QTLs (caQTLs), and Massively Parallel Reporter Assays (MPRA)(19) and cross-referenced them to predict locus/SNP functionality. When possible, data-derived annotations were taken solely from immune cells(15, 2023), identifying associations relevant to pathogenesis. This approach has revealed target genes and cells associated with rheumatoid arthritis(24) and breast cancer(25), among others.

As a proof of concept, we employed CRISPR-Cas9 genome editing, qPCR, and Western blot techniques to validate the allelic effects of a candidate SNP on the SLE risk gene ELF1. Our study provides a comprehensive and integrated approach to reassessing SLE-GWAS association signals, shedding light on potential functional variants and their roles in immune cell regulation.

MATERIALS AND METHODS

Study design.

Our workflow and study design are shown in Figure 1. We 1) collated, from qualified studies, all reported and replicated index and correlated SNPs to define statistically-independent SLE susceptibility loci, 2) predicted SNP effects in statistically-independent loci and annotated them in regulatory tiers, 3) performed molecular modeling on missense SNPs, 4) leveraged cell type-specific cis- and trans-eQTLs and promoter-capture Hi-C (PCHiC) data to define “enhancer” and “promoter” SNPs and target genes, 5) estimated locus overrepresentation in molecular pathways and gene ontology categories and identified cell type-specific SNP enrichment in epigenetic features, 6) used caQTLs and MPRA data to identify allele-specific effects, and 7) experimentally validated a functional variant using CRISPR/Cas9-based activation/silencing in B-cells.

Figure 1.

Figure 1.

Study framework and summary. Tier1: PCHiC and QTL with at least one same target gene, Tier2: PCHiC and QTL with different targets, Tier3a: Only PCHiC target, Tier3b: Only QTL target, Tier4: No targets.

Collating index SNPs.

We thoroughly reviewed SLE association studies up to September 2021, encompassing GWAS and candidate-gene studies with sample sizes >2,000. We selected genome-wide significant index SNPs (P<5×10−8) (Supplementary Table 1, Supplementary Notes).

Ancestral populations.

Of the 452 index SNPs, 116 were derived from European (EUR) populations, 239 from East Asian (EAS) populations, and 93 from both (Supplementary Figure 1, Supplementary Table 1). An additional four index SNPs derived from Hispanic American (AMR), African (AFR), and South Asian (SAS) populations.

Linked SNPs.

To identify all correlated SNPs within these specific populations, we employed 1000Genomes data, focusing on EUR, EAS, AMR, and AFR. We expanded each locus to its LD region, treating loci as independent if separated by >250 kb with low LD (r2<0.2) (Table 1). Loci boundaries refer to the positions of the first and last proxy of each independent signal. We excluded the HLA region (hg19, Chr6: 28,477,797 – 33,448,354). In total, our analysis revealed 9,931 correlated SNPs based on the initial set of 452 SNPs.

Table 1.

Summary of all SLE loci.

Chr Locus Locus boundariesa Best guess target gene(s)b Index/LDc Likely causal SNP

1 LOC_1 1,156,655 – 1,191,870 C1QTNF12 1/104 rs6697886
1 LOC_2 8,431,607 – 8,505,058 RERE 1/10 rs301807
1 LOC_3 2,4518,206 – 24,519,920 IFNLR0 1/1
1 LOC_4 38,258,007 – 38,379,018 MTF1 1/43 rs28469609
1 LOC_5 67,787,691 – 67,891,029 IL12RB2 2/55 rs11209064
1 LOC_6 114,303,808 – 114,377,568 PTPN22 2/0
1 LOC_7 117,040,622 – 117,104,215 CD58; NAP1L4P1 2/40 rs10924104
1 LOC_8 157,108,159 – 157,119,915 ETV3 1/2 rs116785379
1 LOC_9 157,486,336 – 157,538,786 FCRL5 1/90 rs34273689
1 LOC_10 161,469,054 – 161,596,283 FCGR2A; FCGR2C 3/30
1 LOC_11 173,177,392 – 173,376,184 AL645568.1 13/157 rs6664517
1 LOC_12 174,396,030 – 174,923,045 RABGAP1L 1/420 rs72717613
1 LOC_13 183,225,237 – 183,591,098 NCF2 7/39 rs10911363
1 LOC_14 184,636,486 – 184,723,135 EDEM3 1/39
1 LOC_15 192,513,661 – 192,544,795 AL390957.1 1/34 rs2984920
1 LOC_16 198,543,027 – 198,670,469 PTPRC 2/116
1 LOC_17 201,977,073 – 201,986,311 ELF3 1/0
1 LOC_18 206,642,539 – 206,647,450 IKBKE 2/7
1 LOC_19 206,939,904 – 206,955,041 IL10; IL19 1/3 rs3024493
1 LOC_20 235,890,096 – 236,041,129 LYST 1/27 rs4660117
1 LOC_21 246,434,447 – 246,444,082 SMYD3 1/3
2 LOC_22 7,573,079 – 7,584,668 AC013460.1 1/24
2 LOC_23 30,442,402 – 30,492,116 LBH 3/48 rs906866
2 LOC_24 33,701,890 – 33,702,203 RASGRP3 2/0 rs13385731
2 LOC_25 61,040,651 – 61,173,382 LINC01185 1/11
2 LOC_26 65,559,027 – 65,667,272 SPRED2 3/58 rs1876518
2 LOC_27 74,200,833 – 74,219,948 TET3 3/24
2 LOC_28 111,868,604 – 111,940,585 BCL2L11 1/20 rs12613243
2 LOC_29 136,555,659 – 136,761,853 LCT; MCM6 3/75 rs2278682
2 LOC_30 144,013,184 – 144,028,568 ARHGAP15 1/28 rs10153706
2 LOC_31 163,025,929 – 163,211,491 FAP; IFIH1 5/244
2 LOC_32 191,399,581 – 191,434,502 AC108047.1 1/0
2 LOC_33 191,900,449 – 191,973,034 STAT4 8/38 rs7574865
2 LOC_34 198,492,316 – 198,954,774 PLCL1 2/216 rs13034353
2 LOC_35 204,690,355 – 204,738,919 CTLA3 1/52
2 LOC_36 213,585,035 – 213,593,970 AC093865.1 1/3
2 LOC_37 213,862,922 – 213,890,232 IKZF2 1/9
3 LOC_38 28,068,394 – 28,079,260 LINC01967 1/15 rs1813375
3 LOC_39 58,261,741 – 58,473,899 PXK 4/56
3 LOC_40 72,200,387 – 72,256,927 LINC00870 1/0 rs7637844
3 LOC_41 119,111,870 – 119,272,391 TIMMDC1; CD80 7/22 rs9877891
3 LOC_42 159,625,393 – 159,748,367 IL12A-AS1 3/62 rs2936303
3 LOC_43 169,476,991 – 169,528,523 LRRC34 3/47 rs3821383
3 LOC_44 188,451,078 – 188,472,383 LPP 1/11 rs1568669
4 LOC_45 953,193 – 983,809 DGKQ 3/16 rs11248061
4 LOC_46 2,540,146 – 2,760,732 FAM193A 2/126 rs4690053
4 LOC_47 8,558,199 – 8,568,191 GPR78 1/11
4 LOC_48 40,301,264 – 40,308,368 LINC02265 1/4 rs13136820
4 LOC_49 55,547,533 – 55,553,801 KIT 1/10
4 LOC_50 79,626,160 – 79,679,733 AC112253.1 1/31
4 LOC_51 84,141,253 – 84,161,920 AC114781.2 1/51 rs4693592
4 LOC_52 87,888,054 – 87,976,055 AFF1 2/50 rs340643
4 LOC_53 102,712,542 – 102,762,581 BANK1 6/70 rs6811141
4 LOC_54 108,968,701 – 109,090,112 LEF1 1/0
4 LOC_55 123,073,009 – 123,551,032 ADAD1; IL21 2/76
4 LOC_56 184,603,297 – 184,618,470 TRAPPC11 1/5
5 LOC_57 1,282,319 – 1,286,516 TERT 2/6
5 LOC_58 35,850,149 – 35,916,174 IL7R; CAPSL 1/10
5 LOC_59 100,084,878 – 100,291,657 ST8SIA4 3/216 rs10060686
5 LOC_60 127,733,961 – 127,853,142 FBN2 1/15
5 LOC_61 130,665,788 – 131,259,361 FNIP1 1/4
5 LOC_62 131,812,897 – 131,835,395 IRF1 1/60 rs61175929
5 LOC_63 133,418,739 – 133,433,641 AC008608.1 3/21
5 LOC_64 150,386,395 – 150,462,638 GPX3; TNIP1 5/26 rs10036748
5 LOC_65 158,883,027 – 158,944,457 LINC01845 1/6
5 LOC_66 159,879,978 – 159,887,336 MIR3142HG 2/0 rs2431697
6 LOC_67 238,790 – 259,719 AL035696.1 2/24
6 LOC_68 16,299,343 – 16,761,722 ATXN1 1/0
6 LOC_69 25,184,408 – 26,339,131 CARMIL1; H2BC6 4/93 rs17598658
6 LOC_70 27,498,217 – 27,665,920 CD83P1 1/20 rs10807029
6 LOC_71 34,549,107 – 35,356,143 PPARD 8/794 rs6934662
6 LOC_72 36,695,519 – 36,722,789 CPNE5 1/26 rs236469
6 LOC_73 90,936,894 – 91,002,494 BACH2 1/13 rs614120
6 LOC_74 106,564,236 – 106,598,933 ATG5 3/15
6 LOC_75 116,690,849 – 116,694,120 DSE 2/0
6 LOC_76 137,959,235 – 138,243,739 TNFAIP3 9/49 rs200820567
6 LOC_77 154,562,302 – 154,579,861 AL357075.4 1/15 rs2141289
7 LOC_78 28,142,088 – 28,209,953 JAZF1 3/15 rs702814
7 LOC_79 50,227,828 – 50,348,043 IKZF1 6/26 rs876039
7 LOC_80 67,014,434 – 67,084,823 MTATP6P21 1/36
7 LOC_81 73,434,106 – 74,193,642 GTF2IRD1 5/37
7 LOC_82 75,167,934 – 75,209,951 HIP1 6/21
7 LOC_83 128,563,721 – 128,764,737 IRF5; TNPO3 14/114 rs3778752
8 LOC_84 8,088,230 – 8,155,475 ALG1L13P 2/44 rs2945248
8 LOC_85 8,622,877 – 8,649,881 MFHAS1 1/30 rs2428
8 LOC_86 10,712,945 – 10,802,146 XKR6 5/55 rs6985109
8 LOC_87 11,270,993 – 11,402,063 AF131216.5; BLK 13/84 rs67934857
8 LOC_88 42,128,820 – 42,189,978 IKBKB 1/0
8 LOC_89 56,835,673 – 57,044,066 LYN; RPS20 2/198 rs189658553
8 LOC_90 71,017,438 – 71,330,166 NCOA2 2/95 rs71517442
8 LOC_91 72,891,748 – 72,913,114 MSC-AS1 1/14 rs9298192
8 LOC_92 79,555,186 – 79,657,666 ZC2HC1A; IL7 2/65 rs3808619
8 LOC_93 128,192,981 – 128,197,856 CASC19 1/11 rs2456452
8 LOC_94 129,324,232 – 129,465,024 LINC00824 2/50
9 LOC_95 4,981,602 – 4,984,530 JAK2 1/1
9 LOC_96 21,171,267 – 21,320,324 IFNA22P 2/147 rs10757201
9 LOC_97 102,337,143 – 102,605,963 NR4A3 2/64 rs1405209
10 LOC_98 5,894,714 – 5,914,581 ANKRD16 1/12
10 LOC_99 50,014,917 – 50,122,181 WDFY4 5/121 rs7086101
10 LOC_100 63,785,089 – 63,825,807 ARID5B 2/21 rs56140430
10 LOC_101 64,399,617 – 64,443,139 AC067752.1 2/24 rs2393909
10 LOC_102 73,466,709 – 73,506,129 CDH23 2/34 rs3802712
10 LOC_103 104,973,061 – 105,175,131 NT5C2; INA 1/25
10 LOC_104 105,671,683 – 105,700,775 STN1 1/5
10 LOC_105 112,633,671 – 112,799,757 BBIP1 1/27 rs73343848
11 LOC_106 551,235 – 635,569 IRF7; CDHR5 5/87 rs59115876
11 LOC_107 3,875,757 – 4,114,440 STIM1 1/0
11 LOC_108 18,303,597 – 18,362,382 HPS5; GTF2H1 1/1
11 LOC_109 35,070,068 – 35,123,574 PDHX 5/51 rs2785201
11 LOC_110 65,378,028 – 65,564,926 AP5B1; OVOL1 4/31 rs10791824
11 LOC_111 68,814,887 – 68,869,034 TPCN2 2/35 rs7942690
11 LOC_112 71,132,868 – 71,225,082 NADSYN1 1/66 rs11606611
11 LOC_113 72,499,768 – 72,895,102 FCHSD2 3/6
11 LOC_114 118,480,115 – 118,735,476 DDX6 4/42 rs2508573
11 LOC_115 128,297,318 – 128,504,173 ETS1 6/17 rs12576753
12 LOC_116 4,134,873 – 4,152,163 AC084375.1 1/7
12 LOC_117 12,760,658 – 12,874,462 CDKN1B 5/44 rs12811932
12 LOC_118 43,130,547 – 43,200,941 LINC02450 1/9
12 LOC_119 102,271,358 – 102,405,908 DRAM1 1/0
12 LOC_120 103,912,112 – 103,965,115 AC084364.3 1/0
12 LOC_121 111,826,477 – 112,059,557 ATXN2 4/8
12 LOC_122 121,099,302 – 121,378,566 CABP1 2/141 rs904628
12 LOC_123 129,276,658 – 129,307,699 SLC15A4 7/71 rs35907548
12 LOC_124 133,038,182 – 133,042,182 AC079031.2 1/0
13 LOC_125 41,529,773 – 41,588,832 ELF1 2/15 rs57668933
13 LOC_126 50,143,361 – 50,192,528 RCBTB1 1/20
13 LOC_127 100,084,039 – 100,104,407 TM9SF2 1/13 rs749114
14 LOC_128 35,831,811 – 35,832,666 AL133163.2 1/1
14 LOC_129 68,728,425 – 68,760,141 RAD51B 2/14 rs3784099
14 LOC_130 88,370,343 – 88,383,035 GALC 1/13 rs28626750
14 LOC_131 103,238,582 – 103,290,221 TRAF3 1/62 rs12880641
14 LOC_132 105,386,039 – 105,416,010 PLD4; AHNAK2 3/51 rs2819426
15 LOC_133 38,728,250 – 38,927,386 RASGRP1 4/10 rs7173565
15 LOC_134 75,079,474 – 75,392,795 CSK; SCAMP5 3/23 rs34180494
15 LOC_135 77,824,646 – 77,830,430 AC046168.1 1/19 rs1317320
15 LOC_136 97,595,545 – 97,626,101 AC055873.1 1/9
15 LOC_137 101,529,012 – 101,550,214 LRRK1 1/1
16 LOC_138 11,038,360 – 11,291,722 CLEC16A 7/82 rs2041670
16 LOC_139 23,871,457 – 23,901,376 PRKCB 2/1
16 LOC_140 30,584,430 – 30,827,205 PRR14; RNF40 2/88 rs3812999
16 LOC_141 31,260,235 – 31,369,803 ITGAM; ITGAX 7/100 rs4632147
16 LOC_142 50,068,422 – 50,139,799 HEATR3 1/67
16 LOC_143 57,352,124 – 57,403,500 CCL22 3/13 rs9921681
16 LOC_144 58,247,523 – 58,268,561 CCDC113 1/70 rs2731741
16 LOC_145 68,551,277 – 68,663,156 ZFP90; RNU4–36P 3/197 rs28537207
16 LOC_146 79,739,978 – 79,755,446 MAFTRR 1/25
16 LOC_147 85,966,683 – 86,020,039 AC092723.4 6/42 rs8052690
16 LOC_148 87,390,630 – 87,443,734 MAP1LC3B 1/26 rs10431963
17 LOC_149 4,706,123 – 4,712,617 PLD2 1/4
17 LOC_150 7,208,373 – 7,240,391 ACAP1 2/8
17 LOC_151 16,839,901 – 16,845,467 TNFRSF13B 2/1
17 LOC_152 37,885,383 – 38,088,150 MIEN1; IKZF3 7/245 rs34758895
17 LOC_153 43,422,855 – 43,457,886 RNA5SP443 1/5
17 LOC_154 47,448,102 – 47,554,350 AC091180.5 1/0 rs2671655
17 LOC_155 73,304,710 – 73,417,662 GRB2 3/158 rs8072449
17 LOC_156 76,372,972 – 76,393,736 PGS1 1/5
18 LOC_157 67,518,031 – 67,562,657 CD226 3/36 rs1788103
18 LOC_158 77,377,925 – 77,386,912 AC068473.4 1/7 rs118075465
19 LOC_159 936,297 – 952,429 ARID3A 1/16 rs2238580
19 LOC_160 2,131,148 – 2,208,859 DOT1L 1/49 rs2864419
19 LOC_161 6,689,065 – 6,699,330 C3 1/31
19 LOC_162 10,392,638 – 10,481,532 TYK2 5/20 rs2569693
19 LOC_163 16,438,661 – 16,443,718 KLF2 1/9 rs11086029
19 LOC_164 18,383,794 – 18,637,194 IQCN; SSBP4 3/100 rs28375303
19 LOC_165 33,035,097 – 33,106,621 PDCD5 2/21
19 LOC_166 49,788,205 – 49,918,814 SLC6A16; TEAD2 2/45 rs7257053
19 LOC_167 50,162,909 – 50,182,697 IRF3 1/2
19 LOC_168 52,021,247 – 52,127,053 SIGLEC6 2/30
19 LOC_169 55,730,976 – 55,739,813 TMEM86B 2/15
20 LOC_170 1,507,507 – 1,558,508 AL049634.1 1/27
20 LOC_171 44,730,245 – 44,749,251 CD40 1/11
20 LOC_172 48,429,020 – 48,605,930 RNF114 1/92 rs117447227
22 LOC_173 18,648,861 – 18,654,105 USP18 1/19
22 LOC_174 21,798,351 – 21,985,094 UBE2L3; YDJC 7/110 rs1034329
22 LOC_175 39,739,187 – 39,756,650 SYNGR1 2/9 rs2069235
22 LOC_176 40,291,139 – 40,317,126 GRAP2 1/13
X LOC_177 12,839,152 – 12,907,658 PRPS2 3/3
X LOC_178 30,572,729 – 30,577,846 CXorf21 1/5
X LOC_179 53,081,414 – 53,111,428 GPR173 1/37
X LOC_180 56,295,245 – 57,406,814 KLF8; NBDY 2/1146 rs5913948
X LOC_181 149,663,590 – 149,673,253 MAMLD1 1/5
X LOC_182 153,189,819 – 153,378,375 IRAK1; MECP2 11/95 rs3027878
a

The boundaries are defined as base pairs for first and last LD SNP in a locus. If the locus contains only one SNP, the boundaries are defined as the positional gene boundaries based on GENCODE definition.

b

Most evidential targeted gene(s) by eQTL, PCHiC, caQTL and MPRA if there is a likely causal variant, otherwise the nearest gene to loci.

c

Number of index SNPs / Number of LD (proxy) SNPs, not including index SNPs.

Index SNPs per locus.

Among the 182 defined loci, 89 had only one index SNP (Table 1). For the other 93 loci, we calculated the pairwise LD (r2) between SNPs within the same locus (Supplementary Table 1) for both major populations, wherever data was available in 1000G Phase III. The intra-locus LD for index SNPs ranged from <0.001 (e.g., rs11085727 and rs55882956 in LOC_162) to 1.0 (e.g., rs9913957, rs8076347, rs143123127, and rs8079075 in LOC_152).

Rare SNPs.

We defined a rare variant as one with minor allele frequency (MAF) <1%. We used 1000G Phase III data for all 9,931 SNPs (Supplementary Table 2). In EUR, there are 69 rare SNPs (3 index SNPs and 66 SNPs in LD) with MAF <1%, originating from five loci (LOC_13, LOC_31, LOC_42, LOC_81, and LOC_138). In EAS, there are 70 SNPs (five index SNPs and 65 SNPs in LD) with MAF <1% from two loci (LOC_83 and LOC_148). For other populations, there were no SNPs with MAF <1%.

Regulatory region annotation.

We employed diverse bioinformatics tools and databases for evaluating the regulatory implications of each SNP (Supplementary Notes). To identify allele-specific enhancers, we leveraged MPRA data encompassing over 3,000 SNPs with both alleles present(26). Additionally, we incorporated caQTLs data(16, 20) to refine the fine-mapping and annotation of SNP-specific regulatory elements.

Target genes.

We employed two methods to determine SNP targets (Supplementary Notes). First, we annotated SNPs with cis- and/or trans-eQTLs and splicing QTLs (sQTLs) using multiple databases. Second, to identify SNPs interacting with enhancers and promoters through chromatin interactions, we overlapped associated SNPs within chromatin interaction anchors in immune cells with available PCHiC data from immune cells(15, 22).

SNP/gene-set enrichment analysis.

Target genes of functional SNPs were tested for enrichment in Gene Ontology (GO) categories, biochemical pathway membership, and disease association. Enrichment analysis was carried out using FUMA(11) and epiCOLOC(27) on different SNP sets and their target genes identified through target-type annotations.

Protein models.

Protein models were taken from AlphaFold2 and illustrated with PyMOL.

RNA folding.

RNA secondary structure prediction was performed with CentroidFold (http://rtools.cbrc.jp/centroidfold/).

Transcription factor binding.

Binding sites were annotated from UCSC Genome Browser GRCh37/hg19 JASPAR core 22.

Transcription factor binding site (TFBS) position-weight matrices (PWMs).

Human PWMs were downloaded from CIS-BP (http://cisbp.ccbr.utoronto.ca/). The PWM for the STAT1:STAT2 complex was taken from UCSC Genome Browser. Frequency differences were converted to ΔΔG (kcal/mol) with the Boltzmann equation at 37°C.

CRISPR-based functional validation.

We utilized CRISPR/Cas9 activation/inhibition (CRISPRa/i) to introduce activating or silencing domains to rs57668933, details are reported elsewhere(28). We employed the plasmids SP-dCas9-TET1 and SP-dCas9-LSD1, respectively, for inhibition and activation. Lymphoblastoid cell line (LCL; NA18566) with the TT genotype was thawed and cultured in T25 culture flasks until reaching a confluence of 0.5 – 0.7 × 10^6 cells/mL. Subsequently, we co-transfected two pools of ELF1-sgRNA plasmids along with dCas9-based activation and inhibition plasmids into the LCL cells via electroporation using Neon. RNA harvesting was performed 48 hours post-transfection, with cells transfected solely with sgRNA plasmids serving as the control group.

qPCR.

To assess the impact of the SNP on target gene expression, we used qRT-PCR on wild-type (WT), CRISPRa, and CRISPRi cells, as described elsewhere(28). RNA was isolated from WT and CRISPRa/i cells using an RNA Mini kit (Zymo Research) and reverse transcribed using iScript Reverse Transcription Supermix cDNA synthesis kit (Bio-Rad). We measured ELF1 expression and analyzed results for significance using Prism V.7 (GraphPad).

Western blot.

Cells were collected 72 hours after transfection and lysed in RIPA buffer supplemented with a protease and phosphatase inhibitor cocktail (outlined in Supplementary Notes). The blot was visualized using an Azure ChemiBlot machine, and the obtained results were subjected to analysis. Expression levels were quantified using ImageJ, and densitometry values were graphed using GraphPad Prism.

RESULTS

Defining independent SLE candidate loci.

Overall, we identified 452 reported genome-wide significant (P<5×10−8) non-HLA index SNPs from 76 different GWAS and candidate gene studies (Supplementary Table 1). Most index SNPs derived from EAS and EUR ancestry studies (Supplementary Figure 1). Among these, 242 (53.5%) index SNPs lay within 145 gene bodies, with 44 SNPs in exonic regions, 198 SNPs in intronic regions, and 210 SNPs (46.5%) in intergenic regions connected to the target genes through eQTL or PCHiC (Supplementary Table 2).

We then collected SNPs in high LD (r2>0.8) with index SNPs, finding 9,479–totaling 9,931 SNPs for study (Methods). We binned these into 182 statistically-independent loci (Table 1, Supplementary Table 2), with median locus size of 57.7 kb [range 314 bp – 1.15 Mb]. Of the 182 loci, 89 contained single index SNPs; the rest had 2–14 (median 2; Supplementary Figure 1, Supplementary Table 3). Total linked SNPs per locus ranged from 1–1,148 (median 26; Supplementary Table 3). Correlated SNPs per index SNP ranged from 1–146 (median 24). Fifteen loci had single index SNPs and no LD-SNPs; conversely, LOC_180 had two index SNPs and 1,146 LD-SNPs. The physical distance between index SNPs and LD-SNPs varied from 1 bp – 499 kb (median 14 kb).

The 182 independent loci overlap with 426 gene bodies. Out of 9,931 total SNPs, 4,672 (47.0%) are intronic, 88 (0.9%) synonymous, 89 (0.9%) missense, and 5,082 (51.2%) intergenic (Figure 1, Supplementary Table 2). We annotated all SNPs for cis- and trans-regulatory effects. The 89 missense SNPs, which have the potential to modify protein structure, function, or expression, underwent molecular modeling.

Annotation pipeline.

To annotate and prioritize these 9,931 SNPs at 182 loci/426 genes, we established a multi-step pipeline: 1) compiling eQTLs and PCHiC data from immune cells (Supplementary Table 4), 2) integrating these datasets to initially classify SNPs, 3) incorporating histone mark and MPRA data, 4) refining GWAS peaks with caQTLs, and 5) experimentally testing prioritized SNPs. Notably, caQTLs appear much narrower than many other GWAS signals(16); however, they are currently only available for B-cells among immune cells. As such, we placed them late in our pipeline, so that the initial prioritization covers all cell types. As caQTL data becomes available for other cell types, placing this step earlier in the pipeline could more rapidly narrow GWAS peaks.

eQTLs.

We initially annotated all SNPs with cis- and trans-eQTLs (having target genes >1 Mb or on another chromosome) and associated target genes, using only immune cell-specific data. Most SNPs (9,052) have ≥1 significant cis-eQTL [range 0–31; 856 SNPs have single cis-eQTLs and 5,539 have ≤5 cis-eQTLs] (Supplementary Table 2). The cis-eQTL targets are enriched in immune-related genes, with many being known SLE risk loci. In LOC_13, rs17849501 (Neutrophil cytosol factor 2, NCF2) is an eQTL of several genes in multiple immune cell types. We experimentally demonstrated strong, allele-dependent enhancer activity of this SNP(29). In LOC_66, rs2431697 (intergenic) affects expression of multiple genes across cell types. This SNP has been experimentally shown to physically associate with the promoter of miRNA-146a, a potent immune regulator(30) and SLE biomarker(31). In LOC_76, SLE risk SNP rs2230926 (Tumor necrosis factor alpha-induced protein 3, TNFAIP3) greatly increases neutrophil extracellular traps and citrullinated epitopes in SLE patients(32). In LOC_83, rs13239597 (intergenic) is an experimentally validated allele-specific enhancer of Interferon regulatory factor 5 (IRF5), a key SLE risk gene.

We identified 75 trans-eQTLs that target 272 unique genes across 22 loci (ranging from 1 to 149 genes per locus) with a false-discovery rate (FDR) <1e-5 (Supplementary Tables 57). Among these trans-eQTLs, 13 target genes were distal and 259 were located on different chromosomes. Notably, 73 of the 75 trans-eQTL SNPs were also identified as cis-eQTLs. At LOC_121 (SH2B3, ATXN2), all eight trans-eQTL SNPs showed >100 target genes, demonstrating substantial interactions across the genome. SH2B3 (a.k.a. lymphocyte adaptor protein, LNK) links numerous immune signaling pathways to inflammation(33) and is a major immune regulator. Within LOC_79 (IKZF1), a single trans-eQTL (rs4917014) was associated with 50 target genes, many of which were immune-related and often associated with SLE. rs1990760, a coding SNP at IFIH1 (LOC_31), is defined for lupus susceptibility(34). This SNP is also a trans-eQTL targeting nine genes (MX1, IFI44L/IFI44, HERC5, IFIT1/IFI6, OAS3/OAS2, HERC6) significantly enriched in type I/II interferon signaling genes. Interestingly, seven (MX1, IFI44L/IFI44, HERC5, IFIT1/IFI6, OAS3) and four (MX1, IFI44L/IFI44, HERC5) target genes were also targeted by LOC_97 and LOC_99, respectively (Supplementary Table 5), suggesting potential co-regulation among core genes, further amplifying trans-effects in an omnigenic model(35).

Chromatin interactions.

We independently analyzed PCHiC data on immune cells(15, 22). The 6,198 SNPs had ≥1 PCHiC connection (762 SNPs had 1; 3,322 SNPs had ≤5; maximum 93). Combining eQTL and PCHiC datasets, our SNPs target 3,504 unique genes (Figure 1, Supplementary Tables 2, 8).

SNP categorization.

Concordance between eQTL and PCHiC annotations guided the establishment of SNP tiers (Figure 1, Supplementary Table 2). Tier1 includes SNPs annotated by both methods with non-zero target gene overlap (3,746 from 143 loci). These SNPs have strong evidence of controlling expression of specific target genes. Tier2 (1,906 SNPs from 17 loci) were annotated by both methods but targeting different genes in existing datasets. Tiers 3a and 3b (546 SNPs, 11 loci; 3,400 SNPs, 6 loci) showed either PCHiC or eQTL activity, respectively, but not both. Finally, 333 SNPs (Tier4) exhibited neither activity. Of 9,052 cis-eQTL SNPs, 3,746, 1,906, and 3,400 were categorized as Tier1, Tier2, and Tier3b, respectively. Of 75 trans-eQTL SNPs, 50, 11, and 14 were Tier1, Tier2, and Tier3b, respectively. SNPs with trans-eQTL activity were significantly more likely to be Tier1 than those with cis-eQTL activity (Chi-squared test; p=1.7e-5, Supplementary Table 9).

Subsequently, in order to identify shared target genes within specific immune cells, we analyzed eQTL and PCHiC targets across the 7 major immune cell lines present in both databases. B-cells, T-cells, and neutrophils shared 34–47% of target genes identified by both methods. Monocytes, macrophages, and LCLs exhibited much lower overlap, 5–14% of target genes (Supplementary Table 10).

Regulatory elements.

Linked SNPs were closely associated with transcriptional regulatory regions annotated by GenoSTAN and other databases; 4,332 (43.6%) lie in annotated promoter, enhancer, and/or silencer regions (Supplementary Table 2). Of 9,059 eQTL SNPs, 3,457 (38.1%) lie in enhancers, 625 (6.9%) in promoters, 485 (5.4%) in both, and 670 (7.3%) in silencers. We observed median 13 transcriptional element-associated SNPs per locus (four loci had no such SNPs; LOC_71 had 360). The bulk were Tier1/Tier2 SNPs, indicating a relationship between transcriptional regulatory elements and eQTL/PCHiC activity (Supplementary Table 11). There was a statistically significant proportion (p<1.7e-05) of Tier1 SNPs, compared to the rest, with cis-eQTLs (41.3%, or 3,746 out of 9,052) and/or trans-eQTLs (66.7%, or 50 out of 75) (Supplementary Table 9). Enhancer SNPs that are also eQTL SNPs had a median distance of 47.2 kb to their target genes’ transcription start sites (TSSs); for Tier1 SNPs, this distance was 45.0 kb. Enhancer SNPs that are also PCHiC SNPs had a median distance of 214.5 kb to their target genes’ TSS; for Tier1 SNPs, this distance was 193.3 kb (Supplementary Table 8). Tier1 SNPs are substantially closer to their target genes than other tiers, consistent with stronger regulatory effects.

Of all regulatory element-associated SNPs, 117 (from 32 loci) were Tier1 SNPs with 1–4 common target genes, leading to 58 unique genes targeted in both eQTLs and PCHiC. Of enhancer SNPs, 93 (26 loci) were Tier1, together targeting 44 unique genes (Supplementary Table 2). These SNPs (which are in annotated enhancers, are involved in chromatin interactions, and transcriptionally regulate specific target genes) represent highly prioritized candidates and are given further attention below.

Massively parallel reporter assays.

As an independent measure of SNP effects on transcription, we mined MPRA datasets, which provide high-throughput characterization of enhancers(36). We examined MPRA data from B-cells (GM12878) containing 3,073 SNPs, of which 51 were found to have statistically-significant allele-specific enhancer activity (ASE)(26). Of these, 2,614 SNPs in common (24 missense, 20 synonymous, 1,098 intronic, 1,472 intergenic) between available data(26) and our dataset, 42 show ASE (FDR<0.01; Supplementary Tables 12, 13). MPRA-ASE SNPs were almost exclusively non-coding: 50 intergenic, 46 intronic, 1 synonymous, 1 missense.

Deleteriousness scores.

We annotated SNPs with pre-computed deleteriousness scores (predictSNP2). Of exonic SNPs (177 in 61 loci, 91 unique protein-coding genes; 89 missense from 43 loci, 57 unique genes), the algorithms identified 11, 26, and 37 deleterious SNPs, respectively. For missense SNPs, 9, 17, and 37, respectively, were labeled deleterious. For non-coding SNPs, 516 (55% intronic, 45% intergenic) were deemed deleterious by ≥1 algorithm (Supplementary Table 14).

Chromatin accessibility.

SNP annotation based on chromatin accessibility in whole blood revealed Tier1 with the strongest signals, succeeded by Tier2 and 3a (Supplementary Figure 2). Tiers 3b and 4 showed essentially zero enrichment.

caQTL SNPs.

To identify SNPs with allele-specific chromatin accessibility, we searched a caQTL database from LCLs from ten ethnicities(16). The caQTL peaks are quite narrow(37); however, the method is new and of immune cells, has only yet been applied to LCLs. Thus, although the technique dramatically reduces SNP numbers, the results here are specific to B-cells. SLE, of course, manifests through numerous cell types; this analysis is only a subset of associated SNPs. As caQTLs are determined in more cell types, this analysis can be extended.

Our SNP set, spanning 100 loci, includes 295 caQTLs across diverse ancestral backgrounds. Among the 182 loci, 100 had ≥1 caQTL SNP (range 1–16); 73 loci had ≥1 Tier1 caQTL SNP (Figure 1). All but one caQTL SNP were also eQTL SNPs. Of 295 caQTL SNPs, 194 are Tier1, 46 Tier2, 6 Tier3a, 48 Tier3b, and 1 Tier4. Thus, caQTL SNPs are heavily enriched in high-tier SNPs, underscoring SNP-driven changes in chromatin accessibility strongly contributing to chromatin interactions and target gene expression. Of 295 caQTL SNPs, 235 (79.7%) lie in enhancers, 91 (30.8%) in promoters, 63 (21.4%) in both, and 19 (6.4%) in silencers. This is consistent with eQTL and MPRA data, emphasizing the strong enrichment of caQTL SNPs within regulatory elements, particularly enhancers (Supplementary Table 15).

Transcription factor binding.

Next, we independently annotated transcription factor (TF) binding sites using epiCOLOC(27). Tier1 SNPs showed by far the most TFs (89) with binding site enrichment (Supplementary Figure 3), with Tier2 next. Tier3a showed low enrichment, and Tiers 3b and 4 were negligible. TFs highly represented in Tier1/Tier2 SNPs include Brachyury/TBXT, TCF4, MYB, and NFKB1–all critical immune-linked proteins involved in SLE pathogenesis. Altogether, TFBS enrichment strongly correlates with eQTL/PCHiC activity, and enriched TFs were immune-linked and SLE-associated.

Tissue enrichment.

Next, for our collected loci, we tabulated expression in diverse tissues using FUMA GENE2FUNC. Tier1 loci target genes were significantly enriched (FDR <0.001) in whole blood and lymphocytes (Supplementary Figure 4). As before, lower tiers were much less enriched in these tissues and demonstrated less tissue enrichment overall.

Disease and pathway association.

We examined disease GWAS association catalogs; Tier1 loci target genes were significantly overrepresented in 154 of 310 traits/diseases. SLE, rheumatoid arthritis (RA), and inflammatory bowel disease (IBD) were particularly enriched in GWAS hitting these loci. Conversely, lower tiers showed minimal association with disease GWAS. Moreover, Tier1 loci target genes exhibited substantial enrichment in KEGG pathways (36 of 68) and gene ontology (GO) classifications (464 of 1,374), unlike lower tiers. These pathways encompass immune system regulation, cytokine production, phosphorus metabolism, and interferon signaling regulation (Supplementary Table 16). Further studies are required to flesh out precise pathways and mechanisms by which these highly associated SNPs contribute to dyshomeostasis and SLE progression. These findings will guide subsequent experimental investigations.

Missense SNPs.

We highlight several missense SNPs predicted to dramatically disrupt protein function. rs78555129 mutates a universally conserved arginine in adipolin (CTRP12/FAM132A/C1QTNF12) to cysteine, perturbing protein folding and presumably interactions with its (currently unknown) receptor (Supplementary Figure 5a). Adipolin is an anti-inflammatory adipokine implicated in diabetes, arthritis, and obesity(38). In B-cell scaffold protein with ankyrin repeats 1 (BANK1), rs10516487 destabilizes the protein (Supplementary Figure 5b), likely interfering with its interactions with TRAF6 and MyD88 in innate immune signaling(39). rs201802880 in Neutrophil Cytosolic Factor 1 (NCF1/p47phox) mutates a universally conserved residue (Supplementary Figure 5c), leading to protein destabilization. NCF1 is a subunit of NADPH oxidase, critical for phagocytic immune responses(40). rs2230926 in TNFα-Induced Protein 3 (TNFAIP3) mutates a universally conserved residue important for protein stability (Supplementary Figure 5d). TNFAIP3 is indispensable to TNF signaling and immune activation and is an SLE risk gene(41).

Final SNP selection.

We identified 3,746 Tier1 SNPs with substantial predicted contribution to SLE. Among the 182 loci, 106 were flagged by ≥3 independent experimental methods, and 6 loci were flagged by all available experimental methods. These SNPs are predicted to be highly associated with SLE, primarily exerting their effects via enhancer-driven modifications in target gene expression, particularly within B-cells.

All 6 SNPs show much experimental evidence linking them to SLE (Table 2, Figure 3, Supplementary Figures 6ae). Most disrupt highly conserved binding sites of critical immune transcription factors (Supplementary Table 17, Supplementary Figure 7). TFBS position-weight matrices show that binding is essentially abolished in many cases (Supplementary Figure 7). Target genes and disrupted transcription factors are known autoimmune risk genes, implicated in multiple diseases (Supplementary Table 16). Many selected SNPs are far from index SNPs and do not appear in the literature, highlighting the pipeline’s ability to localize signals in large GWAS peaks.

Table 2.

Six most significant SNPs and their target genes.

SNP Chr Risk/Non-Risk Closest Gene Common target genes

rs13385731 2 T/C RASGRP3 RASGRP3, FAM98A
rs2936303 3 G/A IL12A-AS1 IL12A, TRIM59
rs10036748 5 T/C TNIP1 TNIP1, ANXA6, GPX3
rs2431697 5 T/C MIR3142HG PTTG1, SLU7
rs57668933 13 C/T ELF1 ELF1
rs2069235 22 A/G SYNGR1 PDGFB, MGAT3, RPL3

Figure 3.

Figure 3.

Analysis of LOC_125. This locus has two index SNPs and 15 high-LD SNPs. Of these, rs57668933 is a caQTL-SNP with allele-specific enhancer activity. a) Visualized connections of rs57668933 and neighboring regions based on PCHiC, alongside various histone marks. b) The SNP lies in ELF1, a significant eQTL gene. Significant genotype-specific gene expression in follicular T-helper cells (Tfh), CD16+ monocytes, and unstimulated and stimulated B-cells. c) Chromatin accessibility of alleles. Reference allele T has higher chromatin accessibility. d) MPRA data shows T has higher enhancer activity. e) CRISPR-dCas9-based targeting of rs57668933-containing region by two activators (dCas9-VPR and dCas9-TET1) and two suppressors (dCas9-MECP2, dCas9-LSD1) and ELF1 (target gene) mRNA expression. f) Western blot demonstrating the differential expression of ELF1 protein in response to the CRISPR-dCas9-based activation (TET1) and inhibition (LSD1) systems. Densitometry plot illustrating representative Western blot results for ELF1 protein expression. Higher expression is observed in the presence of dCas9-TET1 (activator, lane 2), while reduced expression is observed in the presence of dCas9-LSD1 (inhibitor, lane 3) compared to the control (lane 1). Significance values (**p<0.005, ***p<0.0005) indicate statistically significant differences.

CRISPR-based validation of rs57668933.

We experimentally validated rs57668933, selected randomly from our top six significant SNPs. rs57668933 (intron of lymphoid cell transcription factor E74-like factor 1, ELF1), at LOC_125 (Chr 13), controls ELF1 expression (Figures 3ab). The protective T allele correlates with higher ELF1 expression in T-cells, B-cells, and monocytes in healthy controls (Supplementary Figure 8) and shows high allele-specific chromatin accessibility (Figure 3c) and enhancer activity (Figure 3d). ELF1 has been previously reported as an SLE risk gene (lead SNP rs7329174(42))–we show that rs57668933 is instead the likely causal SNP, with the risk allele yielding lower chromatin accessibility and ELF1 expression. ELF1 represses FcRγ expression(43); SLE patients’ T-cells express essentially no ELF1 but high levels of FcRγ, which activates immune reactivity and promotes nephritis(44). ELF1 also regulates antibody heavy chain production in B-cells. This SNP disrupts universally conserved binding sites for the tumor suppressors p63 and p73 (Supplementary Figure 7a), both with strong immune contributions.

We employed CRISPRa/i-based activation and inhibition targeting rs57668933 (Figure 3e). Both activation domains doubled ELF1 transcript levels, while both suppressor domains halved them, as confirmed by Western blot (Figure 3f). The CRISPR-dCas9-based activation and inhibition system revealed distinct alterations in ELF1 protein expression. Compared to the control group transfected with sgRNA only, the dCas9-TET1 activation plasmid significantly increased (~1.8x) ELF1 protein expression, while the dCas9-LSD1 inhibition system reduced (~0.8x) ELF1 expression. These findings strongly support the notion that the SNP region plays a pivotal role in regulating ELF1 expression, consistent with our other findings.

DISCUSSION

We have established a state-of-the-art SNP and locus analysis pipeline for assimilating data regarding gene expression, chromatin accessibility and interactions, histone marks, transcription factor binding, tissue expression, and disease association. Our pipeline efficiently reduces large sets of associated SNPs to a handful for experimental validation, making it valuable for various genetic association studies.

We compiled high-quality SLE GWAS and candidate gene studies up to September 2021, ultimately defining 182 statistically-independent, non-HLA loci, totaling approximately 10,000 SNPs. Our pipeline tiered SNPs based on target gene expression (eQTL) and chromatin interactions (PCHiC). Of the 182 loci, 106 were flagged by ≥3 independent experimental methods, and six loci were flagged by all available experimental methods. These six SNPs were deemed to be the most significantly associated variants in our study.

Beyond the six most highly selected SNPs, our Tier1 hits and associated targets were very strongly enriched in immune-related genes. High-tier SNPs were also greatly enriched in SNPs flagged as deleterious by other methods. Overall, putative risk loci and target genes were overwhelmingly enriched in immune genes, with many being known risk for SLE, rheumatoid arthritis, systemic sclerosis, Crohn’s disease, Sjögren’s syndrome, primary biliary cholangitis, and particularly inflammatory bowel disease. We experimentally validated a high-priority SNP with CRISPR-Cas9 gene activation/silencing, confirming that this site indeed has dramatic enhancer activity, likely underlying SLE association. This experimental support for SNPs and loci prioritized by our analysis supports its utility in selecting likely underlying SNPs from GWAS peaks.

Our study provides valuable insights into the functional variants and target genes associated with SLE, but has two major limitations. Firstly, the sparse MPRA data utilized in our analysis may result in some loci having unflagged causal variants, potentially leading to missed associations. Secondly, the sparse caQTL data restricts the strongest conclusions to B-cells, limiting the generalizability of our findings to other cell types. To address these limitations, it is crucial to generate more MPRA data and caQTL data in diverse cell types. This would refine the existing loci, identify additional loci, and enhance the applicability of our pipeline to a broader range of diseases. Furthermore, future studies should focus on verifying causality and elucidating underlying biochemical mechanisms, utilizing our SLE dataset as a roadmap.

Another limitation of our study applies to all genetics projects: limited power to resolve rare variants. Various groups have shown that SLE risk loci, including our risk locus BANK1(45), are enriched in rare variants (sometimes strongly) associated with disease. Increasing sample sizes, and performing meta-analyses such as we do, increase power for resolving such associations–although follow-up candidate-gene experiments are required to analyze and validate rare variants. It is likely that some of our loci manifest at least somewhat through rare SNPs.

In conclusion, we demonstrate and validate a comprehensive analysis pipeline useful for diverse post-GWAS studies. We anticipate that this work will stimulate future research to verify causal relationships and uncover the intricate biochemical mechanisms underlying SLE and related diseases. The SLE dataset we generated will serve as a roadmap for future studies verifying causality and establishing underlying biochemical mechanisms.

Supplementary Material

Supinfo1
Supinfo2
Supinfo3
Supinfo4
Supinfo5

Figure 2.

Figure 2.

Distribution of 182 non-HLA SLE loci across the human genome. Loci colored by highest SNP tier. Tier1 names are common target genes from both eQTL and PCHiC data; other tiers are named by the closest positional gene. Loci with double dots have ≥1 significant experimentally validated (caQTL or MPRA) allele-specific SNP. Single dots mean no experimentally validated SNPs are yet known.

Funding.

Research reported in this publication was supported by National Institutes of Health grants R01AI172255 and R21AI168943. The content is solely the responsibility of the authors and does not necessarily reflect the official views of the National Institutes of Health.

Footnotes

Authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

REFERENCES

  • 1.Kuo CF, Grainge MJ, Valdes AM, See LC, Luo SF, Yu KH, et al. Familial Aggregation of Systemic Lupus Erythematosus and Coaggregation of Autoimmune Diseases in Affected Families. JAMA Intern Med. 2015;175(9):1518–26. Epub 2015/07/21. doi: 10.1001/jamainternmed.2015.3528. [DOI] [PubMed] [Google Scholar]
  • 2.Deapen D, Escalante A, Weinrib L, Horwitz D, Bachman B, Roy-Burman P, et al. A revised estimate of twin concordance in systemic lupus erythematosus. Arthritis Rheum. 1992;35(3):311–8. [DOI] [PubMed] [Google Scholar]
  • 3.Lawrence JS, Martins CL, Drake GL. A family survey of lupus erythematosus. 1. Heritability. J Rheumatol. 1987;14(5):913–21. [PubMed] [Google Scholar]
  • 4.Bentham J, Morris DL, Cunninghame Graham DS, Pinder CL, Tombleson P, Behrens TW, et al. Genetic association analyses implicate aberrant regulation of innate and adaptive immunity genes in the pathogenesis of systemic lupus erythematosus. Nat Genet. 2015;47(12):1457–64. doi: 10.1038/ng.3434. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Molineros JE, Yang W, Zhou XJ, Sun C, Okada Y, Zhang H, et al. Confirmation of five novel susceptibility loci for systemic lupus erythematosus (SLE) and integrated network analysis of 82 SLE susceptibility loci. Hum Mol Genet. 2017;26(6):1205–16. doi: 10.1093/hmg/ddx026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Sun C, Molineros JE, Looger LL, Zhou XJ, Kim K, Okada Y, et al. High-density genotyping of immune-related loci identifies new SLE risk variants in individuals with Asian ancestry. Nat Genet. 2016. doi: 10.1038/ng.3496. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Yin X, Kim K, Suetsugu H, Bang SY, Wen L, Koido M, et al. Meta-analysis of 208370 East Asians identifies 113 susceptibility loci for systemic lupus erythematosus. Ann Rheum Dis. 2020. Epub 2020/12/05. doi: 10.1136/annrheumdis-2020-219209. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Langefeld CD, Ainsworth HC, Cunninghame Graham DS, Kelly JA, Comeau ME, Marion MC, et al. Transancestral mapping and genetic load in systemic lupus erythematosus. Nat Commun. 2017;8:16021. doi: 10.1038/ncomms16021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.MacArthur DG, Manolio TA, Dimmock DP, Rehm HL, Shendure J, Abecasis GR, et al. Guidelines for investigating causality of sequence variants in human disease. Nature. 2014;508(7497):469–76. doi: 10.1038/nature13127. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Gallagher MD, Chen-Plotkin AS. The Post-GWAS Era: From Association to Function. Am J Hum Genet. 2018;102(5):717–30. doi: 10.1016/j.ajhg.2018.04.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Watanabe K, Taskesen E, van Bochoven A, Posthuma D. Functional mapping and annotation of genetic associations with FUMA. Nat Commun. 2017;8(1):1826. doi: 10.1038/s41467-017-01261-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Cano-Gamez E, Trynka G. From GWAS to Function: Using Functional Genomics to Identify the Mechanisms Underlying Complex Diseases. Front Genet. 2020;11:424. Epub 20200513. doi: 10.3389/fgene.2020.00424. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Hormozdiari F, van de Bunt M, Segre AV, Li X, Joo JWJ, Bilow M, et al. Colocalization of GWAS and eQTL Signals Detects Target Genes. Am J Hum Genet. 2016;99(6):1245–60. doi: 10.1016/j.ajhg.2016.10.003. Epub 2016 Nov 17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Buenrostro JD, Giresi PG, Zaba LC, Chang HY, Greenleaf WJ. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat Methods. 2013;10(12):1213–8. doi: 10.1038/nmeth.2688. Epub 2013 Oct 6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Javierre BM, Burren OS, Wilder SP, Kreuzhuber R, Hill SM, Sewitz S, et al. Lineage-Specific Genome Architecture Links Enhancers and Non-coding Disease Variants to Target Gene Promoters. Cell. 2016;167(5):1369–84 e19. doi: 10.1016/j.cell.2016.09.037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Tehranchi A, Hie B, Dacre M, Kaplow I, Pettie K, Combs P, et al. Fine-mapping cis-regulatory variants in diverse human populations. Elife. 2019;8. Epub 20190116. doi: 10.7554/eLife.39595. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Cavalli G, Misteli T. Functional implications of genome topology. Nat Struct Mol Biol. 2013;20(3):290–9. Epub 2013/03/07. doi: 10.1038/nsmb.2474. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Rao SS, Huntley MH, Durand NC, Stamenova EK, Bochkov ID, Robinson JT, et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell. 2014;159(7):1665–80. doi: 10.1016/j.cell.2014.11.021. Epub 2014 Dec 11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Inoue F, Ahituv N. Decoding enhancers using massively parallel reporter assays. Genomics. 2015;106(3):159–64. Epub 20150610. doi: 10.1016/j.ygeno.2015.06.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science. 2020;369(6509):1318. doi: 10.1126/science.aaz1776. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Chandra V, Bhattacharyya S, Schmiedel BJ, Madrigal A, Gonzalez-Colin C, Fotsing S, et al. Promoter-interacting expression quantitative trait loci are enriched for functional genetic variants. Nature Genetics. 2021;53(1):110–9. doi: 10.1038/s41588-020-00745-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Mifsud B, Tavares-Cadete F, Young AN, Sugar R, Schoenfelder S, Ferreira L, et al. Mapping long-range promoter contacts in human cells with high-resolution capture Hi-C. Nat Genet. 2015. doi: 10.1038/ng.3286. [DOI] [PubMed] [Google Scholar]
  • 23.Munz M, Wohlers I, Simon E, Reinberger T, Busch H, Schaefer AS, et al. Qtlizer: comprehensive QTL annotation of GWAS results. Scientific Reports. 2020;10(1):20417. doi: 10.1038/s41598-020-75770-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Fonseka CY, Rao DA, Raychaudhuri S. Leveraging blood and tissue CD4+ T cell heterogeneity at the single cell level to identify mechanisms of disease in rheumatoid arthritis. Current opinion in immunology. 2017;49:27–36. doi: 10.1016/j.coi.2017.08.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Baxter JS, Leavy OC, Dryden NH, Maguire S, Johnson N, Fedele V, et al. Capture Hi-C identifies putative target genes at 33 breast cancer risk loci. Nat Commun. 2018;9(1):1028. doi: 10.1038/s41467-018-03411-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Lu X, Chen X, Forney C, Donmez O, Miller D, Parameswaran S, et al. Global discovery of lupus genetic risk variant allelic enhancer activity. Nat Commun. 2021;12(1):1611. Epub 2021/03/14. doi: 10.1038/s41467-021-21854-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Zhou Y, Sun Y, Huang D, Li MJ. epiCOLOC: Integrating Large-Scale and Context-Dependent Epigenomics Features for Comprehensive Colocalization Analysis. Frontiers in Genetics. 2020;11. doi: 10.3389/fgene.2020.00053. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Singh B, Maiti GP, Zhou X, Fazel-Najafabadi M, Bae SC, Sun C, et al. Lupus Susceptibility Region Containing CDKN1B rs34330 Mechanistically Influences Expression and Function of Multiple Target Genes, Also Linked to Proliferation and Apoptosis. Arthritis Rheumatol. 2021;73(12):2303–13. Epub 2021/05/14. doi: 10.1002/art.41799. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Kim-Howard X, Sun C, Molineros JE, Maiti AK, Chandru H, Adler A, et al. Allelic heterogeneity in NCF2 associated with systemic lupus erythematosus (SLE) susceptibility across four ethnic populations. Hum Mol Genet. 2014;23(6):1656–68. doi: 10.1093/hmg/ddt532. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Rusca N, Monticelli S. MiR-146a in Immunity and Disease. Mol Biol Int. 2011;2011:437301. Epub 20110407. doi: 10.4061/2011/437301. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Wang G, Tam LS, Li EK, Kwan BC, Chow KM, Luk CC, et al. Serum and urinary cell-free MiR-146a and MiR-155 in patients with systemic lupus erythematosus. J Rheumatol. 2010;37(12):2516–22. Epub 20101015. doi: 10.3899/jrheum.100308. [DOI] [PubMed] [Google Scholar]
  • 32.Odqvist L, Jevnikar Z, Riise R, Öberg L, Rhedin M, Leonard D, et al. Genetic variations in A20 DUB domain provide a genetic link to citrullination and neutrophil extracellular traps in systemic lupus erythematosus. Ann Rheum Dis. 2019;78(10):1363–70. Epub 20190712. doi: 10.1136/annrheumdis-2019-215434. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Devallière J, Charreau B. The adaptor Lnk (SH2B3): an emerging regulator in vascular cells and a link between immune and inflammatory signaling. Biochem Pharmacol. 2011;82(10):1391–402. Epub 20110624. doi: 10.1016/j.bcp.2011.06.023. [DOI] [PubMed] [Google Scholar]
  • 34.Molineros JE, Maiti AK, Sun C, Looger LL, Han S, Kim-Howard X, et al. Admixture mapping in lupus identifies multiple functional variants within IFIH1 associated with apoptosis, inflammation, and autoantibody production. PLoS Genet. 2013;9(2):e1003222. doi: 10.1371/journal.pgen.1003222. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Liu X, Li YI, Pritchard JK. Trans Effects on Gene Expression Can Drive Omnigenic Inheritance. Cell. 2019;177(4):1022–34.e6. doi: 10.1016/j.cell.2019.04.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Melnikov A, Murugan A, Zhang X, Tesileanu T, Wang L, Rogov P, et al. Systematic dissection and optimization of inducible enhancers in human cells using a massively parallel reporter assay. Nature Biotechnology. 2012;30(3):271–7. doi: 10.1038/nbt.2137. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Kumasaka N, Knights AJ, Gaffney DJ. Fine-mapping cellular QTLs with RASQUAL and ATAC-seq. Nature Genetics. 2016;48(2):206–13. doi: 10.1038/ng.3467. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Enomoto T, Ohashi K, Shibata R, Higuchi A, Maruyama S, Izumiya Y, et al. Adipolin/C1qdc2/CTRP12 Protein Functions as an Adipokine That Improves Glucose Metabolism *. Journal of Biological Chemistry. 2011;286(40):34552–8. doi: 10.1074/jbc.M111.277319. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Georg I, Díaz-Barreiro A, Morell M, Pey AL, Alarcón-Riquelme ME. BANK1 interacts with TRAF6 and MyD88 in innate immune signaling in B cells. Cellular & Molecular Immunology. 2020;17(9):954–65. doi: 10.1038/s41423-019-0254-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Olsson LM, Johansson ÅC, Gullstrand B, Jönsen A, Saevarsdottir S, Rönnblom L, et al. A single nucleotide polymorphism in the NCF1 gene leading to reduced oxidative burst is associated with systemic lupus erythematosus. Annals of the Rheumatic Diseases. 2017;76(9):1607–13. doi: 10.1136/annrheumdis-2017-211287. [DOI] [PubMed] [Google Scholar]
  • 41.Graham RR, Kozyrev SV, Baechler EC, Reddy MV, Plenge RM, Bauer JW, et al. A common haplotype of interferon regulatory factor 5 (IRF5) regulates splicing and expression and is associated with increased risk of systemic lupus erythematosus. Nat Genet. 2006;38(5):550–5. doi: 10.1038/ng1782. [DOI] [PubMed] [Google Scholar]
  • 42.Lessard CJ, Adrianto I, Ice JA, Wiley GB, Kelly JA, Glenn SB, et al. Identification of IRF8, TMEM39A, and IKZF3-ZPBP2 as susceptibility loci for systemic lupus erythematosus in a large-scale multiracial replication study. Am J Hum Genet. 2012;90(4):648–60. doi: 10.1016/j.ajhg.2012.02.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Juang YT, Sumibcay L, Tolnay M, Wang Y, Kyttaris VC, Tsokos GC. Elf-1 binds to GGAA elements on the FcRgamma promoter and represses its expression. J Immunol. 2007;179(7):4884–9. doi: 10.4049/jimmunol.179.7.4884. [DOI] [PubMed] [Google Scholar]
  • 44.Bergtold A, Gavhane A, D'Agati V, Madaio M, Clynes R. FcR-bearing myeloid cells are responsible for triggering murine lupus nephritis. J Immunol. 2006;177(10):7287–95. doi: 10.4049/jimmunol.177.10.7287. [DOI] [PubMed] [Google Scholar]
  • 45.Jiang SH, Athanasopoulos V, Ellyard JI, Chuah A, Cappello J, Cook A, et al. Functional rare and low frequency variants in BLK and BANK1 contribute to human lupus. Nat Commun. 2019;10(1):2201. Epub 2019/05/19. doi: 10.1038/s41467-019-10242-9. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supinfo1
Supinfo2
Supinfo3
Supinfo4
Supinfo5

RESOURCES