Visual Abstract
Keywords: GWAS, Alport syndrome, IgA nephropathy, collagen type IV, hematuria, United Kingdom
Abstract
Background and objectives
Glomerular hematuria has varied causes but can have a genetic basis, including Alport syndrome and IgA nephropathy.
Design, setting, participants, & measurements
We used summary statistics to identify genetic variants associated with hematuria in White British UK Biobank participants. Individuals with glomerular hematuria were enriched by excluding participants with genitourinary conditions. A strongly associated locus on chromosome 2 (COL4A4-COL4A3) was identified. The region was reimputed using the Trans-Omics for Precision Medicine Program followed by sequential rounds of regional conditional analysis, conditioning on previous genetic signals. Similarly, we applied conditional analysis to identify independent variants in the MHC region on chromosome 6 using imputed HLA haplotypes.
Results
In total, 16,866 hematuria cases and 391,420 controls were included. Cases had higher urinary albumin-creatinine compared with controls (women: 13.01 mg/g [8.05–21.33] versus 12.12 mg/g [7.61–19.29]; P<0.001; men: 8.85 mg/g [5.66–16.19] versus 7.52 mg/g [5.04–12.39]; P<0.001) and lower eGFR (women: 88±14 versus 90±13 ml/min per 1.72 m2; P<0.001; men: 87±15 versus 90±13 ml/min per 1.72 m2; P<0.001), supporting enrichment of glomerular hematuria. Variants at six loci (PDPN, COL4A4-COL4A3, HLA-B, SORL1, PLLP, and TGFB1) met genome-wide significance (P<5E-8). At chromosome 2, COL4A4 p.Ser969X (rs35138315; minor allele frequency=0.00035; P<7.95E-35; odds ratio, 87.3; 95% confidence interval, 47.9 to 159.0) had the most significant association, and two variants in the locus remained associated with hematuria after conditioning for this variant: COL4A3 p.Gly695Arg (rs200287952; minor allele frequency=0.00021; P<2.16E-7; odds ratio, 45.5; 95% confidence interval, 11.8 to 168.0) and a common COL4A4 intron 25 variant (not previously reported; rs58261427; minor allele frequency=0.214; P<2.00E-9; odds ratio, 1.09; 95% confidence interval, 1.06 to 1.12). Of the HLA haplotypes, HLA-B (*0801; minor allele frequency=0.14; P<4.41E-24; odds ratio, 0.84; 95% confidence interval, 0.82 to 0.88) displayed the most statistically significant association. For remaining loci, we identified three novel associations, which were replicated in the deCODE dataset for dipstick hematuria (nearest genes: PDPN, SORL1, and PLLP).
Conclusions
Our study identifies six loci associated with hematuria, including independent variants in COL4A4-COL4A3 and HLA-B. Additionally, three novel loci are reported, including an association with an intronic variant in PDPN expressed in the podocyte.
Podcast
This article contains a podcast at https://www.asn-online.org/media/podcast/CJASN/2022_04_26_CJN13711021.mp3
Introduction
Hematuria can originate from any location throughout the genitourinary tract. Glomerular hematuria has varied causes but is characteristic of Alport syndrome, IgA nephropathy, C3 nephropathy, and postinfectious GN (1). The genetic basis of Alport syndrome is well described, attributed to pathogenic variants in the type IV collagen genes, COL4A3, COL4A4, and COL4A5, encoding major glomerular basement membrane proteins (2,3). By contrast, the genetic basis of IgA nephropathy is more complex, with genome-wide association studies (GWAS) identifying numerous loci with small to moderate effects including in the MHC region of chromosome 6 (4). Recent GWAS using the UK Biobank and deCODE datasets (large genetically characterized adult cohorts intended to be a sample of the general population in the United Kingdom and Iceland, respectively) report COL4A3 and COL4A4 variants (which are adjacent) as statistically significant associated signals with hematuria or albuminuria (5). For Alport syndrome, the prevalence has been estimated to be approximately one in 50,000, accounting for the most severe cases that come to specialist attention (5). By contrast, population-based studies using deCODE and the UK Biobank data have illuminated the breadth of phenotypes in Alport syndrome (5,6). Concurrently, several family studies demonstrate that variants in COL4A3/4/5 masquerade as FSGS, once thought to be pathologically distinct from Alport syndrome (7,8).
A GWAS of dipstick hematuria by deCODE identified three loci, including a region encompassing COL4A4-COL4A3 and the MHC locus (9). Although no dedicated GWAS of hematuria has been reported in the UK Biobank, four groups have documented different variants in or near COL4A4-COL4A3 as strongly associated with albuminuria (10–13). For these reports, differing results despite using the same cohort could be accounted for by methodologic differences, including varying minor allele frequency (MAF) thresholds, using directly genotyped or differing reference datasets for imputation, and inclusion of albuminuria values below the limit of detection. Given the similar odds ratios (ORs), MAFs, and P values among the associated variants, we postulate that the different associations in COL4A4-COL4A3 were likely in strong linkage disequilibrium, tagging one variant, COL4A4 p.Ser969x (rs35138315), reported in multiple family studies of Alport syndrome.
To explore this hypothesis, we performed GWAS and conditional regional association analysis restricted to variants in a 2-Mb window around COL4A3 and COL4A4 with hematuria in the UK Biobank (13). To identify independent (i.e., not in linkage disequilibrium) single-nucleotide polymorphisms (SNPs), analysis was sequentially repeated using significant association signals from the previous analysis as covariates (14,15). A similar approach was subsequently applied to the MHC region, which was a strongly associated locus from our broader GWAS.
Materials and Methods
Study Design
The study was approved by the Toronto General Hospital Research Ethics Board (21–5361.0). Summary statistics from PheWeb were used to identify variants associated with hematuria in White British participants from the UK Biobank. The first four principal components of genetic ancestry were derived in cases and controls, which were then included in the model as covariates to adjust for population stratification. Cases and controls show evidence of overlapping distributions in terms of principal component space (genomic control λ=1.048) (Supplemental Figure 1). Subsequently, conditional regional association analysis restricting to neighboring genes on chromosome 2q36.3, COL4A4-COL4A3, chromosome 6 encompassing the MHC locus, and all other genome-wide significant loci was performed sequentially to identify independent variants in these regions.
UK Biobank Setting
Briefly, the UK Biobank is a prospective cohort study involving approximately 500,000 UK adults between the ages of 40 and 69 years at the time of recruitment, in whom genetic and phenotypic data are collected (16). Centrally collected and analyzed clinical/genetic data are available to investigators who apply for access, and summary statistics for thousands of traits are available through various resources, including PheWeb (https://pheweb.sph.umich.edu/). UK Biobank procedures include phenotypic data and biologic samples collection for all participants at baseline through questionnaire and linkage to health records.
Whole-Genome Genotyping and Centralized Analysis
The UK Biobank performed genotyping at the Affymetrix Research Services Laboratory, and array information has been previously described (16). The Haplotype Reference Consortium (HRC) data, consisting of primarily European genetic ancestry individuals and 39,235,157 SNPs, served as the main imputation reference panel. Imputation was also performed using the merged UK10K and the 1000 Genomes phase 3 reference panels (17,18). Imputed data were combined when an SNP was present in both panels.
For our analysis, we used the genome-wide summary statistics from PheWeb for hematuria on the basis of the HRC-imputed dataset. We supplemented the analysis with conditional analyses using individual-level data using Trans-Omics for Precision Medicine Program (TOPMed)–imputed genotypes for the chromosome 2 signal as well as UK Biobank–provided HLA haplotypes for the HLA chromosome 6 signal (described below).
Trans-Omics for Precision Medicine Program Imputation of the Chromosome 2:226000000–228000000 Region (GRCh38) of the UK Biobank
We used the TOPMed imputation server (https://imputation.biodatacatalyst.nhlbi.nih.gov/#!) to impute genetic variants in the chr2:226000000–228000000 region from the UK Biobank array data in batches of 20,000 individuals at a time given that the individual-level TOPMed-imputed UK Biobank dataset is not currently available (12). The batches were subsequently merged.
Imputed Histocompatibility Leukocyte Antigen Haplotypes
The HLA region on chromosome 6 was further examined using the provided HLA-imputed haplotypes field 22182 (16). HLA imputation was conducted by the UK Biobank using HLA*IMP:02 with modified settings using a set of genetically diverse reference datasets (19). The imputation procedure and quality control steps have been described (Motyer et al., unpublished data). In brief, for each locus, only the individuals who had laboratory-based HLA types for that locus and only the SNPs that were polymorphic and were typed in at least 98% of that set of individuals were included.
Phenotype Definition for the Case-Control Study
The R package PheWAS (https://github.com/PheWAS/PheWAS/blob/master/inst/doc/PheWAS-package.pdf) was used to map ICD9 and ICD10 codes (https://phewascatalog.org/phecodes; https://phewascatalog.org/phecodes_icd10cm) to phecode 593 “hematuria” (20). Phecodes are a high-throughput phenotyping tool on the basis of ICD codes that have been previously curated to rapidly define the status of thousands of clinically meaningful diseases and conditions.
Cases had the following ICD9 or ICD10 codes: “hematuria,” 599.7 (ICD9) and R31 (ICD10); “hematuria unspecified,” 599.70 (ICD9) and R31.9 (ICD10); and “gross hematuria,” 599.71 (ICD9) and R31.0 (ICD10).
To enrich for glomerular hematuria, exclusion codes were individuals with any genitourinary phecodes (individuals who have a phecode with the range 590–593.99; ICD9 and ICD10 codes underlying each phecode in this range were extracted to get the full list), and individuals who were not designated as a case and who did not meet exclusion codes were used as controls.
Clinical characteristics for cases and controls were determined. Urine albumin-creatinine ratios were computed as (microalbumin in milligrams per liter)/(creatinine in micromoles per liter per 1000). Individuals marked as having a microalbumin value below 6.7 mg/L (field 30505) were assigned as having a microalbumin value of 6.7 mg/L. eGFR was calculated in two ways using the Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI) equation or as 100/cystatin C (milligrams per liter) (21).
Statistical Methods
SAIGE (version 0.44.5) was used to test variants for association with hematuria in the White British subset of the UK Biobank, where the effect allele is the nonreference allele (22).
Variables
The following covariates were included in analysis: sex, birth year, and the first four genetic ancestry principal components computed on the White British subset. Because SAIGE uses a genetic relationship matrix to properly account for relatedness, we retained related individuals. SAIGE performs QR transformation (decomposition of the covariate matrix to solve the linear least squares problem) of the covariate matrix by default, which adjusts for categorical covariates, such as birth year (23). For sex-stratified analyses on chromosome X, sex was not included as a covariate.
LocusZoom plots were generated for loci of interest (24).
deCODE Replication
Methods used in hematuria GWAS using the deCODE dataset have been previously described (9). In total, 151,677 Icelanders have been genotyped using Illumina SNP chips and genotype probabilities for untyped first- and second-degree relatives of chip-typed individuals calculated on the basis of the Icelandic genealogy database. Whole-genome sequence is also available for 15,220 Icelanders, and this was used for imputation. Hematuria phenotyping was performed with urine dipstick tests obtained from two laboratories in Iceland. Categorical analyses were performed, and cases were defined as (1) mild (at least one measurement of + or greater versus negative controls) or (2) moderate/severe (at least one measurement of ++ or greater versus negative controls).
Results
Participants
We assessed 16,866 (7074 women + 9792 men) hematuria cases and 391,420 (213,407 women + 178,013 men) controls, which includes related individuals. Individuals with any genitourinary phecodes were excluded, enriching for glomerular hematuria cases. The descriptive characteristics of the study population by genetic sex are presented in Table 1. There were significantly more men (9792) than women (7074) cases (P<0.001). The overall rate of hematuria was 4% (16,866 of 408,286) in the White British subset. The mean ages in years (baseline visit) by women and men were 61±7 years (women) and 60±7 years (men) for cases and 57±8 years (women) and 57±8 years (men) for controls, respectively. All comparisons were significantly different. The urinary albumin-creatinine ratio was significantly higher in cases (women: 13.01 mg/g [8.05–21.33]; men: 8.85 mg/g [5.66–16.19]) compared with controls (women: 12.12 mg/g [7.61–19.29]; men: 7.52 mg/g [5.04–12.39]), with a P value of P<0.001. The eGFR was significantly lower in cases (women: 88±14 ml/min per 1.72 m2; men: 87±15 ml/min per 1.72 m2 by CKD-EPI) compared with controls (women: 91±13 ml/min per 1.72 m2; men: 90±13 ml/min per 1.72 m2), with similar results for 100/serum cystatin C. Systolic BP was statistically significantly higher in cases compared with controls, although the difference was nominal. The use of antihypertensive agents and body mass index were higher in cases than controls, separately by sex. Hearing aid use and self-reported hearing difficulty were more frequent in cases than controls, also separately by sex. Nonsex-specific characteristics of cases and controls are shown in Supplemental Table 1.
Table 1.
Variable | Women Only | Men Only | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
Cases with Nonmissing Data | Controls with Nonmissing Data | Cases | Controls | t Test P Value | Cases with Nonmissing Data | Controls with Nonmissing Data | Cases | Controls | t Test P Value | |
Total | 7074 | 213,407 | 7074 | 213,407 | — | 9792 | 178,013 | 9792 | 178,013 | — |
Age at baseline, yr, mean ± SD | 7074 | 213,407 | 61±7 | 57±8 | <0.001 | 9792 | 178,013 | 60±7 | 57±8 | <0.001 |
No. of individuals on 2 arrays | 7074 | 213,407 | UK BiLEVE: 791 2043; UK Biobank Axiom: 6283 | UK BiLEVE: 21,448; UK Biobank Axiom: 191,959 | 0.002 (chi square) | 9792 | 178,013 | UK BiLEVE: 1252; UK Biobank Axiom: 8540 | UK BiLEVE: 21,094; UK Biobank Axiom: 156,919 | 0.005 (chi square) |
PC1, mean ± SD | 7074 | 213,407 | −12.4±1.60 | −12.36±1.61 | 0.14 | 9792 | 178,013 | −12.3±1.62 | −12.36±1.61 | 0.42 |
PC2, mean ± SD | 7074 | 213,407 | 3.78±1.49 | 3.78±1.51 | 0.93 | 9792 | 178,013 | 3.78±1.50 | 3.78±1.50 | 0.95 |
Pulse rate, automated reading at baseline, bpm, mean ± SD | 6565 | 199,182 | 71±11 | 71±11 | <0.001 | 9166 | 166,507 | 69±13 | 68±12 | <0.001 |
Systolic BP, automated reading at baseline, mm Hg, mean ± SD | 6565 | 199,180 | 140±21 | 138±20.2 | <0.001 | 9166 | 166,502 | 145±19.1 | 14±18 | <0.001 |
Diastolic BP, automated reading at baseline, mm Hg, mean ± SD | 6565 | 199,182 | 81±11 | 81±10.5 | 0.09 | 9166 | 166,507 | 84±11 | 84±11 | 0.63 |
Taking BP medication at baseline | 7061 | 213,030 | 1010 | 21,438 | <0.001 (chi square) | 9781 | 177,726 | 1326 | 17,454 | <0.001 (chi square) |
Standing height at baseline, cm, mean ± SD | 7046 | 213,014 | 162±6 | 163±6 | <0.001 | 9752 | 177,618 | 175±7 | 176±7 | <0.001 |
Weight at baseline, kg, mean ± SD | 7041 | 212,861 | 72±14 | 72±14 | <0.001 | 9740 | 177,510 | 88±15 | 86±14 | <0.001 |
BMI at baseline, mean ± SD | 7038 | 212,800 | 27.5±5.34 | 27.0±5.12 | <0.001 | 9735 | 177,433 | 28.5±4.56 | 27.8±4.21 | <0.001 |
Urinary albumin-creatinine ratio, mg/g, median (1st to 3rd percentile)a | 6866 | 206,796 | 13.01 (8.05–21.33) | 12.12 (7.61–19.29) | <0.001 (Mann–Whitney U test) | 9533 | 173,444 | 8.85 (5.66–16.19) | 7.52 (5.04–12.39) | <0.001 (Mann–Whitney U test) |
eGFR, ml/min per 1.72 m2, mean ± SDb | 7074 | 213,407 | 88±14 | 91±13 | <0.001 | 9792 | 178,013 | 87±15 | 90±13 | <0.001 |
eGFR computed as 100/serum cystatin C (mg/L) at baseline, mean ± SD | 6771 | 203,329 | 112±19 | 117±18 | <0.001 | 9351 | 169,805 | 104±17 | 109±16 | <0.001 |
Hearing aid user (yes; no) | 3964 | 120,298 | 267; 3697 | 5367; 114,931 | <0.001 (chi square) | 6462 | 115,555 | 542; 5920 | 6509; 109,046 | <0.001 (chi square) |
Hearing difficulty/problems via touchscreen (yes; no) | 6755 | 203,992 | 1757; 4998 | 43,786; 160,206 | <0.001 (chi square) | 9464 | 171,974 | 3448; 6016 | 53,920; 118,054 | <0.001 (chi square) |
PC1, 1st principal component; PC2, 2nd principal component; BMI, body mass index.
Individuals marked as having a microalbumin value below 6.7 mg/L (field 30505) were assigned as having a microalbumin value of 6.7 mg/L. Then, the ratio was computed as (microalbumin in milligrams per liter)/(creatinine in micromoles per liter per 1000).
eGFR was calculated using the Chronic Kidney Disease Epidemiology Collaboration equation or as 100/serum cystatin C (milligrams per liter) (21).
Case-Control Study
Summary statistics from PheWeb, in which centrally HRC-imputed UK Biobank data were used, identified variants associated with hematuria in the White British UK subset. GWAS identified six loci (P<0.001) (Figure 1, Table 2, Supplemental Figure 2) (22,25). The following genes were closest to the strongest associated variants in each loci: PDPN, IRS1, HLA-B, SORL1, PLLP, and TGFB1 (Figure 2). Of these, three have not been previously reported (closest genes: PDPN, SORL1, and PLLP).
Table 2.
Chromosome:Position | Reference/ Alternate Allele | rs Identification | Nearest Gene(s) and Location | Minor Allele Frequency (Minor Allele) | P Value | Odds Ratio (95% Confidence Interval) | Imputation Quality |
---|---|---|---|---|---|---|---|
GRCh37 | |||||||
1:13,921,934 | A/G | rs2885134 | PDPN (intron 1) | 0.029 (G) | 5.90E-9 | 1.22 (1.14 to 1.31) | 0.958 |
HLA-B*0801 | — | — | — | 0.14 | 4.41E-19 | 0.84 (0.82 to 0.88) | N/A |
11:121,584,573 | T/A | rs618048 | SORL1 (downstream 3′) | 0.22 (T) | 1.30E-11 | 0.91 (0.88 to 0.94) | 0.999 |
19:41,826,020 | A/C | rs56254331 | CCDC97 (intron 3), TGFB1 (downstream 3′) | 0.17 (C) | 8.80E-16 | 0.89 (0.86 to 0.91) | 0.999 |
16:57,349,346 | G/T | rs948705 | PLLP (intergenic with CCL22) | 0.15 (G) | 2.30E-9 | 1.10 (1.07 to 1.14) | 0.976 |
GRCh38 | |||||||
chr2:227052367 (most significant variant in locus) | G/C | rs35138315 | COL4A4 (p.Ser969x, exon 32) | 0.00035 (C) | 7.95E-35 | 87.3 (47.9 to 159.0) | 0.567 |
chr2:227077364 (results conditioned on rs35138315) | T/C | rs58261427 | COL4A4 (intron 25) | 0.214 (C) | 2.00E-9 | 1.09 (1.06 to 1.12) | 0.987 |
chr2:227277511 (results conditioned on rs35138315) | G/A | rs200287952 | COL4A3 (p.Gly695Arg, exon 28) | 0.00021 (A) | 2.16E-7 | 45.5 (11.8 to 168.0) | 0.268 |
For GRCh37, a genome-wide association study was performed using centrally Haplotype Reference Consortium–imputed UK Biobank data, in which variants are reported in reference to GRCh37. After conditioning on the most significant variant at each locus, all other associations in the region became nonsignificant. For GRCh38, reimputation of a 2-Mb region encompassing COL4A4-COLA43 was performed using the Trans-Omics for Precision Medicine Program (reference GRCh38) followed by conditional regional analysis to identify independent associated variants. Reported odds ratios refer to the alternate allele.
The strongest association was observed for rs71431010 (MAF=0.00055; P<0.001; OR, 105; 95% confidence interval [95% CI], 52 to 231), which is intergenic, and the closest gene is IRS1. This variant was also associated with albuminuria in a GWAS using the UK Biobank (13). The variant is 1.4 Mb from the COL4A4-COL4A3 region.
The associated region on chromosome 2q36.3, where COL4A-COL4A3 resides, was examined more closely. COL4A4 p.Ser969x (rs35138315) is directly genotyped on the UK Biobank arrays, but HRC imputation loses it. This variant is important because it is associated with albuminuria in the UK Biobank, and it is reported in multiple cases of autosomal recessive Alport syndrome and autosomal dominant FSGS (11,26–29). Thus, we utilized TOPMed for regional imputation of this region of chromosome 2, enabling prediction of variants with lower frequency, including putative loss-of-function alleles (17,30).
We assessed the 2 Mb surrounding COL4A4-COL4A3 using TOPMed imputation in order to capture the rare variants in the locus, which identified 13 variants (P<0.001). The strongest associated variant was COL4A4 p.Ser969x (rs35138315; MAF=0.00035; P<0.001; OR, 87.3; 95% CI, 47.9 to 159.0) (Table 2). We then performed conditional regional association analysis conditioned on rs35138315, leading to most variants no longer being significantly associated; this represents strong linkage disequilibrium among them. Two additional variants, however, remained significant (P<0.001) and were not previously reported in UK Biobank studies (Figure 2, Table 2). These include COL4A3 p.Gly695Arg (rs200287952; MAF=0.00021; P<0.001; OR, 45.5; 95% CI, 11.8 to 168.0) and a common variant in intron 25 of COL4A4 (rs58261427; MAF=0.214; P<0.001; OR, 1.09; 95% CI, 1.06 to 1.12), which is ∼500 base pairs from the nearest exon (exon 25). Using pheweb.org, we interrogated results to examine the association of the three COL4A4-COL4A3 variants with other traits (Supplemental Table 2). Sex-stratified analysis of COL4A5 variants also did not identify any statistically significant associations.
Of the other hematuria GWAS signals, the second strongest association was in HLA-B. We further performed conditional regional analysis using UK Biobank–provided imputed HLA haplotypes. Of 362 haplotypes, there were six associated at P<0.001, but none remained significant after conditional analysis with HLA-B*08:01 (MAF=0.14; P<0.001; OR, 0.84; 95% CI, 0.82 to 0.88) (Supplemental Table 3). HLA haplotypes are not single-nucleotide sequence variants, but rather, they are several changes in a haplotype and are denoted numerically (31). In the only other reported hematuria GWAS using deCODE data, association for a variant in the immediate neighboring gene on chromosome 6, HLA-C (6:31271845C>A; hg38; the authors stated that this is p.Asp33Tyr, but it is residue 9; MAF[C allele]=0.4275; P<0.001; OR, 0.92; 95% CI, 0.90 to 0.94), was reported (Table 3). The variant identified by deCODE is not available in either the HRC-imputed or HLA-imputed data in the UK Biobank. We replicated three loci (two of which are the same variant) in the deCODE hematuria GWAS (Table 3).
Table 3.
Gene or Loci | deCODE | UK Biobank | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Variant | Minor Allele | Minor Allele Frequency | P Value | Odds Ratio | 95% Confidence Interval | Phenotype | Variant | Minor Allele | Minor Allele Frequency | P Value | Odds Ratio | 95% Confidence Interval | |
Previously published by deCODE (9) | |||||||||||||
COL4A3 | Gly695Arg | A | 0.0003 | <0.001 | 5.46 | 2.94 to 10.16 | ≥++ versus − | Gly695Arg | A | 0.00021 | <0.001 | 45.5 | 11.8 to 168 |
COL4A3 | 2.5-kb deletion (Gly289_Lys330del) | — | 0.0006 | <0.001 | 11.78 | 7.26 to 19.11 | ≥++ versus − | Not examined | — | — | — | — | — |
COL4A3 | rs760545501 | T | 0.0017 | <0.001 | 2.36 | 1.80 to 3.09 | ≥++ versus − | Not examined | — | — | — | — | — |
TGFB1 | rs56254331 | C | 0.18 | <0.001 | 0.92 | 0.89 to 0.94 | ≥+ versus − | rs56254331 | C | 0.17 | <0.001 | 0.89 | 0.86 to 0.91 |
MHC | Chr 6:31271845(hg38) HLA-C | C | 0.43 | <0.001 | 0.92 | 0.90 to 0.94 | ≥+ versus − | HLA-B*08:01 | — | 0.14 | <0.001 | 0.84 | 0.82 to 0.88 |
Not previously reported by deCODE | |||||||||||||
SORL1 | rs618048 | T | 0.15 | <0.001 | 1.08 | 1.05 to 1.11 | ≥+ versus − | rs618048 | T | 0.22 | <0.001 | 1.10 | 1.08 to 1.11 |
PDPN | rs2885134 | G | 0.03 | 0.006 | 1.09 | 1.02 to 1.15 | ≥++ versus − | rs2885134 | G | 0.029 | <0.001 | 1.22 | 1.14 to 1.31 |
PLLP | rs948705 | G | 0.14 | <0.001 | 0.93 | 0.90 to 0.96 | ≥+ versus − | rs948705 | G | 0.15 | <0.001 | 0.91 | 0.89 to 0.92 |
Variants, minor allele frequencies, odds ratios, and 95% confidence intervals at associated loci are shown. Reported odds ratios refer to the minor allele. Hematuria is defined in deCODe by dipstick and quantitated as ≥+ or ≥++. —, minor allele (allele with frequency <50%).
For the remaining four loci, the same conditional regional approach was performed using the HRC-imputed White British UK Biobank subset with the strongest associated variant as a covariate (Table 2). After conditioning on the top variant at the locus, all other associations became nonsignificant (P<0.001).
Of these, three were associations with hematuria that have not been previously reported. Thus, the association was analyzed in the deCODE dataset and was found to be replicated with the same direction of effect, although nominally for the PDPN variant (Table 3, Supplemental Table 4). Cell- and tissue-specific expression along with associated phenotypes from PheWeb are shown in Supplemental Table 5.
Discussion
Here, we present the first dedicated GWAS of hematuria using the UK Biobank data with several novel findings. Six loci that are genome-wide significantly associated with hematuria are identified, including three that have not been previously reported (nearest genes: PDPN, SORL1, and PLLP). For the known loci residing on chromosome 2q36.3 encompassing COL4A4-COL4A3, regional reimputation and sequential association analysis were performed, leading to the identification of three independent SNPs that include COL4A4 p.Ser969x (rs35138315), COL4A3 p.Gly695Arg (rs200287952), and a common variant not previously reported in intron 25 of COL4A4 (rs58261427). Similar fine-mapping analysis was performed for chromosome 6 encompassing the MHC region, identifying one previously unreported common variant in HLA-B*08:01 as driving the association with hematuria. Of the four remaining associated loci, three have not been previously reported, but we show that they are replicated in the deCODE dataset.
Hematuria is a typical manifestation observed with glomerular basement membrane defects as evidenced in Alport syndrome but variably in primary podocyte disorders. Of the three novel loci, one is an intronic variant in PDPN (rs2885134) (Table 2), which encodes podoplanin, reported to anchor to the podocyte cytoskeleton through binding with ezrin (32). PDPN is also reported to be expressed in podocytes by a single-cell sequencing atlas (http://humphreyslab.com/SingleCell/displaycharts.php). In Genotype-Tissue Expression (GTEx) V8, this SNP is a cis-eQTL for PDPN in aorta (P<0.001; with the minor allele G having lower expression; https://gtexportal.org/home/snp/rs2885134) (33). This finding thus potentially informs filtration barrier biology, highlighting a role of podocytes in protecting against glomerular hematuria.
The variant with the closest gene PLLP (rs948705) (Table 2) is a cis-eQTL for CX3CL1 (P<0.001; with the minor allele G having higher expression) in adrenal tissue from GTEx (https://gtexportal.org/home/snp/rs948705) (33). It is also in strong linkage disequilibrium with a pQTL for CX3CL1 in plasma (34).
For chromosome 2, the strongest associated variant was intergenic, closest to IRS1 with the HRC-imputed dataset, but reimputing the region with TOPMed reveals that the association is actually driven by COL4A4 p.Ser969x (rs35138315). Previous GWAS of albuminuria in the UK Biobank identified different SNPs in the region, with two studies reporting COL4A4 p.Ser969x (rs35138315) as a strongly associated signal (11,12). For these studies, direct genotyping or imputed data on the basis of TOPMed were used. The available HRC-imputed dataset that we used in our GWAS loses rs35138315, despite the fact that it was directly genotyped on the array. Our fine-mapping strategy demonstrates that the 13 associated variants on chromosome 2 are in linkage disequilibrium, revealing that only three are actually independently associated with hematuria. These three variants are observed to have an association with kidney disease–relevant or Alport syndrome traits (Supplemental Table 2).
The UK Biobank is not a random sample, with only a 6% participation rate, and the true prevalence of diseases could be higher (16,35). The COL4A4 p.Ser969x (rs35138315) has been reported as causing autosomal dominant FSGS and autosomal recessive Alport syndrome (26–29). In one report, rs35138315 was the most common pathogenic variant detected in a series of autosomal recessive Alport syndrome, accounting for 23% of cases (27). Some of these patients were homozygous for the variant, whereas others were compound heterozygotes with another COL4A4 variant. Five of six patients with this variant were recorded to have sensorineural hearing loss, and the ages of kidney failure, where known, were 18, 22, and 28 years.
Another independently associated variant is COL4A3 p.Gly695Arg (rs200287952), which has been implicated in thin basement nephropathy and Alport syndrome by several groups (6,29,36,37). Recently, this variant has been shown to have a higher frequency in hematuria cases (0.13%) than controls (0.048%) from the 100,000 Genomes Project (6). Additionally, it is reported in a GWAS of dipstick hematuria using the deCODE dataset (MAF=0.00021; P<0.001; OR, 5.46; 95% CI, 2.94 to 10.16) (9). The third independent signal is a common COL4A4 intron 25 variant, which has not been previously associated with kidney phenotypes. Analysis of this variant in GTEx (https://gtexportal.org/home/snp/rs58261427) suggests that it is producing an alternative COL4A4 transcript from the fifth last exon (exon 44) in thyroid tissue. However, there is no independent evidence for this. Furthermore, it is also possible that this variant could be in linkage disequilibrium with a pathogenic coding sequence variant not genotyped or imputed, which can occur for a number of reasons, including if it is a structural variant.
The second most strongly associated locus in our hematuria GWAS was a variant within the MHC locus on chromosome 6 (HLA-B*08:01; MAF=0.14; P<0.001; OR, 0.84; 95% CI, 0.82 to 0.88). At this locus, we also performed conditional analysis using imputed HLA haplotypes, going from six associated variants to only one (as a result of linkage disequilibrium) when the most statistically significant haplotype is conditioned on (38). In the only other reported hematuria GWAS using deCODE data, association for a variant in a neighboring gene on chromosome 6, HLA-C, was described (Table 2) (9). This specific variant is chromosome 6:31271845 (hg38), where the reference is C and also the minor allele with the alternate allele A. The variant identified by deCODE is not available in either the HRC-imputed or HLA-imputed data in the UK Biobank. Instead, the most significant association in the UK Biobank is with the immediate neighbor gene HLA-B*0801, which is approximately 80 kb away. The reported stratified analysis revealed nominally significant sex differences in effect sizes with OR of 0.92 (95% CI, 0.90 to 0.94; P<0.001) for men and OR of 0.90 (95% CI, 0.87 to 0.92; P<0.001) for women. Additionally, three GWAS of IgA nephropathy have revealed associations with variants in the chromosome 6 MHC region (39–41). These papers report associations with HLA-DQA1*0101, HLA-DQA1*0102, HLA-DQB1*0201, and HLA-DQB1*0301 (41,42). In the UK Biobank White British subset, there is no evidence for association of HLA-DQA1*0101 and HLA-DQ*0102 with hematuria (Supplemental Table 3). The initial association for HLA-DQB1*0201 (OR, 0.88; 95% CI, 0.86 to 0.91) and HLA-DQB1*0301 (OR, 1.07; 95% CI, 1.04 to 1.10) is lost after conditioning on HLA-B*0801, suggesting that it is due to linkage disequilibrium (Supplemental Table 3). The r2, D’, and haplotype frequencies formed by pairs of these haplotypes are shown in Supplemental Tables 6–10. HLA-B*0801 and HLA-DQB1*0201 are positively correlated, whereas HLA-B*0801 and HLA-DQB1*0301 are negatively correlated, and, as expected, HLA-DQB1*0201 and HLA-DQB1*0301 are negatively correlated. Identifying the causal variant(s) at MHC will benefit from higher-resolution imputation and/or direct HLA sequencing (43).
We replicated six loci in the deCODE hematuria GWAS. One locus is at the MHC region on chromosome 6 (Table 2). Two are the same variant, including an intronic variant in TGFB1 (rs56254331; MAF[C]=0.1833; P<0.001; OR, 0.92; 95% CI, 0.89 to 0.94) with the same direction of effect and COL4A3 p.Gly695Arg (rs200287952; MAF, 0.00021; P<0.001; OR, 3.41; 95% CI, 1.18 to 9.89) (Table 2) (9). TGFB1 rs56254331 (MAF[C]=0.1833; P<0.001; OR, 0.92; 95% CI, 0.89 to 0.94) has previously been associated with urinary albumin-creatinine ratio in the UK Biobank (45).
A difference between the UK Biobank and the deCODE hematuria GWAS is the method of phenotyping. For deCODE, hematuria was defined by dipstick measurements (quantitated as ≥+ or ≥++), but for the UK Biobank, it was by ICD9/10 codes. In the UK Biobank, the hematuria rate was 4% (16,866 of 408,286). In deCODE, the rate was significantly higher, with 22% (18,839 of 87,742) having mild hematuria and 42% having (49,212 of 118,115) moderate or severe hematuria. Notably, the ORs in replicated loci in the UK Biobank compared with deCODE were higher, likely speaking to the increased specificity in using ICD9/10 codes compared with dipstick measurements for phenotyping.
The clinical characteristics in Table 1 also support that the approach of using ICD9/10 codes to enrich for glomerular hematuria cases is specific, with cases compared with controls having higher urinary albumin-creatinine ratios and lower eGFR measurements.
In the future, sequencing as opposed to genotyping and imputation in the UK Biobank will provide more resolution at associated regions, overcoming the limitations of this study. For power, we only included data for the largest genetic ancestry group (White British), which limits generalizability to other ancestral groups.
In summary, the UK Biobank is a genetically characterized population-based cohort that has facilitated a range of large-scale GWAS. We present results of an analysis focused on glomerular-enriched hematuria, revealing novel loci and insights into glomerular filtration barrier biology. Additionally, we provide independent associated SNPs, including novel ones, within the strongest associated regions, the region that encompasses COL4A4-COL4A3 and the MHC region. This work is important to understand how to prioritize which variants to study, a successful approach in cystic fibrosis where different classes of therapies are dictated by the pathogenic variant and purported mechanism of action (45).
This manuscript was written to conform to STREGA recommendations (46).
Disclosures
M. Barua reports ownership interest in AstraZeneca, honoraria from Natera, serving on the editorial board of Glomerular Diseases, and speakers bureau for Sanofi. D.F. Gudbjartsson, K. Stefansson, P. Sulem, and G. Sveinbjornsson report employment by deCODE Genetics, which is fully owned by Amgen. All remaining authors have nothing to disclose.
Funding
M. Barua and A.D. Paterson were coprincipal investigators on a grant funded by the Alport Syndrome Foundation in 2020 and are supported by the Canadian Institutes of Health Research Project Grants 437528 and 427165. Funding has also been donated by the Toronto General Hospital Foundation. This work was also supported by FRQS.
Supplementary Material
Acknowledgments
We thank the UK Biobank, TOPMed, deCODE, and the study participants. The UK Biobank resource was used under accounts 48839 (to M. Barua) and 66222 (to S.A. Gagliano-Taliun). The GTEx Project was supported by the Common Fund of the Office of the Director of the National Institutes of Health and by National Cancer Institute; National Human Genome Research Institute; the National Heart, Lung, and Blood Institute; National Institute on Drug Abuse; National Institute of Mental Health; and National Institute of Neurological Disorders and Stroke.
S.A. Gagliano Taliun is funded by Fonds de Recherche du Québec – Santé Junior 1 Award and by operational funds from the Institut de valorisation des données (IVADO).
The data used for the analyses described in the manuscript were obtained from the web links included in the text on July 6, 2021.
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Footnotes
Published online ahead of print. Publication date available at www.cjasn.org.
Author Contributions
A.D. Paterson conceptualized the study; M. Barua, S.A. Gagliano Taliun, D.F. Gudbjartsson, K. Stefansson, P. Sulem, and G. Sveinbjornsson were responsible for data curation; M. Barua, S.A. Gagliano Taliun, D.F. Gudbjartsson, K. Stefansson, P. Sulem, and G. Sveinbjornsson were responsible for investigation; M. Barua and S.A. Gagliano Taliun were responsible for formal analysis; M. Barua, S.A. Gagliano Taliun, and A.D. Paterson were responsible for visualization; M. Barua wrote the original draft; and S.A. Gagliano Taliun and A.D. Paterson reviewed and edited the manuscript.
Data Sharing Statement
All data used in this study are available in this article.
Supplemental Material
This article contains the following supplemental material online at http://cjasn.asnjournals.org/lookup/suppl/doi:10.2215/CJN.13711021/-/DCSupplemental.
Supplemental Figure 1. PCA1-4 in cases and controls in the UK Biobank (available at UK Biobank data field 22009).
Supplemental Figure 2. Q-Q plot for centrally HRC-imputed hematuria GWAS.
Supplemental Table 1. Descriptive characteristics of the UK Biobank White British subset used in the hematuria analyses.
Supplemental Table 2. Association of three independent COL4A4-COL4A3 variants with other kidney disease–relevant or Alport syndrome traits using the TOPMed-imputed UK Biobank dataset.
Supplemental Table 3. GWAS via SAIGE using HLA haplotypes provided by the UK Biobank on White British for ICD-based case-control hematuria.
Supplemental Table 4. Novel hematuria loci in the UK Biobank are replicated in the deCODE dataset, although association with variant and nearest gene PDPN is nominal.
Supplemental Table 5. Cell- and tissue-specific expression of novel loci in the hematuria GWAS.
Supplemental Table 6. Estimated pairwise linkage disequilibrium between UK Biobank–imputed haplotypes HLA-B*0801, HLA-DQB1*0201, and HLA-DQB1*0301 for the White British hematuria cases and controls.
Supplemental Table 7. HLA haplotype frequencies.
Supplemental Table 8. Estimated haplotype frequencies for HLA-B*0801 and HLA-DQB1*0201 in White British in the UK Biobank.
Supplemental Table 9. Estimated haplotype frequencies for HLA-B*0801 and HLA-DQB1*0301 in White British in the UK Biobank.
Supplemental Table 10. Estimated haplotype frequencies for HLA-DQB1*0201 and HLA-DQB1*0301 in White British in the UK Biobank.
References
- 1.Ingelfinger JR: Hematuria in adults. N Engl J Med 385: 153–163, 2021 [DOI] [PubMed] [Google Scholar]
- 2.Lemmink HH, Mochizuki T, van den Heuvel LP, Schröder CH, Barrientos A, Monnens LA, van Oost BA, Brunner HG, Reeders ST, Smeets HJ: Mutations in the type IV collagen alpha 3 (COL4A3) gene in autosomal recessive Alport syndrome. Hum Mol Genet 3: 1269–1273, 1994 [DOI] [PubMed] [Google Scholar]
- 3.Mochizuki T, Lemmink HH, Mariyama M, Antignac C, Gubler MC, Pirson Y, Verellen-Dumoulin C, Chan B, Schröder CH, Smeets HJ, Reeders ST: Identification of mutations in the alpha 3(IV) and alpha 4(IV) collagen genes in autosomal recessive Alport syndrome. Nat Genet 8: 77–81, 1994 [DOI] [PubMed] [Google Scholar]
- 4.Sanchez-Rodriguez E, Southard CT, Kiryluk K: GWAS-based discoveries in IgA nephropathy, membranous nephropathy, and steroid-sensitive nephrotic syndrome. Clin J Am Soc Nephrol 16: 458–466, 2021 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Barua M, Paterson AD: Population-based studies reveal an additive role of type IV collagen variants in hematuria and albuminuria. Pediatr Nephrol 37: 253–262, 2022 [DOI] [PubMed] [Google Scholar]
- 6.Gibson J, Fieldhouse R, Chan MMY, Sadeghi-Alavijeh O, Burnett L, Izzi V, Persikov AV, Gale DP, Storey H, Savige J; Genomics England Research Consortium : Prevalence estimates of predicted pathogenic COL4A3-COL4A5 variants in a population sequencing database and their implications for Alport syndrome. J Am Soc Nephrol 32: 2273–2290, 2021 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Yao T, Udwan K, John R, Rana A, Haghighi A, Xu L, Hack S, Reich HN, Hladunewich MA, Cattran DC, Paterson AD, Pei Y, Barua M: Integration of genetic testing and pathology for the diagnosis of adults with FSGS. Clin J Am Soc Nephrol 14: 213–223, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Voskarides K, Damianou L, Neocleous V, Zouvani I, Christodoulidou S, Hadjiconstantinou V, Ioannou K, Athanasiou Y, Patsias C, Alexopoulos E, Pierides A, Kyriacou K, Deltas C: COL4A3/COL4A4 mutations producing focal segmental glomerulosclerosis and renal failure in thin basement membrane nephropathy. J Am Soc Nephrol 18: 3004–3016, 2007 [DOI] [PubMed] [Google Scholar]
- 9.Benonisdottir S, Kristjansson RP, Oddsson A, Steinthorsdottir V, Mikaelsdottir E, Kehr B, Jensson BO, Arnadottir GA, Sulem G, Sveinbjornsson G, Kristmundsdottir S, Ivarsdottir EV, Tragante V, Gunnarsson B, Runolfsdottir HL, Arthur JG, Deaton AM, Eyjolfsson GI, Davidsson OB, Asselbergs FW, Hreidarsson AB, Rafnar T, Thorleifsson G, Edvardsson V, Sigurdsson G, Helgadottir A, Halldorsson BV, Masson G, Holm H, Onundarson PT, Indridason OS, Benediktsson R, Palsson R, Gudbjartsson DF, Olafsson I, Thorsteinsdottir U, Sulem P, Stefansson K: Sequence variants associating with urinary biomarkers. Hum Mol Genet 28: 1199–1211, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Haas ME, Aragam KG, Emdin CA, Bick AG, Hemani G, Davey Smith G, Kathiresan S; International Consortium for Blood Pressure : Genetic association of albuminuria with cardiometabolic disease and blood pressure. Am J Hum Genet 103: 461–473, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Sinnott-Armstrong N, Tanigawa Y, Amar D, Mars N, Benner C, Aguirre M, Venkataraman GR, Wainberg M, Ollila HM, Kiiskinen T, Havulinna AS, Pirruccello JP, Qian J, Shcherbina A, Rodriguez F, Assimes TL, Agarwala V, Tibshirani R, Hastie T, Ripatti S, Pritchard JK, Daly MJ, Rivas MA; FinnGen : Genetics of 35 blood and urine biomarkers in the UK Biobank. Nat Genet 53: 185–194, 2021 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Taliun D, Harris DN, Kessler MD, Carlson J, Szpiech ZA, Torres R, Taliun SAG, Corvelo A, Gogarten SM, Kang HM, Pitsillides AN, LeFaive J, Lee SB, Tian X, Browning BL, Das S, Emde AK, Clarke WE, Loesch DP, Shetty AC, Blackwell TW, Smith AV, Wong Q, Liu X, Conomos MP, Bobo DM, Aguet F, Albert C, Alonso A, Ardlie KG, Arking DE, Aslibekyan S, Auer PL, Barnard J, Barr RG, Barwick L, Becker LC, Beer RL, Benjamin EJ, Bielak LF, Blangero J, Boehnke M, Bowden DW, Brody JA, Burchard EG, Cade BE, Casella JF, Chalazan B, Chasman DI, Chen YI, Cho MH, Choi SH, Chung MK, Clish CB, Correa A, Curran JE, Custer B, Darbar D, Daya M, de Andrade M, DeMeo DL, Dutcher SK, Ellinor PT, Emery LS, Eng C, Fatkin D, Fingerlin T, Forer L, Fornage M, Franceschini N, Fuchsberger C, Fullerton SM, Germer S, Gladwin MT, Gottlieb DJ, Guo X, Hall ME, He J, Heard-Costa NL, Heckbert SR, Irvin MR, Johnsen JM, Johnson AD, Kaplan R, Kardia SLR, Kelly T, Kelly S, Kenny EE, Kiel DP, Klemmer R, Konkle BA, Kooperberg C, Köttgen A, Lange LA, Lasky-Su J, Levy D, Lin X, Lin KH, Liu C, Loos RJF, Garman L, Gerszten R, Lubitz SA, Lunetta KL, Mak ACY, Manichaikul A, Manning AK, Mathias RA, McManus DD, McGarvey ST, Meigs JB, Meyers DA, Mikulla JL, Minear MA, Mitchell BD, Mohanty S, Montasser ME, Montgomery C, Morrison AC, Murabito JM, Natale A, Natarajan P, Nelson SC, North KE, O’Connell JR, Palmer ND, Pankratz N, Peloso GM, Peyser PA, Pleiness J, Post WS, Psaty BM, Rao DC, Redline S, Reiner AP, Roden D, Rotter JI, Ruczinski I, Sarnowski C, Schoenherr S, Schwartz DA, Seo JS, Seshadri S, Sheehan VA, Sheu WH, Shoemaker MB, Smith NL, Smith JA, Sotoodehnia N, Stilp AM, Tang W, Taylor KD, Telen M, Thornton TA, Tracy RP, Van Den Berg DJ, Vasan RS, Viaud-Martinez KA, Vrieze S, Weeks DE, Weir BS, Weiss ST, Weng LC, Willer CJ, Zhang Y, Zhao X, Arnett DK, Ashley-Koch AE, Barnes KC, Boerwinkle E, Gabriel S, Gibbs R, Rice KM, Rich SS, Silverman EK, Qasba P, Gan W, Papanicolaou GJ, Nickerson DA, Browning SR, Zody MC, Zöllner S, Wilson JG, Cupples LA, Laurie CC, Jaquish CE, Hernandez RD, O’Connor TD, Abecasis GR; NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium : Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature 590: 290–299, 2021 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Zanetti D, Rao A, Gustafsson S, Assimes TL, Montgomery SB, Ingelsson E: Identification of 22 novel loci associated with urinary biomarkers of albumin, sodium, and potassium excretion. Kidney Int 95: 1197–1208, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Teslovich TM, Musunuru K, Smith AV, Edmondson AC, Stylianou IM, Koseki M, Pirruccello JP, Ripatti S, Chasman DI, Willer CJ, Johansen CT, Fouchier SW, Isaacs A, Peloso GM, Barbalic M, Ricketts SL, Bis JC, Aulchenko YS, Thorleifsson G, Feitosa MF, Chambers J, Orho-Melander M, Melander O, Johnson T, Li X, Guo X, Li M, Shin Cho Y, Jin Go M, Jin Kim Y, Lee JY, Park T, Kim K, Sim X, Twee-Hee Ong R, Croteau-Chonka DC, Lange LA, Smith JD, Song K, Hua Zhao J, Yuan X, Luan J, Lamina C, Ziegler A, Zhang W, Zee RY, Wright AF, Witteman JC, Wilson JF, Willemsen G, Wichmann HE, Whitfield JB, Waterworth DM, Wareham NJ, Waeber G, Vollenweider P, Voight BF, Vitart V, Uitterlinden AG, Uda M, Tuomilehto J, Thompson JR, Tanaka T, Surakka I, Stringham HM, Spector TD, Soranzo N, Smit JH, Sinisalo J, Silander K, Sijbrands EJ, Scuteri A, Scott J, Schlessinger D, Sanna S, Salomaa V, Saharinen J, Sabatti C, Ruokonen A, Rudan I, Rose LM, Roberts R, Rieder M, Psaty BM, Pramstaller PP, Pichler I, Perola M, Penninx BW, Pedersen NL, Pattaro C, Parker AN, Pare G, Oostra BA, O’Donnell CJ, Nieminen MS, Nickerson DA, Montgomery GW, Meitinger T, McPherson R, McCarthy MI, McArdle W, Masson D, Martin NG, Marroni F, Mangino M, Magnusson PK, Lucas G, Luben R, Loos RJ, Lokki ML, Lettre G, Langenberg C, Launer LJ, Lakatta EG, Laaksonen R, Kyvik KO, Kronenberg F, König IR, Khaw KT, Kaprio J, Kaplan LM, Johansson A, Jarvelin MR, Janssens AC, Ingelsson E, Igl W, Kees Hovingh G, Hottenga JJ, Hofman A, Hicks AA, Hengstenberg C, Heid IM, Hayward C, Havulinna AS, Hastie ND, Harris TB, Haritunians T, Hall AS, Gyllensten U, Guiducci C, Groop LC, Gonzalez E, Gieger C, Freimer NB, Ferrucci L, Erdmann J, Elliott P, Ejebe KG, Döring A, Dominiczak AF, Demissie S, Deloukas P, de Geus EJ, de Faire U, Crawford G, Collins FS, Chen YD, Caulfield MJ, Campbell H, Burtt NP, Bonnycastle LL, Boomsma DI, Boekholdt SM, Bergman RN, Barroso I, Bandinelli S, Ballantyne CM, Assimes TL, Quertermous T, Altshuler D, Seielstad M, Wong TY, Tai ES, Feranil AB, Kuzawa CW, Adair LS, Taylor HA Jr., Borecki IB, Gabriel SB, Wilson JG, Holm H, Thorsteinsdottir U, Gudnason V, Krauss RM, Mohlke KL, Ordovas JM, Munroe PB, Kooner JS, Tall AR, Hegele RA, Kastelein JJ, Schadt EE, Rotter JI, Boerwinkle E, Strachan DP, Mooser V, Stefansson K, Reilly MP, Samani NJ, Schunkert H, Cupples LA, Sandhu MS, Ridker PM, Rader DJ, van Duijn CM, Peltonen L, Abecasis GR, Boehnke M, Kathiresan S: Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466: 707–713, 2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Onengut-Gumuscu S, Chen WM, Burren O, Cooper NJ, Quinlan AR, Mychaleckyj JC, Farber E, Bonnie JK, Szpak M, Schofield E, Achuthan P, Guo H, Fortune MD, Stevens H, Walker NM, Ward LD, Kundaje A, Kellis M, Daly MJ, Barrett JC, Cooper JD, Deloukas P, Todd JA, Wallace C, Concannon P, Rich SS; Type 1 Diabetes Genetics Consortium : Fine mapping of type 1 diabetes susceptibility loci and evidence for colocalization of causal variants with lymphoid gene enhancers. Nat Genet 47: 381–386, 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Bycroft C, Freeman C, Petkova D, Band G, Elliott LT, Sharp K, Motyer A, Vukcevic D, Delaneau O, O’Connell J, Cortes A, Welsh S, Young A, Effingham M, McVean G, Leslie S, Allen N, Donnelly P, Marchini J: The UK Biobank resource with deep phenotyping and genomic data. Nature 562: 203–209, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.McCarthy S, Das S, Kretzschmar W, Delaneau O, Wood AR, Teumer A, Kang HM, Fuchsberger C, Danecek P, Sharp K, Luo Y, Sidore C, Kwong A, Timpson N, Koskinen S, Vrieze S, Scott LJ, Zhang H, Mahajan A, Veldink J, Peters U, Pato C, van Duijn CM, Gillies CE, Gandin I, Mezzavilla M, Gilly A, Cocca M, Traglia M, Angius A, Barrett JC, Boomsma D, Branham K, Breen G, Brummett CM, Busonero F, Campbell H, Chan A, Chen S, Chew E, Collins FS, Corbin LJ, Smith GD, Dedoussis G, Dorr M, Farmaki AE, Ferrucci L, Forer L, Fraser RM, Gabriel S, Levy S, Groop L, Harrison T, Hattersley A, Holmen OL, Hveem K, Kretzler M, Lee JC, McGue M, Meitinger T, Melzer D, Min JL, Mohlke KL, Vincent JB, Nauck M, Nickerson D, Palotie A, Pato M, Pirastu N, McInnis M, Richards JB, Sala C, Salomaa V, Schlessinger D, Schoenherr S, Slagboom PE, Small K, Spector T, Stambolian D, Tuke M, Tuomilehto J, Van den Berg LH, Van Rheenen W, Volker U, Wijmenga C, Toniolo D, Zeggini E, Gasparini P, Sampson MG, Wilson JF, Frayling T, de Bakker PI, Swertz MA, McCarroll S, Kooperberg C, Dekker A, Altshuler D, Willer C, Iacono W, Ripatti S, Soranzo N, Walter K, Swaroop A, Cucca F, Anderson CA, Myers RM, Boehnke M, McCarthy MI, Durbin R; Haplotype Reference Consortium : A reference panel of 64,976 haplotypes for genotype imputation. Nat Genet 48: 1279–1283, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Huang J, Howie B, McCarthy S, Memari Y, Walter K, Min JL, Danecek P, Malerba G, Trabetti E, Zheng HF, Gambaro G, Richards JB, Durbin R, Timpson NJ, Marchini J, Soranzo N; UK10K Consortium : Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel. Nat Commun 6: 8111, 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Dilthey A, Leslie S, Moutsianas L, Shen J, Cox C, Nelson MR, McVean G: Multi-population classical HLA type imputation. PLoS Comput Biol 9: e1002877, 2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Carroll RJ, Bastarache L, Denny JC: R PheWAS: Data analysis and plotting tools for phenome-wide association studies in the R environment. Bioinformatics 30: 2375–2376, 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Bikbov B: R open source programming code for calculation of the kidney donor profile index and kidney donor risk index. Kidney Dis 4: 269–272, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Zhou W, Nielsen JB, Fritsche LG, Dey R, Gabrielsen ME, Wolford BN, LeFaive J, VandeHaar P, Gagliano SA, Gifford A, Bastarache LA, Wei WQ, Denny JC, Lin M, Hveem K, Kang HM, Abecasis GR, Willer CJ, Lee S: Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies. Nat Genet 50: 1335–1341, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Francis JGF: The QR transformation a unitary analogue to the LR transformation - part 1. Comput J 4: 265–271, 1961 [Google Scholar]
- 24.Boughton AP, Welch RP, Flickinger M, VandeHaar P, Taliun D, Abecasis GR, Boehnke M: LocusZoom.js: Interactive and embeddable visualization of genetic association study results. Bioinformatics 37: 3017–3018, 2021 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Gagliano Taliun SA, VandeHaar P, Boughton AP, Welch RP, Taliun D, Schmidt EM, Zhou W, Nielsen JB, Willer CJ, Lee S, Fritsche LG, Boehnke M, Abecasis GR: Exploring and visualizing large-scale genetic associations by using PheWeb. Nat Genet 52: 550–552, 2020 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Dagher H, Yan Wang Y, Fassett R, Savige J: Three novel COL4A4 mutations resulting in stop codons and their clinical effects in autosomal recessive Alport syndrome. Hum Mutat 20: 321–322, 2002 [DOI] [PubMed] [Google Scholar]
- 27.Storey H, Savige J, Sivakumar V, Abbs S, Flinter FA: COL4A3/COL4A4 mutations and features in individuals with autosomal recessive Alport syndrome. J Am Soc Nephrol 24: 1945–1954, 2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Gast C, Pengelly RJ, Lyon M, Bunyan DJ, Seaby EG, Graham N, Venkat-Raman G, Ennis S: Collagen (COL4A) mutations are the most frequent mutations underlying adult focal segmental glomerulosclerosis. Nephrol Dial Transplant 31: 961–970, 2016 [DOI] [PubMed] [Google Scholar]
- 29.Malone AF, Phelan PJ, Hall G, Cetincelik U, Homstad A, Alonso AS, Jiang R, Lindsey TB, Wu G, Sparks MA, Smith SR, Webb NJ, Kalra PA, Adeyemo AA, Shaw AS, Conlon PJ, Jennette JC, Howell DN, Winn MP, Gbadegesin RA: Rare hereditary COL4A3/COL4A4 variants may be mistaken for familial focal segmental glomerulosclerosis. Kidney Int 86: 1253–1259, 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Li Y, Brodsky B, Baum J: NMR shows hydrophobic interactions replace glycine packing in the triple helix at a natural break in the (Gly-X-Y)n repeat. J Biol Chem 282: 22699–22706, 2007 [DOI] [PubMed] [Google Scholar]
- 31.Marsh SGE, Parham P, Barber LD: The HLA Factsbook, San Diego, CA, Academic Press, 2000 [Google Scholar]
- 32.Suzuki K, Fukusumi Y, Yamazaki M, Kaneko H, Tsuruga K, Tanaka H, Ito E, Matsui K, Kawachi H: Alteration in the podoplanin-ezrin-cytoskeleton linkage is an important initiation event of the podocyte injury in puromycin aminonucleoside nephropathy, a mimic of minimal change nephrotic syndrome. Cell Tissue Res 362: 201–213, 2015 [DOI] [PubMed] [Google Scholar]
- 33.GTEx Consortium : Genetic effects on gene expression across human tissues [published correction appears in Nature 553: 530, 2018 10.1038/nature25160]. Nature 550: 204–213, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Sun BB, Maranville JC, Peters JE, Stacey D, Staley JR, Blackshaw J, Burgess S, Jiang T, Paige E, Surendran P, Oliver-Williams C, Kamat MA, Prins BP, Wilcox SK, Zimmerman ES, Chi A, Bansal N, Spain SL, Wood AM, Morrell NW, Bradley JR, Janjic N, Roberts DJ, Ouwehand WH, Todd JA, Soranzo N, Suhre K, Paul DS, Fox CS, Plenge RM, Danesh J, Runz H, Butterworth AS: Genomic atlas of the human plasma proteome. Nature 558: 73–79, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Fry A, Littlejohns TJ, Sudlow C, Doherty N, Adamska L, Sprosen T, Collins R, Allen NE: Comparison of sociodemographic and health-related characteristics of UK Biobank participants with those of the general population. Am J Epidemiol 186: 1026–1034, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Daga S, Baldassarri M, Lo Rizzo C, Fallerini C, Imperatore V, Longo I, Frullanti E, Landucci E, Massella L, Pecoraro C, Garosi G, Ariani F, Mencarelli MA, Mari F, Renieri A, Pinto AM: Urine-derived podocytes-lineage cells: A promising tool for precision medicine in Alport Syndrome. Hum Mutat 39: 302–314, 2018 [DOI] [PubMed] [Google Scholar]
- 37.Wang YY, Rana K, Tonna S, Lin T, Sin L, Savige J: COL4A3 mutations and their clinical consequences in thin basement membrane nephropathy (TBMN). Kidney Int 65: 786–790, 2004 [DOI] [PubMed] [Google Scholar]
- 38.Dand N, Duckworth M, Baudry D, Russell A, Curtis CJ, Lee SH, Evans I, Mason KJ, Alsharqi A, Becher G, Burden AD, Goodwin RG, McKenna K, Murphy R, Perera GK, Rotarescu R, Wahie S, Wright A, Reynolds NJ, Warren RB, Griffiths CEM, Smith CH, Simpson MA, Barker JN; BADBIR Study Group; BSTOP Study Group; PSORT Consortium : HLA-C*06:02 genotype is a predictive biomarker of biologic treatment response in psoriasis. J Allergy Clin Immunol 143: 2120–2130, 2019 [DOI] [PubMed] [Google Scholar]
- 39.Feehally J, Farrall M, Boland A, Gale DP, Gut I, Heath S, Kumar A, Peden JF, Maxwell PH, Morris DL, Padmanabhan S, Vyse TJ, Zawadzka A, Rees AJ, Lathrop M, Ratcliffe PJ: HLA has strongest association with IgA nephropathy in genome-wide analysis. J Am Soc Nephrol 21: 1791–1797, 2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Gharavi AG, Kiryluk K, Choi M, Li Y, Hou P, Xie J, Sanna-Cherchi S, Men CJ, Julian BA, Wyatt RJ, Novak J, He JC, Wang H, Lv J, Zhu L, Wang W, Wang Z, Yasuno K, Gunel M, Mane S, Umlauf S, Tikhonova I, Beerman I, Savoldi S, Magistroni R, Ghiggeri GM, Bodria M, Lugani F, Ravani P, Ponticelli C, Allegri L, Boscutti G, Frasca G, Amore A, Peruzzi L, Coppo R, Izzi C, Viola BF, Prati E, Salvadori M, Mignani R, Gesualdo L, Bertinetto F, Mesiano P, Amoroso A, Scolari F, Chen N, Zhang H, Lifton RP: Genome-wide association study identifies susceptibility loci for IgA nephropathy. Nat Genet 43: 321–327, 2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Kiryluk K, Li Y, Scolari F, Sanna-Cherchi S, Choi M, Verbitsky M, Fasel D, Lata S, Prakash S, Shapiro S, Fischman C, Snyder HJ, Appel G, Izzi C, Viola BF, Dallera N, Del Vecchio L, Barlassina C, Salvi E, Bertinetto FE, Amoroso A, Savoldi S, Rocchietti M, Amore A, Peruzzi L, Coppo R, Salvadori M, Ravani P, Magistroni R, Ghiggeri GM, Caridi G, Bodria M, Lugani F, Allegri L, Delsante M, Maiorana M, Magnano A, Frasca G, Boer E, Boscutti G, Ponticelli C, Mignani R, Marcantoni C, Di Landro D, Santoro D, Pani A, Polci R, Feriozzi S, Chicca S, Galliani M, Gigante M, Gesualdo L, Zamboli P, Battaglia GG, Garozzo M, Maixnerová D, Tesar V, Eitner F, Rauen T, Floege J, Kovacs T, Nagy J, Mucha K, Pączek L, Zaniew M, Mizerska-Wasiak M, Roszkowska-Blaim M, Pawlaczyk K, Gale D, Barratt J, Thibaudin L, Berthoux F, Canaud G, Boland A, Metzger M, Panzer U, Suzuki H, Goto S, Narita I, Caliskan Y, Xie J, Hou P, Chen N, Zhang H, Wyatt RJ, Novak J, Julian BA, Feehally J, Stengel B, Cusi D, Lifton RP, Gharavi AG: Discovery of new risk loci for IgA nephropathy implicates genes involved in immunity against intestinal pathogens. Nat Genet 46: 1187–1196, 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Sukcharoen K, Sharp SA, Thomas NJ, Kimmitt RA, Harrison J, Bingham C, Mozere M, Weedon MN, Tyrrell J, Barratt J, Gale DP, Oram RA: IgA nephropathy genetic risk score to estimate the prevalence of IgA nephropathy in UK Biobank. Kidney Int Rep 5: 1643–1650, 2020 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Luo Y, Kanai M, Choi W, Li X, Sakaue S, Yamamoto K, Ogawa K, Gutierrez-Arcelus M, Gregersen PK, Stuart PE, Elder JT, Forer L, Schönherr S, Fuchsberger C, Smith AV, Fellay J, Carrington M, Haas DW, Guo X, Palmer ND, Chen YI, Rotter JI, Taylor KD, Rich SS, Correa A, Wilson JG, Kathiresan S, Cho MH, Metspalu A, Esko T, Okada Y, Han B, McLaren PJ, Raychaudhuri S; NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium : A high-resolution HLA reference panel capturing global population diversity enables multi-ancestry fine-mapping in HIV host response. Nat Genet 53: 1504–1516, 2021 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Casanova F, Tyrrell J, Beaumont RN, Ji Y, Jones SE, Hattersley AT, Weedon MN, Murray A, Shore AC, Frayling TM, Wood AR: A genome-wide association study implicates multiple mechanisms influencing raised urinary albumin-creatinine ratio. Hum Mol Genet 28: 4197–4207, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Collins FS: Realizing the dream of molecularly targeted therapies for cystic fibrosis. N Engl J Med 381: 1863–1865, 2019 [DOI] [PubMed] [Google Scholar]
- 46.Little J, Higgins JP, Ioannidis JP, Moher D, Gagnon F, von Elm E, Khoury MJ, Cohen B, Davey-Smith G, Grimshaw J, Scheet P, Gwinn M, Williamson RE, Zou GY, Hutchings K, Johnson CY, Tait V, Wiens M, Golding J, van Duijn C, McLaughlin J, Paterson A, Wells G, Fortier I, Freedman M, Zecevic M, King R, Infante-Rivard C, Stewart A, Birkett N; STrengthening the REporting of Genetic Association Studies : STrengthening the REporting of Genetic Association Studies (STREGA): An extension of the STROBE statement. PLoS Med 6: e22, 2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.