Significance Statement
Studies have identified common CKD-associated gene variants, but the contribution of rare variants has not been systematically examined. The authors use exome sequencing and rare-variant collapsing analyses to compare rare genetic variants in 3150 cases (representing broad clinical CKD subtypes) with 9563 controls. For five known CKD-associated genes, they detected a significant enrichment of rare variants in PKD1, PKD2, COL4A5, and found suggestive evidence for rare COL4A3 and COL4A4 variants. They also found evidence for four other genes not previously implicated in CKD. By demonstrating that rare-variant collapsing analyses can validate known genes and identify candidate genes and modifiers for nephropathy, these findings provide a rationale for larger-scale investigation of the rare variants’ contribution to disease risk across major clinical CKD categories.
Keywords: genetic renal disease, human genetics, molecular genetics
Visual Abstract
Abstract
Background
Studies have identified many common genetic associations that influence renal function and all-cause CKD, but these explain only a small fraction of variance in these traits. The contribution of rare variants has not been systematically examined.
Methods
We performed exome sequencing of 3150 individuals, who collectively encompassed diverse CKD subtypes, and 9563 controls. To detect causal genes and evaluate the contribution of rare variants we used collapsing analysis, in which we compared the proportion of cases and controls carrying rare variants per gene.
Results
The analyses captured five established monogenic causes of CKD: variants in PKD1, PKD2, and COL4A5 achieved study-wide significance, and we observed suggestive case enrichment for COL4A4 and COL4A3. Beyond known disease-associated genes, collapsing analyses incorporating regional variant intolerance identified suggestive dominant signals in CPT2 and several other candidate genes. Biallelic mutations in CPT2 cause carnitine palmitoyltransferase II deficiency, sometimes associated with rhabdomyolysis and acute renal injury. Genetic modifier analysis among cases with APOL1 risk genotypes identified a suggestive signal in AHDC1, implicated in Xia–Gibbs syndrome, which involves intellectual disability and other features. On the basis of the observed distribution of rare variants, we estimate that a two- to three-fold larger cohort would provide 80% power to implicate new genes for all-cause CKD.
Conclusions
This study demonstrates that rare-variant collapsing analyses can validate known genes and identify candidate genes and modifiers for kidney disease. In so doing, these findings provide a motivation for larger-scale investigation of rare-variant risk contributions across major clinical CKD categories.
CKD affects approximately one in ten adults worldwide.1–3 Although Mendelian disorders are estimated to be responsible for 10% of ESRD cases,4 15%–30% of patients, including those with presumptively acquired forms of disease such as diabetic or hypertensive nephropathy, report a family history of renal disease, indicating a contribution of hereditary factors across clinical categories.5,6 Moreover, affected individuals within the same family frequently carry different clinical diagnoses, suggesting that there may be shared genetic contributions across CKD subtypes.5,6 Recent studies support this hypothesis.7 For example, two variants in APOL1 (Online Mendelian Inheritance in Man reference number [OMIM]: 603743) significantly increase the risk for multiple forms of CKD, including FSGS, hypertension-attributed nephropathy, and HIV-1–associated nephropathy.8–11 Similarly, mutations in the type 4 collagen genes (COL4A3 [OMIM: 120070], COL4A4 [OMIM: 120131], and COL4A5 [OMIM: 303630]) traditionally associated with Alport syndrome also frequently manifest as FSGS.12–15 Collectively, these findings highlight the potential of genetic analyses to identify shared biology between disease causes traditionally considered to be distinct, enabling more precise classification of CKD on the basis of underlying molecular pathophysiology.
Although genome-wide association studies (GWAS) have identified multiple common genetic variants influencing all-cause CKD and kidney function, these explain a small fraction of the variance in these traits.16–19 To date, no studies have systematically examined the contribution of rare genetic variation (<1% minor allele frequency [MAF]) with a presumed larger effect on the risk of CKD. Detection of rare independent variants clustered in a single gene can provide significant insight into disease biology and influence clinical therapy in several ways, even if pathogenic variants in a particular gene only explain a small proportion of all cases. First, in a population with both acquired and inherited multifactorial disease, accurate estimates of the proportion of cases caused by known genes can inform the use of existing therapies and diagnostic tests. Second, rare mutations can lead to the identification of widely applicable therapeutic drug targets, such as the discovery of PCSK9 mutations leading to the development of treatment for general forms of hypercholesterolemia.20,21 Third, it is increasingly recognized that the validation of such drug targets in genetic studies of human populations improves the probability of the success of drug development in clinical trials.22
In our rare-variant collapsing analyses, we first define a class of variation with properties that are presumed to enrich for biologic impact. We refer to variants that satisfy the imposed criteria as qualifying variants (QVs). Because single-variant association tests for rare variants lack power, the collapsing method aggregates QVs within each protein-coding gene and searches for a statistically significant difference in the rate of QV carriers between cases and controls (individuals with more than one QV in a gene are only counted once). These analyses have allowed gene discovery even when applied to heterogeneous, complex diseases with high allelic and locus heterogeneity; moreover, as long as the effect sizes are high and the causal variants are rare in the wider population, the approach can potentially require relatively small sample sizes.23–25 For example, this approach has successfully detected new and known genes in general epilepsy, kidney malformations, polycystic liver disease, amyotrophic lateral sclerosis, and idiopathic pulmonary fibrosis.26–32 Here, we apply a collapsing analysis to evaluate the contribution of rare variants to CKD risk in a large, multiethnic population drawn from two independent patient cohorts.
Methods
Study Cohort
We studied samples drawn from an initial combined cohort of 3315 individuals, reflecting two independent CKD cohorts (Supplemental Table 1).33 After quality filtering (see Sequencing and Bioinformatic Protocols) and genetic matching to 9563 controls, the final case cohort comprised 3150 individuals. The Columbia University Medical Center (CUMC) cohort comprised 2041 patients with renal disease who were evaluated in CUMC nephrology clinics and who consented for genetics research. The second cohort comprised 1109 individuals from the AstraZeneca A Study to Evaluate the Use of Rosuvastatin in Subjects On Regular Hemodialysis: An Assessment of Survival and Cardiovascular Events trial (AURORA). AURORA recruited 2776 individuals aged between 50 and 80 years who were undergoing hemodialysis,34 and 1109 participants were included in this study, comprising those who had consented to genetic research and for whom DNA was available and successfully sequenced. The recruitment and exclusion criteria for the AURORA cohort have been previously described.34 Reasons for which DNA was not available for all patients in the AURORA study include lack of consent (collection was optional for patients), consent withdrawal, or country regulations not allowing for DNA collection. The clinical diagnosis for samples from both cohorts was classified into one of the following categories: (1) Mendelian CKD, including congenital or cystic renal disease; (2) glomerulopathy; (3) diabetic nephropathy; (4) hypertensive nephropathy; (5) tubulointerstitial disease; and (6) other/unknown, in cases where the primary disease did not belong to one of the above categories (“other”), or was altogether uncertain (“unknown”). We compared the genetic variation in these 3150 cases with that found among 9563 unrelated, healthy individuals and individuals ascertained for studies unrelated to kidney or urological disease in the Columbia Institute for Genomic Medicine (IGM) database (Supplemental Table 2). CUMC patients were more diverse in age and race/ancestry than AURORA patients (Supplemental Table 1). Reflecting different patient ascertainment, the AURORA and CUMC cohorts differed in the representation of clinical diagnostic subgroups, but collectively the two cohorts captured the major clinical categories of CKD (Table 1).3
Table 1.
Clinical Diagnosis | AURORA | CUMC | Total | |||
---|---|---|---|---|---|---|
N | % | N | % | N | % | |
Diabetic nephropathy | 178 | 16.1 | 176 | 8.6 | 354 | 11.2 |
Glomerulopathy | 228 | 20.6 | 1042 | 51.1 | 1270 | 40.3 |
Hypertensive nephropathy | 189 | 17.0 | 118 | 5.8 | 307 | 9.7 |
Mendelian CKD or congenital genitourinary anomalies | 156 | 14.1 | 406 | 19.9 | 562 | 17.8 |
Other/unknown | 148 | 13.3 | 274 | 13.4 | 422 | 13.4 |
Tubulointerstitial disease | 210 | 18.9 | 25 | 1.2 | 235 | 7.5 |
Total | 1109 | 2041 | 3150 |
Ethics
Patients recruited for the CUMC cohort provided informed consent for the use of DNA in genetic research, and the study was approved by the Columbia University Institutional Review Board. The AURORA study was performed in accordance with the provisions of the Declaration of Helsinki as defined by the International Conference on Harmonization, Good Clinical Practice, and applicable regulatory requirements, as well as the policy on bioethics and human biologic samples of the trial sponsor, AstraZeneca. For the sequenced AURORA patients, written, informed consent for the use of DNA for genetic research was obtained from patients during the original AURORA study, before initiation of this study. Deidentified data were provided to CUMC investigators for clinical sequence interpretation and collapsing analysis.
Sequencing and Bioinformatic Protocols
Samples were sequenced at the IGM at Columbia University. Exome enrichment was performed using Roche and IDT xGen Exome Research Panel V1.0 capture kits. Sequencing was performed on the Illumina HiSeq2500 and NovaSeq platforms. FASTQ data were processed and aligned to hg19/GRCh37 alignment using BWA-MEM,35 and variants were called using the Genome Analysis Toolkit (GATK) v3.6 best practices. All samples were processed using the same bioinformatic pipeline in the IGM. Analyses were performed using the in-house Analysis Tool for Annotated Variants36 for retrieving, annotating, and filtering variants in large cohorts.
All case (n=3150) and control (n=9563) samples included in the collapsing analysis achieved >10-fold read depth in >80% of the consensus coding sequence (CCDS release 20) and achieved average autosomal read depth of >30-fold. All samples were also confirmed to have concordance between genetic and self-declared sex, had contamination levels <4% according to VerifyBamID,37 and were unrelated up to the second degree according to KING38 (Supplemental Methods). Genetic ancestry was also defined for the test cohort using a panel of 12,840 common exome markers; for all analyses, genetic outliers were removed (Supplemental Methods). For the recessive analysis, which used a higher MAF threshold, test samples were restricted to those whose genetic variation placed them with high confidence among individuals of European genetic ancestry (probability >0.95).
For clinical sequence interpretation, we prioritized rare (population frequency ≤1%) nonsynonymous variants and canonical splice site variants in renal Mendelian disorders, identified using the OMIM database. We classified variants using the American College of Medical Genetics (ACMG) guidelines for clinical sequence interpretation.39
Variant Quality Criteria
Collapsing analyses were applied using similar protocols to those described previously (Supplemental Methods).28–30,40 To qualify for inclusion in the analysis, a variant was required to have a minimum read-depth of ten and be located within the CCDS or the 2 bp canonical splice sites within the introns. All QVs achieved a “pass” or “intermediate” status according to GATK’s Variant Quality Score Recalibration and passed several GATK quality score thresholds (Supplemental Methods). To reduce bias caused by differential coverage between case and control samples at QV sites, a binomial exact test was performed to exclude variant sites where the proportion of cases and proportion of controls with at least ten-fold read depth significantly differed (binomial exact test P<0.05). In addition to these quality metrics, as defined in our previous studies, QV refers to variants that also fulfill a predefined set of filtering criteria (Supplemental Table 3). A Fisher exact test was then applied to determine if the rate of case carriers was significantly different than the rate of control carriers under identical QV criteria.
Collapsing Analysis Overview
For discovering new CKD genes, we considered 12 distinct, nonsynonymous models for selecting QVs, comprising 11 dominant models and one recessive model incorporating gene-based and regional intolerance–based collapsing procedures (Supplemental Table 3). The recessive model screened for both homozygous genotypes and proxy compound heterozygous genotypes, whereby two QV observations in the same gene in the same patient were considered as a putative compound heterozygous genotype. In addition to running collapsing analyses on the combined case cohort of 3150 cases, we investigated the CUMC (n=2041) and AURORA (n=1109) case collections individually. These analyses were followed by six additional analyses focusing on the major subphenotype groups among the 3150 cases, for a total of nine clinical groupings. All 11 dominant models were run for the nine clinical groups, whereas the single recessive model was only run on the combined case cohort (Supplemental Table 3). Accounting for the 11 dominant runs across the nine different clinical subgroups, plus a single, recessive, nonsynonymous run on the combined case cohort, a total of 100 nonsynonymous models were analyzed.
Regionally Informed Collapsing Analysis
Recent studies have shown that accounting for regional mutation intolerance in coding regions can increase the power of rare-variant association studies.41–43 Our analyses also included six models that included missense variants only if they fell within subgenic regions that are intolerant to missense variants,41–45 according to certain thresholds determined by two measures of missense intolerance (MTR50 and localized intolerance model using Bayesian regression [LIMBR], Supplemental Method 3). We apply these filters to all missense-inclusive, nonsynonymous models and provide supporting data to show that these filters optimize the signal detection among known CKD genes. In addition, to determine whether the regional missense intolerance filters improve our missense signal detection, we use our top-ranked CKD gene signals from the primary analyses and compare the signal strength in the missense-only QV model before and after application of the intolerance filters (Supplemental Table 4).
Modifier Analysis
We also performed three secondary analyses, investigating genes that could potentially modify the risk of CKD in individuals harboring (1) the APOL1 risk genotypes, (2) diagnostic mutations in polycystin 1 (PKD1) and polycystin 2 (PKD2), and (3) diagnostic mutations in COL4A3, COL4A4, or COL4A5.
The APOL1 G1 risk allele is represented by both rs73885319 and rs60910145 missense variants, and the G2 risk allele is a 6 bp deletion (rs71785313, formerly rs143830837).8 APOL1-associated CKD is thought to follow a recessive model, with disease risk significantly higher among patients who carry two copies of the risk alleles (i.e., G1/G1, G1/G2, or G2/G2).8,46 We identified 137 cases (4.3%) and 206 controls (2.2%) carrying two APOL1 risk alleles across the full case-control cohort. Note that despite the high frequency of cases carrying these known risk alleles, the G1 or G2 alleles were not expected to drive a signal for APOL1 in the general collapsing analyses because their global MAF (1%–2%) is above the maximum MAF thresholds for all collapsing analysis models used, meaning they will not be classed as QVs in any category used in this study. Genetic ancestry predictions using principal component analysis clustered these high-risk carriers with in-house samples of black or Afro-Caribbean ancestry. We therefore used the top six ancestry principal components to identify 2209 ancestry-matched individuals from the full control group for case-control comparisons (Supplemental Figure 1). We also adopted this ancestry approach to match the dual-risk APOL1 case samples with the remaining cases that did not harbor two risk variants as a negative control group. The cases for the PKD1/PKD2 and COL4A3/COL4A4/COL4A5 modifier analysis were selected on the basis of the presence of a diagnostic mutation in these genes, as classified by the ACMG criteria.33,39 The modifier analysis for the PKD1/PKD2 and COL4A3/COL4A4/COL4A5 groups was conducted by comparing cases with diagnostic mutations in these genes with the broader set of 9563 multiethnic controls, because the cases with diagnostic mutations were also multiethnic and we have previously shown that ultrarare variants are not expected to be individually strongly associated with ancestry structure.23
Statistical Analyses
For the collapsing analysis and the power calculation, a two-sided Fisher's exact test was used for the P values and odds ratios (ORs), with Bonferroni post hoc analysis. Calculations were performed in R software (version 3.4.0, 2017–04–21).47 We defined a study-wide Bonferroni multiplicity-adjusted significance threshold as P<2.7×10−8, determined by (0.05/[18650 CCDS genes×100 nonsynonymous models]). This is conservative because of the correlation among the various models being considered. We also performed two additional synonymous QV models (one each for dominant and recessive) to act as negative experimental controls. We also refer to suggestive genes in this article as genes found to be highly ranked in the collapsing analyses, but with an understanding that highly ranked genes that do not achieve study-wide significance are a mixture of true (underpowered) biologic signals and observations from the tail of the null distribution. Thus, all suggestive genes depend on additional replication studies or orthogonal evidence to support clinical relevance. For our power calculation, we varied the samples sizes in our all-cause CKD cohort (n=3150), while maintaining the observed case and control carrier frequencies. We then investigated what case and control cohort sizes provided 80% power to achieve study-wide significance for a dominantly acting gene in which first, the collapsed QVs confer a minimum combined OR of 5.5, and second, QVs are present in at least 0.1% of cases.
We also performed a look-up of a recent analysis of UK Biobank data. We selected all kidney-associated traits from the UK Biobank data and downloaded their GWAS P values for genotyped and imputed variants from GeneATLAS (http://geneatlas.roslin.ed.ac.uk).48 Quality control for imputed variants included ensuring an imputation score ≥0.9, a MAF of ≥0.001, and a Hardy–Weinberg equilibrium P value of ≥10−10. We next asked whether the top suggestive genes in the collapsing analysis were enriched for common variants for CKD in GeneATLAS, using a threshold P value of <10−4 for any of the kidney traits, and we included variants 50 kb up- and downstream of each gene.
Results
Gene-Based Collapsing Analyses
We analyzed exome sequencing data from 3150 individuals collectively representing the major clinical subcategories of CKD (Table 1). Three genes reached study-wide significance: PKD1 and PKD2, associated with autosomal dominant polycystic kidney disease (ADPKD); and collagen type 4 α5 chain (COL4A5), associated with X-linked Alport syndrome (Table 2). The signals for these genes were detectable among the all-cause CKD cohort and across multiple other models (Figure 1, Supplemental Table 5). As each of these genes is associated with a Mendelian form of nephropathy, we surmised PKD1, PKD2, or COL4A5 signals were driven by the presence of patients with ADPKD kidney disease and X-linked Alport syndrome, rather than rare variants in these genes contributing to all-cause CKD. Consistent with this expectation, the case enrichment was much stronger among the 562 cases comprising the “Mendelian or congenital CKD” subgroup (Figure 1, Table 2) and explained the signal in the overall cohort. In addition to these three genes, COL4A3 and COL4A4, which are also causal for Mendelian forms of CKD, were highly ranked across multiple analysis models (Figure 1, Supplemental Table 5, Table 2). We highlight these two genes because each of their association signals was distributed across the overall CKD cohort, indicating a contribution of pathogenic COL4A3 and COL4A4 variants across multiple clinical diagnostic categories (Supplemental Table 6). Synonymous models showed no genes at significant or near-significant enrichment (Supplemental Figure 2).
Table 2.
Gene Name | OMIM Phenotype No. | Known Mendelian Disease Inheritance | Best Model (Lowest FET P) | ||
---|---|---|---|---|---|
Clinical Group/Inheritance Model/Collapsing Model | P Value (Fisher Exact Test) | OR [95% CI] | |||
Known Mendelian nephropathy genes | |||||
PKD1 | 173900 | Dominant/multiple | Mendelian or congenital CKD/dominant/ultrarare, deleterious predicted | 1.6×10−55 | 29.45 [19.5 to 44.8] |
PKD2 | 613095 | Dominant/multiple | Mendelian or congenital CKD, dominant/rare, protein-truncating | 5.6×10−26 | >352 [56 to >13164] |
COL4A5 | 301050 | X-linked dominant/multiple | Mendelian or congenital CKD/dominant/rare protein-truncating | 4.7×10−12 | >155 [21 to >6512] |
COL4A4 | 203780/141200 | Autosomal dominant and recessive | All cases/dominant/ultrarare non-benign/LIMBR50 | 1.0×10−5 | 8.53 [3.1 to 23.7] |
COL4A3 | 104200/203780/141200 | Autosomal dominant and recessive | All cases/dominant/ultrarare deleterious predicted | 3.1×10−5 | 7.92 [2.8 to 22.2] |
No prior associations with Mendelian nephropathy | |||||
SLC17A1 | NA | Not previously reported | AURORA only/dominant/rare missense-only | 3.15×10−6 | 4.8 [2.5 to 9.0] |
CPT2 | 255110/600649/608836 | Autosomal recessive | All cases/dominant/rare missense-only/MTR50 | 4.0×10−7 | 4.50 [2.4 to 8.5] |
SCLT1 | NA | Not previously reported | CUMC only/dominant/rare, nonbenign/MTR50 | 2.0×10−6 | 8.25 [3.5 to 19.7] |
To calculate OR and 95% CI for undefined events due to zero-cell counts we force a single count to zero cell and specify greater (>) than resulting OR and undefined 95% CI. Gene names are italicized. FET, Fisher Exact Test.
Diagnostic exome analysis using the ACMG guidelines for clinical sequence interpretation33 detected Mendelian causes of CKD in 293 of the 3150 (9.3%) individuals. After exclusion of these 293 cases, there was no residual signal in COL4A5, COL4A4, or COL4A3, suggesting that the diagnostic analysis identified all or most of the pathogenic variants. However, PKD1 was still significantly enriched in the Mendelian group (ultrarare model, P=7.1×10−11; OR, 7.8; 95% confidence interval [95% CI], 4.4 to 13.3) and PKD2 was suggestively enriched in the AURORA group (ultrarare model, P=8.5×10−4; OR, 20.2; 95% CI, 2.9 to 223.4), suggesting there are residual pathogenic variants in these genes that do not meet ACMG diagnostic criteria. Because most of these variants were in AURORA patients, for whom limited clinical information is available, it is possible that some of these residual variants would be classified as diagnostic using these criteria with more clinical information.
Because the signal from known disease genes such as COL4A3 and COL4A4 did not reach significance, it is possible that novel genes contributing to CKD in our cohort also did not reach our threshold of significance (Table 3). Beyond the known genes, the top-ranked signal belonged to SLC17A1 (OMIM: 182308; Table 2), a gene encoding a urate transporter that has been previously associated with gout and serum uric acid levels by GWAS.49,50 This signal originated primarily from the AURORA cases and was distributed across all clinical categories (rare, missense-only model; 1.53% AURORA cases versus 0.32% IGM controls; P=3.2×10−6; OR, 4.8; 95% CI, 2.5 to 9.0). In addition to SLC17A1, we noted a suggestive signal in the sodium channel and clathrin linker 1 gene (SCLT1; OMIM: 611399), which was enriched for QVs among CUMC cases across different clinical categories (rare, nonbenign model; P=5.2×10−6; OR, 5.0; 95% CI, 2.4 to 10.3). SCLT1 encodes a ciliary protein, and inactivation of its murine ortholog causes cystic kidney disease.51
Table 3.
Gene Name | Clinical Group | Best Model (Lowest FET P) | ||
---|---|---|---|---|
Inheritance Model/Collapsing Model | P Value (Fisher Exact Test) | OR [95% CI] | ||
PKD1 | Mendelian or congenital CKD | Dominant/ultrarare, deleterious-predicted | 1.6×10−55 | 29.5 [19.5 to 44.8] |
PKD2 | Mendelian or congenital CKD | Dominant/rare, LOF | 5.6×10−26 | 352.3 [56.1 to 13163] |
COL4A5 | Mendelian or congenital CKD | Dominant/rare, LOF | 4.7×10−12 | 155.4 [21.5 to 6512] |
CPT2 | All cases | Dominant/rare, missense/MTR50 | 4.0×10−7 | 4.5 [2.4 to 8.5] |
AHDC1 | APOL1 cases | Dominant/rare | 5.91×10−7 | 14 [5 to 37.9] |
SCLT1 | CUMC only | Dominant/rare/MTR50 | 2.0×10−6 | 8.2 [3.2 to 22.7] |
SLC17A1 | AURORA only | Dominant/rare, missense | 3.15×10−6 | 4.8 [2.5 to 9] |
TNFAIP6 | Glomerular neph. | Dominant/rare | 4.23×10−6 | 10.8 [3.7 to 33.6] |
SLC9A2 | Glomerular neph. | Dominant/rare, missense/LIMBR50 | 5.04×10−6 | 4.4 [2.3 to 8.2] |
ALDH1L2 | Glomerular neph. | Dominant/rare/LIMBR50 | 5.09×10−6 | 6.6 [2.9 to 14.9] |
COL4A4 | All cases | Dominant/ultrarare/LIMBR50 | 1.01×10−5 | 8.5 [2.9 to 30.3] |
PAX7 | Unknown or other | Dominant/rare, missense/LIMBR50 | 1.04×10−5 | 5.5 [2.6 to 10.5] |
EFEMP2 | CUMC only | Dominant/rare, missense/MTR50 | 1.52×10−5 | 6.8 [2.7 to 18.1] |
BAG3 | Hypertensive neph. | Dominant/rare/MTR50 | 1.73×10−5 | 8.5 [3.3 to 19.2] |
RPS6KL1 | Glomerular neph. | Dominant/rare, missense/MTR50 | 2.06×10−5 | 6.1 [2.6 to 13.9] |
KRTAP1 to 5 | COL4A3/4/5 cases | Dominant/rare, missense | 2.15×10−5 | 30.6 [7.2 to 98.8] |
PDIA4 | All cases | Dominant/rare/MTR50 | 2.43×10−5 | 4.4 [2.1 to 9.3] |
JPH1 | AURORA only | Dominant/rare, missense | 2.67×10−5 | 4.1 [2.1 to 7.7] |
GNPNAT1 | All cases | Dominant/rare, missense | 2.71×10−5 | 27.4 [3.8 to 1193.7] |
CREBRF | Unknown or other | Dominant/rare/LIMBR50 | 2.78×10−5 | 22.9 [5.2 to 99.7] |
ANXA2 | CUMC only | Dominant/ultrarare, deleterious-predicted | 2.91×10−5 | 18.8 [3.7 to 181] |
TUBA1C | Hypertensive neph. | Dominant/ultrarare, deleterious-predicted | 2.98×10−5 | 94.4 [7.5 to 4770] |
GEMIN2 | PKD1/2 cases | Dominant/rare/LIMBR50 | 3.04×10−5 | 78.4 [11.3 to 469] |
COL4A3 | All cases | Dominant/ultrarare, deleterious-predicted | 3.05×10−5 | 7.9 [2.6 to 28.4] |
TMEFF1 | Tubulo-interstitial | Dominant/rare | 3.72×10−5 | 17.3 [4.7 to 53.2] |
To calculate OR and 95% CI for undefined events due to zero-cell counts we force a single count to zero cell and specify greater (>) than resulting OR and undefined 95% CI. Three genes that had high numbers of likely artifactual variants have been excluded. FET, Fisher Exact Test; LOF, loss of function; neph, nephropathy.
Regional Missense Intolerance Informed Collapsing Analyses
Accounting for regional mutation intolerance in coding regions can increase the power of rare-variant association studies. As some coding segments are more tolerant to missense variation, they are more likely to contain nondisease causal missense variants, which can add noise using gene-based collapsing analyses alone. Hence, we included six models that only included rare missense variants that fell within subgenic regions that are intolerant to missense variants, as determined by two measures of missense intolerance (Supplemental Method 3).42,44,45 We first evaluated the performance of LIMBR45 and missense tolerance ratio (MTR)44 filters by analyzing known gene signals. Both methods increased enrichment of QVs in cases over controls in PKD1, with the MTR50 filter increasing significance by almost two orders of magnitude, demonstrating the utility of this approach (Table 4). The full comparison of the LIMBR- or MTR-filtered missense-only models across the five known genes (PKD1/2 and COL4A3/4/5) is shown in Supplemental Table 4.
Table 4.
Gene Name | Subgroup | Model Name | Exome-Wide Rank | Qualified Case Frequency | Qualified Control Frequency | Enriched Direction | OR | P Value | 95% CI Lower | 95% CI Upper |
---|---|---|---|---|---|---|---|---|---|---|
PKD1 | AURORA only | (Dominant) rare missense | 6 | 0.0965 | 0.0668 | Case | 1.49 | 4.73×10−04 | 1.20 | 1.85 |
PKD1 | AURORA only | (Dominant) rare missense LIMBR50 | 1 | 0.0902 | 0.0601 | Case | 1.55 | 1.95×10−04 | 1.24 | 1.93 |
PKD1 | AURORA only | (Dominant) rare missense MTR50 | 2 | 0.0631 | 0.032 | Case | 2.04 | 1.06×10−06 | 1.56 | 2.66 |
PKD1 | All samples | (Dominant) rare missense | 309 | 0.0797 | 0.0669 | Case | 1.21 | 0.0158 | 1.04 | 1.41 |
PKD1 | All samples | (Dominant) rare missense LIMBR50 | 102 | 0.0737 | 0.0602 | Case | 1.24 | 0.009 | 1.06 | 1.45 |
PKD1 | All samples | (Dominant) rare missense MTR50 | 14 | 0.0454 | 0.0318 | Case | 1.45 | 4.35×10−04 | 1.18 | 1.77 |
PKD1 | CUMC only | (Dominant) rare missense | 9321 | 0.0701 | 0.0669 | Case | 1.05 | 0.5932 | 0.87 | 1.27 |
PKD1 | CUMC only | (Dominant) rare missense LIMBR50 | 5600 | 0.0642 | 0.0602 | Case | 1.07 | 0.5075 | 0.88 | 1.30 |
PKD1 | CUMC only | (Dominant) rare missense MTR50 | 5997 | 0.0358 | 0.0323 | Case | 1.11 | 0.4129 | 0.86 | 1.44 |
Next, we applied the regional intolerance filters on a genome-wide basis (Supplemental Method 3), using predefined models that included missense variants only or both protein-truncating and missense variants (Supplemental Table 3). Although no additional gene achieved study-wide significance, there was now suggestive evidence for one gene, carnitine palmitoyltransferase 2 (CPT2; OMIM: 600650), after adopting the regional missense intolerance filters (Supplemental Figure 3). Compared with the standard missense-only model, the MTR50% filter augmented the significance of the signal by over three orders of magnitude in the combined CUMC-AURORA cohort (rare missense-only model, P=4.7×10−4; OR, 2.1; 95% CI, 1.4 to 3.3, versus rare missense-only MTR50 model; P=4.0×10−7; OR, 4.5; 95% CI, 2.4 to 8.5). Because mutations in CPT2 cause a recessive syndrome which can include AKI and cardiomyopathy (carnitine palmitoyltransferase II [CPT II] deficiency, OMIM phenotypes 255110, 600649, and 608836), all CPT2 variant carriers were further investigated for a second, potentially recessive allele. None of the QV carrier cases in our cohort carried a second CPT2 allele of any protein-altering class with a gnomAD MAF <1%, nor did qualifying individuals carry the Ser113Leu SNP (rs74315294), the most commonly reported variant linked to familial CPT II deficiency.52,53 The CPT2 signal also persisted after removing the cases with diagnostic mutations in known nephropathy genes. These findings indicate that the observed enrichment is not attributable to clinically undiagnosed cases of carnitine palmitoyltransferase II deficiency.
Genetic Modifiers Analysis
We next investigated genes that could potentially modify the risk of nephropathy in individuals with pathogenic variants in established disease-associated loci. We did not detect any noteworthy signals in subgroups with diagnostic mutations in PKD1/PKD2 (n=94) or in COL4A3/COL4A4/COL4A5 (n=87). Ancestry structure within each modifier group is shown in Supplemental Table 7. However, comparing the 137 CKD cases with the APOL1 risk genotypes to 2209 ancestry-matched controls, we identified a suggestive signal in the AT-Hook DNA Binding Motif Containing 1 gene (AHDC1; OMIM:615790) in the rare, nonbenign model (P=5.9×10−7; OR, 14.0; 95% CI, 5.0 to 37.9; Figure 1). This association was driven by nonsynonymous and indel variants distributed across the gene (nine QVs in 137 cases versus 11 QVs in 2209 controls) and was not detected among cases who carried only one or no APOL1 risk alleles (Supplemental Table 8). Moreover, of 206 controls with the APOL1 high-risk genotypes, only two carried a QV in AHDC1, further supporting the association (P=0.008).
Look-Up of Top Genes in the UK Biobank Data
We next asked whether common variants in our top genes were associated with CKD traits in a recent analysis of the UK Biobank data.48 Overall, 45% of the 18,852 genes in this UK Biobank analysis were reported to contain or have close proximity (within 50 kb upstream/downstream of transcript boundaries) to at least one SNP that is nominally associated (P<0.001) with a kidney trait. In comparison, all nine genes of interest from collapsing analysis that either achieved P<3.0×10−06 or lower in our collapsing analyses (PKD1, PKD2, CPT2, AHDC1, SCLT1, COL4A5, and SLC17A1, specific phenotypes and lowest P values shown in Supplemental Table 9) or were known genes with high diagnostic yield (COL4A3 and COL4A4) had at least one nominal association (P<0.001) with a nephropathy. The probability of detecting nine out of nine genetic associations by chance is low (probability, 7.6×10−4) and is consistent with an enrichment for true signals in the collapsing analysis. In addition, even after excluding known genes for kidney disorders (PKD1, PKD2, COLA43, COL4A4, and COL4A5), the probability remains low that all four of our novel suggestive genes would have nominal associations with renal phenotypes (P=0.04), when compared with other genes that do not lead to primary renal disorders.
Implications for Future Sequencing Studies of CKD
The observed distribution of case and control carrier frequencies provides an opportunity to estimate the sample sizes that could implicate novel CKD genes at study-wide significance with 80% power (Supplemental Figure 4). Using the dominant ultrarare deleterious predicted model (Supplemental Table 10), we estimated achievable P values and accompanying confidence intervals for increased sample sizes under the assumption that our current carrier frequency estimates reflect the true frequencies and that any future sampling will reflect an identical case composition as the current case sample. Under these stringent assumptions, the set data indicate that over a dozen susceptibility genes for all-cause CKD could be implicated within the range of effect sizes detected in this study, and a three-fold increase in case and control sample sizes would provide 80% power to detect them with collapsing analyses (Supplemental Table 10, Table 2).
Discussion
We present the largest whole-exome association analysis of CKD to date, comprising 3150 cases representing all major clinical phenotypes. We observe three study-wide significant genes, PKD1, PKD2, and COL4A5, and enrichment of rare variants in two others, COL4A3 and COL4A4, all of which are established causes of Mendelian nephropathy. The highly significant signals in known genes confirmed the strength of our analytic approach in detecting cases with hereditary nephropathy from a cohort composed of different subtypes of CKD. Some significant genes such as PKD1 or PKD2 were essentially all detected in the Mendelian CKD category, indicating it is unlikely that variants in these gene contribute to all-cause CKD. On the other hand, it is noteworthy that the signals in COL4A5, COL4A3, and COL4A4 emerged from the combined cohort, spanning across major clinical disease categories. Mutations in these genes cause X-linked and autosomal forms of Alport syndrome, but recent studies report pathogenic variants in patients diagnosed with other forms of CKD, such as FSGS.12–14 Hence, the association of these genes with CKD across combined clinical categories provides additional and more formal quantitative evidence that rare variation in these genes contributes to many clinical forms of nephropathy.
The AURORA and the CUMC cohorts differed in ascertainment and recruitment strategies, likely explaining variation in signal strength for some genes between these two cohorts. The AURORA patients were composed of older persons (50–80 years old) on maintenance dialysis and the known relative survival advantage of patients with ADPKD on hemodialysis may explain the stronger PKD1/PKD2 representation in this cohort. CUMC’s enrichment for glomerular diseases may explain the stronger COL4A3/COL4A4/COL4A5 representation in that cohort. It should also be noted that we expect to observe some sampling variability especially in the context of sparse observation rates in relatively small individual cohorts.
We detected three interesting suggestive signals (SLC17A1, CPT2, and SCLT1). Although a recent analysis of UK Biobank data provides supportive evidence, replication in larger cohorts will be required. SLC17A1 encodes a renal urate transporter, and common variants in this gene have been associated with uric acid levels in large GWAS analyses.49,50 A number of genetic and epidemiologic studies have indicated a link between serum uric acid levels and CKD, hypertension, and cardiovascular disease. Elevated uric acid levels have been independently associated with hypertension in most studies but its association with cardiovascular disease is not consistent. Hence, if the SLC17A1 signal is confirmed, it may clarify the causal relationship between uric acid levels, CKD, and cardiovascular disease.
The second gene, CPT2, encodes the CPT II protein, which is located in the mitochondrial membrane. CPT II deficiency (OMIM: 255110, 600649. and 608836) is a recessive disease of fatty acid metabolism that can cause severe cardiac and hepatic failure in its neonatal form, whereas its later-onset, myopathic form can produce acute tubular necrosis from rhabdomyolysis.53 Although heterozygous carriers of CPT2 missense variants may experience mild metabolic defects,54,55 to our knowledge, renal or cardiac dysfunction has not been reported except in individuals with biallelic, pathogenic mutations.
The third suggestive signal was in SCLT1, encoding a ciliary protein that has been nominated as a candidate gene for human nephropathy56 because inactivation of its ortholog in mice causes cystic kidney disease.51 Although mitochondrial dysfunction and ciliary disorders usually cause pediatric forms of CKD,56–59 the SCLT1 and CPT2 signals in this adult cohort suggest that dysfunction in these pathways may also contribute to later-onset CKD.
The APOL1 G1 and G2 alleles have risen to high MAF among populations of African ancestry because they confer protection against trypanosomal infection8; however, individuals who harbor two risk variants have a greater risk of several types of nephropathy.10,60,61 In addition, APOL1 risk genotypes may independently predispose to cardiovascular disease.62 Because only a minority develop nephropathy, genetic or environmental modifiers have been suspected and a recent study has implicated common variants in UBD as genetic modifiers.63 We did not detect any association with rare variants in UBD, but found suggestive evidence that rare, nonsynonymous variants in AHDC1 increase the risk of CKD in patients with the APOL1 dual-risk G1/G2 alleles. AHDC1 encodes a nuclear protein with a predicted DNA binding domain that is ubiquitously expressed. De novo mutations cause Xia–Gibbs syndrome, an intellectual disability syndrome that also features sleep apnea and various dysmorphologies.64 The function of AHDC1 is not completely understood, but this suggestive signal encourages further investigation to confirm this association, and, if replicated, to examine the potential links to APOL1 biology.
A major challenge in rare-variant association studies is distinguishing deleterious variations, particularly missense substitutions, from neutral variation. Algorithms on the basis of evolutionary sequence conservation such as PolyPhen-265 have proven helpful, but they do not leverage the large amount of population genetic data now available,66 which enables detection of coding regions that are intolerant to missense variation. We incorporated two filtering methods using regional missense intolerance, MTR and LIMBR, and both showed utility for optimizing missense variant signals in known genes (e.g., PKD1) and also detected a suggestive signal in the CPT2 gene. Systematic examination of MTR/LIMBR filters in exome-wide association studies, and extension of these methods, will better delineate opportunities for application of this approach.
Understanding the genetic basis of nephropathy can clarify the relationship between CKD and its comorbidities. For example, CKD is an established risk factor for cardiovascular mortality but some genetic nephropathies are associated with specific cardiovascular complications such as cerebral aneurysms (e.g., polycystic kidney disease), cardiac structural defects (e.g., coloboma, heart defects, atresia choanae, growth retardation, genital abnormalities, and ear abnormalities [CHARGE], Noonan, Bardet Biedl, Alagille syndromes), cardiomyopathy (e.g., Fabry disease or hereditary amyloidosis), or premature atherosclerosis (e.g., maturity onset diabetes of the young). Hence, exome sequencing can help differentiate cardiovascular complications that are linked to CKD as a manifestation of the same genetic defect, and those that arise as a consequence of renal impairment per se. Such genetic information can also improve risk stratification and adjudication of outcomes in interventional studies such as AURORA, which were undertaken to modify the risk of cardiovascular disease in renal failure.
In summary, this study explored the potential of exome sequencing and collapsing analysis to detect causal genes for nephropathy in a large, ancestrally diverse cohort representing broad clinical subcategories of CKD. We detected study-wide signals for known nephropathy genes and suggestive signals for several candidate genes among the all-cause CKD cohort. These data suggest that genetic risk factors may contribute to all-cause CKD, but their detection would require a larger sample size. For example, we estimate that a doubling of the case size would have yielded a study-wide signal for COL4A4 and other suggestive signals of similar magnitude. With the declining cost of whole-exome sequencing, such studies are increasingly feasible and may enable more precise patient classification on the basis of molecular subtypes. Our findings thus encourage more systematic investigation of the contribution of rare variants to disease risk in larger cohorts and within each CKD disease category.
Disclosures
This study was sponsored by the Innovative Medicines and Early Development (IMED) Biotech Unit, AstraZeneca, AstraZeneca. Authors Cameron-Christie, Fleckner, March, Haefliger, Platt, and Petrovski are employees of AstraZeneca, and Goldstein is an adviser for AstraZeneca. Participants recruited for “A Study to Evaluate the Use of Rosuvastatin in Subjects on Regular Hemodialysis: An Assessment of Survival and Cardiovascular Events” (AURORA), National Institutes of Health code NCT00240331. Sample collection for the Columbia University Irving Medical Center (CUIMC) cohort was partially supported by National Institutes of Health National Institute of Diabetes and Digestive and Kidney Diseases grant R01 DK0800899. The sequencing of the CUIMC cohort was supported by the CUIMC Precision Medicine Initiative.
Supplementary Material
Acknowledgments
We thank CUMC and AURORA participants in this study, and all investigators involved in the AURORA study.
Dr. Gharavi, Dr. Goldstein, and Dr. Platt jointly designed and supervised the project and contributed equally to this article. Dr. Platt led the AstraZeneca scientific collaboration. Dr. Gharavi led the clinical and genetic concepts, Dr. Goldstein led the statistical and methodological concepts, and Dr. Gharavi led the translational concepts and the collaboration. Dr. March provided leadership and discussion on the project’s design and implementation. Dr. Haefliger and Dr. Fleckner led the collaboration with the original AURORA study, including selection, collection, and interpretation of AURORA samples and data, and provided additional clinical expertise. Dr. Fellstrom was the leader of the original AURORA study and provided feedback and access to study data. Dr. Groopman performed molecular diagnoses of cases under clinical genetic guidelines and provided insight into renal genetics for the results in this paper. Dr. Petrovski helped with the design, execution, and interpretation of all statistical analyses and performed the power calculations. Dr. Gelfman designed and helped interpret the regional collapsing analyses, which were performed by Dr. Wolock. Dr. Kamalakaran helped manage the pipeline for the sequencing and bioinformatic analysis. Dr. Zhang and Dr. Allen provided analytical advice and performed exploratory analyses. Dr. Marasa contributed to patient recruitment and clinical data collection. Dr. Li contributed to sample procurement and preparation. Dr. Sanna-Cherchi and Dr. Kiryluk contributed to data analysis and interpretation. Dr. Wolock provided training for the collapsing analysis for Dr. Cameron-Christie, and Dr. Cameron-Christie and Dr. Wolock performed the published collapsing analyses. Dr. Cameron-Christie led the preparation and writing of the manuscript, and Dr. Wolock, Dr. Gharavi, Dr. Goldstein, and Dr. Petrovski contributed to preparation and writing. All authors provided feedback on results and the final manuscript.
The study was supported by R01-MD009223 (to Dr. Gharavi) and the Institute of Genomic Medicine and the Columbia Precision Medicine Initiative. Dr. Groopman is supported by the National Institutes of Health grant 1F30DK116473. Dr. Cameron-Christie is a fellow of the AstraZeneca postdoctorate program, and reports other from Innovative Medicines and Early Development (IMED) Biotech Unit, AstraZeneca, during the conduct of the study; other from IMED Biotech Unit, AstraZeneca, outside the submitted work. Dr. Groopman reports grants from National Institutes of Health (grant number: 1F30DK116473-01), during the conduct of the study. Dr. Petrovski reports personal fees from AstraZeneca, during the conduct of the study. Dr. Povysil reports grants from AstraZeneca, during the conduct of the study; grants from AstraZeneca, outside the submitted work. Dr. Fleckner reports personal fees from AstraZeneca, personal fees from Novo Nordisk, outside the submitted work. Dr. March reports personal fees and other from AstraZeneca, outside the submitted work. Dr. Allen reports personal fees from AstraZeneca, during the conduct of the study; personal fees from AstraZeneca, outside the submitted work. Dr. Fellström reports grants and personal fees from Pharmalink/Calliditas, personal fees from Alexion, grants from Bristol Meyers Squibb, other from Sandoz Pharmaceutical, outside the submitted work. Dr. Haefliger reports personal fees from AstraZeneca, during the conduct of the study; personal fees from AstraZeneca, outside the submitted work. Dr. Platt reports personal fees from AstraZeneca, during the conduct of the study; personal fees from AstraZeneca, outside the submitted work. Dr. Goldstein reports personal fees from AstraZeneca, during the conduct of the study; other from Pairnomix, other from Praxis Therapeutics, other from Apostle Inc. outside the submitted work. Dr. Gharavi reports receiving research grants from the National Institutes of Health and the Renal Research Institute.
Data availability: Individual level data for 1393 CUMC participants are available via National Center for Biotechnology Information (NCBI) dbGAP database (accession number pending). All diagnostic variants discovered in this cohort to the NCBI ClinVar database under accession numbers SCV000809114-SCV000809473. For the remaining CUMC participants, summary-level genetic data can be provided upon request to corresponding authors. Deidentified aggregate data from the AURORA cohort can be made available upon request from the AstraZeneca research portal (https://astrazenecagroup-dt.pharmacm.com/DT/Home/Index/).
Footnotes
Published online ahead of print. Publication date available at www.jasn.org.
Supplemental Material
This article contains the following supplemental material online at http://jasn.asnjournals.org/lookup/suppl/doi:10.1681/ASN.2018090909/-/DCSupplemental.
Supplemental Method 1. IGM bioinformatics processing.
Supplemental Method 2. Quality metrics.
Supplemental Method 3. Missense intolerance scores.
Supplemental Method 3. UK Biobank lookup.
Supplemental Figure 1. APOL1 PCA plots.
Supplemental Figure 2. QQ plots of synonymous variants.
Supplemental Figure 3. CPT2 enrichment, MTR filter.
Supplemental Figure 4. Power curve.
Supplemental Table 1. Cohort demographics.
Supplemental Table 2. Control cohort phenotypes.
Supplemental Table 3. Qualifying variant parameters and subgroups.
Supplemental Table 4. LIMBR and MTR comparison.
Supplemental Table 5. All genes across all models where P<0.05.
Supplemental Table 6. Clinical phenotypes in known genes.
Supplemental Table 7. Ancestry prediction in modifier analysis.
Supplemental Table 8. AHDC1 rare QV rates in APOL1.
Supplemental Table 9. Look-up of UK Biobank data.
Supplemental Table 10. Predicting P value signals.
References
- 1.GBD 2015 Mortality and Causes of Death Collaborators : Global, regional, and national life expectancy, all-cause mortality, and cause-specific mortality for 249 causes of death, 1980-2015: A systematic analysis for the Global Burden of Disease Study 2015. Lancet 388: 1459–1544, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Webster AC, Nagler EV, Morton RL, Masson P: Chronic kidney disease. Lancet 389: 1238–1252, 2017 [DOI] [PubMed] [Google Scholar]
- 3.United States Renal Data System : USRDS Annual Data Report: Epidemiology of Kidney Disease in the United States, Bethesda, MD, National Institutes of Health, National Institute of Diabetes and Digestive and Kidney Diseases, 2017 [Google Scholar]
- 4.Mallett A, Patel C, Salisbury A, Wang Z, Healy H, Hoy W: The prevalence and epidemiology of genetic renal disease amongst adults with chronic kidney disease in Australia. Orphanet J Rare Dis 9: 98, 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Freedman BI, Soucie JM, McClellan WM: Family history of end-stage renal disease among incident dialysis patients. J Am Soc Nephrol 8: 1942–1945, 1997 [DOI] [PubMed] [Google Scholar]
- 6.Gumprecht J, Zychma MJ, Moczulski DK, Gosek K, Grzeszczak W: Family history of end-stage renal disease among hemodialyzed patients in Poland. J Nephrol 16: 511–515, 2003 [PubMed] [Google Scholar]
- 7.Connaughton DM, Bukhari S, Conlon P, Cassidy E, O’Toole M, Mohamad M, et al.: The Irish kidney gene project--prevalence of family history in patients with kidney disease in Ireland. Nephron 130: 293–301, 2015 [DOI] [PubMed] [Google Scholar]
- 8.Genovese G, Friedman DJ, Ross MD, Lecordier L, Uzureau P, Freedman BI, et al.: Association of trypanolytic ApoL1 variants with kidney disease in African Americans. Science 329: 841–845, 2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Friedman DJ, Pollak MR: Apolipoprotein L1 and kidney disease in African Americans. Trends Endocrinol Metab 27: 204–215, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Foster MC, Coresh J, Fornage M, Astor BC, Grams M, Franceschini N, et al.: APOL1 variants associate with increased risk of CKD among African Americans. J Am Soc Nephrol 24: 1484–1491, 2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Papeta N, Kiryluk K, Patel A, Sterken R, Kacak N, Snyder HJ, et al.: APOL1 variants increase risk for FSGS and HIVAN but not IgA nephropathy. J Am Soc Nephrol 22: 1991–1996, 2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Gast C, Pengelly RJ, Lyon M, Bunyan DJ, Seaby EG, Graham N, et al.: Collagen (COL4A) mutations are the most frequent mutations underlying adult focal segmental glomerulosclerosis. Nephrol Dial Transplant 31: 961–970, 2016 [DOI] [PubMed] [Google Scholar]
- 13.Malone AF, Phelan PJ, Hall G, Cetincelik U, Homstad A, Alonso AS, et al.: Rare hereditary COL4A3/COL4A4 variants may be mistaken for familial focal segmental glomerulosclerosis. Kidney Int 86: 1253–1259, 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Lata S, Marasa M, Li Y, Fasel DA, Groopman E, Jobanputra V, et al.: Whole-exome sequencing in adults with chronic kidney disease: A pilot study. Ann Intern Med 168: 100–109, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Wuttke M, Seidl M, Malinoc A, Prischl FC, Kuehn EW, Walz G, et al.: A COL4A5 mutation with glomerular disease and signs of chronic thrombotic microangiopathy. Clin Kidney J 8: 690–694, 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Köttgen A, Glazer NL, Dehghan A, Hwang SJ, Katz R, Li M, et al.: Multiple loci associated with indices of renal function and chronic kidney disease. Nat Genet 41: 712–717, 2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Gorski M, van der Most PJ, Teumer A, Chu AY, Li M, Mijatovic V, et al.: 1000 Genomes-based meta-analysis identifies 10 novel loci for kidney function. Sci Rep 7: 45040, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Köttgen A, Pattaro C, Böger CA, Fuchsberger C, Olden M, Glazer NL, et al.: New loci associated with kidney function and chronic kidney disease. Nat Genet 42: 376–384, 2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Pattaro C, Teumer A, Gorski M, Chu AY, Li M, Mijatovic V, et al.: ICBP Consortium; AGEN Consortium; CARDIOGRAM; CHARGe-Heart Failure Group; ECHOGen Consortium : Genetic associations at 53 loci highlight cell types and biological pathways relevant for kidney function. Nat Commun 7: 10023, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Abifadel M, Varret M, Rabès JP, Allard D, Ouguerram K, Devillers M, et al.: Mutations in PCSK9 cause autosomal dominant hypercholesterolemia. Nat Genet 34: 154–156, 2003 [DOI] [PubMed] [Google Scholar]
- 21.Obradovic M, Zaric B, Sudar-Milovanovic E, Ilincic B, Perovic M, Stokic E, et al.: PCSK9 and hypercholesterolemia: Therapeutic approach. Curr Drug Targets 19: 1058–1067, 2018 [DOI] [PubMed] [Google Scholar]
- 22.Morgan P, Brown DG, Lennard S, Anderton MJ, Barrett JC, Eriksson U, et al.: Impact of a five-dimensional framework on R&D productivity at AstraZeneca. Nat Rev Drug Discov 17: 167–181, 2018 [DOI] [PubMed] [Google Scholar]
- 23.Zhu X, Padmanabhan R, Copeland B, Bridgers J, Ren Z, Kamalakaran S, et al.: A case-control collapsing analysis identifies epilepsy genes implicated in trio sequencing studies focused on de novo mutations. PLoS Genet 13: e1007104, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Lee S, Abecasis GR, Boehnke M, Lin X: Rare-variant association analysis: Study designs and statistical tests. Am J Hum Genet 95: 5–23, 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Guo MH, Dauber A, Lippincott MF, Chan YM, Salem RM, Hirschhorn JN: Determinants of power in gene-based burden testing for monogenic disorders. Am J Hum Genet 99: 527–539, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Sanna-Cherchi S, Khan K, Westland R, Krithivasan P, Fievet L, Rasouly HM, et al.: Exome-wide association study identifies GREB1L mutations in congenital kidney malformations. Am J Hum Genet 101: 789–802, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.De Tomasi L, David P, Humbert C, Silbermann F, Arrondel C, Tores F, et al.: Mutations in GREB1L cause bilateral kidney agenesis in humans and mice. Am J Hum Genet 101: 803–814, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Cirulli ET, Lasseigne BN, Petrovski S, Sapp PC, Dion PA, Leblond CS, et al.: FALS Sequencing Consortium : Exome sequencing in amyotrophic lateral sclerosis identifies risk genes and pathways. Science 347: 1436–1441, 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Bagnall RD, Crompton DE, Petrovski S, Lam L, Cutmore C, Garry SI, et al.: Exome-based analysis of cardiac arrhythmia, respiratory control, and epilepsy genes in sudden unexpected death in epilepsy. Ann Neurol 79: 522–534, 2016 [DOI] [PubMed] [Google Scholar]
- 30.Petrovski S, Todd JL, Durheim MT, Wang Q, Chien JW, Kelly FL, et al.: An exome sequencing study to assess the role of rare genetic variation in pulmonary fibrosis. Am J Respir Crit Care Med 196: 82–93, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Besse W, Dong K, Choi J, Punia S, Fedeles SV, Choi M, et al.: Isolated polycystic liver disease genes define effectors of polycystin-1 function. J Clin Invest 127: 3558, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Stuart BD, Choi J, Zaidi S, Xing C, Holohan B, Chen R, et al.: Exome sequencing links mutations in PARN and RTEL1 with familial pulmonary fibrosis and telomere shortening. Nat Genet 47: 512–517, 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Groopman EE, Marasa M, Cameron-Christie S, Petrovski S, Aggarwal VS, Milo-Rasouly H, et al.: Diagnostic utility of exome sequencing for kidney disease. N Engl J Med 380: 142–151, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Fellström BC, Jardine AG, Schmieder RE, Holdaas H, Bannister K, Beutler J, et al.: AURORA Study Group : Rosuvastatin and cardiovascular events in patients undergoing hemodialysis. N Engl J Med 360: 1395–1407, 2009 [DOI] [PubMed] [Google Scholar]
- 35.Li H, Durbin R: Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25: 1754–1760, 2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Ren Z, Petrovski S, Cirulli E, Wang Q, Copeland B, Bridgers J, et al. : Analysis tool for annotated variants—a comprehensive platform for population-scale genomic analyses. Presented at the Biological Data Analysis Meeting, Cold Spring Harbor, NY, 2016 [Google Scholar]
- 37.Jun G, Flickinger M, Hetrick KN, Romm JM, Doheny KF, Abecasis GR, et al.: Detecting and estimating contamination of human DNA samples in sequencing and array-based genotype data. Am J Hum Genet 91: 839–848, 2012 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Manichaikul A, Mychaleckyj JC, Rich SS, Daly K, Sale M, Chen W-M: Robust relationship inference in genome-wide association studies. Bioinformatics 26: 2867–2873, 2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Richards S, Aziz N, Bale S, Bick D, Das S, Gastier-Foster J, et al. : Standards and guidelines for the interpretation of sequence variants: A joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med 17: 405–424, 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Epi4K Consortium; Epilepsy Phenome/Genome Project : Ultra-rare genetic variation in common epilepsies: A case-control sequencing study. Lancet Neurol 16: 135–143, 2017 [DOI] [PubMed] [Google Scholar]
- 41.Petrovski S, Wang Q, Heinzen EL, Allen AS, Goldstein DB: Genic intolerance to functional variation and the interpretation of personal genomes. PLoS Genet 9: e1003709, 2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Gussow AB, Petrovski S, Wang Q, Allen AS, Goldstein DB: The intolerance to functional genetic variation of protein domains predicts the localization of pathogenic mutations within genes. Genome Biol 17: 9, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Gelfman S, Dugger SA, Moreno CAM, Ren Z, Wolock CJ, Shneider NA, et al. : Regional collapsing of rare variation implicates specific genic regions in ALS. bioRxiv 375774, 2018 [Google Scholar]
- 44.Traynelis J, Silk M, Wang Q, Berkovic SF, Liu L, Ascher DB, et al.: Optimizing genomic medicine in epilepsy through a gene-customized approach to missense variant interpretation. Genome Res 27: 1715–1729, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Hayeck TJ, Stong N, Wolock CJ, Copeland B, Kamalakaran S, Goldstein DB, et al.: Improved pathogenic variant localization via a hierarchical model of sub-regional intolerance. Am J Hum Genet 104: 299–309, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Tzur S, Rosset S, Shemer R, Yudkovsky G, Selig S, Tarekegn A, et al.: Missense mutations in the APOL1 gene are highly associated with end stage kidney disease risk previously attributed to the MYH9 gene. Hum Genet 128: 345–350, 2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.R Core Team : R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Available at: https://www.R-project.org/. Accessed April 21, 2017
- 48.Canela-Xandri O, Rawlik K, Tenesa A: An atlas of genetic associations in UK Biobank. Nat Genet 50: 1593–1599, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Yang Q, Köttgen A, Dehghan A, Smith AV, Glazer NL, Chen MH, et al.: Multiple genetic loci influence serum urate levels and their relationship with gout and cardiovascular disease risk factors. Circ Cardiovasc Genet 3: 523–530, 2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Kolz M, Johnson T, Sanna S, Teumer A, Vitart V, Perola M, et al.: EUROSPAN Consortium; ENGAGE Consortium; PROCARDIS Consortium; KORA Study; WTCCC : Meta-analysis of 28,141 individuals identifies common variants within five new loci that influence uric acid concentrations. PLoS Genet 5: e1000504, 2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Li J, Lu D, Liu H, Williams BO, Overbeek PA, Lee B, et al.: Sclt1 deficiency causes cystic kidney by activating ERK and STAT3 signaling. Human molecular genetics, 26: 2949–2960, 2017 [DOI] [PubMed] [Google Scholar]
- 52.Taroni F, Verderio E, Dworzak F, Willems PJ, Cavadini P, DiDonato S: Identification of a common mutation in the carnitine palmitoyltransferase II gene in familial recurrent myoglobinuria patients. Nat Genet 4: 314–320, 1993 [DOI] [PubMed] [Google Scholar]
- 53.Deschauer M, Wieser T, Zierz S: Muscle carnitine palmitoyltransferase II deficiency: Clinical and molecular genetic features and diagnostic aspects. Arch Neurol 62: 37–41, 2005 [DOI] [PubMed] [Google Scholar]
- 54.Ørngreen MC, Dunø M, Ejstrup R, Christensen E, Schwartz M, Sacchetti M, et al.: Fuel utilization in subjects with carnitine palmitoyltransferase 2 gene mutations. Ann Neurol 57: 60–66, 2005 [DOI] [PubMed] [Google Scholar]
- 55.Taggart RT, Smail D, Apolito C, Vladutiu GD: Novel mutations associated with carnitine palmitoyltransferase II deficiency. Hum Mutat 13: 210–220, 1999 [DOI] [PubMed] [Google Scholar]
- 56.Vivante A, Hildebrandt F: Exploring the genetic basis of early-onset chronic kidney disease. Nat Rev Nephrol 12: 133–146, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Gritti AP, Boletta A: The ciliopathies--polycystic kidney disease. G Ital Nefrol 32: 59–65, 2015. [PubMed] [Google Scholar]
- 58.Van Driest SL, McGregor TL, Velez Edwards DR, Saville BR, Kitchner TE, Hebbring SJ, et al.: Genome-wide association study of serum creatinine levels during vancomycin therapy. PLoS One 10: e0127791, 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Adly N, Alhashem A, Ammari A, Alkuraya FS: Ciliary genes TBC1D32/C6orf170 and SCLT1 are mutated in patients with OFD type IX. Hum Mutat 35: 36–40, 2014 [DOI] [PubMed] [Google Scholar]
- 60.Dummer PD, Limou S, Rosenberg AZ, Heymann J, Nelson G, Winkler CA, et al.: APOL1 kidney disease risk variants: An evolving landscape. Semin Nephrol 35: 222–236, 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Kasembeli AN, Duarte R, Ramsay M, Mosiane P, Dickens C, Dix-Peek T, et al.: APOL1 risk variants are strongly associated with HIV-associated nephropathy in black South Africans. J Am Soc Nephrol 26: 2882–2890, 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Gutiérrez OM, Irvin MR, Chaudhary NS, Cushman M, Zakai NA, David VA, et al. : APOL1 nephropathy risk variants and incident cardiovascular disease events in community-dwelling black adults. Circ Genom Precis Med 11: e002098, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Zhang JY, Wang M, Tian L, Genovese G, Yan P, Wilson JG, et al.: UBD modifies APOL1-induced kidney disease risk. Proc Natl Acad Sci U S A 115: 3446–3451, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Xia F, Bainbridge MN, Tan TY, Wangler MF, Scheuerle AE, Zackai EH, et al.: De novo truncating mutations in AHDC1 in individuals with syndromic expressive language delay, hypotonia, and sleep apnea. Am J Hum Genet 94: 784–789, 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, et al.: A method and server for predicting damaging missense mutations. Nat Methods 7: 248–249, 2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Lek M, Karczewski KJ, Minikel EV, Samocha KE, Banks E, Fennell T, et al.: Exome Aggregation Consortium : Analysis of protein-coding genetic variation in 60,706 humans. Nature 536: 285–291, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.