In the current issue of JASN, Robinson-Cohen et al. present the latest genome-wide association study (GWAS) evaluating the impact of common genetic variants on kidney function decline using repeated measurements of eGFR in more than 100,000 participants of the Million Veteran Program and Vanderbilt BioVU biobank.1 Participants had CKD defined by at least two eGFR measurements of <60 ml/min per 1.73 m2 at enrollment. The largest population-based cross-sectional GWAS of eGFR including >1.5 million participants identified 878 loci,2 and the largest longitudinal eGFR decline GWAS including 343,000 participants in 62 cohorts identified 12 loci.3 Why were only four loci identified in the current study?
In earlier eGFR GWAS iterations, meta-analysis of cohorts with cross-sectional eGFR on the basis of a single measurement allowed for including the largest number of cohorts and participants. These cohorts included participants with kidney function in the normal range and were collected in population-based studies or recruited for other phenotypes. Very large sample sizes are critical to increase statistical power and overcome sampling and measurement error given the relatively small effects of common variants in GWAS. For an individual, cross-sectional eGFR represents the combination of maximally attained lifetime eGFR (nephron endowment) as well as the subsequent loss of eGFR that occurs throughout life with aging or disease. Testing for statistical interaction between genetic effects and age at measurement is one way to test for an association with decline in cross-sectional data (e.g., genetic effects that become larger in those with older age). Is longitudinal eGFR measurement a better way to assess the genetic architecture eGFR decline rather than nephron endowment?
The simplest way to evaluate eGFR decline in longitudinal data is to take the difference between the first and last eGFR measurement and divide it by the length of follow-up time (Figure 1). This simple approach is easy to apply to many cohorts, even if few eGFR measurements have been made. However, many phenotype adjustments remain possible. Do you create a dichotomous phenotype suggestive of rapid kidney loss (e.g., loss of more than 3 ml/min per 1.73 m2 per year, named Rapid3) or incident CKD development (e.g., loss of more than 25% of GFR and final eGFR <60 ml/min per 1.73 m2 named CKDi25)?4 How do we best model the assumption that an eGFR decline from 30 to 25 ml/min per 1.73 m2 is more impactful than from 90 to 85 ml/min per 1.73 m2? Should eGFR be modeled as % decline rather than absolute decline per year? In the example above, a 5 ml/min per 1.73 m2 per year decline would be a 16.7% decline rather than a 5.5% decline. Slightly more complicated, simple regression can be performed in each participant to include multiple eGFR measurements per participant to determine the rate of decline. Although increasing the complexity of the phenotype definition may reduce measurement error, it may also reduce the number of participants and cohorts available in the analysis.
The analysis of Robinson-Cohen et al. used two cohorts with access to primary data rather than meta-analysis of many cohorts, allowing for the next step forward in eGFR decline modeling using linear mixed models. The method models the rate of eGFR decline incorporating the repeated eGFR measurements in each individual, allowing for random effects to impact eGFR slopes and intercepts in each individual while estimating the fixed effects of each variant, age at baseline, sex, and ancestry across all samples and repeated for every variant in the GWAS. Potential biases that could affect linear mixed model analysis are variable censoring (e.g., those with more rapid kidney function decline may die or develop kidney failure reducing the number of eGFR measurements they contribute to the model) and variable rates of kidney function assessment (e.g., those with more eGFR assessments may represent people with more rapid decline). Theoretically, linear mixed models are the most accurate way to model eGFR loss with multiple measurements, and the authors should be commended for integrating this complexity into the analysis.
Was the tradeoff between sophisticated trait modeling versus sample size worth it? Beyond disease gene identification, summary statistics from GWAS are commonly used for development of polygenic scores and in multiomic investigations. High-quality unbiased estimates of variant effect sizes are important even if not below the P < 5×10−8 Bonferroni GWAS significance threshold. A larger effect was observed for lead variant in the UMOD locus (rs77924615, −0.30% per year in Robinson-Cohen et al. compared with −0.07 ml/min per 1.73 m2 baseline-unadjusted per year in the analysis by Gorski et al.3), as well as larger effects for nine of the other 11 previously reported eGFR decline variants despite ten of 11 not being GWAS significant. All 12 variants had concordant direction of effects between the two studies. The larger observed effect with the linear mixed models suggests that greater power might be achieved if similar samples sizes could be obtained as the GWAS with simpler eGFR decline definitions. The resulting relatively fewer GWAS significant loci and the lack of GWAS significant replication for previous loci in the Robinson-Cohen et al. analysis are likely the result of the relatively smaller sample size and lower power.
Furthermore, Robinson-Cohen et al. tested the association of variants in four harmonized ancestry and race/ethnicity groups and presented the results of the multiancestry meta-analysis. This optimal multiancestry design presents the opportunity for identifying genetic risk variants specific to an ancestral group (such as APOL1) and improves the fine-mapping resolution of GWAS loci. However, because of differences in linkage disequilibrium and background allele frequency across ancestries, multiancestry analysis results in a relative reduction in power compared with a similarly sized single-ancestry study.
Whether the genetic architecture of eGFR decline could vary between those with normal kidney function or early disease (eGFR over 60 ml/min per 1.73 m2) compared with those with overt CKD also remains an open question. Robinson-Cohen et al. studied two cohorts with baseline CKD in an attempt to examine this, but similar to adjusting for baseline eGFR, this stratification on baseline CKD can induce collider bias and lead to spurious associations and enlarged effect sizes, a limitation to recognize but that is difficult to overcome.5 Further examination of differences in eGFR decline between those with and without proteinuria, or with and without diabetes, will also be questions for future analyses.
In summary, the work of Robinson-Cohen et al. represents advancement in modeling the eGFR decline phenotype using linear mixed models in two large multiancestry cohorts. The saga of eGFR GWAS is not over yet.
Acknowledgments
Special thanks to the comments and mentoring provided by Dr. Catherine Clase, Dr. Peter Margetts, and Dr. Guillaume Paré.
Footnotes
Published online ahead of print. Publication date available at www.jasn.org.
See related article, “Genome-Wide Association Study of Chronic Kidney Disease Progression,” on pages 1547–1559.
Disclosures
M.B. Lanktree reports Research Funding: grant funding from the Canadian Institutes of Health Research, Hamilton Academic Health Sciences Organization (HAHSO), Hamilton Health Sciences, and the Canadian Kidney Foundation; Honoraria: Bayer, Otsuka, Reata, and Sanofi; Advisory or Leadership Role: M.B. Lanktree received compensation for participating in advisory and consultancy boards with Bayer, Otsuka, Reata, and Sanofi; and Speakers Bureau: Bayer, Otsuka, Reata, and Sanofi.
Funding
M.B. Lanktree acknowledges funding from the Canadian Institutes of Health Research (application 427810) and holds a McMaster University Department of Medicine Early Career Research Award.
Author Contributions
Conceptualization: Matthew B. Lanktree.
Writing – original draft: Matthew B. Lanktree.
Writing – review & editing: Matthew B. Lanktree.
References
- 1.Robinson-Cohen C Triozzi J Rowan B, et al. Genome-wide association study of chronic kidney disease progression. J Am Soc Nephrol. 2023;34(9):1547–1559. doi: 10.1681/ASN.0000000000000170 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Liu H Doke T Guo D, et al. Epigenomic and transcriptomic analyses define core cell types, genes and targetable mechanisms for kidney disease. Nat Genet. 2022;54(7):950–962. doi: 10.1038/s41588-022-01097-w [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Gorski M Rasheed H Teumer A, et al. Genetic loci and prioritization of genes for kidney function decline derived from a meta-analysis of 62 longitudinal genome-wide association studies. Kidney Int. 2022;102(3):624–639. doi: 10.1016/j.kint.2022.05.021 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Gorski M Jung B Li Y, et al. Meta-analysis uncovers genome-wide significant variants for rapid kidney function decline. Kidney Int. 2021;99(4):926–939. doi: 10.1016/j.kint.2020.09.030 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Khan A, Kiryluk K. Kidney disease progression and collider bias in GWAS. Kidney Int. 2022;102(3):476–478. doi: 10.1016/j.kint.2022.06.018 [DOI] [PubMed] [Google Scholar]