Abstract
Genome-wide association studies have identified hundreds of loci associated with kidney-related traits such as glomerular filtration rate, albuminuria, hypertension, electrolyte and metabolite levels. Nearly all of these loci are located in the non-coding region of the genome, therefore their target genes, affected cell types and regulatory mechanisms remain unknown. Genome-scale approaches that attempt to identify the associations between DNA sequence variants and disease risk, changes in gene expression levels (quantified through bulk and single-cell methods), regulation of gene expression and other molecular quantitative trait studies, such as chromatin accessibility, DNA methylation, protein expression, metabolite levels, can be used to deliver robust mechanistic inference for translational exploitation. Understanding the genetic basis of common kidney diseases means having a comprehensive picture of the genes that have a causal role in the etiology and progression of the disease, of the cells, tissues and organs in which these genes act to affect the disease, of the cellular pathways and mechanisms that drive disease and of potential targets for disease prevention, detection and therapy.
Introduction
In the past decade, advances in genetics have provided paradigm-shifting diagnostic and mechanistic insight into the development of kidney disease. Family-based studies, later coupled with exome sequencing, have identified a large number of gene mutations in the coding regions of DNA that explain the development of cystic kidney disease, nephrotic syndrome, hypertension and rare kidney diseases.1,2 These mutations are exceedingly rare in the general population, but they have a substantial impact in affected individuals, and explain the recurrence of disease in specific families. Such genetic disorders usually manifest earlier in life, follow a clear dominant or recessive pattern of inheritance and are known as Mendelian diseases. 3 In contrast to these disorders, most common kidney pathologies that affect the adult population, such as diabetic or hypertensive kidney disease, follow a more complex inheritance pattern.4 These polygenic conditions are controlled by nucleotide variants that are commonly observed in the general population, (that is, the allele frequency or variant incidence is higher than 5%)5 and each variant only has a small effect size on the risk of disease.6
Genome-wide association studies (GWAS) developed in the early 2000’s enabled a systematic interrogation of the association between common sequence variants (also known as single nucleotide polymorphisms (SNPs)) and different disease states and traits in hundreds of thousands of individuals.7 GWAS have since reported hundreds of genetic loci that are associated with kidney function, including loci that correlate with estimated glomerular filtration rate (eGFR), albuminuria and hypertension, as well as levels of metabolites and electrolytes present in blood and urine 8–10,11,12,13,14. More than 95% of disease-associated variants discovered thus far are in the non-coding region of the genome.15,16
GWAS have been exceedingly successful in identifying mostly non-coding sequence variants associated with traits and diseases but this genomic information has not translated into improved diagnostics and therapeutics in nephrology, as our understanding of how non-coding variants cause kidney disease remains incomplete. Causal variants; defined as variants responsible for the observed complex trait, 17 are often difficult to identify within a genetic locus, as DNA sequences that are in close proximity are inherited together. 18 Furthermore, owing to secondary chromatin structure [G], genes that are nearest to the significant sequence variant is not always the causal gene.19 The relatively few loci for which mechanistic studies have been performed, suggest that most disease-causing variants localize to gene regulatory regions, which are usually found in open chromatin and to which transcription factors can bind and regulate gene expression.20–22 As gene regulation is often cell-type specific, the effect of genotype on gene regulation and expression is also cell type-specific,22 which might explain organ-specific disease development.
In this Review we consider different methods that can be used to prioritize causal variants, causal genes and regulatory mechanisms for kidney disease, including fine mapping, epigenome annotation and analysis of molQTLs, as well as the use of single-cell sequencing analyses to identify target cell types. Finally, we will discuss studies that analyzed kidney disease risk loci and identified causal genes, cell types and underlying disease mechanisms.
Causal variant prioritization
A crucial problem in GWAS analysis is that each genetic locus of interest includes a relatively large number of genetic variants with genome-wide significance. Humans are a relatively young species, which means that only a limited number of recombination events have occurred in the genome. Nucleotide variants observed at specific genomic regions are therefore highly correlated17. GWAS take advantage of this local genetic correlation and usually only directly identifies a limited number of SNPs; the remaining SNPs are defined by imputation [G] based on data from prior whole-genome sequencing studies23.
Although this local genetic correlation was crucial for the feasibility and early success of GWAS, as not all variants need to measured, it also complicates the identification of causal variants because hundreds or even thousands of variants in a specific region usually correlate significantly with the observed traits. 24
Based on these limitations, causal variant identification is exceedingly difficult. Identification of causal variants usually involves additional statistical methods such as fine mapping, annotation of regulatory regions and experimental validation. 25 So far only a few disease-causing variants have been identified.
Fine mapping of GWAS variants
Several fine mapping [G] approaches have been developed to refine lists of disease-associated SNPs identified through GWAS. 17,26,27 Although genetic variants that are physically close to each-other are usually inherited together, which increases the number of analyzed variants, increasing the sample size by direct measurement of a larger number of individuals, or by imputation, can help to identify variants that are more likely to be disease-causing.28 Increasing the sample size and analyzing different populations is important, because alleles of SNPs encoded in close proximity within a chromosome are usually co-inherited, and genetically distinct populations therefore tend to have divergent haplotype [G] structures and frequencies. 28 The analysis of large multi-ethnic cohorts and the use of whole-genome sequencing integrate these fine-mapping approaches, and should offer new powerful insight into the identification of causal disease variants. In addition to the larger sample size and diverse population better disease phenotype mapping, such as deep phenotyping [G] and focusing on endophenotypes appear to be important. Endophenotypes are heritable intermediate phenotypes associated with disease. For example, IgA1 glycosylation defects are inherited and constitute a heritable risk factor for IgA associated kidney disease. 29
Epigenome annotation
In our current working model, a causal variant(s) results in quantitative changes in gene expression. Therefore, the methods most commonly used for causal variant prioritization include the annotation of regulatory regions to identify SNPs with regulatory function. Epigenomic annotation, including the identification of regions of open chromatin and histone tail modifications, as well as cytosine methylation maps, can narrow down areas in the genome that are associated with the disease state of interest.
The DNA in the nucleus is densely packaged to form chromatin, which is a DNA–protein complex. However, genomic regions that are actively involved in gene regulation must be accessible to transcription factors and are therefore located in areas of open chromatin. For a genetic variant to influence gene expression, it must be located on a gene regulatory region. 15 Several methods have been developed to identify areas of open chromatin (FIG. 1a). Careful titration of DNase I, coupled with next-generation sequencing (DNase-seq), has provided crucial insight into cell-specific and tissue-specific gene regulatory regions as open chromatin regions are more sensitive to DNaseI digestion than areas that are tightly packed.30 DHS is a highly sensitive method to identify open chromatin, but difficult to perform. Transposases preferentially insert into areas of open chromatin and assay for transposase-accessible chromatin using sequencing (ATAC-seq) can also be used for genome-wide annotation of open chromatin (FIG. 1a).31 Although ATAC-seq might be slightly less comprehensive than DHS, it has gained popularity as a very robust and reproducible method that is fairly easy to perform; currently, ATAC-seq is probably more widely used than DHS.
Figure 1. Epigenome annotation methods for causal variant prioritization.
a)Genetic effects on the epigenome are an important component of the genetic risk of a disease. | DNase I hypersensitivity (DHS) analysis144, identifies DNA that is not tightly wound around histones, ie is in an area of open chromatin. These DNA elements are more sensitive to enzymatic digestion by DNase1,145 In contrast, DNA which is tightly wrapped in nucleosomes is more resistant to digestion. 146 The DNAse digested ends are then enriched and sequenced. 146 (1A)
b)ATAC-seq uses a hyperactive Tn5 transposase, which is an RNase that inserts sequencing adapters into open regions of the genome during a process called tagmentation. 31 The tagged DNA fragments are then purified, PCR-amplified and sequenced using next-generation sequencing.(1B)
c) Modifications to histone proteins, such as methylation, acetylation and trimethylation, can be detected using Chromatin Immunoprecipitation (ChIP) based methods. 35 These modifications can then be used as indicators of the chromatin state which is associated with gene activation or repression. 35
d) Analysis of Chromatin Conformation. Chromatin conformation experiments examine which pairs of DNA loci are in contact with each other, genome wide, and allows long distance contacts to be elucidated. The process involves DNA digestion, biotin labelling and formation of ligation products, which are then sequenced. 147,148
These methods of epigenome annotation, identify cis-regulatory elements (CREs), which contain causal variants and associate with gene expression.
Nucleosomes are structural units of chromatin that comprise DNA coiled around histones, which are basic proteins. Gene regulatory regions have specific histone tail modifications that can be used to locate them32. Promoter [G], enhancer [G] and insulator [G] regions can all be identified by specific histone tail modifications33,34. The most commonly used method for the systematic characterization of histone tail modifications is chromatin immunoprecipitation followed by sequencing (ChIP-seq), which uses an antibody to enrich for histone tails and analyzes DNA sequences associated with specific histone tail modifications (FIG. 1b).35 Specific combinations of histone tail markers, can identify, for example, regions of repressed transcription (trimethylated histone 3 lysine9 (H3K9me3) and H3K27me3), transcriptional start sites of actively transcribed genes (H3K4me3) and acetylated H3K27 (H3K27ac)), areas of active transcription (H3K36me3) and active enhancers (monomethylated H3K4 (H3K4me1) and H3K27ac).35
Methylation of the 5th carbon of cytosine, which generates 5-methylcytosine (5mC), is a well-studied epigenetic mark36. The main role of cytosine methylation is transcriptional repression, mostly for transposable elements [G]. Most cytosines are therefore highly methylated in the genome but active gene regulatory regions are usually characterized by low cytosine methylation.37 Analysis of genome-wide cytosine methylation therefore enables the identification of gene regulatory regions such as active promoters and enhancers38,37. DNA methylation can be analyzed in archived human tissue samples and can thus be more readily applied to clinical samples than histone-ChIP-Seq or ATAC-seq analysis39.
Publicly available resources, which allow analysis of epigenetic material include, the Encyclopedia of coding elements (ENCODE) and the International human epigenome consortium (IHEC). These projects have generated a substantial number of tissue-specific and cell type-specific epigenomic maps, which are important reference sources. 40,41 Unfortunately these datasets do not include a large number of human kidney samples. Epigenetic studies have highlighted the substantial enrichment of GWAS signals on regulatory regions such as enhancers in disease-relevant tissues15
Analysis of transcription factor binding
Our working hypothesis is that disease causing variants alter the binding strength of transcription factors, which leads to quantitative changes in gene expression. Classic analyses of transcription factor binding include electrophoretic mobility shift analysis (EMSA), which is used to study protein–DNA interactions42. EMSA can determine if a protein or mixture of proteins is capable of binding to a given sequence of DNA or RNA, and can sometimes indicate if multiple protein molecules are involved in the binding complex. Transcription factor ChIP (TF-ChIP) experiments can analyze transcription factor binding at a genome-wide level 43 and are important for causal variant identification because they can directly quantify differences in transcription factor binding that are determined by the underlying sequence variants43. This approach can be used to generate genome-wide maps of transcription factor binding regions, also known as cis-regulatory elements [G].44 Importantly, TF-Chip requires an antibody for detection and this antibody must be both highly sensitive and specific to ensure that the antibody–transcription factor complex is enriched in the genomic areas where transcription factor binding occurs without significant confounding due to unspecific antibody binding45. Several computational methods have also been developed that predict transcription factor binding based on DNA nucleotide sequences, such as the SNP effect matrix pipeline (SEMpl)46,which estimates the affinity of transcription factor binding based on the combination of ChIp-seq data and DNA sequences mapped to areas of open chromatin. These novel methods can potentially predict alterations in transcriptional outcomes based on the identified nucleotide variations,46 but they require experimental validation, which can be performed through the CRISPR-Cas9 mutagenesis system. 47
Chromatin conformation
Another challenge in the analysis of gene regulatory regions is the ability to link regulatory elements in the genome to their target genes. This analysis is particularly problematic because causal variants are enriched in enhancer regions. Promoters are always located at the 5’ end of their target genes but enhancers can be located in intronic regions, as wells as upstream or downstream of the target gene.48 Enhancer regions can be as far as 250 kilobases (kB) away from their target gene and, due to chromatin looping, the gene nearest to the enhancer is not always the target or causal gene.49
Chromatin conformation capture experiments can be used to measure these loop interactions (FIG. 1c).50 These methods identify 3D interactions between genomic loci that cannot be detected when the genome is studied in a single dimension, and can uncover biologically relevant promoter–enhancer interactions51. One of the most comprehensive methods of chromatin conformation capture is Hi-C, which uses high-throughput sequencing to characterize all 3D genomic interactions.52
Chromatin interaction analysis by paired-end tag sequencing (ChIA-PET) is another technique used to determine long-range 3D chromatin interactions...ChIA-PET incorporates enrichment for histone tail modification by ChIP-based enrichment, therefore it can also provide information on the type of regulatory element (for example, promoter or enhancer) and not just the interaction 53,54 (FIG. 1c). Understanding long-range genomic interactions, especially enhancer–promoter units that regulate gene expression, 55 is important for linking GWAS variants to their target genes.
Experimental validation of regulatory elements
Genome-wide mapping and sequencing approaches such as DNase-seq, ATAC-seq for open chromatin, ChIP-seq for annotation of histone tail modifications and Hi-C data on 3D chromatin interactions are excellent tools for identifying regulatory regions, but the effect of the genetic variant on regulatory activity is still difficult to predict. Experimental validation of regulatory activity detected by high-throughput techniques is therefore crucial. Self-transcribing active regulatory region sequencing (STARR-seq) is a popular massively parallel reporter assay [G] (MPRA) method56 that enables genome-wide quantitative measurement of enhancer activity. 55 In STARR-seq, following fragmentation of genomic DNA, potential regulatory regions are placed downstream of a minimal promoter in a reporter construct.57 Strong regulatory regions can self-transcribe and their abundance therefore increases, which is measured in a high-throughput manner and allows the simultaneous testing of the activity of hundreds of enhancer regions to identify active enhancers58,57,59...
The best and most direct approach for causal variant testing is the use of CRISPR-Cas9-mediated single nucleotide genome editing followed by read-outs of gene expression (FIG. 2)60 Although CRISPR-Cas9-based methods require substantial optimization, as they can currently only test a few variants at a time and can cause off- target effects,61 they can provide the most direct evidence for the functional importance of a specific SNP. In the past few years, several groups have developed large-scale editing methods with single-nucleotide resolution that enable the screening of a large number of SNPs at a specific genomic region, and evaluation of their role in the regulation of gene expression in a cell type-specific manner.62 A limitation of such analyses is that they use cultured cells, which often lack expression of the target gene regulated by the SNP of interest. Tissue-specific organoids can also be used for such genome editing studies but organoid cells remain poorly differentiated, (FIG. 2)63,64, which can make the extrapolation of results to fully differentiated human tissues challenging. Further complications arise from the likely presence of more than one single causal variant in any given region, as well as the potential existence of causal variants that are unique to specific ethnicities and cannot be generalised. 65 Overall, experimental validation of computationally-prioritized causal variants remains crucial. However, these studies are difficult to perform and have low throughput — prioritization efforts are therefore essential.
Figure 2. Function validation of causal variants.
Several methods can be used to validate causal variants and elucidate the underlying mechanisms that link the sequence variant to disease risk — variant-to-function analysis. Modelling risk variants using Crispr-Cas9 system in single cells or in 3D structures (organoids) and animal models (zebra fish and mice) can provide information on the effect of a sequence variant on gene expression levels.
Analyses of healthy tissue, which can be microdissected into glomeruli and tubules to generate kidney tissue compartment-specific data, or analysed by single-cell technologies can provide information on homeostatic gene regulation, including information on the cell types that express genes whose expression associates with sequence variants. Further validation of the potential mechanisms underlying the modulation of disease risk can obtained by analysing samples obtained from patient with kidney disease.
Target gene identification
For biologists, the most important element of investigating how complex traits contribute to disease is the identification of the target gene. Traditionally, GWAS loci were annotated by gene names based on the proximity of the signal to those genes, or according to subjective author preference. Although the precision of the heuristic approach for target gene identification is still fiercely debated, for several loci, the target gene of the causal variant is clearly located at a distant location. 66 In nephrology, the most widely known example is the chromosome 22 locus, which was initially thought to be associated with MYH9, although it later became clear that APOL1 was the correct target gene.67 This example highlights the need to develop novel datasets that can be used to determine the effect of genetic variation on gene expression — such datasets should ideally be cell type-specific. Although small cell type-specific datasets have been developed for isolated blood cells68, at present most available datasets are based on bulk tissue. Cellular heterogeneity and changes in cell proportions remain an important confounder in such bulk analyses. 69
Molecular quantitative trait loci
Molecular quantitative trait loci [G] (MolQTLs) are genetic variants that are associated with molecular traits and are classed according to the quantitative trait being analysed. Examples of molQTLs include expression QTLs (eQTLs), splicing QTLs (sQTLs), chromatin accessibility QTLs (caQTLs), methylation QTLs (meQTLs), protein QTLs (pQTLs) and metabolic QTLs (mQTLs)..70
Expression QTLs
eQTL analysis focuses on the statistical relationship between SNPs and gene expression71 and involves high-throughput genome-wide measurement of gene expression and its associated genetic variants, which is usually performed through the use of genotyping arrays. 71 Most eQTL analyses are limited to cis-eQTLs, meaning that the genetic variants are within 1 megabase (Mb) of the expressed genes.72 Cis-eQTLs directly alter the expression of a gene through local promoter–enhancer interactions, whereas trans-eQTLs affect the expression of a regulator, usually a transcription factor, which alters the expression of large number of target genes by directly binding to their local promoter region.73,74 The focus on cis-eQTLs is related to the issue that current eQTL datasets are relatively small and underpowered for trans-eQTL analysis, in which the variant is further away from the target gene and the mode of regulation is likely indirect.73 Analysis of trans-eQTLs might be highly informative for target gene identification and to highlight critical disease-causing pathways, as it can reveal which genes are co-regulated by the same disease-associated transcription factor, and are thus likely to be involved in the pathogenesis of the disease75. At present, eQTL datasets that are large enough for trans-eQTL identification are only available for blood samples.76,77.
The Genotype Tissue Expression (GTEx) consortium has generated eQTL datasets for ~ 50 tissue types from ~1,000 samples.78 The systematic multi-tissue cis-eQTL analysis in GTEx showed that ~70% of cis-eQTLs are observed in multiple cell and tissue types (that is, they are not tissue-specific), whereas some cis-eQTLs show tissue or cell-type specificity.79 By contrast, trans-eQTLs might show greater cell-type specifity.79 However, due to the low number of genes and variants that have been validated experimentally, the true positives or negatives in these analyses remain uncertain and further validation is required.
The human genome contains two copies of each allele and each copy is expected to contribute to 50% of the total transcript level. However, allelic imbalance, whereby two alleles of a gene are expressed at different levels instead of the expected 50:50, can also be observed and the analysis of this allele-specific expression (ASE) can substantially boost the power of eQTL analyses80. Allelic imbalance is caused by sequence variations that lead to differences in the strength of transcription factor binding between the two alleles81,82. Compared with classic eQTL analysis, which is based on an association analysis between different individuals, ASE analysis investigates the genotype effect on gene expression within the same sample, when the individual is heterozygous for the specific allele. 80 As ASE analysis focuses on two alleles from the same individual, which can reciprocally serve as internal controls, minimizing the effects of potential confounding factors and environmental biases.80 Although ASE analysis is associated with multiple technical difficulties, such as bias in the mapping of RNA sequencing (RNA-seq) reads towards the reference allele83, its ability to quantify the effect of gene expression in the same sample is a key advantage.80
Splicing QTLs
Another critical consideration for the identification of target transcripts is the analysis of alternative splicing [G]. Splicing is an important factor in the diversification of gene expression and each gene has, on average, at least 3–4 isoforms.84 Although early reports did not suggest a key role of alternative splicing in disease development, subsequent datasets indicate that ~25% of genotype-mediated regulation of gene expression is mediated through effects on gene splicing.85 sQTL analyses use data on protein isoforms and can identify genotype-driven changes in isoform expression. An sQTL dataset generated for the prefrontal cortex region of the human brain indicated that schizophrenia GWAS variants associated with changes in the transcripts of splice isoforms. 86 This finding supports the concept that alternate splicing might be a mechanism whereby genetic risk variants lead to phenotype development.
Chromatin accessibility QTLs
caQTLs can be identified when genotype assessment and quantitative measure of open chromatin can be performed in the same sample87. The integration of caQTL data with GWAS can be extremely useful for causal variant prioritization.87 The GWAS identifies the SNPs that are associated with disease risk, whereas the caQTL highlights which of those SNPs are likely to be non-coding regulators of gene expression, by identifying which SNPs are not only located in areas of open chromatin, but also correlate with changes in gene expression patterns. These SNPs are therefore likely causal variants that drive the increased or decreased risk of disease. In 2018, researchers generated a caQTL dataset for unstimulated and stimulated human CD4 T cells.87 This study reported that genotype significantly affected caQTL and that these areas of open chromatin overlapped with previously reported autoimmune disease GWAS risk variants, which could be mapped and prioritized based on the caQTL data. 87
Computational integration of GWAS and molQTL datasets
Given that disease-causing variants are mostly unknown, analysing single SNP–eQTL or SNP–epigenome datasets might lead to misleading conclusions. Several methods have been developed to integrate genotype, disease risk, gene expression and epigenome datasets, and the development of novel computational methods for high-dimensional data integration remains an active area of research.
Integration of GWAS and QTL datasets is the current gold standard for the prioritization of target genes (FIG. 3). Although QTL–GWAS methods do not perform causal inference, they are useful in prioritizing GWAS genes for experimental follow-up. The goal of these methods is to estimate the properties of the genetic relationship between a molecular trait and the GWAS trait (FIG. 3a). Methods such as transcriptome-wide association studies (TWAS)88, summary Mendelian randomization [G] (SMR)89, PrediXcan and MetaXcan90, for example, test the strength of the genetic correlation between gene expression and SNPs at specific genomic regions.
Figure 3. Quantitative trait analysis for causal gene prioritization.
Expression of Quantitative trait analysis determines the association between genetic variations and gene expression in the same tissue samples.
a| In this example, GWAS identified an association between the sequence variant T/T and chronic kidney disease. This variant is located in an open area of chromatin within a region in which the lysine (K) at position 27 of histone 3 (H3) is acetylated (ac) H3K27ac, which is a DNA enhancer and therefore suggests active gene transcription. The T/T sequence variant affects binding of the transcription factor to the enhancer and results in a decrease in the expression of DAB adaptor protein 2 (DAB2) compared with the T/A and A/A variants. In turn, this decrease in DAB2 expression correlates with a decreased of estimated glomerular filtration rate.
b| Genetic correlation can result from direct causality, in which a variant has a direct effect on a quantitative trait, but it can also result from pleiotropy, in which a variant affects multiple traits, or linkage, which refers to the existence of two variants that are associated by linkage disequilibrium (that is, a non-random association of alleles at different loci) and have independent effects on different phenotypes. These different underlying causes of genetic correlation complicate the identification of causal variants.
c| [ Integration of GWAS–molecular QTLs (molQTLs). Overlap between a GWAS variant, an eQTL and a chromatin accessibility QTL (caQTL) indicates that the sequence variant not only correlates with a disease risk trait (for example, eGFR) but also with gene expression and chromatin accessibility. This overlap potentially indicates that the sequence variant is located in gene regulatory region (that is, an area of open chromatin) where it can modulate expression of a causal gene involved in disease risk.
TWAS use eQTL data to ‘impute’ the total expression of a given gene across a large cohort of genotyped individuals, followed by a test of association with disease risk. In TWAS, direct causation is not easily distinguished from genetic correlation due to pleiotropy, as a single SNP might affect multiple molecular phenotypes, including the expression of more than one gene, or differentially affect the expression of the same gene in different tissues, as well as regulate multiple processes, including gene expression and DNA methylation (FIG. 3b). 70 Many molecular processes, such as gene expression, DNA methylation and histone variation, are controlled by the same cis-regulatory elements. 91,92 Therefore, TWAS serves as a first step in the prioritization of putative target genes but experimental validation is still required to establish causality88. Discriminating these multiple potential phenotypes can be facilitated by the integration of orthogonal datasets [G], which is a process known as triangulation. 93
Other methods, such as the Coloc and Moloc, estimate the posterior probability [G] of colocalization, which is defined as the presence of at least one shared causal variant between the eQTL datasets and GWAS (FIG. 3c). 94 This method requires prior probabilities to be defined at a SNP level. 94 The output of colocalization consists of the posterior probabilities of variants that affect both traits, such as kidney disease (GWAS) and gene expression (eQTL) or SNPs that independently associate with each trait. The posterior probability values have an upper limit of 1.0 and do not give an indication of the strength of the correlation between the SNPs and the traits. However, TWAS measures the strength of the association(s)88 and can therefore complement a colocalization analysis ...
SMR is another tool that assesses causality of an association between a risk factor and a clinically relevant outcome. 95 In contrast to observational studies, with SMR, genetic variants are used as an instrumental variable [G] and the effect of these variants on a trait, disease or gene expression is investigated. 96–98,99 The use of genetic variants helps to circumvent problems with confounding or reverse causation.96 The variants must fit three assumptions — they must be relevant (that is, they must associate with the outcome of interest); they must not be affected by unmeasured confounders; and they must affect the outcome only through their effect on the risk factor of interest (that is, they must be exclusive). 96 The first assumption of the variant can be tested, whereas the other two are determined by sensitivity analyses. 100
Various modifications have been made to SMR to facilitate the analysis of GWAS studies. Two-sample Mendelian randomization only requires GWAS summary statistics, which is useful if the exposure genetic variant or SNP and the outcome SNP are assessed in different datasets. 101 Genetic pleiotropy (FIG. 3c) can be addressed by using multivariate SMR analysis, which reduces bias by simultaneously estimating the effect of all exposures on the outcome.102,103 A 2019 study reported the development of transcriptome-wide Mendelian randomization (TWMR), which is a multivariate Mendelian randomization method that uses gene expression levels as the exposure.92 This method was applied to summary data from a publicly available GWAS dataset combined with an eQTL meta-analysis performed in blood to investigate causal associations between gene expression and 43 complex traits.92 This method was very successful and detected 3,913 causal gene–trait associations, 36% of which did have not a nearby genome-wide SNP, and would have therefore been more difficult to detect.92 These findings suggest that the TWMR method might help to integrate GWAS and eQTL data, and detect associations that would have otherwise been missed owing to the lack of power in GWAS.92
SMR has been used for example to examine the relationship between serum urate levels and CKD104. Human observational data indicated that serum uric acid levels have a causal role in the development of CKD. These studies used SNPs that are known to influence serum urate levels (for example, rs734533) as the instrumental variable105. Whereas the analysis of a single SNP associated with urate levels also showed an association with CKD progression in a small cohort study 104, a similar analysis using a collection of SNPs in a cohort with 110,347 participants107 found no causal associations between serum uric acid levels and eGFR and CKD. 20 SMR can also identify epigenetic changes, such as changes in methylation108 and chromatin accessibility109 that mediate the effect of a genotype on gene expression.
Defining disease-causing cell types
The identification of causal cell types is strongly linked to the discovery of causal variants and causal genes. As previously discussed, causal variants have a disease-specific enrichment pattern 110, which is likely due to the fact that the variant is only active when localized to open chromatin regions and that patterns of chromatin accessibility are cell type-specific. 20 An approach to determining the involvement of cell types present in different kidney tissues it to microdissect kidney tissue into glomerular and tubular compartments, which can then be used for sequencing analyses (FIG. 2); an alternative to these bulk sequencing approaches is the use of single-cell technologies.
Remarkable advances in single-cell analytical methods currently allow the analysis of gene expression levels, open chromatin, surface protein levels and spatial organization of the genome on an unprecedented scale. This methodological revolution occurred around 2015, and included advances in barcoding methods and the development of microfluidic devices for cellular encapsulation [G]. 111 Currently, barcoded droplet-based RNA-seq costs ~US$0.01–0.02 per cell, which makes it an affordable approach for obtaining comprehensive information on the expression of >1,000 genes.112 These new methods enable transcript-based definition of cell types, which enables more in-depth cellular characterization than morphology-based annotation21, and have provided crucial information for the identification of disease-causing cell types and regulatory regions. 21
In addition to expression analysis, open chromatin annotation at the single-cell level through single-nucleus ATAC-seq (snATAC-seq) analysis will likely further revolutionize GWAS signal annotation. Bulk tissue approaches usually only detect signals from predominant cell types but snATAC-seq enables the identification of rare cell types and subpopulations of common cell types. 113,114 snATAC-seq analysis therefore allows the efficient mapping of GWAS regions into narrow cell type-specific open chromatin signals.113 Unfortunately, at present, combining snATAC-seq and snRNA-seq analyses is difficult, although several new methods have been developed.115 Single-cell annotation and analytical methods will be crucial for further GWAS annotation.
Variant-to-function analyses for kidney related traits
As previously discussed, GWAS studies have defined the genetic architecture of a variety of kidney-related traits.8–10,116 These studies include hypertension GWAS performed in a variety of large cohorts14, eGFR GWAS in ~1 million Europeans and multi-ethnic cohorts, 8–10 as well as electrolytes and protein levels in urine, and of serum and urinary metabolite concentrations.8–10,116,117 The large sample size in these studies, up to 1 million participants, 8–10 has enabled the identification of hundreds of DNA regions with very strong association with a variety of renal traits, such as eGFR, serum urea level, blood pressure and albuminuria. Some genetic loci have been identified in multiple GWAS studies in which different traits were mapped such as blood pressure, eGFR, 8,10,116, and albuminuria. 13,118
Unfortunately, very few kidney specific epigenome datasets are currently available for the interrogation of genetic loci (Fig2). The Encyclopedia of DNA elements (ENCODE), the Roadmap epigenomics project and the International Human Epigenome Consortium (IHEC), which characterized functional (gene regulatory) elements in multiple tissue and cell types, contain a relatively small number of human kidney tissue samples for epigenomic analysis. These datasets include some DHS analysis of fetal human kidney samples and genome-wide histone ChIP-Seq data for adult human kidney tissue samples. The human kidney is also very sparsely represented in the Genotype–tissue expression (GTEx) database — the 2019 version (V8) of this dataset includes data from 73 whole human kidney samples.
Below we discuss various kidney studies that have created novel resources for variant-to-function analyses in nephrology; some of these studies have also identified causal genes and the potential underlying mechanisms that link these genes to kidney disease.
In a 2019 study, researchers generated DHS, chromatin conformation and gene expression profiles from cultures of primary human kidney glomerular and cortical cells119. These new datasets allowed the prioritization of GWAS variants. This approach identified 42 kidney GWAS loci, which were physically and functionally connected to 46 potential target genes. Although this is an important initial dataset, its use of cultured cells instead of human tissue remains a key limitation as cell culture is likely to alter the gene expression pattern and the phenotype of the cells.
In 2017, we reported one of the first human kidney eQTL datasets, in which we analysed ~100 whole human kidney cortex samples from European donors from The Cancer Genome Atlas (TCGA) database120. In this study, we identified 1,886 genes whose expression levels correlated with sequence variants, also known as eGenes120. To identify potential genes involved in CKD, we sought to identify genetic variants associated with kidney function that uniquely influenced gene expression in the kidney by performing a colocalization analysis using our kidney eQTL data and data from GWAS studies. This approach identified lysosomal beta A mannosidase (MANBA) as a potential gene linked to CKD development. We found statistically significant colocalization between genetic variants associated with MANBA expression in the kidney and CKD GWAS variants, confirming the potential role of MANBA as a CKD target gene. 120
In a 2018 study, researchers analyzed gene expression data from 280 whole kidney tissue samples — 180 samples from the Transcriptome of renal human tissue (TRANSLATE) study121 and 100 TCGA samples.120 This study reported 3,786 eGenes (approximately 17.2% of all kidney genes); 35 of these genes had an eQTL that overlapped directly with CKD GWAS loci. However, as discussed earlier, the direct overlap method has substantial limitation, as the causal variant on multiple loci remains unknown. Mendelian randomization analysis supported a causal role of NAT8B, CASP9 and MUC1 in the development of kidney disease.121
The Nephrotic Syndrome Study Network (NEPTUNE) analyzed the correlation between genotype and gene expression in ~150 samples from individuals with nephrotic syndrome, using microdissected kidney tissue122. The use of microdissection enabled the authors to distinguish between genes regulated in tubular or glomerular compartments. However, the analysis of gene expression was ascertained from microarrays and the analysed cohort included individuals with mixed ethnicities and varying degree of disease severity.122 This analysis only reported 894 glomerular eGenes and 1,767 eGenes in the microdissected tubulointerstitium.122 This is likely due to the heterogeneity of samples used for the analysis. Furthermore only 53% of glomerular eQTLs and 66% of tubular QTLs could be replicated in the GTEx database. 122 Key differences between the GTEx and NEPTUNE studies respectively include — the use of samples from healthy individuals versus individuals with nephrotic syndrome, the use of whole kidney samples versus microdissected tubules and glomeruli, and the use of RNA-seq versus gene microarrays to quantify gene expression. 78
In 2018 we reported an eQTL dataset generated from 120 microdissected glomerular and tubule samples obtained from individuals of European descent with normal kidney function (eGFR> 60ml/min/1.73m2) and structure (such as <10% fibrosis and sclerosis on histological analysis)22. Furthermore, using an in silico cellular deconvolution [G] method, we only included samples with similar cellular composition, to minimize confounding due to cellular heterogeneity.. This dataset identified more than 4,000 glomerular and tubular eGenes with 70–80% replication rate in other tissue samples (using the GTEx database). The high replication rate validates the identified eGenes as true eQTLs, as it has been shown that most cis-eQTLs are shared across multiple tissues79. Computational integration of the GWAS and eQTL datasets using the coloc prioritized 27 genes, for which association with kidney function and changes in gene expression originated from the same genetic variants. These genes are likely to be causal genes for CKD. 22
To understand where these putative causal genes are expressed we integrated the GWAS target genes with the mouse single-cell transcriptome maps. We found that 10 of the CKD GWAS target genes showed almost exclusive proximal tubule specific expression pattern 22 We further analyzed the region in chromosome 5 that contains complement C9 (C9) and DAB adaptor protein 2 (DAB2), as this area showed significant association with eGFR in GWAS studies and the same area showed association with DAB2 expression in microdissected human kidney samples 22. Bayesian colocalization indicated significant colocalization between tubule-specific DAB2 levels and SNPs identified in an eGFR GWAS12. Part of the GWAS locus also overlapped with an enhancer region, identified by presence of H3K27ac as detected by ChIP-seq22. This genetic region showed markers of gene regulatory elements, but not in any other tissue samples.22 Finally, we generated mice with tubule-specific heterozygous deletion of DAB2 to show the functional role of DAB2 in kidney tubules. The risk allele was associated with higher DAB2 levels and accordingly, animals with only one functional copy of tubular DAB2 were protected from kidney disease induced by folic acid.123
In addition to DAB2 and MANBA, earlier studies also suggested role for uromodulin (UMOD) in CKD12,124,125 Although UMOD has not been identified as an eGene in the above mentioned eQTL studies, 10 direct analysis of risk genotype and UMOD expression indicates that UMOD levels are higher in individuals with a CKD risk variant than in those that do not carry a risk variant.125 Transgenic mice that expressed increased levels of tagged UMOD, developed salt-sensitive hypertension and kidney lesions126, features that are also observed in individuals who are homozygous for UMOD promoter CKD risk variants.125 Increased UMOD expression resulted in the activation of the sodium cotransporter solute carrier family 12 member 1 (encoded by Slc12a1, also known as NKCC2).125 Moreover, inhibition of this sodium cotransporter with furosemide lowered blood pressure most effectively in patients with hypertension who were homozygous for UMOD promoter risk variants.125
GWAS studies also identified a strong association between kidney function and genetic variants in the region that encodes SHROOM3....12,127 Although SHROOM3 was also not identified by GWAS–eQTL integration, studies with mouse models have shown that disruptions in this gene can affect glomerular function. 127,128SHROOM3 is expressed in podocytes and is necessary for the development and/or maintenance of the complex podocyte architecture.129 Shroom3−/− mice developed marked glomerular abnormalities, including cystic and collapsing and/or degenerating glomeruli, as well as marked disruptions in podocyte arrangement and morphology.128 These podocyte-specific abnormalities might result from altered Rho–kinase–myosin II signaling and the loss of apically-distributed actin.128
In a different study127, the risk allele A of the SNP rs17319721 correlated with increased SHROOM3 expression in kidney allografts. Analysis of the SNP indicated that the intronic region of SHROOM3 contained a transcription factor 7–like 2 (TCF7L2)-dependent enhancer element that increased SHROOM3 transcription.127 Unlike in the previous study, which focused on podocytes129, SHROOM3 was shown to localize to kidney tubules127 and treatment of renal tubular cells with TGFβ1 upregulated SHROOM3 expression in a β-catenin and TCF7L2–mediated manner. SHROOM3 overexpression facilitated canonical TGFβ1 signaling and increased expression of α1 collagen (COL1A1)127. Inducible and tubular cell–specific knockdown of Shroom3 markedly abrogated interstitial fibrosis in mice with unilateral ureteric obstruction.127
In summary, several key datasets have been generated for the functionalization of GWAS variants, including epigenome and molQTL datasets. These datasets have been instrumental in functionalizing a few genetic loci, including UMOD, MANBA, DAB2 and SHROOM3. A key bottleneck remains in the experimental validation of the prioritized genes and variants.
Conclusions
Human genetic studies have led to paradigm-shifting observations in relatively rare monogenic forms of CKD, including focal segmental glomerulosclerosis (FSGS), and in polygenic forms of kidney disease such as membranous nephropathy.130 In case the of FSGS, this work determined that the podocyte was a key causal cell and identified a limited set of pathogenic pathways for therapeutics development.131 In the case of membranous nephropathy, these studies enabled precision medicine approaches and the discovery of novel diagnostic, prognostic and therapeutic markers.132
Worldwide, hypertensive and diabetic kidney disease accounts for >75% of cases of CKD and end-stage kidney disease.133 understanding the pathogenesis of these conditions is fundamental to improve treatment strategies and patient outcomes. Although some exome sequencing studies indicate that coding region mutations might contribute to ~9 % of CKD risk,134 most studies consistently indicate that majority of genetic variation that explains heritability of common diseases occurs in non-coding regions of the genome. 15,16 Therefore, identifying kidney disease causal genes, cell types and pathways by defining the biological pathways highlighted by GWAS is crucial. These studies will not only improve our understanding of the pathogenesis of kidney disease but might also identify key targetable pathways.
GWAS causal variants influence the regulation of target genes and are thus usually enriched in gene regulatory regions of open chromatin. These causal variants often alter transcription factor binding strength and quantitatively change the expression of target genes in a cell type-specific manner —we and others have catalogued genotype-driven variation in gene expression in the glomerular and tubule compartments of human kidneys22. 135 Integrating various data sets such as GWAS, molQTLs, single-cell gene expression and epigenome and using robust computational methods for analysis, which are continuously being developed, is of crucial importance and has been successful in identifying putative disease-causing genes. We strongly believe that such target genes must also be validated in appropriate model systems. Moreover, the use of kidney single-cell gene expression analyses, for example, enables the identification of the kidney compartment in which these disease-associated genes are enriched.21 Genetically modified mouse models, such as those used for Umod123, Dab222 and Shroom3126,127, can then be used to study the effect of these disease-associated genes.
Genetics is not the only factor that explains CKD development but an analysis that involved the pharmaceutical industry, indicated that the likelihood of successful drug development is substantially increased when a causal gene and/or its pathway are targeted.136 This observation is further supported by studies in cardiology, where genetic studies have identified key disease-driving genes such as PCSK9, which led to the development a new FDA-approved drug for lowering cholesterol levels.113 Defining the target genes, cells types and underlying mechanisms involved in kidney disease will be crucial for the development of new translational approaches and drugs in nephrology, which are desperately needed.
Table 1.
Techniques applied in variant-to-function annotation
| Research objective | Techniques | Ref |
|---|---|---|
| Identification of areas of open chromatin | DNase hypersensitivity analysis ATAC-seq |
144,145 31 |
| Assessment of chromatin conformation | Hi-Ca
ChiA-PET HiChIPb |
52 53 149 |
| Annotation of histone tail modifications | ChIP-seq | 150 |
| Assessment of DNA methylation | Methylation arrays (Illumina 450K and 850K) Whole-genome bisulfate sequencing Reduced representation bisulfate sequencing Isoschizomer-based analysis (HELP) |
151,152 153 154 155 |
| Measurement of enhancer activity | Massively parallel reporter assays STARR sequencing |
156 58,59 |
| Single-cell analysis of molecular traits | scRNA-seq scATAC-seq scM&T seq scNMT seq CITE-seq. REAP-seq |
21 113 157 158 159 160 |
| Molecular QTL studies | eQTL caQTL sQTL hQTL meQTL pQTL mQTL |
71 87 86 161 162 163 164 |
Hi-C is a chromatin conformation capture method that includes the use of high-throughput sequencing.
HiChIP is another method of chromatin conformation capture, in which contacts between different DNA sequences are formed before lysis and chromatin immunoprecipitation is subsequently performed on these contacts. 149
In this method, chromatin immunoprecipitation is performed with an antibody that recognizes 5-methyl cytosine (MeDIP). ATAC-seq, assay for transposase-accessible chromatin using sequencing; ChIA-PET, chromatin interaction analysis by paired-end tag sequencing; ChIP-seq, chromatin immunoprecipitation followed by sequencing; caQTL, chromatin accessibility QTL; CITE-seq, cellular indexing of transcriptomes and epitopes by sequencing; eQTL, expression quantitative trait loci; HELP, HpaII tiny fragment enrichment by ligation-mediated PCR; hQTL, histone QTL; MBD-ChIP, methylcytosine DNA-binding domain ChIP; mQTL, metabolic QTL; meQTL, methylation QTL; pQTL, protein QTL; REAP-seq, RNA expression and protein sequencing assay; scM&T-seq, single-cell methylome and transcriptome sequencing; scNMT-seq, single-cell nucleosome, methylome and transcriptome sequencing; scRNA-seq, single-cell RNA sequencing; sQTL, splicing QTL; STARR-seq, self-transcribing active regulatory region sequencing.
Table 2.
Techniques combinations used in human kidney studies to delineate GWAS findings
| Techniques | Human Studies | Refs |
|---|---|---|
| Epigenome Analysis — DNA hypersensitivity analysis, RNA seq,Isoschizomer-based analysis (HELP), Illumina 450K, | Fetal Kidney Adult Kidney (14 healthy controls and 12 CKD patients) 165 Adult Blood Samples (181 Pima Indian subjects)166 |
ENCODE accession number GSM530655 165 166 |
| DNase-seq, RNAse-seq, PolyA-seq, Chip-seq, RRBS, FAIRE-seq | 68 human embryonic kidney samples (encodeproject.org) | 167 |
| GWAS, eQTL, scRNA-seq | Susztak lab, 151 human kidney samples microdissected into 121 tubules and 119 glomeruli GTEx database, 73 whole human kidney samples NEPH QTL, http://nephqtl.org 181 microdissected human kidney samples from patients with nephrotic syndrome () TRANSLATE study, 280 human kidney samples from unilateral non-invasive renal cancer. |
22 122 121 |
ChIP-seq, chromatin immunoprecipitation followed by sequencing; eQTL, expression quantitative trait loci; FAIRE–seq, formaldehyde-assisted isolation of regulatory elements followed by high-throughput sequencing; GTEx, genotype–tissue expression; GWAS, genome-wide association studies; H3K4me1, monomethylated H3K4; H3K9ac, acetylated H3K9; H3K27me3, trimethylated histone 3 lysine 27; RRBS, reduced representation bisulfite sequencing; scRNA-seq, single-cell RNA sequencing; TRANSLATE, Transcriptome of renal human tissue.
Key Points.
Genome-wide association studies (GWAS) have identified large number of nucleotide variations that show strong and reproducible association with kidney-related traits such as serum and urine metabolites, estimated glomerular filtration rate, albuminuria and hypertension.
More than 95% of disease-associated GWAS signals are located in non-coding regions of the genome, which can be identified by analysing chromatin accessibility and conformation, DNA methylation and transcription factor binding.
Disease-associated genetic variants seem to be located in cell-type specific gene regulatory regions, where they modulate disease risk by quantitatively altering gene transcript levels.
The use of advanced computational approaches to combine datasets orthogonal to GWAS data, including molecular quantitative trait studies, single cell transcriptomics and epigenetic information, is necessary for the prioritization of GWAS variants.
Box 1. Role of rare variants in common kidney disease.
Over the last few years, a large collection of whole exome sequencing datasets has been generated, initially to be used in family studies of disease inheritance and later for large biobanks.137 These large-scale whole exome sequencing studies have identified novel genes that after experimental validation, could yield potential therapeutic targets138, for example in nephronopthisis, RBM48, FAM186B, PIAS1, INCENP and RCOR1 have been identified, which are novel cillopathy genes 139
However, the low number of identified targets also indicates that the contribution of rare genetic variants to the development of common diseases such as diabetes mellitus140 and hyperlipidemia141 is relatively small. Unbiased exome sequencing studies indicate that the contribution of coding variants to the development of chronic kidney disease (CKD) might be larger than that previously observed for other traits134; however, these studies must be replicated in larger cohorts. 142
Nonetheless, these studies have demonstrated that a clear convergence of genes, cell types and underlying mechanisms can often be observed for a specific disease condition. 21 For example, we observed that monogenic coding variants associated with nephrotic syndrome are enriched for podocyte-specific expression.21 Similarly, a GWAS performed for albuminuria in the context of diabetic kidney disease showed enrichment for podocyte-specific signals.143 By contrast, coding mutations that are associated with either low of high blood pressure were enriched in genes expressed in the distal convoluted tubule expression.21 Of note, although genes identified in a blood pressure GWAS study were different from those implicated in monogenic blood pressure syndrome, they were also enriched for expression in the distal convoluted tubule.116 These studies are encouraging and potentially indicate that the genes and genetic signals involved in the pathogenesis of a disease will coalesce around a limited number of molecular and cellular processes.
Acknowledgements
K. S. is supported by NIH National Institute of Diabetes and Digestive and Kidney Diseases grants R01DK076077, R01 DK087635 and DP3 DK108220.
Glossary terms
- Secondary chromatin structure
The structure formed by the folding of primary chromatin
- Imputation
The inclusion of genotypes that are not directly measured by estimating the missing data based on reference genome datasets and SNP linkage
- Fine-mapping
The process by which a variant is assigned to a complex trait
- Haplotype
A set of genetic polymorphisms that are inherited together
- Deep phenotyping
- Promoter
The DNA sequences on which transcription is initiated
- Enhancer
A DNA sequence that is bound by transcription factors to increase the transcription of a gene
- Insulator
A DNA sequence that limits chromatin activation
- Transposable elements
- Cis-regulatory elements
Regions of non-coding DNA that affect the expression of nearby genes
- Massively parallel reporter assays
High-throughput experiments, in which the transcriptional activities of many regulatory sequences can be obtained
- Quantitative trait loci - (QTL).
genetic loci that are associated with variation in a phenotype
- Alternative splicing
Process by which multiple transcripts are produced from a single gene with different functions
- Mendelian randomization
- Orthogonal datasets
Independent and uncorrelated data sets
- Posterior probability
The probability of an event occurring based on information from a prior event
- Instrumental variable
- Cellular encapsulation
A technique whereby cells are entrapped in a spherical semi-permeable polymeric membrane
- Cellular deconvolution
The estimation of gene expression data for individual cell types within a bulk expression dataset
Footnotes
Competing interests
The authors declare no competing interests.
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Related links
Encyclopedia of DNA Elements (ENCODE) -https://www.encodeproject.org
NIH Roadmap epigenomics project - http://www.roadmapepigenomics.org
Genotype–tissue Expression (GTEX) portal - https://gtexportal.org/home/
Susztak lab human kidney eQTL atlas - http://susztaklab.com/eqtl/
International human epigenome consortium (IHEC) - http://ihec-epigenomes.org
REFERENCES
- 1.Warejko JK, Tan W, Daga A, et al. : Whole Exome Sequencing of Patients with Steroid-Resistant Nephrotic Syndrome. Clin J Am Soc Nephrol 13:53–62, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Mallawaarachchi AC, Hort Y, Cowley MJ, et al. : Whole-genome sequencing overcomes pseudogene homology to diagnose autosomal dominant polycystic kidney disease. Eur J Hum Genet 24:1584–1590, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Boucher C, Sandford R: Autosomal dominant polycystic kidney disease (ADPKD, MIM 173900, PKD1 and PKD2 genes, protein products known as polycystin-1 and polycystin-2). Eur J Hum Genet 12:347–54, 2004 [DOI] [PubMed] [Google Scholar]
- 4.Dempfle A, Scherag A, Hein R, et al. : Gene-environment interactions for complex traits: definitions, methodological requirements and challenges. Eur J Hum Genet 16:1164–72, 2008 [DOI] [PubMed] [Google Scholar]
- 5.Bomba L, Walter K, Soranzo N: The impact of rare and low-frequency genetic variants in common disease. Genome Biol 18:77, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Stringer S, Wray NR, Kahn RS, et al. : Underestimated effect sizes in GWAS: fundamental limitations of single SNP analysis for dichotomous phenotypes. PLoS One 6:e27964, 2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Visscher PM, Brown MA, McCarthy MI, et al. : Five years of GWAS discovery. Am J Hum Genet 90:7–24, 2012 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Wuttke M, Li Y, Li M, et al. : A catalog of genetic loci associated with kidney function from analyses of a million individuals. Nat Genet 51:957–972, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Morris AP, Le TH, Wu H, et al. : Trans-ethnic kidney function association study reveals putative causal genes and effects on kidney-specific disease aetiologies. Nat Commun 10:29, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Hellwege JN, Velez Edwards DR, Giri A, et al. : Mapping eGFR loci to the renal transcriptome and phenome in the VA Million Veteran Program. Nat Commun 10:3842, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Kanai M, Akiyama M, Takahashi A, et al. : Genetic analysis of quantitative traits in the Japanese population links cell types to complex human diseases. Nat Genet 50:390–400, 2018 [DOI] [PubMed] [Google Scholar]
- 12.Kottgen A, Glazer NL, Dehghan A, et al. : Multiple loci associated with indices of renal function and chronic kidney disease. Nat Genet 41:712–7, 2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Teumer A, Tin A, Sorice R, et al. : Genome-wide Association Studies Identify Genetic Loci Associated With Albuminuria in Diabetes. Diabetes 65:803–17, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Ehret GB, Ferreira T, Chasman DI, et al. : The genetics of blood pressure regulation and its target organs from association studies in 342,415 individuals. Nat Genet 48:1171–1184, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Maurano MT, Humbert R, Rynes E, et al. : Systematic localization of common disease-associated variation in regulatory DNA. Science 337:1190–5, 2012This study shows that disease causing variants localize to gene regulatory regions
- 16.Schaub MA, Boyle AP, Kundaje A, et al. : Linking disease associations with regulatory information in the human genome. Genome Res 22:1748–59, 2012 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Schaid DJ, Chen W, Larson NB: From genome-wide associations to candidate causal variants by statistical fine-mapping. Nat Rev Genet 19:491–504, 2018This study highlights the method of fine mapping to identify causal variants.
- 18.Dawn Teare M, Barrett JH: Genetic linkage studies. Lancet 366:1036–44, 2005 [DOI] [PubMed] [Google Scholar]
- 19.Brown CR, Mao C, Falkovskaia E, et al. : Linking stochastic fluctuations in chromatin structure and gene expression. PLoS Biol 11:e1001621, 2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Pattaro C, Teumer A, Gorski M, et al. : Genetic associations at 53 loci highlight cell types and biological pathways relevant for kidney function. Nat Commun 7:10023, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Park J, Shrestha R, Qiu C, et al. : Single-cell transcriptomics of the mouse kidney reveals potential cellular targets of kidney disease. Science 360:758–763, 2018First study in the kidney to eludicate the kidney disease target genes on a cellular level
- 22.Qiu C, Huang S, Park J, et al. : Renal compartment-specific genetic variation analyses identify new pathways in chronic kidney disease. Nat Med 24:1721–1731, 2018Human kidney compartment specific eQTL study, which identifies target genes which are then modelled.
- 23.Uh HW, Deelen J, Beekman M, et al. : How to deal with the early GWAS data when imputing and combining different arrays is necessary. Eur J Hum Genet 20:572–6, 2012 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Flister MJ, Tsaih SW, O’Meara CC, et al. : Identifying multiple causative genes at a single GWAS locus. Genome Res 23:1996–2002, 2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Liu B, Montgomery SB: Identifying causal variants and genes using functional genomics in specialized cell types and contexts. Hum Genet 139:95–102, 2020 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Sebastiani P, Timofeev N, Dworkis DA, et al. : Genome-wide association studies and the genetic dissection of complex traits. Am J Hematol 84:504–15, 2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Pare G, Mao S, Deng WQ: A machine-learning heuristic to improve gene score prediction of polygenic traits. Sci Rep 7:12665, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Visscher PM, Wray NR, Zhang Q, et al. : 10 Years of GWAS Discovery: Biology, Function, and Translation. Am J Hum Genet 101:5–22, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Kiryluk K, Li Y, Moldoveanu Z, et al. : GWAS for serum galactose-deficient IgA1 implicates critical genes of the O-glycosylation pathway. PLoS Genet 13:e1006609, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Zhou W, Sherwood B, Ji Z, et al. : Genome-wide prediction of DNase I hypersensitivity using gene expression. Nat Commun 8:1038, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Buenrostro JD, Giresi PG, Zaba LC, et al. : Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat Methods 10:1213–8, 2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Beckerman P, Ko YA, Susztak K: Epigenetics: a new way to look at kidney diseases. Nephrol Dial Transplant 29:1821–7, 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Heintzman ND, Stuart RK, Hon G, et al. : Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat Genet 39:311–8, 2007 [DOI] [PubMed] [Google Scholar]
- 34.Heintzman ND, Hon GC, Hawkins RD, et al. : Histone modifications at human enhancers reflect global cell-type-specific gene expression. Nature 459:108–12, 2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Kimura H: Histone modifications for human epigenome analysis. J Hum Genet 58:439–45, 2013 [DOI] [PubMed] [Google Scholar]
- 36.Jones PA: Functions of DNA methylation: islands, start sites, gene bodies and beyond. Nat Rev Genet 13:484–92, 2012 [DOI] [PubMed] [Google Scholar]
- 37.Luo C, Hajkova P, Ecker JR: Dynamic DNA methylation: In the right place at the right time. Science 361:1336–1340, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Wain LV, Vaez A, Jansen R, et al. : Novel Blood Pressure Locus and Gene Discovery Using Genome-Wide Association Study and Expression Data Sets From Blood and the Kidney. Hypertension, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Moran S, Esteller M: Infinium DNA Methylation Microarrays on Formalin-Fixed, Paraffin-Embedded Samples. Methods Mol Biol 1766:83–107, 2018 [DOI] [PubMed] [Google Scholar]
- 40.Consortium EP: An integrated encyclopedia of DNA elements in the human genome. Nature 489:57–74, 2012 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Bujold D, Morais DAL, Gauthier C, et al. : The International Human Epigenome Consortium Data Portal. Cell Syst 3:496–499 e2, 2016 [DOI] [PubMed] [Google Scholar]
- 42.Gaudreault M, Gingras ME, Lessard M, et al. : Electrophoretic mobility shift assays for the analysis of DNA-protein interactions. Methods Mol Biol 543:15–35, 2009 [DOI] [PubMed] [Google Scholar]
- 43.Perna A, Alberi LA: TF-ChIP Method for Tissue-Specific Gene Targets. Front Cell Neurosci 13:95, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Vorontsov IE, Fedorova AD, Yevshin IS, et al. : Genome-wide map of human and mouse transcription factor binding sites aggregated from ChIP-Seq data. BMC Res Notes 11:756, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Kidder BL, Hu G, Zhao K: ChIP-Seq: technical considerations for obtaining high-quality data. Nat Immunol 12:918–22, 2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Nishizaki SS, Ng N, Dong S, et al. : Predicting the effects of SNPs on transcription factor binding affinity. Bioinformatics 36:364–372, 2020 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Behera V, Evans P, Face CJ, et al. : Exploiting genetic variation to uncover rules of transcription factor binding and chromatin accessibility. Nat Commun 9:782, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Pennacchio LA, Bickmore W, Dean A, et al. : Enhancers: five essential questions. Nat Rev Genet 14:288–95, 2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Wu C, Pan W: Integration of Enhancer-Promoter Interactions with GWAS Summary Results Identifies Novel Schizophrenia-Associated Genes and Pathways. Genetics 209:699–709, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Dekker J, Rippe K, Dekker M, et al. : Capturing chromosome conformation. Science 295:1306–11, 2002 [DOI] [PubMed] [Google Scholar]
- 51.Li G, Ruan X, Auerbach RK, et al. : Extensive promoter-centered chromatin interactions provide a topological basis for transcription regulation. Cell 148:84–98, 2012 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Burton JN, Adey A, Patwardhan RP, et al. : Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nat Biotechnol 31:1119–25, 2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Li G, Cai L, Chang H, et al. : Chromatin Interaction Analysis with Paired-End Tag (ChIA-PET) sequencing technology and application. BMC Genomics 15 Suppl 12:S11, 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Li G, Fullwood MJ, Xu H, et al. : ChIA-PET tool for comprehensive chromatin interaction analysis with paired-end tag sequencing. Genome Biol 11:R22, 2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Vanhille L, Griffon A, Maqbool MA, et al. : High-throughput and quantitative assessment of enhancer activity in mammals by CapStarr-seq. Nat Commun 6:6905, 2015 [DOI] [PubMed] [Google Scholar]
- 56.Muerdter F, Boryn LM, Arnold CD: STARR-seq - principles and applications. Genomics 106:145–150, 2015 [DOI] [PubMed] [Google Scholar]
- 57.Arnold CD, Gerlach D, Stelzer C, et al. : Genome-wide quantitative enhancer activity maps identified by STARR-seq. Science 339:1074–7, 2013 [DOI] [PubMed] [Google Scholar]
- 58.Inoue F, Ahituv N: Decoding enhancers using massively parallel reporter assays. Genomics 106:159–164, 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Zabidi MA, Arnold CD, Schernhuber K, et al. : Enhancer-core-promoter specificity separates developmental and housekeeping gene regulation. Nature 518:556–9, 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Ran FA, Hsu PD, Wright J, et al. : Genome engineering using the CRISPR-Cas9 system. Nat Protoc 8:2281–2308, 2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Wu S, Zhang M, Yang X, et al. : Genome-wide association studies and CRISPR/Cas9-mediated gene editing identify regulatory variants influencing eyebrow thickness in humans. PLoS Genet 14:e1007640, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Gasperini M, Hill AJ, McFaline-Figueroa JL, et al. : A Genome-wide Framework for Mapping Gene Regulation via Cellular Genetic Screens. Cell 176:1516, 2019 [DOI] [PubMed] [Google Scholar]
- 63.Schwank G, Koo BK, Sasselli V, et al. : Functional repair of CFTR by CRISPR/Cas9 in intestinal stem cell organoids of cystic fibrosis patients. Cell Stem Cell 13:653–8, 2013 [DOI] [PubMed] [Google Scholar]
- 64.Roper J, Yilmaz OH: Breakthrough Moments: Genome Editing and Organoids. Cell Stem Cell 24:841–842, 2019 [DOI] [PubMed] [Google Scholar]
- 65.Replication DIG, Meta-analysis C, Asian Genetic Epidemiology Network Type 2 Diabetes C, et al. : Genome-wide trans-ancestry meta-analysis provides insight into the genetic architecture of type 2 diabetes susceptibility. Nat Genet 46:234–44, 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Soldner F, Stelzer Y, Shivalila CS, et al. : Parkinson-associated risk variant in distal enhancer of alpha-synuclein modulates target gene expression. Nature 533:95–9, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Genovese G, Friedman DJ, Ross MD, et al. : Association of trypanolytic ApoL1 variants with kidney disease in African Americans. Science 329:841–5, 2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.van der Wijst MGP, Brugge H, de Vries DH, et al. : Single-cell RNA sequencing identifies celltype-specific cis-eQTLs and co-expression QTLs. Nat Genet 50:493–497, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Wang X, Park J, Susztak K, et al. : Bulk tissue cell type deconvolution with multi-subject single-cell expression reference. Nat Commun 10:380, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Neumeyer S, Hemani G, Zeggini E: Strengthening Causal Inference for Complex Disease Using Molecular Quantitative Trait Loci. Trends Mol Med, 2019 [DOI] [PubMed] [Google Scholar]
- 71.Franke L, Jansen RC: eQTL analysis in humans. Methods Mol Biol 573:311–28, 2009 [DOI] [PubMed] [Google Scholar]
- 72.Guo X, Lin W, Bao J, et al. : A Comprehensive cis-eQTL Analysis Revealed Target Genes in Breast Cancer Susceptibility Loci Identified in Genome-wide Association Studies. Am J Hum Genet 102:890–903, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Yao C, Joehanes R, Johnson AD, et al. : Dynamic Role of trans Regulation of Gene Expression in Relation to Complex Traits. Am J Hum Genet 100:985–986, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Bryois J, Buil A, Evans DM, et al. : Cis and trans effects of human genomic variants on gene expression. PLoS Genet 10:e1004461, 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Westra HJ, Peters MJ, Esko T, et al. : Systematic identification of trans eQTLs as putative drivers of known disease associations. Nat Genet 45:1238–1243, 2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Võsa Urmo, Claringbould Annique, Westra Harm-Jan, Bonder Marc Jan, Deelen Patrick, Zeng Biao, Kirsten Holger, Saha Ashis, Kreuzhuber Roman, Kasela Silva, Pervjakova Natalia, Alvaes Isabel, Fave Marie-Julie, Agbessi Mawusse, Christiansen Mark, Jansen Rick, Seppälä Ilkka, Tong Lin, Teumer Alexander, Schramm Katharina, Hemani Gibran, Verlouw Joost, Yaghootkar Hanieh, Sönmez Reyhan, Brown Andrew, Kukushkina Viktorija, Kalnapenkis Anette, Rüeger Sina, Porcu Eleonora, Kronberg- Guzman Jaanika, Kettunen Johannes, Powell Joseph, Lee Bernett, Zhang Futao, Arindrarto Wibowo, Beutner Frank, BIOS Consortium, Brugge Harm, i2QTL Consortium, Dmitreva Julia, Elansary Mahmoud, Fairfax Benjamin P., Georges Michel, Heijmans Bastiaan T., Kähönen Mika, Kim Yungil, Knight Julian C., Kovacs Peter, Krohn Knut, Li Shuang, Loeffler Markus, Marigorta Urko M., Mei Hailang, Momozawa Yukihide, Müller-Nurasyid Martina, Nauck Matthias, Nivard Michel, Penninx Brenda, Pritchard Jonathan, Raitakari Olli, Rotzchke Olaf, Slagboom Eline P., Stehouwer Coen D.A., Stumvoll Michael, Sullivan Patrick, ‘t Hoen Peter A.C., Thiery Joachim, Tönjes Anke, van Dongen Jenny, van Iterson Maarten, Veldink Jan, Völker Uwe, Wijmenga Cisca, Swertz Morris, Andiappan Anand, Montgomery Grant W., Ripatti Samuli, Perola Markus, Kutalik Zoltan, Dermitzakis Emmanouil, Bergmann Sven, Frayling Timothy, van Meurs Joyce, Prokisch Holger, Ahsan Habibul, Pierce Brandon, Lehtimäki Terho, Boomsma Dorret, Psaty Bruce M., Gharib Sina A., Awadalla Philip, Milani Lili, Ouwehand Willem, Downes Kate, Stegle8 Oliver, Battle Alexis, Yang Jian, Visscher Peter M., Scholz Markus, Gibson Gregory, Esko Tõnu, Franke Lude: Unraveling the polygenic architecture of complex traits using blood eQTL meta-analysis. bioRxiv, 2018 [Google Scholar]
- 77.Liu X, Li YI, Pritchard JK: Trans Effects on Gene Expression Can Drive Omnigenic Inheritance. Cell 177:1022–1034 e6, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Consortium GT: Human genomics. The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans. Science 348:648–60, 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Consortium GT, Laboratory DA, Coordinating Center -Analysis Working G, et al. : Genetic effects on gene expression across human tissues. Nature 550:204–213, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Khansefid M, Pryce JE, Bolormaa S, et al. : Comparing allele specific expression and local expression quantitative trait loci and the influence of gene expression on complex trait variation in cattle. BMC Genomics 19:793, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Pastinen T, Hudson TJ: Cis-acting regulatory variation in the human genome. Science 306:647–50, 2004 [DOI] [PubMed] [Google Scholar]
- 82.Barlow DP, Bartolomei MS: Genomic imprinting in mammals. Cold Spring Harb Perspect Biol 6, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Degner JF, Marioni JC, Pai AA, et al. : Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data. Bioinformatics 25:3207–12, 2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Baralle FE, Giudice J: Alternative splicing as a regulator of development and tissue identity. Nat Rev Mol Cell Biol 18:437–451, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Li YI, van de Geijn B, Raj A, et al. : RNA splicing is a primary link between genetic variation and disease. Science 352:600–4, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Takata A, Matsumoto N, Kato T: Genome-wide identification of splicing QTLs in the human brain and their enrichment among schizophrenia-associated loci. Nat Commun 8:14519, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Gate RE, Cheng CS, Aiden AP, et al. : Genetic determinants of co-accessible chromatin regions in activated T cells across humans. Nat Genet 50:1140–1150, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Wainberg M, Sinnott-Armstrong N, Mancuso N, et al. : Opportunities and challenges for transcriptome-wide association studies. Nat Genet 51:592–599, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Zhu Z, Zhang F, Hu H, et al. : Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat Genet 48:481–7, 2016Using ATAc-seq data and RNA seq data they showed that human risk variants are in detected within accessible chromatin in activated T cells, (ac QTLs) and these localize with RNA seq data, affecting gene expression.
- 90.Gamazon ER, Wheeler HE, Shah KP, et al. : A gene-based association method for mapping traits using reference transcriptome data. Nat Genet 47:1091–8, 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Grubert F, Zaugg JB, Kasowski M, et al. : Genetic Control of Chromatin States in Humans Involves Local and Distal Chromosomal Interactions. Cell 162:1051–65, 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Porcu E, Rueger S, Lepik K, et al. : Mendelian randomization integrating GWAS and eQTL data reveals genetic determinants of complex and clinical traits. Nat Commun 10:3300, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Lawlor DA, Tilling K, Davey Smith G: Triangulation in aetiological epidemiology. Int J Epidemiol 45:1866–1886, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Giambartolomei C, Vukcevic D, Schadt EE, et al. : Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet 10:e1004383, 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Sekula P, Del Greco MF, Pattaro C, et al. : Mendelian Randomization as an Approach to Assess Causality Using Observational Data. J Am Soc Nephrol 27:3253–3265, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Davies NM, Holmes MV, Davey Smith G: Reading Mendelian randomisation studies: a guide, glossary, and checklist for clinicians. BMJ 362:k601, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Lawlor DA, Harbord RM, Sterne JA, et al. : Mendelian randomization: using genes as instruments for making causal inferences in epidemiology. Stat Med 27:1133–63, 2008 [DOI] [PubMed] [Google Scholar]
- 98.Smith GD, Ebrahim S: ‘Mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease? Int J Epidemiol 32:1–22, 2003 [DOI] [PubMed] [Google Scholar]
- 99.Del Greco MF, Foco L, Pichler I, et al. : Serum iron level and kidney function: a Mendelian randomization study. Nephrol Dial Transplant 32:273–278, 2017 [DOI] [PubMed] [Google Scholar]
- 100.Burgess S, Bowden J, Fall T, et al. : Sensitivity Analyses for Robust Causal Inference from Mendelian Randomization Analyses with Multiple Genetic Variants. Epidemiology 28:30–42, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Hemani G, Zheng J, Elsworth B, et al. : The MR-Base platform supports systematic causal inference across the human phenome. Elife 7, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102.Burgess S, Dudbridge F, Thompson SG: Re: “Multivariable Mendelian randomization: the use of pleiotropic genetic variants to estimate causal effects”. Am J Epidemiol 181:290–1, 2015 [DOI] [PubMed] [Google Scholar]
- 103.Burgess S, Thompson SG: Multivariable Mendelian randomization: the use of pleiotropic genetic variants to estimate causal effects. Am J Epidemiol 181:251–60, 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104.Testa A, Mallamaci F, Spoto B, et al. : Association of a polymorphism in a gene encoding a urate transporter with CKD progression. Clin J Am Soc Nephrol 9:1059–65, 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Kolz M, Johnson T, Sanna S, et al. : Meta-analysis of 28,141 individuals identifies common variants within five new loci that influence uric acid concentrations. PLoS Genet 5:e1000504, 2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106.Jordan DM, Choi HK, Verbanck M, et al. : No causal effects of serum urate levels on the risk of chronic kidney disease: A Mendelian randomization study. PLoS Med 16:e1002725, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Kottgen A, Albrecht E, Teumer A, et al. : Genome-wide association analyses identify 18 new loci associated with serum urate concentrations. Nat Genet 45:145–54, 2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108.Richardson TG, Haycock PC, Zheng J, et al. : Systematic Mendelian randomization framework elucidates hundreds of CpG sites which may mediate the influence of genetic variants on disease. Hum Mol Genet 27:3293–3304, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.Liu B, Pjanic M, Wang T, et al. : Genetic Regulatory Mechanisms of Smooth Muscle Cells Map to Coronary Artery Disease Risk Loci. Am J Hum Genet 103:377–388, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Gallagher MD, Chen-Plotkin AS: The Post-GWAS Era: From Association to Function. Am J Hum Genet 102:717–730, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 111.Lan F, Haliburton JR, Yuan A, et al. : Droplet barcoding for massively parallel single-molecule deep sequencing. Nat Commun 7:11784, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 112.Zhang X, Li T, Liu F, et al. : Comparative Analysis of Droplet-Based Ultra-High-Throughput Single-Cell RNA-Seq Systems. Mol Cell 73:130–142 e5, 2019 [DOI] [PubMed] [Google Scholar]
- 113.Buenrostro JD, Wu B, Litzenburger UM, et al. : Single-cell chromatin accessibility reveals principles of regulatory variation. Nature 523:486–90, 2015Using DNAse seq and RNA seq that found that disease causing variants in GWAS are localized in regulatory gene elements, and effect gene expression.
- 114.Cusanovich DA, Daza R, Adey A, et al. : Multiplex single cell profiling of chromatin accessibility by combinatorial cellular indexing. Science 348:910–4, 2015First study in CKD patients to examine methylation changes and shows these are in enhancer regions and effect fibrosis related genes.
- 115.Jia G, Preussner J, Chen X, et al. : Single cell RNA-seq and ATAC-seq analysis of cardiac progenitor cell transition states and lineage settlement. Nat Commun 9:4877, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 116.Giri A, Hellwege JN, Keaton JM, et al. : Trans-ethnic association study of blood pressure determinants in over 750,000 individuals. Nat Genet 51:51–62, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 117.Nakatochi M, Kanai M, Nakayama A, et al. : Genome-wide meta-analysis identifies multiple novel loci associated with serum uric acid levels in Japanese individuals. Commun Biol 2:115, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 118.Teumer A, Li Y, Ghasemi S, et al. : Genome-wide association meta-analyses and fine-mapping elucidate pathways influencing albuminuria. Nat Commun 10:4130, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 119.Sieber KB, Batorsky A, Siebenthall K, et al. : Integrated Functional Genomic Analysis Enables Annotation of Kidney Genome-Wide Association Study Loci. J Am Soc Nephrol, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 120.Ko YA, Yi H, Qiu C, et al. : Genetic-Variation-Driven Gene-Expression Changes Highlight Genes with Important Functions for Kidney Disease. Am J Hum Genet 100:940–953, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 121.Xu X, Eales JM, Akbarov A, et al. : Molecular insights into genome-wide association studies of chronic kidney disease-defining traits. Nat Commun 9:4800, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 122.Gillies CE, Putler R, Menon R, et al. : An eQTL Landscape of Kidney Tissue in Human Nephrotic Syndrome. Am J Hum Genet 103:232–244, 2018Human study identifying eQTLs and potential target genes in the nephrotic syndrome population
- 123.Beckerman P, Qiu C, Park J, et al. : Human Kidney Tubule-Specific Gene Expression Based Dissection of Chronic Kidney Disease Traits. EBioMedicine 24:267–276, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 124.Kottgen A, Hwang SJ, Larson MG, et al. : Uromodulin levels associate with a common UMOD variant and risk for incident CKD. J Am Soc Nephrol 21:337–44, 2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 125.Trudu M, Janas S, Lanzani C, et al. : Common noncoding UMOD gene variants induce salt-sensitive hypertension and kidney damage by increasing uromodulin expression. Nat Med 19:1655–60, 2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 126.Bernascone I, Janas S, Ikehata M, et al. : A transgenic mouse model for uromodulin-associated kidney diseases shows specific tubulo-interstitial damage, urinary concentrating defect and renal failure. Hum Mol Genet 19:2998–3010, 2010 [DOI] [PubMed] [Google Scholar]
- 127.Menon MC, Chuang PY, Li Z, et al. : Intronic locus determines SHROOM3 expression and potentiates renal allograft fibrosis. J Clin Invest 125:208–21, 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 128.Khalili H, Sull A, Sarin S, et al. : Developmental Origins for Kidney Disease Due to Shroom3 Deficiency. J Am Soc Nephrol 27:2965–2973, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 129.Yeo NC, O’Meara CC, Bonomo JA, et al. : Shroom3 contributes to the maintenance of the glomerular filtration barrier integrity. Genome Res 25:57–65, 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 130.Lovric S, Ashraf S, Tan W, et al. : Genetic testing in steroid-resistant nephrotic syndrome: when and how? Nephrol Dial Transplant 31:1802–1813, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 131.Bryer JS, Susztak K: Screening Drugs for Kidney Disease: Targeting the Podocyte. Cell Chem Biol 25:126–127, 2018 [DOI] [PubMed] [Google Scholar]
- 132.Beck LH Jr., Bonegio RG, Lambeau G, et al. : M-type phospholipase A2 receptor as target antigen in idiopathic membranous nephropathy. N Engl J Med 361:11–21, 2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 133.Foley RN, Collins AJ: The USRDS: what you need to know about what it can and can’t tell us about ESRD. Clin J Am Soc Nephrol 8:845–51, 2013 [DOI] [PubMed] [Google Scholar]
- 134.Groopman EE, Marasa M, Cameron-Christie S, et al. : Diagnostic Utility of Exome Sequencing for Kidney Disease. N Engl J Med 380:142–151, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 135.Susztak K: Understanding the epigenetic syntax for the genetic alphabet in the kidney. J Am Soc Nephrol 25:10–7, 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 136.Nelson MR, Tipney H, Painter JL, et al. : The support of human genetic evidence for approved drug indications. Nat Genet 47:856–60, 2015 [DOI] [PubMed] [Google Scholar]
- 137.Lek M, Karczewski KJ, Minikel EV, et al. : Analysis of protein-coding genetic variation in 60,706 humans. Nature 536:285–91, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 138.Brown TL, Meloche TM: Exome sequencing a review of new strategies for rare genomic disease research. Genomics 108:109–114, 2016 [DOI] [PubMed] [Google Scholar]
- 139.Braun DA, Schueler M, Halbritter J, et al. : Whole exome sequencing identifies causative mutations in the majority of consanguineous or familial cases with childhood-onset increased renal echogenicity. Kidney Int 89:468–475, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 140.Flannick J, Mercader JM, Fuchsberger C, et al. : Exome sequencing of 20,791 cases of type 2 diabetes and 24,440 controls. Nature 570:71–76, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 141.Futema M, Plagnol V, Li K, et al. : Whole exome sequencing of familial hypercholesterolaemia patients negative for LDLR/APOB/PCSK9 mutations. J Med Genet 51:537–44, 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 142.Lata S, Marasa M, Li Y, et al. : Whole-Exome Sequencing in Adults With Chronic Kidney Disease: A Pilot Study. Ann Intern Med 168:100–109, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 143.Salem RM, Todd JN, Sandholm N, et al. : Genome-Wide Association Study of Diabetic Kidney Disease Highlights Biology Involved in Glomerular Basement Membrane Collagen. J Am Soc Nephrol 30:2000–2016, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 144.Ling G, Sugathan A, Mazor T, et al. : Unbiased, genome-wide in vivo mapping of transcriptional regulatory elements reveals sex differences in chromatin structure associated with sex-specific liver gene expression. Mol Cell Biol 30:5531–44, 2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 145.Lu Q, Richardson B: DNaseI hypersensitivity analysis of chromatin structure. Methods Mol Biol 287:77–86, 2004 [DOI] [PubMed] [Google Scholar]
- 146.Song L, Crawford GE: DNase-seq: a high-resolution technique for mapping active gene regulatory elements across the genome from mammalian cells. Cold Spring Harb Protoc 2010:pdb prot5384, 2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 147.Kempfer R, Pombo A: Methods for mapping 3D chromosome architecture. Nat Rev Genet 21:207–226, 2020 [DOI] [PubMed] [Google Scholar]
- 148.Lieberman-Aiden E, van Berkum NL, Williams L, et al. : Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326:289–93, 2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 149.Mumbach MR, Rubin AJ, Flynn RA, et al. : HiChIP: efficient and sensitive analysis of protein-directed genome architecture. Nat Methods 13:919–922, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 150.O’Geen H, Echipare L, Farnham PJ: Using ChIP-seq technology to generate high-resolution profiles of histone modifications. Methods Mol Biol 791:265–86, 2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 151.Moran S, Arribas C, Esteller M: Validation of a DNA methylation microarray for 850,000 CpG sites of the human genome enriched in enhancer sequences. Epigenomics 8:389–99, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 152.Sandoval J, Heyn H, Moran S, et al. : Validation of a DNA methylation microarray for 450,000 CpG sites in the human genome. Epigenetics 6:692–702, 2011 [DOI] [PubMed] [Google Scholar]
- 153.Li Q, Hermanson PJ, Springer NM: Detection of DNA Methylation by Whole-Genome Bisulfite Sequencing. Methods Mol Biol 1676:185–196, 2018 [DOI] [PubMed] [Google Scholar]
- 154.Gu H, Smith ZD, Bock C, et al. : Preparation of reduced representation bisulfite sequencing libraries for genome-scale DNA methylation profiling. Nat Protoc 6:468–81, 2011 [DOI] [PubMed] [Google Scholar]
- 155.Khulan B, Thompson RF, Ye K, et al. : Comparative isoschizomer profiling of cytosine methylation: the HELP assay. Genome Res 16:1046–55, 2006 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 156.Andersson R, Sandelin A: Determinants of enhancer and promoter activities of regulatory elements. Nat Rev Genet, 2019 [DOI] [PubMed] [Google Scholar]
- 157.Angermueller C, Clark SJ, Lee HJ, et al. : Parallel single-cell sequencing links transcriptional and epigenetic heterogeneity. Nat Methods 13:229–232, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 158.Clark SJ, Argelaguet R, Kapourani CA, et al. : scNMT-seq enables joint profiling of chromatin accessibility DNA methylation and transcription in single cells. Nat Commun 9:781, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 159.Stoeckius M, Hafemeister C, Stephenson W, et al. : Simultaneous epitope and transcriptome measurement in single cells. Nat Methods 14:865–868, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 160.Peterson VM, Zhang KX, Kumar N, et al. : Multiplexed quantification of proteins and transcripts in single cells. Nat Biotechnol 35:936–939, 2017 [DOI] [PubMed] [Google Scholar]
- 161.Pelikan RC, Kelly JA, Fu Y, et al. : Enhancer histone-QTLs are enriched on autoimmune risk haplotypes and influence gene expression within chromatin networks. Nat Commun 9:2905, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 162.McRae AF, Marioni RE, Shah S, et al. : Identification of 55,000 Replicated DNA Methylation QTL. Sci Rep 8:17605, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 163.Yao C, Chen G, Song C, et al. : Genome-wide mapping of plasma protein QTLs identifies putatively causal genes and pathways for cardiovascular disease. Nat Commun 9:3268, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 164.Nicholson G, Rantalainen M, Li JV, et al. : A genome-wide metabolic QTL analysis in Europeans implicates two loci shaped by recent positive selection. PLoS Genet 7:e1002270, 2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 165.Ko YA, Mohtat D, Suzuki M, et al. : Cytosine methylation changes in enhancer regions of core pro-fibrotic genes characterize kidney fibrosis development. Genome Biol 14:R108, 2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 166.Qiu C, Hanson RL, Fufaa G, et al. : Cytosine methylation predicts renal function decline in American Indians. Kidney Int 93:1417–1431, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 167.Simon JM, Giresi PG, Davis IJ, et al. : A detailed protocol for formaldehyde-assisted isolation of regulatory elements (FAIRE). Curr Protoc Mol Biol Chapter 21:Unit21 26, 2013 [DOI] [PubMed] [Google Scholar]



