Skip to main content
. 2021 May 17;13:83. doi: 10.1186/s13073-021-00904-z

Fig. 1.

Fig. 1

An improved method for finding shared genetic architecture of human traits. a The overall framework of the iCPAGdb pipeline. GWAS summary statistics (from published GWAS datasets or from user-uploaded GWAS) undergo LD clumping to obtain a lead variant for each signal below a specified p value threshold. These SNPs are queried against an LD proxy database generated from 1000 Genomes African, Asian, or European population to identify cross-phenotype associations through direct overlap or LD proxy at R2 > 0.4. Significance of overlap for each trait pair was calculated using Fisher’s exact test. Outputs can be visualized/downloaded from the iCPAGdb web browser. b Comparison of the number of shared SNPs for each NHGRI-EBI GWAS catalog trait pair identified through direct overlap vs. both direct and indirect (LD-proxy) overlap. c iCPAGdb detected more significant cross-phenotypes associations than CPAG1 at FDR < 0.1. Expansion of the NHGRI-EBI GWAS catalog and improvements in capturing by LD proxy in iCPAGdb fueled a large increase in detected cross-phenotype associations across human traits. Comparisons between CPAG1 and iCPAGdb on the same 2013 dataset are in Additional file 5: Figure S3. d Circle plot of cross-phenotype associations detected by iCPAGdb in the NHGRI-EBI GWAS catalog. After excluding compound phenotypes (phenotypes described by NHGRI-EBI GWAS catalog as > 1 comma-separated phenotype in their ontology), a total of 1709 traits involved in a total of 53314 cross-phenotype associations were left. These were categorized into 17 EFO Parental groups. Inner ribbons link phenotypes connected by cross-phenotype associations with the width of ribbon corresponding to the number of cross-phenotype associations. The axis outside the circle represents the cumulative number of associations for each group vs all other groups. e Comparison of genetic correlation from LD score regression (LDSC) and the Chao-Sorensen similarity index implemented in iCPAG demonstrates significant correlation. The genetic correlation rg of 24 diseases/trait were obtained from [23]. Since Chao-Sorensen values are bounded from 0 to 1 and rg ranges from − 1 to 1, we used the absolute value of rg here. Colored * indicates significant trait-pair for LDSC, iCPAGdb, or both at false discovery rate of 0.1. f A model demonstrating how SNPs regulate uric acid levels to impact the development of kidney stones and gout. g Riverplot of gout cross-phenotype associations generated from iCPAGdb output shows mapped genes associated with gout by GWAS (left) connected with NHGRI-EBI GWAS phenotypes grouped into EFO categories (right; colors are different categories). Cross-phenotype associations include causal connections (such as uric acid levels), comorbid outcomes (such as kidney stones), and regulators of disease (such as alpha-1-antitrypsin levels)