Figure 5.
Co-expression and co-citation profiling. (A) The PDZ domain-containing proteins and literature confirmed PDZ’s ligand gene sets are co-regulated by nearest neighborhood analysis (Mootha et al. 2003). Briefly, the PDZ neighborhood index N(R = 250) is defined as the number of unique ligands previously reported in the literature to bind PDZ proteins that occur within the nearest 250 genes ranked by Euclidean distance to a PDZ gene. The distribution is plotted for random distribution of ligands and for the literature-curated list of ligands (n = 192). For comparison, a set of random ligands of equal size to the literature ligands was generated and the distribution was calculated 1000 times with the average value shown (P < 0.001; permutation testing). (B) No significant co-regulation between known PDZ ligands and a set of 230 WD40 domain-encoding genes or with a set of 120 DNA repair genes was detected by neighborhood analysis. (C) Determination of pairwise Pearson’s correlation (PCC) cutoff of ρ > 0.40 suggests a potential direct or indirect interaction between the protein products of the PDZ gene and PDZCBM ligand. Two distributions were examined consisting of the PCCs of the 270 binary PDZ–ligand interactions documented in the literature survey and another distribution comparing PDZ genes correlations with a random set of ligands. The random distribution was calculated by generating 1000 permutations of random ligands of the PDZCBM gene set size (n = 899). A significant pairwise correlation value for potential PDZ–ligand interaction was derived based on the fact that <5% of the random distribution had (ρ or PCC) > 0.40. (D) Hierarchical clustering of correlation values (ρ) between PDZ genes and PDZCBM (n = 107 × 777) showing modules of co-expression between PDZ and PDZCBM ligands in immune and neuronal tissues. (E) Five hundred and ninety-three genes of PDZCBM clustered by similarity of PubMed literature co-citation profiles with 43 terms describing location, functionality, and interacting proteins of known PDZ ligands and complexes. (Note: 306/899 did not have citations with any of these 43 terms.) The tick marks on the side indicate those PDZCBM genes that are co-cited with the term PDZ. (F) Flow of information used to identify CARD11–SH2D3C interaction combining PDZCBM comparative genomics with co-expression profiling and co-citation profiling.