Fig. (6). Step-wise approach to pharmacogenetic association studies.
Illustrated are the relationships between the amount of genotype data collected (blue rectangles), the amount of information generated from statistical and computational analysis (green rectangles) and the amount of knowledge about genetic architecture that is generated from interpreting data analysis results (yellow rectangles) for 1) a gene-centric approach that focuses on one or several candidate genes selected on the basis of their biochemical properties, 2) a pathway-based approach that looks at candidate genes in a particular biochemical pathway and 3) a genome-wide approach that considers a dense map of single-nucleotide polymorphisms (SNPs) that capture most of the variability in the genome. (A) Here, genome-wide association studies carried out independently of gene-centric and pathway-based results are considered agnostic to prior biological and analytical knowledge. In this paradigm, the amount of knowledge gained from a genome-wide association study is very small in proportion to the amount of data and information that are generated. This is due to the high level of noise inherent to data where the number of variables greatly outnumbers the sample size. (B) In this paradigm, knowledge gained from gene-centric studies is used to help pick the pathways and the genes that will be considered in a pathway-based approach. Further, the knowledge gained from pathway-based studies is used to help interpret genome-wide data analysis results. Here, the amount of knowledge gained from the genome-wide association study is improved over that provided by the purely agnostic approach outlined in (A). (C) The genome-wide association study is more expensive and more time consuming than either of the other two approaches. This is especially true with respect to the greatly increased amount of time that it takes to carry out the quality control, data management, data analysis and results interpretation. Candidate gene studies therefore provide greater value, defined as knowledge gained by data generated. We propose that that the analysis and interpretation of a genome-wide association study will be most successful when carried out once the gene-centric and pathway-based approaches have been fully explored. This will ultimately increase the value of the genome-wide association study.