Skip to main content
Proceedings of the American Thoracic Society logoLink to Proceedings of the American Thoracic Society
. 2006 Aug;3(6):472. doi: 10.1513/pats.200603-033MS

Microarray Data–Based Prioritization of Chronic Obstructive Pulmonary Disease Susceptibility Genes

Soumyaroop Bhattacharya 1, Sorachai Srisuma 1, Dawn L DeMeo 1, John J Reilly 1, Raphael Bueno 1, Edwin K Silverman 1, Thomas J Mariani 1
PMCID: PMC2647634  PMID: 16921109

One limitation of genome-wide linkage screens aiming to identify disease-susceptibility genes is the resolution at which they define individual candidates. Fine mapping of linked loci can often be laborious, protracted, and poorly directed. We reasoned that an integrative genomics approach could help to expedite candidate gene identification. We tested this hypothesis by prioritizing genes within three linkage regions (chromosomes 2q33-36, 8pter-22, and 12p13-12) previously identified in the Boston Early-Onset COPD cohort (1), using two independent microarray data sets. Genes within previously defined linkage regions were identified using Golden Path (University of California Santa Cruz, http://genome.ucsc.edu), and associated probes were assigned using NetAffx (Affymetrix, Santa Clara, CA). Probe sets were filtered by sequence verification using The Lung Transcriptome (http://lungtranscriptome.bwh.harvard.edu). The first microarray data set studied consists of whole lung tissue samples derived from 20 patients who had undergone lung volume reduction surgery with severe emphysema and from 14 control subjects with mild to moderate obstruction (2). The second data set consists of whole lung tissue samples (uninvolved margin) derived from 31 patients undergoing surgery for solitary pulmonary nodules, with varying degrees of airflow obstruction (FEV1% predicted: mean, 72; range, 10–133), and has not been previously described. Signal intensities were derived using both MAS5 and RMA algorithms. Unsupervised clustering with the nonparametric bootstrap was applied to check for undesirable and unanticipated structure or associations among the samples. Pearson and Spearman correlations were used to test for significant associations between gene expression and continuous phenotypic variables (e.g., TLC, FEV1, FVC, FEV1/FVC, DlCO, FEF25–75, smoking history). All analysis methods were exhaustively repeated for each gene/probe set and each data set. For each gene/probe set, analysis results were summarized where data implicated an association between gene expression and the disease variable. Finally, genes were rank-prioritized based upon their frequency of significant association. A number of genes within each locus were repeatedly associated with multiple phenotypic variables in each data set. As has been previously noted, there was limited consistency between data sets, which in this case may be due to distinctions in patient populations and/or technical limitations. However, this approach consistently identified a small number of genes, including the recently implicated candidate susceptibility gene SERPINE2, as targets for further study. We believe this provides a rational approach that is applicable to many diseases and model systems for which data may already exist. However, further investigation is necessary to assert causation between prioritized candidates and the phenotype being investigated.

Acknowledgments

The authors acknowledge Craig P. Hersh for providing additional linkage information.

These studies were supported by NIH GRANTS HL72303 (J.J.R.) and HL71885 (T.J.M.)

Conflict of Interest Statement: S.B. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. S.S. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. D.L.D. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. J.J.R. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. R.B. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. E.K.S. received grant support, consulting fees, and honoraria from GlaxoSmithKline for studies of COPD genetics. He also received a speaker fee from Wyeth for a talk on COPD genetics. He also received honoraria from Bayer. T.J.M. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript.

References

  • 1.Silverman EK, Palmer LJ, Mosley JD, Barth M, Senter JM, Brown A, Drazen JM, Kwiatkowski DJ, Chapman HA, Campbell EJ, et al. Genomewide linkage analysis of quantitative spirometric phenotypes in severe early-onset chronic obstructive pulmonary disease. Am J Hum Genet 2002;70:1229–1239. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Spira A, Beane J, Pinto-Plata V, Kadar A, Liu G, Shah V, Celli B, Brody JS. Gene expression profiling of human lung tissue from smokers with severe emphysema. Am J Respir Cell Mol Biol 2004;31:601–610. [DOI] [PubMed] [Google Scholar]

Articles from Proceedings of the American Thoracic Society are provided here courtesy of American Thoracic Society

RESOURCES