Abstract
Forced vital capacity (FVC), a spirometric measure of pulmonary function, reflects lung volume and is used to diagnose and monitor lung diseases. We performed genome-wide association study meta-analysis of FVC in 52,253 individuals from 26 studies and followed up the top associations in 32,917 additional individuals of European ancestry. We found six new regions associated at genome-wide significance (P < 5 × 10−8) with FVC in or near EFEMP1, BMP6, MIR-129-2/HSD17B12, PRDM11, WWOX, and KCNJ2. Two (GSTCD and PTCH1) loci previously associated with spirometric measures were related to FVC. Newly implicated regions were followed-up in samples of African American, Korean, Chinese, and Hispanic individuals. We detected transcripts for all six newly implicated genes in human lung tissue. The new loci may inform mechanisms involved in lung development and pathogenesis of restrictive lung disease.
Introduction
Pulmonary function is a heritable trait that can be reliably measured by spirometry and reflects the physiological state of the lungs and airways 1. Forced vital capacity (FVC), one of the most widely used pulmonary function measures, approximates vital capacity. In conjunction with forced expiratory volume in 1 second (FEV1), FVC is used to diagnose various respiratory diseases. A reduced ratio of FEV1 to FVC (FEV1/FVC) indicates airflow obstruction when FEV1 is reduced disproportionately relative to FVC. In contrast, a decreased FVC in the face of a normal to elevated FEV1/FVC suggests a restrictive ventilatory defect. In clinical practice, FVC is often used as a surrogate measure of disease progression in patients with established restrictive lung disorders, such as idiopathic pulmonary fibrosis 2,3. Reduced FVC is a strong predictor of mortality in the general population, independently of FEV1 and standard risk factors such as age and cigarette smoking 4–8.
Pulmonary function measures show familial aggregation, with evidence for genetic effects in twin and family studies 9,10. We previously reported associations between FEV1 or FEV1/FVC and at least 27 genetic loci using large-scale meta-analyses of genome-wide association studies (GWAS) 11–14. To date, the genetic determinants of FVC have not been studied using GWAS methods. We conducted a comprehensive GWAS meta-analysis across two large consortia of European ancestry—the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) and SpiroMeta—to identify common genetic variants associated with cross-sectional measures of FVC in 52,253 subjects of European descent. In each of six new loci, we confirmed expression levels of genes nearest the novel variants in lung tissue and performed expression quantitative trait locus (eQTL)-analyses in 762 whole blood samples. We evaluated the new loci in 6,070 African Americans in the National Heart, Lung and Blood Institute (NHLBI)-sponsored Candidate gene Association Resource (CARe) Project, in a Chinese subset (n=563) of the Multi-Ethnic Study of Atherosclerosis (MESA), in a Hispanic subset (n = 849) of MESA, and in Koreans (n = 8,074) from two cohort studies; Healthy Twin 15–17 and Korea Association Resource 3 (KARE3) 18.
Results
The study consisted of two stages. Stage 1 was a meta-analysis of study-specific genome-wide analyses of FVC conducted in 26 studies with a total of 52,253 individuals of European ancestry. The study characteristics are shown in Supplementary Table 1. Individual cohorts performed a GWAS analysis using linear regression models with FVC (in milliliters) as the outcome, stratified by never/ever smoking. Adjustment factors included age, age2, sex, height, height2 and weight. If applicable, cohorts adjusted for center, cohort or principal components to adjust for population stratification. Stage 1 results are in Supplementary Data Set 1. In stage 2, we followed-up SNPs showing association with FVC (P < 5 × 10−7) and meta-analyzed beta estimates and standard errors across stages 1 and 2 (Figure 1). Stage 2 encompassed 32,917 subjects of European ancestry from 9 independent cohorts. The study characteristics are shown in Supplementary Table 2. The follow-up studies used the same models. Effect estimates for each study were corrected using genomic control 19 separately within smoking strata. Study-specific lambda estimates are shown in Supplementary Table 1. The test statistic inflation (λGC) before applying genomic control at the meta-analysis level was 1.12 (Supplementary Figure 1). The test statistic inflation standardized for a sample size of 1,000 individuals was 1.002.
Figure 1. Study design.
Overview of our staged analysis to identify new variants influencing FVC. After a large-scale meta-analyzed GWAS of stage 1 cohorts (n = 52,253), we followed-up a total of 7 SNPs showing evidence of association with FVC (P < 5 × 10−7) in stage 2. The studies included in stage 2, encompassing a total sample size of n = 32,917, undertook in silico testing of the 7 loci, which were not previously associated to any pulmonary phenotype. See Supplementary Tables 1 and 2 for definitions of all study abbreviations
Regions around SNPs reaching genome-wide significance after the meta-analysis of the two stages were followed up in diverse ancestry samples of African Americans, Koreans, Chinese and Hispanics. The characteristics for these multi-ethnic follow-up studies are shown in Supplementary Table 2. Following the meta-analysis, we investigated mRNA expression of the nearest gene for each of the new SNPs in human lung tissue, human airway smooth muscle cells (HASM), human bronchial epithelial cells (HEBC) and peripheral mononuclear blood cells (PMBC) (Supplementary Note). We assessed whether these SNPs were associated with gene expression in whole blood cells (Supplementary Note, Supplementary Table 3) and queried databases to assess whether these SNPs are located within known or predicted regulatory regions.
Stage 1 and 2 results
There were 9 regions containing at least one SNP associated with FVC at P < 5 × 10−7 in stage 1. Of these, two (GSTCD and PTCH1) had previously been reported in GWAS12,13 of spirometric traits (FEV1 and FEV1/FVC), thus were not evaluated further, leaving 7 SNPs in 7 loci for follow-up in stage 2.
Six loci reached genome-wide significance (P < 5 × 10−8) for FVC in the meta-analysis of stages 1 and 2 (Table 1, Figure 2). The loci were in or near the following genes: BMP6 (rs6923462, 6p24, intronic), EFEMP1 (rs1430193, 2p16.1, intronic), MIR-129-2/HSD17B12 (rs4237643, 11p11.2, 54 kb upstream), PRDM11 (rs2863171, 11p11.2, 3 kb downstream), WWOX (rs1079572, 16q23.1, intronic) and KCNJ2 (rs6501431, 17q24.3, 800 kb downstream) (Supplementary Figures 2a–g). Effect sizes were generally consistent across studies (Supplementary Figures 3 a–g and 4 a–g). Three of these regions (BMP6, EFEMP1 and PRDM11) also showed independent replication in stage 2 European ancestry samples, with P values below a Bonferroni corrected threshold for 7 tests (P < 7.14 × 10−3) (Table 1). The lowest P value (5.89 × 10−13) for the meta-analyzed effect estimate across stage 1 and 2 was found for SNP rs6923462 (intronic SNP in BMP6).
Table 1.
Main results from stage 1, stage 2 and the meta-analysis of stage 1 and 2 for loci associated with forced vital capacity
| SNP ID | Chr. | NCBI 36 Position | Nearest gene | Coded Allele | Analysis Stage | β (ml) | S.E.gc | P value | Coded Allele freq. | N effective |
|---|---|---|---|---|---|---|---|---|---|---|
| rs1430193 | 2 | 55974357 | EFEMP1 (intronic) | T | Stage 1 | −23.75 | 4.022 | 3.52×10−9 | 0.370 | 45,852 |
| Stage 2 | −17.839 | 4.500 | 7.36×10−5 | 0.361 | 28,103 | |||||
| Joint meta-analysis | −21.125 | 2.999 | 1.86×10−12 | |||||||
| rs1942055 | 2 | 135215394 | TMEM163 (50 kb upstream) | G | Stage 1 | −19.943 | 3.919 | 3.60×10−7 | 0.458 | 46,365 |
| Stage 2 | −3.13 | 4.447 | 0.482 | 0.470 | 24,984 | |||||
| Joint meta-analysis | −12.594 | 2.940 | 1.84×10−5 | |||||||
| rs6923462 | 6 | 7746111 | BMP6 (intronic) | T | Stage 1 | 28.828 | 5.208 | 3.11×10−8 | 0.843 | 48,680 |
| Stage 2 | 35.204 | 7.552 | 3.14×10−6 | 0.846 | 17,271 | |||||
| Joint meta-analysis | 30.883 | 4.288 | 5.89×10−13 | |||||||
| rs4237643 | 11 | 43604944 | HSD17B12 (54 kb upstream) | T | Stage 1 | −21.366 | 3.957 | 6.66×10−8 | 0.311 | 51,977 |
| Stage 2 | −10.073 | 4.686 | 0.032 | 0.305 | 30,119 | |||||
| Joint meta-analysis | −16.666 | 3.023 | 3.53×10−8 | |||||||
| rs2863171 | 11 | 45207308 | PRDM11 (3 kb downstream) | C | Stage 1 | 25.343 | 5.015 | 4.33×10−7 | 0.158 | 51,758 |
| Stage 2 | 21.755 | 6.23 | 4.79×10−3 | 0.160 | 25,121 | |||||
| Joint meta-analysis | 23.924 | 3.906 | 8.97×10−10 | |||||||
| rs1079572 | 16 | 76744639 | WWOX (intronic) | G | Stage 1 | 20.539 | 3.733 | 3.76×10−8 | 0.417 | 51,049 |
| Stage 2 | 10.41 | 4.364 | 0.017 | 0.419 | 28,103 | |||||
| Joint meta-analysis | 16.258 | 2.837 | 9.95×10−9 | |||||||
| rs6501431 | 17 | 66488010 | KCNJ2 (800 kb downstream) | T | Stage 1 | 26.729 | 4.751 | 1.84×10−8 | 0.798 | 47,576 |
| Stage 2 | 15.641 | 6.746 | 0.020 | 0.789 | 17,694 | |||||
| Joint meta-analysis | 23.053 | 3.884 | 2.94×10−9 |
Shown are FVC results for the leading SNPs, ordered by chromosome and position for each independent locus associated (P < 5 × 10−8) with FVC in a joint analysis of up to 85,170 individuals of European ancestry from the CHARGE-SpiroMeta GWAS (stage 1) and follow-up (stage 2). Two-sided P values are given for stage 1, stage 2 and the joint meta-analysis of all stages. P values reaching genome-wide significance (P < 5 × 10−8) in the joint meta-analysis of all stages are indicated in bold. SNPs reaching independent replication in stage 2 (P = 0.05/7 = 7.14 × 10−3) are indicated with their stage 2 P value in bold. The sample sizes (N) shown are the effective sample sizes. The effective sample size (N effective) is the product of sample size and the imputation quality summed across studies. The joint meta-analysis includes data from stage 1 and stage 2. β values reflect effect-size estimates in milliliters (ml).
Figure 2. Manhattan plot.
Manhattan plot for the association results for FVC. The plot shows all the loci analyzed in stage 1, where the two loci previously associated with either FEV1 or FEV1/FVC are indicated in grey. The previously unassociated loci were in or near the presented adjacent gene. The loci reaching genome-wide significance, after the combined analysis of stage 1 and stage 2, are marked with an asterisk.
We evaluated the effect of the six new loci separately in ever-smokers and in never-smokers, and the effect sizes were consistent across smoking strata for all the variants (Supplementary Table 4).
Multi-ethnic follow-up
To examine the portability of the identified loci to other ethnic groups, we evaluated the regions of our newly identified SNPs in African Americans, Hispanics, Chinese, and Koreans. For three of these samples (African Americans, Hispanics, and Chinese), we looked up the SNPs with minor allele frequency (MAF) ≥ 0.05 within 200 kb in either direction from the sentinel SNP in Europeans in the 1000 Genomes (1000G) Project all ancestries imputation panel 20. For Koreans, we looked up the sentinel SNP +/− 200 kb for SNPs with MAF ≥ 0.05 according to imputation to HapMap and the Korean panel. To determine the appropriate Bonferroni-corrected P value threshold for declaring statistical significance in each ethnic group, we used the Nyholt method to calculate the effective number of independent variants, based on pairwise linkage disequilibrium among the follow-up SNPs 21,22.
African Americans were participants of the Candidate gene Association Resource consortium (CARe) 23. Baseline characteristics of the 6,070 African Americans in CARe are shown by cohort in Supplementary Table 2b. We performed regional meta-analyses of the 7,470 SNPs (MAF ≥ 0.05) within +/− 200 kb of the sentinel SNPs in Europeans using 1000G imputed data. Using the P value threshold of 4.42 × 10−5 (based on 1,132 independent tests), 78 SNPs in the region of EFEMP1 were significantly associated with FVC (lowest P = 1.63 × 10−7) in African Americans. Our top hit at the EFEMP1 locus in European samples (SNP rs1430193, T allele frequency 0.37, Table 1) had a very different allele frequency in our replication African American samples (T allele frequency 0.71) and was not statistically significant (P = 0.13). As has been recently noted regarding extension of GWAS SNPs discovered in Europeans to African-American populations 24, the effect size was in the same direction but attenuated in our African-Americans (β = −15.79 ml versus −23.75 ml). The top hit in African Americans for the EFEMP1 region (rs62164511, P = 1.63 × 10−7) showed a decrease in FVC of 84.7 ml per each copy of the A-allele (A-allele frequency: 0.9) (Supplementary Table 5a). The r2 between the most significant SNP (rs62164511) in African Americans and the most significant SNP (rs1430193) in Europeans was low (0.16), further supporting evidence for allelic heterogeneity at this locus (Supplementary Figures 5a–b).
Baseline characteristics of the 563 Chinese subjects from MESA are shown in Supplementary Table 2c. We performed regional analyses (+/− 200 kb, SNPs with MAF ≥ 0.05) around the sentinel SNPs in Europeans using 1000 Genomes imputation. The P value threshold was determined at 4.41 × 10−5 (based on 1,133 independent tests). None of the 7,436 investigated SNPs reached a statistical significance level below the predefined threshold.
Baseline characteristics of the 849 Hispanics from MESA are shown in Supplementary Table 2d. We performed regional analyses of SNPs (MAF ≥ 0.05) within +/− 200 kb of the sentinel SNPs in Europeans using 1000 Genomes imputation. None of the 7,473 investigated SNPs reached a statistical significance level below the predefined threshold (4.41 × 10−5, based on 1,133 independent tests).
Baseline characteristics of the 8,074 Koreans from the Healthy Twin 15–17 and KARE3 18 studies are shown in Supplementary Table 2e. In this sample, only HapMap imputed (HapMap3 Phase 2 and Korean HapMap) data were available. There were 72 SNPs (MAF ≥ 0.05) within +/− 200 Kb of the sentinel SNPs in Europeans. Using the threshold P value of 1.52 × 10−3 (based on 26 independent tests), two SNPs (rs12449659 and rs4793331, both located approximately 700kb upstream of KCNJ2) were associated with FVC in Koreans with a decrease of 32.5 ml (rs12449659, per T allele, P = 7.92 × 10−4) and 22.5 ml (rs4793331, per A allele, P = 1.22 × 10−3). These SNPs did not show a significant association with FVC in Europeans (P = 0.17 and P= 0.34, Supplementary Table 5b).
Gene set enrichment analysis
To identify plausible pathways associated with FVC, we broadened our focus beyond genome-wide significant variants by performing gene set enrichment analysis 25 on the entire meta-analyzed GWAS. We queried approximately 2,000 gene sets including canonical pathways and Gene Ontology functional categories. Using a false discovery rate (FDR) < 0.01, we identified 65 enriched pathways (Supplementary Table 6). While these over-represented gene sets encompassed diverse functions, many involved processes critical to organ development and tissue remodeling including epithelial morphogenesis, cell proliferation, extracellular matrix, Notch signaling, and cell adhesion. Other prominent pathways included acetylcholine binding and channel activity, smooth muscle contraction, glutamate receptor activity, immunity, and transcriptional/DNA repair processes.
Gene expression
Expression profiles of genes from the six loci that were significant in the meta-analysis of stages 1 and 2 (EFEMP1, BMP6, WWOX, KCNJ2, PRDM11 and HSD17B12), and the housekeeping gene GAPDH, were measured in human lung tissue and primary cell samples using RT-PCR. We detected transcripts for all six newly implicated genes in lung tissue, human bronchial epithelial cells (HBEC) and human airway smooth muscle (HASM). Transcripts for five of the 6 genes (excluding EFEMP1) were present in peripheral blood mononuclear cells (PBMCs) (Table 2 and Supplementary Figures 6 and 7).
Table 2.
Expression profiling of candidate genes in the lung and periphery
| Tissue | |||||||
|---|---|---|---|---|---|---|---|
| Sentinel SNP (relationship to gene) |
Chr. | Gene | Putative function of encoded protein |
Lung | HASM | HBEC | PBMC |
| rs1430193 (intron) |
2 | EFEMP1 | Binds EGFR, the EGF receptor, inducing EGFR autophosphorylation and the activation of downstream signaling pathways. May play a role in cell adhesion and migration. May function as a negative regulator of chondrocyte differentiation. |
+ | + | + | − |
| rs6923462 (intron) |
6 | BMP6 | The bone morphogenetic proteins (BMPs) are a family of secreted signaling molecules that can induce ectopic bone growth. Many BMPs are part of the transforming growth factor-beta (TGFβ) superfamily. |
+ | + | + | + |
| rs4237643 (intergenic) |
11 |
MIR129-2 (downstream)/HSD17B12 |
This gene encodes a very important 17beta- hydroxysteroid dehydrogenase (17β-HSD) |
+ | + | + | + |
| rs2863171 (downstream) |
11 | PRDM11 | PR domain-containing protein 11 |
+ | + | + | + |
| rs1079572 (intron) |
16 | WWOX | WW domain-containing proteins are found in all eukaryotes and play an important role in the regulation of a wide variety of cellular functions such as protein degradation, transcription, and RNA splicing. This gene encodes a protein which contains 2 WW domains and a short-chain dehydrogenase/reductase domain (SRD). |
+ | + | + | + |
| rs6501431 (downstream) |
17 | KCNJ2 | The protein encoded by this gene is an integral membrane protein and inward-rectifier type potassium channel. The encoded protein, which has a greater tendency to allow potassium to flow into a cell rather than out of a cell, probably participates in establishing action potential waveform and excitability of neuronal and muscle tissues. |
+ | + | + | + |
+ indicates that the gene is expressed in the cell type used, and − indicates that we did not detect gene expression at the mRNA level following 70 cycles of PCR. Amplification was followed in real-time by gene specific TaqMan probes and final PCR products were visualized by gel electrophoresis. We used GAPDH (encoding glyceraldehyde-3-phosphate dehydrogenase) as a positive control for the complementary DNA, and this gene was expressed in all tissues. Chr., chromosome; HASM, human airway smooth muscle; HBEC, human bronchial epithelial cells; PBMC, peripheral blood mononuclear cells.
Expression quantitative trait locus analysis in peripheral blood cells
We investigated whether the top SNPs or their proxies (r2 ≥ 0.7) in the six newly implicated FVC loci were associated with gene expression using expression Quantitative Trait Loci (eQTL) data as described in the methods. Multiple SNPs in or near HSD17B12 showed significant cis-eQTL associations (P < 10−4) in peripheral blood, with the strongest association represented by rs11037676 (a proxy of rs4237643, r2 = 0.7) at a P value of 8.42 × 10−81 (Supplementary Table 3). The sentinel SNP associated with FVC in this region (rs4237643) also exhibited a strong cis-effect on HSD17B12 (P = 1.82 × 10−35) (Supplementary Figure 8). Furthermore, this SNP showed a significant cis-eQTL association in lymphoblastoid cell lines (P = 6.7 × 10−11) 26 and brain tissue (P = 1.2 × 10−8) 27. Another FVC-associated SNP located in the intronic region of EFEMP1 (rs1430189) demonstrated local effects on this gene’s expression based on eQTL data from human fibroblasts (P = 4.8 × 10−6) 28. We did not find statistically significant cis-eQTLs for the other FVC-associated variants and loci.
eQTL analysis in lung tissue
To better assess the relevance of cis-eQTLs to lung biology, we queried a publically available database that included lung tissue (The Genotype-Tissue Expression project, GTEx) 29 to further investigate the top SNPs and their proxies. Multiple SNPs mapped to HSD17B12, including rs11037676 and the sentinel SNP, rs4237643, demonstrated highly significant cis-eQTLs in human lung samples (P = 2.8 × 10−26 and P = 7.2 × 10−14, respectively).
Fetal lung mRNA expression for genes associated with FVC
Studies were performed to investigate whether or not the genes we identified were differentially expressed during normal human fetal lung development (Supplementary Table 7). There was strong evidence (P controlling for false discovery rate = 6.7 × 10−6) for differential expression of PRDM11, suggesting this gene may play an important role in utero in lung development. One probe for WWOX also showed correlation between lung expression and fetal age although this was not seen with other probes for the gene.
Putative regulatory variants
We queried the RegulomeDB 30 database to assess whether any of the newly identified FVC-associated SNPs (P < 10−7, n = 150 SNPs) were located within known or predicted regulatory elements, including regions of DNAase hypersensitivity, binding sites of transcription factors, and promoter regions that have been biochemically characterized to regulate transcription. Five SNPs received high likelihood scores (based on the amount of supporting data) for mapping to regulatory regions and affecting gene expression; these were 4 variants upstream of HSD17B12 (rs9783304, rs2862996, rs10768966, and rs6485443) and one variant downstream of EFEMP1 (rs1430189). These variants showed evidence for eQTL, transcription factor binding and/or DNase peak.
Discussion
In a two-stage meta-analysis across 35 cohorts encompassing 85,170 individuals of European ancestry, we found 6 new loci associated with FVC that had not been identified in previous GWAS of spirometric measures of airflow obstruction (FEV1 or FEV1/FVC). The six new loci showed consistent associations across the European ancestry studies in discovery and replication stages. The meta-analysis effect estimates range from 13 to 31 ml per allele, which is similar to the annual rate of decline of FVC ranging from 12 to 47 ml in the general population 31. Expression analyses showed that all the top candidate genes at these loci were expressed in lung tissue and primary lung cells (HBEC and HASM). Two additional loci associated with FVC at genome-wide significance in the stage 1 analysis were previously associated with FEV1/FVC (PTCH1) or FEV1 (GSTCD) 12–14 at genome-wide levels of significance in GWAS.
The six new associations found in this analysis explain only a modest proportion of the additive polygenic variance of FVC (0.74%). Stage 2 effect size estimates were used to calculate the proportion of the variance explained by the six new loci, to avoid the effect of winner’s curse bias. When we take the other known loci for pulmonary function into account, the proportion of the additive polygenic variance explained is 1.78%, a finding that is comparable to many other complex traits 32. Unexplained heritability has become a well-known phenomenon in genetic epidemiology 33 and possible explanations include multiple effects of common variants, rare variants, gene-by-environment interactions, gene-gene interactions and epigenetic regulation - mechanisms that are not captured by existing GWAS platforms.
We, and others, previously identified 27 regions associated at genome wide significance with FEV1, FEV1/FVC or both 11–14. Although FEV1 and FVC are statistically correlated (r = 0.83 in the Rotterdam Study, adjusted for age, sex, height and height2), these are clinically different entities. FVC is used for the evaluation of restrictive ventilatory defects and is a predictor of mortality independent of FEV1, standard risk factors, and even prior cardiovascular disease 5,6. In contrast, FVC and FEV1/FVC have a very low correlation (r = −0.08 in the Rotterdam Study, adjusted for age, sex, height and height2). In this analysis, we were able to identify 6 new loci that are associated with FVC at genome-wide significance. Only two of the loci that were previously associated with FEV1 or FEV1/FVC showed genome-wide significant association with FVC in our study (GSTCD and PTCH1) 12,13. The sentinel SNPs at each of the six novel loci showed consistent directions of effect on both FEV1 and FVC (Supplementary Table 8). Among our FVC-associated SNPs, the smallest P value for FEV1 (9.43 × 10−7) was observed for rs1079572 (WWOX). An intragenic SNP (rs11654749) in the region between KCNJ2 and SOX9 was associated with FEV1 at genome-wide significance in our previous meta-analysis of SNP and SNP-by-smoking effects 34. To assess whether the variant (rs11654749) identified in that analysis of FEV1 and the sentinel SNP (rs6501431) from our current FVC analysis are pointing to the same signal, we fitted both variants together in the model using the software GCTA 35. Their effects sizes increased slightly when fitted together as expected given that the marginal correlation (r=0.03) of their alleles was positive and the effects of these alleles on lung function were in opposite directions. Thus, these SNPs appear to be independent signals.
Two of the top loci for FVC (BMP6 and EFEMP1) have been associated with height in a previous GWAS 36. The FVC-associated SNPs in or near these two genes show a modest to weak correlation with the top SNPs from the height GWAS 36. For EFEMP1, the r2 between rs1430193 (for FVC) and rs3791675 (for height) was 0.45. For BMP6, the correlation between rs6923462 (for FVC) and two SNPs associated with height (rs3812163 and rs1219896) was low. For rs6923462 and rs3812163, the r2 was 0.01 and for rs6923462 and rs1219896 the r2 was 0.02. Since we adjusted for height in our analysis, our findings are likely to be independent of height, but may reflect genetic effects on body or organ size.
To assess whether our identified loci are associated with FVC across populations of different ancestries, we assessed the associations of our main findings in African Americans, Koreans, Hispanics, and Chinese subjects. Despite the limited size of these samples, 78 SNPs in the region of EFEMP1 reached the significance threshold of P < 4.42 × 10−5 in African Americans. These results support the involvement of this locus in lung function in individuals of both European and African descent, although there was evidence for allelic heterogeneity. In the Korean dataset, we found two SNPs in the region of our locus near KCNJ2 (LOC101928165) to be significantly associated with FVC. In the smaller samples of Chinese and Hispanic participants, none of the investigated SNPs were significantly associated with FVC which may not be surprising given the greatly reduced power. In summary, despite the smaller sample size, we were able to show significant evidence of association with FVC for the EFEMP1 locus in African Americans and for the locus downstream of KCNJ2 in Koreans.
A literature review identified candidate genes within the newly identified associated loci plausibly involved in lung growth and pathogenesis. For example, BMP6 is a member of the bone morphogenetic proteins (BMPs) that represent a key canonical signaling pathway in the regulation of lung development, repair and response to injury 37. BMP6 expression in bronchial epithelial cells has been reported to increase during experimental models of allergic airway inflammation 38. EFEMP1 is part of the fibulin family of extracellular matrix glycoproteins and encodes fibulin-3. Targeted disruption of a member of this -family, fibulin-4, has been shown to cause reduced elasticity and emphysematous morphology in the lungs of mice 39. Expression of other members of the fibulin family (fibulin-1 and fibulin-5) seems to be influenced by transforming growth factor beta 1 (TGF-β1), a gene previously linked to inflammation in COPD patients 40,41. WWOX may influence protein-induced apoptosis and behaves as a tumor suppressing gene in various types of neoplasms, including small cell lung cancer 42. Our analysis in fetal lung data showed strong evidence for differential expression of PRDM11 suggesting a role in in utero development. In addition, nuclear expression of PRDM11 has been shown in respiratory epithelial cells of the human bronchus in 3 subjects Human Protein Atlas 43. Interestingly, our eQTL-analysis revealed highly significant cis-effects of FVC-associated variants on HSD17B12 expression in multiple tissues including lung. Furthermore, several of these SNPs were located within regulatory sites upstream of HSD17B12, suggesting that these variants may play a regulatory role in gene function. Exploratory pathway analysis using the all SNPs in the meta-analysis of FVC implicated multiple processes, including several involved in tissue development and remodeling. These findings suggest that distinct and identifiable biological pathways underlie the genetic basis of lung’s vital capacity in the general population.
There are some limitations of our analysis. With cross-sectional measures of FVC, we cannot determine whether the identified signals are due to influence on lung growth or age-related lung function decline 44. The primary analyses were not adjusted for pack-years to avoid attrition in sample size, but within the CHARGE cohorts, estimates from meta-analysis with and without adjustment for pack years were very similar (results not shown).
An important strength of our study is its considerable sample size of European ancestry individuals. Our application of genomic control at the three stages is likely to be overly conservative because it has recently been shown that in large meta-analyses, test statistics are expected to be elevated under polygenic inheritance even when there is no population structure. Genomic inflation estimates increase with sample size, as has been shown for other traits 14,36,45,46. Following the two staged meta-analysis, we were able to test the SNPs and their regions for tissue specific expression.
In conclusion, using a large-scale staged meta-analysis, we report six new loci associated with FVC and show that all are expressed in lung tissue and primary lung cells. Our findings point to previously unexplored pathways and mechanisms underlying lung function. Improvement of the understanding of the role that these genes play in normal lung development and disease pathogenesis could lead to novel therapeutic targets for lung diseases.
Online Methods
Study design
The study consisted of two stages. Stage 1 was a meta-analysis of study-specific genome-wide analyses of FVC conducted in 26 studies with a total of 52,253 individuals of European ancestry. The study characteristics are shown in Supplementary table 1. In stage 2, we followed-up SNPs showing association with FVC (P < 5 × 10−7) and meta-analyzed betas and standard errors across stages 1 and 2. Stage 2 encompassed 32,917 subjects of European ancestry from 9 independent cohorts.
Cohorts included in stage 1
Stage 1 included a total of 26 studies, 15 from the SpiroMeta consortium and 11 from the CHARGE consortium: AGES, ARIC, B58C T1DGC, B58C WTCCC, CARDIA, CHS, ECRHS, EPIC (obese cases and population-based studies), the EUROSPAN studies (CROATIA-Korcula, ORCADES, CROATIA-Vis and NSPHS), FHS, FTC (incorporating the FinnTwin16 and Finnish Twin Study on Aging), Health 2000, Health ABC, HCS, KORA S3, MESA, NFBC 1966, RS-I, RS-II, RS-III, SHIP and Twins UK-I. The genotyping platforms and quality-control criteria implemented by each study are described in Supplementary Table 9.
Cohorts included in stage 2
A total of 9 studies were included in our stage 2 follow-up: BHS 1 & 2, CROATIA-Split, Generation Scotland, KORA F4, LBC1936, LifeLines, LLFS, Pivus, Twins UK-II & III. Study descriptions can be found in the Supplementary Note.
Imputation
Imputation to the HapMap CEU panel was conducted using either MACH 47, IMPUTE 48, Beagle 49,50 or BIMBAM 51 with filters and quality control parameters as shown in Supplementary Table 9. SNPs were excluded on a cohort basis if the imputation score, assessed using r2.hat (MACH), .info (IMPUTE) or OEvar (BIMBAM), was < 0.3. In total, 2,762,059 SNPs were analyzed.
Statistical analysis
Individual studies performed a GWAS analysis using linear regression models with FVC (in milliliters) as the outcome, stratified by never/ever smoking. Adjustment factors were age, age2, sex and height (plus height2 and weight for CHARGE and replication cohorts). If applicable, cohorts adjusted for center, cohort or principal components to adjust for population stratification. The follow-up studies used the same models. Effect estimates for each study were corrected using genomic control 19 separately within smoking strata. Study-specific lambda estimates are shown in Supplementary Table 1.
Meta-analysis of stage 1 data
Variants with imputation quality below 0.3 or minor allele frequency below 0.03 were excluded from each dataset before the meta-analysis. Study-specific effect estimates and standard errors for ever-smokers and never-smokers were combined using METAL 52 with fixed-effects inverse variance weighted meta-analysis, which takes directionality into account by aligning study results according to the same effect allele. Genomic control was applied to the resulting combined (ever-smokers and never-smokers) effect estimates for each study. These combined effect estimates for each study were then combined across studies, again using fixed-effect inverse variance weighted meta-analysis with METAL, and genomic control was applied again to the final meta-analysis estimates. Manhattan plots, quantile-quantile-plots, forest-plots, gene annotation, and additional statistics were produced using R version 2.9.2 53. Stage 1 results are in Supplementary Data Set 1.
Selection of SNPs for stage 2 and stage 2 meta-analysis
For every region containing at least one SNP showing evidence for association with FVC (P < 5 × 10−7), the SNP with the smallest P value that also had a N effective ≥ 80% (the N effective is the product of sample size and the imputation quality summed across studies) of the total stage 1 sample size was followed up in a second stage using in silico data from 9 cohorts (Figure 1, Supplementary Table 2a). In total, 7 SNPs were followed up. Results for variants with imputation quality (Supplementary Table 10) below 0.3 in a given study were excluded from the meta-analysis. Results across stage 2 studies were meta-analyzed using fixed-effects inverse variance weighted meta-analysis with METAL 52.
Regions were defined as independent if the leading SNP from one region was > 500 kilobases (kb) from the leading SNP of any other region. We excluded two regions (GSTCD and PTCH1) from follow-up, that were previously associated with FEV1 or FEV1/FVC 12–14.
Combined analysis of stage 1 and stage 2
We performed an inverse variance weighted fixed effects meta-analysis across stages 1 and 2 using METAL 52 and obtained two-sided P values for the resulting effect estimates.
Follow-up in other ethnicities
To evaluate these loci across ethnicities, we studied association with FVC in four samples of non-European ancestry. We used data from the National Heart, Lung, and Blood Institute (NHLBI)-sponsored Candidate gene Association Resource (CARe) Project 23,54, which genotyped African Americans in ARIC, MESA, CHS, CARDIA, the Jackson Heart Study and the Cleveland Family Studies. The analyses in CARe were carried out per cohort and meta-analyzed using METAL 52. Furthermore, we assessed the SNPs in Hispanic and Chinese participants from the MESA study. Lastly, we investigated the loci in Korean participants from the Healthy Twin 15–17 and KARE3 18 studies. For the Korean studies, the analyses were carried out per cohort and meta-analyzed. Individual studies performed a GWAS analysis using linear regression models with FVC (in milliliters) as the outcome. Adjustment factors were age, age2, sex, height, height2, ever/never smoking and weight. We assessed the sentinel SNPs from the HapMap CEU reference panel that were available and SNPs from the regions of the identified loci based on location of the sentinel SNPs ± 200 kb. Only SNPs with a MAF ≥ 0.05 were included. The estimated number of independent tests per population sample and corresponding Bonferroni P values are shown in Supplementary Table 11. The effective number of independent variants being tested in each replication population was estimated based on linkage disequilibrium between SNPs using the technique of Li and Ji 22, which is a modification of the technique originally proposed by Cheverud 55 and implemented by Nyholt 21. This calculation was performed using “matSpD” 21 based on the linkage disequilibrium structures of the 1,000 Genomes “all ancestries” sample.
Gene set enrichment analysis
We applied an algorithm known as improved gene set enrichment analysis for GWAS (i-GSEA4GWAS) to place variants associated with FVC within curated pathways and functional categories 25. SNPs from the stage 1 and 2 meta-analyzed GWAS were mapped to genes if within a 100 kb distance (upstream or downstream). For a given SNP, if multiple genes were located within this range, the closest gene was selected and assigned the association P value. Since multiple SNPs can map to the same gene, a SNP label permutation was used to reduce biases caused by larger loci having disproportionately higher number of SNPs. Log-transformed association P values were used to rank order the resulting gene list (18,454 genes) and calculate gene set enrichment scores. Approximately 2,000 gene sets were used. These were limited to curated pathways derived from multiple resources such as KEGG, BioCarta, REACTOME, and functional annotations extracted from the Gene Ontology database. A modified version of GSEA procedure was performed and adjusted for multiple testing using false discovery rate (FDR). Significant enrichment of gene sets was set at FDR < 0.01.
Gene expression analysis
Following the meta-analysis, we investigated mRNA expression of the nearest gene for each of the new SNPs in: human lung tissue, human airway smooth muscle cells (HASM), human bronchial epithelial cells (HEBC) and peripheral mononuclear blood cells (PMBC). Lung resection specimens were obtained from patients diagnosed with solitary pulmonary tumors at Ghent University Hospital (Ghent, Belgium). Primary human bronchial epithelial cells (HBEC) and human airway smooth muscle cells (HASM) were prepared from lung resection specimens obtained from anonymous donors during surgery for lung cancer at the Leiden University Medical Center (LUMC, Leiden, The Netherlands). All assays were done at the Ghent University Hospital. Peripheral blood mononuclear cells (PBMCs) were isolated from whole blood using Ficoll gradients. Written informed consent was obtained from all subjects according to protocols approved by the local ethics committees. Total RNA was extracted from samples using the miRNeasy Mini kit (Qiagen) and cDNA was prepared from 1µg RNA template using the Transcriptor Universal cDNA Master kit (Roche) following manufacturer’s instructions. Expression of the candidate genes and the housekeeping gene GADPH was analyzed using TaqMan Gene Expression Assays (Applied Biosystems, Forster City, CA, USA; Assay ID numbers are given in Supplementary Table 12 in the Supplementary Note).
Expression Quantitative Trait Loci
We assessed whether top SNPs or their proxies, based on an r2 > 0.7, in the six new regions were associated with gene expression in whole blood cells in a sample of 762 individuals from the Rotterdam Study III (RS-III)56. Expression was assessed using Illumina Whole-Genome Expression Beadchips (HumanHT-12 v4). For eQTL-analysis, we used the eQTL mapping pipeline called MegaQTL 57. eQTLs were deemed cis when the distance between the SNP chromosomal position and the probe midpoint was less than 250 kb. eQTLs were mapped using Spearman’s rank correlation, using the imputation dosage values as genotypes. Resultant correlations were then converted to P values and their respective z-scores weighted with the square root of the sample size. The model was adjusted for the first 40 eigenvectors of the principal component analysis. We corrected for multiple testing by using Bonferroni correction: associations with P < 10−4 were considered statistically significant.
Finally, we also queried publicly available eQTL databases derived from multiple cell and tissue types (lymphoblastoid cell lines, brain tissue and human fibroblasts) 26–29. We corrected for multiple testing by using Bonferroni correction: associations with P < 10−4 were considered statistically significant.
Expression in fetal lung
We used methods previously described 58 to mine publicly available data 59,60 to identify whether differential expression of relevant genes occurs during normal human lung development. Previously, human fetal lung tissues were obtained from National Institute of Child Health and Human Development tissue databases and microarray profiles used to investigate the expression spanning different gestational ages. RNA samples from 38 subjects (estimated gestational age 7–22 weeks or 53–154 days post conception) representing the pseudoglandular (gestational age, 7–16 weeks) and canalicular (17–26 weeks) stages of lung development were included within the dataset. These data are available at NCBI Gene Expression Omnibus.
Supplementary Material
Acknowledgments
Funding and Acknowledgments
For all studies, information on funding and acknowledgements can be found in the Supplementary Note.
Footnotes
URLs
Korean HapMap, http://www.khapmap.org/; RegulomeDB http://regulome.stanford.edu/; Human protein atlas http://www.proteinatlas.org; METAL http://www.sph.umich.edu/csg/abecasis/metal/; matSpD http://gump.qimr.edu.au/general/daleN/matSpD/; Gene Expression Omnibus http://www.ncbi.nlm.nih.gov/geo.
Accession numbers
NCBI Gene Expression Omnibus: GSE14334.
Contributions
Project conception, design and management. Stage 1 GWAs. Ages: G.E., M.G., V.G., T.B.H., L.J.L. ARIC: S.J.L., N.F., D.C., D.B.H., B.R.J., A.C.M., K.E.N. B58C T1DGC: D.P.S. B58C WTCCC: D.P.S. CARDIA: A.S., L.J.S. CHS: S.A.G., S.R.H., B.M.P. CROATIA-Korcula: O.P., A.F.W. CROATIA-Vis: I.R. ECRHS: D.J., E.O., I.P., M.W. EPIC: R.A.S., N.J.W., J.H.Z. FHS: J.B.W., G.T.O., J.D. FTC: J.K., K.H.P., T.R., A.V. Health ABC: P.A.C., T.B.H., S.B.K., Y.L. Health 2000: M.H., M.K. HCS: J.R.A., R.J.S. KORA S3: C.G., J.H. MESA-Lung: R.G.B. NFBC1966: M-R.J., A.P. NSPHS: U.G. ORCADES: H.C., J.F.W., S.H.W. RS I, II & III: A.H., B.H.S, G.G.B SHIP: S.G., B.K., H.V. Twins UK-I: C.J.H., T.D.S. Stage 2 follow-up studies. BHS 1&2: A.L.J., A.B.M., J.B. CROATIA-Split: N.D.H, C.H. Generation Scotland: D.J.P., B.H. Smith. KORA F4: H.S. LBC1936: I.J.D., J.M.S. LifeLines: H.M.B., D.S.P., J.M.V., C. W. LLFS: A.N., B.T., M. Wojczynski, R.L.M. PIVUS: E.I., L.L., Twins UK-II&III: C.J.H., T.D.S. Multi-ethnic follow-up studies. CARe: R.G.B., K.M.B., D.J.L., R.K., L.J.S., J.B.W., N.H., M.F.P., K.M.B., S.R., E.G.B., G.T.O, L.R.L., W.B.W. KARE3 and Healthy Twin Study: J.S., W.K., Y.O. Gene expression analyses. Rt-PCR: K.R.B., G.G.B. eQTLs: J.B.J.M., A.G.U. Fetal lung expression: I.P.H., I.S., E.M.
Phenotype collection and data management. Stage 1 GWAs. Ages: T.A. ARIC: D.C., N.F., A.C.M., K.E.N. B58C T1DGC: A.R.R., D.P.S. B58C WTCCC: A.R.R., D.P.S. CARDIA: L.J.S., O.D.W. CHS: S.A.G., S.R.H., B.M.P., T.L. CROATIA-Korcula: O.P., L.Z. CROATIA-Vis: S.C., I.K. ECRHS: D.L.J., E.O., I.P, M.W. EPIC: J.H.Z. FHS: J.B.W., G.T.O., J.D. FTC: J.K., K.H.P., T.R., A.V. Health ABC: P.A.C., W.T. Health 2000: M.H., M.K. HCS: J.R.A. KORA S3: J.H. MESA-Lung: R.G.B. NFBC1966: M-R.J., J.P., A.P. NSPHS: A.J. S.E. U.G. ORCADES: S.H.W., J.F.W. RS I, II & III: G.G.B, L. Lahousse, D.W.L., B.H.S. SHIP: S.G., B.K., H.V. Twins UK-I: P.G.H., A. Viňuela. Stage 2 follow-up studies. BHS 1&2: A.L.J., A.B.M., J.B. CROATIA-Split: C.H., T.Z. Generation Scotland: D.J.P., B.H. Smith KORA F4: R.H., S.K., H.S. LBC1936: I.J.D., J.M.S. LifeLines: D.S.P., J.M.V. LLFS: A.N., B.T., R.M. PIVUS: E.I., L.L. Twins UK-II&III: P.G.H., A. Viňuela Multi-ethnic follow-up studies. CARe: R.G.B., T.P., K.M.B., D.J.L., R.K., L.J.S., J.B.W., N.H., M.F.P., K.M.B., S. Ripatti, E.G.B., G.T.O, L.R.L., W.B.W. KARE3 and Healthy Twin Study: J.S., W.K., Y.O. Gene expression analyses. Rt-PCR: K.R.B., G.G.B, F.M.V., P.S.H eQTLs: J.B.J.M., M.J.P. Fetal lung expression: I.P.H., I. Sayers, E.M.
Genotyping Stage 1 GWAs. B58C T1DGC: W.L.M. B58C WTCCC: W.L.M. CARDIA: M.F., X.G. CHS: J.I.R., B.M.P. CROATIA-Korcula: J.E.H. CROATIA-Vis: S.C. ECRHS: M.W. EPIC: J.H.Z. FTC: J.K. Health ABC: Y.L., K.L Health 2000: S. Ripatti, I. Surakka. HCS: R.J.S. KORA S3: H.G. MESA-Lung: S.S.R. NFBC1966: M-R.J. NSPHS: A.J. S.E. U.G. Orcades: H.C., J.F.W. RS I, II & III: F.R., A.G.U. SHIP: S.G., B.K., A.T., H.V. Twins UK-I: C.J.H, T.D.S. Stage 2 follow-up studies. BHS 1&2: J. Hui, J.B. CROATIA-Split: C.H., P.N., T.Z. Generation Scotland: D.J.P., B.H. Smith, H.T. LBC1936: G.D. LifeLines: C.W. LLFS: A.N., B.T. PIVUS: E.I., A.P.M. Twins UK-II&III: C.J.H, T.D.S.
Data analysis. Stage 1 GWAs. Ages: A.V.S. ARIC: N.F., D.B.H. B58C T1DGC: A.R.R., D.P.S. B58C WTCCC: A.R.R., D.P.S. CARDIA: X.G. CHS: S.A.G., G.L., S.R.H., T.L. CROATIA-Korcula: J.E.H. CROATIA-Vis: V.V. ECRHS:: D.L.J., A.R. EPIC: J.H.Z. FHS: J.B.W., J.D., W.G. Health ABC: P.A.C., Y.L., K.L., W.T. Health 2000: M.K., S. Ripatti, I. Surakka. HCS: C.O., E.G.H. KORA S3: E.A. MESA-Lung: A.M., S.S.R. NFBC1966: A.C.A. NSPHS: S.E. ORCADES: P.K.J. RS I, II & III: L. Lahousse, D.W.L. SHIP: A.T. Twins UK-I: P.G.H. Stage 2 follow-up studies. KORA F4: C.F., R.H. LBC1936: L.M.L. LifeLines: K.J., H.M.B. LLFS: M. Wojczynski, B.T. PIVUS: T.F. Twins UK-II&III: P.G.H. Multi-ethnic follow-up studies. N.C.G. CARe: T.P., Q.D., L.A.L., X.Q.W. KARE3 and Healthy Twin Study: M.K.L. Gene expression analyses. Rt-PCR: K.R.B., F.M.V. eQTLs: M.J.P. Fetal lung expression: I.P.H., I. Sayers, E.M.
Analysis group: CHARGE consortium: D.W.L., S.A.G., S.J.L., J.D., N.F., A.V.S. CARe: T.P., Q.D. SpiroMeta consortium: M.S.A., L.V.W., B.K., I.P.H., M.D.T.
Writing group: CHARGE consortium: S.J.L., D.W.L., S.A.G., N.F., J.D., G.G.B., A.S. SpiroMeta consortium: I.P.H., M.S.A., M.D.T., L.V.W., C.H.
Conflict of interest
JBW is employed by Pfizer, Inc. None of the other authors have declared a possible conflict of interest.
References
- 1.Wilk JB, et al. Evidence for major genes influencing pulmonary function in the NHLBI family heart study. Genet Epidemiol. 2000;19:81–94. doi: 10.1002/1098-2272(200007)19:1<81::AID-GEPI6>3.0.CO;2-8. [DOI] [PubMed] [Google Scholar]
- 2.Zappala CJ, et al. Marginal decline in forced vital capacity is associated with a poor outcome in idiopathic pulmonary fibrosis. Eur Respir J. 2010;35:830–836. doi: 10.1183/09031936.00155108. [DOI] [PubMed] [Google Scholar]
- 3.Raghu G, et al. An official ATS/ERS/JRS/ALAT statement: idiopathic pulmonary fibrosis: evidence-based guidelines for diagnosis and management. Am J Respir Crit Care Med. 2011;183:788–824. doi: 10.1164/rccm.2009-040GL. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Ashley F, Kannel WB, Sorlie PD, Masson R. Pulmonary function: relation to aging, cigarette habit, and mortality. Ann Intern Med. 1975;82:739–745. doi: 10.7326/0003-4819-82-6-739. [DOI] [PubMed] [Google Scholar]
- 5.Burney PG, Hooper R. Forced vital capacity, airway obstruction and survival in a general population sample from the USA. Thorax. 2011;66:49–54. doi: 10.1136/thx.2010.147041. [DOI] [PubMed] [Google Scholar]
- 6.Lee HM, Le H, Lee BT, Lopez VA, Wong ND. Forced vital capacity paired with Framingham Risk Score for prediction of all-cause mortality. Eur Respir J. 2010;36:1002–1006. doi: 10.1183/09031936.00042410. [DOI] [PubMed] [Google Scholar]
- 7.Lange P, Nyboe J, Appleyard M, Jensen G, Schnohr P. Spirometric findings and mortality in never-smokers. J Clin Epidemiol. 1990;43:867–873. doi: 10.1016/0895-4356(90)90070-6. [DOI] [PubMed] [Google Scholar]
- 8.Kannel WB, Hubert H, Lew EA. Vital capacity as a predictor of cardiovascular disease: the Framingham study. Am Heart J. 1983;105:311–315. doi: 10.1016/0002-8703(83)90532-x. [DOI] [PubMed] [Google Scholar]
- 9.Palmer LJ, et al. Familial aggregation and heritability of adult lung function: results from the Busselton Health Study. Eur Respir J. 2001;17:696–702. doi: 10.1183/09031936.01.17406960. [DOI] [PubMed] [Google Scholar]
- 10.van Putte-Katier N, et al. Relationship between parental lung function and their children's lung function early in life. Eur Respir J. 2011;38:664–671. doi: 10.1183/09031936.00034210. [DOI] [PubMed] [Google Scholar]
- 11.Wilk JB, et al. A genome-wide association study of pulmonary function measures in the Framingham Heart Study. PLoS Genet. 2009;5:e1000429. doi: 10.1371/journal.pgen.1000429. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Repapi E, et al. Genome-wide association study identifies five loci associated with lung function. Nat Genet. 2010;42:36–44. doi: 10.1038/ng.501. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Hancock DB, et al. Meta-analyses of genome-wide association studies identify multiple loci associated with pulmonary function. Nat Genet. 2010;42:45–52. doi: 10.1038/ng.500. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Soler Artigas M, et al. Genome-wide association and large-scale follow up identifies 16 new loci influencing lung function. Nat Genet. 2011;43:1082–1090. doi: 10.1038/ng.941. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Sung J, et al. The Korean Twin Registry--methods, current stage, and interim results. Twin Res. 2002;5:394–400. doi: 10.1375/136905202320906165. [DOI] [PubMed] [Google Scholar]
- 16.Sung J, et al. Healthy Twin: a twin-family study of Korea--protocols and current status. Twin Res Hum Genet. 2006;9:844–848. doi: 10.1375/183242706779462822. [DOI] [PubMed] [Google Scholar]
- 17.Gombojav B, et al. The Healthy Twin Study, Korea updates: resources for omics and genome epidemiology studies. Twin Res Hum Genet. 2013;16:241–245. doi: 10.1017/thg.2012.130. [DOI] [PubMed] [Google Scholar]
- 18.Cho YS, et al. A large-scale genome-wide association study of Asian populations uncovers genetic factors influencing eight quantitative traits. Nat Genet. 2009;41:527–534. doi: 10.1038/ng.357. [DOI] [PubMed] [Google Scholar]
- 19.Devlin B, Roeder K. Genomic control for association studies. Biometrics. 1999;55:997–1004. doi: 10.1111/j.0006-341x.1999.00997.x. [DOI] [PubMed] [Google Scholar]
- 20.The 1000 Genomes Project Consortium et al. A map of human genome variation from population-scale sequencing. Nature. 2010;467:1061–1073. doi: 10.1038/nature09534. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Nyholt DR. A simple correction for multiple testing for single-nucleotide polymorphisms in linkage disequilibrium with each other. Am J Hum Genet. 2004;74:765–769. doi: 10.1086/383251. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Li J, Ji L. Adjusting multiple testing in multilocus analyses using the eigenvalues of a correlation matrix. Heredity (Edinb) 2005;95:221–227. doi: 10.1038/sj.hdy.6800717. [DOI] [PubMed] [Google Scholar]
- 23.Musunuru K, et al. Candidate gene association resource (CARe): design, methods, and proof of concept. Circ Cardiovasc Genet. 2010;3:267–275. doi: 10.1161/CIRCGENETICS.109.882696. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Carlson CS, et al. Generalization and dilution of association results from European GWAS in populations of non-European ancestry: the PAGE study. PLoS Biol. 2013;11:e1001661. doi: 10.1371/journal.pbio.1001661. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Zhang K, Cui S, Chang S, Zhang L, Wang J. i-GSEA4GWAS: a web server for identification of pathways/gene sets associated with traits by applying an improved gene set enrichment analysis to genome-wide association study. Nucleic Acids Res. 2010;38:W90–W95. doi: 10.1093/nar/gkq324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Dixon AL, et al. A genome-wide association study of global gene expression. Nat Genet. 2007;39:1202–1207. doi: 10.1038/ng2109. [DOI] [PubMed] [Google Scholar]
- 27.Gibbs JR, et al. Abundant quantitative trait loci exist for DNA methylation and gene expression in human brain. PLoS Genet. 2010;6:e1000952. doi: 10.1371/journal.pgen.1000952. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Dimas AS, et al. Common regulatory variation impacts gene expression in a cell type-dependent manner. Science. 2009;325:1246–1250. doi: 10.1126/science.1174148. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.The Genotype-Tissue Expression Consortium. The Genotype-Tissue Expression (GTEx) project. Nat Genet. 2013;45:580–585. doi: 10.1038/ng.2653. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Boyle AP, et al. Annotation of functional variation in personal genomes using RegulomeDB. Genome Res. 2012;22:1790–1797. doi: 10.1101/gr.137323.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Gottlieb DJ, et al. Heritability of longitudinal change in lung function. The Framingham study. Am J Respir Crit Care Med. 2001;164:1655–1659. doi: 10.1164/ajrccm.164.9.2010122. [DOI] [PubMed] [Google Scholar]
- 32.Park JH, et al. Estimation of effect size distribution from genome-wide association studies and implications for future discoveries. Nat Genet. 2010;42:570–575. doi: 10.1038/ng.610. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Eichler EE, et al. Missing heritability and strategies for finding the underlying causes of complex disease. Nat Rev Genet. 2010;11:446–450. doi: 10.1038/nrg2809. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Hancock DB, et al. Genome-wide joint meta-analysis of SNP and SNP-by-smoking interaction identifies novel loci for pulmonary function. PLoS Genet. 2012;8:e1003098. doi: 10.1371/journal.pgen.1003098. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Yang J, et al. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat Genet. 2012;44:369–375. S1–S3. doi: 10.1038/ng.2213. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Lango Allen H, et al. Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature. 2010;467:832–838. doi: 10.1038/nature09410. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Sountoulidis A, et al. Activation of the canonical Bone Morphogenetic Protein (BMP) pathway during lung morphogenesis and adult lung tissue repair. PLoS One. 2012;7:e41460. doi: 10.1371/journal.pone.0041460. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Rosendahl A, et al. Activation of bone morphogenetic protein/Smad signaling in bronchial epithelial cells during airway inflammation. Am J Respir Cell Mol Biol. 2002;27:160–169. doi: 10.1165/ajrcmb.27.2.4779. [DOI] [PubMed] [Google Scholar]
- 39.McLaughlin PJ, et al. Targeted disruption of fibulin-4 abolishes elastogenesis and causes perinatal lethality in mice. Mol Cell Biol. 2006;26:1700–1709. doi: 10.1128/MCB.26.5.1700-1709.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.de Boer WI, et al. Transforming growth factor beta1 and recruitment of macrophages and mast cells in airways in chronic obstructive pulmonary disease. Am J Respir Crit Care Med. 1998;158:1951–1957. doi: 10.1164/ajrccm.158.6.9803053. [DOI] [PubMed] [Google Scholar]
- 41.Takizawa H, et al. Increased expression of transforming growth factor-beta1 in small airway epithelium from tobacco smokers and patients with chronic obstructive pulmonary disease (COPD) Am J Respir Crit Care Med. 2001;163:1476–1483. doi: 10.1164/ajrccm.163.6.9908135. [DOI] [PubMed] [Google Scholar]
- 42.Kimura M, et al. Bmi1 regulates cell fate via tumor suppressor WWOX repression in small-cell lung cancer cells. Cancer Sci. 2011;102:983–990. doi: 10.1111/j.1349-7006.2011.01891.x. [DOI] [PubMed] [Google Scholar]
- 43.Uhlen M, et al. Towards a knowledge-based Human Protein Atlas. Nat Biotechnol. 2010;28:1248–1250. doi: 10.1038/nbt1210-1248. [DOI] [PubMed] [Google Scholar]
- 44.Weiss ST. Lung function and airway diseases. Nat Genet. 2010;42:14–16. doi: 10.1038/ng0110-14. [DOI] [PubMed] [Google Scholar]
- 45.Elks CE, et al. Thirty new loci for age at menarche identified by a meta-analysis of genome-wide association studies. Nat Genet. 2010;42:1077–1085. doi: 10.1038/ng.714. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Lindgren CM, et al. Genome-wide association scan meta-analysis identifies three Loci influencing adiposity and fat distribution. PLoS Genet. 2009;5:e1000508. doi: 10.1371/journal.pgen.1000508. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Li Y, Willer CJ, Ding J, Scheet P, Abecasis GR. MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genet Epidemiol. 2010;34:816–834. doi: 10.1002/gepi.20533. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Marchini J, Howie B, Myers S, McVean G, Donnelly P. A new multipoint method for genome-wide association studies by imputation of genotypes. Nat Genet. 2007;39:906–913. doi: 10.1038/ng2088. [DOI] [PubMed] [Google Scholar]
- 49.Browning BL, Browning SR. A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. Am J Hum Genet. 2009;84:210–223. doi: 10.1016/j.ajhg.2009.01.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Browning SR, Browning BL. High-resolution detection of identity by descent in unrelated individuals. Am J Hum Genet. 2010;86:526–539. doi: 10.1016/j.ajhg.2010.02.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Servin B, Stephens M. Imputation-based analysis of association studies: candidate regions and quantitative traits. PLoS Genet. 2007;3:e114. doi: 10.1371/journal.pgen.0030114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Willer CJ, Li Y, Abecasis GR. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics. 2010;26:2190–2191. doi: 10.1093/bioinformatics/btq340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.R Development Core Team. R: A Language and Environement for Statistical Computing. Vienna, Austria: (R foundation for Statistical Computing; 2009. [Google Scholar]
- 54.Lettre G, et al. Genome-wide association study of coronary heart disease and its risk factors in 8,090 African Americans: the NHLBI CARe Project. PLoS Genet. 2011;7:e1001300. doi: 10.1371/journal.pgen.1001300. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Cheverud JM. A simple correction for multiple comparisons in interval mapping genome scans. Heredity (Edinb) 2001;87:52–58. doi: 10.1046/j.1365-2540.2001.00901.x. [DOI] [PubMed] [Google Scholar]
- 56.Hofman A, et al. The Rotterdam Study: 2012 objectives and design update. Eur J Epidemiol. 2011;26:657–686. doi: 10.1007/s10654-011-9610-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Westra HJ, et al. Systematic identification of trans eQTLs as putative drivers of known disease associations. Nat Genet. 2013;45:1238–1243. doi: 10.1038/ng.2756. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Hodge E, et al. HTR4 gene structure and altered expression in the developing lung. Respir Res. 2013;14:77. doi: 10.1186/1465-9921-14-77. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Melen E, et al. Expression analysis of asthma candidate genes during human and murine lung development. Respir Res. 2011;12:86. doi: 10.1186/1465-9921-12-86. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Kho AT, et al. Transcriptomic analysis of human lung development. Am J Respir Crit Care Med. 2010;181:54–63. doi: 10.1164/rccm.200907-1063OC. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.


