Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2024 Jun 25.
Published in final edited form as: Lancet Respir Med. 2022 Apr 12;10(5):485–496. doi: 10.1016/S2213-2600(21)00510-5

The Genetics of COPD

Michael H Cho 1,2,3, Brian D Hobbs 1,2,3, Edwin K Silverman 1,2,3
PMCID: PMC11197974  NIHMSID: NIHMS1994089  PMID: 35427534

Abstract

Chronic obstructive pulmonary disease (COPD) is a deadly and highly morbid disease. COPD susceptibility and heterogeneity are incompletely explained by environmental factors such as cigarette smoking. Studies in families and populations have demonstrated that a substantial proportion of COPD risk is related to genetic variation. Genetic association studies have identified hundreds of genetic variants that affect risk of COPD, lung function, and other COPD-related traits. These genetic variants are associated with other pulmonary and non-pulmonary traits, demonstrate a genetic basis for at least part of COPD heterogeneity, have a substantial impact on COPD risk in aggregate, implicate early life events in COPD pathogenesis, and often involve genes not previously suspected to have a role in COPD. Additional progress will require larger genetic studies with more ancestral diversity, improved profiling of rare variants, and better statistical methods. Through integration of genetic data with other omics data and comprehensive COPD phenotypes as well as functional description of genetic risk variant causal mechanisms, COPD genetics will continue to inform novel approaches to understanding the pathobiology, management, and treatment of COPD.

Introduction

Chronic obstructive pulmonary disease (COPD) is a leading and rising cause of morbidity and mortality worldwide (WHO). While usually diagnosed in the setting of exposure to noxious particles or gases, these exposures incompletely explain disease susceptibility. Cigarette smoking, the most common exposure worldwide, accounts for only a portion of the observed risk(1,2). While the health risks of cigarette smoking are incontrovertible, and mitigation of other environmental exposures such as outdoor pollution and indoor biomass fuel use should also be a top priority for public health, other factors are likely involved in COPD susceptibility. In fact, genetics explains a significant proportion of the phenotypic variability of COPD(3,4). Identifying these genetic factors may help explain COPD heterogeneity, estimate individual susceptibility, assess prognosis, and identify novel and personalized therapeutics.

In this review, we provide background of studies of COPD genetics and an overview of the past several decades of research. We review genome-wide association analyses and other genetic studies, and what these studies tell us about the genetic risk of COPD. We discuss the role of phenotypes and disease heterogeneity, the ability of genetics to predict COPD risk, the potential role of early life events in COPD pathogenesis, and specific molecular pathways and targets implicated by genetics studies. We also discuss the state of clinical translation and key elements needed to further the impact of genetic studies in COPD.

Genetic epidemiology and COPD

Human DNA consists of 23 pairs of chromosomes, totaling approximately two copies of 3 billion base pairs. The differences between any two individuals are small – on the order of 1 difference per 1000 base pairs, with most of this variation being due to single nucleotide variants (SNVs). This genetic variation has an impact on many human diseases and traits(5). Technological advances have made assessment of most SNVs in the human genome straightforward and inexpensive, available from a cheek swab or a blood sample, in an assay that only needs to be obtained once in a person’s lifetime. In contrast to other biomarkers such as proteins or expressed genes, which may be consequences and not causes of the disease process, genetic variants are randomly assigned at the time of conception and thus allow an investigation of potentially causal disease pathobiology. For drug development, human genetics support increases the likelihood of successful clinical trials(6).

Substantial evidence indicates that genetics are important in COPD susceptibility. Dozens of studies have shown significant heritability for lung function and COPD. For example, in one of the largest studies including > 50,000 twin pairs, the heritability of COPD was estimated at > 50%(7). In the COPDGene study of unrelated individuals, heritability was estimated at ~40%(8). Together, these studies strongly indicate that genetic factors are an important and substantial contributor to COPD susceptibility.

Approaches to Identify Genetic Variants in COPD

Initial studies in COPD tested only a few SNPs in or near candidate genes thought to play a role in COPD pathogenesis. To a large extent, however, these findings were not well replicated(9). Genome-wide association studies (GWAS), using larger sample sizes with greater statistical power - currently able to detect variants with odds ratios of < 1.2 using strict criteria for genome-wide significance - have failed to confirm nearly all of these previous candidates, and instead have found dozens to hundreds of susceptibility variants, near genes that largely had no previously suspected role in COPD pathogenesis. While there may still be a role for studying candidate genes with substantial prior evidence, publicly available genome-wide association studies that cover the variants of interest often can provide evidence that these candidate associations do not replicate in larger cohorts. The first GWAS of COPD, defined using post-bronchodilator spirometry, identified a locus near the CHRNA3/CHRNA5/IREB2 genes, as well as a region near HHIP(10). Subsequent larger studies identified genomic regions near FAM13A, RIN3, CYP2A6, and DSP. The most recent GWAS COPD including 35,735 cases and 222,076 controls identified 82 loci (defined using a distance around the top variant) at genome-wide significance. Of these, 50 had evidence for a second independent association, 60 had clear evidence of replication, and the remainder had either nominal or directionally consistent associations in population-based studies of lung function(11). As the largest COPD studies to date have relied on the Global Initiative for Chronic Obstructive Lung Disease (GOLD) definition of COPD based on lung function, analysis of spirometry – by virtue of being a quantitative trait and available in large numbers of subjects – may result in greater power to detect variants associated with COPD. Indeed large-scale studies of quantitative spirometry identified 279 replicated loci for forced expiratory volume in 1 second (FEV1), forced vital capacity (FVC), and FEV1/FVC ratio; most of which have some effect on COPD risk(12) (Figure 1).

Figure 1:

Figure 1:

Karyogram of GWAS associations for COPD and related lung function phenotypes (FEV1 and FEV1/FVC)(11,12,105). Regions within 1 megabase were combined, and one gene name chosen to represent the region.

While the genetic architecture of COPD is still being learned, studies of heritability of traits such as height, and diseases such as schizophrenia and diabetes(1315) as well as COPD(3) support a model where the largest contribution to the phenotype arises not from rare variants with large effect sizes, but from the combined effect of many common variants of small effect size. This common disease - common variant model lends strong support to the GWAS approach.

Rare variants, typically not found by standard GWAS approaches, clearly have a contribution to COPD risk, as demonstrated by severe alpha-1 antitrypsin (AAT) deficiency. The SERPINA1 gene encodes the AAT protein. The most common form of severe AAT deficiency is due to homozygosity at rs28929474, also known as the SERPINA1 Z allele, which occurs in approximately 1 in 3,000 individuals(16). Other Mendelian syndromes include severe emphysema as a prominent feature, such as cutis laxa(17), Ehlers Danlos syndrome(18), and telomeropathies(19). However, identification of other rare variants, especially outside of known GWAS loci, has been challenging(20), though some promising candidates, such as TBC1D10A and PTPN6, have been identified through genome-wide unrelated or family-based approaches(2124).

COPD Phenotypes and Heterogeneity

In addition to discovering new genomic regions associated with COPD, some important lessons have been learned from GWAS. One important lesson relates to phenotypes of COPD. While post-bronchodilator lung function and cigarette smoke exposure are considered key factors for the diagnosis of COPD, most of the genetic variants identified do not seem to be substantially affected by the use pre- or post-bronchodilator lung function, the inclusion of adult asthmatics, or non-smokers(11,25). These data suggest that genetic factors leading chronic airflow limitation and COPD in asthmatics and non-smokers are shared with those of smokers, and helps explain why there is a high degree of overlap between discoveries using only smokers with clinical diagnoses of COPD, and population-based studies of spirometry. Similarly, despite a strong rationale for gene-by-environment interaction, evidence for interaction of SNVs with cigarette smoking is generally weak, in part due to power that is orders of magnitude lower than that to detect genetic main effects, measurement error, use of cross-sectional rather than longitudinal lung function, and likely other issues. Exceptions where differential effects by smoking status have been identified include the CHRNA5, CYP2A6, SERPINA1, and MECOM loci; however, weak interaction effects of other variants may be detectable in aggregate(2629).

Studies of phenotypes related to COPD may provide additional insight into disease pathogenesis. Emphysema, airway wall thickening, and expiratory gas trapping assessed by chest CT may reflect different pathophysiologic processes leading to airflow obstruction and thus may be important endotypes of COPD. Results from these studies are mixed, with most studies identifying the same signals from COPD case-control studies (e.g., AGER and HHIP) with some exceptions (e.g. DLC1, SERPINA1, MAN2B1)(30,31). Similar findings have also been demonstrated in other COPD phenotypes such as chronic bronchitis and pulmonary artery enlargement(32,33). While new signals are intriguing, most of them have not been well replicated.

An alternative approach to genetic studies of novel phenotypes is to instead examine the phenotypic associations for a given genetic marker. One of the first examples of this approach was at the FAM13A locus, where genome-wide associations were found for both COPD and for pulmonary fibrosis; here, the FAM13A genotype associated with increased risk for COPD was associated with decreased risk for pulmonary fibrosis(25). Subsequent studies identified several additional overlapping and opposite risk loci for COPD and pulmonary fibrosis. Similar approaches identified loci shared between asthma and COPD, though most of the asthma and COPD risk variants have a concordant direction of effect(11). Causal modeling and mediation approaches can identify intermediate phenotypes, such as structural features of the lung that may mediate the relationship between genetics and airflow limitation(34,35). In addition to respiratory diseases, COPD and lung function risk loci are shared with autoimmune diseases including inflammatory bowel disease, celiac disease, height, cardiovascular disease, and eosinophil counts(11,12). This phenomenon, called pleiotropy, appears to be common in GWAS and may lead to further insights about the functional impact of COPD-associated genetic variants (Figure 2).

Figure 2:

Figure 2:

Multiple phenotypic effects identified at COPD and lung function genetic loci. A) Association of 279 lung function variants with 2,411 traits in UK Biobank(12), demonstrating extensive pleiotropy B) Hierarchical clustering of COPD genetic loci with chest imaging phenotypes in COPDGene(11). Variants clustered by association with airway features or emphysema features, suggesting that different groups of genetic variants may explain imaging heterogeneity in COPD.

These studies also highlight the relationship between genetics and COPD heterogeneity. Subjects with COPD vary not only in the degree of airflow limitation but also in the extent of emphysema and other imaging and pathologic characteristics, frequency and type of exacerbations, and presence of co-morbid conditions. Genetic variants may explain COPD heterogeneity. Severe alpha-1 antitrypsin deficiency is a paradigm of a COPD genetic subtype with specific phenotypic characteristics such as emphysema and liver disease that are a functional consequence of the underlying genetic mutation. Other individual rare variants of large effect, or groups of common variants, are likely to also have an impact on heterogeneity commonly observed in COPD(36,37) (Figure 2).

The impact of genetic risk

A second important insight from COPD genetic studies is the effect of genetics on risk. For Mendelian diseases such as severe alpha-1 antitrypsin deficiency and cystic fibrosis, a single genetic variant is responsible for a markedly elevated risk of disease. Genome-wide association studies in COPD have identified risk variants with odds ratios of ~1.3 or less, and thus individual variants are quite poor for risk prediction. Combining multiple genome-wide significant variants, and more recently, applying approaches that can leverage all variants and combine them into a single score per individual, demonstrates a substantial improvement in risk prediction. In COPD risk prediction, these advances have resulted in an increase of area under the curve (AUC) from 0.58 to an AUC of 0.68, and an increase in COPD risk in the top versus bottom decile of the risk score from 3.7, to 4.7, to 8.0 with more comprehensive risk scores(11,12,3840) (Figure 3). These genetic risk scores (also termed polygenic risk scores, or PRS) also associate with important COPD characteristics, such as quantitative computed tomography emphysema and airway features, and lung growth patterns in a cohort of childhood asthmatics followed through to adulthood(40,41), raising the possibility that polygenic risk scores may not only predict overall risk, but may reflect risk of specific features of COPD.

Figure 3:

Figure 3:

Odds ratios for COPD by decile of a polygenetic risk score(40).

Early life and genetic risk

The association of a COPD polygenic risk score with lung growth patterns begs the question of when genetic risk to COPD occurs. Early efforts at identifying COPD genetic susceptibility in the candidate gene era focused on the idea that individuals started with similar degrees of normal lung function and had different rates of lung function decline due to varied responses to cigarette exposure, for example through detoxifying enzymes such as EPHX1 and GSTM1, or genes related to inflammatory pathways such as TNF. However, COPD genome-wide association studies have generally not found enrichment in either inflammatory or detoxifying enzyme pathways. Instead, several lines of evidence suggest that genetic susceptibility begins early in life. COPD-associated genetic loci are largely contained within loci discovered in general populations; several of these loci are shared with height, and COPD and lung function loci are significantly more likely than chance to overlap with regions of the genome that are regulatory in fetal lung, and lung developmental genes(11,39,42). A second piece of evidence comes from looking at studies of childhood lung function or lung function decline. Overall, studies of lung function loci identified in adults appear to have similar effects in children, but most may not have an effect on lung function decline in adults(39,43). Supporting these data are reports of the decreasing heritability of lung function and other phenotypes with age(44). Together, these data are consistent with recent reports in longitudinal cohorts demonstrating that a subset of individuals who have fixed airflow obstruction as adults have reduced lung function in earlier life(45,46). Thus, some genetic risk factors for COPD, and similarly, the currently developed polygenic risk score may reflect – at least in part – the effects of respiratory system development and other events that result in lower lung function that precede later adulthood.

While genetic studies strengthen the case for the importance of early life events in COPD, these findings do not mean that these genetic risk factors cannot have influences later in life. In one of the most extensively studied genetic associations, Zhou et al described how a COPD-associated region could affect expression of the HHIP gene, and how haploinsufficiency of this gene in a murine model led to age-related emphysema that was attenuated with antioxidant treatment(47,48). Thus, HHIP, despite its roles in lung development may also have a role in regulating adult lung function.

While lung function decline appears to have a genetic component(49), identifying the specific genetic factors associated with decline has been challenging, with no genome-wide significant and independently replicated findings(50,51). Reasons for these findings include a smaller relative effect of genetics on decline versus environmental factors; decreased sample size relative to cross-sectional cohorts; and variability and relatively short-term follow-up available in longitudinal lung function studies. Addressing these issues could identify novel variants and clarify the role of currently identified variants on the cycle of chronic inflammation and progressive lung function decline that can persist years after smoking cessation in COPD patients(52).

Functional Genetics

The HHIP example also illustrates one of the main misconceptions regarding genome-wide association studies. Arguably, the main goal of genetic association studies is to discover new disease genes. For therapeutics, having genetic support for a drug target increases the likelihood of success in clinical trials and subsequent drug approval(6). However, GWAS themselves do not implicate a specific gene, and are better understood as implicating a specific region of the genome. Although the functions of most GWAS loci are unknown, by convention loci are often named for their closest gene. This may lead to the incorrect assumption that the named closest gene is causal. In fact, the effector gene for a GWAS risk locus is usually not known, and may be several hundred thousand base pairs distant(53). Similarly, the specific causal variant(s), cell type(s), and environment(s) leading to disease are generally unknown. Identifying a COPD-associated variant that is also an expression quantitative trait locus (eQTL) – i.e., a variant that also affects gene expression – can identify a causal gene, but most GWAS loci are likely not explained by existing eQTLs, particularly when the pervasiveness of eQTLs is considered(54). Previous studies have used multiple methods to identify the potential causal gene at a given locus(11,12), but of the dozens to hundreds of loci in COPD, only a small fraction of implicated loci have been investigated for functional effects(5560). However, despite these challenges, these investigations have often led to important discoveries regarding disease mechanisms (Table 1). For example, FAM13A appears to promote β-catenin degradation, inhibiting Wnt activation relevant for alveolar type 2 cell repair and regeneration; as well as modulating airway TGF- β1 signaling(24,29,61). AGER and SFTPD both harbor coding variants that increase susceptibility to COPD and encode proteins that are biomarkers for COPD(6265). In some cases, finding a COPD risk variant that also alters a molecular phenotype such as gene expression or alternative splicing can implicate a causal gene, such as in TGFB2 and FBXO38(66,67). For other regions, genes in or near the associated region offer intriguing hypotheses that require follow-up testing: ADGRG6 (previously GPR126), a G protein-coupled receptor which may have roles in airway remodeling(24); CHRM3, a cholinergic muscarinic receptor; genes in the MAPK pathway; IL17RD, which can regulate pathways of IL-17; CHIA, which encodes a protein that degrades chitin; and several genes involved in extracellular matrix, cell adhesion, cell-cell interactions, and elastin-associated microfibrils, including ITGA1, NPNT, MFAP2, and ADAMTSL3(11,25).

Table 1:

Selected genetic associations with evidence linking variants to genes and supportive functional data. eQTL = expression quantitative trait locus, pQTL = protein quantitative trait locus

Gene name Evidence for gene Implications
AGER Nonsynonymous / coding variant (pQTL, rs2070600) sRAGE, encoded by AGER, associated with emphysema and lung function decline(69)
HHIP Gene expression, chromosomal conformation capture, functional variants and eQTL, (e.g. rs6537296, rs1542725), murine model Increased susceptibility in haploinsufficient mouse to cigarette smoke, age-related emphysema, and lymphocytic inflammation(47,48,100)
FAM13A Gene expression, functional variants and eQTL (e.g. rs2013701), murine model Effects on Wnt/B-catenin pathway, induces reactive oxygen species(61)
SFTPD Nonsynonymous / coding variant (pQTL, rs721917) Surfactant protein with immunomodulatory role; biomarker for COPD(25,62,64,65)
TGFB2 Gene expression, functional variant identification in fibroblasts and eQTL (rs1690789) With other associations, implicating members of the TGFB pathway in COPD(101)
SERPINA1 Nonsynonymous variant, protein levels (pQTL rs28929474), familial segregation Discovery in 1960’s led to protease-antiprotease hypothesis in COPD, AAT augmentation therapy(16,102)
TERT Rare variants affect telomere length (e.g., rs372511089), segregation in families Telomere pathway mutations can predispose to COPD and pulmonary fibrosis(103,104)

Clinical and translational implications

What are the clinical and translational implications of the reported findings in COPD genetics? Simply understanding that a substantial portion of COPD susceptibility comes from genetic factors and early life events may help mitigate the detrimental effects of self-blame common in COPD(68). Our understanding of the genetic impact on specific phenotypes has also changed. Most identified COPD genetic susceptibility loci overlap with genetic susceptibility to reduced pre-bronchodilator spirometry in the general population. Analyses of other phenotypes or subsets of COPD has to date also found a high degree of overlap; whether discovery of novel loci not related to population-based lung function is due to a lack of sufficient sample size or specificity for these other phenotypes is not known. Genetic risk factors are complex and the genetic risk for COPD partially overlaps with asthma, pulmonary fibrosis, and many other phenotypes. Genetic susceptibility to COPD explained by currently available GWAS is due to many common variants of small effect. Thus, while pulmonologists traditionally consider genetic risk in settings such as alpha-1 antitrypsin deficiency, from a population standpoint a larger number of subjects at markedly elevated genetic risk are likely to be identified by aggregation of these common variants using approaches such as polygenic risk scores. High genetic risk individuals identified by common variant aggregation could be targeted for counseling on avoidance of harmful occupational exposures and smoking; or could be identified for clinical trials. Perhaps most importantly, the discovery of a large number of genetic loci relevant for human disease offers opportunities to investigate new pathobiology, link existing biological hypotheses to human disease, or develop biomarkers, such as sRAGE and SFTPD(29,62,69).

Future directions in COPD Genetics

Despite tremendous advances in COPD genetics over the last several decades, the field is still in its relative infancy. The first genome-wide association study was reported just over a decade ago, and whole genome sequencing data has only recently become available. The fraction of genetic risk (or heritability) that is explained by replicated COPD genetic loci is only ~10% in those of European ancestry and even less so in others. This ‘missing heritability’ is likely due to other, mostly common variants but also rare variants yet to be discovered(15,52). While a major goal of genetics is to develop therapeutics, this translational impact is most immediately apparent when the effects of the drug target are already well understood, and genetics is used to provide the link to human disease. Of the COPD loci identified by GWAS, the underlying pathobiology is understood partially for a handful, and not at all for the vast majority. A limited number of rare variants have been associated with COPD, including alpha-1 antitrypsin deficiency and telomeropathies. Finding additional rare variants in COPD, based on experience with rare variants in other complex diseases(7073) suggests that these variants may be very rare, have more modest effect sizes, or are associated with specific phenotypes, and thus will require sample sizes at least as large as those for traditional genome-wide association studies to reach the same statistical rigor. While most studies have focused on the COPD case-control phenotype defined by spirometry, COPD is a heterogeneous disease, and phenotypes are much more nuanced than the presence or absence of airflow limitation. There is an increasing appreciation for the trajectory in early life and young adulthood that leads to COPD(74,75). While genetic factors solely affecting developmental processes may not be amenable to therapeutic intervention, some genetic loci associated with growth and development likely overlap with processes involved in injury, repair, and inflammation. For those with disease, there are marked differences in lung parenchyma, vasculature, and structure(76,77); and comorbidities such as asthma, cardiovascular disease, and lung cancer share some of the same genetic risk factors(11,78). Genetics also likely underlie differences in rates of lung function decline, emphysema progression, and rates and types of exacerbations(49,79,80).

The field of COPD genetics is rising to meet these challenges. Studies of more diverse ancestries and rarer variants are essential both for discovery of new genetic risk factors but also understanding the biology of discovered loci(81,82). Phenotypes continue to be refined, with attention to longitudinal studies(83,84), via more advanced imaging algorithms such as vascular segmentation and application of deep learning(85). Studies of both the pleiotropic effects of individual associated variants, and studies of specific phenotypes and subtypes will answer questions about the role of genetics in disease heterogeneity. Recently, loss of function variants in IL33 initially associated with asthma were also found to be associated with COPD, leading to a clinical trial of an IL33 inhibitor in COPD(24). Environmental exposures are clearly critical to COPD, and some genetic risk factors may be involved in gene-by-environment interactions(26,27), though power to detect such interactions is an important limitation. The growth of large epidemiologic studies and health systems in biobanks, and the decreasing cost of genotyping and DNA sequencing - in the UK Biobank, Kadoorie, and the Trans-Omics in Precision Medicine (TOPMed) program(23,86,87) will enable larger studies to increase the number of associated loci, leading to synergistic understanding of disease; refine understanding of associated phenotypes and disease risk in different ancestries; and lead to better and more useful polygenic risk prediction and stratification, including identification of disease subtypes. Most studies examine germline genetic variation, which is present in every cell, versus somatic mutations, which occur after conception, particularly in tumors; recently, somatic mutations related to clonal hematopoiesis in blood cells were associated with COPD (Qiao et al, accepted). Genetic variants cause their effect through other molecules, and do not act alone. Biochemical alterations to DNA, and molecules that bind to DNA, can affect the expression of genes. These epigenetic alterations are also associated with COPD susceptibility, severity, and mortality(8892). Integrative omics and networks and systems approaches pair genetic variation with other omics data and examine relationships between genes and omics types. Combining genetic variation with gene expression; methylation and other epigenetic measures (such as microRNA); proteomics; and metabolomics from blood, lung, and respiratory and immune cell types from samples of different ages and clinical conditions can identify new effector genes, biomarkers, and disease pathobiology(9297). The process of identifying the effect of a given associated locus - functional genetics, is perhaps the most challenging and immediate problem in human complex disease genetics. In addition to integrative omics methods, advances in high throughput functional studies such as massively parallel reporter assays (MPRA) and CRISPR editing techniques - combined, for example, with single-cell methods(98,99) allow high-throughput testing to identify causal variants. Resources such as TOPMed, LungMAP, ENCODE, and the Human Cell Atlas provide resources to identify relevant genetic regions and cell types.

COPD genetics has, to date, identified dozens of genetic loci, leading to an understanding of novel disease pathobiology. It has also confirmed a strong component of genetic risk for disease, and identified an effect early in life, and can quantify risk in an individual. Future studies will continue to advance our understanding of disease pathobiology; allow better risk stratification, disease classification, and subtyping across different ancestries; and ultimately result in information to better manage patients and develop new therapies for this deadly disease.

Figure 4:

Figure 4:

Genetic association with COPD or a COPD related phenotype, ideally with replication, can be done through standard genome-wide association studies for common variants, sequencing and rare variant based studies.

Once an association is identified, efforts should be made to identify the likely causal variant(s) through a procedure called ‘fine mapping’, since linkage disequilibrium results in extensive correlation of variants through the genome. For rare protein altering variants, there may be no other likely causal variants; for intragenic regions with high linkage disequilibrium, there may be hundreds or more potential causal variants.

Identifying a causal gene for regulatory variants (the majority of variants found by GWAS) is challenging: variant (or SNP) to gene strategies include molecular QTL (e.g. expression QTL, or eQTL) analysis ideally with colocalization, identification of open chromatin regions that correlate with gene expression, usually in specific cell types. ‘Functional genetics’ encompasses a set of experimental approaches to test the perturbation of an individual variant, regulatory region, or gene, in a specific cell type, organoid, or organism model, consistent with a hypothesis generated from downstream genetic analyses.

GWAS or association studies alone are sufficient to generate genetic risk scores, or in the case of variant(s) of large effect or affecting one pathway, a disease subtype (e.g. alpha-1 antitrypsin deficiency), though fine mapping and other methods may help improve these scores. To identify novel drug pathways and therapeutic targets, there should be an understanding of the basic effector gene and cell types, ideally after functional genetics assays.

Table 2:

Ongoing and Future Approaches in COPD Genetics

Approach Description
Biobanks Large population- or healthcare system-based collections with a range of phenotyping, enabling an increase in sample size and examination of pleiotropic or genotype-first effects
Populations of diverse ancestry Initiatives will enable and enhance discoveries and mitigate health inequalities
Phenomics Comprehensive and detailed measures of phenotypes to identify and elucidate genetic mechanisms and disease heterogeneity
Whole-genome sequencing Comprehensive analysis of the genome allowing identification and discovery of rare variants associated with disease
Integrative genomics Incorporation of genetics with gene expression, epigenetics, proteins, and metabolites to identify genetic effects; systems biology and network methods to understand gene regulation
Single-cell sequencing Identification of genetics, gene regulation, and gene expression in individual lung cells, rather than heterogeneous bulk tissue
High-throughput functional genomics Methods to identify functional variants and effector genes at large scale; to be followed by detailed experimental approaches

Key messages.

  • Genetics accounts for a substantial fraction of risk to develop COPD

  • Genetic variants, assigned at birth, can lead to important insight into causal pathobiology

  • While alpha-1 antitrypsin deficiency remains the most important genetic risk factor for COPD, most genetic risk appears to come from common variants, dozens to hundreds of which have been identified by GWAS

  • Each of these associations can lead to new discoveries about disease pathobiology

  • Though the key gene and functional variants in most COPD GWAS are unknown, at least some appear to be related to lung growth and development

  • A score comprised of many genetic variants can identify a group at high risk for COPD

  • Larger studies of more diverse phenotypes, ethnicities, rare variants, and integrative and network approaches incorporating other omics data are key future directions to understanding COPD genetics

Funding:

No funding was provided specifically for this Series paper.

The investigators are supported by R01HL137927, R01 HL152728, and P01 HL114501(EKS), R01 HL089856, R01 HL133135, P01 HL132825, and R01HL147148 (EKS and MHC), R01HL149861 and R01HL135142 (MHC); K08HL136928 (BDH).

The funders had no role in study design, data collection, data analysis, data interpretation, writing, or the decision to publish. The views expressed in this manuscript are those of the authors and do not necessarily represent the views of the National Heart, Lung, and Blood Institute or the National Institutes of Health.

Conflicts of interest:

Edwin K. Silverman and Michael H. Cho have received grant support from GlaxoSmithKline and Bayer. Michael H. Cho has received consulting and speaking fees from Illumina and AstraZeneca.

Glossary

Single nucleotide polymorphism / single nucleotide variant

The majority of human genetic variation consists of single nucleotide polymorphisms (SNPs); though the term SNV (single nucleotide variant) is more inclusive of the increasingly rare variation being discovered. The term ‘genetic variation’ includes these and other types of variants, such as insertion/deletion (indel), and copy number and structural variants of larger size

Genome-wide association study (GWAS)

A GWAS is a method to test for an association of a genetic variant with disease or a specific trait. In contrast to a candidate gene study which only tests a few variants, a GWAS tests millions of SNVs across the genome, and includes extensive quality control, accounts for population substructure, adjusts for multiple testing, and usually includes replication

Heritability

Heritability is a measure of the degree of variation in a trait due to genetic factors versus other (e.g., environmental, stochastic) factors, and can range from 0–100%. Heritability estimates can vary depending on the relative contribution of genetics to the phenotype in the population studied

Genetic architecture

A concept encompassing the number, allele frequency, and magnitude of genetic variants (and their interactions with other genetic variants and environmental factors) associated with a given trait

Pleiotropy

the ability of one gene (or genetic variant) to influence two or more seemingly unrelated traits

Polygenic Risk Score

a numeric value generally created by aggregating the contribution of thousands or more genetic variants from genome-wide association studies. The resulting score can be used to assess the relative risk of disease

Epigenetics

the study of mitotically heritable or other stable effects that alter the transcriptional potential of the cell not immediately due to the underlying DNA sequence

Search strategy and selection criteria

We searched for COPD [MeSH] and “genetic association study” [MeSH] through October 9, 2021; and “genetic*” [MeSH] in PubMed from Jan 1, 2015 through October 9, 2021

References

RESOURCES