Abstract
Background:
Identification of COPD disease-causing genes is an important tool for understanding why COPD develops, who is at highest COPD risk, and how new COPD treatments can be developed. Previous COPD genetic studies have identified a highly significant genetic association near nephronectin (NPNT), a gene involved in tissue repair, but the biological mechanisms underlying this association are unknown.
Methods:
Splicing quantitative trait locus analysis (sQTL) was performed to identify common genetic variants that alter RNA splicing in lung tissues. These lung sQTL signals were compared to COPD genetic association results near the NPNT gene using colocalization analysis to determine whether genetic risk for COPD in this region may act through altered splicing. Long read sequencing characterized COPD-associated splicing events at isoform-level resolution, and in silico protein structural analysis identified likely functional effects of this alternative splicing.
Results:
An established COPD genetic risk variant, rs34712979_A, creates a cryptic splice acceptor site that causes four separate splicing changes im NPNT. The only of these splicing changes that was associated with COPD phenotypes involved a cassette exon (exon 3). Long read RNA sequencing demonstrated that the COPD risk allele causes a shift in isoform usage away from the dominant NPNT Isoform B precursor, which excludes exon 3, to the Isoform A precursor which splices-in exon 3. Alpha-fold protein structural analysis reveals that inclusion of this exon disrupts an EGF-like functional domain in NPNT.
Conclusion:
Genetic variants in the nephronectin (NPNT) gene increase COPD risk by changing RNA splicing of NPNT in the lung.
Shareable abstract:
Common genetic differences play an important role in the development of COPD, and identification of COPD disease-causing genes is an important tool for understanding COPD pathogenesis, identifying individuals at highest risk for COPD, and the development of more effective treatments. This paper demonstrates that genetic variants in the nephronectin (NPNT) gene increase COPD risk by changing RNA splicing of NPNT in the lung.
Graphical Abstract

Introduction
The development of chronic obstructive pulmonary disease (COPD) is influenced by genetic susceptibility factors, and the identification of COPD disease-causing genes is an important tool for understanding why COPD develops, who is at highest risk for COPD, and how new COPD treatments can be developed. Recent genome-wide association studies (GWAS) have identified over 80 distinct genetic loci that influence susceptibility to COPD[1]; however, the responsible genetic variants and effector genes are unknown for most of these associations.
These genetic effects can be discovered by combining GWAS with functional characterization of genetic effects on gene expression (eQTL analysis) or RNA splicing (sQTL analysis), which can identify the specific causal genetic variant and the mechanism by which it alters gene function. In COPD, this approach has led to the identification of HHIP[2], FAM13A[3], TGFB2[4], and ACVR1B[5] as COPD-causing genes. However, eQTL studies do not capture all potentially relevant functional mechanisms, and it has been shown that the use of sQTLs to capture the effects of genetic variants on splicing identified additional disease-causing mechanisms[6, 7], and recent studies have shown that genetic effects on splicing are an important cause of COPD[8, 9].
In this study, we hypothesized that a COPD GWAS signal near nephronectin (NPNT) alters COPD risk through effects on splicing of NPNT. NPNT participates in tissue injury, inflammation and repair[10–13], and an Npnt knockout mouse model shows impaired resolution of inflammation and injury[14]. NPNT splicing has also been associated with COVID-19 disease severity[15, 16], and a specific genetic variant in NPNT (rs34712979) has been identified as the likely COPD causal variant in this genomic region[1, 17–19]. Using colocalization analysis, we confirmed that the COPD GWAS and sQTL signals near NPNT show almost perfect overlap. Analysis of short and long-read RNA sequencing (RNA-seq) from human lung tissues confirmed the effect of rs34712979 on splicing, indicating that the disease-related event is likely related to the inclusion of an exon that disrupts an EGF-like functional domain in NPNT. This study demonstrates the functional splicing mechanism underlying a genetic risk factor for COPD, providing further evidence of a link between individual differences in the wound healing response and risk for COPD.
Methods
Genetic association analysis, splicing QTL, and colocalization results
Summary GWAS statistics were used from our previous study of COPD and two lung function phenotypes, FEV1 and FEV1/FVC (http://ldsc.broadinstitute.org/ldhub/)[1, 18]. Generation of splicing QTLs (sQTLs) in LTRC is previously described[9]. GTEx (sQTL) significant results[20] were obtained for all tissues from the GTEx Portal (https://www.gtexportal.org/home/datasets), and complete lung sQTL results were obtained from the Anvil GTEx Terra workspace. Multiple colocalization for GWAS and sQTL results was performed using the moloc R package (https://github.com/clagiamba/moloc), details are available in the supplemental methods. See Supplementary Figure 1 for details of subjects selection.
Human tracheobronchial epithelial cell (HTBE) air:liquid interface (ALI) samples and RNA-seq.
Human lung specimens were obtained under the auspices of the University of North Carolina (UNC) Institutional Review Board protocol # 03–1396. All donors or their authorized representatives provided informed consent for research use of organs. These specimens were from previously normal (no history of prior chronic lung disease) organ donors whose lungs were not used for transplantation. Tracheobronchial epithelial cells were obtained and cultured as described in detail previously[21]. Details of genotyping and RNA-seq are provided in the supplemental methods.
Long read RNA-seq analysis in human lung samples from the LTRC
We conducted targeted Oxford Nanopore Technologies (ONT) long read sequencing on RNA from 10 human lung samples from the LTRC which were selected to include five samples from each homozygous class (i.e., GG genotype, AA genotype) of rs34712979. For each of these ten samples, 100–200ng of total RNA was used to generate full-length cDNA using a modified Smart-seq2 protocol[22]. The enrichment and library generation procedures are described in detail in the Supplemental Methods.
For NPNT isoform quantification (i.e., usage analysis), the proportion of isoform usage for each sample was calculated by dividing the number of reads for each isoform by the total number of isoform reads aligning to the NPNT locus. Differences in isoform usage between genotype classes were identified using the Mann-Whitney test.
NPNT protein sequence analysis
Protein sequence analysis was performed using Uniprot [23] to identify NPNT protein domains. The Chou and Fasman Secondary Structure Prediction server [24] was used to characterize the impact of sequence changes to NPNT structure. Alpha-fold protein structures with exon 3 (UniProt ID: D6RH31) and without exon 3 (UniProt ID: Q6UX19) were downloaded from UniProt and visualized with PyMOL. The structure of the first EGF-like domain in Q6UX19 has a high per-residue model confidence score, as deemed by Alpha-fold. The structure of the disrupted EGF-like domain in D6RH31 has a medium per-residue model confidence score, indicating moderately accurate structure prediction.
Plasmid generation, transfection and NPNT protein detection
Plasmids containing encoding the NPNT sequence with and without the serine insertion were transfected into 16HBE14o- cells (Sigma). NPNT levels were measured in whole cell lysates and supernatants by Western blotting with anti-flag antibodies. Full details are available in the Supplemental Methods.
Results
rs34712979 is associated with respiratory phenotypes and NPNT splicing
Genome-wide significant association signals near NPNT have been identified for COPD and various measures of lung function, and rs34712979 is the most likely causal variant[1, 17–19]. The minor A allele of rs34712979 is associated with lower measures of lung function and increased risk of COPD (odds ratio=1.18). Previous colocalization studies showed strong overlap between COPD GWAS and sQTL signals for NPNT, while there is no colocalization with NPNT eQTL results[9]. To provide further evidence of the link between rs34712979, altered splicing of NPNT, and lung disease, we performed colocalization analysis with the moloc method [25] to compare the genetic association results in the region near NPNT from GWAS studies of COPD, FEV1, and FEV1/FVC as well as lung sQTL results from two large studies (Figure 1, Supplementary Figure 1). This analysis confirmed that the genetic signals for COPD, lung function, and NPNT sQTLs are nearly identical (Figure 2), that rs34712979 is very likely (94% estimated probability) to be the causal genetic variant responsible for these associations to splicing and lung disease. Notably, the eQTL signal for NPNT is not very similar to the GWAS signal, suggesting that the COPD-related genetic effect is likely due to altered splicing rather than to effects on the overall NPNT expression level.
Figure 1: Subjects and datasets included in the study.
GTEx = The Genotype-Tissue Expression (GTEx) Project. LTRC = the Lung Tissue Research Consortium.
Figure 2: rs34712979 is associated with COPD, lung function and NPNT splicing.
Local association plots for genetic associations near NPNT. The lead variant, rs34712979, is highlighted in purple, and is used as the reference SNP for calculating linkage disequilibrium. The color of the other points represents the correlation (r2) between the variant shown and the lead variant. The X axis shows the position of the variant in relation to the NPNT gene, and Y axis shows the -log10 p-value. Plots are shown for a) COPD (Sakornsakolpat 2019), b) FEV1/FVC, c) FEV1 (Shrine 2019), d) LTRC sQTL, e) GTEX sQTL, and f) LTRC eQTL.
rs34712979 is associated with a new splice acceptor and alternative inclusion of multiple exons in NPNT
To better characterize the link between rs34712979 and RNA splicing, we examined sQTL results from the Lung Tissue Research Consortium (LTRC). We observed that rs34712979 was associated with multiple splice sites in NPNT (Table 1) that would result in inclusion or exclusion of exons 3, 5, and 11. This observation was confirmed in a separate analysis of sQTL data from GTEx lung tissue (Supplementary Table 2). To better visualize the splicing of exons 3, 5, and 11, we identified all subjects with the less common rs34712979 AA genotype in LTRC and GTEx (n=86 and 33, respectively), and we compared the splicing of these exons to the same number of randomly selected subjects with the more common AG and GG genotype. This confirmed that subjects with the AA genotype have higher rates of inclusion of the 3rd, 5th and 11th exons (Figure 3). We also observed that the A allele creates a novel splice acceptor for exon 2, which results in a 3-bp extension of that exon that results in the insertion of a TAG sequence that codes for a novel serine residue in the NPNT protein (see Supplementary Results).
Table 1:
Association between NPNT splicing in LTRC lung tissue and clinical variables
| Cluster | Position | Event | 1Clinical Variables | ||||
|---|---|---|---|---|---|---|---|
| COPD | FEV1pp | FEV1/FVC | Perc15 | ||||
| 4:105898001:105912188 | 1 | Exon 2–3 | + exon 3 | 5.33E-03 | 0.023 | 1.39E-04 (β= −0.026) | 2.75E-04 (β= −6.9) |
| 4:105898001:105927336 | 1 | Exon 2–4 | − exon 3 | 8.66E-03 | 0.127 | 1.69E-03 (β= 0.020) | 0.125 |
| 4:105912238:105927336 | 1 | Exon 3–4 | + exon 3 | 0.065 | 0.371 | 0.089 | 0.378 |
| 4:105927428:105931515 | 2 | Unannotated | 0.330 | 0.272 | 0.822 | 8.40E-03 | |
| 4:105927428:105932614 | 2 | Exon 4–5 | + exon 5 | 0.049 | 0.021 | 0.046 | 0.897 |
| 4:105927428:105937009 | 2 | Exon 4–6 | − exon 5 | 0.013 | 0.013 | 0.058 | 0.758 |
| 4:105932703:105937009 | 2 | Exon 5–6 | + exon 5 | 3.33E-03 | 0.029 | 0.080 | 0.514 |
| 4:105942702:105959028 | 3 | Exon 10–12 | − exon 11 | 0.0149 | 0.132 | 0.183 | 0.983 |
| 4:105895723:105897898 | 4* | Exon 1–2 with serine | + serine | 0.176 | 0.016 | 0.032 | 0.933 |
| 4:105895723:105897901 | 4* | Exon 1–2 Without serine | − serine | 0.183 | 0.018 | 0.027 | 0.825 |
Eight exon-exon junctions were found to be associated with the COPD risk variant rs34712979. P-values are shown for association testing between splice ratios for the NPNT junctions of interest and COPD related phenotypes. The threshold for significance is based on correction for multiple comparisons (4 variables tested x 4 clusters for significant threshold p<0.004). Effect sizes (beta coefficients from linear regression) are shown for significant associations.
Clinical variables: COPD = GOLD 2–4, FEV1pp = forced expiratory volume in 1 minute percent of predicted, FEV1/FVC = ratio of FEV1 over forced vital capacity, Perc15 = 15th percentile point of Hounsfield Units, i.e., severity of CT-quantified emphysema calculated using the 15th percentile of the lung density histogram technique (lower Perc15 values indicated more CT-quantified emphysema).
Cluster 4 was not tested in the sQTL analysis, this splice site was only discovered after detailed investigation into NPNT splicing.
Figure 3. rs34712979 is associated with increased inclusion of NPNT exon 3.
a) Gene model showing all annotated exons of NPNT. Exons 3, 5, and 11 are known to be alternatively spliced. b) Violin plots showing exon inclusion by rs34712979 genotype in LTRC and GTEx. Subjects were selected such that all available subjects with the rs34712979 AA genotype were included (n=86 and 33 for LTRC and GTEx, respectively) and compared to an equal number of randomly selected subjects with the AG and GG genotype. The X axis shows the genotype, and the Y axis shows normalized exon inclusion proportions. c) Sashimi plot showing the mean number of reads spanning each exon-exon junction in NPNT. Each arched line indicates a junction, and the number indicates the number of spliced reads. Colored peaks indicate RNA-seq read coverage for each exon.
NPNT splicing ratios are associated with pulmonary phenotypes in LTRC
To better understand which alternative splicing events in NPNT contribute to pulmonary phenotypes, we tested for a relationship between alternative splicing and clinical variables in LTRC. We found that the inclusion of exon 3 was significantly associated with reduced FEV1/FVC and reduced perc15 (15th percentile point of Hounsfield Units, indicating increased emphysema). There was no significant association to pulmonary phenotypes observed for exon 5, exon 11, or the novel TAG/serine insertion (Table 2). We also tested for association between NPNT splicing and demographic variables and found that exons 3 and 11 were not associated with any demographic variable, while exon 5 inclusion was associated with sex and current smoking (Supplementary Table 3).
Table 2:
NPNT isoforms identified through long read sequencing analysis
| RNA | Exons* | Associated Transcript | Protein | Amino acids** | Isoform Proportion (%) | p-value | |
|---|---|---|---|---|---|---|---|
| AA | GG | ||||||
| Isoform 1 | 11,13 |
NM_001033047 ENST00000379987.7 |
Isoform B precursor | 565 aa | 58.6 | 66.8 | 0.06 |
| Isoform 2 | 11 | Unannotated | 536 aa | 0.5 | 1.1 | 0.11 | |
| Isoform 3 | 13 |
NM_001184692.2 ENST00000514622.5 |
Isoform D precursor | 3.4 | 5.7 | 0.30 | |
| Isoform 4 | 3,11,13 |
NM_001184690.2 ENST00000453617.6 |
Isoform A precursor | 582 aa | 15.0 | 8.1 | 0.037 |
| Isoform 5 | 5,11,13 |
NM_001184691.2 ENST00000427316.6 |
Isoform C precursor | 596 aa | 16.8 | 14.9 | 0.037 |
| Isoform 6 | 3,5,11,13 |
XM_047449987.1 ENST00000514837.1 |
453 aa | 5.7 | 3.5 | 0.25 | |
All observed isoforms include exons 1,2,4,6–10,12 and 14.
The number of amino acids excludes the Serine residue which is only present with the A allele. We collapsed isoforms regardless of 5’UTR length, transcriptional start site, and novel serine inclusion.
Long read sequencing demonstrates that rs34712979 is associated with increased usage of isoforms containing exons 3, 5 and 11
To determine the effect of rs34712979 on full-length NPNT isoforms, we selected 10 LTRC lung RNA samples with RIN > 7 and the rs34712979 AA or GG genotype (5 each), and we enriched the RNA for NPNT transcripts and performed ONT long-read sequencing. The experiment yielded 24,747 reads mapping to NPNT. The most highly expressed isoform in both genotype classes was the full-length isoform (Refseq NM_001033047.3, Isoform B precursor) that does not include exons 3 and 5 but does include exon 11. Focusing on exons 3, 5, and 11, we found that each exon’s usage (i.e. inclusion in full length isoforms) was higher in AA vs GG subjects (p=5.68 × 10-11, 0.0058 and 0.007, respectively - Figure 4). To analyze these data at the isoform level, we collapsed the isoforms according to their unique exon inclusion pattern, which considered only the four exons that varied across NPNT isoforms (exons 3, 5, 11, and 13). Since the serine insertion was not associated to any COPD-related phenotypes in LTRC, we also collapsed isoforms without regard to this variation. We found that the AA subjects had a higher proportion of the NPNT Isoform A precursor containing exons 3, 11, and 13 (NM_001184690.2/transcript isoform 4, p=0.037; shown in Table 2) and the Isoform C Precursor containing exons 5, 11, and 13 (NM_001184691.2/transcript isoforms 5, p=0.037).
Figure 4. Long read RNA-seq in 10 LTRC lung tissues confirms increased usage of exon 3-containing isoforms in rs34712979-A.
a) Isoforms detected by long read sequencing in 10 human lung tissues from LTRC (5 GG and 5 AA). b) Isoforms containing exons 3, 5 and 11 are more highly represented in the AA vs GG genotype (using Chi squared test). c) The isoform usage profile differs between rs34712979 GG and AA homozygotes. Of the 6 identified isoforms, 4 have corresponding curated Refseq IDs which are as follows – Isoform 1 = NM_001033047/Isoform B precursor, Isoform 3 = NM_001184692.2/Isoform D precursor, Isoform 4 = NM_001184690.2/Isoform A precursor, Isoform 5 = NM_001184691.2 /Isoform C precursor).
Alternative isoforms of NPNT differ in functional domains
We then performed in silico modeling of the NPNT transcript isoforms associated with the rs34712979 COPD risk allele to identify any predicted differences between these isoforms in known NPNT functional domains. NPNT is composed of three primary components, including five epidermal growth factor-like (EGF-like) domains, a linker region containing an integrin binding motif and integrin-binding enhancer, and a C-terminal MAM domain. Alpha-fold protein structure analysis demonstrated that the inclusion of exon 3 in the NPNT Isoform A precursor disrupts the first EGF-like domain in NPNT (Figure 5), likely leading to functional consequences related to this domain, such as epidermal growth factor signaling or binding affinity to ECM components.
Figure 5: Inclusion of exon 3 results in alteration of an EGF-like domain.
Alphafold structural analysis of the exon 3 alternative splicing event associated with the rs34712979-A COPD risk allele. A.) The predominant transcript expressed in lungs (NM0010330479) contains 5 EGF domains near the N-terminal. The first EGF domain spans exons 2 and 4. B.) In addition to including a serine residue, the alternative A allele increases the insertion of exon 3 which disrupts the first EGF domain. C.) AlphaFold model of N-terminal of NPNT for exon 3 exclusion and intact EGF domain colored in green. D.) AlphaFold model of NPNT N-terminal with exon 3 spliced in (outlined in orange), which is predicted to alter folding of the first EGF domain.
Discussion
Our results confirm that rs34712979 is a functional GWAS variant that acts through alternative splicing of NPNT. Our analyses led to three main findings: 1) genetic association and sQTL studies link NPNT to increased risk of COPD and decreased lung function through alternative splicing, 2) rs34712979 is the functional genetic variant responsible for COPD-associated splicing changes in NPNT, and 3) the splicing change most likely to increase risk for COPD is the increased inclusion of exon 3, which is predicted to disrupt the first EGF-like domain in NPNT. Together these results provide the strong evidence linking a COPD GWAS signal to a specific gene and functional mechanism, and the involvement of NPNT supports an emerging trend from COPD GWAS that links altered wound healing and epithelial-ECM interactions to COPD pathogenesis[26]. The genetic association patterns reflect the cumulative data of hundreds of thousands of subjects, and the splicing effects were demonstrated in lung RNA from over 1,000 subjects in two different human cohorts.
NPNT is an extracellular protein involved in tissue development, remodeling, and repair[27]. Full-length NPNT protein includes five epidermal growth factor (EGF)-like functional domains followed by an Arg-Gly-Asp (RGD) integrin-binding domain, a synergy site for high-affinity binding to ⍺8β1, and a meprin A-5 protein-receptor tyrosine phosphatase mu (MAM) domain[27]. It was first identified as the high affinity ligand for integrin α8β1in kidney[28, 29]. Subsequent studies showed that NPNT-α8β1 interactions were essential for normal kidney development in the mouse[30]. NPNT is secreted into basement membranes, where it is presumed to play a role in ECM-mediated cell-signaling that may involve binding to ECM constituents, activation of cell-surface receptors such as the EGF receptor, or integrin-mediated signaling activity through the RGD domain[31]. Although NPNT is highly expressed in the lung[32], its role in pulmonary biology is not fully understood. Global deletion of NPNT in the mouse results in lung lobar fusion[33], and recent evidence from an Npnt conditional knockout mouse model suggests that NPNT plays a role in resolution of inflammation and injury[14]. NPNT has also been linked to osteoblast differentiation and bone remodeling[34], invasiveness of breast cancer[35], and pulmonary silicosis[36].
The NPNT genetic association lies in a region first described in association with lung function[37, 38]. Subsequent studies using eQTL signals confirmed an association with COPD and identified two independent signals at this locus[39]. Several genes in this region have previously been hypothesized to be the effector genes, including the nearby genes GSTCD and INTS12 [40] in addition to NPNT[1, 41], indicating that this genomic region may contain multiple independent COPD genetic risk variants. Our study elucidates the effect of a causal genetic variant in this region on alternative splicing of NPNT, providing new mechanistic detail that elucidates the COPD-related biology in this genomic region. Interestingly, a recent Mendelian randomization and colocalization analysis discovered that the NPNT sQTL signal also colocalizes with the GWAS signal for COVID-19 severity, and the rs34712979 A allele that increases COPD risk is associated with decreased severity of COVID-19 infection, suggesting that NPNT plays a nuanced role in acute and chronic lung injury. It has also been shown that COVID-19 GWAS data does not colocalize with NPNT eQTLs, further supporting our findings that the GWAS association of rs34712979 is driven by splicing and not total gene expression level. Yoshi et al., 2023 observed an effect of rs34712979 on total NPNT protein levels, but they speculated that this may be caused by unrecognized isoform-specific binding of the SomaScan aptamers used to quantify NPNT in this study[16].
We identified four distinct splicing changes in NPNT that were significantly associated with rs34712979, including increased inclusion of the third exon of NPNT and the creation of an alternate splice acceptor by the COPD risk allele rs34712979-A, which results in the inclusion of a novel serine residue near the signal peptide sequence of NPNT. However, when we tested each of these splicing events for association to COPD and related phenotypes, the serine insertion showed no association to pulmonary phenotypes while the exon 3 inclusion event was significantly associated to decreased FEV1/FVC and the amount of pulmonary emphysema in the LTRC. In silico functional analysis also supported a larger biological effect of the exon 3 event, which is predicted to disrupt the first EGF-like binding domain, providing compelling evidence for future studies of the functional effects of this isoform.
Long read sequencing revealed for the first time the full length NPNT isoforms present in lung, indicating that the most common isoform (Isoform Precursor B) does not include exon 3 and has an intact stretch of EGF-like domains. However, subjects with the AA genotype had significantly more expression of a minor isoform (Isoform Precursor A) that includes exon 3, leading to the hypothesis that this isoform results in a functionally altered version of NPNT that may have measurable effects on lung biology, including wound healing and repair. Validation and quantitative assessment of these NPNT proteoforms in future work will be necessary to confirm this hypothesis, as will functional studies of the role of various NPNT isoforms on lung matrix composition, wound healing, inflammatory responses, and lung development.
These observations for NPNT contribute to an emerging literature on the role of COPD-causing genes that are involved in ECM-epithelial interactions[26]. NPNT is also one of a growing number of genes that are implicated in COPD pathogenesis via alternative splicing[8, 9]. Since COPD is a polygenic disorder, NPNT is one of dozens or hundreds of genes that influence COPD susceptibility, but as the functional picture around each of these genes emerges, the variety of therapeutic possibilities for this debilitating and morbid condition will be greatly expanded.
The strengths of this study are that the GWAS and RNA-seq findings are based on the analysis of a very large number of human samples, and the integrated analysis of short and long-read RNA-seq data provides a high level of resolution to observe changes at the isoform level. To our knowledge, this is one of the first applications of long-read sequencing to human tissue samples to demonstrate splicing-related effects of a GWAS-identified genetic variant. While there is strong statistical support for our findings, these are nonetheless correlative findings from large human cohorts. A limitation of this study is that we do not yet know the functional role of NPNT in the context of COPD and lung function, therefore further work is required to define the underlying molecular mechanisms and provide further experimental evidence to demonstrate how these mechanisms may alter COPD risk. An additional limitation is that the gene expression data is from bulk sequencing of tissues, and therefore we cannot determine the cell specificity of the splicing events observed.
In summary, these analyses demonstrate that the pulmonary disease GWAS association near NPNT is very likely to be mediated by a common genetic variant that alters transcript isoform ratios. Given the known function of NPNT in extracellular matrix biology and the high expression of NPNT in lung tissue and pneumocytes[41], further investigation of the functional consequences of this splicing variant, including its effects on NPNT protein structure and function, are likely to further elucidate causal mechanisms of COPD pathogenesis.
Supplementary Material
Funding/Acknowledgements:
This work was funded by K01HL157613, R01HL153248, R01 HL124233, R01 HL147326, R01 HL111527, U01 HL089897, U01 HL089856, R01HL125583, R01HL130512, P01HL114501. Research reported in this publication was supported by the NHLBI and FDA Center for Tobacco Products (CTP). The Genotype-Tissue Expression (GTEx) Project was supported by the Common Fund of the Office of the Director of the National Institutes of Health, and by NCI, NHGRI, NHLBI, NIDA, NIMH, and NINDS. The UNC Marsico Lung Institute Tissue Procurement and Cell Culture Core Biorepository is supported by NIH grant DK065988 and Cystic Fibrosis Foundation grant BOUCHE19R0. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH or the Food and Drug Administration. The data used for the analyses described in this manuscript were obtained from: the GTEx Portal between January and August of 2020 and via the GTEx Terra Workspace during the same time interval.
Molecular data for the Trans-Omics in Precision Medicine (TOPMed) program was supported by the National Heart, Lung and Blood Institute (NHLBI). Whole Genome Sequencing and RNA-seq for “NHLBI TOPMed: The Lung Tissue Research Consortium (phs001662)” was performed at Northwest Genome Center (NWGC, HHSN268201600032I, RNA-seq) and Broad Genomics (HHSN268201600034I, WGS) Core support including centralized genomic read mapping and genotype calling, along with variant quality metrics and filtering were provided by the TOPMed Informatics Research Center (3R01HL-117626-02S1; contract HHSN268201800002I). Core support including phenotype harmonization, data management, sample-identity QC, and general program coordination were provided by the TOPMed Data Coordinating Center (R01HL-120393; U01HL-120393; contract HHSN268201800001I). We gratefully acknowledge the studies and participants who provided biological samples and data for TOPMed.
Conflict of interest statement:
AS, ANW, ZX, TL, XZ, CLW, LMS, SHR, SBVR, AL, and CV have no conflict of interest to disclose. GMS has received reimbursement for travel, accommodation, and conference fees to speak at events organized by PacBio. CPH has received grant funding from the Alpha-1 Foundation, Bayer, Boehringer-Ingelheim and Vertex, and consulting fees from Chiesi, Sanofi, and Takeda. MHC has received grant funding from Bayer. Edwin K. Silverman received grant support from Bayer and Northpond Laboratories. PJC has received consulting fees from Verona Pharmaceuticals and grant support from Bayer and Sanofi.
Data and Code Availability
LTRC genotyping data are available on dbGaP with accession number phs001662.v1.p1
References
- 1.Sakornsakolpat P, Prokopenko D, Lamontagne M, Reeve NF, Guyatt AL, Jackson VE, et al. Genetic landscape of chronic obstructive pulmonary disease identifies heterogeneous cell-type and phenotype associations. Nature Genetics. 2019;51(3):495–505. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Zhou X, Baron RM, Hardin M, Cho M, Zielinski J, Hawrylkiewicz I, et al. Identification of a chronic obstructive pulmonary disease genetic determinant that regulates HHIP. Human molecular genetics. 2012;21(6):1325–35. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Castaldi PJ, Guo F, Qiao D, Du F, Naing ZZC, Li Y, et al. Identification of Functional Variants in the FAM13A COPD GWAS Locus by Massively Parallel Reporter Assays. American Journal of Respiratory and Critical Care Medicine. 2018;199(1):52–61. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Parker MM, Hao Y, Guo F, Pham B, Chase R, Platig J, et al. Identification of an emphysema-associated genetic variant near TGFB2 with regulatory effects in lung fibroblasts. eLife. 2019;8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Boueiz A, Pham B, Chase R, Lamb A, Lee S, Naing ZZC, et al. Integrative Genomics Analysis Identifies ACVR1B as a Candidate Causal Gene of Emphysema Distribution. American journal of respiratory cell and molecular biology. 2019;60(4):388–98. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Li YI, van de Geijn B, Raj A, Knowles DA, Petti AA, Golan D, et al. RNA splicing is a primary link between genetic variation and disease. Science. 2016;352(6285):600–4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Saferali A, Yun JH, Parker MM, Sakornsakolpat P, Chase RP, Lamb A, et al. Analysis of genetically driven alternative splicing identifies FBXO38 as a novel COPD susceptibility gene. PLoS Genetics. 2019;15(7):e1008229-e. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Saferali A, Yun JH, Parker MM, Sakornsakolpat P, Chase RP, Lamb A, et al. Analysis of genetically driven alternative splicing identifies FBXO38 as a novel COPD susceptibility gene. PLoS Genet. 2019;15(7):e1008229. Epub 2019/07/04. doi: 10.1371/journal.pgen.1008229. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Saferali A, Kim W, Chase R, Vollmers C, Silverman EK, Cho M, et al. Overlap between COPD genetic association results and transcriptional quantitative trait loci. medRxiv. 2024:2024.07.08.24310079. doi: 10.1101/2024.07.08.24310079. [DOI] [Google Scholar]
- 10.Inagaki FF, Tanaka M, Inagaki NF, Yagai T, Sato Y, Sekiguchi K, et al. Nephronectin is upregulated in acute and chronic hepatitis and aggravates liver injury by recruiting CD4 positive cells. Biochem Biophys Res Commun. 2013;430(2):751–6. Epub 2012/12/05. doi: 10.1016/j.bbrc.2012.11.076. [DOI] [PubMed] [Google Scholar]
- 11.Decaris ML, Gatmaitan M, FlorCruz S, Luo F, Li K, Holmes WE, et al. Proteomic analysis of altered extracellular matrix turnover in bleomycin-induced pulmonary fibrosis. Mol Cell Proteomics. 2014;13(7):1741–52. Epub 2014/04/18. doi: 10.1074/mcp.M113.037267. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Schiller HB, Fernandez IE, Burgstaller G, Schaab C, Scheltema RA, Schwarzmayr T, et al. Time- and compartment-resolved proteome profiling of the extracellular niche in lung injury and repair. Mol Syst Biol. 2015;11(7):819. Epub 2015/07/16. doi: 10.15252/msb.20156123. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Cheng CW, Ka SM, Yang SM, Shui HA, Hung YW, Ho PC, et al. Nephronectin expression in nephrotoxic acute tubular necrosis. Nephrol Dial Transplant. 2008;23(1):101–9. Epub 2007/11/07. doi: 10.1093/ndt/gfm496. [DOI] [PubMed] [Google Scholar]
- 14.Wilson CL, Hung CF, Schnapp LM. Endotoxin-induced acute lung injury in mice with postnatal deletion of nephronectin. PLoS One. 2022;17(5):e0268398. Epub 2022/05/14. doi: 10.1371/journal.pone.0268398. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Nakanishi T, Willett J, Farjoun Y, Allen RJ, Guillen-Guio B, Adra D, et al. Alternative splicing in lung influences COVID-19 severity and respiratory diseases. Nat Commun. 2023;14(1):6198. Epub 2023/10/05. doi: 10.1038/s41467-023-41912-4. PubMed PMID: 37794074; PubMed Central PMCID: PMCPMC10550956 to this research. J.B.R.’s institution has received investigator-initiated grant funding from Eli Lilly, GlaxoSmithKline and Biogen for projects unrelated to this research. J.B.R. is the CEO of 5 Prime Sciences (www.5primesciences.com), which provides research services for biotech, pharma and venture capital companies for projects unrelated to this research. Y.F. is an employee of 5 Prime Sciences. All other authors declare no competing interests. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Yoshiji S, Butler-Laporte G, Lu T, Willett JDS, Su CY, Nakanishi T, et al. Proteome-wide Mendelian randomization implicates nephronectin as an actionable mediator of the effect of obesity on COVID-19 severity. Nat Metab. 2023;5(2):248–64. Epub 2023/02/23. doi: 10.1038/s42255-023-00742-w. PubMed PMID: 36805566; PubMed Central PMCID: PMCPMC9940690 institution of J.B.R. has received investigator-initiated grant funding from Eli Lilly, GlaxoSmithKline and Biogen for projects unrelated to this research. J.B.R. is the CEO of 5 Prime Sciences (www.5primesciences.com), which provides research services for biotech, pharma and venture capital companies for projects unrelated to this research. T.L. and V.F. are employees of 5 Prime Sciences. T.N. has received speaking fees from Boehringer Ingelheim and AstraZeneca regarding the projects unrelated to this research. The other authors declare no competing interests. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Joo J, Himes B. Gene-Based Analysis Reveals Sex-Specific Genetic Risk Factors of COPD. AMIA Annu Symp Proc. 2021;2021:601–10. [PMC free article] [PubMed] [Google Scholar]
- 18.Shrine N, Guyatt AL, Erzurumluoglu AM, Jackson VE, Hobbs BD, Melbourne CA, et al. New genetic signals for lung function highlight pathways and chronic obstructive pulmonary disease associations across multiple ancestries. Nature Genetics. 2019;51(3):481–93. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Wain LV, Shrine N, Artigas MS, Erzurumluoglu AM, Noyvert B, Bossini-Castillo L, et al. Genome-wide association analyses for lung function and chronic obstructive pulmonary disease identify new loci and potential druggable targets. Nature Genetics. 2017;49(3):416–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Consortium GT. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science. 2020;369(6509):1318–30. Epub 2020/09/12. doi: 10.1126/science.aaz1776. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Fulcher ML, Randell SH. Human nasal and tracheo-bronchial respiratory epithelial cell culture. Methods Mol Biol. 2013;945:109–21. Epub 2012/10/26. doi: 10.1007/978-1-62703-125-7_8. [DOI] [PubMed] [Google Scholar]
- 22.Picelli S, Faridani OR, Bjorklund AK, Winberg G, Sagasser S, Sandberg R. Full-length RNA-seq from single cells using Smart-seq2. Nature protocols. 2014;9(1):171–81. [DOI] [PubMed] [Google Scholar]
- 23.UniProt C. UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res. 2019;47(D1):D506–D15. Epub 2018/11/06. doi: 10.1093/nar/gky1049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Kumar T. CFSSP: Chou and Fasman secondary structure prediction server. Wide Spectrim. 2013;1(9):15–9. [Google Scholar]
- 25.Giambartolomei C, Vukcevic D, Schadt EE, Franke L, Hingorani AD, Wallace C, et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genetics. 2014;10(5):e1004383-e. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Castaldi PJ, Sauler M. Molecular Characterization of the Distal Lung: Novel Insights from COPD Omics. Am J Respir Crit Care Med. 2024. Epub 20240503. doi: 10.1164/rccm.202310-1972PP. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Wilson CL, Hung CF, Schnapp LM. Integrin α8 and Its Ligand Nephronectin in Health and Disease. In: Gullberg D, Eble JA (eds) Integrins in Health and Disease Biology of Extracellular Matrix. 2023;vol 13. doi: 10.1007/978-3-031-23781-2_5. [DOI] [Google Scholar]
- 28.Brandenberger R, Schmidt A, Linton J, Wang D, Backus C, Denda S, et al. Identification and characterization of a novel extracellular matrix protein nephronectin that is associated with integrin alpha8beta1 in the embryonic kidney. The Journal of cell biology. 2001;154(2):447–58. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Morimura N, Tezuka Y, Watanabe N, Yasuda M, Miyatani S, Hozumi N, et al. Molecular cloning of POEM: a novel adhesion molecule that interacts with alpha8beta1 integrin. Journal of Biological Chemistry. 2001;276(45):42172–81. [DOI] [PubMed] [Google Scholar]
- 30.Linton JM, Martin GR, Reichardt LF. The ECM protein nephronectin promotes kidney development via integrin alpha8beta1-mediated stimulation of Gdnf expression. Development (Cambridge, England). 2007;134(13):2501–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Magnussen SN, Toraskar J, Hadler-Olsen E, Steigedal TS, Svineng G. Nephronectin as a Matrix Effector in Cancer. Cancers (Basel). 2021;13(5). Epub 2021/03/07. doi: 10.3390/cancers13050959. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Uhlen M, Fagerberg L, Hallstrom BM, Lindskog C, Oksvold P, Mardinoglu A, et al. Proteomics. Tissue-based map of the human proteome. Science. 2015;347(6220):1260419. [DOI] [PubMed] [Google Scholar]
- 33.Wilson CL, Hung CF, Burkel BM, Ponik SM, Gharib SA, Schnapp LM. Nephronectin is required to maintain right lung lobar separation during embryonic development. Am J Physiol Lung Cell Mol Physiol. 2023. Epub 2023/02/01. doi: 10.1152/ajplung.00505.2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Kahai S, Lee S-C, Seth A, Yang BB. Nephronectin promotes osteoblast differentiation via the epidermal growth factor-like repeats. FEBS letters. 2010;584(1):233–8. [DOI] [PubMed] [Google Scholar]
- 35.Steigedal TS, Toraskar J, Redvers RP, Valla M, Magnussen SN, Bofin AM, et al. Nephronectin is Correlated with Poor Prognosis in Breast Cancer and Promotes Metastasis via its Integrin-Binding Motifs. Neoplasia. 2018;20(4):387–400. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Lee S, Honda M, Yamamoto S, Kumagai-Takei N, Yoshitome K, Nishimura Y, et al. Role of Nephronectin in Pathophysiology of Silicosis. International journal of molecular sciences. 2019;20(10):2581-. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Hancock D, Eijgelsheim M, Wilk JB, Gharib S, Loehr L, Marciante K, et al. Meta-analyses of genome-wide association studies identify multiple loci associated with pulmonary function. Nature Genetics. 2010;42(1):45–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Parker MM, Foreman MG, Abel HJ, Mathias RA, Hetmanski JB, Crapo JD, et al. Admixture mapping identifies a quantitative trait locus associated with FEV1/FVC in the COPDGene Study. Genetic Epidemiology. 2014;38(7):652–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Soler Artigas M, Wain LV, Miller S, Kheirallah AK, Huffman JE, Ntalla I, et al. Sixteen new lung function signals identified through 1000 Genomes Project reference panel imputation. Nat Commun. 2015;6:8658. Epub 2015/12/05. doi: 10.1038/ncomms9658. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Obeidat M, Miller S, Probert K, Billington CK, Henry AP, Hodge E, et al. GSTCD and INTS12 regulation and expression in the human lung. PLoS One. 2013;8(9):e74630. Epub 2013/09/24. doi: 10.1371/journal.pone.0074630. PubMed PMID: 24058608; PubMed Central PMCID: PMCPMC3776747 Leicester was funded by Pfizer. There are no patents, products in development or marketed products to declare. This does not alter the authors’ adherence to all the PLOS ONE policies on sharing data and materials, as detailed online in the guide for authors. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Me Obeidat, Hao K, Bosse Y, Nickle DC, Nie Y, Postma DS, et al. Molecular mechanisms underlying variations in lung function: a systems genetics analysis. The lancet Respiratory medicine. 2015;3(10):782–95. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
LTRC genotyping data are available on dbGaP with accession number phs001662.v1.p1





