Abstract
Rationale
Genome-wide association studies have identified common variants of lung cancer. However, the contribution of rare exome-wide variants, especially protein-coding variants, to cancers remains largely unexplored.
Objectives
To evaluate the role of human exomes in genetic predisposition to lung cancer.
Methods
We performed exome-wide association studies to detect the association of exomes with lung cancer in 30,312 patients and 652,902 control subjects. A scalable and accurate implementation of a generalized mixed model was used to detect the association signals for loss-of-function, missense, and synonymous variants and gene-level sets. Furthermore, we performed association and Bayesian colocalization analyses to evaluate their relationships with intermediate exposures.
Measurements and Main Results
We systematically analyzed 216,739 single-nucleotide variants in the human exome. The loss-of-function variants exhibited the most notable effects on lung cancer risk. We identified four novel variants, including two missense variants (rs202197044TET3 [Pmeta (P values of meta-analysis) = 3.60 × 10−8] and rs202187871POT1 [Pmeta = 2.21 × 10−8]) and two synonymous variants (rs7447927TMEM173 [Pmeta = 1.32 × 10−9] and rs140624366ATRN [Pmeta = 2.97 × 10−9]). rs202197044TET3 was significantly associated with emphysema (odds ratio, 3.55; Pfdr = 0.015), whereas rs7447927POT1 was strongly associated with telomere length (β = 1.08; Pfdr (FDR corrected P value) = 3.76 × 10−53). Functional evidence of expression of quantitative trait loci, splicing quantitative trait loci, and isoform expression was found for the four novel genes. Gene-level association tests identified several novel genes, including POT1 (protection of telomeres 1), RTEL1, BSG, and ZNF232.
Conclusions
Our findings provide insights into the genetic architecture of human exomes and their role in lung cancer predisposition.
Keywords: lung cancer, germline mutation, exome-wide association study, exome sequencing, trans-omics
At a Glance Commentary
Scientific Knowledge on the Subject
Previous studies have identified several rare germline mutations associated with the risk of lung cancer, including those in BRCA2, CHEK2, and ATM. However, most of these are based on SNP arrays, which have a limited scope and accuracy for genetic variant detection. The UK Biobank provides detailed cancer follow-up information linked to whole-exome sequencing for approximately 450,000 participants, offering an unprecedented opportunity to evaluate the effects of germline mutations. In addition, high-quality sequencing reference panels (for example, TOPMed, with 97,256 deeply sequenced genomes) improve the imputation accuracy of SNP array-based variants. Thus, it is necessary to harmonize the large-scale sequencing population and case–control design genome-wide association study populations to detect rare germline variants.
What This Study Adds to the Field
We investigated the association between exome genetic variants and lung cancer in five cohorts including 30,312 patients and 652,902 control subjects. We systematically analyzed 216,739 single-nucleotide variants in the human exome and evaluated their effects according to the variant type (loss of function, missense, and synonymous). We identified four novel sentinel variants, including two missense variants (rs202197044TET3 and rs202187871POT1), two synonymous variants (rs7447927TMEM173 and rs140624366ATRN), and four known variants. Most of these are linked to specific intermediate lung cancer– related aspects such as telomere length and lung function. The identified germline variants indicate potential clinical benefits in using them for the identification of individuals who would benefit the most from screening programs, as well as suggestions for therapeutic targets. Identifying novel lung cancer–related germline mutations would significantly advance our understanding of cancer etiology.
Lung cancer is a leading cause of cancer-related deaths and a critical barrier to increasing life expectancy worldwide (1, 2). It is a multifactorial malignant disease driven by environmental exposure, genetic polymorphisms, and somatic and germline mutations (3). Genome-wide association studies (GWASs) have identified numerous genome-wide significantly contributing risk loci (4, 5), and clinical sequencing projects have identified potentially driven mutation events (6, 7). However, germline mutations in lung cancer remain incompletely understood owing to the lack of large-scale sequencing populations and the limited coverage of SNP arrays (8). Exome-wide association studies (ExWAS) have indicated that rare germline mutations tend to have larger phenotypic effects than common SNPs and contribute to heritability (9, 10). Because the effect allele frequency is generally low in ExWAS, the sample size should be large (for example, n > 100,000) to ensure statistical power, especially for rare coding variants (11, 12). Moreover, most studies have focused on coding variants that alter the amino acid sequence of the protein, whereas the role of synonymous mutations remains largely unknown. Notably, synonymous mutations were recently found to exhibit strongly nonneutral effects (13, 14).
The UK Biobank (UKB) is a powerful resource for evaluating the associations between coding variants and human diseases because of its large sample size with high-quality whole-exome sequencing (WES) data (n ≈ 450,000) (10, 15). However, the number of incident lung cancer cases in the UKB is low (n ≈ 4,000) and provides insufficient power to assess the effects of rare variants compared with the existing case–control studies based on SNP array. A UKB whole-genome sequencing study found that >65% of the variants at a frequency level of 0.001–0.002% (representing three to five allele carriers) can be reliably imputed from the SNP array (16). Thus, it is essential to leverage a high-quality imputation panel to accurately impute these array-based variants to the whole-genome scale, thereby powering association analysis (17, 18). Compared with some small-scale panels (for example, 1000G, the Haplotype Reference Consortium, and UK10K), the Trans-Omics for Precision Medicine (TOPMed) reference panel was built from 97,256 TOPMed samples containing 300 million variants for which high-quality whole-genome sequence data were available, which improves the accuracy and scope of the imputed variants (19, 20).
In the present study, we investigated the association of exome genetic variants with lung cancer in five cohorts, including 30,312 patients and 652,902 control subjects from the UKB WES project, the PLCO (Prostate, Lung, Colorectal, and Ovarian) cancer screening trial, the ILCCO-OncoArray (International Lung Cancer OncoArray Consortium), the TRICL (Transdisciplinary Research in Cancer of the Lung) research team, and FinnGen. Here, we describe the landscape of exome variants associated with lung cancer risk and identify novel germline mutations linked to specific biological functions.
Methods
Study Population
In this multicenter study, we used the UK Biobank Exome Sequencing Project as the discovery set and the PLCO, ILCCO-OncoArray, TRICL, and FinnGen SNP array–based projects as replication sets. Only participants of European descent were included in this study. Details of the sample characteristics and quality control criteria for sequencing and the SNP array are described in the online supplement.
Imputation Based on TOPMed Imputation Server
To estimate missing genotype information, we performed an imputation on the TOPMed online imputation server, which phased haplotypes with Eagle v2.4 (21) using TOPMed data (version r2) as a reference panel that included 97,256 reference samples and 308,107,085 genetic variants (20). The server performed imputations for PLCO, ILCCO-OncoArray, and TRICL using Minimac (version 4) software.
All genetic variants were lifted to GRCh38 (Genome Reference Consortium Human Build 38) coordinates to maintain consistency with the UKB WES project. Poorly imputed single-nucleotide variants (SNVs) with imputation quality score R2 < 0.4 and SNVs on sex chromosomes were excluded from the analyses.
Association Analysis for Single Variant and Gene Sets
Single-variant and gene-based association analyses were performed using a scalable and accurate implementation of generalized mixed model (SAIGE v1.1.6) (22, 23). The variant-level association tests included high-quality and reliable variants with total minor allele count (MAC) ⩾ 10 and MACcase ⩾ 3 in the discovery set. Ultrarare variants (MAC < 10) were included in gene-based association analyses.
Association studies were performed using the discovery set and three replication sets (PLCO, ILCCO-OncoArray, and TRICL) for overall lung cancer, lung adenocarcinoma, lung squamous cell carcinoma, small-cell lung cancer (SCC), never-smokers, and smokers (including current and former smokers). In all association analyses, we adjusted for covariates including age, sex, smoking status (excluding subgroups of never-smokers and smokers), and the top 10 principal components. A meta-analysis was then performed to summarize the results between the discovery and four replication sets for single variants using the METAL software (24).
Because the genomic inflation factor (lambda, λ) increases with sample size, we rescaled the observed lambda value (λobs) to the adjusted one (λ1000), reflecting a standardized sample size of 1,000 patients and 1,000 control subjects, based on the following formula (5):
The significant associations when the variants met the following three criteria simultaneously were reported: 1) P ⩽ 10−4 in the discovery set; 2) P ⩽ 0.05 in all the replication sets; and 3) reach a genome-wide significant level (P ⩽ 5 × 10−8) in the meta-analysis.
The variants were functionally annotated using the Variant Effect Predictor (25) and ClinVar (26), which are provided by the gnomAD v3.1.2 and dbSNP Build 155 databases.
Data Collection of Intermediate Exposures
The known exposures associated with lung cancer were collected to assess the potential pathogenic pathways in UKB, including smoking behavior (data field 20160), smoking pack-years (data field 20161), telomere length (data field 22192), C-reactive protein (data field 30710), lung function including FEV1 (data field 3063), FVC (data field 3062), and FEV1/FVC ratio (data field 20258). All the quantitative variables were transformed into z-scores.
In addition, data on five chronic respiratory diseases genetically associated with lung cancer were collected, namely emphysema, chronic obstructive pulmonary disease, asthma, fibrosis, and pneumonia. The genetic correlation between each disease pair was evaluated through LD (linkage disequilibrium) Score Regression (27) using imputed genotype data from the UKB (data field 3062) (see Table E1 in the online supplement).
Bayesian colocalization analysis was performed using the R package coloc to test the hypothetical causal pathway, SNV-exposure-lung cancer (28).
All statistical analyses were performed using R software (version 4.2.1, The R Foundation). P values were two-sided, and statistical significance was set at P < 0.05 (or Pfdr < 0.05, with multiple comparisons).
Results
Landscape of the Exome Variants and Their Contribution to Lung Cancer
We included five cohorts with available exome sequencing and SNP array data. The UKB WES project was used as the discovery set, and PLCO, ILCCO-OncoArray, TRICL, and FinnGen were used as the replication sets (Figure 1). The demographic and lung cancer characteristics of the European descent population are presented in Table 1. No population stratification was observed in principal component analysis plots (Figure E1). Overall, the age of lung cancer cases (age at cancer diagnosis) was 64.55 ± 10.14 years, whereas the age of participants without lung cancer (age at participant enrollment and blood collection) was 57.84 ± 8.03 years.
Figure 1.
Study design and workflow. In this multicenter study, we used the UK Biobank Exome Sequencing Project as the discovery set and the PLCO (Prostate, Lung, Colorectal, and Ovarian), ILCCO-OncoArray (International Lung Cancer OncoArray Consortium), TRICL (Transdisciplinary Research in Cancer of the Lung), and FinnGen SNP array-based projects as replication sets. Exome-wide association studies were performed and combined using meta-analysis. TOPMed = Trans-Omics for Precision Medicine.
Table 1.
Demographic and Clinical Characteristics of Participants
| Characteristic | Discovery Set |
Replication Sets* |
||||||
|---|---|---|---|---|---|---|---|---|
| UK Biobank |
PLCO |
ILCCO-OncoArray |
TRICL |
|||||
| Incident LC | Non-LC | Incident LC | Non-LC | LC Cases | Controls | LC Cases | Controls | |
| Sample size† | 4,083 | 334,643 | 2,455 | 96,196 | 15,827 | 12,783 | 4,886 | 5,259 |
| Age, yr,‡ mean ± SD | 61.65 ± 5.89 | 55.89 ± 8.07 | 63.92 ± 5.01 | 62.27 ± 5.28 | 63.45 ± 10.56 | 62.15 ± 10.92 | 60.78 ± 10.08 | 58.39 ± 9.39 |
| Sex, male | 2,093 (51.3) | 152,165 (45.5) | 1,493 (60.8) | 46,140 (48.0) | 9,836 (62.1) | 7,671 (60.0) | 2,657 (54.4) | 2,816 (53.5) |
| Smoking status | ||||||||
| Never | 649 (15.9) | 184,204 (55) | 225 (9.2) | 47,041 (48.9) | 1,497 (9.5) | 4,186 (32.7) | 500 (10.2) | 1,656 (31.5) |
| Current | 1,824 (44.7) | 115,030 (34.4) | 944 (38.5) | 7,822 (8.1) | 5,473 (34.6) | 4,249 (33.2) | 1,719 (35.2) | 1,807 (34.4) |
| Former | 1,577 (38.6) | 34,318 (10.3) | 1,286 (52.4) | 41,319 (43) | 8,499 (53.7) | 4,035 (31.6) | 2,667 (54.6) | 1,796 (34.2) |
| Unknown | 33 (0.8) | 1,091 (0.3) | 0 | 14 (0.0) | 358 (2.3) | 313 (2.4) | 0 | 0 |
| Histology | ||||||||
| LUAD | 1,359 (33.3) | — | 776 (31.6) | — | 6,622 (41.8) | — | 1,786 (36.6) | — |
| LUSC | 694 (17) | — | 371 (15.1) | — | 3,944 (24.9) | — | 955 (19.5) | — |
| SCC | 282 (6.9) | — | 252 (10.3) | — | 1,616 (10.2) | — | 463 (9.5) | — |
| Other | 451 (11.0) | — | 431 (17.6) | — | 3,645 (23.1) | — | 1,682 (34.4) | — |
| Unknown | 1,297 (31.8) | — | 625 (25.4) | — | 0 | — | 0 | — |
Definition of abbreviations: ILCCO-OncoArray = International Lung Cancer OncoArray Consortium; LC = lung cancer; LUAD = lung adenocarcinoma; LUSC = lung squamous cell carcinoma; PLCO = Prostate, Lung, Colorectal, and Ovarian cancer screening trial; SCC = small-cell lung cancer; TRICL = Transdisciplinary Research in Cancer of the Lung.
Data are presented as n (%) unless otherwise noted.
In FinnGen, 3,061 incident lung cancer cases and 204,021 cancer-free controls were included in the analysis. Lung cancer included 929 (30.3%) LUADs, 696 (22.7%) LUSCs, and 346 (11.3%) SCCs. Detailed baseline characteristics of age, sex, and smoking status in FinnGen have not been released.
All the participants are of European ancestry.
Age of lung cancer cases is the age at cancer diagnosis (ILCCO-OncoArray and TRICL) or age at enrollment (UK Biobank and PLCO trial), whereas the age of controls is the age at participant enrollment and blood collection.
We performed ExWAS on overall lung cancer, lung adenocarcinoma, lung squamous cell carcinoma, SCC, never-smokers, and smokers. We systematically analyzed 216,739 SNVs in the human exome that passed quality control, including 6,676 (3.1%) predicted loss-of-function (LoF) variants, 123,813 (57.1%) missense variants, and 86,250 (39.8%) synonymous variants (Figure 2A). The SNP array–derived variants had good imputation quality, with an average Rsq of >0.6 also when the minor allele frequency (MAF) was <0.0001 (Figure E2). METAL software was used to summarize the results, and the genomic inflation factor values (λ1000) suggested no population stratification (Figure E3). For the three variant types, their internal MAF composition ratio was significantly different (χ2 test, P < 2.2 × 10−16), and LoF variants tended to have lower allele frequencies and synonymous variants had higher MAF (Figure 2B). The LoF variants had the largest effects for lung cancer risk (the absolute values of β effects in all association tests, median [interquartile range (IQR)], 0.20 [0.07–0.54]), the missense variants had the moderate effects (median [IQR], 0.14 [0.05–0.41]), and the synonymous variants had the smallest effects (median [IQR], 0.10 [0.04–0.31]), with significant differences (ANOVA, P < 2.2 × 10−16) (Figure 2C).
Figure 2.
Landscape of the exome variants and their contribution to lung cancer. (A) Proportion of LoF, missense, and synonymous variants among the 216,739 single-nucleotide variants. (B) Proportion of different MAF thresholds (<0.0001, 0.0001–0.001, 0.001–0.01, 0.01–0.1, and 0.1–1) for the three variant types. (C) The median (IQR) effects of the three variant types in overall lung cancer (LC) and lung cancer subgroups. (D) The mean effects of the three variant types in different MAF thresholds. *P ⩽ 0.05 and **P ⩽ 5 × 10−8. IQR = interquartile range; LoF = loss-of-function; LUAD = lung adenocarcinoma; LUSC = lung squamous cell carcinoma; MAF = minor allele frequency; ns = not significant; SCC = small-cell lung cancer.
The effect sizes were also compared using different MAF thresholds (<0.0001, 0.0001–0.001, 0.001–0.01, 0.01–0.1, and 0.1–0.5). The low-frequency LoF variants had significantly larger effects than missense and synonymous variants (subgroup with MAF < 0.0001: P = 8.89 × 10−15; subgroup with MAF = 0.0001–0.001: P = 5.51 × 10−8; subgroup with MAF = 0.001–0.01: P = 0.0014). However, for variants with a higher frequency (MAF > 0.01), no differences were observed (Figure 2D).
Discovery of Exome Variants Associated with Lung Cancer Risk
We identified eight sentinel variants that passed the genome-wide significance level in patients with lung cancer or smokers (P < 5 × 10−8) (Table 2 and Figures 3 and S4). The identified variants exhibited good imputation quality across the three imputed GWAS datasets (Table E2). Two variants (rs7775397MHC and rs16969968CHRNA5) were associated with smoking status, whereas the remaining six variants were independent of smoking status (Table E3). Four chromosomal regions have been reported in previous GWASs: TERT (5p15.33), MHC (6p21.33), CHRNA5 (15q25.1), and CYP2A6 (19q13.2).
Table 2.
The Association between Sentinel Variants Representing Each Lung Cancer Locus and Lung Cancer Risk
| Category | Stratum | Locus | SNV* | Allele† | rsID | Gene | Type | EAF (Case/Control)‡ | HGVSp | OR (95% CI)§ | P Value§ |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Novel | LC | 2p13.1 | 2:74046345:A:G | A/G | rs202197044 | TET3 | Missense | 0.0010/0.00018 | p.His143Arg | 154.1 (26.8–883.9) | 3.60 × 10−8 |
| Novel | LC | 7q31.33 | 7:124858989:C:T | C/T | rs202187871 | POT1 | Missense | 0.0011/0.00022 | p.Asp224Asn | 10.60 (4.63–24.22) | 2.21 × 10−8 |
| Novel | LC | 5q31.2 | 5:139481561:C:G | C/G | rs7447927 | TMEM173 | Synonymous | 0.723/0.741 | p.Val48Val | 0.93 (0.91–0.95) | 1.32 × 10−9 |
| Novel | Smoker | 20p13 | 20:3545768:C:T | C/T | rs140624366 | ATRN | Synonymous | 0.00029/0.000024 | p.Leu205Leu | 9.28 (2.80–30.72) | 2.97 × 10−9 |
| Known | LC | 5p15.33 | 5:1293971:C:T | C/T | rs2736098 | TERT | Synonymous | 0.305/0.277 | p.Ala305Ala | 1.13 (1.10–1.16) | 6.67 × 10−16 |
| Known | Smoker | 6p21.32 | 6:32293475:T:G | G/A | rs7775397 | MHC | Missense | 0.141/0.129 | p.Lys400Gln | 1.15 (1.11–1.19) | 1.01 × 10−8 |
| Known | LC | 15q25.1 | 15:78590583:G:A | G/A | rs16969968 | CHRNA5 | Missense | 0.373/0.330 | p.Asp398Asn | 1.25 (1.24–1.27) | 2.00 × 10−54 |
| Known | LC | 19q13.2 | 19:40844710:G:A | G/A | rs28399462 | CYP2A6 | Synonymous | 0.015/0.023 | p.Pro408Pro | 0.77 (0.71–0.83) | 2.99 × 10−10 |
Definition of abbreviations: CI = confidence interval; EAF = effect allele frequency; HGVSp = Human Genome Variation Society protein nomenclature; LC = overall lung cancer; OR = odds ratio; rsID = reference SNP ID number; SNVsingle-nucleotide variant.
Genome position was based on GRCh38 (Genome Reference Consortium human build 38) coordinate.
Reference allele/effect allele.
We report the EAF using the discovery set (UK Biobank Whole-Exome Sequencing Project).
The ORs and P values were calculated from the meta-analysis of the discovery and replication sets.
Figure 3.
Manhattan plot of the exome-wide association studies. (A) Overall lung cancer. (B) Smokers. METAL software was used to summarize the results between discovery and four replication sets in the meta-analysis. Red identifies novel genes. LC = overall lung cancer.
Importantly, two novel missense variants were identified: rs202197044TET3 (p. His143Arg on 2p13.1; odds ratio [OR] [95% confidence interval (CI)]: 154.11 [26.88–883.91]; P = 3.60 × 10−8) and rs202187871POT1 (p.Asp224Asn on 7q31.33; OR [95% CI]: 10.60 [4.63–24.22]; P = 2.21 × 10−8). In addition, two novel synonymous variants were identified, including rs7447927TMEM173 (5q31.2; OR [95% CI]: 0.93 [0.91–0.95]; P = 1.32 × 10−9), and rs140624366ATRN (20p13; OR [95% CI]: 9.28 [2.80–30.72]; P = 2.97 × 10−9). All exome variants that reached P < 1 × 10−4 in each stratum are listed in Table E4.
We collected lung cancer WES data from TRICL Exome Plus Targeted Sequencing, which contains 1,045 lung cancer cases and 885 controls. Of the five common SNVs with sufficient allele counts, three were successfully replicated (Table E5). However, when comparing their effect sizes, a high consistency was found (Pearson’s r = 0.96; P = 0.011) (Figure E5).
Furthermore, we performed additional association meta-analysis for ±500-kb sections around the identified variants using the imputed data from the TOPMed reference panel. Coding and noncoding variants that had LD r2 ⩾ 0.01 with the identified variants were included. Seven variants showed the strongest association with lung cancer risk, with the exception of rs401681CLPTM1L on 5p15.33 (Table E6). We also investigated the LD region for known variants of the novel variants (Figures E6 and E7). We observed LD signals in 5q31.2TMEM173, 5p15.33TERT, 6p21.32MHC, 15q25.1CHRNA5, and 19q13.2CYP2A6. However, no notable LD signals were observed among the rare variants with an MAF < 0.01.
Rare Deleterious Variants in TET3 and POT1 Are Associated with Lung Cancer
Two novel missense variants were identified in the meta-analysis that were annotated as deleterious by Variant Effect Predictor. The first SNV was p.His8Arg in TET3 (tet methylcytosine dioxygenase 3). This variant is rare in the European population, with an MAF of 0.0058% in gnomAD and 0.0095% in dbSNP. In our combined datasets, the MAF was 0.076% for lung cancer cases and 0.015% for controls. We observed 10 lung cancer carriers of this variant (8 in UKB [P = 5.18 × 10−6], 2 in PLCO [P = 0.0013]) and 131 control carriers, which were balanced distributed in different histology (Fisher’s exact test P = 0.999) and smoking status (P = 0.051) (Table E7). It remained significant in all strata (P < 0.05).
Another missense variant was p.Asp224Asn in POT1 (protection of telomeres 1). This variant is rare in the European population, with an MAF of 0.020% in gnomAD and 0.025% in dbSNP. In our combined datasets, the MAF was 0.044% for lung cancer cases and 0.018% for controls. We observed 23 LC carriers of this variant (8 in UKB [P = 9.08 × 10−5], 2 in PLCO [P = 0.042], 10 in ILCCO-OncoArray [P = 0.016], and 3 in TRICL [P = 8.71 × 10−4]), which were evenly distributed in different histology (P = 0.352) but enriched in never-smokers (P = 0.001) (Table E8).
Two Synonymous Variants in TMEM173 and ATRN Are Associated with Lung Cancer
Two novel synonymous variants were identified. One is located in STING1, also known as TMEM173 (stimulator of the interferon response, cGAMP interactor). It is a common variant, with an effect allele frequency of 72.3% in lung cancer cases and 74.1% in controls. The signal was stable in all strata except the SCC (Table E9). Another SNV in ATRN (attractin) was exceedingly rare, with an MAF of 0.029% for lung cancer cases and 0.0042% for controls, whereas the MAF in gnomAD was 0.0044%. It was enriched in smoking lung cancer cases only (13/13 carriers) and passed the genome-wide significance level in smokers while maintaining nominal significance in overall LC (P = 0.0012) (Table E10).
ExWAS Based on Gene Level
In addition to variant-level analyses, gene-based association studies were performed to capture the effects of ultrarare variants (online supplement). After Bonferroni correction, eight unique genes were considered significant (P < 2.5 × 10−6) (Figure E8 and Table E11). Four genes have been previously identified: BRCA2, CHRNA5, CYP2A6, and ATM. The remaining four genes were novel, including POT1 (lung cancer, P = 4.77 × 10−7), RTEL1 (lung cancer, P = 1.19 × 10−6), BSG (nonsmoker, P = 4.80 × 10−7), and ZNF232 (smoker, P = 1.97 × 10−6).
Replication of Previously Identified Variants
In addition, eight germline mutations previously reported in lung cancer were replicated (Table E12) (29–31). Of the six variants that had enough minor allele counts in lung cancer cases (MAC ⩾ 3), none of them reached nominal significance (P < 0.05) in the UKB WES project, which may be due to the underpowered lung cancer cases in such a prospective population cohort. However, when combining with four additional cohorts, four variants were successfully replicated (Pmeta (P values of meta-analysis) < 0.05): rs11571833BRCA2 (OR [95% CI]: 1.38 [1.23–1.54]; Pmeta = 5.96 × 10−5), rs17879961CHEK2 (OR [95% CI]: 0.76 [0.66–0.88]; Pmeta = 0.010), rs56009889ATM (OR [95% CI]: 2.17 [1.56–3.03]; Pmeta = 0.004), and rs150665432KIAA0930 (OR [95% CI]: 1.28 [1.09–1.51]; Pmeta = 0.041). Thus, our findings strongly support these observations.
Association Analysis with Potential Intermediate Exposures
The association of the eight identified sentinel variants with potential intermediate exposures was analyzed. rs202197044TET3 was significantly associated with emphysema (OR, 3.55; Pfdr (FDR corrected P value) = 0.015), whereas rs7447927POT1 was strongly associated with telomere length (β = 1.08; Pfdr = 3.75 × 10−53) (Figure E9). For the two synonymous variants, rs202187871TMEM173 was associated with telomere length (β = −0.008; Pfdr = 0.001), FVC (β = 0.008; Pfdr = 0.003), and FEV1/FVC (β = −0.005; Pfdr = 0.016). rs140624366ATRN showed marginal significance with lung fibrosis (β = 2.48; Pfdr = 0.058). For the known loci, we observed abundant signals for variants in CHRNA5, CYP2A6, MHC, and TERT (Figure 4).
Figure 4.
Association results of potential intermediate exposures with identified single-nucleotide variants. Blue indicates that the effect allele of the single-nucleotide variant is positively associated with exposure, whereas red indicates negative effects. *Pfdr (FDR corrected P value) ⩽ 0.05. COPD = chronic obstructive pulmonary disease.
Furthermore, Bayesian colocalization analysis was performed to test whether the intermediate exposures functioned causally in the SNV-exposure-lung cancer pathways (online supplement). We identified 13 of the 28 potential pleiotropic loci (46.4%) with PP.H4 > 0.7 (Table E13).
Functional Analysis of the Identified Variants
We analyzed tissue-specific expression quantitative trait loci (eQTL) in lung tissues from the GTEx database (V8 release). Four variants (rs202197044TET3, rs202187871POT1, rs140624366ATRN, and rs2736098TERT) were not detected in the patients with GTEx. The remaining four variants exhibited significant eQTL relationships with gene expression (Table E14). In addition, we investigated splicing QTL in TCGA and GTEx lung tissues using the CancerSplicingQTL and GTEx databases. Three variants had significant sQTLs (rs7447927, rs7775397, and rs16969968) in TCGA and GTEx (Table E15).
To investigate the isoform expression of the newly identified genes, we collected lung tissue RNA sequencing data from TCGA and GTEx (Figure E10 and Table E16). TET3 has four isoforms, and ENST00000409262 is the Matched Annotation from the NCBI (National Center for Biotechnology Information) and EMBL-EBI (European Molecular Biology Laboratory-European Bioinformatics Institute) select transcript. It was significantly expressed at higher levels in tumor tissues than in adjacent normal tissues (Pfdr = 1.67 × 10−17) and healthy normal tissues (Pfdr = 0.050). For POT1, the representative isoform ENST00000357628 was also significantly upregulated in tumor tissues compared with that in healthy normal tissues (Pfdr = 4.15 × 10−4), as well as ENST00000262919 in ATRN (Pfdr = 1.63 × 10−17). An inverse trend was observed for ENST00000330794 in TMEM173. It was highly expressed in adjacent normal tissues (Pfdr = 2.28 × 10−76) and healthy normal tissues (Pfdr = 4.41 × 10−23).
Discussion
In this study, we comprehensively evaluated genetic variants of the human exome and lung cancer predisposition in 683,214 participants of European ancestry. As genotyping costs decrease, global biobanks can genotype a large number of participants, enabling GWASs for numerous traits and diseases. With long-term follow-up using electronic health records or surveys, thousands of lung cancer cases occur, which increases the statistical power in addition to traditional case–control design GWASs and the evaluation of lung cancer risk stratification at the population level. However, the low incidence of lung cancer in prospective cohorts easily leads to unbalanced case–control ratios (<1:10) and varying rates of cancer cases in meta-analyses, which may inflate type I errors. To compensate for this, we used logistic mixed models incorporating the saddlepoint approximation to calibrate the unbalanced case–control ratios in the score tests. By conducting a meta-analysis of a sufficient number of lung cancer cases, novel germline variants could be identified. Thus, this large-scale ExWAS harmonized the UKB sequencing population and four independent cohorts with SNP array data to gain statistical power for adequate lung cancer cases (Figure E11). Moreover, establishing independent discovery and replication sets allows the identified signals to be replicated externally to ensure robustness, especially for rare variants with smaller MAF.
Our first major finding systematically analyzed the effects of LoF, missense, and synonymous variants on predisposition to lung cancer. LoF variants tended to have lower allele frequencies and larger effects in overall lung cancer or lung cancer subgroups, whereas missense variants had moderate effects. The effect differences disappeared when the MAF increased (>0.01) for all three variant types. However, we did not identify significant LoF variants that passed P ⩽ 5 × 10−8 in the ExWAS, which may be due to two reasons: 1) the proportion of LoF variants is low (approximately 3%), with heavy statistical burden for multiple testing; 2) the MAFs of LoF variants tend to be low, with insufficient statistical power to detect the signals.
Our second major finding is that exome-wide signals are associated with lung cancer. POT1 and RTEL1 are important genes that encode proteins involved in telomere maintenance. The p.Asp224Asn variant (also known as c.670G>A), located in the coding exon 5 of the POT1 gene, results from a C-to-T substitution at nucleotide position 670. This alteration was also identified in three siblings and a mother, all affected by Hodgkin’s lymphoma, inhibiting telomere binding in vitro and increasing telomere length and fragility (32). Moreover, this variant is associated with various hematological malignancies, such as chronic myeloid leukemia (33). Another variant, p.V326A, of POT1 has been associated with lung cancer risk in the Japanese population (34). RTEL1 encodes a DNA helicase that functions in the stability, protection, and elongation of telomeres, which can influence telomere stability by facilitating telomeres (35). Thus, alterations in these genes may provide insights into the mechanisms by which telomere dysfunction influences the risk of lung cancer.
TET3 belongs to the ten-eleven translocation (TET) gene family and plays a role in DNA methylation. It is mainly involved in epigenetic modifications, such as 5-hydroxymethylcytosine (5 hmC), which mediates active DNA demethylation and is considered an important cancer therapeutic target (36). Furthermore, TET proteins (Tet2 and Tet3) play essential roles in maintaining Treg molecular features, including TGFβ-induced iTreg cell differentiation and IL-2 responsiveness in iTreg cells (36).
The other three novel genes identified were TMEM173, ATRN, and BSG, which have similar biological functions in immune response. TMEM173 (STING1) is widely expressed in innate and adaptive immune cells and induces the production of type I IFN through activation of the cGAS-STING pathway; it also activates the innate immune system, regulates the cancer-immunity cycle, facilitates the release of cancer cell antigens, and promotes the trafficking and infiltration of T cells to tumors (37, 38). ATRN is associated with initial immune cell clustering during inflammatory responses and may regulate the chemotactic activity of chemokines (39, 40). BSG, also known as CD147, is a member of the immunoglobulin superfamily and is widely expressed in the epithelial, cancer, and T cells of the immune system (41).
Our third major finding linked the signals to intermediate exposure to lung cancer and explained the potential causal pathogenic pathways, including lung cancer–related chronic diseases and well-known risk factors. We identified significant exposures for each germline variant, some of which showed strong relationships with multiple exposures, such as CHRNA5, MHC, and CYP2A6. However, owing to a lack of information, some SNVs need to be further investigated for their biological functions in the future, such as DNA methylation aberrations, TET3, immune biomarkers (for example, tumor necrosis factors, interleukins), and ATRN/TMEM173.
Our study has several strengths. First, we comprehensively evaluated the exome-wide genetic variants of lung cancer in five independent cohorts. We leveraged a large-scale sequencing population to discover reliable mutation signals, which were replicated in two retrospective case–control datasets and two prospective population cohorts to gain statistical power and make the signals robust. Second, the SNP array-based data were imputed using the largest reference population, which has been proven to have the highest accuracy compared with traditional reference panels such as 1000G and Haplotype Reference Consortium (17, 18). Third, we explored the relationship between the identified genes and lung cancer at the multi-omics level, including genomics and transcriptomics. Trans-omics analysis revealed that the identified signals were functional.
We acknowledge the limitations of this study. First, we only focused on individuals of European ancestry. Therefore, it is essential to evaluate the association between these variants in non-European populations. Second, although we demonstrated the associations of potential candidates with lung cancer risk, it is challenging to detect tissue-specific eQTL effects based on these rare or ultra-rare candidates, owing to their low frequencies and weak LD among rare or common variants. Future large-scale multi-omics studies are expected to explain the QTL relationships of the identified rare variants. Third, independent large-scale sequencing studies are warranted to validate the identified signals, particularly for less-proportional histological subtypes.
In conclusion, our study provides novel insights into human exomes and germline mutations through comprehensive analyses of the genetic predisposition to lung cancer and subsequent target analyses of specific exposures and genes.
Acknowledgments
Acknowledgment
The authors thank the participants and investigators of UK Biobank, PLCO (Prostate, Lung, Colorectal, and Ovarian) cancer screening trial, ILCCO-OncoArray (International Lung Cancer OncoArray Consortium), TRICL (Transdisciplinary Research in Cancer of the Lung), and FinnGen.
Footnotes
Supported by National Natural Science Foundation of China grants 82220108002 (F.C.), 82103946 (S.S.), 82173620 (Y.Z.), 81820108028 (H.S.), Natural Science Foundation of the Jiangsu Higher Education Institutions of China grant 21KJB330004 (S.S.), and NIH (National Cancer Institute) grant U01CA209414 (D.C.C.).
Author Contributions: S.S., Y.Z., D.C.C., and F.C. contributed to the study design. S.S., D,W., L.H., and D.S. contributed to data collection. S.S, L.Z. and J.Y. performed statistical analyses and interpretation. S.S. drafted the manuscript. E.M., S.H., and H.Z. reviewed and revised the manuscript. All authors approved the final version of the manuscript.
Data availability: UK Biobank data are available from https://www.ukbiobank.ac.uk/. ILCCO-Oncoarray data are available from: https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001273.v3.p2. TRICL data are available from: https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001681.v1.p1. Finngen R6 release data are available from: https://www.finngen.fi/fi. PLCO data are available from: https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001286.v2.p2. TCGA data are available from: https://portal.gdc.cancer.gov/. GTEx data is available from: https://www.gtexportal.org/home/. Online resources: TOPMED imputation server: https://imputation.biodatacatalyst.nhlbi.nih.gov/. gnomAD: http://gnomad.broadinstitute.org/. dnSNP: https://www.ncbi.nlm.nih.gov/snp/. Code availability: The R software codes that support our findings are available from the corresponding author by request.
This article has an online supplement, which is accessible from this issue’s table of contents at www.atsjournals.org.
Originally Published in Press as DOI: 10.1164/rccm.202212-2199OC on May 11, 2023
Author disclosures are available with the text of this article at www.atsjournals.org.
References
- 1. Siegel RL, Miller KD, Fuchs HE, Jemal A. Cancer statistics, 2022. CA Cancer J Clin . 2022;72:7–33. doi: 10.3322/caac.21708. [DOI] [PubMed] [Google Scholar]
- 2. Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global Cancer Statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin . 2021;71:209–249. doi: 10.3322/caac.21660. [DOI] [PubMed] [Google Scholar]
- 3. Malhotra J, Malvezzi M, Negri E, La Vecchia C, Boffetta P. Risk factors for lung cancer worldwide. Eur Respir J . 2016;48:889–902. doi: 10.1183/13993003.00359-2016. [DOI] [PubMed] [Google Scholar]
- 4. McKay JD, Hung RJ, Han Y, Zong X, Carreras-Torres R, Christiani DC, et al. SpiroMeta Consortium Large-scale association analysis identifies new lung cancer susceptibility loci and heterogeneity in genetic susceptibility across histological subtypes. Nat Genet . 2017;49:1126–1132. doi: 10.1038/ng.3892. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Byun J, Han Y, Li Y, Xia J, Long E, Choi J, et al. Cross-ancestry genome-wide meta-analysis of 61,047 cases and 947,237 controls identifies new susceptibility loci contributing to lung cancer. Nat Genet . 2022;54:1167–1177. doi: 10.1038/s41588-022-01115-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Cancer Genome Atlas Research Network. Comprehensive molecular profiling of lung adenocarcinoma. Nature . 2014;511:543–550. doi: 10.1038/nature13385. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Castellanos E, Feld E, Horn L. Driven by mutations: the predictive value of mutation subtype in EGFR-mutated non-small cell lung cancer. J Thorac Oncol . 2017;12:612–623. doi: 10.1016/j.jtho.2016.12.014. [DOI] [PubMed] [Google Scholar]
- 8. Tam V, Patel N, Turcotte M, Bossé Y, Paré G, Meyre D. Benefits and limitations of genome-wide association studies. Nat Rev Genet . 2019;20:467–484. doi: 10.1038/s41576-019-0127-1. [DOI] [PubMed] [Google Scholar]
- 9. Wang Q, Dhindsa RS, Carss K, Harper AR, Nag A, Tachmazidou I, et al. AstraZeneca Genomics Initiative Rare variant contribution to human disease in 281,104 UK Biobank exomes. Nature . 2021;597:527–532. doi: 10.1038/s41586-021-03855-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Backman JD, Li AH, Marcketta A, Sun D, Mbatchou J, Kessler MD, et al. Regeneron Genetics Center; DiscovEHR Exome sequencing and analysis of 454,787 UK Biobank participants. Nature . 2021;599:628–634. doi: 10.1038/s41586-021-04103-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Van Hout CV, Tachmazidou I, Backman JD, Hoffman JD, Liu D, Pandey AK, et al. Geisinger-Regeneron DiscovEHR Collaboration; Regeneron Genetics Center Exome sequencing and characterization of 49,960 individuals in the UK Biobank. Nature . 2020;586:749–756. doi: 10.1038/s41586-020-2853-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Cirulli ET, White S, Read RW, Elhanan G, Metcalf WJ, Tanudjaja F, et al. Genome-wide rare variant analysis for thousands of phenotypes in over 70,000 exomes from two cohorts. Nat Commun . 2020;11:542. doi: 10.1038/s41467-020-14288-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Shen X, Song S, Li C, Zhang J. Synonymous mutations in representative yeast genes are mostly strongly non-neutral. Nature . 2022;606:725–731. doi: 10.1038/s41586-022-04823-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Sharp N. Mutations matter even if proteins stay the same. Nature . 2022;606:657–659. doi: 10.1038/d41586-022-01091-6. [DOI] [PubMed] [Google Scholar]
- 15. Sun BB, Kurki MI, Foley CN, Mechakra A, Chen CY, Marshall E, et al. Biogen Biobank Team; FinnGen Genetic associations of protein-coding variants in human disease. Nature . 2022;603:95–102. doi: 10.1038/s41586-022-04394-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Halldorsson BV, Eggertsson HP, Moore KHS, Hauswedell H, Eiriksson O, Ulfarsson MO, et al. DBDS Genetic Consortium The sequences of 150,119 genomes in the UK Biobank. Nature . 2022;607:732–740. doi: 10.1038/s41586-022-04965-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Barton AR, Sherman MA, Mukamel RE, Loh PR. Whole-exome imputation within UK Biobank powers rare coding variant association and fine-mapping analyses. Nat Genet . 2021;53:1260–1269. doi: 10.1038/s41588-021-00892-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Hanks SC, Forer L, Schönherr S, LeFaive J, Martins T, Welch R, et al. NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium Extent to which array genotyping and imputation with large reference panels approximate deep whole-genome sequencing. Am J Hum Genet . 2022;109:1653–1666. doi: 10.1016/j.ajhg.2022.07.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Sun Q, Liu W, Rosen JD, Huang L, Pace RG, Dang H, et al. Cystic Fibrosis Genome Project Leveraging TOPMed imputation server and constructing a cohort-specific imputation reference panel to enhance genotype imputation among cystic fibrosis patients. HGG Adv . 2022;3:100090. doi: 10.1016/j.xhgg.2022.100090. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Taliun D, Harris DN, Kessler MD, Carlson J, Szpiech ZA, Torres R, et al. NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature . 2021;590:290–299. doi: 10.1038/s41586-021-03205-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Loh P-R, Danecek P, Palamara PF, Fuchsberger C, A Reshef Y, K Finucane H, et al. Reference-based phasing using the Haplotype Reference Consortium panel. Nat Genet . 2016;48:1443–1448. doi: 10.1038/ng.3679. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Zhou W, Zhao Z, Nielsen JB, Fritsche LG, LeFaive J, Gagliano Taliun SA, et al. Scalable generalized linear mixed model for region-based association tests in large biobanks and cohorts. Nat Genet . 2020;52:634–639. doi: 10.1038/s41588-020-0621-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Zhou W, Nielsen JB, Fritsche LG, Dey R, Gabrielsen ME, Wolford BN, et al. Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies. Nat Genet . 2018;50:1335–1341. doi: 10.1038/s41588-018-0184-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Willer CJ, Li Y, Abecasis GR. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics . 2010;26:2190–2191. doi: 10.1093/bioinformatics/btq340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GR, Thormann A, et al. The ensembl variant effect predictor. Genome Biol . 2016;17:122. doi: 10.1186/s13059-016-0974-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Landrum MJ, Chitipiralla S, Brown GR, Chen C, Gu B, Hart J, et al. ClinVar: improvements to accessing data. Nucleic Acids Res . 2020;48:D835–D844. doi: 10.1093/nar/gkz972. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Bulik-Sullivan BK, Loh PR, Finucane HK, Ripke S, Yang J, Patterson N, et al. Schizophrenia Working Group of the Psychiatric Genomics Consortium LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat Genet . 2015;47:291–295. doi: 10.1038/ng.3211. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Giambartolomei C, Vukcevic D, Schadt EE, Franke L, Hingorani AD, Wallace C, et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet . 2014;10:e1004383. doi: 10.1371/journal.pgen.1004383. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Liu Y, Xia J, McKay J, Tsavachidis S, Xiao X, Spitz MR, et al. Rare deleterious germline variants and risk of lung cancer. NPJ Precis Oncol . 2021;5:12. doi: 10.1038/s41698-021-00146-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Ji X, Mukherjee S, Landi MT, Bosse Y, Joubert P, Zhu D, et al. Protein-altering germline mutations implicate novel genes related to lung cancer development. Nat Commun . 2020;11:2220. doi: 10.1038/s41467-020-15905-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Wang Y, McKay JD, Rafnar T, Wang Z, Timofeeva MN, Broderick P, et al. Rare variants of large effect in BRCA2 and CHEK2 affect risk of lung cancer. Nat Genet . 2014;46:736–741. doi: 10.1038/ng.3002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. McMaster ML, Sun C, Landi MT, Savage SA, Rotunno M, Yang XR, et al. Germline mutations in Protection of Telomeres 1 in two families with Hodgkin lymphoma. Br J Haematol . 2018;181:372–377. doi: 10.1111/bjh.15203. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Nathan V, Johansson PA, Palmer JM, Hamilton HR, Howlie M, Brooks KM, et al. A rare missense variant in protection of telomeres 1 (POT1) predisposes to a range of haematological malignancies. Br J Haematol . 2021;192:e57–e60. doi: 10.1111/bjh.17218. [DOI] [PubMed] [Google Scholar]
- 34. Ishigaki K, Akiyama M, Kanai M, Takahashi A, Kawakami E, Sugishita H, et al. Large-scale genome-wide association study in a Japanese population identifies novel susceptibility loci across different diseases. Nat Genet . 2020;52:669–679. doi: 10.1038/s41588-020-0640-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Ghisays F, Garzia A, Wang H, Canasto-Chibuque C, Hohl M, Savage SA, et al. RTEL1 influences the abundance and localization of TERRA RNA. Nat Commun . 2021;12:3016. doi: 10.1038/s41467-021-23299-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Yue X, Samaniego-Castruita D, González-Avalos E, Li X, Barwick BG, Rao A. Whole-genome analysis of TET dioxygenase function in regulatory T cells. EMBO Rep . 2021;22:e52716. doi: 10.15252/embr.202152716. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Zhu Y, An X, Zhang X, Qiao Y, Zheng T, Li X. STING: a master regulator in the cancer-immunity cycle. Mol Cancer . 2019;18:152. doi: 10.1186/s12943-019-1087-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Wang Y, Luo J, Alu A, Han X, Wei Y, Wei X. cGAS-STING pathway in cancer biotherapy. Mol Cancer . 2020;19:136. doi: 10.1186/s12943-020-01247-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Matarese G, La Cava A. The intricate interface between immune system and metabolism. Trends Immunol . 2004;25:193–200. doi: 10.1016/j.it.2004.02.009. [DOI] [PubMed] [Google Scholar]
- 40. Duke-Cohan JS, Gu J, McLaughlin DF, Xu Y, Freeman GJ, Schlossman SF. Attractin (DPPT-L), a member of the CUB family of cell adhesion and guidance proteins, is secreted by activated human T lymphocytes and modulates immune cell interactions. Proc Natl Acad Sci USA . 1998;95:11336–11341. doi: 10.1073/pnas.95.19.11336. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Chen Y, Xu J, Wu X, Yao H, Yan Z, Guo T, et al. CD147 regulates antitumor CD8+ T-cell responses to facilitate tumor-immune escape. Cell Mol Immunol . 2021;18:1995–2009. doi: 10.1038/s41423-020-00570-y. [DOI] [PMC free article] [PubMed] [Google Scholar]




