Graphical abstract
Overview of the study. LC: liquid chromatography; MS: mass spectrometry; GO: Gene Ontology; KEGG: Kyoto Encyclopedia of Genes and Genomes; CLC: Charcot–Leyden crystal.
Abstract
Background
Severe asthma is a heterogeneous airway inflammatory disease presenting with varying clinicophysiological characteristics and response to treatments. The objectives of the present study were to determine the clinical phenotypes of the Chinese C-BIOPRED cohort and their link to the sputum proteome.
Methods
Partition-around-medoids clustering was applied to a training set of 362 nonsmoking, smoking or ex-smoking severe asthma patients, and nonsmoking mild–moderate asthma patients using eight clinicophysiological variables, with validation performed in the remaining 181.
Results
Three stable clusters were defined, with Cluster T1 composed of predominantly female patients with severe nonsmoking asthma experiencing frequent exacerbations with moderate airflow obstruction, and Cluster T3 of elderly male patients with smoking/ex-smoking late-onset severe asthma and severe airflow obstruction and a moderate number of exacerbations. Cluster T2 was composed of nonsmokers with a mild–moderate airflow obstruction and no previous exacerbations. Validation clusters (V1, V2 and V3) were similar to the training set clusters. Differentially expressed proteins in sputum supernatants measured by liquid chromatography with tandem mass spectrometry pointed to differences in the complement and coagulation cascade pathway between Cluster 1 (T1 and V1) and Cluster 3 (T3 and V3), as well as between Cluster 2 (T2 and V2) and Cluster 3. Galectin 10 was upregulated in Cluster 1 compared with Cluster 2, and correlated with exacerbations, fractional exhaled nitric oxide, blood and sputum eosinophil count and oral corticosteroid dose in Cluster 1.
Conclusion
The clinical clusters were differentiated by smoking status, degree of airflow obstruction and exacerbation history, and by sputum complement and coagulation pathways, and galectin 10 levels.
Shareable abstract
Three clinical phenotypes of severe asthma differentiated by exacerbations, airflow obstruction and smoking status can be differentiated by sputum proteomic expression of complement and coagulation pathways and galectin 10 https://bit.ly/4aFrAf3
Introduction
Asthma is a major public health challenge in China, where it affects 45.7 million adults with an estimated prevalence of 4.2% [1]. It is a heterogeneous syndrome characterised by bronchial hyperresponsiveness and reversible airflow obstruction with varying degrees of airway inflammation, severity and variable response to treatments. Patients with severe asthma experience more frequent airflow obstruction and impairment of quality of life despite use of higher levels of controller therapy [2]. The underlying molecular mechanisms or endotypes may vary within individuals [3]. Identification of these phenotypes may contribute to a more individualised treatment approach, and may lead to the identification of strategies for preventing the progression of disease severity. A number of asthma clusters have already been described in predominantly white cohorts such as the US Severe Asthma Research Program (SARP) and the European U-BIOPRED. Four stable and reproducible clusters were generated using clinicophysiological parameters in U-BIOPRED [4], with clusters of late-onset severe asthma with chronic airflow obstruction and of obese female patients with uncontrolled severe asthma with increased exacerbations but with normal lung function.
Phenotypes described in Asian cohorts have distinct differences from those described in white cohorts. For example, the obese female phenotype is uncommon in Asia [5, 6] and a recent report of the C-BIOPRED Chinese severe asthma cohort has revealed a higher proportion of people with severe eosinophilic asthma with particularly high levels of eosinophils in sputum [7]. Therefore, we set out to perform a cluster analysis of the clinical phenotypes of asthma using clinicophysiological parameters collected in the C-BIOPRED cohort of patients with severe asthma defined according to the European Respiratory Society (ERS)/American Thoracic Society (ATS) criteria [2] and mild–moderate asthma patients recruited from asthma clinics across China. A novel approach in this cohort was to compare the sputum proteome defined in this analysis, which has helped to understand the potential mechanistic differences between these different clusters.
Methods
Participants
C-BIOPRED is a multicentre prospective study that recruited patients with severe asthma from 15 provinces in China between 2015 and 2018. The study comprises 545 patients with asthma (both mild–moderate and severe, and includes nonsmokers, ex-smokers and current smokers) and 100 healthy nonsmoking controls. The patients were age from 18 to 75 years old. We only use the subset of C-BIOPRED asthma baseline data. More detail about the asthma patients and criteria for inclusion can be found in the supplementary materials and the protocol and assessments used as previously described [7]. The cohort was randomly split into a training and validation set (2:1) balanced in terms of asthma severity, age and sex. The study was approved by the ethics committee for each participating clinical institution. All participants gave written and signed informed consent.
Clinical variables and data preprocessing
The cluster analysis concentrated on pivotal variables that are easily obtainable by primary care physicians, reflecting significant historical, clinical and physiological data for each asthmatic participant. The parameters include age of onset of asthma symptoms, pack-years of cigarette smoking, body mass index (BMI), forced expiratory volume in 1 s (FEV1) as a percentage of predicted (% pred) value, FEV1/forced vital capacity (FVC) ratio, the average score of the first five questions of the Asthma Control Questionnaire, the self-reported number of exacerbations in the previous year and the daily dose of oral prednisolone or equivalent. These clinicophysiological data were transformed to a normal distribution using Box–Cox power transformation [8] through the powerTransform function of the R package [9].
Cluster analysis
The Euclidean distance, which measures dissimilarity by using the ordinary straight-line distance between two points, was used to determine similarity between participants. The partition-around-medoids algorithm, a more robust generalisation of the k-means method [10], was used for clustering the data. Consensus clustering was performed by randomly removing 10% of the data and repeating the clustering 1000 times to assess cluster stability [11]. This was assessed by studying the cumulative distribution function (CDF), which describes the proportion of pairs of participants clustered together in a percentage of bootstrap iterations. Thus, if the curve was flat, such as between values of 0.2 and 0.8, with no pairs of participants that are clustered together between 20% and 80% of the iterations (that is, they are slightly clustered together (<20%) or almost clustered together (>80%)) were considered stable.
Sputum collection and proteomic analysis
Sputum was induced by the inhalation of hypertonic saline aerosols at increasing concentrations (3%, 4% and 5%). First, the expectorate was collected into a sterile container and processed within 2 h. Second, sputum was isolated from expectorate and treated with four volumes of dithiothreitol for 15 min, followed by four volumes of Dulbecco's PBS solution. Third, the resulting suspension was filtered and centrifuged at 4°C. Finally, the supernatant was aspirated and stored in Eppendorf tubes at −80°C for later proteomic assay. Additionally, cell smears from the sediment were fixed in neutral formalin and stained with haematoxylin & eosin. Differential cell counts were determined by counting 400 inflammatory cells, while also assessing the quality of the samples. Sample viability and a cut-off of <40% squamous cells were the default for samples being made available for analysis.
Protein extraction for LC–MS/MS
The sputum supernatant samples were sent in dry ice to Jingjie PTM BioLab (Hangzhou) Co. Ltd. for mass spectrometry analysis. We used liquid chromatography with tandem mass spectrometry (LC–MS/MS) to extract proteins from the sputum supernatant (see supplementary materials for detailed methods). The resulting MS/MS data were processed using the MaxQuant search engine (v.1.6.15.0). MS/MS spectra were searched against the human SwissProt database (20 422 entries) concatenated with a reverse decoy database. Trypsin/p was specified as the cleavage enzyme, allowing up to two missing cleavages. The mass tolerance for precursor ions was set as 20 ppm in the first search and 5 ppm in the main search, and the mass tolerance for fragment ions was set as 0.02 Da. Carbamidomethyl on Cys was specified as the fixed modification, and acetylation on the protein N terminus and oxidation on Met were specified as variable modifications. The false discovery rate (FDR) was adjusted to <1%.
GO and KEGG analyses
For the Gene Ontology (GO) biological analysis, we first downloaded the subset of c5.go.bp.v7.4.symbols.gmt from the Molecular Signatures Database [12] as the background. For the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis, we obtained the latest gene annotation of the KEGG pathway from KEGG rest API [13], which was used as the background. Then, we mapped the differentially expressed proteins (DEPs) between the clusters into the background set, and the R software package clusterProfiler (v.3.14.3) was used for enrichment analysis to obtain the pathway enrichment. Benjamini–Hochberg was used for multiple-testing correction to control the FDR.
Statistical analysis
Statistical analyses of clinical variables were conducted using R software (v.3.1.2). The Shapiro–Wilk test was employed to assess the normality of the data distribution. Group comparisons of normally distributed variables were performed with ANOVA, adjusted for age and sex after log2 transformation. For non-normally distributed or ordinal variables, the Kruskal–Wallis test was applied, and qualitative variables were analysed with the chi-squared test. A Spearman correlation was used to assess associations between clinical variables and DEPs. Statistical significance was defined as p<0.05.
Results
Participants
543 of 545 patients with asthma (supplementary table S1) with a complete set of data for the eight variables used for the clustering were available for analysis. The distribution of the eight variables was not statistically different between the training (n=362) and validation sets (n=181) (table 1).
TABLE 1.
Comparison of training and validation sets of asthma participants
| Training set | Validation set | p-value | |
|---|---|---|---|
| Subject with information | 362 | 181 | |
| Age at diagnosis, years | 42.7 (1.0–52.4) | 44.8 (1.6–53.3) | NS |
| Smoking status | NS | ||
| Nonsmoker | 288 (79.6) | 146 (80.7) | |
| Ex-smoker | 45 (12.4) | 20 (11.0) | |
| Current smoker | 29 (8.0) | 15 (8.3) | |
| Nasal polyps | 44 (12.2) | 29 (16.0) | NS |
| History of rhinosinusitis | 8 (2.2) | 6 (3.3) | NS |
| BMI, kg·m−2 | 24.2 (16.9–39.3) | 24.2 (16.4–33.3) | NS |
| FEV1, % pred | 64.5 (22.2–134.8) | 69.1 (21.4–122.6) | NS |
| FEV1/FVC, % | 60.5 (25–94.3) | 60.4 (28–88.3) | NS |
| Asthma Control Questionnaire | 1.6 (0–5.2) | 1.4 (0–4.4) | NS |
| Exacerbations in past 12 months | NS | ||
| No exacerbations | 149 (41.2) | 78 (43.1) | |
| With exacerbations | 213 (58.8) | 103 (56.9) | |
| OCS use | NS | ||
| On OCS | 23 (5.5) | 10 (6.4) | |
| Not on OCS | 339 (94.5) | 171 (93.6) |
Data are presented as median (minimum–maximum) or n (%). NS: not significant; BMI: body mass index; FEV1: forced expiratory volume in 1 s; FVC: forced vital capacity; % pred: percentage of predicted value; OCS: oral corticosteroid. p-values were calculated using a Mann–Whitney U-test for continuous variables and Fisher's test for categorical variables.
Training set clusters
Consensus clustering on the training set was run to assess stability for a number of potential cluster numbers varying from two to 10. This resulted in the separation of three stable groups after resampling, as defined by a flat middle part of the consensus CDF (figure 1a), well-defined squares within the consensus matrix (figure 1b) and Δ(K) curves exhibiting an “elbow” (figure 1c). Figure 1d represents a heat map of distances between the participants in the three clusters.
FIGURE 1.
Clustering using partition-around-medoids (PAM) algorithm on the training sets. a) Consensus cumulative distribution fraction (CDF) of b) the consensus matrix. c) The relative change in area under the CDF curve with k=2 or 3 for optimal number of clusters. d) Heat map of pairwise distance between participants in the three clusters.
Three clusters of training set (T1–T3)
The three clusters are described in table 2. Cluster T1 was composed of predominantly nonsmoking female patients with severe asthma (90%) experiencing frequent exacerbations and moderate degree of airflow obstruction, despite receiving high-dose inhaled corticosteroids (ICSs). Cluster T2 was similar to Cluster T1 in terms of predominance of nonsmoking females and similar age with comparable airflow obstruction but had no experience of exacerbations, less percentage of severe asthma and were receiving moderate-dose ICSs. Cluster T3 was mostly composed of nearly entirely elderly male smokers or ex-smokers with entirely severe asthma of later onset, with severe airflow obstruction (mean FEV1, 59.2% pred) and an exacerbation history in 68%. There were no significant differences in terms of BMI, atopic status, total serum IgE, use of oral corticosteroid therapy, fractional exhaled nitric oxide (FENO), blood and sputum neutrophil and eosinophil count among the three clusters. However, the mean levels of blood eosinophil count, sputum eosinophil counts and FENO, all markers of type 2 inflammation, were high in each of the three clusters. The mean levels of these markers for T1 were 0.3×109·μL−1, 26.7% and 42.7 ppb, respectively.
TABLE 2.
Comparison of clinical features of training Clusters T1, T2 and T3
| T1 | T2 | T3 | p-value | |
|---|---|---|---|---|
| Subject with information | 161 | 125 | 76 | |
| Sex | <0.001 | |||
| Female | 107 (66.5) | 81 (65.3) | 1 (1.3) | |
| Male | 54 (33.5) | 44 (35.2) | 75 (98.7) | |
| Age, years | 53.5 (19.6–73.6) | 53.7 (18.8–73) | 59.9 (26.3–74.3) | <0.001 |
| Age at diagnosis, years | 40.6 (0.4–65.5) | 41.5 (1.9–68) | 50.5 (0.1–64.9) | <0.001 |
| Smoking status | <0.001 | |||
| Nonsmoker | 161 (100.0) | 124 (99.2) | 4 (5.3) | |
| Smoker/ex-smoker | 0 (0.0) | 1 (0.8) | 72 (94.7) | |
| BMI, kg·m−2 | 24.0 (17.2–36.7) | 23.8 (16.9–39.3) | 25.1 (17.8–34.4) | NS |
| FEV1, % pred | 64.0 (22.2–134.8) | 69.4 (27–131) | 57.0 (22.3–108.7) | 0.0071 |
| FEV1/FVC | 61.3 (31.7–94.3) | 61.3 (31.9–88.7) | 55.4 (25–82.6) | 0.024 |
| Asthma Control Questionnaire | 1.6 (0–5.2) | 1.4 (0–4) | 2 (0–5) | 0.016 |
| Exacerbations in past 12 months | <0.001 | |||
| Zero | 0 (0.0) | 125 (100.0) | 24 (31.6) | |
| Once | 65 (40.4) | 0 (0) | 19 (25.0) | |
| Twice | 54 (34.5) | 0 (0) | 20 (26.3) | |
| ≥three times | 42 (26.1) | 0 (0) | 13 (17.1) | |
| Maintenance medications | ||||
| OCS use | 14 (8.7) | 3 (2.4) | 6 (7.9) | NS |
| ICS dose, BDP, μg·24 h−1 | 1000.0 (200.0–2000.0) | 800.0 (100.0–2000.0) | 1000.0 (200.0–2000.0) | 0.002 |
| ICS/LABA | 159 (98.8) | 115 (92.0) | 76 (100.0) | NS |
| LAMA | 28 (17.4) | 14 (11.2) | 24 (31.6) | 0.001 |
| Singulair/LTRA | 50 (31.1) | 42 (33.6) | 26 (34.3) | NS |
| Xanthines | 14 (8.7) | 16 (12.8) | 11 (14.5) | NS |
| Chinese traditional medicine | 14 (8.7) | 15 (12.0) | 8 (10.5) | NS |
| Asthma subgroup | <0.001 | |||
| Mild/moderate nonsmoking asthma | 16 (10.0) | 44 (35.2) | 0 (0.0) | |
| Severe nonsmoking asthma | 144 (89.4) | 80 (64.0) | 4 (5.3) | |
| Smokers and ex-smokers with severe asthma | 1 (0.6) | 1 (0.8) | 72 (94.7) | |
| Subject with FENO | 155 | 102 | 72 | |
| FENO, ppb | 28.0 (5.0–300.0) | 34.0 (5.0–238.0) | 35.0 (7.0–168.0) | NS |
| Subjects with blood results | 161 | 125 | 76 | |
| Serum IgE total, KU·L−1 | 144.0 (6.5–4986.0) | 156.0 (2.1–3314.0) | 211.5 (6.2–5000.0) | NS |
| Blood neutrophil count, ×109·L−1 | 3.7 (1.5–9.5) | 3.9 (1.6–44.1) | 4.2 (1.6–50.8) | NS |
| Blood eosinophil count, ×109·L−1 | 0.2 (0.0–2.3) | 0.2 (0.0–1.6) | 0.2 (0.0–1.6) | NS |
| ECP, μg·L−1 | 6.8 (2–62.5) | 7.8 (2–164) | 7.1 (2.1–132) | NS |
| Subjects with sputum results | 69 | 70 | 40 | |
| Sputum neutrophils, % | 57.8 (0.3–97.3) | 60.8 (6.6–96.7) | 59.8 (0–97.3) | NS |
| Sputum eosinophils, % | 11.2 (0.3–95.7) | 11.4 (0–82.2) | 9.5 (0.7–99.2) | NS |
| Subjects with atopy | 157 | 122 | 76 | |
| Atopy | 0.027 | |||
| Present | 77 (49.0) | 71 (58.2) | 51 (67.1) | |
| Absent | 80 (51.0) | 51 (41.8) | 25 (32.9) |
Data are presented as median (minimum–maximum) or n (%). BMI: body mass index; NS: not significant; FEV1: forced expiratory volume in 1 s; FVC: forced vital capacity; % pred: per cent predicted; OCS: oral corticosteroid; ICS: inhaled corticosteroid; BDP: beclomethasone dipropionate or equivalent dose; LAMA: long-acting muscarinic antagonist; LABA: long-acting β-agonist; LTRA: leukotriene receptor antagonist; FENO, exhaled nitric oxide; ppb: parts per billion; IgE: immunoglobulin E; ECP: eosinophilic cationic protein. p-values were calculated using a Mann–Whitney U-test for continuous variables and Fisher's test for categorical variables.
Validation set clusters (V1–V3)
The validation analysis yielded three relatively stable clusters after resampling (denoted V1, V2 and V3 to align with the training set). When comparing the training and validation clusters using the least statistical differences of clinical variables, Cluster V1 was found to be similar to Cluster T1, Cluster V2 to Cluster T2, and Cluster V3 to Cluster T3 (table 3 and supplementary table S2). For ease of recall, Clusters T1 and V1 are referred to as Cluster 1, Clusters T2 and V2 as Cluster 2 and Clusters T3 and V3 as Cluster 3. The distribution of the main clinical characteristics of the training and validation clusters were similar (figures 2 and 3).
TABLE 3.
Comparison of clinical features of validation sets V1, V2 and V3
| V1 | V2 | V3 | p-value | |
|---|---|---|---|---|
| Subject with information | 79 | 66 | 36 | |
| Sex | <0.001 | |||
| Female | 61 (77.2) | 42 (63.6) | 2 (5.6) | |
| Male | 18 (23.8) | 24 (36.4) | 34 (94.4) | |
| Age, years | 57.4 (26.1–73.3) | 53.9 (27.2–72.1) | 57.5 (23–75.9) | 0.046 |
| Age at diagnosis, years | 46.4 (1.9–67.7) | 41.4 (1.6–65.6) | 46.3 (4.5–75.6) | NS |
| Smoking status | <0.001 | |||
| Nonsmoker | 79 (100.0) | 66 (100.0) | 1 (2.8) | |
| Smoker | 0 (0) | 0 (0) | 35 (97.2) | |
| BMI, kg·m−2 | 23.8 (16.4–32.8) | 25 (16.9–33.3) | 24.4 (16.6–32.7) | NS |
| FEV1, % pred | 68 (21.4–107.3) | 74.4 (30.7–122.6) | 57.6 (31.5–121.7) | 0.012 |
| FEV1/FVC | 59.5 (28–86.6) | 67.4 (36.1–88.3) | 57.8 (37.5–83.8) | 0.002 |
| Asthma Control Questionnaire | 1.4 (0–4.4) | 1.1 (0–4.4) | 1.9 (0–4) | NS |
| Exacerbations in past 12 months | <0.001 | |||
| 0 | 3 (3.8) | 62 (94.0) | 13 (36.1) | |
| 1 | 41 (51.9) | 2 (3.0) | 12 (33.3) | |
| 2 | 20 (25.3) | 2 (3.0) | 7 (19.5) | |
| ≥3 | 15 (20.0) | 0 (0) | 4 (11.1) | |
| Maintenance medications | ||||
| OCS use | 5 (6.3) | 1 (1.5) | 4 (11.1) | NS |
| ICS dose, BDP, μg·24 h−1 | 1000.0 (200.0–2000.0) | 800.0 (200.0–4000.0) | 800.0 (400.0–2000.0) | 0.001 |
| ICS/LABA | 71 (89.9) | 58 (87.9) | 36 (100.0) | NS |
| LAMA | 7 (8.9) | 5 (7.6) | 5 (13.8) | 0.001 |
| Singulair/LTRA | 20 (25.3) | 29 (43.9) | 20 (55.5) | 0.005 |
| Xanthines | 10 (12.7) | 9 (13.6) | 5 (13.8) | NS |
| Chinese traditional medicine | 3 (3.8) | 11 (16.7) | 5 (13.8) | 0.026 |
| Asthma subgroup | <0.001 | |||
| Mild–moderate nonsmoking asthma | 10 (12.6) | 22 (33.3) | 0 (0) | |
| Severe nonsmoking asthma | 69 (87.4) | 44 (66.7) | 1 (2.8) | |
| Smokers and ex-smokers with severe asthma | 0 (0) | 0 (0) | 35 (97.2) | |
| Subjects with FENO | 75 | 63 | 35 | |
| FENO, ppb | 30.0 (8–218) | 30.0 (7–169) | 22.5 (7–124) | NS |
| Subject with blood result | 76 | 63 | 31 | |
| Serum IgE total, kU·L−1 | 160.0 (3.4–2005.0) | 161.0 (4.2–2130.0) | 223.0 (9.1–1828.0) | NS |
| Blood neutrophil count, ×109·L−1 | 3.7 (1.7–8.9) | 3.6 (1–8.8) | 4.1 (2.2–8.7) | 0.034 |
| Blood eosinophil count, ×109·L−1 | 0.2 (0.0–1.7) | 0.2 (0.0–3.7) | 0.3 (0.0–1.6) | NS |
| ECP, μg·L−1 | 6.8 (2.1–90.8) | 7.8 (2–69.1) | 9 (2–30.2) | NS |
| Subject with sputum result | 40 | 38 | 21 | |
| Sputum neutrophils, % | 37.4 (2.3–100) | 61.4 (1.7–96.1) | 56.5 (4.2–96.5) | NS |
| Sputum eosinophils, % | 22.7 (0.0–94.6) | 5.7 (0.2–86.4) | 9.4 (1–89.6) | NS |
| Subjects with atopy | 79 | 65 | 36 | |
| Atopy | NS | |||
| Present | 35 (44.3) | 33 (50.8) | 24 (66.7) | |
| Absent | 44 (55.7) | 32 (49.2) | 12 (33.3) |
Data are presented as median (minimum–maximum) or n (%). NS: not significant; BMI: body mass index; FEV1: forced expiratory volume in 1 s; FVC: forced vital capacity; % pred: percentage of predicted value; OCS: oral corticosteroid; ICS: inhaled corticosteroid; BDP: beclomethasone dipropionate or equivalent dose; LABA: long-acting β-agonist; LAMA: long-acting muscarinic antagonist; LTRA: leukotriene receptor antagonist; FENO: exhaled nitric oxide; ppb: parts per billion; IgE: immunoglobulin E; ECP: eosinophilic cationic protein. p-values were calculated using Kruskal–Wallis rank-sum test for continuous variables and Fisher's test for categorical variables.
FIGURE 2.
Box and dot blots of the eight clinicophysiological variables used in the clustering of the training and validation sets. a) Age at asthma diagnosis. b) Pack-years of cigarette smoking. c) Body mass index (BMI). d) Forced expiratory volume in 1 s (FEV1) as a percentage of predicted value (FEV1 % pred). e) FEV1/forced vital capacity (FVC) ratio. f) The average score of the first five questions of the Asthma Control Questionnaire (ACQ5). g) The self-reported number of exacerbations in the past 12 months. h) The daily dose of oral prednisolone or equivalent. Data are shown as median (interquartile range). OCS: oral corticosteroid; ppb: parts per billion.
FIGURE 3.
Box and dot blots of main biomarkers in the training and validation sets of Clusters 1, 2 and 3. a) Blood neutrophil count. b) Blood eosinophil count. c) Serum eosinophil cationic protein (ECP). d) Serum total immunoglobulin E (TIgE). e) Fractional exhaled nitric oxide (FENO). f) Sputum neutrophils. g) Sputum eosinophils. Data are shown as median (interquartile range).
Differential protein abundance in sputum supernatants
We analysed the differential protein abundance of Cluster 1, 2 and 3. There were 335 participants’ sputum supernatants proteomics data available (136 in Cluster 1, 119 in Cluster 2 and 80 in Cluster 3). 43 DEPs were found between Cluster 1 and Cluster 2 (25 were downregulated and 18 were upregulated) (figure 4a). GO analysis showed that the DEPs were mainly involved in biological processes such as small molecule metabolic proteins (CD74, CA6, TARS1, VARS1, DLST, APOBR, AIMP1, KPNB1, KARS1, DECR1, LPCAT2, MGLL, DERA) and immune effector molecules (IGHV3–73, CD74, CEACAM6, CLC, C1QBP, AIMP1, KPNB1, KARS1, CYFIP1, HVCN1, DERA) (figure 4d). KEGG analysis indicated that these DEPs were not involved in any specific pathway. In Cluster 1, galectin 10 (CLC) was positively correlated with exacerbation rates (r=0.26, p<0.01), FENO (r=0.46, p<0.001), blood eosinophil count (r=0.51, p<0.001), sputum eosinophil % (r=0.70, p<0.001) and daily oral corticosteroid (OCS) dose (r=0.21, p<0.05), but negatively correlated with sputum neutrophil % (r=−0.52, p<0.001) (figure 5a). Similarly, in Cluster 2, galectin 10 was positively correlated with FENO (r=0.27, p<0.01), blood eosinophil count (r=0.52, p<0.001) and sputum eosinophil % (r=0.74, p<0.001), and negatively correlated with Asthma Quality of Life Questionnaire (r=−0.19, p<0.05) and sputum neutrophil % (r=−0.37, p<0.001) (figure 5b).
FIGURE 4.
Proteomic analysis of sputum supernatants. Volcano plots for differentially expressed proteins (DEPs) between a) Clusters 1 and 2, b) Clusters 1 and 3, and c) Clusters 2 and 3, the green triangle represents downregulated proteins and the red triangle indicates fold change ≥1.5 or fold change ≤0.66. d–f) Gene Ontology (GO) biological process enrichment of DEPs in comparison of the three clusters. The size of the circle represents the number of DEPs and the colour of the circle represents the p-value. h, i) KEGG pathway enrichment of DEPs in the three clusters. The size of the circle represents the number of DEPs and the colour of the circle represents the p-value.
FIGURE 5.
Heat map of the correlations between differentially expressed proteins (DEPs) and clinicophysiological and biomarker parameters in a) Cluster 1 and b) Cluster 2. The colour represents the correlation coefficient (blue to red indicates −1 to 1), the horizontal axis shows the clinicophysiological and biomarker parameters, the vertical axis shows the DEPs and asterisks indicate a statistically significant correlation between DEPs and clinicophysiological and biomarker parameters. ACQ5: Asthma Control Questionnaire 5; AQLQ: Asthma Quality of Life Questionnaire; BNC: blood neutrophil count; BEC: blood eosinophil count; SNC: sputum neutrophil count; SEC: sputum eosinophil count; SMC: sputum macrophage count; SLC, sputum lymphocyte count; FEV1: forced expiratory volume in 1 s; FVC: forced vital capacity; OCS: oral corticosteroid. *: p<0.05; **: p<0.01; ***: p<0.001.
133 DEPs were obtained between Cluster 1 and Cluster 3, with 65 downregulated and 68 upregulated (figure 4b). GO analysis indicated that the DEPs were mainly involved in defence response (36 DEPs) and immune effector process (33 DEPs) (figure 4e). Complement and coagulation cascades (F2, SERPINA1, A2M, C1QC, F11, KLKB1, SERPINB2, SERPINA5, C8A, C8B, C8G, SERPINF2, CPB2) were the DEPs mainly involved, as shown by KEGG analysis (figure 4g). Most of the DEPs involved in complement and coagulation cascades were positively correlated to all markers of type 2 inflammation and asthma severity (supplementary figure S1a,b).
100 DEPs were found between Cluster 2 and Cluster 3 with 47 downregulated and 53 upregulated (figure 4c). Immune effector process (30 DEPs) and defence response (26 DEPs) were also the DEPs found using the GO analysis (figure 4f). By KEGG analysis, these DEPs were involved in complement and coagulation cascades (C8G, C1QC, A2M, MBL2, CPB2) and cholesterol metabolism (APOA1, APOE, APOA2, APOB) (figure 4h). Most of these DEPs were also positively correlated with all markers of type 2 inflammation (supplementary figure S1c,d).
Discussion
Using the partition-around-medoids analysis of the C-BIOPRED cohort of predominantly severe asthma, we identified three clinical phenotypes of asthma that were distinguished based on whether they were nonsmokers or current/ex-smokers, on the degree of airflow obstruction and on the previous history of exacerbations. Thus, one distinct phenotype of exclusively current/ex-smokers consisting of males with the greatest airflow obstruction, and at least one exacerbation in the previous year was obtained (Cluster T3). The other two phenotypes were similar in terms of the female to male ratio (2:1), the degree of asthma control and presence of airflow obstruction, but differed by the experience of exacerbations in all participants in Cluster T1, with no previous exacerbations reported by all participants in Cluster T2 and by a higher % of severe nonsmoking asthma in Cluster T1 (64.0% versus 39.4%) and a higher % of patients on OCS therapy (8.7% versus 2.4%). Interestingly, these three phenotypes were not distinguished by the biomarkers of serum IgE, blood and sputum neutrophil and eosinophil counts, and exhaled nitric oxide levels. Indeed, the biomarkers indicative of type 2-high particularly exhaled nitric oxide, serum IgE and sputum eosinophil counts were in the high range in all three clusters, and this was particularly so with sputum eosinophil counts with median values ranging from 9.5 to 11.4% across the three phenotypes, indicating the high prevalence of eosinophilia in these phenotypes. These three phenotypes were also reproduced in the validation set. This suggests that current clinical markers may not be the most appropriate markers for these groups.
We analysed the proteomic constitution of the sputum supernatants of each of the three defined clusters by LC–MS/MS . The differences between Cluster 1 and Cluster 2 that differed in terms of severity of asthma in nonsmoking people with asthma were related to proteins involved in the metabolic and immune effector processes, of which one, galectin 10/Charcot–Leyden crystals (CLC) was the only one that correlated positively with biomarkers of type 2 inflammation, namely FENO, blood eosinophil count and sputum eosinophil count in both Cluster 1 and Cluster 2. Interestingly, in Cluster 1, the number of exacerbations in the previous year was also positively correlated with galectin 10. Thus, differences in galectin 10/CLC underlie the severity of asthma between Cluster 1 and Cluster 2, with differences mainly attributable to the exacerbations in Cluster 1 and its absence in Cluster 2. Although Cluster 1 could be considered as the severe eosinophilic asthma that has been described previously [14], Cluster 2 has a similar range of sputum eosinophils as Cluster 1 and would also fall within the premise of severe eosinophilic asthma; however, the higher levels of sputum galectin 10 represents evidence of more-activated eosinophils in Cluster 1 compared with Cluster 2. Galectin 10 represents 7–10% of the cytoplasmic content of the eosinophil [15] and has been implicated in the pathogenesis of severe eosinophilic asthma as the CLCs formed from galectin 10 when injected with ovalbumin in mice resulted in dendritic cell uptake and type 2 inflammatory responses [16]. In addition, galectin 10/CLC has been reported previously to be higher in sputum samples of patients with eosinophilic asthma [17], as we have shown here, but were particularly linked to patients with exacerbations.
The DEPs in sputum between Cluster 1 and Cluster 3 are most interesting because these two clusters represent similar degrees of airflow obstruction and exacerbations but with only smokers/ex-smokers in Cluster 3 and only nonsmokers in Cluster 1. These DEPs were mainly in terms of an overexpression of proteins of the complement and coagulation cascade in Cluster 1 compared with Cluster 2, such as the SERPINS and complement C1 and C8. In the comparison of the DEPs between Cluster 2, which is like Cluster 1 but with no experience of exacerbations, and Cluster 3, complement and coagulation cascade was overexpressed also in Cluster 2, and with also an overexpression of proteins associated with cholesterol metabolism. In both Clusters 1 and 2, there was a strong positive correlation between the proteins of the complement and coagulation pathways with FENO, blood and sputum eosinophil count, and to some extent with asthma control, exacerbations and daily OCS dose, particularly in Cluster 1. Activation of both coagulation and complement pathways has been described in asthma [18, 19]. Complement C3a and C5a levels are increased in bronchoalveolar lavage fluid after segmental airway allergen challenge of people with asthma [20], while C1q, part of the C1 complex, is a regulatory dendritic cell marker that has been shown to be a potent inhibitor of allergic inflammation [21].
The higher level of the serpins A1 (α1-anti-trypsin), F2 (α2-anti-plasminogen), B2 (plasminogen activator inhibitor 2; PAI-2) and E1 (carboxypeptidase 2; PAI-1) and also of α2-macroglobulin (A2M) in Cluster 1 compared with Cluster 3 invokes the possibility of their involvement in fibrin damage by plasmin into fibrin degradation products. Smoking has been shown to be associated with impairment of the capacity of the endothelium to release tissue type plasminogen activator (tPA) [22], without an effect on PAI-1 production [23], thus increasing the risk of thrombosis formulation in smokers. However, the fact that there were greater PAI-1 levels in nonsmoking severe asthma (Cluster 1) versus smoking/ex-smoking severe asthma (Cluster 3) suggests that those in Cluster 3 would be less protected against thrombosis. These coagulation and complement proteins show a good correlation with the inflammatory markers of type 2 inflammation in both clusters. Recent studies on asthma phenotypes have yielded similar results [24, 25], suggesting that these proteins may be involved in type 2 inflammatory pathways and the pathogenesis of asthma, warranting further investigation. The DEPs between Cluster 2 and Cluster 3 involve a group of apolipoproteins (APOA1, APOA2, APOE and APOB), which have immunomodulatory roles with protective effects [26]. For example, APOE may have preventive effects in inhibiting Th17 cells and the release of interleukin-33 from epithelial cells, while at the same time reducing Th2 cytokines and IgE levels [27]. The significance of the upregulation of these apolipoproteins in Cluster 2 versus Cluster 3 remains unclear, but suggests that smoking may inhibit their protective immunomodulatory properties in severe asthma.
This study represents the first clinical clustering of patients with severe asthma in a Chinese population. The phenotypes obtained in this C-BIOPRED cohort have some similarities to those reported from the European U-BIOPRED [4] and US SARP [28] cohorts. Compared with the U-BIOPRED cohort, the recruiting protocol and clustering analysis that was used in C-BIOPRED, C-BIOPRED revealed a cohort of predominantly female patients with a high prevalence of exacerbations contrasting with the U-BIOPRED cluster of similar patients but without exacerbations. In both studies, there was a cohort that included the current/ex-smoker group with late-onset asthma with severe airflow obstruction. In addition, in comparison with the SARP cohort [28], there were also two clusters of severe airflow obstruction with high use of daily OCS therapy (Clusters 4 and 5 of SARP). The differences in characteristics of the respective clusters in C-BIOPRED compared with U-BIOPRED and SARP may reflect the differences in the characteristics of people with severe asthma. Thus, Chinese people with severe asthma experienced fewer exacerbations, were less obese, had mostly late-onset asthma, had higher levels of sputum eosinophilia and less use of daily OCS therapy with the current smokers/ex-smoking group consisting mostly of male current smokers [7, 29]. The observation of a low number of patients treated with OCSs and a low frequency of asthma exacerbations is noteworthy. The definition of asthma exacerbations, particularly those defined by the course of OCS, may be closely related to the healthcare organisation and physician prescription behaviours across different countries [30, 31]. Such international variations likely reflect the heterogeneity in asthma management practices and resource allocation among nations, underscoring the importance of considering region-specific factors when devising asthma treatment strategies globally.
Although this study has several strengths, including an internal validation subset, we are cognisant of several drawbacks, which include the lack of validation of the clinical phenotypes in an external cohort. However, as discussed above, some aspects of the clinical phenotypes obtained in C-BIOPRED were reflected in both the U-BIOPRED and SARP cohorts. We did not have external validation of the proteomics analysis but we did not find significant protein overlap separating the clinical phenotypes between C-BIOPRED and U-BIOPRED [32]. However, plasma levels of C1, C9, CF1 and C8-γ-chain were elevated in the U-BIOPRED severe asthma patients compared with the mild–moderate and healthy control subjects [33]. On the other hand, our analysis has further strengthened the elucidation of the type 2 inflammatory association in this Chinese cohort of severe asthma, namely the link of the type 2 biomarkers, particularly sputum eosinophils and FENO with galectin 10 and the complement and coagulation pathways. While the mechanisms linking galectin 10 to asthma severity and exacerbations have been subject to intense investigation, those involving the complement and coagulation pathway, particularly their potential link to type 2 inflammation, deserves further attention. These are areas of future investigations as to why these pathways differentiate the three clinical phenotypes of C-BIOPRED.
Acknowledgements
We are grateful to all the members of the C-BIOPRED Study Group (listed in the supplementary material) who took part in the design and in the recruitment of participants into the study. We thank Yicheng Fan, Yujing Liu, Assam Nkouibert Prysewley Assam, Zhaohui Zhou, Wei Jiang and Jia Wang of AstraZeneca for their help in data and statistical analyses, and in the preparation of the figures and tables. We thank all the participants across China who willingly took part in the C-BIOPRED study.
Footnotes
Provenance: Submitted article, peer reviewed.
Ethics statement: C-BIOPRED is a multicentre prospective study that recruited patients with severe asthma from 15 provinces in China between 2015 and 2018. The study was approved by the ethics committee for each participating clinical institution. All participants gave written and signed informed consent.
Conflict of interest: K.F. Chung reports personal fees from attending advisory board meetings with GSK, AZ, Novartis, Roche, Merck, Trevi, Rickett-Beckinson, Nocion and Shionogi; is a scientific adviser to The Clean Breathing Institute supported by Haleon; reports personal fees for speaking at meetings supported by GSK, Sanofi, Novartis and AZ; and, through his institution, has received research funding from Merck and GSK. The other authors have no relevant conflicts of interest.
Support statement: This study was supported by AstraZeneca (China), the National Natural Science Foundation of China (82070026), Zhongnanshan Medical Foundation of Guangdong Province (ZNSA-2020013, ZNSXS-20220083 and ZNSXS-20240005), Guangzhou Science and Technology Plan Project and Zhongnanshan Medical Foundation of Guangdong Province (202102010355 and ZNSA-2020003), Clinical and Epidemiological Research Project of State Key Laboratory of Respiratory Disease (SKLRD-L-202404), Major Clinical Research Project of the Guangzhou Medical University Research Ability Enhancement Program (GMUCR2024–01010). Funding information for this article has been deposited with the Crossref Funder Registry.
References
- 1.Huang K, Yang T, Xu J, et al. Prevalence, risk factors, and management of asthma in China: a national cross-sectional study. Lancet 2019; 394: 407–418. doi: 10.1016/S0140-6736(19)31147-X [DOI] [PubMed] [Google Scholar]
- 2.Chung KF, Wenzel SE, Brozek JL, et al. International ERS/ATS guidelines on definition, evaluation and treatment of severe asthma. Eur Respir J 2014; 43: 343–373. doi: 10.1183/09031936.00202013 [DOI] [PubMed] [Google Scholar]
- 3.Chung KF, Adcock IM. Precision medicine for the discovery of treatable mechanisms in severe asthma. Allergy 2019; 74: 1649–1659. doi: 10.1111/all.13771 [DOI] [PubMed] [Google Scholar]
- 4.Lefaudeux D, De Meulder B, Loza MJ, et al. U-BIOPRED clinical adult asthma clusters linked to a subset of sputum omics. J Allergy Clin Immunol 2017; 139: 1797–1807. doi: 10.1016/j.jaci.2016.08.048 [DOI] [PubMed] [Google Scholar]
- 5.Kim TB, Jang AS, Kwon HS, et al. Identification of asthma clusters in two independent Korean adult asthma cohorts. Eur Respir J 2013; 41: 1308–1314. doi: 10.1183/09031936.00100811 [DOI] [PubMed] [Google Scholar]
- 6.Park SY, Fowler S, Shaw DE, et al. Comparison of asthma phenotypes in severe asthma cohorts (SARP, U-BIOPRED, ProAR and COREA) from 4 continents. Allergy Asthma Immunol Res 2024; 16: 338–352. doi: 10.4168/aair.2024.16.4.338 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Zhang Q, Fu X, Wang C, et al. Severe eosinophilic asthma in Chinese C-BIOPRED asthma cohort. Clin Transl Med 2022; 12: e710. doi: 10.1002/ctm2.710 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Box GE, Cox DR. An analysis of transformations. J R Stat Soc Ser B Method 1964; 26: 211–243. doi: 10.1111/j.2517-6161.1964.tb00553.x [DOI] [Google Scholar]
- 9.Fox J, Weisberg S. An R Companion to Applied Regression. Sage Publications, 2018. [Google Scholar]
- 10.Kaufman L, Rousseeuw P. Clustering by means of medoids. In: Dodge Y, ed. Statistical Data Analysis Based on the L1–Norm and Related Methods. North-Holland, 1987; pp. 405–416. [Google Scholar]
- 11.Monti S, Tamayo P, Mesirov J, et al. Consensus clustering: a resampling-based method for class discovery and visualisation of gene expression microarray data. Mach Learn 2003; 52: 91–118. doi: 10.1023/A:1023949509487 [DOI] [Google Scholar]
- 12.Liberzon A, Subramanian A, Pinchback R. Molecular signatures database (MSigDB) 3.0. Bioinformatics. 2011;27:1739–1740. 10.1093/bioinformatics/btr260. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Kyoto Encyclopedia of Genes and Genomes. KEGG rest API. Date last accessed: 28 September 2023. Date last updated: 1 March 2023. https://www.kegg.jp/kegg/rest/keggapi.html
- 14.Buhl R, Humbert M, Bjermer L, et al. Severe eosinophilic asthma: a roadmap to consensus. Eur Respir J 2017; 49: 1700634. doi: 10.1183/13993003.00634-2017 [DOI] [PubMed] [Google Scholar]
- 15.Weller PF, Bach DS, Austen KF. Biochemical characterization of human eosinophil Charcot-Leyden crystal protein (lysophospholipase). J Biol Chem 1984; 259: 15100–15105. doi: 10.1016/S0021-9258(17)42520-8 [DOI] [PubMed] [Google Scholar]
- 16.Persson EK, Verstraete K, Heyndrickx I, et al. Protein crystallization promotes type 2 immunity and is reversible by antibody treatment. Science 2019; 364: eaaw4295. doi: 10.1126/science.aaw4295 [DOI] [PubMed] [Google Scholar]
- 17.Nyenhuis SM, Alumkal P, Du J, et al. Charcot-Leyden crystal protein/galectin-10 is a surrogate biomarker of eosinophilic airway inflammation in asthma. Biomark Med 2019; 13: 715–724. doi: 10.2217/bmm-2018-0280 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.de Boer JD, Majoor CJ, van 't Veer C, et al. Asthma and coagulation. Blood 2012; 119: 3236–3244. [DOI] [PubMed] [Google Scholar]
- 19.Zhang X, Köhl J. A complex role for complement in allergic asthma. Expert Rev Clin Immunol 2010; 6: 269–277. doi: 10.1586/eci.09.84 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Krug N, Tschernig T, Erpenbeck VJ, et al. Complement factors C3a and C5a are increased in bronchoalveolar lavage fluid after segmental allergen provocation in subjects with asthma. Am J Respir Crit Care Med 2001; 164: 1841–1843. doi: 10.1164/ajrccm.164.10.2010096 [DOI] [PubMed] [Google Scholar]
- 21.Mascarell L, Airouche S, Berjont N, et al. The regulatory dendritic cell marker C1q is a potent inhibitor of allergic inflammation. Mucosal Immunol 2017; 10: 695–704. doi: 10.1038/mi.2016.87 [DOI] [PubMed] [Google Scholar]
- 22.Newby DE, Wright RA, Labinjoh C, et al. Endothelial dysfunction, impaired endogenous fibrinolysis, and cigarette smoking: a mechanism for arterial thrombosis and myocardial infarction. Circulation 1999; 99: 1411–1415. doi: 10.1161/01.CIR.99.11.1411 [DOI] [PubMed] [Google Scholar]
- 23.Barua RS, Ambrose JA, Saha DC, et al. Smoking is associated with altered endothelial-derived fibrinolytic and antithrombotic factors: an in vitro demonstration. Circulation 2002; 106: 905–908. doi: 10.1161/01.CIR.0000029091.61707.6B [DOI] [PubMed] [Google Scholar]
- 24.Zahraei HN, Schleich F, Gerday S, et al. A clustering analysis of eosinophilic asthmatics: two clusters with sharp differences in atopic status and disease severity. Clin Exp Allergy 2023; 53: 672–678. doi: 10.1111/cea.14312 [DOI] [PubMed] [Google Scholar]
- 25.Zahraei HN, Schleich F, Louis G, et al. Evidence for 2 clusters among patients with noneosinophilic asthma. Ann Allergy Asthma Immunol 2024; 133: 57–5p. doi: 10.1016/j.anai.2024.03.012 [DOI] [PubMed] [Google Scholar]
- 26.Ghosh S, Rihan M, Ahmed S, et al. Immunomodulatory potential of apolipoproteins and their mimetic peptides in asthma: current perspective. Respir Med 2022; 204: 107007. doi: 10.1016/j.rmed.2022.107007 [DOI] [PubMed] [Google Scholar]
- 27.Yao X, Gordon EM, Figueroa DM, et al. Emerging roles of apolipoprotein e and apolipoprotein A-I in the pathogenesis and treatment of lung disease. Am J Respir Cell Mol Biol 2016; 55: 159–169. doi: 10.1165/rcmb.2016-0060TR [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Moore WC, Meyers DA, Wenzel SE, et al. Identification of asthma phenotypes using cluster analysis in the Severe Asthma Research Program. Am J Respir Crit Care Med 2010; 181: 315–323. doi: 10.1164/rccm.200906-0896OC [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Dong C, Yang X, Luo W, et al. Influence of sex, cigarette smoking and airway inflammation on treatable traits in CBIOPRED severe asthma. Clin Transl Allergy 2022; 12: e12189. doi: 10.1002/clt2.12189 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Lee TY, Price D, Yadav CP, et al. International variation in severe exacerbation rates in patients with severe asthma. Chest 2024; 166: 28–38. doi: 10.1016/j.chest.2024.02.029 [DOI] [PubMed] [Google Scholar]
- 31.Van Ganse E, Louis R. International severe asthma registry: closer to the full picture of asthma care and outcomes? Chest 2024; 166: 3–4. doi: 10.1016/j.chest.2024.04.010 [DOI] [PubMed] [Google Scholar]
- 32.Schofield JPR, Burg D, Nicholas B, et al. Stratification of asthma phenotypes by airway proteomic signatures Lipid phenotyping of lung epithelial lining fluid in healthy human volunteers. J Allergy Clin Immunol 2019; 14: 123. doi: 10.1016/j.jaci.2019.03.013 [DOI] [Google Scholar]
- 33.Sparreman Mikus M, Kolmert J, Andersson LI, et al. Plasma proteins elevated in severe asthma despite oral steroid use and unrelated to type-2 inflammation. Eur Respir J 2022; 59: 2100142. doi: 10.1183/13993003.00142-2021 [DOI] [PMC free article] [PubMed] [Google Scholar]






