Abstract
Background
The current interstitial lung disease (ILD) classification has overlapping clinical presentations and outcomes. Cluster analysis modeling is a valuable tool in identifying distinct clinical phenotypes in heterogeneous diseases. However, this approach has yet to be implemented in ILD.
Methods
Using cluster analysis, novel ILD phenotypes were identified among subjects from a longitudinal ILD cohort, and outcomes were stratified according to phenotypic clusters compared with subgroups according to current American Thoracic Society/European Respiratory Society ILD classification criteria.
Results
Among subjects with complete data for baseline variables (N = 770), four clusters were identified. Cluster 1 (ie, younger white obese female subjects) had the highest baseline FVC and diffusion capacity of the lung for carbon monoxide (Dlco). Cluster 2 (ie, younger African-American female subjects with elevated antinuclear antibody titers) had the lowest baseline FVC. Cluster 3 (ie, elderly white male smokers with coexistent emphysema) had intermediate FVC and Dlco. Cluster 4 (ie, elderly white male smokers with severe honeycombing) had the lowest baseline Dlco. Compared with classification according to ILD subgroup, stratification according to phenotypic clusters was associated with significant differences in monthly FVC decline (Cluster 4, –0.30% vs Cluster 2, 0.01%; P < .0001). Stratification by using clusters also independently predicted progression-free survival (P < .001) and transplant-free survival (P < .001).
Conclusions
Among adults with diverse chronic ILDs, cluster analysis using baseline characteristics identified four distinct clinical phenotypes that might better predict meaningful clinical outcomes than current ILD diagnostic criteria.
Key Words: cluster, interstitial lung disease, mortality, phenotype, pulmonary fibrosis
Abbreviations: ANA, antinuclear antibody; ATS, American Thoracic Society; CHP, chronic hypersensitivity pneumonitis; CTD, connective tissue disease; CTD-ILD, connective tissue disease-associated interstitial lung disease; Dlco, diffusing capacity of the lungs for carbon monoxide; ERS, European Respiratory Society; GAP-ILD, gender, age, physiology-interstitial lung disease; HR, hazard ratio; HRCT, high-resolution CT; ILD, interstitial lung disease; IPAF, interstitial pneumonia with autoimmune features; IPF, idiopathic pulmonary fibrosis; NSIP, nonspecific interstitial pneumonia; PA, pulmonary artery; PAM, partitioning around medoids; PFS, progression-free survival; PFT, pulmonary function test; SLB, surgical lung biopsy; TFS, transplant-free survival
Interstitial lung diseases (ILDs) are a heterogeneous group of pulmonary disorders characterized by architectural distortion and lung function impairment.1 The approach to ILD has evolved over time, attempting to improve diagnostic precision while decreasing the need for invasive procedures.2 Based on observed disease behavior, the most recent 2013 update of the American Thoracic Society(ATS)/European Respiratory Society (ERS) guidelines classifies ILDs into diagnostic subgroups that often overlap in their presentation and prognosis, making the clinical application of this recommended diagnostic algorithm challenging.1, 2
A confident diagnosis is often limited by the inability of patients to undergo surgical lung biopsy or the presence of discordant radiologic and histopathologic patterns.3, 4 Such realities preclude the ability to diagnose patients with a specific ILD, leaving them without a clear prognosis or treatment options. In fact, some of these individuals are deemed to have “unclassifiable” ILD.1, 2 In addition, a significant subset of patients with ILDs exhibits serologic and clinical features suggestive of an underlying autoimmune process but do not meet defined criteria for a connective tissue disease (CTD). Various terminologies with subtly different criteria have been used to describe this subset of patients, including undifferentiated CTD-associated ILD (CTD-ILD), lung-dominant CTD, and autoimmune-featured ILD.5, 6, 7, 8 Recent ATS/ERS guidelines designate these patients as having interstitial pneumonia with autoimmune features (IPAF).9 These diagnostic difficulties, along with tremendous variability in disease course within and between different ILDs, limit the utility of the current classification in stratifying patients into clinically meaningful subgroups with uniform outcomes over time.10, 11, 12, 13
Statistical cluster analysis techniques have proven valuable in identifying homogeneous clusters of patients with shared clinical characteristics within pulmonary diseases such as COPD, asthma, and bronchiectasis.14, 15, 16 Because ILDs may benefit from a similar approach, we conducted an innovative cluster analysis in a large longitudinal cohort of patients with chronic ILDs to identify unique ILD phenotypes based on clinical characteristics, serologic data, lung function, and radiographic features. We hypothesized that application of a cluster analysis approach would identify more homogeneous ILD phenotypes than the current ILD classification system regarding clinically meaningful outcomes, and it could provide a foundation for improved understanding of ILD pathogenesis, disease progression, and optimizing approach to management.
Methods
Study Design and Patient Selection
Subjects in the present analysis are from the University of Chicago ILD Registry, a longitudinal ILD cohort in which data are collected prospectively. The University of Chicago Institutional Review Board approved this investigation (institutional review board protocols #14163-A; #16-1062), and all patients signed informed consent forms.
Patients followed up at our institution between 2006 and 2015 with a multidisciplinary diagnosis of chronic ILD according to ATS/ERS criteria2, 9, 17, 18, 19 were screened. Subjects with idiopathic pulmonary fibrosis (IPF), IPAF, CTD-ILD, chronic hypersensitivity pneumonitis (CHP), and unclassifiable idiopathic interstitial pneumonias were identified and eligible for study inclusion. Multidisciplinary diagnosis of ILD at our institution is performed in a rigorous fashion in conjunction with rheumatologists, dedicated chest radiologists, and a thoracic pathologist. Patients with CTD-ILD were required to have a multidisciplinary diagnosis of CTD-ILD for inclusion in the study. The vast majority of the CHP cohort (> 90%) had either consistent histopathologic findings on lung biopsy or an identifiable environmental antigen. However, because multidisciplinary diagnosis remains the current gold standard for a diagnosis of CHP, the minimal criterion for a diagnosis of CHP was a multidisciplinary review of the clinical, pulmonary function test (PFT), radiographic, and pathologic characteristics that were most consistent with a diagnosis of CHP, after exclusion of all other possible etiologies. Patients with IPF, and those with unclassifiable idiopathic interstitial pneumonia, were required to meet 2011 ATS/ERS criteria for inclusion in the study. The minimal criteria for classification as IPAF were that patients must have an interstitial pneumonia (according to high-resolution CT [HRCT] scan or surgical lung biopsy) with exclusion of alternative etiologies, incomplete features of a defined CTD, and at least one feature from at least two IPAF domains (clinical, radiographic, and morphologic) as proposed by the initial IPAF research statement.9
Data Collection
The electronic medical record was retrospectively reviewed to extract pertinent data, and 24 baseline variables were identified from each patient’s initial clinic visit with substantial clinical relevance for inclusion in the cluster analysis model based on previous literature. These variables were as follows: demographic information (age, race/ethnicity, and sex), patient-reported historical information (tobacco use and other environmental exposure [organic or inorganic]), comorbid disease conditions (gastroesophageal reflux and hypothyroidism), physical examination findings [BMI, SpO2:FiO2 ratio, clubbing, and crackles), laboratory studies (antinuclear antibody [ANA] titer, positive rheumatoid factor [> 2× the upper limit of normal], other positive autoantibodies [anti-cyclic citrullinated peptide, anti-double-stranded DNA, anti-Ro(SS-A), anti-La(SS-B), anti-ribonucleoprotein, anti-Smith, anti-topoisomerase(Scl-70), anti-tRNA synthetase]), gamma (protein) gap, pulmonary function tests (PFTs) (FVC, FEV1, FEV1/FVC, and diffusion capacity of the lung for carbon monoxide [Dlco]), and HRCT imaging findings (honeycombing, emphysema, pulmonary artery [PA] diameter, and aortic diameter). Measurements of the PA and aorta were performed by thoracic radiologists (S. M. M. and J. H. C.) blinded to hemodynamic and clinical information. HRCT scans were analyzed by using Philips iSite Enterprise Software (Koninklijke, Philips-N.V.). The PA and aorta were measured at the level of the PA bifurcation, and measurements were taken from the same HRCT image as previously reported.20 Patients were excluded if they did not have the complete set of 24 baseline variables required for cluster analysis.
Follow-up and End Point of the Study
The primary outcome of the study was longitudinal pulmonary function. PFTs were grouped into 3-month intervals to allow for time course alignment and were analyzed by using a mixed effects model. Secondary outcomes included progression-free survival (PFS) and transplant-free survival (TFS). PFS time was defined as time from multidisciplinary ILD diagnosis to first occurrence of the following: a decrease of ≥ 10% in percent predicted FVC, a decrease of ≥ 50 m (164 feet) in 6-min walk distance, death, lung transplantation, loss to follow-up, or end of study period.21 PFS was evaluated over the first 52 weeks following the ILD diagnosis. TFS time was defined as time from multidisciplinary ILD diagnosis to first occurrence of the following: death, transplantation, loss to follow-up, or end of study period. TFS was evaluated over the first 10 years following the ILD diagnosis. Vital status was determined from review of medical records, Social Security Death Index, and telephone communication per our usual clinical practice. Follow-up time was censored on December 31, 2015.
Statistical Analysis
The partitioning around medoids (PAM) clustering algorithm22 was used to cluster ILD subjects into groups with similar clinical phenotypes based on the 24 baseline variables. The fundamental principle underlying cluster analysis aims to group subjects on the basis of prespecified variables to optimize cluster homogeneity and differentiate clusters from one another. The PAM cluster analysis minimizes the dissimilarity of members of each cluster and uses medoids, which are subjects in the dataset representative of each cluster.23 This method is similar to but more robust to outliers than the commonly used k-means clustering algorithm because it relies on medians as opposed to means. Because both continuous and categorical variables were included in the algorithm, the variables were scaled by using Gower’s distance.24 This procedure scales all variables on a scale of 0 to 1 prior to clustering. To determine the optimal number of clusters, the silhouette width was used, which is a measure of how similar a patient is to his or her assigned cluster compared with neighboring clusters.25 PAM cluster analysis was performed by using the “cluster” package in R (R Foundation for Statistical Computing).
Continuous variables are reported as means ± SDs, and categorical variables are reported as counts and percentages. Demographic and clinical differences among identified clusters were examined by using χ2 or analysis of variance rank tests, as appropriate. Survival was assessed by using unadjusted log-rank testing along with univariate and multivariable Cox proportional hazards regression. Survival curves were plotted by using the Kaplan-Meier survival estimator. Survival time was censored on December 31, 2015, or when a patient underwent lung transplantation or was lost to follow-up. The gender, age, physiology-ILD (GAP-ILD) index is an established predictor of survival in subjects with ILD.10 Multivariable Cox regression analyses accounted for the GAP-ILD score to independently determine predictive value of models assessed for secondary outcomes. Statistical analyses following determination of the clusters were performed by using Stata 2015R.14 (StataCorp).
Results
Subject Demographic Characteristics
Of the 1,127 initial patients screened, 770 (IPF, n = 286; CTD-ILD, n = 173; IPAF, n = 156; CHP, n = 119; unclassifiable idiopathic interstitial pneumonias, n = 36) met inclusion and exclusion criteria (e-Fig 1). (Variables chosen for cluster analysis are presented in e-Table 1). Mean age of included patients was 65 years, and 48% were female. When assessing baseline PFTs of the cohort, mean FVC was 64.5% predicted, and Dlco was 49.5%; 56.6% had a positive ANA titer (≥ 1:320), and the mean gamma gap was 3.4 mg/dl. HRCT honeycombing was present in 35% of the cohort, and 23% had emphysema.
Cluster Analysis
PAM cluster analysis determined that four clusters were optimal (Fig 1, Table 1, e-Table 1).
Table 1.
Variable | Cluster 1 | Cluster 2 | Cluster 3 | Cluster 4 |
---|---|---|---|---|
CTD-ILD (n = 173) | 22 | 59 | 20 | 3 |
CHP (n = 119) | 28 | 2 | 9 | 19 |
IPF (n = 286) | 32 | 7 | 50 | 42 |
IPAF (n = 156) | 14 | 29 | 16 | 30 |
Unclassifiable (n = 36) | 4 | 3 | 5 | 6 |
Data are presented as %. CHP = chronic hypersensitivity pneumonitis; CTD-ILD = connective tissue disease-associated interstitial lung disease; ILD = interstitial lung disease; IPAF = interstitial pneumonia with autoimmune features; IPF = idiopathic pulmonary fibrosis.
Subjects grouped into Cluster 1 (n = 210 [27%]) were relatively younger, white, predominantly female, and obese with more modest lung function impairment at baseline. Exposure to organic environmental antigens, hypothyroidism, and gastroesophageal reflux was highly prevalent within this cluster, while the prevalence of emphysema was low. Cluster 2 (n = 114 [15%]) comprised the youngest patients in the entire cohort. They were predominantly African-American female subjects, had elevated ANA titers, and had a lower prevalence of CT honeycombing and emphysema, with the greatest reduction in baseline FVC and FEV1. Subjects in Cluster 3 (n = 276 [36%]) were characteristically older male subjects, white, had a substantial smoking history, and had the highest prevalence of coexistent emphysema with significantly increased PA and aortic diameters. Cluster 4 (n = 170 [22%]) was predominantly composed of white male subjects with the highest number of smoking pack-years, the greatest prevalence of CT honeycombing, and the lowest baseline Dlco.
Multidimensional Disease Severity Indices
To assess baseline multidimensional indices of disease severity among the clusters, we evaluated the GAP-ILD score and the composite physiologic index. The median GAP-ILD score was similar between Cluster 1 (GAP-ILD = 2; range, –2 to 5) and Cluster 2 (GAP-ILD = 2; range, –2 to 6) ) (P = .427); and between Cluster 3 (GAP-ILD = 3; range, –2 to 6) and Cluster 4 (GAP-ILD = 3; range, –2 to 7) (P = .952). The mean composite physiologic index was significantly different across all four clusters: Cluster 1, 44.8 ± 17.0; Cluster 2, 53.6 ± 15.3; Cluster 3, 51.2 ± 15.7; and Cluster 4, 54.5 ± 14.3 (P < .0001).
Outcomes
Longitudinal pulmonary function
When evaluating longitudinal PFTs according to initial ILD classification, all pairwise comparisons were analyzed (Figs 2 and 3). Monthly FVC decline was similar between patients with CHP and IPF (P = .215) and between patients with CTD-ILD and IPAF (P = .953). In contrast, when classifying patients according to phenotypic clusters, although Clusters 1 and 3 had similar monthly FVC decline (P = .852), FVC decline was significantly worse in Cluster 4 than in Cluster 2 (P < .0001) (e-Table 2).
Similarly, although monthly Dlco decline did not differ according to initial ILD classification, patients in Cluster 4 had a significantly worse Dlco decline than those in Cluster 2 (P < .0001).
Survival
Kaplan-Meier analysis suggested better delineation of PFS in the first year according to phenotypic clusters compared with initial ILD classification (χ2 = 28.54, log-rank P < .0001 vs χ2 =17.52, log-rank P = .0006, respectively) (Fig 4); and similar delineation of TFS over 10 years (χ2 = 74.40, log-rank P < .0001 vs χ2 = 62.60, log-rank P < .0001) (Fig 5).
PFS in the first year was greatest in Cluster 2 (88.6%) (Fig 4). When evaluating PFS according to univariate Cox regression analysis, predictors of mortality risk included phenotypic clusters (Cluster 1, hazard ratio [HR] of 2.78, P = .001; Cluster 3, HR of 2.25, P = .011; Cluster 4, HR of 4.31, P < .001), initial ILD classification (CHP, HR of 1.59, P = .089; IPAF, HR of 1.96, P = .009; IPF, HR of 2.47, P < .001), and GAP-ILD score (HR of 1.23, P < .001). Multivariable analysis suggested that phenotypic clusters (Cluster 1, HR of 3.06, P = .001; Cluster 3, HR of 2.18, P = .017; Cluster 4, HR of 3.64, P < .001) and GAP-ILD score (HR of 1.22, P < .001) independently predicted PFS (Table 2).
Table 2.
Characteristic | Progression-Free Survival (Within 1st y of Diagnosis) |
Transplant-Free Survival (Within 10 y of Diagnosis) |
||||
---|---|---|---|---|---|---|
HR | 95% CI | P Value | HR | 95% CI | P Value | |
Univariate Cox regression | ||||||
Phenotypic clustera | ||||||
Cluster 1 | 2.78 | 1.48-5.21 | .001 | 1.57 | 0.97-2.56 | .067 |
Cluster 3 | 2.25 | 1.20-4.20 | .011 | 3.17 | 2.05-4.88 | < .001 |
Cluster 4 | 4.31 | 2.31-8.03 | < .001 | 4.31 | 2.73-6.82 | < .001 |
Diagnosis categoryb | ||||||
CHP | 1.59 | 0.93-2.72 | .089 | 1.32 | 0.81-2.14 | .259 |
IPAF | 1.96 | 1.19-3.24 | .009 | 2.32 | 1.53-3.51 | < .001 |
IPF | 2.47 | 1.57-3.87 | < .001 | 3.80 | 2.65-5.46 | < .001 |
GAP-ILD score | 1.23 | 1.13-1.33 | < .001 | 1.62 | 1.50-1.74 | < .001 |
Multivariable Cox regressionc | ||||||
Phenotypic clustera | ||||||
Cluster 1 | 3.06 | 1.63-5.76 | .001 | 1.88 | 1.16-3.06 | .011 |
Cluster 3 | 2.18 | 1.15-4.12 | .017 | 2.61 | 1.68-4.08 | < .001 |
Cluster 4 | 3.64 | 1.94-6.84 | < .001 | 3.14 | 1.97-4.99 | < .001 |
GAP-ILD score | 1.22 | 1.12-1.33 | < .001 | 1.57 | 1.46-1.69 | < .001 |
GAP = gender-age-physiology; HR = hazard ratio. See Table 1 legend for expansion of other abbreviations.
Reference category: Cluster 2.
Reference category: CTD-ILD.
n = 727; adjusted for phenotypic cluster, sex, age, FVC, diffusing capacity of the lungs for carbon monoxide, ILD subtype, and immunosuppressive therapy.
Similarly, TFS over 10 years was greatest in Cluster 2 (Fig 5). When evaluating TFS by using univariate Cox regression analysis, predictors of mortality risk included phenotypic clusters (Cluster 1, HR of 1.57, P = .067; Cluster 3, HR of 3.17, P < .001; Cluster 4, HR of 4.31, P < .001), initial ILD classification (CHP, HR of 1.32, P = .259; IPAF, HR of 2.32, P < .001; IPF, HR of 3.80, P < .001), and GAP-ILD score (HR of 1.62, P < .001). Multivariable analysis also suggested that phenotypic clusters (Cluster 1, HR of 1.88, P = .011; Cluster 3, HR of 2.61, P < .001; Cluster 4, HR of 3.14, P < .001), and GAP-ILD score (HR of 1.57, P = .001) independently predicted TFS (Tables 2 and 3).
Table 3.
Variable | Cluster 1: Younger White Female With Obesity and Organic Environmental Exposure | Cluster 2: Younger Black Female With High Gamma Gap and High ANA Titer | Cluster 3: Elderly White Male Smoker With Emphysema | Cluster 4: Elderly White Male Smoker With Honeycombing and Environmental Exposure |
---|---|---|---|---|
Clinical characteristics | ||||
Age | Younger | Younger | Elderly | Elderly |
Sex | Female | Female | Male | Male |
Race/ethnicity | White | Black | White | White |
BMI | Obese | Overweight | Overweight | Overweight |
Tobacco pack-years | > 10 | < 10 | > 10 | > 10 |
Crackles | + | + | + | + |
Clubbing | +/– | +/– | +/– | +/– |
FVC percent predicted | > 60% | < 60% | > 60% | > 60% |
FEV1 percent predicted | > 70% | < 70% | > 70% | > 70% |
Dlco | > 50% | < 50% | < 50% | < 50% |
S:F ratio | > 410 | > 410 | ∼ 410 | < 410 |
Gamma gap | < 3.5 | > 3.5 | < 3.5 | ∼ 3.5 |
ANA titer | < 1:320 | > 1:320 | < 1:320 | ∼ 1:320 |
Honeycombing | +/– | +/– | +/– | + |
Emphysema | +/– | +/– | + | +/– |
Hypothyroidism | +/– | +/– | +/– | +/– |
GERD | + | +/– | +/– | + |
Organic environmental exposure | + | +/– | +/– | + |
Inorganic environmental exposure | +/– | – | +/– | +/– |
PA diameter | < 30 mm | ∼ 30 mm | ∼ 30 mm | ∼ 30 mm |
Aortic diameter | < 34 mm | < 34 mm | > 34 mm | > 34 mm |
Disease severity indices | ||||
GAP-ILD score, median | 2 | 2 | 3 | 3 |
CPI score, mean | 44.8 | 53.6 | 51.2 | 54.5 |
Annual FVC decline | ∼ 2% | < 2% | ∼ 2% | ∼ 4% |
Annual Dlco decline | ∼ 3% | < 2% | ∼ 2% | ∼ 6% |
Survival outcomes | ||||
Median survival time, mo | 109 | > 120 | 52 | 40 |
Predicted mortality rate | ||||
1-y | 5.7% | 2.6% | 10.9% | 15.9% |
2-y | 9.5% | 7.9% | 21.0% | 25.9% |
3-y | 12.4% | 13.2% | 29.3% | 32.9% |
5-y | 18.6% | 15.8% | 37.7% | 41.8% |
10-y | 22.9% | 21.9% | 42.8% | 44.7% |
ANA = antinuclear antibody; CPI = composite physiologic index; Dlco = diffusing capacity of the lungs for carbon monoxide; GERD = gastroesophageal reflux disease; – = usually absent (absent in majority of cohort); PA = pulmonary artery; + = usually present (present in majority of cohort); +/– = may be present or absent; S:F = SpO2:FiO2 ratio. See Table 1 and 2 legends for expansion of other abbreviations.
Discussion
In this large cohort study of > 700 patients with a multidisciplinary diagnosis of ILD, cluster analysis identified a discrete subgroup of patients with the worst lung function decline. The results from this study also suggest that the distinct phenotypic subgroups identified by using cluster analysis differ substantially in PFS within the first year of diagnosis and overall mortality within the first 10 years of diagnosis.
Although the most recent ATS/ERS consensus guidelines recommend diagnostic algorithms for ILD, a practical shortcoming frequently encountered is the substantial heterogeneity and low interobserver agreement in classification of major ILD subtypes.26, 27, 28 Furthermore, 10% to 15% of patients with ILD frequently remain “unclassifiable” even after undertaking the risk of a surgical lung biopsy.2, 29, 30 This novel cluster-based approach classifies patients with ILD, including those currently deemed unclassifiable, into clinically meaningful subgroups. We highlight the efficacy of this technique in stratifying patients with ILD into distinct phenotypic subgroups that are predictive of disease progression and short- and long-term survival. Our application of cluster analysis to ILD is similar to other heterogeneous pulmonary diseases such as COPD, asthma, and bronchiectasis, in which discrete patient clusters have been identified and subsequently related to differing outcomes.14, 15, 16
In this study cohort, cluster analysis identified four unique phenotypes of patients with differing patient characteristics. The first cluster was composed predominantly of younger white obese female subjects; the second included mostly younger African-American female subjects with elevated ANA titers. Elderly white male subjects were predominantly grouped into the latter two clusters, with coexistent emphysema characterizing the third cluster and severe honeycombing characterizing the fourth. Although there is a dominant ILD diagnostic subgroup within each cluster, all clusters consist of patients with various ILDs based on current ATS/ERS diagnostic guidelines.
Increasingly, the need to refine the accuracy of disease prognosis in ILD has led to recent exploration of clinical indices, gene variants, and molecular markers that reflect outcomes among heterogeneous patient populations with ILD.31, 32, 33, 34, 35 Although the GAP-ILD index is commonly used as a clinical predictor of mortality in major ILD subtypes, it has not been shown to predict FVC decline or PFS. Furthermore, patients with IPAF are not accounted for in the current GAP-ILD model. Among diagnostic subclasses of ILD, annual FVC decline was similar between those with features of autoimmunity (CTD-ILD and IPAF) and those with no features of autoimmunity (IPF and CHP). However, stratification according to phenotypic clusters identified a population (Cluster 4) with greater decline in annual FVC and Dlco, as well as worse survival. This approach could enhance early identification of patients at the highest risk of clinical deterioration for whom treatment might be of benefit and inclusion in clinical trials.21 In addition, application of this cluster-based classification to patient enrollment for therapeutic trials could address the barrier of heterogeneity in diagnostic criteria, which may influence study outcomes.
We evaluated PFS as a secondary outcome in our mortality analyses because of the low rate of death within the first year among patients with ILD and its frequent selection as a study end point in clinical trials.21, 36 Assessment of TFS as a secondary outcome also permitted evaluation of long-term mortality. In our cohort, phenotypic clusters predicted FVC decline in the first year while independently providing a reliable measure of the risk of death within the first year of diagnosis and over the following 10 years, even after adjusting for disease severity, age, sex, ILD diagnosis, and immunosuppressive therapy. This highly significant finding is crucial for early identification of subgroups with the greatest mortality risk across the various ILD subtypes.
In this real-world cohort of > 700 patients with diverse ILDs, phenotypic clusters predicted lung function decline, as well as short- and long-term survival independent of original ILD classification. Importantly, these clusters also differ in multidimensional indices of clinical function such as the composite physiological index and the GAP-ILD score, hence simultaneously providing measures of disease severity and anticipated clinical course (Table 3). Our findings suggest for the pulmonary practitioner that the current ILD diagnostic paradigm may not be the only approach to clinical care, and they identify additional factors such as race, BMI, cigarette smoking, and PA diameter, in addition to previously recognized patient and ILD characteristics (age, sex, lung function, and honeycomb fibrosis), that may affect ILD diagnoses and outcomes. We recognize that this novel shift in approach requires external validation before it can be used with confidence in clinical practice.
This study has several strengths. The large longitudinal ILD cohort with prospectively acquired data enhanced the ability to include a significant set of clinically relevant variables from multiple domains (clinical, serologic, radiologic, and functional) into the cluster analysis. Next, only patients with a multidisciplinary ILD diagnosis were included. Sensitivity analyses were performed to determine if our findings remained consistent after substratification of the entire ILD population into IPF and non-IPF cohorts. These analyses suggest that the cluster analysis approach remains robust in the prediction of outcomes even after adjustment for IPF vs non-IPF diagnoses (e-Fig 2, e-Table 3). These findings also suggest that the potential utility of a cluster-based approach to the prediction of outcomes might be of value in all cases of ILD. Finally, all baseline measurements for variables included were obtained at the time of initial ILD evaluation to minimize interference by treatment.
Our study also has certain limitations. First, this investigation was conducted at a single tertiary referral center and included only patients with a complete dataset, thus limiting its generalizability. It is also possible that the ILD cohort could be sicker than the general population because we are a referral center. To address this possibility, further sensitivity analyses were performed in which we excluded all out-of-state referrals (n = 176). These analyses yielded similar cluster patterns consistent with that of the entire cohort (e-Fig 3). Second, some patients received immunosuppressive therapy before referral to our institution. This therapy was most often in the form of corticosteroids, which may have biased our results. However, our statistical models adjusted for exposure to immunosuppressive therapy at baseline ILD evaluation to minimize the effect on outcomes. Third, our study was designed to exclude from the cluster analysis model patients with very rare forms of ILD due to their limited sample size. Fourth, this novel study is a proof-of-concept approach to disease classification in ILD. Future studies at other large ILD centers are needed for external validation of this concept. Fifth, it is possible that some patients may have received trial therapies prior to arrival at our center, which may have altered their phenotypic characteristics (e.g., BMI or Dlco), potentially influencing the cluster of allocation.
Conclusions
Cluster analysis, applied to a large cohort of patients with ILD at baseline evaluation, identified four discrete subgroups with unique clinical characteristics and distinctly different outcomes. Classification of ILD patients into clinically relevant phenotypes with prognostic value could improve the efficacy of therapeutic interventions in future clinical trials. Further studies are needed to elucidate the underlying clinical and biologic pathways linking these clusters to differential outcomes in ILD.
Acknowledgments
Author contributions: A. A., J. M. O., M. E. S., and M. C. were responsible for conception and design; A. A., J. M. O., J. H. C., S. M. M., C. L., L. J. W., D. S., R. B., L. C., S. H., A. N. H., I. N., R. V., M. E. S., and M. C. were responsible for acquisition of data for the study; A. A., J. M. O., M. E. S., and M. C. were responsible for analysis and interpretation; and A. A., J. M. O., J. H. C., S. M. M., C. L., L. J. W., D. S., R. B., L. C., S. H., A. N. H., I. N., R. V., M. E. S., and M. C. drafted the manuscript for important intellectual content. All authors critically revised the manuscript for important intellectual content, and all authors gave final approval of the submitted manuscript and are accountable for all aspects of the study.
Financial/nonfinancial disclosures: The authors have reported to CHEST the following: J. M. O. has received speaking fees and honoraria for advisory boards with Genentech and Boehringer Ingelheim related to IPF within the last 12 months. R. V. has received a grant from Genentech to study the genomics of autoimmune interstitial lung diseases. I. N. has received honoraria for advisory boards with Boehringer Ingelheim, InterMune, and Anthera within the last 12 months related to IPF; he has also received speaking honoraria from GlaxoSmithKline and receives consulting fees for ImmuneWorks. I. N. also has study contracts with the National Institutes of Health, Stromedix, Sanofi, and BI for the conduct of clinical trials in IPF. M. E. S. has received institutional funding for ILD research from the National Institutes of Health, Genentech, Gilead, and MedImmune. M. C. is supported by a career development award from the National Heart, Lung and Blood Institute (K08 HL121080), has received honoraria from Chest for invited speaking engagements, and also has a patent pending (ARCD. P0535US.P2) for risk stratification algorithms for hospitalized patients. None declared (A. A., J. H. C., S. M. M., C. L., L. J. W., D. S., R. S. B., A. N. H., S. H., L. W. C.).
Role of sponsors: The sponsor had no role in the design of the study, the collection and analysis of the data, or the preparation of the manuscript.
Other contributions: Our profound appreciation goes to our patients with ILD who generously consented to participation in this study.
Additional information: The e-Figures and e-Tables can be found in the Supplemental Materials section of the online article.
Footnotes
Drs Strek and Churpek contributed equally to this manuscript.
FUNDING/SUPPORT: This investigation was supported by a National Institutes of Health T32 training grant [Grant T32-HL007605].
Supplementary Data
References
- 1.Antoniou K.M., Margaritopoulos G.A., Tomassetti S., Bonella F., Costabel U., Poletti V. Interstitial lung disease. Eur Respir Rev. 2014;23(131):40–54. doi: 10.1183/09059180.00009113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Travis W.D., Costabel U., Hansell D.M. An official American Thoracic Society/European Respiratory Society statement: update of the international multidisciplinary classification of the idiopathic interstitial pneumonias. Am J Respir Crit Care Med. 2013;188(6):733–748. doi: 10.1164/rccm.201308-1483ST. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Yagihashi K., Huckleberry J., Colby T.V. Radiologic-pathologic discordance in biopsy-proven usual interstitial pneumonia. Eur Respir J. 2016;47(4):1189–1197. doi: 10.1183/13993003.01680-2015. [DOI] [PubMed] [Google Scholar]
- 4.Hutchinson J.P., Fogarty A.W., McKeever T.M., Hubbard R.B. In-hospital mortality after surgical lung biopsy for interstitial lung disease in the United States. 2000 to 2011. Am J Respir Crit Care Med. 2016;193(10):1161–1167. doi: 10.1164/rccm.201508-1632OC. [DOI] [PubMed] [Google Scholar]
- 5.Corte T.J., Copley S.J., Desai S.R. Significance of connective tissue disease features in idiopathic interstitial pneumonia. Eur Respir J. 2012;39(3):661–668. doi: 10.1183/09031936.00174910. [DOI] [PubMed] [Google Scholar]
- 6.Kinder B.W., Collard H.R., Koth L. Idiopathic nonspecific interstitial pneumonia: lung manifestation of undifferentiated connective tissue disease? Am J Respir Crit Care Med. 2007;176(7):691–697. doi: 10.1164/rccm.200702-220OC. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Vij R., Noth I., Strek M.E. Autoimmune-featured interstitial lung disease: a distinct entity. Chest. 2011;140(5):1292–1299. doi: 10.1378/chest.10-2662. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Omote N., Taniguchi H., Kondoh Y. Lung-dominant connective tissue disease: clinical, radiologic, and histologic features. Chest. 2015;148(6):1438–1446. doi: 10.1378/chest.14-3174. [DOI] [PubMed] [Google Scholar]
- 9.Fischer A., Antoniou K.M., Brown K.K. An official European Respiratory Society/American Thoracic Society research statement: interstitial pneumonia with autoimmune features. Eur Respir J. 2015;46(4):976–987. doi: 10.1183/13993003.00150-2015. [DOI] [PubMed] [Google Scholar]
- 10.Ryerson C.J., Vittinghoff E., Ley B. Predicting survival across chronic interstitial lung disease: the ILD-GAP model. Chest. 2014;145(4):723–728. doi: 10.1378/chest.13-1474. [DOI] [PubMed] [Google Scholar]
- 11.Lee S.H., Kim S.Y., Kim D.S. Predicting survival of patients with idiopathic pulmonary fibrosis using GAP score: a nationwide cohort study. Respir Res. 2016;17(1):131. doi: 10.1186/s12931-016-0454-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Khanna D., Mittoo S., Aggarwal R. Connective tissue disease-associated interstitial lung diseases (CTD-ILD)—report from OMERACT CTD-ILD Working Group. J Rheumatol. 2015;42(11):2168–2171. doi: 10.3899/jrheum.141182. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Tomassetti S., Ryu J.H., Poletti V. Staging systems and disease severity assessment in interstitial lung diseases. Curr Opin Pulm Med. 2015;21(5):463–469. doi: 10.1097/MCP.0000000000000198. [DOI] [PubMed] [Google Scholar]
- 14.Rennard S.I., Locantore N., Delafont B. Identification of five chronic obstructive pulmonary disease subgroups with different prognoses in the ECLIPSE cohort using cluster analysis. Ann Am Thoracic Soc. 2015;12(3):303–312. doi: 10.1513/AnnalsATS.201403-125OC. [DOI] [PubMed] [Google Scholar]
- 15.Moore W.C., Meyers D.A., Wenzel S.E. Identification of asthma phenotypes using cluster analysis in the Severe Asthma Research Program. Am J Respir Crit Care Med. 2010;181(4):315–323. doi: 10.1164/rccm.200906-0896OC. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Martínez-García M.A., Vendrell M., Girón R. The multiple faces of non-cystic fibrosis bronchiectasis: a cluster analysis approach. Ann Am Thorac Soc. 2016;13(9):1468–1475. doi: 10.1513/AnnalsATS.201510-678OC. [DOI] [PubMed] [Google Scholar]
- 17.American Thoracic Society, European Respiratory Society American Thoracic Society/European Respiratory Society international multidisciplinary consensus classification of the idiopathic interstitial pneumonias. This joint statement of the American Thoracic Society (ATS), and the European Respiratory Society (ERS) was adopted by the ATS board of directors, June 2001 and by the ERS Executive Committee, June 2001. Am J Respir Crit Care Med. 2002;165(2):277–304. doi: 10.1164/ajrccm.165.2.ats01. [DOI] [PubMed] [Google Scholar]
- 18.Raghu G., Collard H.R., Egan J.J. An official ATS/ERS/JRS/ALAT statement: idiopathic pulmonary fibrosis: evidence-based guidelines for diagnosis and management. Am J Respir Crit Care Med. 2011;183(6):788–824. doi: 10.1164/rccm.2009-040GL. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Raghu G., Rochwerg B., Zhang Y. An official ATS/ERS/JRS/ALAT clinical practice guideline: treatment of idiopathic pulmonary fibrosis. An update of the 2011 clinical practice guideline. Am J Respir Crit Care Med. 2015;192(2):e3–e19. doi: 10.1164/rccm.201506-1063ST. [DOI] [PubMed] [Google Scholar]
- 20.Chung J.H., Montner S.M., Adegunsoye A. CT Findings, radiologic-pathologic correlation, and imaging predictors of survival for patients with interstitial pneumonia with autoimmune features. AJR Am J Roentgenol. 2017;208(6):1229–1236. doi: 10.2214/AJR.16.17121. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.King T.E., Jr., Bradford W.Z., Castro-Bernardini S. A phase 3 trial of pirfenidone in patients with idiopathic pulmonary fibrosis. N Engl J Med. 2014;370(22):2083–2092. doi: 10.1056/NEJMoa1402582. [DOI] [PubMed] [Google Scholar]
- 22.La D., Livesay D.R. Predicting functional sites with an automated algorithm suitable for heterogeneous datasets. BMC Bioinformatics. 2005;6:116. doi: 10.1186/1471-2105-6-116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Kaufman L, Rousseeuw PJ. Finding Groups in Data: An Introduction to Cluster Analysis. New York, NY: Wiley; 2009.
- 24.Gower J.C. A general coefficient of similarity and some of its properties. Biometrics. 1971;27(4):857–874. [Google Scholar]
- 25.Rousseeuw P.J. Silhouettes—a graphical aid to the interpretation and validation of cluster-analysis. J Computational Appl Mathematics. 1987;20:53–65. [Google Scholar]
- 26.Raghu G., Wells A.U., Nicholson A.G. Effect of nintedanib in subgroups of idiopathic pulmonary fibrosis by diagnostic criteria. Am J Respir Crit Care Med. 2017;195(1):78–85. doi: 10.1164/rccm.201602-0402OC. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Singh S., Collins B.F., Sharma B.B. Interstitial lung disease (ILD) in India: results of a prospective registry. Am J Respir Crit Care Med. 2017;195(6):801–813. doi: 10.1164/rccm.201607-1484OC. [DOI] [PubMed] [Google Scholar]
- 28.Morell F., Villar A., Montero M.A. Chronic hypersensitivity pneumonitis in patients diagnosed with idiopathic pulmonary fibrosis: a prospective case-cohort study. Lancet Respir Med. 2013;1(9):685–694. doi: 10.1016/S2213-2600(13)70191-7. [DOI] [PubMed] [Google Scholar]
- 29.Ryerson C.J., Urbania T.H., Richeldi L. Prevalence and prognosis of unclassifiable interstitial lung disease. Eur Respir J. 2013;42(3):750–757. doi: 10.1183/09031936.00131912. [DOI] [PubMed] [Google Scholar]
- 30.Ryerson CJ, Corte TJ, Lee JS, et al. A standardized diagnostic ontology for fibrotic interstitial lung disease: an international working group perspective [published online ahead of print April 17, 2017]. Am J Respir Crit Care Med. https://doi.org/10.1164/rccm.201702-0400PP. [DOI] [PMC free article] [PubMed]
- 31.Seibold M.A., Wise A.L., Speer M.C. A common MUC5B promoter polymorphism and pulmonary fibrosis. N Engl J Med. 2011;364(16):1503–1512. doi: 10.1056/NEJMoa1013660. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Peljto A.L., Zhang Y., Fingerlin T.E. Association between the MUC5B promoter polymorphism and survival in patients with idiopathic pulmonary fibrosis. JAMA. 2013;309(21):2232–2239. doi: 10.1001/jama.2013.5827. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Oldham J.M., Ma S.F., Martinez F.J. TOLLIP, MUC5B, and the response to N-acetylcysteine among individuals with idiopathic pulmonary fibrosis. Am J Respir Crit Care Med. 2015;192(12):1475–1482. doi: 10.1164/rccm.201505-1010OC. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Podolanczuk A.J., Raghu G., Tsai M.Y. Cholesterol, lipoproteins and subclinical interstitial lung disease: the MESA study. Thorax. 2017;72(5):472–474. doi: 10.1136/thoraxjnl-2016-209568. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Vij R., Noth I. Peripheral blood biomarkers in idiopathic pulmonary fibrosis. Transl Res. 2012;159(4):218–227. doi: 10.1016/j.trsl.2012.01.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Raghu G., Brown K.K., Collard H.R. Efficacy of simtuzumab versus placebo in patients with idiopathic pulmonary fibrosis: a randomised, double-blind, controlled, phase 2 trial. Lancet Respir Med. 2017;5(1):22–32. doi: 10.1016/S2213-2600(16)30421-0. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.