In the 18th century, Carl Linne, the Swedish physician and botanist commonly referred to as Linnaeus, developed a rational system of classifying plants. This process of classifying organisms, initially on the basis of their physical attributes and now underpinned by scientifically robust studies of their phenotype and genotype, was of great interest to the Founding Fathers of the United States and has been a powerful tool for understanding disease. It was a logical step to use one or more of these defining features to evaluate the severity of a condition, as has been done in the case of patients with chronic obstructive pulmonary disease (COPD). Here spirometry has been the objective characteristic studied, and several arbitrary cut points based on the observed reduction in FEV1 were selected to define the extent to which impaired lung mechanics and hence structural lung damage is present (1). These broad groupings define people with different clinical outcomes, especially for mortality, where FEV1 percent predicted is still the strongest predictive variable (2, 3). For the last 25 years, the most widely adopted approach is that of the Global Initiative for Chronic Obstructive Lung Disease (GOLD), where COPD is defined by the presence of an FEV1/FVC ratio of 0.7 (70% or less) and severity (or GOLD stage) is based on the FEV1 percent predicted; GOLD 1 is an FEV1 80% predicted or greater; GOLD 2, 79–50% predicted; GOLD 3, 49–30% predicted; and GOLD 4, below 30% predicted (4).
Recently, Bhatt and colleagues suggested an alternative approach that uses the defining characteristic of airflow obstruction to grade severity in their Staging of Airflow Obstruction by Ratio (STAR) system (5). Again, four grades were created, STAR 1 being a ratio of 60–70%; STAR 2, 50% to <60%; STAR 3, 40% to <50%; and STAR 4, <40%. They validated this system in the large COPDGene (Genetic Epidemiology of COPD Study) observational cohort and two COPD cohorts from Philadelphia, comprising a total of 12,149 individuals with detailed clinical data, and they compared the new system with the conventional GOLD grading. The system produced similar outcomes for grades 2–4 despite differences in the composition of each group. However, STAR 1 subjects had a significantly higher mortality and worse outcomes than nonobstructed people, which was not true for those classified as having GOLD 1 COPD. In an accompanying editorial, I suggested that data from a wider range of people with more ethnic diversity would be helpful (6). That request has now been met in a full paper (pp. 1308–1316) and two research letters (pp. 1374–1376 and pp. 1376–1379) in this issue of the Journal (7–9). These data not only provide confirmation of the original STAR proposal but also give us more insight into how these differences between the groups might arise and inform how we might use this system.
Backman and colleagues report an analysis of the National Health and Nutrition Examination Survey representative general population sample from 2007 through 2012, in which 14,123 people aged 18–80 (mean age, 45 yr; 52% never smoked) underwent prebronchodilator spirometry, and those over 40 completed a Medical Research Council breathlessness questionnaire (7). Bronchodilator testing was performed in 997 of 1,521 people with airflow obstruction, with the highest subsequent FEV1 being reported. In view of the low numbers of grade 4 people, groups 3 and 4 were combined. The presence of a modified Medical Research Council dyspnea scale score and the risk of mortality in the period to the end of 2019 were evaluated. The distribution of GOLD and STAR grades are shown in the Figure 1. There was good agreement between the systems, with a Kendall Tau B for concordance of 0.46 and a C-statistic for mortality prediction of 0.81. This is not surprising, given the clear relationship between the FEV1/FVC and FEV1 percent predicted below 60% seen in Figure 1 of Backman’s paper, because this is where most of the mortality signal lies. As in the original report by Bhatt and colleagues, GOLD 1 stage did not distinguish the risk of dying from that in people without obstruction, whereas STAR 1 grading did. The same was true for identifying the report of some breathlessness, most clearly in the unadjusted data. The STAR system performed as well as GOLD among people of different ethnicities, although the subgroup numbers were too small to permit analysis by individual grades. Similarly, analysis using the lower limit of normal (LLN) FEV1 as the reference value and including or excluding people with PRISm (preserved ratio impaired spirometry) did not influence the results.
Two other shorter reports add further to these findings. In the Rotterdam study reported by Bertels and colleagues, prebronchodilator FEV1 was measured in 5,459 adults with follow-up data from 2009 to 2022 (98% White; mean age, 69.2 yr; 14.7% with COPD) and mortality established (8). Almost 75% of participants were not smoking when studied. The distributions of GOLD and STAR stages are shown in the figure below. Again, there was good overall agreement between the systems, and again differences in mortality risk were seen in STAR 1 but not GOLD 1 relative to the nonobstructed comparator population. The current use of tobacco did not influence the outcome of this analysis. Bhatt and colleagues complement these data with a new analysis of their original COPDGene dataset based on the recommended severity classification of the European Respiratory Society/American Thoracic Society for the interpretation of lung function (9). They studied 10,134 people (mean age, 59.5 yr; 32% African American) with reproducible post-bronchodilator spirometry who were followed for vital status for a median of 9.3 years. This analysis compared the four STAR grades with a modification of the Global Lung Initiative boundaries for the LLN of FEV1, grade 1 being obstructed in people with an FEV1 >1.65; grade 2, −1.65 to −2.5; grade 3, −2.5 to −4.0; and grade 4, less than −4.0. Obstruction was defined as an FEV1/FVC of 0.7 as in the primary analysis and below the LLN in an additional sensitivity analysis. Once more, the overall level of agreement in terms of mortality risk was high, but there was a difference between STAR 1 and GOLD 1 groups, with the latter just failing to reach statistical significance in the adjusted model compared with the healthy comparator group. Substituting the LLN of the ratio for the fixed ratio did not affect these findings.
Each of these reports has strengths and limitations. The National Health and Nutrition Examination Survey and Rotterdam data sets are population samples with somewhat different age structures and risk of COPD, although similar percentages of obstructed people were identified in each. By design, COPDGene included more people with COPD and a predominantly smoking study population. COPDGene also reported post-bronchodilator data, unlike the other studies, which is important when considering the GOLD 1 group. Despite these differences, the overall findings are in good agreement, with the STAR system showing definite monotonic differences in the mortality experience of each group, unlike the GOLD approach. STAR is easy to apply, is physiologically rational, avoids biases in the way normative data are collected, and is reproducible between clinical settings, regardless of the ethnicity or smoking status of those studied. Thus, STAR would appear to be in the ascendant.
Yet, there is still one area we need to understand better before recommending a wholesale change in the severity classification, and this is reflected in the title of this editorial. An everyday metaphor for a futile task, at least in the United Kingdom, is “to rearrange the deckchairs on the Titanic,” where the resulting pattern is pleasing but wholly unrelated to the important issues at hand. One of the striking features seen when comparing the systems is the reclassification of subjects from GOLD grades 2 and 3 to STAR grade 1 (see Figure 1). This is likely to explain the differing mortality and predictive power of STAR 1 versus GOLD 1 because there more people with a lower FEV1 in the STAR 1 group. A subsidiary analysis by Bertels and colleagues (8) is informative in this regard. Among the 244 STAR 1 people with an FEV1 at or below 80% predicted (the boundary for GOLD 1), the hazard ratio for morality was 1.9 compared with 1.1 in the 365 people above 80% predicted. In the 17 people who moved from GOLD 2 to STAR 1, the hazard ratio was 1.9. These data should be examined in the other data sets to confirm if the same is true and whether we are simply shuffling the deckchairs and potentially disguising a group of GOLD 1 subjects at low risk of experiencing an adverse event.
Fortunately, there is no need for immediate action. Both systems have utility, and, as Backman and colleagues note, current decisions about treatment are based on symptoms and exacerbation risk rather than severity of airflow limitation (4). What is also clear from all these analyses, whether conducted using statistically robust methods of classifying abnormal lung function or the more familiar categories based on the degree to which FEV1 falls below that expected, is that obstruction and its accompanying loss of FEV1 is at the heart of COPD prognostication and is likely to remain so until some other index of structural lung damage supersedes it.
Footnotes
Originally Published in Press as DOI: 10.1164/rccm.202405-0987ED on July 1, 2024
Author disclosures are available with the text of this article at www.atsjournals.org.
References
- 1. Standards for the diagnosis and care of patients with chronic obstructive pulmonary disease (COPD) and asthma. This official statement of the American Thoracic Society was adopted by the ATS Board of Directors, November 1986. Am Rev Respir Dis . 1987;136:225–244. doi: 10.1164/ajrccm/136.1.225. [DOI] [PubMed] [Google Scholar]
- 2. Agusti A, Edwards LD, Celli B, Macnee W, Calverley PM, Müllerova H, et al. ECLIPSE Investigators Characteristics, stability and outcomes of the 2011 GOLD COPD groups in the ECLIPSE cohort. Eur Respir J . 2013;42:636–646. doi: 10.1183/09031936.00195212. [DOI] [PubMed] [Google Scholar]
- 3. Bikov A, Lange P, Anderson JA, Brook RD, Calverley PMA, Celli BR, et al. FEV1 is a stronger mortality predictor than FVC in patients with moderate COPD and with an increased risk for cardiovascular disease. Int J Chron Obstruct Pulmon Dis . 2020;15:1135–1142. doi: 10.2147/COPD.S242809. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Agustí A, Celli BR, Criner GJ, Halpin D, Anzueto A, Barnes P, et al. Global Initiative for Chronic Obstructive Lung Disease 2023 report: GOLD executive summary. Am J Respir Crit Care Med . 2023;207:819–837. doi: 10.1164/rccm.202301-0106PP. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Bhatt SP, Nakhmani A, Fortis S, Strand MJ, Silverman EK, Sciurba FC, et al. FEV1/FVC severity stages for chronic obstructive pulmonary disease. Am J Respir Crit Care Med . 2023;208:676–684. doi: 10.1164/rccm.202303-0450OC. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Calverley PMA. A STAR is born: a new approach to assessing chronic obstructive pulmonary disease severity [editorial] Am J Respir Crit Care Med . 2023;208:647–648. doi: 10.1164/rccm.202306-1106ED. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Backman H, Vanfleteren LEGW, Mannino DM, Ekström M. Severity of airflow obstruction based on FEV1/FVC vs FEV1 percent predicted in the general U.S. population. Am J Respir Crit Care Med . 2024;210:1308–1316. doi: 10.1164/rccm.202310-1773OC. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Bertels X, Riemann S, Vauterin D, Lahousse L, Brusselle GG. All-cause mortality of staging of airflow obstruction by ratio-categorized patients with chronic obstruction pulmonary disease among the general population. Am J Respir Crit Care Med . 2024;210:1374–1376. doi: 10.1164/rccm.202311-2144LE. [DOI] [PubMed] [Google Scholar]
- 9. Bhatt SP, Nakhmani A, Fortis S, Strand MJ, Silverman EK, Wilson CG, et al. STAR has better discrimination for mortality than ERS/ATS chronic obstructive pulmonary disease severity classification. Am J Respir Crit Care Med . 2024;210:1376–1379. doi: 10.1164/rccm.202311-2172LE. [DOI] [PMC free article] [PubMed] [Google Scholar]