Skip to main content
American Journal of Respiratory and Critical Care Medicine logoLink to American Journal of Respiratory and Critical Care Medicine
. 2023 Jul 21;208(6):655–656. doi: 10.1164/rccm.202306-0995VP

Type 1 Error on Type 2 Inflammation: Circular Analysis in Asthma Clustering

Brian J Patchett 1,2,, Edward S Schulman 1
PMCID: PMC10515571  PMID: 37478329

Over the past decade and a half, the field of severe asthma research has begun to utilize a form of machine learning called cluster analysis to “learn" latent or hidden phenotypes present in large datasets of clinically relevant variables. The goal of these analyses is to understand the clinical presentation of the patient with moderate and/or severe asthma, with the hopes that treatment can then be personalized to the individual. The approach that has been largely used in the seminal papers of the Severe Asthma Research Program and Haldar and colleagues (1, 2), as well as in many other asthma-phenotyping papers that followed (38), has been to first identify clusters and then subsequently compare clusters using a classical statistical test for differences in means, between outcomes, rates, levels of biomarkers, and so forth. This represents, however, a form of a statistical error called circular analysis (Figure 1), also called “double dipping,” whereby a hypothesis is generated on the basis of a dataset and then the test for the hypothesis occurs on the same dataset (9). As Gao and colleagues have emphasized in reference to clustering (10), this leads to an extremely inflated Type 1 error rate and broadly invalidates the statistical inference.

Figure 1.


Figure 1.

In circular analysis, data gathering precedes hypothesis generation.

Type 1 error refers to the probability of incorrectly rejecting a “true” null hypothesis. In this context, it is falsely concluding that two clusters differ in a certain variable when, in actuality, they do not. It is common practice to set the acceptable probability of Type 1 error a priori (also known as the α-value cutoff) to 0.05. In the case of a simple experiment (e.g., comparing the FEV1 between a treatment and control group), interpreting P values is straightforward. A P value less than the cutoff for Type 1 error set a priori represents evidence of statistical significance. However, if we begin our analysis by grouping similar observations (keeping dissimilar observations apart) and subsequently define our hypotheses on the basis of these pregrouped data (e.g., Does the mean of one cluster differ from the mean in another?), the standard P value interpretation is no longer valid. Therefore, we must consider that any estimated P value in our cluster comparison is conditional on the process used to identify the clusters in the first place. In other words, the P value becomes the probability of detecting a difference in means as extreme or more extreme than the one observed completely by chance, given the manner chosen to first cluster the data. Therefore, if the same data are used to both generate and evaluate hypotheses without accounting for the double use of data, this represents circular analysis. Put another way, if the same variables used to derive clusters are subsequently compared using a classical statistical test for a difference in means, the resulting P value is invalid. Indeed, in circular analysis, classical statistical testing between clusters will find the means to be highly different and produce small P values, even if the difference is due to random sampling variation. Hence, P values will be invalid. It may be tempting to consider sample splitting as a solution, yet this does not address the underlying circular reasoning (10).

We considered eight influential papers on asthma clustering (18). In all eight papers, the circular analysis described here was used, at least in part. Not all of the hypothesis testing was double dipping, however. For instance, Denton and colleagues (6) had only 3/26 hypotheses that fell into the trap; the remaining hypotheses compared the clusters for variables that were not involved in the clustering process. This avoids the circular analysis trap. However, our review of the foundational studies of the Severe Asthma Research Program (1) and of Haldar and colleagues (2), when taken together, show that over 50% of the hypotheses were a result of circular reasoning. In total, in the eight papers considered, we found 150 of the 330 (45.5%) statistical tests to involve double dipping, excluding supplementary materials. Of these 45.5%, a mere 15/150 (10.0%) of the null hypotheses were accepted at the α = 0.05 level. In comparison, of the remaining 180 tests, 102 (43.3%) of the null hypotheses that did not involve double dipping were accepted. This suggests that there is more overlap in clusters than reported, and where that overlap occurs is unclear (e.g., Does a young allergic cluster truly differ from a young hypereosinophilic asthmatic cluster?). Although our analysis does not include every published clustering study, we feel that our study represents an ample sample to highlight the pervasiveness of this fallacy in efforts to phenotype asthma.

A reader may wonder about applying a Bonferroni correction or a similar multiple-testing P value correction procedure used to control for the type 1 error rate. More broadly, a natural question that arises is: How well-founded are the clusters identified that fell into the circular analysis trap? First, we do not dispute the methodology that first led to the identification of the clusters: Cluster analysis is a powerful form of machine learning that should still be instrumental for identifying phenotypes and endotypes of asthma. We also do not dispute the findings that did not arise from double dipping. Also, we note that some aspects of the clusters appear to be replicated across independent analyses. However, because the tests comparing the clusters are invalid, we cannot conclude on the basis of the analyses conducted that the individual clusters differ significantly at any variable; nor would a P value correction be of any use, as those P values obtained are invalid to begin with.

When data gathering precedes hypothesis generation, performing valid inference is considerably more challenging. Nonetheless, performing valid inference is essential for reproducible medical research. Reanalysis of the clustering results of prior publications using the new methods laid out by Gao and colleagues (10), and it is important to note, combined with external validation, may elucidate previously unknown overlaps between clusters. In doing so, the authors could be able to identify both more precise phenotypes of asthma and the variables that truly differ between clusters.

Acknowledgments

Acknowledgment

We thank Dr. Daniel Vader at the Drexel Biostatistics Consulting Center for helpful review of this article.

Footnotes

Supported by the Margaret Wolf Pulmonary Research Memorial Fund.

Originally Published in Press as DOI: 10.1164/rccm.202306-0995VP on July 21, 2023

Author disclosures are available with the text of this article at www.atsjournals.org.

References

  • 1. Moore WC, Meyers DA, Wenzel SE, Teague WG, Li H, Li X, et al. National Heart, Lung, and Blood Institute’s Severe Asthma Research Program Identification of asthma phenotypes using cluster analysis in the Severe Asthma Research Program. Am J Respir Crit Care Med . 2010;181:315–323. doi: 10.1164/rccm.200906-0896OC. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Haldar P, Pavord ID, Shaw DE, Berry MA, Thomas M, Brightling CE, et al. Cluster analysis and clinical asthma phenotypes. Am J Respir Crit Care Med . 2008;178:218–224. doi: 10.1164/rccm.200711-1754OC. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Ilmarinen P, Tuomisto LE, Niemelä O, Tommola M, Haanpää J, Kankaanranta H. Cluster analysis on longitudinal data of patients with adult-onset asthma. J Allergy Clin Immunol Pract . 2017;5:967–978.e3. doi: 10.1016/j.jaip.2017.01.027. [DOI] [PubMed] [Google Scholar]
  • 4. Sutherland ER, Goleva E, King TS, Lehman E, Stevens AD, Jackson LP, et al. Asthma Clinical Research Network Cluster analysis of obesity and asthma phenotypes. PLoS One . 2012;7:e36631. doi: 10.1371/journal.pone.0036631. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Fitzpatrick AM, Teague WG, Meyers DA, Peters SP, Li X, Li H, et al. National Institutes of Health/National Heart, Lung, and Blood Institute Severe Asthma Research Program Heterogeneity of severe asthma in childhood: confirmation by cluster analysis of children in the National Institutes of Health/National Heart, Lung, and Blood Institute Severe Asthma Research Program. J Allergy Clin Immunol . 2011;127:382–389.e1, 13. doi: 10.1016/j.jaci.2010.11.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Denton E, Price DB, Tran TN, Canonica GW, Menzies-Gow A, FitzGerald JM, et al. Cluster analysis of inflammatory biomarker expression in the International Severe Asthma Registry. J Allergy Clin Immunol Pract . 2021;9:2680–2688.e7. doi: 10.1016/j.jaip.2021.02.059. [DOI] [PubMed] [Google Scholar]
  • 7. Konno S, Taniguchi N, Makita H, Nakamaru Y, Shimizu K, Shijubo N, et al. HiCARAT Investigators Distinct phenotypes of smokers with fixed airflow limitation identified by cluster analysis of severe asthma. Ann Am Thorac Soc . 2018;15:33–41. doi: 10.1513/AnnalsATS.201701-065OC. [DOI] [PubMed] [Google Scholar]
  • 8. Wu W, Bang S, Bleecker ER, Castro M, Denlinger L, Erzurum SC, et al. Multiview cluster analysis identifies variable corticosteroid response phenotypes in severe asthma. Am J Respir Crit Care Med . 2019;199:1358–1367. doi: 10.1164/rccm.201808-1543OC. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Kriegeskorte N, Simmons WK, Bellgowan PS, Baker CI. Circular analysis in systems neuroscience: the dangers of double dipping. Nat Neurosci . 2009;12:535–540. doi: 10.1038/nn.2303. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Gao LL, Bien J, Witten D. Selective inference for hierarchical clustering. J Am Stat Assoc . 2022 doi: 10.1080/01621459.2022.2116331. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from American Journal of Respiratory and Critical Care Medicine are provided here courtesy of American Thoracic Society

RESOURCES