Skip to main content
American Journal of Respiratory and Critical Care Medicine logoLink to American Journal of Respiratory and Critical Care Medicine
letter
. 2023 May 9;208(1):111. doi: 10.1164/rccm.202304-0718LE

Models That Link Physiology with Outcomes

Marcos Valiente Fernández 1,
PMCID: PMC10870841  PMID: 37159945

To the Editor:

I appreciate the work done by Dianti and colleagues in developing knowledge throughout their article in the Journal (1). However, I would like to emphasize and expand on the following point: “The basic challenge in modeling is to build an appropriate model. […] The model needs to fit the data appropriately, the variables included in the model need to be correctly selected and measured, and the assumptions underlying each model cannot be violated.” Although I agree with this statement, I believe the topics of modeling and dataset selection should be further elaborated upon.

Modeling

The philosophical foundation of modeling is that nature presents patterns in its behavior. Traditional mathematical models simplify and reflect reality, which is known as epistemic reduction of reality. The simplification of reality is based on adopting linear mathematical constraints on the data that allow us to obtain reasoning about them. Logistic regression models, for example, allow us to obtain clinical scores.

However, methodologically, the authors do not delve into the other great aspect of modeling (2). Machine learning–based models are nonlinear, nonrestrictive models that seek to optimize the solution given a dataset, often through different algorithms. These models are especially powerful in large datasets and excel in complex problems with high dimensionality and multiple uncontrollable influences.

Thus, the type of problem we want to solve and the dataset we have must guide the type of model we use (traditional or machine learning) (3).

Data

The dataset influences the result through the variables entered and the selection of model metrics. As the authors point out, the selection of variables is crucial because they determine causality. If we introduce variables that are not causally related, what is usually called “garbage in, garbage out” occurs. In other words, if we put garbage into the model, we will get garbage conclusions.

To select the best model parameters, we use metrics. The chosen metric depends on factors such as the dataset and the frequency of the event of interest. If the event of interest is infrequent (i.e., a disbalanced dataset), it is recommended not to use receiver operating characteristic curves but to use precision-recall curves (4). Using an inadequate metric could lead to suboptimal model parameter selection.

As the authors state, “The basic challenge in modeling is to build an appropriate model.” However, this is not a simple problem, as we have seen. To truly unite physiology with results, we must invest time in clearly determining the problem to be solved, knowing our dataset, selecting the best mathematical modeling, and guiding ourselves with an appropriate metric.

Nature offers us many problems to solve, but it does not tell us with which formulas to solve them. It is our job to reflect on the most appropriate model for our problem (5).

Footnotes

Originally Published in Press as DOI: 10.1164/rccm.202304-0718LE on May 9, 2023

Author disclosures are available with the text of this letter at www.atsjournals.org.

References

  • 1.Dianti J, Morris IS, Urner M, Schmidt M, Tomlinson G, Amato MBP, et al. Linking acute physiology to outcomes in the ICU: challenges and solutions for research. Am J Respir Crit Care Med. 2023;207:1441–1450. doi: 10.1164/rccm.202206-1216CI. [DOI] [PubMed] [Google Scholar]
  • 2. Gutierrez G. Artificial intelligence in the intensive care unit. Crit Care . 2020;24:101. doi: 10.1186/s13054-020-2785-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Gravesteijn BY, Steyerberg EW, Lingsma HF. Modern learning from big data in critical care: primum non nocere. Neurocrit Care . 2022;37:174–184. doi: 10.1007/s12028-022-01510-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Saito T, Rehmsmeier M. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS One . 2015;10:e0118432. doi: 10.1371/journal.pone.0118432. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Shu X, Ye Y. Knowledge discovery: methods from data mining and machine learning. Soc Sci Res . 2023;110:102817. doi: 10.1016/j.ssresearch.2022.102817. [DOI] [PubMed] [Google Scholar]

Articles from American Journal of Respiratory and Critical Care Medicine are provided here courtesy of American Thoracic Society

RESOURCES