Abstract
General outcome prediction models developed for use with large, multicenter databases of critically ill patients may not correctly estimate mortality if applied to a particular group of patients that was under-represented in the original database. The development of new diagnostic weights has been proposed as a method of adapting the general model – the Acute Physiology and Chronic Health Evaluation (APACHE) II in this case – to a new group of patients. Such customization must be empirically tested, because the original model cannot contain an appropriate set of predictive variables for the particular group. In this issue of Critical Care, Arabi and co-workers present the results of the validation of a modified model of the APACHE II system for patients receiving orthotopic liver transplants. The use of a highly heterogeneous database for which not all important variables were taken into account and of a sample too small to use the Hosmer–Lemeshow goodness-of-fit test appropriately makes their conclusions uncertain.
Keywords: APACHE II, liver transplantation, mortality, scoring systems, outcome prediction
Introduction
In this issue of Critical Care, Arabi and co-workers present the results of the validation of a modified model of the Acute Physiology and Chronic Health Evaluation (APACHE) II for patients receiving orthotopic liver transplants [1]. They retrospectively used data from 174 patients admitted to two hospitals (King Fahad National Guard Hospital in Riyadh, Saudi Arabia, and the University of Wisconsin Madison, WI, USA) to validate the modification of the APACHE II prognostic model described by Derek Angus and colleagues [2]. Is the approach of Arabi and co-workers correct? Can the results and the approach be generalized to other settings?
The APACHE prognostic systems
Described in 1985 [3], the APACHE II prognostic system is one of the most widely used general outcome models. Developed for use with unselected groups of critically ill adults, the system uses three types of data to provide the user with a probability of death at hospital discharge: these date are the Acute Physiology Score (APS), based on the most deranged physiological and laboratory values during the first 24 hours in the intensive care unit (ICU); the premorbid status, based on a list of chronic diseases and conditions apparent at admission to hospital; and the diagnostic category, based on a list of 29 medical and 24 surgical diagnoses.
Because the system was developed in the early 1980s, several diseases and conditions were not well represented in the original database. This fact, together with major changes in the outcome of major diseases and the need to incorporate other variables, led the authors to undertake a major update, the APACHE III prognostic system, published in 1991 [4]. This updated system, being commercial, has not had the impact of its free predecessor. With better calibration, probably reflecting more the updated database than major changes in the statistical construct of the model, it was found to be quite well calibrated for the USA [5], except in diagnostic groups for which major changes have been made to the therapeutic approach, such as acute myocardial infarction. In other settings, such as Spain, calibration problems remained, prompting a major recalibration or customization of the Apache III system [6].
The customization of an outcome prediction model
Customization – that is, modification of the equations that transform a score (or the directly measured variables) to a probability of mortality – has been suggested as a possible approach when there is evidence that a given model is not fully appropriate and an unbiased estimation of mortality is needed. Preliminary work [7,8] showed that slight modifications of the logistic regression equations would suffice. Later, Zhu et al., working with computer simulations [9], and groups using independent databases [10,11] showed that customization was feasible and would improve the calibration of the model but that some problems would remain, so that there would still be a need for independent validation of the customized model.
This need for validation applies to the work by Angus and colleagues [2] on the development of new coefficients for the APACHE II system to adapt it to patients after liver transplantation. Those authors' approach, which was to develop a new diagnostic weighting for this category of patients, is attractive, because it is simple. However, it assumes that the APACHE II model incorporates the most important prognostic variables in the setting of liver transplantation, and this assumption needs to be justified.
Does the paper by Arabi and colleagues answer our questions?
It does not. The work done by Arabi and his co-workers was based on a highly heterogeneous database, and patients were treated in two very different institutions. Differences in the prevalence of chronic conditions and the degree of physiologic disorder as well as differences in the procedures followed during the liver transplantation (liver nutrition solutions, cold ischemia time, etc.) could have influenced the outcome for these patients. Moreover, the small number of patients in the sample analyzed makes the Hosmer–Lemeshow goodness-of-fit test underpowered to reveal potential differences between the predicted and the actual mortality. The better calibration of the customized model is promising, but it should be empirically tested in a larger database, constructed to reflect the case mix of liver transplantation patients.
For the moment, therefore, it remains to be shown whether the approach used – to derive a new coefficient for the APACHE II system to be applied to a specific group of patients – is potentially useful and will perform better than its predecessor.
Competing interests
None declared.
Abbreviations
ICU = intensive care unit.
See related Research article: http://ccforum.com/content/6/3/245
References
- Arabi Y, Abbasi A, Goraj R, Al-Abdulkareem A, Al Shimemeri A, Kalayoglu M, Wood K. External validation of a modified model of Acute Physiology and Chronic Health Evaluation (APACHE) II for orthotopic liver transplant patients. Crit Care. 2002;6:245–250. doi: 10.1186/cc1497. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Angus DC, Clermont G, Kramer DJ, Linde-Zwirble WT, Pinsky MR. Short-term and long-term outcome prediction with the Acute Physiology and Chronic Health Evaluation II System after orthotopic liver transplantation. Crit Care Med. 2000;28:150–156. doi: 10.1097/00003246-200001000-00025. [DOI] [PubMed] [Google Scholar]
- Knaus WA, Draper EA, Wagner DP, Zimmerman JE. APACHE II: a severity of disease classification system. Crit Care Med. 1985;13:818–829. [PubMed] [Google Scholar]
- Knaus WA, Wagner DP, Draper EA, Zimmerman JE, Bergner M, Bastos PG, Sirio CA, Murphy DJ, Lotring T, Damiano A. The APACHE III prognostic system. Risk prediction of hospital mortality for critically ill hospitalized adults. Chest. 1991;100:1619–1636. doi: 10.1378/chest.100.6.1619. [DOI] [PubMed] [Google Scholar]
- Zimmerman JE, Wagner DP, Draper EA, Wright L, Alzola C, Knaus WA. Evaluation of acute physiology and chronic health evaluation III predictions of hospital mortality in an independent database. Crit Care Med. 1998;26:1317–1326. doi: 10.1097/00003246-199808000-00012. [DOI] [PubMed] [Google Scholar]
- Rivera-Fernandez R, Vazquez-Mata G, Bravo M, Aguayo-Hoyos E, Zimmerman J, Wagner D, Knaus W. The Apache III prognostic system: customized mortality predictions for Spanish ICU patients. Intensive Care Med. 1998;24:574–581. doi: 10.1007/s001340050618. [DOI] [PubMed] [Google Scholar]
- Le Gall J-R, Lemeshow S, Leleu G, Klar J, Huillard J, Rué M, Teres D, Artigas A. Customized probability models for early severe sepsis in adult intensive care patients. JAMA. 1995;273:644–650. [PubMed] [Google Scholar]
- Apolone G, D'Amico R, Bertolini G, Iapichino G, Cattaneo A, De Salvo G, Melotti R. The performance of SAPS II in a cohort of patients admitted in 99 Italian ICUs: results from the GiViTI. Intensive Care Med. 1996;22:1368–1378. doi: 10.1007/s001340050266. [DOI] [PubMed] [Google Scholar]
- Zhu B-P, Lemeshow S, Hosmer DW, Klarm J, Avrunin J, Teres D. Factors affecting the performance of the models in the mortality probability model and strategies of customization: a simulation study. Crit Care Med. 1996;24:57–63. doi: 10.1097/00003246-199601000-00011. [DOI] [PubMed] [Google Scholar]
- Moreno R, Apolone G. The impact of different customization strategies in the performance of a general severity score. Crit Care Med. 1997;25:2001–2008. doi: 10.1097/00003246-199712000-00017. [DOI] [PubMed] [Google Scholar]
- Metnitz PG, Valentin A, Vesely H, Alberti C, Lang T, Lenz K, Steltzer H, Hiesmayr M. Prognostic performance and customization of the SAPS II: results of a multicenter Austrian study. Intensive Care Med. 1999;25:192–197. doi: 10.1007/s001340050815. [DOI] [PubMed] [Google Scholar]