Skip to main content
PLOS ONE logoLink to PLOS ONE
. 2020 Dec 31;15(12):e0244989. doi: 10.1371/journal.pone.0244989

Validation of a simplified risk prediction model using a cloud based critical care registry in a lower-middle income country

Bharath Kumar Tirupakuzhi Vijayaraghavan 1,‡,*,#, Dilanthi Priyadarshini 2,‡,#, Aasiyah Rashan 2, Abi Beane 3, Ramesh Venkataraman 1, Nagarajan Ramakrishnan 1, Rashan Haniffa 3; the Indian Registry of IntenSive care(IRIS) collaborators
Editor: Aleksandar R Zivkovic4
PMCID: PMC7775074  PMID: 33382834

Abstract

Background

The use of severity of illness scoring systems such as the Acute Physiology and Chronic Health Evaluation in lower-middle income settings comes with important limitations, primarily due to data burden, missingness of key variables and lack of resources. To overcome these challenges, in Asia, a simplified model, designated as e-TropICS was previously developed. We sought to externally validate this model using data from a multi-centre critical care registry in India.

Methods

Seven ICUs from the Indian Registry of IntenSive care(IRIS) contributed data to this study. Patients > 18 years of age with an ICU length of stay > 6 hours were included. Data including age, gender, co-morbidity, diagnostic category, type of admission, vital signs, laboratory measurements and outcomes were collected for all admissions. e-TropICS was calculated as per original methods. The area under the receiver operator characteristic curve was used to express the model’s power to discriminate between survivors and non-survivors. For all tests of significance, a 2-sided P less than or equal to 0.05 was considered to be significant. AUROC values were considered poor when ≤ to 0.70, adequate between 0.71 to 0.80, good between 0.81 to 0.90, and excellent at 0.91 or higher. Calibration was assessed using Hosmer-Lemeshow C -statistic.

Results

We included data from 2062 consecutive patient episodes. The median age of the cohort was 60 and predominantly male (n = 1350, 65.47%). Mechanical Ventilation and vasopressors were administered at admission in 504 (24.44%) and 423 (20.51%) patients respectively. Overall, mortality at ICU discharge was 10.28% (n = 212). Discrimination (AUC) for the e-TropICS model was 0.83 (95% CI 0.812–0.839) with an HL C statistic p value of < 0.05. The best sensitivity and specificity (84% and 72% respectively) were achieved with the model at an optimal cut-off for probability of 0.29.

Conclusion

e-TropICS has utility in the care of critically unwell patients in the South Asia region with good discriminative capacity. Further refinement of calibration in larger datasets from India and across the South-East Asia region will help in improving model performance.

Introduction

Severity of illness scoring systems such as the Simple Acute Physiology Score (SAPS) [1] and the Acute Physiology and Chronic Health Evaluation (APACHE) [2] help in risk prediction, benchmarking, quality improvement and patient selection for research. Over the past three decades, several iterations of these models have been developed and validated based on changes in the epidemiology of critical illness and substantial improvements in survival [3, 4]. Most of these models have been evaluated in the context of high-income countries (HICs). There are several limitations to the use of these models in middle income and lower-middle income countries (LMICs) such as differences in epidemiology of critical illness, including the high burden of tropical infections, the lack of resources for data collection, burden of data collection due to the large number of variables, missing variables and the absence of electronic health records that would otherwise facilitate seamless data flow [5]. To overcome these problems, researchers have developed and tested newer simplified models in LMICs [6, 7].

Recently, in Asia, one such simplified model, the e-TropICS (described as ‘model 1’ in the original manuscript) has been developed and validated [8]. The researchers, cognizant of the limited availability of variables and consequent high proportion of missingness, which has hampered intensive care units(ICUs) in resource limited settings from being able to utilise existing prognostic scores, aimed to develop a comparable score based on a more universally obtainable set of variables. However, this score has not been externally validated. We, therefore, aimed to validate the performance of the e-TropICS model on a multi-centre Indian data set from the recently established Indian Registry of IntenSive care(IRIS) [9].

Methods

Study setting

Seven ICUs located in 6 private and 1 not-for-profit institution, part of the Indian Registry of IntenSive care(IRIS) contributed data to this study. Of these, 5 were general (mixed medical-surgical) ICUs and two were medical ICUs. None of the participating ICUs from this study contributed data for the original model development and were only involved in this external validation exercise.

The Indian Registry of IntenSive care, a cloud-based registry of critical care units was established in Jan 2019 [9]. Details of the implementation and preliminary results of the case-mix program have been previously published [9].

Patients

All patients reported to the registry between January 2019 to May 2019, were considered. Patients > 18 years of age with an ICU length of stay > 6 hours were included in the study. Patients with missing outcomes and those not meeting the inclusion criteria were excluded

Data collection

This retrospective study used data collected as part of the IRIS dataset. Age, gender, pre-existing co-morbidity, diagnostic category, type of admission (planned, unplanned, medical or surgical), physiological vital signs and laboratory measurements were collected as per the definitions described for e-TropICS (Table 1) for all consecutive admissions. ICU outcomes rather than hospital outcomes were collected due to well-described logistical challenges in such settings [5, 8]. Data was collected daily by either nursing staff or by data collectors appointed to the registry network, all of whom had been trained in the process of data acquisition. Daily telephone reminders encouraging data input and checks for consistency of the number of admissions, discharges and outcomes from each ICU were undertaken by staff from the central coordinating centre. In-built measures in the data entry portal such as mandatory fields, range validations, drop down and checkboxes as opposed to free text entries were employed to promote fidelity of data recording.

Table 1. Patient characteristics at the time of ICU presentation.

Characteristic n = 2062 Study dataset Data from original model (Haniff et al.)8
All (n = 2062) Dead (n = 212,10.3%) Alive (n = 1850, 89.7%) Dead n = 1031 Alive N = 2590
Planned admission (n (%)) 383(18.57) 31(14.62) 352(19.03) - -
Gender male (n (%)) 1350(65.47) 136(64.15) 1214(65.62) - -
LoS(mean, SD) 3.41(4.20) 3.22(3.91) 5.34(6.34) - -
Age(Median, range) 60(18–110) 64(22–92) 60 (18–110) 54(16–102) 56(16–103)
Heart rate(mean, SD) 94.35(22.95) 104.50(26.59) 93.18(22.21) 109(24) 100(24)
GCS(Median, range) 14(3–14) 10(3–14) 14(3–14) 9(2–15) 15(2–15)
Respiratory rate(mean, SD) 22.83(5.58) 25.70(8.06) 22.50(5.12) 24(8) 23(6)
Systolic BP(mean, SD) 129.20(25.91) 120.53(29.14) 130.19(25.33) 132(35) 139(29)
Blood urea(median, range) 35(0.8–400) 58.5(11–239) 33(0.8–400) 59(0.6–411) 32(0.9–672)
Haemoglobin, g/dL(mean, SD) 11.46(2.77) 10.34(2.62) 11.59(2.76) 10.8(2.8) 11.6(2.4)
Vasopressor used on admission (n (%)) 423(20.51) 124(58.49) 299(16.16) 317(35.7) 177(8.7)
Mechanical ventilation on admission (n (%)) 504(24.44) 122(57.55) 382(20.65) 899(87) 2054(79)

Ethics and consent

The study was approved by the Institutional Ethics Committee centrally at the study coordinating centre (Institutional Ethics Committee, Apollo Main Hospital- AMH-021/07-19). The informed consent model used in the registry has been described and published previously [9]. Briefly, participating sites in the registry either modified their general intensive care consent to include a clause on deidentified data collection or had waiver of the individual consent requirement.

Statistical analysis

Availability of physiological and laboratory measurements was described using descriptive statistics. e-TropICS was calculated as per the authors’ original methods [8]. The area under the receiver operator characteristic curve (AUROC) was used to express the model’s power to discriminate between survivors and non-survivors. For all tests of significance, a 2-sided P less than or equal to 0.05 was considered to be significant. AUROC values were considered poor when less than or equal to 0.70, adequate between 0.71 to 0.80, good between 0.81 to 0.90, and excellent at 0.91 or higher [10]. Calibration for the model was assessed using Hosmer-Lemeshow C -statistic and higher values of the Hosmer-Lemeshow C-Statistic indicate poorer calibration.

All analysis was performed using Stata software version 13.1 [11].

Handling of missing data and analysis

When faced with high proportions of missing data, one approach is to assume normality for a variable when not measured or unavailable, resulting in a score of “0” in weighted scoring systems. Such an approach may not be justified in LMICs where measurements may be unavailable due to lack of resource availability or to differing approaches in decision-making in critical illness. Assumptions of normality in the above manner can adversely impact model performance by underestimating severity scores. In this study, multiple Imputation (MI) with chained equations was employed to handle missing data. It was assumed that the missingness of a variable depends on some of the other observed variables i.e. Missing At Random (MAR). MI was performed using sequential imputation using chained equations. This is a multivariate approach that allows the flexibility of modelling different types of data within the same model with different rules being chosen based on type of data(predictive mean matching for continuous, logit for categorical and so on. The number of imputations (M) was set at 20 and “k-nearest neighbours” (kNN#, Stata syntax) was set at 10. Multiple Imputation (MI) generates several values reflecting the uncertainty in the estimation of the imputed value. The scores (and their mortality probabilities) were then calculated individually for each of the 20 multiple imputed datasets. The mean of 20 probabilities was then calculated and used the MI mortality prediction. As a secondary method, we also performed a complete case analysis and report the AUROC.

Results

Characteristics of population and availability of variables for the e- TropICS model

During the evaluation period, 2094 consecutive patient episodes were reported to the IRIS registry from the seven participating centres. Thirty-two patients were excluded and for the final analysis 2062 patients were included; 19 patients were less than 18 years of age and 13 had no outcome information.

The characteristics, demographics and outcomes for these episodes is described in Table 1. The median age of the cohort was 60 and predominantly male (n = 1350, 65.47%). Planned admissions accounted for 383 (18.57%) episodes. Mechanical Ventilation and vasopressors were administered at admission in 504 (24.44%) and 423 (20.51%) patients respectively. Overall, mortality at ICU discharge was 10.28% (n = 212). S1 Fig presents the commonest APACHE II diagnostic categories and the corresponding ICU outcome information.

Availability of the variables for the e-TropICS model is described in Table 2. Availability was lowest for blood urea (88.60%) and highest for gender and admission type (100%). For all other variables, availability ranged from 95% to 100%.

Table 2. Availability of the parameters of e-Tropics model.

Parameters Availability out of 2062, n(%)
Eye_opening 2056(99.71)
Motor_response 2057(99.76)
Verbal_response 2056(99.71)
GCS 2056(99.71)
Mechanically ventilated (yes/no at admission) 2052(99.52)
Vasoactive drugs (yes/no at admission) 2050(99.42)
Systolic BP 2057(99.76)
Respiratory rate 2056(99.71)
Haemoglobin 2013(97.62)
Blood urea 1827(88.60)

Ability of models to predict mortality

Discrimination (AUC) for the e-TropICS model was 0.83 (95% CI 0.81–0.84) (Table 3, Fig 1) with an HL C statistic p value of < 0.05 suggesting poor model fit. The best sensitivity and specificity (84% and 72% respectively) were achieved with the model at an optimal cut-off for probability of 0.29. Fig 2 provides the calibration plot of the expected probability vs. the observed probability.

Table 3. Performance of the e-tropICS model with multiple imputation.

Performance item MI model N = 2062
Probability, mean (SD) 0.29(0.003)
Optimal cut-off probability 0.23
Sensitivity (at optimum cut-off) 0.84
Specificity (at optimum cut-off) 0.72
AUC (95% CI) 0.83(0.812–0.839)
H/L C-statistic (p) 340.87(0.00)
Brier score (95% CI) 0.12(0.120–0.127)

Fig 1. ROC curve for imputed model.

Fig 1

Fig 2. Calibration plot of the expected probability (deciles) versus the observed probability.

Fig 2

The results of the complete case analysis are presented in S2 Table and S2 Fig.

Discussion

Our study validates the performance of a simplified prognostic model designed for use in critical care units, where information needed to calculate prognostic models from HIC may be absent or burdensome. In this multi-centre cohort from the recently established IRIS critical care registry, e-TropICS had good ability to discriminate death, but poor calibration. This suggests that in this cohort, the model can identify those patients at greatest risk of death, but has less ability to differentiate between degrees of severity of illness.

Several prognostic models for critically ill adults are validated and in use in HIC healthcare systems. However, their applicability in settings where due to limited point of care testing, data collection resources and even perhaps appropriate judicious use of laboratory tests, remains limited. Whilst missingness can be managed for the purposes of performance assessment and validation, a score which is not easily calculated—has limited application in the clinical setting. The e-TropICS model, developed from a South Asian dataset attempts to overcome several of these challenges by limiting model covariates to clinical information that would likely be requested by clinicians, and that are likely available in all health systems. Availability of data for validation was much higher than reported in previous literature from similar middle-income settings and ranged from 95–100%. Only blood urea had an availability less than 95%. Of note, the decision to perform laboratory tests is influenced by clinician preference, in addition to access to equipment, disposables, costs and expertise.

e-TropICS model had good discriminative ability (AUC of 0.81), but poor calibration. This is not surprising though, as several well-established prediction models, when validated externally have shown poor calibration [12]. Several reasons could explain this including the limitations of the HL test itself such as a high sensitivity to the sample size [13, 14]. Other reasons could include differences in case-mix [15]. Another potential explanation is the lower mortality in our dataset as compared to the dataset from which e-TropICS was developed (10.2% versus 28.4%). Previous research has shown that even small differences in mortality can affect the calibration of a model [16]. Whilst both good discrimination and good calibration are desirable for prognostic scores to be deemed suitable for application in clinical practice, in reality, once an appropriate and implementable model is selected for use in a clinical setting, it can and perhaps should be regularly calibrated for the current population to which it is being applied. Our group is undertaking model refinement in the IRIS and in the recently established nine country critical care registry collaboration (Crit Care Asia) in South and Southeast Asia.

Our study has also demonstrated the feasibility of using registry data in a resource-limited setting to validate a locally relevant model. There remains limited investment in systems that enable routine data collection in LMICs and subsequently efforts to benchmark services and evaluate interventions to improve care remain hampered by low availability of information. The greater availability reported in this validation compared to earlier studies in the region suggests that investment in health system strengthening including the implementation of critical care registries can improve the availability of information during critical care admission.

Conclusion

e-TropICS has utility in the care of critically unwell patients in the South Asia region. e-TropICS thus offers a prediction model that is simplified with low data collection burden for resource-limited settings. Further refinement of calibration of this model in larger datasets from India and across the South-East Asia region will help in improving model performance.

Supporting information

S1 Fig. Commonest APACHE II diagnostic categories and ICU outcomes.

(TIF)

S2 Fig. ROC curve for complete case analysis.

(TIF)

S1 Table. Variation of sensitivity and specificity at different cutpoints.

(DOCX)

S2 Table. Performance of the e-TropICS model (complete case analysis).

(DOCX)

S3 Table. Number of patients included from each site.

(DOCX)

Acknowledgments

IRIS collaborators:

  1. Dr Devachandran Jayakumar and Dr Pratheema Ramachandran: Apollo Specialty Hospital, OMR, Chennai, India

  2. Dr Deedipiya Devaprasad and Dr Vijay Chakravarthy: Apollo Specialty Hospital, Teynampet, Chennai, India

  3. Dr Ashwin Mani and Dr Meghena Mathew: Apollo First Med Hospital, Kilpauk, Chennai, India

  4. Dr Ebenezer Rabindrarajan and Dr Usha Rani: Apollo Specialty Hospital, Vanagaram, Chennai, India

  5. Dr Niyaz Channanath Ashraf: IQRAA Hospital, Calicut, India

  6. Dr Jaganathan Selva: Mehta Hospital, Chennai, India

Data Availability

Pooled data from IRIS are available from the IRIS Dashboard at https://nicst.com/picu-iris-public/. The IRIS collaboration supports and welcome data sharing. Our agreement with participating sites in the registry is only for the sharing of deidentified data between them and the registry coordinating centre for the purposes of audit, quality improvement and specific research questions. We are not allowed to post data on a repository or any other public database. Raw data will be made available to qualified researchers who provide a detailed and methodologically sound proposal with specific aims that are clearly outlined. Such proposals will be screened by the IRIS Steering committee for approval. Data sharing will be for the purposes of medical research and under the auspices of the consent under which the data were originally gathered. To gain access, qualified researchers will need to sign a data sharing and access agreement and will need to confirm that data will only be used for the agreed upon purpose for which data access was granted. Researchers can contact the corresponding author through electronic mail (bharath@icuconsultants.com) for such access; alternatively, IRIS can be contacted at info@irisicuregistry.org and joinus@irisicuregistry.org.

Funding Statement

Funding: This work is partially supported by the Wellcome Trust (https://wellcome.ac.uk/what-we-do/our-work/innovations-flagships) and Mahidol Oxford Tropical Research Unit (https://www.tropmedres.ac/) Authors RH and AB were co-applicants on the wellcome innovations grant WT215522/Z/19/Z Role of funder: The funder had no role in the design, conduct, analysis or decision to submit for publication.

References

  • 1.Le Gall JR, Loirat P, Alperovitch A, Glaser P, Granthil C, Mathieu D et al. A simplified acute physiology score for ICU patients. Crit Care Med 1984;12(11):975–7 10.1097/00003246-198411000-00012 [DOI] [PubMed] [Google Scholar]
  • 2.Knaus WA, Zimmerman JE, Wagner DP, Draper EA, Lawrence DE. APACHE- acute physiology and chronic health evaluation: a physiologically based classification system. Crit Care Med 1981;9(8):591–7 10.1097/00003246-198108000-00008 [DOI] [PubMed] [Google Scholar]
  • 3.Gall JR, Lemeshow S, Saulnier F. A New Simplified Acute Physiology Score (SAPS II) Based on a European/North American Multicenter Study. JAMA J Am Med Assoc. 1993; 10.1001/jama.270.24.2957 [DOI] [PubMed] [Google Scholar]
  • 4.Zimmerman JE, Kramer AA, McNair DS, Malila FM. Acute Physiology and Chronic Health Evaluation (APACHE) IV: Hospital mortality assessment for today’s critically ill patients. Crit Care Med. 2006; 10.1097/01.CCM.0000215112.84523.F0 [DOI] [PubMed] [Google Scholar]
  • 5.Haniffa R, Isaam I, De Silva AP, Dondorp AM, De Keizer NF. Performance of critical care prognostic scoring systems in low and middle-income countries: a systematic review. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Riviello ED, Kiviri W, Fowler RA, Mueller A, Novack V, Banner-Goodspeed VM et al. Predicting Mortality in Low-Income Country ICUs: The Rwanda Mortality Probability Model(MPM). PLOS ONE 2016; 10.1371/journal.pone.0155858 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Houweling TA, van Klaveren D, Das S, Azad K, Tripathy P, Manandhar D et al. A prediction model for neonatal mortality in low- and middle-income countries: an analysis of data from population surveillance sites in India, Nepal and Bangladesh. Int J Epidemiol 2019;48(1):186–98 10.1093/ije/dyy194 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Haniffa R, Mukaka M, Munasinghe SB, De Silva AP, Saroj K, Jayasinghe A et al. Simplified prognostic model for critically ill patients in resource limited settings in South Asia. Crit Care 2017;21:250 10.1186/s13054-017-1843-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Adhikari NKJ, Arali R, Attanayake U, Balasundaram S, Beane A, Chakravarthy V et al. Implementing an intensive care registry in India: preliminary results of the case-mix program and an opportunity for quality improvement and research [version 1; peer review: awaiting peer review]. Wellcome Open Res 2020;5:182 10.12688/wellcomeopenres.16152.2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Hosmer DW, Lemeshow S. Applied Logistic Regression, 2nd Ed Chapter 5, John Wiley and Sons, New York, NY: (2000), pp. 160–164 [Google Scholar]
  • 11.StataCorp. 2013. Stata Statistical Software: Release 13. College Station, TX:StataCorp LP. [Google Scholar]
  • 12.Nassar AP Jr, Mocelin AO, Baptiston Nunes AL, Giannini FP, Brauer L, Andrade FM et al. Caution when using prognostic models: A prospective comparison of 3 recent prognostic models. Journal of Critical Care 2012;27:423.e1–423.e7 10.1016/j.jcrc.2011.08.016 [DOI] [PubMed] [Google Scholar]
  • 13.Kramer AA, Zimmerman JE. Assessing the calibration of mortality benchmarks in critical care: The Hosmer-Lemeshow test revisited. Crit Care Med 2007;35(9):2052–6 10.1097/01.CCM.0000275267.64078.B0 [DOI] [PubMed] [Google Scholar]
  • 14.Serrano N. Calibration strategies to validate predictive models: is new always better? Intensive Care Med 2012;38:1246–48 10.1007/s00134-012-2579-z [DOI] [PubMed] [Google Scholar]
  • 15.Glance LG, Osler T, Shinozaki T. Effect of varying the case mix on the standardized mortality ratio and W statistic: a simulation study. Chest 2000;117:1112–7 10.1378/chest.117.4.1112 [DOI] [PubMed] [Google Scholar]
  • 16.Glance LG, Osler TM, Papadakos P. Effect of mortality rate on the performance of the Acute Physiology and Chronic Health Evaluation II: a simulation study. Crit Care Med 2000;28:3424–8. 10.1097/00003246-200010000-00008 [DOI] [PubMed] [Google Scholar]

Decision Letter 0

Aleksandar R Zivkovic

14 Oct 2020

PONE-D-20-27684

Validation of a simplified risk prediction model using a cloud based critical care registry in a lower-middle income country

PLOS ONE

Dear Dr. Tirupakuzhi Vijayaraghavan,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Nov 28 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Aleksandar R. Zivkovic

Academic Editor

PLOS ONE

Journal requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Please note that there appears to be a discrepancy in the number of participants, variously given as n=2094 and n=2062.

3. Please provide additional details regarding participant consent. In the ethics statement in the Methods and online submission information, please ensure that you have specified what type you obtained (for instance, written or verbal, and if verbal, how it was documented and witnessed). If your study included minors, state whether you obtained consent from parents or guardians. If the need for consent was waived by the ethics committee, please include this information.

4.We note that you have indicated that data from this study are available upon request. PLOS only allows data to be available upon request if there are legal or ethical restrictions on sharing data publicly. For information on unacceptable data access restrictions, please see http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions.

In your revised cover letter, please address the following prompts:

a) If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially identifying or sensitive patient information) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent.

b) If there are no restrictions, please upload the minimal anonymized data set necessary to replicate your study findings as either Supporting Information files or to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. Please see http://www.bmj.com/content/340/bmj.c181.long for guidelines on how to de-identify and prepare clinical data for publication. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories.

We will update your Data Availability statement on your behalf to reflect the information you provide.

5.Thank you for stating the following in your Competing Interests section: 

[None ].

Please complete your Competing Interests on the online submission form to state any Competing Interests. If you have no competing interests, please state "The authors have declared that no competing interests exist.", as detailed online in our guide for authors at http://journals.plos.org/plosone/s/submit-now

 This information should be included in your cover letter; we will change the online submission form on your behalf.

Please know it is PLOS ONE policy for corresponding authors to declare, on behalf of all authors, all potential competing interests for the purposes of transparency. PLOS defines a competing interest as anything that interferes with, or could reasonably be perceived as interfering with, the full and objective presentation, peer review, editorial decision-making, or publication of research or non-research articles submitted to one of the journals. Competing interests can be financial or non-financial, professional, or personal. Competing interests can arise in relationship to an organization or another person. Please follow this link to our website for more details on competing interests: http://journals.plos.org/plosone/s/competing-interests

6. One of the noted authors is a group or consortium [IRIS collaborators.]. In addition to naming the author group, please list the individual authors and affiliations within this group in the acknowledgments section of your manuscript. Please also indicate clearly a lead author for this group along with a contact email address.

7. Your ethics statement should only appear in the Methods section of your manuscript. If your ethics statement is written in any section besides the Methods, please delete it from any other section.

8. Please ensure that you refer to Figure 1 in your text as, if accepted, production will need this reference to link the reader to the figure.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer #1: Congratulations for your research. This is an extensive work and the authors should be commended for this. This article will interest many readers. However, there are some minor comments in order to improve the presentation of this article.

- As excluded criteria of the study you report only the age and the length of stay in the ICU. Do you mean that in your dataset are included pregnant or burned patients and patients with the diagnosis of brain death or end-stage malignacy patients? The above case-mix are excluded from the majority of studies and scoring systems. It is not suprising then that the model has a poor calibration. The fact that the above patients are not excluded from the initial dataset is a disadvantage of the method. You should add a discussion about this in the manuscript.

- As you described in detail the AUROC values that indicate the descrimination, you should describe as well the values that indicate wheather the calibration is good or poor. (line 158)

-Line 224. Replace "By" with "by". Also in lines 223-227 there is a long sentence without any comma.

- Of course you need a further investigation in larger dataset in order to generalize the scoring system, but next time it would be better if you perform a prospective study, in order to minimize the missing data and to increase the accuracy of the validation.

Reviewer #2: This paper presents the validation of a new risk prediction model, termed e-Tropics, in a new data set. The results indicated that while the model discriminated well, the calibration was poor.

The methods used in the paper were valid and reasonable and the results potentially useful to users of e-TropICS however the paper lacked clarity in some areas and appeared to be missing information in others. The paper would benefit from additional analyses / more detailed consideration as detailed below.

Major comments:

It is not clear from the paper whether or not any of the institutions used in developing the model were included in the validation data. This is potentially important since case mixes may be different in different institutions. This important information should be carefully described for this data set (rather than stating as having been published elsewhere). The number of participants from each institution in this validation set should be included. If the validation set comprises a combination of institutions used and not used in the model building exercise, analysis containing each subset could be included in sensitivity analysis to show whether those subsets differ in terms of discrimination and calibration.

It would be useful if Table 1 contained summary information on the cohort from e-tropICS in an additional column - then readers can see where the lack of calibration may be arising from. Rather than use of all of the footnote symbols why not just say, for example: Mean heart rate (SD). Some abbreviations are missing definitions which should be included in the footnotes for the table. Why does Table 1 not include all of the e-tropICS variables too? Please update the table to include these. How do we interpret the summaries from Table 1? Obviously they show the difference between those who survived and otherwise but clinical interpretation would be nice here too - how sick is this cohort?

The lack of calibration could be an issue if it is in a region of the predictive distribution where it is important to calibrate well. It is important to know where the model is over or under-estimating the risk. As such, a calibration curve should be included in the paper to help to show readers where the lack of calibration is exhibited.

Sensitivity analysis should be used to show the discrimination and calibration of the data without imputation (i.e. complete case analysis). This will show how much imputation is impacting on the results.

The statistical analysis section describes application to multiple models ("each of the models'..." line 154), t-tests and chi-square tests. Other than the use of chi-squared in the HL C test, where are these tests performed? I don't see any results. It may be interesting to use such tests to compare the validation set with the e-tropICS though.

How to interpret the Brier score should be included in the statistical methods section and I note that the Brier score is not a particularly good measure of accuracy (https://diagnprognres.biomedcentral.com/articles/10.1186/s41512-017-0020-3).

In the multiple imputation section, predictive mean matching should not be used for categorical variables since it can generate non-integer results. For categorical variables one of the logit options should be used as described here: https://www.stata.com/manuals13/mimiimpute.pdf#mimiimpute. Table 2 should include all of the e-tropICS variables and the missingness for all of them. Categorical variables should show the levels so that we can see whether missingness relates to particular levels of a categorical variable. Line 202 of the results includes results not presented in Table 2. It is not immediately evident from the presentation of Table 2 what sorts of variables (categorical or numerical) these are.

Table 2 does not seem to contain all of the parameters described in the e-tropICS model 2 of Haniffa et al. (2017) and contains ones that should not be in the model. This is quite confusing and raises the question as to which model of Haniffa et al. was actually used.

Table 3 appears to be showing results that do not match the rest of the paper. Regarding sensitivity and specificity and cutpoints, it would be better to see a range of probability cutpoints together with the sensitivity so that the reader can see how different choices impact on results.

The ROC curve and estimate do not match the results and abstract.

The discussion sentence on lines 215-217 regarding the ability of "the model [to] identify those patients at greatest risk of death, but has less ability to differentiate between degrees of severity of illness." isn't shown by the results in their current state. Speculation about the case mix is similar. It would be good if you could show in the results that case mix differs and that there is an issue with calibration by severity of illness. The calibration curve and comparisons between the e-tropICS data and the validation data may help with that.

Minor comments:

Spaces are needed before the bracketing of abbreviations throughout, no need for repeating abbreviations, some abbreviations are not defined where they first appear (eg. abstract AUROC, HL C).

Why is the model described as Model 1 (line 109)? Are there other models that were meant to be presented also?

The e-tropICS model should be described as being internally validated rather than just "validated".

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2020 Dec 31;15(12):e0244989. doi: 10.1371/journal.pone.0244989.r002

Author response to Decision Letter 0


16 Dec 2020

We would like to express our since gratitude to the Editorial team and Reviewers for their feedback and insights. We respond below:

Journal and Academic Editor comments:

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming.

Response: Changes made.

2. Please note that there appears to be a discrepancy in the number of participants, variously given as n=2094 and n=2062.

Response: Apologies for the lack of clarity. We excluded 32 patients who did not meet inclusion criteria from 2094. This has been specified under the first paragraph of the ‘results’ and under the ‘results’ section of the abstract as well.

3. Please provide additional details regarding participant consent. In the ethics statement in the Methods and online submission information, please ensure that you have specified what type you obtained (for instance, written or verbal, and if verbal, how it was documented and witnessed). If your study included minors, state whether you obtained consent from parents or guardians. If the need for consent was waived by the ethics committee, please include this information.

Response: This current study is a registry-based analysis. Registries would essentially be untenable if individual patient consent was required. Each participating hospital and ICU in the registry decided in conjunction with their ethics committee what the consent model would be at their site. At some participating hospitals, this included a waiver and at some others a common written consent form at ICU admission authorizing the collection of deidentified data.We have now added these details into the Methods section (under ethics and consent).

4. We note that you have indicated that data from this study are available upon request. PLOS only allows data to be available upon request if there are legal or ethical restrictions on sharing data publicly. For information on unacceptable data access restrictions, please see http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions.

In your revised cover letter, please address the following prompts:

a) If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially identifying or sensitive patient information) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent.

b) If there are no restrictions, please upload the minimal anonymized data set necessary to replicate your study findings as either Supporting Information files or to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. Please see http://www.bmj.com/content/340/bmj.c181.long for guidelines on how to de-identify and prepare clinical data for publication. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories.

We will update your Data Availability statement on your behalf to reflect the information you provide.

Response: Our agreement with participating sites in the registry is only for the sharing of deidentified data between them and the registry coordinating centre for the purposes of audit, quality improvement and specific research questions. We are not allowed to post data on a repository or any other public database.

As stated under ‘availability of study data’ , we are committed to data sharing. Data will be made available to qualified researchers who provide a detailed and methodologically sound proposal with specific aims that are clearly outlined. Such proposals will be screened by the registry steering committee for approval. Data sharing will be for the purposes of medical research and under the auspices of the consent under which the data were originally gathered.

To gain access, qualified researchers will need to sign a data sharing and access agreement and will need to confirm that data will only be used for the agreed upon purpose for which data access was granted. Researchers can contact the corresponding author (bharath@icuconsultants.com) through electronic mail for such access.

5. Thank you for stating the following in your Competing Interests section:

[None ].

Please complete your Competing Interests on the online submission form to state any Competing Interests. If you have no competing interests, please state "The authors have declared that no competing interests exist.", as detailed online in our guide for authors at http://journals.plos.org/plosone/s/submit-now

This information should be included in your cover letter; we will change the online submission form on your behalf.

Response: Done

6. One of the noted authors is a group or consortium [IRIS collaborators.]. In addition to naming the author group, please list the individual authors and affiliations within this group in the acknowledgments section of your manuscript. Please also indicate clearly a lead author for this group along with a contact email address.

Response: Updated- highlighted under author information section.

Lead (joint first )authors are Bharath Kumar Tirupakuzhi Vijayaraghavan and Dilanthi Priyadarshini.

Corresponding author is Bharath Kumar Tirupakuzhi Vijayaraghavan and email address is bharath@icuconsultants.com

7. Your ethics statement should only appear in the Methods section of your manuscript. If your ethics statement is written in any section besides the Methods, please delete it from any other section.

Done.

8. Please ensure that you refer to Figure 1 in your text as, if accepted, production will need this reference to link the reader to the figure.

Response: change made.

Response to reviewer comments: PLoS ONE

Reviewer 1:

Congratulations for your research. This is an extensive work and the authors should be commended for this. This article will interest many readers. However, there are some minor comments in order to improve the presentation of this article.

Response: Thank you.

As excluded criteria of the study you report only the age and the length of stay in the ICU. Do you mean that in your dataset are included pregnant or burned patients and patients with the diagnosis of brain death or end-stage malignacy patients? The above case-mix are excluded from the majority of studies and scoring systems. It is not suprising then that the model has a poor calibration. The fact that the above patients are not excluded from the initial dataset is a disadvantage of the method. You should add a discussion about this in the manuscript.

Yes, our dataset includes pregnant patients and patients with malignancy (any stage). Only 6 patients coded as ‘burns’ are part of the dataset. Patients who are admitted with catastrophic brain injury who go on to become brain dead or are declared brain dead in the ICU are also part of the dataset. However, the overall proportion of all these patient groups is only 0.3% and unlikely to have impacted on calibration in our view.

As you described in detail the AUROC values that indicate the descrimination, you should describe as well the values that indicate wheather the calibration is good or poor. (line 158)

Response: Thank you. This has been added to the lines 165 and 166 and now reads higher values of the Hosmer-Lemeshow C-Statistic indicate poorer calibration

Line 224. Replace "By" with "by". Also in lines 223-227 there is a long sentence without any comma.

Done and corrections made.

Of course you need a further investigation in larger dataset in order to generalize the scoring system, but next time it would be better if you perform a prospective study, in order to minimize the missing data and to increase the accuracy of the validation.

Response: We thank the reviewer for this feedback and we will certainly incorporate this in our future study designs.

Reviewer 2:

This paper presents the validation of a new risk prediction model, termed e-Tropics, in a new data set. The results indicated that while the model discriminated well, the calibration was poor.

The methods used in the paper were valid and reasonable and the results potentially useful to users of e-TropICS however the paper lacked clarity in some areas and appeared to be missing information in others. The paper would benefit from additional analyses / more detailed consideration as detailed below.

Major comments:

It is not clear from the paper whether or not any of the institutions used in developing the model were included in the validation data. This is potentially important since case mixes may be different in different institutions. This important information should be carefully described for this data set (rather than stating as having been published elsewhere). The number of participants from each institution in this validation set should be included. If the validation set comprises a combination of institutions used and not used in the model building exercise, analysis containing each subset could be included in sensitivity analysis to show whether those subsets differ in terms of discrimination and calibration.

Response: We thank the reviewer for raising these important questions. None of the institutions involved in the development of the model were involved in validation. We have now clarified this in the Methods (section on study setting- lines 124-26).

The information on number of participants from each institution has now been added (supplementary table 3).

The model was developed on a completely different dataset and there is no overlap with the patients in the validation set. The validation dataset includes patients from 7 ICUs across India.

It would be useful if Table 1 contained summary information on the cohort from e-tropICS in an additional column - then readers can see where the lack of calibration may be arising from. Rather than use of all of the footnote symbols why not just say, for example: Mean heart rate (SD). Some abbreviations are missing definitions which should be included in the footnotes for the table. Why does Table 1 not include all of the e-tropICS variables too? Please update the table to include these. How do we interpret the summaries from Table 1? Obviously they show the difference between those who survived and otherwise but clinical interpretation would be nice here too - how sick is this cohort?

Response: Our understanding is that the reviewer is seeking information on the baseline characteristics of the cohort from the original manuscript by Haniffa et al. We have now included this in Table 1. There are no missing variables in the table- e-TropICS includes variables published under ‘model 1’ of the original Haniffa manuscript. We apologize for the lack of clarity.

Changes made to abbreviations as suggested.

Table 1 presents the baseline characteristics of the entire cohort and by survival status. In terms of illness severity, approximately 1/4th of the cohort was ventilated on admission and approximately 1/5th were on vasopressors at admission.

In addition, we have added a supplementary fig 1, which presents the commonest APACHE II diagnostic categories and the corresponding outcomes.

The lack of calibration could be an issue if it is in a region of the predictive distribution where it is important to calibrate well. It is important to know where the model is over or under-estimating the risk. As such, a calibration curve should be included in the paper to help to show readers where the lack of calibration is exhibited.

Response: We have now added this- Figure 2

Sensitivity analysis should be used to show the discrimination and calibration of the data without imputation (i.e. complete case analysis). This will show how much imputation is impacting on the results.

Response: We now provide the results of the CCA as well in our Results/Appendix section (supplementary table and figures 2). However, given that missingness is unlikely to be completely random (MCAR), we have chosen the imputation model as the primary analysis.

The statistical analysis section describes application to multiple models ("each of the models'..." line 154), t-tests and chi-square tests. Other than the use of chi-squared in the HL C test, where are these tests performed? I don't see any results. It may be interesting to use such tests to compare the validation set with the e-tropICS though.

Response: Thank you. This has been corrected as one model

How to interpret the Brier score should be included in the statistical methods section and I note that the Brier score is not a particularly good measure of accuracy (https://diagnprognres.biomedcentral.com/articles/10.1186/s41512-017-0020-3).

Response: We accept the reviewer feedback and have deleted this metric.

In the multiple imputation section, predictive mean matching should not be used for categorical variables since it can generate non-integer results. For categorical variables one of the logit options should be used as described here: https://www.stata.com/manuals13/mimiimpute.pdf#mimiimpute.

Response: We apologise for the lack of clarity. The method we used was sequential imputation using chained equations , a multivariate approach that allows the flexibility of modelling different types of data within the same model with different rules being chosen based on type of data( PMM for continuous, logit for categorical and so on). https://www.stata.com/manuals13/mimiimputechained.pdf#mimiimputechained

Table 2 should include all of the e-tropICS variables and the missingness for all of them. Categorical variables should show the levels so that we can see whether missingness relates to particular levels of a categorical variable. Line 202 of the results includes results not presented in Table 2. It is not immediately evident from the presentation of Table 2 what sorts of variables (categorical or numerical) these are.

Response: We have made this change now.

Table 2 does not seem to contain all of the parameters described in the e-tropICS model 2 of Haniffa et al. (2017) and contains ones that should not be in the model. This is quite confusing and raises the question as to which model of Haniffa et al. was actually used.

Response: The model used is ‘model 1’ from Haniffa et al’s paper. We apologize for the lack of clarity.

Table 3 appears to be showing results that do not match the rest of the paper. Regarding sensitivity and specificity and cutpoints, it would be better to see a range of probability cutpoints together with the sensitivity so that the reader can see how different choices impact on results.

Response: We have addressed this now by adding a supplementary table (Suppl. Table 1) which shows a range of probability cutpoints with sensitivity and specificity.

The ROC curve and estimate do not match the results and abstract.

Response: Corrected in both abstract and results- we apologise for this oversight.

The discussion sentence on lines 215-217 regarding the ability of "the model [to] identify those patients at greatest risk of death, but has less ability to differentiate between degrees of severity of illness." isn't shown by the results in their current state. Speculation about the case mix is similar. It would be good if you could show in the results that case mix differs and that there is an issue with calibration by severity of illness. The calibration curve and comparisons between the e-tropICS data and the validation data may help with that.

Response: To address this, we have now included summary information from Haniffa et al.’s original manuscript into Table 1. We have also included the calibration curve.

Minor comments:

Spaces are needed before the bracketing of abbreviations throughout, no need for repeating abbreviations, some abbreviations are not defined where they first appear (eg. abstract AUROC, HL C).

Response: Changes made.

Why is the model described as Model 1 (line 109)? Are there other models that were meant to be presented also?

Response: In Haniffa et al. ‘s previous work, the variables included in e-TropICS were described as Model 1. This is the model we have validated. Haniffa et al. also developed a prognostic model called TropICS which we have not validated.

The e-tropICS model should be described as being internally validated rather than just "validated".

Response: Our work is external validation ( on an entirely different dataset) and not internal validation.

Figure 2: Calibration plot of the expected probability (deciles) versus the observed probability.

Supplementary Figure 1: Commonest APACHE II diagnostic categories and ICU outcomes

Supplementary Table 1: Variation of sensitivity and specificity at different cutpoints

Cutpoint Sensitivity Specificity Correctly Classified

>=0.1 99.87% 4.01% 13.61%

>=0.2 88.60% 60.78% 63.57%

>=0.3 66.97% 74.83% 74.05%

>=0.4 59.97% 81.11% 78.99%

>=0.5 52.85% 85.71% 82.42%

>=0.6 48.70% 90.14% 85.99%

>=0.7 40.67% 94.10% 88.74%

>=0.8 23.19% 96.78% 89.41%

>=0.9 4.79% 98.79% 89.37%

Attachment

Submitted filename: Response to reviewer comments final.docx

Decision Letter 1

Aleksandar R Zivkovic

21 Dec 2020

Validation of a simplified risk prediction model using a cloud based critical care registry in a lower-middle income country

PONE-D-20-27684R1

Dear Dr. Tirupakuzhi Vijayaraghavan,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Aleksandar R. Zivkovic

Academic Editor

PLOS ONE

Acceptance letter

Aleksandar R Zivkovic

23 Dec 2020

PONE-D-20-27684R1

Validation of a simplified risk prediction model using a cloud based critical care registry in a lower-middle income country

Dear Dr. Tirupakuzhi Vijayaraghavan:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Aleksandar R. Zivkovic

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Fig. Commonest APACHE II diagnostic categories and ICU outcomes.

    (TIF)

    S2 Fig. ROC curve for complete case analysis.

    (TIF)

    S1 Table. Variation of sensitivity and specificity at different cutpoints.

    (DOCX)

    S2 Table. Performance of the e-TropICS model (complete case analysis).

    (DOCX)

    S3 Table. Number of patients included from each site.

    (DOCX)

    Attachment

    Submitted filename: Response to reviewer comments final.docx

    Data Availability Statement

    Pooled data from IRIS are available from the IRIS Dashboard at https://nicst.com/picu-iris-public/. The IRIS collaboration supports and welcome data sharing. Our agreement with participating sites in the registry is only for the sharing of deidentified data between them and the registry coordinating centre for the purposes of audit, quality improvement and specific research questions. We are not allowed to post data on a repository or any other public database. Raw data will be made available to qualified researchers who provide a detailed and methodologically sound proposal with specific aims that are clearly outlined. Such proposals will be screened by the IRIS Steering committee for approval. Data sharing will be for the purposes of medical research and under the auspices of the consent under which the data were originally gathered. To gain access, qualified researchers will need to sign a data sharing and access agreement and will need to confirm that data will only be used for the agreed upon purpose for which data access was granted. Researchers can contact the corresponding author through electronic mail (bharath@icuconsultants.com) for such access; alternatively, IRIS can be contacted at info@irisicuregistry.org and joinus@irisicuregistry.org.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES