Skip to main content
PLOS One logoLink to PLOS One
. 2022 Sep 23;17(9):e0275116. doi: 10.1371/journal.pone.0275116

Prediction models for the prediction of unplanned hospital admissions in community-dwelling older adults: A systematic review

Jet H Klunder 1,2,*, Sofie L Panneman 1, Emma Wallace 3, Ralph de Vries 4, Karlijn J Joling 2,5, Otto R Maarsingh 1,2, Hein P J van Hout 1,2,5
Editor: Dong Keon Yon6
PMCID: PMC9506609  PMID: 36149932

Abstract

Background

Identification of community-dwelling older adults at risk of unplanned hospitalizations is of importance to facilitate preventive interventions. Our objective was to review and appraise the methodological quality and predictive performance of prediction models for predicting unplanned hospitalizations in community-dwelling older adults

Methods and findings

We searched MEDLINE, EMBASE and CINAHL from August 2013 to January 2021. Additionally, we checked references of the identified articles for the inclusion of relevant publications and added studies from two previous reviews that fulfilled the eligibility criteria. We included prospective and retrospective studies with any follow-up period that recruited adults aged 65 and over and developed a prediction model predicting unplanned hospitalizations. We included models with at least one (internal or external) validation cohort. The models had to be intended to be used in a primary care setting. Two authors independently assessed studies for inclusion and undertook data extraction following recommendations of the CHARMS checklist, while quality assessment was performed using the PROBAST tool. A total of 19 studies met the inclusion criteria. Prediction horizon ranged from 4.5 months to 4 years. Most frequently included variables were specific medical diagnoses (n = 11), previous hospital admission (n = 11), age (n = 11), and sex or gender (n = 8). Predictive performance in terms of area under the curve ranged from 0.61 to 0.78. Models developed to predict potentially preventable hospitalizations tended to have better predictive performance than models predicting hospitalizations in general. Overall, risk of bias was high, predominantly in the analysis domain.

Conclusions

Models developed to predict preventable hospitalizations tended to have better predictive performance than models to predict all-cause hospitalizations. There is however substantial room for improvement on the reporting and analysis of studies. We recommend better adherence to the TRIPOD guidelines.

Background

In the Netherlands, approximately one in five older adults is admitted to hospital each year [1]. Moreover, hospital admission rates in ED patients aged 65 years and older are twice as high as those in ED patients aged <65 years [2]. When hospitalized, older adults are at high risk of experiencing adverse events such as hospital-associated infections and delirium, causing lengthy hospital stays [3, 4]. In addition, hospitalizations pose a significant risk to the functional ability of older adults, whereas 30% of older patients experiences loss of independence in activities of daily living (ADL) after hospital admission [5].

Older adults account for a large proportion of hospitalized adults, which is likely to increase with the aging population, causing overcrowding of emergency departments (EDs) and hospital wards [6, 7]. Overcrowded EDs have been described as a global health problem having negative effects on patients (e.g. treatment delay), healthcare staff (e.g. stress) and the healthcare system (e.g. increased length of stay in ED as well as in hospital wards) [8]. Taking into account that a large proportion of hospitalizations and ED visits in older adults is considered preventable [9], it seems crucial to timely identify older adults at risk of hospitalization to assess possible preventive measures. This would not only increase patient’s health and quality of life, but also relieve pressure on secondary and tertiary care, resulting in a decrease in overall health care costs [10].

Prediction models can be used to identify community-dwelling older adults at risk for unplanned hospital admissions. By defining and combining important predictors of future emergency care use, preventive interventions can be targeted at high risk individuals [11]. Several prediction models for the prediction of unplanned hospitalizations have been developed and two systematic reviews on this topic have previously been published. However, these reviews included studies in adults of all ages or only included easy to apply case-finding instruments [12, 13]. Furthermore, these reviews were published over seven years ago. In an era of personalized and precision medicine, interest in and the number of prediction models have grown rapidly [14, 15]. Moreover, with the emergence of big data, attention has grown towards different modelling techniques beside traditional regression methods, such as machine learning (ML). Despite guidelines as the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) [16], quality of methodology and reporting of clinical prediction model studies is however often insufficient [17, 18].

We carried out a systematic review of validated prediction models for predicting unplanned hospital admissions in community-dwelling older adults (≥65 years). Our objective was to describe characteristics of the models’ development, the predictors included in the final models, the predictive performance, and to appraise methodological quality of the included models.

Methods

This review is reported according to the Preferred Reported Items for Systematic Reviews and Meta-Analyses (PRISMA) Statement [19]. The study protocol has been registered on the International Prospective Register of Systematic Reviews (PROSPERO, registration number: CRD42020207305).

Search strategy, study selection and data-extraction

We conducted systematic searches in the bibliographic databases PubMed, Embase.com and CINAHL (Ebsco) in January 2021, in collaboration with a medical information specialist. The following terms were used (including synonyms and closely related words) as index terms or free-text words: "Hospital admission", "Patient admission", "Unplanned", "Aged", "Older adults", "Prediction". We applied a validated search filter for finding clinical prediction model studies [20]. The full search strategies are provided in S1 File.

As previously mentioned, two systematic reviews on this topic have been published. Wallace et al. carried out a systematic literature search in February 2014 on risk prediction models to predict emergency admissions in community-dwelling adults [13]. O’Caoimh et al. reviewed short case-finding instruments, published up and until November 2014, for community-dwelling older adults (> 50 years) at risk for multiple adverse outcomes, of which hospitalization was one [12]. To provide a complete overview of available prediction models our search was restricted to August 2013 through January 2021 and we added the models described in the previous reviews that fulfilled the eligibility criteria of this systematic review.

The references of the identified articles were searched for relevant publications. Duplicate articles were excluded.

Studies were included if they met the following criteria:

  1. Population: community-dwelling older adults, aged 65 years and over

  2. Intervention: prognostic prediction models derived from retrospective or prospective cohort studies and containing at least one validation cohort

  3. Comparator: not applicable

  4. Outcome: one or more unplanned hospitalizations (defined as unplanned overnight stay in hospital). Studies that had admission to the ED as part of their outcome of interest (i.e. combined endpoints) were also included

  5. Timing: admission to hospital within any time period

  6. Setting: prediction models intended to be used in primary care

We excluded studies if the prediction models:

  1. were contingent on an index hospital admission or ED visit (i.e. readmission models)

  2. studied hospitalizations for specific conditions (e.g. falls or congestive heart failure) as primary outcome

  3. were intended to be used in the ED

  4. were developed in specific populations (e.g. patients in palliative care or with psychiatric conditions), with the exception of participants with sensory impairments, because of high prevalence in the older population [21]

Studies that assessed risk factors only and did not build a prediction model, studies that were not developed to specifically predict unplanned hospitalizations, such as models that identify frailty, and studies published in languages other than English, Dutch, German, French, Italian and Spanish were also excluded.

All records were deduplicated in Endnote v9.1, and consequently exported to the Rayyan web app for title and abstract screening and study selection [22]. After study selection, data extraction was performed using a standardized form following the recommendations of the Checklist for critical Appraisal and data extraction for systematic Reviews of prediction Modelling Studies (CHARMS; S2 File) [23]. Both selection and data extraction phases were independently conducted by two reviewers (JK and SP). Any disagreements were resolved through a consensus procedure or by third review (OM, KJ, HvH). Additional data were sought from authors, when necessary.

Due to heterogeneity of the prediction models, meta-analysis was not possible. We therefore narratively summarized each unique prediction model on study population, predictors, number of outcomes and predictive performance. For clarity reasons, regression models and machine learning models were presented separately. Predictive performance was assessed as model discrimination using the area under the ROC curve (AUC) with 95% confidence intervals. Higher AUC values indicate better discriminatory ability. An AUC of 0.7–0.8 reflects fair discrimination, whereas a model with AUC ≥ 0.8 represents good discrimination [24].

Methodological quality assessment

The Prediction model Risk of Bias ASsessment Tool (PROBAST; S2 File) was used to assess risk of bias and applicability, of which the latter addresses whether the primary study matches the review question [25]. PROBAST rates study methodology and applicability to the review question as being at “high”, “low” or “unclear” risk of bias based on a predetermined set of questions and scoring guide [26].

In addition, we calculated the number of events per variable (EPV) for each model. The number of EPV is the number of outcome events divided by the number of candidate predictors assessed in the multivariable modelling [27]. Studies with an EPV <10 are generally subject to overfitting, therefore an EPV of >20 is recommended. Prediction models developed using ML techniques often require higher EPVs (often >200) to minimize overfitting [26].

Results

Study selection

The literature searches yielded a total of 16,098 citations (Fig 1). After removing duplicates 8,820 references remained. Additionally, twenty-three articles were identified by checking the reference lists of relevant studies. Full texts were retrieved for 170 studies of which ten met all inclusion criteria. Additionally, a total of nine studies were included from the previously published systematic reviews (Tables 1 and 2).

Fig 1. PRISMA flow diagram of included risk prediction models.

Fig 1

Table 1. Prediction models developed using regression methods.

First author + reference Acronym Modelling method Population + Setting Derivation, n Validation, n Data used for final model Outcome Number of outcome events, n (%) AUC (95% CI) Predictors in final model
Boult [28] Pra (Probability of repeated admissions) Logistic regression Non-institutionalized patients aged ≥70, United States, 1984–1990 2942 2934 (split sample) Predictors = Self-report data from longitudinal study of aging
Outcome = Medicare program records
≥2 hospital admissions in 4 years Internal validation = 669 (22.7%) Internal validation = 0.61 1. Age
2. Sex
3. Self-rated health
4. Availability of informal caregiver
5. Diagnosis of coronary artery disease
6. Diagnosis of diabetes
7. Hospital admission in previous year
8. ≥6 doctor visits in previous year
Deardorff [29] LASSO
Logistic regression
Community dwelling Medicare beneficiaries with hearing and/or vision impairment aged ≥65, United States, 1999–2006 15,999 N/A bootstrap validation in full cohort Predictors = Medicare Current Beneficiary Survey data
Outcome = Claims data
Hospital admission in 12 months Derivation = 2567 (16.0%)
Validation = N/A
Derivation = 0.72
Validation = 0.72
1. Number of inpatient admissions in previous year
2. Number of ED visits in previous year
3. ADL difficulty level
4. Poor self-rated health
5. History of myocardial infarction
6. History of stroke
7. History of non-skin cancer
Freedman [30] Logistic regression Patients with a Kaiser Permanente health plan aged ≥81, United States, 1993 1873 1872 (split sample) Predictors = Self-administered questionnaire
Outcome = Health plans computerized records system
Hospital admission in 4.5 months Derivation = NR
Validation = NR
Derivation = 0.69
Validation = 0.63
1. Heart trouble
2. Limited physical independence
3. Interaction of 1. and 2.
4. Need help preparing meals
5. Diabetes
Inouye [31] Logistic regression Patients aged ≥70 in 2 primary care clinics, United States, 2003–2006 1932 1987 (split sample) Predictors and outcome = Administrative data Unplanned hospital admission within 12 months Derivation = 299 (15%)
Validation = 328 (17%)
Derivation = 0.72
Validation = 0.73
1. CCI ≥21
2. Hospitalization in previous year
3. Primary care visits ≥6 in previous year
4. Age ≥85
5. Unmarried
Kan [32]2 Full model approach Patients enrolled in a local Medicare Advantage Health Maintenance Organization plan, United States, 2011–2013 16,705 NR
Temporal validation
Three models
1. Based on claims data
2. Based on EHR-structured data
3. Based on EHR-structured and EHR-unstructured data
≥1 hospitalization within 12 months Predictive model = 3174 (19.0%) Predictive model =
Model 1 = 0.70
Model 2 = 0.70
Model 3 = 0.71
1. Age
2. Sex
3. Race
4. Number of major ADGs2a
5. Number of hospital dominant conditions2b
6. Number of frailty risk factors2c
Kim [46] Logistic regression Insured adults aged ≥ 65, South Korea, 2011–2012 Total sample: 113,612 NR (split sample + bootstrap) Predictors and outcome = Routinely collected claims data Potentially avoidable hospitalization within 12 months Total sample = 2856 (2.5%) Derivation = 0.77 (0.76–0.79)
Validation = 0.78 (0.77–0.80)
1. Age
2. Living area
3. Insurance
4. No. chronic conditions
5. Polypharmacy
6. Disability
7. Hospitalization in past year
8. Total health expenditures in past year
Kurichi [33] Logistic regression Medicare Beneficiaries aged ≥65, United States, 2001–2007 15,606 7801 (split sample) Predictors = routinely collected survey data
Outcome = claims files
Two models due to collinearity:
1. ADL limitation
2. IADL limitation
Hospital admission within 3 years NR Development =
ADL limitation: 0.67
iADL limitation: 0.67
Validation =
ADL limitation: 0.67
iADL limitation: 0.67
22/23 variables:3
1. Sociodemographics (n = 6)
2. Self-reported health conditions (n = 13)
3. Vision impairment
4. Smoking
5. (I)ADL stage
6. Proxy responded (IADL model only)
Lin [45] Logistic regression Subjects aged ≥65 with at least 1 outpatient visit in 2008, Taiwan, 2008–2009 133,726 44,560 (split sample) Predictors and outcome = claims files from national health insurance institute Hospital admission within 1 year Derivation = 25,541 (19.1%)
Validation = 8511 (19.1%)
Development = 0.64 (0.64–0.65)
Validation = 0.64 (0.63–0.65)
1. Age
2. Education
3. COPD
4. Heart disease
5. Diabetes
6. Cancer (with or without metastases)
7. Chronic kidney disease
8. ED visit in past year
9. Received home care in past year
López-Aguilà [44] Logistic regression Patients in primary care aged ≥65, Spain, 2006–2009 28,430 NR Predictors = Clinical records of primary care centers, pharmacy database, and hospital discharge records
Outcome = Hospital discharge records
Unplanned hospital admission in 12 months Derivation = 2103 (7.3%)
Validation = NR
Derivation = 0.78
Validation = 0.76
1. Sex
2. Age
3. COPD
4. Heart failure
5. 5 or more concurrent diagnoses
6. 4 or more prescribed drugs
7. 2 or more emergency admissions4
8. 2 or more planned admissions4
9. 9 or more days of cumulative stay4
Lyon [40] EARLI (Emergency Admission Risk Likelihood Index Logistic regression Patients in general practices aged ≥75, England, 2002–2003 3032 500 (split sample + bootstrap) Predictors = Questionnaire
Outcome = Hospital Episodes Statistics data
Unplanned hospital admission in 12 months Derivation = 696 (23.0%)
Validation = NR
Derivation = 0.70 (0.67–0.72)Validation =
• Bootstrap validation = 0.69
• Split sample validation = 0.67 (0.63–0.71)
1. Heart problems
2. Leg ulcers
3. Get out of the house without help
4. Problems with memory or get confused
5. Emergency hospital admission in last 12 months
6. Overall state of health
Marcusson [43] Logistic regression / LASSO Patients in primary care aged ≥75, Sweden, 2015–2017 20,364 Internal validation = 20,364 (split sample)
External validation:
1) 51,104 (sample with ages 65–74)
2) 38,121 (different time period)
Predictors and outcome = computerized information system of the County Council of Östergötland. Unplanned hospital admission within 12 months Derivation = 4130 (20.3%)
Validation = Split sample: 4037 (19.8%)
External validation: NR
Derivation: NR
Internal validation: 0.69 (0.68–0.70)
External validation:
1) 0.68 (0.67–0.69)
2) 0.68 (0.67–0.69)
38 predictors =
1. Sex
2. Age
3. Number of non-physician visits
4. Number of physician visits, 5. Number of previous in-ward hospital stays
6. Number of ED visits
7. Signs/symptoms and medical diagnoses (n = 32)
Mazzaglia [38] Logistic regression Persons in primary care aged ≥65, Italy, 2003–2004 2470 2926 (external validation) Predictors = Questionnaire answered by primary care physician, registries of the regional health system of Tuscany
Outcome = Registries of the regional health system of Tuscany
Hospitalization in 15 months Derivation = 445 (18.0%)
Validation = 504 (17.2%)
Derivation = 0.68 (0.66–0.71)
Validation = 0.67 (0.65–0.70)
1. Number of positive responses to screening test5
2. Age
3. Sex
4. Hospitalization in previous 6 months
5. ≥5 prescriptions
Mishra6 [34] Mixed effects logistic regression
Full model approach
Residents at an Aging-in-Place facility, United States, 2011–2019 N/A 150 participants, 4495 individual assessments Predictors and outcome = routinely collected assessments in EMR every 6 months ED visit or hospital admission within 6 months NR 0.72 (0.65–0.79) Geriatric assessments:
1. ADL6a
2. IADL6b
3. Depressive symptoms6c
4. Cognition6d
5. Mental health6e
6. Physical health6e
O’Caoimh [41] RISC (Risk Instrument for Screening in the Community) Iterative process of item generation and reduction using literature searches and focus groups with public health nurses (PHN) Community-dwelling adults ≥65 under follow-up by PHN, Ireland, 2012–2013 N/A 801 Predictors = PHN review and additional GP information
Hospitalization = Data from hospital enquiries
Acute admission to an acute hospital within 12 months Validation = 142 (17.7%) Validation = 0.61 (0.55–0.66) 1. Age
2. Gender
3. Living arrangement
4. Presence and magnitude of concern for PHN across 3 domains:
• Mental state
• ADL
• Medical/physical state
5. Ability of caregiver to manage (according to PHN)
Reuben [35] Logistic regression Medicare beneficiaries aged ≥65, United States, 1988–1992 2569 2569 (split sample)
10-fold cross-validation
Predictors = Interviews, physical examination, and laboratory testsThree models developed =
1. Self-reported prior hospitalizations only
2. Self-report variables
3. Self-report, physical examination, and laboratory variables
Outcome = claims data
High utilization (≥11 hospital days in 3 years) Full cohort = 1243 (24.2%) Full cohort (after cross-validation) =
1. 0.60
2. 0.68
3. 0.69
Self-reported predictors:
1. Any hospitalization in previous year
2. Any hospitalization in year before that
3. Male gender
4. Fair or poor health
5. Not currently working
6. Little participation at religious services
7. Need help with bathing
8. Unable to walk a mile
9. Diabetes, sugar in urine, or high blood sugar,
10. Taking loop diuretics
Laboratory results:
11. Serum albumin
12. Serum iron
Roos [42] Logistic regression Insured participants aged ≥65 years, Canada 1970–1973 1518 1518 (split sample) Predictors = Three models were compared
1. Administrative data only
2. Interview data
3. Administrative and interview data
Outcome = claims data
Hospital admission within 24 months NR NR Interview questions:
1. Self-rated health
2. Reported conditions of arthritis, diabetes, chest
3. Reported undergoing ≥1 treatment
4. Amount of time spent in hospital in last year
Administrative data:
5. Living with spouse
6. Prior hospital utilization in last year
7. Prior ambulatory utilization in last year
Shelton [36] CARS (Community Assessment Risk Screen) Logistic regression Medicare patients with ≥1 specified characteristic and ≥65 years, United States, 1993–1995 411 1054 (external validation) Predictors = telephone interviews, mailed questionnaires
Outcome = claims files (hospitalization) and self-report (ED visit)
Hospitalization or ED visit in 12 months Derivation = 131 (31.9%)
Validation = 304 (28.8%)
Derivation = 0.74
Validation = 0.67
1. Any of the following conditions: heart disease, diabetes, myocardial infarction, stroke, COPD, cancer
2. 5 or more prescription drugs
3. ED visit or unplanned hospital admission in past 6 months
Wu [37] Logistic regression
Full model approach
Medicare beneficiaries aged ≥65 in longitudinal aging study, United States, 2010–2012 4457 Leave-one-out cross validation Predictors =
1. survey based (S)
2. claims based (C)
3. survey and claims based (S+C)
Outcome = claims data
1. Any hospital admission within 12 months
2. Preventable hospital admission within 12 months
1. Any hospital admission = 1046 (21.0%)
2. Preventable hospital admission = 245 (4.5%)
1. Any hospital admission:
• Survey based = 0.67
• Claims based = 0.71
• Combined = 0.72
2. Preventable hospital admission:
• Survey based = 0.72
• Claims based = 0.76
• Combined = 0.78
1. Frailty status (S)7
2. Number of major ADGs2a (C, S+C)
3. Number of geriatric risk factors2c (C, S+C)

ADL: activities of daily living, ADG: Aggregated Diagnostic Group, AMTS: Abbreviated Mental Test Score, C: claims based model; CCI: Charlson Comorbidity Index; ED: emergency department, IADL: instrumental activities of daily living, MMSE: Mini-Mental State Examination, NR: not reported, PHN: public health nurse, S: survey assessment based model; C+S: combined survey and claims based model

1 The Charlson Comorbidity Index incorporates 17 weighted comorbidity conditions. A score of ≥2 is a commonly used cut-point to indicate high comorbidity.

2 An inclusion criterion for age was not specified. Mean age of the sampled population was 76.1 ± 7.3. (a) Major ADGs refers to 8 major aggregated diagnostic groups assigned by the John Hopkins ACG System, which have very high expected resource use. (b) Hospital dominant conditions were based on diagnoses that are associated with markedly higher probability of future hospitalization. (c) The geriatric risk index was based on the presence of 1 or ≥2 of the 10 geriatric risk factors (i.e. falls, walking difficulty, severe issues with bladder control, absence of fecal control, weight loss, malnutrition, vision impairment, dementia/cognitive impairment, presence of decubitus/pressure ulcers, lack of social support).

3 Due to multicollinearity between the ADL and IADL limitation variable, two models were developed. In the model with IADL limitation, proxy response was added as predictor. All other variables were identical.

4 These three variables were separately assessed as number of events in the year before index date and number of events in the year before that.

5 The screening test was a seven item questionnaire answered by the primary care physician and contained information on limitations in ADLs and IADLs, poor vision, poor hearing, recent unintentional weight loss, use of homecare services, and inadequacy of income.

6 One of the study participants was aged 62 at inclusion. The geriatric assessment was composed of (a) the Short Form ADL, RAI MDS 2.0 for ADL, (b) the Lawton IADL scale for IADL, (c) the Geriatric Depression Scale for depression, (d) the Mental State Examination for cognition and (e) the mental component score and physical component score of the Short Form-12, a 12-item Health Survey.

7 Frailty status was categorized as robust, pre-frail and frail, and was based on the five criteria of the Fried frailty phenotype.

Table 2. Prediction model developed using machine learning techniques.

First author + reference Acronym Compared algorithms1 Population + Setting Derivation, n Validation, n Data used for final model Outcome Number of outcome events, n (%) AUC (95% CI) of best performing algorithm Features in final model
Tarekegn [39] SVM
ANN
RF
DT
LR
GP
Patients in primary care aged ≥65 years, Italy, 2016–2017 1) Urgent hospitalization = 1,095,613
2) Preventable hospitalization = 1,095,613
N/A
10-fold cross-validation procedure
Features and outcome = data from administrative and health databases in the Piedmontese Longitudinal Study 1) urgent hospitalization
2) preventable hospitalization2
Horizon = 12 months
Derivation =
1) Urgent hospitalization = 38,918 (3.55%)
2) Preventable hospitalization = 19,072 (1.74%)
Validation = N/A
1) Urgent hospitalization = 0.75
(SVM)
2) Preventable hospitalization = 0.74
(ANN, SVM and LR)
1) Urgent hospitalization = 34 features2a
2) Preventable hospitalization = 33 features2a
Variable categories: Sociodemographic, medical history, medication, healthcare utilization, functional status

AUC: area under the curve, CI: confidence interval, DT: decision tree, GP: genetic programming, LR: logistic regression, ML: machine learning, ANN: artificial neural network, RF: random forests; SVM: support vector machine

1 Algorithms used for feature selection and performance measures, unless stated otherwise.

2 A definition of preventable hospitalizations was not reported. (a) Ten most important features (equal for urgent and preventable hospitalizations): age, mental disease, poly prescriptions, diseases of the respiratory system, citizenship, non-urgent visit (white code), arthropathy, diseases of the circulatory system, glaucoma. NB These variables were not further specified.

Description of included studies

Of the 19 studies included, the majority were developed in the United States (n = 10) [2837] and two in Italy [38, 39]. The other studies were developed in the United Kingdom [40], Ireland [41], Canada [42], Sweden [43], Spain [44], Taiwan [45], and South-Korea [46]. Twelve studies included participants aged ≥65 years [29, 33, 3539, 41, 42, 4446], the remaining studies used a higher age as inclusion criterion with 81 years [30] as the highest minimum age for inclusion. Total sample sizes ranged from 150 [34] to 1,095,613 [39] participants. Two studies were developed in patients receiving home or community care [34, 41], and one study developed a prediction model in older adults with a vision and/or hearing impairment [29].

Eight studies developed their model using administrative or electronic medical record data [31, 32, 34, 39, 4346]. Eight studies used survey data to develop their model [2830, 33, 36, 38, 40, 41], and three models were developed using both [35, 37, 42].

Various outcomes were assessed in the development of the prediction models. Two studies validated their models for more than one outcome (i.e. unplanned hospitalizations and potentially preventable hospitalizations, separately) [37, 39]. Two models predicted a combined endpoint of any hospitalization or ED visit [34, 36]. Fourteen studies assessed unplanned hospitalizations as single endpoint [2933, 3745], two studies predicted multiple hospitalizations within a specific time period [28, 35], and three studies presented a model for potentially preventable hospitalizations [37, 39, 46]. Two out of these three studies defined admissions as potentially preventable based on the principal diagnosis on admission [37, 46]. The third study did not report its definition for preventable admissions [39]. The prediction horizon ranged from 4.5 months [30] to 4 years [28]. The majority of studies (n = 12) were developed to predict the outcome within 12 months [29, 31, 32, 36, 37, 3941, 4346].

Variables used in prediction models

The number of predictors included in the final model ranged from 3 [36] to 38 [43]. The variables most frequently included in the final models were previous hospital admission (n = 11) [28, 29, 31, 35, 36, 38, 40, 4244, 46], age (n = 11) [28, 3133, 3739, 4346] and sex or gender(n = 8) [28, 32, 33, 35, 37, 38, 43, 44] (Table 3). Twelve studies included one or more specific diseases in the final model, of which cardiovascular diseases (e.g. coronary artery disease, heart failure, or hypertension) were most frequently included (n = 11) [2830, 33, 36, 39, 40, 4245]. The most frequently included cardiovascular predictor was ischemic heart disease (n = 7) [28, 29, 33, 39, 4244]. Diabetes was included in seven models [28, 30, 33, 35, 4245]. Other frequently included medical diagnoses were cancer (n = 4) [29, 33, 43, 45] and COPD or respiratory problems (n = 4) [33, 39, 44, 45]. Six studies included a multimorbidity measure, either defined as the Charlson Comorbidity Index or a disease count, in the final model [31, 32, 36, 37, 44, 46]. Living arrangement (mostly defined as living alone) was considered for inclusion in seven models [28, 29, 33, 35, 36, 40, 42], and was retained in one model [42]. This model defined living arrangement as living with a spouse.

Table 3. Variables included in and excluded from the models.

Category Variable Included in final model, N, (%) Excluded after evaluation, N (%)
Demographics Age 11 (73%) [28, 3133, 3739, 4346] 4 (27%) [29, 35, 36, 42]
Sex 8 (62%) [28, 32, 33, 35, 37, 38, 43, 44] 5 (38%) [29, 31, 36, 42, 46]
Education 2 (33%) [33, 45] 4 (67%) [28, 29, 35, 36]
Race/ethnicity 2 (40%) [33, 37] 3 (60%) [28, 29, 31]
Income/SES 1 (20%) [38] 4 (80%) [28, 29, 35, 46]
Residential area 3 (100%) [33, 39, 46] 0
Marital status 1 (33%) [31] 2 (67%) [36, 45]
Insurance coverage 2 (50%) [33, 46] 2 (50%) [29, 31]
Employment 1 (100%) [35] 0
Health status Self-rated health 5 (63%) [28, 29, 35, 40, 42] 3 (37%) [28, 30, 36]
Mental health 2 (50%) [34, 41] 2 (50%) [35, 36]
Physical health 2 (67%) [34, 41] 1 (33%) [36]
Use of alcohol or tobacco 1 (50%) [33] 1 (50%) [35]
Medical history Specific medical diagnoses 12 (63%) [2830, 33, 35, 36, 39, 40, 4245] 7 (37%) [28, 29, 33, 35, 40, 44, 45]
Multimorbidity 6 (86%) [31, 32, 36, 37, 44, 46] 1 (14%) [35]
Sensory impairment 4 (50%) [32, 33, 37, 38] 4 (50%) [28, 33, 35, 40]
Cognitive impairment 5 (83%) [3234, 37, 40] 1 (17%) [28]
Health care utilization Prior hospitalization 11 (73%) [28, 29, 31, 35, 36, 38, 40, 4244, 46] 4 (27%) [28, 30, 42, 45]
Prior ED visit 3(60%) [29, 43, 45] 2 (20%) [30, 46]
Prior outpatient visits 2 (40%) [28, 43] 3 (60%) [28, 30, 42]
Primary care visits 1 (100%) [31] 0
Continuity of care 0 1 (100%) [46]
Receiving homecare 2 (67%) [38, 45] 1 (33%) [28]
Previously in LCF 0 3 (100%) [30, 31, 35]
Receiving treatment for specific condition 1 (50%) [42] 1 (50%) [31]
Laboratory results 1 (33%) [35] 2 (67%) [31, 35]
Barrier to receiving care 0 1 (100%) [29]
Satisfaction with received health care 0 1 (100%) [29]
Medication Number of prescription medication 5 (71%) [36, 38, 39, 44, 46] 2 (29%) [30, 40]
Use of a specific medication 2 (67%) [35, 39] 1 (33%) [35]
Social status Caregiver availability 3 (67%) [28, 41] 1 (33%) [40]
Lack of social support 2 (67%) [32, 37] 1 (33%) [35]
Living arrangement 1 (14%) [42] 6 (86%) [28, 29, 33, 35, 36, 40]
Functional status ADL 6 (75%) [29, 3335, 38, 41] 2 (25%) [30, 40]
IADL 3 (50%) [30, 34, 38] 3 (50%) [29, 30, 35]
Urinary or fecal incontinence 3 (43%) [32, 33, 37] 4 (57%) [28, 30, 35, 40]
History of falls 2 (40%) [32, 37] 3 (60%) [28, 30, 40]
Mobility 6 (86%) [30, 35, 39, 40, 42, 46] 1 (14%) [28]
Malnutrition or weight loss 3 (100%) [32, 37, 38] 0
Other Recent stressful event 0 2 (100%) [30, 40]
Need help to complete survey 1 (33%) [33] 2 (67%) [30, 33]
Participation at religious events 1(100%) [35] 0
State of home 0 1 (100%) [42]

ADL: activities of daily living, ED: emergency department, IADL: instrumental activities of daily living, LCF: long-term care facility, SES: socio-economic status. This table is limited to the information provided in the publications.

Predictive accuracy of the models

Two studies analyzed predictive performance of the same prediction model for two different outcomes [37, 39]. One study did not report its predictive performance [42].

Eighteen studies reported an AUC, ranging from 0.61 to 0.78 after validation. The models published after 2014 tended to perform better; median AUC was 0.72 (range 0.64–0.78) (n = 9), whereas the median AUC from the models in the previous reviews was 0.67 (range 0.61–0.76) (n = 9). Models developed using survey data had median AUC of 0.67 (range 0.61–0.72) (n = 8), the median AUC of models developed with administrative data was 0.73 (range 0.64–0.78) (n = 8). Studies that used both data sources are not included in this count.

The models developed for a specific type of hospitalization (i.e. preventable hospitalization or fall with hospitalization) (n = 3), tended to perform better than the models for all-cause hospitalization (n = 17), with a median AUC of 0.78 (range 0.74–0.78) versus 0.69 (range 0.61–0.76), respectively. The two models that assessed AUCs for both outcomes (i.e. Tarekegn et al. and Wu et al. [37, 39]) were included in calculations of both medians with its corresponding AUC and were thus counted twice.

Methodological quality

Overall, the methodological quality of included studies was low (Table 4). Risk of bias was either high or unclear in all studies, predominantly due to bias or insufficient reporting in the analysis domain. More specifically, the handling of missing data was not reported or performed inappropriately in ten studies [29, 31, 33, 36, 37, 4042, 44, 45], eight studies selected predictors based on univariable analyses [30, 31, 33, 35, 40, 42, 43, 45], and five studies solely handled a split-sample procedure for internal validation [28, 30, 31, 33, 45]. Whereas almost all studies (except one [42]) reported model performance in terms of discrimination, only five sufficiently evaluated calibration [2830, 38, 45]. Four studies only reported results of the Hosmer-Lemeshow test as a single calibration measure [31, 35, 40, 44].

Table 4. Methodological quality assessment of included prediction models according the recommendations of the PROBAST.

First author Risk of bias Applicability Overall
Participants Predictors Outcome Analysis EPV1 Participants Predictors Outcome ROB Applicability
Boult - - - + 48 - - - + -
Deardorff - - - ? 103 + - - ? +
Freedman - - - + NI - - - + -
Inouye - - - + 60 - - - + -
Kan + - - ? 358 - - - + -
Kim - - - + 168 - - - + -
Kurichi - - - + NI - - - + -
Lin + - - + 2003 + - - + +
Lopez-Aguila - - - + 54 - - - + -
Lyon - - - + 44 - - - + -
Marcusson - - - + 87 - - - + -
Mazzaglia - - - + 64 - - - + -
Mishra - - - ? NI + - - ? +
O’Caoimh - - - ? 12 - - - ? -
Reuben - - - + 36 - - - + -
Roos - - - + NI - - - + -
Shelton - - - + 8 - - - + -
Tarekegn - - -/?2 ? 129 - ? - ? ?
Wu - - -/+3 ? 27 - - - ?/+3 -

+: high risk of bias/concern for applicability, -: low risk of bias/concern for applicability,?: unclear risk of bias/concern for applicability. EPV: events per variable, ROB: risk of bias, NI: no information (i.e. either number of events or number of candidate predictors was not reported)

1 For studies where multiple outcomes were assessed, only the lowest number of events per variable per study is reported.

2 For the outcome preventable hospitalization, no definition was reported, ROB was therefore evaluated as unclear. For the outcome acute hospital admission, ROB in this domain was low.

3 ROB was low for the outcome any inpatient hospital admission. ROB was high for the outcome preventable hospital admissions, since predictors were included in the outcome definition. Overall ROB was therefore unclear and high, respectively.

The median EPV was 60 and ranged from 8 [36] to 2003 [45] (n = 15). Two studies reported an EPV <20 [36, 41]. In four studies the EPV could not be computed because data on the number of events or the number of candidate predictors were not reported [30, 33, 34, 42]. The models published after 2014 had a higher EPV (median = 129 (range 27–2003)) than the older models (median = 46 (range 8–64)).

Concern for applicability was high in three studies, because the study population or study outcome did not fully match the review question: one study only included older adults with a sensory impairment [29], one study excluded older adults with a hospital admission <6 months prior to the index date [45], and one study evaluated preventable hospital admissions as only outcome [46].

Discussion

This systematic review identified 19 prediction models to predict unplanned hospital admissions in community-dwelling older adults. With our search strategy we built on a review by Wallace et al. on the same topic, however focusing the study population to adults aged 65 years and over. In total we identified 19 prediction models, of which the current review added 10 new prediction models that were not included in the previous reviews. The new models had higher predictive accuracy than the older models. This might be explained by the fact that new models had larger samples of the development cohort and also higher EPVs than the older models. Both are recommended by the TRIPOD guidelines, published in 2015 [16], to improve predictive accuracy and methodological quality. Moreover, the new models used administrative or clinical record data more often for the development of their model. Consistent with Wallace et al., we found that models developed using administrative or clinical record data had higher predictive accuracy than those developed using self-report data. Of the 10 new prediction models, eight used administrative data for development of their model.

To potentially improve predictive accuracy, Wallace et al. suggested to consider nonmedical factors (e.g. social support and functional status) [13]. Despite this recommendation, these variables were rarely evaluated for inclusion in the latest studies. We found that predictors most frequently included in the final models were medical diagnoses (specifically heart disease), prior hospitalizations, age, and sex, which is in line with Wallace’s findings. These risk factors seem to have more impact in the prediction of unplanned admissions than nonmedical factors, considering the relatively high beta-coefficients of these variables in most models (data not shown). Also, chronic diseases and health care use variables are probably more readily available in large routine care data, whereas nonmedical factors are rarely assessed in a systematic way.

Overall, reporting of methodology and findings was often inappropriate or lacked relevant information, risk of bias was therefore either unclear or high in all models. Moreover, despite the publication of the TRIPOD guidelines in 2015, only one [29] out of seven studies published after 2015 reported their study according to the TRIPOD checklist. The majority of studies showed high risk of bias in the analysis domain. Mainly because of univariable analyses as selection method or inappropriate handling of missing data.

Strengths and limitations

The aging population across the globe and increasing interest in personalized medicine makes this review topical. We added a substantial number of prediction models to the previous systematic reviews on this topic. Furthermore, we conducted a thorough search strategy using a validated search filter and assessed data using tools specifically designed for systematic reviews of prognostic studies. However, there are some limitations. First, care must be taken with directly comparing the prediction models because of heterogeneity in study characteristics (e.g. study populations, and selection of candidate predictors) and study outcomes. Since models perform differently in other populations, comparison of predictive performance can only be performed when these models are validated in the same sample. Further, by limiting our inclusion criteria to participants aged 65 and over, we excluded potential prediction models developed in participants with younger age. For example, the DIVERT scale, a tool to predict emergency department visits, was developed in home care clients aged ≥50 years. Even though reported AUCs are a little over 0.6 after geographical validation, targeted application of the risk score has shown its clinical added value for cardiorespiratory management and reduction of hospitalizations in home care recipients [47]. Last, while in principle CHARMS and PROBAST are relevant for prediction model studies using ML, they predominantly focus on regression-based modelling and some unique aspects of ML methods are not captured [48]. This complicated the critical appraisal of the ML study and therefore risk of bias was unclear. Necessity for guidelines for reporting and critical appraisal of prediction model studies using ML has been addressed and PROBAST-ML (as well as TRIPOD-ML) has been announced [48]. Until then, it is recommended to use TRIPOD, CHARMS and PROBAST as benchmark for the development of prediction model studies rather than none [49].

Implications for future research

Our findings provide a proper basis of prediction models on hospitalizations in older people. Knowing that prediction models often perform worse in new populations, external validation studies are needed to assess generalizability across different countries and healthcare systems. Moreover, models that underperform in external samples should not be discarded and studies should assess the possibility of updating existing models by recalibrating, adjusting weights or considering additional predictors [50, 51]. This way, data of the original development model is not wasted. However, updating of a prediction model is only recommended provided that the initial model was appropriately developed and demonstrated promising accuracy [51]. Most prediction models in this review are poorly reported and all are at either high or unclear risk of bias, which makes updating of the existing models more complicated and we therefore cannot recommend one specific model.

Moreover, while recalibration and adjusting weights only affect a model’s calibration, adding (previously missed) important predictors should be considered to improve a model’s discrimination [51]. As mentioned above, nonmedical factors remain under researched in the prediction of hospital admissions in older adults. Taking into account the influence of nonmedical factors on unscheduled secondary care use [52, 53], these variables may contribute to a better discriminative ability of the model.

Last, for both development studies and validation studies we advise to fully report all modelling steps and analysis in sufficient detail according to the TRIPOD guidelines [16]. The TRIPOD guidelines have been developed to improve the reporting of studies developing, validating, and updating prognostic models and to maximize transparency and reproducibility. More specifically, for example, predictive performance should not only be evaluated in terms of discrimination, but also in terms of calibration. Regarding calibration, it is recommended to include a calibration plot or table in addition to the p-value of the Hosmer-Lemeshow test. Furthermore, variables or participants with missing data should not simply be omitted, multiple imputation is recommended as the preferred method for handling of missing data to decrease bias [26].

Implications for future practice

Our study found that the models to predict preventable hospitalizations tended to have better predictive ability than models for all-cause hospitalizations. Preventable admissions reflect admissions for conditions that could have been managed with timely and effective treatment by outpatient primary care (e.g. pneumonia, congestive heart failure, and COPD, often also referred to as ambulatory care sensitive conditions (ACSCs)) [54]. Interventions targeted at older adults with ACSCs provide a window of opportunity for prevention of admissions. Possibly even more so if targeted at persons with additional important risk factors (e.g. recent hospitalization, polypharmacy and/or multimorbidity). In consequence, reduction of the incidence of preventable admissions could substantially lower healthcare costs, and improve health outcomes and older adult’s quality of life [11].

There is however limited evidence for effective preventive interventions to reduce preventable admissions in general [54]. High continuity of care with a general practitioner is associated with lower rates of hospital admissions [55]. Furthermore, several targeted interventions have shown to be effective in patients with specific diseases, such as self-management in patients with COPD and heart failure, and telemedicine in patients with heart failure [11]. Focusing on these targeted interventions may have a beneficial impact on the reduction of hospital admissions in community-dwelling older adults [54].

Conclusion

The prediction models developed to predict preventable hospitalizations tended to perform better than models predicting all-cause hospitalizations. Focusing on enhancing primary care management of conditions related to these preventable admissions may have a beneficial effect on health care quality.

To improve predictive accuracy of prediction models the use of administrative data sources is recommended as well as incorporation of important variables, i.e. age, prior hospitalization and multimorbidity. The impact of nonmedical factors remains unresearched. Moreover, future researchers are recommended to follow the TRIPOD guidelines for prediction model studies, as methodological quality of reporting and analyses of the included studies was low.

Supporting information

S1 Checklist. PRISMA checklist.

(PDF)

S1 File. Full search strategies.

(PDF)

S2 File. CHARMS and PROBAST forms.

(PDF)

Acknowledgments

The authors thank Stichting Preventie, Vroegdiagnostiek en eHealth for their support.

Data Availability

All relevant data are within the paper and its Supporting Information files.

Funding Statement

The authors received no specific funding for this work.

References

  • 1.Netherlands CBfS. Health and Healthcare utilization 2021. [cited 2022 April 30]. Available from: https://opendata.cbs.nl/#/CBS/nl/dataset/83005NED/table?dl=672AF. [Google Scholar]
  • 2.Berben S, Bloemhoff A, Habets K, Liefers J, Hensens C, van Grunsven P, et al. [Care contacts of elderly patients in the emergency care pathway: a retrospective cohort study]. Ned Tijdschr Geneeskd. 2019;163:D3523. Epub 2019/07/31. [PubMed] [Google Scholar]
  • 3.Long SJ, Brown KF, Ames D, Vincent C. What is known about adverse events in older medical hospital inpatients? A systematic review of the literature. Int J Qual Health Care. 2013;25(5):542–54. Epub 2013/08/09. doi: 10.1093/intqhc/mzt056 [DOI] [PubMed] [Google Scholar]
  • 4.Hoogervorst-Schilp J, Langelaan M, Spreeuwenberg P, de Bruijne MC, Wagner C. Excess length of stay and economic consequences of adverse events in Dutch hospital patients. BMC Health Serv Res. 2015;15:531. Epub 2015/12/03. doi: 10.1186/s12913-015-1205-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Loyd C, Markland A, Zhang Y, Fowler M, Harper S, Wright N, et al. Prevalence of Hospital-Associated Disability in Older Adults: A Meta-analysis. J Am Med Dir Assoc. 2020;21(4):455–61.e5. Epub 2019/11/18. doi: 10.1016/j.jamda.2019.09.015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Legramante JM, Morciano L, Lucaroni F, Gilardi F, Caredda E, Pesaresi A, et al. Frequent Use of Emergency Departments by the Elderly Population When Continuing Care Is Not Well Established. PloS one. 2016;11(12):e0165939. Epub 2016/12/16. doi: 10.1371/journal.pone.0165939 ; PubMed Central PMCID: PMC5156362. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Šteinmiller J, Routasalo P, Suominen T. Older people in the emergency department: a literature review. Int J Older People Nurs. 2015;10(4):284–305. Epub 2015/07/18. doi: 10.1111/opn.12090 [DOI] [PubMed] [Google Scholar]
  • 8.Morley C, Unwin M, Peterson G, Stankovich J, Kinsman L. Emergency department crowding: A systematic review of causes, consequences and solutions. PloS one. 2018;13(8):e0203316. Epub 2018/08/31. doi: 10.1371/journal.pone.0203316 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Kozak L, Hall M, Owings M. Trends in avoidable hospitalizations, 1980–1998. Health Aff (Millwood). 2001;20(2):225–32. Epub 2001/03/23. doi: 10.1377/hlthaff.20.2.225 [DOI] [PubMed] [Google Scholar]
  • 10.Starfield B, Shi L, Macinko J. Contribution of primary care to health systems and health. The Milbank quarterly. 2005;83(3):457–502. Epub 2005/10/06. doi: 10.1111/j.1468-0009.2005.00409.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Purdy S, Paranjothy S, Huntley A, Thomas R, Mann M, Huws D, et al. Interventions to reduce unplanned hospital admission: a series of systematic reviews. Final Report. 2012. [Google Scholar]
  • 12.O’Caoimh R, Cornally N, Weathers E, O’Sullivan R, Fitzgerald C, Orfila F, et al. Risk prediction in the community: A systematic review of case-finding instruments that predict adverse healthcare outcomes in community-dwelling older adults. Maturitas. 2015;82(1):3–21. Epub 2015/04/14. doi: 10.1016/j.maturitas.2015.03.009 [DOI] [PubMed] [Google Scholar]
  • 13.Wallace E, Stuart E, Vaughan N, Bennett K, Fahey T, Smith S. Risk prediction models to predict emergency hospital admission in community-dwelling adults: a systematic review. Medical care. 2014;52(8):751–65. Epub 2014/07/16. doi: 10.1097/MLR.0000000000000171 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Hemingway H. Prognosis research: why is Dr. Lydgate still waiting? Journal of clinical epidemiology. 2006;59(12):1229–38. Epub 2006/11/14. doi: 10.1016/j.jclinepi.2006.02.005 [DOI] [PubMed] [Google Scholar]
  • 15.Riley R, Ridley G, Williams K, Altman D, Hayden J, de Vet H. Prognosis research: toward evidence-based results and a Cochrane methods group. Journal of clinical epidemiology. 2007;60(8):863–5. Epub 2007/07/04. doi: 10.1016/j.jclinepi.2007.02.004 [DOI] [PubMed] [Google Scholar]
  • 16.Moons K, Altman D, Reitsma J, Ioannidis J, Macaskill P, Steyerberg E, et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): explanation and elaboration. Annals of internal medicine. 2015;162(1):W1–73. Epub 2015/01/07. doi: 10.7326/M14-0698 [DOI] [PubMed] [Google Scholar]
  • 17.Bouwmeester W, Zuithoff N, Mallett S, Geerlings M, Vergouwe Y, Steyerberg E, et al. Reporting and Methods in Clinical Prediction Research: A Systematic Review. PLoS medicine. 2012;9(5):e1001221. doi: 10.1371/journal.pmed.1001221 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Heus P, Damen J, Pajouheshnia R, Scholten R, Reitsma J, Collins G, et al. Poor reporting of multivariable prediction model studies: towards a targeted implementation strategy of the TRIPOD statement. BMC Medicine. 2018;16(1):120. doi: 10.1186/s12916-018-1099-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Lee SW, Koo MJ. PRISMA 2020 statement and guidelines for systematic review and meta-analysis articles, and their underlying mathematics: Life Cycle Committee Recommendations. Life Cycle. 2022;2:e9. doi: 10.54724/lc.2022.e9 [DOI] [Google Scholar]
  • 20.Geersing G, Bouwmeester W, Zuithoff P, Spijker R, Leeflang M, Moons K. Search filters for finding prognostic and diagnostic prediction studies in Medline to enhance systematic reviews. PloS one. 2012;7(2):e32844. Epub 2012/03/07. doi: 10.1371/journal.pone.0032844 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Swenor B, Ramulu P, Willis J, Friedman D, Lin F. The prevalence of concurrent hearing and vision impairment in the United States. JAMA internal medicine. 2013;173(4):312–3. Epub 2013/01/23. doi: 10.1001/jamainternmed.2013.1880 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Ouzzani M, Hammady H, Fedorowicz Z, Elmagarmid A. Rayyan-a web and mobile app for systematic reviews. Systematic reviews. 2016;5(1):210. Epub 2016/12/07. doi: 10.1186/s13643-016-0384-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Moons K, de Groot J, Bouwmeester W, Vergouwe Y, Mallett S, Altman D, et al. Critical appraisal and data extraction for systematic reviews of prediction modelling studies: the CHARMS checklist. PLoS medicine. 2014;11(10):e1001744. Epub 2014/10/15. doi: 10.1371/journal.pmed.1001744 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Iezzoni L. Risk adjustment for measuring health care outcomes: Health Administration Press; Chicago; 2003. [Google Scholar]
  • 25.Wolff R, Moons K, Riley R, Whiting P, Westwood M, Collins G, et al. PROBAST: A Tool to Assess the Risk of Bias and Applicability of Prediction Model Studies. Annals of internal medicine. 2019;170(1):51–8. Epub 2019/01/01. doi: 10.7326/M18-1376 [DOI] [PubMed] [Google Scholar]
  • 26.Moons K, Wolff R, Riley R, Whiting P, Westwood M, Collins G, et al. PROBAST: A Tool to Assess Risk of Bias and Applicability of Prediction Model Studies: Explanation and Elaboration. Annals of internal medicine. 2019;170(1):W1–w33. Epub 2019/01/01. doi: 10.7326/M18-1377 [DOI] [PubMed] [Google Scholar]
  • 27.Austin P, Steyerberg E. Events per variable (EPV) and the relative performance of different strategies for estimating the out-of-sample validity of logistic regression models. Statistical methods in medical research. 2017;26(2):796–808. Epub 2014/11/21. doi: 10.1177/0962280214558972 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Boult C, Dowd B, McCaffrey D, Boult L, Hernandez R, Krulewitch H. Screening elders for risk of hospital admission. Journal of the American Geriatrics Society. 1993;41(8):811–7. Epub 1993/08/01. doi: 10.1111/j.1532-5415.1993.tb06175.x [DOI] [PubMed] [Google Scholar]
  • 29.Deardorff W, Sloane R, Pavon J, Hastings S, Whitson H. Hospitalization Risk Among Older Adults with Sensory Impairments: Development of a Prognostic Model. Journal of the American Geriatrics Society. 2020;68(11):2650–5. doi: 10.1111/jgs.16800 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Freedman J, Beck A, Robertson B, Calonge B, Gade G. Using a mailed survey to predict hospital admission among patients older than 80. Journal of the American Geriatrics Society. 1996;44(6):689–92. Epub 1996/06/01. doi: 10.1111/j.1532-5415.1996.tb01833.x [DOI] [PubMed] [Google Scholar]
  • 31.Inouye SK, Zhang Y, Jones RN, Shi P, Cupples LA, Calderon HN, et al. Risk factors for hospitalization among community-dwelling primary care older patients: development and validation of a predictive model. Medical care. 2008;46(7):726–31. Epub 2008/06/27. doi: 10.1097/MLR.0b013e3181649426 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Kan H, Kharrazi H, Leff B, Boyd C, Davison A, Chang H, et al. Defining and Assessing Geriatric Risk Factors and Associated Health Care Utilization Among Older Adults Using Claims and Electronic Health Records. Medical care. 2018;56(3):233–9. doi: 10.1097/MLR.0000000000000865 [DOI] [PubMed] [Google Scholar]
  • 33.Kurichi J, Bogner H, Streim J, Xie D, Kwong P, Saliba D, et al. Predicting 3-year mortality and admission to acute-care hospitals, skilled nursing facilities, and long-term care facilities in Medicare beneficiaries. Arch Gerontol Geriatr. 2017;73:248–56. doi: 10.1016/j.archger.2017.08.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Mishra A, Skubic M, Popescu M, Lane K, Rantz M, Despins L, et al. Tracking personalized functional health in older adults using geriatric assessments. BMC Med Inform Decis Mak. 2020;20(1):270. doi: 10.1186/s12911-020-01283-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Reuben D, Keeler E, Seeman T, Sewall A, Hirsch S, Guralnik J. Development of a method to identify seniors at high risk for high hospital utilization. Medical care. 2002;40(9):782–93. Epub 2002/09/10. doi: 10.1097/00005650-200209000-00008 [DOI] [PubMed] [Google Scholar]
  • 36.Shelton P, Sager MA, Schraeder C. The community assessment risk screen (CARS): identifying elderly persons at risk for hospitalization or emergency department visit. Am J Manag Care. 2000;6(8):925–33. Epub 2001/02/24. [PubMed] [Google Scholar]
  • 37.Wu S, Mulcahy J, Kasper J, Kan H, Weiner J. Comparing Survey-Based Frailty Assessment to Medicare Claims in Predicting Health Outcomes and Utilization in Medicare Beneficiaries. J Aging Health. 2020;32(7):764–77. doi: 10.1177/0898264319851995 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Mazzaglia G, Roti L, Corsini G, Colombini A, Maciocco G, Marchionni N, et al. Screening of older community-dwelling people at risk for death and hospitalization: the Assistenza Socio-Sanitaria in Italia project. Journal of the American Geriatrics Society. 2007;55(12):1955–60. Epub 2007/10/20. doi: 10.1111/j.1532-5415.2007.01446.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Tarekegn A, Ricceri F, Costa G, Ferracin E, Giacobini M. Predictive Modeling for Frailty Conditions in Elderly People: Machine Learning Approaches. JMIR Med Inform. 2020;8(6):e16678. doi: 10.2196/16678 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Lyon D, Lancaster GA, Taylor S, Dowrick C, Chellaswamy H. Predicting the likelihood of emergency admission to hospital of older people: development and validation of the Emergency Admission Risk Likelihood Index (EARLI). Fam Pract. 2007;24(2):158–67. Epub 2007/01/11. doi: 10.1093/fampra/cml069 [DOI] [PubMed] [Google Scholar]
  • 41.O’Caoimh R, Gao Y, Svendrovski A, Healy E, O’Connell E, O’Keeffe G, et al. The Risk Instrument for Screening in the Community (RISC): a new instrument for predicting risk of adverse outcomes in community dwelling older adults. BMC Geriatr. 2015;15:92. doi: 10.1186/s12877-015-0095-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Roos N, Roos L, Mossey J, Havens B. Using administrative data to predict important health outcomes. Entry to hospital, nursing home, and death. Medical care. 1988;26(3):221–39. Epub 1988/03/01. doi: 10.1097/00005650-198803000-00001 [DOI] [PubMed] [Google Scholar]
  • 43.Marcusson J, Nord M, Dong H, Lyth J. Clinically useful prediction of hospital admissions in an older population. BMC Geriatr. 2020;20(1):95. doi: 10.1186/s12877-020-1475-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.López-Aguilà S, Contel JC, Farré J, Campuzano JL, Rajmil L. Predictive model for emergency hospital admission and 6-month readmission. Am J Manag Care. 2011;17(9):e348–57. Epub 2011/09/10. [PubMed] [Google Scholar]
  • 45.Lin K, Chen P, Huang L, Mao H, Chan D. Predicting Inpatient Readmission and Outpatient Admission in Elderly: A Population-Based Cohort Study. Medicine (Baltimore). 2016;95(16):e3484. doi: 10.1097/MD.0000000000003484 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Kim H, Jung Y, Kim K, Park J. Prospectively identifying older adults at risk for potentially avoidable hospitalizations in Korea using population-based data. Int J Qual Health Care. 2019;31(8):620–6. doi: 10.1093/intqhc/mzy225 [DOI] [PubMed] [Google Scholar]
  • 47.Costa A, Schumacher C, Jones A, Dash D, Campbell G, Junek M, et al. DIVERT-Collaboration Action Research and Evaluation (CARE) Trial Protocol: a multiprovincial pragmatic cluster randomised trial of cardiorespiratory management in home care. BMJ open. 2019;9(12):e030301. Epub 2019/12/18. doi: 10.1136/bmjopen-2019-030301 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Andaur Navarro C, Damen J, Takada T, Nijman S, Dhiman P, Ma J, et al. Protocol for a systematic review on the methodological and reporting quality of prediction model studies using machine learning techniques. BMJ open. 2020;10(11):e038832. Epub 2020/11/13. doi: 10.1136/bmjopen-2020-038832 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Andaur Navarro C, Damen J, Takada T, Nijman S, Dhiman P, Ma J, et al. Risk of bias in studies on prediction models developed using supervised machine learning techniques: systematic review. BMJ. 2021;375:n2281. doi: 10.1136/bmj.n2281 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Janssen K, Moons K, Kalkman C, Grobbee D, Vergouwe Y. Updating methods improved the performance of a clinical prediction model in new patients. Journal of clinical epidemiology. 2008;61(1):76–86. Epub 2007/12/18. doi: 10.1016/j.jclinepi.2007.04.018 [DOI] [PubMed] [Google Scholar]
  • 51.Janssen K, Vergouwe Y, Kalkman C, Grobbee D, Moons K. A simple method to adjust clinical prediction models to local circumstances. Can J Anaesth. 2009;56(3):194–201. Epub 2009/02/28. doi: 10.1007/s12630-009-9041-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Huntley A, Lasserson D, Wye L, Morris R, Checkland K, England H, et al. Which features of primary care affect unscheduled secondary care use? A systematic review. BMJ open. 2014;4(5):e004746. Epub 2014/05/27. doi: 10.1136/bmjopen-2013-004746 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Longman J, Johnston J, Ewald D, Gilliland A, Burke M, Mutonga T, et al. What could prevent chronic condition admissions assessed as preventable in rural and metropolitan contexts? An analysis of clinicians’ perspectives from the DaPPHne study. PloS one. 2021;16(1):e0244313. Epub 2021/01/08. doi: 10.1371/journal.pone.0244313 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Purdy S, Huntley A. Predicting and preventing avoidable hospital admissions: a review. The journal of the Royal College of Physicians of Edinburgh. 2013;43(4):340–4. Epub 2013/12/19. doi: 10.4997/jrcpe.2013.415 [DOI] [PubMed] [Google Scholar]
  • 55.Barker I, Steventon A, Deeny S. Association between continuity of care in general practice and hospital admissions for ambulatory care sensitive conditions: cross sectional study of routinely collected, person level data. BMJ. 2017;356:j84. Epub 2017/02/06. doi: 10.1136/bmj.j84 [DOI] [PubMed] [Google Scholar]

Decision Letter 0

Andrea Gruneir

2 Mar 2022

PONE-D-21-38410Risk prediction models for the prediction of unplanned hospital admissions or emergency department visits in community-dwelling older adults: a systematic review.PLOS ONE

Dear Dr. Klunder,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

ACADEMIC EDITOR:

We now have 3 reviews available of your manuscript. You will see that the reviewers were quite positive about this research but did raise a number of suggestions to improve the clarity of the manuscript - some related to the structuring of the manuscript and other focused on providing more detail on certain decisions. As well, you will also see that the reviewers raised some concerns about the discussion of machine learning techniques vs regression and are generally looking for a more nuanced take on this issue.

Please submit your revised manuscript by Apr 16 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Andrea Gruneir

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at 

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: N/A

Reviewer #2: N/A

Reviewer #3: N/A

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: No

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: An interesting systematic review on risk prediction models for hospitalizations among community-dwelling older adults. The review looks to be well-conducted and the conclusions soundly based in the results. I appreciate the focus on the methodological quality of the studies in addition to their performance. I have a few suggestions that I believe well aid in the clarity of their manuscript

Abstract

1. Line 25: In the Abstract the inclusion criteria is that models had to be intended to be used in a “primary care setting” whereas in the Methods section it is “general practice or community care”. This may be a jurisdictional difference, but I would not typically consider community care to be part of primary care so do not see these statements as equivalent. Can the authors align the definitions?

2. Line 27: Do the authors mean “quality” assessment?

3. Line 31: I believe that “sex or gender” would be more accurate according to the data in Table 2

Introduction

4. Lines 60-63. I might be misunderstanding this sentence, but it sounds like the authors are saying that regression models at more at risk of overfitting than machine learning models, which is not typically true. Machine learning models are much more complex than regression models. This complexity leads to both a theoretical benefit for predictive performance as well as an increased risk of overfitting. Some of the papers that the authors have cited [13 – for example] discuss this.

Methods

5. Line 108. I’m confused about the scope and justification of this exclusion. I can understand wanting to excluding studies done in disease-specific populations. But I am not grasping why studies among patients with cognitive impairments were specifically included when studies done in patients with heart failure, for example, presumably were excluded. Could the authors provide more detail and justification?

6. Line 132: Could the authors briefly list the domains of high heterogeneity in this sentence?

Results

7. Line 203: “sex or gender” is likely more accurate

Table 3

8. Presentation of the data in an N(%) format would be more informative than listing the reference numbers

Discussion

9. Feature/variable selection is a controversial and complex topic and I think the authors could benefit from more nuance in their discussion. For example, backwards selection is clearly superior to univariable screening, but it still comes with its own challenges. The use of any automated selection model (as noted in [14 and 15]) bears risks and p-value based approaches in particular lack justification. Could the authors expand this section, comment on other available methods ,i.e. LASSO, other methods as detailed in https://doi.org/10.1016/j.jclinepi.2015.10.002 and https://doi.org/10.1016/j.ijmedinf.2018.05.006 and comment that some ML methods have feature/variable selection incorporated into their algorithms.

Reviewer #2: The authors have presented an updated systematic review in this paper. The study is interesting and the approach is adequately robust. My specific comments are given below.

1. Is there a specific rationale for the chosen time frame for searching literature (2013-2021)?

2. Also, is there a specific reason for restricting the participants to adults > 65 years of age?

3. Time frame ranges from 7 days to 4 years. Was there any discernible temporal decay in the predictive performance across the models over this relatively longer time window? In other words, did those models predicting a shorter time span perform better than those predicting longer time spans?

4. The study has found that models developed to predict preventable hospitalizations had better predictive performance than models predicting hospitalizations in general. I think the authors should elaborate further on the clinical implications of this important finding.

5. Machine learning/deep learning-based models are quite different from traditional statistical models in a number of ways. The inclusion of a large number of variables in ML/DL models is, in fact, not an issue and modelling in a high-dimensional space is permissible with ML/DL. Therefore, using the guidelines (TRIPOD/CHARMS/PROBAST) geared to assessing classic predictive models for ML/DL models may not be ideal. Authors should discuss the implications of this and likely limitations.

6. Authors have presented the different variables included in each model. What predictors were actually found to be important in these predictive models? What predictors were statistically significant in classic predictive models and what variables emerged as important in ML models? For instance, ML uses techniques such as variable importance metrics and Shapley additives to gauge predictor importance.

7. Suggested to include eligibility criteria in a standard PICOTS table.

8. It would be important to describe in detail what additional and novel findings emerged from this SR, compared to the two previous SR on the same domain.

9. Both split-sample validation and cross-validation have their limitations. External validation is a great way to assess the robustness and generalizability of predictive models. How many of these models were externally validated?

Reviewer #3: First of all I would like to thank and congratulate the authors on their work. The topic of this systematic review is very interesting and important. The manuscript lacks however, structure and does not always read well. See my suggestions and questions below to improve this work.

BACKGROUND

Use of “risk” prediction modeling is a confusing and uncommon terminology. Recommend to use solely prediction model. This will probably improve the readability of the manuscript.

A very large proportion of the introduction is being used to describe prediction models, big data and machine learning in general. This does not read well, and does not add very much value to the topic of this systematic review. I would recommend to explain more about the “burden” of older patients at the emergency department. For example, how many times are patients admitted to and ED?; What are the reasons that they visit the ED? Are these reasons preventable? This will highlight the importance of this research. Additionally, I would also recommend to focus on the effects of ED admission and hospitalization on the elderly, such as the loss of functionality, risk of delirium during admission, psychological effects etc. Additionally, the authors described that with an effective primary care intervention, healthcare costs will decrease. I suggest to add details about how identification of these elderly can improve the work of physicians on how they can deliver more qualitative and effective healthcare.

METHODS

My major concern is the following: in the introduction and methods it is explained that previous reviews included studies focusing on ED admission and case-finding instruments. However, inclusion for this study was limited to studies from 2013 onwards, despite focusing on ED admission and unplanned hospitalization. The reason why the authors chose this specific inclusion year is confusing, as the previous performed reviews do not fully cover the research question of this systematic review. Could the authors explain, how and why this decision was made.

Inclusion criteria one, two and four seem obvious. However I think inclusion criteria three and five need more explanation. In general a question to authors: why were only validated prediction models included in this study? With the PROBAST tool, the models are also scored on “validation”. I do not see why development studies are not included.

In regards to inclusion criteria five: how do prediction models being used at the ED differ from the ones being used at a primary care facility?

Textual comments:

Methods section reads cloudy and could be more straightforward. For example: [1] Since these systematic reviews identified the same risk prediction models, we decided to limit publication dates from August 2013 through January 2021, which has some overlap with the searches of these reviews. To give a complete overview, we will also include the studies found in the previous reviews. The references of the identified articles were searched for relevant publications. I would suggest to rephrase to: [1] To provide a complete overview of available prediction models our search was restricted to August 2013 through January 2021. The models described in the previous reviews were also included in this systematic review.

Textual comments:

[2] After extraction of data, the Prediction model Risk of Bias Assessment Tool (PROBAST; see Appendix B) was used to assess risk of bias and applicability of the predictive models. Concern for applicability addresses whether the primary study matches the review question. I would suggest to rephrase to: [2] The Prediction model Risk of Bias Assessment Tool was used to assess risk of bias and applicability, of which the latter addresses whether the primary study matches the review question.

I would suggest to change the structure of the methods section and shorten in. Combine sections search strategy, study selection and data extraction. Make a new subheading with model performance including de explanation about AUC and EPV and lastly discuss the PROBAST tool. The PROBAST tool is explained very extensively. I would recommend to remove details to supplements or just refer to original PROBAST article. Also describe that regression models and machine learning models will be described separately.

RESULTS

A question for authors: Did all of the included study focus on developing only 1 prediction model? Because in these kind of studies sometimes multiple models are developed/ validated and compared to each other. Is 22 studies equivalent to 22 unique prediction models?

Textual comments

Same as the methods. Results can be pointed out more straightforward. See examples below.

Line 162/168: A flow diagram of the search strategy and selection process is presented in Figure 1, can be removed. Data extracted from the studies can be found in Table 1 and Table 2. Suggest to change it to: The literature searches yielded a total of 16,098 citations (Figure 1.). Tables and figures do not a notification, referring is the standard.

Line 164. Additionally, twenty-three articles were identified through other sources. What are these other sources? This was not mentioned in the methods.

Line 165: In addition to 10 studies included in the previously published systematic reviews, 12 new studies met all inclusion criteria, which makes a total of 22 unique risk prediction models. Rephrase: Full texts were retrieved for 170 studies of which 12 met all inclusion criteria. Additionally, a total of 10 studies were included from the previously published systematic reviews.

I would suggest to refer to “prediction model” instead of “study” in the results sections. For example line 171: Thirteen studies included participants aged ≥65 172 years[29, 33, 34, 36-40, 44, 46-49], the remaining studies used a higher age as inclusion criterion with. Rephrase to: Thirteen prediction models included……..

When describing results, try to hold on to the structure in the methods. The EPV can be described in the “predictive accuracy” section.

Avoid using question marks in tables and figures. I would suggest to use NA or a color scheme, for example, high risk= red, low risk= green, unclear= purple.

DISCUSSION

I do not understand why the difference between machine learning and logistic regression models is discussed prominently in this article. In order to say whether one technique is superior to the other, you should validate both models in the identical population. In line 317 the authors state the machine learning techniques are not superior and in in line 328 the authors state that a fair comparison is not possible. Please be consequent in conclusions.

The discussion includes a lot of repetition of the results. Conclusions is solely based on the development of more prediction models. In the introduction the authors describe that the eventual goal is to develop a care management program to avoid these admission. The authors should highlight, how they could use these models to develop such a program. For example; are the variables in these model, standard measurements in a primary care facility?

If the conclusion is only developing more models, the authors should describe how to accomplish this. Which data source to use, which variables should be included, which modeling technique etc.

Textual comments:

Line 293: Twelve risk models were added to the existing evidence. This does not read well. Suggest to remove this sentence.

Line 293-295: The recommendation of using nonmedical factors is never mentioned in the manuscript. This conclusion comes a bit out of the blue. I would recommend to make a more general conclusion, on the results that the authors did find (e.g. quality of models, performance of models etc.).

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Attachment

Submitted filename: Peer review.docx

PLoS One. 2022 Sep 23;17(9):e0275116. doi: 10.1371/journal.pone.0275116.r002

Author response to Decision Letter 0


29 May 2022

Amsterdam, May 1st 2022

Dear editor,

We would like to thank you and the reviewers for thoroughly reviewing our manuscript. We have rigorously revised our manuscript and believe its quality has improved. We have addressed each comment point by point below. We have marked important changes in the revised manuscript with track changes and have referred to the page and line numbers in the commentary below.  

Reviewer #1: An interesting systematic review on risk prediction models for hospitalizations among community-dwelling older adults. The review looks to be well-conducted and the conclusions soundly based in the results. I appreciate the focus on the methodological quality of the studies in addition to their performance. I have a few suggestions that I believe well aid in the clarity of their manuscript

Abstract

1. Line 25: In the Abstract the inclusion criteria is that models had to be intended to be used in a “primary care setting” whereas in the Methods section it is “general practice or community care”. This may be a jurisdictional difference, but I would not typically consider community care to be part of primary care so do not see these statements as equivalent. Can the authors align the definitions?

� We agree these terms may be confusing and we thank the reviewer for highlighting this contradiction. Primary care focuses on physical, mental and social health issues, it comprises care mainly performed in general practice and home care. Community care addresses a wider aspect of wellbeing and also focuses on social problems people may experience, such as housing problems. The aim of this review is to find risk prediction models that can be used by either general practitioners or home care professionals, therefore primary care is the most suitable definition. We have changed the definition in both abstract and methods accordingly. (page 2, line 24; page 4, line 104)

2. Line 27: Do the authors mean “quality” assessment?

� We do indeed. We’ve changed this. (page 2, line 26)

3. Line 31: I believe that “sex or gender” would be more accurate according to the data in Table 2

� We assume all studies refer to the biological characteristics of males and females, which means sex would be the appropriate term. However some included studies use the term gender and do not describe their definition, therefore we changed ‘gender’ into ‘sex or gender’ according to the reviewer’s suggestion. (p2, line 29)

Introduction

4. Lines 60-63. I might be misunderstanding this sentence, but it sounds like the authors are saying that regression models at more at risk of overfitting than machine learning models, which is not typically true. Machine learning models are much more complex than regression models. This complexity leads to both a theoretical benefit for predictive performance as well as an increased risk of overfitting. Some of the papers that the authors have cited [13 – for example] discuss this.

� We thank the reviewer for their comment. We could indeed write a whole article on the benefits and disadvantages regression and ML models. In light of the reviewer’s comment and the exclusion of some studies after this peer review, we have decided put less emphasis on the differences between regression and ML in our review, and we have therefore removed this paragraph from our introduction.

Methods

5. Line 108. I’m confused about the scope and justification of this exclusion. I can understand wanting to excluding studies done in disease-specific populations. But I am not grasping why studies among patients with cognitive impairments were specifically included when studies done in patients with heart failure, for example, presumably were excluded. Could the authors provide more detail and justification?

� We agree with the reviewer that the argument to exclude prediction models developed in specific populations except for community-dwelling older adults with cognitive impairments is disputable. We have therefore adjusted this criterion in our methods (page 5, lines 111-113) and excluded the two corresponding studies (i.e. Tsang et al.[1] and Maust et al.[2]) from this review.

6. Line 132: Could the authors briefly list the domains of high heterogeneity in this sentence?

� This sentence refers to the impossibility to perform a meta-analysis due to high heterogeneity among the included studies. Our study describes prediction model development studies, these are developed using different statistical methodologies (i.e. different regression analysis methods and ML). A meta-analysis summarizes the estimates of model discrimination and calibration, however due to the wide variation in, among others, outcomes (i.e. (preventable) hospital admissions or both) and prediction horizons (days to years) a quantified average performance in terms of discrimination or calibration would be uninterpretable.

Results

7. Line 203: “sex or gender” is likely more accurate

� We have changed this according to their suggestion. (page 8, line 178)

Table 3

8. Presentation of the data in an N(%) format would be more informative than listing the reference numbers

� We thank the reviewer for their suggestion. We agree with the reviewer that presentation of percentages in this table is of added value, and we have therefore included these to Table 3 (page 17).

Discussion

9. Feature/variable selection is a controversial and complex topic and I think the authors could benefit from more nuance in their discussion. For example, backwards selection is clearly superior to univariable screening, but it still comes with its own challenges. The use of any automated selection model (as noted in [14 and 15]) bears risks and p-value based approaches in particular lack justification. Could the authors expand this section, comment on other available methods ,i.e. LASSO, other methods as detailed in https://doi.org/10.1016/j.jclinepi.2015.10.002 and https://doi.org/10.1016/j.ijmedinf.2018.05.006 and comment that some ML methods have feature/variable selection incorporated into their algorithms.

� We agree with the reviewer that this section is somewhat short-sighted and we have nuanced this section (page 21, lines 337-341). As stated by the reviewer, variable selection is complex and controversial and many methods for variable selection exist. However, for this review we strictly followed the recommendations of the TRIPOD.[3] The TRIPOD is currently considered as the state-of-the-art reporting guideline for prediction model studies and systematic reviews on this topic.[4, 5] According to the TRIPOD, backward elimination is generally preferred if automated predictor selection procedures are used. While some of TRIPOD’s recommendations may offer room for in-depth discussion about the different predictor selection methods, we believe that this is beyond the scope of this paper, especially considering the exclusion of 2/3 ML models after revision. If the editor would like us to elaborate on this issue further, we would like to know and we will gladly accommodate their request.

Reviewer #2: The authors have presented an updated systematic review in this paper. The study is interesting and the approach is adequately robust. My specific comments are given below.

1. Is there a specific rationale for the chosen time frame for searching literature (2013-2021)?

� The aim of this review was to update the systematic review of Wallace et al. and in addition, focus on prediction models in adults aged 65 and over. Wallace et al. have performed a thorough literature search with identical inclusion criteria as our study (except for the age criterion). Since Wallace’s last updated search was in February 2014 we decided to overlap our search with 6 months, in case any records were added retrospectively (e.g. to correct an error in indexing) and were not detected by Wallace’s search.

2. Also, is there a specific reason for restricting the participants to adults > 65 years of age?

� Older adults have greater vulnerability to acute stress than younger individuals due to age-related diminution of physiologic reserves. Moreover, older adults tend to have more comorbid illnesses and disability. Older adults are more at risk for hospitalization and require more support after discharge than adults in middle age (45 to 64 years).[6] Prevention of hospital admissions in the older age group seems therefore more important on a community-level. There is however no clear age that defines an older adult. We have chosen the most conventional definition of older adults: people aged 65 and over.[7]

We have revised our introduction, putting more focus on the importance of identifying older adults at risk for hospitalizations.

3. Time frame ranges from 7 days to 4 years. Was there any discernible temporal decay in the predictive performance across the models over this relatively longer time window? In other words, did those models predicting a shorter time span perform better than those predicting longer time spans?

� Most studies handled a prediction horizon of 12 months (n=12, 63%). The predictive performance of these models, excluding the ones that predicted preventable hospitalizations or fall with hospitalization, ranged between AUC 0.61 – 0.76 (median AUC = 0.70). Two models were developed to predict a shorter time span, i.e. 6 months (AUC 0.72), and 4.5 months (AUC 0.63). So there was no remarkable superiority of the prediction models with shorter prediction horizon in terms of predictive performance. On the other hand, four studies handled a prediction horizon of more than 12 months (i.e. 15 months to 4 years). These models showed worse predictive ability (median AUC = 0.67, range 0.61 – 0.69) than the models predicting a time span of 12 months. This suggests predictions over a time span of more than 12 months become less accurate. In addition, it is questionable whether predictions of hospital admissions within an interval of multiple years are of clinical added value to clinicians. However, their low accuracy could also be due to the fact that these studies used survey data, whereas models using administrative data tended to perform better. We did therefore not incorporate this finding in the description of our results.

4. The study has found that models developed to predict preventable hospitalizations had better predictive performance than models predicting hospitalizations in general. I think the authors should elaborate further on the clinical implications of this important finding.

� We thank the reviewer for their suggestion and have elaborated on this finding and its clinical implications (page 21-22, lines 353-370)

5. Machine learning/deep learning-based models are quite different from traditional statistical models in a number of ways. The inclusion of a large number of variables in ML/DL models is, in fact, not an issue and modelling in a high-dimensional space is permissible with ML/DL. Therefore, using the guidelines (TRIPOD/CHARMS/PROBAST) geared to assessing classic predictive models for ML/DL models may not be ideal. Authors should discuss the implications of this and likely limitations.

� On page 20, in line 321 and further we discuss the fact that TRIPOD, CHARMS and PROBAST are originally designed for all types of prediction modelling studies, however their focus is indeed on regression-based prediction models. Nonetheless, all TRIPOD items are applicable for ML models. For PROBAST however, two signaling questions might be less relevant (i.e. selection of predictors based on univariable analysis and reporting of weighted estimates in the final model) and more signaling questions, e.g. related to data generation and feature selection, might be necessary. Hence, risk of bias for the ML model in our review was signed as unclear. ML versions of these checklists are under development, until then it is recommended to use TRIPOD, CHARMS and PROBAST as benchmark rather than none. We have added the latter in the manuscript (see underlined sentence below):

Necessity for guidelines for reporting and critical appraisal of prediction model studies using ML has been addressed and PROBAST-ML (as well as TRIPOD-ML) has been announced.[8] Until then, it is recommended to use TRIPOD, CHARMS and PROBAST as benchmark for the development of prediction model studies rather than none.[9]

6. Authors have presented the different variables included in each model. What predictors were actually found to be important in these predictive models? What predictors were statistically significant in classic predictive models and what variables emerged as important in ML models? For instance, ML uses techniques such as variable importance metrics and Shapley additives to gauge predictor importance.

� Thanks to the reviewer for their suggestion. We have presented all variables that were statistically significant and thus included in the final models in Table 3. Because of other comments on the difference between classic regression models and ML models, we decided to put less emphasis on the differences between these models. To answer the first question; because every study handled predictors differently (e.g. categorical or continuous analysis of the variable age) and used different numbers of predictors and different predictor selection methods, no quantified conclusions can be drawn on which predictors were most important. In general, previous admissions, high age, multimorbidity, polypharmacy and heart disease were most frequently included in the final models with in most cases high beta-coefficients. We added this to our conclusions. (page 19, line 278-282; page 21, line 359-361)

7. Suggested to include eligibility criteria in a standard PICOTS table.

� We have adapted the inclusion criteria in the manuscript according to the PICOTS format as suggested by Debray et al.[4] (page 4, line 95-104)

8. It would be important to describe in detail what additional and novel findings emerged from this SR, compared to the two previous SR on the same domain.

� We thank the reviewer for their suggestion. We have now described our results in the discussion in light of the findings by the previous reviews. (page 19, lines 276-297)

9. Both split-sample validation and cross-validation have their limitations. External validation is a great way to assess the robustness and generalizability of predictive models. How many of these models were externally validated?

� The models that were externally validated within the same study are reported in Table 1 (n=3, i.e. Marcusson, Mazzaglia, and Shelton). Unfortunately, very few models were externally validated by other researchers. We have not incorporated these in our studies since our primary aim was to summarize predictive models that have been developed to predict unplanned hospital admissions to this date. 

Reviewer #3: First of all I would like to thank and congratulate the authors on their work. The topic of this systematic review is very interesting and important. The manuscript lacks however, structure and does not always read well. See my suggestions and questions below to improve this work.

BACKGROUND

Use of “risk” prediction modeling is a confusing and uncommon terminology. Recommend to use solely prediction model. This will probably improve the readability of the manuscript.

� We thank the reviewer for their suggestion and we have changed this accordingly in the manuscript.

A very large proportion of the introduction is being used to describe prediction models, big data and machine learning in general. This does not read well, and does not add very much value to the topic of this systematic review. I would recommend to explain more about the “burden” of older patients at the emergency department. For example, how many times are patients admitted to and ED?; What are the reasons that they visit the ED? Are these reasons preventable? This will highlight the importance of this research. Additionally, I would also recommend to focus on the effects of ED admission and hospitalization on the elderly, such as the loss of functionality, risk of delirium during admission, psychological effects etc. Additionally, the authors described that with an effective primary care intervention, healthcare costs will decrease. I suggest to add details about how identification of these elderly can improve the work of physicians on how they can deliver more qualitative and effective healthcare.

� We thank the reviewer for their suggestion. We have revised the introduction according to their suggestion by omitting a large part of the paragraph on machine learning and describing in more detail about the relevance of the topic of our review. (page 3, lines 40-55)

METHODS

My major concern is the following: in the introduction and methods it is explained that previous reviews included studies focusing on ED admission and case-finding instruments. However, inclusion for this study was limited to studies from 2013 onwards, despite focusing on ED admission and unplanned hospitalization. The reason why the authors chose this specific inclusion year is confusing, as the previous performed reviews do not fully cover the research question of this systematic review. Could the authors explain, how and why this decision was made.

� We understand the reviewer’s concern. With this review we build on the review of Wallace et al. who performed a thorough review in 2014 (20,666 records), for this review our focus was on older adults because of reasons (mentioned in the introduction). Wallace et al. included risk prediction models for hospital admissions or combined endpoints such as hospital admission or ED visits, but did not include prediction models predicting ED visits as single outcome. We agree that this is a gap in our search strategy. We therefore amended our inclusion criterion to hospital admissions only or a combined endpoint of hospital admissions and ED visits. (page 4, line 100-103)

This way our inclusion criteria are identical to Wallace et al., except for the restriction to the older population.

In consequence, the modification of this inclusion criterion led to the exclusion of one study (i.e. Veyron et al.[10]) in the original version of this review.

Inclusion criteria one, two and four seem obvious. However I think inclusion criteria three and five need more explanation. In general a question to authors: why were only validated prediction models included in this study? With the PROBAST tool, the models are also scored on “validation”. I do not see why development studies are not included.

� We thank the reviewer for their comment. Internal validation is considered as a basic procedure in prediction model development studies (in general, but especially for studies with small sample size and/or low EPV). According to the TRIPOD, development studies are defined as development of a prediction model without validation in other participant data, but with inclusion of some form of resampling technique (in other words; including internal validation). Development studies without any form of validation are at high risk of overfitting and thus, according to PROBAST, in principle at high risk of bias. To describe studies that might be useful in daily practice, we only included studies in which overfitting was already accounted for through the execution of any internal validation procedure.

In regards to inclusion criteria five: how do prediction models being used at the ED differ from the ones being used at a primary care facility?

� Risk assessment does differ between these settings. Patients already admitted to the ED have higher a priori probability to be admitted to hospital than patients being at home, when risk assessment is performed. Therefore, predictive performance of prediction models in these two settings cannot be compared. Consequently, other variables are selected for inclusion in the models. For instance, the APOP screener [11] includes a variable whether the patient arrived by ambulance. This question can obviously not be answered when this model is used in a primary care setting.

Textual comments:

Methods section reads cloudy and could be more straightforward. For example: [1] Since these systematic reviews identified the same risk prediction models, we decided to limit publication dates from August 2013 through January 2021, which has some overlap with the searches of these reviews. To give a complete overview, we will also include the studies found in the previous reviews. The references of the identified articles were searched for relevant publications. I would suggest to rephrase to: [1] To provide a complete overview of available prediction models our search was restricted to August 2013 through January 2021. The models described in the previous reviews were also included in this systematic review.

� We thank the reviewer for their suggestion, we have changed this section accordingly. (page 4, lines 88-91)

Textual comments:

[2] After extraction of data, the Prediction model Risk of Bias Assessment Tool (PROBAST; see Appendix B) was used to assess risk of bias and applicability of the predictive models. Concern for applicability addresses whether the primary study matches the review question. I would suggest to rephrase to: [2] The Prediction model Risk of Bias Assessment Tool was used to assess risk of bias and applicability, of which the latter addresses whether the primary study matches the review question.

� We have changed the sentence accordingly to the reviewer’s suggestion. (page 5, lines 135-137)

I would suggest to change the structure of the methods section and shorten in. Combine sections search strategy, study selection and data extraction. Make a new subheading with model performance including de explanation about AUC and EPV and lastly discuss the PROBAST tool. The PROBAST tool is explained very extensively. I would recommend to remove details to supplements or just refer to original PROBAST article. Also describe that regression models and machine learning models will be described separately.

� We thank the reviewer for their suggestion. We have shortened the methods section and removed the elaboration on the PROBAST and referred to the original PROBAST article (page 5, lines 135-139). We have also added a sentence about the separate description of regression models and ML models (page 5, lines 128-129).

RESULTS

A question for authors: Did all of the included study focus on developing only 1 prediction model? Because in these kind of studies sometimes multiple models are developed/ validated and compared to each other. Is 22 studies equivalent to 22 unique prediction models?

� The majority of studies developed one prediction model, but some indeed presented more models. We presented data of all reported models in Tables 1 and 2. In general we counted one model per study, with the only exception for the comparison of predictive performance between all-cause hospitalizations and preventable hospitalizations. We have added a sentence in the results section to clarify this. (page 8, lines 191-195) We have amended the use of the words study and model, when necessary.

The way we handled the different presentation of models is described in detail below:

• In case multiple validation procedures were performed (i.e. Lyon and Marcusson) we only reported the highest AUC in the description of our results. In Table 1 all AUCs are reported.

• Three studies (i.e. Kan, Reuben, and Wu) used (and combined) multiple data sources to develop multiple models (e.g. survey data and electronic record data). For clarity reasons, we evaluated and discussed the studies as one model, because study characteristics and variables in the final model were identical. For the description of predictive accuracies we only counted the model with the highest AUC. All AUCs are reported in Table 1.

• Three studies (i.e. Mishra, Tarekegn, and Wu) assessed multiple outcomes (e.g. all-cause hospitalization and preventable hospitalization) and presented the AUC per outcome. We evaluated and discussed the studies as one model, because study characteristics, methodology and the variables in the final model were identical. However, for the comparison of the predictive accuracy of preventable vs all-cause admissions, we counted these models separately. In other counts we only included the reported AUC for all-cause admission.

• The study of Kurichi et al. presented two different models because of collinearity between the ADL and IADL variable. Predictive performance of the models was equal (AUC = 0.67). The variables included in the final model were equal as well, with the minor difference that the IADL model had an extra variable (i.e. proxy responded). Because of these minor differences, we assessed these two models as one.

Textual comments

Same as the methods. Results can be pointed out more straightforward. See examples below.

Line 162/168: A flow diagram of the search strategy and selection process is presented in Figure 1, can be removed. Data extracted from the studies can be found in Table 1 and Table 2. Suggest to change it to: The literature searches yielded a total of 16,098 citations (Figure 1.). Tables and figures do not a notification, referring is the standard.

� We thank the reviewer for their suggestion and have amended accordingly. (page 7, lines 148, 151-152,

Line 164. Additionally, twenty-three articles were identified through other sources. What are these other sources? This was not mentioned in the methods.

� Other sources was reference checking, this is mentioned in the methods section and we have now clarified this in the referred line. (page 7, lines 149-150)

Line 165: In addition to 10 studies included in the previously published systematic reviews, 12 new studies met all inclusion criteria, which makes a total of 22 unique risk prediction models. Rephrase: Full texts were retrieved for 170 studies of which 12 met all inclusion criteria. Additionally, a total of 10 studies were included from the previously published systematic reviews.

� We thank the reviewer for their suggestion and we have rephrased accordingly. (page 7, lines 150-152)

I would suggest to refer to “prediction model” instead of “study” in the results sections. For example line 171: Thirteen studies included participants aged ≥65 172 years[29, 33, 34, 36-40, 44, 46-49], the remaining studies used a higher age as inclusion criterion with. Rephrase to: Thirteen prediction models included……..

� We thank the reviewer for their suggestion. However, since some studies developed multiple models with the same data source as we pointed out earlier, we believe studies would be more comprehensible. We did check whether there is consistency in the use of ‘models’ and ‘studies’ and adjusted when necessary.

When describing results, try to hold on to the structure in the methods. The EPV can be described in the “predictive accuracy” section.

� The EPV is not a measure to assess performance of the model, but is rather used as a sample size criterion to minimize overfitting of a prediction model. Therefore, it is better gathered under the methodological quality section than the predictive accuracy section.

Avoid using question marks in tables and figures. I would suggest to use NA or a color scheme, for example, high risk= red, low risk= green, unclear= purple.

� We thank the reviewer for their suggestion. We however presented the methodological quality assessment according to the suggested tabular presentation of the PROBAST study group, which includes the use of question marks. (see Table 12, Moons et al. [12])

DISCUSSION

I do not understand why the difference between machine learning and logistic regression models is discussed prominently in this article. In order to say whether one technique is superior to the other, you should validate both models in the identical population. In line 317 the authors state the machine learning techniques are not superior and in in line 328 the authors state that a fair comparison is not possible. Please be consequent in conclusions.

� We thank the reviewer for their comment. We agree the discussion about the differences between machine learning and logistic regression models is put quite superficially, whereas it is a controversial and complicated topic. We have decided to omit most part of this discussion and put more focus on other results of this review.

The discussion includes a lot of repetition of the results. Conclusions is solely based on the development of more prediction models. In the introduction the authors describe that the eventual goal is to develop a care management program to avoid these admission. The authors should highlight, how they could use these models to develop such a program. For example; are the variables in these model, standard measurements in a primary care facility?

If the conclusion is only developing more models, the authors should describe how to accomplish this. Which data source to use, which variables should be included, which modeling technique etc.

� We thank the reviewer for their suggestion. We have amended the discussion rigorously in light of the comments of the reviewers. We presented suggestions for further improvement of predictive performance and methodological quality of prediction model studies (page 19, lines 295-298; page 20-21, lines 330-352) . Furthermore, we reported clinical implications to the finding that preventable admissions tended to have better predictive performance. (page 21-22, line 354-371)

Textual comments:

Line 293: Twelve risk models were added to the existing evidence. This does not read well. Suggest to remove this sentence.

� We thank the reviewer for their comment and have removed the sentence.

Line 293-295: The recommendation of using nonmedical factors is never mentioned in the manuscript. This conclusion comes a bit out of the blue. I would recommend to make a more general conclusion, on the results that the authors did find (e.g. quality of models, performance of models etc.).

� We thank the reviewer for their comment. As mentioned before, we built on the systematic review performed by Wallace et al. One of the recommendations in this review was to consider nonmedical factors for improvement of predictive accuracy. We have made this conclusion more clear in the manuscript. (page 19, lines 277 and further)

References:

1. Tsang G, Zhou SM, Xie X. Modeling Large Sparse Data for Feature Selection: Hospital Admission Predictions of the Dementia Patients Using Primary Care Electronic Health Records. IEEE J Transl Eng Health Med. 2021;9:3000113. PubMed PMID: rayyan-128526376.

2. Maust DT, Kim HM, Chiang C, Langa KM, Kales HC. Predicting Risk of Potentially Preventable Hospitalization in Older Adults with Dementia. Journal of the American Geriatrics Society. 2019;67(10):2077-84. PubMed PMID: rayyan-128520734.

3. Moons KG, Altman DG, Reitsma JB, Ioannidis JP, Macaskill P, Steyerberg EW, et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): explanation and elaboration. Annals of internal medicine. 2015;162(1):W1-73. Epub 2015/01/07. doi: 10.7326/m14-0698. PubMed PMID: 25560730.

4. Debray TPA, Damen JAAG, Snell KIE, Ensor J, Hooft L, Reitsma JB, et al. A guide to systematic review and meta-analysis of prediction model performance. BMJ. 2017;356:i6460. doi: 10.1136/bmj.i6460.

5. Group CPM. Cochrane Systematic Review of Prognosis Studies [cited 2022 April 23]. Available from: https://methods.cochrane.org/prognosis/tools.

6. Tian W. An All-Payer View of Hospital Discharge to Postacute Care, 2013: Statistical Brief #205. Healthcare Cost and Utilization Project (HCUP) Statistical Briefs. Rockville (MD): Agency for Healthcare Research and Quality (US); 2006.

7. Kowal P, Dowd J. Definition of an older person. Proposed working definition of an older person in Africa for the MDS Project2001.

8. Andaur Navarro CL, Damen J, Takada T, Nijman SWJ, Dhiman P, Ma J, et al. Protocol for a systematic review on the methodological and reporting quality of prediction model studies using machine learning techniques. BMJ open. 2020;10(11):e038832. Epub 2020/11/13. doi: 10.1136/bmjopen-2020-038832. PubMed PMID: 33177137; PubMed Central PMCID: PMCPMC7661369.

9. Andaur Navarro CL, Damen JAA, Takada T, Nijman SWJ, Dhiman P, Ma J, et al. Risk of bias in studies on prediction models developed using supervised machine learning techniques: systematic review. BMJ. 2021;375:n2281. doi: 10.1136/bmj.n2281.

10. Veyron JH, Friocourt P, Jeanjean O, Luquel L, Bonifas N, Denis F, et al. Home care aides' observations and machine learning algorithms for the prediction of visits to emergency departments by older community-dwelling individuals receiving home care assistance: A proof of concept study. PloS one. 2019;14(8):e0220002. PubMed PMID: rayyan-128526748.

11. de Gelder J, Lucke JA, de Groot B, Fogteloo AJ, Anten S, Mesri K, et al. Predicting adverse health outcomes in older emergency department patients: the APOP study. Neth J Med. 2016;74(8):342-52. Epub 2016/10/21. PubMed PMID: 27762216.

12. Moons KGM, Wolff RF, Riley RD, Whiting PF, Westwood M, Collins GS, et al. PROBAST: A Tool to Assess Risk of Bias and Applicability of Prediction Model Studies: Explanation and Elaboration. Annals of internal medicine. 2019;170(1):W1-w33. Epub 2019/01/01. doi: 10.7326/m18-1377. PubMed PMID: 30596876.

Attachment

Submitted filename: Response to Reviewers.docx

Decision Letter 1

Dong Keon Yon

18 Jul 2022

PONE-D-21-38410R1Prediction models for the prediction of unplanned hospital admissions in community-dwelling older adults: a systematic review.PLOS ONE

Dear Dr. Klunder,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Sep 01 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Dong Keon Yon, MD, FACAAI

Academic Editor

PLOS ONE

Journal Requirements:

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

Additional Editor Comments:

Please address excellent comments of the reviewers.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #3: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #3: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #3: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #3: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #3: (No Response)

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: (No Response)

Reviewer #3: I would like to thank the authors for the revised manuscript. The manuscript has significantly improved and is well structured. The introduction contains all relevant information and emphasizes the relevance of this manuscript. However, I still have minor (profoundly textual comments) that can further improve this paper.

Line 193-196

Very long and hard to follow sentence. Suggest to rephrase.

Line 216

“additionally one study assessed fall with hospitalizations as outcome”.

As mentioned in the exclusion criteria, models developed for specific disease groups were excluded. I would suggest to remove this sentence in order the avoid confusion as the model also looked at ED visit and hospital admission.

Line 276-277

Replace “and” by “an”

line 281-284

suggest to rephrase sentence.

Line 338-339

Instead of “narrowing” I would suggest the term “focusing”.

Discussion

Line 343-344

Suggest to remove the result of AUC>0.8 for fall related hospital admission as suggested earlier. This model performance is namely for fall related hospitalizations.

The start of your discussion includes a lot of comparison with Wallace et al. I would suggest to narrow this part and only highlight the most important difference with an explanation. For exammple, in your results and discussion you mention that the predictive accuracy of the current models has significantly improved compared to the models in Wallace et al. Is there any explanation to this?

You also mention the further implications for future research. Should we indeed develop more models? And what about the nonmedical factors mentioned by Wallace at al. Could the authors maybe elaborate more on this topic in their discussion.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #3: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2022 Sep 23;17(9):e0275116. doi: 10.1371/journal.pone.0275116.r004

Author response to Decision Letter 1


1 Sep 2022

Reviewer #1: (No Response)

Reviewer #3: I would like to thank the authors for the revised manuscript. The manuscript has significantly improved and is well structured. The introduction contains all relevant information and emphasizes the relevance of this manuscript. However, I still have minor (profoundly textual comments) that can further improve this paper.

Line 193-196

Very long and hard to follow sentence. Suggest to rephrase.

• We assume the reviewer refers to the following sentence: “For calculations of the median predictive performance we only included the AUC for the outcome all-cause admission, except for the calculation of the median predictive performance per outcome (i.e. preventable and all-cause hospitalization), then AUCs for both outcomes were included.”

We agree this sentence causes confusion and have rephrased it:

“The models developed for a specific type of hospitalization (i.e. preventable hospitalization or fall with hospitalization) (n=3), tended to perform better than the models for all-cause hospitalization (n=17), with a median AUC of 0.78 (range 0.74-0.78) versus 0.69 (range 0.61 - 0.76), respectively. The two models that assessed AUCs for both outcomes (i.e. Tarekegn et al. and Wu et al.[37, 39]) were included in calculations of both medians with its corresponding AUC and were thus counted twice.”

Line 216

“additionally one study assessed fall with hospitalizations as outcome”.

As mentioned in the exclusion criteria, models developed for specific disease groups were excluded. I would suggest to remove this sentence in order the avoid confusion as the model also looked at ED visit and hospital admission.

• Thank you for your comment. We have removed any reference to the fall-related hospital admissions outcome of this model from the rest of the manuscript, as it is indeed an exclusion criterion mentioned in the methods section.

Line 276-277

Replace “and” by “an”

• Thank you for noticing the typo.

line 281-284

suggest to rephrase sentence.

• We have rephrased the sentence accordingly:

“Concern for applicability was high in three studies, because the study population or study outcome did not fully match the review question: one study only included older adults with a sensory impairment[29], one study excluded older adults with a hospital admission <6 months prior to the index date[45], and one study evaluated preventable hospital admissions as only outcome[46].”

Line 338-339

Instead of “narrowing” I would suggest the term “focusing”.

• Thank you, we have followed your suggestion.

Discussion

Line 343-344

Suggest to remove the result of AUC>0.8 for fall related hospital admission as suggested earlier. This model performance is namely for fall related hospitalizations.

• We have removed this sentence in the discussion.

The start of your discussion includes a lot of comparison with Wallace et al. I would suggest to narrow this part and only highlight the most important difference with an explanation. For example, in your results and discussion you mention that the predictive accuracy of the current models has significantly improved compared to the models in Wallace et al. Is there any explanation to this?

• Thank you for your suggestion. We have narrowed this paragraph and included a possible explanation to the increased predictive accuracy in the newer models.

“The new models had higher predictive accuracy than the older models. This might be explained by the fact that new models had larger sample sizes of the development cohort and also higher EPVs. Both are recommended by the TRIPOD, published in 2015 [16], to improve predictive accuracy and methodological quality. Moreover, the new models used administrative or clinical record data more often for the development of their model. Consistent with Wallace et al., we found that models developed using administrative or clinical record data had higher predictive accuracy than those developed using self-report data. Of the 10 new prediction models, eight used administrative data for development of their model.”

You also mention the further implications for future research. Should we indeed develop more models? And what about the nonmedical factors mentioned by Wallace at al. Could the authors maybe elaborate more on this topic in their discussion.

• Thank you. We have followed your suggestions and have rephrased the implications for future research paragraph. We put more emphasis on external validation and updating prediction models instead of developing new prediction models. Regarding updating of the models, we suggested the addition of nonmedical factors may contribute to an improved accuracy of the model. (Manuscript, pages 19-20, lines 320-344)

Attachment

Submitted filename: Response to Reviewers.docx

Decision Letter 2

Dong Keon Yon

8 Sep 2022

PONE-D-21-38410R2Prediction models for the prediction of unplanned hospital admissions in community-dwelling older adults: a systematic review.PLOS ONE

Dear Dr. Klunder,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Oct 23 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Dong Keon Yon, MD, FACAAI

Academic Editor

PLOS ONE

Journal Requirements:

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

Additional Editor Comments:

This is an excellent paper. Finally, please replace reference number 19 (PRISMA guideline 2009) with the following recent paper (PRISMA guideline 2020).

DOI: https://doi.org/10.54724/lc.2022.e9

Congratulations!

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #3: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #3: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #3: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #3: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #3: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: (No Response)

Reviewer #3: I would like to thank the authors for their revised mansucript. All comments are adressed. Readability and structure have improved, making the manuscript ready for publication.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #3: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2022 Sep 23;17(9):e0275116. doi: 10.1371/journal.pone.0275116.r006

Author response to Decision Letter 2


8 Sep 2022

Thank you for the quick and thorough review and the opportunity to publish in PLOS One. We have replaced reference number 19 with the suggested recent paper. I have uploaded the revised manuscript file. I did not include a Manuscript with Track Changes file, since MS Word does not track any changes in the reference list. If you need any additional information/files, please let me know.

Attachment

Submitted filename: Response to Reviewers.docx

Decision Letter 3

Dong Keon Yon

11 Sep 2022

Prediction models for the prediction of unplanned hospital admissions in community-dwelling older adults: a systematic review.

PONE-D-21-38410R3

Dear Dr. Klunder,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Dong Keon Yon, MD, FACAAI

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

This is an excellent and mesmerzing paper.

Reviewers' comments:

Acceptance letter

Dong Keon Yon

15 Sep 2022

PONE-D-21-38410R3

Prediction models for the prediction of unplanned hospital admissions in community-dwelling older adults: a systematic review.

Dear Dr. Klunder:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Dong Keon Yon

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Checklist. PRISMA checklist.

    (PDF)

    S1 File. Full search strategies.

    (PDF)

    S2 File. CHARMS and PROBAST forms.

    (PDF)

    Attachment

    Submitted filename: Peer review.docx

    Attachment

    Submitted filename: Response to Reviewers.docx

    Attachment

    Submitted filename: Response to Reviewers.docx

    Attachment

    Submitted filename: Response to Reviewers.docx

    Data Availability Statement

    All relevant data are within the paper and its Supporting Information files.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES