Skip to main content
Journal of the American Medical Informatics Association : JAMIA logoLink to Journal of the American Medical Informatics Association : JAMIA
. 2022 Jul 11;29(9):1497–1507. doi: 10.1093/jamia/ocac111

Falls prediction using the nursing home minimum dataset

Richard D Boyce 1,, Olga V Kravchenko 2, Subashan Perera 3, Jordan F Karp 4, Sandra L Kane-Gill 5, Charles F Reynolds 6, Steven M Albert 7, Steven M Handler 8,9
PMCID: PMC9382393  PMID: 35818288

Abstract

Objective

The purpose of the study was to develop and validate a model to predict the risk of experiencing a fall for nursing home residents utilizing data that are electronically available at the more than 15 000 facilities in the United States.

Materials and Methods

The fall prediction model was built and tested using 2 extracts of data (2011 through 2013 and 2016 through 2018) from the Long-term Care Minimum Dataset (MDS) combined with drug data from 5 skilled nursing facilities. The model was created using a hybrid Classification and Regression Tree (CART)-logistic approach.

Results

The combined dataset consisted of 3985 residents with mean age of 77 years and 64% female. The model’s area under the ROC curve was 0.668 (95% confidence interval: 0.643–0.693) on the validation subsample of the merged data.

Discussion

Inspection of the model showed that antidepressant medications have a significant protective association where the resident has a fall history prior to admission, requires assistance to balance while walking, and some functional range of motion impairment in the lower body; even if the patient exhibits behavioral issues, unstable behaviors, and/or are exposed to multiple psychotropic drugs.

Conclusion

The novel hybrid CART-logit algorithm is an advance over the 22 fall risk assessment tools previously evaluated in the nursing home setting because it has a better performance characteristic for the fall prediction window of ≤90 days and it is the only model designed to use features that are easily obtainable at nearly every facility in the United States.

Keywords: falls, skilled nursing facilities, long-term care minimum dataset, fall prevention intervention

INTRODUCTION

Falls are among the most common and dangerous adverse events that occur in nursing homes. The Centers for Medicaid and Medicare Services (CMS) reported that the yearly rate of noninjurious falls is 11% and 5.3% for falls resulting in injury.1 A 2014 report from the Office of Inspector General examined the incidence of adverse events occurring in the skilled nursing facilities and found that injury related to fall is often preventable.2 The incidence during resident stays for fall events resulting in some level of harm was 8% due to resident care issues and 9% due to medication therapy issues.2 This report led CMS in 2015 to add the reduction of preventable falls as one of the primary goals for the Effective Quality Assurance and Performance Improvement (QAPI) program.3 Unfortunately, although more than 6 years have elapsed, data from the current National Healthcare Quality and Disparities Report show that the benchmark goal has not been achieved for the “Long-stay nursing home patients experiencing one or more falls with major injury” quality measure.4

Many fall prevention interventions have been designed over the past 2 decades. A systematic review and meta-analysis of 35 randomized controlled trials conducted between 2013 and 2019 of fall interventions of at least 6-month duration was conducted in 2020 by Gulka et al5 The review found that the most effective interventions were multifactorial with an average risk ratio of 0.65 (95% confidence interval [CI]: 0.45–0.94). A multifactorial fall-prevention intervention involves selecting from multiple possible interventions, such as exercise or medication review, a combination that is matched to a patient’s risk‐of‐falls profile.

The use of a fall-prevention intervention in the nursing home depends on the accurate identification of residents who are at risk for a fall. A review conducted by Nunan et al6 and published in 2017 found that 16 fall risk assessment tools had been studied for fall prediction performance in the nursing home setting between January1980 to October 2015. A separate review by Park et al identified another 3 fall risk assessment tools evaluated in the nursing home setting in the early 2000s.7 Our own search of PubMed found 3 additional tools also evaluated in the nursing home.8–10

Of the 22 fall risk assessment tools evaluated in the nursing home setting, only 5 were evaluated for fall prediction within ≤90 days of administering the tools.8,9,11–13 Of the 5, only one was designed to use electronic data that are routinely collected in the nursing home.9 The model, called FINDER, was developed using nursing home data from patients in the Netherlands and was shown to predict falls 5 days ahead with an area under the ROC curve (AUROC) of 0.603 (95% CI: 0.565–0.641) with a sensitivity of 83.41% (95% CI: 79.44–86.76%) and a specificity of 27.25% (95% CI: 23.11–31.81%).

Objective

Effective monitoring requires regularly scheduled assessments of nursing home patients to identify who is at sufficient fall risk to require intervention. Also, the expanded collection of electronic health data makes automated fall risk monitoring feasible, with the potential to reduce the time for clinical staff while improving the identification of residents in need of fall prevention interventions. Thus, we sought to develop and validate a novel falls prediction model that predicts falls ≤90 days in the future using the Long-Term Care Minimum Dataset14 and drug therapy data (dispensing and administration), data that are readily available in electronic form to every nursing home in the United States.

METHODS

Overview

The data for this study come from the Minimum Data Set (MDS) 3.0 for Nursing Homes and Swing Bed providers, which is a standardized assessment required to be completed by regulations for all residents admitted to Medicare-certified nursing homes in the United States.14 The model was built and tested using 2 time-separated extracts of MDS data, both from the University of Pittsburgh Medical Center Senior Communities nursing homes. The first dataset consisted of all residents 2011 through 2013 while the second dataset consisted of all residents 2016 through 2018. The gap between the datasets was due to changes in funding for the project and the systems used to extract MDS data for research. The first dataset was extracted from a medical records archive while the second dataset was extracted directly by querying the systems used in the care system. The medical records archive stored drug dispensing and MDS data from the nursing homes (ie, data from 2011 through 2013). We were able to obtain drug administration and MDS from the systems used in the care system (ie, data from 2016 through 2018). All data were deidentified to HIPAA limited dataset requirements prior to analysis. The University of Pittsburgh Institutional Review Board approved the study.

Data preparation for analysis

Our methods for MDS data extraction, transformation, and assuring high quality have been described elsewhere.15 Extraction from the source system was necessary because our institution is not the same as the health system and so all data are filtered through a managed request process. The data were then transformed to the common data model provided by the Observational Medical Outcomes Project (now called OHDSI).16 The transformation was done to leverage useful data quality tools provided by the OHDSI community. As it is a recommended data management practice to confirm that the transformed data are complete, correct, and free of nonpredictive variability, the transformed data were then subjected to a number of data characterization procedures and concordance tests. These included identifying errors using the Achilles data characterization tool17 and comparing publicly available rates of specific quality measures from the CMS Nursing Home Compare program with quality measures generated by us with the new database for concordance.

As described in our article about the dataset preparation,15 we addressed problems identified in the database that would potentially affect the fall prediction algorithm development. These included, among other steps,15 dropping data for patient with missing demographic data or drug dates outside of the expected period that the patient was in the nursing home (based on billing census data). We also dropped data from certain sites over quarterly time periods where we could not establish concordance with Nursing Home Compare quality measures using the methods described in our prior work.15 Briefly, using the data extracted for research, we generated 7 quality measures used by Nursing Home Compare (CMS IDs: N011.01, N031.02, N030.01, N013.01, N014.02, N001.01, N024.01). Each quality measure was generated for each site over every quarter of data we had available from the site. We then used the goodness-of-fit tests to compare each quality measure for all quarters of data available for each site with data publicly available through Nursing Home Compare. Data from each site were used for model development only if there was no statistically significant difference in the quality measures between the extracted data and publicly available data.

Because MDS reports taken upon a resident’s entry to the facility contained only demographic data, we only used MDS reports that included the full set of variables (ie, Federal Omnibus Budget Reconciliation Act [OBRA] Admission, Quarterly, Annual, or Significant Change; prospective payment system [PPS] 5 days, 14 days, 30 days, 60 days, 90 days, OMRA, significant or clinical change, or significant correction assessment).

Outcome measure

We chose J1800 (fall since the admission or prior MDS assessment) as the outcome measure because we had reported previously that a comparison of falls reported by MDS J1800 with those reported in the health system’s risk management system found that J1800 was more complete.15 Moreover, a recent comparison of MDS and Medicare Claims data by Sanghavi et al18 found that J1800 captures a greater percentage of injurious falls than J1900C (falls with major injury since admission or the prior MDS assessment). The study found that J1800 captured 67.8% (white patients) and 62.6% (non-white patients) of major injury falls for CMS short stay residents (ie, ≤100 days with no gap of residence >30 days), and 82.8 (white) and 76.1 (non-white) major injury falls for CMS long-stay residents.

Statistical analysis

The machine learning approach was a hybrid Classification and Regression Tree (CART)-logistic model which harnesses the strengths of both CART and traditional logistic regression techniques.19 Briefly, CART is a computation-intensive recursive partitioning method that searches through every possible split of each predictor to find one that would result in the maximal separation of fallers and nonfallers, to yield 2 nodes from the sample. Then, within each node, the process is repeated until fallers and nonfallers are completely separated and a fully grown tree structure is obtained. To guard against overfitting, the tree is then pruned using the misclassification-complexity criterion. For the purpose of fall prediction, the strengths of CART models are that they are generally easily understood by clinical experts, they enable the use of surrogate splits in case of missing data rather than complete exclusion from analysis of observations with at least 1 missing variable, and they handle multicollinearity, higher order interaction terms, and nonlinear associations without subjective decision making by the analyst, and producing results in terms of a simple flowchart that can be readily used in a clinical setting.

One disadvantage of tree models is that it is a locally optimal method, where once a split is made, perhaps an unreasonable one simply due to random chance, there is no mechanism to alter the split later resulting in some instability of the model. Another disadvantage of tree models is that trees tend to become overly complex when the association of a variable to the predicted class is represented by a gradient. Therefore, to reap the advantages of both modeling modalities, we employed a hybrid CART approach, where nodes from the final tree (as a categorical variable), and any gradient associated variables were included in a logistic regression model.

We first trained a CART decision tree20 to predict the risk of a patient falling before the next MDS report is completed by nursing home staff (Figure 1). Specifically, given characteristics and information on health status, fall history, chronic conditions, and medications available in prior MDS assessments and merged medication data, the algorithm predicted the answer to MDS question J1800 on the subsequent MDS.

Figure 1.

Figure 1.

Example timelines for the MDS data used for developing the fall risk prediction model. The model predicts a fall between the time of each assessment and the next assessment. Operationally, a true positive prediction would involve a high-risk prediction from 1 MDS assessment shown in the timeline where question J1800 (fall since the prior MDS assessment or admission) is true in the subsequent MDS report. As is shown in the figure, this prediction time window can vary from a few days (eg, timeline C) to 90 days (eg, OBRA Quarterly to OBRA Quarterly or OBRA Annual). The variety shown in the timelines reflects the variety of the data used to train the model. Bolded assessment labels indicate the first and last MDS assessment that the model was trained on.

The CART-logistic algorithm development workflow is illustrated in Figure 2. For the model training set, we used a random sample of data for 70% of the residents from the 2011 to 2013 dataset. Continuous variables that appeared to have a gradient association (rather than one based on thresholds) with the fall outcome were excluded from the CART tree and included in a separate logistic regression model along with the terminal nodes from each path of the CART decision tree. We used area under receiver operator characteristic curve (AUROC) to evaluate predictive accuracy.

Figure 2.

Figure 2.

The general outline of the methods used to build a hybrid CART-logistic model for predicting the individual risk of fall before the next Minimum Dataset (MDS) survey using MDS and drug dispensing or administration data. Data from 2011 through 2013 were used to design the hybrid CART-logistic model. The model was revised based on the error analysis and new coefficients were trained and tested on the data from 2016 through 2018. A combined dataset was created that dropped records prior to March 31, 2012 to address the change in MDS drug administration to questions N0410A-G and N0410Z that happened April 1, 2012 (see Methods section). This combined dataset was split into a 70% training set arrive at the final coefficients which were tested on the 30% test set to arrive at the final performance characteristics.

During this phase, several CART models were created using the development data which led to the identification of gradient-associated variables by the presence of repetitious decision nodes in different branches of the CART decision tree. For example, one early version repeated decision nodes for the number of psychotropics a patient was taking at 2 different branches of the decision tree suggesting that weighting the variable as a covariate in a logistic model might be a better approach. Much thought was also put into avoiding overfitting at this phase and complex CART trees with very high levels of predictive accuracy on the development data were avoided in favor of simpler models if with lower accuracy. Performance of the model on the test set was assessed only after arriving at a structure of the model that was simple, had high face validity and acceptable predictive performance.

Error analysis of the initial model and revision

In the next phase of the project, an error analysis was performed using electronic medical records. A pharmacist student reviewed the charts of 3 groups of patients representing 3 different paths through the decision tree component of the model. All patients selected for chart review experienced a fall according to patient charts regardless of whether the model predicted it using MDS and drug data. The pharmacist was blinded to which path each patient was placed in while conducting the chart review. The pharmacist abstracted the data to determine the primary reason for the fall. We then analyzed if the reasons varied depending on which risk pathway the patients belonged to and if there were any possible improvements that could be made to the model.

Informed by the results of the error analysis, the structure of the model was modified and applied to the second, newer, dataset. We used a simple approach where the structure of the model was retained but the coefficients were reweighted using a 70% random sample from the newer dataset. The reweighted model was then tested on the remaining 30% sample to test if the reweighted model had similar performance with the newer dataset. Upon observing this, we combined the older and newer datasets and split the data into 70% and 30% samples. We used the Welch 2-sample t-test to compare the age and the χ2 test to categorical variables between the 2 data samples. Statistically significantly differences were noted when P <0.05. We then performed one more round of reweighting on a 70% data sample, producing the final set of coefficients for the model that we tested on the remaining 30% data sample to obtain the final performance characteristics.

RESULTS

The exact dates of coverage of the final datasets after running data quality checks and validating that we could reproduce Nursing Home Compare measures were as follows:

  • Older dataset used to train the model structure (Supplementary Table S1): all 5 facilities March 31, 2011–December 31, 2013. This dataset used drug dispensing data to fill in 7-day MDS drug exposure data that was missing prior to April 01, 2012. This was necessary because the extract we received was missing the fields N0400A-G and N0400Z which was used in the MDS until the start of Q2 2012 when it was switched to N0410A-G and N0410Z (present in the dataset).

  • Newer dataset used for testing model revisions (Supplementary Table S2): 3 of the 5 facilities with one covering September 1, 2016–December 31, 2018, another January 1, 2017–June 30, 2018, and the third covering January 1, 2017–September 30, 2018. Two of the sites were entirely excluded because we could not reproduce the Nursing Home Compare Short Stay Pain measure across several quarters of data. Similar issues led to the drop of Quarter 3 2016 from 2 of the included sites, Quarters 3 and 4 2018 from one of the included site, and Quarter 4 2018 from another included site.

  • Combined dataset used for the final model coefficients: removed data prior to March 31, 2012 to address the change in MDS drug administration to questions N0410A-G and N0410Z that happened April 1, 2012 (mentioned above).

The combined dataset consisted of 3985 residents with a mean age of 77 years and 64% female. Of the 3985 residents, 934 (23.4%) qualified as long-stay residents according to the CMS definition (one or more stays spanning a period of >100 days with gaps of ≤30 days). The median period between MDS reports used to predict falls and MDS reports used to determine if a fall occurred (using survey question J1800) was 18 days. The dataset had more than 110 predictor variables that were used for analysis representing demographics, location from which a resident entered the nursing home (eg, hospital or home), chronic conditions, and medications (see Supplementary Table S3). Table 1 provides descriptive statistics and comparisons of a selection of variables for the 70% training and 30% validation samples from the combined dataset.

Table 1.

Split data from the final dataset used to test the model

Combined 2012, 2013 and 2016, 2017, and 2018 data
Training Test P-value
Total number of patients 2790 1195
% of patients with fall records since prior assessment 23.4 22.0 0.52
% of patients with a fall with major injury 1.1 2.0 0.07
Avg. age of patients 77 77 0.55
% Female patients 64.3 63.9 0.95
% White 80.8 79.4 0.76
% African American 16 17.5 0.34
% Patients with Parkinson 4.5 5.6 0.17
% Patients with non-Alzheimer dementia 23.6 21.7 0.32
% Patients with Alzheimer 11.5 10.0 0.25
% Patients with impaired mobility 80.0 79.3 0.83
% Patients receiving an antipsychotic within 7 days prior to an MDS report 16.1 15.2 0.61
% Patients dispensed/administered ≥3 CNS affecting drugs 16.2 15.8 0.81
% Patients dispensed/administered ≥3 psychotropic drugs 9.4 9.7 0.82

Note: The dataset combined data from 2011 and 2013 with data from 2016 through 2018. The data were from the same setting but the earlier portion of data had drug dispensing data, while the later portion of data had drug administration data. Note that only data from earlier dataset was used to design the structure of the hybrid CART-logistic model.

In the original model (Figure 3), the regression equation adds age and the number of distinct psychotropic drugs that a patient is exposed to at the time of the MDS assessment taken from either drug dispensing or administration data. The error analysis included 16 fallers and found that the original model did not accurately identify behavioral instability using the presence of any behaviors in the current MDS as indicated by questions “Behavioral Symptom—Presence & Frequency” (E0200A, E0200B, E0200C), “Rejection of Care—Presence & Frequency” (E0800), or “Wandering—Presence & Frequency” (E0900), or severe impairment of cognitive skills for daily decision making according to C1000. To address this issue, the final model (Figure 4) employs a count variable that indicates the number of sequential MDS records prior to and including the current one where there has been a change in the patient’s behavioral symptoms according to E0200A, E0200B, E0200C, E0800, E0900. It also restructures the tree so that the node representing the combination of a prior history of falls, some level of functional impairment in the lower body, and behavioral symptoms towards others can co-occur with one of the remaining terminal nodes.

Figure 3.

Figure 3.

The original CART-logistic algorithm prior to error analysis. The labels are simplified for readability, the actual MDS variables are shown in Supplementary Table S3 using italics. To use either algorithm, the CART tree portion of the algorithm is applied to an MDS assessment (Federal Omnibus Budget Reconciliation Act [OBRA] Quarterly or Annual or prospective payment system [PPS] 5, 15, 30, 60, or 90 days). The decision nodes from the tree are used to assign each resident to a terminal node. The resulting terminal nodes are then used in the logistic regression as categorical variable, multiplied by the corresponding weight coefficient.

Figure 4.

Figure 4.

The final CART-logistic algorithm after making changes based on an error analysis. Ti represents a categorical value for the end of a particular path through the tree. Wi represents the logit weight assigned to Ti. Wj represents the logit weight assigned to categorical value Tj and is only used if T3 is 1 at the same time as one of T4–T7. T7 has a weight of 0.0 because it is the reference node that enables the model to assign weights to the other nodes. MDS: The Nursing Home Minimum DataSet; ROM: Range of Motion.

Figure 4 also shows the coefficients for the final model. The final model produced an AUROC of 0.668 (95% CI: 0.643–0.693) on the validation subset of the combined dataset, as shown in Figure 5. The risk of a fall between MDS assessments is estimated by the equation shown in Figure 4 which produces a probabilistic score between 0 and 1. The optimal threshold cutoff for the algorithm was 0.181 according to the Youden method that maximizes the distance to the diagonal line. At this threshold, the model predicts falls with a sensitivity of 0.57, specificity of 0.69, positive predictive value of 0.25, and balanced F measure (F1) of 0.35 (Table 2). For comparison, Table 2 also shows the model’s performance at 2 other thresholds. The threshold 0.10 leads to a much higher sensitivity (0.85) and a lower positive predictive value (0.19) at the cost of specificity (0.33) which may align with the goals of an automated screening tool rather than definitive identification with low false positives. Table 3 presents the statistical coefficients for the final fall risk prediction algorithm.

Figure 5.

Figure 5.

ROC curve for the validation subset using the final fall prediction model. The area under the ROC curve was 0.668 (95% CI: 0.643–0.693).

Table 2.

Key performance metrics for the hybrid CART-logit model using coefficients trained from the combined dataset at 2 thresholds

Optimal High specificity High sensitivity
Positive predictive value (precision) 0.25 0.31 0.19
Sensitivity (recall) 0.57 0.14 0.85
Specificity 0.69 0.94 0.33
Balanced F-measure 0.35 0.19 0.32
Threshold 0.18 0.30 0.10

Note: The model’s AUROC on the validation subset was 0.668 (95% CI: 0.643–0.693). The “optimal” threshold maximizes sensitivity and specificity by finding the point on the AUROC that is furthest from the diagonal line. The high specificity threshold targets identification of the highest risk patients but has low sensitivity. The high sensitivity threshold targets avoidance of falls but has relatively low specificity. Balanced F-measure is the harmonic mean of positive predictive value (precision, P) and sensitivity (recall, R) which is calculated as 2(PRP+R). It is a measure of model performance that is helpful when precision and recall are of equal importance.

Table 3.

Final fall risk prediction algorithm coefficients

Coefficients Estimate Std. error Z-value Pr(>|z|)
Intercept −2.591 0.475 −5.454 4.9×10−8
Age 0.021 0.003 7.319 2.5×10−13
Psychotropic exposure 0.298 0.030 10.101 <2.0×10−16
Number of behavior changes 0.146 0.061 2.40 0.016
T1—NO fall history prior to admission −1.454 0.417 −3.483 0.005
T2—Fall history prior to admission but NO functional range of motion issues in the lower body −0.688 0.414 −1.661 0.097
T3—Fall history prior to admission, functional range of motion issues in the lower body, and behavioral symptoms towards others 0.883 0.148 5.975 2.3×10−9
T4—Fall history prior to admission, functional range of motion issues in the lower body, but NO assistance needed while walking (includes no walking activity) −1.647 0.422 −3.90 9.7×10−5
T5—Fall history prior to admission, functional range of motion issues in the lower body, assistance needed while walking, and the administration of an antidepressant in the prior 7 days −1.040 0.425 −2.449 0.014
T6—Fall history prior to admission, functional range of motion issues in the lower body, assistance needed while walking, NO administration of an antidepressant in the prior 7 days, and entry to facility from a place other than another nursing home or the community −1.288 0.423 −3.046 0.002
T7 (reference)—Fall history prior to admission, functional range of motion issues in the lower body, assistance needed while walking, NO administration of an antidepressant in the prior 7 days, and entry to facility from another nursing home or the community 0.0

In an effort to make the model’s risk calculations more accessible, Tables 4 and 5 present risk tables focusing on situations where probability of a fall is greater than 0.30 (see Table 2 for performance characteristics at this threshold). Table 4 shows falls risk for when MDS data indicate that the patient has not received an antidepressant medication in the past 7 days, while Table 5 shows risks for when the patient has received an antidepressant.

Table 4.

Risk table focusing on situations where probability of a fall is greater than 0.30 (see Table 3 for performance characteristics at this threshold) where MDS data do not indicate administration of an antidepressant in the past 7 days

Age Number of psychotropics Change in behavioral symptoms since last MDS Fall history prior to admission Assistance to balance while walking Functional ROM impairment lower body MDS behavioral symptoms towards others Antidepressant past 7 days Risk of fall before next MDS
95 4 Yes No N/A N/A N/A N/A 0.32
65 1 Yes Yes Yes Yes No No 0.31
65 2 No Yes Yes Yes No No 0.34
65 2 Yes Yes Yes Yes No No 0.38
65 3 No Yes Yes Yes No No 0.41
65 3 Yes Yes Yes Yes No No 0.45
Risk increases by ∼0.035 for every additional psychotropic and also when there has been a change in behavioral symptoms
65 0 No Yes Yes Yes Yes No 0.41
65 0 Yes Yes Yes Yes Yes No 0.45
65 1 No Yes Yes Yes Yes No 0.49
65 1 Yes Yes Yes Yes Yes No 0.52
65 2 No Yes Yes Yes Yes No 0.56
65 Risk increases by ∼0.035 for every additional psychotropic, change in behavioral symptoms, and MDS behavioral symptom towards others
75 0 No Yes Yes Yes Yes No 0.46
75 0 Yes Yes Yes Yes Yes No 0.50
75 1 No Yes Yes Yes Yes No 0.54
75 1 Yes Yes Yes Yes Yes No 0.57
75 2 No Yes Yes Yes Yes No 0.61
75 Risk increases by ∼0.035 for every additional psychotropic, change in behavioral symptoms, and MDS behavioral symptom towards others
85 0 No Yes Yes Yes Yes No 0.52
85 0 Yes Yes Yes Yes Yes No 0.55
85 1 No Yes Yes Yes Yes No 0.59
85 1 Yes Yes Yes Yes Yes No 0.62
85 2 No Yes Yes Yes Yes No 0.66
85 Risk increases by ∼0.035 for every additional psychotropic, change in behavioral symptoms, and MDS behavioral symptom towards others
95 0 No Yes Yes Yes Yes No 0.57
95 0 Yes Yes Yes Yes Yes No 0.60
95 1 No Yes Yes Yes Yes No 0.64
95 1 Yes Yes Yes Yes Yes No 0.67
95 2 No Yes Yes Yes Yes No 0.70
95 Risk increases by ∼0.035 for every additional psychotropic, change in behavioral symptoms, and MDS behavioral symptom towards others

Note: Bold indicates a potentially addressable risk factor.

ROM: Range of Motion.

Table 5.

Risk table focusing on situations where risk of a fall is greater than 0.30 (see Table 3 for performance characteristics at this threshold) where MDS data indicate that the patient has received an antidepressant medication in the past 7 days

Age Number of psychotropics Change in behavioral symptoms since last MDS Fall history prior to admission Assistance to balance while walking Functional ROM impairment lower body MDS behavioral symptoms towards others Antidepressant past 7 daysa Risk of fall before next MDS
75 4 Yes Yes Yes Yes No Yes 0.32
85 3 Yes Yes Yes Yes No Yes 0.30
85 4 No Yes Yes Yes No Yes 0.34
85 4 Yes Yes Yes Yes No Yes 0.37
95 3 No Yes Yes Yes No Yes 0.32
95 3 Yes Yes Yes Yes No Yes 0.35
65 2 No Yes Yes Yes Yes Yes 0.31
65 2 Yes Yes Yes Yes Yes Yes 0.34
65 Risk increases by ∼0.035 for every additional psychotropic or change in behavioral symptoms
75 1b Yes Yes Yes Yes Yes Yes 0.32
75 2 No Yes Yes Yes Yes Yes 0.36
75 2 Yes Yes Yes Yes Yes Yes 0.39
Risk increases by ∼0.035 for every additional psychotropic or change in behavioral symptoms
85 1b No Yes Yes Yes Yes Yes 0.34
85 1b Yes Yes Yes Yes Yes Yes 0.37
85 2 No Yes Yes Yes Yes Yes 0.40
85 2 Yes Yes Yes Yes Yes Yes 0.44
85 Risk increases by ∼0.035 for every additional psychotropic or change in behavioral symptoms
95 1b No Yes Yes Yes Yes Yes 0.38
95 1b Yes Yes Yes Yes Yes Yes 0.42
95 2 No Yes Yes Yes Yes Yes 0.46
95 2 Yes Yes Yes Yes Yes Yes 0.49
95 Risk increases by ∼0.035 for every additional psychotropic or change in behavioral symptoms

Note: Bold indicates a potentially addressable risk factor.

ROM: Range of Motion.

a

The antidepressant covariate is not shown in bold because it has a protective association in the model.

b

The psychotropic is the antidepressant which is likely protective.

To illustrate how the model could be used to assess the fall risk for an individual person, consider a hypothetical nursing home resident who is 89 years old and who fell 7 days ago prior to their admission from the community. The first full MDS assessment for the resident indicates a long history of behavioral dyscontrol that affected others (but no recent changes), has been taking 1 antidepressant medication, and relies on a walker. Following the model in Figure 4, the fall risk will be calculated in the following way:

  • Age 89 years (logistic coefficient 0.021);

  • An antidepressant is a psychotropic drug (logistic coefficient 0.298);

  • Fall History Prior to Admission?—YES (decision node, no coefficient);

  • Functional ROM Impairment—YES (decision node, no coefficient);

  • Behavioral System Towards Others?—YES (leaf node, coefficient = 0.883);

  • Assistance While Walking?—YES (decision node, no coefficient);

  • Antidepressants Med Administration in the past 7 days?—YES (leaf node, coefficient = −1.040).

To calculate the individual risk of a fall p for this resident, we would apply the logistic regression model in the following way to determine that the patient’s risk of fall is approximately 0.36:

xβ=2.591+0.021×89+0.298×1+0.883×11.040×1=0.581p=e0.581/(1+e0.581)=0.359

One of the key observations about the model is that antidepressant medications have a protective association with the fall risk where the patient has a fall history prior to admission, requires assistance to balance while walking, and some functional range of motion impairment in the lower body. In the combined dataset, 29% of patients met these conditions at the time that at least 1 MDS report was collected. The protective association remains a factor even for patients who exhibit behavioral issues, unstable behaviors, and/or are exposed to multiple psychotropic drugs.

DISCUSSION

We developed and validated a novel falls prediction model that predicts falls ≤90 days in the future using data is readily available in electronic form to every one of the more than 15 000 nursing home (NH) facilities in the United States. The novel hybrid CART-logit algorithm is an advance over the 22 fall risk assessment tools previously evaluated in the NH setting because it has a better performance characteristic for the fall prediction window of ≤90 days and it is the only model designed to use features that are easily obtainable from the Long-Term Care MDS survey14 and drug therapy data (dispensing and administration). The final model could be used as a tool to actively monitor the risk of a resident falling between complete MDS assessments due to an issue that could potentially be prevented via a drug regimen and/or behavioral intervention.

One potentially effective way to implement the use of the model would be for nurses to generate a report based on the model at the time they submit a full MDS report. Such a report could also be used during routine nursing team discussions that happen after the MDS report and before the next report is created. The risk for a given patient could trigger a review for intervenable risks.

Another approach could be the use of this model by consultant pharmacists to incorporate it in the workflow for the federal-mandated monthly medication regimen reviews. Even though performance characteristics of the model are modest, the model showed robustness when applied to datasets from 2 different time periods and then the combined data. While this proposed intervention approach has not been tested, it might be desirable from the perspective of skilled nursing facilities because it would create an opportunity to integrate the pharmacy services provider into a fall risk quality intervention. In our experience, there is rarely a pharmacist present at nursing patient review meetings and nurses may benefit from additional input on potential medication safety concerns.

As it is a recommended data management practice to confirm that transformed data, such as was used in this study, is complete, correct, free of nonpredictive variability,21 we conducting rigorous data characterization activities and concordance tests. We were fortunate to have a reference source (the Nursing Home Compare program) that generated federally mandated quality measures off of nontransformed data submitted directly by the facilities to CMS. Thus, when our tests found discordance in certain quarters of data, we could infer that it was due to errors in either how the MDS data extracted for research, or an issue with our data transform or standardization process. A concern that could be raised about the data preparation process we used is that, by training the model on a dataset that, by dropping discordant quarters of data, the final model no longer predicts off of “real world” MDS data. Rather, the data that remained after dropping discordant data are actually closer to the ideal “real world” data that will be available in nursing homes than if we had kept the quarters in. In other words, our model was trained on data that are empirically tested to be in concordance with the actual source data as determined by validated quality measures generated on the source data by an independent party.

A potential limitation of the model is that the data that we have trained and tested the model on originate from 5 facilities in a single health system. Testing of the model’s performance is recommended prior to deploying it at any new sites. If performance is lower than reported, it might be addressable by retraining the coefficients of the model using a reference set of fallers and nonfallers derived from the target health system. It should also be mentioned that the model was intentionally developed using data that is readily available in electronic form to every nursing home in the United States. This meant that we did not use data from wearable sensors and devices that, in recent years, have shown significant promise for real-time fall prediction.22,23

One of the most interesting findings is that the model assigns a protective role for antidepressant medications where the patient has a fall history prior to admission, requires assistance to balance while walking, and some functional range of motion impairment in the lower body. An example of the practical implication of this feature is that a 65-year old with all of the risks mentioned above, a continuation of poor behavioral symptoms (as indicated by no change in symptoms since the last MDS and current exhibition of behaviors towards others), and taking 2 psychotropics, including one that is an antidepressant (Table 5, seventh row), has the same risk (0.31) as a 65-year old with all of the risks mentioned above, a recent improvement in behavioral symptoms (as indicated by a change in symptoms since the last MDS but no current behaviors towards others), and taking 1 psychotropic that is not an antidepressant (Table 4, second row).

While interesting and useful for fall prediction, this finding does not necessarily mean that antidepressants are in fact protective against falls. Previous studies focusing on antidepressants, the most widely prescribed psychotropic class, have found a nearly 2-fold increase in the risk of falls for NH residents24 and a significantly greater risk of multiple falls (OR: 4.66, 95% CI: 1.23–17.59).25 Indeed, we found a similar positive association of antidepressants with falls in a post hoc multivariable regression using this study’s training and test datasets with fall as the dependent variable but including only age, MDS antidepressant administration (N0400C), and behavioral changes as independent variables. So, while the novel CART-logit model assigns a protective role for antidepressants within the patient context mentioned above, in reality the antidepressant might be a marker certain of mood symptoms or other unmeasured variables that, if present, are protective of falls. Since depression itself is a known fall risk factor, perhaps adequately treating depression may contribute to reduced falls. Rigorous prospective studies are needed to determine if antidepressants indeed do have a protective role against falls in certain patient contexts within the nursing home.

The model also shows a strong association between the number of psychotropics (other than antidepressants) and fall risk regardless of other patient risk factors. This is consistent with previous studies.26 Psychotropic medication use is very prevalent in the NH setting and has been consistently associated with an increased risk of falls among older adults.27,28 Nearly half of all NH residents are prescribed at least 1 antidepressant, antipsychotic, or sedative/hypnotics (46%, 26%, and 13%, respectively).29,30 A meta-analysis of 40 studies found that taking at least 1 drug from these 3 psychotropic classes increased the odds of falling for individuals over 60 by 73% (95% CI: 52–97%).28 A secondary analysis of longitudinal NH data found a trend toward a greater risk of falls for NH residents prescribed any member of the sedative/hypnotic class (OR: 1.19, 95% CI: 0.94, 1.50).31 A systematic literature review of fall determinants in older long-term care residents with dementia conducted by Kropelin et al found that psychotropic drugs are associated with the increased risk of falls by evaluating the hazard ratio associated with these drugs.32 Another study showed that a 21% increase in the risk of falls was found for NH residents taking antipsychotic drugs in a prospective study involving nearly 19 000 residents.33

CONCLUSION

The study resulted in a novel and easily interpretable fall prediction model that requires only MDS and drug dispensing/administration data. The model can guide clinicians and nursing home staff in identifying level of fall risk within 90 days of completing an MDS assessment for individual nursing home residents, which could then lead to appropriate interventions. Further studies are warranted to test the model’s performance over data from a different health system and to validate how it would be best integrated into the NH clinical workflow.

FUNDING

This research was funded by the United States National Institute on Aging (grant number K01AG044433), NIMH (grant number P30 MH90333), Pittsburgh Claude D. Pepper Older Americans Independence Center (grant number NIA P30 AG024827), the UPMC Endowment in Geriatric Psychiatry, the Pharmacy Quality Alliance—CVS Health Foundation Scholars Program, and the Pittsburgh Health Data Alliance through the Center for Commercializable Applications.

AUTHOR CONTRIBUTIONS

RDB is the lead author of the work. RDB conceived of the project during his early career development and led development of the project, acquisition of the data, the analysis, quality assurance, and authorship of the manuscript. OVK contributed to the analysis by assisting with validation of the newer dataset and rerunning the fall prediction algorithm model learning workflow. She also assisted with quality assurance and development of the manuscript. SP was the lead statistician who came up with the idea for the hybrid CART-logit and the initial model structure. He also assisted with quality assurance of the older dataset and development of the manuscript. JFK, SLK-G, CFR, SMA, and SMH were all content mentors who provided significant feedback during the course of the project. This included many suggestions that led to improvements in the final fall prediction model. They all contributed to the manuscript.

SUPPLEMENTARY MATERIAL

Supplementary material is available at Journal of the American Medical Informatics Association online.

Supplementary Material

ocac111_Supplementary_Data

ACKNOWLEDGMENTS

The author thanks Katrina Romagnoli, PhD, for assistance with the error analysis.

CONFLICT OF INTEREST STATEMENT

None declared.

Contributor Information

Richard D Boyce, Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, USA.

Olga V Kravchenko, Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, USA.

Subashan Perera, Aging Institute, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania, USA.

Jordan F Karp, Department of Psychiatry, College of Medicine, University of Arizona, Tucson, Arizona, USA.

Sandra L Kane-Gill, Department of Pharmacy and Therapeutics, University of Pittsburgh, Pittsburgh, Pennsylvania, USA.

Charles F Reynolds, Aging Institute, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania, USA.

Steven M Albert, Department of Behavioral and Community Health Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania, USA.

Steven M Handler, Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, USA; Aging Institute, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania, USA.

Data Availability

The data underlying this article cannot be shared publicly due to patient privacy laws. The data will be shared on reasonable request to the corresponding author and submitting a data use agreement request to institution from which the data were extracted.

REFERENCES

  • 1.Centers for Medicare and Medicaid Services. Nursing Home Data Compendium. 2015 ed. Washington, DC: Centers for Medicare and Medicaid Services; 2015. https://www.cms.gov/Medicare/Provider-Enrollment-and-Certification/CertificationandComplianc/Downloads/nursinghomedatacompendium_508-2015.pdf. Accessed June 16, 2015. [Google Scholar]
  • 2. Levinson D. Adverse Events in Skilled Nursing Facilities: National Incidence among Medicare Beneficiaries. Washington, DC: DHHS Office of the Inspector General; 2014. [Google Scholar]
  • 3.Centers for Medicare and Medicaid Services. Potentially preventable adverse events. QAPI News Brief 2016; 2: 1–3. [Google Scholar]
  • 4.Agency for Healthcare Research and Quality. NHQDR Web Site—National Nursing Home Benchmark Details. 2021. https://nhqrnet.ahrq.gov/inhqrdr/National/benchmark/table/Setting_of_Care/Nursing_Home#close. Accessed November 24, 2021.
  • 5. Gulka HJ, Patel V, Arora T, et al. Efficacy and generalizability of falls prevention interventions in nursing homes: a systematic review and meta-analysis. J Am Med Dir Assoc 2020; 21 (8): 1024–1035.e4. [DOI] [PubMed] [Google Scholar]
  • 6. Nunan S, Brown Wilson C, Henwood T, et al. Fall risk assessment tools for use among older adults in long-term care settings: a systematic review of the literature. Australas J Ageing 2018; 37 (1): 23–33. [DOI] [PubMed] [Google Scholar]
  • 7. Park S-H. Tools for assessing fall risk in the elderly: a systematic review and meta-analysis. Aging Clin Exp Res 2018; 30 (1): 1–16. [DOI] [PubMed] [Google Scholar]
  • 8. Norman KJ, Hirdes JP.. Evaluation of the predictive accuracy of the interRAI falls clinical assessment protocol, Scott fall risk screen, and a supplementary falls risk assessment tool used in residential long-term care: a retrospective cohort study. Can J Aging 2020; 39 (4): 521–32. [DOI] [PubMed] [Google Scholar]
  • 9. Milosevic V, Linkens A, Winkens B, et al. Fall incidents in nursing home residents: development of a predictive clinical rule (FINDER). BMJ Open 2021; 11 (5): e042941. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Marier A, Olsho LEW, Rhodes W, et al. Improving prediction of fall risk among nursing home residents using electronic medical records. J Am Med Inform Assoc 2016; 23 (2): 276–82. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Jakovljevic M. Predictive validity of a modified fall assessment tool in nursing homes: experience from Slovenia. Nurs Health Sci 2009; 11 (4): 430–5. [DOI] [PubMed] [Google Scholar]
  • 12. Wijnia JW, Ooms ME, van Balen R.. Validity of the STRATIFY risk score of falls in nursing homes. Prev Med 2006; 42 (2): 154–7. [DOI] [PubMed] [Google Scholar]
  • 13. Rosendahl E, Lundin-Olsson L, Kallin K, et al. Prediction of falls among older people in residential care facilities by the Downton index. Aging Clin Exp Res 2003; 15 (2): 142–7. [DOI] [PubMed] [Google Scholar]
  • 14. Saliba D, Buchanan J. Development and Validation of a Revised Nursing Home Assessment Tool: MDS 3.0. RAND, 2008. http://www.med-pass.com/Docs/Products/Samples/MDS30FinalReportAppendix.pdf. Accessed April 19, 2012.
  • 15. Boyce RD, Handler SM, Karp JF, et al. Preparing nursing home data from multiple sites for clinical research—a case study using observational health data sciences and informatics. EGEMS (Wash DC) 2016; 4 (1): 1252. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Christian Reich, Patrick Ryan, “OHDSI CDM and Vocabulary Development Working Group.” OMOP Common Data Model V5.1.0. 2017. http://www.ohdsi.org/web/wiki/doku.php?id=documentation:cdm#common_data_model. Accessed June 30, 2017.
  • 17.“OHDSI.” ACHILLES for Data Characterization | OHDSI. 2015. http://www.ohdsi.org/analytic-tools/achilles-for-data-characterization/. Accessed May 15, 2015.
  • 18. Sanghavi P, Pan S, Caudry D.. Assessment of nursing home reporting of major injury falls for quality measurement on nursing home compare. Health Serv Res 2020; 55 (2): 201–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Steinberg D, Cardell NS.. The Hybrid CART-Logit Model in Classification and Data Mining. Salford Systems White Paper; 1998: 42. [Google Scholar]
  • 20. Breiman L, Friedman J, Stone CJ, et al. Classification and Regression Trees. 1st ed. Oxfordshire, England, UK: Routledge; 1984. [Google Scholar]
  • 21. Dhindsa K, Bhandari M, Sonnadara RR.. What’s holding up the big data revolution in healthcare? BMJ 2018; 363: k5357. [DOI] [PubMed] [Google Scholar]
  • 22. Bet P, Castro PC, Ponti MA.. Fall detection and fall risk assessment in older person using wearable sensors: a systematic review. Int J Med Inform 2019; 130: 103946. [DOI] [PubMed] [Google Scholar]
  • 23. Rajagopalan R, Litvan I, Jung T-P.. Fall prediction and prevention systems: recent trends, challenges, and future research directions. Sensors (Basel) 2017; 17 (11): 2509. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Thapa PB, Gideon P, Cost TW, et al. Antidepressants and the risk of falls among nursing home residents. N Engl J Med 1998; 339 (13): 875–82. [DOI] [PubMed] [Google Scholar]
  • 25. Kallin K, Lundin-Olsson L, Jensen J, et al. Predisposing and precipitating factors for falls among older people in residential care. Public Health 2002; 116 (5): 263–71. [DOI] [PubMed] [Google Scholar]
  • 26. Sterke CS, van Beeck EF, van der Velde N, et al. New insights: dose-response relationship between psychotropic drugs and falls: a study in nursing home residents with dementia. J Clin Pharmacol 2012; 52 (6): 947–55. [DOI] [PubMed] [Google Scholar]
  • 27. Hartikainen S, Lönnroos E, Louhivuori K.. Medication as a risk factor for falls: critical systematic review. J Gerontol A Biol Sci Med Sci 2007; 62 (10): 1172–81. [DOI] [PubMed] [Google Scholar]
  • 28. Leipzig RM, Cumming RG, Tinetti ME.. Drugs and falls in older people: a systematic review and meta-analysis: I. Psychotropic drugs. J Am Geriatr Soc 1999; 47 (1): 30–9. [DOI] [PubMed] [Google Scholar]
  • 29. Stevenson DG, Decker SL, Dwyer LL, et al. Antipsychotic and benzodiazepine use among nursing home residents: findings from the 2004 National Nursing Home Survey. Am J Geriatr Psychiatry 2010; 18 (12): 1078–92. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Karkare SU, Bhattacharjee S, Kamble P, et al. Prevalence and predictors of antidepressant prescribing in nursing home residents in the United States. Am J Geriatr Pharmacother 2011; 9 (2): 109–19. [DOI] [PubMed] [Google Scholar]
  • 31. Hien LTT, Cumming RG, Cameron ID, et al. Atypical antipsychotic medications and risk of falls in residents of aged care facilities. J Am Geriatr Soc 2005; 53 (8): 1290–5. [DOI] [PubMed] [Google Scholar]
  • 32. Kröpelin TF, Neyens JCL, Halfens RJG, Kempen GIJM, Hamers JPH. Fall determinants in older long-term care residents with dementia: a systematic review. Int Psychogeriatr 2013; 25 (4): 549–63. [DOI] [PubMed] [Google Scholar]
  • 33. Kiely DK, Kiel DP, Burrows AB, et al. Identifying nursing home residents at risk for falling. J Am Geriatr Soc 1998; 46 (5): 551–5. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

ocac111_Supplementary_Data

Data Availability Statement

The data underlying this article cannot be shared publicly due to patient privacy laws. The data will be shared on reasonable request to the corresponding author and submitting a data use agreement request to institution from which the data were extracted.


Articles from Journal of the American Medical Informatics Association : JAMIA are provided here courtesy of Oxford University Press

RESOURCES