Skip to main content
Journal of the American Medical Informatics Association: JAMIA logoLink to Journal of the American Medical Informatics Association: JAMIA
. 2023 Mar 25;30(6):1103–1113. doi: 10.1093/jamia/ocad042

Integrating economic considerations into cutpoint selection may help align clinical decision support toward value-based healthcare

Rex Parsons 1,, Robin Blythe 2, Susanna M Cramb 3,4, Steven M McPhail 5,6
PMCID: PMC10198528  PMID: 36970849

Abstract

Objective

Clinical prediction models providing binary categorizations for clinical decision support require the selection of a probability threshold, or “cutpoint,” to classify individuals. Existing cutpoint selection approaches typically optimize test-specific metrics, including sensitivity and specificity, but overlook the consequences of correct or incorrect classification. We introduce a new cutpoint selection approach considering downstream consequences using net monetary benefit (NMB) and through simulations compared it with alternative approaches in 2 use-cases: (i) preventing intensive care unit readmission and (ii) preventing inpatient falls.

Materials and methods

Parameter estimates for costs and effectiveness from prior studies were included in Monte Carlo simulations. For each use-case, we simulated the expected NMB resulting from the model-guided decision using a range of cutpoint selection approaches, including our new value-optimizing approach. Sensitivity analyses applied alternative event rates, model discrimination, and calibration performance.

Results

The proposed approach that considered expected downstream consequences was frequently NMB-maximizing compared with other methods. Sensitivity analysis demonstrated that it was or closely tracked the optimal strategy under a range of scenarios. Under scenarios of relatively low event rates and discrimination that may be considered realistic for intensive care (prevalence = 0.025, area under the receiver operating characteristic curve [AUC] = 0.70) and falls (prevalence = 0.036, AUC = 0.70), our proposed cutpoint method was either the best or similar to the best of the compared methods regarding NMB, and was robust to model miscalibration.

Discussion

Our results highlight the potential value of conditioning cutpoints on the implementation setting, particularly for rare and costly events, which are often the target of prediction model development research.

Conclusions

This study proposes a cutpoint selection method that may optimize clinical decision support systems toward value-based care.

Keywords: net monetary benefit, clinical prediction model, probability threshold, cutpoint, value-based care

BACKGROUND AND SIGNIFICANCE

The increasing use of computerized decisions support and digitization of hospital systems including electronic medical records is providing new opportunities to develop clinical prediction models. Computerized decision support has the potential to improve quality, safety, and efficiency of care, as well as patient and staff experiences. However, prediction of future health-related events alone is without merit unless it precipitates improved care and better outcomes. By integrating likely downstream effects of decision support system recommendations when developing threshold indicators for decision support prediction models, including implications associated with false positives and false negatives, there may be opportunity to optimize the value of decision support recommendations derived from prediction models.

From a healthcare implementation perspective, decision-makers are confronted with a wide array of potential models to use, few of which are externally validated.1 When clinical prediction models are developed, the receiver operating characteristic (ROC) curve is frequently used to assess model discrimination.2 Several methods are then used to determine a probability threshold, or cutpoint, to discern predicted probabilities into predicted classes. The Youden index is measured as Sensitivity+Specificity-1 and is a common approach used to obtain a cutpoint for classification.3 Alternatives to the Youden index include the closest-to-(0,1) corner approach,4 concordance probability method,5 and the index of union.6

These approaches are calculated using the sensitivity, specificity, and/or area under the ROC (AUC). However, the use of model sensitivity and specificity as metrics to assess clinical utility for prediction models has been heavily criticized.7–9 A variety of methods for assessing clinical prediction models have been developed that partly address these critiques. These include the predictive summary index, which uses positive and negative predictive values to determine net gain in certainty of disease,10 and the number needed to evaluate, which reflects the clinical burden of responding to false positives to determine appropriate classification thresholds.9

Decision curve analysis (DCA), an approach that factors the impact of false positives and negatives into the threshold selection process, is rapidly growing in usage.11 The recent interest in more patient-centered measures demonstrates an encouraging trend in valuing the impact of model-informed decisions over concerns solely of predictive optimization. From the health system perspective, model evaluations using sensitivity and specificity are typically agnostic to costs and quality of life outcomes. However, misclassification can present significant problems for patients, and this can be accounted for in DCA.12

Even when models are externally validated, a model with high discrimination may not necessarily lead to care that improves patient outcomes and reduces healthcare costs, which requires testing through the use of decision-analytic models.13 While the strength of DCA is the use of a weighting factor for patients to help make complex care decisions, it cannot replace decision-analytic models as health-economic outcomes of cost and effectiveness are not included in threshold selection parameters.14 Situations in which the patient benefit is uncertain, preferences are weak, or 2 choices of action seem clinically equivocal may not be resolved by DCA. In these cases, a care provider may defer to standard practice or choose either option if there is sufficient justification.

There has been recent work in assessing the cost-effectiveness of prediction models15,16 but relatively little work to integrate downstream costs of patient risk assessments into model development processes. Within our recent scoping review of inpatient fall prediction models,17 we were unable to find a study that incorporated the use of cost and effectiveness estimates for cutpoint selection. One of the included studies18 did incorporate relative costs of misclassification but these were not informed by the associated costs of falls and the available intervention, nor the effectiveness of the intervention. In studies reported by van Geissen and colleagues, costs were incorporated into the model development process by including a brute-force search of many cutpoints to select the most cost-effective.19–21 Prior research has demonstrated the concept of selecting a cutpoint that minimizes costs and maximizes benefits using expected values or model-based approaches.22,23 The formula that Wyants and colleagues presented as cost-minimizing (Equation 1) was an important advancement in the field but relies on models being calibrated.22 However, many published clinical prediction models are not well calibrated or do not report on calibration.24 In this equation, the threshold (t) is calculated by the relative costs (C) of possible classifications (TP: true positive; FP: false positive; TN: true negative; FP: false negative):

t=CFP-CTNCFP+CFN-CTP-CTN. (1)

In this article, we build on this approach by providing an easily adaptable health-economic method to cutpoint selection that maximizes total net monetary benefit (NMB)25 in the given dataset. The proposed “value-optimizing” method selects the cutpoint that maximizes the function for the total NMB (Equation 2). In this equation, the total NMB is calculated as the sum of the number of samples (n) for each possible classification multiplied by their associated NMB:

NMBTOTAL=nTP×NMBTP+nFP×NMBFP+nTN×NMBTN+nFN×NMBFN. (2)

Our study compares the impact of using prediction models to guide use of an intervention for a given population using measures of treatment costs and effectiveness, and utility decrements from prior literature. We use sensitivity analyses to explore how the choice of cutpoint selection methods compare in different scenarios that may represent different clinical settings, including differences in the model AUC and the rate of the event being predicted. We also compare the proposed value-optimizing method to the cost-minimizing cutpoint selected using the approach in Equation (1) where the model is either calibrated or not.

OBJECTIVE

Our objective was to determine whether models with a cutpoint selected by optimizing NMB produced better outcomes for the health system than models optimized using alternative methods that do not consider downstream costs associated with model-based recommendations and effectiveness of the intervention. To examine the utility of our model, we chose 2 case studies that may be encountered in hospitals: a model for determining readiness for intensive care unit (ICU) discharge and a model for assigning a hospital fall-prevention strategy.

MATERIALS AND METHODS

Design and purpose

We designed and implemented a simulation study to prospectively evaluate the modeled consequences of clinical actions following prediction under several cutpoint selection methods. We used 2 case studies: (1) identifying patients ready for ICU discharge and (2) allocating inpatients to receive a fall-prevention, education-based intervention.

For the ICU discharge model, we simulated the impact of a model that predicts patient readiness for discharge from ICU every 24 h. The clinical action following prediction was set to discharge the patient for a positive result (predicted probability greater than specified threshold) and hold them for another 24 h for a negative result (predicted probability less than specified threshold). The selected event rates and discrimination performance of our simulated models were based on those reported by a recent systematic review of ICU readmission models.26

For the falls model, we simulated the education program’s impact delivered in addition to usual care falls prevention. Our simulated intervention was based on the Safe Recovery inpatient fall-prevention education program, which has demonstrated clinical trial effectiveness and includes multi-media education materials and health professional-delivered follow-up using adult-focused learning and behavior change principles.27 The selected event rates for the primary model were based on those reported by Morello et al28 (3.6%) and the discrimination of our simulated models was based on those that we previously reviewed17 (given the wide range of performance of models that were externally validated, we used a conservative AUC estimate of 0.70).

Our primary analysis for both use-cases was based on combinations of event rate and model discrimination (AUC) reported from prior literature. For ICU readmission, we used an event rate of 0.025 and AUC of 0.70.29 For inpatient falls, we used an event rate of 0.036 and AUC of 0.70. As sensitivity analyses, to examine the relative performance of our approach for different potential models and settings, we repeated our simulation with varying rates of the outcome event, and model discrimination. For the ICU discharge model, this simulation was repeated for every combination of event rates (0.01, 0.025, and 0.1) and AUC values (0.55, 0.70, and 0.85). For the inpatient falls use-case, we performed the same comparison but with the event rates that, similarly, included the primary analyses (0.01, 0.036, and 0.1).

For both use-cases, we present the incremental net monetary benefit (INB) where the reference strategy of treat-none for ICU discharge and treat-all for falls education. These represent the most likely current practice.

Assigning parameter values and patient states

Our model parameters were costs and health utilities, measured in quality-adjusted life years (QALYs). QALYs represent the benefits associated with reductions in morbidity and mortality over 1 year.30 In the context of this study, we defined health utility as the time-independent unit of preference anchored with health states of death (0.00) and perfect health (1.00). In this context, a health utility of 1.00 maintained for 1 year represents 1 QALY, while a utility of 0.50 maintained for 1 year represents 0.5 QALYs. We categorized all predictions into the following classes: true positive, false positive, true negative, and false negative. True positives were defined as patients with a predicted probability above the cutpoint who would have experienced the event of interest if no intervention was given. True negatives were defined as the inverse (predicted probability below cutpoint and no event). False positives were defined as patients who were predicted to be positive but would not experience the event of interest, with false negatives being the inverse (predicted negative and event occurred). While this study comprises hypothetical scenario simulations, parameter estimates and uncertainties were informed from existing empirical studies reported in prior literature. For both case studies, we selected a conservative estimate of an Australian willingness to pay (WTP) threshold of $28 033 per QALY gained.31 All costs were updated to 2022 $AUD using a 3% inflation rate.32

For the ICU discharge case study, true positives were defined as the patient being appropriately held in the ICU for an additional 24 h, preventing an event leading to ICU readmission. False positives were defined as patients who could have been safely discharged being inappropriately held in the ICU for an additional 24 h, which carries an opportunity cost. False negatives were defined as patients being discharged from ICU who went on to suffer ICU readmission. True negatives were defined as patients successfully being discharged from the ICU without readmission.

For the falls case study, true positives were patients that were appropriately allocated to receive the patient education intervention. The costs and outcomes for these patients were based on the outcomes associated with falls multiplied by a probability-weighted factor accounting for the reduced cost of falls following the intervention, plus the cost of providing the intervention. False positives were patients who were unnecessarily allocated to the intervention, with costs equal to the cost of the intervention. True negatives were appropriately not given the intervention and had no associated intervention or fall-related costs. False negatives were not given the intervention when they would likely have benefited from it; their costs were equal to the unadjusted outcomes associated with falls.

The costs and outcomes used in each case study are described in Table 1. Hospital care costs were selected from the literature and transformed into gamma distributions using the R package fitdistrplus.33 We selected gamma distributions as hospital costs are positive and frequently right-skewed, and gamma distributions avoid the bias introduced by retransforming estimates.34 The outcomes associated with ICU readmission were calculated by multiplying the daily cost of ICU care with the expected duration in days of an ICU readmission, taken from Chen et al35 and transformed into a gamma distribution. Health utility increments and decrements were transformed into beta distributions, as beta distributions are flexible and bounded between 0 and 1.33

Table 1.

Parameters used to populate model

Description Source Source parameters Transformed distribution
Applies to all
 WTP Edney et al (2018)31 Value = 28 033 N/A
ICU readmission prediction
 ICU opportunity cost Page et al (2017)36 Cost = 436 N/A
 ICU readmission LOS (days) Chen et al (1998)35 Mean = 7.8 (SD = 13.4) Gamma (0.3385, 0.0433)
 QALY gain de Vos et al (2022)37 Mean = 0.42 (SD = 0.083) Beta (14.3927, 19.8741)
 ICU occupancy cost Hicks et al (2021)38 Mean = 4375 (SD = 1157) Gamma (14.2911, 0.0033)
In-hospital fall prediction
 Treatment cost Hill et al (2015)39 Cost = 77.30 N/A
 Falls cost Morello et al (2015)40 Mean = 6669 [95% CI: 3888–9450] Gamma (22.0516, 0.0033)
 Treatment effect Haines et al (2011)41 Adjusted HR = 0.43 [95% CI, 0.24–0.78] Exp(Normal (−0.8440, 0.3038))
 QALY loss Latimer et al (2013)42
  • Utility decrement, n:

  • No injury (−0.02, 40);

  • Minor injury (−0.04, 31);

  • Moderate injury (−0.06, 18)

  • Major injury (−0.11, 9)

Beta (2.4253, 55.4053)

Note: HR: hazard ratio; LOS: length of stay.

The weighting factor for the falls intervention effect was taken from a previous multi-site randomized controlled trial.41 We log-transformed the reported hazard ratio of (0.43 [95% confidence interval {CI}, 0.24–0.78]) and sampled from a normal distribution fitted using these parameters. The sampled value (log-hazard ratio) was exponentiated and used as the weighting factor. The health utility decrement resulting from a fall was taken from Latimer et al.42 The authors did not publish uncertainty estimates, but did report transition probabilities and utility scores, so we simulated a population of patients experiencing a fall and fit a beta distribution to the expected utility decrement from falling. The costs of the patient education intervention were obtained by contacting the authors as it was not included in the published manuscript.39

To determine the cost and effectiveness of the prediction model, we assigned values to each square of a confusion matrix comparing test prediction (positive or negative) with the occurrence of an event (Supplementary Table S1).

Simulating patient population

Values for a single predictor variable for the 2 classes of patients (positive and negative for the event occurrence) were taken from separate distributions. Distribution parameters were calculated by converting the simulated model AUC to a Cohen’s d value.43 The samples for the negative and positive cases were sampled from normal distributions with a standard deviation of 1, but the distribution for negative cases had a mean of 0 and the positive cases had a mean equal to Cohens’ d. The event rate and sample size were used to sample the number of positive and negative events from a binomial distribution. The prediction model was fit using logistic regression with an intercept term and this single predictor variable. In each iteration of the simulation, a training set was sampled and used for fitting the prediction model and obtaining cutpoints. The sample size for this training set was determined by using the pmsampsize R package44 and the specified discrimination (AUC) and event rate. If the number of expected events was not obtained in the sample, then one outcome value was sampled from a binomial distribution at the specified event rate and added to the dataset until the specified minimum number of events was achieved.

A validation sample (n = 10 000) taken from the same distributions and the previously fit prediction model was applied to generate predicted probabilities for each sample. The thresholds estimated using the training set were applied to the validation set to assign them to predicted classes.

Threshold selection

We applied several methods to select the cutpoint above which a predicted probability was classified as a positive case. Our strategies included treating all patients (cutpoint = 0), treating no patients (cutpoint = 1), the Youden index, closest-to-(0,1) corner approach, concordance probability method, and the index of union. For more information on these approaches, see Unal (2017).6 Our approach selected the cutpoint which maximized the NMB under the expected patient state costs. We used the expected values for the inputs (treatment cost, treatment effect, and utility decrement) required to compute the NMB associated with each patient state. The cutpointr R package was used for threshold estimation using each approach.45 We then evaluated each cutpoint by incorporating uncertainty of the inputs used to estimate NMB, resampling each from their underlying distribution (Table 1) and calculating the NMB associated with each predicted class within each iteration of the simulation (n = 5000).

Comparison to cost-minimizing threshold

To compare our approach to the cost-minimizing threshold (Equation 1) for both poorly and well-calibrated models, we assessed the performance for our primary analyses with varying amounts of miscalibration. Miscalibration was introduced by introducing an offset to the model intercept after it was fit within each simulation.

RESULTS

The findings from primary analyses for each of the cutpoint methods on INB for both case studies are visualized in Figure 1 and Table 2. The sensitivity scenarios modeling is visualized in Supplementary Figure S1 for the ICU case study and Supplementary Figure S2 for the falls case study. Overall, the threshold which maximized NMB was most commonly the value-optimizing method proposed in this study that considered downstream costs and effects in comparison to other approaches to determining model thresholds. Figure 2 visualizes the differences in cutpoint selection on a simulated dataset using the value-optimizing method, the treat-all and treat-none approaches, and the commonly used Youden index.

Figure 1.

Figure 1.

Primary analyses of incremental NMB associated with each cutpoint selection method (A and B for ICU readmission and inpatient falls, respectively) and distribution of selected cutpoints (C and D for ICU readmission and inpatient falls, respectively).

Table 2.

Primary analyses of incremental NMB associated with each cutpoint method for ICU dishcharge and inpatient falls

Cutpoint method Incremental NMB (median [95% intervals]) Best performing [n (%)]
ICU readmission (reference group: Treat-none)
 Treat-all −4616.69 [−7570.67, −2365.82] 0 (0%)
 Value-optimizing 0 [7.20, 0] 4276 (86%)
 Closest to (0,1) −1412.16 [−2820.33, −281.78] 35 (1%)
 Youden −1415.91 [−3516.53, −168.37] 27 (1%)
 Sens-Spec product −1415.53 [−2977.96, −251.93] 1 (0%)
 Index of union −1477.87 [−2753.65, −340.32] 17 (0%)
 Cost-minimizing 0 [0, 0] 644 (13%)
Inpatient falls (reference group: Treat-all)
 Treat-none −32.41 [−175.35, 37.40] 35 (1%)
 Value-optimizing 15.22 [−37.33, 38.73] 1258 (25%)
 Closest to (0,1) 15.31 [−45.65, 42.40] 747 (15%)
 Youden 15.00 [−54.47, 41.61] 166 (3%)
 Sens-Spec product 15.32 [−47.18, 42.06] 31 (1%)
 Index of union 15.89 [−40.26, 42.30] 778 (16%)
 Cost-minimizing 16.09 [19.42, 37.40] 1985 (40%)

Note: The bolded values are the best performing within the group of cutpoint methods for that given metric (column).

Figure 2.

Figure 2.

Example selected cutpoints on (A) distribution of predicted probabilities and (B) ROC curve. Example data were sampled using the method detailed for all simulations (AUC = 0.85, event rate = 0.15, sample size = 1000) and used same inputs for NMB values as the falls use-case.

Primary analyses

For the ICU discharge model, the optimal strategies with similar NMB were the proposed value-optimizing approach, the cost-minimizing threshold, and the treat-none approach. As the treat-none was the reference strategy for calculating INB, the primary analyses (Figure 1A) show the value-optimizing and cost-minimizing approaches as performing similar to the reference strategy. The median INB for the value-optimizing approach was $0 [95% interval: −7.2, 0]. The value-optimizing approach was the best performer in 86% of simulations, whereas the cost-minimizing approach was best in 13% of simulations. The next best cutpoint selection method was the closest to (0,1) with a median INB of −$1412.16 [95% CI, −2820.33, −281.78].

For the inpatient falls model, where the reference strategy was treat-all, the findings were similar across all cutpoint methods. The treat-none approach was the worst regarding INB (−$32.41 [−175.35, 37.40]), but all other cutpoint methods appeared comparable with median INBs between $15 and $16.09. The cost-minimizing approach had the highest median INB ($16.09 [−19.42, 37.40]) and was the best performer for 40% of simulations. The next best approach was the value-optimizing method, which was best in 25% of simulations.

Sensitivity analyses

For both models, the value-optimizing approach outperformed other cutpoint selection methods most of the time in the sensitivity analysis scenarios. The relative improvement of this method was greatest when the event rate was rare and the AUC was lower (Supplementary Figures S1 and S2 and Tables 3 and 4). Treating all patients was always the least favorable approach for both ICU readmission and inpatient falls.

Table 3.

ICU readmission INBs (presented as median [95% interval])

Rate Model AUC Treat-all Value-optimizing Closest to (0,1) Youden Sens-Spec product Index of union Cost-minimizing
0.010 0.55 −4623.10 [−7594.12, −2268.19] 0 [0, 0] −2081.63 [−3574.01, −945.53] −2020.24 [−4373.57, −590.08] −2074.33 [−3561.08, −940.46] −2116.13 [−3503.59, −969.11] 0 [0, 0]
0.010 0.70 −4616.69 [−7570.67, −2365.82] 0 [7.20, 0] −1412.16 [−2820.33, −281.78] −1415.91 [−3516.53, −168.37] −1415.53 [−2977.96, −251.93] −1477.87 [−2753.65, −340.32] 0 [0, 0]
0.010 0.85 −4583.71 [−7466.05, −2345.28] 0 [36.02, 90.13] −652.12 [−2107.69, 457.12] −700.95 [−2605.15, 436.32] −693.05 [−2326.28, 436.29] −707.01 [−1995.68, 477.73] 0 [14.55, 56.23]
0.025 0.55 −4290.48 [−7346.75, −26.12] 0 [1.03, 0] −1923.62 [−3411.43, 270.89] −1825.63 [−4211.70, 261.45] −1922.78 [−3417.62, 270.89] −1951.48 [−3405.49, 299.60] 0 [0, 0]
0.025 0.70 −4260.48 [−7320.42, 416.36] 0 [24.07, 25.11] −1243.78 [−2767.40, 1558.89] −1212.82 [−3457.86, 1553.76] −1233.36 [−2915.04, 1558.77] −1320.39 [−2662.36, 1681.96] 0 [1.54, 1.33]
0.025 0.85 −4248.23 [−7347.53, −88.34] 0 [114.37, 554.47] −524.45 [−2042.98, 2295.34] −575.20 [−2509.51, 2238.64] −559.01 [−2229.50, 2320.03] −577.71 [−1909.39, 2383.83] 0 [68.01, 400.43]
0.100 0.55 −3501.48 [−6859.66, 16 129.35] 0 [8.37, 10.39] −1552.11 [−3245.17, 8826.46] −1380.05 [−3904.36, 8894.15] −1549.44 [−3277.62, 8747.87] −1583.10 [−3225.75, 8964.11] 0 [0, 0]
0.100 0.70 −3442.81 [−7002.28, 16 807.74] 0 [233.39, 1320.33] −924.30 [−2815.20, 11 641.08] −816.40 [−3370.60, 11 870.13] −887.15 [−2912.05, 11 870.13] −1003.92 [−2698.71, 11 719.73] 0 [87.84, 531.80]
0.100 0.85 −3380.71 [−6968.28, 16 753.46] 27.57 [−496.45, 5777.93] −411.02 [−1990.98, 14 566.43] −410.58 [−2280.99, 14 692.09] −408.44 [−2134.12, 14 567.33] −422.37 [−1934.61, 14 742.27] 52.44 [366.72, 5548.12]

Note: The bolded values are the best performing within the group of cutpoint methods for that given metric (column).

Table 4.

Inpatient falls INBs (presented as median [95% interval])

Rate Model AUC Treat-none Value-optimizing Closest to (0,1) Youden Sens-Spec product Index of union Cost-minimizing
0.010 0.55 63.79 [22.65, 82.89] 63.56 [20.01, 82.64] 37.06 [17.33, 48.71] 35.62 [12.03, 60.36] 36.99 [17.36, 49.24] 36.95 [17.19, 46.55] 63.78 [22.65, 82.89]
0.010 0.70 64.36 [23.32, 83.40] 64.67 [28.73, 81.50] 52.43 [34.78, 67.83] 50.80 [27.27, 71.42] 51.86 [33.99, 69.32] 52.22 [35.38, 64.41] 65.47 [29.85, 81.82]
0.010 0.85 63.88 [22.41, 84.05] 72.48 [45.60, 84.62] 69.64 [51.37, 82.45] 68.40 [45.01, 82.47] 68.78 [49.21, 82.36] 69.30 [53.59, 81.87] 73.90 [50.57, 84.88]
0.036 0.55 −33.42 [−174.59, 36.14] 0.20 [18.45, 9.11] −8.75 [−78.03, 24.28] −7.88 [−85.10, 25.17] −8.71 [−78.02, 24.26] −8.64 [−75.76, 24.24] 0.16 [−9.50, 6.14]
0.036 0.70 −32.41 [−175.35, 37.40] 15.22 [−37.33, 38.73] 15.31 [−45.65, 42.40] 15.00 [−54.47, 41.61] 15.32 [−47.18, 42.06] 15.89 [−40.26, 42.30] 16.09 [19.42, 37.40]
0.036 0.85 −33.67 [−178.19, 35.89] 39.75 [−22.00, 58.87] 40.00 [−28.11, 60.43] 39.88 [−29.47, 59.61] 40.36 [−26.98, 60.09] 40.79 [−16.70, 60.60] 41.71 [6.15, 58.76]
0.100 0.55 −267.41 [−653.52, −80.94] −0.39 [−6.14, 0.33] −119.98 [−300.44, −30.93] −114.87 [−351.94, −21.05] −120.08 [−299.92, −30.90] −118.38 [−296.99, −31.19] 0 [0, 0]
0.100 0.70 −273.28 [−639.59, −79.67] −0.88 [−69.94, 8.09] −76.50 [−250.21, −1.65] −70.89 [−289.66, 3.15] −75.69 [−261.31, 0.08] −74.92 [−231.94, −1.82] 0.34 [25.28, 6.36]
0.100 0.85 −273.43 [−651.14, −79.63] 15.18 [−67.59, 32.45] −22.82 [−163.39, 30.21] −20.14 [−181.89, 31.24] −21.76 [−175.62, 31.09] −21.99 [−146.81, 29.94] 17.55 [15.33, 31.78]

Note: The bolded values are the best performing within the group of cutpoint methods for that given metric (column).

Performance under poor model calibration

When the model was calibrated, the INB using the cost-minimizing cutpoint (method described in Equation 1) was equivalent to our proposed value-optimizing cutpoint selection approach. However, for the inpatient falls case study, when the model was poorly calibrated, the INB for the cost-minimizing cutpoint worsened while our value-optimizing cutpoint was not affected. This difference between the approaches was not observed for ICU readmission, likely due to both the cost-minimizing and value-optimizing approach selecting relatively high, albeit different, cutpoints with relatively few predicted probabilities between them, meaning that there were few differently classified samples between these 2 methods. This comparison between the value-optimizing and cost-minimizing approach is visualized for the inpatient falls model in Figure 3 and the full comparisons between all cutpoint selection methods shown in Supplementary Figures S3 and S4 for the ICU and falls use-case, respectively.

Figure 3.

Figure 3.

Cutpoint performance under varying levels of model calibration for the ICU discharge model (A) and falls prevention model (B).

DISCUSSION

In this article, we applied a range of methods for selecting cutpoint thresholds and evaluated them based on their ability to optimize downstream costs and outcomes, indicated by NMB, for differing intervention and target population scenarios. The overarching finding was that the proposed value-optimizing method, which included cost and intervention effectiveness considerations into the threshold selection procedure, has potential to improve the value of model recommendations in comparison to competing threshold selection methods. The sensitivity scenario analyses indicated that the extent of the benefit from considering intervention costs and effects when selecting thresholds for decision support recommendations is likely to depend on local patient and event rate contexts, predictive model discrimination, as well as the outcomes associated with intervention implementation or nonimplementation. While it was expected that our approach which optimized NMB typically led to the best outcome, our findings were also robust to model inputs and miscalibration. It is noteworthy that the value-optimizing threshold selection approach performed favorably against other approaches across all sensitivity analysis scenarios in the 2 use-cases presented in this study.

This new value-optimizing approach may be particularly beneficial in clinical settings with time and resource constraints, where the downstream impacts of computerized clinical decision support system recommendations are important pragmatic considerations. Additionally, this approach can be used to assess the estimated economic benefit of a clinical prediction model prior to implementation. The comparison to treat-all and treat-none included in the present study demonstrated a potential approach to estimating the theoretical expected benefit (or detriment). This can be weighed against implementation costs of using the model-informed intervention for policy and practice decisions regarding decision support implementation.

Although this study presents a new method for improving clinical decision support systems, findings from the 2 use-cases are consistent with prior research indicating that threshold selection can, and perhaps should, be a dynamic process tailored to the clinical context, capable of incorporating both costs and outcomes.22 However, costs and outcomes are often stochastic measures subject to significant variability, even in tightly controlled experiments.46 It was encouraging to observe that the value-optimization methods proposed in this study were robust to uncertainty in these costs and outcomes. Trends in prediction model development and implementation highlight the potential usefulness of models in terms of discrimination, calibration, and the consequences of interventions resulting from model classification before implementation.47 Accordingly, our proposed method can be used to better align prediction model implementation with the latter goal, supporting transitions toward value-based care.

A broad challenge relevant to threshold selection is that published models require external validation and are likely to perform best when tailored to a target population, ideally with patient-level data. With the increasing digitization of hospital systems including electronic medical records, these data may be increasingly available at the hospital level and can inform continuous quality improvement. Findings from this study add weight to the importance of local calibration with local factors, including event rate and context-specific model performance, and their likely substantial impact on the operational value of specific decision support solutions like those in the included case studies.

It is important to acknowledge that at a conceptual level, there are also potential ethical considerations associated with the exclusion or inclusion of downstream outcomes and costs during cutpoint development. It is plausible that cutpoint selection for decision support systems may bias treatment decisions away from groups who may gain less benefit, or require higher costs to treat, from a decision support recommendation based on published evidence alone. Each patient is different, and the benefit they experience may not conform to the inputs and expected outcomes indicated from approaches like that proposed in this study. The method we have proposed in this study attempts to incorporate some of this nuance, but in practice this must be carefully considered by the treating clinical team ideally in partnership with the patient, including through shared decision making. Binary clinical decision support system recommendations do not equate to automated treatment decisions. The analyses performed in the present study focused on comparing different methods for cutpoint selection for prediction models, but did not aim to consider this in comparison to a clinically informed protocol for intervention assignment, or incorporate clinical reasoning to guide intervention assignment on a case-by-case basis. Clinical decision support systems are typically intended to provide healthcare providers with information to guide decisions, but not necessarily have them followed in all cases, as healthcare providers assign interventions to individual patients based on their understanding of the wider clinical context to best meet the needs of the patient within their prevailing local context and circumstances.

While there is merit in considering ROC-based methods for treatment decisions, these metrics alone lack context without the real-world implications of the treatments they recommend. For example, in clinical contexts where false negatives could be associated with substantial patient detriment and greater downstream healthcare requirements, the ethical choice may be to favor sensitivity over specificity to ensure fewer patients are missed. In contrast, specificity may be preferred to sensitivity when treatment can lead to significant and unwanted patient burden or healthcare waste. There is risk, however, in focusing only on either patient outcomes or healthcare resource use, without also considering the other. Informed ethical choice is likely to require consideration of a variety of factors, including consideration of patient outcomes and healthcare resources required to achieve those outcomes. If cutpoint selection was to focus solely on maximizing patient outcomes without considering resource use and cost implications in a clinical environment with constrained resources, clinical teams may be overwhelmed by the consequences of a highly sensitive cutpoint selection that diverted resources, including healthcare provider time, away from other high value clinical care activities. Including both patient outcomes and costs associated with decision support recommendations via NMB using the proposed method offers a potential informatics-informed solution for considering both positive and negative consequences in a constructive way during cutpoint selection for decision support systems.

In general, across both use-cases, the benefit of the value-optimizing threshold selection relative to other cutpoint selection methods was greatest when the events were rarer and the model discrimination was worse, and when the opposite was true, the performance of the value-optimized method was similar to other methods regarding INB. Although the cost-minimizing approach performed quite similarly to the value-optimizing approach, our simulation study created only calibrated models and this is an assumption for the cost-minimizing approach proposed by Wynants et al22 to perform as expected. When simulating poorly calibrated models, the value-optimizing approach was unaffected in contrast to the cost-minimizing thresholds which had a reduced INB. Unfortunately, published models which are reported to be calibrated in their development setting may not be well calibrated when used elsewhere. The cost-minimizing threshold selects cutpoints based only on the relative costs of correctly classifying a positive versus negative case (Equation 1), and does not incorporate any of the observed data. It therefore requires that the predicted probabilities be well calibrated to perform as expected. In contrast, the proposed value-optimizing cutpoint method finds the threshold which maximizes the function for the total NMB (Equation 2) given the number of, and NMB associated with, each possible classification (true positive, false positive, true negative, and false negative) in the dataset. It does not necessarily require that the predicted probabilities be well-calibrated to work as expected but it does require that the validation dataset share the same level of calibration as the data used to obtain the cutpoint.

When adopting an existing model into a healthcare setting, findings from the present study highlight the potential benefit of estimating cutpoints based on local costs, events, and predicted probabilities rather than assuming that the same cutpoint will be appropriate or that the model will remain calibrated in an external setting. Although it would be optimal for clinical informaticians deploying externally developed models to evaluate their calibration locally and adjust the model as necessary, this may not always occur. To aid the digital health community, we include a Shiny app (https://github.com/RWParsons/cost-effective-cpms-app) that allows the user to upload or use simulated data and associated NMB values to estimate a cutpoint using the proposed value-optimizing method and the compared methods included in this study.

Limitations

The proposed method is susceptible to the choice of parameters selected and their uncertainty. This can be a significant limitation to the value-optimizing threshold selection approach if inputs (treatment effect, treatment costs, and utility decrement) are not available, or not well suited to the local target population or intervention. However, by identifying appropriate input parameters through directly analyzing hospital and patient characteristics, for example, the individual cost of an ICU bed day and the payment penalty for an in-hospital adverse event, estimates of NMB and selection of an optimal cutpoint have potential to be well-informed by local data.

An important limitation of any modeling work of this nature is that it does not simulate true clinical judgment that would occur in the absence of an applicable prediction model, nor the level of divergence from model recommendations in real-world clinical practice. The decision to use a cutpoint and assign predicted classes based on the model assumes that the user is being provided these predicted classes rather than a predicted probability. Although the former is common within clinical decision support systems, the latter may allow the user to make a better-informed decision while considering the specific patient context alongside predicted and competing risks. Clinicians may choose to diverge from model recommendations based on evidence-informed clinical reasoning. The default strategies of treating all or treating none are improbable in hospital settings where clinicians must make well-considered or instinctive judgments based on clinical knowledge and experience.48 In many cases, for example, when anticipating clinical deterioration, these judgments are likely to be sound.49 Accordingly, there must be sufficient added utility from the implementation of both prediction models and the treatments they recommend to justify their implementation.

CONCLUSION

In this study, we proposed a cutpoint selection method that maximizes the NMB when used to guide the allocation of patients to treatment strategies. To do this, we incorporate the cost and effectiveness of the intervention, alongside the cost of the outcome, and use an existing cutpoint selection framework to maximize a function that evaluates the NMB. By simulation, we show that it is often value-maximizing compared with existing cutpoint methods and that it is robust to poor model calibration.

Supplementary Material

ocad042_Supplementary_Data

Contributor Information

Rex Parsons, Australian Centre for Health Services Innovation and Centre for Healthcare Transformation, School of Public Health and Social Work, Faculty of Health, Queensland University of Technology, Kelvin Grove, Australia.

Robin Blythe, Australian Centre for Health Services Innovation and Centre for Healthcare Transformation, School of Public Health and Social Work, Faculty of Health, Queensland University of Technology, Kelvin Grove, Australia.

Susanna M Cramb, Australian Centre for Health Services Innovation and Centre for Healthcare Transformation, School of Public Health and Social Work, Faculty of Health, Queensland University of Technology, Kelvin Grove, Australia; Jamieson Trauma Institute, Royal Brisbane and Women’s Hospital, Metro North Health, Herston, Australia.

Steven M McPhail, Australian Centre for Health Services Innovation and Centre for Healthcare Transformation, School of Public Health and Social Work, Faculty of Health, Queensland University of Technology, Kelvin Grove, Australia; Digital Health and Informatics, Metro South Health, Woolloongabba, Australia.

AUTHOR CONTRIBUTIONS

RP, RB, and SMM conceived the idea for this study. RP and RB designed the study and performed the analyses. All authors contributed to the writing, finalized the manuscript, and reviewed and approved the final manuscript.

FUNDING

This work was supported by the Digital Health Cooperative Research Centre (“DHCRC”). DHCRC is funded under the Commonwealth’s Cooperative Research Centres (CRC) Program. SMM and SMC are supported by NHMRC-administered fellowships (#1181138 and #2008313, respectively). No patients or members of the public were included in this study.

SUPPLEMENTARY MATERIAL

Supplementary material is available at Journal of the American Medical Informatics Association online.

CONFLICT OF INTEREST STATEMENT

None declared.

DATA AVAILABILITY

The data used in this study were simulated. The code used to perform these simulations is available at https://github.com/RWParsons/cost-effective-cpms. A Shiny app to simulate data using the same approach and user-selected inputs and visualize selected thresholds under all included cutpoint selection methods is available at https://github.com/RWParsons/cost-effective-cpms-app.

REFERENCES

  • 1. Adibi A, Sadatsafavi M, Ioannidis JPA.. Validation and utility testing of clinical prediction models: Time to change the approach. JAMA 2020; 324 (3): 235–6. [DOI] [PubMed] [Google Scholar]
  • 2. Grunkemeier GL, Jin R.. Receiver operating characteristic curve analysis of clinical risk models. Ann Thorac Surg 2001; 72 (2): 323–6. [DOI] [PubMed] [Google Scholar]
  • 3. Fluss R, Faraggi D, Reiser B.. Estimation of the Youden index and its associated cutoff point. Biom J 2005; 47 (4): 458–72. [DOI] [PubMed] [Google Scholar]
  • 4. Rota M, Antolini L, Valsecchi MG.. Optimal cut-point definition in biomarkers: The case of censored failure time outcome. BMC Med Res Methodol 2015; 15: 24. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Liu X. Classification accuracy and cut point selection. Stat Med 2012; 31 (23): 2676–86. [DOI] [PubMed] [Google Scholar]
  • 6. Unal I. Defining an optimal cut-point value in ROC analysis: An alternative approach. Comput Math Methods Med 2017; 2017: 3762651. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Moons KG, Harrell FE.. Sensitivity and specificity should be de-emphasized in diagnostic accuracy studies. Acad Radiol 2003; 10 (6): 670–2. [DOI] [PubMed] [Google Scholar]
  • 8. Moons KG, van Es G-A, Deckers JW, Habbema JDF, Grobbee DEJE.. Limitations of sensitivity, specificity, likelihood ratio, and Bayes’ theorem in assessing diagnostic probabilities: A clinical example. Epidemiology 1997; 8 (1): 12–7. [DOI] [PubMed] [Google Scholar]
  • 9. Romero-Brufau S, Huddleston JM, Escobar GJ, Liebow M.. Why the C-statistic is not informative to evaluate early warning scores and what metrics to use. Crit Care 2015; 19 (1): 285. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Linn S, Grunau PD.. New patient-oriented summary measure of net total gain in certainty for dichotomous diagnostic tests. Epidemiol Perspect Innov 2006; 3: 11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Vickers AJ, van Calster B, Steyerberg EW.. A simple, step-by-step guide to interpreting decision curve analysis. Diagn Progn Res 2019; 3: 18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Fitzgerald M, Saville BR, Lewis RJ.. Decision curve analysis. JAMA 2015; 313 (4): 409–10. [DOI] [PubMed] [Google Scholar]
  • 13. Vickers AJ, Cronin AM.. Traditional statistical methods for evaluating prediction models are uninformative as to clinical value: Towards a decision analytic framework. Semin Oncol 2010; 37 (1): 31–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Vickers AJ, Elkin EB.. Decision curve analysis: A novel method for evaluating prediction models. Med Decis Making 2006; 26 (6): 565–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Areia M, Mori Y, Correale L, et al. Cost-effectiveness of artificial intelligence for screening colonoscopy: A modelling study. Lancet Digit Health 2022; 4 (6): e436–44. [DOI] [PubMed] [Google Scholar]
  • 16. van Giessen A, Peters J, Wilcher B, et al. Systematic review of health economic impact evaluations of risk prediction models: Stop developing, start evaluating. Value Health 2017; 20 (4): 718–26. [DOI] [PubMed] [Google Scholar]
  • 17. Parsons R, Blythe RD, Cramb SM, McPhail SM.. Inpatient fall prediction models: A scoping review. Gerontology 2022; 69 (1): 14–29. [DOI] [PubMed] [Google Scholar]
  • 18. Marschollek M, Gövercin M, Rust S, et al. Mining geriatric assessment data for in-patient fall prediction models and high-risk subgroups. BMC Med Inform Decis Making 2012; 12: 19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Dinh TA, Rosner BI, Atwood JC, et al. Health benefits and cost-effectiveness of primary genetic screening for lynch syndrome in the general population health benefits and cost-effectiveness of primary genetic screening. Cancer Prev Res 2011; 4 (1): 9–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Lippuner K, Johansson H, Borgström F, Kanis JA, Rizzoli R.. Cost-effective intervention thresholds against osteoporotic fractures based on FRAX® in Switzerland. Osteoporos Int 2012; 23 (11): 2579–89. [DOI] [PubMed] [Google Scholar]
  • 21. Ström O, Jönsson B, Kanis JA.. Intervention thresholds for Denosumab in the UK using a FRAX®-based cost-effectiveness analysis. Osteoporos Int 2013; 24 (4): 1491–502. [DOI] [PubMed] [Google Scholar]
  • 22. Wynants L, van Smeden M, McLernon DJ, et al. Three myths about risk thresholds for prediction models. BMC Med 2019; 17 (1): 192. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Le P, Martinez KA, Pappas MA, Rothberg MB.. A decision model to estimate a risk threshold for venous thromboembolism prophylaxis in hospitalized medical patients. J Thromb Haemost 2017; 15 (6): 1132–41. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Collins GS, de Groot JA, Dutton S, et al. External validation of multivariable prediction models: A systematic review of methodological conduct and reporting. BMC Med Res Methodol 2014; 14 (1): 40. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Stinnett AA, Mullahy J.. Net health benefits: A new framework for the analysis of uncertainty in cost-effectiveness analysis. Med Decis Making 1998; 18(2 Suppl): S68–80. [DOI] [PubMed] [Google Scholar]
  • 26. Markazi-Moghaddam N, Fathi M, Ramezankhani A.. Risk prediction models for intensive care unit readmission: A systematic review of methodology and applicability. Aust Crit Care 2020; 33 (4): 367–74. [DOI] [PubMed] [Google Scholar]
  • 27. Haines TP, Hill AM, Hill KD, et al. Cost effectiveness of patient education for the prevention of falls in hospital: Economic evaluation from a randomized controlled trial. BMC Med 2013; 11: 135. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Morello RT, Barker AL, Watts JJ, et al. The extra resource burden of in-hospital falls: A cost of falls study. Med J Aust 2015; 203 (9): 367. [DOI] [PubMed] [Google Scholar]
  • 29. Badawi O, Breslow MJ.. Readmissions and death after ICU discharge: Development and validation of two predictive models. PLoS ONE 2012; 7 (11): e48758. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Drummond MF, Sculpher MJ, Claxton K, Stoddart GL, Torrance GW.. Methods for the Economic Evaluation of Health Care Programmes. Oxford university press; 2015. [Google Scholar]
  • 31. Edney LC, Haji Ali Afzali H, Cheng TC, Karnon J.. Estimating the reference incremental cost-effectiveness ratio for the Australian Health System. Pharmacoeconomics 2018; 36 (2): 239–52. [DOI] [PubMed] [Google Scholar]
  • 32. Gold MR, Milton C, Weinstein L, et al. Cost-Effectiveness in Health and Medicine. USA: Oxford University Press; 1996. [Google Scholar]
  • 33. Delignette-Muller ML, Dutang C.. fitdistrplus: An R package for fitting distributions. J Stat Softw 2015; 64 (4): 1–34. [Google Scholar]
  • 34. Manning WG, Basu A, Mullahy J.. Generalized modeling approaches to risk adjustment of skewed outcomes data. J Health Econ 2005; 24 (3): 465–88. [DOI] [PubMed] [Google Scholar]
  • 35. Chen LM, Martin CM, Keenan SP, Sibbald WJ.. Patients readmitted to the intensive care unit during the same hospitalization: Clinical features and outcomes. Crit Care Med 1998; 26 (11): 1834–41. [DOI] [PubMed] [Google Scholar]
  • 36. Page K, Barnett AG, Graves N.. What is a hospital bed day worth? A contingent valuation study of hospital Chief Executive Officers. BMC Health Serv Res 2017; 17 (1): 137. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. de Vos J, Visser LA, de Beer AA, et al. The potential cost-effectiveness of a machine learning tool that can prevent untimely intensive care unit discharge. Value Health 2022; 25 (3): 359–67. [DOI] [PubMed] [Google Scholar]
  • 38. Hicks P, Huckson S, Fenney E, Leggett I, Pilcher D, Litton E.. The financial cost of intensive care in Australia: A multicentre registry study. Med J Aust 2019; 211 (7): 324–5. [DOI] [PubMed] [Google Scholar]
  • 39. Hill A-M, McPhail SM, Waldron N, et al. Fall rates in hospital rehabilitation units after individualised patient and staff education programmes: A pragmatic, stepped-wedge, cluster-randomised controlled trial. Lancet 2015; 385 (9987): 2592–9. [DOI] [PubMed] [Google Scholar]
  • 40. Morello RT, Barker AL, Watts JJ, et al. The extra resource burden of in-hospital falls: A cost of falls study. Med J Aust 2015; 203 (9): 367. [DOI] [PubMed] [Google Scholar]
  • 41. Haines TP, Hill AM, Hill KD, et al. Patient education to prevent falls among older hospital inpatients: A randomized controlled trial. Arch Intern Med 2011; 171 (6): 516–24. [DOI] [PubMed] [Google Scholar]
  • 42. Latimer N, Dixon S, Drahota AK, Severs M.. Cost–utility analysis of a shock-absorbing floor intervention to prevent injuries from falls in hospital wards for older people. Age Ageing 2013; 42 (5): 641–5. [DOI] [PubMed] [Google Scholar]
  • 43. Salgado JF. Transforming the area under the normal curve (AUC) into Cohen’s d, Pearson’s rpb, odds-ratio, and natural log odds-ratio: Two conversion tables. Eur J Psychol Appl Legal Context 2018; 10 (1): 35–47. [Google Scholar]
  • 44. Ensor J, Martin EC, Riley RD.. pmsampsize: Calculates the Minimum Sample Size Required for Developing a Multivariable Prediction Model. R package version 1.1.2. 2022.
  • 45. Thiele C, Hirschfeld G.. cutpointr: Improved estimation and validation of optimal cutpoints in R. J Stat Softw 2021; 98 (11): 1–27. [Google Scholar]
  • 46. Briggs AH. Handling uncertainty in cost-effectiveness models. Pharmacoeconomics 2000; 17 (5): 479–500. [DOI] [PubMed] [Google Scholar]
  • 47. Hendriksen JM, Geersing GJ, Moons KG, de Groot JA.. Diagnostic and prognostic prediction models. J Thromb Haemost 2013; 11 (Suppl 1): 129–41. [DOI] [PubMed] [Google Scholar]
  • 48. Kappen TH, van Klei WA, van Wolfswinkel L, Kalkman CJ, Vergouwe Y, Moons KGM.. Evaluating the impact of prediction models: Lessons learned, challenges, and recommendations. Diagn Progn Res 2018; 2: 11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49. Winter MC, Kubis S, Bonafide CP.. Beyond reporting early warning score sensitivity: The temporal relationship and clinical relevance of “true positive” alerts that precede critical deterioration. J Hosp Med 2019; 14 (3): 138–43. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

ocad042_Supplementary_Data

Data Availability Statement

The data used in this study were simulated. The code used to perform these simulations is available at https://github.com/RWParsons/cost-effective-cpms. A Shiny app to simulate data using the same approach and user-selected inputs and visualize selected thresholds under all included cutpoint selection methods is available at https://github.com/RWParsons/cost-effective-cpms-app.


Articles from Journal of the American Medical Informatics Association : JAMIA are provided here courtesy of Oxford University Press

RESOURCES