Abstract
Clinical predictive models use a patient’s baseline demographic and clinical data to make predictions about patient outcomes and have the potential to aid clinical decision making. The extent of equine clinical predictive models is unknown in the literature. Using PubMed and Google Scholar, we systematically reviewed the predictive models currently described for use in equine patients. Models were eligible for inclusion if they were published in a peer-reviewed article as a multivariable model used to predict a clinical/laboratory/imaging outcome in an individual horse or herd. The agreement of at least two authors was required for model inclusion. We summarised the patient populations, model development methods, performance metric reporting, validation efforts, and, using PROBAST, assessed the risk of bias, and applicability concerns for these models. In addition, we summarised the index conditions for which models were developed and provided detailed information on included models. A total of ninety predictive models and 9 external validation studies were included in the final systematic review. A plurality of models (41%) was developed to predict outcomes associated with colic, e.g. need for surgery or survival to discharge. All included models were at high risk of bias, defined as failing one or more PROBAST signalling questions, primarily for analysis-related reasons. Importantly, a high risk of bias does not necessarily mean that models are unusable, but that they require more careful consideration prior to clinical use. Concerns for applicability were low for the majority of models. Systematic reviews such as this can serve to increase veterinarians’ awareness of predictive models, including evaluation of their performance and their use in different patient populations.
Keywords: horse, prediction, predictive model, prognostic model, diagnostic model, colic
Introduction
Predictive models are often used in human medicine for enhanced clinical decision making and patient support. These statistical models typically utilise several sociodemographic and clinical parameters to categorise patients as well as predict the likelihood of various clinical outcomes, such as mortality within 30 days of a myocardial infarction or the need for haemodialysis or transplantation in those with chronic kidney disease.1,2 Despite a proliferation of these models in the last ~20 years, only a small percentage of published models actually find use in clinical practice and enhance shared decision making between the patient and their provider.3,4
Spurred by a desire to make clinical decisions less subjective,5 multivariable clinical predictive models first appeared in the equine medicine literature in the 1980s.6,7 While many predictive models have been developed since, very few seem to have gained any traction in the clinic. While some of this is likely due to methodological issues and clinician preference to use their own intuition for clinical decision making, we believe a review of the diversity and types of predictive models in the equine literature may allow clinicians to add to their clinical decision-making toolset. No systematic review of clinical predictive models in equids exists.
The goal of this systematic review was to identify clinical predictive models in equine medicine and provide their population characteristics, methods, predictors, and outcomes. In addition, we assessed the quality of predictive model performance metric reporting, risk of bias, and applicability concerns.
Predictive model background
Predictive, or prediction, models are ostensibly developed as aids to clinical decision making or as tools for more refined patient prognostication. These models will take baseline demographic, clinical, laboratory, and/or imaging characteristics to make predictions about the probability of an outcome (see clinical vignette in Box 1). Whether these models actually see use in the clinic, however, is dependent on several factors – model performance, model validation, and risk of bias or applicability concerns – in addition to clinician preferences.3,4
Box 1: Clinical vignette of predictive model29 use in an equine patient.
A 22-year-old Quarter Horse mare presented to a clinic for signs of acute colic.
The mare displayed intermittent signs of pain (pawing and flank staring), had weakened digital pulses and abnormal rectal palpation findings (possible dilated small intestinal loops), and decreased but not entirely absent GI sounds.
The clinician, who had to decide whether to send the horse to the operating room for an exploratory laparotomy, used a predictive model to clarify this mare’s predicted risk of needing surgery.
The predictive model this clinician used26 included age, sex, breed, frequency of pain, rectal palpation findings, peripheral pulse quality, and borborygmi as predictor variables.
After inputting these baseline characteristics into the model’s equation, the clinician found that the mare’s probability of needing surgery was 85%.
This result, in combination with the clinician’s a priori knowledge, led to the horse being sent to the operating room with a suspected strangulating mesenteric lipoma.
Model performance
There are several different qualities to assess when measuring the performance of predictive models. These include calibration, discrimination, accuracy, sensitivity, specificity, and positive and negative likelihood ratios.
Calibration refers to the extent of agreement between the estimated and observed number of events, and how well the model predicts overall risk. This is commonly assessed using a Hosmer-Lemeshow test, although other methods such as calibration plots and calibration slopes are preferable.8 Discrimination refers to the ability of a model to differentiate between individuals who have the outcome and those who do not. Discrimination is typically expressed as the area under a receiver operating characteristic curve (AUC) or an equivalent value, the C-statistic, both of which bound at 0.5 and 1. This can be thought of as “the probability that for a randomly selected pair of patients with and without the disease/condition, the patient with the disease/condition has a result indicating greater suspicion.”9 When interpreting these values, an AUC of 0.5 indicates that the model discriminates no better than a coin flip, while an AUC of 0.7 – 0.8 is considered to be acceptable, 0.8 – 0.9 is considered to be excellent, and greater than 0.9 is outstanding.9
Other performance parameters such as accuracy, sensitivity, specificity, and positive/negative likelihood ratios are determined according to a particular test cut-off value (e.g. probability of outcome > 0.5 is positive test). Accuracy is the number of correct predictions divided by the total number of predictions. Sensitivity is the number of true positives divided by the number of true positives and false negatives. Specificity is the number of true negatives divided by the number of true negatives and false positives. Positive and negative likelihood ratios (PLR and NLR) are, respectively, the ratios of true positives to false positives and false negatives to true negatives. Unlike positive and negative predictive values, these likelihood ratios are prevalence-independent so they can be used to calculate post-test probabilities in different populations (Methods S1).
Model validation
Validation of predictive models is necessary to understand how well the model performs in data that were not used to develop the model. Validation can be classified broadly as internal or external validation. Internal validation is validation using the same cohort on which the model was developed; this can be accomplished by randomly splitting the dataset into model training and test datasets, k-fold cross-validation, or bootstrap methods. External validation assesses the performance of the model in a different geographic, temporal, or care setting from the cohort on which the model was developed.
Risk of bias and applicability concerns
Bias in predictive models occurs when predicted results systematically deviate from the truth. Bias can result from the population on which the model was developed, the predictors used, outcomes assessed, or the analytic choices made. Many models are at high risk for bias due to suboptimal analysis methods, such as exclusion of individuals with one or more missing variables and dichotomisation of continuous predictor variables, among others, which can sometimes be unavoidable given the available data.10
Applicability concerns come about when there are questions about how well the model is able to be translated to a clinical setting. These concerns can be due to the population on which the model was originally developed, which may be a very specific population from which predictions are not easily generalised. Predictors and outcomes assessed can result in applicability concerns, like when there is a mismatch between the timing of the predictor assessment and optimal time for model use, e.g. using an outside laboratory diagnostic in a model that predicts the need for emergent surgery. Lastly, poor reporting can result in an inability to apply the model, such as when cut-off values are not reported in models reporting sensitivity and specificity.
Methods
Search strategy
Two data sources – PubMed and Google Scholar – were searched using the following search strings: [“veterinary” AND “clinical” AND (“predictive model” OR “predictive analytics” OR “prediction model”)] on 26 October 2021 and [“veterinary” AND “clinical” AND (“receiver operating curve” OR “scoring system” OR “area under curve”)] on 22 November 2021. Freeware software, Publish or Perish,11 was used to generate a citation list, while Rayyan,12 another freeware platform, was used to screen articles and extract data from those included.
Initial screening of article titles and abstract was performed to identify and exclude those articles that were duplicates, not published in English, or unrelated to clinical predictive modelling in veterinary medicine. Articles identified for full text review were classified according to subject area: equine, small animal, food and fibre animal, and exotic/wildlife/zoo/lab animal. Full text review was performed for articles that met our inclusion criteria.
Criteria for inclusion in this review were any peer-reviewed article that reported a multivariable model explicitly intended to predict a clinical/laboratory/imaging outcome in an individual equid or herd. Models were also included if, after a multivariable model selection process, only one predictor variable was retained or if they assessed a scoring system with several constituent variables. Included models could be diagnostic, predicting likelihood of having a particular disease, or prognostic, predicting future risk of an outcome such as death/euthanasia, recurrence of clinical signs, or incident disease. Models that did not report an equation for predicting the outcome or at least one measure of predictive performance – discrimination, accuracy, sensitivity/specificity, positive or negative likelihood ratios/predictive values– were considered as association studies and excluded. Other exclusion criteria were any models that reported a nonclinical outcome (e.g. racing performance), models used to predict organ or tumour size, or pharmacokinetic models unless used to predict concentrations of a drug commonly monitored in clinical practice such as digoxin.
References for articles deemed to meet the inclusion and exclusion criteria were then searched for additional articles that may have met our pre-specified inclusion criteria.
Study, population, predictor, and outcome characteristics
General study characteristics were extracted, including publication year, retrospective or prospective nature of the study population, country in which the model was developed, care setting (ambulatory primary care/private referral/university), index condition (e.g. colic), sub-condition (e.g. small intestinal strangulating lesion), and time point (e.g. after physical exam and point-of-care bloodwork) for model use. For articles reporting multiple nested multivariable models that would be used at the same time point, only the best performing model was reported. Population characteristics extracted included the number of individuals the model was developed on, as well as any age and breed restrictions of those individuals. In models with survival as the outcome, data on how euthanised animals were included or excluded were also extracted. The final predictors, outcome, outcome assessment method, and type of model (e.g. logistic regression) were also extracted.
Model performance and validation
Model performance metrics for calibration and discrimination were extracted, as were accuracy, sensitivity, specificity, and overall performance metrics (e.g. R2). Positive/negative likelihood ratios were extracted or calculated; these were reported preferentially over positive/negative predictive values because likelihood ratios are prevalence-independent and thus applicable to new populations. When available, model performance metrics of the validation data were reported instead of the development data. Data on model validation were also extracted, including the type of validation, internal vs. external, and number of individuals in the validation cohort.
Risk of bias assessment
The Predictive model Risk Of Bias ASessement Tool (PROBAST), in both short form and long form, is a tool used to qualitatively evaluate the risk of bias in predictive models.10,13 While the long form PROBAST tool uses 20 signalling questions across 4 domains (participants, predictors, outcome, and analysis) to determine risk of bias (ROB), the short form tool condenses down to just 6 signalling questions – 1 outcome-related and 5 analysis-related – to assess whether a predictive model suffers from a high ROB.10,13 The analysis questions pertain to how continuous variables and missing data were handled, whether univariate screening was performed, whether there were a sufficient number of events per candidate predictors, and whether there was any correction for overfitting or optimism of the model. In both the short form and long form PROBAST, if the answer to any single signalling question indicates that suboptimal methods were employed, the model was categorised as categorised as high ROB. This stringency means that even well-designed predictive models can be at high risk of bias if one PROBAST criteria is not satisfied. To evaluate the models included in this study, a single author applied the short form PROBAST tool to classify models as being at high, low, or unclear ROB. Classification by a single author, as opposed to multiple, had the potential to introduce bias, but this was thought to be unlikely as the threshold for a model to be considered at low risk of bias is exceedingly difficult to reach. For any models with low or unclear ROB using the short form PROBAST, the long form PROBAST was used to clarify ROB.13 The long form PROBAST form was also used to assess applicability concerns.13
Results
General study characteristics
A total of 63 peer reviewed articles containing 90 predictive models and 9 external validation studies were included after initial screening, full text-review, and reference searching (Figure 1). These articles were published between 1983 and 2021, with 9 models or validations (9%) published prior to 1990, 15 (15%) from 1990 to 2000, 19 (19%) from 2000 to 2010, 48 (48) from 2010 to 2020, and 8 (8%) from 2020 to 2021. The vast majority of models and validations were from North America (65%) and Europe (33%), with three models from South America (3%) and only one model each reported from Australia and Asia. No models were reported from Africa. Most models were developed at least partially in university settings (80%).
Figure 1:

Results from systematic review of equine clinical predictive model, using PRISMA-style flow diagram.80
The index conditions were colic (37/90, 41%), neonatal illness (11/90, 12%), chronic pulmonary diseases (9/90, 10%), infectious diseases (8/90, 9%), lameness/ataxia (8/90, 9%), non-infectious, non-pulmonary, inflammatory diseases (5/90, 6%), general illness (3/90, 3%), pregnancy/fertility (3/90, 3%), exogenous toxins (2/90, 2%), and one model each for equine grass sickness, equine metabolic syndrome, sarcoids, and sudden death. The most common types of predictive models were logistic regression (59/90, 66%), followed by discriminant analysis (8/90, 9%), partial least squares regression (6/90, 7%), and classification and regression trees (5/90, 6%). Stepwise variable selection was most frequently used in those models developed with logistic regression and discriminant analysis.
Performance metric reporting
Model calibration metrics were reported in 41/99 models and validations (41%). Most (37/41) were reported as Hosmer-Lemeshow tests, and only two of those indicated poor model fit. The remaining calibration metrics were reported as root mean square error of cross-validation. Model discrimination was reported as AUC in 38/99 models (38%). Fifteen (39%) of those models had an AUC > 0.9, 11 were between 0.8 and 0.9 (29%), 6 were between 0.7 and 0.8 (16%), and 6 were between 0.6 and 0.7 (16%).
Accuracy was reported in 36/99 models and validations (36%) with a range of 43–100%. Sensitivity and specificity were reported in 55/99 models and validations (56%), although in one model only a qualitative assessment of “low” was reported. Positive/negative likelihood ratios or predictive values were reported as the sole performance metric in 5/99 models and validations (5%). For the model metrics that required a cut-off for determining a positive or negative test – accuracy, sensitivity, specificity, positive/negative likelihood ratios or predictive values – over 71% (51/71) of the cut-offs were reported. Nine studies were solely external validations of previously published models. Of the remaining 90 models, only 25 (28%) incorporated a validation attempt. Of those, 13/25 (52%) were internal only, 8/25 (32%) were external only, 2/25 (8%) were both internal and external, and 2/25 had unclear validation types.
Risk of bias and applicability concerns
There was a high ROB in all included models. This bias stemmed primarily from analysis-related factors. These included dichotomisation of continuous variables, inappropriate methods to address missing data, such as excluding any individual with an incomplete set of predictor variables (i.e. complete case analysis), univariate screening of candidate predictors prior to inclusion in a multivariable model, too few individuals with the outcome of interest for the number of candidate predictors selected, lack of correction for model overfitting or optimism, and a lack of validation attempts.
Applicability concerns were low in 53/90 models (59%) and high in the remainder. These concerns were primarily due to a lack of reported cut-offs for which sensitivity/specificity and accuracy were determined, no provision of an equation to calculate the probability of the outcome (or actual outcome value, in the case of linear regression), or use of predictors not available at the time outside of a research context. Individual model details including number of individuals, final predictors, outcome, performance metrics, validation efforts, ROB and applicability concerns are reported in supporting information (Tables S1–3).
Colic models
Of the 37 colic models (excluding 2 independent model validations, Table S1),14 20 had survival to hospital discharge as the primary study outcome.6,7,15–27 The top ten most commonly included predictors of short-term survival are shown in Figure 2 with the top three being heart rate, packed cell volume (PCV), and mucous membrane characteristics (colour and capillary refill time). Four models (4/37, 11%) predicted the need for exploratory laparotomy; rectal palpation findings and peritoneal fluid colour were included as predictors in 3 of 4 of these models.21,28–30 Three models (3/37, 8%) predicted ischaemic bowel or need for resection and anastomosis with peritoneal fluid characteristics, gastric reflux, and small intestine motility on ultrasound included as predictors.21,31,32 For each outcome, the models differed in the available demographic and clinical predictor variables. Other outcomes that were examined in the literature included survival at 6 months, diagnosis of primary sand colic, good quality anaesthetic recovery following exploratory laparotomy, and diagnosis of colic vs. healthy.20,26,33,34
Figure 2:

The top ten most commonly included variables in 20 equine colic predictive models with a short-term survival outcome. Predictor variables are shown in proportion to their occurrence. Not all models had the same slate of candidate variables available.
Neonatal illness models
Of the 11 neonatal illness models (excluding 7 independent validations, Table S2),35–37 5 had short-term survival outcomes, namely survival to the time of hospital discharge or survival to 10 days post-partum.27,38–41 Foals included in these models ranged in age from less than 4 to less than 14 days at the time of hospital admission. Predictors included in multiple models were absolute white blood cell or neutrophil counts (3/6, 50%), serum levels of IgG (2/6, 33%), and anion gap findings (2/6, 33%). The remaining 6 models predicted the probability of sepsis occurring. Predictors included in at least 50% of sepsis models are shown in Figure 3. For each outcome, not all models had the same candidate variables available for assessment.
Figure 3:

Variables included in at least 50% of 6 equine neonatal sepsis predictive models. Predictor variables are shown in proportion to their occurrence. Not all models had the same slate of candidate variables available.
Models for other conditions
Of the remaining 42 models (Table S3), 6 (14%) had a survival outcome for acute idiopathic colitis, equine atypical myopathy, Palestine viper (Daboia palaestinae) envenomation, emergent, or all-cause admission to the hospital (2 models).42–46 Twenty four (57%) were diagnostic models designed to classify individuals as having a particular type of chronic pulmonary disease (recurrent airway obstruction or inflammatory airway disease), insect bite hypersensitivity, West Nile fever, spinal ataxia, hindlimb lameness, equine grass sickness, hyperinsulinaemia, or sarcoids.47–61 Four models (10%) were used to predict risk of incident outcomes, including race-associated sudden death, laminitis, catastrophic proximal sesamoid fracture, or poor quality anaesthetic recovery following arthroscopic or tenoscopic surgery.62–65 Two models (5%) were used to classify fertility level or predict fertility of stallions.66 One model was used to predict seropositivity for Leishmania spp.,67 and one model was used to predict seropositivity above ≥ 3200 for Streptococcus equi equi antibodies so as to avoid inducing purpura haemorrhagica through vaccination.68 One model was used to predict number of days pre-partum in Standardbred mares.69
Discussion
This systematic review identified clinical predictive models used in equine medicine, reported their characteristics, and assessed the quality of model development, performance, and validation, as well as overall ROB. Not surprisingly, most models were developed for colic and neonatal illness, both extremely common conditions affecting equids. Most models were developed using logistic regression analysis in academic settings, consistent with the resources available at these institutions for research. Anecdotally, the authors could only find evidence of two models, both predicting neonatal sepsis/survival, receiving widespread use in clinical practice.41,70 This discrepancy between the number of models produced and the number used by clinicians is apparent in human medicine as well.3,4
Lack of model translation to clinical practice
There are a number of reasons why these predictive models fail to translate to use in the clinical setting.4,71 These can be due to concerns with the practical applicability of these models, such as insufficient reporting of model performance metrics and the equation used to predict the outcome, the inclusion of predictors not commonly available to clinicians, or outcomes that are not of primary interest. Even with optimal predictors, outcomes, and reporting, the available models may not be used because probabilistic knowledge of the outcome does not necessarily translate easily to decision making, or because a clinician’s decision-making is intuitive rather than analytical.4 Lastly, a lack of awareness of the existence of potentially useful predictive models, a manageable tool for their application, and poor performance metrics may contribute to a lack of clinical use.
Performance metric reporting
In response to widespread poor reporting of predictive models, guidelines (“TRIPOD” guidelines) for reporting multivariable predictive models were developed in human medicine to allow for the assessment of ROB and clinical utility.72 When applying these guidelines in this review, many predictive models did not meet those criteria, especially with regards to the reporting of performance metrics or equations for prediction in individuals. Many models reporting classification metrics (sensitivity/specificity/accuracy/likelihood ratios) did not report cut-off values limiting their utility in clinical practice where such values can guide and inform therapy. Among the models reporting a discrimination metric, only 22/37 (60%) reported a calibration metric as well. In addition, these calibration metrics were predominantly Hosmer-Lemeshow tests, a test not recommended due to low statistical power and its lack of informativeness regarding miscalibration type (i.e. overprediction or underprediction of risk) and extent. This is of concern because models with good discriminatory ability (i.e., those with a high AUC) that are miscalibrated can have misleading predictions.
To illustrate this concept, consider a hypothetical model predicting the need for surgery among colic patients. This model’s AUC is 0.9, meaning that a random surgical horse has a 90% chance of having a higher predicted probability of needing surgery than a random non-surgical horse. Despite this good discriminatory ability, if the model is poorly calibrated and underpredicts risk, many surgical horses could still have low predicted need for surgery, resulting in delayed surgical intervention and poor outcomes. For example, in a model with good discrimination but generalised underprediction of risk, the prediction (i.e. probability of needing surgery) for a random surgical horse might be as low as 10% in a very poorly calibrated model, whereas the prediction for a random non-surgical horse might be 6%. On the other hand, if the risk of requiring surgery is overestimated, many horses may undergo expensive and unnecessary surgery, suffer from anaesthetic complications, or need to be euthanised. For example, in a model with good discrimination but generalised overprediction of risk, the prediction (i.e. probability of needing surgery) for a random surgical horse might be 95%, whereas the prediction for a random non-surgical horse needing surgery might also be extremely high at 80%. This emphasises the importance of assessing multiple model performance measures, including both discrimination and calibration.
Lack of validation
Most models included in this review (70%) had no reported validation, indicating that model performance metrics are almost certainly overoptimistic when applied to new populations.72 Incomplete performance assessments like this could lead to inaccurate predictions. This could have negative consequences if such a model were to be used clinically as clinicians may have an inappropriately high sense of certainty regarding the probability of an outcome. One model that is widely used, the modified equine neonatal sepsis score,70 has been externally validated several times with variable degrees of performance.35–37 In these external validations, model performance was much worse than in the development cohort. By altering the cut-off point and essentially recalibrating the model, however, model performance was improved. This illustrates the need for both model validation and updating, through recalibration or other updating methodology, prior to clinical use in populations that differ from the development cohort to ensure accuracy of the model’s predictions.
Risk of bias
The finding of high ROB for all models is consistent with findings in systematic reviews of predictive models in human medicine.73–75 In fact, models often had several reasons for having a high ROB. While we did not quantify all of them, risk of bias was largely attributable to analysis methods as these were the most common items on the PROBAST short form. In most models, there were too few events for the number of predictors screened, risking over- or underfitting of the data.76 Likewise, missing data were often excluded without justification, instead of multiply imputed, potentially introducing bias and reducing statistical power.76 Lastly, variables were often screened in a univariate fashion, which can result in omission of important predictors that are only significantly associated with the outcome in the context of other variables.76 Other reasons for having a high ROB, including participant, predictor, and outcome-related reasons, were also present, albeit less commonly. As addressed previously, a lack of validation is also consistent with a high ROB. Notably, many of the included models were published before the establishment of best practice standards.13,76 Importantly, a high ROB does not necessarily mean that models are unusable, but that they require more careful consideration prior to clinical use. Accordingly, investigators should consult PROBAST and TRIPOD guidelines prior to developing new predictive models in order to minimise bias.
Applicability concerns
Several predictive models included in this review used predictors not readily available to the average clinician. These included allergen microarrays, force plates for the analysis of lameness vs. ataxia, and measurement of serum acylcarnitine and other uncommon metabolite concentrations.43,47,52,57,58 Although these models were interesting, the lack of predictor availability limits use in practice. For some models, the timing of predictor collection/assessment was discordant with the time of model use. For example, the use of pelvic flexure biopsy scores to predict short-term survival following large colon volvulus necessitates survival until histopathology is reported (often ≥48 hours post-surgery), making the model less practical.24,25 Still other models included predictors that were not independent of how the outcome was assessed. In other words, some models used clinical parameters to make a determination on whether patients had the outcome, but also used some of those same clinical parameters as predictors in the model. Lastly, some models reported outcomes that were not of primary clinical interest. For example, in colic cases, the first outcome of interest is the indication for surgery, yet multiple models predicted intestinal ischaemia or the need for resection and anastomosis.21,31,32 These are interesting outcomes but are unlikely to influence clinical practice since they are downstream of the decision to proceed to surgery and do not predict the severity of bowel ischaemia.
Overcoming barriers to predictive model implementation
With the rise of predictive analytics in human medicine, there has been great interest in understanding the barriers to implementation of predictive models in the clinic.4,77,78 Beyond the aforementioned poor reporting of models, one barrier found in multiple studies was clinical utility, in that clinicians questioned the usefulness or need for certain models; this could be due to some models predicting an outcome that was not of primary interest or other models that lacked actionability.4,77 In addition, the likelihood of a clinician acting in response to a model’s output can be reduced by the inclusion of predictors that are not intuitively associated with risk of the outcome.78 Another barrier to implementation can be a lack of understanding of model performance metrics (especially in machine learning models), which makes the quality of a model’s predictions unclear.77 Lastly, even high-performance predictive models can see limited use when there is no readily accessible tool with which to use it, e.g. a mobile application or calculator embedded within an electronic medical record software.3,79
In the future, these barriers might be prospectively overcome by following TRIPOD reporting guidelines for predictive models, including clinician input in the model development process, and providing a simple tool at time of publication for model application, like a web-based calculator or a formulated spreadsheet. For the moment, though, there are a couple of steps that clinicians can take to evaluate and potentially apply the already published predictive models in their own practice. One first step might be for clinicians within a practice to discuss the pros and cons of implementing predictive models. This might include a discussion of clinical scenarios in which decision making is particularly difficult and identification of potentially useful models. These models can then be evaluated based on their population (“is the clinic population similar enough to the population on which the model was developed and/or validated?”), the predictors (“are all predictor variables routinely collected? could they be?”), the outcomes (“is the outcome of primary interest?”), as well as performance (“do we believe this model could perform better than our current paradigm?”). If the answer to these questions is yes, then a next step would be to create a tool for model application. In many cases, this could be a simple formulated spreadsheet in which predictor variables are the input and the probability of the outcome is the output.79 Other tools, like mobile phone applications or simple laminated cards with sum scores and corresponding outcome probabilities, may facilitate the use of predictive models in field settings.
Conclusion
This systematic review is the first to comprehensively identify and evaluate the reporting of equine predictive models. The majority of models were developed using logistic regression analysis in university settings in North America or Europe and focused largely on colic and neonatal illness. All models were at high risk of bias due to methodological deficiencies and lack of validation. Many models had applicability concerns due to insufficient reporting of performance metrics/cut-off points/equations, use of predictors not readily available or those used to assess outcome, and mismatch between timing of predictors and optimal time for model use. Nevertheless, with further validation and updating, many models included in this review may prove useful to clinicians as decision aids.
Supplementary Material
Methods S1: Method to convert sensitivity and specificity to positive and negative likelihood ratios and convert likelihood ratios into post-test probabilities.
Table S1: Predictive models for use in equine colic. Performance metrics include AUC = area under the curve of a receiver operating characteristic curve, Cal = calibration metric (e.g. a significant or nonsignificant Hosmer-Lemeshow test (HL)), Acc = accuracy, Sn = sensitivity, Sp = specificity, PPV = positive predictive value, +LR = positive likelihood ratio, −LR = negative likelihood ratio. ROB refers to a risk of bias assessment.
Table S2: Predictive models for use in equine neonatal illness. Performance metrics include AUC = area under the curve of a receiver operating characteristic curve, Cal = calibration metric (e.g. a significant or nonsignificant Hosmer-Lemeshow test (HL)), Acc = accuracy, Sn = sensitivity, Sp = specificity, +LR = positive likelihood ratio, −LR = negative likelihood ratio. ROB refers to a risk of bias assessment.
Table S3: Predictive models for use in various other equine index conditions. Performance metrics include AUC = area under the curve of a receiver operating characteristic curve, Cal = calibration metric (e.g. a significant or nonsignificant Hosmer-Lemeshow test (HL)), Acc = accuracy, Sn = sensitivity, Sp = specificity, +LR = positive likelihood ratio, −LR = negative likelihood ratio. ROB refers to a risk of bias assessment
Acknowledgements
We thank David M. Kent, MD, CM, MS and Jason Nelson MPH for their help in guiding this review.
Source of Funding
This work was supported by the National Center for Advancing Translational Sciences, National Institutes of Health, Award Number TL1TR002546 (Cummings, Price) and the Office of Research Infrastructure Programs, National Institutes of Health, Award Number T32OD011121 (Krucik).
Footnotes
Competing Interests
No competing interests have been declared.
Data availability statement
Data sharing is not applicable to this article as no new data were created or analysed in this study.
References
- 1.Tangri N, Stevens LA, Griffith J, Tighiouart H, Djurdjev O, Naimark D, Levin A, Levey AS. A predictive model for progression of chronic kidney disease to kidney failure. JAMA 2011;305:1553–1559. [DOI] [PubMed] [Google Scholar]
- 2.Hadanny A, Shouval R, Wu J, Shlomo N, Unger R, Zahger D, Matetzky S, Goldenberg I, Beigel R, Gale C, Iakobishvili Z. Predicting 30-day mortality after ST elevation myocardial infarction: machine learning- based random forest and its external validation using two independent nationwide datasets. J Cardiol 2021;78:439–446. [DOI] [PubMed] [Google Scholar]
- 3.Wyatt JC, Altman DG. Commentary: Prognostic models: clinically useful or quickly forgotten? BMJ 1995;311:1539–1541. [Google Scholar]
- 4.Kappen TH, van Loon K, Kappen MAM, van Wolfswinkel L, Vergouwe Y, van Klei WA, Moons KGM, Kalkman CJ. Barriers and facilitators perceived by physicians when using prediction models in practice. J Clin Epidemiol 2016;70:136–145. [DOI] [PubMed] [Google Scholar]
- 5.Reeves MJ, Curtis CR. ‘By the seat of your pants’ or multivariable predictive modelling. Equine Vet J 1989;21:83–84. [DOI] [PubMed] [Google Scholar]
- 6.Parry BW, Anderson GA, Gay CC. Prognosis in equine colic: a comparative study of variables used to assess individual cases. Equine Vet J 1983;15:211–215. [DOI] [PubMed] [Google Scholar]
- 7.Reeves MJ, Curtis CR, Salman MD, Hilbert BJ. Prognosis in equine colic patients using multivariable analysis. Can J Vet Res 1989;53:87–94. [PMC free article] [PubMed] [Google Scholar]
- 8.Van Calster B, McLernon DJ, van Smeden M, et al. Calibration: the Achilles heel of predictive analytics. BMC Med 2019;17:230. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Mandrekar JN. Receiver operating characteristic curve in diagnostic test assessment. J Thor Oncol 2010;5:1315–1316. [DOI] [PubMed] [Google Scholar]
- 10.Venema E, Wessler BS, Paulus JK, et al. Large-scale validation of the prediction model risk of bias assessment tool (PROBAST) using a short form: high risk of bias models show poorer discrimination. J Clin Epidemiol 2021;138:32–39. [DOI] [PubMed] [Google Scholar]
- 11.Publish or Perish. Version 8. Harzing AW; 2021. [Google Scholar]
- 12.Ouzzani M, Hammady H, Fedorowicz Z, Elmagarmid A. Rayyan—a web and mobile app for systematic reviews. Syst Rev 2016;5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Wolff RF, Moons KGM, Riley RD, Whiting PF, Westwood M, Collins GS, Reitsma JB, Kleijnen J, Mallett S, PROBAST Group. PROBAST: a tool to assess the risk of bias and applicability of prediction model studies. Ann Intern Med 2019;170:51–58. [DOI] [PubMed] [Google Scholar]
- 14.Reeves MJ, Curtis CR, Salman MD, Stashak TS, Reif JS. Validation of logistic regression models used in the assessment of prognosis and the need for surgery in equine colic patients. Prev Vet Med 1992;13:155–172. [Google Scholar]
- 15.Puotunen-Reinert A Study of variables commonly used in examination of equine colic cases to assess prognostic value. Equine Vet J 1986;18:275–277. [DOI] [PubMed] [Google Scholar]
- 16.Orsini JA, Elser AH, Galligan DT, Donawick WJ, Kronfeld DS. Prognostic index for acute abdominal crisis (colic) in horses. Am J Vet Res 1988;49:1969–1971. [PubMed] [Google Scholar]
- 17.Reeves MJ, Curtis CR, Salman MD, Reif JS, Stashak TS. A multivariable prognostic model for equine colic patients. Prev Vet Med 1990;9:241–257. [Google Scholar]
- 18.Pascoe PJ, Ducharme NG, Ducharme GR, Lumsden JH. A computer-derived protocol using recursive partitioning to aid in estimating prognosis of horses with abdominal pain in referral hospitals. Can J Vet Res 1990;54:373–378. [PMC free article] [PubMed] [Google Scholar]
- 19.Furr MO, Lessard P, Ii NAW. Development of a colic severity score for predicting the outcome of equine colic. Vet Surg 1995;24:97–101. [DOI] [PubMed] [Google Scholar]
- 20.Sandholm M, Vidovic A, Puotunen-Reinert A, Sankari S, Nyholm K, Rita H. D-dimer improves the prognostic value of combined clinical and laboratory data in equine gastrointestinal colic. Acta Vet Scand 1995;36:255–272. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Freden GO, Provost PJ, Rand WM. Reliability of using results of abdominal fluid analysis to determine treatment and predict lesion type and outcome for horses with colic: 218 cases (1991–1994). J Am Vet Med Assoc 1998;213:1012–1015. [PubMed] [Google Scholar]
- 22.Thoefner MB, Ersbøll AK, Hesselholt M. Prognostic indicators in a Danish hospital-based population of colic horses. Equine Vet J 2000;32:11–18. [DOI] [PubMed] [Google Scholar]
- 23.Ihler CF, Venger JL, Skjerve E. Evaluation of clinical and laboratory variables as prognostic indicators in hospitalised gastrointestinal colic horses. Acta Vet Scand 2004;45:109–118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Levi O, Affolter VK, Benak J, Kass PH, Le Jeune SS. Use of pelvic flexure biopsy scores to predict short-term survival after large colon volvulus: pelvic flexure biopsy scores. Vet Surg 2012;41:582–588. [DOI] [PubMed] [Google Scholar]
- 25.Gonzalez LM, Fogle CA, Baker WT, Hughes FE, Law JM, Motsinger-Reif AA, Blikslager AT. Operative factors associated with short-term outcome in horses with large colon volvulus: 47 cases from 2006 to 2013. Equine Vet J 2015;47:279–284. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.McConachie E, Giguère S, Barton MH. Scoring system for multiple organ dysfunction in adult horses with acute surgical gastrointestinal disease. J Vet Intern Med 2016;30:1276–1283. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Farrell A, Kersh K, Liepman R, Dembek KA. Development of a colic scoring system to predict outcome in horses. Front Vet Sci 2021;8:1166. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Ducharme NG, Pascoe PJ, Lumsden JH, Ducharme GR. A computer-derived protocol to aid in selecting medical versus surgical treatment of horses with abdominal pain. Equine Vet J 1989;21:447–450. [DOI] [PubMed] [Google Scholar]
- 29.Reeves MJ, Curtis CR, Salman MD, Stashak TS, Reif JS. Multivariable prediction model for the need for surgery in horses with colic. Am J Vet Res 1991;52:1903–1907. [PubMed] [Google Scholar]
- 30.Thoefner MB, Ersbøll BK, Jansson N, Hesselholt M. Diagnostic decision rule for support in clinical assessment of the need for surgical intervention in horses with acute abdominal pain. Can J Vet Res 2003;67:20–29. [PMC free article] [PubMed] [Google Scholar]
- 31.Latson KM, Nieto JE, Beldomenico PM, Snyder JR. Evaluation of peritoneal fluid lactate as a marker of intestinal ischaemia in equine colic. Equine Vet J 2005;37:342–346. [DOI] [PubMed] [Google Scholar]
- 32.Pye J, Espinosa-Mur P, Roca R, Kilcoyne I, Nieto J, Dechant J. Preoperative factors associated with resection and anastomosis in horses presenting with strangulating lesions of the small intestine. Vet Surg 2019;48:786–794. [DOI] [PubMed] [Google Scholar]
- 33.Keppie NJ, Rosenstein DS, Holcombe SJ, Schott 2nd HC. Objective radiographic assessment of abdominal sand accumulation in horses. Vet Radiol Ultrasound 2008;49:122–128. [DOI] [PubMed] [Google Scholar]
- 34.Louro LF, Robson K, Hughes J, Loomes K, Senior M. Head and tail rope-assisted recovery improves quality of recovery from general anaesthesia in horses undergoing emergency exploratory laparotomy. Equine Vet J 2022;54(5):875–884. [DOI] [PubMed] [Google Scholar]
- 35.Corley KTT, Furr MO. Evaluation of a score designed to predict sepsis in foals: evaluation of sepsis score. J Vet Emergen Crit Care 2003;13:149–155. [Google Scholar]
- 36.Wong DM, Ruby RE, Dembek KA, Barr BS, Reus SM, Magdesian KG, Olsen E, Burns T, Slovis NM, Wilkins PA. Evaluation of updated sepsis scoring systems and systemic inflammatory response syndrome criteria and their association with sepsis in equine neonates. J Vet Intern Med 2018;32:1185–1193. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Weber EJ, Sanchez LC, Giguère S. Re-evaluation of the sepsis score in equine neonates: sepsis scoring in foals. Equine Vet J 2015;47:275–278. [DOI] [PubMed] [Google Scholar]
- 38.Hoffman AM, Staempfli HR, Willan A. Prognostic variables for survival of neonatal foals under intensive care. J Vet Intern Med 1992;6:89–95. [DOI] [PubMed] [Google Scholar]
- 39.Haas SD, Bristol F, Card CE. Risk factors associated with the incidence of foal mortality in an extensively managed mare herd. Can Vet J 1996;37:91–95. [PMC free article] [PubMed] [Google Scholar]
- 40.Furr M, Tinker MK, Edens L. Prognosis for neonatal foals in an intensive care unit. J Vet Intern Med 1997;11:183–188. [DOI] [PubMed] [Google Scholar]
- 41.Rohrbach BW, Buchanan BR, Drake JM, Andrews FM, Bain FT, Byars DT, Bernard WV, Furr MO, Paradis MR, Lawler J, Giguère S, Dunkel B. Use of a multivariable model to estimate the probability of discharge in hospitalized foals that are 7 days of age or less. J Am Vet Med Assoc 2006;228:1748–1756. [DOI] [PubMed] [Google Scholar]
- 42.Staempfli HR, Townsend HGG, Prescott JF. Prognostic features and clinical presentation of acute idiopathic enterocolitis in horses. Can Vet J 1991;32:232–237. [PMC free article] [PubMed] [Google Scholar]
- 43.Boemer F, Detilleux J, Cello C, Amory H, Marcillaud-Pitel C, Richard E, van Galen G, van loon G, Lefère L, Votion D-M. Acylcarnitines profile best predicts survival in horses with atypical myopathy. PLoS One 2017;12:e0182761. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Tirosh-Levy S, Solomovich R, Comte J, Sutton GA, Steinman A. Daboia (Vipera) palaestinae envenomation in horses: clinical and hematological signs, risk factors for mortality and construction of a novel severity scoring system. Toxicon 2017;137:58–64. [DOI] [PubMed] [Google Scholar]
- 45.Roy M-F., Kwong GPS, Lambert J, Massie S, Lockhart S. Prognostic value and development of a scoring system in horses with systemic inflammatory response syndrome. J Vet Intern Med 2017;31:582–592. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.de Barros A de MC, Silva AFR, Zibordi M, Spagnolo JD, Corrêa RR, Belli CB, de Camargo MM. Equine simplified acute physiology score: personalised medicine for the equine emergency patient. Vet Rec 2021;189: e136. [DOI] [PubMed] [Google Scholar]
- 47.Ishihara A, Reed SM, Rajala-Schultz PJ, Robertson JT, Bertone AL. Use of kinetic gait analysis for detection, quantification, and differentiation of hind limb lameness and spinal ataxia in horses. J Am Vet Med Assoc 2009;234:644–651. [DOI] [PubMed] [Google Scholar]
- 48.Kutasi O, Balogh N, Lajos Z, Nagy K, Szenci O. Diagnostic approaches for the assessment of equine chronic pulmonary disorders. J Equine Vet Sci 2011;31:400–410. [Google Scholar]
- 49.Porter RS, Leblond A, Lecollinet S, Tritz P, Cantile C, Kutasi O, Zientara S, Pradier S, van Galen G, Speybroek N, Saegerman C. Clinical diagnosis of West Nile fever in equids by classification and regression tree (cart) analysis and comparative study of clinical appearance in three European countries. Transbound Emerg Dis 2011;58:197–205. [DOI] [PubMed] [Google Scholar]
- 50.Tilley P, Sales Luis JP, Branco Ferreira M. Correlation and discriminant analysis between clinical, endoscopic, thoracic X-ray and bronchoalveolar lavage fluid cytology scores, for staging horses with recurrent airway obstruction (RAO). Res Vet Sci 2012;93:1006–1014. [DOI] [PubMed] [Google Scholar]
- 51.Saegerman C, Alba-Casals A, García-Bocanegra I, Dal Pozzo F, van Galen G. Clinical sentinel surveillance of equine West Nile fever, Spain. Transbound Emerg Dis 2016;63:184–193. [DOI] [PubMed] [Google Scholar]
- 52.Marti E, Wang X, Jambari NN, Rhyner C. Novel in vitro diagnosis of equine allergies using a protein array and mathematical modelling approach: a proof of concept using insect bite hypersensitivity. Vet Immunol Immunopathol 2015;167:171–177. [DOI] [PubMed] [Google Scholar]
- 53.Haltmayer E, Reiser S, Schramel JP, van den Hoven R. Breathing pattern and thoracoabdominal asynchrony in horses with chronic obstructive and inflammatory lung disease. Res Vet Sci 2013;95:654–659. [DOI] [PubMed] [Google Scholar]
- 54.Bullone M, Hélie P, Joubert P, Lavoie J-P. Development of a semiquantitative histological score for the diagnosis of heaves using endobronchial biopsy specimens in horses. J Vet Intern Med 2016;30:1739–1746. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Miller JE, Mann S, Fettelschoss-Gabriel A, Wagner B. Comparison of three clinical scoring systems for Culicoides hypersensitivity in a herd of Icelandic horses. Vet Dermatol 2019;30:536. [DOI] [PubMed] [Google Scholar]
- 56.Randleff-Rasmussen PK, Leblond A, Cappelle J, Bontemps J, Belluco S, Popoff MR, Marcillaud-Pitel C, Tapprest J, Tritz P, Desjardins I. Development of a clinical prediction score for detection of suspected cases of equine grass sickness (dysautonomia) in France. Vet Res Commun 2018;42:19–27. [DOI] [PubMed] [Google Scholar]
- 57.White SJ, Moore-Colyer M, Marti E, Hannant D, Gerber V, Coüetil L, Richard EA, Alcocer M. Antigen array for serological diagnosis and novel allergen identification in severe equine asthma. Sci Rep 2019;9:15170. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Delarocque J, Frers F, Feige K, Huber K, Jung K, Warnken T. Metabolic changes induced by oral glucose tests in horses and their diagnostic use. J Vet Intern Med 2021;35:597–605. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Haspeslagh M, Gerber V, Knottenbelt DC, Schüpbach G, Martens A, Koch C. The clinical diagnosis of equine sarcoids—part 2: assessment of case features typical of equine sarcoids and validation of a diagnostic protocol to guide equine clinicians in the diagnosis of equine sarcoids. Vet J 2018;240:14–18. [DOI] [PubMed] [Google Scholar]
- 60.Pfau T, Robilliard JJ, Weller R, Jespers K, Eliashar E, Wilson AM. Assessment of mild hindlimb lameness during over ground locomotion using linear discriminant analysis of inertial sensor data. Equine Vet J 2007;39:407–413. [DOI] [PubMed] [Google Scholar]
- 61.Church EE, Walker AM, Wilson AM, Pfau T. Evaluation of discriminant analysis based on dorsoventral symmetry indices to quantify hindlimb lameness during over ground locomotion in the horse. Equine Vet J 2009;41:304–308. [DOI] [PubMed] [Google Scholar]
- 62.Luethy D, Feldman R, Stefanovski D, Aitken MR. Risk factors for laminitis and nonsurvival in acute colitis: retrospective study of 85 hospitalized horses (2011–2019). J Vet Intern Med 2021;35:2019–2025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Cresswell EN, McDonough SP, Palmer SE, Hernandez CJ, Reesink HL. Can quantitative computed tomography detect bone morphological changes associated with catastrophic proximal sesamoid bone fracture in Thoroughbred racehorses? Equine Vet J 2019;51:123–130. [DOI] [PubMed] [Google Scholar]
- 64.Louro LF, Milner PI, Bardell D. Epidural administration of opioid analgesics improves quality of recovery in horses anaesthetised for treatment of hindlimb synovial sepsis. Equine Vet J 2021;53:682–689. [DOI] [PubMed] [Google Scholar]
- 65.Lyle CH, Blissitt KJ, Kennedy RN, McGorum BC, Newton JR, Parkin TDH, Stirk A, Boden LA. Risk factors for race-associated sudden death in Thoroughbred racehorses in the UK (2000–2007). Equine Vet J 2012;44:459–465. [DOI] [PubMed] [Google Scholar]
- 66.Barrier Battut I, Kempfer A, Becker J, Lebailly L, Camugli S, Chevrier L. Development of a new fertility prediction model for stallion semen, including flow cytometry. Theriogenology 2016;86:1111–1131. [DOI] [PubMed] [Google Scholar]
- 67.Biral NV, Azevedo Santos H, Senne NA, Paulino PG, Camilo TA, Tassinari WdS, Silva VL, Santos FN, Angelo. A cross-sectional study of Leishmania spp. in draft horses from the Distrito Federal, Brazil: seroprevalence, spatial distribution, and associated factors. Prev Vet Med 2021;195:105467. [DOI] [PubMed] [Google Scholar]
- 68.Boyle AG, Smith MA, Boston RC, Stefanovski D. A case-control study developing a model for predicting risk factors for high SeM-specific antibody titers after natural outbreaks of Streptococcus equi subsp equi infection in horses. J Am Vet Med Assoc 2017;250:1432–1439. [DOI] [PubMed] [Google Scholar]
- 69.Agnew ME, Slack J, Stefanovski D, Linton JK, Sertich PL. Sonographic appearance of the late gestation equine fetal intestine. Theriogenology 2019;138:121–126. [DOI] [PubMed] [Google Scholar]
- 70.Brewer BD, Koterba AM, Carter RL, Rowe ED. Comparison of empirically developed sepsis score with a computer generated and weighted scoring system for the identification of sepsis in the equine neonate. Equine Vet J 1988;20:23–24. [DOI] [PubMed] [Google Scholar]
- 71.Kattan MW, Hess KR, Amin MB, et al. American Joint Committee on Cancer acceptance criteria for inclusion of risk models for individualized prognosis in the practice of precision medicine. CA: Cancer J Clin. 2016;66:370–374. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Collins GS, Reitsma JB, Altman DG, Moons KGM. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD Statement. BMC Med 2015;13:1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Wynants L, Calster BV, Collins GS, et al. Prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal. BMJ 2020;369:m1328. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Navarro CLA, Damen JAA, Takada T, et al. Risk of bias in studies on prediction models developed using supervised machine learning techniques: systematic review. BMJ 2021;375:n2281. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Daines L, McLean S, Buelo A, et al. Systematic review of clinical prediction models to support the diagnosis of asthma in primary care. NPJ Prim Care Respir Med 2019;29:19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Moons KGM, Wolff RF, Riley RD, et al. PROBAST: a tool to assess risk of bias and applicability of prediction model studies: explanation and elaboration. Ann Intern Med 2019;170:W1. [DOI] [PubMed] [Google Scholar]
- 77.Watson J, Hutyra CA, Clancy SM, Chandiramani A, Bedoya A, Ilangovan K, Nderitu N, Poon EG. Overcoming barriers to the adoption and implementation of predictive modeling and machine learning in clinical care: what can we learn from US academic medical centers? JAMIA Open 2020;3:167–172. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Brown LA, Benhamou K, May AM, Mu W, Berk R. Machine learning algorithms in suicide prevention: clinician interpretations as barriers to implementation. J Clin Psychiatry 2020;81:10951. [DOI] [PubMed] [Google Scholar]
- 79.Cummings CO. Letter to the editor: A tool for calculating VetCOT score. J Vet Emergen Crit Care 2022:vec.13194. [DOI] [PubMed] [Google Scholar]
- 80.Page MJ, McKenzie JE, Bossuyt PM, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ 2021;372:n71. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Fischer AT. Corrections for prognostic index equation. Am J Vet Res 1989;50:1429. [PubMed] [Google Scholar]
- 82.Brewer BD, Koterba AM. Development of a scoring system for the early diagnosis of equine neonatal sepsis. Equine Vet J 1988;20:18–22. [DOI] [PubMed] [Google Scholar]
- 83.Stewart AJ, Hinchcliff KW, Saville WJA, Jose-Cunilleras E, Hardy J, Kohn VW, Reed SM, Kowalski JJ. Actinobacillus sp. bacteremia in foals: clinical signs and prognosis. J Vet Intern Med 2002;16(4):464–71 [DOI] [PubMed] [Google Scholar]
- 84.Dembek KA, Hurcombe SD, Frazer ML, Morresey PR, Toribio RE. Development of a likelihood of survival scoring system for hospitalized equine neonates using generalized boosted regression modeling. PLoS One 2014;9:e109212. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Methods S1: Method to convert sensitivity and specificity to positive and negative likelihood ratios and convert likelihood ratios into post-test probabilities.
Table S1: Predictive models for use in equine colic. Performance metrics include AUC = area under the curve of a receiver operating characteristic curve, Cal = calibration metric (e.g. a significant or nonsignificant Hosmer-Lemeshow test (HL)), Acc = accuracy, Sn = sensitivity, Sp = specificity, PPV = positive predictive value, +LR = positive likelihood ratio, −LR = negative likelihood ratio. ROB refers to a risk of bias assessment.
Table S2: Predictive models for use in equine neonatal illness. Performance metrics include AUC = area under the curve of a receiver operating characteristic curve, Cal = calibration metric (e.g. a significant or nonsignificant Hosmer-Lemeshow test (HL)), Acc = accuracy, Sn = sensitivity, Sp = specificity, +LR = positive likelihood ratio, −LR = negative likelihood ratio. ROB refers to a risk of bias assessment.
Table S3: Predictive models for use in various other equine index conditions. Performance metrics include AUC = area under the curve of a receiver operating characteristic curve, Cal = calibration metric (e.g. a significant or nonsignificant Hosmer-Lemeshow test (HL)), Acc = accuracy, Sn = sensitivity, Sp = specificity, +LR = positive likelihood ratio, −LR = negative likelihood ratio. ROB refers to a risk of bias assessment
Data Availability Statement
Data sharing is not applicable to this article as no new data were created or analysed in this study.
