Skip to main content
. 2021 Aug 4;11(8):2303. doi: 10.3390/ani11082303

Table 2.

Statistical methods for validation of the Broken/Unbroken Test.

Reliability and Validity Type of Analysis Description Statistical Test
- Frequency distribution Distribution of AHT, HT, and BUT scores Descriptive statistics
Reliability Inter-observer Agreement among the four blinded observers 1 Fleiss’ kappa and ICC
Intra-observer Agreement between scores assigned by the same observer to videos viewed twice 1 Kendall tau-b correlation coefficient, concordance rate, and ICC
Test–retest Agreement between results of tests conducted on the same horse at two different times 1
Internal consistency and item-total correlation Agreement between individual items of the scale 1 and between each item and the total score Spearman’s rank-order coefficient 2
Validity Construct Degree to which the BUT score correlates with other measures to which it is theoretically related 1,3 Spearman’s rank-order coefficient and ordinal logistic regressions
Criterion Strength of the relationship between the BUT score and the ‘gold standard’ criterion 4,5 Binary logistic regression 5, Receiver operating characteristic (ROC) analysis, and Cohen’s kappa 5

AHT = Approaching and Haltering Test; BUT = Broken/Unbroken Test; CI = Confidence Interval; HT = Handling Test; ICC = intraclass correlation coefficient; ROC = receiver operating characteristic. BUT score = sum of scores assigned to AHT and HT tests. 1 Modified by Meagher [36]. 2 Spearman’s coefficient was chosen as Cronbach’s coefficient alpha is inappropriate for two-item scales [39]. 3 Convergent validity. 4 Modified by Boateng [37] (concurrent criterion validity). 5 Expert’s judgment used as criterion measure.