Skip to main content
. 2020 Jul 9;3:93. doi: 10.1038/s41746-020-0303-x

Table 3.

The process of development of (sub-)classification tools for LBP using AI/ML compared to the STarT Back and McKenzie.

Classification accuracya Internal consistencyb Test−retest reliabilityc Intra- or inter-rater reliabilityd Construct validitye Discriminative validityf Prognosis: paing Prognosis: disabilityg Treatment: painh Treatment: disabilityh Treatment: costsh
AI/ML 20/25 (80%)
STarT Back NA 6/9 (67%) 9/9 (100%) 5/11 (45%) 8/8 (100%) 2/6 (33%) 6/8 (75%) 1/4 (25%) 3/4 (75%) 0/2 (0%)
McKenzie NA 4/10 (40%) 1/2 (50%) 5/11 (45%) 4/11 (36%) 0/1 (0%)

Values reported as number and percentage.

AI/ML artificial intelligence and machine learning, — no studies available or unable to be measured, NA not assessed in this systematic review.

aNumber of AI/ML studies reporting ≥80% accuracy of classification into ‘low-back pain’ versus ‘healthy’.

bInternal consistency was considered acceptable if Cronbach’s α was ≥0.7146.

cTest−retest was considered as acceptable above an intraclass correlation coefficient (ICC) of ≥0.7146,163.

dKappa scores for intra-rater and inter-tester reliability were considered good ≥0.61122.

eConstruct validity ≥0.6 was considered acceptable146,164.

fDiscriminative validity ≥0.7 was considered as acceptable discrimination13.

gPrognosis prediction was considered ‘adequate’ when the classification approach resulted in statistically significant prediction of outcome after adjusting for baseline pain or disability in multivariate models147150.

hTreatment effect was considered ‘adequate’ when the classification approach resulted in a statistically significant improved patients outcomes for pain or disability or healthcare costs in randomised or non-randomised clinical trials.