Table 2.
Tools | Cosmin | Terwee’s criteria | Attributes and criteria | Economic evaluation | Guidance for industry | Fitzpatrick’s criteria | ICF ICFCY | EMPRO | SCI criteria | Andresen’s tool | Canchild outcomes | Omeract | Testing standards |
Development | Delphi | Author criteria | Expert panel | Literature | Consensus | Literature | Expert panel | Expert panel literature | Literature | Expert panel | Expert panel Delphi | Consensus | |
Sponsor/s | COSMIN initiative | Author | SACMOT working group | Standing group of health technology | FDA staff | Standing group of health technology | WHO member states | IRYSS committee | SCIRE working group | Author | CanChild centre staff | OMERACT initiative | AERA, APA, NCME |
Approval updates | 2010, 2018 | 2007 | 1996, 2002, 2013 | 1999, 2017 | 2006, 2009 | 1998 | 2001, 2019* | 2008 | 2008, 2016 | 2000 | 1987†, 2004 | 1992, 1998,2007,2014, 2019 | 1954, 1966, 1974, 1985, 1999, 2014 |
Items (scoring) | 5–18 items/box (+/−/?) | 8–9 items total (+/−/?) | Not item structured (no scoring) | Not item structured (no scoring) | Not item structured (no scoring) | Not item structured (no scoring) | Not item structured (no scoring) | 39 items(strongly agree, agree, disagree, strongly disagree) | 3–5 items/box (++++/+++/++/+) | Eleven items total (A, B, C) | 2–6 items/box (excellent, adequate, poor) | 2–5 items/box (Green, amber, red, white) | Not item structured (no scoring) |
Measurement properties Validity |
Content construct (Int. structure cross-cultural hypotheses test) Criterion (Gold standard) Responsiveness |
Content construct (Hypotheses test) Criterion (Gold standard) Floor/Ceiling Responsiveness |
Conceptual and measurement model Content construct (Hypotheses test) Criterion (Gold standard) Responsiveness |
Descriptive (Content Face Construct) Preference-based valuation Empirical (Criterion) |
Conceptual model Content construct (Hypothesis test, discriminant, convergent, known groups) Responsiveness |
Use Content/face construct (convergent, discriminant, int. structure) Criterion (Predictive) Cut-score precision Responsiveness |
Content | Conceptual and measurement model Content construct (Hypotheses test) Criterion Responsiveness |
Content criterion (concurrent predictive ‘discriminant’) Clinical utility (consequential validity) Floor/Ceiling Responsiveness |
Conceptual and measurement model Instrument bias Int. structure convergent discriminant Responsiveness |
Use scale construction Content construct (Hypotheses test) Criterion (Gold standard) Responsiveness |
Content, face construct (Convergent, divergent) Criterion (Accuracy) Discrimination (Sensitivity over time and over treatment) |
Content response process Int. structure (Dimensions, DIF) Relations to other variables (Hypotheses test, Convergent, Discriminant, criterion, responsiveness Consequences |
Reliability | Int. consistency measurement error (Test retest, agreement) | Int. consistency reproducibility (Agreement, relative measurement error) | Int. consistency reproducibility (Test retest, inter-rater) | Test retest Inter-rater | Test retest Inter-rater Int. consistency | Int. consistency reproducibility (Test retest) | Int. consistency reproducibility (Test retest, inter-rater) | Int. consistency test retest | Int. consistency test retest | Int. consistency intra/inter-rater test retest | Reproducibility test retest | Int. consistency test retest alternate forms scorers and decision consistency/accuracy | |
Fairness | Equivalence of accommodations | ||||||||||||
Other characteristics | Norms | Norms, standard values | Norms standardisation | Scales, norms, Score comparability | |||||||||
Interpretability | Interpretability | Interpretability | Interpretability | Interpretability | Interpretability | Test development and revision | |||||||
Burden | Burden | Acceptability (Burden) | Burden | Burden | Burden | ||||||||
Administration accessible forms | Administration accessible forms | Administration | Administration accessible forms | Administration accessible forms | |||||||||
Feasibility | Cultural adaptations | Practicality | Feasibility cultural adaptations | Cultural adaptations | Applicability cultural adaptations | Cultural adaptations | Clinical utility (Feasibility) | Feasibility | |||||
Frequency of use (%) | 61 (30.4) | 45 (22.4) | 33 (16.4) | 17 (8.4) | 14 (6.9) | 14 (6.9) | 7 (3.4) | 4 (2.0) | 2 (1.0) | 2 (1.0) | 1 (0.5) | 1 (0.5) | 0 |
*Updated version at website.
†Reference at 2004.
AERA, american educational research association; APA, American Psychological Association; COSMIN, Consensus-based Standards for the selection of health Measurement Instruments; DIF, differential item functioning; EMPRO, Evaluating Measures of Patient Reported Outcomes; FDA, Food and Drug Administration; ICF, international classification of functioning; ICFCY, international classification of functioning for children and youth; IRYSS, Investigation Network for Health and Health Service Outcomes Research; NCME, National Council on Measurement in Education; OMERACT, Outcomes Measures in Rheumatology Clinical Trials; SACMOT, Scientific Advisory Committee Medical Outcomes Trust; SCI, spinal cord injury; SCIRE, Spinal Cord Injury Rehabilitation Evidence.;