Supplementary File 1.
COSMIN Study Design checklist for patient-reported outcome measurement instruments
Yes | No | ? | |
---|---|---|---|
Internal Consistency | |||
1. Were there any important flaws in the design or methods of the study? | ✓ | ||
2. Design requirements | ✓ | ||
3. Was the percentage of missing items given? | ✓ | ||
4 Was there a description of how missing items were handled? | ✓ | ||
5. Was the sample size included in the internal consistency analysis adequate? | ✓ | ||
6. Was the unidimensionality of the scale checked? i.e. was factor analysis or IRT model applied? | ✓ | ||
7. Was the sample size included in the unidimensionality analysis adequate? | ✓ | ||
8. Was an internal consistency statistic calculated for each (unidimensional) (sub)scale separately? | ✓ | ||
9. Were there any important flaws in the design or methods of the study? | ✓ | ||
Statistical methods | Yes | No | NA |
1. for Classical Test Theory (CTT): Was Cronbach’s alpha calculated? | ✓ | ||
2. for dichotomous scores: Was Cronbach’s alpha or KR-20 calculated? | ✓ | ||
3. for IRT: Was a goodness of fit statistic at a global level calculated? e.g. χ2, reliability coefficient of estimated latent trait value (index of (subject or item) separation) | ✓ | ||
Reliability: relative measures (including test-retest reliability, inter-rater reliability and intra-rater reliability) | Yes | No | NA/? |
Design requirements | |||
Was the percentage of missing items given? | ✓ | ||
Was there a description of how missing items were handled? | ✓ | ||
Was the sample size included in the analysis adequate? | ✓ | ||
Were at least two measurements available? | ✓ | ||
Were the administrations independent? | ✓ | ||
Was the time interval stated? | ✓ | ||
Were patients stable in the interim period on the construct to be measured? | ✓ | ||
Was the time interval appropriate? | ✓ | ||
Were the test conditions similar for both measurements? e.g. type of administration, environment, instructions | ✓ | ||
Were there any important flaws in the design or methods of the study? | ✓ | ||
Statistical methods | |||
for continuous scores: Was an intraclass correlation coefficient (ICC) calculated? | ✓ | ||
for dichotomous/nominal/ordinal scores: Was kappa calculated? | ✓ | ||
for ordinal scores: Was a weighted kappa calculated? | ✓ | ||
for ordinal scores: Was the weighting scheme described? e.g. linear, quadratic | ✓ | ||
Measurement error: absolute measures | |||
Design requirements | |||
Was the percentage of missing items given? | ✓ | ||
Was there a description of how missing items were handled? | ✓ | ||
Was the sample size included in the analysis adequate? | ✓ | ||
Were at least two measurements available? | ✓ | ||
Were the administrations independent? | ✓ | ||
Was the time interval stated? | ✓ | ||
Were patients stable in the interim period on the construct to be measured? | ✓ | ||
Was the time interval appropriate? | ✓ | ||
Were the test conditions similar for both measurements? e.g. type of administration, environment, instructions | ✓ | ||
Were there any important flaws in the design or methods of the study? | ✓ | ||
Statistical methods | |||
for CTT: Was the Standard Error of Measurement (SEM), Smallest Detectable Change (SDC) or Limits of Agreement (LoA) calculated? | ✓ | ||
Hypotheses testing | Yes | No | ? |
Design requirements | ✓ | ||
Was the percentage of missing items given? | ✓ | ||
Was there a description of how missing items were handled? | ✓ | ||
Was the sample size included in the analysis adequate? | ✓ | ||
Were hypotheses regarding correlations or mean differences formulated a priori (i.e. before data collection)? | ✓ | ||
Yes | No | NA | |
Was the expected direction of correlations or mean differences included in the hypotheses? | ✓ | ||
Was the expected absolute or relative magnitude of correlations or mean differences included in the hypotheses? | ✓ | ||
for convergent validity: Was an adequate description provided of the comparator instrument(s)? | ✓ | ||
for convergent validity: Were the measurement properties of the comparator instrument(s) adequately described? | ✓ | ||
Were there any important flaws in the design or methods of the study? | ✓ | ||
Statistical methods | Yes | No | NA |
Were design and statistical methods adequate for the hypotheses to be tested? | ✓ | ||
Interpretability | Yes | No | NA |
Was the percentage of missing items given? | ✓ | ||
Was there a description of how missing items were handled? | ✓ | ||
Was the sample size included in the analysis adequate? | ✓ | ||
Was the distribution of the (total) scores in the study sample described? | ✓ | ||
Was the percentage of the respondents who had the lowest possible (total) score described? | ✓ | ||
Was the percentage of the respondents who had the highest possible (total) score described? | ✓ | ||
Were scores and change scores (i.e. means and SD) presented for relevant (sub) groups? e.g. for normative groups, subgroups of patients, or the general population | ✓ | ||
Was the minimal important change (MIC) or the minimal important difference (MID) determined? | ✓ | ||
Were there any important flaws in the design or methods of the study? | ✓ |