Skip to main content
. 2019 Dec 17;45(3):340–357. doi: 10.1093/jpepsy/jsz096

Table II.

Modified Criteria Used to Evaluate Quality of Evidence-Based Assessments of Pediatric Self-Management

Hunsley and Mash’s (2018) criteria for evidence-based assessment Modified criteria
Internal Consistency1
  • Adequate: Majority of alphas between .70 and .79

  • Good: Majority of alphas between .80 and .89

  • Excellent: Majority of alphas ≥.90

Not modified
Interrater Reliability2
  • Adequate: Majority of K values between .60 and .74, or Pearson/interclass correlations between .70 and .79

  • Good: Majority of K values between .75 and .84, or Pearson/interclass correlations between .80 and .89

  • Excellent: Majority of K values ≥.85, or Pearson/interclass correlations ≥.90

Not modified
Test-Retest Reliability3
  • Adequate: Majority of correlations ≥.70 over several days to several weeks

  • Good: Majority of correlations ≥.70 over several months

  • Excellent: Majority of correlations ≥.70 over at least one year

Not modified
Content Validity
  • Adequate: Domain of the construct assessed was clearly defined and items were representative of the domain

  • Good: In addition to criteria for an adequate rating, measure was evaluated by judges (e.g., experts, participants)

  • Excellent: In addition to criteria for a good rating, measure development involved multiple groups of judges and quantitative ratings

Not modified
Construct Validity
  • Adequate: Replicated evidence of various types of construct validity (e.g., predictive, concurrent, convergent, discriminant) by independent research groups

  • Good: Majority of evidence by independent research groups across multiple types of construct validity

  • Excellent: In addition to criteria for a good rating, there is evidence of incremental validity with respect to clinical data

  • Adequate: Two or more independent research groups assessed construct validity with respect to one health-related outcome,a sociodemographic factors (i.e., distinguishing differences between groups), or other measures of adherence/self-management

  • Good: Two or more independent research groups assessed construct validity across two or more health-related outcomes, sociodemographic factors, or other measures of adherence/self-management

  • Excellent: Two or more independent research groups assessed construct validity across two or more health-related outcomes, sociodemographic factors, or other measures of adherence/self-management and demonstrated incremental validity (e.g., iterative versions of the measure showing improvement over time or adaptations of the same measure comparing versions in the same sample)

Validity Generalization
  • Adequate: Some evidence supports using the measure with either more than one specific group or in multiple contexts

  • Good: Majority of evidence supports using the measure with more than one specific group or in multiple contexts

  • Excellent: Majority of evidence supports using the measure with more than one specific group and in multiple contexts

  • Adequate: One or two independent research groups demonstrated use of the measure in more than one demographic groupb or in multiple contextsc

  • Good: Three or more independent research groups demonstrated use of the measure in more than one demographic group or in multiple contexts

  • Excellent: Three or more independent research groups demonstrated use of the measure in more than one demographic groups and in multiple contexts

Treatment Sensitivity
  • Adequate: Some evidence of sensitivity to change pre–post treatment

  • Good: Majority of evidence by independent research groups demonstrates sensitivity to change pre–post treatment

  • Excellent: In addition to criteria for a good rating, evidence of sensitivity to change across different treatments

  • Adequate: One study that demonstrates sensitivity to change pre–post treatment

  • Good: Two studies from independent research groups that demonstrate sensitivity to change pre–post treatment

  • Excellent: More than two studies by independent research groups that demonstrate sensitivity to change pre–post treatment

Clinical Utility
  • Adequate: Measure data are clinically useful while taking into account factors including cost, ease/availability of administration and scoring instructions, etc.

  • Good: In addition to criteria for an adequate rating, there is published evidence that using the measure results in clinical benefits

  • Excellent: In addition to criteria for an adequate rating, there is evidence by independent research groups that using the measure results in clinical benefits

Not modified

Notes. Ratings of below adequate were given if published evidence of a criterion was found, but it did not reach an adequate rating. Ratings of N/A were given if published evidence of a criterion was not found. These two additional rating categories were developed for this systematic review and not included in Hunsley and Mash’s original ratings.

a

Health outcomes included clinical outcomes, health-related quality of life, health care utilization, psychological functioning, and alternative measures of adherence (e.g., medication possession ratios, electronic monitoring, daily diaries).

b

Demographic characteristics included age, gender, and race/ethnicity.

c

Contexts included setting (e.g., outpatient, inpatient) and type of administration (interview, paper–pencil, web-based).

1

Internal consistency refers to correlations between items on the same measure to examine whether items meant to measure the same construct produce similar ratings.

2

Interrater reliability refers to the consistency of ratings on a measure by multiple respondents (e.g., youth-report and parent-report).

3

Test-retest reliability refers to the closeness of the agreement between results of the same measure used at multiple time points under the same conditions.