Table 2.
Source of evidence | Definition | Examples of evidence |
---|---|---|
Content | “The relationship between the content of a test and the construct it is intended to measure” [24] | Procedures for item sampling, development, and scoring (e.g., expert panel, previously described instrument, test blueprint, and pilot testing and revision) |
Internal structure | Relationship among data items within the assessment and how these relate to the overarching construct | Internal consistency reliability Interrater reliability Factor analysis Test item statistics |
Relationships with other variables | “Degree to which these relationships are consistent with the construct underlying the proposed test score interpretations” [24] | Correlation with tests measuring similar constructs Correlation (or lack thereof) with tests measuring different constructs Expert-novice comparisons |
Response process | “The fit between the construct and the detailed nature of performance . . . actually engaged in” [24] | Analysis of examinees’ or raters’ thoughts or actions during assessment (e.g., think-aloud protocol) Assessment security (e.g., prevention of cheating) Quality control (e.g., video capture) Rater training |
Consequences | “The impact, beneficial or harmful and intended or unintended, of assessment” [27] | Impact on examinee performance (e.g., downstream effects on board scores, graduation rates, clinical performance, patient safety) Other examinee effects (e.g., test preparation, length of training, stress, anxiety) Definition of pass/fail standard |