Table 3.

The validation inferences validity framework

Validity inference	Definition (assumptions)^a	Examples of evidence
Scoring	The score or written narrative from a given observation adequately captures key aspects of performance	Procedures for creating and empirically evaluating item wording, response options, scoring options Rater selection and training
Generalization	The total score or synthesis of narratives reflects performance across the test domain	Sampling strategy (e.g., test blueprint) and sample size Internal consistency reliability Interrater reliability
Extrapolation	The total score or synthesis in a test setting reflects meaningful performance in a real life setting	Authenticity of context Correlation with tests measuring similar constructs, especially in real-life context Correlation (or lack thereof) with tests measuring different constructs Expert-novice comparisons Factor analysis
Implications/decisions	Measured performance constitutes a rational basis for meaningful decisions and actions	See Table 2, “Consequences”

See Kane [10] and Cook et al [12] for further details and examples

^aEach of the inferences reflects assumptions about the creation and use of assessment results