Table 1.
Specific evidence to support validity arguments.
Scoring | Generalization | Extrapolation | Implication | |
---|---|---|---|---|
Definition of Kane (26) | Rule is appropriate Rule is applied as specified Scoring is free of bias Data fit the scaling todel |
Sample is representative of universe of possible observations Sample is large enough to control for random error |
Observed score is related to the target score No systematic errors likely to undermine the extrapolation |
Implications (interpretations) are appropriate Properties of scores support the implications (interpretations) associated with the label |
Operational sources of evidence | Development of scoring dimensions/development of selection items Evidence of independence of scoring dimensions Distinguishability of scored items Evidence of scoring reliability Quality control of scoring |
Internal consistency reliability across projects Assessment of sources of measurement error Sampling of observations (number of items or sites, breadth of content) Sample size |
Relationships with other variables/measures (correlation with other scores) Development of items to reflect the full breadth of real-life tasks Retesting performance |
Impact on the physically exercise person (i.e., viewing the act of assessment as an intervention) Impact on the project Accurate classification of individuals Standard setting process |