TABLE 1.
Aspect | Characteristic | Conceptual definition and strategies | What to observe |
Validity | Dimensional validity | This refers to the correspondence that should exist between the instrument's internal structure and the one that was theorized regarding the phenomenon to be evaluated. For example, if the instrument aims to measure mental disorders and includes depression and anxiety as its two dimensions of interest, a statistical analysis of it should reveal such dimensions. | Results of exploratory and confirmatory factor analyzes, demonstrating the correspondence between the postulated structure for the phenomenon and the loading of the instrument items on their respective dimensions. |
Returning to the example, a factor analysis of the instrument for common mental disorders should demonstrate that the questions regarding anxiety are grouped in the dimension that concerns them (anxiety) and the questions about depression are associated with their underlying factor (depression). | |||
Construct validity | The instrument's ability to measure what it intends to assess when there is not another tool considered the "gold standard" for measuring the phenomenon of interest. Construct validity can be determined by several methods, including: | Finding that the instrument confirms the hypothesis that one group has the feature of interest and the other does not, is an indication of the instrument's validity through the comparison of extreme groups. | |
• Extreme groups: the instrument is applied to two groups, one supposedly with the presence of the characteristic of interest and the other without it. | In the convergent validity example, it is expected that the results from both instruments point in the same direction (that they are positively correlated with each other). | ||
• Convergent validity: comparison between the assessments obtained with the instrument of interest versus those resulting from another scale used for measuring the same phenomenon. | |||
The correlation between the results of different instruments must be zero when evaluating the discriminant validity. | |||
• Discriminant or divergent validity: it can be obtained by testing the correlation between the results of an instrument and those of another one used for measuring a different construct. | |||
Criterion-related validity | Ability of the instrument to measure what it proposes, whenever there are instruments considered as the "gold standard". The verification of this validity involves the application of two instruments, the one intended to be used and another considered as reference, and also by the observation of their correlation. Criterion validity is typically divided into two subtypes: | In both cases the correlation between the instrument of interest and the "gold standard" one support the validity argument for the former. | |
• Concurrent or simultaneous validity: tests the correlation of the instrument of interest with a "gold standard" after applying both simultaneously. | |||
• Predictive validity: determined by the ability of the instrument to predict a future event, which will be based on the subsequent application of the reference instrument. | |||
Reliability | Internal consistency | As an illustration, if we wish to measure the functional capacity of individuals and we have several items (questions) to measure it, they should have a high correlation among themselves. The measures used to assess internal consistency are the Cronbach's alpha coefficient and the Kuder-Richardson coefficient, among others. In all cases, it is possible to estimate the internal consistency with a single application of the instrument to the sample under evaluation. | The minimum acceptable value for these coefficients is 0.8. |
Temporal stability | Stability may be assessed in different ways, including: | The minimum acceptable value for these coefficients is 0.5. | |
• The degree of agreement between different observers, using the same instrument (inter-observer reliability). | |||
• The consistency of the observations made by the same examiner at different moments in time (intra-observer reliability or test-retest). |
This Table was designed based on data published by Reichenheim & Moraes10 and Streiner & Norman9 that must be consulted if the reader wishes to advance further in these topics.