Abstract
Two types of interobserver reliability values may be needed in treatment studies in which observers constitute the primary data-acquisition system: trial reliability and the reliability of the composite unit or score which is subsequently analyzed, e.g., daily or weekly session totals. Two approaches to determining interobserver reliability are described: percentage agreement and “correlational” measures of reliability. The interpretation of these estimates, factors affecting their magnitude, and the advantages and limitations of each approach are presented.
Keywords: observational technology, reliability, validity, statistics, recording and measurement techniques, Cohen's kappa, generalizability theory, measurement theory, Spearman-Brown prophesy formula, correlational measures
Full text
PDFSelected References
These references are in PubMed. This may not be the complete list of references from this article.
- Fleiss J. L. Measuring agreement between two judges on the presence or absence of a trait. Biometrics. 1975 Sep;31(3):651–659. [PubMed] [Google Scholar]
- Kazdin A. E., Klock J. The effect of nonverbal teacher approval on student attentive behavior. J Appl Behav Anal. 1973 Winter;6(4):643–654. doi: 10.1901/jaba.1973.6-643. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schmidt G. W., Ulrich R. E. Effects of group contingent events upon classroom noise. J Appl Behav Anal. 1969 Fall;2(3):171–179. doi: 10.1901/jaba.1969.2-171. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wahler R. G. Some structural aspects of deviant child behavior. J Appl Behav Anal. 1975 Spring;8(1):27–42. doi: 10.1901/jaba.1975.8-27. [DOI] [PMC free article] [PubMed] [Google Scholar]