Abstract
This dataset belongs to a large-scale randomized controlled trial (RCT) in educational research targeting English learning students and their teachers' instructional capacity. The dataset includes ratings conducted through classroom observations of 45-minute English as a Second language (ESL) blocks. Each coder rated 60 recorded video segments collected from each teacher. During the 20-second segment, ratings of six domains of teachers' instruction (i.e., ESL Strategies, Group, Activity Structure, Mode, Language Content, Language of Teacher, Language of Student) were collected. The dataset is organized by teacher, by coder, and by domain, for researchers to analyze inter-rater reliability among coders by domain and/or cross-domain. This data article is related to the research article Tong et al. [3] on “The determination of appropriate coefficient indices for inter-rater reliability: using classroom observation instruments as fidelity measures in large-scale randomized research”.
Keywords: Inter-rater reliability, Classroom observation, Fidelity of implementation, Bilingual/ESL education
Specifications Table
Subject | Social Sciences |
Specific subject area | Education; bilingual/English as second language (ESL) education |
Type of data | Table |
How data were acquired | Virtually-recorded classroom videos were coded via an online coding platform http://tbop.teachbilingual.com/ |
Data format | Cleaned and labelled raw data |
Parameters for data collection | Recordings were collected during a 45-minute block of ESL instruction. Research staff and graduate students of the project rated the recordings using a multi-dimension-multi-response (MDMR) instrument, i.e., Transitional bilingual observation protocol (TBOP, [1,2]). The coding was based on the following dimensions: ESL Strategies, Group, Activity Structure, Mode, Language Content, Language of Teacher, and Language of Student. These coders received training provided by the research team who were the developers of the observation instrument. |
Description of data collection | Classroom instruction was recorded virtually. Videos were then coded by trained personnel. A value representing the category in each dimension is recorded to indicate the presence of a certain pedagogical behaviour. |
Data source location | College Station, Texas, the United States |
Data accessibility | With the article or Mendeley Data link https://data.mendeley.com/datasets/479hwxdfwb/draft?a=10b48a38-dc39-46b3-8402-4ffbe51548f3 |
Related research article | Tong, F., Tang, S., Irby, B. J., Lara-Alecio, R., & Guerrero, C., (2020). The Determination of Appropriate Coefficient Indices for Inter-Rater Reliability: Using Classroom Observation Instruments as Fidelity Measures in Large-Scale Randomized Research. International Journal of Educational Research [3]. |
Value of the Data
|
1. Data description
The dataset contains seven observation dimensions: ESL Strategies, Group, Activity Structure, Mode, Language Content, Language of Instruction/Teacher, and Language of Student that were compiled in one workbook (see Table 1). Each sheet (tab) was named after the domain of observation. The top row contains variable names: teacher (1–3), segment (1–60), and coder (1–6). The first column is the index of three teachers whose lessons were recorded and coded. The second column is the index of segments that were coded. Each video is coded into 60 segments, and each segment lasts 20 seconds. Columns 3–8 are coders' ratings of each segment. For example, column 3 is Coder 1's ratings of three videos of 60 20-second segments that were rated. The value in each cell from columns 3–8 represents a unique code corresponding to an instructional activity within each domain that was observed by the coders during that segment. The explanation of these codes and coding protocol can be referred to Lara et al. [1] and Tong et al. [2]. These data are therefore nominal in nature and can be used to calculate different indices of inter-rater agreement as was reported in Tong et al. For future analysis, researchers need to select appropriate inter-rater indices for the nominal data in educational research.
Table 1.
Descriptive statistics of the data used to calculate inter-rater reliability.
# of teachers | # of domains (worksheets) | # of coded 20-second segments | # of coders |
---|---|---|---|
3 | 7 | 180 | 6 |
2. Experimental design, materials, and methods
Data in this paper were part of a larger database of classroom observation from a randomized project. We randomly selected 10% of the data to ensure sample representativeness. The purpose of obtaining such a sample was to calculate interrater reliability among the coders of their observations. We recommended 10% in consideration of the total number of raters involved in this process (between 5 and 8, see Tong et al. [2]). After receiving intensive training of the observation instruments, each coder was assigned three recorded lessons to code individually with the intent to reach inter-rater reliability. In the coding process, the coders watched the first five-minute of the recorded video to obtain a general sense of the lesson. In the next five minutes, the rater coded 15 segments with each segment lasting for 20 seconds. The coders repeated such circle four times till he/she completed coding 60 20-second segments of a 45-minute ESL lesson. For each 20-second segment, a coder rated the above-mentioned six domains. The data presented in the paper were cleaned and organized by domain in the order of 20-second segments that were coded for each teacher.
Acknowledgments
This dataset was part of Project English Language and Literacy Acquisition-Validation (ELLA-V), supported by the Office of Innovation and Improvement, United States Department of Education, #U411B120047.
Footnotes
Supplementary data to this article can be found online at https://doi.org/10.1016/j.dib.2020.105303.
Conflict of Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Appendix A. Supplementary data
The following are the Supplementary data to this article:
References
- 1.Lara-Alecio R., Parker R.I. A pedagogical model for transitional English bilingual classrooms. Biling. Res. J. 1994;18 199-133. [Google Scholar]
- 2.Tong F., Tang S., Irby B.J., Lara-Alecio R., Guerrero C., Lopez T. A process for establishing and maintaining inter-rater reliability for two observation instruments as a fidelity of implementation measure: a large-scale randomized controlled trial perspective. Stud. Educ. Eval. 2019;62:18–29. [Google Scholar]
- 3.Tong F., Tang S., Irby B.J., Lara-Alecio R., Guerrero C. The determination of appropriate coefficient indices for inter-rater reliability: using classroom observation instruments as fidelity measures in large-scale randomized research. Int. J. Educ. Res. 2020;99 [Google Scholar]
- 4.Nelson M.C., Cordray D.S., Hulleman C.S., Darrow C.L., Sommer E.C. A procedure for assessing intervention fidelity in experiments testing educational and behavioral interventions. J. Behav. Health Serv. Res. 2012;39(4):374–396. doi: 10.1007/s11414-012-9295-x. [DOI] [PubMed] [Google Scholar]
- 5.Noell G.H. Empirical and pragmatic issues in assessing and supporting intervention implementation in school. In: Peackock G.G., Ervin R.A., Daly E.J., Merrell K.W., editors. Practical Handbook in School Psychology. Guilford; New York, NY: 2010. pp. 513–530. [Google Scholar]
- 6.Smith S.W., Daunic A.P., Taylor G.G. Treatment fidelity in applied educational research: expanding the adoption and application of measures to ensure evidence-based practice. Educ. Treat. Child. 2007;30(4):121–134. [Google Scholar]
- 7.Lee O., Penfield R., Maerten-Rivera J. Effects of fidelity of implementation on science achievement gains among English language learners. J. Res. Sci. Teach. 2009;46(7):836–859. [Google Scholar]
- 8.Missett T.C., Foster L.H. Searching for evidence-based practice: a survey of empirical studies on curricular interventions measuring and reporting fidelity of implementation published during 2004-2013. J. Adv. Acad. 2015;26(2):96–111. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.