Using authentic representations of practice in teacher education: Do direct instructional and problem-based approaches really produce different effects?

Jürgen Schneider; Marc Kleinknecht; Thorsten Bohl; Sebastian Kuntze; Markus Rehm; Marcus Syring

doi:10.1371/journal.pone.0273988

. 2022 Sep 2;17(9):e0273988. doi: 10.1371/journal.pone.0273988

Using authentic representations of practice in teacher education: Do direct instructional and problem-based approaches really produce different effects?

Jürgen Schneider ^1,^*, Marc Kleinknecht ², Thorsten Bohl ¹, Sebastian Kuntze ³, Markus Rehm ⁴, Marcus Syring ¹

Editor: Micah B Goldwater⁵

PMCID: PMC9439238 PMID: 36054187

Abstract

This paper investigates the effects of different instructional approaches (problem-based vs. direct instructional) on student teachers‘ analysis of practice when using authentic representations of practice in teacher education. We assigned 638 student teachers from 21 equivalent teacher education courses to one of the two conditions. Students’ analyses of practice were evaluated on selective attention, reflective thought, and theory-practice integrations in a pre-post-design. In line with inconsistent findings from prior research, we were able to produce evidence for equivalent effects of the instructional approaches on all dependent variables using Bayesian data analyses. As called for in a review on the topic, we additionally explored the role of the instructors administering the field study interventions. Findings revealed that a positive attitude toward the instructional approach the instructors administered was related to more theory-practice integrations in the students’ analyses.

1. Introduction

Learning from practice is a key element of professionalization in teacher education [1]. Over the past 15 years, approaches that enable student teachers to learn from practice by using representations of practice (particularly via video) have steadily increased. Representations of practice can be described as a window into practice that enables students to experience and understand teaching [2, 3]. They are considered to enable students to approximate practice in a controlled environment and thus prepare them for professional action [4]. Representations of practice can only realize their potential for professional development if they are paired with a substantive reflection that is not limited to surface features [5–8]. However, it is challenging for student teachers to engage in deep reflection without further support [9]. Novices tend to focus overly on the surface features of classroom interactions [10] and themselves rather than on student learning [11]. To address this challenge, researchers aim to identify key processes of reflection to make it tangible in teacher education contexts [12]. Based on seminal literature on reflection by Dewey [13], research in teacher education programs typically emphasizes processes of selective attention, reflective thought, and theory-practice integration. A variety of programs have been able to demonstrate that these skills of selective attention, reflective thought, and theory-practice integration can be fostered through interventions in teacher education [14, 15]. An open question concerns the comparison of different intervention programs and their effectiveness, answering the question: Under which boundary conditions are interventions effectively promoting reflective engagement with representations of practice? In the current study, we compare different instructional approaches and their effects on selective attention, reflective thought, and theory-practice integration.

1.1 Selective attention, reflective thought, and theory-practice integration with authentic representations of practice

Selective attention, as defined by Sherin and van Es [6], describes the ability to perceive and be aware of relevant, in-depth features of instruction. What is considered relevant depends on the curricula of the teacher education programs and thus on the respective definition of teacher professionalism. In our case, we address classroom management, which is a broadly recognized dimension of teaching quality [16].

The rationale behind the need to train selective attention is that teachers cannot consciously reflect on (or respond to) relevant aspects of instruction if they do not notice them [17]. Comparisons between experienced teachers and student teachers reveal that student teachers tend to have less selective attention to relevant instructional features [18, 19]. At the same time, teaching experience is not a sufficient condition for the development of selective attention to relevant features of instruction [20]. Even experienced in-service teachers may attend to surface features when viewing representations of practice if they have not been trained in selective attention [21]. However, robust evidence exists that selective attention can be trained in teachers [22] and student teachers [23] through the use of representations of practice. In their syntheses of research, Marsh and Mitchell [24], as well as Gaudin and Chaliès [14] found that the use of video-based representations of practice has the potential to promote selective attention among pre-service and in-service teachers.

Selective attention creates the possibility for subsequent reflective thought. From a theoretical stance on reflection, reasoning about different possible options for the selected situation constitutes reflective thought [6, 13, 25]). In the context of learning from representations of practice, reflective thinking involves describing a situation, exploring options for the situation, anticipating consequences, and making a decision based on these processes. These processes constitute the core of reflection in the teaching profession. Learning to apply these processes to practice represents a necessary prerequisite for professional development in the teaching profession [26]. Based on their research synthesis, Marsh and Mitchell [24] argue that collaborative learning with video-based representations of practice is particularly effective in promoting reflective thought due to its discursive nature that scaffolds the learning process. Likewise, Gaudin and Chaliès [14] identified a series of empirical studies that corroborate the effectiveness of video-based learning arrangements in this regard.

Topic-specific selective attention and subsequent reflective thinking on representations of practice in teacher education create opportunities to integrate theory and practice. Integrating theory and practice is pivotal to the professional development of teachers [27]. On the one hand, it helps to avoid a purely theory-based education that leads to practice shock [28]; on the other hand, it prevents over-simplified adoptions of practice routines that are disconnected from scientific knowledge about teaching [8]. Representations of practice provide a possibility to foster relations of theory and practice through contextualization and abstraction during the process of reflective thinking [3, 29]).

1.2 Problem-based and direct instruction

The instructional approach in which authentic representations are leveraged plays a pivotal role in the effectiveness of the learning outcome [30]. These instructional approaches may have differential effects on how learners perceive and reason about practice [31]. Even though most learning arrangements using authentic representations in teacher education follow a problem-based (PB) approach, it is still unclear which differential effects direct instruction (DI) will evoke, due to the lack of empirical studies that systematically compare these two approaches.

The use of authentic representations is an inherent part of a PB approach. It opens a problem space that initiates the learning process [32]. Learners explore the problem and apply problem-solving strategies, such as hypothesizing and self-regulated research and application of knowledge [33]. PB approaches involve small groups, whose members divide tasks, share knowledge and engage in discursive reasoning to reach a conclusion [34]. Since reasoning constitutes an essential part of PB learning, this instructional approach has the potential to promote learners’ reflective thought through practice [35]. In contrast, DI describes a teacher-centered approach in which phases of modeling are typically followed by phases of guided and individual practice [36]. Within DI, authentic representations are used to exemplify a solution presented previously by the instructor [37]. Learners may subsequently work on similar authentic representations to replicate the solution process, and thus, do not necessarily reason about different solutions. Thus, from a theoretical perspective, DI would be suitable to practice the selective attention of participants. Moreover, DI as a structured approach may be particularly helpful for novice learners, who may suffer from cognitive overload otherwise [38].

1.3 Empirical findings on effects of different instructional approaches

Research over the past several decades on different instructional approaches suggests that DI is successful in promoting content knowledge and transfer [36], particularly in novices [39]. On the other hand, there is evidence that PB instruction promotes skill rather than knowledge acquisition [40] and may foster other factors relevant to the learning process, such as motivation or attitudes [41]. In our study, we compare these two instructional approaches and their effects on student teachers’ selective attention, reflective thought, and integration of theoretical knowledge with representations of practice. To our knowledge, there are only two studies directly comparing different instructional approaches with respect to these variables.

Seidel, Blomberg, and Renkl [42] compared two different instructional approaches that share characteristics with DI and PB—rule-example versus example-rule. In the rule-example strategy, definitions of the topic (goal clarity, scaffolding, and learning climate) were given and subsequently exemplified in a classroom video. The students were then asked to practice the demonstrated observation on further video representations. The example-rule strategy used group discussions stimulated by observations on classroom videos. These discussions and a subsequent moderated collection of ideas were intended to lead to an understanding of the topic. Aggregated scores on noticing and evaluating classroom situations showed that the rule-example group outperformed the example-rule group. Unfortunately, the paper did not report separate scores for noticing and evaluation; thus, it is challenging to draw more precise inferences from the effects. In a test on factual knowledge, the rule-example group outperformed the example-rule group. However, when asked to plan a lesson, the example-rule group more frequently elaborated on theoretical ideas that included situational references rather than being merely general.

Barth et al. [25] compared two collaborative instructional approaches using authentic representations of practice (vignettes) that included either direct instructional or traditional problem-based elements. Interventions were implemented in professional development courses for student teachers on the topic of classroom management. In the DI condition, students received lecture-like instruction on theoretical aspects of classroom management, then analyzed several vignettes guided by a three-step worksheet that depicted selective attention, reflective thought, and theory-practice integrations. In the PB condition students first observed one problematic vignette, then independently read theoretical literature on classroom management in a self-study phase. Afterward, they proceeded to analyze the given vignette using the same worksheet as the DI condition. The intervention, therefore, focused more on different approaches concerning the acquisition of theoretical knowledge and less on different approaches concerning the analysis process. Measures included selective attention and theory-practice integrations in written analyses on video-based classroom vignettes. Posttest comparison of the groups in the first study yielded no differences in measures of selective attention and a small to medium effect on theory-practice integration favoring the DI group. In a second, study Barth et al. [25] focused solely on the DI group revealing no differences in selective attention and a strong effect on theory-practice integrations from pretest to posttest.

1.4 Empirical findings on the effects of instructors’ attitudes

In a systematic review, Baecher et al. [43] highlighted the role of instructors administering the interventions in field studies, which is rarely clarified in scholarly publications. Guiding discussions and structuring analyses of participants, the instructor plays an important part in the success of the learning process [44, 45].

One important aspect when implementing and studying the effects of teaching approaches in field studies is the attitude of instructors administering the treatment conditions. Instructors of teacher education courses may differ in their attitudes about teaching styles and learning scenarios [46]. These attitudes can be consistent or inconsistent with features of the treatment conditions and influence the way instructors practice their teaching. Even when facing obstacles (e.g., being told to teach differently), instructors may try to maintain consistency between their attitudes and their practice [47]. Given that field studies involve instructors in the delivery of treatments, their attitudes toward the treatment can be an important source of information and predict outcomes. Several studies revealed that a positive attitude of instructors toward the treatment may have an effect on the subject’s performance concerning critical measurements (e.g., seminal first findings by Rosenthal and Fode [48]). Since there has been little research in this regard in teacher education field studies, this part of our study is exploratory.

Based on the findings described above, we conducted an experimental field study comparing different instructional approaches and their effects on selective attention, reflective thought, and integration of theoretical knowledge with authentic representations of practice. We also address the role of the instructor, focusing on their attitude toward the treatment.

There exists a large body of exploratory research on learning arrangements utilizing video-based representations of practice in teacher education. However, to accumulate evidence on differential effects and border conditions of video-based approaches, we need studies conducting experimental variation and comparison.

1.5 Research questions and hypotheses

In this study, we investigate how different instructional approaches (PB vs. DI) and the instructors’ attitudes are related to student teachers’ selective attention, reflective thought, and integration of theoretical knowledge with authentic representations of practice. One of the hypotheses on reflective thinking (H2₁) was based on strong assumptions derived from theory. The rest of the hypotheses were labelled as exploratory, since robust research is lacking.

RQ1 on selective attention: To what extent are different instructional approaches (DI and PB) and instructors’ attitudes on these approaches related to the selective attention of student teachers?

Based on theoretical considerations and first empirical findings we hypothesize that DI and PB instructional approaches both foster selective attention (no difference between groups), with instructors’ positive attitudes about the instructional approach positively related to selective attention (H1₁: β_treat = 0 & β_IA > 0).

RQ2 on reflective thought: To what extent are different instructional approaches (DI and PB) and instructors’ attitudes on these approaches related to the reflective thought of student teachers?

Based on established research findings, we hypothesize that a PB instructional approach leads to more elaborate reflective thought compared with DI, with instructors’ positive attitudes about the instructional approach positively related to reflective thought (H2₁: β_treat > 0 & β_IA > 0).

RQ3 on theory-practice integrations: To what extent are different instructional approaches (DI and PB) and instructors’ attitudes on these approaches related to theory-practice integrations of student teachers?

We hypothesize that DI and PB instructional approaches both lead to similar amounts of theory-practice integrations (no difference between groups), with instructors’ positive attitudes about the instructional approach positively related to theory-practice integrations (H3₁: β_treat = 0 & β_IA > 0).

Consistent with our hypotheses, we use Bayesian inferential statistics in our data analysis. This allows us (in contrast to frequentist null hypothesis testing) to make statements about the equivalence of groups as used in H1₁ and H3₁ [49]. We tested the hypotheses of the two predictors (instructional approaches, instructors’ attitudes) within each dependent variable simultaneously to increase rigor by making the predictions as precise as possible.

2. Method

Written approval has been obtained from the Ethics Committee of the Faculty of Economics and Social Sciences at the University of Tübingen (without approval number). The participants of the study were informed about the study’s research interest one month before the start of the study. Participants completed a written informed consent form which included a privacy statement. Participants were aware that participation was voluntary and non-participation had no consequences. We also informed participants that we will anonymize the dataset after data collection and will not share it with the instructors.

2.1 Participants

In total, 638 student teachers participated in the study, recruited from 21 introductory courses for secondary teachers. The 21 courses covered the same content on teaching, learning, and instruction over one semester (15 weekly 90-minute sessions). They all took place in the same academic semester and teacher education program, allowing for comparable conditions. The treatment was a regular part of the course, measurements were administered through an online pretest and posttest survey. Participants were M_age = 21.01 (SD_age = 2.36) years old on average and predominantly female (65.0% female, 33.2% male, and 1.8% “other” or none). Students usually attend this course in their second of 10 semesters; hence, 72.3% of the participants were studying in their second semester.

2.2 Design

The study was conducted in a regular teacher education program in Germany, allowing for field study conditions. In each of the 21 courses included in the study, one of the two interventions was implemented. For the interventions, we redesigned part of the courses (two sessions and an inter-session assignment) using authentic representations of practice. For this purpose, we focused on sessions in which classroom management was on the curriculum. These were sessions six and seven of 15 sessions. The other 13 sessions before and after the treatment remained as planned by the instructors (“business as usual”). Courses were either realized as problem-based (n_PB = 11) or direct instructional (n_DI = 10). Thus, the interventions were implemented at the course level. Students were assigned to the courses via a university-administrated system we had no influence on. We allocated courses to the different interventions such that each intervention was uniformly distributed across days between Monday and Friday, time slots (8am to 8pm) and the six instructors. Each instructor taught DI an PB courses; the conditions were balanced within the instructors (teaching the same amount of both conditions, except when teaching an odd number of courses). These instructors were the same as those teaching throughout the semester, further strengthening field study conditions. We used the R package ‘BayesFactor’ [50] to test whether conditions were equivalent concerning several potentially confounding context variables. The Bayes factor is a measure of relative evidence comparing two hypotheses, one of which can be specified as a null hypothesis. As opposed to (classical frequentist) null hypothesis significance testing, Bayes factors allow us to gather relative evidence for a null hypothesis, and therefore, test for equivalence. In our Bayes factor analyses for two independent samples with default priors of the BayesFactor package [50], we tested the conditions for equivalence (H₀) or differences (H₁) of the groups concerning gender (BF₁₀ = .109), teaching experience (BF₁₀ = .116), experience with video-based analyses (BF₁₀ = .099), topic-related literature read (BF₁₀ = .114), and prior knowledge on classroom management theories (BF₁₀ = .101). All Bayes factors pointed toward evidence of equivalence between the groups prior to the treatment.

2.3 Treatments, materials, and procedure

The procedure of the treatment is shown in Table 1. To ensure that the instructors were well trained in conducting the intervention sessions, they taught these sessions one semester before the study and received feedback from two researchers videotaping and observing the sessions. In multiple subsequent meetings with all instructors, we were able to further optimize treatment compliance. Choosing from a pool of 14 normal-practice classroom videos, we found four vignettes suitable as authentic representations of classroom management, each with a duration of approximately 5 minutes. These vignettes were deemed appropriate because strategies of classroom management were particularly visible in them (thus providing an authentic representation) and were appropriate for the student’s grade level (secondary education). To offer text-based vignettes, we transcribed the video vignettes and added nonverbal information. Text and video vignettes were evenly distributed between the two treatment groups, representing another factor (2 x 2 design) not considered as a predictor in this paper (for description and results see [51]). One vignette was used in the first session, another as an assignment (to be completed until the second session), and two in the second session.

Table 1. Procedure for measurement and intervention.

	Problem-based learning (PB)	Direct instruction (DI)
Before intervention (40 minutes)	(Online) pretest (analysis of practice, demographics, and covariates)
Session 1 (90 minutes)	Instructor lectures on classroom management theories and steps for analyzing classroom vignettes
Session 1 (90 minutes)	• Students analyze video or text-based vignette in small groups, focusing on classroom management • Students participate in collaborative discussions about interesting situations	• Instructor analyzes a video or text-based vignette on classroom management, step by step • Students replicate the analysis with a new situation
Assignment (60 minutes)	Analysis of a video or text-based vignette at home
Session 2 (90 minutes)	• Students analyze video or text-based vignette in small groups, focusing on classroom management • Students participate in collaborative discussion about interesting situations	• Instructor analyzes a video or text-based vignette on classroom management, step by step • Students replicate the analysis with a new situation
After intervention (40 minutes)	(Online) posttest (analysis of practice, covariates)

Open in a new tab

The first session had two parts. In the first part, the instructor introduced students to the topic of classroom management and its theoretical approaches [52–54]. Instructors gave definitions and clarified strategies concerning classroom management in a short exercise. Subsequently, students were made familiar with the steps of how to analyze the authentic representations of practice (i.e. vignettes): “Describe the problem/situation, describe the teacher’s action, reason about alternative courses of action, anticipate reasons and consequences of these actions, decide on one of the alternatives”. This first part was taught equally in all courses and in both conditions. In contrast, the second part varied between the two conditions.

2.3.1 Direct instructional treatment

In the second part, instructors introduced students to the lesson represented in the vignette, giving background information on the topic, the lesson structure, the class level, and the specific sequence of the lesson they were about to witness. The vignette was then shown to the entire course without interruption so that students could visualize the classroom activities. In a second round, the instructor stopped at certain situations and demonstrated a step-by-step analysis that integrated theoretical aspects of classroom management. After the situation was sufficiently analyzed, the instructor continued by repeating the analysis with several more situations. After this, students individually analyzed some more situations chosen by the instructor. These analyses were performed as a dialogue with the instructor and other students contributing. In the second session, the same procedure of demonstration and exercise was repeated for two additional vignettes.

2.3.2 Problem-based treatment

Problem-based courses also started with the instructor giving background information on the subsequent vignette. After that, the vignettes were handed out either as text vignettes on paper or video vignettes viewed on laptops. Students came together in groups of four to five members. They observed several situations and discussed the analysis in their group. The students, not the instructor, selected which situations they analyzed. Students were free to determine what they consider to be a problem (selection of situations) and what was and was not part of the problem. Students were also free in inquiring about these situations. To guide the analysis, students received key questions that targeted the analysis of practice steps. These key questions merely served as a guide in case students needed support with their analysis. The questions did not serve as step-by-step instructions and were not introduced as such. In a final discussion, the whole course talked about two or three of these situations. Students were asked to describe situations that stood out to them and perform analyses of them. The instructor acted as a moderator and tried to include different student suggestions about the situation and promote a discursive discussion. In the second session, the instructor repeated the same procedure of small group discussion and final course group discussion with two more vignettes.

Treatments checks were administered by one of three trained raters who visited the treatment sessions. They judged the implementation of the treatment on eight items that measured the degree of DI of the instructor on a 6-point Likert scale (from “doesn’t apply at all” to “applies completely”). A (reversed) example item is as follows: “Students chose which situations to have a closer look at while working with the vignettes.” Raters also recorded the time students were effectively able to work on the vignettes. For both measures, the inter-rater reliability showed good intraclass correlation (ICC = .96–.99). Internal consistency concerning the treatment check scale showed good scores (Cronbach’s α = .96 for both sessions). In both sessions, we found evidence of equivalence between the groups for the time students worked on the vignettes (1^st session: BF₁₀ = .514; 2^nd session: BF₁₀ = .545). Evidence in both sessions is not very strong, however, in both cases they point in the same direction. In contrast, we were able to provide evidence of differences in the degree of direct instruction for the treatments (1^st session: BF₁₀ = 3.76 ∙ 10⁶; 2^nd session: BF₁₀ = 2.34 ∙ 10⁵) using Bayes factor analyses for two independent samples with default priors of the BayesFactor package [50]. These findings corroborate that the instructors realized the different instructional methods as planned with comparable times on task.

2.4 Measures

Dependent variables and covariates were assessed via an online survey the students were asked to complete as part of the course, before (pretest) and after (posttest) the two treatment sessions. As stated in the research questions, we are looking into the dependent variables of selective attention, reflective thought, and integration of theoretical knowledge with authentic representations of practice. These three variables were assessed through written comments that students gave on classroom vignettes presented in the survey. Students were instructed to comment on situations they perceived relevant in the topic of classroom management. Since students might have been familiar with the idea but not with the term classroom management in the pretest, we asked them to observe the lesson planning, control of behavior, and shaping of relationships witnessed in the vignettes—the three dimensions of classroom management that were taught in the treatment sessions. They were instructed to write and save each analysis separately using a “save comment” button below the text box (Fig 1). After saving an analysis, they were free to continue the observation and write further analyses.

The pretest and posttest each presented three vignettes of a classroom video that added up to ten minutes. To avoid a test effect, different vignettes from the same videotaped classroom lesson were used in the pretest and posttest. We investigated the extent to which pairs of similar vignettes could be found from a classroom video, each of which was then split between the pretest and the posttest. Three experts (in the fields of practice-oriented teacher education and classroom management) rated the vignettes on 14 dimensions (e.g., complex, interesting, or classroom management) with a mean (standard deviation) agreement of r_WG = .79 (.31). We used these ratings to conduct cluster and graphical analyses to find three pairs of similar vignettes that were then separately allocated to either the pretest or posttest.

2.4.1 Number of selected situations as selective attention

In teacher education programs, what is defined as relevant may depend on the current learning goal of the course or learning arrangement (e.g., teacher guidance and support [55]). In our study, we focus on classroom management; the topic of the two sessions in the course our treatment took place.

Thus, we operationalize selective attention as situations the participants select to discuss a relevant topic, where “relevant” means that the selected situations are instances of the focused topic of our study, “classroom management.” We use the number of selected situations in the web-based survey students commented on (i.e., number of saved analyses). Participants were instructed to comment on every situation they perceived as being relevant concerning the classroom management in the vignette. Therefore, they were able to save as many analyses as they pleased. A trained coder rated whether the analyses focused on classroom management or were off-topic. Counts were summed up for each of the three vignettes in the pretest and posttest, thus constituting the number of selected (and analyzed) situations for each vignette.

2.4.2 Realized inquiry steps representing reflective thought

We operationalized reflective thought by students’ ability to apply the inquiry steps to the selected situations from the vignettes in the online survey. On the survey, students were reminded about the inquiry steps in the item question (see Fig 1). Thus, we measured whether students could transfer and apply these steps to the practice situations observed rather than whether they remembered the steps. The comments written and saved by the students were coded as to whether they contained the individual inquiry steps (dichotomous). The scale from 0 to 3 categorizes whether we detected none, one, two, or all of these steps in each of the student’s written comments (Table 2).

Table 2. Levels of reflective thought based on coded inquiry steps.

Level	Steps realized	Example
3	• Description, alternatives, and consequences	The teacher collects the notes from the students. It becomes increasingly loud in the classroom as the teacher hangs notes on the blackboard. The teacher gives the students a warning. It gets quieter. Overall, it would be better to include the students more in hanging up the notes. That would enable a quieter learning situation.
2	• Description and alternatives	Students insult each other, but the teacher tries to ignore it by continuing with the lesson. She should really try to stop the insults before recommencing.
	• Description and consequences	While checking the results of the assignment, the teacher praises the students a lot. That’ll motivate them.
	• Alternatives and consequences	The class is loud as some students begin to present their results. The teacher should not let them begin to present until the rest of the class is quiet. Thus, half the information wouldn’t be lost.
1	• Description	There was quite a bit of noise in the classroom, then the teacher placed her finger on the lips.
	• Alternatives	The teacher should react to the student wandering around the room.
	• Consequences	The students don’t seem to heed the teacher’s actions.
0	none	Everything seems normal.

Open in a new tab

Inter-rater reliabilities for all codings were computed based on randomly selected 20% of the approximately 7 600 comments written by the participants in the pretest and posttest. Cohen’s Kappa scores of the two trained raters were satisfactory (κ = .64–.77) and disagreements were resolved through discussion. The raters’ scores were tested for one-dimensionality per vignette using confirmatory factor analysis (CFA) with individual comments used as indicators and the robust maximum likelihood estimator (MLR) obtained by full information maximum likelihood [56]. The data showed a good fit regarding all six vignettes, p[χ²] = .13–.99, CFI = .95–1, RMSEA ≤ .001–.03, p[RMSEA < .05] = .86–1. Furthermore, model comparisons indicated that we were able to assume strict measurement invariance for all vignettes between treatment groups. In addition, reliability between the comments revealed good internal consistency for all vignettes, McDonald’s ω = .70–.80 [57]. Thus, we computed mean scores for each vignette in the pretest and posttest, reflecting the average quality of reflective thought.

2.4.3 Theory–practice integration

We assessed students’ theory–practice integration regarding classroom management principles by coding their written analyses for terms and principles from the classroom management literature. When detected, coders evaluated whether the theoretical principle fit the situation to which the comment referred. We did not decide to use a sample solution (a predetermined list of theoretical principles associated with specific situations) because it would not do justice to the complexity of classroom situations. There can be no certainty about what students were truly referring to within these situations when only coding for predetermined theoretical principles while excluding further information from the participants’ view. Written analyses that referred to theoretical principles of classroom management that fit the situation described were coded dichotomously with 1 if they met these criteria (e.g., “the teacher does not manage to show withitness by maintaining eye contact and keeping students under control or quiet”) or 0 if they did not fulfill these criteria.

Inter-rater-reliability was satisfactory (κ = .74). We computed mean scores for each vignette in the pretest and posttest, reflecting the percentage of comments that included theoretical principles of classroom management.

2.4.4 Theoretical literature on classroom management

We allege that the students were unfamiliar with the topic of classroom management since the teacher education program’s curriculum did not cover it up to this point (second semester). Concerning the two sessions on classroom management, three recommended readings [52–54] were provided for download on the university’s content management system used for this course. As theory–practice integration was part of the treatment and a central dependent variable, we needed to control for different preconditions of theoretical knowledge caused by sources outside the treatment sessions. Students were thus asked which of the articles they had read. Assessing the literature read (and attendance as well as instructors’ positive attitude toward the treatment, see below) based on self-report carries the risk of social desirability. To counteract this bias, we repeatedly emphasized the anonymity of the pretest and posttest. We further pointed out that the data will be used exclusively for research purposes and will not be shared with the instructors. The number of students who reported not having read any or only one of the texts can be interpreted as an indication of a minor role of social desirability: After the two treatment sessions, 41% of the students indicated they read none, and 28% read all three texts (median = 1). The theoretical literature read was used as a control variable.

2.4.5 Attendance

As the treatment was part of two sessions of a regular course, student attendance varied and influenced the efficacy of the treatment. The more sessions they attended, the more the treatment could influence their cognition and behavior. Student attendance is therefore an aspect of treatment feasibility [58]. In a field-based setting, we cannot determine or standardize student attendance, consequently, we measured it. Accordingly, we asked the students how many of the two treatment sessions they attended (none, one, or two). The maximum of two sessions were attended by 79%, whereas 20% attended one session, and 1% attended neither of the two sessions. Again, these numbers are indicative (but no evidence) of low social desirability. Attendance was used as a control variable.

2.4.6 Instructors’ positive attitude toward the treatment

During the yearlong training instructors received for the intervention, we noticed their divergent attitudes toward the two instructional approaches. Attitudes largely reflect the extent to which the instructional approach is consistent with the values and practice routines of the instructors [59]. Particularly in field settings, measures of attitudes are associated with values and practice routines. Consequently, the interrelationships of these constructs must be taken into account when interpreting their results. To assess their attitudes, we separately asked for both treatments they conducted: “Think about the concept of the PB [DI] course: I like the way learning with classroom vignettes is handled here.” Neither treatment tended to be more popular with all instructors: On the 6-point Likert scale (1 = disagree strongly to 6 = agree strongly), instructors’ attitudes ranged from 2 to 6 for the problem-based treatment (M = 4.77; SD = 1.42) and 1 to 6 concerning the direct instructional treatment (M = 3.80; SD = 1.84). We ran a Bayes factor t-test for dependent samples, which showed considerable evidence for difference between the treatment groups (BF₁₀ = 1.424 ∙ 10⁷). This indicates that instructors had divergent but not necessarily one-sided attitudes throughout for each treatment, thus, we used the variable as a covariate of the treatment. We matched the attitude of the instructor toward a specific course to students’ data from exactly that course. This way we were able to predict student level data with the respective course level information.

2.5 Statistical analysis

The collected data contained 18% missing values overall. Therefore, in the first step, we decided to impute the data via chained equations with the R package ‘mice’ [60]. Second, with the complete datasets, we computed separate full structural equation models for each of the three dependent variables (selective attention, reflective thought, and theory–practice integration) with the predictors’ treatment (0: DI, 1: PB), attendance, theoretical literature on classroom management, and instructors’ attitudes about the treatment (Fig 2). Note that the three vignettes we used to measure the dependent variables in the pretest and posttest were matched with great effort, yet they were not exactly the same. As a result, comparisons between the pretest and posttest on absolute scores must be interpreted carefully for all three dependent variables. Hence, we preferred to use the pretest scores as predictors of the posttest, accounting for differences before the treatment. Other predictors can be interpreted as increasing or decreasing the posttest’s selective attention score under the control of pretest scores. Reflecting our study design, we obtained clustered data. Given our research interest, we are not interested in modeling these clustered data, but consider them a nuisance [61]. Accordingly, we used robust standard errors and adjusted χ² that take the clustered structure into account.

Fig 2 — SA: Latent variable “selective attention” in pretest (SA₁) as a control variable and posttest (SA₂) as the dependent variable; sa: Manifest variables of “selective attention” representing the three vignettes within each measurement point; *att*: Attendance; *lit*: Theoretical literature on classroom management; *treat*: Treatment; IA: Latent variable of instructors’ attitudes about the treatment; ia: Manifest variables of instructors’ attitudes.

Fit indices of the three models were good: Taking the N = 638 students into account, it is not surprising that the χ²-test shows a significant result (χ²₍₄₀₎ = 56.743–67.988, p = .008–.053). Furthermore, the CFI = .962–.981, TLI = .950–975, and RMSEA = .035–.033 with CI_95% = [.014–.042] indicated good fit (lowest and highest value respectively).

As mentioned in the hypothesis section, a Bayesian approach is needed to test the formulated hypotheses. Accordingly, with the results from the models, we applied a Bayesian informative hypothesis approach using the R package ‘bain’ [62]. As opposed to the commonly used frequentist null hypothesis significance testing, this allows us to quantify and compare relative evidence of hypotheses (including a null hypothesis). The hypotheses to be compared were derived from the formulated hypotheses above. We tested the hypotheses of the two predictors (instructional approaches, instructors’ attitudes) within each dependent variable simultaneously, to increase rigor by making the predictions as precise as possible. They will be reported directly in the respective results part to reduce complexity (all analyses and results can be examined in the supplemental material).

3. Results

3.1 Selective attention

The number of selected situations slightly declined from pretest to posttest in both groups (Table 3). This might be an unexpected result; however, please be reminded that a direct comparison of pretest and posttest should be interpreted with caution.

Table 3. Means (and standard deviations) of the three dependent variables.

	Selective attention		Reflective thought		Theory-practice integration
	Pretest	Posttest	Pretest	Posttest	Pretest	Posttest
DI	2.44	2.19	2.18	2.16	.05	.23
DI	(1.71)	(1.41)	(.58)	(.59)	(.14)	(.33)
PB	2.42	2.15	2.17	2.22	.04	.27
PB	(1.81)	(1.44)	(.56)	(.60)	(.11)	(.34)

Open in a new tab

Note: Selective attention: number of analyzed situations per test; reflective thought: realized inquiry steps in the analysis process [0–3]; theory–practice integration: relative frequency of analyses including theoretical target aspects

We expected the DI and PB approaches to foster selective attention (no difference between groups) and for positive attitudes of the instructors about the instructional approach to be positively related to selective attention. Therefore, the statistical hypothesis to test can be formulated as H1₁: β_treat = 0 & β_IA > 0, with the dichotomous treatment variable coded as DI = 0 and PB = 1 (see Fig 2). This hypothesis will be tested against H1₂: β_treat = 0 & β_IA < 0 as one may also assume that instructors with positive attitudes toward a treatment made students analyze situations in greater detail. Students would, in consequence, have selected fewer situations to analyze. These hypotheses are further compared with a null hypothesis, H1₀: β_treat = 0 & β_IA = 0, and an unrestricted hypothesis that will have the greatest probability in case all our formulated hypotheses are implausible, H1_u: β_treat; β_IA.

To describe the results, we indicate which hypothesis received the greatest posterior probability, then report the Bayes factor against its complement (opposite of the hypothesis) and the Bayes factors of the hypothesis against the other hypotheses that were also tested (see supplement for detailed results). Evidence pointed toward the exploratory hypothesis H1₂, as well as the null hypothesis H1₀. We found solid evidence of these hypotheses against their complement (BF_2c = 27.32; BF_0c = 25.59) and H1₁ (BF₂₁ = 50.97; BF₀₁ = 47.73). Comparing the two hypotheses H1₂ and H1₀ against each other yielded no clear result (BF₂₀ = 1.07). We conclude that there is strong evidence that the two instructional approaches have an equivalent effect on selective attention. In addition, instructors’ positive attitudes toward the treatment had either no relation or negative relation to the number of selected situations. However, regarding the instructors’ attitude, we cannot make a concluding statement.

3.2 Reflective thought

Students’ reflective thought (as measured by realized inquiry steps in the analyses) was already well developed before they entered the treatment sessions and showed only little change afterward.

We expected the PB approach to foster reflective thought (compared with DI) and instructors’ positive attitude to show a positive relation, H2₁: β_treat > 0 & β_IA > 0. Two exploratory hypotheses tested whether only one of the effects holds, H2₂: β_treat = 0 & β_IA > 0 and H2₃: β_treat > 0 & β_IA = 0. As before, we included a null (H2₀) and an unrestricted hypothesis (H2_u).

As one may expect from the descriptive results, we found substantial evidence for the null hypothesis against the other hypotheses and its complement (BF_0u = 82.00; BF_0c = 82.00; BF₀₁ = 28.22; BF₀₂ = 4.09; BF₀₃ = 6.26). Therefore, we conclude that the effect on students’ reflective thought is equivalent between the instructional approaches, and the instructor’s attitude is not related to students’ reflective thought.

3.3 Theory–practice integration

Students’ theory–practice integrations when analyzing classroom situations greatly changed from the pretest to the posttest (ΔM_DI = 18%; ΔM_PB = 23%). Even though the pretest and posttest vignettes are not identical, the difference in scores is certainly noteworthy. As opposed to reflective thought, students showed considerable room for potential with their theory–practice integrations in the pretest. A mere 2–8% of analyses on the vignettes in the pretest contained theory–practice integrations, although the theoretical literature on classroom management was already provided before the test. To obtain the effect of the treatment independently of the amount of literature read by the students, we measured and controlled for this variable (see Fig 2).

We expected the PB approach to generate a similar amount of theory–practice integration as DI and the instructors’ attitudes to be positively related, H3₁: β_treat = 0 & β_IA > 0. Like before, we explored further hypotheses on whether PB instruction shows a positive effect H3₂: β_treat > 0 & β_IA > 0 or the attitudes make no difference H3₃: β_treat > 0 & β_IA = 0, and included a null (H3₀) along with an unrestricted hypothesis (H3_u). Evidence indicates strong support for the hypothesis H3₁ (BF_1u = 34.06; BF_1c = 34.06; BF₁₂ = 12.24; BF₁₃ = 90.95; BF₁₀ = 41.09). From these results, we infer that both instructional approaches lead to an equivalent effect on students’ theory–practice integrations. What is more, instructors’ positive attitudes toward the treatment are related to an increase in theory–practice integrations (see Fig 3).

4. Discussion

4.1 Interpretation of results

The goal of our field study was to compare different learning scenarios using authentic representations of practice and their differential effects on how students analyze classroom situations. More specifically, we aimed to consider selective attention, reflective thought, and theory–practice integration. Our data did not reveal differences between the DI and PB approaches on the selective attention of students. At first glance, the treatment’s low impact does not necessarily contradict Seidel et al.’s [42] results because students in the DI courses might have been able to notice more critical situations, but if given a choice, did not make use of that skill due to negative attitude or a lack of motivation. To test for this possible explanation, we included the students’ willingness for effort and their attitude on readiness for reflection in the structure model. The variables had no significant relation with selective attention, and thus, cannot resolve this issue. Furthermore, we found that the number of selected situations decreased from the pretest to the posttest in both conditions. We cannot conclusively elucidate this phenomenon with our data, but we offer some tentative interpretations. A first intuitive explanation is that the analyses became fewer because students wrote longer analyses. However, we did not observe an increase in the length of the analyses in our data. A second explanation could be that the vignettes to be analyzed in the pretest and the posttest offer different numbers of situations that can be analyzed. Although we matched the vignettes to the pretest and posttest with great effort, we cannot exclude this option. A third explanatory approach relates to test fatigue. The students may put more effort into the pretest because analyzing classroom videos was a novel format for them (novelty effect). After they went through the pretest and analyzed several instructional videos again in the treatment sessions, the novelty effect may have worn off and their willingness to reflect may have decreased in the posttest. A slight decrease in scales of readiness to reflect was indeed observed in our data. Concerning the relation of the instructors’ attitude toward the instructional approaches with the selective attention of students, we were not able to make a final conclusion, as results from the data were inconclusive.

We were able to reveal evidence for PB and DI to have equivalent effects on the reflective thought of students. This contradicts our assumptions formulated before data collection. As the strongest predictor in the model was the pretest score on reflective thought and our intervention was rather short (two 90-minute sessions), this might support assumptions that reflective thought is rather a stable skill, and thus, more challenging to influence. Further inquiry shows a negative relationship with the number of selected situations (r = -.371, p < .001) and a positive correlation with theory–practice integration (r = .278, p < .001). The motto for high performers in inquiry seems to be less quantity, more quality. Overall, the means imply consistent high scores on the 0–3 scale from pre- to posttest for both groups. Moreover, instructors’ attitudes toward the treatment did not play a role in students’ reflective thought.

Students’ theory-practice integration scores in analyzing classroom situations improved greatly from the pretest to the posttest. However, this result should be taken with a grain of salt because the pretest and posttest are not equivalent, even though we matched them with great effort. Both instructional approaches appear to have equivalent positive effects, as attendance in either represents a significant predictor. Thus, these results do not support Dochy et al.’s [40] and Seidel et al.’s [42] findings. Interestingly, the strongest predictor (stronger than reading literature on the topic) represents the instructors’ attitude toward the instructional approaches. What is more, their attitudes were rather heterogeneous: The instructors favored different instructional approaches with no approach being universally preferred.

4.2 Limitations and further research

Regarding implications for practice, it is important to keep in mind that while we captured performance in the analysis of practice, we did not capture whether this analysis had an impact on student teachers’ classroom practice. With the data and design of the current study, we cannot draw inferences on this relation. However, there is first empirical evidence that the development of analysis skills has a positive impact on classroom practice [63, 64].

Further, in our measurement tool, we operationalized selective attention as the number of comments students gave on the classroom vignettes. This conceptualization makes it difficult to compare the data with further studies in the field, such as those on professional vision [65]. With our performance-based operationalization, differences in the number of selected situations may be interpreted as differences in the ability to notice critical situations or as differences in motivation or attitude. We tried to address this limitation by including related variables (willingness for effort, readiness for reflection), but this did not improve the model or influence correlations between latent variables.

Our results underscore the significance of Baecher et al.’s [43] claim that more attention should be paid to the role of instructors. However, it remains unclear how the instructors’ attitudes about learning scenarios affect the students’ performance in applying the analysis of practice. Further insight and research are needed on the instructors’ and students’ sides to clarify the path of effects and interactions: How does the instructors’ attitude influence their teaching performance and how does the teaching performance influence the students’ beliefs and performances? Lastly, the treatment was short compared with, for example, video clubs [21]; this was due to the field study character of the teacher education program. An artificially prolonged treatment that exceeded the courses’ two sessions on classroom management could have had a different effect, but this would have reduced the external validity in our case.

4.3 Conclusion

With the limitations in mind, we draw two major conclusions from our data. First, we refer to the question posed in the title: Do direct instructional and problem-based approaches really produce different effects? Based on our data, the answer is no. In our study, we produced evidence, that using either DI or PB approaches to short-term interventions will yield equivalent effects on students’ selective attention, reflective thought, and theory–practice integrations. Particularly, students’ reflective thought proved to be a stable skill, and therefore, would need to be addressed with longer interventions (e.g., over one semester).

Second, encouragingly, both instructional approaches can foster theory–practice integrations of students, with the instructors playing a key role. These results contribute to further uncovering approaches to increase theory–practice integration, which was already labeled a “highly relevant endeavor” [66]. They also underscore the importance of examining the role of instructors within future field-based research. Based on our findings, we might not necessarily recommend forcing instructors to use certain (allegedly effective) learning methods, but rather, to draw on those about which they have positive views. We consider this result as a vital insight because it should be relevant for related field studies or the interpretation of results in laboratory studies (e.g., where researchers function as instructors). Coming from a field study design perspective, these results are applicable for teacher education practice, and thus, highly relevant.

Supporting information

S1 File

(HTML)

Click here for additional data file.^{(4MB, html)}

Data Availability

https://doi.org/10.4232/1.13468.

Funding Statement

This work was supported by the Ministry of Science, Research and Arts of the state Baden-Württemberg, Germany The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

1.Zeichner KM, Liston DP. Reflective teaching: an introduction. Second edition. New York: Routledge; 2014. [Google Scholar]
2.Grossman P, Compton C, Igra D, Ronfeldt M, Shahan E, Williamson P. Teaching practice: A cross-professional perspective. Teachers College Record. 2009. Jan;111(9):2055–100. [Google Scholar]
3.Orland-Barak L, Maskit D. Methodologies of Mediation in Professional Learning. Springer International Publishing; 2017. doi: 10.1007/978-3-319-49906-2 [DOI] [Google Scholar]
4.Grossman P, Pupik Dean CG. Negotiating a common language and shared understanding about core practices: The case of discussion. Teach Teach Educ. 2019;80:157–166. doi: 10.1016/j.tate.2019.01.009 [DOI] [Google Scholar]
5.Beisiegel M, Mitchell R, Hill HC. The Design of Video-Based Professional Development: An Exploratory Experiment Intended to Identify Effective Features. J Teach Educ. 2018;69(1):69–89. doi: 10.1177/0022487117705096 [DOI] [Google Scholar]
6.Sherin MG, van Es EA. Effects of Video Club Participation on Teachers’ Professional Vision. J Teach Educ. 2009;60(1):20–37. doi: 10.1177/0022487108328155 [DOI] [Google Scholar]
7.Shulman LS. Those Who Understand: Knowledge Growth in Teaching. Educ Res. 1986;15(2):4–14. [Google Scholar]
8.Zeichner KM. Reflective teaching and field-based experience in teacher education. Interchange. 1981;12(4):1–22. doi: 10.1007/BF01807805 [DOI] [Google Scholar]
9.Jay JK, Johnson KL. Capturing complexity: a typology of reflective practice for teacher education. Teaching and Teacher Education. 2002;18(1):73–85. [Google Scholar]
10.Castro A, Clark K, Jacobs J, Givvin KB. Response to theory & practice question: Using video to support teacher learning. AMTE Connections. 2005;14(3):8–12. [Google Scholar]
11.Colestock A, Sherin MG. Teachers’ Sense-Making Strategies While Watching Video of Mathematics Instruction. Journal of Technology and Teacher Education. 2009;17(1):7–29. [Google Scholar]
12.Ward JR, McCotter SS. Reflection as a visible outcome for preservice teachers. Teaching and Teacher Education. 2004;20(3):243–57. [Google Scholar]
13.Dewey J. How we think. Dover Publications; 1910. [Google Scholar]
14.Gaudin C, Chaliès S. Video viewing in teacher education and professional development: A literature review. Educ Res Rev. 2015;16:41–67. doi: 10.1016/j.edurev.2015.06.001 [DOI] [Google Scholar]
15.Tripp TR, Rich PJ. The influence of video analysis on the process of teacher change. Teach Teach Educ. 2012;28(5):728–739. [Google Scholar]
16.Praetorius AK, Klieme E, Herbert B, Pinger P. Generic dimensions of teaching quality: the German framework of Three Basic Dimensions. ZDM Mathematics Education. 2018;50(3):407–26. [Google Scholar]
17.Rosaen CL, Lundeberg M, Cooper M, Fritzen A, Terpstra M. Noticing Noticing: How Does Investigation of Video Records Change How Teachers Reflect on Their Experiences? Journal of Teacher Education. 2008;59(4):347–60. [Google Scholar]
18.Plöger W, Krepf M, Scholl D, Seifert A. Looking in the Heads of Experienced Teachers–Do they use the Wide Range of Principles of Effective Teaching when Analysing Lessons? AJTE. 2019;21–35. [Google Scholar]
19.van den Bogert N, van Bruggen J, Kostons D, Jochems W. First steps into understanding teachers’ visual perception of classroom events. Teaching and Teacher Education. 2014;37:208–16. [Google Scholar]
20.Jacobs VR, Lamb LLC, Philipp RA. Professional noticing of children’s mathematical thinking. Journal for Research in Mathematics Education. 2010;41(2):169–202. [Google Scholar]
21.van Es EA, Sherin MG. Mathematics teachers’ “learning to notice” in the context of a video club. Teach Teach Educ. 2008;24(2):244–276. doi: 10.1016/j.tate.2006.11.005 [DOI] [Google Scholar]
22.Watkins J, Portsmore M. Designing for Framing in Online Teacher Education: Supporting Teachers’ Attending to Student Thinking in Video Discussions of Classroom Engineering. Journal of Teacher Education. 2021;002248712110565. [Google Scholar]
23.Gold B, Pfirrmann C, Holodynski M. Promoting Professional Vision of Classroom Management Through Different Analytic Perspectives in Video-Based Learning Environments. Journal of Teacher Education. 2021;72(4):431–47. [Google Scholar]
24.Marsh B, Mitchell N. The role of video in teacher professional development. Teacher Development. 2014;18(3):403–417. doi: 10.1080/13664530.2014.938106 [DOI] [Google Scholar]
25.Barth VL, Piwowar V, Kumschick IR, Ophardt D, Thiel F. The impact of direct instruction in a problem-based learning setting. Effects of a video-based training program to foster preservice teachers’ professional vision of critical incidents in the classroom. Int J Educ Res. 2019;95:1–12. doi: 10.1016/j.ijer.2019.03.002 [DOI] [Google Scholar]
26.Day C. Reflection: A necessary but not sufficient condition for professional development. Br Educ Res J. 1993;19(1):83–93. [Google Scholar]
27.Korthagen FAJ. The Relationship Between Theory and Practice in Teacher Education. In: Peterson P, Baker E, McGaw B, editors. International Encyclopedia of Education. Elsevier; 2010. p. 669–75. [Google Scholar]
28.Stokking K, Leenders F, Jong J, van Tartwijk J. From student to teacher: reducing practice shock and early dropout in the teaching profession. European Journal of Teacher Education. 2003;26(3):329–50. [Google Scholar]
29.Marsh B, Mitchell N, Adamczyk P. Interactive video technology: Enhancing professional learning in initial teacher education. Comput Educ. 2010;54(3):742–748. doi: 10.1016/j.compedu.2009.09.011 [DOI] [Google Scholar]
30.Hatch T, Grossman P. Learning to Look beyond the Boundaries of Representation: Using Technology to Examine Teaching (Overview for a Digital Exhibition: Learning from the Practice of Teaching). J Teach Educ. 2009;60(1):70–85. [Google Scholar]
31.Borko H, Jacobs J, Eiteljorg E, Pittman M. Video as a tool for fostering productive discussions in mathematics professional development. Teach Teach Educ. 2008;24(2):417–436. doi: 10.1016/j.tate.2006.11.012 [DOI] [Google Scholar]
32.Fogarty R. Problem-based learning and other curriculum models for the multiple intelligences classroom. Arlington Heights (IL): IRI/Skylight Training and Publishing; 1997. [Google Scholar]
33.Duch BJ, Groh SE, Allen DE. Why Problem-Based Learning? A Case Study of Institutional Chance in Undergraduate Education. In: The power of problem-based learning: A practical “how to” for teaching undergraduate courses in any discipline. Sterling (VA): Stylus Pub; 2001. p. 3–11. [Google Scholar]
34.Savery JR. Overview of Problem-Based Learning: Definitions and Distinctions. Interdiscip J Probl Based Learn. 2006;1(1):9–20. doi: 10.7771/1541-5015.1002 [DOI] [Google Scholar]
35.Hmelo-Silver CE. Problem-Based Learning: What and How Do Students Learn? Educ Psychol Rev. 2004;16(3):235–266. doi: 10.1023/B:EDPR.0000034022.16470.f3 [DOI] [Google Scholar]
36.Stockard J, Wood TW, Coughlin C, Rasplica Khoury C. The Effectiveness of Direct Instruction Curricula: A Meta-Analysis of a Half Century of Research. Review of Educational Research. 2018;88(4):479–507. doi: 10.3102/0034654317751919 [DOI] [Google Scholar]
37.Atkinson RK, Derry SJ, Renkl A, Wortham D. Learning from Examples: Instructional Principles from the Worked Examples Research. Rev Educ Res. 2002;70(2):181–214. doi: 10.3102/00346543070002181 [DOI] [Google Scholar]
38.Syring M, Kleinknecht M, Bohl T, Kuntze S, Rehm M, Schneider J. How problem-based or direct instructional case-based learning environments influence pre-service teachers’ cognitive load, motivation and emotions: A quasi-experimental intervention study in teacher education. Journal of Education and Human Development. 2016;4(4):115–29. [Google Scholar]
39.Clark RE, Kirschner PA, Sweller J. Putting Students on the Path to Learning: The Case for Fully Guided Instruction. American Educator. 2012;36(1):6–11. [Google Scholar]
40.Dochy F, Segers M, van den Bossche P, Gijbels D. Effects of problem-based learning: A metaanalysis. Learn Instr. 2003;13:533–568. [Google Scholar]
41.Demirel M, Dağyar M. Effects of Problem-Based Learning on Attitude: A Meta-analysis Study. Eurasia Journal of Mathematics, Science & Technology Education. 2016;12(8). doi: 10.12973/eurasia.2016.1293a [DOI] [Google Scholar]
42.Seidel T, Blomberg G, Renkl A. Instructional strategies for using video in teacher education. Teach Teach Educ. 2013;34:56–65. [Google Scholar]
43.Baecher L, Kung SC, Ward SL, Kern K. Facilitating Video Analysis for Teacher Development: A Systematic Review of the Research. Journal of Technology and Teacher Education. 2018;26(2):185–216. [Google Scholar]
44.Gröschner A, Seidel T, Pehmer A-K, Kiemer K. Facilitating collaborative teacher learning: The role of “mindfulness” in video-based teacher professional development programs. Gruppendynamik Und Organisationsberatung. 2014;45(3):273–290. doi: 10.1007/s11612-014-0248-0 [DOI] [Google Scholar]
45.Weber KE, Gold B, Prilop CN, Kleinknecht M. Promoting pre-service teachers’ professional vision of classroom management during practical school training: Effects of a structured online- and video-based self-reflection and feedback intervention. Teaching and Teacher Education. 2018;76:39–49. [Google Scholar]
46.Courtland MC, Leslie L. Beliefs and Practices of Three Literacy Instructors in Elementary Teacher Education. Alberta Journal of Educational Research. 2010;56(1):19–30. [Google Scholar]
47.Hallett F. Do we practice what we preach? An examination of the pedagogical beliefs of teacher educators. Teaching in Higher Education. 2010;15(4):435–448. [Google Scholar]
48.Rosenthal R, Fode KL. The Effect of Experimenter Bias on the Performance of the Albino Rat. Behav Sci. 1963;8(3):183. [Google Scholar]
49.Quintana DS, Williams DR. Bayesian alternatives for common null-hypothesis significance tests in psychiatry: A non-technical guide using JASP. BMC Psychiatry. 2018;18(1):178. doi: 10.1186/s12888-018-1761-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
50.Morey RD, Rouder JN. BayesFactor: Computation of bayes factors for common designs. Version 0.9.12–4.2 [software]. 2018. Available from: https://CRAN.R-project.org/package=BayesFactor [Google Scholar]
51.Schneider J, Bohl T, Kleinknecht M, Rehm M, Kuntze S, Syring M. Unterricht analysieren und reflektieren mit unterschiedlichen Fallmedien: Ist Video wirklich besser als Text? Unterrichtswissenschaft. 2016;44(4):474–91. [Google Scholar]
52.Evertson CM, editor. Handbook of classroom management: Research practice and contemporary issues. Mahwah (NJ): Lawrence Erlbaum Associates; 2006. [Google Scholar]
53.Kounin JS. Discipline and group management in classrooms. Holt, Rinehart and Winston; 1970. [Google Scholar]
54.Mayr J. Klassen stimmig führen. Pädagogik. 2009;61(2):34–37. German. [Google Scholar]
55.Seidel T, Stürmer K. Modeling and Measuring the Structure of Professional Vision in Preservice Teachers. Am J Educ Res. 2014;51(4):739–771. [Google Scholar]
56.Praetorius AK, Lenske G, Helmke A. Observer ratings of instructional quality: Do they fulfill what they promise? Learning and Instruction. 2012;22(6):387–400. [Google Scholar]
57.McDonald RP. Test theory: A unified treatment. L. Erlbaum Associates; 1999. [Google Scholar]
58.Wilczynski SM. Treatment Feasibility and Social Validity. In: A Practical Guide to Finding Treatments That Work for People with Autism. Elsevier; 2017:47–57. doi: 10.1016/B978-0-12-809480-8.00008-X [DOI] [Google Scholar]
59.Ronis DL, Yates JF, Kirscht JP. Attitudes, decisions, and habits as determinants of repeated behavior. Attitude structure and function. 1989;213:39. [Google Scholar]
60.van Buuren S, Groothuis-Oudshoorn K. mice: Multivariate Imputation by Chained Equations in R. J Stat Softw. 2011;45(3). doi: 10.18637/jss.v045.i03 [DOI] [Google Scholar]
61.Snijders TAB, Bosker RJ. Multilevel analysis: An introduction to basic and advanced multilevel modeling. Washington (DC): Sage; 2012. [Google Scholar]
62.Hoijtink H, Gu X, Mulder J, Rosseel Y. Computing Bayes factors from data with missing values. Psychol Methods. 2019;24(2):253–268. doi: 10.1037/met0000187 [DOI] [PubMed] [Google Scholar]
63.Sun J, van Es EA. An Exploratory Study of the Influence That Analyzing Teaching Has on Preservice Teachers’ Classroom Practice. Journal of Teacher Education. 2015;66(3):201–214. doi: 10.1177/0022487115574103 [DOI] [Google Scholar]
64.van Es EA, Hand V, Mercado J. Making Visible the Relationship Between Teachers’ Noticing for Equity and Equitable Teaching Practice. In: Schack EO, Fisher MH, Wilhelm JA, eds. Teacher Noticing: Bridging and Broadening Perspectives, Contexts, and Frameworks. Research in Mathematics Education. Springer International Publishing; 2017:251–270. doi: 10.1007/978-3-319-46753-5_15 [DOI] [Google Scholar]
65.Sherin MG. The development of teachers’ professional vision in video clubs. In: Goldman R, editor. Video research in the learning sciences. Lawrence Erlbaum Associates; 2007. p. 383–395. [Google Scholar]
66.Brouwer N, Korthagen FAJ. Can Teacher Education Make a Difference? Am Educ Res J. 2005;42(1):153–224. doi: 10.3102/00028312042001153 [DOI] [Google Scholar]

PLoS One. doi: 10.1371/journal.pone.0273988.r001

Decision Letter 0

Micah B Goldwater

21 Dec 2021

PONE-D-21-30051Using authentic representations of practice in teacher education: Do direct instructional and problem-based approaches really produce different effects?PLOS ONE

Dear Dr. Schneider,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

I agree with the reviewers' thoughts that this was a well conducted study with a impressive sized sample, but that there are also several places to improve the work. Please carefully consider and address the two reviewer's comments below in your next submission, I do not repeat them here. I note that Reviewer 1 put their comments in an attached doc; very little is directly in the letter. In addition, I will point out that even though in your submission information you included a link to the data set, both reviewers indicated you did not include access to the raw data. So, I think you need to in the body of the manuscript, make it quite clear how to access the relevant data.

Please submit your revised manuscript by Feb 04 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Micah B. Goldwater, Ph.D

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Please consider changing the title so as to meet our title format requirement (https://journals.plos.org/plosone/s/submission-guidelines). In particular, the title should be "Specific, descriptive, concise, and comprehensible to readers outside the field" and in this case it is not informative and specific about your study's scope and methodology

3. Please provide additional details regarding participant consent. In the ethics statement in the Methods and online submission information, please ensure that you have specified (1) whether consent was informed and (2) what type you obtained (for instance, written or verbal, and if verbal, how it was documented and witnessed). If the need for consent was waived by the ethics committee, please include this information

4. Peer review at PLOS ONE is not double-blinded (https://journals.plos.org/plosone/s/editorial-and-peer-review-process). For this reason, authors should include in the revised manuscript all the information removed for blind review.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: No

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

Reviewer #2: No

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: No

Reviewer #2: No

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Thank you for letting me review this manuscript.

I congratulate you to a sound study with a large sample size. Yet I would see the manuscript to get developed further. It would improve greatly by using a more precise language.

I also tried to reproduce your results but I was missing the dataset "delete.Rdata". I liked that you provided a markdown file with all your code.

Many more comments attached :)

Greetings

Reviewer #2: The authors addressed an interesting topic of comparing students’ skill development of selective attention, reflective thought and theory-practice integration between two different learning approaches – Direct Instruction (DI) and Problem-Based Learning (PBL). I appreciate your effort in this study. However, there are a few areas that I would like the authors to consider before publishing the paper.

1. The authors may need to restructure the manuscript and add a literature review on PBL and DI in teacher education. For example, the definition of PBL. What does other research already know about PBL and DI? Why are the skills of selective attention, refective thought and theory-practice integration important to students in teacher education? Are they difficult to be developed? Why do the authors anticipate PBL or DI could help students develop those three skills. Moreover, some contents in the Method section should be moved to the literature review session. For instance, the first part of 3.4.1 (line 403 to 420) was the literature review about “Selective Attention”. Similarly, the first paragraph of 3.4.2, 3.4.3 and 3.4.4 should move to the literature review section.

2. There are no research questions, only hypotheses in this study. However, the authors might need to provide the relational or evidence of the predictions based on previous research.

3. The sentence in Line 314 is not clear.

4. LINE 281: What is the reason to redesign sessions 6 and 7, rather than other sessions?

5. Line 343: It is not clear that students analysed the situations in groups or individually.

6. In Line 352, the authors mentioned that “To guide the analysis, students received key questions that targeted the analysis of practice steps”. It means students were received the guidance of the analysis procedure, step by step. Based on one of the essential characteristics of PBL, “ The problem simulations used in problem-based learning must be ill-structured and allow for free inquiry (p.13, Savery, J.R., 2006)”. It seems the treatment in the PBL group was not ill-structured and didn’t allow for free inquiry.

Savery, J. R. (2015). Overview of problem-based learning: Definitions and distinctions. Essential readings in problem-based learning: Exploring and extending the legacy of Howard S. Barrows, 9(2), 5-15.

7. Line 379: Not sure where are the research questions of this study?

8. In section 4.1, I am keen to know why the number of analysed situations decreased in both groups.

9. In section 4.2, the authors mentioned that “Students’ reflective thought (as measured by realized inquiry steps in the analyses) was already well developed before they entered the treatment sessions”. If this is the case, why did the authors measure their skills of reflective thoughts? The authors already know students have this skill before the intervention.

10. Line 607: The authors mentioned that “ we conclude that the effect on students’ reflective thought is equivalent between learning approaches”. The authors had explained because students had already developed the skill of reflective thoughts before the interventions; therefore, there was no significant difference between the pre-and post-tests. Then the authors conclude that the effect on students’ reflective thoughts is equivalent between learning approaches. Please indicate what evidence to make this conclusion.

11. There is no discussion in the Discussion section.

I hope the authors find these comments useful.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Attachment

Submitted filename: Review_PlosOne_December2021.docx

Click here for additional data file.^{(22.7KB, docx)}

PLoS One. 2022 Sep 2;17(9):e0273988. doi: 10.1371/journal.pone.0273988.r002

Author response to Decision Letter 0

30 May 2022

Reviewer #1

I also tried to reproduce your results but I was missing the data set "delete.Rdata". I liked that you provided a markdown file with all your code.

Our Response

Thank you for bringing this problem to our attention. We unnecessarily blinded the document with the analyses. The link to both data sets is now in the RMarkdown file. We also put the "delete.RDdta" data set on github in the "data_public" folder and renamed it to "ts.Rdata". How to download the data can be found under the heading "Import data" in the RMarkdown file.

I think the summary of the reviews could be more concise, more to the point. There is no advantage of citing that researchers are "reinventing the wheel" (ll125f).

Our Response

We agree that the literature review would benefit from being restructured and rewritten for clarity. Therefore, we restructured and rewrote the introductory section, synthesizing the results of the reviews and tailoring literature review to the research questions.

We have deleted the statement on researchers "reinventing the wheel".

You claim to compare Problem Based Learning and Direct Instruction and your description of the methods you used fits these terms. However, in my particular understanding the definition of DI from Kirschner (2006) that you bring, would only define a “darstellende Stoffvermittlung”. In the current agreement of educational science, Direct Instruction names a specific instruction method that includes multiple phases.

Our Response

We agree that the definition by Kirschner (2006) is too narrow for our conceptualization of direct instruction. We changed this statement to:

“In contrast, DI describes a teacher-centered approach in which phases of modeling are typically followed by phases of guided and individual practice (Stockard et al., 2018).” The hypotheses are clear and could be tackled with this data. It would help the understanding of the results, if you already here make clear that you always test the sub-hypotheses in combination (and give a reason for it).

Our Response

We now included a statement below the description of hypotheses: “We tested the hypotheses of the two predictors (instructional approaches, instructors’ attitudes) within each dependent variable simultaneously to increase rigor by making the predictions as precise as possible.”

You use the number of attended sessions, the number of articles read for these sessions and instructor’s attitude towards the instruction method as proxies/control variables. However, it remains unclear, what these proxies are really measuring. A more thorough discussion would be indicated.

Our Response

We agree that a more thorough discussion of these measures would be indicated. Therefore, we added paragraphs on the social desirability and interpretation of these measures in the sections 2.4.4, 2.4.5 and 2.4.6.

From a pedagogical point of view I do not really find the task described in lines 329-331 as promoting learning. A good task (Arbeitsauftrag) should be structured more and be more elaborate.

Our Response

We explicitly refrained from strongly structuring the task, especially against the background of the PB instruction. The instructors implemented the task in the sense of their instructional approach.

I wondered whether each instructor had courses in both conditions. Please specify.

Our Response

We agree that this information was missing in the manuscript. We now added a sentence to the “Design” section:

“Each instructor taught DI an PB courses, the conditions were balanced within the instructors (teaching the same amount of both conditions, except when teaching an odd number of courses).” In section 3.4.1. there appeared again a short literature overview. I would shift this to the introduction above.

Our Response

We agree, thank you for bringing this to our attention. These contents have now been moved to the literature review section and integrated accordingly.

In section 3.4.2 I wondered why you tested for unidimensionality of the vignettes? Do you mean the raters score of reflective thought per vignette? Why did you only test for unidimensionality of reflective thought and not of selective attention as well?

Our Response

We clarified this section by changing one sentence to “The raters’ scores were tested for one-dimensionality per vignette”. When coding qualitative data, it is useful to check for dimensionality. One-dimensionality may indicate that the coding does not covary with other factors (e.g., the part of the vignette to which the analysis being coded refers). Selective attention was not tested for one-dimensionality as the score comprises the sum (count) of analyses regarding classroom management. Therefore, a calculation of the dimensionality is not possible. Theory–practice integration was not tested for one-dimensionality for the same reason.

When you report interrater reliability, please also specify how many cases have been rated by two (or even three) raters. Please also state what you did in cases in which the raters had different scores.

Our Response

We agree that this is essential information for assessing interrater reliability. We have updated the sentence accordingly:

“Inter-rater reliabilities for all codings were computed based on randomly selected 20% of the approximately 7 600 comments written by the participants in the pretest and posttest. Cohen’s Kappa scores of the two trained raters were satisfactory (κ = .64–.77) and disagreements were resolved through discussion.” When reading the sentence in ll. 395 it was unclear to me what you meant by it. After reading part 3.5, I understood that you aimed at matching similar vignettes to pre- and posttest. Please clarify.

Our Response

We concede that the description might have been unclear. We restructured this section and added the sentence: “We investigated the extent to which pairs of similar vignettes could be found from a classroom video, each of which was then split between the pretest and the posttest.”

Please make clearer, that you test the two sub-hypotheses of each dependent variable in combination. Currently, the indices of the hypotheses do not match and it is difficult to match the hypotheses and results. I would number the three main hypotheses with numbers from 1-3 and the different cases that you test (hypothesized direction, opposite direction, null hypothesis, unrestricted hypothesis) with letters from a-d for example.

Our Response

We agree that the consistency can be increased by numbering the hypotheses and sub-hypotheses. Numbers and indices of hypotheses in the section “Research Questions and Hypotheses” now match those from the “Results” section. Further, we numbered hypotheses on selective attention as H1 (H11, H12, H10, H1u), hypotheses on reflective thought as H2 (H21, H22, H23, H20, H2u) and hypotheses on theory-practice integration as H3 (H31, H32, H33, H30, H3u). The hypotheses we formulated in the section " Research Questions and Hypotheses" are each assigned the index 1 (H11, H21, H31). Also, we now mention that the sub-hypotheses are tested simultaneously in the sections “Research Questions and Hypotheses” as well as in “Statistical Analyses”.

Figure 2: Would it be possible to depict the raw data instead of the 12 data points here? It is not really interesting how the theory-practice integration differs by the vignettes, but more how it differs by DI vs PBL and how strongly students vary in that. Raw data per student would be very interesting.

Our Response

We assume you refer to Fig. 3: A similar graph with raw data would be an interesting option, but is unfortunately very cluttered. Therefore, we calculated the change scores for each person on the two variables and created a two-dimensional density plot of change, differentiated for treatment groups. We updated the figure and caption accordingly.

You state that you included the students’ willingness for effort and their attitude on readiness for reflection, but I did not see any results regarding this. Is this reported in the supplementary material.

Our Response

In the supplementary material we only included analyses directly mirrored in the manuscript. The inclusion of further exploratory analyses would overload the document.

You state “Students’ theory–practice integrations when analyzing classroom situations greatly increased from pretest to posttest.” Beforehand you ask the reader to be careful when comparing the scores between pre- and posttest, as the vignettes differ. I thus find that this conclusion is a bit far fetched. It could just be due to the different vignettes.

Our Response

We agree that this result should be interpreted with greater caution. We rewrote the sentence as follows: “Students' theory-practice integration scores in analyzing classroom situations improved greatly from the pretest to the posttest. However, this result should be taken with a grain of salt because the pretest and posttest are not equivalent, even though we matched them with great effort.”

I very much liked that the supplementary material contains a knitted R markdown file containing all the code and results. I wondered why the authors put the data on the gesis server in a proprietary .sav format (maybe add a .txt or .csv file as well). Please also upload the data “delete.Rdata”, as this is vital to reproduce your code.

Our Response

We agree that a proprietary file format (such as .sav) is less accessible than open file formats (such as .Rdata). At the recommendation of GESIS, we uploaded a SAV file (and not a CSV-file) to the repository. As opposed to CSV, SAV-files include the item labels and levels – but so does the .Rdata file format. This is why we also uploaded the data sets as .Rdata. The "delete.RDdta" data set is now on github in the "data_public" folder and we renamed it to "ts.Rdata". A guide on how to download the data can be found under the heading "Import data" in the RMarkdown file. The links to both relevant data sets are now in the RMarkdown file.

I suggest a sound proofreading focusing of the clarity and precision of the language. Also please make use of the past tense, and use it consistently. You may also consider asking the writing experts at your institution. From my experience, this is always greatly improving a manuscript.

Our Response

We have thoroughly proofread the manuscript and revised it accordingly. For this we also involved an external writing expert.

Use key nouns consistently: for example, the manuscript uses “DI and PB methods”/“PB instruction”/“PB learning”/“PBL” → choose one. This also happens for “subject”/“student”/“teacher student” and “count”/“number”/“quantity” and “second semester”/“2nd semester”/“teachers grade level (second)”. Please be very precise and consistent in the words you use.

Our Response

We agree that consistency will foster clarity of the manuscript. We decided for “DI/PB approach”, “student” (however, we kept “student teachers” when it contributed to understanding), “number of selected situations”, “second semester” and updated the manuscript accordingly.

I would not number the overview of current research as “2.”, but put it as sub-sections of the Introduction.

Our Response

We updated the manuscript accordingly.

Reviewer #2

The authors may need to restructure the manuscript and add a literature review on PBL and DI in teacher education. For example, the definition of PBL. What does other research already know about PBL and DI? Why are the skills of selective attention, reflective thought and theory-practice integration important to students in teacher education? Are they difficult to be developed? Why do the authors anticipate PBL or DI could help students develop those three skills.

Our Response

We have rewritten the introductory section and added remarks on the relevance of reflection and related constructs in teacher education. For each section of selective attention, reflective thinking, and integration of theory and practice we elaborated their relevance to teacher professionalism and the development of professionalism (section “Selective attention, reflective thought and theory-practice integration with authentic representations of practice”). Further you will now find theoretical and empirical elaborations on the efficacy of PBL and DI to develop these skills in the “Problem-based and direct instruction” section.

Moreover, some contents in the Method section should be moved to the literature review session. For instance, the first part of 3.4.1 (line 403 to 420) was the literature review about “Selective Attention”. Similarly, the first paragraph of 3.4.2, 3.4.3 and 3.4.4 should move to the literature review section.

Our Response

We agree, thank you for bringing this to our attention. These contents have now been moved to the literature review section and integrated accordingly.

There are no research questions, only hypotheses in this study. However, the authors might need to provide the relational or evidence of the predictions based on previous research.

Our Response

We now included research questions as well as hypotheses. Also, we have updated the introductory section to include literature that leads more directly to the predictions of the hypotheses.

The sentence in Line 314 is not clear.

Our Response

We deleted the sentence “In the two sessions, students learned about the classroom management strategies of Kounin (1970), Evertson (2006), and Mayr (2009).” and described the contents of the course in the design section.

LINE 281: What is the reason to redesign sessions 6 and 7, rather than other sessions?

Our Response

We focused on redesigning sessions where classroom management was on the curriculum. This was the case for sessions six and seven. We updated the sentences “For the interventions, we redesigned two of the courses’ weekly 90-minute sessions (6th and 7th of 15 sessions) and an assignment between these two sessions using authentic representations of practice. The topic of these two sessions and the assignment was classroom management.” to “For the interventions, we redesigned part of the courses (two sessions and an inter-session assignment) using authentic representations of practice. For this purpose, we focused on sessions in which classroom management was on the curriculum. These were sessions six and seven of 15 sessions.”

Line 343: It is not clear that students analysed the situations in groups or individually.

Our Response

Thank you for bringing this to our attention. We changed the sentence to “After this, students individually analyzed some more situations”

In Line 352, the authors mentioned that “To guide the analysis, students received key questions that targeted the analysis of practice steps”. It means students were received the guidance of the analysis procedure, step by step. Based on one of the essential characteristics of PBL, “ The problem simulations used in problem-based learning must be ill-structured and allow for free inquiry (p.13, Savery, J.R., 2006)”. It seems the treatment in the PBL group was not ill-structured and didn’t allow for free inquiry.

Our Response

Our description in the manuscript may have been somewhat unclear: The problem is ill-structured as students determine what they consider to be a problem (selection of situations) and what is and is not part of the problem. The students were free to inquire solutions to these situations. The key questions merely served as a guide in case students needed support with their inquiry. The questions did not serve as step-by-step instructions and were not introduced as such. We now clarified this in the manuscript.

Line 379: Not sure where are the research questions of this study?

Our Response

We now included research questions.

In section 4.1, I am keen to know why the number of analysed situations decreased in both groups.

Our Response

This is indeed an interesting phenomenon that needs further investigation. However, our data unfortunately do not allow us to answer this question conclusively. We now offer several interpretations of this in the discussion section: „Furthermore, we found that the number of selected situations decreased from the pretest to the posttest in both conditions. We cannot conclusively elucidate this phenomenon with our data, but we offer some tentative interpretations. A first intuitive explanation is that the analyses became fewer because students wrote longer analyses. However, we did not observe an increase in the length of the analyses in our data. A second explanation could be that the vignettes to be analyzed in the pretest and the posttest offer different numbers of situations that can be analyzed. Although we matched the vignettes to the pretest and posttest with great effort, we cannot exclude this option. A third explanatory approach relates to test fatigue. The students may put more effort into the pretest because analyzing classroom videos was a novel format for them (novelty effect). After they went through the pretest and analyzed several instructional videos again in the treatment sessions, the novelty effect may have worn off and their willingness to reflect may have decreased in the posttest. A slight decrease in scales of readiness to reflect was indeed observed in our data. “

In section 4.2, the authors mentioned that “Students’ reflective thought (as measured by realized inquiry steps in the analyses) was already well developed before they entered the treatment sessions”. If this is the case, why did the authors measure their skills of reflective thoughts? The authors already know students have this skill before the intervention.

Our Response

The time period that the students were given to complete the pretest extended until directly before the first treatment session. Accordingly, we were unfortunately not able to analyze the data before the treatment.

Line 607: The authors mentioned that “ we conclude that the effect on students’ reflective thought is equivalent between learning approaches”. The authors had explained because students had already developed the skill of reflective thoughts before the interventions; therefore, there was no significant difference between the pre-and post-tests. Then the authors conclude that the effect on students’ reflective thoughts is equivalent between learning approaches. Please indicate what evidence to make this conclusion.

Our Response

The conclusion “that the effect on students’ reflective thought is equivalent between learning approaches” is not related to our assumption that students had already developed the skill of reflective thoughts before the interventions. We derived this conclusion directly from our data: The inferential statistical comparison of the formulated hypotheses generated evidence for the null hypothesis. The null hypothesis states that there is no difference between the two groups. We therefore generated evidence that the effect is equivalent, which is possible with Bayes Factor hypothesis testing.

There is no discussion in the Discussion section.

Our Response

We added a discussion on the decrease of the number of selected situations. Further, we added a paragraph on the relation of the analysis of practice and classroom practice itself.

Attachment

Submitted filename: Response to Reviewers.pdf

Click here for additional data file.^{(468.7KB, pdf)}

PLoS One. doi: 10.1371/journal.pone.0273988.r003

Decision Letter 1

Micah B Goldwater

6 Jul 2022

PONE-D-21-30051R1Using authentic representations of practice in teacher education: Do direct instructional and problem-based approaches really produce different effects?PLOS ONE

Dear Dr. Schneider,

Thank you for submitting your manuscript to PLOS ONE. I and the reviewer agree that your have improved the manuscript from its first submission, but still does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. The reviewer has gone above and beyond what reviewers typically do in how they carefully considered your analyses and manuscript. I hope you are as grateful as I am for such an effort to improve your paper. Please carefully address each of their suggestions in your revision, if you decide to submit one.

Please submit your revised manuscript by Aug 20 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

We look forward to receiving your revised manuscript.

Kind regards,

Micah B. Goldwater, Ph.D

Academic Editor

PLOS ONE

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

Reviewer #1: Partly

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: No

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

Reviewer #1: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

Reviewer #1: Yes

**********

6. Review Comments to the Author

Reviewer #1: Hey,

I saw that you integrated all of the comments and added essential information to the manuscript. It has become much better now :)

Thanks also for providing the necessary data to reproduce the analyses. I really appreciate that you make your code and data openly accessible; that's a great example of open science, which I also promote myself!

I could replicate your results in terms of BayesFactors. I did not reproduce your core results, as the multiple imputation did not end in a reasonable amount of time on my computer ;)

But I had a look at the variables with which you constructed the instructor's attitudes, and here I see the major problem:

First, the variable labels are unclear (doz_gef & doz_pass), and even lead to the suspicion, that these variables did not measure the attitude towards problem-based learning and direct instruction. Why did you choose these labels and which instructional approach do they refer to?

Secondly, the dataset "rating-treatmentcheck.sav" contains the real names of the instructors --> anonymise and give them a code or number.

Thirdly, and this is the major problem: the instructors were not consistent in their ratings about their attitudes. For example, the instructor of seminars 10, 11, and 12 gave the following answers on the attitudes: 3-4, 1-2, 4-4.

Thus, depending on which group s/he taught, s/he had different attitudes toward the instructional approach!

This has many theoretical & practical implications for your research:

-the attitudes were not reliably measured; they varied, based on the seminar that was taught

-If the seminar groups vary in size, this might produce a bias in the multiple imputations (the answers from larger seminars will likely be weighed more?)

-If instructors themselves were not sure about their attitude, what does it mean for the whole study?!

I hope I did not misunderstand something here and am on the wrong track. If so, let me know. But I think you should delve into this issue and compare the instructors' attitudes across treatments. Maybe you can come up with an average per instructor across seminars, but it will be needed to be discussed thoroughly.

Now to the manuscript...

Importantly, the line numbers refer to the document with track changes!

line 698 ff. That is related to the instructors' attitudes again: I am not convinced by the argument that the broad range of attitudes indicates the absence of any bias of the instructors.

Please state the mean attitude per treatment & SDs, and maybe consider a Bayesian t-test for dependent samples to check, whether the attitudes are comparable (after you resolved the issue that they vary across seminars).

I am also not convinced to model these two distinct attitudes as one latent variable. It's not a truly reflective construct, is it? Wouldn't it be much more meaningful to see how attitudes towards DI predict theory-practice integration in the DI, whereas attitudes towards PB predict theory-practice integration in PB-approaches? Or maybe a difference score instead of a sum score? Not yet sure, but please explain in further detail, why you modeled the two attitudes as indicators of one underlying general attitude variable.

A second major aspect concerns Figure 3. To me, Fig 3 makes no sense; in the hypothesis you do not test whether theory-practice integration relate to realized inquiry steps, so why plot it? For which research question does it depict meaningful information? I would put instructors' attitude on the x-axis and then add a Figure 4 to plot the change in theory-practice integration (again predicted by instructors' attitude for example).

Minor aspects and typos:

l. 262: remove full stop before reference

l. 268: the reference seems to be in a smaller font size

ll. 365: As not all hypotheses are exploratory, I would start like that: "One of the hypotheses on reflective thinking (H21) was based on strong assumptions derived from theory. The rest of the hypotheses were labelled as exploratory, since robust research is lacking."

l. 426: according to APA, a sentence cannot start with a number in numeric form. You could circumvent this by saying "In total, 638..."

l. 437: "The study was conducted within our institute's teacher education program" -> you are from 4 different institutes :) I think it does not need to be stated that it was from any institute; just that it was in a regular teacher education program in Germany

l. 451: "(8am to 8pm)" (delete space)

ll. 451: "Each instructor taught DI and PB courses, the conditions" ('d' missing, use semicolon to separate these sentences)

ll. 461: As no 'standard' for priors exist, please replace "standard priors" by "default priors of the BayesFactor package (REF)"

l. 538: Bayes Factors of .5 are very weak/anecdotal evidence. Please put that into perspective

l. 662: give references to Kounin, Evertson, and Mayr or leave out

ll. 859: you state a substantial correlation of reflective thought and theory-practice integration. But did you report this correlation somewhere?? In Figure 3 there seems not any relation

ll. 886: if your operationalization of selective attention makes it difficult to compare to other literature (as you just counted the comments), then why did you choose it? What could future research improve??

In general, I found it a bit funny that it's single blind peer review (I can see your names), but the

references have been blinded. I guess that's unnecessary

I know that I expect a lot again, and I am fine if the authors provide good reasons why some things cannot be changed.

I hope my comments contribute to improve this work.

Best regards

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Christian M. Thurn

**********

PLoS One. 2022 Sep 2;17(9):e0273988. doi: 10.1371/journal.pone.0273988.r004

Author response to Decision Letter 1

19 Jul 2022

Reviewer comment:

Our response:

“doz” is short for “Dozierende” which signifies “instructor” in German. As the data set consists mostly of data from students we wanted to make this clear in the variable label. The suffix “_gef” and “_pass” refers to the two items measuring their attitudes perceived “pleasure” and perceived “fit”.

Reviewer comment:

Secondly, the dataset "rating-treatmentcheck.sav" contains the real names of the instructors --> anonymise and give them a code or number.

Our response:

Thank you for this very important information. We followed your suggested procedure.

Reviewer comment:

Our response:

I see how disagreement within instructors may seem like a substantial problem. However, the example you pointed out, come from within one instructor but between different treatments. Hence we would assume agreement (=similar ratings) within the same type of courses (=same treatment) but not necessarily between courses from different treatments within each instructor. This was the case here and is the case for all instructors. In the word document “Response to Reviewers” I have attached two figures showing the agreement/disagreement for the two items.

Reviewer comment:

line 698 ff. That is related to the instructors' attitudes again: I am not convinced by the argument that the broad range of attitudes indicates the absence of any bias of the instructors.

Our response:

Our wording “bias” might be somewhat unclear. We wanted to illustrate that attitudes varied considerably within the treatments signifying that each treatment did not experience “only disagree” or “only agree” and therefore instructors being one-sided throughout. We therefore changed the sentence to “This indicates that instructors had divergent but not necessarily one-sided attitudes throughout for each treatment, thus, we used the variable as a covariate of the treatment.”

Reviewer comment:

Our response:

We did not expect instructors to be similar in their attitudes toward the treatment. For this reason, we included the variable as a predictor. The decision to be made was solely between using the variable as a moderator or as a covariate. We added the Bayes factor test and deleted the part on the mediation as it does not add to the manuscript and might be unclear.

Reviewer comment:

Our response:

Our description might have been somewhat unclear. First, the attitude toward the treatment was measured via two items. These two items both measure how positive the attitude of the instructors was toward a treatment. As instructors taught several courses, they indicated their attitude toward the treatment for each of the courses separately. In the data set then, the attitude of the instructor toward a specific course was matched to students’ data from exactly that course. This way, we achieved exactly what you were suggesting: “attitudes towards DI predict theory-practice integration in the DI, whereas attitudes towards PB predict theory-practice integration in PB-approaches”. To clarify this, we added the following paragraph: “We matched the attitude of the instructor toward a specific course to students’ data from exactly that course. This way we were able to predict student level data with the respective course level information.”

Reviewer comment:

Our response:

We agree that the figure does not directly fit any of the hypotheses. We have therefore implemented the suggestion and plotted the change in theory-practice integration on the y-axis and the attitudes of the lecturers on the x-axis. We used a 2D density plot because scatter or point plots would be too cluttered. Further, we added density distributions of each of the two variables.

Minor aspects and typos

We corrected all minor aspects and typos according to your suggestions.

Attachment

Submitted filename: Response to Reviewers.pdf

Click here for additional data file.^{(468.3KB, pdf)}