Abstract
Many national reports have called for undergraduate biology education to incorporate research and analytical thinking into the curriculum. In response, interventions have been developed and tested. CREATE (Consider, Read, Elucidate the hypotheses, Analyze and interpret the data, and Think of the next Experiment) is an instructional strategy designed to engage students in learning core concepts and competencies through careful reading of primary literature in a scaffolded fashion. CREATE has been successfully implemented by many instructors across diverse institutional contexts and has been shown to help students develop in the affective, cognitive, and epistemological domains, consistent with broader meta-analyses demonstrating the effectiveness of active learning. Nonetheless, some studies on CREATE have reported discrepant results, raising important questions on effectiveness in relation to the fidelity and integrity of implementation. Here, we describe an upper-division genetics course that incorporates a modified version of CREATE. Similar to the original CREATE instructional strategy, our intervention’s design was based on existing learning principles. Using existing concept inventories and validated survey instruments, we found that our modified CREATE intervention promotes higher affective and cognitive gains in students in contrast to three comparison groups. We also found that students tended to underpredict their learning and performance in the modified CREATE intervention, while students in some comparison groups had the opposite trend. Together, our results contribute to the expanding literature on how and why different implementations of the same active-learning strategy contribute to student outcomes.
INTRODUCTION
In the past few decades, many national reports have called for the transformation of undergraduate biology education. Following broad calls in Science for All Americans (1) and Reinventing Undergraduate Education (2), recommendations specifically for the biological sciences began to emerge: engage students in the excitement of discoveries (3); help students develop core concepts and competencies (4, 5); promote active learning in the classroom and incorporate research into the curriculum (6), especially for students across diverse undergraduate educational contexts (7, 8). These recommendations were echoed and paralleled by The Next Generation Science Standards in K–12 education (9). As a result, many educational interventions have been developed across science, technology, engineering, and mathematics (STEM) disciplines and their efficacies examined (10–13).
One instructional strategy developed in the biological sciences to achieve these recommendations is CREATE (Consider, Read, Elucidate the hypotheses, Analyze and interpret the data, and Think of the next Experiment) (14). CREATE builds upon research on how people learn and incorporates pedagogical tools that facilitate conceptual learning (15–20). Students develop core concepts and competencies in biological sciences through reading discussing primary literature and by systematically decoding figures, tables, and narratives in a scaffolded fashion (14, 21–25). CREATE has been successfully implemented by faculty across many institutional contexts (26) and has been shown to promote affective, cognitive, and epistemological development of diverse two- and four-year students (27–31).
A number of other instructional strategies have been developed to engage students in reading and understanding primary literature in the biological sciences. For example, Research Deconstruction engages students in dissecting standard research presentations given by faculty and has been shown to improve first- and second-year students’ self-efficacy in research skills at a large research university (32). Figure Facts assists students in understanding experimental data presented in papers and identifying conclusions from these data; students in an upper-division classroom course at a liberal arts college had improved performance in interpreting data and reported decreased frustration in examining primary literature figures (33). Another intervention guides students through a series of three papers by gradually reducing scaffolds related to experimental design, data analysis, and scientific argumentation; students were tested on quiz questions with different cognitive processes defined by Bloom’s taxonomy and had improved performance longitudinally across multiple upper-division laboratory courses at a large research university (34). Furthermore, Science in the Classroom has curated and annotated a number of papers with the potential of helping students read and understand primary literature (35). While many of these instructional strategies were shown to be efficacious interventions, few have been as extensively studied as CREATE in terms of student outcomes and diversity of institutional contexts.
Despite many successful CREATE implementations, one recent study reported that students showed no difference in cognitive gains in critical thinking between CREATE and a more traditional method of analyzing primary literature (36). There are potential reasons for the lack of efficacy in this instance, such as a partial implementation of CREATE and the amount of active learning present in the comparison group (37). Nonetheless, this study raises questions about the effectiveness of CREATE in relation to the fidelity and integrity of implementation, as well as broader challenges of successfully propagating active learning to diverse educational contexts across STEM disciplines (38–41).
Here, we describe an upper-division genetics course that incorporates a modified CREATE intervention and examines its efficacy in relation to comparison courses that involved some level of active learning. Specifically, our research question is: Does our modified CREATE intervention improve student cognitive and affective outcomes? We hypothesized that a modified version of an established intervention such as CREATE should remain effective, as long as the intervention is aligned with fundamental principles on how people learn. We examined cognitive outcomes using pre- and post-course concept inventories and compared these results with students’ perceived learning gains; we also examined affective outcomes using validated survey instruments.
METHODS
Study context
This study was conducted at a large, public, doctoral university (highest research activity) with an undergraduate profile that is four-year, full-time, more selective, and higher transfer-in (42), with approval from the Institutional Review Board (IRB) at University of California San Diego and as part of a larger study on CREATE (with IRB approval at The City College of New York). At the study institution, genetics is typically the first upper-division course that students majoring in biological sciences take. The course thus represents a logical point in the curriculum for an educational intervention, especially considering the higher transfer-in percentage.
Modified CREATE intervention
The intervention used CREATE for only part of the course and had some deviation from the original CREATE instructional strategy. First, the course included a combination of CREATE-only classes, other interactive classes with lectures involving clicker questions, and mixed classes with both CREATE and clicker questions. Second, only a subset of the CREATE steps specifically related to understanding research methods, analyzing data, and constructing scientific arguments were used in the course. Scientific argumentation involves using evidence and reasoning to draw conclusions (43), and engaging in scientific argumentation has also been shown to improve students’ skills in reading primary literature (44). Finally, instead of thoroughly reading three or four related papers from the same research group, we selected a larger number of papers from many different research groups and focused on only one or two data figures or tables, along with their associated methods, from each paper. The background knowledge for the papers and fundamental genetics concepts were introduced through pre-lecture reading assignments (with online questions) and in the interactive clicker lectures (Appendix 1). These modifications allowed the CREATE intervention to follow the standard genetics curriculum across multiple sections at the study institution.
Because of these modifications, we were careful to ensure that the intervention was designed using evidence-based course structures already implicit in the original CREATE instructional strategy. First, course concepts in the intervention were interleaved into four major stories with societal implications (Table 1), emphasizing the connections among genes, phenotypes, and evolution. Interleaving mimics the interconnectedness of concepts as they would naturally appear in primary literature papers. Interleaving has been shown to improve student learning in both controlled cognitive psychology laboratory settings (45) and natural classroom contexts (46) by providing repeated deliberate practice and enhancing metacognitive discrimination (47–50). Second, in contrast to the traditional lecture-and-practice format, course materials in the intervention were structured according to the 5E learning cycle, which engages students to explore complex problems before being supported with an explanation (51). The 5E learning cycle mimics the productive failures that students are likely to encounter in the original CREATE instructional strategy as they explore complex data and language in primary literature papers. Productive failures promote student learning (52, 53) through activation of existing knowledge, explanation and elaboration of concepts, and organization of these concepts into new knowledge (54–56), all of which are supported by scaffolds in the original CREATE instructional strategy (14).
TABLE 1.
Modified CREATE intervention.
Module | Genetics Topics | Papers |
---|---|---|
DNA forensics: How can we use genetics to find elephant poachers and reconstruct the family tree of King Tut? | Molecular markers, alleles, polymorphisms, meiosis, Mendelian inheritance, pedigrees, population genetics, probability and statistics | Science (2015) 349: 84–87 Nature (2011) 472: 404–406 JAMA (2010) 303: 638–647 |
Human diseases: Why do deleterious diseases such as sickle cell anemia continue to persist in populations? | Genes, alleles, mutations, phenotypes, meiosis, non-Mendelian inheritance, selection, gene regulation, pleiotropy, expressivity and penetrance | Br Med J (1954) 1: 290–294 NEJM (1994) 330: 1639–1644 PNAS (2011) 108: 20113–20118 |
Biodiversity: How do new forms and functions evolve in skeletal structures and coat colors in fish and mice? | Epistasis, gene regulation, linkage, QTL, population genetics, complementation, forward and reverse genetics, necessary vs. sufficient | PNAS (2003) 100: 5268–5273 PLOS Biol (2007) 5: e219 Nat Genet (2006) 38: 107–111 Nature (2004) 428: 717–723 |
Human genetics: How do complex traits such as human eye and skin color evolve and continue to evolve? | GWAS, polymorphisms, linkage, haplotypes, mutations, selection, gene regulation, epigenetics, correlation vs. causation, probability and statistics | Hum Mol Genet (2009) 18: 9–17 PLOS Genet (2013) 9: e1003372 |
Each module spans two to three weeks of the academic term and focuses on a central question. Some genetics topics are interleaved across multiple modules. Only one or two data figures or tables, along with their associated methods, are used from each of the papers listed.
QTL = quantitative trait locus; GWAS = genome-wide association study.
Two iterations of the modified CREATE intervention (n = 48 students for each iteration) were implemented, and outcome data (described below) were combined, as they were not statistically different. One iteration of the modified CREATE intervention was recorded and analyzed using Decibel Analysis for Research in Teaching (DART), a machine-learning algorithm that examines classroom sound to identify time spent on single voice (e.g., lecture) vs. multiple voices (e.g., peer discussions) (57). The automated DART outputs were annotated for different classroom activities.
Comparison groups
Multiple sections of the genetics course are typically offered within the same academic term at the study institution, and they were used as quasi-experimental comparison groups. Specifically, three other sections (n = 48, 148, and 356 students, respectively) were offered at the same time as the second iteration of the modified CREATE intervention course (Table 2). All three comparison sections had the traditional lecture-and-practice structure with blocked content, while employing clicker questions as an instructional strategy. One comparison section had limited enrollment at the same number as the intervention course, and the other two sections had larger enrollments. Students self-selected to enroll in the CREATE intervention and the different comparison courses.
TABLE 2.
Intervention and comparison groups.
Intervention | Comparison | |||
---|---|---|---|---|
|
||||
CREATE | Small | Medium | Large | |
| ||||
Enrollment | 48 × 2 = 96 | 48 | 148 | 356 |
| ||||
Response rate | CI: 73% | CI: 75% | CI: 66% | CI: 67% |
TOSLS: 90% | ||||
Survey: 85% | Survey: 71% | Survey: 74% | Survey: 73% | |
Evaluation: 87% | Evaluation: 90% | Evaluation: 59% | Evaluation: 58% |
Our modified CREATE intervention course had a limited enrollment of 48, equal to the enrollment of the small comparison course. Response rates are reported separately for CI, TOSLS (intervention only), affective survey, and course evaluation.
CI = concept inventory items; TOSLS = test of scientific literacy skills.
Cognitive outcomes
Cognitive learning of genetics concepts was measured using selected items from existing concept inventories customized for the course material, given to students at the beginning (pre) and end (post) of the academic term in the intervention and comparison courses. Sixteen items were chosen from established concept inventories in the literature (58, 59) by a different instructor who typically teaches the same course in another academic term and was not involved in the intervention or comparison courses (Appendix 2). In our dataset, the 16 items have a Cronbach’s alpha of 0.74. (In this paper, we will refer to these 16 selected concept inventory items collectively as “CI.”) While there are no universal guidelines for interpreting Cronbach’s alpha values, this value falls within the adequate-to-good range of reliability (60).
The Test of Scientific Literacy Skills (TOSLS) was given pre- and post-course in the intervention course to measure potential changes in students’ ability to evaluate scientific information and arguments (61). TOSLS was given only in the intervention course, as the administration of the test would have taken too much class time in the comparison courses. In our dataset, the 28 TOSLS items have a Cronbach’s alpha of 0.94, indicating high or excellent reliability (60).
Pre- and post-course CI were analyzed across the intervention and comparison courses using a standard two-way analysis of variance (ANOVA) with a post-hoc Tukey’s honestly significant difference (HSD) test for multiple comparisons. For clarity, p values are only reported for comparisons between the intervention course and each of the other courses to establish baseline comparisons and for pre- versus post-course CI scores to determine statistical outcomes for learning gains. TOSLS scores were analyzed using a standard Student’s t-test. Effect sizes were calculated using Cohen’s d, which is defined as the difference between the pre- and post-course means normalized to the standard deviation from the pre-course data. All statistical analyses in this study were performed in Microsoft Excel 2016 and JMP Pro version 13.0.
Perceived learning and course outcomes
To compare cognitive outcomes (measured by CI scores) with students’ perception of their own learning, we used the institution’s course evaluation data. Students rated the statement “I learned a great deal from this course” on a standard five-point Likert scale (from “strongly disagree” to “strongly agree”). Differences in perceived learning were calculated across the intervention and comparison courses using a standard two-way ANOVA with a post-hoc Tukey’s HSD test.
To further corroborate the potential difference between measured and perceived learning, we compared the students’ actual and expected course grades, the latter of which were also reported in the institution’s course evaluation. While students at the study institution were not required to complete course evaluations, individual instructors typically encouraged students to participate, and there is an institutional culture of completing these evaluations. The response rates for the intervention and comparison courses were within the typical range for the genetics course at the study institution (over 12 academic years from 2007 to 2019, average = 58%, standard deviation = 20%) (Table 2).
We used Fisher’s exact test to determine whether there was a difference in the distribution of actual versus expected grades in each of the intervention and comparison courses. Fisher’s exact test was used instead of a standard chi-square test of independence because of the low numbers in certain grade categories, such as D and F, which would have rendered the chi-square approximations inaccurate.
Affective outcomes
Student affect in the intervention and comparison courses was examined using the constructivist learning environments survey (CLES) and the affective dimension from the classroom community inventory (CCI). These instruments measure student affect in relation to the course learning environment and community; thus, this survey was administered only post-course. CLES has 30 items spanning five dimensions: personal relevance of the course material, comfort with the uncertainty of science, and students’ critical voice, shared control, and peer negotiation in the learning process; the items were on a five-point, Likert-like scale from “almost never” to “almost always” (62). The CCI has two dimensions; only the affective support dimension (with five items) was used in our survey, as the other dimension deals with cognitive aspects of student experiences in a course (63). The items were on a standard five-point Likert scale. In our dataset, the 35-item survey has a Cronbach’s alpha of 0.81, indicating good or high reliability (60). Differences in affective outcomes were calculated across the intervention and comparison courses using a standard two-way ANOVA with a post-hoc Tukey’s HSD test.
RESULTS
Characterization of the intervention
Classroom recordings of the modified CREATE intervention were analyzed using DART to quantify the amount of time with multiple voices (Table 3). Overall, about 25% of classroom time involved some form of interactive peer discussions. CREATE-only classes and mixed classes (with CREATE and clicker questions) had close to 30% of interactive time on average, whereas classes with interactive clicker lectures had under 15% of interactive time on average.
TABLE 3.
Interactive time in the modified CREATE intervention.
Class Type | Number | % Interactive Time |
---|---|---|
CREATE only | 2 | 28.7 ± 0.6 |
Interactive clicker lecture | 5 | 14.9 ± 6.5 |
Mixed (CREATE and clicker) | 9 | 29.8 ± 6.9 |
All | 16 | 25.0 ± 9.3 |
Percentage of class time with multiple voices from the DART profiles is tabulated across the three types of classes in the intervention: CREATE only, interactive clicker lecture, and mixed.
The DART output files were further annotated for the different kinds of activities in the classroom (Fig. 1). For example, in one CREATE-only class, two extended CREATE time blocks were flanked by lectures (Fig. 1A). The first lecture time block was used to introduce the general background and core genetics concepts behind the paper. Then, students engaged in CREATE to diagram the methods in a collaborative fashion, followed by a short summary lecture that provided a synthesis of students’ work and broader explanations. Finally, students engaged in CREATE again, this time to annotate a figure and construct a scientific argument; this interactive time block was also followed by a short summary lecture.
FIGURE 1.
Implementation of the modified CREATE intervention. Annotated DART profiles for: A) a CREATE-only class, B) an interactive clicker lecture, and C) a mixed class with both CREATE and clicker questions.
CQ = clicker question; DART = Decibel Analysis for Research in Teaching.
In one interactive lecture with no CREATE, fundamental genetics concepts were introduced and discussed through a series of four clicker questions (indicated as CQ, Fig. 1B). These clicker questions were implemented as iterative peer discussions, with instructor feedback in between (64), resulting in a total of eight CQ time blocks in the DART profile. Iterative peer discussions on the same clicker questions were made in real time in the classroom, following general guidelines from peer instruction (64). Some clicker questions in other classes did not have a second CQ time block of peer discussion.
In one mixed class with CREATE and clicker questions (Fig. 1C), students were engaged in a clicker question on review materials at the beginning of class. The background and related genetics concepts were then introduced through an interactive clicker lecture. Subsequently, students diagrammed the methods of a paper in a CREATE time block, followed by a guided analysis of the data figures through an interactive clicker lecture. In some other instances, clicker questions were used to discuss the methods, and CREATE was used to analyze the data figures, or only the methods or data figures were examined in one class period.
Cognitive outcomes
To establish baselines for learning gains, we compared the pre-course CI scores between the modified CREATE intervention and each of the comparison courses. Students in the small and large comparison courses had a statistically equivalent cognitive baseline (measured by pre-course CI scores) to that of students in the modified CREATE intervention (ANOVA, p > 0.9999 and p > 0.95 respectively), and students in the medium comparison course had a higher baseline pre-course CI score on average (ANOVA, p < 0.01). While baseline test scores alone do not eliminate all possible biases in students self-selecting to enroll in the CREATE intervention and the different comparison courses, they are the best predictors for student outcomes in K–12 science education research (65). Therefore, the small comparison course is a suitable statistically equivalent comparison (66), given the equivalent course size and pre-course CI scores. The other two comparison courses, while not fully statistically equivalent, are included as additional data source triangulation.
To compare cognitive learning outcomes across the intervention and comparison courses, we examined pre- and post-course CI scores (Fig. 2A, left y-axis). For the intervention, CI scores were higher (ANOVA, p < 0.0001) at the end of the course (average ± standard deviation = 9.54 ± 3.31) compared with the beginning of the course (5.93 ± 2.40), with an effect size of 1.51, which is considered very large (67). In contrast, the three comparison courses (small, medium, and large) had statistically significant effect sizes of 0.87, 0.56, and 0.46 respectively, all of which were lower than the effect size in the intervention course. Furthermore, we used TOSLS to examine changes in students’ ability to evaluate scientific information and arguments in the intervention course (Fig. 2B). TOSLS scores were higher (t-test, p < 0.0001) at the end of the course (19.7 ± 4.1) than at the beginning of the course (17.3 ± 3.0), with an effect size of 0.78, which is considered large (67).
FIGURE 2.
Cognitive outcomes. (A) Pre- and post-course CI scores are plotted on the left y-axis and students’ perceived learning on the right y-axis. Two-way ANOVA indicates that the perceived learning score in CREATE was lower than the three comparison courses (p at least < 0.01), whereas the comparison courses were not statistically different among themselves. For CI scores, error bars indicate standard deviation; effect sizes (ES) are calculated by Cohen’s d, and p values are determined by two-way ANOVA. (B) Pre- and post-course TOSLS scores for the modified CREATE intervention are compared by t-test (p < 0.001). Error bars indicate standard deviation, and ES is calculated by Cohen’s d. (C) Distributions of students’ actual and perceived grades (legend: A, B, C, D, and F) are plotted as outer and inner rings respectively in the donut graphs and compared using Fisher’s exact test. CI = concept inventory items.
We examined students’ perceived learning in relation to the cognitive outcomes measured by CI scores (Fig. 2A, right versus left y-axis). Students reported a lower perceived learning score in the intervention course (3.66 ± 1.19) versus the three comparison courses (small, medium, and large), which had perceived learning scores of 4.28 ± 0.91 (ANOVA, p < 0.01), 4.27 ± 1.00 (ANOVA, p < 0.001), and 4.23 ± 0.73 (ANOVA, p < 0.0001), respectively. In contrast to the comparison courses, students in the intervention course reported statistically lower perceived learning, even though the effect size in CI scores showed much higher cognitive learning gains. In addition, we compared students’ expected versus actual grades (Fig. 2C). Consistent with the perceived learning and CI results, we found that students in the intervention course underpredicted how well they had done (predicted average grade point out of 4.00 = 3.12, actual = 3.36; Fisher’s exact test, p < 0.05), whereas students in the comparison courses either made accurate predictions (small and medium comparison courses) or overpredicted their course performance (large comparison course; predicted = 3.77, actual = 3.15; Fisher’s exact test, p < 0.0001).
Affective outcomes
We compared affective outcomes across the intervention and comparison courses (Fig. 3). In all six affective dimensions measured, students in the intervention had statistically higher outcomes than at least two of the three comparison courses. In contrast to the small and large comparison courses, students in the intervention reported higher personal relevance of the subject matter (ANOVA, p < 0.05 and p < 0.0001, respectively) and had more comfort with the uncertainty of science (ANOVA, p < 0.01 and p < 0.0001, respectively), a core aspect of the nature of science (5, 62). Students in the intervention reported having more critical voice and shared control in the classroom, as well as more peer negotiation and affective support in the learning process, than students in all three comparison courses (ANOVA, p at least < 0.05).
FIGURE 3.
Affective outcomes. Results on the six affective dimensions from our survey are plotted: (A) personal relevance, (B) uncertainty of science, (C) critical voice, (D) shared control, (E) peer negotiation, and (F) affective support. Error bars indicate standard deviation, and statistical differences (by two-way ANOVA) are indicated by brackets and the following notation: * p < 0.05; ** p < 0.01; *** p < 0.001; **** p < 0.0001.
DISCUSSION
In this study, we showed that a modified CREATE intervention in an upper-division genetics course led to improved student outcomes. Triangulation across multiple sources of data (CI, survey, and course evaluation) provides potential explanations and mechanisms for why we observed these results. For example, a higher survey score on peer negotiation (in the modified CREATE intervention versus the comparison courses) indicates that students perceived more interactive discussions, which are critical for learning (16, 20); correspondingly, we observed higher learning gains as measured by pre- and post-course CI scores. Somewhat ironically, the higher score on peer negotiation could also explain why students in the modified CREATE intervention tended to underpredict their learning and performance. Students typically hold a combination of teacher-centered and knowledge-centered conceptions of teaching and learning, where the instructors are expected to clearly present the correct knowledge through examples, in contrast to a student-centered conception of teaching and learning, where students actively participate in the learning process by providing feedback to one another (68). It is conceivable that peer negotiation is inconsistent with what students may perceive to be effective teaching and learning, thus resulting in a lower predicted level of learning.
In our modified CREATE intervention, students reported more comfort with the uncertainty of science, which is a key aspect of the nature of science (5, 62). Our results are consistent with a previous study showing significant changes in students’ self-assessed understanding of the nature of science from an upper-division course taught using the original CREATE instructional strategy (29). In addition, higher personal relevance, critical voice, shared control, and affective support would likely result in increased persistence in biological sciences majors, especially for minoritized students, such as underrepresented minority or first-generation college students, by potentially counteracting classroom microaggressions (69), stereotype threat (70), and demotivation based on faculty mindset (71). However, persistence in the major was not measured, as the study was designed as a single-course intervention.
There are some additional limitations to the current study. Even though the modified CREATE course was offered twice, the comparison courses were only offered once in the same academic term as the second iteration of the intervention. However, having three concurrent comparison courses provides additional information. More importantly, one of the comparison courses was the same size as the intervention, and students in the intervention course and this comparison course had equivalent pre-course CI scores. We were also not able to fully disentangle the effect of course size on student outcomes in the comparison courses; however, the main goal of the study is to compare the modified CREATE intervention with other active-learning strategies. Despite these limitations, our study highlights important considerations that are of interests to the growing field of biology education research and practice.
First, we observed that students in the modified CREATE intervention on average underpredicted their learning and course performance, whereas in at least one of the comparison courses, students on average overpredicted their learning and course performance. This pattern is consistent with the existing literature on interleaved versus blocked learning (72) and productive failure (55) and can be explained by the lack of metacognitive awareness on the effectiveness of different learning strategies (73). Thus, our results argue—as do previous studies in the literature—that it is important to measure both student self-reported perceived learning and actual cognitive learning gains to triangulate findings in a research study.
Second, the use of different outcome measurements can lead to varying results across studies that make comparisons difficult (13). Even though a previous study observed no difference in cognitive gains in critical thinking between CREATE and a comparison (36), we do not necessarily see our results being contradictory to that study. This is in part because the two studies had different outcome measurements, as we did not measure critical thinking. Our comparison courses also did not explicitly engage students in primary literature in an extensive fashion, thus potentially widening the difference between the modified CREATE invention and the comparisons. When situated in the context of other existing studies, our work contributes to the expanding literature on how and why different implementations of the same active-learning strategy contribute to student outcomes.
Finally, recent studies have shown that instructors who report using active-learning strategies often do not use them as suggested or originally designed by researchers (74, 75), which may lead to different student outcomes (76). Therefore, it is critical to align the intended curriculum (designed by the researchers based on learning principles) and the enacted curriculum (implemented in different educational contexts by practitioners). More broadly speaking, the intended and enacted curricula, as well as what students learn, can often be substantially different (77, 78). In this study, our modified CREATE intervention (intended curriculum) was designed with a course structure that is supported by existing literature on how people learn, and we observed the implementation (enacted curriculum) using DART. Implementing established active-learning strategies with integrity based on fundamental learning principles can have an important positive impact on student outcomes; this is in contrast to directly copying the instructional strategy (high fidelity) without adapting it to the local educational context (39). The increasing calls for widespread and large-scale implementation of active learning across STEM disciplines (11) raise important questions for how best to support the professional development of current and future faculty (79) and what mechanisms of propagation would be best for ensuring the integrity of implementation (40, 80).
SUPPLEMENTARY MATERIALS
ACKNOWLEDGMENTS
We are grateful to the many course instructors and students for engaging in this study and to L. McDonnell for contributing to the design of the research instruments. We thank M. Owens and K. Tanner for discussions on DART, as well as S. Hoskins and K. Kenyon for thoughtful feedback on how to implement CREATE in a modified fashion. This work was supported in part by the Division of Biological Sciences Teaching Professor Summer Research Fellowship Program and the Faculty Career Development Program at University of California San Diego (SML), and JT was an undergraduate researcher in the Faculty Mentor Program at University of California San Diego. This material is based upon work supported by the National Science Foundation under Grant No. 1524779. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. The authors declare that there are no conflicts of interest.
Footnotes
Supplemental materials available at http://asmscience.org/jmbe
REFERENCES
- 1.American Association for the Advancement of Science. Science for all Americans. Oxford University Press; New York NY: 1989. [Google Scholar]
- 2.Boyer Commission. Reinventing undergraduate education: a blueprint for America’s research universities. Commission on Educating Undergraduates in the Research University; Stoney Brook, NY: 1998. https://eric.ed.gov/?id=ED424840. [Google Scholar]
- 3.National Research Council. BIO2010: Transforming undergraduate education for future research biologists. The National Academies Press; Washington, DC: 2003. [PubMed] [Google Scholar]
- 4.Association of American Medical Colleges, Howard Hughes Medical Institute. Scientific foundations for future physicians. Washington, DC: 2009. [Google Scholar]
- 5.American Association for the Advancement of Science. Vision and change in undergraduate biology education: a call to action: a summary of recommendations made at a national conference organized by the American Association for the Advancement of Science; July 15–17, 2009; Washington, DC. 2011. [Google Scholar]
- 6.President’s Council of Advisors on Science and Technology (PCAST) Engage to excel: producing one million additional college graduates with degrees in science, technology, engineering, and mathematics. Executive Office of the President; Washington, DC: 2012. [Google Scholar]
- 7.National Academy of Sciences, National Academy of Engineering, and Institute of Medicine. Expanding underrepresented minority participation: America’s science and technology talent at the crossroads. The National Academies Press; Washington, DC: 2011. [Google Scholar]
- 8.National Academies of Sciences, Engineering, and Medicine. Barriers and opportunities for 2-year and 4-year STEM degrees: systemic change to support students’ diverse pathways. The National Academies Press; Washington, DC: 2016. [PubMed] [Google Scholar]
- 9.National Research Council. A framework for K–12 science education: practices, crosscutting concepts, and core ideas. The National Academies Press; Washington, DC: 2012. [Google Scholar]
- 10.Bowen CW. A quantitative literature review of cooperative learning effects on high school and college chemistry achievement. J Chem Educ. 2000;77:116–119. doi: 10.1021/ed077p116. [DOI] [Google Scholar]
- 11.Freeman S, Eddy SL, McDonough M, Smith MK, Okoroafor N, Jordt H, Wenderoth MP. Active learning increases student performance in science, engineering, and mathematics. Proc Natl Acad Sci. 2014;111:8410–8415. doi: 10.1073/pnas.1319030111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Prince M. Does active learning work? A review of the research. J Engineer Educ. 2004;93:223–231. doi: 10.1002/j.2168-9830.2004.tb00809.x. [DOI] [Google Scholar]
- 13.Ruiz-Primo MA, Briggs D, Iverson H, Talbot R, Shepard LA. Impact of undergraduate science course innovations on learning. Science. 2011;331:1269–1270. doi: 10.1126/science.1198976. [DOI] [PubMed] [Google Scholar]
- 14.Hoskins SG, Stevens LM, Nehm RH. Selective use of primary literature transforms the classroom into a virtual laboratory. Genetics. 2007;176:1381–1389. doi: 10.1534/genetics.107.071183. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Chi MT. Active-constructive-interactive: a conceptual framework for differentiating learning activities. Topics Cogn Sci. 2009;1:73–105. doi: 10.1111/j.1756-8765.2008.01005.x. [DOI] [PubMed] [Google Scholar]
- 16.Chi MT, Wylie R. The ICAP framework: linking cognitive engagement to active learning outcomes. Educ Psychol. 2014;49:219–243. doi: 10.1080/00461520.2014.965823. [DOI] [Google Scholar]
- 17.National Research Council. How people learn: brain, mind, experience, and school, expanded ed. The National Academies Press; Washington, DC: 2000. [Google Scholar]
- 18.National Academies of Sciences, Engineering, and Medicine. How people learn II: learners, contexts, and cultures. The National Academies Press; Washington, DC: 2018. [Google Scholar]
- 19.Piaget J. To understand is to invent: the future of education. Grossman Publishers; New York, NY: 1973. [Google Scholar]
- 20.Vygotsky LS. Mind in society: the development of higher mental process. Harvard University Press; Cambridge, MA: 1978. [Google Scholar]
- 21.Hoskins SG. Using a paradigm shift to teach neurobiology and the nature of science: a CREATE-based approach. J Undergrad Neurosci Educ. 2008;6:A40. [PMC free article] [PubMed] [Google Scholar]
- 22.Hoskins SG, Stevens LM. Learning our LIMITS: less is more in teaching science. Adv Physiol Educ. 2009;33:17–20. doi: 10.1152/advan.90184.2008. [DOI] [PubMed] [Google Scholar]
- 23.Hoskins SG. Teaching science for understanding: focusing on who, what, and why. In: Meinwald J, Hildebrand JG, editors. Science and the educated American: a core component of liberal education. American Academy of Arts and Sciences; Cambridge, MA: 2010. pp. 151–179. [Google Scholar]
- 24.Hoskins SG. “But if it’s in the newspaper, doesn’t that mean it’s true?” Developing critical reading and analysis skills by evaluating newspaper science with CREATE. Am Biol Teach. 2010;72:415–420. doi: 10.1525/abt.2010.72.7.5. [DOI] [Google Scholar]
- 25.Hoskins SG, Krufka A. The CREATE strategy benefits students and is a natural fit for faculty. Microbe. 2015;10:108–112. [Google Scholar]
- 26.Hoskins SG, Gottesman AJ, Kenyon KL. CREATE two-year/four-year faculty workshops: a focus on practice, reflection, and novel curricular design leads to diverse gains for faculty at two-year and four-year institutions. J Microbiol Biol Educ. 2017;18 doi: 10.1128/jmbe.v18i3.1365. 18.3.65. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Kenyon KL, Onorato ME, Gottesman AJ, Hoque J, Hoskins SG. Testing CREATE at community colleges: an examination of faculty perspectives and diverse student gains. CBE Life Sci Educ. 2016;15:ar8. doi: 10.1187/cbe.15-07-0146. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Gottesman AJ, Hoskins SG. CREATE cornerstone: introduction to scientific thinking, a new course for STEM-interested freshmen, demystifies scientific thinking through analysis of scientific literature. CBE Life Sci Educ. 2013;12:59–72. doi: 10.1187/cbe.12-11-0201. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Hoskins SG, Lopatto D, Stevens LM. The CREATE approach to primary literature shifts undergraduates’ self-assessed ability to read and analyze journal articles, attitudes about science, and epistemological beliefs. CBE Life Sci Educ. 2011;10:368–378. doi: 10.1187/cbe.11-03-0027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Hoskins SG, Gottesman AJ. Investigating undergraduates’ perceptions of science in courses taught using the CREATE strategy. J Microbiol Biol Educ. 2018;19 doi: 10.1128/jmbe.v19i1.1440. 19.1.6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Stevens LM, Hoskins SG. The CREATE strategy for intensive analysis of primary literature can be used effectively by newly trained faculty to produce multiple gains in diverse students. CBE Life Sci Educ. 2014;13:224–242. doi: 10.1187/cbe.13-12-0239. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Clark IE, Romero-Calderón R, Olson JM, Jaworski L, Lopatto D, Banerjee U. “Deconstructing” scientific research: a practical and scalable pedagogical tool to provide evidence-based science instruction. PLOS Biol. 2009;7:e1000264. doi: 10.1371/journal.pbio.1000264. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Round JE, Campbell AM. Figure facts: encouraging undergraduates to take a data-centered approach to reading primary literature. CBE Life Sci Educ. 2013;12:39–46. doi: 10.1187/cbe.11-07-0057. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Sato BK, Kadandale P, He W, Murata PM, Latif Y, Warschauer M. Practice makes pretty good: assessment of primary literature reading abilities across multiple large-enrollment biology laboratory courses. CBE Life Sci Educ. 2014;13:677–686. doi: 10.1187/cbe.14-02-0025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.McCartney M, Childers C, Baiduc RR, Barnicle K. Annotated primary literature: a professional development opportunity in science communication for graduate students and postdocs. J Microbiol Biol Educ. 2018;19 doi: 10.1128/jmbe.v19i1.1439. 19.1.24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Segura-Totten M, Dalman NE. The CREATE method does not result in greater gains in critical thinking than a more traditional method of analyzing the primary literature. J Microbiol Biol Educ. 2013;14:166–175. doi: 10.1128/jmbe.v14i2.506. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Hoskins SG, Kenyon KL. Letter to the editor. J Microbiol Biol Educ. 2014;15:3–4. doi: 10.1128/jmbe.v15i1.725. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Borrego M, Cutler S, Prince M, Henderson C, Froyd JE. Fidelity of implementation of research-based instructional strategies (RBIS) in engineering science courses. J Engineer Educ. 2013;102:394–425. doi: 10.1002/jee.20020. [DOI] [Google Scholar]
- 39.Bryk AS, Gomez LM, Grunow A, LeMahieu PG. Learning to improve: how America’s schools can get better at getting better. Harvard Education Press; Cambridge, MA: 2015. [Google Scholar]
- 40.Froyd JE, Henderson C, Cole RS, Friedrichsen D, Khatri R, Stanford C. From dissemination to propagation: a new paradigm for education developers. Change Mag Higher Learn. 2017;49:35–42. doi: 10.1080/00091383.2017.1357098. [DOI] [Google Scholar]
- 41.Stains M, Vickrey T. Fidelity of implementation: an overlooked yet critical construct to establish effectiveness of evidence-based instructional practices. CBE Life Sci Educ. 2017;16:rm1. doi: 10.1187/cbe.16-03-0113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.McCormick AC, Zhao CM. Rethinking and reframing the Carnegie classification. Change Mag Higher Learn. 2005;37:51–57. doi: 10.3200/CHNG.37.5.51-57. [DOI] [Google Scholar]
- 43.Sampson V, Clark DB. Assessment of the ways students generate arguments in science education: current perspectives and recommendations for future directions. Sci Educ. 2008;92:447–472. doi: 10.1002/sce.20276. [DOI] [Google Scholar]
- 44.Lacum EB, Ossevoort MA, Goedhart MJ. A teaching strategy with a focus on argumentation to improve undergraduate students’ ability to read research articles. CBE Life Sci Educ. 2014;13:253–264. doi: 10.1187/cbe.13-06-0110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Rohrer D, Taylor K. The shuffling of mathematics problems improves learning. Instruct Sci. 2007;35:481–498. doi: 10.1007/s11251-007-9015-8. [DOI] [Google Scholar]
- 46.Rohrer D, Dedrick RF, Burgess K. The benefit of interleaved mathematics practice is not limited to superficially similar kinds of problems. Psychonomic Bull Rev. 2014;21:1323–1330. doi: 10.3758/s13423-014-0588-3. [DOI] [PubMed] [Google Scholar]
- 47.Birnbaum MS, Kornell N, Bjork EL, Bjork RA. Why interleaving enhances inductive learning: the roles of discrimination and retrieval. Mem Cogn. 2013;41:392–402. doi: 10.3758/s13421-012-0272-7. [DOI] [PubMed] [Google Scholar]
- 48.Kang SH, Pashler H. Learning painting styles: spacing is advantageous when it promotes discriminative contrast. Appl Cogn Psychol. 2012;26:97–103. doi: 10.1002/acp.1801. [DOI] [Google Scholar]
- 49.Rohrer D. Interleaving helps students distinguish among similar concepts. Educ Psychol Rev. 2012;24:355–367. doi: 10.1007/s10648-012-9201-3. [DOI] [Google Scholar]
- 50.Taylor K, Rohrer D. The effects of interleaved practice. Appl Cogn Psychol. 2010;24:837–848. doi: 10.1002/acp.1598. [DOI] [Google Scholar]
- 51.Bybee RW. The BSCS 5E instructional model: personal reflections and contemporary implications. Sci Children. 2014;51:10–13. doi: 10.2505/4/sc14_051_08_10. [DOI] [Google Scholar]
- 52.Kapur M. Productive failure in mathematical problem solving. Instr Sci. 2010;38:523–550. doi: 10.1007/s11251-009-9093-x. [DOI] [Google Scholar]
- 53.Kapur M. Productive failure in learning the concept of variance. Instr Sci. 2012;40:651–672. doi: 10.1007/s11251-012-9209-6. [DOI] [Google Scholar]
- 54.Kapur M. Productive failure. Cogn Instr. 2008;26:379–424. doi: 10.1080/07370000802212669. [DOI] [Google Scholar]
- 55.Kapur M. A further study of productive failure in mathematical problem solving: unpacking the design components. Instr Sci. 2011;39:561–579. doi: 10.1007/s11251-010-9144-3. [DOI] [Google Scholar]
- 56.Kapur M, Bielaczyc K. Designing for productive failure. J Learn Sci. 2012;21:45–83. doi: 10.1080/10508406.2011.591717. [DOI] [Google Scholar]
- 57.Owens MT, Seidel SB, Wong M, Bejines TE, Lietz S, Perez JR, Sit S, Subedar ZS, Acker GN, Akana SF, Balukjian B, Benton HP, Blair JR, Boaz SM, Boyer KE, Bram JB, Burrus LW, Byrd DT, Caporale N, Carpenter EJ, Chan YHM, Chen L, Chovnick A, Chu DS, Clarkson BK, Cooper SE, Creech C, Crow KD, de la Torre JR, Denetclaw WF, Duncan KE, Edwards AS, Erickson KL, Fuse M, Gorga JJ, Govindan B, Green LJ, Hankamp PZ, Harris HE, He ZH, Ingalls S, Ingmire PD, Jacobs JR, Kamakea M, Kimpo RR, Knight JD, Krause SK, Krueger LE, Light TL, Lund L, Márquez-Magaña LM, McCarthy BK, McPheron LJ, Miller-Sims VC, Moffatt CA, Muick PC, Nagami PH, Nusse GL, Okimura KM, Pasion SG, Patterson R, Pennings PS, Riggs B, Romeo J, Roy SW, Russo-Tait T, Schultheis LM, Sengupta L, Small R, Spicer GS, Stillman JH, Swei A, Wade JM, Waters SB, Weinstein SL, Willsie JK, Wright DW, Harrison CD, Kelley LA, Trujillo G, Domingo CR, Schinske JN, Tanner KD. Classroom sound can be used to classify teaching practices in college science courses. Proc Natl Acad Sci. 2017;114:3085–3090. doi: 10.1073/pnas.1618693114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Smith MK, Wood WB, Knight JK. The genetics concept assessment: a new concept inventory for gauging student understanding of genetics. CBE Life Sci Educ. 2008;7:422–430. doi: 10.1187/cbe.08-08-0045. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Shi J, Wood WB, Martin JM, Guild NA, Vicens Q, Knight JK. A diagnostic assessment for introductory molecular and cell biology. CBE Life Sci Educ. 2010;9:453–461. doi: 10.1187/cbe.10-04-0055. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Taber KS. The use of Cronbach’s alpha when developing and reporting research instruments in science education. Res Sci Educ. 2018;48:1273–1296. doi: 10.1007/s11165-016-9602-2. [DOI] [Google Scholar]
- 61.Gormally C, Brickman P, Lutz M. Developing a test of scientific literacy skills (TOSLS): measuring undergraduates’ evaluation of scientific information and arguments. CBE Life Sci Educ. 2012;11:364–377. doi: 10.1187/cbe.12-03-0026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Nix RK, Fraser BJ, Ledbetter CE. Evaluating an integrated science learning environment using the Constructivist Learning Environment Survey. Learn Environ Res. 2005;8:109–133. doi: 10.1007/s10984-005-7251-x. [DOI] [Google Scholar]
- 63.Rovai AP, Wighting MJ, Lucking R. The classroom and school community inventory: development, refinement, and validation of a self-report measure for educational research. Internet Higher Educ. 2004;7:263–280. doi: 10.1016/j.iheduc.2004.09.001. [DOI] [Google Scholar]
- 64.Smith MK, Wood WB, Krauter K, Knight JK. Combining peer discussion with instructor explanation increases student learning from in-class concept questions. CBE Life Sci Educ. 2011;10:55–63. doi: 10.1187/cbe.10-08-0101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Westine CD, Spybrook J, Taylor JA. An empirical investigation of variance design parameters for planning cluster-randomized trials of science achievement. Eval Rev. 2013;37:490–519. doi: 10.1177/0193841X14531584. [DOI] [PubMed] [Google Scholar]
- 66.National Research Council. On evaluating curricular effectiveness: judging the quality of K–12 mathematics evaluations. The National Academies Press; Washington, DC: 2004. [Google Scholar]
- 67.Maher JM, Markey JC, Ebert-May D. The other half of the story: effect size analysis in quantitative research. CBE Life Sci Educ. 2013;12:345–351. doi: 10.1187/cbe.13-04-0082. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Virtanen V, Lindblom-Ylänne S. University students’ and teachers’ conceptions of teaching and learning in the biosciences. Instruct Sci. 2010;38:355–370. doi: 10.1007/s11251-008-9088-z. [DOI] [Google Scholar]
- 69.Suárez-Orozco C, Casanova S, Martin M, Katsiaficas D, Cuellar V, Smith NA, Dias SI. Toxic rain in class: classroom interpersonal microaggressions. Educ Res. 2015;44:151–160. doi: 10.3102/0013189X15580314. [DOI] [Google Scholar]
- 70.Walton GM, Cohen GL. A brief social-belonging intervention improves academic and health outcomes of minority students. Science. 2011;331:1447–1451. doi: 10.1126/science.1198364. [DOI] [PubMed] [Google Scholar]
- 71.Canning EA, Muenks K, Green DJ, Murphy MC. STEM faculty who believe ability is fixed have larger racial achievement gaps and inspire less student motivation in their classes. Sci Adv. 2019;5:eaau4734. doi: 10.1126/sciadv.aau4734. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Yan VX, Bjork EL, Bjork RA. On the difficulty of mending metacognitive illusions: a priori theories, fluency effects, and misattributions of the interleaving benefit. J Experiment Psychol Gen. 2016;145:918–933. doi: 10.1037/xge0000177. [DOI] [PubMed] [Google Scholar]
- 73.Karpicke JD, Butler AC, Roediger HL., III Metacognitive strategies in student learning: do students practise retrieval when they study on their own? Memory. 2009;17:471–479. doi: 10.1080/09658210802647009. [DOI] [PubMed] [Google Scholar]
- 74.Ebert-May D, Derting TL, Hodder J, Momsen JL, Long TM, Jardeleza SE. What we say is not what we do: effective evaluation of faculty professional development programs. BioScience. 2011;61:550–558. doi: 10.1525/bio.2011.61.7.9. [DOI] [Google Scholar]
- 75.Henderson C, Dancy MH. Impact of physics education research on the teaching of introductory quantitative physics in the United States. Phys Rev Spec Top Phys Educ Res. 2009;5:020107. doi: 10.1103/PhysRevSTPER.5.020107. [DOI] [Google Scholar]
- 76.Andrews TM, Leonard MJ, Colgrove CA, Kalinowski ST. Active learning not associated with student learning in a random sample of college biology courses. CBE Life Sci Educ. 2011;10:394–405. doi: 10.1187/cbe.11-07-0061. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Bussey TJ, Orgill M, Crippen KJ. Variation theory: a theory of learning and a useful theoretical framework for chemical education research. Chem Educ Res Pract. 2013;14:9–22. doi: 10.1039/C2RP20145C. [DOI] [Google Scholar]
- 78.Lloyd GM, Cai J, Tarr JE. Issues in curriculum studies: evidence-based insights and future directions. In: Cai J, editor. Compendium for research in mathematics education. National Council of Teachers of Mathematics; Reston, VA: 2017. pp. 824–852. [Google Scholar]
- 79.Amundsen C, Wilson M. Are we asking the right questions? A conceptual review of the educational development literature in higher education. Rev Educ Res. 2012;82:90–126. doi: 10.3102/0034654312438409. [DOI] [Google Scholar]
- 80.Henderson C, Beach A, Finkelstein N. Facilitating change in undergraduate STEM instructional practices: an analytic review of the literature. J Res Sci Teach. 2011;48:952–984. doi: 10.1002/tea.20439. [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.