Abstract
Background
Program evaluation is important for assessing the effectiveness of the residency curriculum. Limited resources are available, however, and curriculum evaluation processes must be sustainable and well integrated into program improvement efforts.
Intervention
We describe the pediatric Clinical Skills Fair, an innovative method for evaluating the effectiveness of residency curriculum through assessment of trainees in 2 domains: medical knowledge/patient care and procedure. Each year from 2008 to 2011, interns completed the Clinical Skills Fair as rising interns in postgraduate year (PGY)-1 (R1s) and again at the end of the year, as rising residents in PGY-2 (R2s). Trainees completed the Clinical Skills Fair at the beginning and end of the intern year for each cohort to assess how well the curriculum prepared them to meet the intern goals and objectives.
Results
Participants were 48 R1s and 47 R2s. In the medical knowledge/patient care domain, intern scores improved from 48% to 65% correct (P < .001). Significant improvement was demonstrated in the following subdomains: jaundice (41% to 65% correct; P < .001), fever (67% to 94% correct; P < .001), and asthma (43% to 62% correct; P = .002). No significant change was noted within the arrhythmia subdomain. There was significant improvement in the procedure domain for all interns (χ2 = 32.82, P < .001).
Conclusions
The Clinical Skills Fair is a readily implemented and sustainable method for our residency program curriculum assessment. Its feasibility may allow other programs to assess their curriculum and track the impact of programmatic changes; it may be particularly useful for program evaluation committees.
What was known
Program evaluation committees must document formal, systematic evaluation of the curriculum at least annually.
What is new
A Clinical Skills Fair assesses program effectiveness through resident assessment in medical knowledge/patient care and procedure domains.
Limitations
The single-institution intervention reduces the ability to generalize; the intervention requires modest investment of faculty/staff time and equipment.
Bottom line
The Clinical Skills Fair is an effective and sustainable approach to program evaluation in a continuous improvement process.
Editor's Note: The online version (188.2KB, docx) of this article contains sample intern medical knowledge/patient care questions used in this study and scoring instructions examples for sample questions.
Introduction
The Accreditation Council for Graduate Medical Education (ACGME) requires that training programs “must document formal, systematic evaluation of the curriculum at least annually” through program evaluation committees (PECs).1 Curriculum evaluation is consistent with sound educational principles2,3 and is increasingly important in the ACGME's Next Accreditation System.4
Despite the availability of program evaluation models, there is a need for consistent curriculum evaluation processes that are sustainable and well-integrated into program evaluation.5 One such program evaluation model, Musik's model,6 includes needs assessment, focused methodology, and presentation and documentation of results. Other evaluation models have incorporated similar principles.2,7
We present a Clinical Skills Fair as a rigorous curriculum evaluation method that assesses clinical skills based on level-specific learning objectives, thereby providing meaningful educational outcomes. This method helps programs identify curricular strengths and deficiencies, effectively informing program improvement.
Methods
This study was a quasi-experimental, pre-post design using a purposeful sample of Duke University pediatrics and combined internal medicine-pediatrics residents (interns) in postgraduate year (PGY)-1 from 2008 to 2011.
In 2006, the Duke Pediatric Residency Program formed a curriculum committee (CC) comprising program leadership and staff who represented a range of expertise, including primary care, education, hospital medicine, critical care, cardiology, infectious diseases, child abuse, and newborn medicine. The CC was created to supervise the development, implementation, and evaluation of the curriculum and identified a need for more rigorous evaluation. This led the CC to create the Clinical Skills Fair to assess how well the curriculum prepared trainees to meet level-specific goals and objectives. Using a modified Delphi technique,8 the CC reviewed the curriculum goals and objectives to identify the highest-yield areas, which included asthma, fever, jaundice, arrhythmias, and lumbar puncture. Corresponding goals and objectives were organized into 2 domains: medical knowledge/patient care (MK/PC) and procedure.
For MK/PC, the CC developed a written test based on intern-level goals and objectives, using the blueprinting process.9 Fourteen questions were organized into 4 subdomains: asthma, fever, jaundice, and arrhythmias. All questions were case-based and either multiple choice or short answer; 21 points were possible. Revisions were made based on pilot-testing with nonparticipating senior residents. Questions were entered into an electronic survey system (provided as online supplemental material). Criterion-referenced standards were not identified for this study; the intent was to demonstrate improvement in scores over time. The CC developed written scoring instructions based on consensus (provided as online supplemental material).
For the procedure domain, interns performed a lumbar puncture using a low-fidelity simulator. They were scored by a trained faculty member using scoring instructions based on procedural skills training modules and expert opinion.10 The test was criterion-referenced, and 10 of 13 points was set as the minimum standard, as determined by CC consensus to meet acceptable clinical performance. Copies of the intern curriculum and all test and scoring instructions are available from the authors.
We examined the construct validity of the Clinical Skills Fair using multiple sources of evidence. Content evidence was demonstrated through systematic test development as previously described. Two CC members scored subsets of the tests at separate points using written scoring instructions. Scores were compared between the 2 CC members and for each CC member separately over time. We calculated the interrater and intrarater reliability using Cohen's alpha. Once it was deemed reliable, the tool was used and scored by 1 faculty member for subsequent years.
The outcome measures included changes in the MK/PC and procedure test scores from beginning to end of the intern year. We also measured feasibility outcomes, including resident/faculty/staff time, equipment costs, information technology resources, staff training, and space.
A paired t test was performed annually on the aggregate data for each cohort. Clinical Skills Fair scores from the rising interns in PGY-1 (R1s) were compared to scores from the rising residents in PGY-2 (R2s). For example, scores from the 2008 R1s were compared with scores from the 2009 R2s, thereby comparing the same class of interns. We also compared all R1s to all R2s.
Data analyses were conducted in aggregate because the intent of the Clinical Skills Fair was to assess the curriculum, not the individual. Anonymity codes were used for all participants and were maintained by the research assistant. To enable participation in the intervention, interns were excused from clinical duties, which were covered by senior residents. They were not provided with individual results from the Clinical Skills Fair as part of the study design.
This study was deemed exempt by Duke University Medical Center's Institutional Review Board.
Results
A total of 48 R1s and 47 R2s participated from 2008 to 2011. table 1 provides participation rates per year. Using Cohen's kappa, the interrater and intrarater reliability for the written test ranged between 0.89 and 0.95. The overall comparison between R1s and R2s increased from 48% to 65% correct (table 1). There was also significant improvement in Clinical Skills Fair MK/PC scores for each cohort per year.
TABLE 1.
Clinical Skills Fair Medical Knowledge/Patient Care Domain Results

Subset analyses were performed for each subdomain (table 1). We demonstrated significant improvements in jaundice, fever, and asthma. There was no significant improvement in the arrhythmia subdomain. Some variability in significance was noted per year. The figure compares the changes in scores between R1s and R2s across the subdomains. There was significant improvement in overall lumbar puncture scores for 2 of 3 years (table 2).
FIGURE.

Comparison of Rising Pediatrics Intern (R1) and Rising Second-Year Resident (R2) Changes in Means Across the Subdomains
TABLE 2.
Clinical Skills Fair Procedure (Lumbar Puncture) Domain Results

We also measured feasibility outcomes. Moderate faculty/staff investment was required but was part of their existing responsibilities. The Clinical Skills Fair also required moderate time commitments and collaboration within the department. table 3 enumerates the resources needed.
TABLE 3.
Clinical Skills Fair Resources Used for Each Domain

Discussion
The Clinical Skills Fair allowed for effective program evaluation and may serve as a model for PECs. The MK/PC tests consistently identified strengths in asthma, fever, and jaundice, whereas the procedure test demonstrated mastery of lumbar puncture skills. Therefore, the CC believed the curriculum was effective in these subdomains and no significant interventions were necessary. In contrast, the CC focused on deficiencies in arrhythmias, regardless of the fact that all interns had completed Pediatric Advanced Life Support training before their intern year. These results were shared with program leadership and faculty who identified recent system changes that likely impacted learning. Changes included a new cardiac intensive care unit that did not incorporate residents. Previously, cardiac patients were seen by residents in the general and neonatal intensive care units. Before implementing the Clinical Skills Fair, these unintended consequences were unrecognized. We collaborated with cardiology and nursery faculty to incorporate cardiac teaching into the newborn rotation. We are planning for additional electrocardiogram and arrhythmia teaching in cardiology electives and noon conferences. Performance in subsequent Clinical Skills Fair assessments should determine whether these changes were effective.
The feasibility outcomes suggested that the Clinical Skills Fair required moderate investments, particularly in faculty/staff time. Leadership commitment was required to support the program. Faculty development models can support faculty in such roles without financial compensation.11 Programs may need to consider using existing committees, such as PECs, to develop a Clinical Skills Fair similar to the approach described here.
The Clinical Skills Fair required a moderate initial cost investment in equipment. The same equipment has been used for multiple years and represented a reasonable investment for the returns. Programs may need to identify funds to implement this type of program evaluation or leverage existing resources.
Our study demonstrated new findings. Other studies have shown how objective structured examinations can provide useful assessment of clinical competence.12 Our results supported these findings. Moreover, we demonstrated that systematic test development can be sustainable and long-term components of program evaluation. The Clinical Skills Fair was replicated annually for 4 years. It correlated well with other measures of performance supported by the literature to assess knowledge, such as in-training examination (ITE) scores.13 The ITE scores from our same cohort of residents similarly identified the need for improvement in cardiology knowledge. However, our study provided additional information beyond the ITE. The Clinical Skills Fair measured growth through the pre-post design to track improvements with observable outcomes tailored to program needs.
This study has limitations. Data were analyzed in aggregate for curriculum evaluation rather than to assess individual resident competency. A Clinical Skills Fair can be used for individual resident tracking as part of the overall assessment process but not as the sole tool for high-stakes assessments. Further validity testing should be done before such use.
Although low Clinical Skills Fair scores may be related to the curriculum, other factors must be considered, including issues related to the test, such as phrasing, scoring, delivery, and environment. As with other assessment tools, the data must be considered in context and alternative explanations explored. We attempted to address this limitation through systematic test development.9
Next steps in the development of the Clinical Skills Fair approach may include adaptation for individual resident assessment. We analyzed data in aggregate using codes to preserve resident anonymity while the Clinical Skills Fair was tested and to avoid potentially adverse impressions of the resident based on pilot-test performance. We showed that the Clinical Skills Fair is a reasonable approach; therefore, it may be used as a component of individual assessment.
Competence as measured by the Pediatrics Milestones is ideally based on multiple assessment methods.14 Development of a single valid test to assess overall competence is difficult given the variability of performance across settings.9,14 Using multiple assessment methods provides a broader sampling per resident and a more meaningful holistic assessment.9 The Clinical Skills Fair offers 1 measure of how well residents meet goals and objectives using a standardized approach.
Conclusion
The Clinical Skills Fair provides focused curriculum and program evaluation in a structured and replicable manner, is implemented relatively easily, and is sustainable. It can serve as a model for PECs to refine curricula and track outcomes as continuous process improvement.
Footnotes
All authors are at Duke University. Aditee P. Narayan, MD, MPH, is Associate Professor, Department of Pediatrics, and Associate Program Director, Pediatrics Residency Program; Shari A. Whicker, EdD, MEd, is Director of Education Development, Office of Pediatric Education; Betty B. Staples, MD, is Associate Professor, Department of Pediatrics, and Program Director, Pediatrics Residency Program; Jack Bookman, MAT, PhD, is Professor of the Practice Emeritus, Department of Mathematics; Kathleen W. Bartlett, MD, is Associate Professor, Department of Pediatrics, and Associate Program Director, Pediatrics Residency Program; and Kathleen A. McGann, MD, is Professor, Department of Pediatrics, and Vice Chair of Education, Office of Pediatric Education.
Funding: This study was funded by the Duke Graduate Medical Education Innovations Fund.
Portions of these results were presented at the Annual Spring Meeting of the Association of Pediatric Program Directors, San Antonio, TX, March 28–31, 2012.
The authors would like to thank the curriculum committee members, Duke pediatrics and combined internal medicine-pediatrics residents, and J. Marc Majure, MD.
References
- 1.Accreditation Council for Graduate Medical Education. ACGME Program Requirements for Graduate Medical Education in Pediatrics. http://www.acgme.org/acgmeweb/Portals/0/PFAssets/2013-PR-FAQ-PIF/320_pediatrics_07012013.pdf. Accessed December 9, 2013. [Google Scholar]
- 2.Kern DE, Thomas PA, Hughes MT. Curriculum Development for Medical Education: A Six-Step Approach. 2nd ed. Baltimore, MD: Johns Hopkins University Press; 2009. [Google Scholar]
- 3.Morrison GR, Ross SM, Kemp JE, Kalman H. Designing Effective Instruction. 6th ed. Hoboken, NJ: John Wiley & Sons Inc; 2010. [Google Scholar]
- 4.Nasca TJ, Philibert I, Brigham T, Flynn TC. The next GME accreditation system—rationale and benefits. N Engl J Med. 2012;366(11):1051–1056. doi: 10.1056/NEJMsr1200117. [DOI] [PubMed] [Google Scholar]
- 5.Reed D. Nimble approaches to curriculum evaluation in graduate medical education. J Grad Med Educ. 2011;3(2):264–266. doi: 10.4300/JGME-D-11-00081.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Musick DW. A conceptual model for program evaluation in graduate medical education. Acad Med. 2006;81(8):759–765. doi: 10.1097/00001888-200608000-00015. [DOI] [PubMed] [Google Scholar]
- 7.Branson RK, Rayner GT, Cox JL, Furman JP, King FJ, Hannum WH. Interservice Procedures for Instructional Systems Development. 5 vols. Ft Monroe, VA: US Army Training and Doctrine Command; 1975. TRADOC Pam 350-30 NAVEDTRA 106A. NTIS No. ADA 019 486 through ADA 019 490. [Google Scholar]
- 8.Dalkey N, Helmer O. An experimental application of the Delphi method to the use of experts. Manage Sci. 1963;9(3):458–467. [Google Scholar]
- 9.Wass V, Van der Vleuten C, Shatzer J, Jones R. Assessment of clinical competence. Lancet. 2001;357(9260):945–949. doi: 10.1016/S0140-6736(00)04221-5. [DOI] [PubMed] [Google Scholar]
- 10.Ellenby MS, Tegtmeyer K, Lai S, Braner DA. Videos in clinical medicine. Lumbar puncture. N Engl J Med. 2006;355(13):e12. doi: 10.1056/NEJMvcm054952. [DOI] [PubMed] [Google Scholar]
- 11.Narayan AP, Whicker SA, McGann KA. An innovative process for faculty development in residency training. Teach Learn Med. 2012;24(3):248–256. doi: 10.1080/10401334.2012.692280. [DOI] [PubMed] [Google Scholar]
- 12.Carraccio C, Englander R. The objective structured clinical examination: a step in the direction of competency-based evaluation. Arch Pediatr Adolesc Med. 2000;154(7):736–741. doi: 10.1001/archpedi.154.7.736. [DOI] [PubMed] [Google Scholar]
- 13.Althouse LA, McGuinness GA. The in-training examination: an analysis of its predictive value on performance on the general pediatrics certification examination. J Pediatr. 2008;153(3):425–428. doi: 10.1016/j.jpeds.2008.03.012. [DOI] [PubMed] [Google Scholar]
- 14.Hicks PJ, Englander R, Schumacher DJ, Burke A, Benson BJ, Guralnick S, et al. Pediatrics milestone project: next steps toward meaningful outcomes assessment. J Grad Med Educ. 2010;2(4):577–584. doi: 10.4300/JGME-D-10-00157.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
