Skip to main content
Journal of Graduate Medical Education logoLink to Journal of Graduate Medical Education
. 2018 Jun;10(3):331–335. doi: 10.4300/JGME-D-17-00377.1

Developing an Assessment Framework for Essential Internal Medicine Subspecialty Topics

Natasha Chida 1,, Christopher Brown 1, Jyoti Mathad 1, Kelly Carpenter 1, George Nelson 1, Marcos C Schechter 1, Paulina A Rebolledo 1, Valeria Fabre 1, Diana Silva Cantillo 1, Sarah Longworth 1, Valerianna Amorosa 1, Christian Petrauskis 1, Catherine Boulanger 1, Natalie Cain 1, Amita Gupta 1, Jane McKenzie-White 1, Robert Bollinger 1, Michael Melia 1
PMCID: PMC6008041  PMID: 29946392

ABSTRACT

Background 

Assessing residents by direct observation is the preferred assessment method for infrequently encountered subspecialty topics, but this is logistically challenging.

Objective 

We developed an assessment framework for internal medicine (IM) residents in subspecialty topics, using tuberculosis diagnosis for proof of concept.

Methods 

We used a 4-step process at 8 academic medical centers that entailed (1) creating a 10-item knowledge assessment tool; (2) pilot testing on a sample of 129 IM residents and infectious disease fellow volunteers to evaluate validity evidence; (3) implementing the final tool among 886 resident volunteers; and (4) assessing outcomes via retrospective chart review. Outcomes included tool score, item performance, and rates of obtaining recommended diagnostics.

Results 

Following tool development, 10 infectious disease experts provided content validity. Pilot testing showed higher mean scores for fellows compared with residents (7 [SD = 1.8] versus 3.8 [SD = 1.7], respectively, P < .001) and a satisfactory Kuder-Richardson Formula 20 (0.72). Implementation of the tool revealed a 14-minute (SD = 2.0) mean completion time, 61% (541 of 886) response rate, 4.4 (SD = 1.6) mean score, and ≤ 57% correct response rate for 9 of 10 items. On chart review (n = 343), the rate of obtaining each recommended test was ≤ 43% (113 of 261), except for chest x-rays (96%, 328 of 343).

Conclusions 

Our assessment framework revealed knowledge and practice gaps in tuberculosis diagnosis in IM residents. Adopting this approach may help ensure assessment is not limited to frequently encountered topics.

Introduction

Assessment using the Accreditation Council for Graduate Medical Education (ACGME) competencies is critical to determining resident readiness for unsupervised practice.1,2 Unfortunately, commonly used assessment methods may lack validity evidence and have other limitations.17 The American Board of Internal Medicine (ABIM) Certification Examination and the American College of Physicians Internal Medicine In-Training Examination (IM-ITE) have validity evidence, but they occur annually and may not provide for timely identification of deficits. Direct observation is optimal but can be logistically challenging, particularly for core internal medicine (IM) subspecialty topics, with which residents have few encounters.1,4,8 As internists should be well-versed in these topics, additional assessment strategies are needed.

The ABIM Certification Examination blueprint identifies tuberculosis (TB) as a core infectious disease (ID) subject, yet both the ABIM examination and the IM-ITE contain few TB-focused questions.9,10 We reviewed PubMed and MedEdPORTAL and could not locate TB diagnosis assessment tools. Because patients with TB often present to internists, and 1 untreated patient may infect up to 15 people, a missed diagnosis can have significant consquences.1113 Using pulmonary TB diagnosis as a proof of concept, we created and tested an assessment framework for IM subspecialty topics.

Methods

We developed our framework through a 4-step process: We (1) created a TB diagnosis knowledge assessment tool; (2) evaluated the tool's evidence of validity; (3) distributed the tool to IM residents; and (4) assessed resident practice by reviewing charts of inpatients evaluated for TB. We included 972 residents from 8 IM residency programs. The study was conducted between May and September 2015.

Development of Assessment Tool and Evaluation of Validity Evidence

Using established question-writing guidelines, we created 10 multiple-choice items based on the Centers for Disease Control and Prevention (CDC) TB Core Curriculum (Table 1).14,15 Ten ID attending physicians provided content validity through question review. Using Qualtrics software (Provo, Utah), we pilot tested the tool with 86 residents—with equal representation from each postgraduate year (PGY)—and a convenience sample of 43 ID fellows. Participation was voluntary and responses were anonymous. Respondents received a $10 gift card. We used the American Association for Public Opinion Research type 1 response rate, which includes only fully completed tools.16

Table 1.

Performance by Item on Knowledge Assessment Tool Among Sites and Respondentsa

graphic file with name i1949-8357-10-3-331-t01.jpg

We expressed site scores as means of individual results, compared resident and fellow scores using Student's t tests, and evaluated institutional differences using 1-way analysis of variance. Hypothesis testing was done using 2-tailed tests; significance was determined at P < .05. We performed item analyses and discrimination and rewrote items with correct response percentages of < 25% or > 95% or with point-biserial indices of < 0.20.17 We assessed internal consistency via the Kuder-Richardson Formula 20.18 Analyses were done using STATA version 13.0 (StataCorp LLC, College Station, TX).

Tool Distribution

We assessed the remaining 886 residents in the 7 programs using the refined tool and the same methodology as the pilot. We captured resident demographics and self-reported experience caring for patients with TB, and we used 1-way analysis of variance to compare mean scores by PGY level and TB experience.

Chart Review of IM Resident Practice

To assess resident practice beyond knowledge, we retrospectively reviewed the charts of 343 patients randomly selected from 2136 inpatients assessed for pulmonary TB across the sites in 2014. A uniform abstraction instrument was used to record whether residents obtained tests the CDC recommend for every patient evaluated for pulmonary TB14: 3 sputum specimens, a nucleic acid amplification test, a chest x-ray, and a latent TB infection test. Sample size was based on the assumption each test would be performed in 70% of patients with a 95% confidence interval and ±5% error margin. The number of charts reviewed per site was proportional to the site's contribution to the 2136 patients. We included adults admitted to an IM or ID resident team with a respiratory sample sent for acid-fast bacilli smear and culture. Exclusion criteria included death during hospitalization and evaluation for nontuberculous mycobacteria. Resident and fellow study team members performed the chart reviews to determine the feasibility of trainees completing this task.

The study was approved by each site's Institutional Review Board.

Results

Development of Assessment Tool and Evaluation of Validity Evidence

Question development required less than 10 hours, while each expert review required less than 1 hour. The response rate for the pilot was 57% (74 of 129), with 53% (46 of 86) of residents and 65% (28 of 43) of fellows responding. Mean tool completion time was 14 minutes (SD = 2.0). Fellows scored higher than residents, confirming criterion validity (7.0 of 10.0 [SD = 1.8], median score 7 [interquartile range {IQR} 6–8], versus 3.8 of 10 [SD = 1.7], median score 4 [IQR 2.5–5], respectively; P < .001). Scores did not differ by site. Item analyses and discrimination yielded 1 item with a point-biserial index of 0.01 and another with a 14% correct response percentage (18 of 129); both were revised. The Kuder-Richardson Formula 20 coefficient was 0.72, indicating satisfactory reliability.

Tool Distribution

The response rate for the final assessment was 61% (541 of 886), with no difference by PGY level (F3,537 = 0.23, P = .90; Table 2). Mean tool completion time was 14 minutes (SD = 3.0); the overall mean score was 4.4 (SD = 1.6). Site 5 performed better than other sites, with a mean score of 4.8 (F6,534 = 2.46, P = .021). There was no difference in mean score among other sites (F5,483 = 2.04, P = .07), by PGY (F3,485 = 0.20, P = .90), or by TB experience (F3,537 = 1.10, P = .35). The correct response percentage was ≤ 57% (311 of 541) for 9 of the items (Table 1).

Table 2.

Characteristics, Tuberculosis Experiences, and Performance Among Knowledge Assessment Tool Respondentsa

graphic file with name i1949-8357-10-3-331-t02.jpg

Chart Review of IM Resident Practice

Data abstraction required approximately 20 minutes per chart. Eighteen percent (62 of 343) of patients had a nucleic acid amplification test, 35% (121 of 343) had latent TB infection testing, and 96% (328 of 343) had a chest x-ray. Sputa were obtained in 261 patients; 43% (113 of 261) submitted 3 specimens, and 22% (57 of 261) submitted 1 specimen. The remaining 82 patients underwent bronchoscopy. Therefore, the rate of obtaining each recommended test was ≤ 43%, except for chest x-rays.

Discussion

We developed an assessment framework for an infrequently encountered IM subspecialty topic. Implementation revealed concordance between our knowledge assessment tool and chart review, indicating gaps in resident knowledge and practice in TB diagnosis.

Prior work has identified IM resident knowledge gaps in commonly encountered subspecialty topics such as chronic kidney disease and asthma.19,20 To our knowledge this is the first framework to assess an infrequently encountered subspecialty topic. Programs may be able to utilize our framework to understand opportunities for improvement at individual and program levels. For example, programs could use our framework to explore deficiencies consistently identified by the ITE, while residents could perform the chart review on their own patients under the purview of quality improvement–oriented faculty; the latter would help residents analyze their practice, as recommended by the ACGME.21

A limitation of our study was the retrospective nature of our chart review. The small number of TB evaluations made linking individual residents' tool performances to their patients impractical, and we used review as a surrogate for practice. Another limitation was the time required for framework development, given program faculty often have many demands on their time.3,22,23 While it likely is not feasible for individual programs to generate this type of framework for each subspecialty topic, programs may build frameworks for topic(s) with which their residents have the most difficulty, and the resulting tools could be shared via open access journals such as MedEdPORTAL.

Future research should investigate the feasibility of implementing our framework for other subspecialty topics and its impact on resident knowledge and skills.

Conclusion

We developed an assessment framework that revealed resident knowledge and practice gaps in TB diagnosis. Incorporating our framework into existing assessment methods can help ensure that timely individualized assessment is not limited to frequently encountered topics.

References


Articles from Journal of Graduate Medical Education are provided here courtesy of Accreditation Council for Graduate Medical Education

RESOURCES