Skip to main content
Clinical Orthopaedics and Related Research logoLink to Clinical Orthopaedics and Related Research
. 2021 Jan 4;479(6):1386–1394. doi: 10.1097/CORR.0000000000001623

Surgical Skill Can be Objectively Measured From Fluoroscopic Images Using a Novel Image-based Decision Error Analysis (IDEA) Score

Steven Long 1,2,3, Geb W Thomas 1,2,3, Matthew D Karam 1,2,3, J Lawrence Marsh 1,2,3, Donald D Anderson 1,2,3,
PMCID: PMC8133282  PMID: 33399401

Abstract

Background

To advance orthopaedic surgical skills training and assessment, more rigorous and objective performance measures are needed. In hip fracture repair, the tip-apex distance is a commonly used summative performance metric with clear clinical relevance, but it does not capture the skill exercised during the process of achieving the final implant position. This study introduces and evaluates a novel Image-based Decision Error Analysis (IDEA) score that better captures performance during fluoroscopically-assisted wire navigation.

Questions/purposes

(1) Can wire navigation skill be objectively measured from a sequence of fluoroscopic images? (2) Are skill behaviors observed in a simulated environment also exhibited in the operating room? Additionally, we sought to define an objective skill metric that demonstrates improvement associated with accumulated surgical experience.

Methods

Performance was evaluated both on a hip fracture wire navigation simulator and in the operating room during actual fracture surgery. After examining fluoroscopic image sequences from 176 consecutive simulator trials (performed by 58 first-year orthopaedic residents) and 21 consecutive surgical procedures (performed by 19 different orthopaedic residents and one attending orthopaedic surgeon), three main categories of erroneous skill behavior were identified: off-target wire adjustments, out-of-plane wire adjustments, and off-target drilling. Skill behaviors were measured by comparing wire adjustments made between consecutive images against the goal of targeting the apex of the femoral head as part of our new IDEA scoring methodology. Decision error metrics (frequency, magnitude) were correlated with other measures (image count and tip-apex distance) to characterize factors related to surgical performance on both the simulator and in the operating room. An IDEA composite score integrating decision errors (off-target wire adjustments, out-of-plane wire adjustments, and off-target drilling) and the final tip-apex distance to produce a single metric of overall performance was created and compared with the number of hip wire navigation cases previously completed (such as surgeon experience levels).

Results

The IDEA methodology objectively analyzed 37,000 images from the simulator and 688 images from the operating room. The number of decision errors (7 ± 5 in the operating room and 4 ± 3 on the simulator) correlated with fluoroscopic image count (33 ± 14 in the operating room and 20 ± 11 on the simulator) in both the simulator and operating room environments (R2 = 0.76; p < 0.001 and R2 = 0.71; p < 0.001, respectively). Decision error counts did not correlate with the tip-apex distance (16 ± 4 mm in the operating room and 12 ± 5 mm on the simulator) for either the simulator or the operating room (R2 = 0.08; p = 0.15 and R2 = 0.03; p = 0.47, respectively), indicating that the tip-apex distance is independent of decision errors. The IDEA composite score correlated with surgical experience (R2 = 0.66; p < 0.001).

Conclusion

The fluoroscopic images obtained in the course of placing a guide wire contain a rich amount of information related to surgical skill. This points the way to an objective measure of skill that also has potential as an educational tool for residents. Future studies should expand this analysis to the wide variety of procedures that rely on fluoroscopic images.

Clinical Relevance

This study has shown how resident skill development can be objectively assessed from fluoroscopic image sequences. The IDEA scoring provides a basis for evaluating the competence of a resident. The score can be used to assess skill at key timepoints throughout residency, such as when rotating onto/off of a new surgical service and before performing certain procedures in the operating room, or as a tool for debriefing/providing feedback after a procedure is completed.

Introduction

The need for more rigorous assessment of surgical skills during training cannot be overstated [2]. Conventional subjective assessment of orthopaedic resident performance typically lacks the sensitivity, accuracy, and reliability needed to detect performance improvement and guide remediation [3]. The lack of objective assessment methods poses a fundamental challenge to the validation and adoption of simulators in orthopaedics because without a reliable sensitive skill metric, it is difficult to demonstrate an increase in skill after a specific intervention. Not having an appropriate metric also creates ambiguity in determining when a resident is properly prepared to perform an operation. The work described here focuses on defining a quantitative assessment metric that is more sensitive and reliable than previous metrics, can be used both with simulators and in the operating room, and relies on behavior observations that can yield specific, performance-related feedback to help residents improve skill when learning a new procedure. The technique extends the observations that for certain skills, such as fluoroscopically navigating a wire in treating intertrochanteric hip fractures, the final surgical result can be objectively assessed with a measure such as the tip-apex distance [4].

The tip-apex distance is not only quantitative, unambiguous, and a reliable measure, but also it is clinically relevant as a strong predictor of implant cutout risk [4]. However, use of the tip-apex distance as a performance metric to assess the skill of a surgeon placing the wire has its limitations. The tip-apex distance provides a final measure of the procedure outcome but yields little insight into the rest of the wire navigation process. Also, particularly in the operating room, surgeons typically make multiple wire passes until they achieve an acceptable tip-apex distance, limiting its utility as a performance metric. The final tip-apex distance does not account for the overall time for task completion, the number of fluoroscopic images used, or how many erroneous wire passes were made. In addition, when less-skilled surgeons make errors while judging and correcting the wire start point and path, they must use more fluoroscopy. Consequently, the tip-apex distance may be better as a surgical objective than a performance measure. Surgical faculty are well aware that even when two residents both achieve a satisfactory tip-apex distance, their execution of that task may be very different.

Therefore, the purpose of this study was to develop a methodology to objectively measure the differences in performance through a more detailed assessment of the accomplishment of the task. This could be important for determining proficiency levels for advancement to the operating room, for example, and to help remediate and expedite learning of orthopaedic surgical residents.

We designed a study to answer two questions and to pursue a third objective: (1) Can wire navigation skill be objectively measured from a sequence of fluoroscopic images? (2) Are skill behaviors observed in a simulated environment also exhibited in the operating room? (3) Define an objective skill metric that demonstrates improvement associated with accumulated surgical experience.

Materials and Methods

Study Overview and Outcomes of Interest

This study was designed to examine surgical skills that rely upon complementary use of fluoroscopy, which provides a visual record of progress. To tie together simulator and operating room performance, virtual (simulator) and actual (operating room) fluoroscopic images were analyzed using the same objective methods developed to capture important behaviors identified in the course of review.

Our primary study goal was to gain new insights into objective skill behaviors of wire navigation performance. To achieve this, we examined fluoroscopic image sequences to identify common behavior errors that could be objectively measured and scored. We are calling this novel assessment methodology Image-based Decision Error Analysis (IDEA) scoring.

Our secondary study goal was to examine the relationship between skill behaviors exhibited on a simulator and those in the operating room. We compared how the new skill behaviors measured on the simulator and in operating room environments correlated with other metrics of performance such as the number of images used and the tip-apex distance.

Our final study goal was to examine how the new objective metric of performance (the IDEA composite score) correlated with increased surgical experience. We examined the correlation between IDEA composite scores collected in the operating room and surgical case log data to see how experience influenced the scores.

Defining Decision Errors

The IDEA is a new objective method that we have developed for measuring wire navigation skill using images to analyze decision errors that capture important elements of surgical performance previously not measured, such as errant wire adjustments made during the procedure. After examining hundreds of fluoroscopic image sequences, the authors observed patterns of behavior that seemed inconsistent with achieving the surgical goal. We attempted to categorize these erroneous patterns in ways that could be quantitatively defined and in ways that the problem could be readily expressed and accepted by a learner. Six types of decision errors were initially defined: off-target wire adjustments, out-of-plane wire movement, off-target drilling, premature view switching, gratuitous imaging, and inappropriate redirection (Appendix 1; Supplemental Digital Content1, http://links.lww.com/CORR/A489). Only the first three—off-target wire adjustments (Supplemental Digital Content1, http://links.lww.com/CORR/A489), out-of-plane wire movement (Supplemental Digital Content 2, http://links.lww.com/CORR/A490), and off-target drilling (Supplemental Digital Content 3, http://links.lww.com/CORR/A491)—were subsequently found to be independent predictive measures of skill and were therefore included in the IDEA composite score.

Off-target wire adjustments occur when a surgeon adjusts the wire orientation in a direction away from the intended target; for example, in the case of hip wire navigation, away from the femoral head apex. The mean angle deviation of the off-target adjustments was also measured and used to assess the severity of these errors during a procedure. Out-of-plane wire movement occurs when a surgeon adjusts the wire trajectory in a direction parallel to the imaging direction, so the adjustments are not visible in the current image. One example is making an adjustment that would affect the wire position in the AP view, while actually taking sequential images in the lateral view. If the out-of-plane movement caused the wire trajectory to angle further away from the apex target, this was counted as a single (+1) out-of-plane error. Off-target drilling occurs when a surgeon advances the wire along an incorrect trajectory, for instance, on a trajectory away from the femoral head apex in the case of hip wire navigation. If a surgeon advanced their wire along this off-target trajectory after breaching the lateral cortex of the femur, this was counted as a single (+1) off-target drilling error.

Custom image analysis software was developed using MATLAB software to detect and quantify the skill behaviors described above. We analyzed the wire trajectory changes between sequentially requested image pairs to count the number of times each of the decision errors occurred. Images taken from the operating room sequences were analyzed by two investigators (SL and one other who is not an author on this paper) blinded to the surgeon experience level using DICOM viewing software (OsiriX – https://www.osirix-viewer.com) to locate the wire tip, femur apex, and wire trajectory in bone using a standard validated method [8]. This method has previously been shown to be highly reliable for locating these image landmarks, with a Cronbach alpha of 0.97 being found between 10 different raters [16]. We analyzed these operating room data and simulator data to systematically quantify the decisions made by residents during the wire navigation phase of the surgery. The same error logic in both the simulator and operating room environments was implemented. For each procedure and training exercise, we also recorded the number of fluoroscopic images and the final tip-apex distance.

Simulator and Operating Room Data Collection

Data to assess wire navigation decisions were gathered from two environments: a wire navigation simulator (Fig. 1) and the operating room. We used the IDEA scoring methodology to analyze 37,000 images from 176 wire navigation trials gathered during a previous study on the simulator including 58 first-year orthopaedic residents from four training programs (University of Iowa, Iowa City, IA, USA; Mayo Clinic, Rochester, MN, USA; University of Nebraska, Omaha, NB, USA; University of Minnesota, Minneapolis, MN, USA). The wire navigation simulator uses hybrid reality to facilitate safe resident practice while avoiding radiation exposure [9, 10]. A camera system measures the position of a laser-etched K-wire relative to a Sawbones model (a femur in the hip wire exercise; model 1130-21-33, Sawbones) mounted on a fixed mast. A laptop computer renders fluoroscope-like images that trainees use to guide them as they drill a surgical wire into the Sawbones femur. The simulator automatically records the wire tip location and orientation vector every time a trainee requests a new fluoroscopic image. Residents placed wires during multiple trials with the simulator in the course of a training session (Fig. 2A-B). Residents were instructed to minimize their tip-apex distance while balancing their use of fluoroscopic images and time. For each training session, a new Sawbones model was used on the simulator. In the early training sessions, feedback was provided to residents to aid their wire placement. In the final session with the simulator, no feedback was provided. All of the trials performed by the residents were included in this study.

Fig. 1.

Fig. 1

The wire navigation simulator (left) allows trainees to drive a surgical wire into a plastic femur mounted on the mast and hidden by the soft tissue cover. Simulated fluoroscopic images (right) allow the trainee to practice guiding the wire to the apex of the femoral head.

Fig. 2.

Fig. 2

A-B In these images, each dot represents the surgical wire tip location when an image was requested on the simulator. Both residents achieved a tip-apex distance of 19 mm. (A) However, resident A needed more images to find the starting position, had to withdraw and reinsert the wire several times after moving into the femoral head, and took many fluoroscopic images without making meaningful progress with their wire. (B) The wire path reflects a much more straightforward path with fewer adjustments needed to achieve the same wire position.

In the operating room, we collected and analyzed 688 images from 21 patient surgical procedures and 20 different surgeons (19 residents, one who completed two procedures as a PGY3 resident, and one attending) using the IDEA scoring methodology. Surgeons in the operating room had completed a mean 4 years of residency training (one PGY2, 13 PGY3, one PGY4, four PGY 5, and one attending), with their status ranging from second-year residents to a single attending surgeon. Data from consecutive operating room procedures were collected between August 2015 to February 2016 (30 total compression hip screw procedures, 15 of which were not included in the study because of incomplete image sets) and again from August 2018 to February 2019 (19 total compression hip screw procedures; 13 were not included due to incomplete image sets), when residents and faculty at the University of Iowa were instructed to save complete fluoroscopic image sequences from procedures that involved placing a center-center guide wire for the treatment of intertrochanteric hip fractures. In the event that the entire sequence of fluoroscopic images was not saved, which we confirmed by directly asking the surgeon involved, those procedures were excluded from study. The second data collection period from August 2018 to February 2019 was required to augment the existing dataset gathered between 2015 and 2016 because entire image sequences were not always saved, which necessitated exclusion from analysis. The mean number of comparable procedures previously logged was 7 ± 5, ranging from 1 to 24 procedures (the attending completed 24 procedures). Patients treated with compression hip screws were included in this study, whereas those treated with cephalomedullary nails were not.

Measuring Surgical Experience

The number of hip wire navigation procedures previously completed was used as a measure of experience to examine how different metrics correlated with surgeon experience. It has been shown previously that case log data are a better metric of surgical experience than resident post-graduate year [15]. Resident case logs were therefore queried to determine how many hip wire navigation procedures (specifically the number of compression hip screw procedures) they had previously completed at the time of the surgery being assessed. Additionally, only cases in which residents were listed as the primary surgeon, categorized as a “1”, and not the secondary surgeon, categorized as a “2”, were included in the case log count. Case logs were cross-referenced with electronic medical records using Epic medical record software to make sure an accurate count of the number of compression hip screw cases previously completed was recorded. If the Epic search did not match the self-reported resident case logs, our team contacted the surgeon to verify their participation to get the most accurate count of hip wire navigation procedures completed possible.

First, we analyzed the correlations of the IDEA constituent elements (number of decision errors and the mean angle of wire movement errors) and the tip-apex distance with the number of hip wire navigation procedures a resident had previously logged in the operating room individually. We fit a logarithmic curve when correlating each metric because learning curve theory suggests that learning consistently follows a logistic-shaped pattern [14]. Next, we constructed an IDEA composite score to summarize the resident’s overall performance on the wire navigation task, combining the tip-apex distance measured and the decision error behaviors. Each metric was first independently normalized based on overall population means and SDs, and the normalized metrics were then summed to create the composite score. A higher composite score indicates better performance. A score that is equal to zero would indicate that it matches perfectly with the mean performance across all resident participants. We calculated the correlation between this IDEA composite score and surgeon experience (the number of procedures previously completed) to further investigate this relationship.

Ethical Approval

Ethical approval for this study was obtained from the University of Iowa (IRB ID# 201409755).

Results

Objectively Evaluating Skill Behaviors from Fluoroscopic Image Sequences

The IDEA methodology objectively analyzed 37,000 images from the simulator and 688 images from the operating room. There was a wide variety of performance observed. For the simulator trials, residents achieved a mean tip-apex distance of 12 ± 5 mm, needed 20 ± 11 fluoroscopic images, made 4 ± 3 decision errors, and had a mean error of 3° ± 3° when making off-target adjustments (Table 1). In addition, for residents on their first trial with the simulator, the number of decision errors was 6 ± 4 errors. After the third trial, the number of decision errors was 2 ± 2 errors. We observed similar trends in performance in the operating room, where surgeons (19 different residents and one attending) had a mean tip-apex distance of 16 ± 4 mm, needed 33 ± 14 fluoroscopic images, made 7 ± 5 decision errors, and had a mean error of 3° ± 2° when making off-target adjustments (Table 1). Some residents made as many as 18 or 19 decision errors, and in those specific cases, it resulted in more fluoroscopic images being acquired (51 and 64 images, respectively) to correct for the errors. Other residents, often those with more hip wire navigation procedures logged, made qualitatively fewer errors, some with no errors and others with only two errors. In these specific cases, the surgeons were able to place the wire with a tip-apex distance less than 25 mm while using relatively few images (17 and 20 images, respectively).

Table 1.

Summative performance metrics (mean ± SD)

Environment Procedures analyzed, n Number of participants Tip-apex distance in mm Number of images Number of decision errors Angle of reverse trajectory correction in °
Hybrid-reality simulator- based wire navigation 176 58 12 ± 5 20 ± 11 4 ± 3 3 ± 3
Operating room wire navigation (actual patient procedures) 21 20 16 ± 4 33 ± 14 7 ± 5 3 ± 2

Data presented as mean ± SD unless otherwise indicated.

Examining the Relationship Between Simulator and Operating Room Behaviors

Skill behaviors on the simulator and in the operating room were in many ways comparable. The most commonly observed error included in the IDEA score on the simulator was out-of-plane wire adjustments (Table 2), with a mean occurrence of 2 ± 2 times per trial. The second most common error was off-target wire adjustments, with a mean occurrence of 1.3 ± 1.6 times per trial. To assess the value of the IDEA, we studied correlations between the number of decision errors detected and other metrics of wire navigation performance. Overall, there was a strong positive correlation (R2 = 0.76; p < 0.001) between the number of decision-making errors and the number of fluoroscopic images requested (Fig. 3). Out-of-plane adjustment and off-target wire adjustment errors were both more highly correlated with the total number of images (R2 = 0.51; p < 0.001 and R2 = 0.48; p < 0.001, respectively) than was off-target drilling (R2 = 0.19; p = 0.13) (Table 2). Both the number of decision errors and the mean angle of wire movement error did not correlate with tip-apex distance (R2 = 0.08; p = 0.15 and R2 = 0.06; p = 0.11, respectively).

Table 2.

Simulator-based decision error frequency

Error type Total number Mean error per trial ± SD Correlation (R2) to image count p value
Out-of-plane adjustment 407 2 ± 2 0.51 < 0.001
Off-target wire adjustment 229 1 ± 2 0.48 < 0.001
Off-target drilling 75 0.5 ± 1 0.19 0.13

Fig. 3.

Fig. 3

This graph shows the relationship between the total number of decision errors made on the simulator and total number of images requested (R2 = 0.76; p < 0.001).

In the operating room, the most common error included in the IDEA score was an off-target wire adjustment, with a mean occurrence of 5 ± 4 times per case (Table 3). Similar to results seen for the simulator, there was a strong correlation between the total number of decision errors made and the number of fluoroscopy images requested in the operating room (R2 = 0.71; p < 0.001) (Fig. 4). Off-target wire adjustment errors were more strongly correlated with the total number of images (R2 = 0.66; p < 0.001) than either off-target drilling or out-of-plane adjustment errors (Table 3). Both the total number of decision-making errors and the mean angle of wire movement error did not correlate with tip-apex distance (R2 = 0.03; p = 0.47 and R2 = 0.002; p = 0.85, respectively). The decision errors were also correlated with the number of hip wire navigation procedures completed. Off-target wire adjustment errors were more strongly correlated with cases completed (R2 = 0.21; p = 0.04) than either off-target drilling or out-of-plane adjustment errors distance (R2 = 0.07; p = 0.26 and R2 = 0.03; p = 0.42, respectively) (Table 3).

Table 3.

Operating room (actual patients)–based decision error frequency

Error type Total number Mean error per case ± SD Correlation (R2) to image count p value Correlation (R2) to cases logged p value
Off-target wire adjustment 113 5 ± 4 0.66 < 0.001 0.21 0.04
Off-target drilling 23 1 ± 1 0.19 0.05 0.07 0.26
Out-of-plane adjustment 22 1 ± 1 0.37 < 0.01 0.03 0.42

Fig. 4.

Fig. 4

This graph shows the relationship between the total number of decision errors made in the operating room and total number of images requested (R2 = 0.71; p < 0.001).

Document Improvement Associated with Surgical Experience

The IDEA composite score demonstrated improvement associated with accumulated surgical experience. The decision error count, one component of the IDEA score, was moderately correlated with the number of hip wire navigation procedures previously completed (R2 = 0.25; p = 0.04). Off-target wire adjustments had the highest correlation with surgical experience of all decision error categories (R2 = 0.66; p < 0.001). The mean angle of decision errors, another component of the IDEA score, more weakly correlated with the number of hip wire navigation cases completed (R2 = 0.18; p = 0.04). When combining the two elements of the IDEA score (decision errors and mean angles of errors), there was a stronger correlation with the number of hip wire navigation surgeries completed (R2 = 0.50; p = 0.001). The tip-apex distance alone had a slightly lower correlation with the number of hip wire navigation cases completed (R2 = 0.31; p = 0.01). Lastly, the composite IDEA score, a combination of the tip-apex distance measured and the decision error behaviors, had the strongest correlation with the number of hip wire navigation operations logged by a resident (R2 = 0.66; p < 0.001) (Fig. 5).

Fig. 5.

Fig. 5

This graph shows how a resident’s overall composite IDEA score improves as they complete more cases and gain more experience (R2 = 0.66; p < 0.001). The score is a metric that combines the tip-apex distance, number of decision errors, and the mean angle of decision errors.

Discussion

Establishing that surgical performance can be objectively assessed in the operating room is an important step in moving toward a system of competency-based education. Current assessment metrics like the Objective Structured Assessment of Technical Skills (OSATS) score or O-Score use expert ratings of resident performance based on different categories of assessment such as tool use, respect for soft tissue, or surgeon independence [7, 11, 17]. These metrics have been shown to distinguish between levels of novice and expert performance and have been used to validate various simulation tools [18]. However, it has also been shown that these metrics do not relate to the quality of the surgery in terms of the mechanical factors than can influence patient outcomes [1]. Additionally, there is concern that some residents feel intimidated when asking for evaluation from different faculty members, which may introduce selection bias [7]. This study aimed to add new methods of objective assessment of technical skill for wire navigation–based procedures. This study focused on evaluating data contained within fluoroscopic images acquired during a procedure that provide insight into the skill behaviors of the surgeon.

Limitations

This study has several limitations. The first limitation is the small sample size from the operating room. With only 21 cases collected, all from one institution, it is possible that resident performance at other institutions may differ from the performance measured in this study. That said, the relationships measured between decision errors, tip-apex distance, and number of fluoroscopic images collected on the simulator from multiple institutions and a larger sample size closely mirrored the relationships measured in the operating room. Another limitation is that this new methodology of performance assessment assumes that the resident drilling in the guide wire is independent of direction from a supervising surgeon. It is likely that a supervising surgeon, often a more-senior resident, would offer helpful tips and direction to the residents to help them accurately place the guide wire and ensure that the patient has a satisfactory result. The degree to which a supervising surgeon intervenes was not quantified in this study, but when it was previously quantified, it was not found to correlate with surgeon case log or tip-apex distance [15]. A third limitation may be in how some of the errors are measured. For instance, when measuring out-of-plane movement in the operating room, adjustments in the C-arm position may have played a factor in capturing those errors. That said, this error type was less common in the operating room, so if C-arm position did play a role in this metric, it likely did not have a major impact on other metrics captured, such as the composite score.

Objectively Evaluating Skill Behaviors from Fluoroscopic Image Sequences

Decision-making skill is often used to assess physician competence [6, 13]. The decisions targeted in these studies are typically high-level, such as whether to do a surgery or to use a bag to extract a gallbladder [5]. The decisions in the present work are much more finite; they focus on the change in guide-wire position between consecutive fluoroscopic images. Similar principles may still apply; for example, the correlation demonstrated between decision-making errors and technical errors in junior residents [12] is likely to be evident in navigation decisions.

Examining the Relationship Between Simulator and Operating Room Behaviors

Examining intraoperative images taken during a given surgery provides an opportunity to gain insight into the decision-making and technical abilities of surgeons using information that is already standardly available. It is important to have metrics that can be readily measured in both a simulated environment and in the operating room so that performance on a simulator can be linked to operating room performance. In this study, the analysis of decision- making was successfully implemented with data from both the simulator and the operating room. The analysis showed that the relationships between the summative decision-making metric and other metrics of wire navigation were similar in both environments. Also, the number of decision errors appears to be independent of a surgeon’s tip-apex distance on both the simulator and in the operating room.

Given that the number of decision errors and the mean angle of the wire movement errors were independent of the tip-apex distance, we chose to combine these metrics into a composite score to better assess surgical skill. Taylor et al. [15] implemented a similar metric when looking at hip wire navigation performance in the operating room and found that combining the tip-apex distance, number of fluoroscopic images, overall time, and the amount of intervention needed from a supervising surgeon had a moderate correlation (R2 = 0.43; p = 0.01) when related to the number of hip wire navigation cases logged by a resident. When combining the tip-apex distance, number of decision errors, and mean angle of decision errors into a composite metric, we observed a stronger correlation (Fig. 5) with the number of hip wire navigation cases logged by a resident, suggesting this new metric captures elements of surgical skill previously not measured. This new composite score appears to be promising and worthy of additional study in other settings.

Document Improvement Associated with Surgical Experience

The new score also presents a basis upon which to study the learning curve of residents in the operating room. The learning curve measured in this study defines how residents currently increase their skill when following a standard residency program curriculum. Future studies will be able to compare the composite score of residents who have trained with new methods with the scores measured in this study. If the residents can demonstrate a higher composite score earlier on in their learning curve, this would be a clear demonstration of transfer of skill to the operating room. Additionally, this curve may be helpful in setting proficiency benchmarks for simulation trainings. If residents can perform on a simulator at the same level as residents with 9 or 10 procedures logged (that is, a composite score of approximately 0.5), they may demonstrate a higher performance when they enter the operating room for the first time.

Conclusion

This study has shown how resident skill development can be objectively assessed from fluoroscopic image sequences by analyzing decision-making through the process of wire navigation. The IDEA scoring methodology may be a promising approach to assess skill at key timepoints throughout residency, such as when rotating onto/off a new surgical service and before performing certain procedures in the operating room. The decision-making analysis approach also has the potential to be beneficial for training and feedback purposes, as the intraoperative fluoroscopic image sequence taken provides a frame-by-frame summary of a surgical performance, not unlike a video. These images provide a basis for debriefing residents after a surgery and guiding focused practice on areas in which a resident may have difficulty. Finally, although this study focused solely on decision-making in hip wire navigation, future work is planned to determine whether this analysis approach can be used to evaluate performance in a variety of wire navigation procedures, such as placing a wire for an iliosacral screw or pinning a pediatric supracondylar humerus fracture.

Supplementary Material

SUPPLEMENTARY MATERIAL
abjs-479-1386-s001.docx (62.9KB, docx)

Acknowledgment

We thank Roshan Abid BS, for his contribution to this work. We also thank all the residents who participated in this study.

Footnotes

The institution of one or more of the authors (SL, GWT, MDK, JLM, DDA) has received, during the study period, funding from the Agency for Healthcare Research and Quality (R18 HS022077 and R18 HS025353), the Orthopaedic Trauma Association (18832000), and the American Board of Orthopaedic Surgery.

Four authors (SL, GWT, MDK, DDA) are co-owners of Iowa Simulation Solutions LLC, a company that manufactures the simulator mentioned in this paper.

All ICMJE Conflict of Interest Forms for authors and Clinical Orthopaedics and Related Research® editors and board members are on file with the publication and can be viewed on request.

Clinical Orthopaedics and Related Research® neither advocates nor endorses the use of any treatment, drug, or device. Readers are encouraged to always seek additional information, including FDA approval status, of any drug or device before clinical use.

Ethical approval for this study was obtained from the University of Iowa (IRB ID# 201409755).

Contributor Information

Steven Long, Email: steven-long@uiowa.edu.

Geb W. Thomas, Email: geb-thomas@uiowa.edu.

Matthew D. Karam, Email: matthew-karam@uiowa.edu.

J. Lawrence Marsh, Email: j-marsh@uiowa.edu.

References

  • 1.Anderson DD, Long S, Thomas GW, Putnam MD, Bechtold JE, Karam MD. Objective Structured Assessments of Technical Skills (OSATS) does not assess the quality of the surgical result effectively. Clin Orthop Relat Res. 2016;474:874-881. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Atesok K, Hurwitz S, Anderson DD, et al. Advancing simulation-based orthopaedic surgical skills training: an analysis of the challenges to implementation. Adv Orthop. 2019;2019:2586034. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Atesok K, MacDonald P, Leiter J, et al. Orthopaedic education in the era of surgical simulation: still at the crawling stage. World J Orthop . 2017;8:290-294. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Baumgaertner MR, Curtin SL, Lindskog DM, Keggi JM. The value of the tip-apex distance in predicting failure of fixation of peritrochanteric fractures of the hip. J Bone Joint Surg Am. 1995;77:1058-1064. [DOI] [PubMed] [Google Scholar]
  • 5.Cristancho SM, Vanstone M, Lingard L, LeBel ME, Ott M. When surgeons face intraoperative challenges: a naturalistic model of surgical decision making. Am J Surg. 2013;205:156-162. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Francis DM. Surgical decision making. ANZ J Surg. 2009;79:886-891. [DOI] [PubMed] [Google Scholar]
  • 7.Gofton WT, Dudek NL, Wood TJ, Balaa F, Hamstra SJ. The Ottawa Surgical Competency Operating Room Evaluation (O-SCORE): a tool to assess surgical competence. Acad Med. 2012;87:1401-1407. [DOI] [PubMed] [Google Scholar]
  • 8.Johnson LJ, Cope MR, Shahrokhi S, Tamblyn P. Measuring tip-apex distance using a picture archiving and communication system (PACS). Injury. 2008;39:786-790. [DOI] [PubMed] [Google Scholar]
  • 9.Long S, Thomas GW, Anderson DD. An extensible orthopedic wire navigation simulation platform. J Med Devices. 2019;13:031001-310017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Long S, Thomas GW, Anderson DD. Designing an affordable wire navigation surgical simulator. J Med Devices. 2016;10:030921. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Martin JA Regehr G Reznick R, et al. Objective structured assessment of technical skill (OSATS) for surgical residents. Br J Surg. 1997;84:273-278. [DOI] [PubMed] [Google Scholar]
  • 12.Nathwani JN, Fiers RM, Ray RD, et al. Relationship between technical errors and decision-making skills in the junior resident. J Surg Educ. 2016;73:e84-e90. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Pugh CM, DaRosa DA, Santacaterina S, Clark RE. Faculty evaluation of simulation-based modules for assessment of intraoperative decision making. Surgery. 2011;149:534-542. [DOI] [PubMed] [Google Scholar]
  • 14.Pusic MV, Boutis K, McGaghie WC. Role of scientific theory in simulation education research. Simul Healthc. 2018;13(3S suppl 1):S7-S14. [DOI] [PubMed] [Google Scholar]
  • 15.Taylor LK, Thomas GW, Karam MD, Kreiter CD, Anderson DD. Developing an objective assessment of surgical performance from operating room video and surgical imagery. IISE Trans Healthc Syst Eng. 2018;88:110-116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Taylor LK, Thomas GW, Karam MD, Kreiter CD, Anderson DD. Assessing wire navigation performance in the operating room. J Surg Educ. 2016;73:780-787. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Van Heest AE, Agel J, Ames SE, et al. Resident surgical skills web-based evaluation: a comparison of 2 assessment tools. J Bone Joint Surg Am. 2019;101:e18. [DOI] [PubMed] [Google Scholar]
  • 18.VanHeest A, Kuzel B, Agel J, Putnam M, Kalliainen L, Fletcher J. Objective structured assessment of technical skill in upper extremity surgery. J Hand Surg Am. 2012;37:332-337.e3374. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

SUPPLEMENTARY MATERIAL
abjs-479-1386-s001.docx (62.9KB, docx)

Articles from Clinical Orthopaedics and Related Research are provided here courtesy of The Association of Bone and Joint Surgeons

RESOURCES