Skip to main content
Elsevier - PMC COVID-19 Collection logoLink to Elsevier - PMC COVID-19 Collection
editorial
. 2021 Feb 3;221(4):764–767. doi: 10.1016/j.amjsurg.2021.01.040

The development of a virtual pilot for the American Board of Surgery Certifying examination

Andrew T Jones 1,, Carol L Barry 1, Beatriz Ibáñez 1, Michelle LaPlante 1, Jo Buyske 1
PMCID: PMC9746259  PMID: 33563463

The COVID-19 pandemic has had substantial effects on the ability of testing organizations to deliver assessments. Organizations like the College Board, for example, have cancelled delivery of the SATs.1 Many medical certification organizations have also cancelled or delayed exams and are still in the process of developing contingency plans on how to safely test candidates during the pandemic.2 , 3 In the field of surgery, the pandemic has disrupted the normal processes of assessment and subsequent board certification. Historically, the final step in the certification process for general surgeons is to take and pass the oral certifying examination (CE), traditionally administered in person several times a year at a hotel in Philadelphia. Under normal circumstances, around 370 candidates, examiners, and administrators would convene in Philadelphia for each exam, with candidates moving from room to room and being examined by pairs of examiners over 90 minutes.

As the pandemic began to spread across the country, the ABS determined that having surgeons travel long distances and gather in one location would directly endanger the surgeons who were involved, while also greatly increasing the risk of the spread of the virus to the broader public. The ABS acted quickly to cancel its scheduled April in-person oral examination. As the pandemic evolved, the ABS made sequential real-time decisions to cancel the scheduled May in-person oral examination for Vascular Surgery as well as the June examination for general surgery. It became apparent that the likelihood of large gatherings reliably occurring anytime in the near future was quite low. The ABS made the decision to pilot virtual administrations of the CE.

Principles

The ABS began the development of the virtual CE pilot by espousing a set of principles. The first principle was that the ABS would not require candidates, examiners, or staff to travel, as that might put them at risk. Individuals were only expected to test in a location that would minimize both the risk of their exposure to COVID-19 as well as the risk of them spreading the virus. This meant that the ABS would allow candidates to test from home or in an office setting, as long as it was a quiet environment without the presence of other individuals.

The next principle was that the ABS wanted to maximize the integrity of the exam delivery. The ABS serves the public by only certifying individuals who have met its defined standard for general surgery or surgical specialties. Having someone take the test in one’s place, sharing exam content to aid others, or otherwise cheating before, during, or after the exam threaten the integrity of these certification decisions and therefore the public trust in the value of certification. It is critical that the individual who completes the exam is who they purport to be, that they are not able to gain assistance from someone else while testing, and that the risk of obtaining exam material for future unauthorized distribution is minimized. In the traditional hotel, in-person setting, the ABS has more control over the environment and can better ensure the integrity of the exam and the certification decisions made thereafter. In the virtual setting, these factors become harder to control, and the ABS therefore needed to take as many precautions as possible to help mitigate the risks of a security breach.

The final major principle was that the ABS needed to have some tolerance for risk, understanding that not all exams would be successfully administered. The ABS realized that by switching to a virtual format there would be technical issues that were outside the control of the organization (e.g., power failures) that could interrupt the exam. A “definitions of success” document was generated and included successful delivery of 90% of the scheduled exams as a metric. This was based on the oft quoted but unscientific assertion of a 15% industry wide failure rate for virtual exams.

Design structure

With these principles in mind, the ABS began to design the first virtual pilot administration of the exam. The first decision was to design the overall structure of the exam. The ABS determined that the virtual administration should be highly similar to the in-person format, with candidates being examined by three pairs of examiners within a similar period of time (the in-person format normally has appointments that last 1.5 hours). The next decision was to determine how many candidates to examine for the first virtual administration. Since this was the first administration of its kind, the ABS decided to start with a small group of 18 candidates, consistent with the surgical saw “small holes small problems big holes big problems”. One candidate did not test and the actual number of candidates was 17. Table 1 shows the planned schedule for the delivery on the first day of the pilot.

Table 1.

Day 1 schedule of exam deliveries (All times eastern).

Day Session Time Examiner Group 1
Examiner Group 2
Team 1a Team 1b Team 1c Team 2a Team 2b Team 2c
Candidate Check-In (7:30 a.m.)
1 1 (8AM) 8:00–8:30 a.m. Cand1 Cand2 Cand3 Cand4 Cand5 Cand6
10-Minute Transition/Break (or testing time for late start)
8:40–9:10 a.m. Cand3 Cand1 Cand2 Cand6 Cand4 Cand5
10-Minute Transition/Break (or testing time for late start)
9:20–9:50 a.m. Cand2 Cand3 Cand1 Cand5 Cand6 Cand4
Candidate Check-In (10:30 a.m.)
1 2 (11AM) 11:00–11:30 a.m. Cand7 Cand8 Cand9 Cand10 Cand11 Cand12
10-Minute Transition/Break (or testing time for late start)
11:40–12:10 a.m. Cand9 Cand7 Cand8 Cand12 Cand10 Cand11
10-Minute Transition/Break (or testing time for late start)
12:20–12:50 p.m. Cand8 Cand9 Cand7 Cand11 Cand12 Cand10
Debrief with Examiners (1:30 p.m.)

Abbreviations: Cand = Candidate.

This schedule differed from in-person administration in that candidates needed to arrive 30 minutes prior to the start of their exam for an individual check-in process with ABS staff and in that there were 10-min breaks for transitioning between pairs of examiners. Normally, candidates would check in as a large group at the hotel before starting their exam process. Furthermore, the usual situation was that candidates would have 2 min to transition between hotel rooms. The extra 10 minutes of break time allowed for appointments to run long in case there were any interruptions to the virtual delivery during the exam. This helped to minimize the risk of small technology issues leading to an invalid exam result. Another key difference was that the ABS minimized the number of sessions in a day. Normally, the in-person exam would have four 1.5-h exam blocks per day. In the virtual pilot format, the ABS reduced this to scale down the scope of the pilot delivery.

Delivery platform/process

The ABS currently uses the Google product G-Suite for its email and other IT infrastructure needs. Given the short time frame between the cancellation of the in-person exams and the need to deliver a virtual pilot, the ABS decided to use a tool that would be familiar to the staff for conducting the virtual administrations (i.e., Google Meet). This required either setting up examiner-centered appointments (as the traditional in-person exams) or candidate-centered appointments, where either the candidate or the examiner would have to move in and out of appointments to adhere to the schedule. The ABS decided to make the appointments candidate-centered (one appointment per candidate) and have the examiners rotate between the candidate virtual rooms. This process allowed the ABS to have a proctor that continuously monitored the candidate during the exam to provide support and minimize any security risks. Fig. 1 illustrates the inversion of the normal in-person flow on exam day to the flow for the virtual exam.

Fig. 1.

Fig. 1

Comparison of in-person versus online candidate/examiner flow.

In addition to using Google Meet for virtual appointments, the ABS had to develop a secure method for sharing exam content with examiners. At an in-person exam, examiners have paper copies of exam books that are tracked and returned to the ABS at the end of each administration. To ensure content integrity in a virtual format, the ABS shared electronic versions of the exams with examiners in a way that would prevent printing and downloading, and that would ensure that the content was no longer accessible after the exam. The format of the content was re-arranged so that it would display appropriately in an electronic format.

When exams are administered in the in-person format, scores are documented on paper and scanned for reporting. These scores also go through a rigorous quality control process to ensure that they are accurately recorded. This process does not translate well to an online delivery and needed to be modified. The ABS therefore developed an online score collection form which also had quality control measures in place for accurate score collection.

In sum, each of these pieces (the virtual meeting room, the exam content delivery, and the scoring process) were each critical components. They were independent of each other and needed to be integrated into an intuitive process for examiners and candidates to follow on exam day. As all of these steps were new to the examiners, candidates, and proctors, it was important that the ABS oriented them to how the technology would function on exam day. Prior to the exam, the ABS conducted systems checks for all candidates and examiners to ensure that they could access and use Google Meet on exam day. All examiners participated in training sessions prior to the administration that explained each part of the process and allowed the examiners to see how the technology would function. Finally, all proctors were also trained on their roles for ensuring the integrity of the exam and helping to troubleshoot any technology issues.

As noted above, the ABS was prepared for some degree of technical failure during the exam administration. However, the ABS also took multiple measures to mitigate the impact of technical issues on exam delivery. For example, there were backup proctors who were available in the event that the primary proctor had a connectivity issue. Furthermore, the ABS had backup examiners/observers who were available in the event that one of the primary examiners had a technical failure. These backup examiners could step in and deliver questions to the candidate if needed. Additionally, ABS staff utilized a group messaging chat to communicate internally and texts to communicate rapidly with examiners. IT staff was also live monitoring connections to try to pre-emptively identify examiners or candidates who might be likely to lose their connections. Finally, all sessions were recorded so that if a primary examiner had a technical failure, they could go back and review anything they missed and still provide scores for a candidate. These recordings were destroyed shortly after exam delivery.

Security measures

Given that ensuring the integrity of the exam was of utmost importance, the ABS took numerous precautions for security. First, as noted previously, each candidate was assigned a proctor for the duration of their exam. This proctor was responsible for the candidate check-in process, monitoring the candidate for the duration of the exam, and for helping to troubleshoot any technical issues that arose. During the check-in process, the proctor verified that the candidate had a state-issued photo identification that matched the individual on camera and was the individual who was scheduled to be examined at that time. The proctor had the candidate complete additional security measures, which included taking a room scan with a camera, emptying their pockets, and observing the candidate turning off their mobile phone. Prior to starting the exam, the candidate shared their desktop view with the proctor and opened up their task manager so that any additional programs besides the Google Meet virtual exam room were closed. Additionally, one of the security measures that Google Meet allowed was that anyone outside of the organization (e.g., candidates, examiners) would have to be admitted into the appointment by the proctor. This minimized the risk of someone hacking into or interrupting the exam.

Outcomes

Prior to the administration of the exam, the ABS defined several criteria for a successful pilot. These included but were not limited to all candidates, proctors, and examiners arriving at the correct virtual appointments, candidates completing their exam (with an acknowledgment that not all candidates would necessarily be able to), Google Meet functioning well with few connectivity drops, scores being accurately recorded, and no known security breaches occurring during the exam. Minor connectivity issues did occur (e.g., brief audio or video glitches) and one examiner did lose connectivity for several minutes. When this occurred, the backup examiner delivered the case and the examiner who lost connectivity reviewed a recording of the session later in the day for scoring purposes. The ABS was able to meet all criteria, and every candidate completed their exam. During an in-person administration, candidates are allowed to request to invalidate their CE results immediately after testing if they feel that any incidents occurred (e.g., someone knocked on the hotel room door) which caused a distraction or resulted in an unfair exam. The ABS applied this same rule to the virtual delivery; none of the candidates felt that any issues occurred which would invalidate the results of their exam. Fig. 2, Fig. 3 show some of the results from the candidate post-exam survey. As the results show, the vast majority of candidates thought that the technology functioned well and were satisfied overall with their exam experience. Cost savings to candidates were not measurable but were significant by avoiding the costs of flights, hotels, transportation, meals, childcare, and valuable time away from practice. Costs to the ABS were substantially similar to those of in-person exams, with expenses shifted from airfare and hotel rooms to having to pay for a substantial increase in staff resources to deliver the exams, technology platforms, and proctors. Last, cost savings and some revenue preservation for examiners can be assumed, primarily due to the saved time of travel. For in-person exams examiners travelled the day before the exams as well as immediately after the end of the exam, with a minimum of 4 days away from work and home.

Fig. 2.

Fig. 2

Candidate perceptions of the technology during the exam.

Fig. 3.

Fig. 3

Candidate overall satisfaction with delivery of the exam.

Lessons learned

After the first smaller pilot with 17 candidates, the ABS repeated the process with a larger group of 54 candidates. Both of these pilots were successful, and all candidates completed their exams with minimal issues. These pilots were the first examples of remote administrations of oral exams from a member board of the American Board of Medical Specialties (ABMS). While the ABS has viewed these pilots as largely successful, it also had some lessons learned that informed plans to scale up to a virtual solution to examine over 2000 candidates between October, 2020 and June, 2021. For example, while each major component of the solution was functional (Google Meet, exam content delivery, and score collection), it is clear that it was not optimal. Examiners had to switch between programs to complete each part of the process, for example.

A better solution would integrate all, or at least some, of these components into a more streamlined application. Additionally, logistics of an all-volunteer surgeon examiner pool mean that the ABS is not able to have a third backup examiner/observer for the scale of exams that need to be delivered. Subsequent exam deliveries have two examiners. If one of the examiners has a technology issue, the other can continue to deliver the exam. The examiner with the connectivity issue is able to review the recording of the exam after the administration and provide scores for the candidate. Moreover, the ABS depended upon staff across the entire organization to act as proctors for this administration. The necessary and routine work of the ABS was still required, which, given lean ABS staffing, meant that staff proctoring is not scalable for subsequent large-scale deliveries. The ABS therefore recruited and trained outside proctors to help deliver subsequent exams. Moreover, the ABS shifted the schedule of the delivery to allow for better scheduling for West Coast examiners by starting later in the day. Finally, virtual oral exam results will be analyzed to evaluate the comparability of the scores from the virtual administrations to the scores from the in-person exams to ensure that candidates are still meeting the equivalent standard expected of a board-certified surgeon.

Looking to the future

The switch to a virtual delivery model may enable innovations above and beyond the current changes that may help to maximize objectivity in scoring. For example, the ABS could potentially implement image blurring to mask the identity of the candidates, which may have potential ramifications for bias. Additionally, voice baffling may be an additional layer of de-identification that can minimize potential issues with examiner bias. The ABS may also be able to add more raters to the scoring process by having one set of examiners deliver the exam and a completely different set score the exams using recordings or transcripts of the exams, so that they are completely blinded to the examiners and candidates. Furthermore, some combination of transcripts and natural language processing may help to further enhance the scoring of the exam. Examiner training for new examiners should become easier, with novices having the ability to participate in live training without having to travel to an event. All of these concepts would be significant changes to the exam and would require evaluation before being implemented in an actual exam setting. However, they have the potential to enhance the assessment of surgical judgment and improve fairness in making decisions for board certification.

References


Articles from American Journal of Surgery are provided here courtesy of Elsevier

RESOURCES