Skip to main content
JAMA Network logoLink to JAMA Network
. 2020 May 6;155(7):590–598. doi: 10.1001/jamasurg.2020.1004

Association of Surgical Skill Assessment With Clinical Outcomes in Cancer Surgery

Nathan J Curtis 1,2,, Jake D Foster 1,2, Danilo Miskovic 3, Chris S B Brown 4, Peter J Hewett 5, Sarah Abbott 6, George B Hanna 1, Andrew R L Stevenson 7,8, Nader K Francis 2,9
PMCID: PMC7203671  PMID: 32374371

Key Points

Question

Is surgical skill associated with outcome differences following cancer operations?

Findings

In this cohort study, the intraoperative performance of credentialed surgeons within 2 multicenter laparoscopic rectal cancer randomized trials was analyzed using a bespoke objective assessment tool shown to be reliable and valid for the specialist level. Substantial variation in measured skill was present with large differences between upper and lower quartile surgeons (mesorectal fascial plane, 93% vs 59%; 30-day morbidity, 23% vs 50%).

Meaning

Surgical skill is highly associated with histopathological and clinical outcomes and requires consideration in trial design and interpretation.

Abstract

Importance

Complex surgical interventions are inherently prone to variation yet they are not objectively measured. The reasons for outcome differences following cancer surgery are unclear.

Objective

To quantify surgical skill within advanced laparoscopic procedures and its association with histopathological and clinical outcomes.

Design, Setting, and Participants

This analysis of data and video from the Australasian Laparoscopic Cancer of Rectum (ALaCaRT) and 2-dimensional/3-dimensional (2D3D) multicenter randomized laparoscopic total mesorectal excision trials, which were conducted at 28 centers in Australia, the United Kingdom, and New Zealand, was performed from 2018 to 2019 and included 176 patients with clinical T1 to T3 rectal adenocarcinoma 15 cm or less from the anal verge. Case videos underwent blinded objective analysis using a bespoke performance assessment tool developed with a 62–international expert Delphi exercise and workshop, interview, and pilot phases.

Interventions

Laparoscopic total mesorectal excision undertaken with curative intent by 34 credentialed surgeons.

Main Outcomes and Measures

Histopathological (plane of mesorectal dissection, ALaCaRT composite end point success [mesorectal fascial plane, circumferential margin, ≥1 mm; distal margin, ≥1 mm]) and 30-day morbidity. End points were analyzed using surgeon quartiles defined by tool scores.

Results

The laparoscopic total mesorectal excision performance tool was produced and shown to be reliable and valid for the specialist level (intraclass correlation coefficient, 0.889; 95% CI, 0.832-0.926; P < .001). A substantial variation in tool scores was recorded (range, 25-48). Scores were associated with the number of intraoperative errors, plane of mesorectal dissection, and short-term patient morbidity, including the number and severity of complications. Upper quartile–scoring surgeons obtained excellent results compared with the lower quartile (mesorectal fascial plane: 93% vs 59%; number needed to treat [NNT], 2.9, P = .002; ALaCaRT end point success, 83% vs 58%; NNT, 4; P = .03; 30-day morbidity, 23% vs 50%; NNT, 3.7; P = .03).

Conclusions and Relevance

Intraoperative surgical skill can be objectively and reliably measured in complex cancer interventions. Substantial variation in technical performance among credentialed surgeons is seen and significantly associated with clinical and pathological outcomes.


This cohort study examines surgical skill within advanced laparoscopic procedures and its association with histopathological and clinical outcomes.

Introduction

In the treatment of many gastrointestinal cancers, outcomes are surgeon dependent. Considerable variation is seen in results from randomized clinical trials (RCTs) and routine surgical practice.1,2,3,4

For the 700 000 patients annually who receive a rectal cancer diagnosis, total mesorectal excision (TME) forms the mainstay of curative treatment pathways.5 Oncological outcomes following TME are strongly associated with the quality of the tumor specimen, highlighting the need for proficient surgery.6,7,8,9

Over the past 2 decades there has been an increasing uptake in minimally invasive TME based on reported short-term patient benefits.2,10,11 Debate persists regarding the role of laparoscopic TME, as 2 major RCTs did not establish noninferiority vs laparotomy.12,13 Amplified by conflicting meta-analysis reports, the lack of consensus has led to widespread variation in the use of laparoscopy.2,11,14,15

It is a widely held assumption that surgical skill is associated with procedural delivery and subsequent outcomes. In the few available reports, the technical skill of specialist surgeons, as measured through peer observation of gastric bypass videos, was closely associated with clinical outcomes.16 There is a compelling argument to investigate cancer surgery as complex interventions may be subject to wider variation with potentially larger effects.4,17 As, to our knowledge, the intraoperative period has not been robustly measured, the reasons behind suboptimal outcomes are unclear.2,11,18

We hypothesized that surgical performance is associated with outcomes following laparoscopic TME.1,2,15 We previously showed that objective assessment of laparoscopic colon cancer surgery at the specialist level is reliable and clinically valid.19,20,21 However, there are no comparable tools that can appraise the technical performance of laparoscopic TME.22 Therefore, we aimed to measure surgical skill within randomized trials and investigate its association with clinical and pathological outcomes through first developing a reliable and valid objective laparoscopic TME assessment tool.

Methods

An international collaborative approach underpinned the design and delivery of this project, which developed and validated a bespoke laparoscopic TME assessment tool (LapTMEpt) and its application to measure surgical performance within 2 multicenter laparoscopic TME RCTs (the Australasian Laparoscopic Cancer of the Rectum trial [ALaCaRT] and the 2-dimensional/3-dimensional [2D3D] trial). Both trial protocols included the capture of unedited laparoscopic case video with informed patient consent. Respective reports describing full methods, ethical approvals (ALaCaRT: Sydney Local Health District human research ethics committee; 2D3D: UK National Health Service South Central–Berkshire B research ethics committee), and results are available.12,23,24

Tool Development

A structured, mixed-methods approach was overseen by a steering group holding expertise in surgical assessment, tool development, and laparoscopic rectal cancer surgery. A 2-round Delphi exercise with 62 international laparoscopic TME experts from 5 continents defined task areas for assessment by deconstructing the procedure into constituent steps as part of the laparoscopic TME technique standardization project.25 A detailed description of procedural steps was generated and further refined at an interactive workshop. Abdominal procedural phases were not included as they could be assessed using previously reported competency tools.19,21

To develop the performance assessment metrics, experts identified through laparoscopic TME experience, peer recommendation, and involvement in laparoscopic TME RCTs were invited to participate in semistructured interviews. An open question interview framework was applied to determine technical performance indicators, allowing freedom to express thoughts and explore ideas while also enabling the interviewer to cover necessary information.26 For each task, 2 indicative TME video clips were shown allowing reflection on the displayed performance. Interviews were transcribed verbatim and underwent coding and thematic categorization. Saturation was achieved after 8 interviews. Descriptors of proficient and poor performance were collated from the transcripts and triangulated onto specific procedural tasks.

In addition to measuring task performance, an errors domain was incorporated and shaped by commonly observed technical error mechanisms obtained through application of the observational clinical human reliability analysis (OCHRA) technique. Prospectively recorded specialist-performed laparoscopic TME resections were analyzed with approval from the UK southwest research ethics committee, with all patients providing written informed consent.27

A LapTMEpt draft was generated incorporating the 4 procedural tasks described in the expert consensus (Figure 1). A 4-point ordinal scale described the quality of technical performance for each domain within each task area, with objective descriptors developed from the interviews, error analysis, and steering group refinement. Transcripts showed the experts identified a spectrum of TME case difficulty. A 3-point scale was applied for the assessor to stratify case complexity (1: wide pelvis, no scarring/edema, and no gross obesity; 2: moderate width pelvis, minimal scarring/edema in tissue planes, and moderate bulk to tumor/mesorectum; 3: narrow pelvis, significant scarring/edema/reaction to neoadjuvant therapy, bulky tumor, and obesity). To avoid overcluttering, an instruction manual contained guidance on each task and performance level (eAppendix in the Supplement). An initial utility and feasibility study comprising 12 prospectively recorded laparoscopic TME cases performed by 4 consultant surgeons and 6 expert assessors was successfully performed (eMethods 1, eTables 1 and 2, and eFigure 1 in the Supplement).

Figure 1. The Laparoscopic Total Mesorectal Excision (TME) Performance Tool (LapTMEpt).

Figure 1.

The accompanying manual is provided in the eAppendix in the Supplement. It consists of 4 vertical columns representing task areas and 4 horizontal rows representing the performance domains, creating 16 separate items that are scored on a scale of 1 to 4, in which a higher score indicates a more proficient technical performance and a total score of 64 indicates a perfect and proficient performance. Nv indicates neurovascular.

LapTMEpt Reliability Assessments

The ALaCaRT and 2D3D trials both routinely video-captured all laparoscopic cases that were analyzed by a blinded surgical researcher (holding 1500 hours of colorectal video analysis) not involved in the tool development. Test-retest reliability was investigated through repeated analysis of the ALaCaRT series performed after a 12-month delay with a second trained independent assessor also applying the LapTMEpt to explore interrater reliability.

Concurrent Validity

LapTMEpt scores were compared with errors identified with observational clinical human reliability analysis (OCHRA) performed in keeping with previously described applications to laparoscopic TME surgery, including RCT cases.27,28,29 The validated OCHRA technique assesses the interface between humans and complex systems. The system is described in constituent tasks analyzed to identify and categorize error events. Errors were defined as something done that was not intended by the actor nor desired by a set of rules or an external observer or led the task outside acceptable limits.27,28,30

We focused on evaluating mesorectal dissection as defined by the international laparoscopic TME standardization hierarchical task analysis that was also applied to prior OCHRA rectal cancer studies.24,25,27 This was aligned with the ALaCaRT protocol that required capture of pelvic dissection tasks for the specimen-based trial end point.12

Clinical Validity

To compare performance levels, surgeon quartiles were calculated based on individual LapTMEpt mean scores. Clinical end points comprised 30-day morbidity (graded with the Clavien-Dindo classification), reoperation, anastomotic leak, length of hospital stay, and readmission. Surgical complications formed a predefined subgroup analysis. The 2D3D trial captured surgeon-reported case difficulty using a visual analog scale (0 mm, easiest; 100 mm, hardest) that was compared with the LapTMEpt case complexity grade. Histopathological outcomes were mesorectal dissection plane (as graded by masked pathologists),6,9 the ALaCaRT composite primary end point (mesorectal fascial plane, circumferential margin of ≥1 mm, and distal margin of ≥1 mm), and lymph node yield. Medium-term ALaCaRT overall survival and recurrence data were studied.23

Statistical Analysis and Data Handling

LapTMEpt scores were handled as a continuous variable with the assumption that all items carried equal weight. To ensure homogeneity, presented analyses represent the sum of the 3 TME dissection task columns with possible totals between 12 and 48. Full LapTMEpt score analyses are provided in eMethods 2, eTable 3, and eFigure 2 in the Supplement. Following an exploration for normality, nonparametric tests were applied. Interclass correlation coefficients (ICCs) were calculated using a 2-way random-effects model. The internal consistency of each task domain was determined with Cronbach α. Case complexity grade reliability used crosstabulation and the Cohen κ coefficient. Clinical validity comparisons were performed using Mann-Whitney U, Kruskal-Wallis, and Spearman ρ correlation testing as appropriate. The numbers needed to treat (NNT) were calculated as the inverse of the absolute risk reduction (upper quartile outcome % − lower quartile outcome %). Unless otherwise stated, figures represent medians (interquartile range [IQR]) throughout. Analyses were performed using SPSS (version 25.0; IBM) with P < .05 considered significant.

Results

The LapTMEpt

The LapTMEpt and accompanying instruction manual are presented (Figure 1; eAppendix in the Supplement). Four task areas were defined: posterior, anterior and lateral mesorectal dissection, and resection and anastomosis.25 Interview transcripts contained 11 themes that defined 4 overarching skill domains: retraction and exposure, task execution, errors, and end product (eMethods 3 and eTable 4 in the Supplement). All 16 items were scored using a descriptive 4-level scale representing technical performance from 4 (optimal) to 1 (poor). When an assessor feels unable to comment or a step not performed, no score is assigned. Tool piloting confirmed feasibility and utility (eMethods 1, eTables 1 and 2, and eFigure 1 in the Supplement). Median LapTMEpt completion time was 6 minutes (IQR, 3-13 minutes).

LapTMEpt Evaluation

A total of 385 hours of unedited video from 176 laparoscopic TME cases were analyzed (99 ALaCaRT participants [46% of all patients who underwent laparoscopic TME]) and 77 2D3D participants). Median (IQR) age, body mass index (calculated as weight in kilograms divided by height in meters squared), and tumor height from the anal verge were 66 years (58-75 years), 27 (24-30), and 8 cm (6-10 cm), respectively. There were 105 men (60%) and 68 (39%) received neoadjuvant chemoradiotherapy.

Reliability Assessments

High test-retest reliability was shown for TME dissection task columns scores (ICC, 0.878; 95% CI, 0.819-0.918; P < .001) and case complexity grading (85% absolute agreement; κ = 0.74; 95% CI, 0.61-0.86; P < .001; eTable 5 in the Supplement). Good internal consistency was shown for all TME dissection task areas (Cronbach α, posterior TME, 0.844; anterior TME, 0.772; lateral TME, 0.880).

High interrater reliability was observed (ICC, 0.889; 95% CI, 0.832-0.926; P < .001) with good case complexity grading agreement (κ = 0.69; 95% CI, 0.56-0.83; P < .001). Good internal consistency was observed for each task area (Cronbach α, posterior TME, 0.831; anterior TME, 0.807; lateral TME, 0.814).

Concurrent Validity

The OCHRA analysis identified 1115 pelvic errors (median [IQR], 5 per case (3-8); mode, 3; range, 0-31). A moderate negative correlation was seen with tool scores (rs = −0.515, P < .001; Figure 2A).

Figure 2. Laparoscopic Total Mesorectal Excision (TME) Performance Tool Score Analyses.

Figure 2.

A, Scattergraph displaying number of error events identified from observational clinical human reliability analysis (OCHRA) review with line of best fit and 95% CI. A moderate negative correlation is observed (rs = −0.515; P < .001) and is comparable with the previously reported laparoscopic colonic competency assessment tool concurrent validity.19 Each additional error event was associated with a 2-point drop in tool scores. B, Bar graph displaying the distribution of tool scores from the 176 cases. Substantial variation is observed despite both randomized clinical trials using surgeon-credentialing policies. C, Box-whisker plot comparing scores between the 3 case complexity grades. Lines represent the median and interquartile range with whiskers depicting the 95% CI. A significant decrease is observed with grade increase (43 [95% CI, 40-46] vs 39 [95% CI, 36-42] vs 36 [95% CI, 32-38]; P < .001).

Clinical Validity

The performance of 34 credentialed surgeons was analyzed with no median score difference between the trials (40 [IQR, 36-42] vs 41 [36-44]; P = .19). Substantial performance variation was seen (median, 40 [IQR, 36-43]; σ2, 25.7; range, 25-48; Table 1 and Figure 2B). When surgeon quartiles were applied, the upper quartile contained lower, more advanced rectal cancers that received more neoadjuvant treatment. Despite this, their cases were performed faster with less blood loss and fewer enacted errors (6 vs 6 vs 3; P < .001; Table 2).

Table 1. Raw LapTMEpt Item, Column, and Task Score Data.

Characteristic Mean (SD)
TME dissection Resection and anastomosis Total
Posterior Anterior Lateral
No. 176 176 176 91 91
Retraction + exposure 3.41 (0.69) 3.24 (0.69) 3.30 (0.7) 3.42 (0.82) 11.76 (2.64)
Task performance 3.38 (0.66) 3.07 (0.76) 3.29 (0.69) 2.92 (1.13) 11.26 (2.34)
Errors 3.11 (0.82) 3.29 (0.8) 3.39 (0.71) 3.08 (1.11) 11.38 (2.29)
End product 3.38 (0.69) 3.05 (0.94) 3.49 (0.67) 3.03 (1.15) 11.48 (2.61)
Column total 13.28 (1.93) 12.65 (2.16) 13.49 (1.99) 12.45 (3.36)
TMEdt
No. NA 176
Mean (SD) 39.43 (5.07)
Laparoscopic TMEpt
No. NA 91
Mean (SD) 51.95 (7.24)

Abbreviations: TME, total mesorectal excision; TMEdt, total mesorectal excision dissection task.

Table 2. LapTMEpt Clinical Validity Assessmenta.

Characteristic Surgeon quartiles P value
Lower Interquartile Upper
Median (IQR) No. (%) Median (IQR) No. (%) Median (IQR) No. (%) 3 Groups Lower-upper
Performance measurement
Laparoscopic TME performance tool score (TMEdt) 36 (32-39) NA 40 (36-43) NA 44 (42-46) NA <.001 <.001
Pelvic error count (OCHRA) 6 (5-11) 6 (4-8) 3 (2-4) <.001 <.001
Demographics
Age, y 68 (61-77) NA 67 (60-75) NA 63 (55-70) NA .13 .07
BMI 27 (25-31) 27 (24-30) 25 (22-28) .17 .09
Tumor height, cm 8.0 (7.0-10.0) 8.0 (6.0-11.0) 7.0 (5.0-8.6) .12 .04
Sex
Women NA 16 (51.6) NA 41 (36.0) NA 13 (43.3) .27 .52
Men 15 (48.4) 73 (64.0) 17 (56.7)
Neoadjuvant treatment NA 11 (35.5) NA 38 (33.3) NA 19 (63.3) .01 .03
Tumor stage (histopathologically defined)
PathCR NA 0 (0) NA 2 (1.8) NA 0 (0) .11 .04
I 12 (38.7) 29 (25.4) 8 (26.7)
II 9 (29.0) 32 (28.1) 3 (10.0)
III 10 (32.3) 48 (42.1) 18 (60.0)
IV 0 (0) 3 (2.6) 1 (3.3)
Operative data
Operative duration, min 290 (230-380) NA 255 (204-297) NA 178 (155-210) NA <.001 <.001
Estimated blood loss, mL 100 (50-200) 100 (50-200) 40 (15-60) <.001 <.001
Histopathological outcomes
Plane of mesorectal dissection
Mesorectal fascia NA 17 (58.6) NA 96 (88.9) 28 (93.3) <.001 <.002
Intramesorectal 8 (27.6) 8 (7.4) 1 (3.3)
Muscularis propria 4 (13.3) 4 (3.7) 1 (3.3)
ALaCART composite end point success NA 18 (58.1) NA 101 (88.6) NA 25 (83.3) <.001 .03
Lymph node yield 16 (11-27) NA 17 (13-22) NA 17 (12-21) NA .91 .67
Clinical outcomes
Any 30-d morbidity event NA 15 (50.0) NA 63 (55.3) NA 7 (23.3) .01 .03
Surgical morbidity,b % 28.1 27.2 3.3 .02 .01
No. of 30 d morbidity events
None NA 16 (50.0) NA 51 (44.7) NA 23 (76.7) .01 .03
1 5 (15.6) 28 (23.7) 3 (10.0)
2 8 (25.0) 16 (14.9) 2 (6.7)
3 1 (6.3) 11 (10.5) 2 (6.7)
4 1 (3.1) 6 (4.4) 0 (0)
5 0 (0) 2 (1.8) 0 (0)
Highest grade morbidity experienced (Clavien-Dindo classification)
None NA 16 (50.0) 51 (44.7) 23 (76.7) .01 .02
1 2 ( 9.4) 12 (12.3) 1 (10.0)
2 9 (28.1) 33 (31.6) 3 (10.0)
3 4 (12.5) 13 (7.0) 2 (3.3)
4 0 (0) 5 (4.0) 0 (0)
Unplanned reoperation NA 2 (6.3) NA 8 (7.0) NA 1 (3.3) .76 .60
Anastomotic leak NA 2 (6.3) NA 9 (7.9) NA 0 (0) .29 .17
Length of stay, d 9 (5-14) NA 7 (6-13) NA 7 (5-11) NA .46 .20
Hospital readmissionc NA 6 (33.3) NA 10 (18.2) NA 0 (0) .23 .19

Abbreviations: ALaCaRT, the Australasian Cancer of the Rectum; BMI, body mass index (calculated as weight in kilograms divided by height in meters squared); IQR, interquartile range; NA, not applicable; OCHRA, observational clinical human reliability analysis; PathCR, pathological complete response; TME, total mesorectal excision; TMEdt, total mesorectal excision dissection task column total.

a

Surgeons were grouped in quartiles based on their individual mean score. Three-group and upper vs lower quartile comparisons were made. Variation is evident between these credentialed specialist surgeons. Despite containing significantly higher demographic and tumor factors typically considered to increase case difficulty, improved perioperative, histopathological, and morbidity outcomes are seen from the top-quartile surgeons. Inadequate outcomes are observed in the lower quartile particularly in the rate of nonmesorectal fascia plane surgery, which is associated with increased locoregional and distant recurrence.

b

Surgical morbidity comprises any complication regarding the anastomosis, surgical wound, and stoma as well as any iatrogenic organ injury, bleeding, bowel obstruction, and reoperation (any indication).

c

Readmission data are from the 2D3D trial patients only as this was not routinely captured in the ALaCaRT study.

Case complexity grades were 1, 61 (34.7%); 2, 85 (48.3%); and 3, 30 (17%). Significantly lower scores with higher surgeon reported difficulty were seen with each increase (median, 43 [IQR, 40-46] vs median of 39 [IQR, 36-42] vs median of 36 [32-38]; P < .001; Figure 2C; and median, 21 mm [IQR, 15-29 mm] vs median of 30 mm [IQR, 19-54 mm] vs median of 64 mm [IQR, 36-71]; P < .001). Surgeons outside the lower quartile achieved significantly higher rates of mesorectal fascial dissection (93.3% vs 88.9% vs 58.8%; NNT, 2.9; P = .002) and ALaCaRT composite end point success (88.3% vs 88.9% vs 58.1%; NNT, 3.3; P = .03). There was no difference in lymph node yield.

Eighty-six patients (48.9%) developed morbidity within 30 days of surgery. There were no deaths. Patients who underwent operations performed by top-quartile surgeons developed significantly less 30-day morbidity (23.3% vs 55.3% vs 50%; NNT, 3.7; P = .008) as well as fewer and less serious events (Table 2). Several clinically relevant reductions in anastomotic leak (6.3% vs 0%; P = .17), median length of stay (9 days [IQR, 5-14 days] vs 7 days [IQR, 5-11 days]; P = .20), and hospital readmission (33% vs 0%; P = .19) were seen between the upper and lower quartiles but none were statistically significant.

The median follow-up was 2.9 years (IQR, 1.7-4.1 years). Locoregional recurrence was 4%. There were no differences in medium-term locoregional or distant recurrence or survival data between the quartiles (Table 3). Although not statistically significant, clinically important findings were observed in patients who underwent operations performed by upper-quartile surgeons with no locoregional recurrences: more than 2 years longer disease-free survival rates and 96.6% overall survival rates (Table 3).

Table 3. Medium-Term Oncological Outcomesa.

Characteristic Surgeon quartiles P value
Lower Interquartile Upper 3 Groups Lower-upper
Median (IQR) % Median (IQR) % Median (IQR) %
Follow-up, y 2.28 (0.86-4.05) 2.6 (1.71-4.07) 4.12 (3.09-5.03) <.001 .003
Recurrence
Disease-free survival NA 74.2 NA 78.1 NA 70 .63 .72
Time from surgery to first recurrence, y 1.67 (1.39-4.49) NA 2.92 (1.58-4.14) NA 3.98 (1.44-4.53) NA .67 .51
Recurrence
Locoregional NA 6.3 NA 4.5 NA 0 .44 .18
Distant 21.9 13.4 24.1 .27 .84
Overall survival NA 87.5 NA 87.5 NA 96.6 .46 .2

Abbreviations: IQR, interquartile range; NA, not applicable.

a

No differences in disease-free survival or local or metastatic recurrence was seen between the quartiles. Clinically important but not statistically significant improvements in locoregional recurrence and overall survival were seen in patients operated on by the top-quartile surgeons.

Discussion

To our knowledge, this study is the first to report a direct association between objective measured surgical skill and cancer surgery outcomes. Within 2 randomized rectal cancer surgery trials, substantial performance variation was demonstrated, with strong associations with key histopathological determinates of oncological outcomes.6,7,8,9 Despite potential multifactorial associations, we observed that performance was also associated with short-term morbidity as well as the number and severity of events. The efficacy of high surgical performance was demonstrated by very low NNT to reduce suboptimal pathology and morbidity results. The scope and need for improvement in specialist-level laparoscopic TME practice is evident.

In the context of present debates on the oncological safety of minimal access techniques for rectal cancer, our data show acceptable results, comparable with open TME, that were obtained by surgeons above the lower quartile. Top-performing surgeons operated on lower, more advanced cancers but obtained excellent results. All reported TME randomized trials investigated the generalizability of laparoscopy but in light of our findings it is important to consider who is performing the procedure and their demonstrable skill level in addition to the surgical approach.

This study’s findings have implications for future clinical studies containing surgical interventions. Procedures are often grouped for analysis, which oversimplifies the inherent variability.4 Credentialing based on case experience alone appears insufficient, as heterogeneity is shown to persist. Subjective assessment of laparoscopic videos by steering committee members is occasionally used for credentialing or quality assurance purposes.31 Objective demonstration of pretrial procedural competency would strengthen enrollment criteria and quality assurance by confirming the standardized delivery of the intended intervention and defining intraoperative protocol deviations.1,31 Performance evaluation aided by tools could detect and quantify performance bias and facilitate comparison between surgeons and units.16 As surgeons are shown to constitute an important outcome factor, performance data aid the subsequent interpretation of trial results and provide insight on the association between surgical proficiency and procedural efficacy.

These issues are not confined to trials, as the need for improvement within routine practice remains.15 Tool data could assist accreditation and benchmarking within established initiatives, such as the US National Accreditation Program for Rectal Cancer.3 Surgeon-specific LapTMEpt data could shape targeted training and quality improvement efforts. It is unknown if meaningful improvement in specialist surgical skill is achievable, although musicians and athletes of all levels train and receive coaching believing they can progress. We are unaware why this should not also apply to surgeons.

Individual case factors require consideration, as we observed a correlation between scores and case complexity grades. It is unclear whether a well-performed operation makes the procedure appear less complex or if a less complex procedure allows better performance. Presently, to our knowledge, no attempt is made to define a high-stakes assessment threshold. All cases were performed by specialist surgeons, meaning the wider applicability of the tool is presently unknown. It may hold applications in formative training and summative competency assessments. A dedicated study is underway.

Limitations

The TME dissections were uniformly analyzed and captured all tasks responsible for specimen quality; however, some videos did not contain the resection and anastomosis, preventing full LapTMEpt completion, and several ALaCaRT cases were never recorded, presenting a theoretical selection bias. The tool is designed to facilitate categorical qualitative appraisal of skill in 16 areas with the assumption that each item is equally important. Further work is required to define the relative importance of each. Our study design means no comment on LapTMEpt predictive validity can be made and longitudinal data are now required. Video-based analyses are a time- and labor-intensive technique, potentially limiting their broader applicability and use outside the research setting. As analyzing 1 hour of surgery was seen to take approximately 90 minutes, a correlation with in-theater tool completion data is now required. A potential risk of the Hawthorne effect exists, although this is yet to be investigated in surgical practice.32 The tool is restricted to assessing technical performance within the pelvis and does not consider nontechnical factors that might be associated with procedural delivery.33

Conclusions

Surgical skill can be objectively and reliably measured in complex cancer interventions. Within 2 randomized trials, substantial variation in technical performance among credentialed surgeons exists and has significant associations with clinical and pathological outcomes. This finding holds implications for the design and interpretation of surgical trials containing cancer interventions.

Supplement.

eMethods 1. The LapTMEpt pilot study

eMethods 2. Full laparoscopic TMEpt application

eMethods 3. Expert interviews and thematic analysis

eTable 1. Patient demography, tumor properties and case scores for the LapTMEpt pilot

eTable 2. Outcomes from the questionnaire completed by the assessors to evaluate utility of the tool

eTable 3. LapTMEpt surgeon quartiles

eTable 4. Grouping of themes generated from interview thematic analysis into skill domains

eTable 5. LapTIMEpt reliability assessments

eFigure 1. Scatter graph plotting the different assessors’ total scores for each pilot case

eFigure 2. LapTMEpt score distribution

eAppendix. L-TMEpt instruction manual

jamasurg-e201004-s001.pdf (1,003.8KB, pdf)

References

  • 1.Markar SR, Wiggins T, Ni M, et al. Assessment of the quality of surgery within randomised controlled trials for the treatment of gastro-oesophageal cancer: a systematic review. Lancet Oncol. 2015;16(1):e23-e31. doi: 10.1016/S1470-2045(14)70419-X [DOI] [PubMed] [Google Scholar]
  • 2.Vennix S, Pelzers L, Bouvy N, et al. Laparoscopic versus open total mesorectal excision for rectal cancer. Cochrane Database Syst Rev. 2014;(4):CD005200. doi: 10.1002/14651858.CD005200.pub3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Monson JR, Probst CP, Wexner SD, et al. ; Consortium for Optimizing the Treatment of Rectal Cancer . Failure of evidence-based cancer care in the United States: the association between rectal cancer treatment, cancer center volume, and geography. Ann Surg. 2014;260(4):625-631. doi: 10.1097/SLA.0000000000000928 [DOI] [PubMed] [Google Scholar]
  • 4.McCulloch P, Altman DG, Campbell WB, et al. ; Balliol Collaboration . No surgical innovation without evaluation: the IDEAL recommendations. Lancet. 2009;374(9695):1105-1112. doi: 10.1016/S0140-6736(09)61116-8 [DOI] [PubMed] [Google Scholar]
  • 5.Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2018;68(6):394-424. doi: 10.3322/caac.21492 [DOI] [PubMed] [Google Scholar]
  • 6.Quirke P, Steele R, Monson J, et al. ; MRC CR07/NCIC-CTG CO16 Trial Investigators; NCRI Colorectal Cancer Study Group . Effect of the plane of surgery achieved on local recurrence in patients with operable rectal cancer: a prospective study using data from the MRC CR07 and NCIC-CTG CO16 randomised clinical trial. Lancet. 2009;373(9666):821-828. doi: 10.1016/S0140-6736(09)60485-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Kitz J, Fokas E, Beissbarth T, et al. ; German Rectal Cancer Study Group . Association of plane of total mesorectal excision with prognosis of rectal cancer: secondary analysis of the CAO/ARO/AIO-04 phase 3 randomized clinical trial. JAMA Surg. 2018;153(8):e181607. doi: 10.1001/jamasurg.2018.1607 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Leonard D, Penninckx F, Laenen A, Kartheuser A; PROCARE . Scoring the quality of total mesorectal excision for the prediction of cancer-specific outcome. Colorectal Dis. 2015;17(5):O115-O122. doi: 10.1111/codi.12931 [DOI] [PubMed] [Google Scholar]
  • 9.Nagtegaal ID, van de Velde CJ, van der Worp E, Kapiteijn E, Quirke P, van Krieken JH; Cooperative Clinical Investigators of the Dutch Colorectal Cancer Group . Macroscopic evaluation of rectal cancer resection specimen: clinical significance of the pathologist in quality control. J Clin Oncol. 2002;20(7):1729-1734. doi: 10.1200/JCO.2002.07.010 [DOI] [PubMed] [Google Scholar]
  • 10.Martínez-Pérez A, Carra MC, Brunetti F, de’Angelis N. Short-term clinical outcomes of laparoscopic vs open rectal excision for rectal cancer: a systematic review and meta-analysis. World J Gastroenterol. 2017;23(44):7906-7916. doi: 10.3748/wjg.v23.i44.7906 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Acuna SA, Chesney TR, Ramjist JK, Shah PS, Kennedy ED, Baxter NN. Laparoscopic versus open resection for rectal cancer: a noninferiority meta-analysis of quality of surgical resection outcomes. Ann Surg. 2019;269(5):849-855. [DOI] [PubMed] [Google Scholar]
  • 12.Stevenson AR, Solomon MJ, Lumley JW, et al. ; ALaCaRT Investigators . Effect of laparoscopic-assisted resection vs open resection on pathological outcomes in rectal cancer: the ALACART randomized clinical trial. JAMA. 2015;314(13):1356-1363. doi: 10.1001/jama.2015.12009 [DOI] [PubMed] [Google Scholar]
  • 13.Fleshman J, Branda M, Sargent DJ, et al. Effect of laparoscopic-assisted resection vs open resection of stage II or III rectal cancer on pathologic outcomes: the ACOSOG Z6051 randomized clinical trial. JAMA. 2015;314(13):1346-1355. doi: 10.1001/jama.2015.10529 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Martínez-Pérez A, Carra MC, Brunetti F, de’Angelis N. Pathologic outcomes of laparoscopic vs open mesorectal excision for rectal cancer: a systematic review and meta-analysis. JAMA Surg. 2017;152(4):e165665. doi: 10.1001/jamasurg.2016.5665 [DOI] [PubMed] [Google Scholar]
  • 15.Rickles AS, Dietz DW, Chang GJ, et al. ; Consortium for Optimizing the Treatment of Rectal Cancer (OSTRiCh) . High rate of positive circumferential resection margins following rectal cancer surgery: a call to action. Ann Surg. 2015;262(6):891-898. doi: 10.1097/SLA.0000000000001391 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Birkmeyer JD, Finks JF, O’Reilly A, et al. ; Michigan Bariatric Surgery Collaborative . Surgical skill and complication rates after bariatric surgery. N Engl J Med. 2013;369(15):1434-1442. doi: 10.1056/NEJMsa1300625 [DOI] [PubMed] [Google Scholar]
  • 17.Fecso AB, Szasz P, Kerezov G, Grantcharov TP. The effect of technical performance on patient outcomes in surgery: a systematic review. Ann Surg. 2017;265(3):492-501. doi: 10.1097/SLA.0000000000001959 [DOI] [PubMed] [Google Scholar]
  • 18.Blencowe NS, Boddy AP, Harris A, et al. Systematic review of intervention design and delivery in pragmatic and explanatory surgical randomized clinical trials. Br J Surg. 2015;102(9):1037-1047. doi: 10.1002/bjs.9808 [DOI] [PubMed] [Google Scholar]
  • 19.Miskovic D, Ni M, Wyles SM, et al. ; National Training Programme in Laparoscopic Colorectal Surgery in England . Is competency assessment at the specialist level achievable? a study for the national training programme in laparoscopic colorectal surgery in England. Ann Surg. 2013;257(3):476-482. doi: 10.1097/SLA.0b013e318275b72a [DOI] [PubMed] [Google Scholar]
  • 20.Miskovic D, Wyles SM, Carter F, Coleman MG, Hanna GB. Development, validation and implementation of a monitoring tool for training in laparoscopic colorectal surgery in the English National Training Program. Surg Endosc. 2011;25(4):1136-1142. doi: 10.1007/s00464-010-1329-y [DOI] [PubMed] [Google Scholar]
  • 21.Mackenzie H, Ni M, Miskovic D, et al. Clinical validity of consultant technical skills assessment in the English National Training Programme for Laparoscopic Colorectal Surgery. Br J Surg. 2015;102(8):991-997. doi: 10.1002/bjs.9828 [DOI] [PubMed] [Google Scholar]
  • 22.Curtis NJ, Davids J, Foster JD, Francis NK. Objective assessment of minimally invasive total mesorectal excision performance: a systematic review. Tech Coloproctol. 2017;21(4):259-268. doi: 10.1007/s10151-017-1614-z [DOI] [PubMed] [Google Scholar]
  • 23.Stevenson ARL, Solomon MJ, Brown CSB, et al. Disease-free survival and local recurrence after laparoscopic-assisted resection or open resection for rectal Ccancer: the Australasian Laparoscopic Cancer of the Rectum randomized clinical trial. Ann Surg. 2019;269(4):596-602. [DOI] [PubMed] [Google Scholar]
  • 24.Curtis NJ, Conti JA, Dalton R, et al. 2D versus 3D laparoscopic total mesorectal excision: a developmental multicentre randomised controlled trial. Surg Endosc. 2019;33(10):3370-3383. doi: 10.1007/s00464-018-06630-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Miskovic D, Foster J, Agha A, et al. Standardization of laparoscopic total mesorectal excision for rectal cancer: a structured international expert consensus. Ann Surg. 2015;261(4):716-722. doi: 10.1097/SLA.0000000000000823 [DOI] [PubMed] [Google Scholar]
  • 26.Britten N. Qualitative interviews in medical research. BMJ. 1995;311(6999):251-253. doi: 10.1136/bmj.311.6999.251 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Foster JD, Miskovic D, Allison AS, et al. Application of objective clinical human reliability analysis (OCHRA) in assessment of technical performance in laparoscopic rectal cancer surgery. Tech Coloproctol. 2016;20(6):361-367. doi: 10.1007/s10151-016-1444-4 [DOI] [PubMed] [Google Scholar]
  • 28.Miskovic D, Ni M, Wyles SM, Parvaiz A, Hanna GB. Observational clinical human reliability analysis (OCHRA) for competency assessment in laparoscopic colorectal surgery at the specialist level. Surg Endosc. 2012;26(3):796-803. doi: 10.1007/s00464-011-1955-z [DOI] [PubMed] [Google Scholar]
  • 29.Foster JD, Ewings P, Falk S, et al. ; STARRCAT Investigators . Surgical timing after chemoradiotherapy for rectal cancer, analysis of technique (STARRCAT): results of a feasibility multi-centre randomized controlled trial. Tech Coloproctol. 2016;20(10):683-693. doi: 10.1007/s10151-016-1514-7 [DOI] [PubMed] [Google Scholar]
  • 30.Senders DM, Human Error NP. Cause, Prediction and Reduction. CRC Press; 1991. [Google Scholar]
  • 31.Foster JD, Mackenzie H, Nelson H, Hanna GB, Francis NK. Methods of quality assurance in multicenter trials in laparoscopic colorectal surgery: a systematic review. Ann Surg. 2014;260(2):220-229. doi: 10.1097/SLA.0000000000000660 [DOI] [PubMed] [Google Scholar]
  • 32.Sedgwick P, Greenwood N. Understanding the Hawthorne effect. BMJ. 2015;351:h4672. doi: 10.1136/bmj.h4672 [DOI] [PubMed] [Google Scholar]
  • 33.Gjeraa K, Spanager L, Konge L, Petersen RH, Østergaard D. Non-technical skills in minimally invasive surgery teams: a systematic review. Surg Endosc. 2016;30(12):5185-5199. doi: 10.1007/s00464-016-4890-1 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplement.

eMethods 1. The LapTMEpt pilot study

eMethods 2. Full laparoscopic TMEpt application

eMethods 3. Expert interviews and thematic analysis

eTable 1. Patient demography, tumor properties and case scores for the LapTMEpt pilot

eTable 2. Outcomes from the questionnaire completed by the assessors to evaluate utility of the tool

eTable 3. LapTMEpt surgeon quartiles

eTable 4. Grouping of themes generated from interview thematic analysis into skill domains

eTable 5. LapTIMEpt reliability assessments

eFigure 1. Scatter graph plotting the different assessors’ total scores for each pilot case

eFigure 2. LapTMEpt score distribution

eAppendix. L-TMEpt instruction manual

jamasurg-e201004-s001.pdf (1,003.8KB, pdf)

Articles from JAMA Surgery are provided here courtesy of American Medical Association

RESOURCES