Skip to main content
Canadian Urological Association Journal logoLink to Canadian Urological Association Journal
. 2017 Oct;11(10):331–336. doi: 10.5489/cuaj.4442

Feasibility of expert and crowd-sourced review of intraoperative video for quality improvement of intracorporeal urinary diversion during robotic radical cystectomy

Mitchell G Goldenberg 1, Jamal Nabhani 2, Christopher JD Wallis 1, Sameer Chopra 2, Andrew J Hung 2, Anne Schuckman 2, Hooman Djaladat 2, Siamak Daneshmand 2, Mihir M Desai 2, Monish Aron 2, Inderbir S Gill 2, Raj Satkunasivam 1,
PMCID: PMC5963445  PMID: 29382445

Abstract

Introduction

Development of uretero-ileal stricture (UIS) after robotic-assisted radical cystectomy (RARC) may be dependent on surgical technique. Video review of intraoperative technique is an emerging paradigm for surgical quality improvement. We examined whether surgeon-perceived risk of UIS or crowd-sourced assessment of robotic skill are associated with the development of UIS.

Methods

We conducted a case-control study comparing the operative technique of uretero-ileal anastomoses resulting in clinically significant UIS with the contralateral anastomosis for the same patient. De-identified videos were analyzed by 1) five high-volume surgeons; and 2) crowd workers (Crowd-Sourced Assessment of Technical Skill, C-SATS) to determine Global Evaluative Assessment of Robotic Skill (GEARS) score. Mantel-Haenszel common odds ratio (OR) estimates were calculated to assess the association between surgeon performance and the development of UIS. Logistic regression models were used to examine the association between GEARS scores and the development of UIS.

Results

A total of 10 UIS videos were compared to eight control videos by five surgeons and 2142 crowd workers. Expert surgeons systematically evaluated intraoperative footage, however, no association between the expert mode response and UIS (OR 0.42; 95% confidence interval [CI] 0.05–3.45; p=0.91) was identified. Crowd-sourced assessment was not predictive of UIS (p=0.62).

Conclusions

We used video review to systematically analyze procedure-specific content and technique. The inability of surgeons to predict UIS may reflect the questionnaire, uncontrolled patient factors, or a lack of power. Crowd-sourced GEARS score was unsuccessful in predicting UIS after RARC.

Introduction

An understanding of the relationship between surgical technique and adverse operative outcomes is critical to ensure optimal patient outcomes. Recognition that technical skills, as assessed from intraoperative video, are associated with postoperative complications1,2 has motivated efforts to analyze video footage for training and accreditation.35 Recently, evaluation of video footage by laypeople trained in the use of validated assessment metrics, termed “crowd-sourcing,” has been shown to provide efficient and reliable feedback that correlates well with expert ratings.6,7 This methodology has been translated to differentiating surgeon skill8 in robotic prostatectomy.

In order to improve care provided to patients, one aspect of surgical quality improvement seeks to identify both intraoperative steps that contribute to clinically relevant outcomes. In addition, feasible methods of obtaining robust assessment of these steps from content experts are necessary for improvement of surgical skill. This approach may have particular value during novel technique development and provide data for continuous quality improvement. Retrospective video review has been used in open9 and laparoscopic radical prostatectomy10 to identify surgical steps correlated with potency and technical errors leading to positive surgical margin, respectively. Robot-assisted radical cystectomy (RARC) is becoming increasingly used and is under close scrutiny with regards to complications,1113 such as strictures at the uretero-ileal anastomosis. These may require further endoscopic or surgical intervention and represent a potentially preventable complication of RARC.1416 Risk factors for benign strictures, such as radiation and poor tissue quality, have been reported, although the impact of surgical technique remains unclear.12,13,17,18 There exist accepted standards of surgical technique, including maintaining tissue vascularity, minimizing ureteral handling, and performing tension-free anastomosis, which underpin good technique believed to minimize strictures.17,19,20 Careful intraoperative video review may allow for refinement of robotic surgical technique involved in uretero-ileal anastomosis.

In this proof of principle study, we sought to assess the utility of video review for quality improvement in RARC. We selected uretero-ileal stricture (UIS) as an outcome as it represents an objective, clinically significant endpoint that occurs relatively early following surgery. Secondarily, we sought to assess whether expert or crowd-sourced video review could predict UIS.

Methods

Study design and subjects

We conducted a retrospective, case-control study examining the operative technique leading to UIS after RARC compared to the same patient’s contralateral anastomosis. All patients underwent robotic radical cystectomy, extended pelvic lymph node dissection, and either intracorporeal ileal conduit or ileal orthotopic neobladder, as previously described.11,21 All surgeries were performed by one of three high-volume robotic surgeons (MA, MMD, ISG). Consenting patients at the University of Southern California (USC) are prospectively followed in an institutional review board-approved database. This database is maintained by a single dedicated database manager with followup and reliable capture of complications by direct communication with primary care providers and referring physicians. We identified 102 patients from July 2010 to December 2013 who underwent intracorporeal urinary diversion. Detailed chart review was performed to identify patients who developed clinically significant UIS, defined as those requiring percutaneous (e.g., nephrostomy tube), endoscopic (e.g., laser incision), or surgical (e.g., re-implantation) intervention. Any strictures secondary to malignancy (including carcinoma in situ [CIS]) were excluded. We identified a total of 12 patients with UIS, of which nine cases (10 strictures) had intraoperative video available. The control group consisted of the contralateral ureter that did not develop a stricture among the same patients (n=8).

Expert raters

Five content experts from our institution were asked to participate in this study as video raters. Three are experts in open radical cystectomy (AS, HD, SD) and two are experts in RARC (MA, ISG).

Outcome measures and statistical analysis

A consensus-based approach was used to identify intraoperative steps of interest. Three broad categories were identified as having the greatest impact on ureteral stricture outcomes: ureteral mobilization, ureteral preparation for anastomosis, and ureteral anastomosis. As it was impractical to review the complete case footage, a standardized video synopsis of each case was produced, lasting 60 seconds in total. Each video clip showed ureteral mobilization starting at the common iliac for approximately 10 seconds and included any segment where the ureter was on tension. Twenty seconds was used to show ureteral preparation, displaying the quality of the ureteral adventitia, vascularity, ureteral handling, and adequacy of spatulation. Lastly, 30 seconds focused on the uretero-ileal anastomosis, including the apical stitch, three to four five-second clips representative of ureteral and advential handling, and the running anastomosis. Any use of electrocautery around the ureter or direct handling/grasping of the ureter were included in the video.

The expert raters completed a five-part questionnaire (Appendix 1) that addressed the ureteral mobilization, ureteral preparation, ureteral anastomosis, and overall perceived risk of ureteral stricture (whether increased or not). Since this study instrument and video methodology was not validated, we performed a pilot evaluation. First, in order to test whether watching the surgical clip captures an equivalent assessment to the entire raw footage, the questionnaire outcome was compared between full-length and edited segments for the same patients. This was done in random order for three patients with good agreement between viewing a full-length video and an edited video. Secondly, we assessed the inter-rater reliability of the questionnaire. Each participating surgeon viewed edited surgical clips in random order, blinded to the identity of the surgeon performing the procedure and the stricture outcome. All surgeons performed the uretero-ileal anastomosis using the same technique,22 limiting the rater’s ability to discern between different surgeons. Trainees did not participate in the aspects of the operation captured in the video clips. The same technique was used regardless whether an ileal conduit or orthotopic neobladder was created. All clips were viewed in the same setting and questionnaires were completed immediately after an individual clip was viewed.

Appendix 1.

Surgeon questionnaire for peer-review

Video Clip No:_______
a. Ureteral handling during mobilization: Are you concerned regarding excessive tension, excessive cautery use, and/or insufficient preservation of adventitia?
  1. YES

  2. NO


b. Ureteral viability: Are you concerned regarding the vascularity, adequacy of spatulation of the distal ureter prior to anastomosis?
  1. YES

  2. NO


c. Ureteral handling during anastomosis: Are you concerned regarding excessive traction, excessive ureteral length, excessive handling of ureteral mucosa rather than adventitia during anastomosis?
  1. YES

  2. NO


d. Ureteral suturing: Are you concerned about interrupted vs. running suturing and excessive tightening of suture as regards to potential for stricturing?i. YESii. NO

e. CONCLUSION: Do you believe that this ureter is at increased risk for a stricture based on the observed surgical technique?i. YESii. NO
If YES to e: Specify which factor led to this conclusion (can circle more than one):
  1. a

  2. b

  3. c

  4. d

The Mantel-Haenszel common odds ratio estimate was used to determine if there was an association between the components of the case assessed in the video clips and the development of clinically significant UIS, with results expressed as odds ratios (OR) and 95% confidence interval (CI). In order to account for our binary questionnaire assessment across the five experts, we used the mode response of “yes” or “no” for each question across all videos. Additionally, we calculated the OR for the final question and overall impression for each of the raters.

Crowd-sourcing

The same video clips were evaluated by crowd-sourcing by the Crowd-Sourced Assessment of Technical Skills (C-SATS) group (Seattle, WA, U.S.). C-SATS uses ‘lay-people’ to perform video analysis of clinical activities. These assessments have been demonstrated to correlate with the assessments of content experts.8 Crowd workers are paid a nominal fee for each video clip they rate. The raters enlisted by C-SATS are trained in use of the Global Evaluative Assessment of Robotic Skill (GEARS) assessment tool, which consists of Likert scale scoring across six domains of robotic skill.23 One domain, autonomy, was not assessed, as it is not applicable on retrospective video review. Thus, the total possible GEARS score was 25. The C-SATS team confirmed de-identification of videos prior to submitting them to the crowd for assessment. We used logistic regression to quantify the relationship between mean GEARS scores and development of UIS.

Statistical significance was set at p<0.05 based on a two-tailed comparison. Statistical analyses were performed using SAS 9.3 (SAS Institute Inc., Cary, NC, U.S.).

Results

Study subjects

The median age of patients with clinically significant UIS was 70.1 years (interquartile range [IQR] 67.2–73.6), and all nine were male (Table 1). Five patients received neoadjuvant chemotherapy (cisplatin-based), while none had a history of pelvic radiation. Six strictures were right-sided, two left-sided, and one patient had bilateral strictures. The median time to stricture diagnosis was 105 days (IQR 92–130) following cystectomy. Management of strictures varied, with four treated endoscopically (three balloon dilations and one laser incision), three percutaneously (nephrostomy tubes), and one surgically (open re-implantation).

Table 1.

Patient demographics

n (%)
Number of patients 9*
 Unilateral stricture 7 (78)
 Bilateral structure 1 (11)
 Control only 1 (11)
Stricture location
 Left 2 (20)
 Right 6 (80)
Gender
 Male 9 (100)
 Female 0 (0)
Median age (IQR) 70.09 (67.15–83.60)
Median time (days) to stricture diagnosis (IQR) 105 (91.75–130.25)
Management of stricture
 Balloon dilation 3 (37.5)
 Nephrostomy tube 3 (37.5)
 Laser 1 (12.5)
 Redo anastomosis 1 (12.5)
Charleston Comorbidity Index (CCI)
 0 2 (22)
 1 3 (33)
 2 3 (33)
 3 1 (11)
Diabetic
 Yes 1 (11)
 No 8 (88)
Previous abdominal surgery
 Yes 5 (55)
 No 4 (33)
BMI (kg/m2)
 18.5–25 4 (44)
 25–30 3 (33)
 30–40 2 (22)
Pathological staging
 Organ-confined 5 (55)
 Extravesical 4 (44)
Neoadjuvant chemotherapy
 Yes 2 (22)
 No 7 (77)
Neoadjuvant radiation therapy
 Yes 0 (0)
 No 9 (100)
Adjuvant chemotherapy
 Yes 4 (44)
 No 5 (55)
Adjuvant radiation therapy
 Yes 0 (0)
 No 9 (100)
*

8 patients with strictures (self-controlled, left vs. right), 1 patient bilateral control only. BMI: body mass index; IQR: interquartile range.

Inter-rater agreement

Among the content expert raters, there was little agreement on whether a given video clip would result in a clinically significant stricture (Table 2). Between the five raters, intra-class correlations (ICC) were significant for ureteral handling during mobilization (κ=0.146; p=0.03) and ureteral handling during the anastomosis (κ=0.107; p=0.008), but not for ureteral viability/adequacy of spatulation (κ=0.098; p=0.10), ureteral suturing (κ=−0.138; p=0.96), or the overall technique (κ= −0.067; p=0.81).

Table 2.

Peer-review inter-rater agreement overall and by rater subspecialty

Question Kappa-statistic p
Overall agreement (n=5) A 0.15 0.03
B 0.10 0.10
C 0.11 0.08
D −0.14 0.96
E −0.07 0.81

Agreement by surgeon type

Robotic surgeons (n=2) A 0.00 N/A
B 0.00 0.5
C 0.56 0.008
D * *
E −0.11 0.79

Open surgeons (n=3) A 0.31 0.03
B 0.11 0.22
C −0.03 0.60
D −0.18 0.90
E 0.20 0.07
*

Too few categories to analyze inter-rater agreement.

We then analyzed the agreement of robotic experts and open surgeons separately (Table 2). There was correlation among the robotic experts only on the assessment of ureteral handling during the anastomosis (κ=0.56; p=0.008). Robotic surgeons did not agree on whether the overall technique observed would result in a clinically significant UIS (κ= −0.108; p=0.79). A weak, but statistically significant ICC for ureteral handling during mobilization (κ=0.306; p=0.03) was observed among open surgeons. Robotic surgeons failed to agree on all remaining of components assessed.

Surgeon rating and stricture outcome

In order to determine whether any of the components of the anastomosis presented were associated with stricture formation, we calculated ORs for the mode response of each question (Table 3). For each question, the mode response was not significantly associated with stricture development. There were no significant associations identified after stratification into open and robotic surgeons.

Table 3.

Univariate analysis – Association between risk of UIS and surgeon rating

Question Odds ratio Mantel-Haenszel 95% confidence interval p
Overall majority vote (n=5) A 3.00 0.25–36.33 0.39
B 3.00 0.25–36.33 0.39
C 0.17 0.02–1.44 0.10
D * * *
E 0.42 0.05–3.45 0.42

Robotic surgeon #3 A ** ** 0.48
B 1.75 0.13–23.70 0.67
C 0.42 0.051–3.43 0.42
D * * *
E 0.89 0.13–6.31 0.91

Robotic surgeon #4 A ** ** 0.48
B 1.75 0.13–23.70 0.67
C 0.42 0.05–3.44 0.41
D * * *
E 0.89 0.13–6.31 0.91

Majority vote open surgeons (n=3) A 2.50 0.32–19.52 0.38
B 2.00 0.26–15.38 0.50
C 2.10 0.25–17.59 0.49
D 0.40 0.06–2.70 0.35
E 0.60 0.08–4.76 0.63

Judge Odds ratio Mantel-Haenszel 95% confidence interval p

Conclusion (E) by judge 1 0.33 0.04–2.52 0.29
2 2.00 0.28–14.19 0.49
3 2.68 N/A 0.56
4 1.11 0.16–7.51 0.91
5 1.13 0.16–7.99 0.91
*

No association statistics calculated, as all answers ‘No’;

**

odds ratio not calculated, denominator of zero.

Crowd-sourcing and stricture outcome

C-SATS crowd workers analyzed the same video clips, completing a total 2142 assessments with the GEARS metric. The overall mean score was 20.73 (out of 25). There was no predictive relationship between mean GEARS score and clinically significant UIS (p=0.62). Across all five GEARS domains assessed, the crowd was unable to identify anastomoses resulting in clinically significant UIS (Table 4).

Table 4.

Crowd-sourced assessment of technical skill (C-SATS) mean GEARS scores (maximum 5 points per domain) Based on 2142 crowd-sourced assessments

Depth perception (SD) Efficiency of movement (SD) Force of movement (SD) Robotic control (SD) Bimanual dexterity (SD) Overall GEARS/25 (SD)
Stricture 4.02 (0.23) 4.13 (0.19) 3.91 (0.25) 4.36 (0.14) 4.21 (0.24) 20.64 (0.93)
Controls 4.08 (0.15) 4.17 (0.10) 3.90 (0.24) 4.40 (0.10) 4.30 (0.15) 20.80 (0.50)
Difference (t-test) p=0.51 p=0.54 p=0.91 p=0.97 p=0.36 p=0.65

Note: ‘Autonomy’ domain not assessed by C-SATS. SD: standard deviation.

Discussion

Intraoperative video review by expert surgeons and crowd-sourcing may be a useful platform for ongoing study of technique-outcome relationship with a view towards quality improvement. Despite using expert raters in identification of key operative steps and assessment of video footage of ureteroileal anastomoses, our small study did not find any correlation between intraoperative technique and clinically significant UIS.

As all surgical subspecialties try to better understand the factors that contribute to patient safety and outcomes, there has been a recent shift toward identifying important intraoperative factors.24 In a seminal article, Birkmeyer et al1 assessed the association between peer-reviewed technical skill and 30-day perioperative complications. Among 24 bariatric surgeons, a large discrepancy in skill assessment scores was identified and these scores were associated with perioperative complications. Thus, we sought to assess the feasibility of systematically producing intraoperative video clips, assessed by expert surgeons, in order to facilitate surgical quality improvement and technical innovation. The approach employed in this study is novel, as to our knowledge, a video review strategy has not been employed in robotic cystectomy quality improvement. Additionally, unlike prostatectomy, wherein early outcomes are often subjective and patient-reported, we specifically chose to study strictures since they represent an early, identifiable, objective outcome, with significant clinical relevance. We developed a procedure-specific questionnaire by expert consensus in order to assess relevant individual steps of the anastomosis, from ureteral dissection to suturing. We used the contralateral anastomosis as a control in most cases to allow for adjudication of surgical technique on the development of strictures while limiting the impact of confounding due to patient factors.

Our data suggest that both expert surgeons and crowd-sourcing may be unable to reliably determine whether a given patient is at risk for UIS based on observing the surgical technique of the ureteral dissection and anastomosis. While the questionnaire derived by expert consensus has face validity, its external validity and reliability have not been assessed. The inter-rater reliability of the tool was poor, and this may explain the heterogeneity in our results, as well as the lack of association between the expert assessment and the primary outcome. Although the steps of the dissection and anastomosis to be analyzed were identified through expert consensus, the video clips shown to both experts and the crowd in this study were not internally validated. In addition, there may be unmeasured patient factors that we have been unable to assess. Finally, while crowd-sourced GEARS assessment is a valid method of assessing global robotic surgical technical skill in urology,8,25 in our study, crowd-sourced reviewers could not differentiate between UIS and control videos. While platforms like C-SATS have a role in generating high-volume assessment and feedback, crowd-sourcing has yet to demonstrate the ability to discern between clinically relevant patient outcomes, unlike peer-review.1,26

Future efforts using intraoperative video review for quality improvement initiatives should strive to include an adequate number of surgical cases for appropriate study power. Additionally, the lack of agreement across our content expert raters may have been due in part to the use of a questionnaire that has not been validated. We feel that despite the inability of our raters to predict clinical outcomes based on review of intraoperative video footage, these types of endeavours are worthwhile for both self and peer-review of surgical technique, and possibly the future creation of formal systems of high-stakes evaluation or accreditation, which incorporate surgical technical skill.

Conclusion

We used expert surgeon and crowd evaluation of distilled surgical video footage to explore the effect of technical skill on the development of UIS. Although we were unable to identify an association between expert-rated and crowd-sourced video ratings and UIS, this platform has the potential for quality improvement and, therefore, warrants further study.

Footnotes

See related commentary on page 337

Competing interests: Dr. Daneshmand has been an advisor for and received honoraria from Photocure and TARIS; and has participated in clinical trials unrelated to this subject matter supported by Photocure and TARIS. Dr. Aron has received honoraria from Intuitive Surgical. Dr. Gill has been an advisor/consultant for EDAP and Mimic; and holds investments in Hansen Medical. The remaining authors report no competing personal or financial interests.

This paper has been peer-reviewed.

References


Articles from Canadian Urological Association Journal are provided here courtesy of Canadian Urological Association

RESOURCES