基于深度学习的微创手术工具检测与跟踪研究综述

玉莹 刘; 子健 赵

doi:10.7507/1001-5515.201904061

. 2019 Oct;36(5):870–878. [Article in Chinese] doi: 10.7507/1001-5515.201904061

Show available content in

基于深度学习的微创手术工具检测与跟踪研究综述

玉莹刘 ¹, 子健赵 ¹

PMCID: PMC9935135 PMID: 31631638

Abstract

基于深度学习的微创手术工具检测与跟踪技术在微创外科手术中的应用是目前的一个研究热点。本文首先对微创手术工具检测与跟踪的相关技术内容进行系统阐述，主要介绍了基于深度学习算法的优势。然后，本文概述了基于完全监督的深度神经网络手术工具检测与跟踪算法以及新兴的基于弱监督的深度神经网络手术工具检测与跟踪的算法，重点归纳了基于深度卷积神经网络及递归神经网络的几种典型算法框架及其流程图，以便相关领域的科研工作者更系统地了解目前研究进展，同时可为微创外科手术医生选择导航技术时提供参考。最后，本文为基于深度学习的微创手术工具检测与跟踪技术的进一步研究提供了一个大致的方向。

Keywords: 深度学习, 卷积神经网络, 完全监督的, 弱监督的, 手术工具检测与跟踪

引言

微创外科手术（minimally invasive surgeries, MIS）与传统外科手术相比具有许多优势，例如：创伤小、疼痛少、术后恢复时间短等，现已成为通用的外科手术选择^[1]。但 MIS 采用间接观察和操纵的方式，使得深度感知复杂化、手术视野和操作空间狭窄，降低了医生的手眼协调，因而可能会在实际操作中对器官或组织造成损伤，所以需要以其他手段获取额外信息来监测在体内移动的手术工具，而微创手术工具检测与跟踪算法就可以为 MIS 中的操作导航提供这样的重要信息。微创手术工具的检测与跟踪算法可以确定微创手术工具的位置和空间姿态，为执行手术的临床医生或机器人提供精确实时的导航，使手术过程更顺利安全。

微创手术工具检测与跟踪，有基于硬件和基于视觉的两种解决方式，但由于基于视觉的解决方式，既简单又不需要额外的设备，现已成为目前主要的研究方向。通过系统软件，基于视觉的解决方式可以直接对图像中的手术工具进行某些特征提取，即可检测与跟踪微创手术工具在图像中的位置，从而为微创手术工具操纵提供导航。

值得一提的是，现在微创手术工具检测与跟踪技术日趋发展成熟，既有从传统的自行设计特征、分类器，发展到现在的基于深度学习的端到端（end to end）算法；也有基于深度学习的从先检测后跟踪，发展到检测与跟踪融为一体的算法。本文主要介绍了基于完全监督深度学习神经网络和基于弱监督深度学习神经网络的微创手术工具检测与跟踪算法，以便相关领域的科研工作者更系统地了解目前研究进展，同时为微创外科手术医生选择合适的导航技术提供参考。最后，本文为基于深度学习的微创手术工具检测与跟踪技术的进一步研究提供了一个大致的方向。

1. 深度学习理论

目前微创手术工具检测与跟踪算法可以被划分为生成式（generative model）模型和判别式（discriminative model）模型两大类别^[2]。生成式模型主要有基于特征匹配的方法、基于贝叶斯跟踪的方法以及运动检测的方法等^[3-4]。判别式模型则采用模式分类方法，例如：支持向量机（support vector machine，SVM）、随机森林、大部分基于深度学习的方法等^[4-5]。判别式模型因为能够明显地区分背景和前景的信息，逐渐在微创手术工具检测与跟踪领域占据主流地位。

上文所提到的生成式模型中所有的方法和判别式模型中的 SVM 及随机森林方法通常有较多的局限，比如：需要手工设计特征，不精确且耗费人力；难以构建高级的语义信息；无法应用于复杂场景等。但是基于深度学习的模型以人工神经网络为架构对数据进行表征学习，具有学习能力强、特征表达能力高效、语义特征更高级的特点，所以从 2013 年以来研究者们普遍关注怎样解决基于深度学习的微创手术工具检测与跟踪问题^[6]。

解决基于深度学习的微创手术工具检测与跟踪问题与解决基于深度学习的车辆检测与跟踪问题不同，前者存在目标更容易丢失、数据集更少、场景更加复杂等难题。同时微创手术工具的数据集也会因为遮挡、镜面反射、手术场景中出血或烟雾导致信息丢失，进而影响微创手术工具检测与跟踪的测试效果^[2]。多年来研究者们直面难题，搭建全新的深度神经网络，或者将用于解决各种医学图像分割或识别问题中的深度神经网络进一步拓展应用在基于深度学习的微创手术工具检测与跟踪任务中^[7]。研究者们的努力使得基于深度学习的微创手术工具检测与跟踪技术不断发展，至今已有多种基于深度学习的微创手术工具检测与跟踪算法被应用于 MIS 中。

现在基于深度学习的微创手术工具检测与跟踪算法中有两种主流深度神经网络模型：完全监督深度神经网络模型和弱监督深度神经网络模型。下面介绍近年来基于深度学习的微创手术工具检测与跟踪算法中测试效果较好的基于完全监督深度神经网络和基于弱监督深度神经网络的微创手术工具检测与跟踪算法。

2. 基于完全监督深度神经网络的微创手术工具检测与跟踪算法

2.1. 以卷积神经网络为骨干的经典算法

2.1.1. 卷积神经网络结合线段检测

神经网络的层数并不是越深越好，过深的神经网络会带来一系列的问题，比如图像细节信息丢失、更新困难、定位更加复杂等。适当深度的神经网络层数结合经典算法取得的效果或许更佳，例如 Chen 等^[8]提出了一种卷积神经网络（convolutional neural network，CNN）结合传统线段检测（line segment detection，LSD）的微创手术工具检测与跟踪算法。该研究使用了 CNN 检测和时空上下文（spatio-temporal context，STC）学习跟踪算法，在视频帧之间进行微创手术工具的检测与跟踪。在检测之前，先用 LSD 来检测微创手术工具外观的线段并标记它们，然后使用有标记的图像作为数据集来训练 CNN，这些线段的位置有助于快速准确地检测微创手术工具的尖端。最后使用 STC 学习算法逐帧跟踪微创手术工具，STC 关键点在于有效利用快速傅里叶变换（fast fourier transform，FFT）和逆快速傅里叶变换（inverse fast fourier transform，IFFT）。实验结果表明，在微创手术工具检测与跟踪任务中，Chen 等^[8]提出的方法具有先进的二维跟踪性能，准确率为 93.2%，处理速度达到了 25 帧/s，故而该方法适合用于对实时性要求不是很高，但是对准确率要求较高的 MIS 中。如图1 所示是其微创手术工具检测与跟踪算法流程图。

图 1 — CNN combined with LSD for minimally invasive surgical tool detection and tracking algorithm

CNN 结合 LSD 的微创手术工具检测与跟踪算法

2.1.2. 卷积神经网络结合支持向量机以及隐马尔科夫模型

2017 年，Sahu 等^[9]提出了用改进亚历克斯网络（AlexNet）结合随机森林实现微创手术工具的检测，Twinanda 等^[4]则提出了将 AlexNet 改进用于解决微创手术工具检测与跟踪问题。Twinanda 等^[4]所提出算法的流程如图2 所示：输入训练图像后首先通过微调过程训练内网络（EndoNet），EndoNet 就是 AlexNet 的扩展。在 EndoNet 被训练之后，用于微创手术工具检测与跟踪任务。对于微创手术工具的存在检测，EndoNet 给出的置信度直接用于执行任务；对于微创手术工具的跟踪，EndoNet 用于从图像中提取微创手术工具的视觉特征，然后将这些特征传递给 SVM 和分层隐马尔科夫模型（hidden markov model，HMM）以完成微创手术工具检测与跟踪。虽然 Twinanda 等^[4]给出的测试结果中显示无法达到当前实时的标准（大于 20 帧/s），但 Twinanda 等^[4]和斯特拉斯堡大学医院（University Hospital of Strasbourg，HUS）建立了胆囊切除术数据集（Cholec80 Dataset）（网址为：http://camma.u-strasbg.fr/datasets），Cholec80 数据集包含了 13 名外科医生进行的 80 个胆囊切除手术的视频。因为在基于深度学习的微创手术检测与跟踪领域，可用于训练的数据集非常稀少，所以 Twinanda 等^[4]为研究者们的研究带来了极大的便利。

2.1.3. 全卷积网络结合光流跟踪

全卷积网络（fully convolutional networks, FCN）被提出之后，García-Peraza-Herrera 等^[10]基于 FCN 进行改进，提出了一种 FCN 结合光流（optical flow，OF）跟踪的微创手术工具检测与跟踪算法。该方法先利用微调的 FCN 验证 FCN 解决微创手术工具检测与跟踪问题的可行性，结果证明微调的 FCN 虽然达不到实时的要求，但是精度最高达到了 83.7%，所以该课题组继续研究利用 OF 跟踪加快速度。FCN 结合 OF 跟踪的微创手术工具检测与跟踪算法最终达到了在精度为 88.3% 时，速度为 30 帧/s，比较适合用于较慢的 MIS 视频数据集。如图3 所示为 FCN 结合 OF 跟踪的微创手术工具检测与跟踪算法流程图。

图 3 — FCN combined with OF tracking for minimally invasive surgical tool detection and tracking algorithm

FCN 结合 OF 跟踪的微创手术工具检测与跟踪算法

2.2. 基于从粗略到精细级联卷积神经网络的算法

Zhao 等^[11]提出了一种基于从粗略到精细级联 CNN 的微创手术工具检测与跟踪算法，该文献中设计了两个级联的 CNN 用于微创手术工具的粗略定位和精细定位。粗略 CNN 是一个分类网络，精细 CNN 是微创手术工具尖端区域的回归网络，通过更新 STC 使得两个 CNN 协同工作以用于微创手术工具的检测与跟踪。基于从粗略到精细级联 CNN 的微创手术工具检测与跟踪算法在冒烟、遮挡等 MIS 场景中都得到了不错的效果，比较适用于复杂的 MIS 场景，但是其速度还有待于提高。如图4 所示是基于从粗略到精细级联 CNN 的微创手术工具检测与跟踪算法流程图。

图 4 — Minimally invasive surgical tool detection and tracking algorithm based on coarse to fine cascade CNN

基于从粗略到精细级联 CNN 的微创手术工具检测与跟踪算法

Zhao 等^[12]基于 AlexNet、视觉几何组（visual geometry group，VGG）网络和谷歌网（GoogLeNet）提出的从粗略到精细级联 CNN 的微创手术工具检测与跟踪算法带有一个空间转换网络（spatial transformer network，STN），可用于精确定位微创手术工具的位置，该算法比较适用于对精度要求高且场景比较复杂的 MIS，如图5 所示是该算法的流程图。由于先经过一个粗略 CNN，再经过一个精细 CNN 的思路可以加快检测微创手术工具尖端的速度，缩小精细 CNN 的检测范围，所以研究者们在搭建网络结构时可以使用这个思路加快对微创手术工具尖端位置的检测。

图 5 — Minimally invasive surgical tool detection and tracking algorithm based on coarse CNN to fine STN cascade

基于从粗略 CNN 到精细 STN 级联的微创手术工具检测与跟踪算法

2.3. 基于区域提案的两阶段卷积神经网络算法

2017 年 Sarikaya 等^[13]提出了一种通过改进更快的区域卷积神经网络（faster regions with convolutional neural network，Faster R-CNN）完成微创手术工具检测与跟踪的算法，该算法利用区域提议网络（regional proposal network，RPN）和多模式双流 CNN，以融合预测图像和时间运动线索进而对微创手术工具检测与跟踪。Faster R-CNN 是典型的两阶段 CNN，虽然比一阶段的 CNN 精度较高，但是比较费时，远远达不到实时的标准，改进 Faster R-CNN 后实验结果的平均精度为 91%，速度为 10 帧/s。值得一提的是 Sarikaya 等^[13]提供了来自美国纽约州布法罗罗斯威尔帕克癌症研究所（Roswell Park Cancer Institute，RPCI）的 10 名外科医生在达芬奇外科系统 (dVSS) 上执行 6 种不同任务的 MIS 视频，视频中的每帧配有微创手术工具的注释，这代表着该数据集可以直接使用不用额外处理，为研究者们带来了极大的便利。如图6 所示为基于 Faster R-CNN 的多模式双流微创手术工具检测与跟踪算法流程图。

图 6 — Detection and tracking algorithm of multi-mode dual-stream minimally invasive surgery tool based on Faster R-CNN

基于 Faster R-CNN 的多模式双流微创手术工具检测与跟踪算法

2018 年 Jin 等^[14]提出的微创手术工具检测与跟踪算法也是基于 Faster R-CNN 的两阶段框架，是端到端的训练，即输入视频帧，直接输出 MIS 中所用 7 个微创手术工具周围边界框的空间坐标，具体的流程图如图 7 所示。结果表明，基于 Faster R-CNN 的微创手术工具检测与跟踪算法虽然精度只提升到了 81.8%，速度也只有 5 帧/s，但是 Jin 等^[14]创造性地提出了一个 MIS 效果评估模块，可以用微创手术工具的时间轨迹和运动轨迹以自动评价 MIS 效果，而且该算法还在检测微创手术工具存在任务中超越了 EndoNet 算法。

2.4. 基于回归的一阶段卷积神经网络算法

上述基于区域提案的两阶段 CNN 算法，虽然其准确度较高，但是远达不到实时的要求，所以 Choi 等^[15]提出了基于回归的一阶段 CNN 算法实现微创手术工具检测与跟踪任务。2017 年 Choi 等^[15]在 Redmon 等^[16]提出的你只能看一眼（you only look once，YOLO）网络结构的最后一层添加一个全连接层，通过下采样减少参数的数量，将检测与跟踪看成一个简单的回归问题，设计了一种能够实时检测与跟踪微创手术工具的 CNN 算法。在添加的全连接层中使用了丢弃（dropout），该层将来自前一层的所有输出值连接到神经网络，以防止训练数据过度拟合。基于回归的一阶段 CNN 算法的实验结果虽然平均精度为 72.26%，容易错检误检小型微创手术工具，但是微创手术工具的检测速度为 48.9 帧/s，增强了一阶段 CNN 效率方面的优势，实时性较高，适用于较大型的微创手术工具检测与跟踪。如图8 所示是改进后的 YOLO 网络结构。

图 8 — Improved YOLO network structure

改进后的 YOLO 网络结构

2.5. 基于先收缩后扩展路径的卷积神经网络算法

Kurmann 等^[17]受 U 型网络（U-Net）启发提出了一种新颖的基于先收缩后扩展（shrink before expand，SBE）路径的 CNN 算法，该算法的网络结构共用 5 个下采样（特征数加倍）和 5 个上采样（特征数减半）阶段，每个阶段有卷积层、激活层和采样层，最后用全连接层扩展最底层，该算法在测试时不需要滑动窗口从而减少了计算量，而且可以同时检测与跟踪多个微创手术工具。2019 年 Gao 等^[18] 提出的算法与 Kurmann 等^[17]提出的算法类似，而 Du 等^[19]受 Twinanda 等^[4]和 U-Net 的启发提出了一种更好的基于 SBE 路径和双分支 CNN 的算法，该算法的网络结构由 FCN 检测和回归网络构成，还融合残差网络快捷连结优化输出精度，由深度学习技术提供动力，没有来自机器人的任何直接运动学信息。基于 SBE 路径和双分支 CNN 算法的实验结果表明精度在 90% 以上，对新型微创手术工具检测与跟踪具有一定的通用性，在烟雾模拟下具有较好的鲁棒性，具有处理真实 MIS 场景的能力，不足之处是处理速度没有达到实时的标准。如图9 所示是基于 SBE 路径和双分支 CNN 算法的流程图。

图 9 — Minimally invasive surgical tool detection and tracking algorithm based on SBE path and dual-branch CNN

基于 SBE 路径和双分支 CNN 的微创手术工具检测与跟踪算法

Laina 等^[20]提出了利用 SBE 路径的新算法，通过利用微创手术工具分割和定位之间的相互依赖性，将微创手术工具二维（two dimensional，2D）姿势估计重新建模表示为热度图回归问题，进而通过深度学习实现分割任务和定位任务同时鲁棒回归，完成微创手术工具检测与跟踪任务。通过实验证明，重新建模热度图回归问题比直接回归微创手术工具的位置更具优势。基于 SBE 路径的 CNN 结构由 He 等^[21]提出的完全卷积残差网络改进而来，对微创手术工具的分割输出和定位输出进行连接后直接端到端训练，仅依赖于上下文信息而无需任何后处理技术，该方法精度达到了 92.6%，每帧处理速度为 56 ms，实现了精度高、速度较快。如图10 所示是基于 SBE 路径的 CNN 微创手术工具检测与跟踪算法流程图。

图 10 — Minimally invasive surgical tool detection and tracking algorithm based on SBE path and CNN

基于 SBE 路径的 CNN 微创手术工具检测与跟踪算法

2.6. 基于卷积神经网络结合递归神经网络的算法

由于 MIS 过程是按时间顺序展开的，所以 2017 年 Mishra 等^[22]提出了一种较为创新的算法用于微创手术工具检测与跟踪任务的检测部分，该算法使用残差网络（ResNet）从微创手术工具图像中提取高度视觉特征，并利用长短期记忆（long short-term memory，LSTM）网络来编码时间信息，以捕获 ResNet-50 所提取出的高度视觉特征时间连接性，达到提高预测准确性的目的。LSTM 是递归神经网络（recurrent neural network，RNN）中应用最广泛的神经网络。该 CNN 结合 LSTM 的算法获得了 88.75%的平均准确度，每帧处理速度为 2.45 ms，适用于对精度要求不高，但对实时性要求较高的 MIS。值得特别指出的是，Mishra 等^[22]在文献中不但为后续研究者提供了 CNN 结合 LSTM 的研究思路，还公布了可供参考的开源代码。

CNN 结合 RNN 算法的计算复杂性阻止了其结构端到端训练，某一步骤结果的好坏会影响到下一步骤，从而影响整个训练的结果，但是好在许多微创手术工具彼此非常相似，通常可以根据之前时刻的事件进行区分，所以应结合时间背景训练 CNN 以提取最有用的视觉特征。Hajj 等^[23]提出了一种新的增强策略，即通过逐步增加训练好的弱分类器来同时丰富系统的 CNN 和 RNN 部分，提高微创手术工具检测与跟踪算法的平均精度，其中精度最高可达 97.89%。该研究在实验部分用 18 种 CNN 对 28 种微创手术工具检测与跟踪结果进行对比评价，以供后续研究者们参考。

3. 基于弱监督的深度神经网络微创手术工具检测与跟踪算法

前面综述的所有算法都是基于完全监督的深度神经网络微创手术工具检测与跟踪算法，但是完全监督模型要求在训练时使用空间信息完全标注的数据，目前可供微创手术工具检测与跟踪领域作为训练数据集及测试数据集的标准数据库十分稀少。除了之前所提 Cholec80 数据集和 2016 年计算机辅助干预的建模和监测会议挑战发布的公开数据集（m2cai16-tool）(网址：http://camma.u-strasbg.fr/m2cai2016/index.php/tool-presence-detection-challenge-results/) 可以直接用以训练外，其他数据集都需要人工完全标注数据，且可能存在标注人员不专业或浪费人力等问题。完全监督模型限制了可用数据集的大小和微创手术工具检测与跟踪算法的泛化，于是研究者们为了避免完全监督中需人工完全标注数据的弊端，尝试了弱监督模型，弱监督模型在训练时仅需要对图像级别标注数据进行训练，不需要任何空间标注，这为缺少完全标注数据集的微创手术工具检测与跟踪领域的发展带来了新契机。目前弱监督也被应用于医学图像中癌症区域的检测等任务^[24-25]，但遗憾的是相比较于完全监督模型，弱监督模型出现的时间较晚，与之的相关的参考文献还较少，发展不够成熟，而且还达不到实时性的要求。

基于弱监督的目标检测文献[26-27]，Vardazaryan 等^[28]构建出一种 FCN 结构，该网络结构是改进的 ResNet-18，删除了完全连接的层以保留空间信息并允许研究者们在没有明确空间标注的数据集下对微创手术工具检测与跟踪，而且是端到端的训练，即上一步骤的误差不会影响下一步骤，测试结果平均精度为 88.8%。

Nwoye 等^[29]提出了一种用于微创手术工具检测与跟踪的弱监督 CNN 结合卷积长短时记忆（ConvLSTM）网络算法，该算法仅在微创手术工具二进制存在标签上进行弱监督，再使用 ConvLSTM 网络来模拟微创手术工具运动中的时间依赖性，并利用 ConvLSTM 网络的时空性能来平滑定位热度图（Lhmaps）中的类峰值激活。基于弱监督 CNN 结合 ConvLSTM 的微创手术工具检测与跟踪算法提出了 ConvLSTM 网络的思路，测试结果平均精度提升到 92.9%，适用于对精度要求较高的 MIS，也证明了基于弱监督的深度神经网络足以用于基于深度学习的微创手术工具检测与跟踪算法，该算法流程图如图11 所示。

图 11 — Weakly supervised CNN combined with ConvLSTM for minimally invasive surgical tool detection and tracking algorithm

基于弱监督 CNN 结合 ConvLSTM 的微创手术工具检测与跟踪算法

4. 总结及展望

本综述主要总结了多种基于深度学习的微创手术工具检测与跟踪算法，目前基于 SBE 路径的 CNN 微创手术工具检测与跟踪算法在 MIS 中应用比较成熟，精度较高且速度较快；基于 CNN 结合 RNN 算法在精度方面还有很大的提升空间，需要研究者们继续探索；解决数据集完全标注问题的基于弱监督的深度神经网络微创手术工具检测与跟踪算法是比较新的研究方向，精度比较高，但是达不到实时性的要求。如图12 所示为基于深度学习的微创手术工具检测与跟踪算法总结图。本综述比较分析了多种微创手术工具检测与跟踪算法的测试结果、适用范围及优缺点，以便相关领域的科研工作者更系统地了解目前研究进展，同时可为微创外科手术医生选择合适的导航技术提供参考。

图 12 — Summary of detection and tracking algorithm for minimally invasive surgical tool based on deep learning

基于深度学习的微创手术工具检测与跟踪算法总结

目前已有的基于深度学习的微创手术工具检测与跟踪算法还需在准确度和实时性方面实现整体提升，因此未来的微创手术工具检测与跟踪算法或将有以下发展方向：

（1）有效融合特征。有效的特征是解决微创手术工具检测与跟踪问题的基础，所以可将多种深度学习特征融合到所有网络结构中，使测试结果更加准确。

（2）实现长时间跟踪。设计一个更好的目标模型更新，重新找到所需跟踪的微创手术工具或找到合适的神经网络相互结合，解决在微创手术工具跟踪过程中目标丢失的问题。

（3）由完全监督向弱监督甚至无监督深度学习发展。如何借助生成对抗网络、自编码器等实现无监督模型以增强实时性，或将是微创手术工具检测与跟踪领域研究的难点和重点。

利益冲突声明：本文全体作者均声明不存在利益冲突。

Funding Statement

国家自然科学基金资助项目（61273277，81401543）

References

1.Zhao Zijian, Voros S, Weng Ying, et al Tracking-by-detection of surgical instruments in minimally invasive surgery via the convolutional neural network deep learning-based method. Computer Assisted Surgery. 2017;22(1):26–35. doi: 10.1080/24699322.2017.1378777. [DOI] [PubMed] [Google Scholar]
2.Bouget D, Allan M, Stoyanov D, et al Vision-based and marker-less surgical tool detection and tracking: a review of the literature. Med Image Anal. 2017;35:633–654. doi: 10.1016/j.media.2016.09.003. [DOI] [PubMed] [Google Scholar]
3.Du Xiaofei, Allan M, Dore A, et al Combined 2D and 3D tracking of surgical instruments for minimally invasive and robotic-assisted surgery. Int J Comput Assist Radiol Surg. 2016;11(6):1109–1119. doi: 10.1007/s11548-016-1393-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Twinanda A P, Shehata S M, Mutter D, et al EndoNet: a deep architecture for recognition tasks on laparoscopic videos. IEEE Trans Med Imaging. 2017;36(1):86–97. doi: 10.1109/TMI.2016.2593957. [DOI] [PubMed] [Google Scholar]
5.Rieke N, Tan D J, Tombari F, et al Real-time online adaption for robust instrument tracking and pose estimation. Medical Image Computing and Computer Assisted Intervention (MICCAI 2016), Springer. 2016:422–430. [Google Scholar]
6.Sahu M, Moerman D, Mewes P, et al Instrument state recognition and tracking for effective control of robotized laparoscopic systems. International Journal of Mechanical Engineering and Robotics Research. 2016;5(1):33–38. [Google Scholar]
7.García-Peraza-Herrera L C, Li Wenqi, Fidon L, et al. ToolNet: holistically-nested real-time segmentation of robotic surgical tools//IEEE/RSJ International Conference on Intelligent Robots and Systems, IEEE, 2017: 5717-5722.
8.Chen Zhaorui, Zhao Zijian, Cheng Xiaolin. Surgical instruments tracking based on deep learning with lines detection and spatio-temporal context//Chinese Automation Congress (CAC 2017). IEEE, 2018. DOI: 10.1109/CAC.2017.8243236.
9.Sahu M, Mukhopadhyay A, Szengel A, et al Addressing multi-label imbalance problem of surgical tool detection using CNN. Int J Comput Assist Radiol Surg. 2017;12(6):1013–1020. doi: 10.1007/s11548-017-1565-x. [DOI] [PubMed] [Google Scholar]
10.García-Peraza-Herrera L C, Li Wenqi, Gruijthuijsen C, et al Real-time segmentation of non-rigid surgical tools based on deep learning and tracking// Computer-Assisted and Robotic Endoscopy (CARE 2016) Springer. 2016:84–95. [Google Scholar]
11.Zhao Z, Voros S, Chen Z, et al Surgical tool tracking based on two CNNs: from coarse to fine. The Journal of Engineering. 2019;(14):467–472. [Google Scholar]
12.Zhao Zijian, Chen Zhaorui, Voros S, et al Real-time tracking of surgical instruments based on spatio-temporal context and deep learning. Computer Assisted Surgery. 2019:1–10. doi: 10.1080/24699322.2018.1560097. [DOI] [PubMed] [Google Scholar]
13.Sarikaya D, Corso J J, Guru K A Detection and localization of robotic tools in robot-assisted surgery videos using deep neural networks for region proposal and detection. IEEE Trans Med Imaging. 2017;36(7):1542–1549. doi: 10.1109/TMI.2017.2665671. [DOI] [PubMed] [Google Scholar]
14.Jin A, Yeung S, Jopling J, et al. Tool detection and operative skill assessment in surgical videos using region-based convolutional neural networks//31st Conference on Neural Information Processing Systems (NIPS 2017), 2018. arXiv: 1802.08774v1.
15.Choi B, Jo K, Choi S, et al. Surgical-tools detection based on convolutional neural network in laparoscopic robot-assisted surgery//39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2017. DOI: 10.1109/embc.2017.8037183.
16.Redmon J, Divvala S K, Girshick R B, et al. You only look once: unified, real-time object detection// Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2016. arXiv: 1506.02640v5.
17.Kurmann T, Neila P M, Du Xiaofei, et al. Simultaneous recognition and pose estimation of instruments in minimally invasive surgery//Medical Image Computing and Computer-Assisted Intervention (MICCAI 2017), Springer, 2017: 505-513.
18.Gao Cong, Unberath M, Taylor R H, et al. Localizing dexterous surgical tools in X-ray for image-based navigation. arXiv: Computer Vision and Pattern Recognition, 2019. arXiv: 1901.06672v2.
19.Du X, Kurmann T, Chang P L, et al Articulated Multi-Instrument 2D pose estimation using fully convolutional networks. IEEE Trans Med Imaging. 2018;37(5):1276–1287. doi: 10.1109/TMI.2017.2787672. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Laina I, Rieke N, Rupprecht C, et al. Concurrent segmentation and localization for tracking of surgical instruments//Medical Image Computing and Computer-Assisted Intervention (MICCAI 2017), Springer, 2017: 664-672.
21.He Kaiming, Zhang Xiangyu, Ren Shaoqing, et al. Deep residual learning for image recognition// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2016. arXiv: 1512.03385v1.
22.Mishra K, Sathish R, Sheet D. Learning latent temporal connectionism of deep residual visual abstractions for identifying surgical tools in laparoscopy procedures//2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2017: 2233-2240.
23.Hajj H A, Lamard M, Conze P, et al Monitoring tool usage in surgery videos using boosted convolutional and recurrent neural networks. Med Image Anal. 2018;47:203–218. doi: 10.1016/j.media.2018.05.001. [DOI] [PubMed] [Google Scholar]
24.Hwang S, Kim H E. Self-transfer learning for weakly supervised lesion localization// Medical Image Computing and Computer-Assisted Intervention (MICCAI 2016). 2016: 239-246.
25.Jia Zhipeng, Huang Xingyi, Chang E I, et al Constrained deep weak supervision for histopathology image segmentation. IEEE Trans Med Imaging. 2017;36(11):2376–2388. doi: 10.1109/TMI.2017.2724070. [DOI] [PubMed] [Google Scholar]
26.Singh K K, Lee Y J. Hide-and-seek: forcing a network to be meticulous for weakly-supervised object and action localization// IEEE International Conference on Computer Vision (ICCV), 2017: 3524-3533.
27.Kim D, Cho D, Yoo D, et al. Two-phase learning for weakly supervised object localization//IEEE Computer Society and the Computer Vision Foundation (CVF), 2017: 3554-3563. arXiv: 1708.02108v3.
28.Vardazaryan A, Mutter D, Marescaux J, et al. Weakly-supervised learning for tool localization in laparoscopic videos//LABELS 2018, CVII 2018, STENT 2018: Intravascular Imaging and Computer Assisted Stenting and Large-Scale Annotation of Biomedical Data and Expert Label Synthesis, 2018: 169-179. arXiv: 1806.05573v2.
29.Nwoye C I, Mutter D, Marescaux J A Weakly supervised convolutional LSTM approach for tool tracking in laparoscopic videos. Int J Comput Assist Radiol Surg. 2019;14(6):1059–1067. doi: 10.1007/s11548-019-01958-6. [DOI] [PubMed] [Google Scholar]

[b1] 1.Zhao Zijian, Voros S, Weng Ying, et al Tracking-by-detection of surgical instruments in minimally invasive surgery via the convolutional neural network deep learning-based method. Computer Assisted Surgery. 2017;22(1):26–35. doi: 10.1080/24699322.2017.1378777. [DOI] [PubMed] [Google Scholar]

[b2] 2.Bouget D, Allan M, Stoyanov D, et al Vision-based and marker-less surgical tool detection and tracking: a review of the literature. Med Image Anal. 2017;35:633–654. doi: 10.1016/j.media.2016.09.003. [DOI] [PubMed] [Google Scholar]

[b3] 3.Du Xiaofei, Allan M, Dore A, et al Combined 2D and 3D tracking of surgical instruments for minimally invasive and robotic-assisted surgery. Int J Comput Assist Radiol Surg. 2016;11(6):1109–1119. doi: 10.1007/s11548-016-1393-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b4] 4.Twinanda A P, Shehata S M, Mutter D, et al EndoNet: a deep architecture for recognition tasks on laparoscopic videos. IEEE Trans Med Imaging. 2017;36(1):86–97. doi: 10.1109/TMI.2016.2593957. [DOI] [PubMed] [Google Scholar]

[b5] 5.Rieke N, Tan D J, Tombari F, et al Real-time online adaption for robust instrument tracking and pose estimation. Medical Image Computing and Computer Assisted Intervention (MICCAI 2016), Springer. 2016:422–430. [Google Scholar]

[b6] 6.Sahu M, Moerman D, Mewes P, et al Instrument state recognition and tracking for effective control of robotized laparoscopic systems. International Journal of Mechanical Engineering and Robotics Research. 2016;5(1):33–38. [Google Scholar]

[b7] 7.García-Peraza-Herrera L C, Li Wenqi, Fidon L, et al. ToolNet: holistically-nested real-time segmentation of robotic surgical tools//IEEE/RSJ International Conference on Intelligent Robots and Systems, IEEE, 2017: 5717-5722.

[b8] 8.Chen Zhaorui, Zhao Zijian, Cheng Xiaolin. Surgical instruments tracking based on deep learning with lines detection and spatio-temporal context//Chinese Automation Congress (CAC 2017). IEEE, 2018. DOI: 10.1109/CAC.2017.8243236.

[b9] 9.Sahu M, Mukhopadhyay A, Szengel A, et al Addressing multi-label imbalance problem of surgical tool detection using CNN. Int J Comput Assist Radiol Surg. 2017;12(6):1013–1020. doi: 10.1007/s11548-017-1565-x. [DOI] [PubMed] [Google Scholar]

[b10] 10.García-Peraza-Herrera L C, Li Wenqi, Gruijthuijsen C, et al Real-time segmentation of non-rigid surgical tools based on deep learning and tracking// Computer-Assisted and Robotic Endoscopy (CARE 2016) Springer. 2016:84–95. [Google Scholar]

[b11] 11.Zhao Z, Voros S, Chen Z, et al Surgical tool tracking based on two CNNs: from coarse to fine. The Journal of Engineering. 2019;(14):467–472. [Google Scholar]

[b12] 12.Zhao Zijian, Chen Zhaorui, Voros S, et al Real-time tracking of surgical instruments based on spatio-temporal context and deep learning. Computer Assisted Surgery. 2019:1–10. doi: 10.1080/24699322.2018.1560097. [DOI] [PubMed] [Google Scholar]

[b13] 13.Sarikaya D, Corso J J, Guru K A Detection and localization of robotic tools in robot-assisted surgery videos using deep neural networks for region proposal and detection. IEEE Trans Med Imaging. 2017;36(7):1542–1549. doi: 10.1109/TMI.2017.2665671. [DOI] [PubMed] [Google Scholar]

[b14] 14.Jin A, Yeung S, Jopling J, et al. Tool detection and operative skill assessment in surgical videos using region-based convolutional neural networks//31st Conference on Neural Information Processing Systems (NIPS 2017), 2018. arXiv: 1802.08774v1.

[b15] 15.Choi B, Jo K, Choi S, et al. Surgical-tools detection based on convolutional neural network in laparoscopic robot-assisted surgery//39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2017. DOI: 10.1109/embc.2017.8037183.

[b16] 16.Redmon J, Divvala S K, Girshick R B, et al. You only look once: unified, real-time object detection// Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2016. arXiv: 1506.02640v5.

[b17] 17.Kurmann T, Neila P M, Du Xiaofei, et al. Simultaneous recognition and pose estimation of instruments in minimally invasive surgery//Medical Image Computing and Computer-Assisted Intervention (MICCAI 2017), Springer, 2017: 505-513.

[b18] 18.Gao Cong, Unberath M, Taylor R H, et al. Localizing dexterous surgical tools in X-ray for image-based navigation. arXiv: Computer Vision and Pattern Recognition, 2019. arXiv: 1901.06672v2.

[b19] 19.Du X, Kurmann T, Chang P L, et al Articulated Multi-Instrument 2D pose estimation using fully convolutional networks. IEEE Trans Med Imaging. 2018;37(5):1276–1287. doi: 10.1109/TMI.2017.2787672. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b20] 20.Laina I, Rieke N, Rupprecht C, et al. Concurrent segmentation and localization for tracking of surgical instruments//Medical Image Computing and Computer-Assisted Intervention (MICCAI 2017), Springer, 2017: 664-672.

[b21] 21.He Kaiming, Zhang Xiangyu, Ren Shaoqing, et al. Deep residual learning for image recognition// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2016. arXiv: 1512.03385v1.

[b22] 22.Mishra K, Sathish R, Sheet D. Learning latent temporal connectionism of deep residual visual abstractions for identifying surgical tools in laparoscopy procedures//2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2017: 2233-2240.

[b23] 23.Hajj H A, Lamard M, Conze P, et al Monitoring tool usage in surgery videos using boosted convolutional and recurrent neural networks. Med Image Anal. 2018;47:203–218. doi: 10.1016/j.media.2018.05.001. [DOI] [PubMed] [Google Scholar]

[b24] 24.Hwang S, Kim H E. Self-transfer learning for weakly supervised lesion localization// Medical Image Computing and Computer-Assisted Intervention (MICCAI 2016). 2016: 239-246.

[b25] 25.Jia Zhipeng, Huang Xingyi, Chang E I, et al Constrained deep weak supervision for histopathology image segmentation. IEEE Trans Med Imaging. 2017;36(11):2376–2388. doi: 10.1109/TMI.2017.2724070. [DOI] [PubMed] [Google Scholar]

[b26] 26.Singh K K, Lee Y J. Hide-and-seek: forcing a network to be meticulous for weakly-supervised object and action localization// IEEE International Conference on Computer Vision (ICCV), 2017: 3524-3533.

[b27] 27.Kim D, Cho D, Yoo D, et al. Two-phase learning for weakly supervised object localization//IEEE Computer Society and the Computer Vision Foundation (CVF), 2017: 3554-3563. arXiv: 1708.02108v3.

[b28] 28.Vardazaryan A, Mutter D, Marescaux J, et al. Weakly-supervised learning for tool localization in laparoscopic videos//LABELS 2018, CVII 2018, STENT 2018: Intravascular Imaging and Computer Assisted Stenting and Large-Scale Annotation of Biomedical Data and Expert Label Synthesis, 2018: 169-179. arXiv: 1806.05573v2.

[b29] 29.Nwoye C I, Mutter D, Marescaux J A Weakly supervised convolutional LSTM approach for tool tracking in laparoscopic videos. Int J Comput Assist Radiol Surg. 2019;14(6):1059–1067. doi: 10.1007/s11548-019-01958-6. [DOI] [PubMed] [Google Scholar]

PERMALINK

基于深度学习的微创手术工具检测与跟踪研究综述

Review of research on detection and tracking of minimally invasive surgical tools based on deep learning

玉莹刘

子健赵

Abstract

Abstract

引言

1. 深度学习理论