Lung parenchyma segmentation based on double scale parallel attention network

Kaili FENG; Lili REN; Yanlin WU; Yan LI; Hongrui WANG; Guanglei WANG

doi:10.7507/1001-5515.202108005

. 2022 Aug 25;39(4):721–729. [Article in Chinese] doi: 10.7507/1001-5515.202108005

Show available content in

Lung parenchyma segmentation based on double scale parallel attention network

Kaili FENG ¹, Lili REN ², Yanlin WU ¹, Yan LI ¹, Hongrui WANG ¹, Guanglei WANG ^1,^*

PMCID: PMC10957358 PMID: 36008336

Abstract

[Abstract]Automatic and accurate segmentation of lung parenchyma is essential for assisted diagnosis of lung cancer. In recent years, researchers in the field of deep learning have proposed a number of improved lung parenchyma segmentation methods based on U-Net. However, the existing segmentation methods ignore the complementary fusion of semantic information in the feature map between different layers and fail to distinguish the importance of different spaces and channels in the feature map. To solve this problem, this paper proposes the double scale parallel attention (DSPA) network (DSPA-Net) architecture, and introduces the DSPA module and the atrous spatial pyramid pooling (ASPP) module in the “encoder-decoder” structure. Among them, the DSPA module aggregates the semantic information of feature maps of different levels while obtaining accurate space and channel information of feature map with the help of cooperative attention (CA). The ASPP module uses multiple parallel convolution kernels with different void rates to obtain feature maps containing multi-scale information under different receptive fields. The two modules address multi-scale information processing in feature maps of different levels and in feature maps of the same level, respectively. We conducted experimental verification on the Kaggle competition dataset. The experimental results prove that the network architecture has obvious advantages compared with the current mainstream segmentation network. The values of dice similarity coefficient (DSC) and intersection on union (IoU) reached 0.972 ± 0.002 and 0.945 ± 0.004, respectively. This paper achieves automatic and accurate segmentation of lung parenchyma and provides a reference for the application of attentional mechanisms and multi-scale information in the field of lung parenchyma segmentation.

Keywords: Lung parenchymal segmentation, Deep learning, Collaborative attention, Multi-scale information

引言

肺癌是危及人们生命的严重疾病之一。研究表明，肺癌的早期检测和治疗可以有效地提高肺癌患者的生存率，而肺部计算机断层扫描（computed tomography，CT）图像是早期临床诊断的主要手段，其中肺实质的分割对于辅助诊断至关重要。然而，由于受肺部CT图像中其他组织和空气等影响，快速准确地对肺实质进行分割依然是肺癌辅助诊断研究的热点和难点之一。

卷积神经网络（convolutional neural network，CNN）是一种常见的基于自然视觉认知的网络模型，能直接应用视觉规律对图像进行有效表征。经典的CNN网络结构包括：亚历克斯网络、视觉几何组（visual geometry group，VGG）、基于Inception模块的深度神经网络、残差网络（residual network，ResNet）、U型网络（U-Net）和分割网络（SegNet）等^[1-6]。在CNN应用于肺实质分割方面，Shaziya等^[7]使用基于U-Net的网络对胸部X射线图像中的肺实质进行自动分割。Gu等^[8]设计了一种多尺度预测网络并将其应用于胸部CT图像的肺区域分割。Gholamiankhah等^[9]利用网络中的残差模型实现了肺区域的自动分割。近年来，越来越多的注意力机制相继被提出并应用到各种网络结构中^[10-11]，如通过采用不同的空间注意力机制或设计高级注意力块^[12-15]，或者利用非局部机制来捕获不同类型的空间信息^[16-21]。与此同时，多尺度信息在网络中的结合也受到了研究者们的关注^[22-24]，其主要思想是通过高低级特征融合或通过具有不同感受野大小的卷积核提取多尺度信息来提升模型的特征表达能力^[25-27]。这些方法虽然可以提升网络的分割精度，但普遍存在以下问题：① 忽视了不同层级间特征图语义信息的融合互补，并且无法区分特征图中不同空间与通道的重要性；② 注意力机制计算量大、参数冗余；③ 缺少同一特征图不同尺度信息的提取。

针对上述问题本文提出了一个基于U-Net的双尺度并行注意力（double scale parallel attention，DSPA）网络（DSPA-Net），其中DSPA模块使网络具备区分特征图中不同空间与通道的重要性的能力，并且可以对不同层级特征图的语义信息进行聚合；引入的协同注意力（collaborative attention，CA）^[28]通过卷积操作对空间与通道依赖关系进行捕获，相比于全连接激活操作有效减少计算量与参数量；在瓶颈层中加入空洞空间金字塔池化（atrous spatial pyramid pooling，ASPP）模块，利用不同采样率组成的空洞卷积组获取同一层级不同感受野下包含多尺度信息的特征图。本文通过以上三点改进实现对肺实质区域准确高效的分割，为注意力机制和多尺度信息在肺实质分割领域的应用与改进提供相应参考。

1. 方法

1.1. 网络结构

本文采用改进的U-Net网络进行肺实质分割的研究，新网络结构包括DSPA模块和ASPP模块，如图1所示。为将低级特征图中的空间信息与高级特征图中的语义信息进行有效融合，在跳跃连接位置上加入DSPA模块。在特征互补融合的过程中，通过CA得到精确位置信息后对两种尺度特征图的通道关系和长距离依赖性进行编码。ASPP模块的作用在于提取不同尺度感受野下的语义信息进行融合，当语义信息越丰富时，对网络性能的提升效果越明显，而U-Net编码器输出的特征图包含了丰富的语义信息，因此本文在瓶颈层之后加入ASPP模块，进一步增强网络的特征表达能力。

1.2. DSPA模块

为解决不同层级特征图中多尺度信息处理问题，本文提出DSPA模块，如图2所示，分别在高维低维两个尺度上并行实现特征提取。以低级特征图 Inline graphic 为例（C、H、W分别表示特征图的通道数、长度、宽度），分别沿着水平方向与垂直方向对特征图进行平均池化得到、，如式(1)～式(2)所示：

其中，X_low表示输入特征图； Inline graphic 表示输入特征图在水平方向平均池化得到的输出特征图；表示输入特征图在垂直方向平均池化得到的输出特征图；c、h、w表示特征图中坐标值，为特征值所在位置。

对 Inline graphic 经过行列转置操作后与进行空间维度的拼接得到张量，如式(3)所示：

其中， Inline graphic 表示将在空间维度进行转置操作。将F_low进行1×1卷积，批量归一化和激活，得到张量，如式(4)所示：

其中，Conv（）表示卷积，如式(5)～式(6)所示；δ（）代表批量归一化操作，如式(7)～式(10)所示；Δ（）代表sigmoid激活函数，如式(11)所示。

其中，F_j表示卷积操作后输出图合集x的第j个特征图（0 ≤ j < C）；(x, y)代表输入特征图像素点的坐标（x = 0，0 ≤ y < H + W）；(u, v)代表卷积核中心点的坐标（u = 0，0 ≤ v < H + W）；F_low(x + u, y + v)表示对应坐标像素值；W_k(u, v)代表1 × 1卷积核第k层对应坐标的权值；w₁₁表示卷积核W_k的权值。

其中，x表示批量归一化操作前的输入图合集；m表示当前输入图合集中的样本数；μ表示x的均值；σ²表示x的方差； Inline graphic 表示x经过标准化之后的特征图；ε表示防止分母为零的非零正数；γ和β为两个可学习的参数；F_δ表示批量归一化操作得到的输出图合集。

将 Inline graphic 在空间维度进行拆分、转置后得到张量、，通过1 × 1的卷积核计算,得到两个空间方向的双尺度并行特征图，分别为、。将、与原低级特征图X_low进行矩阵乘法，得到与X_low尺寸相同的特征图如式(12)所示：

同理，针对高级特征图 Inline graphic ，由式(1)～式(12)的相同操作可得到，对其进行3 × 3卷积和上采样得到如式(13)所示：

其中，卷积操作Conv（）的权值矩阵W_k如式(14)所示，w₁₁～w₃₃表示W_k的权值：

式(13)中，F_bli（）表示双线性插值操作：对于一个目的像素，通过坐标的反向设置变换得到浮点的坐标(x + u, y + v)，其中x和y均为大于等于零的整数，u和v为[0, 1)区间的浮点数，则这个像素的值可由原图像中坐标为(x, y)、(x + 1, y)、(x, y + 1)、(x + 1, y + 1)所对应的相邻四个像素的值决定，计算过程如式(15)所示：

其中，f(x, y)表示双线性插值操作前的输入特征图在(x, y)处的像素值。

最终的输出 Inline graphic 由和H′逐像素相乘得到。

高级特征图X_high具有较多的语义信息和较少的空间信息，而低级特征图X_low的特点与其相反，二者通过CA处理后的结合实现了优势互补，如图2所示。在DSPA模块中，CA将特征图进行水平池化和垂直池化操作，分别沿两个空间方向聚合特征，解决传统通道注意力池化方式单一的问题。DSPA模块在水平方向捕获远程依赖关系的同时在垂直方向保留精确的位置信息，得到对方向和位置敏感的注意力特征图后与输入特征图进行信息融合，增强模型的特征表达能力。当网络中加入DSPA模块后，输入到解码器中的特征图F_out相较于X_low噪声更少，肺实质区域边缘轮廓更加清楚，且类内差异性明显减小，与其它组织的对比度得到明显提升。

为了验证DSPA模块的有效性，本文使用梯度加权类激活映射（gradient-weighted class activation mapping，Grad-CAM）方法将U-Net网络加入DSPA模块前后的两个网络输出图进行处理^[29]，得到对比热力图。如图3所示，同一行的热力图是根据同一张数据集初始图像进入不同网络生成的，在网络反向传播时利用Grad-CAM方法求得热力图后与输入图像进行叠加得到Grad-CAM定位图，红色区域即为网络所关注的区域。对比两种网络热力图可以发现，加入DSPA模块后肺实质区域以外的红色明显减少，说明网络更加聚焦于目标区域，从而加强网络对于肺实质区域的定位与分割。由此可见DSPA模块能够有效提升网络的性能。

1.3. ASPP模块

为解决同一层级特征图中多尺度信息处理问题，本文引用了金字塔网络（DeepLabv3 +）中的ASPP模块^[30]，利用空洞卷积增加卷积核的感受野，并通过不同空洞率的卷积构造出金字塔结构。如图4所示，ASPP模块的具体结构包括一个1 × 1卷积，三个3 × 3卷积（空洞率分别为6，12，18）和全局平均池化操作，并在每个并行卷积层之后加入批量归一化。空洞卷积的输出特征图定义如式(16)所示：

其中，H、W分别表示输入图像（或前一层特征图）的长度和宽度；x(i, j)表示该输入图像上（i, j）位置的像素值；ar表示空洞率；y(i, j)表示该输入图像经过空洞卷积后的输出；(u, v)代表卷积核中心点的坐标（0 ≤ u < H，0 ≤ v < W）；W(u, v)表示卷积核对应坐标位置的权重。

全局平均池化的输出定义如式(17)所示：

其中，X(c, i, j)表示输入特征图X第c个通道中对应坐标的像素值； Inline graphic (c, 1, 1)表示输出特征图。

ASPP模块的五条路径分别提供不同大小的感受野，大空洞率的卷积层为网络提供更多的全局性上下文特征信息，小空洞率的卷积层为网络补充细节信息。在不同感受野下处理得到的特征图即为同一层级特征图的多尺度信息。前四个卷积操作不改变特征图的空间维度，仅改变通道维度大小得到X₁、X₂、X₃、X₄；经过全局平均池化操作后的特征图 Inline graphic 通过双线性插值上采样的方式将空间尺寸还原为与输入特征图大小相同的X₅。所有分支处理好的特征图在通道维度进行拼接，如式(18)所示：

其中，C、H、W分别表示特征图的通道数、长度、宽度。拼接后通过卷积核大小为1 × 1的卷积，将特征图压缩到指定维度的同时，在通道维度实现同一层级特征图中多尺度信息的交互。

ASPP模块和普通卷积操作对编码器最后一层进行特征提取的效果图，如图5所示。在对编码器输出特征图进行特征提取时，普通卷积操作提取出的特征图保留了原特征图大部分的形状、轮廓和颜色等低级特征。而ASPP模块进一步对网络深层特征图中的高级特征进行提取，得到具有丰富语义信息的特征图，以此增强网络的特征提取能力，为解码路径中的特征图提供更多抽象和高级的特征。

2. 实验和结果

2.1. 数据来源

本文使用卡格尔（Kaggle）竞赛的在CT数据中发现和测量肺（finding and measuring lungs in CT data，FML-CT）（网址：https://www.kaggle.com/kmader/finding-lungs-in-ct-data）数据集进行了实验。本文将此数据集中的200张图像划分为训练集，37张图像划分为验证集，30张图像划分为测试集。为了提升模型的泛化能力和鲁棒性，对原数据集进行了水平翻转、垂直翻转和旋转的数据增强操作，并且在训练阶段将输入的图像进行大小为368 × 368像素的中心裁剪，以此加快训练速度。实验过程中操作系统为Ubuntu18.04（开源，美国），图形处理器（graphics processing unit，GPU）硬件为QuadroRTX8000（英伟达，美国），优化器采用适应性矩估计，初始学习率设置为0.000 1，每三个训练周期降低一半，设置最大训练周期为40，并采取了提前停止训练的策略防止模型过拟合，深度学习框架为Pytorch1.8.1（Facebook，美国）。

2.2. 指标

为评价DSPA-Net网络的性能，本文使用戴斯相似性系数（dice similarity coefficient，DSC）、交并比（intersection over union，IoU）、体积重叠误差（volumetric overlap error，VOE）、相对体积差（relative volume difference，RVD）、特异性（specificity，SP）、敏感性（sensitivity，SE）和精确率（precision，PC）作为指标评价分割性能，计算公式分别如式(19)～式(25)所示。

其中，A表示网络模型预测输出的肺实质分割结果，B表示真实的肺实质区域，A和B的交集表示正确分割的肺实质区域。真阴性（true negative，TN）表示被正确分为非肺实质的像素个数，真阳性（true positive，TP）表示被正确分为肺实质的像素个数，假阴性(false negative，FN)表示被错误分为非肺实质的像素个数，假阳性(false positive，FP)表示被错误分为肺实质的像素个数。DSC∈（0，1）是医学图像分割中常用来评估分割结果和真实标记之间相似度的评估矩阵，IoU∈（0，1）是预测标签图像和真实标签图像的交叠率，DSC和IoU越接近1，分割性能越好。VOE和RVD越接近0，分割性能越好。SP、SE、PC∈（0，1），越趋近1，分割效果越好。

2.3. 消融实验

在保证实验环境相同的前提下，本文将U-Net设置为基础参考网络，并逐步加入DSPA模块和ASPP模块进行消融实验。如表1所示，与参考网络相比，DSPA模块的增加使得IoU指数和DSC指数分别增加了4.9%和2.8%。添加ASPP模块后，IoU和DSC分别增加了5.4%和3%。这些结果证明了DSPA模块和ASPP模块的有效性。DSPA-Net的IoU和DSC分别达到94.5%和97.2%，比参考网络的值高出6.6%和3.7%，证明DSPA-Net可以有效地提高分割精度。

表 1. Ablation experiment ().

消融实验（ Inline graphic ）

方法	DSC	IoU	VOE	RVD	SP	SE	PC
U-Net	0.935 ± 0.007	0.879 ± 0.012	0.065 ± 0.015	0.067 ± 0.017	0.972 ± 0.010	0.957 ± 0.017	0.916 ± 0.030
U-Net+DSPA	0.963 ± 0.005	0.928 ± 0.009	0.038 ± 0.020	0.039 ± 0.021	0.982 ± 0.005	0.981 ± 0.005	0.945 ± 0.014
U-Net+ASPP	0.965 ± 0.005	0.933 ± 0.010	0.017 ± 0.009	0.018 ± 0.009	0.986 ± 0.003	0.974 ± 0.006	0.957 ± 0.008
DSPA-Net	0.972 ± 0.002	0.945 ± 0.004	0.012 ± 0.002	0.012 ± 0.002	0.989 ± 0.001	0.977 ± 0.001	0.966 ± 0.003

Open in a new tab

2.4. 对比实验

为验证本文所提出网络的优越性，将DSPA-Net与常用的语义分割网络进行了对比，对比网络包括U-Net、SegNet、U-Net++、U-Net3+和DeepLabv3+，实验结果如表2所示。

表 2. Comparison of segmentation results evaluation of different models ().

不同模型的分割结果评估对比（ Inline graphic ）

方法	DSC	IoU	VOE	RVD	SP	SE	PC
U-Net	0.935 ± 0.007	0.879 ± 0.012	0.065 ± 0.015	0.067 ± 0.017	0.972 ± 0.010	0.957 ± 0.017	0.916 ± 0.030
SegNet	0.909 ± 0.009	0.834 ± 0.016	0.123 ± 0.056	0.134 ± 0.061	0.948 ± 0.014	0.969 ± 0.019	0.858 ± 0.032
U-Net++	0.930 ± 0.012	0.870 ± 0.021	0.057 ± 0.039	0.060 ± 0.043	0.968 ± 0.011	0.958 ± 0.007	0.905 ± 0.028
U-Net3+	0.912 ± 0.013	0.838 ± 0.021	0.132 ± 0.028	0.334 ± 0.316	0.947 ± 0.010	0.976 ± 0.003	0.856 ± 0.023
DeepLabv3+	0.937 ± 0.003	0.881 ± 0.006	0.060 ± 0.017	0.063 ± 0.019	0.969 ± 0.004	0.966 ± 0.006	0.910 ± 0.010
DSPA-Net	0.972 ± 0.002	0.945 ± 0.004	0.012 ± 0.002	0.012 ± 0.002	0.989 ± 0.001	0.977 ± 0.001	0.966 ± 0.003

Open in a new tab

从表2可以看出，DSPA-Net在这七个指标上的表现都优于其他网络。在参数量方面，U-Net作为医学图像分割中最经典的网络，其参数量为17.27 M，而DSPA-Net、DeepLabv3+的参数量分别为28.83 M、54.94 M，在取得性能提升的情况下DSPA-Net相比于DeepLabv3+引入了更少的参数，再次证明了DSPA-Net在肺实质分割任务中的优越性。

此外，本文列出了上述六种方法针对六张肺部CT图像（图像1～图像6）的分割结果，如图6所示，每一行依次为六张不同的肺部CT图像；每一列依次为预处理图像、真实标签与由U-Net、SegNet、U-Net++、U-Net3+、DeepLabv3+、本文方法DSPA-Net得到的分割结果图；预处理图像由原始位深度为16的图像转换为位深度为24的图像得到，并经过翻转和旋转等数据增强操作。分割结果中红色区域代表真实肺实质位置被网络错误判定为非肺实质位置的欠分割区域，绿色区域代表非肺实质位置被网络错误判定为肺实质位置的过分割区域。如图6所示，DSPA-Net在肺实质周围的过分割绿色区域面积、欠分割红色区域面积，以及背景中的误分割绿色区域面积都明显少于其他先进方法，分割结果更为精确。由此可以看出，本文提出的方法在肺实质分割方面相较于其他方法能够进一步增加肺实质区域自动分割的精度。

3. 讨论

本研究在传统U-Net基本框架的基础上加入了DSPA模块，分别在高维、低维两个尺度上并行实现特征提取，解决了不同层级特征图中多尺度信息处理问题。在最初的设计中，本文仅用CA模块对跳跃连接中的特征图进行处理，相较与其他注意力机制，CA的优点是将位置信息嵌入到通道注意力中，有助于网络更准确地定位感兴趣的目标。但通过实验发现单独加入CA的性能并不理想。在进一步分析了相关特征融合网络的结构和思想后，本文将多尺度信息处理机制与CA注意力相结合，进而提出DSPA模块，使得网络将不同层级间特征图语义信息进行融合互补，并且区分特征图中不同空间与通道的重要性。实验结果表明，在跳跃连接中加入多尺度信息处理机制和并行注意力机制有效增强了类间不一致性和类内一致性，从而将肺实质区域与其他组织结构进行更加清晰的区分，避免过度分割的同时减少肺实质内部的误分割。同时在U-Net网络的瓶颈层后引入ASPP模块，让网络拥有处理同一层级特征图多尺度信息的能力。DSPA模块与ASPP模块两者的有效结合实现了网络分割性能的提升。

4. 总结

本文设计了名为DSPA-Net的新型网络架构，并将其用于肺部CT图像中肺实质的分割。该网络在U-Net的跳跃连接中加入DSPA模块，解决了不同层级特征图中多尺度信息处理问题，并且引入的CA模块通过卷积激活操作极大减少了传统注意力机制中的参数量与计算量。在U-Net的瓶颈层加入ASPP模块，利用不同采样率组成的空洞卷积组获取不同感受野下包含多尺度信息的特征图，解决了同一层级特征图中多尺度信息处理问题。在FML-CT数据集上的实验证明了本文所提出的网络在肺实质分割方面的优越性，其分割结果可为肺癌的诊断与治疗过程提供相应参考。但DSPA-Net也存在一些不足，本文仅实现了对肺实质进行分割。后续，本课题组将研究DSPA-Net在已分割出肺实质轮廓的情况下对肺结节的分割性能，尽可能开发DSPA-Net的潜力。除此之外，DSPA-Net并未使用CT图像的3D信息，而CT图像的3D信息对于肺实质和肺结节分割也至关重要。在未来的研究中，本课题组考虑将CT图像的3D信息添加到DSPA-Net模型中，并探索其应用于更多医学图像分割任务的可能性。

重要声明

利益冲突声明：本文全体作者均声明不存在利益冲突。

作者贡献声明：冯凯丽负责数据处理分析，算法设计与开发，论文写作与修改；吴彦林参与图表绘制；王光磊提供实验指导，论文审阅修订；任莉莉、李艳、王洪瑞提供论文指导。

伦理声明：本论文所用实验数据均来自公开数据库，不涉及伦理问题。

Funding Statement

国家自然科学基金项目（61473112）；河北省自然科学基金重点项目（F2017201222）

National Natural Science Foundation of China

References

1.Krizhevsky A, Sutskever I, Hinton G E Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems. 2012;25:1097–1105. [Google Scholar]
2.Simonyan K, Zisserman A Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv. 2014:1409.1556. [Google Scholar]
3.Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions//2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015: 1-9.
4.He Kaiming, Zhang Xiangyu, Ren Shaoqing, et al Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016;90:770–778. [Google Scholar]
5.Ronneberger O, Fischer P, Brox T. U-net: convolutional networks for biomedical image segmentation//International Conference on Medical image computing and computer-assisted intervention, Springer Cham, 2015: 234-241.
6.Badrinarayanan V, Kendall A, Cipolla R SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell. 2017;39(12):2481–2495. doi: 10.1109/TPAMI.2016.2644615. [DOI] [PubMed] [Google Scholar]
7.Shaziya H, Shyamala K, Zaheer R. Automatic lung segmentation on thoracic CT scans using U-Net convolutional network//2018 International Conference on Communication and Signal Processing (ICCSP), 2018: 0643-0647.
8.Gu Yuchong, Lai Yaoming, Xie Peiliang, et al. Multi-scale prediction network for lung segmentation//2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019), IEEE, 2019: 438-442.
9.Gholamiankhah F, Mostafapour S, Goushbolagh N A, et al Automated lung segmentation from CT images of normal and COVID-19 pneumonia patients. arXiv preprint arXiv. 2021:2104.02042. doi: 10.30476/IJMS.2022.90791.2178. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Hu Jie, Shen Li, Sun Gang. Squeeze-and-excitation networks//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 7132-7141.
11.Woo S, Park J, Lee J Y, et al. Cbam: convolutional block attention module//Proceedings of the European conference on computer vision (ECCV), 2018: 3-19.
12.Hu Jie, Shen Li, Albanie S, et al Gather-excite: exploiting feature context in convolutional neural networks. NIPS'18: Proceedings of the 32nd International Conference on Neural Information Processing Systems. 2018:9423–9433. [Google Scholar]
13.Linsley D, Scheibler D, Eberhardt S, et al Global-and-local attention networks for visual recognition. Benefits. 2018;64:01. [Google Scholar]
14.Tay C P, Roy S, Yap K H. AANet: attribute attention network for person re-identifications//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019: 7127-7136.
15.Misra D, Nalamada T, Arasanipalai A U, et al. Rotate to attend: convolutional triplet attention module//2021 IEEE Winter Conference on Applications of Computer Vision (WACV), 2021: 3138-3147.
16.Wang Xiaolong, Girshick R, Gupta A, et al. Non-local Neural Networks//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 7794-7803.
17.Cao Yue, Xu Jiarui, Lin S, et al. GCNet: non-local networks meet squeeze-excitation networks and beyond//2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), 2019: 1971-1980.
18.Chen Y, Kalantidis Y, Li J, et al A2-nets: Double attention networks. Advances in neural information processing systems. 2018:350–359. [Google Scholar]
19.Liu Jiangjiang, Hou Qibin, Cheng Mingming, et al. Improving convolutional networks with self-calibrated convolutions//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020: 10093-10102.
20.Gao Zilin, Xie Jiangtao, Wang Qilong, et al. Global second-order pooling convolutional networks//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019: 3019-3028.
21.Huang Zilong, Wang Xinggang, Huang Lichao, et al. CCNet: criss-cross attention for semantic segmentation//2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019: 603-612.
22.Fang X, Yan P Multi-organ segmentation over partially labeled datasets with multi-scale feature abstraction. IEEE Trans Med Imaging. 2020;39(11):3619–3629. doi: 10.1109/TMI.2020.3001036. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Fu Huazhu, Cheng Jun, Xu Yanwu, et al Joint optic disc and cup segmentation based on multi-label deep network and polar transformation. IEEE Trans Med Imaging. 2018;37(7):1597–1605. doi: 10.1109/TMI.2018.2791488. [DOI] [PubMed] [Google Scholar]
24.Gu Zaiwang, Cheng Jun, Fu Huazhu, et al Ce-net: context encoder network for 2D medical image segmentation. IEEE Trans Med Imaging. 2019;38(10):2281–2292. doi: 10.1109/TMI.2019.2903562. [DOI] [PubMed] [Google Scholar]
25.Yu Changqian, Wang Jingbo, Peng Chao, et al. Learning a discriminative feature network for semantic segmentation//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 1857-1866.
26.Zhang Z, Zhang X, Peng C, et al. Exfuse: enhancing feature fusion for semantic segmentation// Proceedings of the European Conference on Computer Vision (ECCV), 2018: 269-284.
27.Zhang Pingping, Liu Wei, Lei Yinjie, et al. Cascaded context pyramid for full-resolution 3D semantic scene completion//2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019: 7800-7809.
28.Hou Qibin, Zhou Daquan, Feng Jiashi. Coordinate attention for efficient mobile network design//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021: 13708-13717.
29.Selvaraju R R, Cogswell M, Das A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization//2017 IEEE International Conference on Computer Vision (ICCV), 2017: 618-626.
30.Chen L C, Papandreou G, Schroff F, et al Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv. 2017:1706.05587. [Google Scholar]

[b1] 1.Krizhevsky A, Sutskever I, Hinton G E Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems. 2012;25:1097–1105. [Google Scholar]

[b2] 2.Simonyan K, Zisserman A Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv. 2014:1409.1556. [Google Scholar]

[b3] 3.Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions//2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015: 1-9.

[b4] 4.He Kaiming, Zhang Xiangyu, Ren Shaoqing, et al Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016;90:770–778. [Google Scholar]

[b5] 5.Ronneberger O, Fischer P, Brox T. U-net: convolutional networks for biomedical image segmentation//International Conference on Medical image computing and computer-assisted intervention, Springer Cham, 2015: 234-241.

[b6] 6.Badrinarayanan V, Kendall A, Cipolla R SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell. 2017;39(12):2481–2495. doi: 10.1109/TPAMI.2016.2644615. [DOI] [PubMed] [Google Scholar]

[b7] 7.Shaziya H, Shyamala K, Zaheer R. Automatic lung segmentation on thoracic CT scans using U-Net convolutional network//2018 International Conference on Communication and Signal Processing (ICCSP), 2018: 0643-0647.

[b8] 8.Gu Yuchong, Lai Yaoming, Xie Peiliang, et al. Multi-scale prediction network for lung segmentation//2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019), IEEE, 2019: 438-442.

[b9] 9.Gholamiankhah F, Mostafapour S, Goushbolagh N A, et al Automated lung segmentation from CT images of normal and COVID-19 pneumonia patients. arXiv preprint arXiv. 2021:2104.02042. doi: 10.30476/IJMS.2022.90791.2178. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b10] 10.Hu Jie, Shen Li, Sun Gang. Squeeze-and-excitation networks//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 7132-7141.

[b11] 11.Woo S, Park J, Lee J Y, et al. Cbam: convolutional block attention module//Proceedings of the European conference on computer vision (ECCV), 2018: 3-19.

[b12] 12.Hu Jie, Shen Li, Albanie S, et al Gather-excite: exploiting feature context in convolutional neural networks. NIPS'18: Proceedings of the 32nd International Conference on Neural Information Processing Systems. 2018:9423–9433. [Google Scholar]

[b13] 13.Linsley D, Scheibler D, Eberhardt S, et al Global-and-local attention networks for visual recognition. Benefits. 2018;64:01. [Google Scholar]

[b14] 14.Tay C P, Roy S, Yap K H. AANet: attribute attention network for person re-identifications//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019: 7127-7136.

[b15] 15.Misra D, Nalamada T, Arasanipalai A U, et al. Rotate to attend: convolutional triplet attention module//2021 IEEE Winter Conference on Applications of Computer Vision (WACV), 2021: 3138-3147.

[b16] 16.Wang Xiaolong, Girshick R, Gupta A, et al. Non-local Neural Networks//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 7794-7803.

[b17] 17.Cao Yue, Xu Jiarui, Lin S, et al. GCNet: non-local networks meet squeeze-excitation networks and beyond//2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), 2019: 1971-1980.

[b18] 18.Chen Y, Kalantidis Y, Li J, et al A2-nets: Double attention networks. Advances in neural information processing systems. 2018:350–359. [Google Scholar]

[b19] 19.Liu Jiangjiang, Hou Qibin, Cheng Mingming, et al. Improving convolutional networks with self-calibrated convolutions//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020: 10093-10102.

[b20] 20.Gao Zilin, Xie Jiangtao, Wang Qilong, et al. Global second-order pooling convolutional networks//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019: 3019-3028.

[b21] 21.Huang Zilong, Wang Xinggang, Huang Lichao, et al. CCNet: criss-cross attention for semantic segmentation//2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019: 603-612.

[b22] 22.Fang X, Yan P Multi-organ segmentation over partially labeled datasets with multi-scale feature abstraction. IEEE Trans Med Imaging. 2020;39(11):3619–3629. doi: 10.1109/TMI.2020.3001036. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b23] 23.Fu Huazhu, Cheng Jun, Xu Yanwu, et al Joint optic disc and cup segmentation based on multi-label deep network and polar transformation. IEEE Trans Med Imaging. 2018;37(7):1597–1605. doi: 10.1109/TMI.2018.2791488. [DOI] [PubMed] [Google Scholar]

[b24] 24.Gu Zaiwang, Cheng Jun, Fu Huazhu, et al Ce-net: context encoder network for 2D medical image segmentation. IEEE Trans Med Imaging. 2019;38(10):2281–2292. doi: 10.1109/TMI.2019.2903562. [DOI] [PubMed] [Google Scholar]

[b25] 25.Yu Changqian, Wang Jingbo, Peng Chao, et al. Learning a discriminative feature network for semantic segmentation//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 1857-1866.

[b26] 26.Zhang Z, Zhang X, Peng C, et al. Exfuse: enhancing feature fusion for semantic segmentation// Proceedings of the European Conference on Computer Vision (ECCV), 2018: 269-284.

[b27] 27.Zhang Pingping, Liu Wei, Lei Yinjie, et al. Cascaded context pyramid for full-resolution 3D semantic scene completion//2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019: 7800-7809.

[b28] 28.Hou Qibin, Zhou Daquan, Feng Jiashi. Coordinate attention for efficient mobile network design//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021: 13708-13717.

[b29] 29.Selvaraju R R, Cogswell M, Das A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization//2017 IEEE International Conference on Computer Vision (ICCV), 2017: 618-626.

[b30] 30.Chen L C, Papandreou G, Schroff F, et al Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv. 2017:1706.05587. [Google Scholar]

PERMALINK

基于双尺度并行注意力网络的肺实质分割

Lung parenchyma segmentation based on double scale parallel attention network

Kaili FENG

Lili REN

Yanlin WU

Yan LI

Hongrui WANG

Guanglei WANG

Abstract

Abstract

引言