Learnable edge detectors can make deep convolutional neural networks more robust

Jin Ding; Jie-Chao Zhao; Yong-Zhi Sun; Ping Tan; Jia-Wei Wang; Ji-En Ma; You-Tong Fang

doi:10.1371/journal.pone.0330299

. 2025 Sep 11;20(9):e0330299. doi: 10.1371/journal.pone.0330299

Learnable edge detectors can make deep convolutional neural networks more robust

Jin Ding ^1,^2,^*, Jie-Chao Zhao ¹, Yong-Zhi Sun ¹, Ping Tan ¹, Jia-Wei Wang ¹, Ji-En Ma ³, You-Tong Fang ³

Editor: Panos Liatsis⁴

PMCID: PMC12425196 PMID: 40934232

Abstract

Deep convolutional neural networks (DCNNs) are vulnerable to examples with small perturbations. Improving DCNNs’ robustness is of great significance to the safety-critical applications, such as autonomous driving and industry automation. Inspired by the principal way that human eyes recognize objects, i.e., largely relying on the shape features, this paper first designs four learnable edge detectors as layer kernels and proposes a binary edge feature branch (BEFB) to learn the binary edge features, which can be easily integrated into any popular backbone. The four edge detectors can learn the horizontal, vertical, positive diagonal, and negative diagonal edge features, respectively, and the branch is stacked by multiple Sobel layers (using learnable edge detectors as kernels) and one threshold layer. The binary edge features learned by the branch, concatenated with the texture features learned by the backbone, are fed into the fully connected layers for classification. We integrate the proposed branch into VGG16 and ResNet34, respectively, and conduct experiments on multiple datasets. Experimental results demonstrate the BEFB is lightweight and has no side effects on training. And the accuracy of the BEFB-integrated models is better than the original ones when facing white-box attacks and black-box attack. Besides, BEFB-integrated models equipped with the robustness enhancement techniques can achieve better classification accuracy compared to the original models. The work in this paper for the first time shows it is feasible to enhance the robustness of DCNNs through combining both shape-like features and texture features.

Introduction

It is well known that deep convolutional neural networks (DCNNs) can be fooled by examples with small perturbations [1–6], which poses potential hazards for safety-critical applications, e.g., autonomous driving, airport security, and industry automation. Therefore, it is of great significance to improve the robustness of DCNNs. Figs 1 and 2 show a clean example and its adversarial counterpart. Adversarial example (AE for short) in Fig 2 is with small noises which can induce DCNNs to make wrong decisions. However, these noises can be filtered by human eyes easily. Researches show that the principal way human beings recognize the objects is largely relying on the shape features [7,8], and that’s why human beings are not susceptible to the noise imposed in Fig 2. Inspired by this, it is natural to ask, is it possible to make the DCNNs more robust by learning the shape-like features?

Fig 3 and Fig 4 shows the thresholded edge images of Fig 1 and Fig 2, respectively. From the figures, it can be observed that the thresholded edge images for Figs 1 and 2 are almost the same, which indicates the thresholded shape-like features may be immune to small noise interference. However, the representational capability of thresholded shape-like features is limited. Only by combining them with traditional texture features is it possible to improve the robustness of DCNNs while not reducing the model’s classification accuracy. Therefore, in this paper, we propose a novel neural network model architecture that combines thresholded shape-like features and texture features for image classification, aiming to enhance model robustness. The traditional edge detectors, e.g., Sobel [9], DoG [10], Marr-Hildreth [11], and Canny [12], are unlearnable. Considering the learnability of shape-like features, in this paper, four learnable Sobel-like edge detectors are designed and taken as layer kernels, which can be used to extract horizontal, vertical, positive diagonal, and negative diagonal edge features, respectively [13,14]. Note that, in [15], the definition of four difference convolution kernels are similar to the learnable Sobel-like edge detectors in this paper. They both can enhance the details learned by DCNNs. The former is employed for dehazing images, and the latter is leveraged for improving robustness of DCNNs. Based on the four learnable edge detectors, a binary edge feature branch (BEFB) is proposed, which is staked by multiple Sobel layers and one threshold layer. The Sobel layers employ the edge detectors as kernels, and the threshold layer turns the output of the last Sobel layer to the binary features. The binary features concatenated with the texture features learned by any popular backbone are then fed into fully connected layers for classification. To deal with the zero gradient of threshold layer, which makes the weights in BEFB unable to update using the chain rule, Straight-Through Estimator (STE) [16–18] is employed. In addition, to take obfuscated gradient effect raised by [19–21] into consideration, zero gradient or one center-translated sigmoid activation function are also used for generating AEs. In the experiments, we integrate BEFB into VGG16 [22] and ResNet34 [23,24], respectively, and conduct experiments on multiple datasets. The results demonstrate BEFB has no side effects on model training, and BEFB-integrated models can achieve better accuracy under white-box attacks and black-box attack compared to the original ones. Furthermore, we combine the BEFB-integrated models with state-of-the-art (SOTA) robustness enhancement techniques, and find BEFB-integrated models can achieve better accuracy than the original models as well.

The contributions of this paper can be summarized as follows:

Inspired by the principal way that the human beings recognize objects, we design four learnable edge detectors and propose BEFB to make DCNNs more robust.
BEFB can be easily integrated into any popular backbone, and has no side effects on model training. Extensive experiments on multiple datasets show BEFB-integrated models can achieve better accuracy under white-box attacks and black-box attack compared to the original models.
For the first time, we show it is feasible to make DCNNs more robust by combining the shape-like features and texture features.

The organization of the paper is as follows. Section Related Work briefly reviews the related works, and section Proposed Approach describes the details of BEFB. In section Experiments and Discussions, the experiments on multiple datasets are conducted, and the discussions are made. Finally, in section Conclusions, the concluding remarks are given.

Related work

Robustness Enhancement with AT.

AT improves the robustness of the DCNNs by generating AEs as training samples during optimization. Tsipras et al. [25] found there exists a tradeoff between robustness and standard accuracy of the models generated by AT, due to the robust classifiers learning a different feature representation compared to the clean classifiers. Madry et al. [26] proposed an approximated solution framework to the optimization problems of AT, and found PGD-based AT can produce models defending themselves against the first-order adversaries. Kannan et al. [27] investigated the effectiveness of AT at the scale of ImageNet [28], and proposed a logit pairing AT training method to tackle the tradeoff between robust accuracy and clean accuracy. Wong et al. [29] accelerated the training process using FGSM attack with random initialization instead of PGD attack [30], and reached significantly lower cost. Xu et al. [31] proposed a novel attack method which can make a stronger perturbation to the input images, resulting in the robustness of models by AT using this attack method is improved. Li et al. [32] revealed a link between fast growing gradient of examples and catastrophic overfitting during robust training, and proposed a subspace AT method to mitigate the overfitting and increase the robustness. Dabouei et al. [33] found the gradient norm in AT is higher than natural training, which hinders the training convergence of outer optimization of AT. And they proposed a gradient regularization method to improve the performance of AT. Zhou et al. [34] proposed an AT framework which can provide detailed and quantifiable explanations through objects, parts, scenes, materials, textures, and colors dimensions. Eleftheriadis et al. [35] proposed a general framework for AT to simultaneously consider several distance metrics and adversarial attack strategies.

Robustness Enhancement without AT. Non-AT robustness enhancement techniques can be categorized into part-based models [36,37], feature vector clustering [38,39], adversarial margin maximization [40,41], etc. Li et al. [36] argued one reason that DCNNs are vulnerable to attacks is they are trained only on category labels, not on the part-based knowledge as humans do. They proposed an object recognition model, which first segments the parts of objects, scores the segmentations based on human knowledge, and final outputs the classification results based on the scores. This part-based model shows better robustness than classic recognition models across various attack settings. Sitawarin et al. [37] also thought richer annotation information can help learn more robust features. They proposed a part segmentation model with a head classifier trained end-to-end. The model first segments objects into parts, and then makes predictions based on the parts. Mustafa et al. [38] stated that making feature vectors in the same class closer and centroids of different classes more separable can enhance the robustness of DCNNs. They added PCL to the conventional loss function, and designed an auxiliary function to reduce the dimensions of convolutional feature maps. Seo et al. [39] proposed a training methodology that enhances the robustness of DCNNs through a constraint that applies a class-specific differentiation to the feature space. The training methodology results in feature representations with a small intra-class variance and large inter-class variances, and can improve the adversarial robustness notably. Yan et al. [40] proposed an adversarial margin maximization method to improve DCNNs’ generalization ability. They employed the deepfool attack method [42] to compute the distance between an image sample and its decision boundary. This learning-based regularization can enhance the robustness of DCNNs as well. Yuan et al. [43] proposed using alpha noise as an alternative to Gaussian noise for data augmentation to improve the robustness of DCNNs. Amerehi et al. [44] proposed a label augmentation method to enhance the robustness of DCNNs against both common and intentional perturbations. Dong et al. [45] explored the relationship among adversarial robustness, Lipschitz constant, and architecture parameters, and found an appropriate constraint on architecture parameters could reduce the Lipschitz constant to improve the robustness of DCNNs.

Note that, BEFB-integrated models proposed in this paper can be easily combined with above-mentioned robustness enhancement techniques, such as adversarial training (AT) [26] and Prototype Conformity Loss (PCL)[38], and can achieve better classification accuracy than the original models.

Proposed approach

In this section, we first introduce four learnable edge detectors, and then illustrate a binary edge feature branch (BEFB).

Learnable edge detectors

Inspired by the observation that humans primarily rely on shape features to recognize objects, here, we design four learnable edge detectors which can extract horizontal edge features, vertical edge features, positive diagonal edge features, and negative diagonal edge features, respectively. These four learnable edge detectors can be taken as layer kernels and are shown in Figs 5–8.

For horizontal edge detector in Fig 5, we have

w_{i j} {\begin{matrix} \in [0, 1] & if i = 1, j = 1, 2, 3, \\ = 0 & if i = 2, j = 1, 2, 3, \\ \in [-1, 0] & if i = 3, j = 1, 2, 3, \end{matrix}

(1)

For vertical edge detector in Fig 6, we have

w_{i j} {\begin{matrix} \in [0, 1] & if j = 1, i = 1, 2, 3, \\ = 0 & if j = 2, i = 1, 2, 3, \\ \in [-1, 0] & if j = 3, i = 1, 2, 3, \end{matrix}

(2)

For positive diagonal edge detector in Fig 7, we have

w_{i j} {\begin{matrix} \in [0, 1] & if (i, j) \in {(1, 1), (1, 2), (2, 1)}, \\ = 0 & if (i, j) \in {(1, 3), (2, 2), (3, 1)}, \\ \in [-1, 0] & if (i, j) \in {(2, 3), (3, 2), (3, 3)}, \end{matrix}

(3)

For negative diagonal edge detector in Fig 8, we have

w_{i j} {\begin{matrix} \in [0, 1] & if (i, j) \in {(1, 2), (1, 3), (2, 3)}, \\ = 0 & if (i, j) \in {(1, 1), (2, 2), (3, 3)}, \\ \in [-1, 0] & if (i, j) \in {(2, 1), (3, 1), (3, 2)}, \end{matrix}

(4)

Binary edge feature branch

Based on the four learnable edge detectors, BEFB is proposed to extract the binary edge features of the images. BEFB is stacked by multiple Sobel layers and one threshold layer, and can be integrated into any popular backbone. The architecture of a BEFB-integrated DCNN model is shown in Fig 9.

Sobel Layer. A Sobel layer is four parallel convolutional layers using addition for fusion. The four parallel convolutional layers take horizontal, vertical, positive diagonal, and negative diagonal edge detectors as their kernels, respectively. Fig 9 shows a BEFB-integrated model in which Sobel layer is with four parallel convolutional layers.

Threshold Layer. The threshold layer in its nature is an activation layer. The activation function can be written as follows,

x_{i k j}^{o u t} = {\begin{matrix} 1 & if x_{i k j}^{i n} \geq t * max (x_{i}^{i n}), \\ 0 & if x_{i k j}^{i n} < t * max (x_{i}^{i n}), \end{matrix}

(5)

In Eq (5), $x_{i k j}^{i n}$ is an element of the tensor $x^{i n}$ $\in$ $R^{N \times P \times Q}$ , which represents the feature map obtained by the last Sobel layer. $x_{i k j}^{o u t}$ is an element of the tensor $x^{o u t}$ $\in$ $R^{N \times P \times Q}$ , which represents the binary feature map output by the threshold layer. N is the number of channels of the feature map, and P and Q are the width and height of a channel. i = 1, 2, ..., N, k = 1, 2, ..., P, and j = 1, 2, ..., Q. t is a proportional coefficient and belongs to [0, 1]. $max$ ( $x_{i}^{i n}$ ) represents the maximum value of channel i. It is obvious to see that, the higher t, the less binary features obtained; the smaller t, the more binary features obtained.

The threshold layer turns the output of the last Sobel layer to the binary edge features, which concatenated with the texture features learned by the backbone are fed into fully connected layers for classification. Note that the activation function in Eq (5) has a zero gradient. Therefore, the STE technique [16–18] is used to update the weights in BEFB.

Experiments

In this section, we integrate BEFB into VGG16 [22] and ResNet34 [23,24], respectively, and conduct the experiments on CIFAR-10 [46], MNIST [47], SVHN [48], and TinyImageNet (TinyIN for short) [49] datasets. We examine the effects of BEFB on model training, analyze obfuscated gradient effect [19–21] by using different activation functions of the threshold layer to generate AEs, and compare the accuracy of BEFB-integrated models when facing white-box attacks (FGSM [1], PGD [30], AA[50], $A^{3}$ [51], and C&W [52]) and black-box attack (One Pixel Attack [53]) with original ones. Furthermore, we combine the BEFB-integrated models with SOTA robustness enhancement techniques to evaluate the classification accuracy. All experiments are coded in Tensorflow with one TITAN XP GPU.

Experimental settings

In the experiments, the settings of the number of Sobel layers l and proportional coefficient t of threshold layer are shown in Table 1. The settings of the ϵ, steps, and stepsize of FGSM and PGD are shown in Table 2. In AA and $A^{3}$ , the perturbation limit is set to 20 in MNIST dataset, and 2 in other three datasets. In C&W attack, the constant c is set to 1 for MNIST dataset, and 1e-3 for three other datasets. In black-box attack, three pixels are up to modifications. Each model is run for five times, and the average value is recorded. The number of training epochs is set to 50000. The optimizer used in the experiments is the SGD optimizer, with a momentum of 0.9 and a weight decay coefficient of 5e-4. The initial learning rate is set to 0.01, and the batch size for training samples is set to 128.

Table 1. The settings of the number of Sobel layers and proportional coefficient of Threshold layer.

	CIFAR-10		MNIST		SVHN		TinyIN
	l	t	l	t	l	t	l	t
VGG16-BEFB	2	0.8	2	0.8	2	0.8	3	0.6
ResNet34-BEFB	2	0.6	2	0.6	2	0.6	3	0.6

Open in a new tab

Table 2. Parameters of FGSM and PGD.

	CIFAR-10	MNIST	SVHN	TinyIN
ϵ	8	80	8	8
*steps*	8	8	8	8
*stepsize*	2	20	2	2

Open in a new tab

Effects of BEFB on model training

We examine the effects of BEFB on model training by comparing three metrics between the BEFB-integrated models and the original ones, i.e., training accuracy, test accuracy, and training time per epoch. The loss function, optimizer, batch size, and number of epochs are set to be the same. Table 3 shows the comparison of training performance between BEFB-integrated models and the original ones. Table 4 shows the comparison of standard deviation of test accuracy between BEFB-integrated models and the original ones. From both tables, it is clear to see the BEFB-integrated models perform comparably with the original models, which indicates BEFB is lightweight and has no side effects on model training. Figs 10 and 11 compare the training profiles of VGG16 and VGG16-BEFB on CIFAR-10 dataset, respectively. Figs 12 and 13 compare the training profiles of ResNet34 and ResNet34-BEFB on CIFAR-10 dataset, respectively. From the figures, we can see BEFB-integrated models demonstrate the similar training dynamics with the original ones. Note that, the training epochs for all four models in Figs 10–13 is 50000. To more clearly show the comparison of converging speed and accuracy between BEFB-integrated models and original models, Figs 10–13 illustrate the changes in training loss and accuracy for the first 60 epochs.

Table 3. Comparison of training performance between BEFB-integrated models and the original ones.

	CIFAR-10			MNIST			SVHN			TinyIN
	Tr.Acc.	Te.Acc.	Ti.PE	Tr.Acc.	Te.Acc.	Ti.PE	Tr.Acc.	Te.Acc.	Ti.PE	Tr.Acc.	Te.Acc.	Ti.PE
VGG16	99.19%	83.56%	18s	99.65%	99.56%	17s	99.39%	94.46%	18s	94.42%	33.76%	83s
VGG16-BEFB	99.65%	83.08%	18s	99.66%	99.56%	17s	99.53%	94.74%	18s	92.12%	30.16%	93s
ResNet34	99.23%	79.23%	47s	99.73%	99.51%	42s	99.70%	93.18%	51s	97.48%	30.75%	243s
ResNet34-BEFB	99.51%	74.65%	46s	99.45%	99.36%	41s	99.81%	93.38%	50s	98.01%	26.34%	248s

Open in a new tab

Tr.Acc. stands for training accuracy. Te.Acc. stands for test accuracy. Ti.PE stands for training time per epoch.

Table 4. Comparison of standard deviation of test accuracy between BEFB-integrated models and original ones.

	CIFAR-10	MNIST	SVHN	TinyIN
VGG16	0.0011	0.0007	0.0010	0.0012
VGG16-BEFB	0.0009	0.0008	0.00011	0.0010
ResNet34	0.0013	0.0011	0.0012	0.0014
ResNet34-BEFB	0.0010	0.0009	0.0013	0.0015

Open in a new tab

Analysis of obfuscated gradient effect

The activation function of the threshold layer is expressed by Eq (5), whose gradient is zero. In order to update the weights in BEFB, STE technique [16–18] is adopted. But with respect to generating AEs, STE may bring obfuscated gradient effect [19–21], which means the yielded AEs are not powerful enough to deceive the models. Here, in addition to STE, zero gradient and gradient of a center-translated sigmoid activation function for threshold layer are also tested. Table 5 compares the classification accuracy of AEs generated by these three gradient computing strategies under FGSM attacks. Eq (6) defines the center-translated sigmoid function. x₀ in general, is the threshold value for channels, illustrated in Eq (5).

Table 5. Comparison of classification accuracy of AEs generated by three gradient computing strategies under FGSM attacks.

		CIFAR-10	MNIST	SVHN	TinyIN
VGG16-BEFB	STE	64.24%	81.65%	71.02%	43.21%
	Sigmoid	44.93%	68.25%	59.75%	33.06%
	Zero Gradient	44.40%	67.42%	59.28%	31.97%
ResNet34-BEFB	STE	23.35%	17.55%	14.36%	35.87%
	Sigmoid	14.87%	13.84%	12.71%	5.82%
	Zero Gradient	14.88%	14.53%	12.17%	5.39%

Open in a new tab

f (x) = \frac{1}{1 + e^{x - x_{0}}}

(6)

From the table 5, it is clear to see the classification accuracy under STE is much higher than zero gradient and gradient of center-translated sigmoid function. It indeed brings the obfuscated gradient effect using STE. In our experiments, we employ zero gradient of threshold layer to generate AEs.

Performance comparison between BEFB-integrated models and original models under attacks

Tables 6 and 7 compare the classification accuracy of BEFB-integrated models and the original models under white-box attacks, in which FGSM [1], PGD [30], AA [50], $A^{3}$ [51], and C&W attacks are employed. Table 8 compares the classification accuracy of BEFB-integrated models and the original models under black-box attack [53].

Table 6. Comparison of classification accuracy between BEFB-integrated models and original ones under white-box attacks.

	CIFAR-10				MNIST				SVHN				TinyIN
	FGSM	PGD	AA	$A^{3}$	FGSM	PGD	AA	$A^{3}$	FGSM	PGD	AA	$A^{3}$	FGSM	PGD	AA	$A^{3}$
VGG16	28.59%	1.93%	7.39%	7.16%	56.54%	3.63%	38.77%	38.86%	57.77%	16.84%	48.36%	48.35%	27.39%	4.03%	2.47%	2.41%
VGG16-BEFB	45.60%	8.05%	14.57%	11.84%	67.42%	15.53%	76.04%	76.13%	59.28%	19.48%	53.85%	53.77%	31.93%	9.18%	2.85%	2.79
ResNet34	7.82%	0.47%	15.16%	14.82%	12.66%	0.76%	25.77%	25.75%	24.14%	2.49%	49.36%	49.30%	5.80%	0.00%	5.14%	5.07%
ResNet34-BEFB	14.88%	1.01%	15.83%	15.62%	14.53%	2.95%	72.43%	72.33%	25.17%	2.98%	51.35%	50.97%	5.95%	0.95%	5.08%	5.09%

Open in a new tab

Table 7. Comparison of classification accuracy between BEFB-integrated models and original ones under C&W attack.

	CIFAR-10	MNIST	SVHN	TinyIN
VGG16	56.94%	71.41%	82.16%	18.32%
VGG16-BEFB	59.13%	74.12%	83.01%	19.44%
ResNet34	57.74%	68.24%	81.23%	21.64%
ResNet34-BEFB	60.49%	69.37%	81.60%	22.30%

Open in a new tab

Table 8. Comparison of classification accuracy between BEFB-integrated models and original ones under black-box attack.

	CIFAR-10	MNIST	SVHN	TinyIN
VGG16	16.20%	94.40%	37.80%	54.40%
VGG16-BEFB	29.40%	95.80%	40.60%	55.00%
ResNet34	40.60%	69.20%	35.60%	66.60%
ResNet34-BEFB	41.20%	78.60%	41.00%	66.80%

Open in a new tab

From the tables, it is clear to see that BEFB-integrated models are more robust than the original models under white-box attacks and black-box attack. For white-box attacks, on CIFAR-10 dataset, VGG16-BEFB model can achieve 17%, 6%, 7%, and 4% higher accuracy than the original model under FGSM, PGD, AA, and $A^{3}$ attacks, respectively. On MNIST dataset, VGG16-BEFB model can achieve 11%, 12%, 37%, and 37% higher accuracy than the original model under FGSM, PGD, AA, and $A^{3}$ attacks, respectively. For black-box attack, on CIFAR-10 dataset, VGG16-BEFB model can achieve 13% higher accuracy than the original model. On MNIST dataset, ResNet34-BEFB model can achieve 9% higher accuracy than the original model.

Table 9 integrates BEFB with WRN34-10 [54] backbone, and the results show the classification accuracy of WRN34-10-BEFB is better than the original WRN34-10 under FGSM, PGD, AA, $A^{3}$ attacks on CIFAR-10 dataset.

Table 9. Comparison of classification accuracy between WRN34-10-BEFB model and original one on CIFAR-10 dataset.

	FGSM	PGD	AA	$A^{3}$
WRN34-10	4.13%	0.00%	9.25%	9.12%
WRN34-10-BEFB	5.01%	0.00%	9.42%	9.33%

Open in a new tab

Figs 14–17 depict how BEFB helps improve the robustness of VGG16 model. In these figures, the gray features are extracted from backbone branch, and the binary features are extracted from BEFB. Figs 14 and 15 examine the extracted features of both clean examples and the AEs by the original VGG16 model and VGG16-BEFB model on TinyIN dataset, respectively. On Fig 14, the certain images from TinyIN dataset under FGSM of ϵ=16 attack are predicted wrong by the original VGG16 model, and it is clear to see there is significant difference between the extracted texture features of the clean examples and the AEs. On Fig 15, the same images from TinyIN dataset under FGSM of ϵ=16 attack are classified correctly by the VGG16-BEFB model, and the difference of the extracted texture features between the clean examples and the AEs is lower than that in the original model. More notably, the extracted binary edge features of clean examples and AEs are almost the same, e.g., in “ladybug” image and “sock” image, the number of different pixels are both zero. It indicates these extracted binary edge features play an important role in improving the classification accuracy on AEs. Furthermore, we add more perturbations to the images and observe the extracted features and prediction results of the VGG16-BEFB models. On Fig 16, four more perturbations are added on the images under FGSM attack. The difference of texture features between the clean examples and AEs is slightly higher than that under ϵ=16 attack, e.g., RMSE (root mean square error) of “jellyfish” image is 0.15 under ϵ=16 and 0.18 under ϵ=20, and RMSE of “mushroom” image is 0.48 under ϵ=16 and 0.52 under ϵ=20. The difference of binary edge features between the clean examples and AEs is nearly unchanged under ϵ=16 attack and ϵ=20 attack, e.g., the number of different pixels for “jellyfish” image is 1 under both ϵ=16 attack and ϵ=20 attack, and the number of different pixels for “mushroom” image is also 1 under both ϵ=16 attack and ϵ=20 attack. And the prediction results under FGSM of ϵ=20 attack are correct. On Fig 17, another four perturbations are added, and the prediction results are wrong. It can be seen from the figure that, the difference of texture features under ϵ=24 attack is slightly higher than that under ϵ=20 attack, e.g., RMSE of “jellyfish” image is 0.18 under ϵ=20 and 0.19 under ϵ=24, and RMSE of “mushroom” image is 0.52 under ϵ=20 attack and 0.57 under ϵ=24 attack. The number of different pixels in binary edge features of clean examples and AEs keeps very close under ϵ=20 attack and ϵ=24 attack, e.g. the number for “mushroom” image keeps unchanged and the number for “jellyfish” image increases one. Both Figs 16 and 17 demonstrate the binary edge features are less susceptible to small perturbations, and to further improve the classification accuracy on AEs, exploring the novel combination forms for both texture features and binary edge features is a key issue.

Similarly, Figs 18–21 depict how BEFB helps improve the robustness of ResNet34 model. On Figs 18 and 19, it can be seen that, the original ResNet34 model makes the wrong predictions for the certain images from CIFAR-10 dataset under FGSM of ϵ=8 attack, while ResNet34-BEFB model gets them correct because of the extracted binary edge features almost the same between the clean examples and the AEs. For example, on “dog”, “airplane”, “automobile”, “ship” and “frog” images, the number of different pixels is zero. On Figs 20 and 21, four more and eight more perturbations are added, respectively. It can be seen that the prediction results are correct under FGSM of ϵ=12 attack and wrong under FGSM of ϵ=16 attack. When focusing on the binary edge features under these two attacks, it is found the changes on binary edge features in most of cases are zeros, e.g., in “dog”, “truck”, and “ship” images, demonstrating the binary edge features are less susceptible to small perturbations.

We also compare the classification performance of both original models and BEFB-integrated models on the images perturbed by gaussian noise. Figs 22 and 23 illustrate the gaussian noise perturbed images of CIFAR-10 dataset and MNIST dataset. In Table 10, it is clear to see the classification accuracy of BEFB-integrated models are better than the original models on both datasets, e.g. the ResNet34-BEFB model can achieve 15% and 10% higher accuracy than the original model on CIFAR-10 dataset and MNIST dataset, respectively.

Table 10. Comparison of classification accuracy between the original models and BEFB-integrated models on gaussian noise perturbed CIFAR-10 and MNIST datasets.

	CIFAR-10	MNIST
VGG16	61.35%	85.17%
VGG16-BEFB	64.17%	88.75%
ResNet34	75.63%	76.01%
ResNet34-BEFB	91.62%	86.11%

Open in a new tab

Combining BEFB-integrated models with SOTA robustness enhancement methods

We combine BEFB-integrated models with four popular robustness enhancement techniques, i.e., AT [26], PCL [38], DDPM [55], and BORT [56], and compare them to the original models with these robustness enhancement techniques. We denote VGG16-BEFB models with AT as VGG16-BEFB-AT, and VGG16-BEFB models with PCL as VGG16-BEFB-PCL. The same notations for ResNet34-BEFB models and the original models. Fig 24 compares the classification accuracy of BEFB-AT models with AT enhanced original models under PGD attack. Figs 25 and 26 compare the classification accuracy of BEFB-PCL models with PCL enhanced original models under FGSM and PGD attacks.

From the figures, we can see AT and PCL enhanced BEFB-integrated models have better classification accuracy than AT and PCL enhanced original models.

In Table 11, DDPM and BORT are integrated with WRN28-10 [54] and WRN28-10-BEFB models respectively for robustness enhancement. The results show DDPM/BORT enhanced WRN28-10-BEFB model has higher classification accuracy than original WRN28-10 model under FGSM, PGD, AA, and $A^{3}$ attacks.

Table 11. Comparison of classification accuracy between DDPM/BORT enhanced WRN28-10-BEFB model and original one under FGSM, PGD, AA, and $A^{3}$ attacks.

	FGSM	PGD	AA	$A^{3}$
WRN28-10-DDPM	75.78%	71.34%	69.07%	68.99%
WRN28-10-BEFB-DDPM	76.49%	71.81%	69.55%	69.47%
WRN28-10-BORT	58.29%	52.95%	50.29%	50.20%
WRN28-10-BEFB-BORT	58.71%	53.39%	50.54%	50.53%

Open in a new tab

Ablation study

We first make a comparison between BEFB-integrated models with learnable edge detectors and those with traditional Sobel edge detector. BEFB-integrated models with traditional Sobel edge detector are denoted as BEFB-fixed. Table 12 shows the comparison of classification accuracy between the original BEFB-integrated models and BEFB-integrated models with traditional Sobel edge detector. From the table, it can be seen that the average classification accuracy decreases when replacing the learnable edge detectors with traditional Sobel edge detector.

Table 12. Comparison of replacing learnable edge detectors with traditional Sobel edge detector.

	CIFAR-10		SVHN
	FGSM	PGD	FGSM	PGD
VGG16-BEFB-fixed	42.90%	3.98%	60.72%	17.83%
VGG16-BEFB	45.60%	8.05%	59.28%	19.48%
ResNet34-BEFB-fixed	11.37%	1.09%	23.12%	3.21%
ResNet34-BEFB	14.88%	1.01%	25.17%	2.98%

Open in a new tab

Considering BEFB-integrated models having two main components, i.e., Sobel layers and threshold layer, we then make a comparison between original BEFB-integrated models and threshold layer and Sobel layer removed models. Threshold layer removed models are denoted as BEFB-tlre, and Sobel layers removed models are denoted as BEFB-slre. Table 13 shows the comparison of classification accuracy between the original BEFB-integrated models and BEFB-integrated models with threshold layer and Sobel layers removed, respectively. From the table, it is clear to see when removing threshold layer and Sobel layers, the robustness of BEFB-tlre and BEFB-slre models is weakened.

Table 13. Comparison of removing threshold layer and Sobel layers respectively.

	CIFAR-10		SVHN
	FGSM	PGD	FGSM	PGD
VGG16-BEFB-tlre	40.55%	6.73%	48.70%	10.46%
VGG16-BEFB-slre	33.73%	2.90%	55.22%	15.89%
VGG16-BEFB	45.60%	8.05%	59.28%	19.48%
ResNet34-BEFB-tlre	13.58%	0.09%	24.22%	2.44%
ResNet34-BEFB-slre	9.11%	0.86%	16.42%	2.18%
ResNet34-BEFB	14.88%	1.01%	25.17%	2.98%

Open in a new tab

Discussions

The experimental results from Section shows BEFB has no side effects on model training, and BEFB-integrated models are more robust than original models under attacks. And when combining BEFB-integrated models with SOTA robustness enhancement techniques, it can achieve better classification accuracy than original models. It also can be seen that, using BEFB to enhance robustness is effective but not that significant. It may be because under BEFB, the binary edge features are combined with texture features by concatenation. We believe it is worthwhile to explore other combination forms of binary edge features and texture features which can potentially improve the robustness of DCNNs notably.

Conclusions

Enhancing the robustness of DCNNs is of great significance for the safety-critical applications in the real world. Inspired by the principal way that human eyes recognize objects, in this paper, we design four learnable edge detectors and propose a binary edge feature branch (BEFB), which can be easily integrated into any popular backbone. Experiments on multiple datasets show BEFB has no side effects on model training, and BEFB-integrated models are more robust than the original models. The work in this paper for the first time shows it is feasible to combine shape-like features and texture features to make DCNNs more robust. In future’s work, we endeavor to explore other effective and efficient combination forms of binary edge features and texture features, and design an optimization framework for the parameter searching to yield models with good performance under attacks.

Acknowledgment

The authors want to thank Dr. Huo-Ping Yi for the insights on evaluating the robustness of DCNN.

Data Availability

All relevant data files are available at https://doi.org/10.6084/m9.figshare.28911614.v1.

Funding Statement

This work was supported by National Key Research and Development Program of China(2024YFB2409200), Open Foundation of the State Key Laboratory of Fluid Power and Mechatronic Systems(GZKF-202329), and National Natural Science Foundation of China(52293424, 52477065). The funders played no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

1.Goodfellow IJ, Shlens J, Szegedy C. Explaining and harnessing adversarial examples. In: International conference on learning representations; 2015.
2.Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I. Intriguing properties of neural networks. In: International conference on learning representations; 2014.
3.Al-Maliki S, Bouanani FE, Ahmad K, Abdallah M, Hoang DT, Niyato D, et al. Toward improved reliability of deep learning based systems through online relabeling of potential adversarial attacks. IEEE Trans Rel. 2023;72(4):1367–82. doi: 10.1109/tr.2023.3298685 [DOI] [Google Scholar]
4.Qi P, Jiang T, Wang L, Yuan X, Li Z. Detection tolerant black-box adversarial attack against automatic modulation classification with deep learning. IEEE Trans Rel. 2022;71(2):674–86. doi: 10.1109/tr.2022.3161138 [DOI] [Google Scholar]
5.Bai T, Luo J, Zhao J, Wen B, Wang Q. Recent advances in adversarial training for adversarial robustness. In: Proceedings of the thirtieth international joint conference on artificial intelligence; 2021. p. 4312–21. 10.24963/ijcai.2021/591 [DOI]
6.Wen Z, Jiang T, Huang Z, Cui X. Section sparse attack: A more powerful sparse attack method. Comput Sci. 2025:1–12. [Google Scholar]
7.Davitt LI, Cristino F, Wong AC-N, Leek EC. Shape information mediating basic- and subordinate-level object recognition revealed by analyses of eye movements. J Exp Psychol Hum Percept Perform. 2014;40(2):451–6. doi: 10.1037/a0034983 [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Sigurdardottir HM, Michalak SM, Sheinberg DL. Shape beyond recognition: Form-derived directionality and its effects on visual attention and motion perception. J Exp Psychol Gen. 2014;143(1):434–54. doi: 10.1037/a0032353 [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Kittler J. On the accuracy of the Sobel edge detector. Image Vision Comput. 1983;1(1):37–42. doi: 10.1016/0262-8856(83)90006-9 [DOI] [Google Scholar]
10.Berzins V. Accuracy of Laplacian edge detectors. Comput Vision Graph Image Process. 1984;27(2):195–210. [Google Scholar]
11.Forshaw MRB. Speeding up the Marr-Hildreth edge operator. Computer Vision Graph Image Process. 1988;41(2):172–85. doi: 10.1016/0734-189x(88)90018-7 [DOI] [Google Scholar]
12.Ding L, Goshtasby A. On the Canny edge detector. Pattern Recogn. 2001;34(3):721–5. doi: 10.1016/s0031-3203(00)00023-6 [DOI] [Google Scholar]
13.Xiao H, Xiao S, Ma G, Li C. Image Sobel edge extraction algorithm accelerated by OpenCL. J Supercomput. 2022;78(14):16236–65. doi: 10.1007/s11227-022-04404-8 [DOI] [Google Scholar]
14.Yasir M, Hossain MS, Nazir S, Khan S, Thapa R. Object identification using manipulated edge detection techniques. Science. 2022;3(1):1–6. [Google Scholar]
15.Chen Z, He Z, Lu Z-M. DEA-Net: Single image dehazing based on detail-enhanced convolution and content-guided attention. IEEE Trans Image Process. 2024;33:1002–15. doi: 10.1109/TIP.2024.3354108 [DOI] [PubMed] [Google Scholar]
16.Bengio Y, Léonard N, Courville A. Estimating or propagating gradients through stochastic neurons for conditional computation. In: arXiv preprint; 2013. 10.48550/arXiv.1308.3432 [DOI]
17.Yang Z, Lee J, Park C. Injecting logical constraints into neural networks via straight-through estimators. In: Proceedings of international conference on machine learning; 2022. p. 25096–122.
18.Vanderschueren A, De Vleeschouwer C. Are straight-through gradients and soft-thresholding all you need for sparse training? In: Proceedings of IEEE winter conference on applications of computer vision; 2023. p. 3808–17.
19.Athalye A, Carlini N, Wagner D. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In: Proceedings of international conference on machine learning; 2018. p. 274–83.
20.Yue K, Jin R, Wong CW, Baron D, Dai H. Gradient obfuscation gives a false sense of security in federated learning. In: 32nd USENIX security symposium (USENIX security); 2023. p. 6381–98.
21.Huang Y, Yu Y, Zhang H, Ma Y, Yao Y. Adversarial robustness of stabilized neural ode might be from obfuscated gradients. In: Mathematical and scientific machine learning. PMLR; 2022. p. 497–515.
22.Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In: International conference on learning representations; 2015.
23.He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of IEEE conference on computer vision and Pattern Recognition; 2016. p. 770–8
24.He F, Liu T, Tao D. Why ResNet works? Residuals generalize. IEEE Trans Neural Netw Learn Syst. 2020;31(12):5349–62. doi: 10.1109/TNNLS.2020.2966319 [DOI] [PubMed] [Google Scholar]
25.Tsipras D, Santurkar S, Engstrom L, Turner A, Madry A. Robustness may be at odds with accuracy. In: International conference on learning representations; 2019.
26.Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A. Towards deep learning models resistant to adversarial attacks. In: International conference on learning representations; 2018.
27.Kannan H, Kurakin A, Goodfellow I. Adversarial logit pairing; 2018. 10.48550/arXiv.1803.06373 [DOI]
28.Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L. Imagenet: A large-scale hierarchical image database. In: Proceedings of IEEE conference on computer vision and pattern recognition; 2009. p. 248–255.
29.Wong E, Rice L, Kolter JZ. Fast is better than free: Revisiting adversarial training. In: International conference on learning representations; 2020.
30.Kurakin A, Goodfellow I, Bengio S. Adversarial machine learning at scale. In: International conference on learning representations; 2017.
31.Xu Z, Zhu G, Meng C, Ying Z, Wang W, Gu M, et al. A2: Efficient automated attacker for boosting adversarial training. Adv Neural Inform Process Syst. 2022;35:22844–55. [Google Scholar]
32.Li T, Wu Y, Chen S, Fang K, Huang X. Subspace adversarial training. In: Proceedings of IEEE conference on computer vision and pattern recognition; 2022. p. 13409–18.
33.Dabouei A, Taherkhani F, Soleymani S, Nasrabadi NM. Revisiting outer optimization in adversarial training. In: Proceedings of European conference on computer vision. Springer; 2022. p. 244–61.
34.Zhou D, Song Z, Chen Z, Huang X, Ji C, Kumari S, et al. Advancing explainability of adversarial trained Convolutional Neural Networks for robust engineering applications. Eng Appl Artif Intell. 2025;140:109681. doi: 10.1016/j.engappai.2024.109681 [DOI] [Google Scholar]
35.Eleftheriadis C, Symeonidis A, Katsaros P. Adversarial robustness improvement for deep neural networks. Mach Vision Appl. 2024;35(3). doi: 10.1007/s00138-024-01519-1 [DOI] [Google Scholar]
36.Li X, Wang Z, Zhang B, Sun F, Hu X. Recognizing object by components with human prior knowledge enhances adversarial robustness of deep neural networks. IEEE Trans Pattern Anal Mach Intell. 2023;45(7):8861–73. doi: 10.1109/TPAMI.2023.3237935 [DOI] [PubMed] [Google Scholar]
37.Sitawarin C, Pongmala K, Chen Y, Carlini N, Wagner D. Part-based models improve adversarial robustness. In: International conference on learning representations; 2023.
38.Mustafa A, Khan SH, Hayat M, Goecke R, Shen J, Shao L. Deeply supervised discriminative learning for adversarial defense. IEEE Trans Pattern Anal Mach Intell. 2021;43(9):3154–66. doi: 10.1109/TPAMI.2020.2978474 [DOI] [PubMed] [Google Scholar]
39.Seo S, Lee Y, Kang P. Cost-free adversarial defense: Distance-based optimization for model robustness without adversarial training. Comput Vision Image Understand. 2023;227:103599. doi: 10.1016/j.cviu.2022.103599 [DOI] [Google Scholar]
40.Yan Z, Guo Y, Zhang C. Adversarial margin maximization networks. IEEE Trans Pattern Anal Mach Intell. 2021;43(4):1129–39. doi: 10.1109/TPAMI.2019.2948348 [DOI] [PubMed] [Google Scholar]
41.Guo Y, Zhang C. Recent advances in large margin learning. IEEE Trans Pattern Anal Mach Intell. 2022;44(10):7167–74. doi: 10.1109/TPAMI.2021.3091717 [DOI] [PubMed] [Google Scholar]
42.Moosavi-Dezfooli SM, Fawzi A, Frossard P. Deepfool: A simple and accurate method to fool deep neural networks. In: Proceedings of IEEE conference on computer vision and Pattern Recognition; 2016. p. 2574–82.
43.Yuan X, Li J, Kuruoglu EE. Robustness enhancement in neural networks with alpha-stable training noise. Digital Signal Process. 2025;156:104778. doi: 10.1016/j.dsp.2024.104778 [DOI] [Google Scholar]
44.Amerehi F, Healy P. Label augmentation for neural networks robustness. arXiv preprint; 2024. https://doi.org/arXiv:240801977
45.Dong M, Li Y, Wang Y, Xu C. Adversarially robust neural architectures. IEEE Trans Pattern Anal Mach Intell. 2025;47(5):4183–97. doi: 10.1109/TPAMI.2025.3542350 [DOI] [PubMed] [Google Scholar]
46.Krizhevsky A, Hinton G. Learning multiple layers of features from tiny images; 2009.
47.Li Deng. The MNIST database of handwritten digit images for machine learning research [Best of the Web]. IEEE Signal Process Mag. 2012;29(6):141–2. doi: 10.1109/msp.2012.2211477 [DOI] [Google Scholar]
48.Yuval N. Reading digits in natural images with unsupervised feature learning. In: Proceedings of the NIPS workshop on deep learning and unsupervised feature learning; 2011. p. 4.
49.Le Y, Yang X. Tiny imagenet visual recognition challenge. CS 231N. 2015;7(7):3. [Google Scholar]
50.Croce F, Hein M. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In: International conference on machine learning. PMLR; 2020. p. 2206–16.
51.Liu Y, Cheng Y, Gao L, Liu X, Zhang Q, Song J. Practical evaluation of adversarial robustness via adaptive auto attack. In: Proceedings of IEEE conference on computer vision and pattern recognition; 2022. p. 15105–14.
52.Carlini N, Wagner D. Towards evaluating the robustness of neural networks. In: 2017 IEEE symposium on security and privacy (SP); 2017. p. 39–57. 10.1109/sp.2017.49 [DOI]
53.Su J, Vargas DV, Sakurai K. One pixel attack for fooling deep neural networks. IEEE Trans Evol Computat. 2019;23(5):828–41. doi: 10.1109/tevc.2019.2890858 [DOI] [Google Scholar]
54.Zagoruyko S, Komodakis N. Wide residual networks; 2016. https://doi.org/arXiv:1605.07146
55.Wang Z, Pang T, Du C, Lin M, Liu W, Yan S. Better diffusion models further improve adversarial training. In: International conference on machine learning. PMLR; 2023. p. 36246–63.
56.Huang D, Bu Q, Qing Y, Pi H, Wang S, Cui H. Two heads are better than one: Robust learning meets multi-branch models. arXiv preprint; 2022. 10.48550/arXiv.2208.08083 [DOI]

PLoS One. doi: 10.1371/journal.pone.0330299.r001

Decision Letter 0

Panos Liatsis

8 Apr 2025

PONE-D-25-09891Learnable Edge Detectors Can Make Deep Convolutional Neural Networks More RobustPLOS ONE

Dear Dr. Ding,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

==============================

The reviewers noted that your idea shares similarities with prior work such as “DEA-Net.” Please cite this and clearly distinguish your contribution.
The most recent references in your paper are from 2023. To position your work accurately within the current research landscape, please include and discuss more recent (2024–2025) papers in the field.
Your method introduces an interesting idea with learnable edge features, but the theoretical explanation of how this improves robustness is quite limited.
Expand your discussion on how the proposed edge detector helps maintain shape-related cues under adversarial attacks. More in-depth analysis and intuition would help readers understand why this strategy works.
Several key training details are missing. Please specify the number of training epochs, optimizer used, learning rate(s), and any other relevant hyperparameters.
Clarify how hyperparameters like the number of layers (l) and the coefficient (t) were chosen, especially why l varies across datasets.
Describe the training pipeline: are both branches of your model trained jointly? Is the backbone trained from scratch or fine-tuned from a pretrained model? How does the inclusion of Sobel-like layers influence edge detection in earlier CNN layers?
You're only reporting the best results out of five training runs. Please include averages and standard deviations to give a more accurate and reliable picture of your model’s performance.
Clarify the meaning of bold values in Table 4.
For Table 7, explain why your model performs significantly better under Gaussian noise compared to the baseline results in Table 3.
The work lacks comparisons with more recent or strong baseline models. Please include evaluations against more state-of-the-art (SOTA) adversarially robust architectures.
Consider comparing with other methods that incorporate shape-based or multi-branch architectures for robustness.
Test your model with additional backbones beyond VGG16 and ResNet34 to demonstrate its general applicability.
The current adversarial attacks used are somewhat limited. Please incorporate more challenging or recent attack methods to thoroughly evaluate the robustness of your approach.
It would also be helpful to explain how you selected the parameters for generating adversarial perturbations.
Please elaborate on why you chose to use the STE (Straight-Through Estimator) for training binarized edge features. Are there other alternatives, and if so, why was STE preferred?
There’s concern that your edge detection approach only captures basic patterns (horizontal, vertical, diagonal). Could your model handle more complex or irregular edges? If so, please provide evidence or discussion to support this.
For transparency and reproducibility, the reviewers request that you share your implementation—ideally by uploading the code to a public repository or including it as supplementary material.
The overall quality of English needs improvement. Several sentences are unclear, and some transitions are abrupt. Please rewrite unclear sections and polish the writing for readability and flow, particularly when presenting your main ideas.

==============================

Please submit your revised manuscript by May 23 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Panos Liatsis, PhD

Academic Editor

PLOS ONE

Journal requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Please note that PLOS ONE has specific guidelines on code sharing for submissions in which author-generated code underpins the findings in the manuscript. In these cases, we expect all author-generated code to be made available without restrictions upon publication of the work. Please review our guidelines at https://journals.plos.org/plosone/s/materials-and-software-sharing#loc-sharing-code and ensure that your code is shared in a way that follows best practice and facilitates reproducibility and reuse.

Additional Editor Comments (if provided):

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: Partly

Reviewer #3: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: No

Reviewer #3: N/A

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: No

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: This work proposes a learnable edge detector to enhance the robustness of DCNNs. However, I have several major concerns regarding the article:

1. The English in the manuscript requires significant polishing. Some transitions are too abrupt, for example: "It prompts us that, enhancing the robustness of DCNNs can be achieved by using the binary edge features which can be seen as a kind of shape features. ", the motivation needs a deeper analysis.

2. The idea is similar with "DEA-Net: Single image dehazing based on detail-enhanced convolution and content-guided attention", typically, the learnable edge part. Needs proper citation.

3. The most recent reference cited in this work is from 2023. Given that it is now 2025, the paper would benefit from incorporating and comparing with more recent related works.

4. It seems that the proposed model not compare with SOTA models, please implement this.

5. If possible, please attach your code via file or opensource it.

6. The experimental setup lacks critical details. Information such as the number of training epochs, optimizer used, and learning rate settings should be explicitly provided

Reviewer #2: Publication is well written and presents an interesting idea of using learnable Sobel-like layers as an addition to two already established CNN architectures. Presented results showcase a good performance of a model under multiple adversarial attacks. However, authors include only top results from 5 trainings which does not provide an appropriate statistical analysis. Authors should provide an average across those 5 trainings and standard deviation or similar metric.

Please provide more information about model training in Experimental Settings section:

1. How number of layers l and coefficient t were chosen. Why l is different for only one dataset?

2. Are both branches of the model trained? Is it training from scratch or is the backbone somehow pretrained? If backbone is also trained, how are its weight influence? Usually, some layers in CNN learn to detect edges - is this somehow changed when there is explicit training of Sobel layers?

3. How perturbation parameters were selected?

Author response to Decision Letter 1

12 May 2025

Learnable Edge Detectors Can Make Deep Convolutional Neural Networks More Robust (PONE-D-25-09891)

We would like to express many thanks to reviewers and associate editor for their valuable comments, which help to enhance and improve the quality of manuscript considerably. We endeavored to address all the comments and our reflections are provided below point by point. An updated version of the paper being closed is modified based on the proposed comments. Now we summarize the responses to the reviewers’ comments as follows:

Reviewer 1:

Reviewer Comments:

The English in the manuscript requires significant polishing. Some transitions are too abrupt, for example: "It prompts us that, enhancing the robustness of DCNNs can be achieved by using the binary edge features which can be seen as a kind of shape features. ", the motivation needs a deeper analysis.

Authors Responses:

Thanks for your comments. A detailed explanation of the motivation is given at the beginning of the second paragraph of Section Introduction (marked in red).

We have proofread the manuscript and made several modifications, listed below:

[1] In the first line of Abstract and Section Introduction, “Deep convolutional neural networks (DCNN for short)” has been changed to “Deep convolutional neural networks (DCNNs)”.

[2] In the second line of Abstract, “Improving DCNN’s robustness” has been changed to “Improving DCNNs’ robustness”.

[3] In the first paragraph of Section Introduction, “which brings the potential hazards when applying DCNNs to the safety-critical applications” has been changed to “which poses potential hazards for safety-critical applications”.

[4] In the second paragraph of Section Introduction and the last paragraph of Section Proposed Approach, “learnt” has been changed to “learned”.

[5] In the second paragraph of Section Introduction, “STE technique” has been changed to “Straight-Through Estimator (STE) technique”.

[6] In the second paragraph of Section Related Work, “DCNNs are prone to be attacked” has been changed to “DCNNs are vulnerable to attacks”.

[7] In the second paragraph of Section Proposed Approach, “Inspired by the fact that shape feature is the main factor relied on by human beings to recognize objects” has been changed to “Inspired by the observation that humans primarily rely on shape features to recognize objects”.

[8] In the last paragraph of Section Proposed Approach, “Note that, the gradient of activation function of Eq. (5) is zero. To update the weights in BEFB, STE technique [14-16] is employed.” has been changed to “Note that the activation function in Eq. (5) has a zero gradient. Therefore, the STE technique [14-16] is used to update the weights in BEFB.”

[9] In the third paragraph of Section Experiments, “are on a par” has been changed to “perform comparably”.

[10] “BEFB integrated models” has been changed to “BEFB-integrated models”.

All these changes are marked in red.

Reviewer Comments:

The idea is similar with "DEA-Net: Single image dehazing based on detail-enhanced convolution and content-guided attention", typically, the learnable edge part. Needs proper citation.

Authors Responses:

Thanks for your comments. The paper “DEA-Net: Single image dehazing based on detail-enhanced convolution and content-guided attention” has been cited as Ref [15]. This paper proposed DEA-Net to perform image dehazing, part of which are four difference convolution kernels. These kernels are similar with the learnable edge detectors in our manuscript. Both can enhance the details learnt by DCNNs. The former used these enhanced details to dehaze images, and the latter leveraged these enhanced details to generate the binary shape-like features to improve robustness of DCNNs.

This paper “DEA-Net: Single image dehazing based on detail-enhanced convolution and content-guided attention” has been cited in the fourteenth line of the second paragraph in Section Introduction (marked in red).

Reviewer Comments:

The most recent reference cited in this work is from 2023. Given that it is now 2025, the paper would benefit from incorporating and comparing with more recent related works.

Authors Responses:

Thanks for your comments. The backbone models, attack methods, and robustness enhancement techniques employed in the manuscript for testing the performance of BEFB are widely used in DCNN-related research articles, e.g., ResNet34, AA, and AT. In the revised manuscript, we use WRN34-10 model, C&W attack method, BORT and DDPM robustness enhancement techniques to further test the performance of BEFB. The newly added references have been marked in red.

Reviewer Comments:

It seems that the proposed model not compare with SOTA models, please implement this.

Authors Responses:

Thanks for your comments. The performance of using the proposed BEFB to enhance robustness of DCNN is inferior than that of using SOTA robustness enhancement methods, like AT and PCL. However, when combining BEFB-integrated models with these SOTA methods, the robustness enhancement performance is better than that of combining original models with SOTA methods. The relevant experiments are conducted in Section Experiments--Combining BEFB-integrated models with SOTA robustness enhancement methods. Considering BEFB is lightweight and can be easily integrated with existing popular backbones, it depicts the superiority of BEFB. In the revised manuscript, we newly add two robustness enhancement techniques--DDPM and BORT, and the experimental results show the robustness enhancement performance of combining BEFB-integrated models with DDPM/BORT is better than that of combining original models with DDPM/BORT. For more details, please refer to Section Experiments--Combining BEFB-integrated models with SOTA robustness enhancement methods (marked in red).

Reviewer Comments:

If possible, please attach your code via file or opensource it.

Authors Responses:

Thanks for your comments. The project can be found in https://github.com/dingjin/BEFB.

Reviewer Comments:

The experimental setup lacks critical details. Information such as the number of training epochs, optimizer used, and learning rate settings should be explicitly provided.

Authors Responses:

Thanks for your comments. The number of training epochs is 50000. The optimizer used in the experiments is the SGD optimizer, with a momentum of 0.9 and a weight decay coefficient of 5e-4. The initial learning rate is set to 0.01, and the batch size for training samples is set to 128. Section Experiments-- Experimental Settings has been modified accordingly (marked in red).

Reviewer 2:

Reviewer Comments:

However, authors include only top results from 5 trainings which does not provide an appropriate statistical analysis. Authors should provide an average across those 5 trainings and standard deviation or similar metric.

Authors Responses:

Thanks for your comments. The values of training metrics in Table 3 have been modified to the average across 5 trainings. And the corresponding standard deviations have been provided in Table 4. Section Experiments-- Effects of BEFB on model training has been modified (marked in red).

Reviewer Comments:

How number of layers l and coefficient t were chosen. Why l is different for only one dataset?

Authors Responses:

Thanks for your comments. The number of layers l and coefficient t are chosen empirically. The Table R1 below shows the classification accuracy of BEFB-integrated models under AA attack on CIFAR-10 dataset for different l with t fixed to 0.8. The Table R2 below shows the classification accuracy of BEFB-integrated models under AA attack on CIFAR-10 dataset for different t with l fixed to 2.

Table R1 Performance comparison for different l with the fixed t = 0.8

l=1 l=2 l=3 l=4

VGG16-BEFB 13.78% 14.57% 14.01% 12.72%

ResNet34-BEFB 14.46% 15.04% 14.73% 14.03%

Table R2 Performance comparison for different t with the fixed l = 2

t=0.2 t=0.4 t=0.6 t=0.8 t=0.9

VGG16-BEFB 13.70% 13.65% 14.06% 14.57% 13.99%

ResNet34-BEFB 14.11% 14.30% 15.83% 15.04% 14.07%

Reviewer Comments:

Are both branches of the model trained? Is it training from scratch or is the backbone somehow pretrained? If backbone is also trained, how are its weight influence? Usually, some layers in CNN learn to detect edges - is this somehow changed when there is explicit training of Sobel layers?

Authors Responses:

Thanks for your comments. Yes. Both branches of the model are trained. Both are trained from scratch. The backbone is trained to output traditional texture features. BEFB is trained to output the binary edge features. When there is explicit training of Sobel layers, some layers in backbone we think can also learn to detect edges as common CNNs do. Fig. 8-10 demonstrate the gray features output by backbone. It can be observed edge information has been learned.

Reviewer Comments:

How perturbation parameters were selected?

Authors Responses:

Thanks for your comments. The selected perturbation parameters are commonly used in the publications on testing adversarial robustness of CNNs.

Reviewer Comments:

Table 4: What do bold values mean?

Authors Responses:

Thanks for your comments. Since the gradient of the threshold function is zero, we utilize the STE to ensure weight updates can proceed. However, when generating adversarial examples (AE) using white-box attack algorithms (such as FGSM), STE can bring obfuscated gradient effect [19-21], which means the generated AEs are not powerful enough, leading to an artificially high classification accuracy for the BEFB-integrated models. Therefore, this paper tested two alternatives to STE: the sigmoid function and zero gradient to generate AEs. The experimental results in Table 5 (Table 4 in original manuscript) show that zero gradient in most cases can allow FGSM attack to achieve lower classification accuracy of BEFB-integrated models. Consequently, we choose zero gradient for generating AEs.

Reviewer Comments:

Table 7: Why performance of the model on images with Gaussian noise is so much higher than the baseline one in Table 3?

Authors Responses:

Thanks for your comments. Adversarial examples generated with Gaussian noise do not utilize the model's gradient information, therefore the model's classification accuracy is higher. In contrast, the attack methods in Table 6 (Table 5 in original manuscript) are white-box attacks, which leverage the model's gradient information, resulting in lower classification accuracy for the model. Results in both tables show the BEFB-integrated models can have better robustness performance than the original models.

Reviewer 3:

Reviewer Comments:

The paper lacks thorough theoretical analysis of how the learnable edge detector functions. Although the authors provide visualizations of "clean vs. adversarial" edge features, the analysis of why these edge features help the model retain more shape-related cues is rather brief. A deeper discussion would help readers better understand how such structured edge features contribute to robustness against adversarial attacks.

Authors Responses:

Thanks for your comments. How the learnable edge detectors help improve the robustness of DCNNs is depicted in Fig. 7-10/Fig. 11-14. In these figures, the gray features are obtained from backbone branch, and the binary features are obtained from BEFB. It can be observed in Fig. 7 that, using the original VGG16 model, the difference of the extracted gray features between clean examples and adversarial examples (AE) is large, which makes AE prediction results all wrong under FGSM ϵ=16 attack. While in Fig. 8, using BEFB-integrated VGG16 model, the difference of the extracted binary and gray features between clean examples and AEs is small, which keeps AE prediction results all right under FGSM ϵ=16 attack. Fig. 9 and Fig. 10 add 4 more and 8 more perturbations to evaluate the performance of BEFB-integrated VGG16 model, respectively. It can be observed in both figures that, the difference of the extracted binary features between clean examples and AEs is small, demonstrating the extracted binary features from BEFB are less susceptible to the noise. Similarly, Fig. 11-14 demonstrate how the learnable edge detectors help improve the robustness of ResNet34 model. For more details, please refer to the fourth paragraph and the fifth paragraph of Section Experiments--Performance comparison between BEFB-integrated models and original models under attacks.

Reviewer Comments:

Adversarial attacks are more limited, and while the authors tested traditional attack methods, they did not delve into the impact of more advanced adversarial methods. Additionally, comparisons with other shape-based or multi-branch robustness enhancement methods are lacking, making it difficult to fully demonstrate the superiority of the proposed approach.

Authors Responses:

Thanks for your comments. In the revised manuscript, C&W attack method is included to further test the performance of the proposed BEFB. And the experimental results in Table 7 show the BEFB-integrated models perform better than original models under C&W attack on four datasets.

The performance of using the proposed BEFB to enhance robustness of DCNN is inferior than that of using SOTA robustness enhancement methods, like AT and PCL. However, when combining BEFB-integrated models with these SOTA methods, the robustness enhancement performance is better than that of combining original models with SOTA methods. The relevant experiments are conducted in Section Experiments--Combining BEFB-integrated models with SOTA robustness enhancement methods. Considering BEFB is lightweight and can be easily integrated with existing popular backbones, it depicts the superiority of BEFB.

The shape-based robustness enhancement methods require datasets of pixel-level segmentation labeling. Therefore, in the revised manuscript, we add a multi-branch robustness enhancement technique--BORT, and a SOTA robustness enhancement technique—DDPM for comparison. The experimental in Table 11 results show the robustness enhancement performance of combining BEFB-integrated model with BORT/DDPM is better than that of combining original model with BORT/DDPM. For more details, please refer to Section Experiments--Combining BEFB-integrated models with SOTA robustness enhancement methods (marked in red).

Reviewer Comments:

Limited innovation and model generalizability. Although the designed convolutional kernels offer a novel approach to edge detection, they are primarily limited to fixed horizontal, vertical, and diagonal edges. The effectiveness of the method on more complex edges (e.g., irregular edges) is not thoroughly investigated. Thus, the method may be mainly suitable for basic edge detection tasks, while its capability to handle more complex edge features in real-world images remains unclear. Moreover, the authors used the STE approach, however this approach is more common when dealing with similar problems. As a result, innovativeness may be lacking.

Authors Responses:

Thanks for your comments. The four learnable Sobel-like edge detectors are designed for detecting horizontal, vertical, positive diagonal, negative diagonal edges, which means in these directions, the detection output will be relatively strong. In other directions, these edge detectors can also work, and the detection output will be a little weaker.

When the derivative of the neural network's activation function does not exist or is zero, to the best of our knowledge, the STE is a commonly-used and effective technique to make weights can be updated. It sets the derivative to a constant value to facilitate training. In BEFB, the activation function of the threshold layer is a threshold function, and we utilize STE to enable errors can be back-propagated to the Sobel layers for weights updating. The experimental results demonstrate using STE technique, the trained BEFB-integrated models have superior performance under various attacks.

The main innovation of this paper lies in proposing BEFB to learn shape-like features to enhance the robustness of deep convolutional neural networks.

Reviewer Comments:

The authors should include experiments with more sophisticated adversarial strategies to further test the proposed model.

Authors Responses:

Thanks f

Attachment

Submitted filename: response to decision letter.docx

pone.0330299.s001.docx^{(36.4KB, docx)}

PLoS One. doi: 10.1371/journal.pone.0330299.r003

Decision Letter 1

Panos Liatsis

9 Jun 2025

PONE-D-25-09891R1Learnable Edge Detectors Can Make Deep Convolutional Neural Networks More RobustPLOS ONE

Dear Dr. Ding,

Please resolve any ambiguities between the manuscript text and figures
Please include and analyze more recent references (from 2024) in the state-of-the-art

Please submit your revised manuscript by Jul 24 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

We look forward to receiving your revised manuscript.

Kind regards,

Panos Liatsis, PhD

Academic Editor

PLOS ONE

Journal Requirements:

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: (No Response)

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: N/A

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

Reviewer #1: No

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Reviewer #1: Thanks for the response, however, I do think there exist a minor issues:

- There is a discrepancy in the paper regarding training epochs. While it claims "The number of training epochs is set to 50000," Figure 6 shows convergence after only 60 epochs. This inconsistency needs clarification from the authors.

- Including only one or two papers from 2024 is insufficient for a 2025 publication. At least 10% of references should come from recent work (2024-2025). A more comprehensive analysis of recent related work is necessary.

Reviewer #2: Authors addressed comments in satisfactory way and work is acceptable to be published.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

**********

PLoS One. 2025 Sep 11;20(9):e0330299. doi: 10.1371/journal.pone.0330299.r004

Author response to Decision Letter 2

25 Jun 2025

Learnable Edge Detectors Can Make Deep Convolutional Neural Networks More Robust (PONE-D-25-09891R1)

Reviewer 1:

Reviewer Comments:

There is a discrepancy in the paper regarding training epochs. While it claims "The number of training epochs is set to 50000," Figure 6 shows convergence after only 60 epochs. This inconsistency needs clarification from the authors.

Authors Responses:

Thanks for your comments. In Figure 5 and Figure 6, all four models were trained on the CIFAR-10 dataset for 50000 epochs. To more clearly compare the training dynamics curves between BEFB-integrated models and original models, Figures 5 and 6 illustrate the changes in training loss and accuracy for the first 60 epochs. As can be seen from the figures, the BEFB-integrated models and the original models exhibit similar converging speed and accuracy. Section “Experiments--Effects of BEFB on model training” has been modified to clarify this inconsistency (marked in red).

Reviewer Comments:

Including only one or two papers from 2024 is insufficient for a 2025 publication. At least 10% of references should come from recent work (2024-2025). A more comprehensive analysis of recent related work is necessary.

Authors Responses:

Thanks for your comments. Five references from 2024 to 2025 have been added and analyzed in Section “Related Work” (marked in red).

Reviewer 2:

Reviewer Comments:

Authors addressed comments in satisfactory way and work is acceptable to be published.

Authors Responses:

Thanks for your efforts and time.

Finally, we would thank all reviewers again for the valuable and critical comments to make the paper more understandable and clearer.

Sincerely,

The Authors.

Attachment

Submitted filename: response_to_decision_letter_auresp_2.docx

pone.0330299.s002.docx^{(23KB, docx)}

PLoS One. doi: 10.1371/journal.pone.0330299.r005

Decision Letter 2

Panos Liatsis

30 Jul 2025

Learnable Edge Detectors Can Make Deep Convolutional Neural Networks More Robust

PONE-D-25-09891R2

Dear Dr. Ding,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice will be generated when your article is formally accepted. Please note, if your institution has a publishing partnership with PLOS and your article meets the relevant criteria, all or part of your publication costs will be covered. Please make sure your user information is up-to-date by logging into Editorial Manager at Editorial Manager® and clicking the ‘Update My Information' link at the top of the page. For questions related to billing, please contact billing support.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Panos Liatsis, PhD

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

PLoS One. doi: 10.1371/journal.pone.0330299.r006

Acceptance letter

Panos Liatsis

PONE-D-25-09891R2

PLOS ONE

Dear Dr. Ding,

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now being handed over to our production team.

At this stage, our production department will prepare your paper for publication. This includes ensuring the following:

* All references, tables, and figures are properly cited

* All relevant supporting information is included in the manuscript submission,

* There are no issues that prevent the paper from being properly typeset

You will receive further instructions from the production team, including instructions on how to review your proof when it is ready. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few days to review your paper and let you know the next and final steps.

Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

You will receive an invoice from PLOS for your publication fee after your manuscript has reached the completed accept phase. If you receive an email requesting payment before acceptance or for any other service, this may be a phishing scheme. Learn how to identify phishing emails and protect your accounts at https://explore.plos.org/phishing.

If we can help with anything else, please email us at customercare@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Professor Panos Liatsis

Academic Editor

PLOS ONE

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Attachment

Submitted filename: response to decision letter.docx

pone.0330299.s001.docx^{(36.4KB, docx)}

Attachment

Submitted filename: response_to_decision_letter_auresp_2.docx

pone.0330299.s002.docx^{(23KB, docx)}

Data Availability Statement

All relevant data files are available at https://doi.org/10.6084/m9.figshare.28911614.v1.

[pone.0330299.ref001] 1.Goodfellow IJ, Shlens J, Szegedy C. Explaining and harnessing adversarial examples. In: International conference on learning representations; 2015.

[pone.0330299.ref002] 2.Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I. Intriguing properties of neural networks. In: International conference on learning representations; 2014.

[pone.0330299.ref003] 3.Al-Maliki S, Bouanani FE, Ahmad K, Abdallah M, Hoang DT, Niyato D, et al. Toward improved reliability of deep learning based systems through online relabeling of potential adversarial attacks. IEEE Trans Rel. 2023;72(4):1367–82. doi: 10.1109/tr.2023.3298685 [DOI] [Google Scholar]

[pone.0330299.ref004] 4.Qi P, Jiang T, Wang L, Yuan X, Li Z. Detection tolerant black-box adversarial attack against automatic modulation classification with deep learning. IEEE Trans Rel. 2022;71(2):674–86. doi: 10.1109/tr.2022.3161138 [DOI] [Google Scholar]

[pone.0330299.ref005] 5.Bai T, Luo J, Zhao J, Wen B, Wang Q. Recent advances in adversarial training for adversarial robustness. In: Proceedings of the thirtieth international joint conference on artificial intelligence; 2021. p. 4312–21. 10.24963/ijcai.2021/591 [DOI]

[pone.0330299.ref006] 6.Wen Z, Jiang T, Huang Z, Cui X. Section sparse attack: A more powerful sparse attack method. Comput Sci. 2025:1–12. [Google Scholar]

[pone.0330299.ref007] 7.Davitt LI, Cristino F, Wong AC-N, Leek EC. Shape information mediating basic- and subordinate-level object recognition revealed by analyses of eye movements. J Exp Psychol Hum Percept Perform. 2014;40(2):451–6. doi: 10.1037/a0034983 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0330299.ref008] 8.Sigurdardottir HM, Michalak SM, Sheinberg DL. Shape beyond recognition: Form-derived directionality and its effects on visual attention and motion perception. J Exp Psychol Gen. 2014;143(1):434–54. doi: 10.1037/a0032353 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0330299.ref009] 9.Kittler J. On the accuracy of the Sobel edge detector. Image Vision Comput. 1983;1(1):37–42. doi: 10.1016/0262-8856(83)90006-9 [DOI] [Google Scholar]

[pone.0330299.ref010] 10.Berzins V. Accuracy of Laplacian edge detectors. Comput Vision Graph Image Process. 1984;27(2):195–210. [Google Scholar]

[pone.0330299.ref011] 11.Forshaw MRB. Speeding up the Marr-Hildreth edge operator. Computer Vision Graph Image Process. 1988;41(2):172–85. doi: 10.1016/0734-189x(88)90018-7 [DOI] [Google Scholar]

[pone.0330299.ref012] 12.Ding L, Goshtasby A. On the Canny edge detector. Pattern Recogn. 2001;34(3):721–5. doi: 10.1016/s0031-3203(00)00023-6 [DOI] [Google Scholar]

[pone.0330299.ref013] 13.Xiao H, Xiao S, Ma G, Li C. Image Sobel edge extraction algorithm accelerated by OpenCL. J Supercomput. 2022;78(14):16236–65. doi: 10.1007/s11227-022-04404-8 [DOI] [Google Scholar]

[pone.0330299.ref014] 14.Yasir M, Hossain MS, Nazir S, Khan S, Thapa R. Object identification using manipulated edge detection techniques. Science. 2022;3(1):1–6. [Google Scholar]

[pone.0330299.ref015] 15.Chen Z, He Z, Lu Z-M. DEA-Net: Single image dehazing based on detail-enhanced convolution and content-guided attention. IEEE Trans Image Process. 2024;33:1002–15. doi: 10.1109/TIP.2024.3354108 [DOI] [PubMed] [Google Scholar]

[pone.0330299.ref016] 16.Bengio Y, Léonard N, Courville A. Estimating or propagating gradients through stochastic neurons for conditional computation. In: arXiv preprint; 2013. 10.48550/arXiv.1308.3432 [DOI]

[pone.0330299.ref017] 17.Yang Z, Lee J, Park C. Injecting logical constraints into neural networks via straight-through estimators. In: Proceedings of international conference on machine learning; 2022. p. 25096–122.

[pone.0330299.ref018] 18.Vanderschueren A, De Vleeschouwer C. Are straight-through gradients and soft-thresholding all you need for sparse training? In: Proceedings of IEEE winter conference on applications of computer vision; 2023. p. 3808–17.

[pone.0330299.ref019] 19.Athalye A, Carlini N, Wagner D. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In: Proceedings of international conference on machine learning; 2018. p. 274–83.

[pone.0330299.ref020] 20.Yue K, Jin R, Wong CW, Baron D, Dai H. Gradient obfuscation gives a false sense of security in federated learning. In: 32nd USENIX security symposium (USENIX security); 2023. p. 6381–98.

[pone.0330299.ref021] 21.Huang Y, Yu Y, Zhang H, Ma Y, Yao Y. Adversarial robustness of stabilized neural ode might be from obfuscated gradients. In: Mathematical and scientific machine learning. PMLR; 2022. p. 497–515.

[pone.0330299.ref022] 22.Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In: International conference on learning representations; 2015.

[pone.0330299.ref023] 23.He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of IEEE conference on computer vision and Pattern Recognition; 2016. p. 770–8

[pone.0330299.ref024] 24.He F, Liu T, Tao D. Why ResNet works? Residuals generalize. IEEE Trans Neural Netw Learn Syst. 2020;31(12):5349–62. doi: 10.1109/TNNLS.2020.2966319 [DOI] [PubMed] [Google Scholar]

[pone.0330299.ref025] 25.Tsipras D, Santurkar S, Engstrom L, Turner A, Madry A. Robustness may be at odds with accuracy. In: International conference on learning representations; 2019.

[pone.0330299.ref026] 26.Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A. Towards deep learning models resistant to adversarial attacks. In: International conference on learning representations; 2018.

[pone.0330299.ref027] 27.Kannan H, Kurakin A, Goodfellow I. Adversarial logit pairing; 2018. 10.48550/arXiv.1803.06373 [DOI]

[pone.0330299.ref028] 28.Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L. Imagenet: A large-scale hierarchical image database. In: Proceedings of IEEE conference on computer vision and pattern recognition; 2009. p. 248–255.

[pone.0330299.ref029] 29.Wong E, Rice L, Kolter JZ. Fast is better than free: Revisiting adversarial training. In: International conference on learning representations; 2020.

[pone.0330299.ref030] 30.Kurakin A, Goodfellow I, Bengio S. Adversarial machine learning at scale. In: International conference on learning representations; 2017.

[pone.0330299.ref031] 31.Xu Z, Zhu G, Meng C, Ying Z, Wang W, Gu M, et al. A2: Efficient automated attacker for boosting adversarial training. Adv Neural Inform Process Syst. 2022;35:22844–55. [Google Scholar]

[pone.0330299.ref032] 32.Li T, Wu Y, Chen S, Fang K, Huang X. Subspace adversarial training. In: Proceedings of IEEE conference on computer vision and pattern recognition; 2022. p. 13409–18.

[pone.0330299.ref033] 33.Dabouei A, Taherkhani F, Soleymani S, Nasrabadi NM. Revisiting outer optimization in adversarial training. In: Proceedings of European conference on computer vision. Springer; 2022. p. 244–61.

[pone.0330299.ref034] 34.Zhou D, Song Z, Chen Z, Huang X, Ji C, Kumari S, et al. Advancing explainability of adversarial trained Convolutional Neural Networks for robust engineering applications. Eng Appl Artif Intell. 2025;140:109681. doi: 10.1016/j.engappai.2024.109681 [DOI] [Google Scholar]

[pone.0330299.ref035] 35.Eleftheriadis C, Symeonidis A, Katsaros P. Adversarial robustness improvement for deep neural networks. Mach Vision Appl. 2024;35(3). doi: 10.1007/s00138-024-01519-1 [DOI] [Google Scholar]

[pone.0330299.ref036] 36.Li X, Wang Z, Zhang B, Sun F, Hu X. Recognizing object by components with human prior knowledge enhances adversarial robustness of deep neural networks. IEEE Trans Pattern Anal Mach Intell. 2023;45(7):8861–73. doi: 10.1109/TPAMI.2023.3237935 [DOI] [PubMed] [Google Scholar]

[pone.0330299.ref037] 37.Sitawarin C, Pongmala K, Chen Y, Carlini N, Wagner D. Part-based models improve adversarial robustness. In: International conference on learning representations; 2023.

[pone.0330299.ref038] 38.Mustafa A, Khan SH, Hayat M, Goecke R, Shen J, Shao L. Deeply supervised discriminative learning for adversarial defense. IEEE Trans Pattern Anal Mach Intell. 2021;43(9):3154–66. doi: 10.1109/TPAMI.2020.2978474 [DOI] [PubMed] [Google Scholar]

[pone.0330299.ref039] 39.Seo S, Lee Y, Kang P. Cost-free adversarial defense: Distance-based optimization for model robustness without adversarial training. Comput Vision Image Understand. 2023;227:103599. doi: 10.1016/j.cviu.2022.103599 [DOI] [Google Scholar]

[pone.0330299.ref040] 40.Yan Z, Guo Y, Zhang C. Adversarial margin maximization networks. IEEE Trans Pattern Anal Mach Intell. 2021;43(4):1129–39. doi: 10.1109/TPAMI.2019.2948348 [DOI] [PubMed] [Google Scholar]

[pone.0330299.ref041] 41.Guo Y, Zhang C. Recent advances in large margin learning. IEEE Trans Pattern Anal Mach Intell. 2022;44(10):7167–74. doi: 10.1109/TPAMI.2021.3091717 [DOI] [PubMed] [Google Scholar]

[pone.0330299.ref042] 42.Moosavi-Dezfooli SM, Fawzi A, Frossard P. Deepfool: A simple and accurate method to fool deep neural networks. In: Proceedings of IEEE conference on computer vision and Pattern Recognition; 2016. p. 2574–82.

[pone.0330299.ref043] 43.Yuan X, Li J, Kuruoglu EE. Robustness enhancement in neural networks with alpha-stable training noise. Digital Signal Process. 2025;156:104778. doi: 10.1016/j.dsp.2024.104778 [DOI] [Google Scholar]

[pone.0330299.ref044] 44.Amerehi F, Healy P. Label augmentation for neural networks robustness. arXiv preprint; 2024. https://doi.org/arXiv:240801977

[pone.0330299.ref045] 45.Dong M, Li Y, Wang Y, Xu C. Adversarially robust neural architectures. IEEE Trans Pattern Anal Mach Intell. 2025;47(5):4183–97. doi: 10.1109/TPAMI.2025.3542350 [DOI] [PubMed] [Google Scholar]

[pone.0330299.ref046] 46.Krizhevsky A, Hinton G. Learning multiple layers of features from tiny images; 2009.

[pone.0330299.ref047] 47.Li Deng. The MNIST database of handwritten digit images for machine learning research [Best of the Web]. IEEE Signal Process Mag. 2012;29(6):141–2. doi: 10.1109/msp.2012.2211477 [DOI] [Google Scholar]

[pone.0330299.ref048] 48.Yuval N. Reading digits in natural images with unsupervised feature learning. In: Proceedings of the NIPS workshop on deep learning and unsupervised feature learning; 2011. p. 4.

[pone.0330299.ref049] 49.Le Y, Yang X. Tiny imagenet visual recognition challenge. CS 231N. 2015;7(7):3. [Google Scholar]

[pone.0330299.ref050] 50.Croce F, Hein M. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In: International conference on machine learning. PMLR; 2020. p. 2206–16.

[pone.0330299.ref051] 51.Liu Y, Cheng Y, Gao L, Liu X, Zhang Q, Song J. Practical evaluation of adversarial robustness via adaptive auto attack. In: Proceedings of IEEE conference on computer vision and pattern recognition; 2022. p. 15105–14.

[pone.0330299.ref052] 52.Carlini N, Wagner D. Towards evaluating the robustness of neural networks. In: 2017 IEEE symposium on security and privacy (SP); 2017. p. 39–57. 10.1109/sp.2017.49 [DOI]

[pone.0330299.ref053] 53.Su J, Vargas DV, Sakurai K. One pixel attack for fooling deep neural networks. IEEE Trans Evol Computat. 2019;23(5):828–41. doi: 10.1109/tevc.2019.2890858 [DOI] [Google Scholar]

[pone.0330299.ref054] 54.Zagoruyko S, Komodakis N. Wide residual networks; 2016. https://doi.org/arXiv:1605.07146

[pone.0330299.ref055] 55.Wang Z, Pang T, Du C, Lin M, Liu W, Yan S. Better diffusion models further improve adversarial training. In: International conference on machine learning. PMLR; 2023. p. 36246–63.

[pone.0330299.ref056] 56.Huang D, Bu Q, Qing Y, Pi H, Wang S, Cui H. Two heads are better than one: Robust learning meets multi-branch models. arXiv preprint; 2022. 10.48550/arXiv.2208.08083 [DOI]

PERMALINK

Learnable edge detectors can make deep convolutional neural networks more robust

Jin Ding

Jie-Chao Zhao

Yong-Zhi Sun

Ping Tan

Jia-Wei Wang

Ji-En Ma

You-Tong Fang

Roles

Abstract

Introduction

Fig 1. Clean example.

Fig 2. Adversarial example.

Fig 3. Thresholded edge image of Fig 1.

Fig 4. Thresholded edge image of Fig 2.

Related work

Robustness Enhancement with AT.

Proposed approach

Learnable edge detectors

Fig 5. Horizontal edge detector.

Fig 8. Negative diagonal edge detector.

Fig 6. Vertical edge detector.

Fig 7. Positive diagonal edge detector.

Binary edge feature branch

Fig 9. The architecture of a BEFB-integrated DCNN model.

Experiments

Experimental settings

Table 1. The settings of the number of Sobel layers and proportional coefficient of Threshold layer.

Table 2. Parameters of FGSM and PGD.

Effects of BEFB on model training

Table 3. Comparison of training performance between BEFB-integrated models and the original ones.

Table 4. Comparison of standard deviation of test accuracy between BEFB-integrated models and original ones.

Fig 10. Training profile of VGG16.

Fig 11. Training profile of VGG16-BEFB.

Fig 12. Training profile of ResNet34.

Fig 13. Training profile of ResNet34-BEFB.

Analysis of obfuscated gradient effect

Table 5. Comparison of classification accuracy of AEs generated by three gradient computing strategies under FGSM attacks.

Performance comparison between BEFB-integrated models and original models under attacks

Table 6. Comparison of classification accuracy between BEFB-integrated models and original ones under white-box attacks.

Table 7. Comparison of classification accuracy between BEFB-integrated models and original ones under C&W attack.

Table 8. Comparison of classification accuracy between BEFB-integrated models and original ones under black-box attack.

Table 9. Comparison of classification accuracy between WRN34-10-BEFB model and original one on CIFAR-10 dataset.

Fig 14. Texture features extracted by the original VGG16 model and its predictions under FGSM of ϵ=16 attack.

Fig 17. Binary edge features and the texture features extracted by the VGG16-BEFB model and its predictions under FGSM of ϵ=24 attack.

Fig 15. Binary edge features and the texture features extracted by the VGG16-BEFB model and its predictions under FGSM of ϵ=16 attack.

Fig 16. Binary edge features and the texture features extracted by the VGG16-BEFB model and its predictions under FGSM of ϵ=20 attack.

Fig 18. Texture features extracted by the original ResNet34 model and its predictions under FGSM of ϵ=8 attack.

Fig 21. Binary edge features and the texture features extracted by the ResNet34-BEFB model and its predictions under FGSM of ϵ=16 attack.

Fig 19. Binary edge features and the texture features extracted by the ResNet34-BEFB model and its predictions under FGSM of ϵ=8 attack.

Fig 20. Binary edge features and the texture features extracted by the ResNet34-BEFB model and its predictions under FGSM of ϵ=12 attack.

Fig 22. Gaussian noise with zero mean and 0.08 standard deviation on CIFAR-10 dataset.

Fig 23. Gaussian noise with zero mean and 0.35 standard deviation on MNIST dataset.

Table 10. Comparison of classification accuracy between the original models and BEFB-integrated models on gaussian noise perturbed CIFAR-10 and MNIST datasets.

Combining BEFB-integrated models with SOTA robustness enhancement methods

Fig 24. Comparing AT enhanced BEFB-integrated models with original models under PGD attack.

Fig 25. PCL enhanced BEFB-integrated models and original models under FGSM attack.

Fig 26. PCL enhanced BEFB-integrated models and original models under PGD attack.

Table 11. Comparison of classification accuracy between DDPM/BORT enhanced WRN28-10-BEFB model and original one under FGSM, PGD, AA, and A3 attacks.

Ablation study

Table 12. Comparison of replacing learnable edge detectors with traditional Sobel edge detector.

Table 13. Comparison of removing threshold layer and Sobel layers respectively.

Discussions

Conclusions

Acknowledgment

Data Availability

Funding Statement

References

Decision Letter 0

Panos Liatsis

Roles

Author response to Decision Letter 1

Decision Letter 1

Panos Liatsis

Roles

Author response to Decision Letter 2

Decision Letter 2

Panos Liatsis

Roles

Table 11. Comparison of classification accuracy between DDPM/BORT enhanced WRN28-10-BEFB model and original one under FGSM, PGD, AA, and $A^{3}$ attacks.