Abstract
Deep convolutional neural networks (DCNNs) are vulnerable to examples with small perturbations. Improving DCNNs’ robustness is of great significance to the safety-critical applications, such as autonomous driving and industry automation. Inspired by the principal way that human eyes recognize objects, i.e., largely relying on the shape features, this paper first designs four learnable edge detectors as layer kernels and proposes a binary edge feature branch (BEFB) to learn the binary edge features, which can be easily integrated into any popular backbone. The four edge detectors can learn the horizontal, vertical, positive diagonal, and negative diagonal edge features, respectively, and the branch is stacked by multiple Sobel layers (using learnable edge detectors as kernels) and one threshold layer. The binary edge features learned by the branch, concatenated with the texture features learned by the backbone, are fed into the fully connected layers for classification. We integrate the proposed branch into VGG16 and ResNet34, respectively, and conduct experiments on multiple datasets. Experimental results demonstrate the BEFB is lightweight and has no side effects on training. And the accuracy of the BEFB-integrated models is better than the original ones when facing white-box attacks and black-box attack. Besides, BEFB-integrated models equipped with the robustness enhancement techniques can achieve better classification accuracy compared to the original models. The work in this paper for the first time shows it is feasible to enhance the robustness of DCNNs through combining both shape-like features and texture features.
Introduction
It is well known that deep convolutional neural networks (DCNNs) can be fooled by examples with small perturbations [1–6], which poses potential hazards for safety-critical applications, e.g., autonomous driving, airport security, and industry automation. Therefore, it is of great significance to improve the robustness of DCNNs. Figs 1 and 2 show a clean example and its adversarial counterpart. Adversarial example (AE for short) in Fig 2 is with small noises which can induce DCNNs to make wrong decisions. However, these noises can be filtered by human eyes easily. Researches show that the principal way human beings recognize the objects is largely relying on the shape features [7,8], and that’s why human beings are not susceptible to the noise imposed in Fig 2. Inspired by this, it is natural to ask, is it possible to make the DCNNs more robust by learning the shape-like features?
Fig 1. Clean example.

Fig 2. Adversarial example.

Fig 3 and Fig 4 shows the thresholded edge images of Fig 1 and Fig 2, respectively. From the figures, it can be observed that the thresholded edge images for Figs 1 and 2 are almost the same, which indicates the thresholded shape-like features may be immune to small noise interference. However, the representational capability of thresholded shape-like features is limited. Only by combining them with traditional texture features is it possible to improve the robustness of DCNNs while not reducing the model’s classification accuracy. Therefore, in this paper, we propose a novel neural network model architecture that combines thresholded shape-like features and texture features for image classification, aiming to enhance model robustness. The traditional edge detectors, e.g., Sobel [9], DoG [10], Marr-Hildreth [11], and Canny [12], are unlearnable. Considering the learnability of shape-like features, in this paper, four learnable Sobel-like edge detectors are designed and taken as layer kernels, which can be used to extract horizontal, vertical, positive diagonal, and negative diagonal edge features, respectively [13,14]. Note that, in [15], the definition of four difference convolution kernels are similar to the learnable Sobel-like edge detectors in this paper. They both can enhance the details learned by DCNNs. The former is employed for dehazing images, and the latter is leveraged for improving robustness of DCNNs. Based on the four learnable edge detectors, a binary edge feature branch (BEFB) is proposed, which is staked by multiple Sobel layers and one threshold layer. The Sobel layers employ the edge detectors as kernels, and the threshold layer turns the output of the last Sobel layer to the binary features. The binary features concatenated with the texture features learned by any popular backbone are then fed into fully connected layers for classification. To deal with the zero gradient of threshold layer, which makes the weights in BEFB unable to update using the chain rule, Straight-Through Estimator (STE) [16–18] is employed. In addition, to take obfuscated gradient effect raised by [19–21] into consideration, zero gradient or one center-translated sigmoid activation function are also used for generating AEs. In the experiments, we integrate BEFB into VGG16 [22] and ResNet34 [23,24], respectively, and conduct experiments on multiple datasets. The results demonstrate BEFB has no side effects on model training, and BEFB-integrated models can achieve better accuracy under white-box attacks and black-box attack compared to the original ones. Furthermore, we combine the BEFB-integrated models with state-of-the-art (SOTA) robustness enhancement techniques, and find BEFB-integrated models can achieve better accuracy than the original models as well.
Fig 3. Thresholded edge image of Fig 1.

Fig 4. Thresholded edge image of Fig 2.

The contributions of this paper can be summarized as follows:
Inspired by the principal way that the human beings recognize objects, we design four learnable edge detectors and propose BEFB to make DCNNs more robust.
BEFB can be easily integrated into any popular backbone, and has no side effects on model training. Extensive experiments on multiple datasets show BEFB-integrated models can achieve better accuracy under white-box attacks and black-box attack compared to the original models.
For the first time, we show it is feasible to make DCNNs more robust by combining the shape-like features and texture features.
The organization of the paper is as follows. Section Related Work briefly reviews the related works, and section Proposed Approach describes the details of BEFB. In section Experiments and Discussions, the experiments on multiple datasets are conducted, and the discussions are made. Finally, in section Conclusions, the concluding remarks are given.
Related work
Robustness Enhancement with AT.
AT improves the robustness of the DCNNs by generating AEs as training samples during optimization. Tsipras et al. [25] found there exists a tradeoff between robustness and standard accuracy of the models generated by AT, due to the robust classifiers learning a different feature representation compared to the clean classifiers. Madry et al. [26] proposed an approximated solution framework to the optimization problems of AT, and found PGD-based AT can produce models defending themselves against the first-order adversaries. Kannan et al. [27] investigated the effectiveness of AT at the scale of ImageNet [28], and proposed a logit pairing AT training method to tackle the tradeoff between robust accuracy and clean accuracy. Wong et al. [29] accelerated the training process using FGSM attack with random initialization instead of PGD attack [30], and reached significantly lower cost. Xu et al. [31] proposed a novel attack method which can make a stronger perturbation to the input images, resulting in the robustness of models by AT using this attack method is improved. Li et al. [32] revealed a link between fast growing gradient of examples and catastrophic overfitting during robust training, and proposed a subspace AT method to mitigate the overfitting and increase the robustness. Dabouei et al. [33] found the gradient norm in AT is higher than natural training, which hinders the training convergence of outer optimization of AT. And they proposed a gradient regularization method to improve the performance of AT. Zhou et al. [34] proposed an AT framework which can provide detailed and quantifiable explanations through objects, parts, scenes, materials, textures, and colors dimensions. Eleftheriadis et al. [35] proposed a general framework for AT to simultaneously consider several distance metrics and adversarial attack strategies.
Robustness Enhancement without AT. Non-AT robustness enhancement techniques can be categorized into part-based models [36,37], feature vector clustering [38,39], adversarial margin maximization [40,41], etc. Li et al. [36] argued one reason that DCNNs are vulnerable to attacks is they are trained only on category labels, not on the part-based knowledge as humans do. They proposed an object recognition model, which first segments the parts of objects, scores the segmentations based on human knowledge, and final outputs the classification results based on the scores. This part-based model shows better robustness than classic recognition models across various attack settings. Sitawarin et al. [37] also thought richer annotation information can help learn more robust features. They proposed a part segmentation model with a head classifier trained end-to-end. The model first segments objects into parts, and then makes predictions based on the parts. Mustafa et al. [38] stated that making feature vectors in the same class closer and centroids of different classes more separable can enhance the robustness of DCNNs. They added PCL to the conventional loss function, and designed an auxiliary function to reduce the dimensions of convolutional feature maps. Seo et al. [39] proposed a training methodology that enhances the robustness of DCNNs through a constraint that applies a class-specific differentiation to the feature space. The training methodology results in feature representations with a small intra-class variance and large inter-class variances, and can improve the adversarial robustness notably. Yan et al. [40] proposed an adversarial margin maximization method to improve DCNNs’ generalization ability. They employed the deepfool attack method [42] to compute the distance between an image sample and its decision boundary. This learning-based regularization can enhance the robustness of DCNNs as well. Yuan et al. [43] proposed using alpha noise as an alternative to Gaussian noise for data augmentation to improve the robustness of DCNNs. Amerehi et al. [44] proposed a label augmentation method to enhance the robustness of DCNNs against both common and intentional perturbations. Dong et al. [45] explored the relationship among adversarial robustness, Lipschitz constant, and architecture parameters, and found an appropriate constraint on architecture parameters could reduce the Lipschitz constant to improve the robustness of DCNNs.
Note that, BEFB-integrated models proposed in this paper can be easily combined with above-mentioned robustness enhancement techniques, such as adversarial training (AT) [26] and Prototype Conformity Loss (PCL)[38], and can achieve better classification accuracy than the original models.
Proposed approach
In this section, we first introduce four learnable edge detectors, and then illustrate a binary edge feature branch (BEFB).
Learnable edge detectors
Inspired by the observation that humans primarily rely on shape features to recognize objects, here, we design four learnable edge detectors which can extract horizontal edge features, vertical edge features, positive diagonal edge features, and negative diagonal edge features, respectively. These four learnable edge detectors can be taken as layer kernels and are shown in Figs 5–8.
Fig 5. Horizontal edge detector.

Fig 8. Negative diagonal edge detector.

For horizontal edge detector in Fig 5, we have
| (1) |
For vertical edge detector in Fig 6, we have
Fig 6. Vertical edge detector.

| (2) |
For positive diagonal edge detector in Fig 7, we have
Fig 7. Positive diagonal edge detector.

| (3) |
For negative diagonal edge detector in Fig 8, we have
| (4) |
Binary edge feature branch
Based on the four learnable edge detectors, BEFB is proposed to extract the binary edge features of the images. BEFB is stacked by multiple Sobel layers and one threshold layer, and can be integrated into any popular backbone. The architecture of a BEFB-integrated DCNN model is shown in Fig 9.
Fig 9. The architecture of a BEFB-integrated DCNN model.
Sobel Layer. A Sobel layer is four parallel convolutional layers using addition for fusion. The four parallel convolutional layers take horizontal, vertical, positive diagonal, and negative diagonal edge detectors as their kernels, respectively. Fig 9 shows a BEFB-integrated model in which Sobel layer is with four parallel convolutional layers.
Threshold Layer. The threshold layer in its nature is an activation layer. The activation function can be written as follows,
| (5) |
In Eq (5), is an element of the tensor , which represents the feature map obtained by the last Sobel layer. is an element of the tensor , which represents the binary feature map output by the threshold layer. N is the number of channels of the feature map, and P and Q are the width and height of a channel. i = 1, 2, ..., N, k = 1, 2, ..., P, and j = 1, 2, ..., Q. t is a proportional coefficient and belongs to [0, 1]. () represents the maximum value of channel i. It is obvious to see that, the higher t, the less binary features obtained; the smaller t, the more binary features obtained.
The threshold layer turns the output of the last Sobel layer to the binary edge features, which concatenated with the texture features learned by the backbone are fed into fully connected layers for classification. Note that the activation function in Eq (5) has a zero gradient. Therefore, the STE technique [16–18] is used to update the weights in BEFB.
Experiments
In this section, we integrate BEFB into VGG16 [22] and ResNet34 [23,24], respectively, and conduct the experiments on CIFAR-10 [46], MNIST [47], SVHN [48], and TinyImageNet (TinyIN for short) [49] datasets. We examine the effects of BEFB on model training, analyze obfuscated gradient effect [19–21] by using different activation functions of the threshold layer to generate AEs, and compare the accuracy of BEFB-integrated models when facing white-box attacks (FGSM [1], PGD [30], AA[50], [51], and C&W [52]) and black-box attack (One Pixel Attack [53]) with original ones. Furthermore, we combine the BEFB-integrated models with SOTA robustness enhancement techniques to evaluate the classification accuracy. All experiments are coded in Tensorflow with one TITAN XP GPU.
Experimental settings
In the experiments, the settings of the number of Sobel layers l and proportional coefficient t of threshold layer are shown in Table 1. The settings of the ϵ, steps, and stepsize of FGSM and PGD are shown in Table 2. In AA and , the perturbation limit is set to 20 in MNIST dataset, and 2 in other three datasets. In C&W attack, the constant c is set to 1 for MNIST dataset, and 1e-3 for three other datasets. In black-box attack, three pixels are up to modifications. Each model is run for five times, and the average value is recorded. The number of training epochs is set to 50000. The optimizer used in the experiments is the SGD optimizer, with a momentum of 0.9 and a weight decay coefficient of 5e-4. The initial learning rate is set to 0.01, and the batch size for training samples is set to 128.
Table 1. The settings of the number of Sobel layers and proportional coefficient of Threshold layer.
| CIFAR-10 | MNIST | SVHN | TinyIN | |||||
|---|---|---|---|---|---|---|---|---|
| l | t | l | t | l | t | l | t | |
| VGG16-BEFB | 2 | 0.8 | 2 | 0.8 | 2 | 0.8 | 3 | 0.6 |
| ResNet34-BEFB | 2 | 0.6 | 2 | 0.6 | 2 | 0.6 | 3 | 0.6 |
Table 2. Parameters of FGSM and PGD.
| CIFAR-10 | MNIST | SVHN | TinyIN | |
|---|---|---|---|---|
| ϵ | 8 | 80 | 8 | 8 |
| steps | 8 | 8 | 8 | 8 |
| stepsize | 2 | 20 | 2 | 2 |
Effects of BEFB on model training
We examine the effects of BEFB on model training by comparing three metrics between the BEFB-integrated models and the original ones, i.e., training accuracy, test accuracy, and training time per epoch. The loss function, optimizer, batch size, and number of epochs are set to be the same. Table 3 shows the comparison of training performance between BEFB-integrated models and the original ones. Table 4 shows the comparison of standard deviation of test accuracy between BEFB-integrated models and the original ones. From both tables, it is clear to see the BEFB-integrated models perform comparably with the original models, which indicates BEFB is lightweight and has no side effects on model training. Figs 10 and 11 compare the training profiles of VGG16 and VGG16-BEFB on CIFAR-10 dataset, respectively. Figs 12 and 13 compare the training profiles of ResNet34 and ResNet34-BEFB on CIFAR-10 dataset, respectively. From the figures, we can see BEFB-integrated models demonstrate the similar training dynamics with the original ones. Note that, the training epochs for all four models in Figs 10–13 is 50000. To more clearly show the comparison of converging speed and accuracy between BEFB-integrated models and original models, Figs 10–13 illustrate the changes in training loss and accuracy for the first 60 epochs.
Table 3. Comparison of training performance between BEFB-integrated models and the original ones.
| CIFAR-10 | MNIST | SVHN | TinyIN | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Tr.Acc. | Te.Acc. | Ti.PE | Tr.Acc. | Te.Acc. | Ti.PE | Tr.Acc. | Te.Acc. | Ti.PE | Tr.Acc. | Te.Acc. | Ti.PE | |
| VGG16 | 99.19% | 83.56% | 18s | 99.65% | 99.56% | 17s | 99.39% | 94.46% | 18s | 94.42% | 33.76% | 83s |
| VGG16-BEFB | 99.65% | 83.08% | 18s | 99.66% | 99.56% | 17s | 99.53% | 94.74% | 18s | 92.12% | 30.16% | 93s |
| ResNet34 | 99.23% | 79.23% | 47s | 99.73% | 99.51% | 42s | 99.70% | 93.18% | 51s | 97.48% | 30.75% | 243s |
| ResNet34-BEFB | 99.51% | 74.65% | 46s | 99.45% | 99.36% | 41s | 99.81% | 93.38% | 50s | 98.01% | 26.34% | 248s |
Tr.Acc. stands for training accuracy. Te.Acc. stands for test accuracy. Ti.PE stands for training time per epoch.
Table 4. Comparison of standard deviation of test accuracy between BEFB-integrated models and original ones.
| CIFAR-10 | MNIST | SVHN | TinyIN | |
|---|---|---|---|---|
| VGG16 | 0.0011 | 0.0007 | 0.0010 | 0.0012 |
| VGG16-BEFB | 0.0009 | 0.0008 | 0.00011 | 0.0010 |
| ResNet34 | 0.0013 | 0.0011 | 0.0012 | 0.0014 |
| ResNet34-BEFB | 0.0010 | 0.0009 | 0.0013 | 0.0015 |
Fig 10. Training profile of VGG16.
Fig 11. Training profile of VGG16-BEFB.
Fig 12. Training profile of ResNet34.
Fig 13. Training profile of ResNet34-BEFB.
Analysis of obfuscated gradient effect
The activation function of the threshold layer is expressed by Eq (5), whose gradient is zero. In order to update the weights in BEFB, STE technique [16–18] is adopted. But with respect to generating AEs, STE may bring obfuscated gradient effect [19–21], which means the yielded AEs are not powerful enough to deceive the models. Here, in addition to STE, zero gradient and gradient of a center-translated sigmoid activation function for threshold layer are also tested. Table 5 compares the classification accuracy of AEs generated by these three gradient computing strategies under FGSM attacks. Eq (6) defines the center-translated sigmoid function. x0 in general, is the threshold value for channels, illustrated in Eq (5).
Table 5. Comparison of classification accuracy of AEs generated by three gradient computing strategies under FGSM attacks.
| CIFAR-10 | MNIST | SVHN | TinyIN | ||
|---|---|---|---|---|---|
| VGG16-BEFB | STE | 64.24% | 81.65% | 71.02% | 43.21% |
| Sigmoid | 44.93% | 68.25% | 59.75% | 33.06% | |
| Zero Gradient | 44.40% | 67.42% | 59.28% | 31.97% | |
| ResNet34-BEFB | STE | 23.35% | 17.55% | 14.36% | 35.87% |
| Sigmoid | 14.87% | 13.84% | 12.71% | 5.82% | |
| Zero Gradient | 14.88% | 14.53% | 12.17% | 5.39% | |
| (6) |
From the table 5, it is clear to see the classification accuracy under STE is much higher than zero gradient and gradient of center-translated sigmoid function. It indeed brings the obfuscated gradient effect using STE. In our experiments, we employ zero gradient of threshold layer to generate AEs.
Performance comparison between BEFB-integrated models and original models under attacks
Tables 6 and 7 compare the classification accuracy of BEFB-integrated models and the original models under white-box attacks, in which FGSM [1], PGD [30], AA [50], [51], and C&W attacks are employed. Table 8 compares the classification accuracy of BEFB-integrated models and the original models under black-box attack [53].
Table 6. Comparison of classification accuracy between BEFB-integrated models and original ones under white-box attacks.
| CIFAR-10 | MNIST | SVHN | TinyIN | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| FGSM | PGD | AA | FGSM | PGD | AA | FGSM | PGD | AA | FGSM | PGD | AA | |||||
| VGG16 | 28.59% | 1.93% | 7.39% | 7.16% | 56.54% | 3.63% | 38.77% | 38.86% | 57.77% | 16.84% | 48.36% | 48.35% | 27.39% | 4.03% | 2.47% | 2.41% |
| VGG16-BEFB | 45.60% | 8.05% | 14.57% | 11.84% | 67.42% | 15.53% | 76.04% | 76.13% | 59.28% | 19.48% | 53.85% | 53.77% | 31.93% | 9.18% | 2.85% | 2.79 |
| ResNet34 | 7.82% | 0.47% | 15.16% | 14.82% | 12.66% | 0.76% | 25.77% | 25.75% | 24.14% | 2.49% | 49.36% | 49.30% | 5.80% | 0.00% | 5.14% | 5.07% |
| ResNet34-BEFB | 14.88% | 1.01% | 15.83% | 15.62% | 14.53% | 2.95% | 72.43% | 72.33% | 25.17% | 2.98% | 51.35% | 50.97% | 5.95% | 0.95% | 5.08% | 5.09% |
Table 7. Comparison of classification accuracy between BEFB-integrated models and original ones under C&W attack.
| CIFAR-10 | MNIST | SVHN | TinyIN | |
|---|---|---|---|---|
| VGG16 | 56.94% | 71.41% | 82.16% | 18.32% |
| VGG16-BEFB | 59.13% | 74.12% | 83.01% | 19.44% |
| ResNet34 | 57.74% | 68.24% | 81.23% | 21.64% |
| ResNet34-BEFB | 60.49% | 69.37% | 81.60% | 22.30% |
Table 8. Comparison of classification accuracy between BEFB-integrated models and original ones under black-box attack.
| CIFAR-10 | MNIST | SVHN | TinyIN | |
|---|---|---|---|---|
| VGG16 | 16.20% | 94.40% | 37.80% | 54.40% |
| VGG16-BEFB | 29.40% | 95.80% | 40.60% | 55.00% |
| ResNet34 | 40.60% | 69.20% | 35.60% | 66.60% |
| ResNet34-BEFB | 41.20% | 78.60% | 41.00% | 66.80% |
From the tables, it is clear to see that BEFB-integrated models are more robust than the original models under white-box attacks and black-box attack. For white-box attacks, on CIFAR-10 dataset, VGG16-BEFB model can achieve 17%, 6%, 7%, and 4% higher accuracy than the original model under FGSM, PGD, AA, and attacks, respectively. On MNIST dataset, VGG16-BEFB model can achieve 11%, 12%, 37%, and 37% higher accuracy than the original model under FGSM, PGD, AA, and attacks, respectively. For black-box attack, on CIFAR-10 dataset, VGG16-BEFB model can achieve 13% higher accuracy than the original model. On MNIST dataset, ResNet34-BEFB model can achieve 9% higher accuracy than the original model.
Table 9 integrates BEFB with WRN34-10 [54] backbone, and the results show the classification accuracy of WRN34-10-BEFB is better than the original WRN34-10 under FGSM, PGD, AA, attacks on CIFAR-10 dataset.
Table 9. Comparison of classification accuracy between WRN34-10-BEFB model and original one on CIFAR-10 dataset.
| FGSM | PGD | AA | ||
|---|---|---|---|---|
| WRN34-10 | 4.13% | 0.00% | 9.25% | 9.12% |
| WRN34-10-BEFB | 5.01% | 0.00% | 9.42% | 9.33% |
Figs 14–17 depict how BEFB helps improve the robustness of VGG16 model. In these figures, the gray features are extracted from backbone branch, and the binary features are extracted from BEFB. Figs 14 and 15 examine the extracted features of both clean examples and the AEs by the original VGG16 model and VGG16-BEFB model on TinyIN dataset, respectively. On Fig 14, the certain images from TinyIN dataset under FGSM of ϵ=16 attack are predicted wrong by the original VGG16 model, and it is clear to see there is significant difference between the extracted texture features of the clean examples and the AEs. On Fig 15, the same images from TinyIN dataset under FGSM of ϵ=16 attack are classified correctly by the VGG16-BEFB model, and the difference of the extracted texture features between the clean examples and the AEs is lower than that in the original model. More notably, the extracted binary edge features of clean examples and AEs are almost the same, e.g., in “ladybug” image and “sock” image, the number of different pixels are both zero. It indicates these extracted binary edge features play an important role in improving the classification accuracy on AEs. Furthermore, we add more perturbations to the images and observe the extracted features and prediction results of the VGG16-BEFB models. On Fig 16, four more perturbations are added on the images under FGSM attack. The difference of texture features between the clean examples and AEs is slightly higher than that under ϵ=16 attack, e.g., RMSE (root mean square error) of “jellyfish” image is 0.15 under ϵ=16 and 0.18 under ϵ=20, and RMSE of “mushroom” image is 0.48 under ϵ=16 and 0.52 under ϵ=20. The difference of binary edge features between the clean examples and AEs is nearly unchanged under ϵ=16 attack and ϵ=20 attack, e.g., the number of different pixels for “jellyfish” image is 1 under both ϵ=16 attack and ϵ=20 attack, and the number of different pixels for “mushroom” image is also 1 under both ϵ=16 attack and ϵ=20 attack. And the prediction results under FGSM of ϵ=20 attack are correct. On Fig 17, another four perturbations are added, and the prediction results are wrong. It can be seen from the figure that, the difference of texture features under ϵ=24 attack is slightly higher than that under ϵ=20 attack, e.g., RMSE of “jellyfish” image is 0.18 under ϵ=20 and 0.19 under ϵ=24, and RMSE of “mushroom” image is 0.52 under ϵ=20 attack and 0.57 under ϵ=24 attack. The number of different pixels in binary edge features of clean examples and AEs keeps very close under ϵ=20 attack and ϵ=24 attack, e.g. the number for “mushroom” image keeps unchanged and the number for “jellyfish” image increases one. Both Figs 16 and 17 demonstrate the binary edge features are less susceptible to small perturbations, and to further improve the classification accuracy on AEs, exploring the novel combination forms for both texture features and binary edge features is a key issue.
Fig 14. Texture features extracted by the original VGG16 model and its predictions under FGSM of ϵ=16 attack.
Fig 17. Binary edge features and the texture features extracted by the VGG16-BEFB model and its predictions under FGSM of ϵ=24 attack.
Fig 15. Binary edge features and the texture features extracted by the VGG16-BEFB model and its predictions under FGSM of ϵ=16 attack.
Fig 16. Binary edge features and the texture features extracted by the VGG16-BEFB model and its predictions under FGSM of ϵ=20 attack.
Similarly, Figs 18–21 depict how BEFB helps improve the robustness of ResNet34 model. On Figs 18 and 19, it can be seen that, the original ResNet34 model makes the wrong predictions for the certain images from CIFAR-10 dataset under FGSM of ϵ=8 attack, while ResNet34-BEFB model gets them correct because of the extracted binary edge features almost the same between the clean examples and the AEs. For example, on “dog”, “airplane”, “automobile”, “ship” and “frog” images, the number of different pixels is zero. On Figs 20 and 21, four more and eight more perturbations are added, respectively. It can be seen that the prediction results are correct under FGSM of ϵ=12 attack and wrong under FGSM of ϵ=16 attack. When focusing on the binary edge features under these two attacks, it is found the changes on binary edge features in most of cases are zeros, e.g., in “dog”, “truck”, and “ship” images, demonstrating the binary edge features are less susceptible to small perturbations.
Fig 18. Texture features extracted by the original ResNet34 model and its predictions under FGSM of ϵ=8 attack.
Fig 21. Binary edge features and the texture features extracted by the ResNet34-BEFB model and its predictions under FGSM of ϵ=16 attack.
Fig 19. Binary edge features and the texture features extracted by the ResNet34-BEFB model and its predictions under FGSM of ϵ=8 attack.
Fig 20. Binary edge features and the texture features extracted by the ResNet34-BEFB model and its predictions under FGSM of ϵ=12 attack.
We also compare the classification performance of both original models and BEFB-integrated models on the images perturbed by gaussian noise. Figs 22 and 23 illustrate the gaussian noise perturbed images of CIFAR-10 dataset and MNIST dataset. In Table 10, it is clear to see the classification accuracy of BEFB-integrated models are better than the original models on both datasets, e.g. the ResNet34-BEFB model can achieve 15% and 10% higher accuracy than the original model on CIFAR-10 dataset and MNIST dataset, respectively.
Fig 22. Gaussian noise with zero mean and 0.08 standard deviation on CIFAR-10 dataset.

Fig 23. Gaussian noise with zero mean and 0.35 standard deviation on MNIST dataset.

Table 10. Comparison of classification accuracy between the original models and BEFB-integrated models on gaussian noise perturbed CIFAR-10 and MNIST datasets.
| CIFAR-10 | MNIST | |
|---|---|---|
| VGG16 | 61.35% | 85.17% |
| VGG16-BEFB | 64.17% | 88.75% |
| ResNet34 | 75.63% | 76.01% |
| ResNet34-BEFB | 91.62% | 86.11% |
Combining BEFB-integrated models with SOTA robustness enhancement methods
We combine BEFB-integrated models with four popular robustness enhancement techniques, i.e., AT [26], PCL [38], DDPM [55], and BORT [56], and compare them to the original models with these robustness enhancement techniques. We denote VGG16-BEFB models with AT as VGG16-BEFB-AT, and VGG16-BEFB models with PCL as VGG16-BEFB-PCL. The same notations for ResNet34-BEFB models and the original models. Fig 24 compares the classification accuracy of BEFB-AT models with AT enhanced original models under PGD attack. Figs 25 and 26 compare the classification accuracy of BEFB-PCL models with PCL enhanced original models under FGSM and PGD attacks.
Fig 24. Comparing AT enhanced BEFB-integrated models with original models under PGD attack.
Fig 25. PCL enhanced BEFB-integrated models and original models under FGSM attack.
Fig 26. PCL enhanced BEFB-integrated models and original models under PGD attack.
From the figures, we can see AT and PCL enhanced BEFB-integrated models have better classification accuracy than AT and PCL enhanced original models.
In Table 11, DDPM and BORT are integrated with WRN28-10 [54] and WRN28-10-BEFB models respectively for robustness enhancement. The results show DDPM/BORT enhanced WRN28-10-BEFB model has higher classification accuracy than original WRN28-10 model under FGSM, PGD, AA, and attacks.
Table 11. Comparison of classification accuracy between DDPM/BORT enhanced WRN28-10-BEFB model and original one under FGSM, PGD, AA, and attacks.
| FGSM | PGD | AA | ||
|---|---|---|---|---|
| WRN28-10-DDPM | 75.78% | 71.34% | 69.07% | 68.99% |
| WRN28-10-BEFB-DDPM | 76.49% | 71.81% | 69.55% | 69.47% |
| WRN28-10-BORT | 58.29% | 52.95% | 50.29% | 50.20% |
| WRN28-10-BEFB-BORT | 58.71% | 53.39% | 50.54% | 50.53% |
Ablation study
We first make a comparison between BEFB-integrated models with learnable edge detectors and those with traditional Sobel edge detector. BEFB-integrated models with traditional Sobel edge detector are denoted as BEFB-fixed. Table 12 shows the comparison of classification accuracy between the original BEFB-integrated models and BEFB-integrated models with traditional Sobel edge detector. From the table, it can be seen that the average classification accuracy decreases when replacing the learnable edge detectors with traditional Sobel edge detector.
Table 12. Comparison of replacing learnable edge detectors with traditional Sobel edge detector.
| CIFAR-10 | SVHN | |||
|---|---|---|---|---|
| FGSM | PGD | FGSM | PGD | |
| VGG16-BEFB-fixed | 42.90% | 3.98% | 60.72% | 17.83% |
| VGG16-BEFB | 45.60% | 8.05% | 59.28% | 19.48% |
| ResNet34-BEFB-fixed | 11.37% | 1.09% | 23.12% | 3.21% |
| ResNet34-BEFB | 14.88% | 1.01% | 25.17% | 2.98% |
Considering BEFB-integrated models having two main components, i.e., Sobel layers and threshold layer, we then make a comparison between original BEFB-integrated models and threshold layer and Sobel layer removed models. Threshold layer removed models are denoted as BEFB-tlre, and Sobel layers removed models are denoted as BEFB-slre. Table 13 shows the comparison of classification accuracy between the original BEFB-integrated models and BEFB-integrated models with threshold layer and Sobel layers removed, respectively. From the table, it is clear to see when removing threshold layer and Sobel layers, the robustness of BEFB-tlre and BEFB-slre models is weakened.
Table 13. Comparison of removing threshold layer and Sobel layers respectively.
| CIFAR-10 | SVHN | |||
|---|---|---|---|---|
| FGSM | PGD | FGSM | PGD | |
| VGG16-BEFB-tlre | 40.55% | 6.73% | 48.70% | 10.46% |
| VGG16-BEFB-slre | 33.73% | 2.90% | 55.22% | 15.89% |
| VGG16-BEFB | 45.60% | 8.05% | 59.28% | 19.48% |
| ResNet34-BEFB-tlre | 13.58% | 0.09% | 24.22% | 2.44% |
| ResNet34-BEFB-slre | 9.11% | 0.86% | 16.42% | 2.18% |
| ResNet34-BEFB | 14.88% | 1.01% | 25.17% | 2.98% |
Discussions
The experimental results from Section shows BEFB has no side effects on model training, and BEFB-integrated models are more robust than original models under attacks. And when combining BEFB-integrated models with SOTA robustness enhancement techniques, it can achieve better classification accuracy than original models. It also can be seen that, using BEFB to enhance robustness is effective but not that significant. It may be because under BEFB, the binary edge features are combined with texture features by concatenation. We believe it is worthwhile to explore other combination forms of binary edge features and texture features which can potentially improve the robustness of DCNNs notably.
Conclusions
Enhancing the robustness of DCNNs is of great significance for the safety-critical applications in the real world. Inspired by the principal way that human eyes recognize objects, in this paper, we design four learnable edge detectors and propose a binary edge feature branch (BEFB), which can be easily integrated into any popular backbone. Experiments on multiple datasets show BEFB has no side effects on model training, and BEFB-integrated models are more robust than the original models. The work in this paper for the first time shows it is feasible to combine shape-like features and texture features to make DCNNs more robust. In future’s work, we endeavor to explore other effective and efficient combination forms of binary edge features and texture features, and design an optimization framework for the parameter searching to yield models with good performance under attacks.
Acknowledgment
The authors want to thank Dr. Huo-Ping Yi for the insights on evaluating the robustness of DCNN.
Data Availability
All relevant data files are available at https://doi.org/10.6084/m9.figshare.28911614.v1.
Funding Statement
This work was supported by National Key Research and Development Program of China(2024YFB2409200), Open Foundation of the State Key Laboratory of Fluid Power and Mechatronic Systems(GZKF-202329), and National Natural Science Foundation of China(52293424, 52477065). The funders played no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1.Goodfellow IJ, Shlens J, Szegedy C. Explaining and harnessing adversarial examples. In: International conference on learning representations; 2015.
- 2.Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I. Intriguing properties of neural networks. In: International conference on learning representations; 2014.
- 3.Al-Maliki S, Bouanani FE, Ahmad K, Abdallah M, Hoang DT, Niyato D, et al. Toward improved reliability of deep learning based systems through online relabeling of potential adversarial attacks. IEEE Trans Rel. 2023;72(4):1367–82. doi: 10.1109/tr.2023.3298685 [DOI] [Google Scholar]
- 4.Qi P, Jiang T, Wang L, Yuan X, Li Z. Detection tolerant black-box adversarial attack against automatic modulation classification with deep learning. IEEE Trans Rel. 2022;71(2):674–86. doi: 10.1109/tr.2022.3161138 [DOI] [Google Scholar]
- 5.Bai T, Luo J, Zhao J, Wen B, Wang Q. Recent advances in adversarial training for adversarial robustness. In: Proceedings of the thirtieth international joint conference on artificial intelligence; 2021. p. 4312–21. 10.24963/ijcai.2021/591 [DOI]
- 6.Wen Z, Jiang T, Huang Z, Cui X. Section sparse attack: A more powerful sparse attack method. Comput Sci. 2025:1–12. [Google Scholar]
- 7.Davitt LI, Cristino F, Wong AC-N, Leek EC. Shape information mediating basic- and subordinate-level object recognition revealed by analyses of eye movements. J Exp Psychol Hum Percept Perform. 2014;40(2):451–6. doi: 10.1037/a0034983 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Sigurdardottir HM, Michalak SM, Sheinberg DL. Shape beyond recognition: Form-derived directionality and its effects on visual attention and motion perception. J Exp Psychol Gen. 2014;143(1):434–54. doi: 10.1037/a0032353 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Kittler J. On the accuracy of the Sobel edge detector. Image Vision Comput. 1983;1(1):37–42. doi: 10.1016/0262-8856(83)90006-9 [DOI] [Google Scholar]
- 10.Berzins V. Accuracy of Laplacian edge detectors. Comput Vision Graph Image Process. 1984;27(2):195–210. [Google Scholar]
- 11.Forshaw MRB. Speeding up the Marr-Hildreth edge operator. Computer Vision Graph Image Process. 1988;41(2):172–85. doi: 10.1016/0734-189x(88)90018-7 [DOI] [Google Scholar]
- 12.Ding L, Goshtasby A. On the Canny edge detector. Pattern Recogn. 2001;34(3):721–5. doi: 10.1016/s0031-3203(00)00023-6 [DOI] [Google Scholar]
- 13.Xiao H, Xiao S, Ma G, Li C. Image Sobel edge extraction algorithm accelerated by OpenCL. J Supercomput. 2022;78(14):16236–65. doi: 10.1007/s11227-022-04404-8 [DOI] [Google Scholar]
- 14.Yasir M, Hossain MS, Nazir S, Khan S, Thapa R. Object identification using manipulated edge detection techniques. Science. 2022;3(1):1–6. [Google Scholar]
- 15.Chen Z, He Z, Lu Z-M. DEA-Net: Single image dehazing based on detail-enhanced convolution and content-guided attention. IEEE Trans Image Process. 2024;33:1002–15. doi: 10.1109/TIP.2024.3354108 [DOI] [PubMed] [Google Scholar]
- 16.Bengio Y, Léonard N, Courville A. Estimating or propagating gradients through stochastic neurons for conditional computation. In: arXiv preprint; 2013. 10.48550/arXiv.1308.3432 [DOI]
- 17.Yang Z, Lee J, Park C. Injecting logical constraints into neural networks via straight-through estimators. In: Proceedings of international conference on machine learning; 2022. p. 25096–122.
- 18.Vanderschueren A, De Vleeschouwer C. Are straight-through gradients and soft-thresholding all you need for sparse training? In: Proceedings of IEEE winter conference on applications of computer vision; 2023. p. 3808–17.
- 19.Athalye A, Carlini N, Wagner D. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In: Proceedings of international conference on machine learning; 2018. p. 274–83.
- 20.Yue K, Jin R, Wong CW, Baron D, Dai H. Gradient obfuscation gives a false sense of security in federated learning. In: 32nd USENIX security symposium (USENIX security); 2023. p. 6381–98.
- 21.Huang Y, Yu Y, Zhang H, Ma Y, Yao Y. Adversarial robustness of stabilized neural ode might be from obfuscated gradients. In: Mathematical and scientific machine learning. PMLR; 2022. p. 497–515.
- 22.Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In: International conference on learning representations; 2015.
- 23.He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of IEEE conference on computer vision and Pattern Recognition; 2016. p. 770–8
- 24.He F, Liu T, Tao D. Why ResNet works? Residuals generalize. IEEE Trans Neural Netw Learn Syst. 2020;31(12):5349–62. doi: 10.1109/TNNLS.2020.2966319 [DOI] [PubMed] [Google Scholar]
- 25.Tsipras D, Santurkar S, Engstrom L, Turner A, Madry A. Robustness may be at odds with accuracy. In: International conference on learning representations; 2019.
- 26.Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A. Towards deep learning models resistant to adversarial attacks. In: International conference on learning representations; 2018.
- 27.Kannan H, Kurakin A, Goodfellow I. Adversarial logit pairing; 2018. 10.48550/arXiv.1803.06373 [DOI]
- 28.Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L. Imagenet: A large-scale hierarchical image database. In: Proceedings of IEEE conference on computer vision and pattern recognition; 2009. p. 248–255.
- 29.Wong E, Rice L, Kolter JZ. Fast is better than free: Revisiting adversarial training. In: International conference on learning representations; 2020.
- 30.Kurakin A, Goodfellow I, Bengio S. Adversarial machine learning at scale. In: International conference on learning representations; 2017.
- 31.Xu Z, Zhu G, Meng C, Ying Z, Wang W, Gu M, et al. A2: Efficient automated attacker for boosting adversarial training. Adv Neural Inform Process Syst. 2022;35:22844–55. [Google Scholar]
- 32.Li T, Wu Y, Chen S, Fang K, Huang X. Subspace adversarial training. In: Proceedings of IEEE conference on computer vision and pattern recognition; 2022. p. 13409–18.
- 33.Dabouei A, Taherkhani F, Soleymani S, Nasrabadi NM. Revisiting outer optimization in adversarial training. In: Proceedings of European conference on computer vision. Springer; 2022. p. 244–61.
- 34.Zhou D, Song Z, Chen Z, Huang X, Ji C, Kumari S, et al. Advancing explainability of adversarial trained Convolutional Neural Networks for robust engineering applications. Eng Appl Artif Intell. 2025;140:109681. doi: 10.1016/j.engappai.2024.109681 [DOI] [Google Scholar]
- 35.Eleftheriadis C, Symeonidis A, Katsaros P. Adversarial robustness improvement for deep neural networks. Mach Vision Appl. 2024;35(3). doi: 10.1007/s00138-024-01519-1 [DOI] [Google Scholar]
- 36.Li X, Wang Z, Zhang B, Sun F, Hu X. Recognizing object by components with human prior knowledge enhances adversarial robustness of deep neural networks. IEEE Trans Pattern Anal Mach Intell. 2023;45(7):8861–73. doi: 10.1109/TPAMI.2023.3237935 [DOI] [PubMed] [Google Scholar]
- 37.Sitawarin C, Pongmala K, Chen Y, Carlini N, Wagner D. Part-based models improve adversarial robustness. In: International conference on learning representations; 2023.
- 38.Mustafa A, Khan SH, Hayat M, Goecke R, Shen J, Shao L. Deeply supervised discriminative learning for adversarial defense. IEEE Trans Pattern Anal Mach Intell. 2021;43(9):3154–66. doi: 10.1109/TPAMI.2020.2978474 [DOI] [PubMed] [Google Scholar]
- 39.Seo S, Lee Y, Kang P. Cost-free adversarial defense: Distance-based optimization for model robustness without adversarial training. Comput Vision Image Understand. 2023;227:103599. doi: 10.1016/j.cviu.2022.103599 [DOI] [Google Scholar]
- 40.Yan Z, Guo Y, Zhang C. Adversarial margin maximization networks. IEEE Trans Pattern Anal Mach Intell. 2021;43(4):1129–39. doi: 10.1109/TPAMI.2019.2948348 [DOI] [PubMed] [Google Scholar]
- 41.Guo Y, Zhang C. Recent advances in large margin learning. IEEE Trans Pattern Anal Mach Intell. 2022;44(10):7167–74. doi: 10.1109/TPAMI.2021.3091717 [DOI] [PubMed] [Google Scholar]
- 42.Moosavi-Dezfooli SM, Fawzi A, Frossard P. Deepfool: A simple and accurate method to fool deep neural networks. In: Proceedings of IEEE conference on computer vision and Pattern Recognition; 2016. p. 2574–82.
- 43.Yuan X, Li J, Kuruoglu EE. Robustness enhancement in neural networks with alpha-stable training noise. Digital Signal Process. 2025;156:104778. doi: 10.1016/j.dsp.2024.104778 [DOI] [Google Scholar]
- 44.Amerehi F, Healy P. Label augmentation for neural networks robustness. arXiv preprint; 2024. https://doi.org/arXiv:240801977
- 45.Dong M, Li Y, Wang Y, Xu C. Adversarially robust neural architectures. IEEE Trans Pattern Anal Mach Intell. 2025;47(5):4183–97. doi: 10.1109/TPAMI.2025.3542350 [DOI] [PubMed] [Google Scholar]
- 46.Krizhevsky A, Hinton G. Learning multiple layers of features from tiny images; 2009.
- 47.Li Deng. The MNIST database of handwritten digit images for machine learning research [Best of the Web]. IEEE Signal Process Mag. 2012;29(6):141–2. doi: 10.1109/msp.2012.2211477 [DOI] [Google Scholar]
- 48.Yuval N. Reading digits in natural images with unsupervised feature learning. In: Proceedings of the NIPS workshop on deep learning and unsupervised feature learning; 2011. p. 4.
- 49.Le Y, Yang X. Tiny imagenet visual recognition challenge. CS 231N. 2015;7(7):3. [Google Scholar]
- 50.Croce F, Hein M. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In: International conference on machine learning. PMLR; 2020. p. 2206–16.
- 51.Liu Y, Cheng Y, Gao L, Liu X, Zhang Q, Song J. Practical evaluation of adversarial robustness via adaptive auto attack. In: Proceedings of IEEE conference on computer vision and pattern recognition; 2022. p. 15105–14.
- 52.Carlini N, Wagner D. Towards evaluating the robustness of neural networks. In: 2017 IEEE symposium on security and privacy (SP); 2017. p. 39–57. 10.1109/sp.2017.49 [DOI]
- 53.Su J, Vargas DV, Sakurai K. One pixel attack for fooling deep neural networks. IEEE Trans Evol Computat. 2019;23(5):828–41. doi: 10.1109/tevc.2019.2890858 [DOI] [Google Scholar]
- 54.Zagoruyko S, Komodakis N. Wide residual networks; 2016. https://doi.org/arXiv:1605.07146
- 55.Wang Z, Pang T, Du C, Lin M, Liu W, Yan S. Better diffusion models further improve adversarial training. In: International conference on machine learning. PMLR; 2023. p. 36246–63.
- 56.Huang D, Bu Q, Qing Y, Pi H, Wang S, Cui H. Two heads are better than one: Robust learning meets multi-branch models. arXiv preprint; 2022. 10.48550/arXiv.2208.08083 [DOI]
















