Robust deep learning object recognition models rely on low frequency information in natural images

Zhe Li; Josue Ortega Caro; Evgenia Rusak; Wieland Brendel; Matthias Bethge; Fabio Anselmi; Ankit B Patel; Andreas S Tolias; Xaq Pitkow

doi:10.1371/journal.pcbi.1010932

. 2023 Mar 27;19(3):e1010932. doi: 10.1371/journal.pcbi.1010932

Robust deep learning object recognition models rely on low frequency information in natural images

Zhe Li ^1,^*,^#, Josue Ortega Caro ^1,^#, Evgenia Rusak ², Wieland Brendel ², Matthias Bethge ², Fabio Anselmi ¹, Ankit B Patel ^1,^3,^4,^‡, Andreas S Tolias ^1,^3,^4,^‡,^*, Xaq Pitkow ^1,^3,^4,^‡,^*

Editor: Xue-Xin Wei⁵

PMCID: PMC10079058 PMID: 36972288

Abstract

Machine learning models have difficulty generalizing to data outside of the distribution they were trained on. In particular, vision models are usually vulnerable to adversarial attacks or common corruptions, to which the human visual system is robust. Recent studies have found that regularizing machine learning models to favor brain-like representations can improve model robustness, but it is unclear why. We hypothesize that the increased model robustness is partly due to the low spatial frequency preference inherited from the neural representation. We tested this simple hypothesis with several frequency-oriented analyses, including the design and use of hybrid images to probe model frequency sensitivity directly. We also examined many other publicly available robust models that were trained on adversarial images or with data augmentation, and found that all these robust models showed a greater preference to low spatial frequency information. We show that preprocessing by blurring can serve as a defense mechanism against both adversarial attacks and common corruptions, further confirming our hypothesis and demonstrating the utility of low spatial frequency information in robust object recognition.

Author summary

Though artificial intelligence has achieved high performance on various vision tasks, its ability to generalize to out-of-distribution data is limited. Most remarkably, machine learning models are extremely sensitive to input perturbations such as adversarial attacks and common corruptions. Previous studies have observed that imposing an inductive bias towards brain-like representations can improve the robustness of models, but the reasons underlying this benefit were left unknown. In this work, we propose and test the hypothesis that the robustness of brain-like models can be accounted for by a low frequency feature preference inherited from the neural representation. We designed a novel machine learning task to probe the frequency bias of different models and observed a strong correlation between that and model robustness. We believe this work serves as a first step towards understanding why biological visual systems generalize well to out-of-distribution data and provides an explanation for the robustness of state-of-the-art machine learning models trained with various methods. It also opens the door to applying computational principles of the brain in artificial intelligence, hence helping to overcome the fundamental difficulties faced by current AI methods.

Introduction

Currently, deep neural networks are the state-of-the-art models for numerous computer vision tasks such as object detection [1], image recognition [2], semantic segmentation [3], etc. However, these models are often not robust, as demonstrated by their inability to generalize to new data distributions. For instance, current neural networks are not able to generalize to common corruption noise such as ImageNet-C [4], where network performance is stress tested against 15 different kinds of image corruption applied to the ImageNet dataset. Furthermore, these models seem to be extremely sensitive to targeted noise such as adversarial attacks [5]. In contrast, the human visual system does not seem to suffer from such problems: in particular, recognizing object identity is little affected by the common corruptions [6], and adversarial perturbations that break machine learning models are imperceptible to humans [5]. This difference in behavior between the brains and deep learning algorithms might be explained by differences in inductive bias, i.e. they learn different features from data. Accordingly, natural vision has an inductive bias, a bias towards which fixed network are learned by the optimization algorithm from a class of models given the set of training data, towards robust features, which means insensitivity to perturbations that do not change the perceptual relevant latent variables such as object identities. How can we instill the brain’s inductive bias towards robust to targeted and random noise distortions to these deep learning algorithms? Recent work has shown that machine learning models that are encouraged to learn brain-like representations, a paradigm known as neural regularization, are also more robust to certain common corruptions such as Gaussian noise and adversarial attacks [7, 8]. Furthermore, other work has shown that models that explain more variation in primate primary visual cortical (V1) responses also tend to be more robust to adversarial attacks [9].

Much recent parallel work has also attempted to produce models that are robust to common corruptions (ANT [10], SIN [11], DeepAugment, AugMix [4]) and to adversarial attacks (PGD Training [12], TRADES [13]). These models achieve significant improvements upon baseline models by employing several other methods, including data augmentation, adversarial training, and anti-aliasing [14]. AutoAugment, a data augmentation method for common corruption robustness, has produced improvements mainly for high frequency common corruption noises [15]. Consistent with this work, we hypothesize that one of the reasons for the success of current robustness methods for both common corruption and adversarial attacks can be explained by a simple computational principle: models that are biased to rely on low spatial frequency information for object recognition are more robust. Here, we tested this hypothesis with several frequency-oriented analyses. Finally, we introduced a simple preprocessing step based on these principles that produced robust models with comparable performance to these more sophisticated methods.

Results

Neural regularization boosts model robustness

There are many ways to bias a machine learning model toward brain-like computations, such as architecture, learning rules, the training dataset or task’s objective functions [16]. Among them, methods that use auxiliary loss functions to bias models towards brain-like representations are called neural regularization. We will focus on two particular forms of neural regularization: neural similarity regularization [7, 17] and neural response regularization [8]. Both methods directly encourage a model to learn a brain-like representation of the visual stimuli.

The neural responses of natural images are located on a low dimension manifold in the response space [18]. Similarly, for any machine learning model, features of images stimuli also form their own manifold embedded in model feature space. Neural similarity regularization adjusts the manifold in the model’s hidden layers so that it is biased towards the same geometry as the manifold in the neural response space (Fig 1), while performing benchmark tasks like image classification at the same time. Li et al. [7] showed that a ResNet18 model regularized to incentivize its representational similarity [19], which is the pair-wise cosine similarity between the representation for a set of images, to be more like the one measured in mouse V1 is more robust against Gaussian noise and adversarial attacks. Another way to induce brain-like representations is to request the model to predict neural responses of the visual input as an auxiliary task. Safarani et al. [8] showed that a VGG19 model co-trained with monkey V1 data, where the VGG19’s core was used to predict neural responses directly, is able to improve robustness against common corruptions on Tiny ImageNet. Since the neural readout module is shallow, the VGG19 core has to encode neural features to make the co-training work.

Fig 1 — Machine learning models are trained so that the stimuli manifold in the model feature space resembles the manifold in the neural response space. The most simple version considers only pairwise relationships among stimuli instances. Two images that are close in the neural response space should also be close in the model feature space.

In this study, we expanded both neural regularization, either using mouse’s representational similarity matrix or using the monkey responses directly, by testing model robustness against various common corruptions and adversarial attacks.

Since neurally regularized models are trained on grayscale images, we used grayscale versions of the CIFAR10-C and TinyImageNet-C datasets for evaluation. Models regularized with either mouse or monkey V1 representations have higher accuracy at different severity levels of common corruptions compared to baseline models (Fig 2a and 2e), though their performance on clean images is slightly worse. Targeted gradient-based boundary attacks [20] were performed to find the minimum perturbations needed to change model predictions to wrong categories. The use of strong attacks is critical in evaluating adversarial robustness, as weak attacks such as FGSM only provide a loose upper bound on the minimum adversarial perturbation size. We verified the effectiveness of boundary attacks by extensive comparison between many off-the-shelf methods, and are confident it characterizes the adversarial robustness of models. Evaluation on the testing set showed that neurally regularized models need larger adversarial perturbations on average (Fig 2b and 2f), i.e. they are more robust against adversarial attacks. The attack size that gives 50% success rate is ϵ = 1.25/255 for the baseline ResNet and ϵ = 2.89/255 for the mouse regularized one. If we look at the distribution of minimum perturbation size needed for each image, the mean and standard deviation is ϵ = (1.34 ± 0.70)/255 for the baseline ResNet, and ϵ = (3.09 ± 1.61)/255 for the mouse regularized one. Similarly, the attack size that gives 50% success rate is ϵ = 1.84/255 for the baseline VGG and ϵ = 2.46/255 for the monkey regularized one. And the minimum perturbation size for all images is ϵ = (1.93 ± 0.89)/255 for the baseline VGG, and ϵ = (2.64 ± 1.26)/255 for the monkey regularized one. In the following text, we use the mean value of minimum perturbation sizes to characterize adversarial robustness of the model.

Fig 2 — A ResNet18 model (orange in a-d) was trained for grayscale CIFAR10 with mouse neural similarity regularization [7], a VGG19 model (orange in e-h) was trained for grayscale TinyImageNet with monkey neural response regularization [8]. (a) Grayscale CIFAR10 classification accuracy against common corruptions at different severity levels. Average accuracy over all corruptions are reported for a baseline ResNet model (black) and a mouse regularized model (orange). (b) Success rate of targeted attacks at different perturbation budget ϵ, using the boundary attack [20] with an L_∞ metric. (c) Classification accuracy against different types of corruptions, broken down into three groups based on their frequency characteristics (Table A in S1 Appendix). Model performance is averaged over all severity levels. (d) Radial profile of the Fourier spectrum of adversarial perturbations. We found the minimal adversarial perturbations of all testing images, and calculated the averaged Fourier spectrum thereof, where blue is minimum and red is maximum values of each heat map respectively (insets), a logarithm scale color map is used for better visualization. The portion of power under different frequency thresholds are compared between baseline and neurally regularized models. The abscissa is the absolute value of the spatial frequency, normalized by sampling frequency f_s. (e–h) Same as a–d, except comparing a baseline VGG model with a model co-trained with monkey neural data on the grayscale TinyImageNet dataset [21].

A natural question is: what does the neural regularization do? First, we decided to explore the structure of the neural similarity matrix of mouse V1 responses with a simple decomposition analysis. We calculated the eigenvalues and eigenvectors of the similarity matrix and computed the linear approximation of each principal component with respect to the image. We observed that geometry of mouse V1 representation is well approximated by a small number of principal components, and the spatial tuning of them reveals low frequency structure (Fig D in S1 Appendix). This has led us to ask if the bias towards low frequency features is inherited by neurally regularized models. We divided the 15 common corruptions in the CIFAR10-C and TinyImageNet-C datasets [22] into three categories based on their frequency spectra (Table A in S1 Appendix). Low-frequency corruptions: ‘snow’, ‘frost’, ‘fog’, ‘brightness’, ‘contrast’; medium-frequency corruptions: ‘motion_blur’, ‘zoom_blur’, ‘defocus_blur’, ‘glass_blur’, ‘elastic_transform’, ‘jpeg_compression’, ‘pixelate’; high-frequency corruptions: ‘gaussian_noise’, ‘shot_noise’, ‘impulse_noise’. Fourier spectra of different corruptions are shown in Fig A in S1 Appendix. The biggest performance boost in model classification accuracy comes from the category with the high-frequency corruptions (Fig 2c and 2g).

We also compared the adversarial perturbations for baseline models and neurally regularized ones. We performed a Fourier analysis on the minimal adversarial perturbation found through our boundary attack [20], and calculated the average frequency spectrum for different models (insets of Fig 2d and 2h). We observed that the mouse and monkey neurally regularized models contain relatively higher low-frequency components than the baseline model. To quantify this frequency shift, we characterized the frequency preference by a radial profile of the spectrum of the adversarial perturbation, i.e. the power of all Fourier components whose frequency is smaller than certain values. We found the radial profile of the adversarial perturbation spectrum for the neurally regularized model is shifted toward lower frequencies (Fig 2d and 2h). Furthermore, we can quantify this shift by the half power frequency f_0.5, which is the frequency where half of the Fourier power lies below (marked by the dashed line in Fig 2d and 2h). f_0.5 = 0.316 ± 0.015 for mouse regularized ResNet and f_0.5 = 0.442 ± 0.009 for baseline, with mean and standard deviation estimated from 1000 images (Fig 2d). Similarly, f_0.5 = 0.306 ± 0.020 for monkey regularized VGG and f_0.5 = 0.316 ± 0.034 for baseline (Fig 2h). The half power frequency f_0.5 is smaller for the neurally regularized model compared with baseline, though effect on the monkey regularized model is much weaker than the mouse regularized one.

Hybrid image experiment

To directly probe the frequency bias of models, we next designed a new dataset of hybrid images constructed by mixing the low-frequency components and high-frequency components from two different images [23]. We select two images belonging to two different categories, and examine whether the model prediction on the mixed image is consistent with either component. For a given mixing frequency f_mix, we combine the Fourier components of an image whose frequencies are smaller than f_mix with Fourier components of another image whose frequencies are larger than f_mix, and then use an inverse Fourier transformation to get a hybrid image (Fig 3a, see Methods for more details).

Fig 3 — (a) Examples of hybrid images at different mixing frequencies. Hybrid images are constructed by mixing the low-frequency component of one image and the high-frequency component of another, while the two seed images belong to different categories. The range of mixing frequency’s values are normalized by the Nyquist frequency. (b) Model predictions on hybrid images at different mixing frequencies. As more low-frequencies from one image are included, the probability that a network reports its label p_low increases. The reversal frequency f_rev where p_low = p_high is smaller for the mouse regularized model (‘neural’) than for the baseline model (‘base’).

We denote the probability that a hybrid image is classified as the low-/high-frequency seed image class as p_low and p_high respectively, and calculate the probability difference p_low − p_high at different mixing frequency values. When the mixing frequency is very small, the hybrid image is close to the high frequency seed image, therefore p_high is high for a properly trained model. Likewise, when the mixing frequency is big, p_low is high. We are interested in the mixing frequency when p_low = p_high, as it characterizes the frequency bias of a model. We term this frequency the reversal frequency f_rev. The result shows that f_rev is smaller for a mouse regularized model (0.371 ± 0.0006, mean ± standard deviation estimated from 4 randomly permuted hybrid image datasets) than a baseline model (0.528 ± 0.0009), indicating that the neurally regularized model is more likely to classify the hybrid image as the class of the low-frequency component (Fig 3b). This provides strong evidence that the reason behind the robustness gain by mouse neural regularization could be a bias towards low-frequency features. The same experiments and analysis were done for monkey regularized models as well, but the low-frequency bias is smaller compared to mice. f_rev is 0.376 ± 0.003 and 0.426 ± 0.002 for monkey regularized VGG19 and a baseline model, respectively (Fig B in S1 Appendix).

Since it appears the neural induced robustness can be largely accounted for by effectively changing the sensitivity to different spatial frequency, it is straightforward to implement such principle via a simple preprocessing of images. We propose two simple methods for filtering images and will describe them in the next section.

Frequency analysis on robust models

We then asked if other models that are not regularized with neural data, but engineered to be robust to common corruptions and adversarial attacks also have this low frequency bias. To this end, we analyzed and compared several robust models trained on CIFAR10, most of which were downloaded from RobustBench [24]. Some of the models are trained for adversarial robustness and some are trained for common corruption robustness. A full description of models is listed in Table B in S1 Appendix. Since the mouse regularized model in Fig 2 was trained on grayscale CIFAR10, it is not included in this comparison.

Two additional models were trained and included in this comparison. Both models were constructed by attaching a preprocessing layer before a ResNet18 model. The parameters of the preprocessing are not trainable but fixed beforehand. One is called the ‘blur’ model as the preprocessing is a convolution with a Gaussian kernel of standard deviation σ. The other is called the ‘PCA’ model as the preprocessing keeps only the first K principal components (PCs) calculated from all training images. σ and K are chosen such that the classification accuracy on CIFAR10 of both models are around 90%, similar to other robust models. More details are included in the Methods section.

The ‘blur’ model directly biases the model towards low-frequency features, because the high frequency features are attenuated during the preprocessing step. The filtering of features based on their spatial frequency can be treated as a special form of reweighting of input principal components, since principal component analysis on natural images with translation invariance recovers the Fourier basis, and the variance explained by a feature usually decreases monotonically as frequency increases. Hence a ‘PCA’ model that explicitly projects the original image onto a low dimensional space spanned by the first PCs is also included in the comparison. The hypothesis behind ‘PCA’ model is that the directions parallel to the data manifold are robust features, and perturbations that are orthogonal to the manifold can be safely removed. Since natural images have the greatest variance in low frequencies, the distance of any data point to a class boundary are longest along low frequency dimensions, and thus a model that mainly uses these low frequencies should be the more robust. Previous work also found that this kind of PCA data preprocessing increases robustness to adversarial attacks [25].

Similar to the analysis done on neurally regularized models, we calculated the Fourier spectrum of minimum adversarial perturbations for all robust models. A fixed set of 1000 images were selected from testing set, and the incorrect class targets for each image was also fixed. The adversarial perturbations for robust models, including the ones not specifically trained for adversarial robustness, contained more low spatial frequency components, while in the baseline models, adversarial perturbations were dominated by high spatial frequency components (Fig 4a). To better visualize the differences of adversarial spectrum shape, we plotted the radial spectrum for each model. The portions of the spectral power within different spatial frequencies are compared (Fig 4b), and the half power frequency f_0.5 is marked by the dashed line. Compared with the baseline model, all robust models have smaller f_0.5, indicating the minimum adversarial perturbations for them have lower frequencies. The size of average minimal perturbations are plotted against f_0.5 in Fig 4c, revealing a negative correlation between adversarial robustness and the frequency bias of the model.

Fig 4 — (a) The Fourier spectrum of the minimal adversarial perturbations of different models, including six baseline models (‘base’), seven models trained for adversarial robustness (‘adv’), two models for corruption robustness (‘crp’), one model with preprocessing by blurring (‘blur’), and one with preprocessing by PCA compression (‘pca’). Model details are listed in Table B in S1 Appendix. The spectrum is averaged over 1000 images, and color maps are normalized separately for each panel. (b) Radial profiles of adversarial perturbation spectra. Light thin lines represent each individual model, while thick lines are the average within each group. The frequency where each line crosses 50% is denoted as half power frequency f_0.5. (c) Scatter plot of minimum adversarial perturbation size versus f_0.5 for all models.

We next tested model predictions on hybrid images. Hybrid images for RGB CIFAR10 are constructed similarly as in Fig 3a. The reversal frequencies f_rev for all models are plotted against the classification accuracy on the CIFAR10-C dataset (averaged over all corruptions and severity levels) in Fig 5b. The result shows that all models are more robust to common corruptions than the baseline, even those trained for adversarial robustness. They all have a low frequency bias characterized by f_rev, which suggests that low frequency information is more important to these models. However, one interesting observation is that compared to adversarial attacks, common corruption robustness seems to cap in an specific f_rev. For example the adversarially robust models (blue) have smaller corruption accuracy compared to the other robust models. This seems to indicate a trade-off, where using too few frequencies ends up affecting performance. We decided to explore this by using different ‘blur’ models with increasing blurring kernels’ size. In Fig 5b, we observe that most models except the ones trained for common corruption robustness lie close to the curve (light green) fit to ‘blur’ models (light green crosses). This is an indication that the decrease in performance as a function of f_rev can be partly explained through the frequency preference of the models.

Fig 5 — (a) Difference between probability of choosing the low frequency label vs the high frequency label of hybrid images. As in Fig 4, light thin lines represent each individual model and thick lines are the average within each group. (b) Scatter plot of model accuracy on CIFAR10-C dataset versus reversal frequency f_rev in hybrid image classification. The dashed green line is the performance of a series of ‘blur’ models, using different degree of low-pass filtering. The left end corresponds to σ = 3 pixels and the right end corresponds to σ = 1 pixel, while the green dot is the model with σ = 1.5 pixel listed in Table B in S1 Appendix.

Next we explored this relationship between robustness and frequency bias on models trained for ImageNet classification. In addition to a baseline model, we analyzed two adversarially trained models, six models trained for robustness to common corruption and a model with blurring preprocessing layer (Table C in S1 Appendix). In Fig 6a, we plot the minimum adversarial perturbation size ϵ with respect to the half-power frequencies, f_0.5, of adversarial perturbation spectra. We can observe that the minimum perturbation size is negatively correlated with half-power frequencies, this indicates that robustness to adversarial attacks is correlated with adversarial attacks that have more energy in low frequencies.

Furthermore, Fig 6b shows corruption accuracy versus the reverse frequency, f_rev, for the ImageNet-C dataset. Here we observe that the common corruption robust models are more robust than the baseline and have smaller f_rev, consistent with our findings in CIFAR10 models. However the adversarially robustness models are less robust than baseline even though they have smaller f_rev. This is probably because the clean performance of adversarially robust models on this dataset is too low, therefore they are not comparable to baseline (Table C in S1 Appendix).

Similar to the analysis done for CIFAR10 models, we trained a series of ‘blur’ models with different blurring σ (light green crosses in Fig 6b). We observe the same ‘Goldilocks’ behavior in Fig 6b, that either too much or too little blurring decreases model performance on corrupted images. Other models deviate from this ‘blur’ model manifold more compared with the results for CIFAR10 dataset (Fig 5b), suggesting that a simple low-/high-frequency view on the robustness is less accurate for models trained on larger images. For example, perhaps some robust features in such dataset are not entirely low-frequency. While ‘blur’ models for CIFAR10 are trained from scratch, ‘blur’ models for ImageNet are initialized by a pre-trained baseline ResNet50 model (‘base’ in Fig 6) and fine-tuned after the fixed blurring layer is added. This might induces some bias on the ‘blur’ models towards the baseline model and affects the frequency tuning properties.

Discussion

Recent studies have linked brain-like representations to robustness of deep network models [7–9]. By introducing a novel hybrid image task, we were able to study the frequency preference of a neural networks, and show that these brain-like models were more robust mainly through a preference towards low spatial frequency features. This work follows in spirit the approach of Geirhos et al. [11], where the authors combined the shape of an input and the texture of another one to produce an input that had features with “conflicting information” towards different classes. This approach of producing cue-conflict style tasks seems to be useful for understanding the model preference towards certain features and not others.

Furthermore, we have shown that this bias is present in an extensive group of machine learning models trained for robustness: we found that robustness to common corruptions and adversarial examples are correlated with this low frequency bias. However, there seems to be a difference between robust models on CIFAR10 vs ImageNet. Adversarially robust models are more robust to common corruptions on CIFAR10 but not on ImageNet. Previous work has also shown that adversarially robust models are not robust to common corruptions [26]. However, they attributed this behavior to both perturbations having qualitative different properties. In our case, we argue that adversarial robustness in ImageNet dataset is more difficult to achieve, and therefore the models we analyzed cannot handle perturbations as large as the ones in common corruptions compared to CIFAR10 trained models. All of this seem to indicate that robustness to different datasets might be achieved through different methods but low-frequency preference seem to be a key component of current robust models.

The mouse regularized model shows a clear bias towards low spatial frequency compared with the baseline (Figs 2d and 3b), which must be inherited from the neural representation of mouse V1. We hence decomposed the mouse V1 manifold and looked at the spatial tuning of major components. We found that the dominant dimensions of neural population responses can be approximately described by linear receptive fields with low spatial frequency (Fig D in S1 Appendix), and that is the origin of low frequency preference of mouse regularized model. Although we do observe statistically significant effect of low frequency bias in monkey regularized model, it is much weaker than that in the mouse regularized one. We speculate it is because monkey V1 representation encodes other robust features that are not low spatial frequency. Analysis in Safarani et al. [8] shows monkey regularized model emphasizes the salient regions of images more than other control models, which suggests that the robustness is due to more than just a change on frequency sensitivity.

Previous physiological experiments [27–29] have measured the tuning properties of neurons in mouse and monkey visual cortices. Here we take the values from Table 1 of Van den Bergh et al. [29] regarding the preferred spatial frequency of a neuron, which are 0.04 cycles/deg for mouse V1 and 2.43 cycles/deg for monkey V1. We can safely assume each natural image used in the neural regularization studies [7, 8] contains one object, and treat the image size as a universal length unit. In the mouse regularization work, each image used in the experiment is approximately 67.5 visual degrees in height, therefore the preferred frequency of V1 neuron is about 2.7 cycles per image. In the monkey regularization work, each image is approximately 6.7 visual degrees, hence the preferred frequency of V1 neuron is about 16.3 cycles per image. The difference in normalized preferred spatial frequencies is also consistent with the observation that mouse regularized model shows more low frequency bias and robustness than the monkey regularized counterpart.

In addition, we were able to use the strategy of biasing towards low spatial frequencies and enforced it with a simple preprocessing layer to produce robustness on par with this group of robust machine learning methods. Previous work has also tried to use frequency information to produce robust models for ImageNet [14, 30, 31]. For example, antialising has been shown to improve robustness to common corruptions when introduced before or after the nonlinearities of different models [14, 31]. However, this method seems to behave differently than our preprocessing models. For example, these anti-aliasing models decrease in performance as you go into the high frequency corruptions, similar to the baseline models. This is in contrast to the other robust models that perform better on high frequency corruptions compared to low frequency ones. As the authors explained in their work, the anti-aliasing methods focus more on producing shift-invariant models while preserving as much high frequency information as possible. This might be the reason why this model behaves differently than the ‘blur’ or ‘PCA’ models which explicitly attenuate high frequency information from the input, and why our models behave more similarly to neurally regularized and data augmented models. This indicates that there can be different approaches to producing robust models to common corruptions with focus on different properties of the frequency spectrum.

However, as previous work has shown, using low frequency bias has its limitations. For example, Yin et al. [15] have shown that training a model on the “Fog” common corruption only using the frequency, but not the spatial information is insufficient to generalize to the same “Fog” common corruption during testing. This makes sense given that the corruption has a very specific spatial structure that does not depend only on the frequency. In addition, as we observed in Figs 5b and 6b, the bias towards low frequencies can be too strong, such that the performance deteriorates. This suggests that to reduce the gap between humans and machine learning models in out-of-distribution generalization, we must move beyond this low frequency-based preference found in current robust machine learning models and find a better and more principled inductive bias.

One possibility to achieve this grand goal is to use ideas from neuroscience. As our and previous work has suggested, neuroscience has provided inspiration to machine learning, but specific paradigms to directly translate biological scientific insights and data into a performance improvement in machine learning are largely absent. The neural regularization approach is a demonstration that this direction can help engineer more intelligent algorithms to usher the new field of NeuroAI. Furthermore, given that current machine learning models are still not robust to adversarial attacks and common corruptions, this work is just the start of bridging the gap between natural and artificial intelligence.

Besides making the model representation be more brain-like, recent research has also introduced other neural features into machine learning models to make it more robust, for example using biologically-constrained Gabor filters as the first layer and adding stochasticity to mimic the spiking responses in neural systems [9]. The more biological Gabor filters functions much like our blurring preprocessing layer, since they both gains Fourier components to enhance features in certain spatial frequency interval. We do not put too much effort on the the stochasticity as there have been evidence that such defense mechanism usually cause gradient masking as obfuscated gradients [32] and overestimated the model robustness. When properly attacked, for example computing the expectation over instantiations of randomness, the appeared adversarial robustness is gone [32]. Evaluation of stochastic models does not follow the same protocol as that of deterministic ones [24], we therefore focus only deterministic models as they gain the robustness through using different representation from that in baseline models.

One recent research [33] measured the preferred spatial frequency of individual model units via in-silico electrophysiology experiments, and showed that representation of robust models are more aligned with macaque V1 neurons in terms of the distribution of preferred spatial frequencies compared with non-robust models. Based on the eigenspectrum analysis, they suggested that the robustness is due to smaller portion of high-frequency tuning units in the robust models. Our work is complementary to their results as we analyze the model frequency preference on the behavioral aspect instead of individual model units.

Methods

Hybrid images

We randomly select two images I_low and I_high from two different categories as seed images for low-frequency and high-frequency components respectively. The 2D Fourier transform of these two images are denoted as ${\tilde{I}}_{low}$ and ${\tilde{I}}_{high}$ , and we define a binary mask on frequency domain as

\begin{matrix} M (f) = {\begin{matrix} 1, & ‖ f ‖ \leq f_{mix} \\ 0 . & otherwise \end{matrix} \end{matrix}

(1)

The Fourier transform of the hybrid image is thus the combination of ${\tilde{I}}_{low}$ and ${\tilde{I}}_{high}$ through M, according to ${\tilde{I}}_{hybrid} = {\tilde{I}}_{low} ⊙ M + {\tilde{I}}_{high} ⊙ (1 - M)$ , where ⊙ denotes an element-wise product. The hybrid image I_hybrid is calculated from ${\tilde{I}}_{hybrid}$ using an inverse Fourier transform.

Blur model

We designed a blur model to explicitly implement a frequency bias. Our blur model is simply a ResNet model prepended with a blurring layer. The blurring layer is a linear convolutional layer whose kernel weight is fixed as

\begin{matrix} w (Δ x, Δ y) = \frac{1}{Z} exp (- \frac{Δ x^{2} + Δ y^{2}}{2 σ^{2}}) \end{matrix}

(2)

in which Z is the normalization factor. The blurring layer can pass a gradient back to the input image, so gradient-based adversarial attacks can be performed. Clean performance of the ‘blur’ model decreases as σ increases because more information is discarded. We choose the value of σ such that the trained model has a similar classification accuracy on CIFAR10 testing set compared with other robust methods; σ = 1.5 pixel is selected.

PCA model

Similar to the blurring model, the PCA model is also a ResNet model with a non-trainable preprocessing layer. Treating the input image as a vector x whose dimension N is the number of pixels, we first performed principal component analysis (PCA) on the training set and obtained the eigenvectors v_i (1 ≤ i ≤ N) sorted by their eigenvalues. We took the first K eigenvectors, denoted by a K × N matrix W_K, and constructed the filtering matrix

\begin{matrix} W = W_{K} \cdot {W_{K}}^{T} . \end{matrix}

(3)

The PCA preprocessing layer is a linear layer whose input and output dimensions are both N, with its weights given by the matrix W. This layer projects the original image onto the subspace spanned by the first K eigenvectors, and effectively denoises the image. Similarly to the blur model, the PCA model allows gradient backward passes to the input image, and can be attacked with gradient-based adversarial attacks. The value of K is chosen so that the trained model has a similar clean performance compared with other models, and K = 512 was selected for the CIFAR10 dataset.

Model training and evaluation

The details about training mouse regularized models can be found in Li et al. [7]. The details about training monkey regularized models can be found in Safarani et al. [8]. Models for standard CIFAR10 and ImageNet tasks are fetched through RobustBench [24] except for the preprocessing ones.

To evaluate adversarial robustness of models, we used Foolbox [34, 35] to perform targeted attacks using both gradient-based boundary attacks and projected gradient descent attacks (PGD) on a fixed subset of testing images. For each image, the attack target class is predetermined in a random way. We gather all candidates proposed by different attack algorithms and random seeds for each image, and pick the one with minimum perturbation size as the final adversarial example.

Codes for training ‘blur’ and ‘PCA’ models, robustness evaluation of all models can be found in https://github.com/lizhe07/blur-net.

Disclaimer

The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of IARPA, DoI/IBC, or the U.S. Government.

Supporting information

S1 Appendix. Additional analysis and results.

(PDF)

Click here for additional data file.^{(1.5MB, pdf)}

Data Availability

All data and computational codes are released in the public repository https://github.com/lizhe07/blur-net.

Funding Statement

This work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior/Interior Business Center (DoI/IBC) (D16PC00003 to AST and XP), National Eye Institute of the National Institutes of Health (R01EY026927 to AST), NEI/NIH Core Grant for Vision Research (EY-002520-37) and NeuroNex (1707400 to XP and AST). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

1.Girshick R. Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision; 2015. p. 1440–1448.
2.He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2016. p. 770–778.
3.He K, Gkioxari G, Dollár P, Girshick R. Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision; 2017. p. 2961–2969.
4.Hendrycks D, Mu N, Cubuk ED, Zoph B, Gilmer J, Lakshminarayanan B. AugMix: A Simple Method to Improve Robustness and Uncertainty under Data Shift. In: International Conference on Learning Representations; 2020. Available from: https://openreview.net/forum?id=S1gmrxHFvB.
5.Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, et al. Intriguing properties of neural networks. arXiv preprint arXiv:13126199. 2013;.
6.Geirhos R, Janssen DH, Schütt HH, Rauber J, Bethge M, Wichmann FA. Comparing deep neural networks against humans: object recognition when the signal gets weaker. arXiv e-prints. 2017;.
7. Li Z, Brendel W, Walker E, Cobos E, Muhammad T, Reimer J, et al. Learning from brains how to regularize machines. In: Wallach H, Larochelle H, Beygelzimer A, d'Alché-Buc F, Fox E, Garnett R, editors. Advances in Neural Information Processing Systems 32. Curran Associates, Inc.; 2019. p. 9525–9535. [Google Scholar]
8. Safarani S, Nix A, Willeke K, Cadena SA, Restivo K, Denfield G, et al. Towards robust vision by multi-task learning on monkey visual cortex. In: Advances in Neural Information Processing Systems 34. Curran Associates, Inc.; 2021. [Google Scholar]
9. Dapello J, Marques T, Schrimpf M, Geiger F, Cox D, DiCarlo JJ. Simulating a Primary Visual Cortex at the Front of CNNs Improves Robustness to Image Perturbations. In: Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, editors. Advances in Neural Information Processing Systems 33. vol. 33. Curran Associates, Inc.; 2020. p. 13073–13087. [Google Scholar]
10. Rusak E, Schott L, Zimmermann RS, Bitterwolf J, Bringmann O, Bethge M, et al. A Simple Way to Make Neural Networks Robust Against Diverse Image Corruptions. In: Vedaldi A, Bischof H, Brox T, Frahm JM, editors. Computer Vision – ECCV 2020. Cham: Springer International Publishing; 2020. p. 53–69. [Google Scholar]
11.Geirhos R, Rubisch P, Michaelis C, Bethge M, Wichmann FA, Brendel W. ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. In: International Conference on Learning Representations; 2019.
12.Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A. Towards deep learning models resistant to adversarial attacks. arXiv e-prints. 2017;.
13.Zhang H, Yu Y, Jiao J, Xing E, El Ghaoui L, Jordan M. Theoretically principled trade-off between robustness and accuracy. In: International Conference on Machine Learning. PMLR; 2019. p. 7472–7482.
14.Zhang R. Making convolutional networks shift-invariant again. In: International conference on machine learning. PMLR; 2019. p. 7324–7334.
15. Yin D, Gontijo Lopes R, Shlens J, Cubuk ED, Gilmer J. A Fourier Perspective on Model Robustness in Computer Vision. In: Advances in Neural Information Processing Systems 32. vol. 32. Curran Associates, Inc.; 2019. p. 13276–13286. [Google Scholar]
16. Sinz FH, Pitkow X, Reimer J, Bethge M, Tolias AS. Engineering a less artificial intelligence. Neuron. 2019;103(6):967–979. doi: 10.1016/j.neuron.2019.08.034 [DOI] [PubMed] [Google Scholar]
17. Federer C, Xu H, Fyshe A, Zylberberg J. Improved object recognition using neural networks trained to mimic the brain’s statistical properties. Neural Networks. 2020;131:103–114. doi: 10.1016/j.neunet.2020.07.013 [DOI] [PubMed] [Google Scholar]
18. Stringer C, Pachitariu M, Steinmetz N, Carandini M, Harris KD. High-dimensional geometry of population responses in visual cortex. Nature. 2019;571(7765):361–365. doi: 10.1038/s41586-019-1346-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
19. Kriegeskorte N, Mur M, Bandettini P. Representational similarity analysis—connecting the branches of systems neuroscience. Frontiers in Systems Neuroscience. 2008;2:4. doi: 10.3389/neuro.06.004.2008 [DOI] [PMC free article] [PubMed] [Google Scholar]
20. Brendel W, Rauber J, Kümmerer M, Ustyuzhaninov I, Bethge M. Accurate, reliable and fast robustness evaluation. In: Wallach H, Larochelle H, Beygelzimer A, d'Alché-Buc F, Fox E, Garnett R, editors. Advances in Neural Information Processing Systems 32. Curran Associates, Inc.; 2019. p. 12861–12871. [Google Scholar]
21. Le Y, Yang X. Tiny ImageNet visual recognition challenge. CS 231N. 2015;7(7):3. [Google Scholar]
22.Hendrycks D, Dietterich T. Benchmarking Neural Network Robustness to Common Corruptions and Perturbations. In: International Conference on Learning Representations; 2019. Available from: https://openreview.net/forum?id=HJz6tiCqYm.
23. Oliva A, Torralba A, Schyns PG. Hybrid Images. ACM Trans Graph. 2006;25(3):527–532. doi: 10.1145/1141911.1141919 [DOI] [Google Scholar]
24.Croce F, Andriushchenko M, Sehwag V, Flammarion N, Chiang M, Mittal P, et al. RobustBench: a standardized adversarial robustness benchmark. arXiv e-prints. 2020;.
25.Bhagoji AN, Cullina D, Sitawarin C, Mittal P. Enhancing robustness of machine learning systems via data transformations. In: 2018 52nd Annual Conference on Information Sciences and Systems (CISS). IEEE; 2018. p. 1–5.
26.Laugros A, Caplier A, Ospici M. Are adversarial robustness and common perturbation robustness independant attributes? In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops; 2019. p. 0–0.
27. Ringach DL, Hawken MJ, Shapley R. Receptive field structure of neurons in monkey primary visual cortex revealed by stimulation with natural image sequences. Journal of Vision. 2002;2(1):2–2. doi: 10.1167/2.1.2 [DOI] [PubMed] [Google Scholar]
28. Niell CM, Stryker MP. Highly Selective Receptive Fields in Mouse Visual Cortex. Journal of Neuroscience. 2008;28(30):7520–7536. doi: 10.1523/JNEUROSCI.0623-08.2008 [DOI] [PMC free article] [PubMed] [Google Scholar]
29. Van den Bergh G, Zhang B, Arckens L, Chino YM. Receptive-field properties of V1 and V2 neurons in mice and macaque monkeys. The Journal of comparative neurology. 2010;518:2051–70. doi: 10.1002/cne.22321 [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Caro JO, Ju Y, Pyle R, Dey S, Brendel W, Anselmi F, et al. Local convolutions cause an implicit bias towards high frequency adversarial examples. arXiv preprint arXiv:200611440. 2020;. [DOI] [PMC free article] [PubMed]
31.Vasconcelos C, Larochelle H, Dumoulin V, Roux NL, Goroshin R. An effective anti-aliasing approach for residual networks. arXiv e-prints. 2020;.
32. Athalye A, Carlini N, Wagner D. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples. vol. 80 of Proceedings of Machine Learning Research. Stockholmsmässan, Stockholm Sweden: PMLR; 2018. p. 274–283. Available from: http://proceedings.mlr.press/v80/athalye18a.html. [Google Scholar]
33. Kong NCL, Margalit E, Gardner JL, Norcia AM. Increasing neural network robustness improves match to macaque V1 eigenspectrum, spatial frequency preference and predictivity. PLOS Computational Biology. 2022;18(1):1–25. doi: 10.1371/journal.pcbi.1009739 [DOI] [PMC free article] [PubMed] [Google Scholar]
34. Rauber J, Zimmermann R, Bethge M, Brendel W. Foolbox Native: Fast adversarial attacks to benchmark the robustness of machine learning models in PyTorch, TensorFlow, and JAX. Journal of Open Source Software. 2020;5(53):2607. doi: 10.21105/joss.02607 [DOI] [Google Scholar]
35.Rauber J, Brendel W, Bethge M. Foolbox: A Python toolbox to benchmark the robustness of machine learning models. In: Reliable Machine Learning in the Wild Workshop, 34th International Conference on Machine Learning; 2017. Available from: http://arxiv.org/abs/1707.04131.

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1010932.r001

Decision Letter 0

Wolfgang Einhäuser, Xue-Xin Wei

28 Apr 2022

Dear Li,

Thank you very much for submitting your manuscript "Robust deep learning object recognition models rely on low frequency information in natural images" for consideration at PLOS Computational Biology.

As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments.

We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts.

Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Xuexin Wei

Associate Editor

PLOS Computational Biology

Wolfgang Einhäuser

Deputy Editor

PLOS Computational Biology

***********************

Reviewer's Responses to Questions

Comments to the Authors:

Please note that the reviews of reviewer 1 and 2 are available as pdf.

Reviewer #1: This paper shows that the deep neural network models that are robust to adversarial images or data augmentations share one common feature of preferring the low frequency information in the natural images.

The authors started their analysis from the neural regularized models and found that the mouse-regularized model has a strong preference of the low frequency feature.

They then validated that the other publicly available robust models also share this preference and further proposed the blurring preprocessing as a defense strategy against the attacks.

At a high level, I think the main point of this article (robust networks prefer low-frequency features) is indeed supported by the presented evidence.

This result will be useful for the community to better understand these robust networks and then propose the networks that are even more robust.

However, I think there are several (possibly critical) issues that need to be addressed to make the whole story coherent.

Detailed comments are in the attachment.

Reviewer #2: Uploaded as an attachment

Reviewer #3: Li et al. test an interesting computational hypothesis. Namely that computational models of vision are more robust to adversarial perturbations and image corruptions if they have a bias towards lower spatial frequencies. The authors show via power spectral analysis of adversarial images and a clever ‘mixing’ experiment that neural regularized models (i.e., image classification models biased to have a similar representational geometry as mice/primate visual activity) have a bias towards lower spatial frequencies. In addition, they show that the neural regularization models are more robust by being less susceptible to ‘common corruptions’ (CIFAR10-C, ImageNet-C; Geirhos et al., 2016) and by yielding higher minimal perturbation distances for adversarial images generated under these models. The authors then suggest that a low frequency bias might be useful for robust object recognition in general. They study the spatial frequency bias of robustly trained models (adversarially trained or trained on ‘common corruptions’) and find that these indeed have a bias towards lower spatial frequencies compared to a baseline model. They also compare robust models to a model that includes a simple blur or PCA preprocessing step and show that some aspects of robustness can indeed be explained by biasing the model to lower spatial frequencies in the input.

I enjoyed reading the paper and I consider its contribution to be valuable for our understanding of model robustness. I do have several comments though where I think the paper should be improved.

Comments:

1) I really like the idea of the blur model in the second part of the paper. It is the most direct implementation of the computational hypothesis stated in the paper that a preference for lower spatial frequencies increases the robustness of a model. This model therefore has the highest explanatory value w.r.t the main claim of the paper. Unfortunately, the blur model is missing in the first part of the paper. It is therefore not clear whether the neural regularization model inherits the lower spatial frequency bias *and* the robustness of the mice/primate visual system or whether the neural regularization model is more robust *because* it inherits the lower spatial frequency bias (which is the computational hypothesis of the paper). I would suggest to include blur models in the first part of the paper to tease these possibilities apart.

2) The evaluation of the robustness as a function of spatial frequency preference in the second part is not quite satisfactory. Why averaging the accuracy of the common corruption dataset over the different corruptions after making insightful comparisons between the different severity levels and categories of corruptions in the first part of the paper? It would be good to see the results of Figure 5b separately for the categories and severities. In particular, there seem to be clearly different predictions. In general, it would be most helpful for the reader to perform the same kind of analysis and show the same kind of plots in both parts of the paper which seems possible for almost all analyses in the paper.

3) It would be helpful to include a wider range of baseline models in the second part of the paper (in particular for the mixing experiment and the adversarial perturbations). While it seems plausible that non-robust models prefer higher spatial frequencies compared to robust models, there is only one baseline model presented and it is not clear to what extent its spatial frequency preference might be a function of the specific architecture (which is a WideResNet28-10). Optimally, there would be a baseline model for each of the architectures of the robust models (Table 2). A compromise would be to include a set of baseline models that come close in representing the architectures in Table 2 but do not require model training (i.e., publicly available pretrained models). At the very least one could add a ResNet-18 as additional baseline model.

4) The focus of the paper is on machine learning models. The authors emphasize the importance and value of integrating machine learning and neuroscience which I fully agree with and I commend their approach. However, given that framing, I was missing a treatment (possibly in the discussion section) of the biological system that inspired the computational hypothesis here in a data-driven way. What do we know about whether the biological circuits indeed blur the image? Are psychophysics results and known spatial frequency preferences of mice and primate visual cortices consistent with a blurring hypothesis and the notion that this would increase robustness?

5) Please include more details about the overall procedure (e.g., in an additional procedure subsection in the Methods section) with a level of detail that would allow others to reproduce the analysis. First of two examples: it is unclear what the training and test sets are for adversarial evaluation of the neural regularized models (what is "the full testing dataset"?). Second: “a fixed set of 1000 images were selected from testing set and the incorrect class targets for each image was also fixed". Quite a few details are missing here including what differs between the procedure in the first and the second part of the paper.

Minor comments:

6) It might be very informative to plot the spectra of the blur and pca filters (i.e., of the preprocessing layers) juxtaposed to the corruption spectra (Fig. 7) to understand which part of the robustness results (for the common corruptions) can be explained by a simple filtering explanation.

7) How did the green curve in Figure 5b come about? Please also plot the actual datapoints to which the curve was fit. In addition, it is unclear whether the blur models underlying this curve are all trained with different fixed sigma or whether they correspond to a sweep of sigma on a model trained with a single fixed sigma. See my related point about more detailed description of the procedure.

8) Figure 5b, 6b. Given that the performance of these models on clean images is quite different - wouldn't it make more sense to analyze the decrement in performance due to image corruption (compared to clean image performance) instead of the accuracy on the corrupted images?

9) The second part of the paper should - like the first part - report standard means and standard deviations for the relevant measures.

10) End of last paragraph on p. 3. Reference should be to Fig 2 c, g (not c, f)

11) "Since the mouse regularized models in Fig. 2 are trained on grayscale CIFAR10, it is not included in this comparison" (p. 4). Are there multiple mouse models or just one?

12) The term "corruption accuracy" is a misnomer and slightly confusing when encountering it in Fig. 5b. The legend (in conflict with the y axis label) states "corruption robustness” which is an ambiguous term as well. Accuracy (on corrupted images) would seem like the most accurate term which is also being used in the first part of the paper.

13) "adv" and "crp" are not explained / spelled out in the figure legends or the text.

14) colorbars missing for the power spectra in Fig. 2 and Fig. 4.

15) Even though this is uncommon in machine learning – I would appreciate considerations regarding statistical inference. The standard error of most estimates is a function of the number of evaluation samples and can therefore be made arbitrarily low. Reporting test statistics is therefore somewhat meaningless – but this fact could be made explicit in a short paragraph in the Methods section. It could also be made explicit that the kind of implicit statistical inference made in the paper is at the level of individual models (e.g., the specific instance of a single robustly trained model vs. a specific non-robust model) and not at the level of model classes (e.g., adversarially trained models vs. baseline models).

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No: The authors claim that the data and the codes will be public later.

Reviewer #2: Yes

Reviewer #3: No: The authors state that data and code will be made publicly available in a public repository but I cannot assess to what level of detail and when this will happen.

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Chengxu Zhuang

Reviewer #2: No

Reviewer #3: No

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Attachment

Submitted filename: Review_for_PLOS_Comp__Bio.pdf

Click here for additional data file.^{(84KB, pdf)}

Attachment

Submitted filename: ploscb review.pdf

Click here for additional data file.^{(89KB, pdf)}

PLoS Comput Biol. 2023 Mar 27;19(3):e1010932. doi: 10.1371/journal.pcbi.1010932.r002

Author response to Decision Letter 0

7 Oct 2022

Attachment

Submitted filename: rebuttal.pdf

Click here for additional data file.^{(97.6KB, pdf)}

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1010932.r003

Decision Letter 1

Wolfgang Einhäuser, Xue-Xin Wei

14 Nov 2022

Dear Li,

When you are ready to resubmit, please upload the following:

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Sincerely,

Xuexin Wei

Academic Editor

PLOS Computational Biology

Wolfgang Einhäuser

Section Editor

PLOS Computational Biology

***********************

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: I want to first thank the authors for their work addressing my concerns. They have added new experiments and results, answered my questions, and provided more explanations and discussions. With these modifications, I now recommend accepting the paper.

However, the newly added ImageNet ResNet-50 results make me feel that the robust networks on small and larger resolutions differ significantly on why they are robust. This low-frequency preference matters more for small resolution compared to large resolution. This resolution-relevant difference may also explain why mouse-regularized networks show a much higher preference for low frequency, as their visual systems have lower acuity. The results of this work are still important, as they uncover what underlies the robustness of small-resolution robust networks and report that the robustness in large-resolution networks may require more mechanisms. Given that the large resolution networks are much more widely used in real-world applications, I encourage the authors to think more about how to explain and further facilitate the robustness of those networks.

No attachment is uploaded.

Reviewer #3: I reviewed the revised version of the manuscript and I appreciate the authors’ responses. However, the authors missed responding to one major and a couple of minor concerns in the rebuttal. I could also not find them addressed in the revised manuscript. I will reiterate them and/or quote from my last review.

Major:

1) I had asked the authors to include more baseline models in their analysis (my third point in the comments to the authors), which was not addressed in the rebuttal. The authors propose a computational principle for robust object recognition. Namely, that low spatial frequency preference is one cause of model robustness. Evidence is presented only for robustly trained models with the exception of a single baseline architecture. Spatial frequency preferences may vary quite substantially between architectures with different spatial integration properties. The correlations between spatial frequency preference and robustness (e.g., Fig 4c and 5b) might therefore become less clear-cut if a wider range of non-robustly trained baseline models is considered. This would restrict the proposed computational mechanism from a general principle to the special case of models trained for robustness. Currently, there is a single baseline model (WideResNet28-10) while the robust models have a wider variety of architectures (WideResNet-70-16, WideResNet-28-10, WideResNet-34-20, ResNeXt29-32x4d, PreActResNet-18, ResNet-18).

It should be quite straightforward to include baseline models of at least a few more of the other architecture types (WideResNet-70-16, WideResNet-28-10, WideResNet-34-20, ResNeXt29-32x4d, PreActResNet-18, ResNet-18) for the CIFAR10 part and I ask the authors to include these analyses in the revised manuscript.

Minor:

2) “9) The second part of the paper should - like the first part - report standard means and standard deviations for the relevant measures.”

3) “11) "Since the mouse regularized models in Fig. 2 are trained on grayscale CIFAR10, it is not included in this comparison" (p. 4). Are there multiple mouse models or just one?” -> It would be helpful for the reader to very briefly clarify in the manuscript or supplement what those models are (why not just one VGG19 mouse model?). Not every reader might be deeply familiar with the details of the respective publication.

4) “13) "adv" and "crp" are not explained / spelled out in the figure legends or the text.”

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

Reviewer #1: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #3: No

Figure Files:

Data Requirements:

Reproducibility:

PLoS Comput Biol. 2023 Mar 27;19(3):e1010932. doi: 10.1371/journal.pcbi.1010932.r004

Author response to Decision Letter 1

21 Jan 2023

Attachment

Submitted filename: rebuttal.pdf

Click here for additional data file.^{(69.5KB, pdf)}

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1010932.r005

Decision Letter 2

Wolfgang Einhäuser, Xue-Xin Wei

6 Feb 2023

Dear Li,

We are pleased to inform you that your manuscript 'Robust deep learning object recognition models rely on low frequency information in natural images' has been provisionally accepted for publication in PLOS Computational Biology.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology.

Best regards,

Xue-Xin Wei

Academic Editor

PLOS Computational Biology

Wolfgang Einhäuser

Section Editor

PLOS Computational Biology

***********************************************************

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #3: The authors have addressed all my concerns.

Please make sure to properly cite the URL of the cifar10 model repository (reference [38]) in the paper.

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #3: No

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1010932.r006

Acceptance letter

Wolfgang Einhäuser, Xue-Xin Wei

22 Mar 2023

PCOMPBIOL-D-22-00328R2

Robust deep learning object recognition models rely on low frequency information in natural images

Dear Dr Li,

I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Zsofia Freund

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1 Appendix. Additional analysis and results.

(PDF)

Click here for additional data file.^{(1.5MB, pdf)}

Attachment

Submitted filename: Review_for_PLOS_Comp__Bio.pdf

Click here for additional data file.^{(84KB, pdf)}

Attachment

Submitted filename: ploscb review.pdf

Click here for additional data file.^{(89KB, pdf)}

Attachment

Submitted filename: rebuttal.pdf

Click here for additional data file.^{(97.6KB, pdf)}

Attachment

Submitted filename: rebuttal.pdf

Click here for additional data file.^{(69.5KB, pdf)}

Data Availability Statement

All data and computational codes are released in the public repository https://github.com/lizhe07/blur-net.

[pcbi.1010932.ref001] 1.Girshick R. Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision; 2015. p. 1440–1448.

[pcbi.1010932.ref002] 2.He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2016. p. 770–778.

[pcbi.1010932.ref003] 3.He K, Gkioxari G, Dollár P, Girshick R. Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision; 2017. p. 2961–2969.

[pcbi.1010932.ref004] 4.Hendrycks D, Mu N, Cubuk ED, Zoph B, Gilmer J, Lakshminarayanan B. AugMix: A Simple Method to Improve Robustness and Uncertainty under Data Shift. In: International Conference on Learning Representations; 2020. Available from: https://openreview.net/forum?id=S1gmrxHFvB.

[pcbi.1010932.ref005] 5.Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, et al. Intriguing properties of neural networks. arXiv preprint arXiv:13126199. 2013;.

[pcbi.1010932.ref006] 6.Geirhos R, Janssen DH, Schütt HH, Rauber J, Bethge M, Wichmann FA. Comparing deep neural networks against humans: object recognition when the signal gets weaker. arXiv e-prints. 2017;.

[pcbi.1010932.ref007] 7. Li Z, Brendel W, Walker E, Cobos E, Muhammad T, Reimer J, et al. Learning from brains how to regularize machines. In: Wallach H, Larochelle H, Beygelzimer A, d'Alché-Buc F, Fox E, Garnett R, editors. Advances in Neural Information Processing Systems 32. Curran Associates, Inc.; 2019. p. 9525–9535. [Google Scholar]

[pcbi.1010932.ref008] 8. Safarani S, Nix A, Willeke K, Cadena SA, Restivo K, Denfield G, et al. Towards robust vision by multi-task learning on monkey visual cortex. In: Advances in Neural Information Processing Systems 34. Curran Associates, Inc.; 2021. [Google Scholar]

[pcbi.1010932.ref009] 9. Dapello J, Marques T, Schrimpf M, Geiger F, Cox D, DiCarlo JJ. Simulating a Primary Visual Cortex at the Front of CNNs Improves Robustness to Image Perturbations. In: Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, editors. Advances in Neural Information Processing Systems 33. vol. 33. Curran Associates, Inc.; 2020. p. 13073–13087. [Google Scholar]

[pcbi.1010932.ref010] 10. Rusak E, Schott L, Zimmermann RS, Bitterwolf J, Bringmann O, Bethge M, et al. A Simple Way to Make Neural Networks Robust Against Diverse Image Corruptions. In: Vedaldi A, Bischof H, Brox T, Frahm JM, editors. Computer Vision – ECCV 2020. Cham: Springer International Publishing; 2020. p. 53–69. [Google Scholar]

[pcbi.1010932.ref011] 11.Geirhos R, Rubisch P, Michaelis C, Bethge M, Wichmann FA, Brendel W. ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. In: International Conference on Learning Representations; 2019.

[pcbi.1010932.ref012] 12.Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A. Towards deep learning models resistant to adversarial attacks. arXiv e-prints. 2017;.

[pcbi.1010932.ref013] 13.Zhang H, Yu Y, Jiao J, Xing E, El Ghaoui L, Jordan M. Theoretically principled trade-off between robustness and accuracy. In: International Conference on Machine Learning. PMLR; 2019. p. 7472–7482.

[pcbi.1010932.ref014] 14.Zhang R. Making convolutional networks shift-invariant again. In: International conference on machine learning. PMLR; 2019. p. 7324–7334.

[pcbi.1010932.ref015] 15. Yin D, Gontijo Lopes R, Shlens J, Cubuk ED, Gilmer J. A Fourier Perspective on Model Robustness in Computer Vision. In: Advances in Neural Information Processing Systems 32. vol. 32. Curran Associates, Inc.; 2019. p. 13276–13286. [Google Scholar]

[pcbi.1010932.ref016] 16. Sinz FH, Pitkow X, Reimer J, Bethge M, Tolias AS. Engineering a less artificial intelligence. Neuron. 2019;103(6):967–979. doi: 10.1016/j.neuron.2019.08.034 [DOI] [PubMed] [Google Scholar]

[pcbi.1010932.ref017] 17. Federer C, Xu H, Fyshe A, Zylberberg J. Improved object recognition using neural networks trained to mimic the brain’s statistical properties. Neural Networks. 2020;131:103–114. doi: 10.1016/j.neunet.2020.07.013 [DOI] [PubMed] [Google Scholar]

[pcbi.1010932.ref018] 18. Stringer C, Pachitariu M, Steinmetz N, Carandini M, Harris KD. High-dimensional geometry of population responses in visual cortex. Nature. 2019;571(7765):361–365. doi: 10.1038/s41586-019-1346-5 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1010932.ref019] 19. Kriegeskorte N, Mur M, Bandettini P. Representational similarity analysis—connecting the branches of systems neuroscience. Frontiers in Systems Neuroscience. 2008;2:4. doi: 10.3389/neuro.06.004.2008 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1010932.ref020] 20. Brendel W, Rauber J, Kümmerer M, Ustyuzhaninov I, Bethge M. Accurate, reliable and fast robustness evaluation. In: Wallach H, Larochelle H, Beygelzimer A, d'Alché-Buc F, Fox E, Garnett R, editors. Advances in Neural Information Processing Systems 32. Curran Associates, Inc.; 2019. p. 12861–12871. [Google Scholar]

[pcbi.1010932.ref021] 21. Le Y, Yang X. Tiny ImageNet visual recognition challenge. CS 231N. 2015;7(7):3. [Google Scholar]

[pcbi.1010932.ref022] 22.Hendrycks D, Dietterich T. Benchmarking Neural Network Robustness to Common Corruptions and Perturbations. In: International Conference on Learning Representations; 2019. Available from: https://openreview.net/forum?id=HJz6tiCqYm.

[pcbi.1010932.ref023] 23. Oliva A, Torralba A, Schyns PG. Hybrid Images. ACM Trans Graph. 2006;25(3):527–532. doi: 10.1145/1141911.1141919 [DOI] [Google Scholar]

[pcbi.1010932.ref024] 24.Croce F, Andriushchenko M, Sehwag V, Flammarion N, Chiang M, Mittal P, et al. RobustBench: a standardized adversarial robustness benchmark. arXiv e-prints. 2020;.

[pcbi.1010932.ref025] 25.Bhagoji AN, Cullina D, Sitawarin C, Mittal P. Enhancing robustness of machine learning systems via data transformations. In: 2018 52nd Annual Conference on Information Sciences and Systems (CISS). IEEE; 2018. p. 1–5.

[pcbi.1010932.ref026] 26.Laugros A, Caplier A, Ospici M. Are adversarial robustness and common perturbation robustness independant attributes? In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops; 2019. p. 0–0.

[pcbi.1010932.ref027] 27. Ringach DL, Hawken MJ, Shapley R. Receptive field structure of neurons in monkey primary visual cortex revealed by stimulation with natural image sequences. Journal of Vision. 2002;2(1):2–2. doi: 10.1167/2.1.2 [DOI] [PubMed] [Google Scholar]

[pcbi.1010932.ref028] 28. Niell CM, Stryker MP. Highly Selective Receptive Fields in Mouse Visual Cortex. Journal of Neuroscience. 2008;28(30):7520–7536. doi: 10.1523/JNEUROSCI.0623-08.2008 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1010932.ref029] 29. Van den Bergh G, Zhang B, Arckens L, Chino YM. Receptive-field properties of V1 and V2 neurons in mice and macaque monkeys. The Journal of comparative neurology. 2010;518:2051–70. doi: 10.1002/cne.22321 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1010932.ref030] 30.Caro JO, Ju Y, Pyle R, Dey S, Brendel W, Anselmi F, et al. Local convolutions cause an implicit bias towards high frequency adversarial examples. arXiv preprint arXiv:200611440. 2020;. [DOI] [PMC free article] [PubMed]

[pcbi.1010932.ref031] 31.Vasconcelos C, Larochelle H, Dumoulin V, Roux NL, Goroshin R. An effective anti-aliasing approach for residual networks. arXiv e-prints. 2020;.

[pcbi.1010932.ref032] 32. Athalye A, Carlini N, Wagner D. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples. vol. 80 of Proceedings of Machine Learning Research. Stockholmsmässan, Stockholm Sweden: PMLR; 2018. p. 274–283. Available from: http://proceedings.mlr.press/v80/athalye18a.html. [Google Scholar]

[pcbi.1010932.ref033] 33. Kong NCL, Margalit E, Gardner JL, Norcia AM. Increasing neural network robustness improves match to macaque V1 eigenspectrum, spatial frequency preference and predictivity. PLOS Computational Biology. 2022;18(1):1–25. doi: 10.1371/journal.pcbi.1009739 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1010932.ref034] 34. Rauber J, Zimmermann R, Bethge M, Brendel W. Foolbox Native: Fast adversarial attacks to benchmark the robustness of machine learning models in PyTorch, TensorFlow, and JAX. Journal of Open Source Software. 2020;5(53):2607. doi: 10.21105/joss.02607 [DOI] [Google Scholar]

[pcbi.1010932.ref035] 35.Rauber J, Brendel W, Bethge M. Foolbox: A Python toolbox to benchmark the robustness of machine learning models. In: Reliable Machine Learning in the Wild Workshop, 34th International Conference on Machine Learning; 2017. Available from: http://arxiv.org/abs/1707.04131.

PERMALINK

Robust deep learning object recognition models rely on low frequency information in natural images

Zhe Li

Josue Ortega Caro

Evgenia Rusak

Wieland Brendel

Matthias Bethge

Fabio Anselmi

Ankit B Patel

Andreas S Tolias

Xaq Pitkow

Roles

Abstract

Author summary

Introduction

Results

Neural regularization boosts model robustness

Fig 1. Schematic of neural similarity regularization.

Fig 2. Neural regularization boosts model robustness and makes it less sensitive to high frequency component of the input.

Hybrid image experiment

Fig 3. Probing frequency sensitivity of mouse regularized model using hybrid images.

Frequency analysis on robust models

Fig 4. Frequency analysis of adversarial attacks on robust models trained on CIFAR10.

Fig 5. Hybrid CIFAR10 image classification performance of robust models.

Fig 6. Frequency analysis of models trained on ImageNet.

Discussion

Methods

Hybrid images

Blur model

PCA model

Model training and evaluation

Disclaimer

Supporting information

Data Availability

Funding Statement

References

Decision Letter 0

Wolfgang Einhäuser

Xue-Xin Wei

Roles

Author response to Decision Letter 0

Decision Letter 1

Wolfgang Einhäuser

Xue-Xin Wei

Roles

Author response to Decision Letter 1

Decision Letter 2

Wolfgang Einhäuser

Xue-Xin Wei

Roles

Acceptance letter

Wolfgang Einhäuser

Xue-Xin Wei

Roles

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases