Robustness to Spurious Correlations Improves Semantic Out-of-Distribution Detection

Lily H Zhang; Rajesh Ranganath

doi:10.1609/aaai.v37i12.26785

. Author manuscript; available in PMC: 2024 Mar 8.

Published in final edited form as: Proc AAAI Conf Artif Intell. 2023 Jun 27;37(12):15305–15312. doi: 10.1609/aaai.v37i12.26785

Robustness to Spurious Correlations Improves Semantic Out-of-Distribution Detection

Lily H Zhang ¹, Rajesh Ranganath ^1,²

PMCID: PMC10923583 NIHMSID: NIHMS1900327 PMID: 38464961

Abstract

Methods which utilize the outputs or feature representations of predictive models have emerged as promising approaches for out-of-distribution (ood) detection of image inputs. However, these methods struggle to detect ood inputs that share nuisance values (e.g. background) with in-distribution inputs. The detection of shared-nuisance out-of-distribution (sn-ood) inputs is particularly relevant in real-world applications, as anomalies and in-distribution inputs tend to be captured in the same settings during deployment. In this work, we provide a possible explanation for sn-ood detection failures and propose nuisance-aware ood detection to address them. Nuisance-aware ood detection substitutes a classifier trained via Empirical Risk Minimization (erm) and cross-entropy loss with one that 1. is trained under a distribution where the nuisance-label relationship is broken and 2. yields representations that are independent of the nuisance under this distribution, both marginally and conditioned on the label. We can train a classifier to achieve these objectives using Nuisance-Randomized Distillation (NURD), an algorithm developed for ood generalization under spurious correlations. Output- and feature-based nuisance-aware ood detection perform substantially better than their original counterparts, succeeding even when detection based on domain generalization algorithms fails to improve performance.

1. Introduction

Out-of-distribution (ood) detection is the task of identifying inputs that fall outside the training distribution. A natural approach is to estimate the training distribution via a generative model and flag low-density inputs as ood (Bishop 1994), but such an approach has been shown to perform worse than random chance on several image tasks (Nalisnick et al. 2019), likely due to model estimation error (Zhang, Goldstein, and Ranganath 2021). Instead, many detection methods utilize either the outputs or feature representations of a learned classifier, yielding results much better than those of deep generative models on many tasks (Lee et al. 2018; Salehi et al. 2021)

However, classifier-based ood detection has been shown to struggle when ood and in-distribution (id) inputs share the same values of a nuisance variable that is of no inherent interest to the semantic task, e.g. the background of an image (Ming, Yin, and Li 2022). We call such ood inputs shared-nuisance ood (sn-ood) examples. For example, in the Waterbirds dataset (Sagawa et al. 2020), the image background (water or land) is a nuisance in the task of classifying bird type (waterbird vs. landbird), and an image of a boat on water is an sn-ood input, given the familiar background nuisance value but novel object label. Detection of sn-ood images is worse than detection of ood images with novel nuisance values. Moreover, the stronger the correlation between the nuisance and label in the training distribution, the worse the detection of sn-ood inputs. Even when classifiers are trained via domain generalization algorithms intended for generalizing to new test domains, detection does not improve (Ming, Yin, and Li 2022).

This failure mode is far from a rare edge case, given the relevance of sn-ood inputs in real-world applications. While an instance can be ood with respect to labels (e.g. new object) or nuisances (e.g. new background), in most cases, the goal is semantic ood detection, or detecting out-of-scope inputs (Yang et al. 2021). For instance, manufacturing plants are interested in product defects on the factory floor, not working products in new settings.

In this work, we introduce nuisance-aware ood detection to address sn-ood detection failures detection. Our contributions:

We present explanations for output- and feature-based sn-ood detection failures based on the role of nuisances in the learned predictor and its representations (Section 3).
We illustrate why models that are robust to spurious correlations can yield better output-based sn-ood detection and identify a predictor with such robustness guarantees to use for ood detection (Section 4).
We explain why removing nuisance information from representations can improve feature-based sn-ood detection and propose a joint independence constraint to achieve this end (Section 4).
We describe how domain generalization algorithms can fail to improve sn-ood detection, providing insight into the empirical failures seen previously (Section 5).
We show empirically that nuisance-aware ood detection improves detection on sn-ood inputs while maintaining performance on non-sn-ood ones (Section 7).

2. Background

Most methods which employ predictive models for out-of-distribution detection can be categorized as either output-based or feature-based. Output-based methods utilize some function of the logits as an anomaly, while feature-based methods utilize internal representations of the learned model.

Output-based Out-of-distribution Detection

Let $f : ℝ^{D} \to ℝ^{K}$ be the learned function mapping an D-dimensional input to K logits for the K id classes. Output-based methods utilize some function of the logits $f (x) \in ℝ^{K}$ as an anomaly score. Letting σ denote the softmax function, relevant methods include maximum softmax probability (MSP) or $\max_{k} σ {(f (x))}_{k}$ (Hendrycks and Gimpel 2017), max logit or $\max_{k} f_{k} (x)$ (Hendrycks et al. 2022), the energy score or $- \log Σ_{k} \exp f_{k} (x)$ (Liu et al. 2020), and out-of-distribution detector for neural networks (odin) or $\max_{k} σ (f_{k} (\tilde{x}) / T)$ , where T is a learned temperature parameter and $(\tilde{x})$ is an input perturbed in the direction of the gradient of the maximum softmax probability (Liang, Li, and Srikant 2018).

Feature-based Out-of-distribution Detection

Feature-based methods utilize internal representations of the learned classifier. Following prior work (Kamoi and Kobayashi 2020; Ren et al. 2021; Fort, Ren, and Lakshminarayanan 2021), we consider the penultimate feature activations $r (x)$ . The most widely used feature-based method is the Mahalanobis distance (MD) (Lee et al. 2018), which models the feature representations of in-distribution data as class-conditional Gaussians with means $μ_{k}$ and shared covariance $Σ$ . At test time, the anomaly score is the minimum Mahalanobis distance from a new input’s feature representations to each of these class distributions, $\min_{k} \sqrt{(r (x) - μ_{k}) Σ^{- 1} {(r (x) - μ_{k})}^{⊤}} = \max_{k} p (r (x) | y = k)$ . Assuming minimal overlap in probability mass across each class-conditional Gaussian (e.g. tight and separated clusters), this method can approximate detection based on density estimation on the representations: $\max_{k} p (r (x) | y = k) \propto \max_{k} p (r (x) | y = k) p (y = k) \approx \sum_{k} p (r (x) | y = k) p (y = k) = p (r (x))$ . Other functions can also be computed on top of the representations (Sastry and Oore 2020), and we can generalize feature-based methods to $ϕ_{feature} = h (r (x))$ , $h$ : $ℝ^{M} \to ℝ$ .¹

Failures in Shared-Nuisance ood Detection

Ming, Yin, and Li (2022) find that output- and feature-based detection show worse performance on shared-nuisance ood inputs than non-sn-ood inputs, and that the performance of output-based methods degrades as the strength of the correlation between nuisance and label in the training data increases. Figure 1 corroborates and extends their findings, illustrating that across several datasets, performance is generally worse on shared nuisance inputs. Output-based detection of such inputs degrades under stronger spurious correlations and is sometimes comparable to or worse than random chance (AUROC < 50). Feature-based detection tends to be more stable across varying spurious correlations but can perform worse than output-based detection even under strong spurious correlations (e.g. Waterbirds). Our absolute numbers differ from those of Ming, Yin, and Li (2022) for the following reasons: First, our CelebA results are based on the blond/non-blond in-distribution task in (Sagawa et al. 2020) rather than gray/non-gray. Next, the Waterbirds results are sensitive data generation seed (see Figure 10 in Appendix for details). Finally, our feature-based results use the penultimate activations while Ming, Yin, and Li (2022) aggregate features over all layers, which requires additional validation ood data. Even so, our results show the same trends.

Figure 1: — Output- and feature-based detection methods perform worse on shared-nuisance out-of-distribution inputs (see Section 7 for task details). The performance of output-based shared-nuisance ood detection (MSP, top) degrades under increasing spurious correlation strength (x-axis). Figure 7 shows the same trend for other output-based methods. The performance of feature-based ood detection (MD, bottom) is more stable but poor. See Appendix for plots of other common ood detection methods, which follow a similar trend.

3. Understanding Shared-Nuisance Out-of-Distribution Detection Failures

To understand sn-ood failures, first note that the success of a classifier for detection depends on the classifier assigning different feature representations or outputs to ood and id inputs. Poor detection of sn-ood inputs relative to non-sn-ood inputs implies that the classifier fails to map sn-ood inputs to outputs or representations sufficiently distinct from those expected of id inputs. Given that both sn-ood and non-sn-ood inputs have unseen semantics, we hypothesize that the difference in performance can be largely explained by how the learned classifier utilizes nuisances. We walk through an output-based and feature-based example below.

To establish notation, let $z$ and $y$ be the nuisance and label which together generate input $x$ . Let $Z_{t r}$ , $Y_{t r}$ be the values of $z$ and $y$ which appear during training. In semantic ood detection, an input is out-of-distribution if its semantic label was not seen in the training data, i.e. $y \notin Y_{t r}$ . Then, the difference between shared-nuisance out-of-distribution inputs and non-shared-nuisance out-of-distribution inputs lies in $z$ : sn-ood inputs have nuisances $z \notin Z_{t r}$ , while non-sn-ood inputs do not. For instance, for the Waterbirds dataset, a non-bird image over a water background would be a shared-nuisance ood input, while a non-bird image taken indoors would be a non-sn-ood input.

Explaining Poor MSP Performance.

Perfect MSP performance requires that all ood inputs have less confident output probabilities than id inputs. Worse detection on sn-ood over non-sn-ood inputs suggests that the former get more peaked confidences in their outputs, making them more similar to id inputs. Since the difference between sn-ood and non-sn-ood inputs is whether $z \in Z_{t r}$ , this worse performance can be attributed to the model’s behavior on $z \in Z_{t r}$ vs. $z \notin Z_{t r}$ . Poor sn-ood results suggest that predictive models assign peaked output probabilities to inputs where $z \in Z_{t r}$ even if $y \notin Y_{t r}$ . Such a phenomenon is possible if the learned function $f$ is primarily a function of the nuisance, e.g. $f (x) \approx g (z)$ (Figure 2, top left). Then, sn-ood and id inputs would yield similar outputs, whereas non-sn-ood inputs could yield different outputs and still be detected.

Figure 2: — **Top:** Output-based ood detection will perform poorly on shared-nuisance out-of-distribution inputs when the prediction output relies on nuisance (left) rather than semantics (right), even if non-sn-ood inputs can be detected well in either case. **Bottom:** Feature-based ood detection is easier when representations focus on semantics (right) rather than both semantics and nuisance (left). Nuisance-aware ood detection encourages the top and bottom right scenarios to improve sn-ood detection.

Explaining Poor Mahalanobis Distance Performance.

Mahalanobis distance performs well when ood inputs have representations that are sufficiently different from id ones, enough so to be assigned low density under a model estimating id representations via class-conditional Gaussians. Detection is worse for sn-ood inputs than non-sn-ood ones, suggesting that ood representations are assigned higher density when inputs have nuisance values $z \in Z_{t r}$ . Such a scenario can only occur if nuisance information is present in the representations; otherwise, detection performance of sn-ood and non-sn-ood inputs should be similar, assuming their semantics are similarly different from id semantics.

It is worth noting that, if all the semantic information needed to distinguish id and ood is present in the representations, then a detection method based on perfect density estimation of the id representation distribution would successfully detect ood inputs, regardless of whether the representations additionally contain nuisance information or not. However, in the absence of perfect estimation, representations with more dimensions related to nuisance can be more sensitive to estimation error since they have fewer dimensions dedicated to semantics where id and sn-ood are non-overlapping; for instance, when representations only differ over one dimension, accurate detection requires very accurate modeling of the single relevant dimension, unknown a priori (Figure 2, bottom left). In contrast, representations where id and sn-ood inputs differ over more dimensions are more robust to misestimation over any one dimension (Figure 2, bottom right).

4. Nuisance-Aware ood Detection

We summarize above observations and explanations below:

Observation: Output-based ood detection is worse on sn-ood inputs and degrades with increasing correlation between the nuisance and label in the training distribution.

Explanation: The learned predictor adjusts its output based on the nuisance, particularly when there is a strong spurious correlation between nuisance and label in the training data. The result is that sn-ood outputs look like id outputs.
Observation: Feature-based ood detection is worse on sn-ood inputs even though its performance is fairly stable across different correlations.

Explanation: Regardless of the nuisance-label correlation, the learned representations contain information about the nuisance in addition to semantics, making sn-ood representations look more similar to id ones.

To address both issues, we propose nuisance-aware out-of-distribution detection, utilizing knowledge of nuisances to improve detection. Concretely, to improve output-based detection, we propose substituting a classifier trained via empirical risk minimization with one that is robust to spurious correlations, defined by good classification performance on all distributions that differ from the training distribution in nuisance-label relationship only. Then, to improve feature-based detection, we train a classifier such that its penultimate representation cannot predict the nuisance by itself or conditioned on the label. We motivate and describe our approach below.

Addressing Spurious Correlations via Reweighting

To improve output-based sn-ood detection, we recall that poor output-based sn-ood detection can occur when the learned function f can be approximated by a function of only the nuisance, i.e. $f (x) \approx g (z)$ . Is there a way to avoid learning such functions given only in-distribution data?

First, if a predictor behaves like a function of the nuisance in order to predict the label well on a given data distribution, then it can perform arbitrarily poorly on a new distribution where the relationship between the nuisance and label has changed. Given a data distribution $p_{D}$ , let $F$ be a family of distributions that differ from $p_{D}$ only in the nuisance-label relationship: $F = \{p_{D^{'}} = p_{D} (x | y, z) p_{D^{'}} (z | y) p_{D} (y)\}$ , $supp (p_{D^{'}} (z | y)) = supp (p_{D} (z | y))$ for all $y \in supp (p_{D} (y))$ . We call the nuisance-label relationship a spurious correlation because it changes across relevant distributions. A predictor that performs well across all distributions in $F$ , i.e. is robust to the spurious correlation between nuisance and label, cannot rely on a function of only nuisance to make its prediction and thus is more likely to succeed at sn-ood detection. In other words, models that are robust to spurious correlations are also likely to be better for output-based sn-ood detection.

Theoretical Motivation.

We propose to improve output-based detection by training models that are robust to spurious correlations. Let $p_{╨}$ be a distribution in $F$ where the label is independent of the nuisance: $z ╨_{p_{╨}} y$ . Puli et al. (2022a) prove that for all representation functions $r \in R$ such that $z ╨_{p_{╨}} y | r (x)$ , the predictor $p_{╨} (y | r (x))$ is guaranteed to perform as well as marginal label prediction on any distribution in $F$ , a guarantee that does not always hold outside the set $R$ where $z ╨_{p_{╨}} y | r (x)$ . Moreover, when the identity function is in $R$ , i.e. $z ╨_{p_{╨}} y | x$ , then the predictor $p_{╨} (y | x)$ yields simultaneously optimal performance across all distributions in $F$ relative to any representation $r \in R$ and is minimax optimal for a sufficiently diverse $F$ among all predictors $p_{D} (y | x)$ . In other words, $p_{╨} (y | x)$ enjoys substantial performance guarantees when $z ╨_{p_{╨}} y | x$ .

When the input $x$ determines the nuisance $z$ (e.g., looking at an image tells you its background), then $z ╨_{p_{╨}} y | x$ holds trivially. Consequently, $p_{╨} (y | x)$ has the robustness guarantees summarized above.

Method: Reweighting.

We can estimate $p_{╨} (y | x)$ by reweighting: given $p_{D} (x, y, z) = p_{D} (x | y, z) p_{D} (y | z) p_{D} (z)$ , we can construct $p_{╨} (x, y, z) = p_{D} (x | y, z) p_{D} (y | z) p_{D} (z)$ as follows (Puli et al. 2022a):

p_{╨} (x, y, z) = p_{D} (x, y, z) \frac{p_{D} (y)}{p_{D} (y | z)} .

(1)

We reweight both the training and validation loss using Equation (1) and perform model selection based on the best reweighted validation loss.

Reweighting when Group Labels are Unavailable.

When nuisance values are present as metadata, reweighting based on Equation (1) is straightforward. When we do not have access to exact group labels, following Puli et al. (2022b), we can use functions of the input as nuisance values, e.g. via masking. For instance, for images with centered objects, the outer border of the image can be used as a proxy for background overall. Then, the reweighting mechanism is the same, where a well-calibrated classifier predicting the label from the masked input can approximate $p_{D} (y | z)$ .

Addressing Nuisance Features via Independence Constraints

Can we improve feature-based methods on sn-ood inputs, even if they are stable in performance regardless of spurious correlation strength? We hypothesize that removing nuisance information from the learned representations makes id and sn-ood inputs easier to distinguish. First, shared nuisance information is not helpful for distinguishing id and sn-ood inputs by definition, so removing it should not hurt detection performance. Furthermore, when representations contain this information, sn-ood inputs can more easily go undetected, e.g. by looking like id inputs over more of the principal components of variation in the representation. More generally, nuisance information can be additional modeling burden for a downstream feature-based method by introducing additional entropy; for instance, given a discrete representation $r (x)$ that is independent of a discrete nuisance, a representation $r^{'} (x)$ which additionally includes nuisance information (i.e. $p (r^{'} (x)) = p (r (x), g (z))$ ) has strictly higher entropy $H$ : $H (r^{'} (x)) = H (r (x)) + H (g (z) | r (x)) > H (r (x))$ .

Theoretical Motivation.

To remove nuisance information, we propose enforcing $r (x) ╨_{p_{╨}} z$ and $r (x) ╨_{p_{╨}} z | y$ . The former ensures that the representations cannot predict nuisance on their own, while the latter ensures that within each label class, the representations do not provide information about the nuisance. Without the latter condition, the representations can be a function of nuisance within the id classes such that marginal independence is enforced but sn-ood representations overlap with id ones (see the Appendix A for an example). To avoid this situation and encourage disjoint representations, we enforce marginal and conditional independence, equivalent to joint independence $z ╨_{p_{╨}} y$ , $r (x)$ .

Method: Joint Independence.

To enforce joint independence, we penalize the estimated mutual information between the nuisance $z$ and the combined representations $r (x)$ and label $y$ . When $z$ is high-dimensional, e.g. a masked image, we estimate the mutual information via the density-ratio estimation trick (Sugiyama, Suzuki, and Kanamori 2012), following Puli et al. (2022a). Concretely, we use a binary classifier distinguishing between samples from $p_{╨} (r (x), y, z)$ and $p_{╨} (r (x), y) p_{╨} (z)$ to estimate the ratio $\frac{p_{╨} (r (x), y, z)}{p_{╨} (r (x), y) p_{╨} (z)}$ . When $z$ is low-dimensional, we estimate the mutual information by training a model to predict $z$ from $r (x)$ , $y$ under the reweighted distribution $p_{╨}$ :

I_{p ╨} (z; r (x), y)

(2)

= \int p_{╨} (r (x), y, z) \log \frac{p_{╨} (r (x), y, z)}{p_{╨} (r (x), y) p_{╨} (z)} d x d y d z

(3)

= E_{p ╨} [\log \frac{p_{╨} (z | r (x), y)}{p_{╨} (z)}] .

(4)

Other neural network-based mutual information estimators can also be used (Belghazi et al. 2018; Poole et al. 2019). Zero mutual information implies independence; otherwise, there is still dependence, and we add the mutual information as a penalty to the loss when training the main classifier.

Why Naive Independence Doesn’t Work.

Why must we ensure independence of $r (x)$ and $z$ under $p_{╨}$ instead of under the original training distribution $p_{D}$ ? In cases where the nuisance and label are strongly correlated, forcing independence of the penultimate representations and the nuisance will force the representation to ignore information that is predictive of the label, simply because it is also predictive of the nuisance. At one extreme, if the nuisance and label are nearly perfectly correlated under $p_{D}$ , then $r (x) ╨_{p_{D}} z$ will force $r (x)$ to contain almost no information which could predict $y$ . This situation is avoided when label and nuisance are independent.

Summarizing Nuisance-Aware ood Detection

We propose the following for nuisance-aware ood detection:

When performing output-based detection, train a classifier with reweighting: $L_{p_{╨}} (f (x); y)$ .
When performing feature-based detection, train a classifier with reweighting and a joint independence penalty: $L_{p_{╨}} (f (x); y) + λ I_{p_{╨}} (z; r (x), y)$ .

Reweighting is performed based on Equation (1) by estimating $p_{D} (y)$ and $p_{D} (y | z)$ from the data. When $z$ is a discrete label, both terms can be estimated by counting the sizes of groups defined by $y$ and $z$ . When $z$ is continuous or high-dimensional, as in a masked image, an additional reweighting model $p_{D} (y | z)$ can be estimated prior to training the main model.

To implement the joint independence penalty, Equation (2) is estimated after every gradient step of the classifier. As described in the above section, one can fit a critic model $p_{╨} (z | r (x), y)$ when $z$ is low-dimensional or employ the density ratio trick when $z$ is high-dimensional. We re-estimate the critic model after each step of the main model training.

5. Why Domain Generalization Methods Fail

Domain generalization algorithms utilize multiple training environments in order to generalize to new test environments. Using knowledge of nuisance to create environments, Ming, Yin, and Li (2022) do not see improved ood detection when training a classifier using domain generalization algorithms implemented in Gulrajani and Lopez-Paz (2021). Here, we explain how, despite taking into account nuisance information, these algorithms can fail to improve output-based sn-ood detection because the resulting models can fail to be robust to spurious correlations.

First, algorithms that rely on enforcing constraints across multiple environments (e.g. Invariant Risk Minimization (irm) (Arjovsky et al. 2019), Risk Extrapolation (rex) (Krueger et al. 2021)) can fail to achieve robustness to spurious correlations if the environments do not have common support, which is typically the case if environments are denoted by nuisance values. In such scenarios, predicting well in one environment does not restrict the model in another environment. The set up in Ming, Yin, and Li (2022) is an example of this scenario, which has also been discussed in Guo et al. (2021) for irm. Instead, to address spurious correlations, these algorithms rely on environments defined by different nuisance-label relationships. However, even then, an irm solution can still fail to generalize to relevant test environments if the training environments available are not sufficiently diverse, numerous, and overlapping (Rosenfeld, Ravikumar, and Risteski 2021). In contrast, nuisance-aware ood detection enables robustness across the family of distributions $F$ given only one member in the family. Multi-environment objectives can also struggle with difficulty in optimization (Zhang, Lopez-Paz, and Bottou 2022), an additional challenge beyond that of training data availability.

Next, algorithms that enforce invariant representations (e.g. Domain-Adversarial Neural Networks (dann) (Ajakan et al. 2014), Conditional Domain Adversarial Neural Networks (cdann) (Li et al. 2018)) can hurt performance if the constraints imposed are too stringent. For instance, when environments are denoted by nuisance, the constraint posed by dann is equivalent to the naive independence constraint $z ╨ r (x)$ discussed earlier, which can actively remove semantic information helpful for prediction just because it is correlated with $z$ . cdann enforces the distribution of representations to be invariant across environments conditional on the class label. In Appendix B, we show that a predictor based on the cdann constraint is worse than the nuisance-aware predictor for certain distribution families $F$ . For such $F$ , a cdann predictor is less robust to spurious correlations as measured by performance across the distribution family. Such a predictor will also be worse for sn-ood detection that relies on strong performance over all regions of the in-distribution support in order for id and sn-ood outputs to be as distinct as possible. Group Distributionally Robust Optimization (gdro) (Sagawa et al. 2020) aims to minimize the worst-case loss over some uncertainty set of distributions, defined by mixtures of pre-defined groups. When the label and nuisance combined define the groups, their mixtures can be seen as distributions with varying spurious correlation strengths. However, other ways of defining groups do not yield the same interpretation; for instance, when the groups are defined by nuisance only,² the uncertainty set of distributions is one of varying mixtures of nuisance values while the nuisance-label relationship $p (y | z)$ is held fixed. In other words, such a set up does not address robustness across varying spurious correlations. Even under an appropriate set up for robustness to spurious correlations, gdro may still not reach optimal performance. Concretely, as the loss only depends on the outputs of the worst-performing group, gdro does not favor solutions that continue to optimize performance on other groups if the loss for the worst-performing group is at an local optimum. As intuition, an example where such a situation could occur is one where the best losses possible across groups differs drastically. For an output-based detection method that relies on in-distribution outputs being as large or peaked as possible in order to separate them from ood ones, gdro’s consideration of only worst group performance can hinder detection performance, especially relative to the proposed method which considers all inputs in the reweighting.

6. Related Work

Our work is most closely related to Ming, Yin, and Li (2022) and Puli et al. (2022a). Ming, Yin, and Li (2022) first notice that ood detection is harder on shared-nuisance ood inputs; we expand on their analysis and provide a solution that makes progress on the issue where other approaches (e.g. domain generalization algorithms) do not. Our proposed solution for improving feature-based detection with high-dimensional nuisances, i.e. reweighting and enforcing joint independence, matches the algorithm proposed in Puli et al. (2022a) for ood generalization, though the motivations for each component of the solution are different. Our joint independence algorithm is different for low-dimensional nuisances.

Fort, Ren, and Lakshminarayanan (2021) also demonstrate that classifier choice matters for detection, but they focus on large pretrained transformers, while we consider models that utilize domain knowledge of nuisances. The idea of removing non-discriminative features in the representations of id and ood inputs has also been explored in Kamoi and Kobayashi (2020); Ren et al. (2021), who modify the Mahalanobis distance to consider only a subset of the eigenvectors of the computed covariance matrix. This partial Mahalanobis score focuses on non-sn-ood benchmarks and chooses principal components based on explained variation, whereas we remove shared-nuisance information to address sn-ood failures. More broadly, nuisance-aware ood detection changes the classifier used for detection and can be used alongside other detection methods which employ classifiers.

7. Experiments

We consider the following three tasks:

CMNIST. $Y_{t r} = \{0, 1\}$ , $Z_{t r} = \{red, green\}$ , $Y_{SN-OOD} = \{2, \dots, 10\}$ .
Waterbirds. $Y_{t r} = \{waterbird, landbird\}$ , $Z_{t r} = \{water, land\}$ , $Y_{SN-OOD} = \{no bird\}$ .
CelebA. $Y_{t r} = \{blond, not blond\}$ , $Z_{t r} = \{male, female\}$ , $Y_{SN-OOD} = \{bald\}$ .

The open-source datasets from which these tasks are derived all have nuisance metadata available for all examples (e.g. environment label for Waterbirds, attributes for CelebA (Liu et al. 2015)) as well as a way to identify or construct sn-ood examples, enabling us to control both the strength of the spurious correlation in the training set (we consider $ρ \in \{0.7, 0.9\}$ ) while also testing ood detection. We ensure that the dataset sizes are the same across $ρ$ . For non-sn-ood datasets, we use blue MNIST digits (Deng 2012) for CMNIST and SVHN (Netzer et al. 2011) for Waterbirds and CelebA. For model architecture, we utilize a 4-layer convolutional network for CMNIST and ResNet-18 pretrained on ImageNet for Waterbirds and CelebA. Unless otherwise noted, we average all results over 5 random seeds, and error bars denote one standard error. We use AUROC as the metric for all detection results following previous literature (Hendrycks and Gimpel 2017). All code, including the hyperparameter configurations for experiments, is available at https://github.com/rajesh-lab/nuisance-aware-ood-detection. See Appendix for more details.

Main Results.

Reweighting substantially improves output-based sn-ood detection (Table 2 in Appendix), providing empirical evidence that increased robustness to spurious correlations (estimated by performance on a new distribution $ρ_{D^{'}} \neq ρ_{D}$ ) correlates with improved output-based detection. Reweighting plus joint independence improves feature-based detection with statistically significant results, with the exception of CMNIST, likely due to the task construction where digit is only partially predictive of class (see Appendix D for details), and CelebA at $ρ = 0.9$ , where joint independence results have high variance. Reversing training strategies and anomaly methods does not the same consistent positive benefit, highlighting the importance selecting the right nuisance-aware strategy for a given detection method (see Figure 8).

While Ming, Yin, and Li (2022) do not see success in combining nuisance information with domain generalization algorithms over their baseline erm solution, Table 1 shows that nuisance-aware ood detection succeeds over nuisance-unaware erm. On non-sn-ood inputs, reweighting yields consistent or better output-based detection performance, and reweighting plus joint independence generally yields comparable or better feature-based detection (Table 3 in Appendix).

Table 1:

On Waterbirds $ρ = 0.7$ , nuisance-aware ood detection (RW) yields statistically significant improvement over nuisance-unaware detection using erm (energy score, mean ± standard error over 4 seeds). In contrast, Ming, Yin, and Li (2022) do not see benefit using nuisance information with domain generalization algorithms.

	AUROC ↑
ERM (Ming)	80.98 ± 2.22
IRM	81.29 ± 2.62
GDRO	82.94 ± 2.29
REx	81.25 ± 2.49
DANN	81.11 ± 3.10
CDANN	82.13 ± 1.76

ERM (Ours)	82.14 ± 1.55
RW	86.86 ± 1.59

Open in a new tab

Exact vs. User-generated Nuisances.

Unless otherwise specified, experiments use exact nuisances. Even so, Figure 4 shows that that ood detection and balanced id classification results are comparable for Waterbirds whether the nuisance is the exact metadata label or denoted by the outer border of the image. These results illustrate the applicability of nuisance-aware ood detection even when nuisance values are not provided as metadata. Other creative specifications of $z$ are possible, e.g. an image with shuffled pixels if low-level pixel statistics are expected to be a nuisance (Puli et al. 2022b), the representations from a randomly initialized neural network if such representations are expected to be cluster based on nuisance (see Badgeley et al. (2019) for an example).

Other results.

We also consider alternatives to reweighting and joint independence, namely undersampling and marginal independence respectively. We find that undersampling achieves comparable output-based detection but worse feature-based detection and id classification accuracy, and marginal independence yields worse performance than joint independence (Figure 5). We also find that the independence penalty with coefficient $λ = 1$ yields representations that are less predictive but not completely independent of the nuisance (Table 4 in Appendix) and suspect that removing nuisance information further can improve feature-based sn-ood detection results.

Figure 5: — On Waterbirds $ρ = 0.9$ , undersampling achieves comparable detection results to reweighting but worse balanced accuracy (left). Marginal independence performs worse than joint independence (right).

8. Discussion

Out-of-distribution detection based on predictive models can suffer from poor performance on shared-nuisance ood inputs when the classifier relies on a spurious correlation or its feature representations encode nuisance information. To address these failures, we present nuisance-aware ood detection: to improve output-based detection, we train a classifier via reweighting to estimate $p_{╨} (y | x)$ , a distribution that is theoretically guaranteed to be robust to spurious correlations; to improve feature-based detection, we utilize reweighting and a joint independence constraint which encourages representations to be uninformative of the nuisance marginally or conditioned on the semantic class label. Nuisance-aware ood detection yields sn-ood performance benefits for a wide array of existing detection methods, while maintaining performance on non-sn-ood inputs.

However, nuisance-aware ood detection is not without its limitations. First, it requires a priori knowledge of nuisances and a way to specify them for a given task, though there are techniques that handle some missing values (Goldstein et al. 2022). In addition, implementing the joint independence penalty requires training a critic model for each gradient step of the main classifier, an expensive bi-level optimization. Fortunately, once the classifier is trained, prediction (and thus detection) time is no different from that of any other classifier, regardless of how it was trained. We also dox3 not consider feature-based ood detection methods which utilize all layers of a trained network, as such methods typically require validation ood data.

Our work draws a connection between ood generalization and ood detection: classifiers that generalize well across spurious correlations also yield good output-based detection of sn-ood inputs. We believe that future work exploring further connections between ood generalization and detection could be fruitful.

Supplementary Material

NIHMS1900327-supplement-1.pdf^{(202.3KB, pdf)}

Figure 3: — Reweighting (RW) improves output-based sn-ood detection (top), while reweighting and joint independence (RWJI) generally improves feature-based sn-ood detection (bottom). $ρ$ is the strength of the spurious correlation. Results on other methods follow this trend (Appendix Figure 9).

9. Acknowledgements

This work was generously funded by NIH/NHLBI Award R01HL148248, NSF Award 1922658 NRT-HDR: FUTURE Foundations, Translation, and Responsibility for Data Science, and NSF CAREER Award 2145542. We thank Aahlad Puli and the anonymous AAAI reviewers for their helpful comments and suggestions.

Footnotes

Ethical Statement

ood detection is an important capability for reliable machine learning, spanning applications from robotics and transportation (e.g. novel object identification) to ecology and public health (e.g. novel species detection). sn-ood detection focuses on a particularly difficult type of ood input, and improving detection of such inputs can enhance the capabilities of systems across these applications. However, improved ood detection also makes it easier for bad actors to deploy systems which detect anything that strays from the norm they define. Working towards intended applications while keeping in mind potential misuse can help usher in a future of more reliable machine learning systems with positive impact.

We call a method feature-based if it cannot be represented as an output-based method.

This is enforced by the code in Gulrajani and Lopez-Paz (2021) which requires that all classes are present in each environment.

References

Ajakan H; Germain P; Larochelle H; Laviolette F; and Marchand M 2014. Domain-Adversarial Neural Networks. arXiv preprint arXiv:1412.4446
Arjovsky M; Bottou L; Gulrajani I; and Lopez-Paz D 2019. Invariant Risk Minimization. ArXiv, abs/1907.02893.
Badgeley MA; Zech JR; Oakden-Rayner L; Glicksberg BS; Liu M; Gale W; McConnell MV; Percha BL; Snyder TM; and Dudley JT 2019. Deep learning predicts hip fracture using confounding patient and healthcare variables. NPJ Digital Medicine, 2. [DOI] [PMC free article] [PubMed] [Google Scholar]
Belghazi MI; Baratin A; Rajeswar S; Ozair S; Bengio Y; Hjelm RD; and Courville AC 2018. Mutual Information Neural Estimation In ICML. [Google Scholar]
Bishop CM 1994. Novelty detection and neural network validation. IEE Proceedings-Vision, Image and Signal processing, 141(4): 217–222. [Google Scholar]
Deng L 2012. The mnist database of handwritten digit images for machine learning research. IEEE Signal Processing Magazine, 29(6): 141–142. [Google Scholar]
Fort S; Ren J; and Lakshminarayanan B 2021. Exploring the limits of out-of-distribution detection In NeurIPS. [Google Scholar]
Goldstein M; Jacobsen J-H; Chau O; Saporta A; Puli AM; Ranganath R; and Miller A 2022. Learning invariant representations with missing data In Conference on Causal Learning and Reasoning, 290–301. PMLR. [PMC free article] [PubMed] [Google Scholar]
Gulrajani I; and Lopez-Paz D 2021. In Search of Lost Domain Generalization In International Conference on Learning Representations. [Google Scholar]
Guo R; Zhang P; Liu H; and Kıcıman E 2021. Out-of-distribution Prediction with Invariant Risk Minimization: The Limitation and An Effective Fix. arXiv preprint arXiv:2101.07732
He K; Zhang X; Ren S; and Sun J 2016. Deep Residual Learning for Image Recognition 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 770–778. [Google Scholar]
Hendrycks D; Basart S; Mazeika M; Mostajabi M; Steinhardt J; and Song DX 2022. Scaling Out-of-Distribution Detection for Real-World Settings In ICML. [Google Scholar]
Hendrycks D; and Gimpel K 2017. A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks In ICLR. [Google Scholar]
Kamoi R; and Kobayashi K 2020. Why is the Mahalanobis Distance Effective for Anomaly Detection? arXiv preprint arXiv:2003.00402
Krueger D; Caballero E; Jacobsen J-H; Zhang A; Binas J; Priol RL; and Courville AC 2021. Out-of-Distribution Generalization via Risk Extrapolation (REx) In ICML. [Google Scholar]
Lee K; Lee K; Lee H; and Shin J 2018. A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks In NeurIPS. [Google Scholar]
Li Y; Tian X; Gong M; Liu Y; Liu T; Zhang K; and Tao D 2018. Deep Domain Generalization via Conditional Invariant Adversarial Networks In ECCV. [Google Scholar]
Liang S; Li Y; and Srikant R 2018. Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks In ICLR. [Google Scholar]
Liu W; Wang X; Owens JD; and Li Y 2020. Energy-based Out-of-distribution Detection In NeurIPS. [Google Scholar]
Liu Z; Luo P; Wang X; and Tang X 2015. Deep Learning Face Attributes in the Wild In Proceedings of International Conference on Computer Vision (ICCV). [Google Scholar]
Ming Y; Yin H; and Li Y 2022. On the Impact of Spurious Correlation for Out-of-distribution Detection In AAAI. [Google Scholar]
Nalisnick E; Matsukawa A; Teh Y; Görür D; and Lakshminarayanan B 2019. Do Deep Generative Models Know What They Don’t Know? In ICLR. [Google Scholar]
Netzer Y; Wang T; Coates A; Bissacco A; Wu B; and Ng AY 2011. Reading Digits in Natural Images with Unsupervised Feature Learning In NeurIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011. [Google Scholar]
Poole B; Ozair S; van den Oord A; Alemi AA; and Tucker G 2019. On Variational Bounds of Mutual Information In ICML. [Google Scholar]
Puli A; and Ranganath R 2020. General Control Functions for Causal Effect Estimation from Instrumental Variables In NeurIPS. [PMC free article] [PubMed] [Google Scholar]
Puli A; Zhang LH; Oermann EK; and Ranganath R 2022a. Out-of-Distribution Generalization in the Presence of Nuisance-induced Spurious Correlations In ICLR. [Google Scholar]
Puli AM; Joshi N; He HY; and Ranganath R 2022b. Nuisances via Negativa: Adjusting for Spurious Correlations via Data Augmentation. ArXiv, abs/2210.01302.
Ren J; Fort S; Liu J; Roy AG; Padhy S; and Lakshminarayanan B 2021. A simple fix to mahalanobis distance for improving near-ood detection. arXiv preprint arXiv:2106.09022
Rosenfeld E; Ravikumar PK; and Risteski A 2021. The Risks of Invariant Risk Minimization In ICLR. [Google Scholar]
Sagawa S; Koh PW; Hashimoto TB; and Liang P 2020. Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization In ICLR. [Google Scholar]
Salehi M; Mirzaei H; Hendrycks D; Li Y; Rohban MH; and Sabokrou M 2021. A unified survey on anomaly, novelty, open-set, and out-of-distribution detection: Solutions and future challenges. arXiv preprint arXiv:2110.14051
Sastry CS; and Oore S 2020. Detecting Out-of-Distribution Examples with In-distribution Examples and Gram Matrices In ICML. [Google Scholar]
Sudarshan M; Puli AM; Tansey W; and Ranganath R 2022. DIET: Conditional independence testing with marginal dependence measures of residual information. arXiv preprint arXiv:2208.08579 [PMC free article] [PubMed]
Sugiyama M; Suzuki T; and Kanamori T 2012. Density Ratio Estimation in Machine Learning Cambridge University Press. [Google Scholar]
Yang J; Zhou K; Li Y; and Liu Z 2021. Generalized Out-of-Distribution Detection: A Survey. arXiv preprint arXiv:2110.11334
Zhang J; Lopez-Paz D; and Bottou L 2022. Rich Feature Construction for the Optimization-Generalization Dilemma In ICML. [Google Scholar]
Zhang LH; Goldstein M; and Ranganath R 2021. Understanding Failures in Out-of-distribution Detection with Deep Generative Models In ICML. [PMC free article] [PubMed] [Google Scholar]
Zhou B; Lapedriza À; Khosla A; Oliva A; and Torralba A 2018. Places: A 10 Million Image Database for Scene Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40: 1452–1464. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

NIHMS1900327-supplement-1.pdf^{(202.3KB, pdf)}

[R1] Ajakan H; Germain P; Larochelle H; Laviolette F; and Marchand M 2014. Domain-Adversarial Neural Networks. arXiv preprint arXiv:1412.4446

[R2] Arjovsky M; Bottou L; Gulrajani I; and Lopez-Paz D 2019. Invariant Risk Minimization. ArXiv, abs/1907.02893.

[R3] Badgeley MA; Zech JR; Oakden-Rayner L; Glicksberg BS; Liu M; Gale W; McConnell MV; Percha BL; Snyder TM; and Dudley JT 2019. Deep learning predicts hip fracture using confounding patient and healthcare variables. NPJ Digital Medicine, 2. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] Belghazi MI; Baratin A; Rajeswar S; Ozair S; Bengio Y; Hjelm RD; and Courville AC 2018. Mutual Information Neural Estimation In ICML. [Google Scholar]

[R5] Bishop CM 1994. Novelty detection and neural network validation. IEE Proceedings-Vision, Image and Signal processing, 141(4): 217–222. [Google Scholar]

[R6] Deng L 2012. The mnist database of handwritten digit images for machine learning research. IEEE Signal Processing Magazine, 29(6): 141–142. [Google Scholar]

[R7] Fort S; Ren J; and Lakshminarayanan B 2021. Exploring the limits of out-of-distribution detection In NeurIPS. [Google Scholar]

[R8] Goldstein M; Jacobsen J-H; Chau O; Saporta A; Puli AM; Ranganath R; and Miller A 2022. Learning invariant representations with missing data In Conference on Causal Learning and Reasoning, 290–301. PMLR. [PMC free article] [PubMed] [Google Scholar]

[R9] Gulrajani I; and Lopez-Paz D 2021. In Search of Lost Domain Generalization In International Conference on Learning Representations. [Google Scholar]

[R10] Guo R; Zhang P; Liu H; and Kıcıman E 2021. Out-of-distribution Prediction with Invariant Risk Minimization: The Limitation and An Effective Fix. arXiv preprint arXiv:2101.07732

[R11] He K; Zhang X; Ren S; and Sun J 2016. Deep Residual Learning for Image Recognition 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 770–778. [Google Scholar]

[R12] Hendrycks D; Basart S; Mazeika M; Mostajabi M; Steinhardt J; and Song DX 2022. Scaling Out-of-Distribution Detection for Real-World Settings In ICML. [Google Scholar]

[R13] Hendrycks D; and Gimpel K 2017. A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks In ICLR. [Google Scholar]

[R14] Kamoi R; and Kobayashi K 2020. Why is the Mahalanobis Distance Effective for Anomaly Detection? arXiv preprint arXiv:2003.00402

[R15] Krueger D; Caballero E; Jacobsen J-H; Zhang A; Binas J; Priol RL; and Courville AC 2021. Out-of-Distribution Generalization via Risk Extrapolation (REx) In ICML. [Google Scholar]

[R16] Lee K; Lee K; Lee H; and Shin J 2018. A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks In NeurIPS. [Google Scholar]

[R17] Li Y; Tian X; Gong M; Liu Y; Liu T; Zhang K; and Tao D 2018. Deep Domain Generalization via Conditional Invariant Adversarial Networks In ECCV. [Google Scholar]

[R18] Liang S; Li Y; and Srikant R 2018. Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks In ICLR. [Google Scholar]

[R19] Liu W; Wang X; Owens JD; and Li Y 2020. Energy-based Out-of-distribution Detection In NeurIPS. [Google Scholar]

[R20] Liu Z; Luo P; Wang X; and Tang X 2015. Deep Learning Face Attributes in the Wild In Proceedings of International Conference on Computer Vision (ICCV). [Google Scholar]

[R21] Ming Y; Yin H; and Li Y 2022. On the Impact of Spurious Correlation for Out-of-distribution Detection In AAAI. [Google Scholar]

[R22] Nalisnick E; Matsukawa A; Teh Y; Görür D; and Lakshminarayanan B 2019. Do Deep Generative Models Know What They Don’t Know? In ICLR. [Google Scholar]

[R23] Netzer Y; Wang T; Coates A; Bissacco A; Wu B; and Ng AY 2011. Reading Digits in Natural Images with Unsupervised Feature Learning In NeurIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011. [Google Scholar]

[R24] Poole B; Ozair S; van den Oord A; Alemi AA; and Tucker G 2019. On Variational Bounds of Mutual Information In ICML. [Google Scholar]

[R25] Puli A; and Ranganath R 2020. General Control Functions for Causal Effect Estimation from Instrumental Variables In NeurIPS. [PMC free article] [PubMed] [Google Scholar]

[R26] Puli A; Zhang LH; Oermann EK; and Ranganath R 2022a. Out-of-Distribution Generalization in the Presence of Nuisance-induced Spurious Correlations In ICLR. [Google Scholar]

[R27] Puli AM; Joshi N; He HY; and Ranganath R 2022b. Nuisances via Negativa: Adjusting for Spurious Correlations via Data Augmentation. ArXiv, abs/2210.01302.

[R28] Ren J; Fort S; Liu J; Roy AG; Padhy S; and Lakshminarayanan B 2021. A simple fix to mahalanobis distance for improving near-ood detection. arXiv preprint arXiv:2106.09022

[R29] Rosenfeld E; Ravikumar PK; and Risteski A 2021. The Risks of Invariant Risk Minimization In ICLR. [Google Scholar]

[R30] Sagawa S; Koh PW; Hashimoto TB; and Liang P 2020. Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization In ICLR. [Google Scholar]

[R31] Salehi M; Mirzaei H; Hendrycks D; Li Y; Rohban MH; and Sabokrou M 2021. A unified survey on anomaly, novelty, open-set, and out-of-distribution detection: Solutions and future challenges. arXiv preprint arXiv:2110.14051

[R32] Sastry CS; and Oore S 2020. Detecting Out-of-Distribution Examples with In-distribution Examples and Gram Matrices In ICML. [Google Scholar]

[R33] Sudarshan M; Puli AM; Tansey W; and Ranganath R 2022. DIET: Conditional independence testing with marginal dependence measures of residual information. arXiv preprint arXiv:2208.08579 [PMC free article] [PubMed]

[R34] Sugiyama M; Suzuki T; and Kanamori T 2012. Density Ratio Estimation in Machine Learning Cambridge University Press. [Google Scholar]

[R35] Yang J; Zhou K; Li Y; and Liu Z 2021. Generalized Out-of-Distribution Detection: A Survey. arXiv preprint arXiv:2110.11334

[R36] Zhang J; Lopez-Paz D; and Bottou L 2022. Rich Feature Construction for the Optimization-Generalization Dilemma In ICML. [Google Scholar]

[R37] Zhang LH; Goldstein M; and Ranganath R 2021. Understanding Failures in Out-of-distribution Detection with Deep Generative Models In ICML. [PMC free article] [PubMed] [Google Scholar]

[R38] Zhou B; Lapedriza À; Khosla A; Oliva A; and Torralba A 2018. Places: A 10 Million Image Database for Scene Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40: 1452–1464. [DOI] [PubMed] [Google Scholar]

PERMALINK

Robustness to Spurious Correlations Improves Semantic Out-of-Distribution Detection

Lily H Zhang

Rajesh Ranganath

Abstract

1. Introduction

2. Background

Output-based Out-of-distribution Detection

Feature-based Out-of-distribution Detection

Failures in Shared-Nuisance ood Detection

Figure 1:

3. Understanding Shared-Nuisance Out-of-Distribution Detection Failures

Explaining Poor MSP Performance.

Figure 2:

Explaining Poor Mahalanobis Distance Performance.

4. Nuisance-Aware ood Detection

Addressing Spurious Correlations via Reweighting

Theoretical Motivation.

Method: Reweighting.

Reweighting when Group Labels are Unavailable.

Addressing Nuisance Features via Independence Constraints

Theoretical Motivation.

Method: Joint Independence.

Why Naive Independence Doesn’t Work.

Summarizing Nuisance-Aware ood Detection

5. Why Domain Generalization Methods Fail

6. Related Work

7. Experiments

Main Results.

Table 1:

Exact vs. User-generated Nuisances.

Figure 4:

Other results.

Figure 5:

8. Discussion

Supplementary Material

Figure 3:

9. Acknowledgements

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases