Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Oct 3.
Published in final edited form as: IEEE Trans Vis Comput Graph. 2025 Oct;31(10):7436–7447. doi: 10.1109/TVCG.2025.3546644

AttributionScanner: A Visual Analytics System for Model Validation with Metadata-Free Slice Finding

Xiwei Xuan 1,, Jorge Piazentin Ono 2, Liang Gou 3,, Kwan-Liu Ma 1, Liu Ren 2
PMCID: PMC12490708  NIHMSID: NIHMS2109394  PMID: 40031557

Abstract

Data slice finding is an emerging technique for validating machine learning (ML) models by identifying and analyzing subgroups in a dataset that exhibit poor performance, often characterized by distinct feature sets or descriptive metadata. However, in the context of validating vision models involving unstructured image data, this approach faces significant challenges, including the laborious and costly requirement for additional metadata and the complex task of interpreting the root causes of underperformance. To address these challenges, we introduce AttributionScanner, an innovative human-in-the-loop Visual Analytics (VA) system, designed for metadata-free data slice finding. Our system identifies interpretable data slices that involve common model behaviors and visualizes these patterns through an Attribution Mosaic design. Our interactive interface provides straightforward guidance for users to detect, interpret, and annotate predominant model issues, such as spurious correlations (model biases) and mislabeled data, with minimal effort. Additionally, it employs a cutting-edge model regularization technique to mitigate the detected issues and enhance the model’s performance. The efficacy of AttributionScanner is demonstrated through use cases involving two benchmark datasets, with qualitative and quantitative evaluations showcasing its substantial effectiveness in vision model validation, ultimately leading to more reliable and accurate models.

Keywords: Model validation, data slicing, data-centric AI, human-assisted AI, visual analytics

I. Introduction

As machine learning (ML) technologies continue to prevail in various domains, the importance of model interpretation and validation has grown significantly. Detecting a model’s failure and understanding the underlying reasons are crucial in promoting greater model accuracy, transparency, and accountability. A prevalent and harmful issue in ML models is “spurious correlation” [1], where a model erroneously utilizes irrelevant image features for classification. For instance, in hair color classification, a model should make decisions according to hair features, termed as “core correlation”, while the use of other features (e.g., background or facial attributes) denotes “spurious correlation.” Such spurious correlations, regardless of model accuracy, pose substantial challenges including poor generalization in production [2] and AI fairness issues [3].

However, models often perform inconsistently across different subsets of data, making it challenging to validate a model comprehensively to uncover hidden issues [4]–[8]. Thus, the crucial demands of identifying and characterizing such problematic data subsets have fueled data slice finding techniques [4]–[6], [8]–[10]. The term “data slice” refers to data subgroups with coherent features, often defined by metadata, model representations, or other descriptive information. By detecting, understanding, and resolving issues in data slices, ML experts can improve model performance and ensure safer and more reliable deployments in high-stakes domains [11]–[14]. To enhance the trustworthiness, eXplainable Artificial Intelligence (XAI) techniques [15] are also utilized to reveal potential factors adversely impacting the model and apply targeted model regularization [16].

Despite the promising benefits of these techniques, several hurdles remain unaddressed for a thorough model validation.

Challenges of Data Slice Finding.

State-of-the-art methods for data slice finding demand extensive metadata associated with each single instance, which are often obtained from manual annotations [17]–[19] or generated by pre-trained vision-language models [14]. However, it is expensive to collect manually annotated metadata, and vision-language models can produce erroneous results [20], especially in domain-specific scenarios. When neither metadata nor suitable pre-trained models are available, obtaining data slices becomes challenging: requires manual examination of numerous instances or computationally expensive model training [21], [22].

Challenges of XAI for Vision Models.

Explanation methods for vision models are often in instance-level [23]–[26], producing explanations for individual data instances and leading to less-actionable insights—When observing a problematic model behavior, it is unknown whether it is a rare or common case, whether an actual model issue happens is still unclear. To obtain more reasonable conclusions, a thorough examination of all instances is required yet practically unfeasible.

Our approach.

We present AttributionScanner, a Visual Analytics (VA) system designed for efficient vision model validation with slice finding, eliminating the costly requirements of metadata or other descriptive information. Our innovative workflow employs attribution-based explanation techniques to create interpretable feature representations for slice finding, enabling discriminative slice construction and summary to streamline user exploration. Additionally, the methods provided by AttributionScanner are applicable to various computer vision models, collaboratively forming a human-in-the-loop approach for model validation and enhancement. Our main contributions include:

  • AttributionScanner, a novel framework and VA system to validate vision models with data slice finding. Unlike existing methods, AttributionScanner removes the resource-intensive need for metadata to generate interpretable data slices.

  • A design for slice summarization, Attribution Mosaic, illustrating the significant pattern of model behaviors in data slices, which reduces the effort in conventional approaches of examining a large volume of individual data samples.

  • An iterative workflow supporting model validation and enhancement, which ensures generated insights are actionable, enabling users to swiftly detect, understand, and fix model issues related to spurious correlations and mislabeled data.

II. Related Work

A. Data Slice Finding

Data slice finding is a critical technique for identifying coherent subsets of data where ML models may perform inconsistently. These subsets, or slices, are often defined by similar metadata, model representations, data statistics, or other descriptive information [9], [17], [18], [27]. By analyzing problematic data slices, ML experts can design targeted solutions to address model issues in edge cases, improving the model robustness and reliability [4], [7]. However, existing approaches often rely on expensive metadata to obtain meaningful subgroups, limiting their applicability. To address this, recent work has focused on leveraging feature vectors in the latent space and clustering them to produce data slices [21], [22], [28], [29]. For example, Domino [14] uses a pre-trained vision-language model (VLM) to generate textual descriptions of slices. However, this approach can backfire if there is a significant domain mismatch, yielding less meaningful slices. Our work fills this gap by introducing a system that provides explainable slices without requiring metadata or pre-trained VLMs, making it more broadly applicable and reliable.

B. Visual Analytics for Model Explanation and Validation

Visual analytics has emerged as a powerful tool for both model explanation [30]–[36] and validation [37]–[42]. For instance, Kahng et al. [43] use pairwise feature combinations to produce data slices and assess model performance. FairVis [44] enables domain users to manually slice data to identify model biases. Similarly, Errudite [45] develops a domain-specific language for slicing textual documents. While manual slicing is useful, it is not scalable due to its laborintensive nature. To address this, SliceTeller [10] combines an automatic slice-finding algorithm with a visual analytics tool, allowing for iterative refinement of models. However, it still relies on structured metadata to produce meaningful slices. Our approach, AttributionScanner, eliminates the need for expensive metadata while incorporating effective human participation, making it a scalable and accessible solution for explainable model validation.

C. XAI Techniques for Neural Networks

Explainable AI (XAI) techniques have become essential for interpreting neural networks, often using visualizations to provide insights into model behavior. Attribution-based methods [23]–[26], [46], such as GradCAM [25], generate heatmaps or decision boundary explanations for individual instances. While these methods offer intuitive and straightforward results, they are instance-based and fail to capture broader patterns in model behavior. Another line of work utilizes optimization to produce synthetic images for model explanation, such as Feature Visualization [47], [48] and Feature Inversion [49], which create visualizations that maximally activate specific neurons/layers, or reconstruct images from feature vectors. These methods provide insights into the learned patterns of a model but are often limited to single instances or randomly aggregated groups. In AttributionScanner, we bridge this gap by leveraging Feature Inversion to highlight shared patterns across data slices, providing visual summaries of model behavior at the slice level. Additionally, we integrate GradCAM to explain individual instances, offering a comprehensive view of model behavior at both the micro and macro levels.

III. AttributionScanner

A. Design Requirements

Slice finding has garnered increased attention due to its potential to facilitate a thorough ML model evaluation. However, this task is particularly challenging in the realm of vision models considering the inherent unstructured nature of vision data. A pivotal shortcoming we observed in existing approaches is their dependency on metadata, including pre-collected or vision-language model-generated ones, to compute meaningful data slices. However, the acquisition of metadata is resource-intensive, demanding substantial human effort and financial investment. Additionally, the vision-language models are commonly trained on general-purpose datasets, potentially leading to inaccurate or biased results in specific domains. Stemming from these observations, we delineated the subsequent requirements for a system capable of conducting metadata-free, slice-driven model validation:

R1. Metadata-free. Our method should yield meaningful data slices without metadata, i.e., neither using metadata as input nor generating or extracting metadata.

R2. Interpretability. We should provide sufficient and intuitive information to enhance model transparency, supporting interpretations of a data slice’s shared attributes for streamlined exploration and individual instances for insight verification as needed.

R3. Slice Overview. Mandatory manual inspection of individual images is impractical for model validation, especially with a high volume of data. To enhance efficiency, our system should provide a visual summary of data slices, depicting each data slice’s key features and attributions.

R4. Actionable insights. The system should ensure the obtained insights are actionable, which requires careful design of visual guidance, issue annotation, and corresponding model or data regularization.

B. System Workflow

AttributionScanner is a human-in-the-loop system, taking as input an image dataset and a trained CNN, such as ResNet [50] for image classification, and yielding interpretable data slices to assist experts in efficient model validation.

A workflow of our approach is presented in Fig. 1, where we foster explainable slice finding without metadata (R1, R2) by leveraging GradCAM [25] and Feature Inversion [48]. The first “explainable data slice finding” phase interpolates model attributions into the latent space to craft attribution-weighted feature vectors, forming an attribution representation space with locally consistent model attributes (R1). Then we conduct clustering over this space to generate data slices. Although the produced slices maintain consistent attributions, they still demand significant human effort to identify and understand their shared patterns. To streamline slice exploration and issue detection, the “slice summarization and annotation” phase introduces Attribution Mosaic, which visually elucidates each slice’s dominant attributions (R3), synchronized with instance-level heatmaps for insight verification (R2). Upon user annotation of an issue like “spurious correlations”, our Spuriousness Propagation method automatically estimates a spuriousness score for each slice to help users uncover other hidden model biases (R4). Afterward, our workflow enables the mitigation of detected issues in data or model, corresponding to the “slice error mitigation” phase (R4).

Fig. 1.

Fig. 1.

AttributionScanner workflow involves three phases: Explainable Data Slice Finding, Slice Summarization and Annotation, and Slice Error Mitigation. The first phase: GradCAM is used to assist the generation of feature vectors and then obtain data slices. The second phase: users can identify and annotate slice error types such as core/spurious correlations and noisy labels with the help of Attribution Mosaic and Spuriousness propagation. The third phase: the annotation and user-verified Spuriousness are used on the ML side to mitigate slice errors.

C. System Interface

AttributionScanner is comprised of four main components. Fig. 2 depicts a use case where our system is used to validate a hair color classification model.

Fig. 2.

Fig. 2.

AttributionScanner applied to the model validation of a hair color classifier trained on the CelebA dataset. (A) System Menu, enabling the selection of dataset, model, and visualization options (Layout, Confusion Matrix, Scatter Plot/Attribution Mosaic, and Contour Visibility). (B) Slice Table, showing slice metrics. (C) Attribution Mosaic, showing a visual overview of all data slices, which can also be displayed as a confusion matrix view (E). (D) Slice Detail View, showing individual images or Attribution Heatmaps belonging to a selected data slice.

The System Menu (Fig. 2 A) provides options for dataset and model selection, visualization layout configuration, and color encoding choice. Two visualization layouts are available: Combination view and Confusion Matrix view, allowing for an overall view of data slices or a more detailed inspection categorized by error types (R3). The provided color encoding choices include Slice Name, Slice Accuracy, Slice Confidence, and Spuriousness Probability, adaptable to user requirements.

The Slice Table (Fig. 2 B) presents various slice performance metrics to facilitate dataset navigation. Available metrics include Accuracy, Confidence, and Spuriousness Probability, which is produced by the Spuriousness Propagation method upon user annotation (Sec. III-E). This table assists in detecting intriguing patterns, like high accuracy with wrong attributions possibly indicative of model biases. On the other hand, low accuracy with correct attributions indicates the potential existence of mislabeled data (R3, R4).

Our system’s third component is the Attribution Mosaic, rendering the dominant visual patterns of each slice with respect to the model’s decision (R2, R3), either in a unified form (Fig. 2 C) or segregated by the model’s confusion matrix (Fig. 2 E). The design method for this view is detailed in Sec. III-E. Moreover, Attribution Mosaic incorporates visualization of user-specified metrics through the coloration of mosaic boundaries, providing a vital guide in pinpointing problematic slices (R3). Annotation of a slice is facilitated by a double-click action on a mosaic tile, allowing for effortless insight collection (R4).

The Slice Detail View (Fig. 2 D) showcases individual image samples of the selected data slice, rendering either images or attribution heatmaps. This view enables an in-depth examination of each slice (R2).

D. Explainable Data Slice Finding

In this section, we delineate our Explainable Data Slicing method. Our approach presumes a typical CNN model architecture, comprising a sequence of convolutional layers succeeded by a fully connected (FC) layer, a commonality across most CNN models [25], [50], [51].

  1. Attribution-Weighted Feature Vector Generation: In this section, we detail the generation of attribution-weighted feature vectors (R1, R2). It is notable that there are existing approaches employing attribution-weighted rather than original feature vectors, which aim to provide better model explanation [49], [52], [53] or conduct explainable model enhancement [7], [16]. Considering attribution-weighted feature vectors reserve both data features and model attributes [7], [16], [53], we opt for this design choice to support our explainable data slice construction. Specifically, we leverages GradCAM to obtain model attributes in the latent space, due to its demonstrated efficacy via sanity checks [54] and prevalent usage [16], [28], [55]–[59]. In our approach, GradCAM is interchangeable with other XAI techniques such as CAM [24] or RISE [26] (R2).

    As shown in Fig. 3, an input image is processed by the model to extract the feature vector F, and GradCAMis performed to produce attributes in the latent space W, which can be linearly interpolated to the image size for attribution heatmap. WRm×n and is normalized with Σi=1,j=1i=m,j=nWij=1 to compute the weighted average of F. The resulting attribution-weighted feature vector FWR1×1×d is:
    FW=FW=Σi=1,j=1i=m,j=nFijWij. (1)
  2. Data Slice Identification: Representation Space Construction. To derive meaningful data slices (R1, R2), we need to construct a representation space with neighbor consistency in terms of model attributes and data features. We first conduct quantitative and qualitative experiments to validate whether attribution-weighted feature vectors help fulfill this requirement—by comparing the feature representation space (derived from the original feature vectors) and the attribution representation space (derived from the attribution-weighted feature vectors). Following the established practices [4], [5], [14] and considering the time efficiency [60], [61], we utilize dimensionality reduction by UMAP [60] to produce both spaces. At each space, three points are randomly sampled and their nearest neighbors are obtained. We qualitatively examine the consistency of their corresponding attribution heatmaps as shown in Fig. 4. Neighboring instances’ model attributions on the original space are not similar, while our constructed space has greater local consistency of model attributions.

Fig. 3.

Fig. 3.

Attribution-weighted feature vector generation. An image is forwarded through CNN, where the corresponding feature vector F and weight matrix W can be extracted to calculate the attribution-weighted feature vector FW.

Fig. 4.

Fig. 4.

Comparison of representation spaces. (a) Feature representation space computed on original feature vectors. (b) Attribution representation space computed on attribution-weighted feature vectors.

In addition, we quantitatively compute the neighboring consistency regarding data features and model attributions and report the results in Fig. 4. For each point, we compute the average cosine similarities between its top 10 neighbors using their original feature vectors, representing data feature consistency; besides, we compute the average Wasserstein similarities of their attribution mask, where higher values of both scores indicate better consistency. Lastly, we compute the average scores to quantify the overall neighbor consistency of data features and model attributions for each space. As shown in Fig. 4, compared to the original feature space, our constructed space has comparable data feature similarity (0.8987) and significantly higher attribution similarity (1.0659), which proves that the attribution-weighted feature vectors help construct a space with satisfactory local consistency in terms of data features and model attributions, providing a solid basis for computing interpretable data slices (R2).

Slice Identification.

On top of the attribution representation space, we apply K-Means clustering to obtain data slices. Similarly to Domino [14], we apply over-clustering by incrementing the number of clusters and monitoring the model attribution consistency within each slice, until a coherent grouping of samples is attained. Specifically, the attribution consistency of a data slice is measured as the average cosine similarity of each instance’s attribution-weighted feature vector with the slice centroid. We set a threshold at 0.8 to determine slice coherence in our case studies. Note that we allow ML experts to adjust this threshold based on the specific requirements of their datasets and tasks. For rare slices with a small number of data samples featuring abnormal model attributions, this iterative process, along with our slice mosaic design detailed in Sec. III-E, ensures such slices are not overlooked. By combining automated clustering with user-driven refinement, our approach provides a comprehensive and efficient solution for identifying both common and rare slices.

E. Slice Summarization and Annotation

1). Attribution Mosaic:

After obtaining slices with internally consistent model attributions and covering the entire dataset, we design Attribution Mosaic, a novel visualization technique, to visually summarize data slices in a mosaic depiction(R3). As shown in Fig. 5, we first compute the convex hull of each slice based on the projected points in 2D space; then distill each slice’s predominant visual patterns via Feature Inversion, integrating the shape constraint of the convex hull. The final landscape featuring slices’ attribution summaries is presented as Attribution Mosaic, allowing users to easily identify slices with rare, or uncommon model attributions through visual inspection and comparison.

Fig. 5.

Fig. 5.

Attribution Mosaic generation. We compute data slices after acquiring attribution-weighted feature vectors (FW). Then Feature Inversion is conducted according to FW and the mosaic boundary of each slice to visualize their common patterns, forming Attribution Mosaic.

Mosaic Boundary Generation.

Given that the convolutional modules take rectangular-shaped input, conventional Feature Inversion produces rectangular visualizations, merely obtaining and displaying its results over the projected cluster centroids leads to overlapped visualizations, hindering effective user exploration. Since K-Means is a centroid-based algorithm that partitions the space into strict non-overlapping Voronoi cells, this guarantees that the convex hulls [62] of our data slices have no overlap. Therefore, we compute the convex hull of each slice on the attribution representation space, which provides a mosaic layout for our slice visualization.

Mosaic Drawing with Feature Inversion.

We deploy Feature Inversion technique [63]–[65] to visualize the significant pattern in each slice (R3), which is an optimization-based technique to visualize shared features across a group of feature vectors. Specifically, we first create an image I with size W×H, which is the minimal rectangle encompassing the mosaic boundary of a specific slice denoted by M. The optimization process focuses on updating pixels in I that lie inside M, synthesizing a mosaic-shaped image to visualize the dominant pattern of the feature vectors. To enhance the quality of synthesized images, we initialize them with the average of the original images instead of Gaussian noises [64] and parameterize them in the Fourier space [65], making them better aligned with natural images rather than less-interpretable artificial patterns. The optimization process is defined by:

x*=argmaxxMϕ(x),ϕ0+λR(x), (2)

where x* denotes the pixels in I bounded by the mosaic shape M, the convex hull of the slice. The values of x* are updated iteratively through the optimization process to maximize the target function. ϕ:RW×H×CRd is the representation function, i.e., the model’s convolutional modules. ϕ0 is a target feature activation, which is the average of attribution-weighted feature vectors of a data slice in our case. (*) is the loss function where the Euclidean Loss is used here, and R(x) is a Total Variation (TV) regularization term to improve the quality of results [49]. Thus, we obtain a mosaic-shaped image to visualize the shared attribution patterns within a data slice. Lastly, the visualizations from all data slices are exhibited at their respective positions within the Attribution Mosaic.

2). Slice Annotation and Spuriousness Propagation:

The Attribution Mosaic facilitates rapid comprehension of the main content of different slices, supporting the identification of potential spurious correlation issues(R4).

We introduce a metric termed Spuriousness probabilities to streamline the detection of spuriousness, ranging from 0 to 1, indicative of the likelihood of spurious correlation existence. We employ the Label Spreading algorithm [66] to propagate user-annotated spurious/core information to other slices. Specifically, a graph is constructed where nodes represent data points, and edge weights encode similarities between points. In our case, we fit the graph to our 2D attribution representation space. Based on this graph structure, the Label Spreading algorithm iteratively updates the labels of unannotated slices, ensuring local and global label consistency—similar slices should exhibit similar spuriousness scores, and the propagated scores should respect the annotated slice spurious/core labels. While only one annotation of “core” or “spurious” is required to initialize the propagation process, the spuriousness scores are dynamically updated as users provide additional annotations. The iteratively refined scores are visualized in the interface to guide further annotation.

The Spuriousness probabilities obtained through the Attribution Mosaic offer two benefits. First, they streamline slice exploration by highlighting potential spurious correlations, supporting identifying and assessing problematic slices. Second, after users’ verification, these probabilities inform the model-side strategies for issue mitigation (Sec. III-G) (R4). Since propagated scores are visualized alongside the Attribution Mosaic, users can easily validate and override incorrect propagations, ensuring human decisions take precedence over automated computations.

F. System Design Considerations

In designing AttributionScanner, various configurations of visual components’ combinations have been evaluated to ensure optimal system capabilities to fulfill our design requirements discussed in Sec. III-A, as outlined in Table I. Our system should be capable of providing four-faced information: (1) slice performance indicator (R3), (2) slice attribution overview (R2, R3), (3) instance performance indicator (R2), and (4) instance attribution inspection (R2, R4). The notions in “design components” are consistent with Fig. 2.

TABLE I.

System Design Considerations.

Design Components System Capabilities
Slice Table Slice Detail View Scatter Plot Attribution Mosaic Slice Performance Slice Attribution Instance Performance Instance Attribution

In Table I, the initial configuration, “Slice Table & Slice Detail View,” is capable of presenting slice performance indicators and instance attributions. However, the numerical matrices of performance indicators are not sufficient in guiding users to uncover hidden model issues, such as spurious correlations that can happen regardless of high or low performance. A subsequent iteration included a scatter plot showing the attribution representation space. This configuration enables the system’s instance performance indicator capability but still misses the slice attribution overview. To the best of our knowledge, there exist few approaches that visually elucidate potential model spuriousness in scale, where our third configuration fulfills this gap — “Slice Table & Slice Detail View & Attribution Mosaic, ” which comprehensively satisfies all the essential system capabilities and fulfills our design requirements.

Attribution Mosaic, alongside the Slice Table and Slice Detail View, enriched the system’s ability to provide granular insight into both slice and instance-level attributions and performance indicators (R2, R3). The coordinated Slice Table and Attribution Mosaic can guide users to pinpoint hidden issues, and the Slice Detail View enables users to further verify their insights (R4). This robust configuration underscores our meticulous design process in assuring that AttributionScanner is not only capable of furnishing critical insights but also does so in an intuitive and user-friendly manner.

The tabulated design considerations and the ensuing choice of system component configuration aim at maximizing the system’s utility and user-centric functionality. With our finalized configuration (the last row of Table I), users are empowered with sufficient support for problem detection, interpretation, and mitigation in the model validation.

G. Slice Error Mitigation for Model Enhancement

One effective approach to mitigating spurious correlations in ML models is through model re-training, which can improve the model’s robustness without changing its architecture. We leverage the Core Risk Minimization (CoRM) method introduced in [2]. CoRM corrupts non-core image regions with random Gaussian noise and retrains the model using the noise-corrupted data, which has been shown to be effective in mitigating a model’s reliance on spurious features.

We first export slices with high Spuriousness probabilities, indicating undesired correlations. For the images in these slices, the model attribution masks (e.g., GradCAM masks) highlight spurious regions and we utilize such masks to add random Gaussian noise to spurious regions. For a single image, this process can be represented by x=x+mz, where x is the input image, m is the attribution mask, and z is the generated Gaussian noise matrix. All these three variables are of the same size as the input image, and denotes the Hadammard product. Fig. 6 shows some examples of this operation, with exaggerated noise for presentation purposes. After replacing the original data with these noisy-corrupted slices, we retrain the model and evaluate whether spurious correlation has been reduced (R4). In Sec. V, we explain the evaluation metrics we use to quantify the effectiveness of AttributionScanner in mitigating spurious correlations.

Fig. 6.

Fig. 6.

Example of spurious correlation mitigation with CoRM for hair color classification: ❶ The Attribution Mosaic indicates a problem with a spurious feature: mouth. ❷ An example of how noise is occupied by CoRM: Original Image (left), GradCAM activations (middle), noise added to the spurious regions of the image (right).

IV. Case Studies

In this section, we present two case studies with publicly available vision datasets to benchmark and evaluate the capabilities of AttributionScanner. The primary objective of these case studies is to demonstrate how AttributionScanner empowers ML experts and practitioners to detect, evaluate, and interpret potential issues in vision models. Fig. 8 illustrates common usage patterns of our system: users can begin the analysis with either a rank-driven evaluation A or a visual-driven evaluation B. Once a slice of interest is identified, users can delve deeper into individual samples in the Slice Detail View C. Finally, the user can annotate the data slices and continue their analysis D.

Fig. 8.

Fig. 8.

Usage patterns for AttributionScanner derived from case studies. The analysis can start with A a rank-driven exploration (ranking the slices by A1 score (e.g., model accuracy) or A2 spuriousness), or B a visual-driven exploration. After inspecting slice summary in Attribution Mosaic (B), they can inspect individual samples in the Slice Detail View (C). Finally, they can annotate their insights (D).

A. Hair Color Classifier Validation - Finding Edge Cases

  • Overview. This case study involves the Large-scale CelebFaces Attributes (CelebA) dataset [67] with 202, 599 number of face images. The label of each image is one of {not gray hair, gray hair}, refer to labels {0, 1}, respectively. With the train, validation, and test splits of (8 : 1 : 1), we adopt transfer learning to train a ResNet50 [50] binary image classifier. After iteratively fine-tuning hyper-parameters, we obtain a model with 98.03% classification accuracy. The ML experts would then explain and troubleshoot this hair color classifier with AttributionScanner based on these settings: n_neighbors = 5, min_dist = 0.01, and n_components = 2 for the UMAP algorithm, and n_clusters = 50 for K-Means.

  • Does the model behave correctly on well-performing slices? One basic expected behavior for a hair color classification model is to catch hair features. At first glance of the Attribution Mosaic (Fig. 2 C), ML experts notice slice_22 and slice_5 on the left, which lie separately with others on the Attribution Mosaic. Their mosaic visualizations indicate gray hair patterns, which means the model uses the correct features, i.e., core features. By verifying the corresponding attribution heatmaps (Fig. 2 D), they confirm the correctness of this insight and annotate both slices as core feature with description Core: gray hair. Upon experts saving the annotation, AttributionScanner automatically propagates the annotation and provides a Spuriousness probability of each slice. With this guidance, the experts observe many slices lying on the right of Attribution Mosaic have higher Spuriousness probabilities (Fig. 2 C E). Their mosaic visualizations do not show any hair patterns, leading to valid doubts that the model does not behave correctly on such slices. Besides, they also notice that the model has correct predictions on these slices (with 100% prediction accuracy). This makes them worry that the model is largely biased by spurious features. Through investigation, they find the model mistakenly utilizes mouth and eyes to predict hair color for those top-performed slices, such as slice_25, slice_14, and slice_2 (Fig. 9).

  • What underlying factors contribute to unexpected behaviors? By sorting the Slice Table by descending order of the propagated Spuriousness, ML experts notice more slices with high Spuriousness, such as slice_35, and are interested in understanding why such unexpected model behaviors happen. They switch Attribution Mosaic into the confusion matrix form (Fig. 2 E) and click on the name of “slice_35” to highlight this slice on the four sub-views and inspect explanations. They notice that images that lie in the “FN” group of this slice have wrong labels — they should be labeled as “not gray hair” rather than “gray hair”. They mark this issue as wrong labels and investigate its neighborhood slices on Attribution Mosaic, finding slice_2_FN also features wrong labels.

  • Why slices underperform? After fully exploring those well-performed but biased slices, the experts wonder whether the underperformed slices are indeed issue-free or what contributes to their low performance. By sorting the Slice Table (Fig. 2 B) by ascending order of accuracy, they notice the model only achieves 72.41% accuracy on slice_44 and find that the slice’s visualization only shows colorful patterns without any recognizable content. After checking the attribution heatmaps, they find that the model mistakenly uses image backgrounds to make hair color predictions (Fig. 9). This spurious correlation issue stands out, and they annotate this slice as “spurious” with the description “Spurious: backgrounds”. Similar issues also occur in its neighborhood slices via Attribution Mosaic, and AttributionScanner automatically propagates them with higher Spuriousness.

  • Insights summary. Fig. 9 presents examples of the detected issues with the current model, which include wrong attributions in underperforming slices, model biases in well-performing slices, and noisy label issues. Through the afore-mentioned procedure in this case study, we demonstrate how AttributionScanner supports users to uncover and interpret potential model issues with visual summaries and other guidance. Based on annotations from ML experts, we employ the CoRM framework to mitigate the detected errors, which will be evaluated in Sec. V.

Fig. 9.

Fig. 9.

Examples of findings when ML experts validate a hair color classifier, where experts discover slices corresponding to the model’s core/spurious correlations and wrong labels.

B. Bird Category Classifier Validation - Detecting Bias

  • Overview. To study whether AttributionScanner can help ML experts and practitioners find the potential biases and discrimination of models, we design this case study with a biased dataset called Waterbirds [4], which is constructed by cropping out birds from images in the Caltech-UCSD Birds-200–2011 (CUB) dataset [68] and transferring them onto backgrounds from the Places dataset [69]. For each image, the label belongs to one of {waterbird, landbird}, and the image background belongs to one of {water background, land background}. The training set is skewed by placing 95% of waterbirds (landbirds) against a water (land) background and the remaining 5% against a land (water) background. Following the same data splitting as [4], the train, validation, and test sets include 4795, 1199, and 5794 images, respectively. After training and fine-tuning hyper-parameters, the waterbirds/landbirds classification model achieves 85.74% classification accuracy. In the data slice finding, we set n_neighbors = 20, min_dist = 0.05, and n_components = 2 for the UMAP algorithm, and n_clusters = 46 for K-Means.

    While in this study, ML experts are aware that the model is likely biased by backgrounds — using water (land) background to classify waterbirds (landbirds). However, such priori knowledge is hard to establish in real-world applications because of the scarcity of additional well-labeled metadata. And hence the experts assume such information is unknown and want to validate whether AttributionScanner can make the potential model biases stand out by only utilizing the original images and the trained model.

  • Does the model exhibit bias? To answer this key question, experts start by investigating the underperformed slices. From the Slice Table (Fig. 7 A), they sort slices by ascending order of accuracy and select the worst-performed slice_38. The coordinated information provided by Attribution Mosaic and model attribution heatmaps highlights a spurious correlation problem—the model uses water backgrounds to classify birds (Fig. 10). The experts annotate this slice as “spurious”, and AttributionScanner automatically propagates this annotation. They verify the propagation correctness on neighborhood slices and annotate slice_41 as “spurious” with “water backgrounds” (Fig. 10). Moreover, the experts identify an underperformed slice_32 that is not clustered with the current ones and is given a high Spuriousness possibility. They investigate it and verify that the model utilizes “Spurious feature: land backgrounds”, to predict bird classes. Through the Attribution Mosaic in the confusion matrix form (Fig. 7 D), they find such spurious correlations result in many false negatives (Fig. 7 D1), where the model uses land backgrounds to mistakenly predict many “waterbirds” as “landbirds”.

  • Is the detected bias prevalent across all slices? Why or why not? ML experts are interested in determining whether the detected bias is pervasive throughout the dataset. By analyzing slices that are distant from the annotated ones in the Attribution Mosaic and are assigned with low Spuriousness possibilities, they discovered that the farthest neighbors, namely slice_34 and slice_5, correspond to core features. This suggests that the model can correctly capture bird regions in these slices (Fig. 7 B C), which raises a follow-up “why” question. To understand in what circumstances when the model fails. They browse the original images from slice_32 (spurious feature) and slice_34 (core feature), respectively, as shown in Fig. 7 D and C. They find slice_32 has very similar land backgrounds and very different birds, while on the other hand, the birds’ appearance of slice_34 is very consistent. This finding explains why this biased model can successfully capture the core features from slice_34 but fails at slice_32—greater similarities in the representation space indicate stronger features. In other words, core features in slice_34 are strong enough to support the model’s robustness against bias. Such insights are helpful in improving model robustness and have been further studied by ML experts [5].

  • Insights summary. A summary of insights obtained from this case study is provided in Fig. 10, where ML experts validate the existence of model biases and extract slices corresponding to different biases. In the following Sec. V, we evaluate whether AttributionScanner can mitigate model errors by incorporating human feedback.

Fig. 7.

Fig. 7.

AttributionScanner applied to the validation of a bird category classification model trained on the Waterbirds dataset. The user finds the correct model behavior (bird patterns) corresponding to a well-performed Slice 34 and annotates it as “core: birds” B and C. By switching the Attribution Mosaic to the confusion matrix view D and investigating the underperformed slices with accuracy sorting A, the user identifies a problematic Slice 32 that has high false negatives D1, which turns out to use spurious feature of land backgrounds (BG) to predict landbirds.

Fig. 10.

Fig. 10.

Examples of findings when ML experts validate a bird category classifier, where experts identify problematic slices where the model uses backgrounds (BG) to classify birds.

V. Evaluation

A. Quantitative Evaluation

The quantitative evaluation involves four matrices introduced in [2], including clean accuracy, core accuracy, spurious accuracy, and relative core sensitivity (RCS). We revisit their definitions as follows:

  • Clean Accuracy. The original model accuracy, where larger values are indicative of better overall accuracy.

  • Core Accuracy acc(C). The model accuracy when spurious regions are occupied with Gaussian noise, where larger values are indicative of the model’s more reliance on core regions.

  • Spurious Accuracy acc(S). The model accuracy when core regions are added with Gaussian noise, where larger values are indicative of the model’s more reliance on spurious regions.

  • RCS. This metric quantifies the model’s reliance on core features while controlling for general noise robustness. It is defined as the ratio of the absolute gap between core and spurious accuracy, and the total possible gap for any model between core and spurious accuracy. Represented as RCS=acc(C)acc(S)2×min(α,1α), where α=acc(C)+acc(S)2. RCS ranges from 0 to 1, a higher value indicative of better model performance.

In the two case studies discussed in Sec. IV, ML experts annotated five spurious slices in the CelebA dataset, {slice_25, slice_14, slice_2, slice_44, slice_26}, and six spurious slices in the Waterbirds dataset, {slice_38, slice_41, slice_32, slice_29, slice_4, slice_10}, respectively. For each case study, AttributionScanner automatically ran the Label Propagation algorithm and exported both the users’ annotation records and the propagated Spuriousness for further investigation.

To thoroughly evaluate our introduced method, we validate three models for each case. The models marked as “baseline” are the original trained models obtained at the beginning of each case study. The models marked as “AttributionScanner” are re-trained using the CoRM method after adding noise to “spurious” slices according to results exported from AttributionScanner. In particular, “Annotation” indicates that only user-annotated spurious slices were corrupted with noise, while “Propagation” indicates that propagated Spuriousness is used to identify spurious slices to be added with noise.

Table II and Table III present the quantitative evaluation results for the two case studies, respectively. Our presented AttributionScanner can significantly improve the vision model’s overall performance with reduced spurious correlations. Furthermore, our Label Propagation largely reduces human effort by automating the annotation process, and achieves the best performance in this quantified evaluation. Overall, our results demonstrate that AttributionScanner is an effective approach for mitigating spurious correlations in machine learning models, and the Label Propagation algorithm is a valuable tool for automating the annotation process.

TABLE II.

Quantitative evaluation of the performance of the hair color classification models trained on the original and AttributionScanner (AS)-improved CelebA dataset.

Training Procedure Clean Acc (↑) Core Acc (↑) Spurious Acc (↓) RCS (↑)
Baseline 98.02688 97.21170 97.61917 0.067720
AS (Annotation) 98.02688 98.02185 96.92958 0.216351
AS (Propagation) 98.08225 98.16278 96.01852 0.368512

TABLE III.

Quantitative evaluation of the performance of the bird category classification models trained on the original and AttributionScanner(AS)-improved Waterbirds dataset.

Training Procedure Clean Acc (↑) Core Acc (↑) Spurious Acc (↓) RCS (↑)
Baseline 85.73812 82.98582 82.82901 0.004878
AS (Annotation) 89.57465 86.40534 79.81651 0.195062
AS (Propagation) 90.40867 87.40617 77.89825 0.274038

B. Qualitative Evaluation

Results in Sec. V-A highlight the best performance of AttributionScanner equipped with Label Propagation in mitigating spurious correlations. For further evaluation, we visually compare models’ attributions via GradCAM in Fig. 11. Refer to Sec. IV-A, the hair color classifier originally has spurious correlations dominant in {slice_44, slice_25, and slice_14}, where the model uses image backgrounds, mouth, or eyes to predict hair color, respectively. With the help of AttributionScanner, the model’s misattribution is successfully fixed, directing focus towards the correct hair regions (refer to Fig. 11(a)). As for the bird category classification model, it originally focuses on spurious features, water/land backgrounds, to decide whether there are water/land birds on the input image {slice_38, slice_41, and slice_32} (refer to Sec. IV-A). In Fig. 11(b), we can observe that AttributionScanner mitigates these issues by helping the model to focus on the core bird features.

Fig. 11.

Fig. 11.

Qualitative evaluation with GradCAM attribution heatmaps. We visually compare the original and the AttributionScanner-improved models for hair color and bird category classification, respectively. In each sub-figure, the first row refers to the original model (baseline), and the second row refers to the improved model (ours). We can observe AttributionScanner suppresses the spurious correlations by using the correct features for predictions.

C. Experts Feedback

We conduct two case studies (see Sec. IV-A and Sec. IV-B) with ten ML experts to evaluate AttributionScanner. None of them are co-authors of this paper, and they had not previously seen AttributionScanner. Five are male and five are female. Six of the experts have over five years of experience in ML, two have between three to five years of experience, and two have between one to three years of experience. All experts need to conduct model validation in their research and have experience working with vision models. Our findings are drawn from their comments during the study, as they follow the “think-aloud” protocol, and additional feedback is gathered through exit discussion and questionnaire.

Summary of Likert-Type Questions.

The exit questionnaire included nine Likert-type questions on a seven-point scale, aiming to evaluate AttributionScanner from various aspects. The feedback, as shown in Fig. 12, reflects a positive overall impression. Notably, in the usability evaluation, five experts strongly agreed and three agreed that our system is user-friendly and understandable. In terms of effectiveness, eight experts strongly agreed that our system aids in validating models. All experts expressed strong confidence in the insights provided. Considering that Feature Inversion (FI) results are synthesized images that might initially be hard to grasp, we included questions to assess this design aspect. Seven experts strongly agreed, and two agreed that FI is helpful for understanding slice patterns. While two experts expressed neutral or somewhat disagreeing opinions about understanding slice contents based solely on FI, eight strongly agreed that AttributionScanner’s coordination of FI and other views supports a comprehensive understanding of slice contents. Lastly, all experts expressed interest in using our system in the future and would recommend it as a valuable tool for model validation.

Fig. 12.

Fig. 12.

Expert perception of the system, according to eight Likert-Type Questions using a 7-point scale. (AS = AttributionScanner.)

Interpretable Model Validation.

All experts acknowledged that AttributionScannereffectively helps ML model validation”. They remarked that the complimentary visual summaries and attribution heatmaps help them “easily understand slice issues”. “It’s often hard to figure out what’s going wrong with a model”, one expert said, “This system guides me to find where I should look and explains the issues intuitively.” Another expert noted, “The ability to browse what’s happening in different slices at once greatly helped me understand the model.” They commented that they are confident about their insights because our system “highlights slice errors and provides supportive evidence”. Three of them are eager to see what issues AttributionScanner could uncover for the models they are currently using.

Attribution Mosaic.

All participants showed particular interest in the Attribution Mosaic and asked about how it was designed in the interview. Seven experts were impressed by its ability to summarize model attributions across data subgroups. Two experts noted that “it highlights model issues intuitively.” Another described it as “impressive and novel,” emphasizing that it “highlights and explains the model’s failures.” Another expert commented, “This view provides a new angle for me to identify and understand the data/model issues.” All experts quickly adapted to using Spuriousness propagation and utilized the Spuriousness matrix to speed up their annotation. Specifically, five experts found that the Spuriousness propagation “surprisingly helpful,” and six remarked that it “really saved my effort”.

VI. Conclusion

In this work, we present AttributionScanner, a novel human-in-the-loop model validation system empowered by metadata-free data slice finding and XAI techniques, which can support effective detection, explanation, and mitigation of potential errors and biases in vision models. AttributionScanner provides explainable data slices that reveal critical model problems, such as spurious correlations and mislabeled data. The effectiveness of this work is validated by the performance improvement and bias mitigation through both quantitative and qualitative evaluations. We hope this work can inspire future research in human-AI teaming to improve AI trustworthiness and accountability.

Supplementary Material

supp1-3546644

Acknowledgments

This research is supported in part by the National Science Foundation via grant No. IIS-2427770, the National Institute of Health under grant no. P41 EB032840, and the University of California (UC) Multicampus Research Programs and Initiatives (MRPI) grant and the UC Climate Action Initiative grant.

Biographies

graphic file with name nihms-2109394-b0001.gif

Xiwei Xuan is a Ph.D. candidate in computer science at the University of California, Davis. Before UC Davis, she received her M.S. degree in electrical engineering in 2020 from Washington University in St. Louis. Her research spans computer vision, machine learning, and visual analytics, aiming to address the reliability, efficiency, and transparency of machine learning. She is particularly interested in improving data quality and the alignment between human and machine intelligence.

graphic file with name nihms-2109394-b0002.gif

Jorge Piazentin Ono is a Senior Research Scientist at Bosch. He earned his PhD in Computer Science from New York University and a Master’s degree from the University of Sao Paulo. His research interests include Human-Computer Interaction, Visual Analytics, Explainable AI, and Model Validation. Specifically, his work focuses on leveraging human domain knowledge and insights to improve data quality and model performance.

graphic file with name nihms-2109394-b0003.gif

Liang Gou is the Director of AI at Splunk, a part of Cisco. Previously, he served as a Senior Principal Research Scientist at Bosch, a Principal Research Scientist at Visa Research, and a Research Staff Member at IBM’s Almaden Research Center. Liang holds a Ph.D. in Information Science from Penn State University. His research interests focus on large language models (LLMs), agentic systems, and visual analytics.

graphic file with name nihms-2109394-b0004.gif

Kwan-Liu Ma is a distinguished professor of computer science at the University of California, Davis. His research is in the intersection of data visualization, computer graphics, human-computer interaction, and high performance computing. For his significant research accomplishments, Ma received several recognitions including being elected as IEEE Fellow in 2012 and ACM Fellow in 2024, recipient of the IEEE VGTC Visualization Technical Achievement Award in 2013, and inducted to IEEE Visualization Academy in 2019.

graphic file with name nihms-2109394-b0005.gif

Liu Ren is currently the Vice President and Chief Scientist of Salable and Assistive AI at Bosch Research North America and Bosch Center for AI (BCAI). He received his Ph.D. in Computer Science from the Computer Science Department at Carnegie Mellon University. His research focuses on AI, computer vision, and visual analytics, among other areas. He has been honored with multiple best paper honorable mention awards (2016,2022) and best paper awards (2018,2020) at IEEE Visualization conferences for his contributions to visual analytics and XAI.

References

  • [1].Simon HA, “Spurious correlation: A causal interpretation,” Journal of the American statistical Association, vol. 49, no. 267, pp. 467–479, 1954. [Google Scholar]
  • [2].Singla S, Moayeri M, and Feizi S, “Core risk minimization using salient imagenet.” arXiv, 2022. [Google Scholar]
  • [3].Madaio MA, Stark L, Wortman Vaughan J, and Wallach H, “Co-designing checklists to understand organizational challenges and opportunities around fairness in ai,” in Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, 2020, pp. 1–14. [Google Scholar]
  • [4].Sagawa S, Koh PW, Hashimoto TB, and Liang P, “Distributionally robust neural networks for group shifts: On the importance of regularization for worst-case generalization,” arXiv preprint arXiv:1911.08731, 2019. [Google Scholar]
  • [5].Zhang M, Sohoni NS, Zhang HR, Finn C, and Ré C, “Correct-n-contrast: A contrastive approach for improving robustness to spurious correlations,” arXiv preprint arXiv:2203.01517, 2022. [Google Scholar]
  • [6].Liu EZ, Haghgoo B, Chen AS, Raghunathan A, Koh PW, Sagawa S, Liang P, and Finn C, “Just train twice: Improving group robustness without training group information,” in International Conference on Machine Learning. PMLR, 2021, pp. 6781–6792. [Google Scholar]
  • [7].Xuan X, Deng Z, Lin H-T, and Ma K-L, “SLIM: Spuriousness mitigation with minimal human annotations,” in Proceedings of the European Conference on Computer Vision (ECCV), September 2024. [Google Scholar]
  • [8].Kirichenko P, Izmailov P, and Wilson AG, “Last layer re-training is sufficient for robustness to spurious correlations,” in ICML 2022: Workshop on Spurious Correlations, Invariance and Stability. [Google Scholar]
  • [9].Pastor E, de Alfaro L, and Baralis E, “Looking for trouble: Analyzing classifier behavior via pattern divergence,” in Proceedings of the 2021 International Conference on Management of Data, 2021, pp. 1400–1412. [Google Scholar]
  • [10].Zhang X, Ono JP, Song H, Gou L, Ma K-L, and Ren L, “Sliceteller: A data slice-driven approach for machine learning model validation,” IEEE Transactions on Visualization and Computer Graphics, vol. 29, no. 1, pp. 842–852, 2022. [DOI] [PubMed] [Google Scholar]
  • [11].Foster JC, Taylor JM, and Ruberg SJ, “Subgroup identification from randomized clinical trial data,” Statistics in medicine, vol. 30, no. 24, pp. 2867–2880, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [12].Zhang M, Zhang Y, Zhang L, Liu C, and Khurshid S, “DeepRoad: GAN-based metamorphic testing and input validation framework for autonomous driving systems,” in Proceedings of the 33rd ACM/IEEE international conference on automated software engineering, 2018, pp. 132–142. [Google Scholar]
  • [13].Farchi E, Narayanam R, and Nagalapatti L, “Ranking data slices for ml model validation: A shapley value approach,” in 2021 IEEE 37th International Conference on Data Engineering (ICDE). IEEE, 2021, pp. 1937–1942. [Google Scholar]
  • [14].Eyuboglu S, Varma M, Saab KK, Delbrouck J-B, Lee-Messer C, Dunnmon J, Zou J, and Re C, “Domino: Discovering systematic errors with cross-modal embeddings,” in International Conference on Learning Representations, 2022. [Google Scholar]
  • [15].Gunning D, “Explainable artificial intelligence (xai),” Defense advanced research projects agency (DARPA), nd Web, vol. 2, no. 2, p. 1, 2017. [Google Scholar]
  • [16].Rieger L, Singh C, Murdoch W, and Yu B, “Interpretations are useful: penalizing explanations to align neural networks with prior knowledge,” in International conference on machine learning. PMLR, 2020, pp. 8116–8126. [Google Scholar]
  • [17].Chung Y, Kraska T, Polyzotis N, Tae KH, and Whang SE, “Slice finder: Automated data slicing for model validation,” in 2019 IEEE 35th International Conference on Data Engineering (ICDE). IEEE, 2019, pp. 1550–1553. [Google Scholar]
  • [18].Sagadeeva S and Boehm M, “Sliceline: Fast, linear-algebra-based slice finding for ml model debugging,” in Proceedings of the 2021 International Conference on Management of Data, 2021, pp. 2290–2299. [Google Scholar]
  • [19].Pastor E, Gavgavian A, Baralis E, and de Alfaro L, “How divergent is your data?” Proceedings of the VLDB Endowment, vol. 14, no. 12, pp. 2835–2838, 2021. [Google Scholar]
  • [20].Xuan X, Wang X, He W, Ono JP, Gou L, Ma K-L, and Ren L, “VISTA: A visual analytics framework to enhance foundation model-generated data labels,” IEEE Transactions on Visualization and Computer Graphics, 2025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [21].Sohoni N, Dunnmon J, Angus G, Gu A, and Ré C, “No subclass left behind: Fine-grained robustness in coarse-grained classification problems,” Advances in Neural Information Processing Systems, vol. 33, pp. 19339–19352, 2020. [Google Scholar]
  • [22].d’Eon G, d’Eon J, Wright JR, and Leyton-Brown K, “The spotlight: A general method for discovering systematic errors in deep learning models,” in 2022 ACM Conference on Fairness, Accountability, and Transparency, 2022, pp. 1962–1981. [Google Scholar]
  • [23].Ribeiro MT, Singh S, and Guestrin C, “” why should i trust you?” explaining the predictions of any classifier,” in Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 2016, pp. 1135–1144. [Google Scholar]
  • [24].Zhou B, Khosla A, Lapedriza A, Oliva A, and Torralba A, “Learning deep features for discriminative localization,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 2921–2929. [Google Scholar]
  • [25].Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, and Batra D, “Grad-cam: Visual explanations from deep networks via gradient-based localization,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 618–626. [Google Scholar]
  • [26].Petsiuk V, Das A, and Saenko K, “Rise: Randomized input sampling for explanation of black-box models,” arXiv preprint arXiv:1806.07421, 2018. [Google Scholar]
  • [27].Barash G, Farchi E, Jayaraman I, Raz O, Tzoref-Brill R, and Zalmanovici M, “Bridging the gap between ML solutions and their business requirements using feature interactions,” in Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2019, pp. 1048–1058. [Google Scholar]
  • [28].Krishnakumar A, Prabhu V, Sudhakar S, and Hoffman J, “Udis: Unsupervised discovery of bias in deep visual recognition models,” in British Machine Vision Conference (BMVC), vol. 1, no. 2, 2021, p. 3. [Google Scholar]
  • [29].Lee S, Wang ZJ, Hoffman J, and Chau DHP, “Viscuit: Visual auditor for bias in cnn image classifier,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 21475–21483. [Google Scholar]
  • [30].Xuan X, Zhang X, Kwon O-H, and Ma K-L, “VAC-CNN: A visual analytics system for comparative studies of deep convolutional neural networks,” IEEE Transactions on Visualization and Computer Graphics, vol. 28, no. 6, pp. 2326–2337, 2022. [DOI] [PubMed] [Google Scholar]
  • [31].Lai S, Luan W, and Tao J, “Explore your network in minutes: A rapid prototyping toolkit for understanding neural networks with visual analytics,” IEEE Transactions on Visualization and Computer Graphics, 2023. [DOI] [PubMed] [Google Scholar]
  • [32].Wang J, Liu S, and Zhang W, “Visual analytics for machine learning: A data perspective survey,” IEEE Transactions on Visualization and Computer Graphics, 2024. [DOI] [PubMed] [Google Scholar]
  • [33].Prasad V, van Sloun RJ, van den Elzen S, Vilanova A, and Pezzotti N, “The transform-and-perform framework: Explainable deep learning beyond classification,” IEEE Transactions on Visualization and Computer Graphics, vol. 30, no. 2, pp. 1502–1515, 2022. [DOI] [PubMed] [Google Scholar]
  • [34].Huang J, Mishra A, Kwon B-C, and Bryan C, “Conceptexplainer: Understanding the mental model of deep learning algorithms via interactive concept-based explanations,” 2022. [Google Scholar]
  • [35].Chen C, Yuan J, Lu Y, Liu Y, Su H, Yuan S, and Liu S, “OoD-Analyzer: Interactive analysis of out-of-distribution samples,” IEEE transactions on visualization and computer graphics, vol. 27, no. 7, pp. 3335–3349, 2020. [DOI] [PubMed] [Google Scholar]
  • [36].Li Z, Wang X, Yang W, Wu J, Zhang Z, Liu Z, Sun M, Zhang H, and Liu S, “A unified understanding of deep nlp models for text classification,” IEEE Transactions on Visualization and Computer Graphics, vol. 28, no. 12, pp. 4980–4994, 2022. [DOI] [PubMed] [Google Scholar]
  • [37].Robertson S, Wang ZJ, Moritz D, Kery MB, and Hohman F, “Angler: Helping machine translation practitioners prioritize model improvements,” in Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, 2023, pp. 1–20. [Google Scholar]
  • [38].Chen C, Guo Y, Tian F, Liu S, Yang W, Wang Z, Wu J, Su H, Pfister H, and Liu S, “A unified interactive model evaluation for classification, object detection, and instance segmentation in computer vision,” IEEE Transactions on Visualization and Computer Graphics, 2023. [DOI] [PubMed] [Google Scholar]
  • [39].Zhang X, Piazentin Ono JH, He W, Gou L, Sachan M, Ma K-L, and Ren L, “Slicing, chatting, and refining: A concept-based approach for machine learning model validation with conceptslicer,” in Proceedings of the 29th International Conference on Intelligent User Interfaces, 2024, pp. 274–287. [Google Scholar]
  • [40].Yang W, Liu M, Wang Z, and Liu S, “Foundation models meet visualizations: Challenges and opportunities,” Computational Visual Media, pp. 1–26, 2024. [Google Scholar]
  • [41].Jin Z, Wang X, Cheng F, Sun C, Liu Q, and Qu H, “ShortcutLens: A visual analytics approach for exploring shortcuts in natural language understanding dataset,” IEEE Transactions on Visualization and Computer Graphics, 2023. [DOI] [PubMed] [Google Scholar]
  • [42].Zhang X, Xuan X, Dima A, Sexton T, and Ma K-L, “Labelvizier: Interactive validation and relabeling for technical text annotations,” in 2023 IEEE 16th Pacific Visualization Symposium (PacificVis). IEEE, 2023, pp. 167–176. [Google Scholar]
  • [43].Kahng M, Fang D, and Chau DH, “Visual exploration of machine learning results using data cube analysis,” in Proceedings of the Workshop on Human-In-the-Loop Data Analytics, 2016, pp. 1–6. [Google Scholar]
  • [44].Cabrera ÁA, Epperson W, Hohman F, Kahng M, Morgenstern J, and Chau DH, “FairVis: Visual analytics for discovering intersectional bias in machine learning,” in Proceedings of 2019 IEEE Conference on Visual Analytics Science and Technology (VAST). IEEE, 2019, pp. 46–56. [Google Scholar]
  • [45].Wu T, Ribeiro MT, Heer J, and Weld D, “Errudite: Scalable, reproducible, and testable error analysis,” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019. [Google Scholar]
  • [46].Xuan X, Deng Z, Lin H-T, Kong Z, and Ma K-L, “SUNY: A visual interpretation framework for convolutional neural networks from a necessary and sufficient perspective,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2024, pp. 8371–8376. [Google Scholar]
  • [47].Erhan D, Bengio Y, Courville A, and Vincent P, “Visualizing higher-layer features of a deep network,” University of Montreal, vol. 1341, no. 3, p. 1, 2009. [Google Scholar]
  • [48].Olah C, Mordvintsev A, and Schubert L, “Feature visualization,” Distill, 2017, https://distill.pub/2017/feature-visualization. [Google Scholar]
  • [49].Mahendran A and Vedaldi A, “Understanding deep image representations by inverting them,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 5188–5196. [Google Scholar]
  • [50].He K, Zhang X, Ren S, and Sun J, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778. [Google Scholar]
  • [51].Simonyan K and Zisserman A, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014. [Google Scholar]
  • [52].Sikdar S, Bhattacharya P, and Heese K, “Integrated directional gradients: Feature interaction attribution for neural nlp models,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 2021, pp. 865–878. [Google Scholar]
  • [53].Janizek JD, Sturmfels P, and Lee S-I, “Explaining explanations: Axiomatic feature interactions for deep networks,” Journal of Machine Learning Research, vol. 22, no. 104, pp. 1–54, 2021. [Google Scholar]
  • [54].Adebayo J, Gilmer J, Muelly M, Goodfellow I, Hardt M, and Kim B, “Sanity checks for saliency maps,” Advances in neural information processing systems, vol. 31, 2018. [Google Scholar]
  • [55].Mao C, Xia K, Wang J, Wang H, Yang J, Bareinboim E, and Vondrick C, “Causal transportability for visual recognition,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 7521–7531. [Google Scholar]
  • [56].Zhuang J, Cai J, Wang R, Zhang J, and Zheng W, “Care: Class attention to regions of lesion for classification on imbalanced data,” in International Conference on Medical Imaging with Deep Learning. PMLR, 2019, pp. 588–597. [Google Scholar]
  • [57].Yang Y, Nushi B, Palangi H, and Mirzasoleiman B, “Mitigating spurious correlations in multi-modal models during fine-tuning,” arXiv preprint arXiv:2304.03916, 2023. [Google Scholar]
  • [58].Wu S, Yuksekgonul M, Zhang L, and Zou J, “Discover and cure: Concept-aware mitigation of spurious correlation,” arXiv preprint arXiv:2305.00650, 2023. [Google Scholar]
  • [59].Lee S, Payani A, and Chau DHP, “Towards mitigating spurious correlations in image classifiers with simple yes-no feedback.” [Google Scholar]
  • [60].McInnes L, Healy J, and Melville J, “Umap: Uniform manifold approximation and projection for dimension reduction,” arXiv preprint arXiv:1802.03426, 2018. [Google Scholar]
  • [61].Pal K and Sharma M, “Performance evaluation of non-linear techniques umap and t-sne for data in higher dimensional topological space,” in 2020 fourth international conference on I-SMAC (IoT in social, mobile, analytics and cloud)(I-SMAC). IEEE, 2020, pp. 1106–1110. [Google Scholar]
  • [62].Barber CB, Dobkin DP, and Huhdanpaa H, “The quickhull algorithm for convex hulls,” ACM Transactions on Mathematical Software (TOMS), vol. 22, no. 4, pp. 469–483, 1996. [Google Scholar]
  • [63].Carter S, Armstrong Z, Schubert L, Johnson I, and Olah C, “Activation atlas,” Distill, 2019, https://distill.pub/2019/activation-atlas. [Google Scholar]
  • [64].Nguyen A, Yosinski J, and Clune J, “Multifaceted feature visualization: Uncovering the different types of features learned by each neuron in deep neural networks,” in Proceedings of the International Conference on Machine Learning (ICML), 2016. [Google Scholar]
  • [65].Fel T, Boissin T, Boutin V, Picard A, Novello P, Colin J, Linsley D, Rousseau T, Cadène R, and Laurent Gardes TS, “Unlocking feature visualization for deeper networks with magnitude constrained optimization,” Advances in Neural Information Processing Systems (NeurIPS), 2023. [Google Scholar]
  • [66].Zhou D, Bousquet O, Lal T, Weston J, and Schölkopf B, “Learning with local and global consistency,” Advances in neural information processing systems, vol. 16, 2003. [Google Scholar]
  • [67].Liu Z, Luo P, Wang X, and Tang X, “Deep learning face attributes in the wild,” in Proceedings of International Conference on Computer Vision (ICCV), December 2015. [Google Scholar]
  • [68].Wah C, Branson S, Welinder P, Perona P, and Belongie S, “The caltech-ucsd birds-200–2011 dataset,” 2011. [Google Scholar]
  • [69].Zhou B, Lapedriza A, Khosla A, Oliva A, and Torralba A, “Places: A 10 million image database for scene recognition,” IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 6, pp. 1452–1464, 2017. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

supp1-3546644

RESOURCES