2025-11-25T17:22:18.138717

Assessing reliability of explanations in unbalanced datasets: a use-case on the occurrence of frost events

Vascotto, Blasone, Rodriguez et al.

The usage of eXplainable Artificial Intelligence (XAI) methods has become essential in practical applications, given the increasing deployment of Artificial Intelligence (AI) models and the legislative requirements put forward in the latest years. A fundamental but often underestimated aspect of the explanations is their robustness, a key property that should be satisfied in order to trust the explanations. In this study, we provide some preliminary insights on evaluating the reliability of explanations in the specific case of unbalanced datasets, which are very frequent in high-risk use-cases, but at the same time considerably challenging for both AI models and XAI methods. We propose a simple evaluation focused on the minority class (i.e. the less frequent one) that leverages on-manifold generation of neighbours, explanation aggregation and a metric to test explanation consistency. We present a use-case based on a tabular dataset with numerical features focusing on the occurrence of frost events.

academic

Assessing Reliability of Explanations in Unbalanced Datasets: A Use-Case on the Occurrence of Frost Events

Basic Information

Paper ID: 2507.09545
Title: Assessing reliability of explanations in unbalanced datasets: a use-case on the occurrence of frost events
Authors: Ilaria Vascotto, Valentina Blasone, Alex Rodriguez, Alessandro Bonaita, Luca Bortolussi
Classification: cs.LG (Machine Learning)
Publication Time/Conference: Late-breaking work, 3rd World Conference on eXplainable Artificial Intelligence (July 09–11, 2025, Istanbul, Turkey)
Paper Link: https://arxiv.org/abs/2507.09545

Abstract

The application of explainable artificial intelligence (XAI) methods has become increasingly critical in practical applications, driven by the widespread deployment of AI models and recent legislative requirements. The robustness of explanations is a fundamental yet often underestimated aspect and a key attribute that explanations must satisfy to warrant trust. This study provides preliminary insights into assessing the reliability of explanations in the specific context of imbalanced datasets. Imbalanced datasets are prevalent in high-risk use cases but simultaneously present considerable challenges for both AI models and XAI methods. We propose a simple evaluation approach focused on the minority class (i.e., the less frequent class), which leverages neighbor generation on manifolds, explanation aggregation, and metrics for testing explanation consistency. We demonstrate this approach using tabular datasets based on numerical features, with frost event occurrence as a use case.

Research Background and Motivation

Problem Definition

The core problem this study addresses is: How to assess the reliability of XAI explanations in imbalanced datasets. Specifically, when minority class samples are extremely scarce in a dataset, traditional explanation methods may produce unreliable results.

Importance Analysis

Legislative Requirements: Regulations such as GDPR and the AI Act impose transparency requirements on high-risk applications
Practical Needs: High-risk domains such as healthcare, climate science, and fraud detection frequently face imbalanced data problems
Trust Crisis: On imbalanced datasets, even if a model achieves 99% accuracy, it may simply be predicting the majority class

Limitations of Existing Methods

LIME and SHAP methods exhibit poor robustness on imbalanced datasets
Lack of Targeted Assessment: Existing methods primarily focus on overall performance while neglecting the special characteristics of the minority class
Unstable Explanations: Similar inputs may produce drastically different explanations

Research Motivation

The authors argue that assessing explanation reliability for the minority class in imbalanced datasets is particularly important because:

Accurately predicting rare events is crucial in high-risk applications
The majority class is easy to predict, and its explanations are not necessarily trustworthy
Specialized methods are needed to assess the robustness of minority class explanations

Core Contributions

Proposed an explanation reliability assessment framework for imbalanced datasets, focusing on minority class samples
Designed a manifold-based neighbor generation method, ensuring that perturbed samples lie on the data manifold
Introduced consistency metrics, assessing reliability by comparing original explanations with locally weighted averaged explanations
Validated the method's effectiveness on a real frost prediction task, which exhibits high imbalance (99:1)

Methodology Details

Task Definition

Given an imbalanced dataset $\mathcal{D} = (X,y)$ where $P(y=0) \gg P(y=1)$ (0 is the majority class, 1 is the minority class), train a neural network $f(\cdot)$ with the goal of assessing the reliability of explanation method $e$ on minority class samples.

Model Architecture

1. Neighborhood Generation

Employs manifold-based neighbor generation using k-medoids clustering:

Steps:

Apply k-medoids clustering to the validation set to obtain $k_{medoids}$ clusters
Average cluster size $n_k = 10$
Extract the medoid of each cluster as a representative point
For test samples, find the corresponding medoid and its $k_{nn}=5$ nearest neighbors

Perturbation Formula: $\tilde{x}_j = (1-\bar{\lambda}) \cdot x_j + \bar{\lambda} \cdot x_{M_j}$ where $\bar{\lambda} \sim Beta(\lambda \cdot 100, (1-\lambda) \cdot 100)$

2. Local Averaging

Compute weighted average explanations for minority class samples: $\bar{e}(x) = \frac{\sum_{\tilde{x} \in \mathcal{N}} e(\tilde{x}) \cdot \pi(x,\tilde{x})}{\sum_{\tilde{x} \in \mathcal{N}} \pi(x,\tilde{x})}$ where the weight $\pi(x,\tilde{x}) = \frac{1}{dist(x,\tilde{x})}$

3. Reliability Assessment

Define two evaluation metrics:

Local Robustness: $\hat{\mathcal{R}}(x) = \frac{1}{|\mathcal{N}|} \sum_{\tilde{x} \in \mathcal{N}} \rho(e(x), e(\tilde{x}))$

Consistency: $\hat{\mathcal{C}}(x) = \rho(e(x), \bar{e}(x))$ where $\rho$ is the Spearman rank correlation coefficient

Technical Innovations

Manifold-aware neighbor generation: Compared to random Gaussian noise, the medoid-based approach generates neighbors that better conform to the data distribution
Specialized assessment for minority class: Focuses on the most critical yet most fragile minority class samples
Introduction of consistency metrics: Assesses local consistency by comparing original explanations with aggregated explanations
Distance-weighted explanation aggregation: Weights explanations by averaging based on inter-sample distances

Experimental Setup

Dataset

Frost Prediction Dataset:

Source: ERA5 reanalysis data (ECMWF) + proprietary insurance company data
Time Span: 2009-2024 (15 years)
Geographic Coverage: Entire territory of Poland
Features: 8 numerical atmospheric variables (standardized)
Target: Binary classification (frost occurrence or not)
Imbalance Ratio: 99% vs 1% (highly imbalanced)
Data Split: Training set 75%, validation set 15%, test set 10% (stratified by region)

Evaluation Metrics

Model Performance: F1-score (suitable for imbalanced datasets)
Explanation Reliability: Local robustness $\hat{\mathcal{R}}(x)$ and consistency $\hat{\mathcal{C}}(x)$
Correlation Measure: Spearman rank correlation coefficient

Comparison Methods

Explanation Methods:

Integrated Gradients: Attribution method based on gradient integration
DeepLIFT: Method based on activation difference propagation
Layer-wise Relevance Propagation (LRP): Layer-wise relevance propagation
Ensemble Method: Weighted combination of the above three methods

Neighbor Generation Comparison:

Random Gaussian noise generation vs. manifold-based medoid generation

Implementation Details

Model Architecture: 5-layer fully connected neural network, ReLU activation, sigmoid output
Loss Function: Focal Loss ( $\gamma=2.5, \alpha=0.75$ )
Optimizer: RAdam, learning rate 0.0001
Training Settings: 100 epochs, batch size 256
Neighbor Parameters: $k_{nn}=5, \lambda=0.05$ , neighborhood size $n=100$

Experimental Results

Main Results

Model Performance

Dataset	Majority Class F1	Minority Class F1	Minority Class Samples
Training Set	1.00	0.66	~2,500
Validation Set	1.00	0.50	~450
Test Set	1.00	0.51	~300

Explanation Method Performance Comparison

Method	Robustness $\hat{\mathcal{R}}(x)$	Consistency $\hat{\mathcal{C}}(x)$
Integrated Gradients	89.34% (±8.35%)	97.56% (±3.58%)
DeepLIFT	97.69% (±2.26%)	99.40% (±1.51%)
LRP	76.77% (±15.70%)	89.86% (±19.95%)
Ensemble	79.03% (±12.56%)	89.20% (±13.73%)

Key Findings

Importance of Neighbor Generation Method: The medoid-based method significantly outperforms random noise on minority class samples
DeepLIFT Achieves Optimal Performance: Achieves the highest scores and lowest standard deviations on both robustness and consistency metrics
Instability of LRP: Due to vanishing gradient problems, LRP exhibits the most unstable performance
Fragility of Minority Class: Minority class explanations are more susceptible to the choice of neighbor generation method than majority class explanations

Ablation Study

By comparing random neighbor generation with medoid-based neighbor generation, the study demonstrates:

Random methods produce larger distribution shifts on minority class samples
Medoid-based methods better preserve data manifold structure
Minority class is more sensitive to the choice of neighbor generation method

XAI Robustness Research

Limitations of LIME and SHAP: Existing studies show these methods perform poorly under adversarial attacks
Explanation Stability: Existing work primarily focuses on explanation stability in general cases, lacking specialized research on imbalanced data

Imbalanced Learning

Traditional Methods: Resampling, cost-sensitive learning, etc.
Deep Learning Methods: Focal Loss and other loss functions specifically designed for imbalanced data
Evaluation Challenges: Traditional evaluation metrics fail on extremely imbalanced data

Contribution of This Work

Compared to existing work, this paper is the first to systematically investigate the reliability of XAI methods on imbalanced datasets and proposes a specialized assessment framework.

Conclusions and Discussion

Main Conclusions

Explanation reliability in imbalanced datasets is an important but overlooked problem
Minority class explanations require specialized assessment methods, as traditional methods may produce misleading results
Manifold-based neighbor generation can significantly improve assessment reliability
DeepLIFT performs best on the frost prediction task, exhibiting high robustness and consistency

Limitations

Method is still in preliminary stages: Requires validation on more datasets and scenarios
Only considers tabular data: Does not address other data types such as images and text
Limitations of evaluation metrics: Current metrics may not fully capture explanation quality
Computational Overhead: Generating numerous neighbors for each sample increases computational cost

Future Directions

Extension to Different Imbalance Ratios: Investigate method performance under varying degrees of imbalance
Multimodal Data: Extend the method to image, text, and other data types
Uncertainty Analysis: Incorporate uncertainty quantification to improve minority class assessment
Spatiotemporal Data: Consider the special properties of spatiotemporal dimensions

In-Depth Evaluation

Strengths

Problem Importance: Addresses an important yet overlooked problem in the XAI field
Method Innovation: Proposes a targeted assessment framework with theoretical foundation
Experimental Sufficiency: Validated on real-world scenarios with practical application value
Writing Clarity: Clear paper structure with detailed method description

Weaknesses

Limited Experimental Scale: Validation on only one dataset lacks generalizability proof
Insufficient Theoretical Analysis: Lacks in-depth analysis of the theoretical properties of the method
Limited Baseline Methods: No comparison with other XAI methods specifically designed for imbalanced data
Single Evaluation Metric: Primarily relies on correlation metrics, which may not comprehensively reflect explanation quality

Impact

Academic Contribution: Provides new perspectives for XAI applications on imbalanced data
Practical Value: Provides guidance for XAI deployment in high-risk applications
Reproducibility: Open-source code facilitates reproduction and extension

Applicable Scenarios

High-Risk Applications: Medical diagnosis, financial risk control, weather forecasting, etc.
Highly Imbalanced Data: Fraud detection, anomaly detection, rare event prediction
Strictly Regulated Domains: Industries requiring explainable AI

References

The paper cites important works in the XAI field, including:

Classical methods such as LIME 3 and SHAP 4
Neural network explanation methods such as Integrated Gradients 11, DeepLIFT 12, and LRP 13
Imbalanced learning techniques such as Focal Loss 7
Related robustness analysis work 5, 9, 10

Overall Assessment: This is a preliminary research work addressing an important practical problem. While there is room for improvement in experimental scale and theoretical depth, it opens new research directions for assessing the reliability of XAI on imbalanced datasets and demonstrates good application prospects.