2025-11-25T17:22:18.138717

Assessing reliability of explanations in unbalanced datasets: a use-case on the occurrence of frost events

Vascotto, Blasone, Rodriguez et al.
The usage of eXplainable Artificial Intelligence (XAI) methods has become essential in practical applications, given the increasing deployment of Artificial Intelligence (AI) models and the legislative requirements put forward in the latest years. A fundamental but often underestimated aspect of the explanations is their robustness, a key property that should be satisfied in order to trust the explanations. In this study, we provide some preliminary insights on evaluating the reliability of explanations in the specific case of unbalanced datasets, which are very frequent in high-risk use-cases, but at the same time considerably challenging for both AI models and XAI methods. We propose a simple evaluation focused on the minority class (i.e. the less frequent one) that leverages on-manifold generation of neighbours, explanation aggregation and a metric to test explanation consistency. We present a use-case based on a tabular dataset with numerical features focusing on the occurrence of frost events.
academic

Assessing Reliability of Explanations in Unbalanced Datasets: A Use-Case on the Occurrence of Frost Events

Basic Information

  • Paper ID: 2507.09545
  • Title: Assessing reliability of explanations in unbalanced datasets: a use-case on the occurrence of frost events
  • Authors: Ilaria Vascotto, Valentina Blasone, Alex Rodriguez, Alessandro Bonaita, Luca Bortolussi
  • Classification: cs.LG (Machine Learning)
  • Publication Time/Conference: Late-breaking work, 3rd World Conference on eXplainable Artificial Intelligence (July 09–11, 2025, Istanbul, Turkey)
  • Paper Link: https://arxiv.org/abs/2507.09545

Abstract

The application of explainable artificial intelligence (XAI) methods has become increasingly critical in practical applications, driven by the widespread deployment of AI models and recent legislative requirements. The robustness of explanations is a fundamental yet often underestimated aspect and a key attribute that explanations must satisfy to warrant trust. This study provides preliminary insights into assessing the reliability of explanations in the specific context of imbalanced datasets. Imbalanced datasets are prevalent in high-risk use cases but simultaneously present considerable challenges for both AI models and XAI methods. We propose a simple evaluation approach focused on the minority class (i.e., the less frequent class), which leverages neighbor generation on manifolds, explanation aggregation, and metrics for testing explanation consistency. We demonstrate this approach using tabular datasets based on numerical features, with frost event occurrence as a use case.

Research Background and Motivation

Problem Definition

The core problem this study addresses is: How to assess the reliability of XAI explanations in imbalanced datasets. Specifically, when minority class samples are extremely scarce in a dataset, traditional explanation methods may produce unreliable results.

Importance Analysis

  1. Legislative Requirements: Regulations such as GDPR and the AI Act impose transparency requirements on high-risk applications
  2. Practical Needs: High-risk domains such as healthcare, climate science, and fraud detection frequently face imbalanced data problems
  3. Trust Crisis: On imbalanced datasets, even if a model achieves 99% accuracy, it may simply be predicting the majority class

Limitations of Existing Methods

  1. LIME and SHAP methods exhibit poor robustness on imbalanced datasets
  2. Lack of Targeted Assessment: Existing methods primarily focus on overall performance while neglecting the special characteristics of the minority class
  3. Unstable Explanations: Similar inputs may produce drastically different explanations

Research Motivation

The authors argue that assessing explanation reliability for the minority class in imbalanced datasets is particularly important because:

  • Accurately predicting rare events is crucial in high-risk applications
  • The majority class is easy to predict, and its explanations are not necessarily trustworthy
  • Specialized methods are needed to assess the robustness of minority class explanations

Core Contributions

  1. Proposed an explanation reliability assessment framework for imbalanced datasets, focusing on minority class samples
  2. Designed a manifold-based neighbor generation method, ensuring that perturbed samples lie on the data manifold
  3. Introduced consistency metrics, assessing reliability by comparing original explanations with locally weighted averaged explanations
  4. Validated the method's effectiveness on a real frost prediction task, which exhibits high imbalance (99:1)

Methodology Details

Task Definition

Given an imbalanced dataset D=(X,y)\mathcal{D} = (X,y) where P(y=0)P(y=1)P(y=0) \gg P(y=1) (0 is the majority class, 1 is the minority class), train a neural network f()f(\cdot) with the goal of assessing the reliability of explanation method ee on minority class samples.

Model Architecture

1. Neighborhood Generation

Employs manifold-based neighbor generation using k-medoids clustering:

Steps:

  • Apply k-medoids clustering to the validation set to obtain kmedoidsk_{medoids} clusters
  • Average cluster size nk=10n_k = 10
  • Extract the medoid of each cluster as a representative point
  • For test samples, find the corresponding medoid and its knn=5k_{nn}=5 nearest neighbors

Perturbation Formula: x~j=(1λˉ)xj+λˉxMj\tilde{x}_j = (1-\bar{\lambda}) \cdot x_j + \bar{\lambda} \cdot x_{M_j} where λˉBeta(λ100,(1λ)100)\bar{\lambda} \sim Beta(\lambda \cdot 100, (1-\lambda) \cdot 100)

2. Local Averaging

Compute weighted average explanations for minority class samples: eˉ(x)=x~Ne(x~)π(x,x~)x~Nπ(x,x~)\bar{e}(x) = \frac{\sum_{\tilde{x} \in \mathcal{N}} e(\tilde{x}) \cdot \pi(x,\tilde{x})}{\sum_{\tilde{x} \in \mathcal{N}} \pi(x,\tilde{x})} where the weight π(x,x~)=1dist(x,x~)\pi(x,\tilde{x}) = \frac{1}{dist(x,\tilde{x})}

3. Reliability Assessment

Define two evaluation metrics:

Local Robustness: R^(x)=1Nx~Nρ(e(x),e(x~))\hat{\mathcal{R}}(x) = \frac{1}{|\mathcal{N}|} \sum_{\tilde{x} \in \mathcal{N}} \rho(e(x), e(\tilde{x}))

Consistency: C^(x)=ρ(e(x),eˉ(x))\hat{\mathcal{C}}(x) = \rho(e(x), \bar{e}(x)) where ρ\rho is the Spearman rank correlation coefficient

Technical Innovations

  1. Manifold-aware neighbor generation: Compared to random Gaussian noise, the medoid-based approach generates neighbors that better conform to the data distribution
  2. Specialized assessment for minority class: Focuses on the most critical yet most fragile minority class samples
  3. Introduction of consistency metrics: Assesses local consistency by comparing original explanations with aggregated explanations
  4. Distance-weighted explanation aggregation: Weights explanations by averaging based on inter-sample distances

Experimental Setup

Dataset

Frost Prediction Dataset:

  • Source: ERA5 reanalysis data (ECMWF) + proprietary insurance company data
  • Time Span: 2009-2024 (15 years)
  • Geographic Coverage: Entire territory of Poland
  • Features: 8 numerical atmospheric variables (standardized)
  • Target: Binary classification (frost occurrence or not)
  • Imbalance Ratio: 99% vs 1% (highly imbalanced)
  • Data Split: Training set 75%, validation set 15%, test set 10% (stratified by region)

Evaluation Metrics

  • Model Performance: F1-score (suitable for imbalanced datasets)
  • Explanation Reliability: Local robustness R^(x)\hat{\mathcal{R}}(x) and consistency C^(x)\hat{\mathcal{C}}(x)
  • Correlation Measure: Spearman rank correlation coefficient

Comparison Methods

Explanation Methods:

  1. Integrated Gradients: Attribution method based on gradient integration
  2. DeepLIFT: Method based on activation difference propagation
  3. Layer-wise Relevance Propagation (LRP): Layer-wise relevance propagation
  4. Ensemble Method: Weighted combination of the above three methods

Neighbor Generation Comparison:

  • Random Gaussian noise generation vs. manifold-based medoid generation

Implementation Details

  • Model Architecture: 5-layer fully connected neural network, ReLU activation, sigmoid output
  • Loss Function: Focal Loss (γ=2.5,α=0.75\gamma=2.5, \alpha=0.75)
  • Optimizer: RAdam, learning rate 0.0001
  • Training Settings: 100 epochs, batch size 256
  • Neighbor Parameters: knn=5,λ=0.05k_{nn}=5, \lambda=0.05, neighborhood size n=100n=100

Experimental Results

Main Results

Model Performance

DatasetMajority Class F1Minority Class F1Minority Class Samples
Training Set1.000.66~2,500
Validation Set1.000.50~450
Test Set1.000.51~300

Explanation Method Performance Comparison

MethodRobustness R^(x)\hat{\mathcal{R}}(x)Consistency C^(x)\hat{\mathcal{C}}(x)
Integrated Gradients89.34% (±8.35%)97.56% (±3.58%)
DeepLIFT97.69% (±2.26%)99.40% (±1.51%)
LRP76.77% (±15.70%)89.86% (±19.95%)
Ensemble79.03% (±12.56%)89.20% (±13.73%)

Key Findings

  1. Importance of Neighbor Generation Method: The medoid-based method significantly outperforms random noise on minority class samples
  2. DeepLIFT Achieves Optimal Performance: Achieves the highest scores and lowest standard deviations on both robustness and consistency metrics
  3. Instability of LRP: Due to vanishing gradient problems, LRP exhibits the most unstable performance
  4. Fragility of Minority Class: Minority class explanations are more susceptible to the choice of neighbor generation method than majority class explanations

Ablation Study

By comparing random neighbor generation with medoid-based neighbor generation, the study demonstrates:

  • Random methods produce larger distribution shifts on minority class samples
  • Medoid-based methods better preserve data manifold structure
  • Minority class is more sensitive to the choice of neighbor generation method

XAI Robustness Research

  • Limitations of LIME and SHAP: Existing studies show these methods perform poorly under adversarial attacks
  • Explanation Stability: Existing work primarily focuses on explanation stability in general cases, lacking specialized research on imbalanced data

Imbalanced Learning

  • Traditional Methods: Resampling, cost-sensitive learning, etc.
  • Deep Learning Methods: Focal Loss and other loss functions specifically designed for imbalanced data
  • Evaluation Challenges: Traditional evaluation metrics fail on extremely imbalanced data

Contribution of This Work

Compared to existing work, this paper is the first to systematically investigate the reliability of XAI methods on imbalanced datasets and proposes a specialized assessment framework.

Conclusions and Discussion

Main Conclusions

  1. Explanation reliability in imbalanced datasets is an important but overlooked problem
  2. Minority class explanations require specialized assessment methods, as traditional methods may produce misleading results
  3. Manifold-based neighbor generation can significantly improve assessment reliability
  4. DeepLIFT performs best on the frost prediction task, exhibiting high robustness and consistency

Limitations

  1. Method is still in preliminary stages: Requires validation on more datasets and scenarios
  2. Only considers tabular data: Does not address other data types such as images and text
  3. Limitations of evaluation metrics: Current metrics may not fully capture explanation quality
  4. Computational Overhead: Generating numerous neighbors for each sample increases computational cost

Future Directions

  1. Extension to Different Imbalance Ratios: Investigate method performance under varying degrees of imbalance
  2. Multimodal Data: Extend the method to image, text, and other data types
  3. Uncertainty Analysis: Incorporate uncertainty quantification to improve minority class assessment
  4. Spatiotemporal Data: Consider the special properties of spatiotemporal dimensions

In-Depth Evaluation

Strengths

  1. Problem Importance: Addresses an important yet overlooked problem in the XAI field
  2. Method Innovation: Proposes a targeted assessment framework with theoretical foundation
  3. Experimental Sufficiency: Validated on real-world scenarios with practical application value
  4. Writing Clarity: Clear paper structure with detailed method description

Weaknesses

  1. Limited Experimental Scale: Validation on only one dataset lacks generalizability proof
  2. Insufficient Theoretical Analysis: Lacks in-depth analysis of the theoretical properties of the method
  3. Limited Baseline Methods: No comparison with other XAI methods specifically designed for imbalanced data
  4. Single Evaluation Metric: Primarily relies on correlation metrics, which may not comprehensively reflect explanation quality

Impact

  1. Academic Contribution: Provides new perspectives for XAI applications on imbalanced data
  2. Practical Value: Provides guidance for XAI deployment in high-risk applications
  3. Reproducibility: Open-source code facilitates reproduction and extension

Applicable Scenarios

  • High-Risk Applications: Medical diagnosis, financial risk control, weather forecasting, etc.
  • Highly Imbalanced Data: Fraud detection, anomaly detection, rare event prediction
  • Strictly Regulated Domains: Industries requiring explainable AI

References

The paper cites important works in the XAI field, including:

  • Classical methods such as LIME 3 and SHAP 4
  • Neural network explanation methods such as Integrated Gradients 11, DeepLIFT 12, and LRP 13
  • Imbalanced learning techniques such as Focal Loss 7
  • Related robustness analysis work 5, 9, 10

Overall Assessment: This is a preliminary research work addressing an important practical problem. While there is room for improvement in experimental scale and theoretical depth, it opens new research directions for assessing the reliability of XAI on imbalanced datasets and demonstrates good application prospects.