2025-11-12T09:37:10.141820

Epistemic Errors of Imperfect Multitask Learners When Distributions Shift

Sloman, Caprio, Kaski

Uncertainty-aware machine learners, such as Bayesian neural networks, output a quantification of uncertainty instead of a point prediction. In this work, we provide uncertainty-aware learners with a principled framework to characterize, and identify ways to eliminate, errors that arise from reducible (epistemic) uncertainty. We introduce a principled definition of epistemic error, and provide a decompositional epistemic error bound which operates in the very general setting of imperfect multitask learning under distribution shift. In this setting, the training (source) data may arise from multiple tasks, the test (target) data may differ systematically from the source data tasks, and/or the learner may not arrive at an accurate characterization of the source data. Our bound separately attributes epistemic errors to each of multiple aspects of the learning procedure and environment. As corollaries of the general result, we provide epistemic error bounds specialized to the settings of Bayesian transfer learning and distribution shift within $Îµ$-neighborhoods. We additionally leverage the terms in our bound to provide a novel definition of negative transfer.

academic

Epistemic Errors of Imperfect Multitask Learners When Distributions Shift

基本信息

论文ID: 2505.23496
标题: Epistemic Errors of Imperfect Multitask Learners When Distributions Shift
作者: Sabina J. Sloman, Michele Caprio, Samuel Kaski
分类: cs.LG stat.ML
发表时间: October 13, 2025 (arXiv预印本)
论文链接: https://arxiv.org/abs/2505.23496

摘要

本文为不确定性感知机器学习模型（如贝叶斯神经网络）提供了一个原则性框架，用于刻画和消除由可约（认知）不确定性引起的错误。论文引入了认知误差的原则性定义，并在分布偏移下的不完美多任务学习这一非常一般的设置中提供了分解性认知误差界。在此设置下，训练（源）数据可能来自多个任务，测试（目标）数据可能与源数据任务存在系统性差异，和/或学习器可能无法准确刻画源数据。该界将认知误差分别归因于学习过程和环境的多个方面。

研究背景与动机

问题定义

该研究要解决的核心问题是：如何为不确定性感知学习器提供理论框架来刻画和减少认知误差？具体而言：

传统学习理论的局限性：现有的统计学习理论主要关注泛化误差，但对于输出不确定性量化的学习器，预测误差是一个不相关、不完整或无信息的性能度量。
不确定性类型混淆：传统方法将可约的认知不确定性和不可约的随机不确定性混为一谈，无法有效指导模型改进。
复杂学习场景缺乏理论支持：在多任务学习、分布偏移、不完美学习等复杂现实场景下，缺乏理论指导。

研究重要性

实际应用价值：在医疗等高风险领域，准确的不确定性量化至关重要
理论完善：填补了不确定性感知学习理论的空白
指导实践：为模型选择和优化提供理论依据

现有方法局限性

PAC学习理论等传统框架无法区分认知误差和随机误差
缺乏针对多任务学习和分布偏移场景的统一理论框架
现有界限通常假设完美学习或无分布偏移

核心贡献

引入认知误差界概念：提出了认知误差界这一新的理论工具，专门针对不确定性感知学习器
分解性认知误差界：在不完美多任务学习和分布偏移的一般设置下，提供了将认知误差分解为三个组成部分的界限
特殊情况的协变量：为贝叶斯迁移学习和ε-邻域内分布偏移提供了专门的认知误差界
负迁移的新定义：基于界限中的项提供了负迁移现象的新理论刻画

方法详解

任务定义

认知误差定义为学习器对数据生成过程(DGP)认识错误的程度，形式化为： $e := d_{TV}(\hat{P}, Q^t)$

其中 $\hat{P}$ 是学习器的预测分布， $Q^t$ 是目标任务分布， $d_{TV}$ 是全变分距离。

核心理论框架

多任务学习设置

任务分布：任务本身从二阶任务分布 $\mathcal{Q} \in \Delta(\Delta_X)$ 中采样
源任务：训练数据来自 $n$ 个源任务，每个任务 $Q \sim \mathcal{Q}^S$
目标任务：测试任务 $Q^t \sim \mathcal{Q}^T$
分布偏移：当 $\mathcal{Q}^S \neq \mathcal{Q}^T$ 时发生

关键定义

任务分布的重心 (Definition 1): $\bar{Q}(x) := \int_{\Delta_X} Q(x) q(Q) dQ = \mathbb{E}_{Q \sim \mathcal{Q}}[Q(x)]$
任务分布的变异性 (Definition 2): $V[\mathcal{Q}] := \sup_{x \in X} \int_{\Delta_X} [Q(x) - \bar{Q}(x)]^2 q(Q) dQ$
近似偏差 (Definition 7): $B := d_{TV}(P^*, \bar{Q}^S)$ 其中 $P^* = \arg\min_{P \in \pi} d_{TV}(P, \bar{Q}^S)$
收敛不足 (Definition 8): $C := d_{TV}(\hat{P}, P^*)$
分布偏移程度 (Definition 9): $D := d_{TV}(\bar{Q}^S, \bar{Q}^T)$