2025-11-14T22:04:10.870857

Characterizing the Multiclass Learnability of Forgiving 0-1 Loss Functions

Trauger, Trauger, Tewari

In this paper we will give a characterization of the learnability of forgiving 0-1 loss functions in the finite label multiclass setting. To do this, we create a new combinatorial dimension that is based off of the Natarajan Dimension and we show that a hypothesis class is learnable in our setting if and only if this Generalized Natarajan Dimension is finite. We also show a connection to learning with set-valued feedback. Through our results we show that the learnability of a set learning problem is characterized by the Natarajan Dimension.

academic

Characterizing the Multiclass Learnability of Forgiving 0-1 Loss Functions

基本信息

论文ID: 2510.08382
标题: Characterizing the Multiclass Learnability of Forgiving 0-1 Loss Functions
作者: Jacob Trauger (University of Michigan), Tyson Trauger (The Ohio State University), Ambuj Tewari (University of Michigan)
分类: cs.LG (Machine Learning), stat.ML (Statistics - Machine Learning)
发表时间: 2025年10月 (arXiv预印本)
论文链接: https://arxiv.org/abs/2510.08382

摘要

本文在有限标签多类分类设置中给出了宽容0-1损失函数可学习性的刻画。为此，作者基于Natarajan维度创建了一个新的组合维度，并证明了假设类在该设置下可学习当且仅当这个广义Natarajan维度是有限的。文章还展示了与集合值反馈学习的联系，通过结果表明集合学习问题的可学习性由Natarajan维度刻画。

研究背景与动机

问题背景

在机器学习理论中，分类任务的可学习性刻画是一个核心问题。对于二分类，VC维度完全刻画了PAC可学习性；对于多类分类，在有限标签情况下Natarajan维度起到了类似作用。然而，这些理论都基于标准0-1损失函数，该函数具有"恒等不可分辨性"(Identity of Indiscernibles)性质，即当且仅当两个标签相等时损失为0。

研究动机

在实际应用中，经常需要更加"宽容"的损失函数，例如：

句子重述任务：多个不同的句子可能都是正确的重述
基于阈值的度量：在某个阈值范围内的输出都可以接受
集合值反馈学习：预测结果只需要在给定集合中即可

这些场景下，多个不同的输出可能对同一个真实标签都产生0损失，打破了传统理论的基础假设。

现有方法局限性

现有的可学习性理论（VC维度、Natarajan维度等）都隐含地将标签相等性与损失值联系在一起。当损失函数不满足恒等不可分辨性时，这些理论不再适用，需要新的理论框架来刻画可学习性。

核心贡献

提出广义Natarajan维度：基于Natarajan维度创建了新的组合维度，适用于宽容0-1损失函数
完整的可学习性刻画：证明了假设类在宽容0-1损失下PAC可学习当且仅当广义Natarajan维度有限
集合学习问题的解决：首次在批处理设置下刻画了集合值反馈学习的可学习性
理论框架的建立：为不满足恒等不可分辨性的损失函数建立了系统的学习理论

方法详解

任务定义

输入空间： $X$ （任意输入空间） 输出空间： $Y = [k]$ （有限标签集合， $|Y| = k$ ） 假设类： $H \subset Y^X$ 损失函数： $\ell: Y \times Y \to \{0,1\}$ ，满足以下约束：

二值性： $\forall y_1, y_2 \in Y, \ell(y_1, y_2) \in \{0,1\}$
对称性： $\forall y_1, y_2 \in Y, \ell(y_1, y_2) = \ell(y_2, y_1)$
非包含性： $\forall y_1, y_2 \in Y, \sigma(y_1) \not\subset \sigma(y_2)$
自反性： $\forall y \in Y, \ell(y, y) = 0$

其中 $\sigma(y) = \{y' | \ell(y, y') = 0\}$ 表示与 $y$ 产生0损失的所有标签集合。

核心理论构建

1. 广义Natarajan维度

定义4（广义Natarajan维度）：假设类 $H$ 和损失函数 $\ell$ 广义Natarajan粉碎集合 $S = \{s_1, ..., s_n\}$ ，如果存在 $h_1, h_2 \in H$ 使得：

分离条件： $\forall s_i \in S, \sigma(h_1(s_i)) \neq \sigma(h_2(s_i))$
实现条件： $\forall S' \subseteq S$ $\forall S^{'} \subseteq S$ ，存在 $h \in H$ $h \in H$ 使得：
- $\forall s \in S': \sigma(h(s)) = \sigma(h_1(s))$
- $\forall s \in S \setminus S': \sigma(h(s)) = \sigma(h_2(s))$