2025-11-22T20:19:15.981080

Diffusion-Classifier Synergy: Reward-Aligned Learning via Mutual Boosting Loop for FSCIL

Wu, Zhao, Chen et al.

Few-Shot Class-Incremental Learning (FSCIL) challenges models to sequentially learn new classes from minimal examples without forgetting prior knowledge, a task complicated by the stability-plasticity dilemma and data scarcity. Current FSCIL methods often struggle with generalization due to their reliance on limited datasets. While diffusion models offer a path for data augmentation, their direct application can lead to semantic misalignment or ineffective guidance. This paper introduces Diffusion-Classifier Synergy (DCS), a novel framework that establishes a mutual boosting loop between diffusion model and FSCIL classifier. DCS utilizes a reward-aligned learning strategy, where a dynamic, multi-faceted reward function derived from the classifier's state directs the diffusion model. This reward system operates at two levels: the feature level ensures semantic coherence and diversity using prototype-anchored maximum mean discrepancy and dimension-wise variance matching, while the logits level promotes exploratory image generation and enhances inter-class discriminability through confidence recalibration and cross-session confusion-aware mechanisms. This co-evolutionary process, where generated images refine the classifier and an improved classifier state yields better reward signals, demonstrably achieves state-of-the-art performance on FSCIL benchmarks, significantly enhancing both knowledge retention and new class learning.

academic

Diffusion-Classifier Synergy: Reward-Aligned Learning via Mutual Boosting Loop for FSCIL

基本信息

论文ID: 2510.03608
标题: Diffusion-Classifier Synergy: Reward-Aligned Learning via Mutual Boosting Loop for FSCIL
作者: Ruitao Wu, Yifan Zhao, Guangyao Chen, Jia Li
分类: cs.CV
发表会议: NeurIPS 2025
论文链接: https://arxiv.org/abs/2510.03608

摘要

Few-Shot Class-Incremental Learning (FSCIL) 挑战模型从极少样本中顺序学习新类别，同时不遗忘先前知识，这一任务因稳定性-可塑性困境和数据稀缺而变得复杂。当前FSCIL方法由于依赖有限数据集而在泛化能力上存在困难。虽然扩散模型为数据增强提供了路径，但直接应用可能导致语义错位或无效指导。本文提出了扩散-分类器协同（DCS）框架，在扩散模型和FSCIL分类器之间建立互相促进循环。DCS采用奖励对齐学习策略，通过源自分类器状态的动态多面奖励函数指导扩散模型。该奖励系统在两个层面运作：特征层面通过原型锚定最大均值差异和维度方差匹配确保语义一致性和多样性；logits层面通过置信度重校准和跨会话混淆感知机制促进探索性图像生成并增强类间可区分性。这种共同进化过程中，生成图像优化分类器，改进的分类器状态产生更好的奖励信号，在FSCIL基准测试中取得最先进性能，显著提升知识保持和新类学习能力。

研究背景与动机

问题定义

Few-Shot Class-Incremental Learning (FSCIL) 是一个极具挑战性的任务，要求模型：

顺序学习: 从连续的数据流中学习新类别
少样本约束: 新类别仅有少量训练样本（通常5-shot）
避免遗忘: 保持对先前学习类别的知识

核心挑战

稳定性-可塑性困境: 在学习新知识和保持旧知识之间找到平衡
数据稀缺: 新类别的极少样本导致不可靠的经验风险最小化
泛化能力不足: 现有方法过度依赖有限的初始数据集

现有方法局限性

传统FSCIL方法主要存在两个问题：

语义错位和多样性不足: 直接使用扩散模型生成的图像可能存在语义偏差或多样性不足
反馈机制缺失: 缺乏扩散模型根据分类器当前状态调整输出的机制

核心贡献

提出DCS框架: 首创扩散模型与FSCIL分类器间的互相促进循环，通过DAS算法实现奖励对齐生成
多层次奖励设计: 设计了在特征层面和logits层面运作的多面奖励函数
- 特征层面：确保语义一致性和促进类内多样性
- Logits层面：指导生成探索性、泛化的类内图像并增强类间可区分性
最先进性能: 在FSCIL基准数据集上取得state-of-the-art结果，显著改善旧类知识保持和新类学习效果

方法详解

任务定义

FSCIL涉及从连续数据流 $D_{train} = \{D^t_{train}\}^T_{t=0}$ 中顺序学习，其中：

每个会话 $t$ 引入新的不相交类别集合 $C_t$ 的训练样本 $(x_i, y_i)$
基础会话 $(t=0)$ 有充足数据，增量会话 $(t>0)$ 采用N-way K-shot格式
模型在 $D^t_{train}$ 上训练后，需在所有已见类别 $C^t_{seen} = \bigcup^t_{s=0} C_s$ 上评估

模型架构

互相促进循环机制

DCS的核心思想是建立扩散模型和分类器间的双向反馈：

奖励计算: 基于分类器 $\sigma$ (参数 $\theta$ ) 对生成图像 $x$ 的输出计算多个奖励组件 $R_i$
扩散模型优化: $\phi^* = \arg\max_\phi \sum_i R_i(\sigma_\theta(D(x;\phi)))$
分类器改进: $\theta^* = \arg\min_\theta L_{cls}(\sigma_\theta; x \cup D(x;\phi^*), y)$

特征层面奖励设计

1. 原型锚定最大均值差异奖励 (R_PAMMD) $R_{PAMMD}(x_{gen}, I^{(c,N)}_{gen}) = -\alpha \frac{1}{N^2}\sum_{i=1}^N\sum_{j=1}^N k(z_i,z_j) + \beta \frac{1}{N}\sum_{i=1}^N k(z_i,\mu_c)$