2025-11-13T21:49:11.069891

SAP: Corrective Machine Unlearning with Scaled Activation Projection for Label Noise Robustness

Kodge, Ravikumar, Saha et al.

Label corruption, where training samples are mislabeled due to non-expert annotation or adversarial attacks, significantly degrades model performance. Acquiring large, perfectly labeled datasets is costly, and retraining models from scratch is computationally expensive. To address this, we introduce Scaled Activation Projection (SAP), a novel SVD (Singular Value Decomposition)-based corrective machine unlearning algorithm. SAP mitigates label noise by identifying a small subset of trusted samples using cross-entropy loss and projecting model weights onto a clean activation space estimated using SVD on these trusted samples. This process suppresses the noise introduced in activations due to the mislabeled samples. In our experiments, we demonstrate SAP's effectiveness on synthetic noise with different settings and real-world label noise. SAP applied to the CIFAR dataset with 25% synthetic corruption show upto 6% generalization improvements. Additionally, SAP can improve the generalization over noise robust training approaches on CIFAR dataset by ~3.2% on average. Further, we observe generalization improvements of 2.31% for a Vision Transformer model trained on naturally corrupted Clothing1M.

academic

SAP: Corrective Machine Unlearning with Scaled Activation Projection for Label Noise Robustness

基本信息

论文ID: 2403.08618
标题: SAP: Corrective Machine Unlearning with Scaled Activation Projection for Label Noise Robustness
作者: Sangamesh Kodge, Deepak Ravikumar, Gobinda Saha, Kaushik Roy (Purdue University)
分类: cs.LG cs.AI stat.ML
发表时间: 2025年1月2日 (arXiv v2)
论文链接: https://arxiv.org/abs/2403.08618
代码链接: https://github.com/sangamesh-kodge/SAP.git

摘要

标签损坏是深度学习中的一个重要问题，由于非专业标注或对抗攻击导致的训练样本错误标记会显著降低模型性能。获取大规模完美标记的数据集成本高昂，从头重新训练模型计算开销巨大。为此，本文提出了缩放激活投影(SAP)，一种基于奇异值分解(SVD)的修正机器遗忘算法。SAP通过使用交叉熵损失识别少量可信样本，并将模型权重投影到基于这些可信样本使用SVD估计的干净激活空间来缓解标签噪声。实验表明，SAP在CIFAR数据集上25%合成损坏的情况下可获得高达6%的泛化改进，在噪声鲁棒训练方法基础上平均提升约3.2%，在自然损坏的Clothing1M数据集上的Vision Transformer模型获得2.31%的泛化改进。

研究背景与动机

问题定义

标签噪声问题：大规模数据集中普遍存在标签错误，这些错误可能来源于：
- 人工标注错误
- 自动标注系统(如大语言模型)的误判
- 恶意数据投毒攻击
现有解决方案的局限性：
- 数据清洗方法：需要重新训练模型，计算成本高
- 噪声鲁棒训练：虽然能提高鲁棒性，但无法完全消除性能差距
- 传统机器遗忘：需要明确区分错误标记和难学习样本，实际应用困难
研究动机：
- 避免从头重新训练的高计算成本
- 无需显式识别错误标记样本
- 通过单次权重更新实现高效的噪声缓解