2025-11-23T03:16:16.407109

TARD: Test-time Domain Adaptation for Robust Fault Detection under Evolving Operating Conditions

Sun, Fink

Fault detection is essential in complex industrial systems to prevent failures and optimize performance by distinguishing abnormal from normal operating conditions. With the growing availability of condition monitoring data, data-driven approaches have increasingly applied in detecting system faults. However, these methods typically require large, diverse, and representative training datasets that capture the full range of operating scenarios, an assumption rarely met in practice, particularly in the early stages of deployment. Industrial systems often operate under highly variable and evolving conditions, making it difficult to collect comprehensive training data. This variability results in a distribution shift between training and testing data, as future operating conditions may diverge from those previously observed ones. Such domain shifts hinder the generalization of traditional models, limiting their ability to transfer knowledge across time and system instances, ultimately leading to performance degradation in practical deployments. To address these challenges, we propose a novel method for continuous test-time domain adaptation, designed to support robust early-stage fault detection in the presence of domain shifts and limited representativeness of training data. Our proposed framework --Test-time domain Adaptation for Robust fault Detection (TARD) -- explicitly separates input features into system parameters and sensor measurements. It employs a dedicated domain adaptation module to adapt to each input type using different strategies, enabling more targeted and effective adaptation to evolving operating conditions. We validate our approach on two real-world case studies from multi-phase flow facilities, delivering substantial improvements in both fault detection accuracy and model robustness over existing domain adaptation methods under real-world variability.

academic

TARD: Test-time Domain Adaptation for Robust Fault Detection under Evolving Operating Conditions

基本信息

论文ID: 2507.16354
标题: TARD: Test-time Domain Adaptation for Robust Fault Detection under Evolving Operating Conditions
作者: Han Sun, Olga Fink (EPFL)
分类: stat.AP (Statistics - Applications)
发表时间: 2025年10月13日 (arXiv v2)
论文链接: https://arxiv.org/abs/2507.16354

数据稀缺性: 工业系统，特别是新部署或翻新的设备，缺乏全面的历史数据，尤其是故障数据极其稀缺
域偏移挑战: 不同设备单元之间以及同一系统在不同时间的操作条件存在显著差异，违反了传统机器学习的i.i.d假设
动态环境: 工业系统在持续演化的环境中运行，需要连续适应而非离散的域适应

研究重要性

早期故障检测对于优化系统性能、最小化维护成本和减少资产不可用性至关重要
现有方法在面对分布偏移时容易产生高误报率和检测精度下降
需要支持舰队级知识转移，从数据丰富的系统向数据稀缺的新系统转移经验

现有方法局限性

传统域适应方法: 需要大量源域和目标域数据，且通常需要标记的故障数据
静态适应: 大多数方法假设离散的静态域特征，无法处理连续演化的操作条件
测试时适应风险: 现有TTA方法可能错误地将故障模式适应为正常行为

核心贡献

提出TARD框架: 一种专为无监督故障检测设计的连续测试时域适应框架，完全不依赖标记的故障数据
创新的特征分离策略: 明确将输入变量分为控制参数和传感器测量值，并为每类采用专门的适应策略
实用性框架: 仅需目标系统的少量正常样本，适合早期部署和舰队级知识转移
实证验证: 在两个多相流设施的真实案例研究中验证了方法的有效性

方法详解

任务定义

给定：

源系统的丰富健康训练数据： $X^s = [x^s_1, \cdots, x^s_n]$
目标域的有限正常数据： $X^t = [x^t_1, \cdots, x^t_m]$

目标：在目标域 $t$ 中实现鲁棒的故障检测，考虑：

两个域都缺乏故障训练数据
目标域数据可用性有限
推理过程中的连续分布偏移

系统变量分类

将输入数据分为两组： $X = [x, w]$

控制变量 $w$ ：操作员或控制系统设置的系统条件控制变量
传感器测量值 $x$ ：监测系统组件并反映实时系统状态的传感器信号

输入: 控制变量 $w$ 和预训练自编码器的预测值
输出: 补偿项 $\Delta x$
设计原理: 避免适应到潜在的故障数据分布

3. 关键技术特点

冻结主模型: 预训练的自编码器 $f_\theta$ 在适应阶段保持冻结
AdaBN层: 在适应模块中集成自适应批归一化层，基于批统计更新均值和方差
分离适应: 仅对控制变量进行适应，保护传感器测量的异常检测能力

监测变量: 24个过程变量（压力、流量、液位、密度、温度、阀位）
控制变量: 空气和水流量设定点
故障类型: 6种（空气管路阻塞、水管路阻塞、顶部分离器输入阻塞、直接旁路开启、段塞流条件、2英寸管路加压）
采样频率: 1 Hz

2. PRONTO异构基准数据集

监测变量: 15个过程变量
操作条件: 20种不同的空气和水流量组合
故障类型: 3种（空气泄漏、空气阻塞、分流）
采样频率: 1 Hz

评价指标

准确率 (Accuracy): 整体预测正确率
F1分数: 精确率和召回率的调和平均
AUC: ROC曲线下面积

对比方法

Baseline: 仅在源域训练的模型
AdaBN: 自适应批归一化
MMD: 最大均值差异

实现细节

优化器: Adam，学习率1e-5
批大小: 128
训练轮数: 自编码器500轮，适应模块50轮
架构: 编码器和解码器各3层全连接，维度50-50-10

故障类型	Baseline	AdaBN	MMD	TARD
空气管路阻塞	F1: 0.43	F1: 0.43	F1: 0.47	F1: 0.70
水管路阻塞	F1: 0.67	F1: 0.62	F1: 0.69	F1: 0.76
顶部分离器阻塞	F1: 0.63	F1: 0.65	F1: 0.64	F1: 0.79
直接旁路开启	F1: 0.53	F1: 0.60	F1: 0.56	F1: 0.69
段塞流条件	F1: 0.85	F1: 0.88	F1: 0.89	F1: 0.92
2英寸管路加压	F1: 0.94	F1: 0.98	F1: 1.00	F1: 1.00