2025-11-25T18:04:18.517311

COGNOS: Universal Enhancement for Time Series Anomaly Detection via Constrained Gaussian-Noise Optimization and Smoothing

Shang, Chang

Reconstruction-based methods are a dominant paradigm in time series anomaly detection (TSAD), however, their near-universal reliance on Mean Squared Error (MSE) loss results in statistically flawed reconstruction residuals. This fundamental weakness leads to noisy, unstable anomaly scores with a poor signal-to-noise ratio, hindering reliable detection. To address this, we propose Constrained Gaussian-Noise Optimization and Smoothing (COGNOS), a universal, model-agnostic enhancement framework that tackles this issue at its source. COGNOS introduces a novel Gaussian-White Noise Regularization strategy during training, which directly constrains the model's output residuals to conform to a Gaussian white noise distribution. This engineered statistical property creates the ideal precondition for our second contribution: a Kalman Smoothing Post-processor that provably operates as a statistically optimal estimator to denoise the raw anomaly scores. The synergy between these two components allows COGNOS to robustly separate the true anomaly signal from random fluctuations. Extensive experiments demonstrate that COGNOS is highly effective, delivering an average F-score uplift of 57.9% when applied to 12 diverse backbone models across multiple real-world benchmark datasets. Our work reveals that directly regularizing output statistics is a powerful and generalizable strategy for significantly improving anomaly detection systems.

academic

COGNOS: Universal Enhancement for Time Series Anomaly Detection via Constrained Gaussian-Noise Optimization and Smoothing

Basic Information

Paper ID: 2511.06894
Title: COGNOS: Universal Enhancement for Time Series Anomaly Detection via Constrained Gaussian-Noise Optimization and Smoothing
Authors: Wenlong Shang, Peng Chang (Beijing University of Technology)
Categories: cs.LG cs.AI
Submission Date: November 10, 2025 to arXiv
Paper Link: https://arxiv.org/abs/2511.06894

Abstract

This paper addresses a fundamental issue in reconstruction-based methods for time series anomaly detection (TSAD): statistically defective reconstruction residuals caused by MSE loss. We propose the COGNOS framework, which directly constrains model output residuals to follow a Gaussian white noise distribution through Gaussian white noise regularization (GWNR) during training, combined with a Kalman smoothing post-processor for optimal denoising. Across 12 different backbone models and multiple real-world datasets, COGNOS achieves an average F-score improvement of 57.9%, demonstrating that direct regularization of output statistical properties is a powerful and generalizable strategy.

Research Background and Motivation

1. Core Problem

Time series anomaly detection is critical in industrial manufacturing monitoring, financial system security, and IT infrastructure maintenance. Reconstruction-based self-supervised methods have become the mainstream paradigm but suffer from fundamental defects:

Statistically defective residuals: Reconstruction residuals from standard MSE training exhibit undesirable statistical properties (non-Gaussian, temporal correlations)
Low signal-to-noise ratio: Original anomaly scores are noisy and unstable, making it difficult to distinguish true anomalies from random fluctuations
Incomplete modeling: Models fail to fully separate deterministic patterns from random noise

2. Problem Significance

As shown in Figure 1, standard MSE-trained Transformers on the SWaT dataset exhibit three critical issues:

Anomaly scores are highly noisy with poor signal-to-noise ratio
Q-Q plots reveal strongly non-Gaussian residuals
Autocorrelation plots show significant temporal correlations in residuals

These statistical defects directly impact anomaly detection performance, resulting in high false positive and false negative rates.

3. Limitations of Existing Methods

Contrastive learning methods: While capable of learning more discriminative representations, they are typically coupled with specific architectures and do not directly address the statistical properties of final residuals
Filtering and regularization techniques:
- Methods integrating filters create new hybrid architectures lacking generality
- Latent space regularization (e.g., SVD, periodicity consistency) does not directly act on output residuals
Lack of theoretically optimal post-processing solutions

4. Research Motivation

This paper addresses the problem at its source: directly engineering the statistical properties of output residuals to create ideal preconditions for subsequent optimal denoising.

Core Contributions

Proposes Gaussian White Noise Regularization (GWNR) strategy: First to directly constrain reconstruction residuals to follow Gaussian white noise distribution, fundamentally different from existing representation-focused contrastive methods
Designs Kalman smoothing post-processor: Works synergistically with GWNR to achieve theoretically optimal denoising by leveraging engineered residual properties, significantly improving anomaly score stability
Demonstrates model-agnostic effectiveness:
- Universal enhancement framework applicable to any reconstruction model
- Average F-score improvement of 57.9% across 12 different architectures (attention-based, time-frequency fusion, CNN-MLP)
- Validation on 4 real-world benchmark datasets (MSL, SMAP, SWaT, PSM)
Reveals new improvement direction: Proves that direct regularization of output statistical properties is more effective than traditional architectural or representation improvements

Method Details

Task Definition

Input: Multivariate time series $\mathbf{x} \in \mathbb{R}^{L \times D}$ (length $L$ , dimension $D$ )
Training: Learn data manifold using only normal data
Output: Anomaly score for each time point to identify deviations from normal patterns
Objective: Generate high signal-to-noise ratio, statistically optimal anomaly scores

Model Architecture

COGNOS is a two-stage framework (Figure 2):

Stage 1: Training Phase - Gaussian White Noise Regularization (GWNR)

Overall objective function: $L_{Total} = L_{AWL}(L_{MSE}, L_{MMD}, L_{ACF})$

where Automatic Weighted Loss (AWL) dynamically balances three components.

1. Reconstruction Loss ( $L_{MSE}$ ): $L_{MSE} = \frac{1}{|R|}\sum_{r \in R} r^2$ where $R = \mathbf{x} - \hat{\mathbf{x}}$ is the reconstruction residual, ensuring high-fidelity reconstruction.

2. Gaussianity Regularization ( $L_{MMD}$ ): Uses Maximum Mean Discrepancy (MMD) to constrain residual distribution to approximate target Gaussian distribution $\mathcal{N}(0, \sigma^{*2})$ :

$L_{MMD} = \frac{1}{|R|^2}\sum_{p_i,p_j \in R}\kappa(p_i, p_j) + \frac{1}{|S|^2}\sum_{q_i,q_j \in S}\kappa(q_i, q_j) - \frac{2}{|R||S|}\sum_{p_i \in R, q_j \in S}\kappa(p_i, q_j)$

Kernel function uses multi-bandwidth RBF: $\kappa(a,b) = \sum_{j=1}^M \exp\left(-\frac{\|a-b\|^2}{2(B_j\sigma^*)^2}\right)$

Bandwidth multipliers $\{B_j\} = \{0.1, 0.5, 1.0, 2.0, 5.0\}$ , $\sigma^* = e^\omega$ (learnable parameter).

Innovation points:

Non-parametric method with strong robustness
Adaptively learns noise level
Penalizes systematic bias and complex structures

3. White Noise Regularization ( $L_{ACF}$ ): Penalizes temporal correlations by summing squared autocorrelation coefficients for the first 10 lags:

$L_{ACF} = \sum_{k \in N_{lag}} \mathbb{E}_{b,d}[(\rho_{k,b,d})^2]$

where autocorrelation coefficient at lag $k$ : $\rho_{k,b,d} = \frac{\sum_{l=k+1}^L (r_{b,l,d} - \mu_{b,d})(r_{b,l-k,d} - \mu_{b,d})}{\sum_{l=1}^L (r_{b,l,d} - \mu_{b,d})^2}$

Design rationale: Empirical observation shows most significant correlations occur at early lags; $N_{lag}=\{1,...,10\}$ balances effectiveness and computational cost.

Stage 2: Inference Phase - Kalman Smoothing Post-Processor

Theoretical foundation: Kalman filter is the provably optimal linear estimator when the noise process is zero-mean, uncorrelated (white noise), and Gaussian. The residuals created by GWNR satisfy exactly these conditions.

State space model: