2025-11-20T07:19:14.926764

STaTS: Structure-Aware Temporal Sequence Summarization via Statistical Window Merging

Bhowmick, Ramanathan, Aakur

Time series data often contain latent temporal structure, transitions between locally stationary regimes, repeated motifs, and bursts of variability, that are rarely leveraged in standard representation learning pipelines. Existing models typically operate on raw or fixed-window sequences, treating all time steps as equally informative, which leads to inefficiencies, poor robustness, and limited scalability in long or noisy sequences. We propose STaTS, a lightweight, unsupervised framework for Structure-Aware Temporal Summarization that adaptively compresses both univariate and multivariate time series into compact, information-preserving token sequences. STaTS detects change points across multiple temporal resolutions using a BIC-based statistical divergence criterion, then summarizes each segment using simple functions like the mean or generative models such as GMMs. This process achieves up to 30x sequence compression while retaining core temporal dynamics. STaTS operates as a model-agnostic preprocessor and can be integrated with existing unsupervised time series encoders without retraining. Extensive experiments on 150+ datasets, including classification tasks on the UCR-85, UCR-128, and UEA-30 archives, and forecasting on ETTh1 and ETTh2, ETTm1, and Electricity, demonstrate that STaTS enables 85-90\% of the full-model performance while offering dramatic reductions in computational cost. Moreover, STaTS improves robustness under noise and preserves discriminative structure, outperforming uniform and clustering-based compression baselines. These results position STaTS as a principled, general-purpose solution for efficient, structure-aware time series modeling.

academic

STaTS: Structure-Aware Temporal Sequence Summarization via Statistical Window Merging

基本信息

论文ID: 2510.09593
标题: STaTS: Structure-Aware Temporal Sequence Summarization via Statistical Window Merging
作者: Disharee Bhowmick, Ranjith Ramanathan, Sathyanarayanan N. Aakur
分类: cs.LG (Machine Learning), cs.CV (Computer Vision)
发表时间: 2025年10月
论文链接: https://arxiv.org/abs/2510.09593

摘要

时间序列数据通常包含潜在的时间结构，如局部平稳状态之间的转换、重复模式和变异性突发等，但这些结构在标准表示学习流程中很少被利用。现有模型通常处理原始或固定窗口序列，将所有时间步视为同等重要，这导致在长序列或噪声序列中出现效率低下、鲁棒性差和可扩展性有限的问题。本文提出STaTS，一个轻量级的无监督框架，用于结构感知的时间序列摘要，能够自适应地将单变量和多变量时间序列压缩为紧凑的、信息保持的token序列。

研究背景与动机

问题定义

时间序列数据在金融、物联网、医疗等领域广泛存在，随着传感技术的进步，记录的时间序列长度和复杂性快速增长，对基于机器学习的序列理解框架提出了巨大的计算需求。

现有方法的局限性

传统方法：如PAA（分段聚合近似）、SAX（符号聚合近似）、DTW（动态时间规整）等实现了有效的摘要，但依赖于统一窗口化或刚性符号编码，忽略了信号复杂性的动态变化
深度学习方法：如TS2Vec、TS-TCC等处理完整序列或应用滑动窗口，不考虑语义变化，导致冗余、计算开销和模型标记化与信号真实转换之间的错位

研究动机

现有方法存在以下问题：

固定窗口策略可能过度分割稳定区域，而对复杂区域分割不足
在噪声条件下，统一处理的输入倾向于放大虚假模式并降低泛化能力
缺乏结构感知导致效率低下和错误传播

核心贡献

提出STaTS框架：基于BIC的变化检测准则，在多个时间尺度上识别统计连贯段的结构感知标记化框架
模块化轻量级摘要流水线：在保持显著模式的同时压缩时间序列超过30倍，实现高效的下游建模
模型无关的无监督方法：无需架构更改或基于梯度的调优，可与现有时间序列编码器（如TS2Vec）直接兼容
统一接口：适用于分类、预测和鲁棒性任务，作为通用的时间序列摘要预处理工具

使用BIC（贝叶斯信息准则）评估相邻时间窗口的统计相似性
对于相邻窗口 $x_1, x_2 \in \mathbb{R}^{\delta \times d}$ ，计算：

$\Delta BIC = -2(\ell_{joint} - \ell_{sep}) + k \log(2\delta)$

其中：

$\ell_{sep} = -\frac{\delta}{2}(\log|\Sigma_1| + \log|\Sigma_2|)$
$\ell_{joint} = -\delta \log|\Sigma_{12}|$
$k = d + \frac{d(d+1)}{2}$ （全协方差模型的自由参数数量）

全局目标函数： $L_{BIC}(\{S_i\}) = \sum_{i=1}^{T'} \left(-\frac{|S_i|}{2}\log|\Sigma_i| + \frac{k}{2}\log|S_i|\right)$

多尺度评估：

在预定义范围内的每个 $\delta$ 值上评估统计连贯性
使用自适应阈值 $\mu_\delta + \alpha \cdot \sigma_\delta$ 识别候选分割点
通过非最大抑制消除冗余检测

2. 摘要阶段（Summarization）

摘要函数： $\phi(S_i) = \frac{1}{|S_i|} \sum_{t=\tau_{i-1}}^{\tau_i-1} x_t$

使用均值池化作为默认摘要操作，捕获段的一阶统计特性。

技术创新点

自适应分割：与固定窗口方法不同，STaTS根据局部统计变化动态调整段边界
多变量扩展：通过全协方差矩阵自然扩展到多变量时间序列
多尺度检测：在不同时间分辨率上检测变化，捕获短期突变和长期渐变
统计有效性：在多变量高斯假设下，段均值是充分统计量

实验设置

数据集

单变量分类：UCR-128（128个数据集）和UCR-85（85个数据集）
多变量分类：UEA-30（30个数据集）
多变量预测：ETTh1、ETTh2、ETTm1、Electricity

评价指标

分类任务：平均准确率和平均排名
预测任务：标准化均方误差（nMSE）

对比方法

分类基线：T-Loss、TNC、TS-TCC、TST、DTW、TS2Vec
压缩变体：TS2Vec (uniform)、TS2Vec (GMM)
预测基线：Informer、TCN

实现细节

窗口大小范围： $\delta \in \{5, 10, ..., 500\}$
阈值参数： $\alpha = 2$
最小分离距离： $s_{min} = 20$
数值稳定性：协方差正则化 $\epsilon = 10^{-6}$

模型	UCR-85准确率	UCR-85排名	UCR-128准确率	UCR-128排名	平均长度
TS2Vec (ori)	0.829	1.99	0.829	2.02	424.4/534.5
TS2Vec (mean)	0.739	4.82	0.741	4.39	12.1/12.9
TS2Vec (uniform)	0.621	8.21	0.616	8.10	12.1/12.9
TS2Vec (GMM)	0.655	7.35	0.664	6.92	60.7/73.2