NAP: Attention-Based Late Fusion for Automatic Sleep Staging
Rossi, van der Meer, Schmidt et al.
Polysomnography signals are highly heterogeneous, varying in modality composition (e.g., EEG, EOG, ECG), channel availability (e.g., frontal, occipital EEG), and acquisition protocols across datasets and clinical sites. Most existing models that process polysomnography data rely on a fixed subset of modalities or channels and therefore neglect to fully exploit its inherently multimodal nature. We address this limitation by introducing NAP (Neural Aggregator of Predictions), an attention-based model which learns to combine multiple prediction streams using a tri-axial attention mechanism that captures temporal, spatial, and predictor-level dependencies. NAP is trained to adapt to different input dimensions. By aggregating outputs from frozen, pretrained single-channel models, NAP consistently outperforms individual predictors and simple ensembles, achieving state-of-the-art zero-shot generalization across multiple datasets. While demonstrated in the context of automated sleep staging from polysomnography, the proposed approach could be extended to other multimodal physiological applications.
academic
NAP: Attention-Based Late Fusion for Automatic Sleep Staging
多导睡眠监测(PSG)信号具有高度异质性,在模态组成(如EEG、EOG、ECG)、通道可用性(如额叶、枕叶EEG)以及不同数据集和临床中心的采集协议方面存在差异。现有处理多导睡眠监测数据的模型大多依赖固定的模态或通道子集,因此未能充分利用其固有的多模态特性。本文通过引入NAP(Neural Aggregator of Predictions)来解决这一局限性,这是一个基于注意力机制的模型,使用三轴注意力机制学习组合多个预测流,捕获时间、空间和预测器级别的依赖关系。NAP经过训练以适应不同的输入维度。通过聚合来自冻结的预训练单通道模型的输出,NAP始终优于单个预测器和简单集成方法,在多个数据集上实现了最先进的零样本泛化性能。