2025-11-15T18:28:11.606243

S-Diff: An Anisotropic Diffusion Model for Collaborative Filtering in Spectral Domain

Xia, Cheng, Tang et al.

Recovering user preferences from user-item interaction matrices is a key challenge in recommender systems. While diffusion models can sample and reconstruct preferences from latent distributions, they often fail to capture similar users' collective preferences effectively. Additionally, latent variables degrade into pure Gaussian noise during the forward process, lowering the signal-to-noise ratio, which in turn degrades performance. To address this, we propose S-Diff, inspired by graph-based collaborative filtering, better to utilize low-frequency components in the graph spectral domain. S-Diff maps user interaction vectors into the spectral domain and parameterizes diffusion noise to align with graph frequency. This anisotropic diffusion retains significant low-frequency components, preserving a high signal-to-noise ratio. S-Diff further employs a conditional denoising network to encode user interactions, recovering true preferences from noisy data. This method achieves strong results across multiple datasets.

academic

S-Diff: An Anisotropic Diffusion Model for Collaborative Filtering in Spectral Domain

Basic Information

Paper ID: 2501.00384
Title: S-Diff: An Anisotropic Diffusion Model for Collaborative Filtering in Spectral Domain
Authors: Rui Xia, Yanhua Cheng, Yongxiang Tang, Xiaocheng Liu, Xialong Liu, Lisong Wang, Peng Jiang
Category: cs.IR (Information Retrieval)
Venue: WSDM '25 (The Eighteenth ACM International Conference on Web Search and Data Mining)
Paper Link: https://arxiv.org/abs/2501.00384

Abstract

Recovering user preferences from user-item interaction matrices in recommendation systems is a critical challenge. While diffusion models can sample and reconstruct preferences from latent distributions, they often fail to effectively capture the collective preferences of similar users. Furthermore, latent variables degrade to pure Gaussian noise during the forward process, reducing signal-to-noise ratio and degrading performance. To address these issues, this paper proposes S-Diff, inspired by graph-based collaborative filtering, to better exploit low-frequency components in the spectral domain. S-Diff maps user interaction vectors to the spectral domain and parameterizes diffusion noise to align with graph frequencies. This anisotropic diffusion preserves important low-frequency components while maintaining high signal-to-noise ratio. S-Diff further employs a conditional denoising network to encode user interactions and recover true preferences from noisy data. The method achieves strong results across multiple datasets.

Research Background and Motivation

Problem Definition

The core task of recommendation systems is to recover true user preferences from sparse user-item interaction matrices, which is essentially an inverse problem. Traditional collaborative filtering methods address this by mining similarities among users.

Limitations of Existing Methods

Insufficiencies of Traditional Diffusion Models:
- Primarily rely on individual user interaction vectors as conditional inputs, failing to fully leverage shared preference information among users in collaborative filtering
- Inject large amounts of Gaussian noise into high-dimensional historical interaction vectors, complicating the recovery process for denoising decoders
Encoding-Decoding Inconsistency:
- Some models explicitly use collaborative information as conditional guidance in the decoding network, but the forward process does not reflect collaborative signals
- Results in inconsistency between encoding and decoding processes
Signal-to-Noise Ratio Degradation:
- Latent variables degrade to pure Gaussian noise during the forward process, reducing signal-to-noise ratio
- Impacts overall model performance

Research Motivation

Inspired by the success of graph-based collaborative filtering and graph signal processing, the authors observe that the "over-smoothing" process in graph convolution is analogous to signal smoothing in diffusion processes. Based on this insight, they propose anisotropic diffusion in the graph spectral domain to better preserve low-frequency information (representing global preferences).

Core Contributions

Spectral Domain Forward Diffusion Process: Introduces a forward diffusion process defined in the graph spectral domain, effectively integrating global preference information across users
Anisotropic Noise Parameterization Method: Proposes a method to parameterize and modulate noise scales for different frequency components, with theoretical analysis and experimental results demonstrating advantages in signal-to-noise ratio
Element-wise Fusion Denoising Module: Designs a denoising module based on element-wise fusion in the reverse process, with extensive experiments validating the effectiveness of the proposed method
Theoretical Guarantees: Provides boundedness analysis of the spectral domain diffusion process, establishing theoretical soundness of the method

Methodology Details

Task Definition

Given a set of users U and items I, with user-item interaction matrix X ∈ {0,1}^{|U|×|I|}, where x_{u,i} = 1 indicates user u has interacted with item i. The goal is to predict a rating vector x̂ ∈ ℝ^{|I|}, generating latent preference scores for all items for a specified user.

Model Architecture

1. Graph Construction and Spectral Decomposition

Item Similarity Graph: Define normalized similarity adjacency matrix A = X̃^TX̃, where X̃ = D_U^{-1/2}X****D_I^{-1/2}
Laplacian Operator: L = I - A
Eigendecomposition: L = UΛU^T, where Λ contains eigenvalues and U contains eigenvectors

2. Graph-Guided Forward Diffusion

Traditional diffusion process: x_t = α_tx_0 + σ_tε_t

Improved graph-guided diffusion: x_t = C_tx_0 + σ_tε_t

where C_t = e^{-Lt} is a time-decay operator defined by the Laplacian matrix.

3. Spectral Domain Diffusion Framework

Transform the diffusion process to the spectral domain via spectral transformation v_t = U^Tx_t:

v_t = λ_t ⊙ v_0 + σtv{ε,t}

where:

v_0 = U^Tx_0 is the frequency response of x_0
λ_t = e^{-t·d_1}, e^{-t·d_2}, ..., e^{-t·d_{|I|}} is the eigenvalue vector
⊙ denotes element-wise multiplication

4. Anisotropic Noise Schedule

Adopts variance-preserving diffusion model:

α_t = λ_t
σ_t^2 = 1 - λ_t^2

Introduces boundary parameter control:

αt = (1 - α) · λt + α
σ_t = Min(√(1 - λt^2), σ)

5. Conditional Reverse Denoising

Uses neural network φ_θ for denoising with optimization objective:

L_t = E_{(v_0,v_t)~q_0(v_0)q_t(v_t|v_0)}||φ_θ(v_t, U^Tc, t) - v_0||^2

Technical Innovations

Spectral Domain Mapping: Transforms traditional spatial domain diffusion to graph spectral domain, leveraging spectral properties of graphs
Anisotropic Noise: Modulates noise levels for different frequency components according to eigenvalues, preserving low-frequency information
Boundedness Properties: Guarantees lower bounds on signal-to-noise ratio due to boundedness of Laplacian eigenvalues
FiLM Fusion: Uses Feature-wise Linear Modulation for element-wise conditional fusion

Experimental Setup

Datasets

Three public datasets are used:

MovieLens-1M: 5,949 users, 2,810 items, 571,531 interactions, sparsity 96.6%
Yelp: 54,574 users, 34,395 items, 1,402,736 interactions, sparsity 99.93%
Amazon-Book: 108,822 users, 94,949 items, 3,146,256 interactions, sparsity 99.97%

Data is split in 7:1:2 ratio into training, validation, and test sets.

Evaluation Metrics

Recall@K: Measures the proportion of relevant items in the top-K recommendation list
NDCG@K: Ranking-sensitive metric giving higher scores to relevant items at higher positions

Baseline Methods

Include traditional collaborative filtering, graph neural network, and diffusion model approaches:

MF, LightGCN, CDAE, MultiDAE/MultiVAE
CODIGEM, DiffRec (diffusion models)
LinkProp, BSPM, Giff (graph signal processing methods)

Implementation Details

Batch size: 100
Learning rate: 1e-4
Maximum training epochs: 1,000
Diffusion steps: T=5
Spectral decomposition dimension: 200

Experimental Results

Main Results

S-Diff significantly outperforms all baseline methods across all datasets and metrics:

Amazon-Book Dataset:

Recall@10: 0.1155 (vs. best baseline Giff: 0.1109)
NDCG@10: 0.0746 (vs. best baseline Giff: 0.0733)

Yelp Dataset:

Recall@10: 0.0635 (vs. best baseline Giff: 0.0639)
NDCG@20: 0.0561 (vs. best baseline Giff: 0.0520)

MovieLens-1M Dataset:

Recall@10: 0.1277 (vs. best baseline Giff: 0.1108)
NDCG@10: 0.0970 (vs. best baseline Giff: 0.0952)

Ablation Studies

Compares different noise scheduling strategies:

DDPM in Spectral: Traditional Gaussian noise in spectral domain
S-Diff-VE: Variance exploding diffusion
S-Diff-VP: Variance preserving diffusion (proposed method)

Results show S-Diff-VP achieves optimal signal-to-noise ratio and performance.

2. Denoising Network Component Analysis

Removing the FiLM layer significantly degrades performance, validating the importance of element-wise fusion.

Signal-to-Noise Ratio Analysis

Theoretical analysis and experiments demonstrate that spectral domain anisotropic diffusion achieves better signal-to-noise ratio lower bounds compared to traditional diffusion models:

SNR(t) = α_t^2/σ_t^2 ≥ (e^{-2τ})^2/(1-(e^{-2τ})^2)

Experiments show S-Diff maintains identifiable signal-to-noise ratio even after 1000 diffusion steps.

Parameter Sensitivity Analysis

Spectral Decomposition Dimension K: Optimal performance at K=200
Boundary Parameters: Best results when α_ ∈ 0, 0.1, σ_ ∈ 0.4, 0.5

Diffusion Models in Recommendation

CODIGEM: First application of DDPM to collaborative filtering
DiffRec: Improves diffusion models through latent space mapping and timestep guidance
CF-Diff: Precomputes multi-hop neighbor information as conditions
Giff: Uses graph propagation for signal smoothing and recovery

Graph Filtering Methods

LightGCN: Multi-layer linear aggregation of neighbor information
Poly-CF: Adaptive spectral graph filtering
SGFCF: Transforms collaborative filtering into adaptive filter design problem

Conclusions and Discussion

Main Conclusions

S-Diff successfully combines graph spectral theory with diffusion models, performing anisotropic diffusion in the spectral domain
By preserving low-frequency components and maintaining high signal-to-noise ratio, significantly improves recommendation performance
The method has solid theoretical foundations and experimental validation

Limitations

Computational Complexity: Requires spectral decomposition with time complexity O(K|I|m)
Parameter Tuning: Requires careful adjustment of boundary parameters α_ and σ_
Scalability: Applicability to ultra-large-scale datasets remains to be verified

Future Directions

Computational Efficiency Optimization: Research more efficient spectral decomposition and diffusion processes
Adaptive Parameters: Develop methods for automatic noise parameter adjustment
Multimodal Extension: Extend the method to multimodal recommendation scenarios

In-Depth Evaluation

Strengths

Theoretical Innovation: Cleverly combines graph signal processing with diffusion models, providing new theoretical perspectives
Advanced Techniques: Anisotropic noise scheduling and spectral domain diffusion are important technical contributions
Comprehensive Experiments: Conducts thorough comparisons and ablation studies across multiple datasets
Superior Performance: Achieves best performance across all evaluation metrics

Weaknesses

High Complexity: Spectral decomposition adds computational overhead, potentially limiting application on large-scale data
Parameter Sensitivity: The method involves multiple hyperparameters requiring careful tuning
Insufficient Theoretical Analysis: Lacks deeper theoretical explanation for why anisotropic diffusion is more effective

Impact

Academic Value: Provides new insights for applying diffusion models in recommendation systems
Practical Value: Method demonstrates good performance improvements with practical application potential
Reproducibility: Paper provides detailed implementation details and algorithm descriptions

Applicable Scenarios

Medium-scale recommendation systems
Scenarios with high recommendation quality requirements
Datasets with pronounced collaborative filtering characteristics
Environments with relatively abundant computational resources

References

The paper cites 52 relevant references covering multiple domains including diffusion models, collaborative filtering, and graph neural networks, providing solid theoretical foundations for this research.

Overall Assessment: This is a high-quality research paper demonstrating excellence in both theoretical innovation and experimental validation. The combination of graph spectral theory with diffusion models represents a valuable contribution, opening new research directions for the recommendation systems field. Despite some limitations, it is a noteworthy work deserving attention.