2025-11-15T18:28:11.606243

S-Diff: An Anisotropic Diffusion Model for Collaborative Filtering in Spectral Domain

Xia, Cheng, Tang et al.
Recovering user preferences from user-item interaction matrices is a key challenge in recommender systems. While diffusion models can sample and reconstruct preferences from latent distributions, they often fail to capture similar users' collective preferences effectively. Additionally, latent variables degrade into pure Gaussian noise during the forward process, lowering the signal-to-noise ratio, which in turn degrades performance. To address this, we propose S-Diff, inspired by graph-based collaborative filtering, better to utilize low-frequency components in the graph spectral domain. S-Diff maps user interaction vectors into the spectral domain and parameterizes diffusion noise to align with graph frequency. This anisotropic diffusion retains significant low-frequency components, preserving a high signal-to-noise ratio. S-Diff further employs a conditional denoising network to encode user interactions, recovering true preferences from noisy data. This method achieves strong results across multiple datasets.
academic

S-Diff: An Anisotropic Diffusion Model for Collaborative Filtering in Spectral Domain

Basic Information

  • Paper ID: 2501.00384
  • Title: S-Diff: An Anisotropic Diffusion Model for Collaborative Filtering in Spectral Domain
  • Authors: Rui Xia, Yanhua Cheng, Yongxiang Tang, Xiaocheng Liu, Xialong Liu, Lisong Wang, Peng Jiang
  • Category: cs.IR (Information Retrieval)
  • Venue: WSDM '25 (The Eighteenth ACM International Conference on Web Search and Data Mining)
  • Paper Link: https://arxiv.org/abs/2501.00384

Abstract

Recovering user preferences from user-item interaction matrices in recommendation systems is a critical challenge. While diffusion models can sample and reconstruct preferences from latent distributions, they often fail to effectively capture the collective preferences of similar users. Furthermore, latent variables degrade to pure Gaussian noise during the forward process, reducing signal-to-noise ratio and degrading performance. To address these issues, this paper proposes S-Diff, inspired by graph-based collaborative filtering, to better exploit low-frequency components in the spectral domain. S-Diff maps user interaction vectors to the spectral domain and parameterizes diffusion noise to align with graph frequencies. This anisotropic diffusion preserves important low-frequency components while maintaining high signal-to-noise ratio. S-Diff further employs a conditional denoising network to encode user interactions and recover true preferences from noisy data. The method achieves strong results across multiple datasets.

Research Background and Motivation

Problem Definition

The core task of recommendation systems is to recover true user preferences from sparse user-item interaction matrices, which is essentially an inverse problem. Traditional collaborative filtering methods address this by mining similarities among users.

Limitations of Existing Methods

  1. Insufficiencies of Traditional Diffusion Models:
    • Primarily rely on individual user interaction vectors as conditional inputs, failing to fully leverage shared preference information among users in collaborative filtering
    • Inject large amounts of Gaussian noise into high-dimensional historical interaction vectors, complicating the recovery process for denoising decoders
  2. Encoding-Decoding Inconsistency:
    • Some models explicitly use collaborative information as conditional guidance in the decoding network, but the forward process does not reflect collaborative signals
    • Results in inconsistency between encoding and decoding processes
  3. Signal-to-Noise Ratio Degradation:
    • Latent variables degrade to pure Gaussian noise during the forward process, reducing signal-to-noise ratio
    • Impacts overall model performance

Research Motivation

Inspired by the success of graph-based collaborative filtering and graph signal processing, the authors observe that the "over-smoothing" process in graph convolution is analogous to signal smoothing in diffusion processes. Based on this insight, they propose anisotropic diffusion in the graph spectral domain to better preserve low-frequency information (representing global preferences).

Core Contributions

  1. Spectral Domain Forward Diffusion Process: Introduces a forward diffusion process defined in the graph spectral domain, effectively integrating global preference information across users
  2. Anisotropic Noise Parameterization Method: Proposes a method to parameterize and modulate noise scales for different frequency components, with theoretical analysis and experimental results demonstrating advantages in signal-to-noise ratio
  3. Element-wise Fusion Denoising Module: Designs a denoising module based on element-wise fusion in the reverse process, with extensive experiments validating the effectiveness of the proposed method
  4. Theoretical Guarantees: Provides boundedness analysis of the spectral domain diffusion process, establishing theoretical soundness of the method

Methodology Details

Task Definition

Given a set of users U and items I, with user-item interaction matrix X ∈ {0,1}^{|U|×|I|}, where x_{u,i} = 1 indicates user u has interacted with item i. The goal is to predict a rating vector ∈ ℝ^{|I|}, generating latent preference scores for all items for a specified user.

Model Architecture

1. Graph Construction and Spectral Decomposition

  • Item Similarity Graph: Define normalized similarity adjacency matrix A = ^T, where = D_U^{-1/2}X****D_I^{-1/2}
  • Laplacian Operator: L = I - A
  • Eigendecomposition: L = UΛU^T, where Λ contains eigenvalues and U contains eigenvectors

2. Graph-Guided Forward Diffusion

Traditional diffusion process: x_t = α_tx_0 + σ_tε_t

Improved graph-guided diffusion: x_t = C_tx_0 + σ_tε_t

where C_t = e^{-Lt} is a time-decay operator defined by the Laplacian matrix.

3. Spectral Domain Diffusion Framework

Transform the diffusion process to the spectral domain via spectral transformation v_t = U^Tx_t:

v_t = λ_t ⊙ v_0 + σtv{ε,t}

where:

  • v_0 = U^Tx_0 is the frequency response of x_0
  • λ_t = e^{-t·d_1}, e^{-t·d_2}, ..., e^{-t·d_{|I|}} is the eigenvalue vector
  • ⊙ denotes element-wise multiplication

4. Anisotropic Noise Schedule

Adopts variance-preserving diffusion model:

  • α_t = λ_t
  • σ_t^2 = 1 - λ_t^2

Introduces boundary parameter control:

  • αt = (1 - α) · λt + α
  • σ_t = Min(√(1 - λt^2), σ)

5. Conditional Reverse Denoising

Uses neural network φ_θ for denoising with optimization objective:

L_t = E_{(v_0,v_t)~q_0(v_0)q_t(v_t|v_0)}||φ_θ(v_t, U^Tc, t) - v_0||^2

Technical Innovations

  1. Spectral Domain Mapping: Transforms traditional spatial domain diffusion to graph spectral domain, leveraging spectral properties of graphs
  2. Anisotropic Noise: Modulates noise levels for different frequency components according to eigenvalues, preserving low-frequency information
  3. Boundedness Properties: Guarantees lower bounds on signal-to-noise ratio due to boundedness of Laplacian eigenvalues
  4. FiLM Fusion: Uses Feature-wise Linear Modulation for element-wise conditional fusion

Experimental Setup

Datasets

Three public datasets are used:

  • MovieLens-1M: 5,949 users, 2,810 items, 571,531 interactions, sparsity 96.6%
  • Yelp: 54,574 users, 34,395 items, 1,402,736 interactions, sparsity 99.93%
  • Amazon-Book: 108,822 users, 94,949 items, 3,146,256 interactions, sparsity 99.97%

Data is split in 7:1:2 ratio into training, validation, and test sets.

Evaluation Metrics

  • Recall@K: Measures the proportion of relevant items in the top-K recommendation list
  • NDCG@K: Ranking-sensitive metric giving higher scores to relevant items at higher positions

Baseline Methods

Include traditional collaborative filtering, graph neural network, and diffusion model approaches:

  • MF, LightGCN, CDAE, MultiDAE/MultiVAE
  • CODIGEM, DiffRec (diffusion models)
  • LinkProp, BSPM, Giff (graph signal processing methods)

Implementation Details

  • Batch size: 100
  • Learning rate: 1e-4
  • Maximum training epochs: 1,000
  • Diffusion steps: T=5
  • Spectral decomposition dimension: 200

Experimental Results

Main Results

S-Diff significantly outperforms all baseline methods across all datasets and metrics:

Amazon-Book Dataset:

  • Recall@10: 0.1155 (vs. best baseline Giff: 0.1109)
  • NDCG@10: 0.0746 (vs. best baseline Giff: 0.0733)

Yelp Dataset:

  • Recall@10: 0.0635 (vs. best baseline Giff: 0.0639)
  • NDCG@20: 0.0561 (vs. best baseline Giff: 0.0520)

MovieLens-1M Dataset:

  • Recall@10: 0.1277 (vs. best baseline Giff: 0.1108)
  • NDCG@10: 0.0970 (vs. best baseline Giff: 0.0952)

Ablation Studies

Compares different noise scheduling strategies:

  • DDPM in Spectral: Traditional Gaussian noise in spectral domain
  • S-Diff-VE: Variance exploding diffusion
  • S-Diff-VP: Variance preserving diffusion (proposed method)

Results show S-Diff-VP achieves optimal signal-to-noise ratio and performance.

2. Denoising Network Component Analysis

Removing the FiLM layer significantly degrades performance, validating the importance of element-wise fusion.

Signal-to-Noise Ratio Analysis

Theoretical analysis and experiments demonstrate that spectral domain anisotropic diffusion achieves better signal-to-noise ratio lower bounds compared to traditional diffusion models:

SNR(t) = α_t^2/σ_t^2 ≥ (e^{-2τ})^2/(1-(e^{-2τ})^2)

Experiments show S-Diff maintains identifiable signal-to-noise ratio even after 1000 diffusion steps.

Parameter Sensitivity Analysis

  • Spectral Decomposition Dimension K: Optimal performance at K=200
  • Boundary Parameters: Best results when α_ ∈ 0, 0.1, σ_ ∈ 0.4, 0.5

Diffusion Models in Recommendation

  • CODIGEM: First application of DDPM to collaborative filtering
  • DiffRec: Improves diffusion models through latent space mapping and timestep guidance
  • CF-Diff: Precomputes multi-hop neighbor information as conditions
  • Giff: Uses graph propagation for signal smoothing and recovery

Graph Filtering Methods

  • LightGCN: Multi-layer linear aggregation of neighbor information
  • Poly-CF: Adaptive spectral graph filtering
  • SGFCF: Transforms collaborative filtering into adaptive filter design problem

Conclusions and Discussion

Main Conclusions

  1. S-Diff successfully combines graph spectral theory with diffusion models, performing anisotropic diffusion in the spectral domain
  2. By preserving low-frequency components and maintaining high signal-to-noise ratio, significantly improves recommendation performance
  3. The method has solid theoretical foundations and experimental validation

Limitations

  1. Computational Complexity: Requires spectral decomposition with time complexity O(K|I|m)
  2. Parameter Tuning: Requires careful adjustment of boundary parameters α_ and σ_
  3. Scalability: Applicability to ultra-large-scale datasets remains to be verified

Future Directions

  1. Computational Efficiency Optimization: Research more efficient spectral decomposition and diffusion processes
  2. Adaptive Parameters: Develop methods for automatic noise parameter adjustment
  3. Multimodal Extension: Extend the method to multimodal recommendation scenarios

In-Depth Evaluation

Strengths

  1. Theoretical Innovation: Cleverly combines graph signal processing with diffusion models, providing new theoretical perspectives
  2. Advanced Techniques: Anisotropic noise scheduling and spectral domain diffusion are important technical contributions
  3. Comprehensive Experiments: Conducts thorough comparisons and ablation studies across multiple datasets
  4. Superior Performance: Achieves best performance across all evaluation metrics

Weaknesses

  1. High Complexity: Spectral decomposition adds computational overhead, potentially limiting application on large-scale data
  2. Parameter Sensitivity: The method involves multiple hyperparameters requiring careful tuning
  3. Insufficient Theoretical Analysis: Lacks deeper theoretical explanation for why anisotropic diffusion is more effective

Impact

  1. Academic Value: Provides new insights for applying diffusion models in recommendation systems
  2. Practical Value: Method demonstrates good performance improvements with practical application potential
  3. Reproducibility: Paper provides detailed implementation details and algorithm descriptions

Applicable Scenarios

  • Medium-scale recommendation systems
  • Scenarios with high recommendation quality requirements
  • Datasets with pronounced collaborative filtering characteristics
  • Environments with relatively abundant computational resources

References

The paper cites 52 relevant references covering multiple domains including diffusion models, collaborative filtering, and graph neural networks, providing solid theoretical foundations for this research.


Overall Assessment: This is a high-quality research paper demonstrating excellence in both theoretical innovation and experimental validation. The combination of graph spectral theory with diffusion models represents a valuable contribution, opening new research directions for the recommendation systems field. Despite some limitations, it is a noteworthy work deserving attention.