2025-11-19T08:52:13.731098

Nonnegative Tensor Decomposition Via Collaborative Neurodynamic Optimization

Ahmadi-Asl, Leplat, Phan et al.
This paper introduces a novel collaborative neurodynamic model for computing nonnegative Canonical Polyadic Decomposition (CPD). The model relies on a system of recurrent neural networks to solve the underlying nonconvex optimization problem associated with nonnegative CPD. Additionally, a discrete-time version of the continuous neural network is developed. To enhance the chances of reaching a potential global minimum, the recurrent neural networks are allowed to communicate and exchange information through particle swarm optimization (PSO). Convergence and stability analyses of both the continuous and discrete neurodynamic models are thoroughly examined. Experimental evaluations are conducted on random and real-world datasets to demonstrate the effectiveness of the proposed approach.
academic

Nonnegative Tensor Decomposition Via Collaborative Neurodynamic Optimization

Basic Information

  • Paper ID: 2411.18127
  • Title: Nonnegative Tensor Decomposition Via Collaborative Neurodynamic Optimization
  • Authors: Salman Ahmadi-Asl, Valentin Leplat, Anh-Huy Phan, Andrzej Cichocki
  • Classification: math.NA cs.NA
  • Submission Date: January 1, 2025 (submitted to arXiv)
  • Paper Link: https://arxiv.org/abs/2411.18127

Abstract

This paper proposes a novel collaborative neurodynamic model for computing nonnegative Canonical Polyadic Decomposition (CPD). The model relies on recurrent neural network systems to solve the underlying nonconvex optimization problems associated with nonnegative CPD. Furthermore, a discrete-time version of the continuous neural network is developed. To enhance the chances of reaching potential global minima, the recurrent neural networks communicate and exchange information through Particle Swarm Optimization (PSO). Comprehensive convergence and stability analyses are conducted for both continuous and discrete neurodynamic models. Experimental evaluations on synthetic and real datasets demonstrate the effectiveness of the proposed method.

Research Background and Motivation

Problem Background

Tensor decomposition is an important tool in machine learning and data science, particularly Canonical Polyadic Decomposition (CPD), which decomposes high-order tensors into sums of the minimum number of rank-1 tensors. Nonnegative CPD is significant in many practical applications, including data compression, matrix completion, Hammerstein identification, and clustering.

Limitations of Existing Methods

  1. Local Optimum Problem: Traditional iterative algorithms such as Hierarchical Alternating Least Squares (HALS) and Alternating Least Squares (ALS) tend to get trapped in local optima
  2. Convergence Speed: For difficult tensors with highly collinear factor matrices, existing methods converge slowly
  3. Global Optimization Challenge: Nonnegative CPD is a nonconvex optimization problem, making the search for global optima challenging

Research Motivation

Although collaborative neurodynamic optimization has demonstrated strong capabilities in convex and nonconvex optimization problems, its application to tensor decomposition remains limited. This paper aims to fill this gap by proposing a nonnegative tensor decomposition method based on collaborative neurodynamics.

Core Contributions

  1. Proposed a collaborative neurodynamic model for CPD computation, representing the first comprehensive study extending collaborative neurodynamic optimization to tensor decomposition
  2. Developed a discrete-time projected neural network for nonnegative CPD, providing a practical discrete version of the continuous model
  3. Developed an accelerated version through Hessian preconditioning strategy, improving convergence speed of both continuous and discrete neurodynamic models
  4. Provided comprehensive convergence and stability theoretical analysis, proving global convergence of the algorithm
  5. Demonstrated superior performance on highly collinear data tensors, particularly suitable for handling difficult tensor decomposition problems

Methodology Details

Problem Formulation

Given an N-order tensor XRI1×I2××IN\mathcal{X} \in \mathbb{R}^{I_1 \times I_2 \times \cdots \times I_N}, the nonnegative CPD problem is defined as:

minA(1)0,,A(N)0XA(1),A(2),,A(N)F2\min_{A^{(1)} \geq 0, \ldots, A^{(N)} \geq 0} \|\mathcal{X} - \llbracket A^{(1)}, A^{(2)}, \ldots, A^{(N)} \rrbracket\|_F^2

where A(n)RIn×RA^{(n)} \in \mathbb{R}^{I_n \times R} is the n-th factor matrix and RR is the tensor rank.

Model Architecture

1. Continuous-Time Neurodynamic Model

For third-order tensors, the continuous neurodynamic system is defined as:

ϵ1dAdt=A+[AAF(A,B,C)PA1]+\epsilon_1 \frac{dA}{dt} = -A + [A - \nabla_A F(A,B,C) P_A^{-1}]_+ϵ2dBdt=B+[BBF(A,B,C)PB1]+\epsilon_2 \frac{dB}{dt} = -B + [B - \nabla_B F(A,B,C) P_B^{-1}]_+ϵ3dCdt=C+[CCF(A,B,C)PC1]+\epsilon_3 \frac{dC}{dt} = -C + [C - \nabla_C F(A,B,C) P_C^{-1}]_+

where:

  • F(A,B,C)=12XA,B,CF2F(A,B,C) = \frac{1}{2}\|\mathcal{X} - \llbracket A,B,C \rrbracket\|_F^2 is the objective function
  • PA=(CTC)(BTB)P_A = (C^T C) * (B^T B) is the Hessian preconditioning matrix
  • []+[\cdot]_+ denotes the activation function projecting onto the nonnegative orthant

2. Discrete-Time Projected Neural Network (DTPNN)

The discrete-time version of the continuous model is:

Ak+1=Ak+λk(Ak+[A~k]+)A_{k+1} = A_k + \lambda_k(-A_k + [\tilde{A}_k]_+)Bk+1=Bk+λk(Bk+[B~k]+)B_{k+1} = B_k + \lambda_k(-B_k + [\tilde{B}_k]_+)Ck+1=Ck+λk(Ck+[C~k]+)C_{k+1} = C_k + \lambda_k(-C_k + [\tilde{C}_k]_+)

where A~k=AkAF(Ak,Bk,Ck)\tilde{A}_k = A_k - \nabla_A F(A_k, B_k, C_k).

3. Collaborative Mechanism

Multiple neural networks collaborate through Particle Swarm Optimization (PSO):

vn(k+1)=αvn(k)+β1γ1(pn(k)xn(k))+β2γ2(pbest(k)xn(k))v_n^{(k+1)} = \alpha v_n^{(k)} + \beta_1 \gamma_1 (p_n^{(k)} - x_n^{(k)}) + \beta_2 \gamma_2 (p_{best}^{(k)} - x_n^{(k)})xn(k+1)=xn(k)+vn(k+1)x_n^{(k+1)} = x_n^{(k)} + v_n^{(k+1)}

where pn(k)p_n^{(k)} is the best position of the n-th particle and pbest(k)p_{best}^{(k)} is the global best position.

Technical Innovations

  1. Multi-timescale Neurodynamics: Using different time constants ϵ1,ϵ2,ϵ3\epsilon_1, \epsilon_2, \epsilon_3 allows factor matrices to update at different rates
  2. Hessian Preconditioning: Accelerates convergence through preconditioning matrices such as PA1P_A^{-1}
  3. Wavelet Mutation Mechanism: Employs Gabor wavelet functions to enhance search capability when particle diversity is low
  4. Logarithmic Barrier Method: Provides an alternative approach to convert constrained optimization into unconstrained optimization

Experimental Setup

Datasets

  1. Synthetic Datasets:
    • Difficult tensors: 9×9×9, rank R=10-16
    • Highly collinear tensors: 20×20×20, rank R=10
    • Large-scale tensors: 70×70×70, rank R=75
  2. Real Datasets:
    • COIL20: 32×32×1440 image dataset
    • YALE: 32×32×165 face dataset
    • ORL: 32×32×400 face dataset
    • Hyperspectral Images: Cuprite (120×120×180), Urban (120×120×162), Jasper Ridge (100×100×198)

Evaluation Metrics

Relative error is defined as: Relative error=XA(1),A(2),A(3)FXF\text{Relative error} = \frac{\|\mathcal{X} - \llbracket A^{(1)}, A^{(2)}, A^{(3)} \rrbracket\|_F}{\|\mathcal{X}\|_F}

Comparison Methods

  • HALS (Hierarchical Alternating Least Squares)
  • MUR (Multiplicative Updating Rules)
  • CCG, CGP, BFGSP, GradP and other optimization methods
  • ANLS (Alternating Nonnegative Least Squares)
  • Proco-ALS

Implementation Details

  • PSO parameters: α=0.5, β₁=β₂=0.01
  • Diversity threshold δ for triggering wavelet mutation
  • Backtracking line search for step size determination
  • Population size q=5-30 (adjusted according to experimental needs)

Experimental Results

Main Results

1. Difficult Tensor Decomposition

On 9×9×9 tensors with rank R=10, CNO-CPD achieves relative error of 10⁻⁶ within 600 iterations, significantly outperforming all baseline methods. The ANLS method fails in multiple tests, while CNO-CPD demonstrates stable performance.

2. Highly Collinear Tensors

For tensors with highly collinear factor matrices (μ≥0.96):

  • Case I (one highly collinear factor matrix): CNO-CPD converges to 10⁻⁶ error within 200 iterations
  • Case II (two highly collinear factor matrices): CNO-CPD performs equally well, while baseline methods converge slowly

3. Large-Scale Tensors

On 70×70×70 tensors with rank R=75, BFGSP and Proco-ALS methods fail, while CNO-CPD converges successfully and outperforms other methods.

4. Real Dataset Performance

  • COIL20, YALE, ORL: CNO-CPD achieves lower objective function values on all datasets
  • Hyperspectral Images: On Cuprite, Urban, and Jasper Ridge datasets, CNO-CPD demonstrates faster convergence speed

Ablation Studies

  1. Population Size Impact: As population size increases from 5 to 30, relative error decreases significantly, validating the effectiveness of the collaborative mechanism
  2. Discrete vs Continuous: Semi-implicit discrete methods outperform fully explicit and cubic regularization methods
  3. Classical ODE vs Logarithmic Barrier: Classical ODE formulation outperforms the logarithmic barrier method

Case Analysis

t-SNE visualization shows that factor matrices extracted by CNO-CPD can effectively be used for clustering tasks, demonstrating good clustering structure on COIL20, ORL, and YALE datasets.

Computational Efficiency

Although CNO-CPD has higher per-iteration complexity (due to reinitialization and ODE solver), for highly collinear tensors, CNO-CPD achieves 10⁻⁴ precision within 20 seconds, while HALS requires 100 seconds to achieve 10⁻¹ precision.

Tensor Decomposition Methods

Traditional methods include HALS, ALS, and MUR, which are based on alternating optimization strategies but tend to get trapped in local optima.

Neurodynamic Optimization

Has been applied to LU decomposition, Cholesky decomposition, sparse signal recovery, nonnegative matrix factorization, etc., but applications to tensor decomposition remain limited.

Advantages of This Work

Compared to existing work, this paper is the first to systematically apply collaborative neurodynamics to tensor decomposition, providing complete theoretical analysis and experimental validation.

Conclusions and Discussion

Main Conclusions

  1. CNO-CPD effectively solves nonnegative tensor decomposition problems, particularly suitable for difficult tensors with high collinearity
  2. Theoretical analysis proves global convergence of the algorithm
  3. Experimental results validate superior performance across multiple datasets

Limitations

  1. Computational Complexity: For large-scale tensors, continuous dynamical systems require large time steps, resulting in high computational cost
  2. Parameter Sensitivity: PSO parameters and population size need adjustment for specific problems
  3. Memory Requirements: Hessian preconditioning requires O(R²) memory space

Future Directions

  1. Extend collaborative neurodynamics to other tensor decompositions (Tucker decomposition, tensor ring decomposition, etc.)
  2. Explore nonnegative tensor decomposition based on Kullback-Leibler divergence and Alpha-Beta divergence
  3. Develop parallelization strategies for handling ultra-large-scale tensors

In-Depth Evaluation

Strengths

  1. Theoretical Completeness: Provides complete convergence and stability analysis for both continuous and discrete models
  2. Method Novelty: First systematic application of collaborative neurodynamics to tensor decomposition
  3. Experimental Sufficiency: Comprehensive experimental validation on synthetic and real datasets
  4. Practical Value: Particularly suitable for handling difficult tensor decomposition problems with high collinearity

Weaknesses

  1. Computational Efficiency: Higher per-iteration computational complexity compared to traditional methods
  2. Parameter Tuning: Requires adjustment of multiple PSO parameters, increasing usage complexity
  3. Scalability: Handling capability for ultra-large-scale tensors requires further verification

Impact

  1. Academic Contribution: Introduces a new optimization paradigm to the tensor decomposition field
  2. Application Prospects: Broad application potential in machine learning, signal processing, and data mining
  3. Reproducibility: Authors provide code implementation, facilitating reproduction and further research

Applicable Scenarios

  1. Tensor decomposition with highly collinear factor matrices
  2. Nonnegative tensor decomposition problems requiring global optimization
  3. Application scenarios with high decomposition quality requirements (e.g., hyperspectral image processing, clustering analysis)

References

The paper cites 55 relevant references covering multiple domains including tensor decomposition, neurodynamic optimization, and particle swarm optimization, providing a solid theoretical foundation for this research.


Overall Assessment: This is a high-quality academic paper that demonstrates excellence in theoretical innovation, methodological completeness, and experimental validation. Although it has certain limitations in computational efficiency, its advantages in solving difficult tensor decomposition problems give it significant academic value and application prospects.