This paper introduces a novel collaborative neurodynamic model for computing nonnegative Canonical Polyadic Decomposition (CPD). The model relies on a system of recurrent neural networks to solve the underlying nonconvex optimization problem associated with nonnegative CPD. Additionally, a discrete-time version of the continuous neural network is developed. To enhance the chances of reaching a potential global minimum, the recurrent neural networks are allowed to communicate and exchange information through particle swarm optimization (PSO). Convergence and stability analyses of both the continuous and discrete neurodynamic models are thoroughly examined. Experimental evaluations are conducted on random and real-world datasets to demonstrate the effectiveness of the proposed approach.
- Paper ID: 2411.18127
- Title: Nonnegative Tensor Decomposition Via Collaborative Neurodynamic Optimization
- Authors: Salman Ahmadi-Asl, Valentin Leplat, Anh-Huy Phan, Andrzej Cichocki
- Classification: math.NA cs.NA
- Submission Date: January 1, 2025 (submitted to arXiv)
- Paper Link: https://arxiv.org/abs/2411.18127
This paper proposes a novel collaborative neurodynamic model for computing nonnegative Canonical Polyadic Decomposition (CPD). The model relies on recurrent neural network systems to solve the underlying nonconvex optimization problems associated with nonnegative CPD. Furthermore, a discrete-time version of the continuous neural network is developed. To enhance the chances of reaching potential global minima, the recurrent neural networks communicate and exchange information through Particle Swarm Optimization (PSO). Comprehensive convergence and stability analyses are conducted for both continuous and discrete neurodynamic models. Experimental evaluations on synthetic and real datasets demonstrate the effectiveness of the proposed method.
Tensor decomposition is an important tool in machine learning and data science, particularly Canonical Polyadic Decomposition (CPD), which decomposes high-order tensors into sums of the minimum number of rank-1 tensors. Nonnegative CPD is significant in many practical applications, including data compression, matrix completion, Hammerstein identification, and clustering.
- Local Optimum Problem: Traditional iterative algorithms such as Hierarchical Alternating Least Squares (HALS) and Alternating Least Squares (ALS) tend to get trapped in local optima
- Convergence Speed: For difficult tensors with highly collinear factor matrices, existing methods converge slowly
- Global Optimization Challenge: Nonnegative CPD is a nonconvex optimization problem, making the search for global optima challenging
Although collaborative neurodynamic optimization has demonstrated strong capabilities in convex and nonconvex optimization problems, its application to tensor decomposition remains limited. This paper aims to fill this gap by proposing a nonnegative tensor decomposition method based on collaborative neurodynamics.
- Proposed a collaborative neurodynamic model for CPD computation, representing the first comprehensive study extending collaborative neurodynamic optimization to tensor decomposition
- Developed a discrete-time projected neural network for nonnegative CPD, providing a practical discrete version of the continuous model
- Developed an accelerated version through Hessian preconditioning strategy, improving convergence speed of both continuous and discrete neurodynamic models
- Provided comprehensive convergence and stability theoretical analysis, proving global convergence of the algorithm
- Demonstrated superior performance on highly collinear data tensors, particularly suitable for handling difficult tensor decomposition problems
Given an N-order tensor X∈RI1×I2×⋯×IN, the nonnegative CPD problem is defined as:
minA(1)≥0,…,A(N)≥0∥X−[[A(1),A(2),…,A(N)]]∥F2
where A(n)∈RIn×R is the n-th factor matrix and R is the tensor rank.
For third-order tensors, the continuous neurodynamic system is defined as:
ϵ1dtdA=−A+[A−∇AF(A,B,C)PA−1]+ϵ2dtdB=−B+[B−∇BF(A,B,C)PB−1]+ϵ3dtdC=−C+[C−∇CF(A,B,C)PC−1]+
where:
- F(A,B,C)=21∥X−[[A,B,C]]∥F2 is the objective function
- PA=(CTC)∗(BTB) is the Hessian preconditioning matrix
- [⋅]+ denotes the activation function projecting onto the nonnegative orthant
The discrete-time version of the continuous model is:
Ak+1=Ak+λk(−Ak+[A~k]+)Bk+1=Bk+λk(−Bk+[B~k]+)Ck+1=Ck+λk(−Ck+[C~k]+)
where A~k=Ak−∇AF(Ak,Bk,Ck).
Multiple neural networks collaborate through Particle Swarm Optimization (PSO):
vn(k+1)=αvn(k)+β1γ1(pn(k)−xn(k))+β2γ2(pbest(k)−xn(k))xn(k+1)=xn(k)+vn(k+1)
where pn(k) is the best position of the n-th particle and pbest(k) is the global best position.
- Multi-timescale Neurodynamics: Using different time constants ϵ1,ϵ2,ϵ3 allows factor matrices to update at different rates
- Hessian Preconditioning: Accelerates convergence through preconditioning matrices such as PA−1
- Wavelet Mutation Mechanism: Employs Gabor wavelet functions to enhance search capability when particle diversity is low
- Logarithmic Barrier Method: Provides an alternative approach to convert constrained optimization into unconstrained optimization
- Synthetic Datasets:
- Difficult tensors: 9×9×9, rank R=10-16
- Highly collinear tensors: 20×20×20, rank R=10
- Large-scale tensors: 70×70×70, rank R=75
- Real Datasets:
- COIL20: 32×32×1440 image dataset
- YALE: 32×32×165 face dataset
- ORL: 32×32×400 face dataset
- Hyperspectral Images: Cuprite (120×120×180), Urban (120×120×162), Jasper Ridge (100×100×198)
Relative error is defined as:
Relative error=∥X∥F∥X−[[A(1),A(2),A(3)]]∥F
- HALS (Hierarchical Alternating Least Squares)
- MUR (Multiplicative Updating Rules)
- CCG, CGP, BFGSP, GradP and other optimization methods
- ANLS (Alternating Nonnegative Least Squares)
- Proco-ALS
- PSO parameters: α=0.5, β₁=β₂=0.01
- Diversity threshold δ for triggering wavelet mutation
- Backtracking line search for step size determination
- Population size q=5-30 (adjusted according to experimental needs)
On 9×9×9 tensors with rank R=10, CNO-CPD achieves relative error of 10⁻⁶ within 600 iterations, significantly outperforming all baseline methods. The ANLS method fails in multiple tests, while CNO-CPD demonstrates stable performance.
For tensors with highly collinear factor matrices (μ≥0.96):
- Case I (one highly collinear factor matrix): CNO-CPD converges to 10⁻⁶ error within 200 iterations
- Case II (two highly collinear factor matrices): CNO-CPD performs equally well, while baseline methods converge slowly
On 70×70×70 tensors with rank R=75, BFGSP and Proco-ALS methods fail, while CNO-CPD converges successfully and outperforms other methods.
- COIL20, YALE, ORL: CNO-CPD achieves lower objective function values on all datasets
- Hyperspectral Images: On Cuprite, Urban, and Jasper Ridge datasets, CNO-CPD demonstrates faster convergence speed
- Population Size Impact: As population size increases from 5 to 30, relative error decreases significantly, validating the effectiveness of the collaborative mechanism
- Discrete vs Continuous: Semi-implicit discrete methods outperform fully explicit and cubic regularization methods
- Classical ODE vs Logarithmic Barrier: Classical ODE formulation outperforms the logarithmic barrier method
t-SNE visualization shows that factor matrices extracted by CNO-CPD can effectively be used for clustering tasks, demonstrating good clustering structure on COIL20, ORL, and YALE datasets.
Although CNO-CPD has higher per-iteration complexity (due to reinitialization and ODE solver), for highly collinear tensors, CNO-CPD achieves 10⁻⁴ precision within 20 seconds, while HALS requires 100 seconds to achieve 10⁻¹ precision.
Traditional methods include HALS, ALS, and MUR, which are based on alternating optimization strategies but tend to get trapped in local optima.
Has been applied to LU decomposition, Cholesky decomposition, sparse signal recovery, nonnegative matrix factorization, etc., but applications to tensor decomposition remain limited.
Compared to existing work, this paper is the first to systematically apply collaborative neurodynamics to tensor decomposition, providing complete theoretical analysis and experimental validation.
- CNO-CPD effectively solves nonnegative tensor decomposition problems, particularly suitable for difficult tensors with high collinearity
- Theoretical analysis proves global convergence of the algorithm
- Experimental results validate superior performance across multiple datasets
- Computational Complexity: For large-scale tensors, continuous dynamical systems require large time steps, resulting in high computational cost
- Parameter Sensitivity: PSO parameters and population size need adjustment for specific problems
- Memory Requirements: Hessian preconditioning requires O(R²) memory space
- Extend collaborative neurodynamics to other tensor decompositions (Tucker decomposition, tensor ring decomposition, etc.)
- Explore nonnegative tensor decomposition based on Kullback-Leibler divergence and Alpha-Beta divergence
- Develop parallelization strategies for handling ultra-large-scale tensors
- Theoretical Completeness: Provides complete convergence and stability analysis for both continuous and discrete models
- Method Novelty: First systematic application of collaborative neurodynamics to tensor decomposition
- Experimental Sufficiency: Comprehensive experimental validation on synthetic and real datasets
- Practical Value: Particularly suitable for handling difficult tensor decomposition problems with high collinearity
- Computational Efficiency: Higher per-iteration computational complexity compared to traditional methods
- Parameter Tuning: Requires adjustment of multiple PSO parameters, increasing usage complexity
- Scalability: Handling capability for ultra-large-scale tensors requires further verification
- Academic Contribution: Introduces a new optimization paradigm to the tensor decomposition field
- Application Prospects: Broad application potential in machine learning, signal processing, and data mining
- Reproducibility: Authors provide code implementation, facilitating reproduction and further research
- Tensor decomposition with highly collinear factor matrices
- Nonnegative tensor decomposition problems requiring global optimization
- Application scenarios with high decomposition quality requirements (e.g., hyperspectral image processing, clustering analysis)
The paper cites 55 relevant references covering multiple domains including tensor decomposition, neurodynamic optimization, and particle swarm optimization, providing a solid theoretical foundation for this research.
Overall Assessment: This is a high-quality academic paper that demonstrates excellence in theoretical innovation, methodological completeness, and experimental validation. Although it has certain limitations in computational efficiency, its advantages in solving difficult tensor decomposition problems give it significant academic value and application prospects.