2025-11-19T08:52:13.731098

Nonnegative Tensor Decomposition Via Collaborative Neurodynamic Optimization

Ahmadi-Asl, Leplat, Phan et al.

This paper introduces a novel collaborative neurodynamic model for computing nonnegative Canonical Polyadic Decomposition (CPD). The model relies on a system of recurrent neural networks to solve the underlying nonconvex optimization problem associated with nonnegative CPD. Additionally, a discrete-time version of the continuous neural network is developed. To enhance the chances of reaching a potential global minimum, the recurrent neural networks are allowed to communicate and exchange information through particle swarm optimization (PSO). Convergence and stability analyses of both the continuous and discrete neurodynamic models are thoroughly examined. Experimental evaluations are conducted on random and real-world datasets to demonstrate the effectiveness of the proposed approach.

academic

Nonnegative Tensor Decomposition Via Collaborative Neurodynamic Optimization

Basic Information

Paper ID: 2411.18127
Title: Nonnegative Tensor Decomposition Via Collaborative Neurodynamic Optimization
Authors: Salman Ahmadi-Asl, Valentin Leplat, Anh-Huy Phan, Andrzej Cichocki
Classification: math.NA cs.NA
Submission Date: January 1, 2025 (submitted to arXiv)
Paper Link: https://arxiv.org/abs/2411.18127

Abstract

This paper proposes a novel collaborative neurodynamic model for computing nonnegative Canonical Polyadic Decomposition (CPD). The model relies on recurrent neural network systems to solve the underlying nonconvex optimization problems associated with nonnegative CPD. Furthermore, a discrete-time version of the continuous neural network is developed. To enhance the chances of reaching potential global minima, the recurrent neural networks communicate and exchange information through Particle Swarm Optimization (PSO). Comprehensive convergence and stability analyses are conducted for both continuous and discrete neurodynamic models. Experimental evaluations on synthetic and real datasets demonstrate the effectiveness of the proposed method.

Research Background and Motivation

Problem Background

Tensor decomposition is an important tool in machine learning and data science, particularly Canonical Polyadic Decomposition (CPD), which decomposes high-order tensors into sums of the minimum number of rank-1 tensors. Nonnegative CPD is significant in many practical applications, including data compression, matrix completion, Hammerstein identification, and clustering.

Limitations of Existing Methods

Local Optimum Problem: Traditional iterative algorithms such as Hierarchical Alternating Least Squares (HALS) and Alternating Least Squares (ALS) tend to get trapped in local optima
Convergence Speed: For difficult tensors with highly collinear factor matrices, existing methods converge slowly
Global Optimization Challenge: Nonnegative CPD is a nonconvex optimization problem, making the search for global optima challenging

Research Motivation

Although collaborative neurodynamic optimization has demonstrated strong capabilities in convex and nonconvex optimization problems, its application to tensor decomposition remains limited. This paper aims to fill this gap by proposing a nonnegative tensor decomposition method based on collaborative neurodynamics.

Core Contributions

Proposed a collaborative neurodynamic model for CPD computation, representing the first comprehensive study extending collaborative neurodynamic optimization to tensor decomposition
Developed a discrete-time projected neural network for nonnegative CPD, providing a practical discrete version of the continuous model
Developed an accelerated version through Hessian preconditioning strategy, improving convergence speed of both continuous and discrete neurodynamic models
Provided comprehensive convergence and stability theoretical analysis, proving global convergence of the algorithm
Demonstrated superior performance on highly collinear data tensors, particularly suitable for handling difficult tensor decomposition problems

Methodology Details

Problem Formulation

Given an N-order tensor $\mathcal{X} \in \mathbb{R}^{I_1 \times I_2 \times \cdots \times I_N}$ , the nonnegative CPD problem is defined as:

$\min_{A^{(1)} \geq 0, \ldots, A^{(N)} \geq 0} \|\mathcal{X} - \llbracket A^{(1)}, A^{(2)}, \ldots, A^{(N)} \rrbracket\|_F^2$

where $A^{(n)} \in \mathbb{R}^{I_n \times R}$ is the n-th factor matrix and $R$ is the tensor rank.

Model Architecture

1. Continuous-Time Neurodynamic Model

For third-order tensors, the continuous neurodynamic system is defined as:

$\epsilon_1 \frac{dA}{dt} = -A + [A - \nabla_A F(A,B,C) P_A^{-1}]_+$ $\epsilon_2 \frac{dB}{dt} = -B + [B - \nabla_B F(A,B,C) P_B^{-1}]_+$ $\epsilon_3 \frac{dC}{dt} = -C + [C - \nabla_C F(A,B,C) P_C^{-1}]_+$

where:

$F(A,B,C) = \frac{1}{2}\|\mathcal{X} - \llbracket A,B,C \rrbracket\|_F^2$ is the objective function
$P_A = (C^T C) * (B^T B)$ is the Hessian preconditioning matrix
$[\cdot]_+$ denotes the activation function projecting onto the nonnegative orthant

2. Discrete-Time Projected Neural Network (DTPNN)

The discrete-time version of the continuous model is:

$A_{k+1} = A_k + \lambda_k(-A_k + [\tilde{A}_k]_+)$ $B_{k+1} = B_k + \lambda_k(-B_k + [\tilde{B}_k]_+)$ $C_{k+1} = C_k + \lambda_k(-C_k + [\tilde{C}_k]_+)$

where $\tilde{A}_k = A_k - \nabla_A F(A_k, B_k, C_k)$ .

3. Collaborative Mechanism

Multiple neural networks collaborate through Particle Swarm Optimization (PSO):

$v_n^{(k+1)} = \alpha v_n^{(k)} + \beta_1 \gamma_1 (p_n^{(k)} - x_n^{(k)}) + \beta_2 \gamma_2 (p_{best}^{(k)} - x_n^{(k)})$ $x_n^{(k+1)} = x_n^{(k)} + v_n^{(k+1)}$

where $p_n^{(k)}$ is the best position of the n-th particle and $p_{best}^{(k)}$ is the global best position.

Technical Innovations

Multi-timescale Neurodynamics: Using different time constants $\epsilon_1, \epsilon_2, \epsilon_3$ allows factor matrices to update at different rates
Hessian Preconditioning: Accelerates convergence through preconditioning matrices such as $P_A^{-1}$
Wavelet Mutation Mechanism: Employs Gabor wavelet functions to enhance search capability when particle diversity is low
Logarithmic Barrier Method: Provides an alternative approach to convert constrained optimization into unconstrained optimization

Experimental Setup

Datasets

Synthetic Datasets:
- Difficult tensors: 9×9×9, rank R=10-16
- Highly collinear tensors: 20×20×20, rank R=10
- Large-scale tensors: 70×70×70, rank R=75
Real Datasets:
- COIL20: 32×32×1440 image dataset
- YALE: 32×32×165 face dataset
- ORL: 32×32×400 face dataset
- Hyperspectral Images: Cuprite (120×120×180), Urban (120×120×162), Jasper Ridge (100×100×198)

Evaluation Metrics

Relative error is defined as: $\text{Relative error} = \frac{\|\mathcal{X} - \llbracket A^{(1)}, A^{(2)}, A^{(3)} \rrbracket\|_F}{\|\mathcal{X}\|_F}$

Comparison Methods

HALS (Hierarchical Alternating Least Squares)
MUR (Multiplicative Updating Rules)
CCG, CGP, BFGSP, GradP and other optimization methods
ANLS (Alternating Nonnegative Least Squares)
Proco-ALS

Implementation Details

PSO parameters: α=0.5, β₁=β₂=0.01
Diversity threshold δ for triggering wavelet mutation
Backtracking line search for step size determination
Population size q=5-30 (adjusted according to experimental needs)

Experimental Results

Main Results

1. Difficult Tensor Decomposition

On 9×9×9 tensors with rank R=10, CNO-CPD achieves relative error of 10⁻⁶ within 600 iterations, significantly outperforming all baseline methods. The ANLS method fails in multiple tests, while CNO-CPD demonstrates stable performance.

2. Highly Collinear Tensors

For tensors with highly collinear factor matrices (μ≥0.96):

Case I (one highly collinear factor matrix): CNO-CPD converges to 10⁻⁶ error within 200 iterations
Case II (two highly collinear factor matrices): CNO-CPD performs equally well, while baseline methods converge slowly

3. Large-Scale Tensors

On 70×70×70 tensors with rank R=75, BFGSP and Proco-ALS methods fail, while CNO-CPD converges successfully and outperforms other methods.

4. Real Dataset Performance

COIL20, YALE, ORL: CNO-CPD achieves lower objective function values on all datasets
Hyperspectral Images: On Cuprite, Urban, and Jasper Ridge datasets, CNO-CPD demonstrates faster convergence speed

Ablation Studies

Population Size Impact: As population size increases from 5 to 30, relative error decreases significantly, validating the effectiveness of the collaborative mechanism
Discrete vs Continuous: Semi-implicit discrete methods outperform fully explicit and cubic regularization methods
Classical ODE vs Logarithmic Barrier: Classical ODE formulation outperforms the logarithmic barrier method

Case Analysis

t-SNE visualization shows that factor matrices extracted by CNO-CPD can effectively be used for clustering tasks, demonstrating good clustering structure on COIL20, ORL, and YALE datasets.

Computational Efficiency

Although CNO-CPD has higher per-iteration complexity (due to reinitialization and ODE solver), for highly collinear tensors, CNO-CPD achieves 10⁻⁴ precision within 20 seconds, while HALS requires 100 seconds to achieve 10⁻¹ precision.

Tensor Decomposition Methods

Traditional methods include HALS, ALS, and MUR, which are based on alternating optimization strategies but tend to get trapped in local optima.

Neurodynamic Optimization

Has been applied to LU decomposition, Cholesky decomposition, sparse signal recovery, nonnegative matrix factorization, etc., but applications to tensor decomposition remain limited.

Advantages of This Work

Compared to existing work, this paper is the first to systematically apply collaborative neurodynamics to tensor decomposition, providing complete theoretical analysis and experimental validation.

Conclusions and Discussion

Main Conclusions

CNO-CPD effectively solves nonnegative tensor decomposition problems, particularly suitable for difficult tensors with high collinearity
Theoretical analysis proves global convergence of the algorithm
Experimental results validate superior performance across multiple datasets

Limitations

Computational Complexity: For large-scale tensors, continuous dynamical systems require large time steps, resulting in high computational cost
Parameter Sensitivity: PSO parameters and population size need adjustment for specific problems
Memory Requirements: Hessian preconditioning requires O(R²) memory space

Future Directions

Extend collaborative neurodynamics to other tensor decompositions (Tucker decomposition, tensor ring decomposition, etc.)
Explore nonnegative tensor decomposition based on Kullback-Leibler divergence and Alpha-Beta divergence
Develop parallelization strategies for handling ultra-large-scale tensors

In-Depth Evaluation

Strengths

Theoretical Completeness: Provides complete convergence and stability analysis for both continuous and discrete models
Method Novelty: First systematic application of collaborative neurodynamics to tensor decomposition
Experimental Sufficiency: Comprehensive experimental validation on synthetic and real datasets
Practical Value: Particularly suitable for handling difficult tensor decomposition problems with high collinearity

Weaknesses

Computational Efficiency: Higher per-iteration computational complexity compared to traditional methods
Parameter Tuning: Requires adjustment of multiple PSO parameters, increasing usage complexity
Scalability: Handling capability for ultra-large-scale tensors requires further verification

Impact

Academic Contribution: Introduces a new optimization paradigm to the tensor decomposition field
Application Prospects: Broad application potential in machine learning, signal processing, and data mining
Reproducibility: Authors provide code implementation, facilitating reproduction and further research

Applicable Scenarios

Tensor decomposition with highly collinear factor matrices
Nonnegative tensor decomposition problems requiring global optimization
Application scenarios with high decomposition quality requirements (e.g., hyperspectral image processing, clustering analysis)

References

The paper cites 55 relevant references covering multiple domains including tensor decomposition, neurodynamic optimization, and particle swarm optimization, providing a solid theoretical foundation for this research.

Overall Assessment: This is a high-quality academic paper that demonstrates excellence in theoretical innovation, methodological completeness, and experimental validation. Although it has certain limitations in computational efficiency, its advantages in solving difficult tensor decomposition problems give it significant academic value and application prospects.