2025-11-10T02:43:53.338320

Fast Accelerated Proximal Gradient Method with New Extrapolation Term for Multiobjective Optimization

Huang

In this paper, we propose a novel extrapolation coefficient scheme within a new extrapolation term and develop an accelerated proximal gradient algorithm. We establish that the algorithm achieves a sublinear convergence rate. The proposed scheme only requires the Lipschitz constant estimate sequence to satisfy mild initial conditions, under which a key equality property can be derived to support the convergence analysis. Numerical experiments are provided to demonstrate the effectiveness and practical performance of the proposed method.

academic

Fast Accelerated Proximal Gradient Method with New Extrapolation Term for Multiobjective Optimization

Basic Information

Paper ID: 2507.06737
Title: Fast Accelerated Proximal Gradient Method with New Extrapolation Term for Multiobjective Optimization
Author: Huang Chengzhi
Classification: math.OC (Optimization and Control)
Publication Date: October 17, 2025
Paper Link: https://arxiv.org/abs/2507.06737

Abstract

This paper proposes a novel extrapolation coefficient scheme and extrapolation term, and develops an accelerated proximal gradient algorithm. The algorithm achieves sublinear convergence rates. The proposed scheme only requires the sequence of Lipschitz constant estimates to satisfy mild initial conditions, under which key equality properties can be derived to support convergence analysis. Numerical experiments validate the effectiveness and practical performance of the proposed method.

Research Background and Motivation

Problem to be Solved: Multiobjective optimization problems, particularly composite unconstrained multiobjective optimization problems: $\min_{x \in \mathbb{R}^n} F(x) \equiv (f_1(x) + g_1(x), \ldots, f_m(x) + g_m(x))^T$ where $f_i$ are smooth convex functions and $g_i$ are convex functions (possibly nonsmooth).
Problem Significance: Multiobjective optimization is ubiquitous in practical applications, such as image restoration and compressed sensing. Such problems typically do not have a single optimal solution but rather a solution set consisting of Pareto optimal solutions.
Limitations of Existing Methods:
- Tanabe et al. extended FISTA to multiobjective optimization, achieving $O(1/k^2)$ convergence rate
- Work by Sonntag et al. and Zhang et al. suffers from incomplete theoretical proofs, with convergence analysis depending on the nonnegativity of the auxiliary function $\sigma(z) = \min_{i=1,\ldots,m} F_i(x_k) - F_i(z)$ , which is difficult to guarantee
Research Motivation: To overcome defects in theoretical analysis of existing methods, propose methods with milder requirements on initial Lipschitz constant estimates, and avoid dependence on the nonnegativity of $\sigma$ through key equalities.

Core Contributions

Proposes a New Extrapolation Term Scheme: Employs the extrapolation form $y_k = x_k + \frac{k+\alpha-4}{k+\alpha-1}(x_k - x_{k-1})$ , where $\alpha \geq 3$
Establishes Mild Initial Conditions: Only requires the sequence of Lipschitz constant estimates to satisfy weaker initial conditions
Derives Key Equality Properties: Avoids dependence on the nonnegativity of auxiliary functions, perfecting the theoretical analysis
Proves Sublinear Convergence Rate: Achieves $O(1/k^2)$ convergence rate in the smooth case and $O(1/k)$ in the nonsmooth case
Extends to Nonsmooth Case: Handles completely nonsmooth multiobjective optimization problems through smoothing techniques

Method Details

Problem Formulation

Consider the composite unconstrained multiobjective optimization problem (MOP): $\min_{x \in \mathbb{R}^n} F(x) = (f_1(x) + g_1(x), \ldots, f_m(x) + g_m(x))^T$

where:

$f_i: \mathbb{R}^n \to \mathbb{R}$ are continuously differentiable convex functions
$g_i: \mathbb{R}^n \to \mathbb{R}$ are convex functions (possibly nonsmooth)
The objective is to find weakly Pareto optimal solutions

Model Architecture

Algorithm for Smooth Case (Algorithm 1)

Core Subproblem: $\min_{z \in \mathbb{R}^n} \phi_{L(f)}(z; x, y) = \max_{i=1,\ldots,m}[\langle\nabla f_i(y), z-y\rangle + g_i(z) + f_i(y) - F_i(x)] + \frac{L(f)}{2}\|z-y\|^2$

Algorithm Steps:

Compute extrapolation point: $y_k = x_k + \frac{k+\alpha-4}{k+\alpha-1}(x_k - x_{k-1})$
Solve subproblem: $x_{k+1} = p_{s_k}(x_k, y_k)$
Update parameters: $s_{k+1} = \eta s_k$ , where $\eta = \frac{(k+\alpha-2)^2}{(k+\alpha-1)(k+\alpha-3)}$

Parameter Conditions:

When $\alpha > 3$ : $0 < \frac{\alpha-2}{\alpha-3}s_0 < \frac{1}{L(f)}$
When $\alpha = 3$ : $0 < s_0 < \frac{1}{L(f)}$

Algorithm for Nonsmooth Case (Algorithm 2)

Approximates nonsmooth functions $f_i(x)$ through smoothing functions $\tilde{f}_i(x, \mu)$ , where the smoothing function satisfies:

Continuous Differentiability: For fixed $\mu > 0$ , $\tilde{f}(\cdot, \mu)$ is continuously differentiable
Consistency: $\lim_{z \to x, \mu \downarrow 0} \tilde{f}(z, \mu) = f(x)$
Gradient Consistency: $\{\lim_{z \to x, \mu \downarrow 0} \nabla\tilde{f}(z, \mu)\} \subseteq \partial f(x)$

Technical Innovations

Novel Extrapolation Coefficient Design: Through specific parameter update scheme $\eta = \frac{(k+\alpha-2)^2}{(k+\alpha-1)(k+\alpha-3)}$ , ensures $s_k < \frac{1}{L(f)}$ holds consistently
Key Equality Derivation: Through clever algebraic manipulation and parameter selection, avoids dependence on the nonnegativity of $\sigma_k(z)$
Unified Framework: Degenerates to existing methods when $\alpha = 3$ , but provides more complete theoretical analysis

Experimental Setup

Datasets

The paper mentions numerical experiments on three tri-objective optimization problems:

BK1&ℓ1 problem
JOS1&ℓ1 problem
SP1&ℓ1 problem

Evaluation Metrics

Uses merit function $u_0(x) = \sup_{z \in \mathbb{R}^n} \min_{i=1,\ldots,m}[F_i(x) - F_i(z)]$ to evaluate algorithm performance, which satisfies:

$u_0(x) \geq 0$ for all $x$
$x$ is weakly Pareto optimal if and only if $u_0(x) = 0$

Implementation Details

Stopping criterion: $\|x_k - x_{k+1}\| < \varepsilon$
For nonsmooth case also requires $\mu_k < \varepsilon$
Parameter updates: $\mu_{k+1} = \frac{k+\alpha-2}{k+\alpha-1}\mu_k$ , $s_{k+1} = \frac{k+\alpha-2}{k+\alpha-3}s_k$

Experimental Results

Main Results

The paper presents Pareto frontier plots for three tri-objective optimization problems, though specific numerical results and detailed performance comparison data are incomplete in the provided document.

Convergence Theory Results

Smooth Case (Theorem 4.3): $u_0(x_k) \leq \frac{L(f)(\alpha-1)^2}{2(k+\alpha-1)^2}R$ Achieves $O(1/k^2)$ convergence rate.

Nonsmooth Case (Theorem 6.2): $u_0(x_{k+1}) \leq O\left(\frac{1}{k}\right)$ Achieves $O(1/k)$ convergence rate.

Multiobjective FISTA Extension: Tanabe et al. first extended FISTA to multiobjective optimization, achieving $O(1/k^2)$ convergence rate
Monotone Variants: Nishimura et al. proposed monotone variants of multiobjective FISTA
Generalized Framework: Tanabe et al. generalized the framework to single-objective cases through hyperparameter introduction
Nesterov-type Schemes: Sonntag et al. and Zhang et al. attempted to use more effective extrapolation terms, but with incomplete theoretical analysis
Nonsmooth Methods: Gebken et al. proposed subgradient descent algorithms for nonsmooth multiobjective optimization

Conclusions and Discussion

Main Conclusions

Proposes an accelerated proximal gradient method with novel extrapolation term applicable to smooth and nonsmooth multiobjective optimization
Establishes complete convergence theory, avoiding theoretical defects of existing methods
Achieves $O(1/k^2)$ convergence rate in smooth case and $O(1/k)$ in nonsmooth case

Limitations

Insufficient Experimental Validation: Numerical experiment results are incomplete, lacking detailed performance comparisons
Parameter Selection Constraints: Specific requirements on initial parameters $s_0$ and $\alpha$
Slower Convergence in Nonsmooth Case: Compared to the smooth case, the nonsmooth version has reduced convergence rate of $O(1/k)$

Future Directions

Explore better smoothing techniques to improve convergence rate in nonsmooth case
Investigate adaptive parameter selection strategies
Extend to constrained multiobjective optimization problems

In-Depth Evaluation

Strengths

Significant Theoretical Contribution: Resolves key defects in theoretical analysis of existing methods, providing complete convergence proofs
Clever Method Design: Through specific parameter update strategies ensures theoretical guarantees of the algorithm
Framework Unification: Incorporates both smooth and nonsmooth cases into a unified framework
Mathematical Rigor: Detailed proofs with clear logical structure

Weaknesses

Insufficient Experimental Verification: Numerical experiments are overly simplistic, lacking detailed comparisons with other advanced methods
Lack of Practical Analysis: Missing in-depth analysis of computational complexity and practical application scenarios
Parameter Sensitivity Not Discussed: No analysis of parameter selection impact on algorithm performance

Impact

High Theoretical Value: Provides more complete theoretical foundation for accelerated methods in multiobjective optimization
Practical Value Pending Verification: Requires more experimental validation of effectiveness in real-world problems
Good Reproducibility: Clear algorithm description and complete theoretical analysis

Applicable Scenarios

Composite structure multiobjective optimization problems
Application domains such as image processing and compressed sensing
Optimization scenarios requiring theoretical guarantees

References

The paper cites important literature in multiobjective optimization, including:

Pioneering work by Tanabe et al. on multiobjective FISTA
Related theory of Nesterov acceleration methods
Literature on smoothing techniques
Classical theoretical foundations of multiobjective optimization

Overall Assessment: This is a theoretically strong paper that successfully addresses theoretical defects in existing multiobjective accelerated proximal gradient methods, providing complete convergence analysis. However, the paper has room for improvement in experimental validation and practical analysis.