2025-11-10T02:43:53.338320

Fast Accelerated Proximal Gradient Method with New Extrapolation Term for Multiobjective Optimization

Huang
In this paper, we propose a novel extrapolation coefficient scheme within a new extrapolation term and develop an accelerated proximal gradient algorithm. We establish that the algorithm achieves a sublinear convergence rate. The proposed scheme only requires the Lipschitz constant estimate sequence to satisfy mild initial conditions, under which a key equality property can be derived to support the convergence analysis. Numerical experiments are provided to demonstrate the effectiveness and practical performance of the proposed method.
academic

Fast Accelerated Proximal Gradient Method with New Extrapolation Term for Multiobjective Optimization

Basic Information

  • Paper ID: 2507.06737
  • Title: Fast Accelerated Proximal Gradient Method with New Extrapolation Term for Multiobjective Optimization
  • Author: Huang Chengzhi
  • Classification: math.OC (Optimization and Control)
  • Publication Date: October 17, 2025
  • Paper Link: https://arxiv.org/abs/2507.06737

Abstract

This paper proposes a novel extrapolation coefficient scheme and extrapolation term, and develops an accelerated proximal gradient algorithm. The algorithm achieves sublinear convergence rates. The proposed scheme only requires the sequence of Lipschitz constant estimates to satisfy mild initial conditions, under which key equality properties can be derived to support convergence analysis. Numerical experiments validate the effectiveness and practical performance of the proposed method.

Research Background and Motivation

  1. Problem to be Solved: Multiobjective optimization problems, particularly composite unconstrained multiobjective optimization problems: minxRnF(x)(f1(x)+g1(x),,fm(x)+gm(x))T\min_{x \in \mathbb{R}^n} F(x) \equiv (f_1(x) + g_1(x), \ldots, f_m(x) + g_m(x))^T where fif_i are smooth convex functions and gig_i are convex functions (possibly nonsmooth).
  2. Problem Significance: Multiobjective optimization is ubiquitous in practical applications, such as image restoration and compressed sensing. Such problems typically do not have a single optimal solution but rather a solution set consisting of Pareto optimal solutions.
  3. Limitations of Existing Methods:
    • Tanabe et al. extended FISTA to multiobjective optimization, achieving O(1/k2)O(1/k^2) convergence rate
    • Work by Sonntag et al. and Zhang et al. suffers from incomplete theoretical proofs, with convergence analysis depending on the nonnegativity of the auxiliary function σ(z)=mini=1,,mFi(xk)Fi(z)\sigma(z) = \min_{i=1,\ldots,m} F_i(x_k) - F_i(z), which is difficult to guarantee
  4. Research Motivation: To overcome defects in theoretical analysis of existing methods, propose methods with milder requirements on initial Lipschitz constant estimates, and avoid dependence on the nonnegativity of σ\sigma through key equalities.

Core Contributions

  1. Proposes a New Extrapolation Term Scheme: Employs the extrapolation form yk=xk+k+α4k+α1(xkxk1)y_k = x_k + \frac{k+\alpha-4}{k+\alpha-1}(x_k - x_{k-1}), where α3\alpha \geq 3
  2. Establishes Mild Initial Conditions: Only requires the sequence of Lipschitz constant estimates to satisfy weaker initial conditions
  3. Derives Key Equality Properties: Avoids dependence on the nonnegativity of auxiliary functions, perfecting the theoretical analysis
  4. Proves Sublinear Convergence Rate: Achieves O(1/k2)O(1/k^2) convergence rate in the smooth case and O(1/k)O(1/k) in the nonsmooth case
  5. Extends to Nonsmooth Case: Handles completely nonsmooth multiobjective optimization problems through smoothing techniques

Method Details

Problem Formulation

Consider the composite unconstrained multiobjective optimization problem (MOP): minxRnF(x)=(f1(x)+g1(x),,fm(x)+gm(x))T\min_{x \in \mathbb{R}^n} F(x) = (f_1(x) + g_1(x), \ldots, f_m(x) + g_m(x))^T

where:

  • fi:RnRf_i: \mathbb{R}^n \to \mathbb{R} are continuously differentiable convex functions
  • gi:RnRg_i: \mathbb{R}^n \to \mathbb{R} are convex functions (possibly nonsmooth)
  • The objective is to find weakly Pareto optimal solutions

Model Architecture

Algorithm for Smooth Case (Algorithm 1)

Core Subproblem: minzRnϕL(f)(z;x,y)=maxi=1,,m[fi(y),zy+gi(z)+fi(y)Fi(x)]+L(f)2zy2\min_{z \in \mathbb{R}^n} \phi_{L(f)}(z; x, y) = \max_{i=1,\ldots,m}[\langle\nabla f_i(y), z-y\rangle + g_i(z) + f_i(y) - F_i(x)] + \frac{L(f)}{2}\|z-y\|^2

Algorithm Steps:

  1. Compute extrapolation point: yk=xk+k+α4k+α1(xkxk1)y_k = x_k + \frac{k+\alpha-4}{k+\alpha-1}(x_k - x_{k-1})
  2. Solve subproblem: xk+1=psk(xk,yk)x_{k+1} = p_{s_k}(x_k, y_k)
  3. Update parameters: sk+1=ηsks_{k+1} = \eta s_k, where η=(k+α2)2(k+α1)(k+α3)\eta = \frac{(k+\alpha-2)^2}{(k+\alpha-1)(k+\alpha-3)}

Parameter Conditions:

  • When α>3\alpha > 3: 0<α2α3s0<1L(f)0 < \frac{\alpha-2}{\alpha-3}s_0 < \frac{1}{L(f)}
  • When α=3\alpha = 3: 0<s0<1L(f)0 < s_0 < \frac{1}{L(f)}

Algorithm for Nonsmooth Case (Algorithm 2)

Approximates nonsmooth functions fi(x)f_i(x) through smoothing functions f~i(x,μ)\tilde{f}_i(x, \mu), where the smoothing function satisfies:

  • Continuous Differentiability: For fixed μ>0\mu > 0, f~(,μ)\tilde{f}(\cdot, \mu) is continuously differentiable
  • Consistency: limzx,μ0f~(z,μ)=f(x)\lim_{z \to x, \mu \downarrow 0} \tilde{f}(z, \mu) = f(x)
  • Gradient Consistency: {limzx,μ0f~(z,μ)}f(x)\{\lim_{z \to x, \mu \downarrow 0} \nabla\tilde{f}(z, \mu)\} \subseteq \partial f(x)

Technical Innovations

  1. Novel Extrapolation Coefficient Design: Through specific parameter update scheme η=(k+α2)2(k+α1)(k+α3)\eta = \frac{(k+\alpha-2)^2}{(k+\alpha-1)(k+\alpha-3)}, ensures sk<1L(f)s_k < \frac{1}{L(f)} holds consistently
  2. Key Equality Derivation: Through clever algebraic manipulation and parameter selection, avoids dependence on the nonnegativity of σk(z)\sigma_k(z)
  3. Unified Framework: Degenerates to existing methods when α=3\alpha = 3, but provides more complete theoretical analysis

Experimental Setup

Datasets

The paper mentions numerical experiments on three tri-objective optimization problems:

  • BK1&ℓ1 problem
  • JOS1&ℓ1 problem
  • SP1&ℓ1 problem

Evaluation Metrics

Uses merit function u0(x)=supzRnmini=1,,m[Fi(x)Fi(z)]u_0(x) = \sup_{z \in \mathbb{R}^n} \min_{i=1,\ldots,m}[F_i(x) - F_i(z)] to evaluate algorithm performance, which satisfies:

  • u0(x)0u_0(x) \geq 0 for all xx
  • xx is weakly Pareto optimal if and only if u0(x)=0u_0(x) = 0

Implementation Details

  • Stopping criterion: xkxk+1<ε\|x_k - x_{k+1}\| < \varepsilon
  • For nonsmooth case also requires μk<ε\mu_k < \varepsilon
  • Parameter updates: μk+1=k+α2k+α1μk\mu_{k+1} = \frac{k+\alpha-2}{k+\alpha-1}\mu_k, sk+1=k+α2k+α3sks_{k+1} = \frac{k+\alpha-2}{k+\alpha-3}s_k

Experimental Results

Main Results

The paper presents Pareto frontier plots for three tri-objective optimization problems, though specific numerical results and detailed performance comparison data are incomplete in the provided document.

Convergence Theory Results

Smooth Case (Theorem 4.3): u0(xk)L(f)(α1)22(k+α1)2Ru_0(x_k) \leq \frac{L(f)(\alpha-1)^2}{2(k+\alpha-1)^2}R Achieves O(1/k2)O(1/k^2) convergence rate.

Nonsmooth Case (Theorem 6.2): u0(xk+1)O(1k)u_0(x_{k+1}) \leq O\left(\frac{1}{k}\right) Achieves O(1/k)O(1/k) convergence rate.

  1. Multiobjective FISTA Extension: Tanabe et al. first extended FISTA to multiobjective optimization, achieving O(1/k2)O(1/k^2) convergence rate
  2. Monotone Variants: Nishimura et al. proposed monotone variants of multiobjective FISTA
  3. Generalized Framework: Tanabe et al. generalized the framework to single-objective cases through hyperparameter introduction
  4. Nesterov-type Schemes: Sonntag et al. and Zhang et al. attempted to use more effective extrapolation terms, but with incomplete theoretical analysis
  5. Nonsmooth Methods: Gebken et al. proposed subgradient descent algorithms for nonsmooth multiobjective optimization

Conclusions and Discussion

Main Conclusions

  1. Proposes an accelerated proximal gradient method with novel extrapolation term applicable to smooth and nonsmooth multiobjective optimization
  2. Establishes complete convergence theory, avoiding theoretical defects of existing methods
  3. Achieves O(1/k2)O(1/k^2) convergence rate in smooth case and O(1/k)O(1/k) in nonsmooth case

Limitations

  1. Insufficient Experimental Validation: Numerical experiment results are incomplete, lacking detailed performance comparisons
  2. Parameter Selection Constraints: Specific requirements on initial parameters s0s_0 and α\alpha
  3. Slower Convergence in Nonsmooth Case: Compared to the smooth case, the nonsmooth version has reduced convergence rate of O(1/k)O(1/k)

Future Directions

  1. Explore better smoothing techniques to improve convergence rate in nonsmooth case
  2. Investigate adaptive parameter selection strategies
  3. Extend to constrained multiobjective optimization problems

In-Depth Evaluation

Strengths

  1. Significant Theoretical Contribution: Resolves key defects in theoretical analysis of existing methods, providing complete convergence proofs
  2. Clever Method Design: Through specific parameter update strategies ensures theoretical guarantees of the algorithm
  3. Framework Unification: Incorporates both smooth and nonsmooth cases into a unified framework
  4. Mathematical Rigor: Detailed proofs with clear logical structure

Weaknesses

  1. Insufficient Experimental Verification: Numerical experiments are overly simplistic, lacking detailed comparisons with other advanced methods
  2. Lack of Practical Analysis: Missing in-depth analysis of computational complexity and practical application scenarios
  3. Parameter Sensitivity Not Discussed: No analysis of parameter selection impact on algorithm performance

Impact

  1. High Theoretical Value: Provides more complete theoretical foundation for accelerated methods in multiobjective optimization
  2. Practical Value Pending Verification: Requires more experimental validation of effectiveness in real-world problems
  3. Good Reproducibility: Clear algorithm description and complete theoretical analysis

Applicable Scenarios

  1. Composite structure multiobjective optimization problems
  2. Application domains such as image processing and compressed sensing
  3. Optimization scenarios requiring theoretical guarantees

References

The paper cites important literature in multiobjective optimization, including:

  • Pioneering work by Tanabe et al. on multiobjective FISTA
  • Related theory of Nesterov acceleration methods
  • Literature on smoothing techniques
  • Classical theoretical foundations of multiobjective optimization

Overall Assessment: This is a theoretically strong paper that successfully addresses theoretical defects in existing multiobjective accelerated proximal gradient methods, providing complete convergence analysis. However, the paper has room for improvement in experimental validation and practical analysis.