Revisit First-order Methods for Geodesically Convex Optimization
Shu, Jiang, Shi et al.
In a seminal work of Zhang and Sra, gradient descent methods for geodesically convex optimization were comprehensively studied. In particular, Zhang and Sra derived a comparison inequality that relates the iterative points in the optimization process. Since their seminal work, numerous follow-ups have studied different downstream usages of their comparison lemma.
In this work, we introduce the concept of quasilinearization to optimization, presenting a novel framework for analyzing geodesically convex optimization. By leveraging this technique, we establish state-of-the-art convergence rates -- for both deterministic and stochastic settings -- under weaker assumptions than previously required. The technique of quasilinearization may prove valuable for other non-Euclidean optimization problems.
academic
Revisit First-order Methods for Geodesically Convex Optimization
This paper revisits first-order methods for geodesically convex optimization. Zhang and Sra conducted a comprehensive study of gradient descent methods for geodesically convex optimization in their seminal work, particularly deriving comparison inequalities for iterative points in the optimization process. This paper introduces the concept of quasilinearization to the optimization domain and proposes a novel framework for analyzing geodesically convex optimization. By leveraging this technique, state-of-the-art convergence rates are established for both deterministic and stochastic settings under weaker assumptions than previously required. The quasilinearization technique may prove valuable for other non-Euclidean optimization problems.
Limitations of Existing Methods: The classical approach by Zhang and Sra relies on two strong assumptions:
(A1) Uniform lower bound on sectional curvature (CBB condition)
(A2) Prior upper bound on trajectory diameter
Practical Issues: Many important Hadamard manifolds do not satisfy the CBB condition, such as warped product manifolds whose curvature may tend to negative infinity.
Core Challenge: How to maintain state-of-the-art convergence rates while removing assumptions (A1) and (A2)?
Theoretical Breakthrough: Providing an affirmative answer to Question (Q), demonstrating that optimal convergence rates can be maintained under weaker assumptions
Theorem 1 (Deterministic Case): Let f be a geodesically convex function on Hadamard manifold M. The proximal gradient algorithm satisfies:
f(xt)−f(x∗)≤ηt∣x0x∗∣2
Theorem 2 (Stochastic Case): Under bounded variance assumptions, the stochastic proximal gradient algorithm with step size ηt=2Lt1 satisfies:
∑t=1Tαt1∑t=1Tαt(EF(xt)−F(x∗))≤2∑t=1Tαt∣x0x∗∣2+∑t=1Tαtσ2log(T+1)
The paper cites 61 related references, primarily including:
Berg & Nikolaev (2008): Original work on quasilinearization
Zhang & Sra (2016): Classical analysis of geodesically convex optimization
Bonnabel (2013): Stochastic gradient descent on Riemannian manifolds
Multiple recent works on Hadamard manifold optimization
Overall Assessment: This is a high-quality theoretical paper that successfully resolves an important open problem in geodesically convex optimization through the introduction of quasilinearization techniques. While lacking numerical experiments, its theoretical contributions are significant, providing new analytical frameworks and tools for non-Euclidean optimization.