Forward Euler for Wasserstein Gradient Flows: Breakdown and Regularization
Xu, Li
Wasserstein gradient flows have become a central tool for optimization problems over probability measures. A natural numerical approach is forward-Euler time discretization. We show, however, that even in the simple case where the energy functional is the Kullback-Leibler (KL) divergence against a smooth target density, forward-Euler can fail dramatically: the scheme does not converge to the gradient flow, despite the fact that the first variation $\nabla\frac{δF}{δÏ}$ remains formally well defined at every step. We identify the root cause as a loss of regularity induced by the discretization, and prove that a suitable regularization of the functional restores the necessary smoothness, making forward-Euler a viable solver that converges in discrete time to the global minimizer.
academic
Forward Euler for Wasserstein Gradient Flows: Breakdown and Regularization
Wasserstein gradient flows have become a fundamental tool for optimization problems on probability measures. Forward Euler time discretization is a natural numerical method. However, this paper proves that even in the simple case where the energy functional is the Kullback-Leibler (KL) divergence with respect to a smooth target density, the forward Euler method fails dramatically: the scheme does not converge to the gradient flow, despite the first variation ∇δρδF remaining formally well-defined at each step. The authors identify the root cause as regularity loss induced by discretization and prove that appropriate regularization of the functional can restore the necessary smoothness, making forward Euler a viable solver that converges to the global minimum in discrete time.
Optimization on Probability Measure Spaces: Minimizing functionals F[ρ] on probability measure spaces P(Ω) arises widely in machine learning and statistical physics
Wasserstein Gradient Flows: Analogous to gradient descent in Euclidean spaces, gradient flows under the Wasserstein metric provide a natural framework for probability measure optimization
Numerical Implementation Challenges: Numerical solution of gradient flow PDEs requires time discretization, with forward Euler being the most intuitive choice
Although forward Euler performs well for classical PDEs, does it remain effective for Wasserstein gradient flows? Particularly for fundamental functionals such as KL divergence.
Counterexample 1 (Non-injectivity): Target distribution ρ∗=e−U with U(x)=2x2+4x4, initial distribution as standard Gaussian. Non-injectivity of the pushforward map T(x)=x−hx3 leads to density discontinuities.
Counterexample 2 (Derivative Depletion): Piecewise initial distribution produces jump discontinuities after forward Euler step, with KL divergence bounded below by >0.019.
This paper cites 41 relevant references covering important works in optimal transport theory, Wasserstein gradient flows, numerical analysis and other fields, providing solid theoretical foundation for the research.
Technical Summary:
Central role of regularity in Wasserstein gradient flows
Structural limitations of forward Euler method
Effectiveness of Gaussian regularization
Convergence guarantees of projected gradient descent