2025-11-13T18:46:11.434221

Integration Matters for Learning PDEs with Backwards SDEs

Park, Tu

Backward stochastic differential equation (BSDE)-based deep learning methods provide an alternative to Physics-Informed Neural Networks (PINNs) for solving high-dimensional partial differential equations (PDEs), offering potential algorithmic advantages in settings such as stochastic optimal control, where the PDEs of interest are tied to an underlying dynamical system. However, standard BSDE-based solvers have empirically been shown to underperform relative to PINNs in the literature. In this paper, we identify the root cause of this performance gap as a discretization bias introduced by the standard Euler-Maruyama (EM) integration scheme applied to one-step self-consistency BSDE losses, which shifts the optimization landscape off target. We find that this bias cannot be satisfactorily addressed through finer step-sizes or multi-step self-consistency losses. To properly handle this issue, we propose a Stratonovich-based BSDE formulation, which we implement with stochastic Heun integration. We show that our proposed approach completely eliminates the bias issues faced by EM integration. Furthermore, our empirical results show that our Heun-based BSDE method consistently outperforms EM-based variants and achieves competitive results with PINNs across multiple high-dimensional benchmarks. Our findings highlight the critical role of integration schemes in BSDE-based PDE solvers, an algorithmic detail that has received little attention thus far in the literature.

academic

Integration Matters for Learning PDEs with Backwards SDEs

基本信息

论文ID: 2505.01078
标题: Integration Matters for Learning PDEs with Backwards SDEs
作者: Sungje Park, Stephen Tu (University of Southern California)
分类: cs.LG, cs.SY, eess.SY, math.OC, stat.ML
发表时间: 2025年5月5日初稿，2025年11月13日修订
论文链接: https://arxiv.org/abs/2505.01078

摘要

本文研究基于后向随机微分方程(BSDE)的深度学习方法求解高维偏微分方程(PDE)问题。尽管BSDE方法在随机最优控制等场景中具有算法优势，但实证表现一直不如物理信息神经网络(PINNs)。作者识别出性能差距的根本原因：标准Euler-Maruyama(EM)积分格式在单步自洽BSDE损失中引入了离散化偏差，该偏差无法通过更精细的步长或多步自洽损失得到满意解决。为此，作者提出基于Stratonovich的BSDE公式，并使用随机Heun积分实现，完全消除了EM积分的偏差问题。实验结果表明，Heun-BSDE方法在多个高维基准测试中持续优于EM变体，并与PINNs取得竞争性结果。

研究背景与动机

问题定义

偏微分方程(PDE)的数值求解是科学与工程建模的基础，但传统数值方法面临维度诅咒，在高维PDE中计算不可行。近年来，深度学习方法提供了两种主要替代方案：

物理信息神经网络(PINNs)：直接在随机采样的配置点上最小化PDE残差
BSDE方法：将PDE重构为前向-后向随机微分方程，通过模拟随机过程最小化预测与终端条件的差异

研究动机

尽管BSDE方法在以下场景具有优势：

存在底层动力学系统的高维问题（如随机最优控制）
可以通过仿真访问但无法显式获得PDE方程的问题（模型自由最优控制）

但现有研究（如Nüsken & Richter 2023）发现BSDE方法在基准测试中显著弱于PINNs。该文献提出插值损失来缓解问题，但存在两个关键缺陷：

未阐明性能差距的根本原因
引入需要调优的超参数（时间范围长度），增加训练复杂度

核心洞察

本文识别出性能差距的关键来源是随机积分格式的选择。标准EM格式在单步BSDE损失中引入不可消除的离散化偏差，该偏差与PDE残差项同阶，无法通过减小步长解决。

核心贡献

理论分析：首次系统性分析了EM和Heun随机积分格式应用于单步自洽BSDE损失的离散化偏差
- 证明EM格式引入与PDE残差同阶的非消失偏差项（定理4.2）
- 证明Heun格式完全消除该偏差问题（定理4.4）
方法创新：提出Stratonovich-BSDE公式配合随机Heun积分
- 将前向和后向SDE解释为Stratonovich SDE（而非Itô SDE）
- 使用随机Heun方法进行数值积分，消除单步损失的偏差
多步损失分析：深入分析多步自洽损失的权衡（第5节）
- 揭示EM方法在不同时间范围长度k下的性能权衡
- 证明Heun方法在单步和多步情况下均保持一致性
实证验证：在多个高维基准（HJB、BSB、BZ方程，维度高达100维）上验证
- Heun-BSDE持续优于EM-BSDE
- 与PINNs达到竞争性能，恢复性能对等
算法实现：提供高效的批处理子采样算法，显著降低计算开销

方法详解

任务定义

考虑以下非线性边值PDE：

$R[u](x,t) := \partial_t u(x,t) + \frac{1}{2}\text{tr}(H(x,t)\cdot\nabla^2 u(x,t)) + \langle f(x,t), \nabla u(x,t)\rangle - h[u](x,t) = 0$

其中：

$x \in \Omega \subseteq \mathbb{R}^d$ ， $t \in [0,T]$
边界条件： $u(x,T) = \phi(x)$
$H(x,t) = g(x,t)g(x,t)^T$ 为正定矩阵

标准方法回顾

PINNs方法： $L_{\text{PINNs}}(\theta) = \mathbb{E}_{(x,t)\sim\mu}[(R[u_\theta](x,t))^2]$

BSDE方法：基于前向SDE $dX_t = f(X_t,t)dt + g(X_t,t)dB_t$ 和后向SDE $dY_t = h(X_t,t,Y_t,Z_t)dt + Z_t^T g(X_t,t)dB_t$

H-时间范围自洽BSDE损失： $L_{\text{BSDE},H}(\theta) := \mathbb{E}_{x_0,B_t}\left[\frac{1}{NH^2}\sum_{n=0}^{N-1}\left(u_\theta(X_{t_{n+1}},t_{n+1}) - u_\theta(X_{t_n},t_n) - S_\theta(t_n,t_{n+1})\right)^2\right]$

Euler-Maruyama积分问题分析

EM离散化： $\hat{X}_{n+1} = \hat{X}_n + \tau f(\hat{X}_n,t_n) + \sqrt{\tau}g(\hat{X}_n,t_n)w_n$ $\hat{Y}^\theta_{n+1} = \hat{Y}^\theta_n + \tau h_\theta(\hat{X}_n,t_n) + \sqrt{\tau}\nabla u_\theta(\hat{X}_n,t_n)^T g(\hat{X}_n,t_n)w_n$

关键定理4.1（点态EM损失）：对于固定点 $(x,t)$ ，点态EM损失满足： $\tau^{-2}\cdot\ell_{\text{EM},\tau}(\theta,x,t) = (R[u_\theta](x,t))^2 + \frac{1}{2}\text{tr}[(H(x,t)\cdot\nabla^2 u_\theta(x,t))^2] + O(\tau^{1/2})$

关键定理4.2（完整EM-BSDE损失）： $L_{\text{EM},\tau}(\theta) = \frac{1}{T}\int_0^T \mathbb{E}\left[(R[u_\theta](X_t,t))^2 + \frac{1}{2}\text{tr}[(H(X_t,t)\cdot\nabla^2 u_\theta(X_t,t))^2]\right]dt + O(\tau^{1/2})$

关键洞察：偏差项 $\frac{1}{2}\text{tr}[(H\cdot\nabla^2 u_\theta)^2]$ 与PDE残差项同阶，无法通过减小步长 $\tau$ 消除。

Stratonovich-BSDE与Heun积分

Stratonovich前向SDE： $dX_t^\circ = f(X_t^\circ,t)dt + g(X_t^\circ,t)\circ dB_t$

修正的后向SDE：根据Stratonovich链式法则， $du(X_t^\circ,t) = h^\circ[u](X_t^\circ,t)dt + \nabla u(X_t^\circ,t)^T g(X_t^\circ,t)\circ dB_t$ 其中 $h^\circ[u](x,t) := h[u](x,t) - \frac{1}{2}\text{tr}(H(x,t)\nabla^2 u(x,t))$

随机Heun离散化： $\bar{Z}^\theta_{n+1} = \hat{Z}^\theta_n + \tau F_\theta(\hat{Z}^\theta_n,t_n) + \sqrt{\tau}G_\theta(\hat{Z}^\theta_n,t_n)w_n$ $\hat{Z}^\theta_{n+1} = \hat{Z}^\theta_n + \frac{\tau}{2}(F_\theta(\hat{Z}^\theta_n,t_n) + F_\theta(\bar{Z}^\theta_{n+1},t_{n+1})) + \frac{\sqrt{\tau}}{2}(G_\theta(\hat{Z}^\theta_n,t_n) + G_\theta(\bar{Z}^\theta_{n+1},t_{n+1}))w_n$

其中 $Z^\theta_t = (X_t, Y_t^\theta)$ 为增广过程。

关键定理4.3（点态Heun损失）： $\tau^{-2}\cdot\ell_{\text{Heun},\tau}(\theta,x,t) = (R[u_\theta](x,t))^2 + O(\tau^{1/2})$

关键定理4.4（完整Heun-BSDE损失）： $L_{\text{Heun},\tau}(\theta) = \frac{1}{T}\int_0^T \mathbb{E}(R[u_\theta](X_t^\circ,t))^2 dt + O(\tau^{1/2})$

突破性结果：Heun方法完全消除了EM方法中的偏差项，使得单步损失的主导项仅为PDE残差平方。

技术创新点

问题诊断：首次识别出BSDE性能差距源于积分格式而非损失函数设计
理论突破：提供严格的数学证明，量化EM和Heun方法的离散化偏差
方法设计：巧妙利用Stratonovich解释消除Hessian相关的偏差项
实用性：Heun方法虽然计算量更大，但通过批处理和子采样实现高效训练

多步损失分析（第5节）

EM方法的权衡

对于k步损失（ $1 < k \leq N$ ）：

命题E.3：在SDE层面， $L_{\text{BSDE},T}(\theta) \leq L_{\text{BSDE},\tau}(\theta) + O(\tau^{1/2})$

命题E.4：全时间范围EM损失 $L_{\text{EM}}^N(\theta) = L_{\text{BSDE},T}(\theta) + O(\tau^{1/2})$

命题E.5：单步EM损失 $L_{\text{EM},\tau}(\theta) = L_{\text{BSDE},\tau}(\theta) + \text{Bias}(\theta) + O(\tau^{1/2})$

关键洞察：

全时间范围损失 $L_{\text{EM}}^N$ 消除了偏差，但其近似的SDE损失 $L_{\text{BSDE},T}$ 被更强的损失 $L_{\text{BSDE},\tau}$ 支配
单步损失 $L_{\text{EM},\tau}$ 虽近似更强的损失，但引入不可消除的偏差
中间多步损失试图平衡这一权衡，这正是插值损失方法的本质

Heun方法的一致性

命题E.8-E.10：对于Heun方法， $L_{\text{Heun}}^N(\theta) \leq L_{\text{Heun},\tau}(\theta) + O(\tau^{1/2})$

关键结论：在Heun设置下，单步和全时间范围损失在SDE和离散化层面保持相同关系，消除了选择时间范围k的需求。

实验设置

数据集与PDE基准

1. Hamilton-Jacobi-Bellman (HJB)方程（100维）： $\partial_t u = -\text{Tr}[\nabla^2 u] + \|\nabla u\|^2$ 终端条件： $u(x,T) = \ln(0.5(1+\|x\|^2))$

2. Black-Scholes-Barenblatt (BSB)方程（100维）： $\partial_t u = -\frac{1}{2}\text{Tr}[\sigma^2\text{diag}(x^2)\nabla^2 u] + r(u - \nabla u^T x)$ 终端条件： $u(x,T) = \|x\|^2$

3. Bender & Zhang (BZ)完全耦合FBSDE（10维和100维）：前向过程依赖于后向过程，测试更复杂的耦合场景

4. 摆锤摆动最优控制问题：展示在非线性控制问题中的应用

评价指标

相对L2误差（RL2）： $\text{RL2} := \sqrt{\frac{\sum_{i=0}^N (u_{\text{ref}}(X_{t_i},t_i) - u_{\text{pred}}(X_{t_i},t_i))^2}{\sum_{i=0}^N u_{\text{ref}}^2(X_{t_i},t_i)}}$