2025-11-24T05:40:17.486436

On Minimum-Dispersion Control of Nonlinear Diffusion Processes

Chertovskih, Pogodaev, Staritsyn et al.

This work collects some methodological insights for numerical solution of a "minimum-dispersion" control problem for nonlinear stochastic differential equations, a particular relaxation of the covariance steering task. The main ingredient of our approach is the theoretical foundation called $\infty$-order variational analysis. This framework consists in establishing an exact representation of the increment ($\infty$-order variation) of the objective functional using the duality, implied by the transformation of the nonlinear stochastic control problem to a linear deterministic control of the Fokker-Planck equation. The resulting formula for the cost increment analytically represents a "law-feedback" control for the diffusion process. This control mechanism enables us to learn time-dependent coefficients for a predefined Markovian control structure using Monte Carlo simulations with a modest population of samples. Numerical experiments prove the vitality of our approach.

academic

On Minimum-Dispersion Control of Nonlinear Diffusion Processes

基本信息

论文ID: 2405.07676
标题: On Minimum-Dispersion Control of Nonlinear Diffusion Processes
作者: Roman Chertovskih, Nikolay Pogodaev, Maxim Staritsyn, A. Pedro Aguiar
分类: math.OC (Optimization and Control)
发表时间: 2024年5月13日
论文链接: https://arxiv.org/abs/2405.07676

摘要

本研究针对非线性随机微分方程的"最小散布"控制问题提出了数值求解的方法论见解，这是协方差导引任务的一种特殊松弛形式。该方法的核心是基于∞阶变分分析的理论基础，通过将非线性随机控制问题转换为Fokker-Planck方程的线性确定性控制，建立了目标函数增量的精确表示。由此得到的成本增量公式解析地表示了扩散过程的"律反馈"控制。这种控制机制使得能够通过少量样本的蒙特卡罗模拟来学习预定义马尔可夫控制结构的时变系数。数值实验证明了该方法的有效性。

研究背景与动机

核心问题

本研究主要解决协方差导引问题(Covariance Steering Problem, CSP)的非线性扩展。CSP的核心是在给定初始高斯概率分布的情况下，将随机过程的状态引导到具有预定义均值和协方差矩阵的终端状态。

问题重要性

实际应用价值: 如在噪声环境中安全着陆飞机，需要在指定"安全区域"内以合理概率完成任务
理论意义: CSP可视为质量传输约束下的随机最优控制问题
技术挑战: 非线性动力学破坏了高斯结构，使得二阶统计量不足以刻画概率分布形状

现有方法局限性

线性情况: CSP在高斯初始分布、线性动力学和线性二次成本函数情况下有闭式解，通过Riccati方程求解
非线性处理: 现有非线性方法主要采用状态动力学线性化，仍依赖线性情况的推理
高阶统计: 非线性情况下需要考虑高阶矩，但现有方法处理能力有限

研究动机

提出"最小散布控制"作为CSP的松弛形式，在将随机群体均值导向预定义目标的同时，考虑围绕均值散布的合适高阶统计测度。

核心贡献

∞阶变分分析框架: 建立了基于对偶性的目标函数增量精确表示理论
律反馈控制机制: 通过Fokker-Planck方程对偶性导出了解析形式的下降控制结构
数值实现算法: 结合蒙特卡罗方法和Krasovskii-Subbotin采样算法的实用数值方案
维数灾难缓解: 通过概率框架有效处理高维问题，避免传统PDE数值方法的计算复杂性

方法详解

任务定义

考虑标准最优随机控制问题的Mayer形式： $\min_{u \in U} I[u] = E[\ell(X_T[u])]$

其中 $X[u]$ 是非线性随机微分方程的强解： $X_t = x_0 + \int_0^t f_\tau(X_s, u_s)ds + \int_0^t \sigma_s(X_s, u_s)dW_s$

核心理论框架

Fokker-Planck控制转换

将非线性随机控制问题转换为等价的状态线性确定性优化问题： $(RP) \quad \min_{u \in U} J[u] = \int_{\mathbb{R}^d} \ell d\mu_T[u]$ 受约束： $\partial_t \mu = L_t^*(u_t)\mu$ ，其中 $L_t^*(\upsilon)$ 是椭圆算子 $L_t(\upsilon)$ 的形式伴随。

∞阶变分分析

通过对偶性建立成本函数增量的精确表示。设 $\bar{u}, u \in U$ 分别为参考控制和目标控制，则： $\Delta J = \int_I \int_{\mathbb{R}^n} (\bar{H}_s(x, u_s) - \bar{H}_s(x, \bar{u}_s)) d\mu_s(x) ds$

其中 $\bar{H}_s(x, \upsilon) = H_s(x, \nabla_x \bar{p}_s(x), \upsilon)$ 是Hamilton-Pontryagin函数的收缩形式。

律反馈控制设计

定义下降控制： $\bar{v}_t[\mu] \in \arg\min_{\upsilon \in U} \int_{\mathbb{R}^n} \bar{H}_s(x, \upsilon) d\mu(x)$

这构成了PDE的反馈控制，产生非局部方程： $\partial_t \mu = L_t^*(\bar{v}_t[\mu])\mu$

数值实现算法

Algorithm 1: 下降方法

输入: 初始猜测ū ∈ U, 容差ε > 0
输出: 序列{uk}使得I[uk+1] < I[uk]

1. 初始化: k ← 0, u0 ← ū
2. 重复:
   - 计算pk ← p[uk]
   - 求解vk_s[μ]从优化问题(9)
   - 更新μk+1 ← μ̂[vk], uk+1 ← vk[μk+1]
   - k ← k + 1
3. 直到|I[uk-1] - I[uk]| < ε