2025-11-13T08:28:10.831761

Optimal Control with Lyapunov Stability Guarantees for Space Applications

Abhijeet, Mohamed, Sharma et al.

This paper investigates the infinite horizon optimal control problem (OCP) for space applications characterized by nonlinear dynamics. The proposed approach divides the problem into a finite horizon OCP with a regularized terminal cost, guiding the system towards a terminal set, and an infinite horizon linear regulation phase within this set. This strategy guarantees global asymptotic stability under specific assumptions. Our method maintains the system's fully nonlinear dynamics until it reaches the terminal set, where the system dynamics is linearized. As the terminal set converges to the origin, the difference in optimal cost incurred reduces to zero, guaranteeing an efficient and stable solution. The approach is tested through simulations on three problems: spacecraft attitude control, rendezvous maneuver, and soft landing. In spacecraft attitude control, we focus on achieving precise orientation and stabilization. For rendezvous maneuvers, we address the navigation of a chaser to meet a target spacecraft. For the soft landing problem, we ensure a controlled descent and touchdown on a planetary surface. We provide numerical results confirming the effectiveness of the proposed method in managing these nonlinear dynamics problems, offering robust solutions essential for successful space missions.

academic

Optimal Control with Lyapunov Stability Guarantees for Space Applications

基本信息

论文ID: 2510.08854
标题: Optimal Control with Lyapunov Stability Guarantees for Space Applications
作者: Abhijeet, Mohamed Naveed Gul Mohamed, Aayushman Sharma, Suman Chakravorty (Texas A&M University)
分类: math.OC (Optimization and Control), cs.SY (Systems and Control), eess.SY (Systems and Control)
发表时间: 2025年10月9日
论文链接: https://arxiv.org/abs/2510.08854v1

摘要

本文研究了航天应用中具有非线性动力学特征的无限时域最优控制问题(OCP)。提出的方法将问题分解为两个阶段：带有正则化终端代价的有限时域OCP，引导系统到达终端集合；以及在该集合内的无限时域线性调节阶段。该策略在特定假设下保证全局渐近稳定性。方法在到达终端集合前保持系统的完全非线性动力学，然后对系统动力学进行线性化。随着终端集合收敛到原点，产生的最优代价差异趋于零，保证了高效稳定的解。该方法通过三个问题的仿真验证：航天器姿态控制、交会机动和软着陆。

研究背景与动机

问题背景

航天任务的控制挑战：航天探索需要先进的控制策略来确保任务成功，从航天器的精确定向到对接和着陆的精细机动都需要克服空间环境的固有挑战。
传统方法的局限性：
- 打靶法(Shooting Method)：在姿态控制和轨迹优化中有效，但适应性差，对初始猜测敏感
- 直接方法(SQP, Interior Point)：能处理约束，但无法保证全局渐近稳定性或提供反馈
- 强化学习(RL)：数据依赖性强，结果不一致
长期稳定性需求：空间任务需要系统能从任意初始状态到达特定终端状态，这使得全局渐近稳定性对空间任务特别有价值。

研究动机

针对现有方法在解决最优控制问题时的局限性以及对长期稳定性的需求，本文将问题重新表述为无限时域OCP，采用可处理的方法确保反馈并保证全局渐近稳定性。

核心贡献

提出了一种新的无限时域非线性最优控制求解框架：将无限时域问题分解为有限时域非线性OCP和线性调节两个阶段
建立了理论保证：证明了所提方法满足Bellman方程，提供控制Lyapunov函数(CLF)，确保全局渐近稳定性
开发了实用算法：结合迭代线性二次调节器(iLQR)和线性二次调节器(LQR)的混合方法
验证了方法有效性：在三个关键航天应用中验证：航天器姿态控制、交会机动和软着陆
提供了收敛性分析：证明了当终端集合参数M→0时，替代构造OCP(AC-OCP)的代价收敛到真实无限时域OCP代价

方法详解

任务定义

无限时域最优控制问题定义为：

J*∞(x) = min{ut} Σ(t=0 to ∞) c(xt, ut); given x0 = x
subject to: xt+1 = f(xt, ut)

其中：

xt ∈ Rn：系统状态向量
ut ∈ Rp：控制输入
c(xt, ut)：增量代价函数

模型架构

1. 替代构造最优控制问题(AC-OCP)

将无限时域问题转换为：

JM∞(x) = min{ut}(T-1, t=0), T [Σ(t=0 to T-1) c(xt, ut) + max(J̄∞(xT), M)]
subject to: xt+1 = f(xt, ut), xT ∈ ΩM

其中ΩM = {x | J̄∞(x) ≤ M}是终端集合。

2. 两阶段求解策略

第一阶段：非线性有限时域OCP

使用iLQR求解有限时域问题：

JT∞(x) = min{ut}(T-1, t=0) [Σ(t=0 to T-1) c(xt, ut) + J̄∞(xT)]

第二阶段：线性调节

在终端集合ΩM内使用LQR控制器
线性化系统：J̄∞(x) = xTP�x，其中P∞是稳态Riccati方程的解

3. iLQR算法实现

前向传播：

uk+1_t = uk_t + αkt + Kt(xk+1_t - xk_t)
xk+1_t+1 = f(xk+1_t, uk+1_t)

后向传播：计算Q函数的偏导数并更新增益：

kt = -Q^(-1)_utut * Qut
Kt = -Q^(-1)_utut * Qutxt

技术创新点

自由终端时间优化：通过优化转移时间T确保平滑过渡到终端集合
渐进最优性：证明了limM→0 JM∞(x) = J*∞(x)
稳定性保证：AC-OCP的代价函数满足Bellman方程，作为CLF确保全局渐近稳定性
混合动力学处理：在终端集合外保持完全非线性动力学，在终端集合内进行线性化

实验设置

应用场景

本文在三个关键航天应用中验证方法：

航天器姿态控制
交会机动
软着陆

系统动力学

1. 姿态控制

状态向量：ψ, θ, φ, ω1, ω2, ω3T

欧拉角动力学和角速度动力学
转动惯量矩阵：J = diag4500, 2000, 7500
时域：200秒，离散化步长：0.1秒

2. 交会机动

状态包括相对位置误差er、相对速度误差ev和质量m

椭圆轨道动力学
时域：6000秒，离散化步长：2秒

3. 软着陆

结合姿态和位置动力学

火星重力：gref = 0, 0, -3.7114T
包含质量变化和推力约束
时域：30秒，离散化步长：0.2秒

评价指标

总代价函数：二次型代价c(x,u) = ½(xTQx + uTRu)
终端状态误差
控制输入平滑性
收敛性分析

实验结果

主要结果

1. 姿态控制

转移时间影响：从10秒到80秒，总代价从6.45×10^5降低到5.20×10^5
状态收敛：
- 10秒转移：终端误差34.86°, -33.19°, -36.71°, 2.79°/s, 6.02°/s, 0.97°/s
- 80秒转移：终端误差-0.77°, -0.15°, 0.55°, -0.05°/s, 0.02°/s, -0.05°/s