2025-11-23T10:46:16.032830

Strategy Templates for Almost-Sure and Positive Winning of Stochastic Parity Games towards Permissive and Resilient Control

Phalakarn, Pruekprasert, Hasuo

Stochastic games are fundamental in various applications, including the control of cyber-physical systems (CPS), where both controller and environment are modeled as players. Traditional algorithms typically aim to determine a single winning strategy to develop a controller. However, in CPS control and other domains, permissive controllers are essential, as they enable the system to adapt when additional constraints arise and remain resilient to runtime changes. This work generalizes the concept of (permissive winning) strategy templates, originally introduced by Anand et al. at TACAS and CAV 2023 for deterministic games, to incorporate stochastic games. These templates capture an infinite number of winning strategies, allowing for efficient strategy adaptation to system changes. We focus on two winning criteria (almost-sure and positive winning) and five winning objectives (safety, reachability, BÃ¼chi, co-BÃ¼chi, and parity). Our contributions include algorithms for constructing templates for each winning criterion and objective and a novel approach for extracting a winning strategy from a given template. Discussions on comparisons between templates and between strategy extraction methods are provided.

academic

Strategy Templates for Almost-Sure and Positive Winning of Stochastic Parity Games towards Permissive and Resilient Control

基本信息

论文ID: 2409.08607
标题: Strategy Templates for Almost-Sure and Positive Winning of Stochastic Parity Games towards Permissive and Resilient Control
作者: Kittiphon Phalakarn, Sasinee Pruekprasert, Ichiro Hasuo
分类: eess.SY cs.LO cs.SY
发表时间: 2024年9月 (arXiv v2: 2025年10月16日)
论文链接: https://arxiv.org/abs/2409.08607

摘要

随机博弈在多种应用中具有基础性作用，特别是在网络物理系统(CPS)控制中，控制器和环境被建模为博弈参与者。传统算法通常旨在确定单一获胜策略来开发控制器。然而，在CPS控制和其他领域中，宽松控制器至关重要，因为它们能够在出现附加约束时适应系统并对运行时变化保持弹性。本工作将策略模板概念从确定性博弈推广到随机博弈，这些模板能够捕获无限数量的获胜策略，允许对系统变化进行高效的策略适应。我们专注于两种获胜准则(几乎必然获胜和正概率获胜)以及五种获胜目标(安全性、可达性、Büchi、co-Büchi和奇偶性)。

研究背景与动机

问题背景

传统方法局限性: 传统的博弈求解算法通常只寻找单一获胜策略，不考虑策略的宽松性(permissiveness)
实际应用需求: 在网络物理系统控制中，需要宽松控制器来适应额外约束和运行时变化
弹性控制需求: 系统需要在面临故障或环境变化时保持鲁棒性

研究动机

现有的策略模板概念仅适用于确定性博弈，缺乏对随机博弈的支持
需要能够捕获无限数量获胜策略的框架，以支持策略的快速适应
在CPS控制等实际应用中，宽松性和弹性是关键要求

核心贡献

几乎必然获胜策略模板算法: 提出了针对五种获胜目标(安全性、可达性、Büchi、co-Büchi、奇偶性)的几乎必然获胜策略模板构造算法
正概率获胜策略模板: 开发了正概率获胜准则下的策略模板构造和组合算法
策略模板比较框架: 提供了基于宽松性和大小的模板比较讨论
策略提取方法: 提出了从给定模板中提取获胜策略的新方法，平衡获胜目标和宽松性

方法详解

任务定义

随机博弈定义: 随机博弈G = (V, E, (V□, V○, V△))，其中：

V是顶点集合，E是边集合
V□, V○, V△分别表示Even玩家、Odd玩家和Random玩家的顶点
被称为"2.5"玩家博弈，包含两个主要玩家和一个随机玩家

策略模板定义: T = (P, L, C)，其中：

P ⊆ E□是禁用边集合
L ⊆ 2^E□是活跃组集合
C ⊆ E□是共活跃边集合

模型架构

1. 几乎必然获胜策略模板构造

安全性目标(G X):

SafetyTemplate(G, X):
1. W□ ← νY.(X ∩ (Pre□(Y) ∪ Pre(Y)))
2. P ← Edges□(W□, V \ W□)
3. return (P, ∅, ∅)

可达性目标(F X):

ReachabilityTemplate(G, X):
1. A ← Attr'(X)
2. W□ ← Attr'□(A)
3. P ← Edges□(W□, V \ W□)
4. C ← Edges□(W□ \ A, W□ \ A)
5. return (P, ∅, C)

Büchi目标(GF X): 通过LiveGroups函数构造活跃组，确保路径无限次访问目标集合。

奇偶性目标:

将随机博弈约简为确定性博弈(使用Reduce算法)
构造确定性博弈的策略模板
转换回随机博弈的模板

2. 正概率获胜策略模板构造

PositiveTemplate(G, φ):
1. 计算W□, W○和几乎必然获胜模板T^(a)
2. W? ← V \ (W□ ∪ W○)
3. P^(p) ← P^(a) ∪ Edges□(W?, W○)
4. C^(p) ← C^(a) ∪ Edges□(W?, W?)
5. return T^(p) = (P^(p), L^(p), C^(p))