2025-11-28T04:01:18.891206

Natural Strategic Ability in Stochastic Multi-Agent Systems

Berthon, Katoen, Mittelmann et al.
Strategies synthesized using formal methods can be complex and often require infinite memory, which does not correspond to the expected behavior when trying to model Multi-Agent Systems (MAS). To capture such behaviors, natural strategies are a recently proposed framework striking a balance between the ability of agents to strategize with memory and the model-checking complexity, but until now has been restricted to fully deterministic settings. For the first time, we consider the probabilistic temporal logics PATL and PATL* under natural strategies (NatPATL and NatPATL*, resp.). As main result we show that, in stochastic MAS, NatPATL model-checking is NP-complete when the active coalition is restricted to deterministic strategies. We also give a 2NEXPTIME complexity result for NatPATL* with the same restriction. In the unrestricted case, we give an EXPSPACE complexity for NatPATL and 3EXPSPACE complexity for NatPATL*.
academic

Natural Strategic Ability in Stochastic Multi-Agent Systems

Basic Information

  • Paper ID: 2401.12170
  • Title: Natural Strategic Ability in Stochastic Multi-Agent Systems
  • Authors: Raphaël Berthon, Joost-Pieter Katoen (RWTH Aachen University), Munyque Mittelmann, Aniello Murano (University of Naples Federico II)
  • Classification: cs.LO (Logic in Computer Science), cs.AI (Artificial Intelligence)
  • Publication Time/Venue: AAAI 2024 (Extended version, revised November 2025)
  • Paper Link: https://arxiv.org/abs/2401.12170

Abstract

This paper extends the natural strategies framework to stochastic multi-agent systems (MAS) for the first time, proposing variants of probabilistic alternating temporal logic PATL and PATL* under natural strategies (NatPATL and NatPATL*). The main results show that when coalitions are restricted to deterministic strategies, NatPATL model checking is ∆P₂-complete; NatPATL* has 2NEXPTIME complexity. In the unrestricted case (probabilistic strategies), NatPATL has EXPSPACE complexity, and NatPATL* has 3EXPSPACE. This work achieves a good balance between strategic ability verification for finite-memory agents and computational complexity.

Research Background and Motivation

1. Core Problem

Strategies synthesized by formal methods typically have high complexity and require infinite memory, which is unrealistic in practical multi-agent system modeling. Human agents and computationally resource-constrained artificial agents cannot execute such complex strategies.

2. Problem Significance

  • Practical Requirements: Real-world agents (human or constrained AI) need executable, understandable strategies
  • Uncertainty Modeling: MAS commonly face stochasticity (natural events, agent behavior, sensor errors, etc.)
  • Safety-Critical Applications: Electronic voting systems, access control, etc. require verification of strategic ability under uncertain environments

3. Limitations of Existing Approaches

  • PATL/PATL*: Require infinite-memory strategies; model checking complexity is in NP∩co-NP but impractical
  • PSL: Undecidable; even restricted to memoryless strategies requires 3EXPSPACE
  • Stochastic Game Logic: Memoryless deterministic strategies are PSPACE, memoryless probabilistic strategies are EXPSPACE, but the memoryless assumption is too restrictive
  • Existing Natural Strategy Research: Limited to fully deterministic environments, cannot handle stochasticity

4. Research Motivation

  • Extend natural strategies to stochastic environments, filling a theoretical gap
  • Achieve balance between bounded memory and reasonable complexity
  • Provide theoretical foundation for multi-agent POMDP extensions

Core Contributions

  1. Theoretical Extension: First extension of the natural strategies framework from deterministic environments to stochastic multi-agent systems, defining behavioral natural strategies
  2. New Logic Systems: Propose NatPATL and NatPATL* two logic systems, supporting both memoryless (r) and bounded recall (R) semantics
  3. Complexity Results (see Table 1 summary):
    • Deterministic Strategies: NatPATLr/R is ∆P₂-complete (NP-hard lower bound), NatPATL*r/R is 2NEXPTIME
    • Probabilistic Strategies: NatPATLr/R is EXPSPACE, NatPATL*r/R is 3EXPSPACE
  4. Expressiveness Analysis: Prove that NatPATL() and PATL() have incomparable distinguishing power and expressiveness
  5. Application Examples: Demonstrate practical value through electronic voting systems and access control systems

Methodology Details

Task Definition

Model Checking Problem: Given a stochastic concurrent game structure (CGS) G, state s, and NatPATL(*) formula φ, determine whether G, s ⊨ρ φ holds (ρ∈{r,R} denotes memoryless or bounded recall).

Input:

  • CGS G = (St, L, δ, ℓ): state set, legality function, stochastic transition function, labeling function
  • Initial state s ∈ St
  • NatPATL(*) formula φ, including complexity bound k

Output: Boolean value indicating whether the formula is satisfied

Core Concept: Behavioral Natural Strategies

1. Definition Structure

A behavioral natural strategy σₐ is an ordered list of condition-action pairs:

σₐ = [(r₁, d₁), (r₂, d₂), ..., (rₙ, dₙ)]

where:

  • rᵢ ∈ Reg(Bool(AP)): Regular expression conditions (based on propositional formula sequences)
  • dᵢ ∈ Dist(Ac): Probability distribution over actions
  • The last pair must be (⊤*, d) as the default action

2. Matching Mechanism

Given history h, the strategy selects the first matching rule:

match(h, σₐ) = min{i : h matches condᵢ(σₐ) and actᵢ(σₐ) is legal at last(h)}

History h matches regular expression r if and only if there exists a word b∈L(r) such that each state in h satisfies the corresponding boolean formula in b.

3. Complexity Measure

Strategy complexity c(σ) = Σ|r| (total number of symbols in all regular expressions)

4. Example (Electronic Voting System)

Voter's deterministic memoryless strategy:

1. (hasBallot_v ∧ ¬scanned_v, scanBallot)
2. (¬vot_v ∧ scanned_v, enterVote)
3. (¬vot_v ∧ entVote_{v,s} ∧ ¬(sigOk_s ∨ sigFail_s), checkSig_s)
4. (¬vot_v ∧ entVote_{v,s} ∧ sigFail_s, cnlVote)
5. (¬vot_v ∧ entVote_{v,s} ∧ sigOk_s, conf)
6. (vot_v ∧ rec_{v,r} ∧ ¬shreded_r, shred_r)
7. (⊤, noop)

Coercer's probabilistic recall strategy:

1. (⊤* · ⋀_v ¬coerced_v, d_V)  // Randomly select coercion target
2. (⊤* · coerced_v ∧ ¬requested_v, request_v)
3. (⊤* · ¬requested_v* · (requested_v ∧ ¬vot_v)* ∧ ¬punished_v, punish_v)
4. (⊤* · ¬coerced_v ∧ ¬coerced_{v'}, d_{v,v'})
5. (⊤*, noop)

Logic Syntax and Semantics

NatPATL* Syntax

φ ::= p | φ∨φ | ¬φ | Xφ | φUφ | ⟨⟨C⟩⟩_{▷◁d}^k φ

where ⟨⟨C⟩⟩_{▷◁d}^k φ denotes: coalition C has a strategy with complexity ≤k such that the probability of satisfying φ has relation ▷◁ with d.

NatPATL Syntax (Restricted)

φ ::= p | φ∨φ | ¬φ | ⟨⟨C⟩⟩_{▷◁d}^k (Xφ) | ⟨⟨C⟩⟩_{▷◁d}^k (φUφ)

Temporal operators must immediately follow coalition operators.

Probability Space Construction

  • Strategy profile σ and state s induce Markov chain M_{σ,s}
  • Transition probability: p(h, hs') = Σ_c σ(h)(c) × δ(last(h), c)(s')
  • Generate canonical probability measure out(σ, s)
  • Possible outcomes of coalition strategy σ_C: out_C(σ_C, s) = {out((σ_C, σ_{-C}), s) : σ_{-C}∈∏_{a∈Ag-C} Str^ρ_a}

Semantic Definition

Key coalition operator semantics:

G, π ⊨ρ ⟨⟨C⟩⟩_{▷◁d}^k φ ⟺ 
  ∃σ_C ∈ ∏_{a∈C} {α∈Str^ρ_a : c(α)≤k} such that
  ∀μ^{σ_C}_{π₀} ∈ out_C(σ_C, π₀), μ^{σ_C}_{π₀}({π' : G,π'⊨ρ φ}) ▷◁ d

Technical Innovations

  1. Probabilistic Strategy Extension: Extends originally deterministic natural strategies to probability distributions, more aligned with practical decision-making
  2. Regular Expression Conditions: Uses regular expressions rather than state histories, enabling imperfect information modeling
  3. Complexity Parameterization: Incorporates strategy complexity k as a formula parameter, enabling expression of properties like "no simple strategy exists"
  4. Behavioral Strategy Semantics: Adopts behavioral strategies (state-action probabilities) rather than mixed strategies (strategy selection probabilities), related to Kuhn equivalence in game theory
  5. Dual Adversariality: Nested structure of coalition strategy existential quantification and opponent strategy universal quantification

Experimental Setup

Note: This is a theoretical computer science paper providing complexity-theoretic results rather than experimental evaluation.

Theoretical Proof Methods

Upper Bound Proof Techniques

  1. ∆P₂ Algorithm (Theorem 1):
    • Guess polynomial-length witnesses (strategies for each subformula, state, and agent)
    • Utilize NP oracle polynomially many times
    • Verify formula bottom-up, converting to MDP reachability problems
  2. 2NEXPTIME Algorithm (Theorem 5):
    • Non-deterministically guess strategies
    • Each subformula verification requires 2EXPTIME (LTL model checking)
    • Polynomial number of oracle calls
  3. EXPSPACE/3EXPSPACE Algorithms (Theorems 7-8):
    • Convert to real arithmetic
    • Introduce variables r_{x,s,a,q} representing probability of strategy x selecting action a at state s and automaton state q
    • Automaton state count is exponential in k, leading to exponential variable count

Lower Bound Proof Techniques

  1. NP-hardness (Theorem 4): Reduction from POMDP almost-sure reachability
  2. 2EXPTIME-hardness (Theorem 6): Reduction from LTL model checking over MDPs

Case Systems

1. Access Control System (Example 3)

  • Structure: Grid-like tiles, randomly moving robot, coalition-controlled doors
  • Objective: Reach all targets with probability ≥0.7 infinitely often
  • Formula: ⟨⟨C⟩⟩^k_{≥0.7} G⋀_{t_j∈T,t_j≠t_i} Ft_j
  • Analysis Result: Demonstrates sufficiency of deterministic strategies in stochastic environments

2. Electronic Voting System (Examples 1, 2, 4)

  • Agents: Voters V, coercer C
  • Actions: Scan, vote, cancel, confirm, check signature, shred receipt, etc.
  • Stochasticity: Actions may fail (e.g., coercion may be unsuccessful)
  • Verification Properties:
    • Voter Verifiability: ⟨⟨v⟩⟩^k_{≥0.9} F(sigOk_s ∨ sigFail_s)
    • Receipt Freeness: ¬⟨⟨v⟩⟩^k_{≥0.5} F⋁r (receipt{v,r} ∧ ¬shreded_r)

Experimental Results

Main Complexity Results

LogicDeterministic StrategiesProbabilistic Strategies
NatPATLr∆P₂-completeEXPSPACE
NatPATL*r2NEXPTIME3EXPSPACE
NatPATLR∆P₂-completeEXPSPACE
NatPATL*R2NEXPTIME3EXPSPACE

Key Theorem Summary

Theorems 1-4 (NatPATL Deterministic Strategies)

  • Upper Bound: ∆P₂ = P^NP (polynomial time with NP oracle)
  • Lower Bound: NP-hard (reduction from POMDP)
  • Positive Fragment: NP-complete (Theorem 3)
  • Significance: Matches POMDP bounded-memory deterministic strategy complexity

Theorems 5-6 (NatPATL* Deterministic Strategies)

  • Upper Bound: 2NEXPTIME
  • Lower Bound: 2EXPTIME-hard
  • Gap: Exponential gap exists; whether it can be improved is an open question

Theorems 7-8 (Probabilistic Strategies)

  • NatPATL*R: 3EXPSPACE (matches PSL memoryless strategies)
  • NatPATLR: EXPSPACE (avoids LTL's double exponential blowup)
  • Technical Key: Leverages polynomial complexity of reachability/invariance

Expressiveness Results (Theorem 9)

Conclusion: NatPATL() and PATL() have incomparable distinguishing power and expressiveness

Proof Strategy:

  1. NatPATL ⊀_d PATL: Natural strategies can express "no strategy with complexity ≤k exists," which combined strategies cannot
  2. PATL ⊀_d NatPATL: Combined strategies can express certain properties requiring infinite memory that natural strategies cannot
  3. Derive expressiveness incomparability from distinguishing power

1. Stochastic MAS Verification

  • Huang & Luo (2013): Deterministic strategies + probabilistic knowledge ATL
  • Song et al. (2019): Probabilistic alternating temporal μ-calculus
  • Belardinelli et al. (2023): PATL with imperfect information and memoryless strategies
  • Chen et al. (2013): PATL with cumulative cost/reward

2. Finite-Memory Strategy Representation

  • Vester (2013): Input/output automata representation
  • Brázdil et al. (2015): Decision tree representation
  • Ågotnes & Walther (2009): Bounded-memory ATL
  • Deuser & Naumov (2020): Mealy machine representation, bounded recall impact on planning ability

3. Prior Natural Strategy Work

  • Jamroga et al. (2019a): Original definition, concurrent games in deterministic environments, Nash equilibrium, ATL model checking
  • Jamroga et al. (2019b): Extension to imperfect information ATL
  • Belardinelli et al. (2022): Extension to strategy logic SL
  • Infinite Memory: Büchi/reachability targets EXPTIME, parity targets undecidable
  • Bounded Memory (Junges et al. 2018):
    • Deterministic strategies: NP-complete
    • Probabilistic strategies: ETR-complete
  • This Paper's Contribution: Extends POMDP to multi-agent + bounded memory

Paper's Advantages

  1. First combination of probabilistic environments with natural strategies
  2. Complexity results match POMDP in deterministic case, comparable to PSL in probabilistic case
  3. Provides good balance between practicality and complexity

Conclusions and Discussion

Main Conclusions

  1. Theoretical Contribution: Successfully extends natural strategies to stochastic MAS, establishing a complete complexity landscape
  2. Practical Value:
    • ∆P₂ complexity of deterministic strategies has practical potential
    • Can capture bounded-memory POMDP without significant complexity loss
  3. Theoretical Insights:
    • Double exponential blowup from PATL to PATL* stems from LTL model checking
    • Exponential space blowup for probabilistic strategies originates from real arithmetic encoding
    • Complexity gap between deterministic and probabilistic strategies remains unresolved in related work

Limitations

  1. Complexity Gaps:
    • NatPATL* deterministic strategies: 2EXPTIME-hard vs 2NEXPTIME upper bound
    • Whether upper or lower bounds can be improved is an open question
  2. Practical Implementation:
    • 3EXPSPACE complexity may be infeasible in practice
    • Lacks experimental evaluation on actual systems
  3. Expressiveness Limitations:
    • Cannot express certain properties requiring infinite memory
    • Choice of complexity bound k requires domain knowledge
  4. Probabilistic Strategy Lower Bounds:
    • No evidence of complexity separation between probabilistic and deterministic strategies
    • May require new constructions from POMDP or Dec-POMDP

Future Directions

  1. Extension to PSL: Feasible but difficult to improve 3EXPSPACE complexity
  2. Qualitative Fragments: Consider qualitative PATL* or PSL with only >0 and =1 thresholds, potentially achieving better complexity
  3. Quantitative Techniques: Apply probabilistic model checking techniques (graph analysis, bisimulation minimization, symbolic techniques, partial order reduction)
  4. Epistemic Operators: Extend to epistemic logic framework, handling knowledge and belief
  5. Approximation Algorithms: Develop heuristic or approximate algorithms for practical applications
  6. Tool Implementation: Develop prototype tools and evaluate on practical cases

In-Depth Evaluation

Strengths

  1. Theoretical Rigor:
    • Complete complexity upper and lower bound proofs
    • Detailed semantic definitions (probability space construction)
    • Rigorous expressiveness analysis
  2. Problem Significance:
    • Fills theoretical gap in natural strategies under stochasticity
    • Addresses key requirements for practical MAS modeling
    • Connects multiple research areas (MAS, POMDP, game theory)
  3. Technical Contributions:
    • Elegant probabilistic extension of behavioral strategies
    • Innovative introduction of complexity parameterization k
    • Regular expression conditions enable imperfect information modeling
  4. Application-Oriented:
    • Electronic voting system case closely aligned with practice
    • Access control example clearly demonstrates stochasticity modeling
    • Formula examples have practical significance
  5. Writing Quality:
    • Clear structure, progressively developing from motivation to techniques
    • Rich examples aid understanding of abstract concepts
    • Comprehensive related work survey

Weaknesses

  1. Experimental Absence:
    • Purely theoretical research without practical system evaluation
    • No tool implementation or case studies provided
    • Cannot assess practical feasibility
  2. Complexity Gaps:
    • NatPATL* deterministic strategies have exponential gap
    • Probabilistic strategy lower bounds are not tight
    • Lacks deep analysis of gap sources
  3. Expressiveness Analysis:
    • Only proves incomparability, doesn't quantify differences
    • Lacks discussion of which practical properties are/aren't expressible
    • Relationship with other logics (e.g., SL) not deeply explored
  4. Strategy Complexity:
    • Complexity measure c(σ) is coarse (only symbol count)
    • Doesn't consider other factors like automaton state count
    • Lacks guidance on practical k selection
  5. Probabilistic Model:
    • Only considers finite-state CGS
    • Doesn't discuss continuous state/action spaces
    • Probability distributions limited to rationals

Impact

  1. Theoretical Impact:
    • Provides new tools for stochastic MAS verification
    • Advances natural strategy theory development
    • Bridges MAS and POMDP communities
  2. Practical Value:
    • Safety-critical applications: electronic voting, access control
    • Human-machine collaboration system design
    • Resource-constrained agent planning
  3. Reproducibility:
    • Detailed definitions and proofs
    • Complete proofs in technical appendix
    • But lacks code and datasets
  4. Follow-up Research:
    • Epistemic logic extensions
    • Approximation algorithm development
    • Tool implementation
    • Practical case studies

Applicable Scenarios

  1. Theoretical Research:
    • Multi-agent system formal verification
    • Game theory and logic intersection research
    • Complexity theory
  2. Safety-Critical Systems:
    • Electronic voting protocol verification
    • Access control policy analysis
    • Autonomous system safety verification
  3. Human-Machine Interaction:
    • Explainable AI strategy design
    • Human-understandable planning
    • Collaborative robotics
  4. Resource-Constrained Environments:
    • Embedded systems
    • IoT devices
    • Mobile robots

Inapplicable Scenarios:

  • Systems requiring precise numerical optimization
  • Continuous state/action spaces
  • Requiring fast online decision-making (complexity too high)

Selected References

  1. Jamroga, W., Malvone, V., & Murano, A. (2019). Natural strategic ability. Artificial Intelligence, 277, 103170.
    • Original definition of natural strategies
  2. Aminof, B., et al. (2019). Probabilistic Strategy Logic. IJCAI.
    • PSL's 3EXPSPACE complexity
  3. Chatterjee, K., Chmelik, M., & Davies, J. (2016). A Symbolic SAT-Based Algorithm for Almost-Sure Reachability with Small Strategies in POMDPs. AAAI.
    • NP-completeness of POMDP bounded-memory strategies
  4. Baier, C., et al. (2012). Stochastic game logic. Acta Informatica, 49(4), 203-224.
    • EXPSPACE complexity of stochastic game logic
  5. Alur, R., Henzinger, T., & Kupferman, O. (2002). Alternating-time temporal logic. J. ACM, 49(5), 672-713.
    • Seminal ATL paper

Overall Assessment: This is a high-quality theoretical computer science paper making significant contributions to stochastic multi-agent system verification. The theoretical results are rigorous and complete, problem motivation is well-established, and technical innovations are notable. Main shortcomings are the absence of experimental validation and tool implementation. For theoretical researchers and formal methods researchers, this is essential reading; for practitioners, implementation and case studies are needed. The paper's complexity results establish important theoretical benchmarks for the field.