2025-11-12T02:22:29.481811

PSN Game: Game-theoretic Prediction and Planning via a Player Selection Network

Qiu, Ouano, Palafox et al.
While game-theoretic planning frameworks are effective at modeling multi-agent interactions, they require solving large optimization problems where the number of variables increases with the number of agents, resulting in long computation times that limit their use in large-scale, real-time systems. To address this issue, we propose 1) PSN Game: a learning-based, game-theoretic prediction and planning framework that reduces runtime by learning a Player Selection Network (PSN); and 2) a Goal Inference Network (GIN) that makes it possible to use the PSN in incomplete information games where agents' intentions are unknown. A PSN outputs a player selection mask that distinguishes influential players from less relevant ones, enabling the ego player to solve a smaller, masked game involving only selected players. By reducing the number of players in the game, and therefore reducing the number of variables in the corresponding optimization problem, PSN directly lowers computation time. The PSN Game framework is more flexible than existing player selection methods as it 1) relies solely on observations of players' past trajectories, without requiring full state, action, or other game-specific information; and 2) requires no online parameter tuning. Experiments in both simulated scenarios and human trajectory datasets demonstrate that PSNs outperform baseline selection methods in 1) prediction accuracy; and 2) planning safety. PSNs also generalize effectively to real-world scenarios in which agents' objectives are unknown without fine-tuning. By selecting only the most relevant players for decision-making, PSN Game offers a general mechanism for reducing planning complexity that can be seamlessly integrated into existing multi-agent planning frameworks.
academic

PSN Game: Game-theoretic Prediction and Planning via a Player Selection Network

Basic Information

  • Paper ID: 2505.00213
  • Title: PSN Game: Game-theoretic Prediction and Planning via a Player Selection Network
  • Authors: Tianyu Qiu, Eric Ouano, Fernando Palafox, Christian Ellis, David Fridovich-Keil (University of Texas at Austin)
  • Categories: cs.RO (Robotics), math.OC (Optimization and Control)
  • Publication Date: 2025 (arXiv preprint)
  • Paper Link: https://arxiv.org/abs/2505.00213

Abstract

While game-theoretic planning frameworks are effective at modeling multi-agent interactions, they require solving large-scale optimization problems whose variable count increases with the number of agents, resulting in excessive computational time and limiting applicability in large-scale real-time systems. To address this challenge, this paper proposes: 1) PSN Game—a learning-based game-theoretic prediction and planning framework that reduces runtime by learning a Player Selection Network (PSN); 2) Goal Inference Network (GIN), enabling PSN to operate in incomplete information games where agent intentions are unknown. PSN outputs player selection masks that distinguish influential players from less relevant ones, allowing the ego agent to solve smaller-scale masked games involving only selected players. By reducing the number of players in the game and consequently the variables in the corresponding optimization problem, PSN directly decreases computational time.

Research Background and Motivation

Problem Definition

The core challenge faced by game-theoretic planning frameworks in multi-agent systems is that computational complexity grows cubically with the number of agents. As shown in Figure 2, using existing solvers, computation time grows as O(N³), where N is the number of players. This renders game-theoretic methods impractical for large-scale real-time systems.

Research Significance

  1. Real-time Requirements: Applications such as autonomous driving and robot navigation require frequent replanning, making computational efficiency critical
  2. Scalability Challenges: Real-world scenarios often involve large numbers of agents (e.g., dense traffic environments)
  3. Human Behavior Inspiration: Research shows that human drivers in dense traffic instinctively prioritize nearby threatening vehicles rather than monitoring all vehicles

Limitations of Existing Methods

Existing player selection methods suffer from the following issues:

  1. Strong Information Dependency: Require game-specific information such as control inputs and cost functions
  2. Complex Parameter Tuning: Necessitate environment-specific parameter adjustments
  3. Rigid Selection Strategies: Ranking methods based on simple heuristics (e.g., distance, gradient) lack adaptability

Core Contributions

  1. Proposed Unsupervised Player Selection Network (PSN): Trained using differentiable dynamic game solvers, supporting backpropagation through selection masks
  2. Developed Supervised Goal Inference Network (GIN): Infers agent objectives from historical trajectories, enabling PSN applicability in scenarios with unknown intentions
  3. Developed Receding Horizon Framework: Leverages PSN to efficiently identify equilibrium strategies by solving reduced-scale games
  4. Experimental Validation: Verified on multi-agent simulations and real human trajectory datasets; PSN Game effectively reduces game scale by 50%-75%, achieving significant acceleration

Methodology Details

Task Definition

Consider a finite-horizon discrete-time open-loop Nash game with N agents, where each agent i has state xkiRnx_k^i \in \mathbb{R}^n and control input ukiRmu_k^i \in \mathbb{R}^m. Agent state transitions follow the dynamics equation: xk+1i=fi(xki,uki)x_{k+1}^i = f^i(x_k^i, u_k^i)

Each agent's objective is to minimize cumulative cost: Ji(x,u;θi)=k=0Tcki(xk,uk;θi)J^i(x,u;\theta^i) = \sum_{k=0}^T c_k^i(x_k, u_k; \theta^i)

Model Architecture

1. Player Selection Network (PSN)

PSN is a neural network tasked with inferring mask MiM^i to balance performance and sparsity. Two variants are provided:

  • PSN-Full: Input is complete historical state of all agents x0:Kx_{0:K}
  • PSN-Partial: Input is partial observations {h(xk)}k=0K\{h(x_k)\}_{k=0}^K (e.g., position information only)

Network Structure:

  • GRU encoder (hidden dimension 64) processes K-step sequences for each agent
  • MLP layers: 256→128→32 (ReLU activation, dropout=0.3)
  • Sigmoid output layer produces continuous masks mji[0,1]m_j^i \in [0,1]

2. Masked Nash Game

Define player selection mask Mi=(mji){0,1}N1M^i = (m_j^i) \in \{0,1\}^{N-1}, where:

1, & \text{agent j is included in the game} \\ 0, & \text{agent j is excluded} \end{cases}$$ The masked game $\Gamma^i(\tilde{x}_0, \tilde{f}; \theta, M^i)$ retains only agent parameters and states most relevant to agent i. #### 3. Goal Inference Network (GIN) GIN is a data-driven network that infers agent objectives $p_g^i$ from partial trajectory observations: - Input: Historical trajectory $\{h(x_k)\}_{k=0}^K$ - Output: 2D goal position $p_g^i$ - Loss function: Mean squared error $L_{Goal} = \frac{1}{|D| \cdot N}\sum_{d \in D}\sum_{i \in [N]} \|p_{g,ref}^i - G_\phi(x_{0:K})\|$ ### Loss Function Design PSN training employs a composite loss function: **Prediction Task**: $$L_{Pred} = L_{Binary} + \sigma_1 L_{Sparsity} + \sigma_2 L_{Similarity}$$ **Planning Task**: $$L_{Plan} = L_{Binary} + \sigma_3 L_{Sparsity} + \sigma_4 L_{Cost}$$ Where: - $L_{Binary} = \frac{1}{N}\sum_{j=1}^N m_j^i(1-m_j^i)$: Encourages masks to converge to binary values - $L_{Sparsity} = \frac{\|M^i\|_1}{N}$: Encourages selection of fewer agents - $L_{Similarity}$: Encourages selection of agent subsets that recover observed trajectories - $L_{Cost}$: Encourages selection of appropriate agents to minimize game cost ### Receding Horizon Implementation The algorithm at each time step $k$: 1. Infers agent objectives through GIN 2. Obtains adaptive mask $M_k^i$ using PSN 3. Solves masked game to obtain OLNE policy 4. Executes first-step control input and updates state ## Experimental Setup ### Datasets **Simulation Scenarios**: - 4-agent scenario: 5m×5m environment - 10-agent scenario: 7m×7m environment - 20-agent scenario: 7m×7m environment (scalability testing) **Real Data**: - CITR pedestrian dataset: 7.5m×25.5m environment, 10 pedestrians **Game Settings**: - State space: 4-dimensional (position + velocity) - Dynamics: Double integrator model - Cost function: Includes goal tracking, velocity penalty, control cost, and collision avoidance terms ### Evaluation Metrics **Prediction Task**: - ADE (Average Displacement Error) - FDE (Final Displacement Error) - Consistency: Selection consistency **Planning Task**: - Navigation Cost - Collision Cost - Control Cost - Minimum Distance - Trajectory Smoothness - Trajectory Length ### Comparison Methods 1. **PSN Variants**: PSN-Threshold, PSN-Rank 2. **Distance Methods**: Distance, kNNs 3. **Gradient Methods**: Gradient, Hessian, Cost Evolution 4. **Constraint Methods**: BF (Barrier Function), CBF (Control Barrier Function) ## Experimental Results ### Main Results #### Prediction Performance (Table 2) In 4-agent scenario: - PSN-Full ADE: 0.1834m (optimal) - PSN-Partial ADE: 0.1876m - Best baseline (Cost Evolution) ADE: 0.1861m In 10-agent scenario: - PSN-Partial ADE: 0.2213m (optimal) - PSN-Full ADE: 0.2314m - Best baseline (kNNs) ADE: 0.2343m #### Planning Performance (Table 3) PSN demonstrates outstanding performance on safety metrics: - **Collision Cost**: Achieves optimality in both 4 and 10-agent scenarios - **Minimum Distance**: Ranks in top three in both scenarios - Safety performance degradation compared to complete game is minimal ### Incomplete Information Game Adaptability In scenarios with unknown objectives (Tables 4-5), PSN combined with GIN: - Prediction metrics maintain top-two ranking - Planning metrics rank in top three - Safety metrics remain optimal ### Scalability Verification PSN trained on 10-agent scenario applied to 20-agent scenario: - PSN-Full ADE: 0.3108m (optimal) - PSN-Partial ADE: 0.3152m (second) - Adapts to larger-scale scenarios without retraining ### Real Data Generalization On CITR pedestrian dataset: - PSN-Partial ADE: 0.4931m (optimal) - PSN-Full ADE: 0.4966m - Even outperforms solving the complete game ## Related Work ### Game-Theoretic Planning Existing methods primarily focus on small-scale environments (≤5 agents), using Newton-style solvers for iterative solution. Complexity grows cubically with player count, becoming a fundamental barrier for large-scale real-time systems. ### Player Selection Methods **Threshold Methods**: Select agents based on predefined distance thresholds, requiring extensive parameter tuning. **Ranking Methods**: Select top-k most influential agents, but fixed-count selection may be overly aggressive or conservative. **Advantages of This Work**: 1. Requires only historical trajectory observations, no privileged information 2. No online parameter tuning needed 3. Adaptively selects number of agents ## Conclusions and Discussion ### Main Conclusions 1. PSN Game successfully reduces game scale by 50%-75%, achieving significant computational acceleration 2. Outperforms existing baseline methods in prediction accuracy and planning safety 3. Demonstrates good generalization capability, adapting to larger-scale scenarios and real data ### Limitations 1. **Continuous Mask Approximation**: Uses continuous masks to support backpropagation; requires thresholding conversion at runtime 2. **Training Data Dependency**: Requires training data generated using complete games 3. **Dynamics Model Assumptions**: Experiments primarily based on double integrator model ### Future Directions 1. Extend to more complex dynamics models 2. Investigate other types of game parameter inference 3. Explore end-to-end training strategies ## In-Depth Evaluation ### Strengths 1. **Problem Importance**: Addresses core computational bottleneck in game-theoretic planning 2. **Methodological Innovation**: First to combine deep learning with game theory for player selection 3. **Experimental Comprehensiveness**: Covers simulations and real data, validating multiple hypotheses 4. **Practical Value**: Provides general mechanism directly integrable into existing frameworks ### Weaknesses 1. **Lack of Theoretical Analysis**: No theoretical guarantees on convergence or approximation error 2. **Computational Overhead**: While reducing game-solving time, adds network inference overhead 3. **Scenario Limitations**: Primarily validated on navigation tasks; applicability to other game types unknown ### Impact 1. **Academic Contribution**: Provides new solution paradigm for large-scale multi-agent systems 2. **Practical Applications**: Direct applicability in autonomous driving, robot swarms, and related domains 3. **Reproducibility**: Provides detailed implementation details and parameter settings ### Applicable Scenarios 1. **Dense Multi-Agent Environments**: Such as urban traffic and crowd navigation 2. **High Real-Time Requirement Systems**: Dynamic environments requiring frequent replanning 3. **Partially Observable Scenarios**: Incomplete information games with unknown agent intentions ## References The paper cites 35 relevant references, primarily covering: - Game-theoretic planning methods [8, 28, 9, 15] - Multi-agent interaction modeling [17, 20, 21, 24] - Player selection heuristics [2, 27, 30] - Attention mechanisms and neighbor selection [3, 6, 32] --- This paper makes significant contributions to addressing computational complexity in game-theoretic planning. Through learning-driven player selection, it substantially improves computational efficiency in multi-agent systems, paving the way for large-scale real-time applications.