Test-Time Alignment for Large Language Models via Textual Model Predictive Control
Wang, Chen, Hung et al.
Aligning Large Language Models (LLMs) with human preferences through finetuning is resource-intensive, motivating lightweight alternatives at test time. We address test-time alignment through the lens of sequential decision making, a perspective that reveals two fundamental challenges. When actions are defined at the token level, as in guided decoding, alignment suffers from the curse of horizon. Conversely, when actions are at the response level, as in traditional iterative refinement, the curse of dimensionality emerges. To resolve this trade-off, we draw inspiration from Model Predictive Control (MPC) in control theory to propose Textual Model Predictive Control (TMPC), a novel predictive planning framework adapted for aligning LLMs at inference time. A key limitation of standard MPC is its reliance on predefined, hard segment boundaries, which are often absent in text generation. TMPC overcomes this by introducing two principles inspired by hierarchical reinforcement learning: (1) Hindsight Subgoal Identification, where TMPC analyzes generation subgoals to retrospectively identify high-reward intermediate outputs as subgoals. This allows the framework to discover meaningful, task-specific planning steps (e.g., a sentence in machine translation or a bug fix in code generation.). (2) Subgoal-Conditioned Re-Generation, where these identified subgoals are used to guide subsequent planning iterations. By conditioning on these proven, high-quality subgoals, TMPC ensures stable improvement by building upon previously validated successes. TMPC is evaluated on three tasks with distinct segmentation properties: discourse-level translation, long-form response generation, and program synthesis. The results demonstrate that TMPC consistently improves performance, highlighting the generality.
academic
Test-Time Alignment for Large Language Models via Textual Model Predictive Control
Aligning large language models with human preferences typically requires fine-tuning, which is resource-intensive. This paper addresses test-time alignment from a sequential decision-making perspective, revealing two fundamental challenges: when actions are defined at the token level (e.g., guided decoding), alignment faces the "curse of dimensionality"; when actions are defined at the response level (e.g., traditional iterative optimization), it faces the "curse of horizon." To address this trade-off, the authors draw inspiration from Model Predictive Control (MPC) in control theory and propose Textual Model Predictive Control (TMPC), a novel predictive planning framework applicable to inference-time LLM alignment.
Importance of Alignment: While large language models demonstrate excellent performance on various NLP tasks, aligning their outputs with human preferences remains a critical challenge, particularly for smaller-scale LLMs (e.g., under 10B parameters).
Test-time alignment methods face fundamental trade-offs:
Token-level guided decoding suffers from the "curse of horizon"
Response-level iterative optimization suffers from the "curse of dimensionality"
Research Motivation: There is a need for a test-time alignment method that avoids expensive model retraining while effectively balancing temporal and search space complexity.
Novel Problem Formulation: First to model test-time alignment as a sequential decision-making problem, unifying existing methods and revealing their fundamental trade-offs.
TMPC Framework: Proposes a Textual Model Predictive Control framework that adapts control-theoretic concepts to language generation tasks.
Two Core Principles:
Hindsight Subgoal Identification: Discovering meaningful planning steps from rollouts
Subgoal-Conditioned Re-Generation: Iterative refinement based on verified subgoals
Comprehensive Experimental Validation: Validates the method's effectiveness and generality across three tasks with different characteristics.
TMPC is the first to systematically apply Model Predictive Control to preference alignment in language generation, filling a gap in the intersection of control theory and NLP.
Unified Framework: Successfully unifies test-time alignment as a sequential decision-making problem, revealing fundamental trade-offs in existing methods
Effective Balance: TMPC effectively balances the curse of horizon and curse of dimensionality
Broad Applicability: Achieves consistent improvements across three tasks with different characteristics
Significant Theoretical Contribution: First systematic analysis of fundamental challenges in test-time alignment, providing a unified theoretical framework
Strong Method Innovation: Successfully adapts MPC to text generation with clear principles and elegant design
Comprehensive Experiments: Validation across three tasks with different characteristics, including detailed ablation studies and robustness analysis
High Practical Value: No retraining required, computationally efficient, easy to deploy
Reinforcement learning theory (hierarchical RL, trajectory optimization, etc.)
Summary: This is a high-quality paper with significant contributions in both theoretical innovation and practical application. The authors successfully adapt the MPC framework from control theory to the preference alignment problem in language generation, proposing the innovative TMPC method, and comprehensively validate its effectiveness through extensive experiments. This work provides a new research direction for test-time alignment with important academic value and practical significance.