2025-11-15T12:40:11.869613

Formally Verified Certification of Unsolvability of Temporal Planning Problems

Wang, Abdulaziz

We present an approach to unsolvability certification of temporal planning. Our approach is based on encoding the planning problem into a network of timed automata, and then using an efficient model checker on the network followed by a certificate checker to certify the output of the model checker. Our approach prioritises trustworthiness of the certification: we formally verify our implementation of the encoding to timed automata using the theorem prover Isabelle/HOL and we use an existing certificate checker (also formally verified in Isabelle/HOL) to certify the model checking result.

academic

Formally Verified Certification of Unsolvability of Temporal Planning Problems

Basic Information

Paper ID: 2510.10189
Title: Formally Verified Certification of Unsolvability of Temporal Planning Problems
Authors: David Wang, Mohammad Abdulaziz (King's College London, United Kingdom)
Classification: cs.LO (Logic in Computer Science), cs.AI (Artificial Intelligence)
Publication Date: October 11, 2025 (arXiv preprint)
Paper Link: https://arxiv.org/abs/2510.10189

Abstract

This paper proposes a method for certifying the unsolvability of temporal planning problems. The approach is based on encoding planning problems as networks of timed automata, then employing an efficient model checker on the network, followed by a certificate checker to authenticate the model checker's output. The method prioritizes the trustworthiness of certification: it uses the theorem prover Isabelle/HOL to formally verify the implementation of timed automata encoding, and employs an existing certificate checker (also formally verified in Isabelle/HOL) to authenticate model checking results.

Research Background and Motivation

Core Problem

The core problem addressed by this research is certification of unsolvability for temporal planning problems. Temporal planning is a richer form of planning that allows actions to have durations and concurrent execution, making it significantly more complex than classical planning.

Problem Significance

Trustworthiness Requirements: Existing planning systems are extremely complex at both algorithmic and implementation levels, making it crucial to enhance the credibility of their outputs
Certification Difficulty: Unlike solvable problems (which can be verified by executing a plan), claims of unsolvability or optimality are difficult to obtain compact certificates for
Exponential Complexity: In the worst case, such certificates may grow exponentially relative to the size of the planning task

Limitations of Existing Approaches

Lack of Formal Guarantees: Existing methods for detecting temporal planning unsolvability lack formal correctness guarantees
High Complexity: State-of-the-art temporal planning algorithms are technically complex, making it difficult to design practical unsolvability certification schemes
Verification Gap: Compared to existing formal verification work in classical planning, temporal planning lacks such verification efforts

Research Motivation

This paper adopts an encoding transformation approach: it encodes temporal planning problems into other computational problems with practical certification algorithms (timed automata), and formally verifies the entire encoding process and certificate checker using a theorem prover to ensure certification trustworthiness.

Core Contributions

Formally Verified Encoding Method: First formal verification of Heinz et al.'s encoding from temporal planning to timed automata
Engineering Implementation: Adaptation of the encoding to produce timed automata formats compatible with the Wimmer and von Mutius system
Simplified Semantic Design: Design of simpler temporal planning semantics than existing work, facilitating mathematical reasoning, with proof of equivalence to existing semantics
Complete Certification Framework: Construction of a complete trustworthy certification chain from temporal planning problems to timed automata model checking
Theoretical Correctness Guarantees: Proof of encoding correctness in Isabelle/HOL, providing strong theoretical assurance

Methodology Details

Task Definition

Input: Temporal planning problem P = ⟨P, A, I, G⟩

P: Set of propositions
A: Set of durative actions
I: Initial state
G: Goal conditions

Output: Formal certificate of unsolvability (if the problem is indeed unsolvable)

Constraints:

Actions have start and end snapshot actions
Support for invariant conditions and scheduling constraints
Satisfaction of mutual exclusion constraints and duration bounds

Model Architecture

1. Temporal Planning Formalization

The paper first defines formal semantics for temporal planning:

Snapshot Actions (Definition 1):

h ≡ ⟨pre(h), adds(h), dels(h)⟩

Durative Actions (Definition 2):

a ≡ ⟨a⊢, a⊣, over_all(a), L(a), U(a)⟩

where a⊢ and a⊣ are the start and end snapshot actions respectively.

2. Timed Automata Encoding

Variable Design (Definition 24):

For each proposition p: binary variable vp (encoding truth value) and lock counter lp (recording active actions requiring p to be true)
aa: Total number of active actions
ps: Planning state (0=not started, 1=planning, 2=completed)

Clock Design (Definition 25):

Each action a is allocated two clocks: ca⊢ (recording time after start) and ca⊣ (recording time after end)

Main Automaton (Definition 26): Controls state transitions throughout the planning process, including initialization and goal checking.

Action Automata (Definition 27): Each action corresponds to an automaton containing the following key transitions:

sea: Application of start action effects
se'a: Locking of invariant conditions
eea: Preparation before action end
ee'a: Application of end action effects
iea: Handling of instantaneous actions

3. Network Construction

Combination of the main automaton and all action automata into a timed automata network (Definition 28), with initial configuration setting all automata to inactive state.

Technical Innovations

Concurrent Execution Support: Unlike Heinz et al.'s use of global locks, this paper uses clock constraints to allow concurrent snapshot action execution
Instantaneous Action Handling: Support for zero-duration instantaneous actions through addition of iea transitions
Formal Verification: First to provide machine-checked correctness proofs for such encodings
Semantic Simplification: Design of temporal planning semantics more suitable for formal reasoning

Experimental Setup

Formal Verification Environment

Theorem Prover: Isabelle/HOL
Certificate Checker: Wimmer and von Mutius's verified certificate checker
Target Property: Reachability checking A ⊨ EF(loc(AM) = goal)

Code Scale Statistics

Component	Lines of Code
Abstract temporal planning semantics formalization	~7,200
Timed automata network definition using Munta	~800
Proof of Theorem 1 and related lemmas	~8,500
List-related lemmas	~1,500
Total	~18,000

Comparison Benchmarks

Scale comparison with related formal verification work:

Verified SAT-based planner: ~17,500 lines
Verified classical planning verifier: ~3,000 lines
Verified temporal PDDL planning verifier: ~6,500 lines

Experimental Results

Main Theoretical Results

Theorem 1 (Main Correctness Theorem): If plan π is valid and non-self-overlapping (valid(π) ∧ no_self_overlap(π)), then the assertion A ⊨ EF(loc(AM) = goal) holds.

Proof Structure:

Lemma 1: Construction of runs simulating parallel plans
Lemma 2: Reachability from initial configuration to encoded state
Lemma 3: Transition from final state to goal configuration

Formal Verification Achievements

Completeness Proof: Successfully proved the completeness of the encoding, i.e., every valid plan can find a valid run in the corresponding timed automata network
Machine Checking: All proofs pass machine checking in Isabelle/HOL, providing the highest level of trustworthiness guarantee
Modular Design: Proof structure is modular, facilitating understanding and maintenance

Engineering Implementation Verification

The encoding implementation has been adapted to be compatible with existing verified certificate checkers, forming a complete executable certification chain.

Classical Planning Certification

Eriksson et al. designed compact unsolvability certification schemes for classical planning
Abdulaziz and Lammich provided formally verified verifiers for classical planning

Temporal Planning and Model Checking

Dierks et al. implemented static reduction from PDDL subsets to timed automata
Heinz et al. defined explicit encoding from temporal planning problems to timed automata
Gigante et al. studied the complexity of temporal planning at the theoretical level

Formal Verification Methods

Abdulaziz and Kurz used similar methods to develop certified SAT-based planning systems
Kumar et al. and Leroy employed stepwise verification of encoding methods in compiler verification

Conclusions and Discussion

Main Conclusions

Feasibility Verification: Demonstrates that formal certification of temporal planning unsolvability is feasible
Theoretical Guarantees: Provides strong correctness guarantees through theorem provers
Engineering Implementation: Successfully constructs a complete certification tool chain

Limitations

Input Restrictions: Currently only accepts instantiated temporal planning problems as input
Semantic Subset: Supported semantics are a subset of verified planning verifiers
Executability: Certification mechanisms are not yet fully executable

Future Directions

Equivalence Proofs: Plan to formally prove equivalence between the paper's semantics and existing verifier semantics
Executable Implementation: Develop executable certificate checkers
Instantiation Verification: Formally verify instantiation algorithms, such as Helmert's datalog solver
Constraint Relaxation: Investigate possibilities for relaxing the non-self-overlap condition

In-Depth Evaluation

Strengths

Theoretical Rigor: First to provide machine-checked formal proofs for temporal planning unsolvability certification
Engineering Completeness: Not only provides theoretical framework but also implements integration with existing tools
Methodological Innovation:
- Encoding design supporting concurrent execution
- Elegant solution for handling instantaneous actions
- Simplified yet equivalent semantic design
Trustworthiness Guarantees: Provides the highest level of correctness assurance through theorem provers

Weaknesses

Practical Limitations:
- Requires pre-instantiated input
- Not yet fully executable
- Lacks performance evaluation
Coverage Scope: Only supports a subset of temporal planning, not full PDDL features
Scalability: The scale of 18,000 lines of formal verification work is substantial, with high maintenance costs

Impact

Academic Contribution:
- Fills the gap in formal verification of temporal planning
- Provides new theoretical foundation for unsolvability certification
- Demonstrates feasibility of formal verification for complex systems
Practical Value:
- Provides trustworthy planning system certification for critical applications
- Extensible to other planning forms and constraint satisfaction problems
- Provides reference for automated verification tool development
Reproducibility: Based on open-source Isabelle/HOL, theoretically fully reproducible

Applicable Scenarios

Critical Systems: Planning applications requiring high trustworthiness guarantees (e.g., aerospace, medical devices)
Research Tools: Provides formal verification infrastructure for temporal planning research
Educational Use: Serves as a case study for teaching formal methods and planning algorithms
Tool Development: Provides theoretical foundation for developing trustworthy planning tools

References

Key references include:

Abdulaziz & Koller (2022): Formal semantics and verifiers for temporal planning
Heinz et al. (2019): Encoding from temporal planning to timed automata
Wimmer & von Mutius (2020): Verified certificate checkers for timed automata
Abdulaziz & Kurz (2023): Verified SAT-based planning systems

This paper makes significant contributions to the field of formal verification of temporal planning, and while there is room for improvement in practical applicability, its theoretical rigor and methodological innovation provide a solid foundation for the development of this field.