2025-11-10T02:38:03.962319

Substitution Without Copy and Paste

Altenkirch, Burke, Wadler

Defining substitution for a language with binders like the simply typed $Î»$-calculus requires repetition, defining substitution and renaming separately. To verify the categorical properties of this calculus, we must repeat the same argument many times. We present a lightweight method that avoids repetition and that gives rise to a simply typed category with families (CwF) isomorphic to the initial simply typed CwF. Our paper is a literate Agda script.

academic

Substitution Without Copy and Paste

Basic Information

Paper ID: 2510.12304
Title: Substitution Without Copy and Paste
Authors: Thorsten Altenkirch (University of Nottingham), Nathaniel Burke (Imperial College London), Philip Wadler (University of Edinburgh and Input Output)
Classification: cs.LO (Logic in Computer Science)
Conference: LFMTP 2025 (International Workshop on Logical Frameworks and Meta-Languages: Theory and Practice)
Paper Link: https://arxiv.org/abs/2510.12304

Abstract

When defining substitution operations for languages with binders (such as simply-typed λ-calculus), it is customary to separately define substitution and renaming operations, resulting in substantial code duplication. To verify the categorical properties of the calculus, identical arguments must be repeated multiple times. This paper proposes a lightweight approach to avoid such duplication and constructs a simply-typed Contextual Families (CwF) isomorphic to the initial simply-typed CwF family. The paper is presented as an Agda literate programming script.

Research Background and Motivation

Core Problem

In mechanized proof development, defining substitution operations for languages with binders using traditional methods requires:

Separate definitions of variable renaming (∆ ⊩v Γ) and term substitution (∆ ⊩ Γ)
Repeated implementation of four distinct substitution operations: variable-to-variable, variable-to-term, term-to-variable, and term-to-term
Repeated proofs of all combinations of functor laws, resulting in 8 distinct proof cases

Research Motivation

Software engineering principles: Avoiding copy-paste code, which is particularly important in formal proof development
Theoretical significance: Providing a foundation for substitution definitions in dependent type theory
Practical applications: Interpreting consistency issues in dependent types within higher-order categories

Limitations of Existing Approaches

Code duplication: Similar operations must be defined separately for variables and terms
Proof duplication: Categorical law proofs must cover all combinations, leading to substantial repeated arguments
Maintenance difficulty: Modifications in one place require synchronized updates across multiple similar definitions

Core Contributions

Unified framework: Proposes a unified substitution operation based on Sort parameterization, merging variable and term handling into a single definition
Termination guarantee: Cleverly leverages Agda's structural recursion and lexicographic termination checking to ensure well-foundedness of definitions
Categorical verification: Proves that recursive substitution definitions satisfy all laws of simply-typed CwF
Initiality result: Establishes an isomorphism between recursive substitution syntax and the initial CwF
Normalization theorem: Substitution normal form of λ-terms corresponds to λ-terms without explicit substitutions

Methodology Details

Task Definition

Construct a unified substitution system such that:

Input: Any combination of terms/variables and substitutions/renamings
Output: Corresponding substitution results of appropriate types
Constraints: Maintain type safety and termination

Core Technique: Sort System

Sort Type Definition

data Sort : Set where
  V : Sort
  T>V : (s : Sort) → IsV s → Sort

data IsV : Sort → Set where
  isV : IsV V

pattern T = T>V V isV

This definition cleverly makes V structurally smaller than T, satisfying Agda's termination checking requirements.

Unified Term and Substitution Definition

data _ ⊢ [_]_ : Con → Sort → Ty → Set where
  zero : Γ ▷ A ⊢ [ V ] A
  suc : Γ ⊢ [ V ] A → (B : Ty) → Γ ▷ B ⊢ [ V ] A
  `_ : Γ ⊢ [ V ] A → Γ ⊢ [ T ] A
  _ ·_ : Γ ⊢ [ T ] A ⇒ B → Γ ⊢ [ T ] A → Γ ⊢ [ T ] B
  λ_ : Γ ▷ A ⊢ [ T ] B → Γ ⊢ [ T ] A ⇒ B

Where Γ ⊢ [ V ] A corresponds to variables and Γ ⊢ [ T ] A corresponds to terms.

Lattice Structure on Sort

data _ ⊑ _ : Sort → Sort → Set where
  rfl : s ⊑ s
  v⊑t : V ⊑ T

_⊔_ : Sort → Sort → Sort
V ⊔ r = r
T ⊔ r = T

Unified Substitution Operation

Core Substitution Function

_[_] : Γ ⊢ [ q ] A → ∆ ⊩ [ r ] Γ → ∆ ⊢ [ q ⊔ r ] A

Key insight: The result sort is the least upper bound of input sorts, ensuring that the result is a variable only when both inputs are variables/renamings.

Termination Handling

Resolves termination issues through Sort-polymorphic identity functions:

id-poly : Γ ⊩ [ q ] Γ
id : Γ ⊩ [ V ] Γ
id = id-poly {q = V}

Technical Innovations

Structured recursion: Leverages Sort's structural ordering and lexicographic measures to ensure termination
Parametric polymorphism: Unifies handling of different variable and term cases through Sort parameterization
Lattice theory application: Elegantly handles type lifting using the ⊔ operation
Rewrite rules: Utilizes Agda's REWRITE functionality to simplify equational reasoning

Theoretical Verification

Categorical Law Proofs

Identity Law

[id] : x [ id ] ≡ x

Proved by structural induction, with the key being a naturality lemma in the variable case.

Associativity Law

[◦] : x [ xs ◦ ys ] ≡ x [ xs ] [ ys ]

Requires mutually recursive proof with the left identity law, reflecting the intrinsic connection of categorical structure.

CwF Structure Verification

The paper proves that recursive substitution syntax satisfies all axioms of simply-typed CwF:

Categorical structure: Contexts and substitutions form a category
Presheaf structure: Each type corresponds to a presheaf
Terminal object: Empty context
Context extension: Structure analogous to categorical products

Initiality Theorem

Establishes bidirectional mappings:

Normalization norm : Γ ⊢I A → Γ ⊢ [ T ] A
Embedding ⌜_⌝ : Γ ⊢ [ q ] A → Γ ⊢I A

And proves they are mutual inverses:

Stability norm ⌜ t ⌝ ≡ t
Completeness ⌜ norm t ⌝ ≡ t

Implementation Details

Agda Features Utilized

Inductive-inductive types: Mutual definition of Sort and IsV
Lexicographic termination: Supports complex recursive patterns
Rewrite rules: Automates equational reasoning
Pattern synonyms: Simplifies usage of complex types

Termination Analysis

Proves termination through call graph analysis, with each function's measure:

_[_]: (r, t)
id: (r, Γ)
_+_: (r, σ)
suc[_]: (q)

In all cycles, either Sort strictly decreases or Sort is preserved while other parameters decrease.

Distinctions from Existing Approaches

Kit method 18,5: Abstracts common patterns of renaming and substitution through kits, but still requires repeated proofs
Monadic perspective 6,9: Encodes substitution as functions from variables to terms, but difficult to extend to dependent types
Automation tools 21,22: Autosubst library automatically generates substitution lemmas, but underlying duplication remains

Advantages Analysis

Simplicity: More straightforward in languages supporting lexicographic recursion
Uniformity: Single definition covers all cases
Extensibility: Provides foundation for handling dependent types

Conclusions and Discussion

Main Conclusions

Methodological contribution: Demonstrates that Sort parameterization elegantly unifies substitution operations
Theoretical contribution: Establishes initiality of recursive substitution syntax
Practical contribution: Provides concrete technical solutions for avoiding duplication

Limitations

Agda-dependent: Requires lexicographic termination checking support
Complexity shift: While avoiding duplication, increases complexity of the Sort system
Extension challenges: Extension to dependent types requires further research

Future Directions

Dependent type extension: Apply the method to full dependent type theory
Higher-order coherence: Applications within higher-order categories
Other proof assistants: Porting to Lean, Coq, and other systems

In-Depth Evaluation

Strengths

Technical innovation: The Sort system design cleverly resolves both termination and unification issues
Theoretical completeness: Complete theoretical development from basic definitions to initiality
Practical value: Provides practical solutions to common problems in formal verification
Clear presentation: As a literate programming script, code and explanation are well-integrated

Weaknesses

Platform dependency: Heavily dependent on Agda-specific features, limiting portability
Complexity trade-off: While avoiding duplication, introduces new conceptual complexity
Extensibility uncertainty: Extension to more complex type systems remains to be verified

Impact

Theoretical contribution: Provides new perspectives for mechanization of type theory
Practical guidance: Offers useful tools for formal verification practitioners
Research inspiration: Lays foundation for further research in dependent type theory

Applicable Scenarios

Formal verification: Language definitions requiring binder handling
Type theory research: Study of CwF and initial algebras
Programming language theory: Mechanization of λ-calculus and its extensions

References

The paper cites important works in the field, including:

De Bruijn's foundational work 12
McBride's kit method 18
Allais et al.'s type-safe approach 5
Autosubst series 21,22
Related research on relative monads 6

These citations reflect the authors' deep understanding of field development and thorough investigation of existing work.