2025-11-10T02:38:03.962319

Substitution Without Copy and Paste

Altenkirch, Burke, Wadler
Defining substitution for a language with binders like the simply typed $λ$-calculus requires repetition, defining substitution and renaming separately. To verify the categorical properties of this calculus, we must repeat the same argument many times. We present a lightweight method that avoids repetition and that gives rise to a simply typed category with families (CwF) isomorphic to the initial simply typed CwF. Our paper is a literate Agda script.
academic

Substitution Without Copy and Paste

Basic Information

  • Paper ID: 2510.12304
  • Title: Substitution Without Copy and Paste
  • Authors: Thorsten Altenkirch (University of Nottingham), Nathaniel Burke (Imperial College London), Philip Wadler (University of Edinburgh and Input Output)
  • Classification: cs.LO (Logic in Computer Science)
  • Conference: LFMTP 2025 (International Workshop on Logical Frameworks and Meta-Languages: Theory and Practice)
  • Paper Link: https://arxiv.org/abs/2510.12304

Abstract

When defining substitution operations for languages with binders (such as simply-typed λ-calculus), it is customary to separately define substitution and renaming operations, resulting in substantial code duplication. To verify the categorical properties of the calculus, identical arguments must be repeated multiple times. This paper proposes a lightweight approach to avoid such duplication and constructs a simply-typed Contextual Families (CwF) isomorphic to the initial simply-typed CwF family. The paper is presented as an Agda literate programming script.

Research Background and Motivation

Core Problem

In mechanized proof development, defining substitution operations for languages with binders using traditional methods requires:

  1. Separate definitions of variable renaming (∆ ⊩v Γ) and term substitution (∆ ⊩ Γ)
  2. Repeated implementation of four distinct substitution operations: variable-to-variable, variable-to-term, term-to-variable, and term-to-term
  3. Repeated proofs of all combinations of functor laws, resulting in 8 distinct proof cases

Research Motivation

  1. Software engineering principles: Avoiding copy-paste code, which is particularly important in formal proof development
  2. Theoretical significance: Providing a foundation for substitution definitions in dependent type theory
  3. Practical applications: Interpreting consistency issues in dependent types within higher-order categories

Limitations of Existing Approaches

  • Code duplication: Similar operations must be defined separately for variables and terms
  • Proof duplication: Categorical law proofs must cover all combinations, leading to substantial repeated arguments
  • Maintenance difficulty: Modifications in one place require synchronized updates across multiple similar definitions

Core Contributions

  1. Unified framework: Proposes a unified substitution operation based on Sort parameterization, merging variable and term handling into a single definition
  2. Termination guarantee: Cleverly leverages Agda's structural recursion and lexicographic termination checking to ensure well-foundedness of definitions
  3. Categorical verification: Proves that recursive substitution definitions satisfy all laws of simply-typed CwF
  4. Initiality result: Establishes an isomorphism between recursive substitution syntax and the initial CwF
  5. Normalization theorem: Substitution normal form of λ-terms corresponds to λ-terms without explicit substitutions

Methodology Details

Task Definition

Construct a unified substitution system such that:

  • Input: Any combination of terms/variables and substitutions/renamings
  • Output: Corresponding substitution results of appropriate types
  • Constraints: Maintain type safety and termination

Core Technique: Sort System

Sort Type Definition

data Sort : Set where
  V : Sort
  T>V : (s : Sort) → IsV s → Sort

data IsV : Sort → Set where
  isV : IsV V

pattern T = T>V V isV

This definition cleverly makes V structurally smaller than T, satisfying Agda's termination checking requirements.

Unified Term and Substitution Definition

data _ ⊢ [_]_ : Con → Sort → Ty → Set where
  zero : Γ ▷ A ⊢ [ V ] A
  suc : Γ ⊢ [ V ] A → (B : Ty) → Γ ▷ B ⊢ [ V ] A
  `_ : Γ ⊢ [ V ] A → Γ ⊢ [ T ] A
  _ ·_ : Γ ⊢ [ T ] A ⇒ B → Γ ⊢ [ T ] A → Γ ⊢ [ T ] B
  λ_ : Γ ▷ A ⊢ [ T ] B → Γ ⊢ [ T ] A ⇒ B

Where Γ ⊢ [ V ] A corresponds to variables and Γ ⊢ [ T ] A corresponds to terms.

Lattice Structure on Sort

data _ ⊑ _ : Sort → Sort → Set where
  rfl : s ⊑ s
  v⊑t : V ⊑ T

_⊔_ : Sort → Sort → Sort
V ⊔ r = r
T ⊔ r = T

Unified Substitution Operation

Core Substitution Function

_[_] : Γ ⊢ [ q ] A → ∆ ⊩ [ r ] Γ → ∆ ⊢ [ q ⊔ r ] A

Key insight: The result sort is the least upper bound of input sorts, ensuring that the result is a variable only when both inputs are variables/renamings.

Termination Handling

Resolves termination issues through Sort-polymorphic identity functions:

id-poly : Γ ⊩ [ q ] Γ
id : Γ ⊩ [ V ] Γ
id = id-poly {q = V}

Technical Innovations

  1. Structured recursion: Leverages Sort's structural ordering and lexicographic measures to ensure termination
  2. Parametric polymorphism: Unifies handling of different variable and term cases through Sort parameterization
  3. Lattice theory application: Elegantly handles type lifting using the ⊔ operation
  4. Rewrite rules: Utilizes Agda's REWRITE functionality to simplify equational reasoning

Theoretical Verification

Categorical Law Proofs

Identity Law

[id] : x [ id ] ≡ x

Proved by structural induction, with the key being a naturality lemma in the variable case.

Associativity Law

[◦] : x [ xs ◦ ys ] ≡ x [ xs ] [ ys ]

Requires mutually recursive proof with the left identity law, reflecting the intrinsic connection of categorical structure.

CwF Structure Verification

The paper proves that recursive substitution syntax satisfies all axioms of simply-typed CwF:

  • Categorical structure: Contexts and substitutions form a category
  • Presheaf structure: Each type corresponds to a presheaf
  • Terminal object: Empty context
  • Context extension: Structure analogous to categorical products

Initiality Theorem

Establishes bidirectional mappings:

  1. Normalization norm : Γ ⊢I A → Γ ⊢ [ T ] A
  2. Embedding ⌜_⌝ : Γ ⊢ [ q ] A → Γ ⊢I A

And proves they are mutual inverses:

  • Stability norm ⌜ t ⌝ ≡ t
  • Completeness ⌜ norm t ⌝ ≡ t

Implementation Details

Agda Features Utilized

  1. Inductive-inductive types: Mutual definition of Sort and IsV
  2. Lexicographic termination: Supports complex recursive patterns
  3. Rewrite rules: Automates equational reasoning
  4. Pattern synonyms: Simplifies usage of complex types

Termination Analysis

Proves termination through call graph analysis, with each function's measure:

  • _[_]: (r, t)
  • id: (r, Γ)
  • _+_: (r, σ)
  • suc[_]: (q)

In all cycles, either Sort strictly decreases or Sort is preserved while other parameters decrease.

Distinctions from Existing Approaches

  1. Kit method 18,5: Abstracts common patterns of renaming and substitution through kits, but still requires repeated proofs
  2. Monadic perspective 6,9: Encodes substitution as functions from variables to terms, but difficult to extend to dependent types
  3. Automation tools 21,22: Autosubst library automatically generates substitution lemmas, but underlying duplication remains

Advantages Analysis

  1. Simplicity: More straightforward in languages supporting lexicographic recursion
  2. Uniformity: Single definition covers all cases
  3. Extensibility: Provides foundation for handling dependent types

Conclusions and Discussion

Main Conclusions

  1. Methodological contribution: Demonstrates that Sort parameterization elegantly unifies substitution operations
  2. Theoretical contribution: Establishes initiality of recursive substitution syntax
  3. Practical contribution: Provides concrete technical solutions for avoiding duplication

Limitations

  1. Agda-dependent: Requires lexicographic termination checking support
  2. Complexity shift: While avoiding duplication, increases complexity of the Sort system
  3. Extension challenges: Extension to dependent types requires further research

Future Directions

  1. Dependent type extension: Apply the method to full dependent type theory
  2. Higher-order coherence: Applications within higher-order categories
  3. Other proof assistants: Porting to Lean, Coq, and other systems

In-Depth Evaluation

Strengths

  1. Technical innovation: The Sort system design cleverly resolves both termination and unification issues
  2. Theoretical completeness: Complete theoretical development from basic definitions to initiality
  3. Practical value: Provides practical solutions to common problems in formal verification
  4. Clear presentation: As a literate programming script, code and explanation are well-integrated

Weaknesses

  1. Platform dependency: Heavily dependent on Agda-specific features, limiting portability
  2. Complexity trade-off: While avoiding duplication, introduces new conceptual complexity
  3. Extensibility uncertainty: Extension to more complex type systems remains to be verified

Impact

  1. Theoretical contribution: Provides new perspectives for mechanization of type theory
  2. Practical guidance: Offers useful tools for formal verification practitioners
  3. Research inspiration: Lays foundation for further research in dependent type theory

Applicable Scenarios

  1. Formal verification: Language definitions requiring binder handling
  2. Type theory research: Study of CwF and initial algebras
  3. Programming language theory: Mechanization of λ-calculus and its extensions

References

The paper cites important works in the field, including:

  • De Bruijn's foundational work 12
  • McBride's kit method 18
  • Allais et al.'s type-safe approach 5
  • Autosubst series 21,22
  • Related research on relative monads 6

These citations reflect the authors' deep understanding of field development and thorough investigation of existing work.