2025-11-24T09:28:17.353555

Herb.jl: A Unifying Program Synthesis Library

Hinnerichs, Reid, de Jong et al.

Program synthesis -- the automatic generation of code given a specification -- is one of the most fundamental tasks in artificial intelligence (AI) and many programmers' dream. Numerous synthesizers have been developed to tackle program synthesis, manifesting different ideas to approach the exponentially growing program space. While numerous smart program synthesis tools exist, reusing and remixing previously developed methods is tedious and time-consuming. We propose Herb.jl, a unifying program synthesis library written in the Julia programming language, to address these issues. Since current methods rely on similar building blocks, we aim to modularize the underlying synthesis algorithm into communicating and fully extendable sub-compartments, allowing for straightforward reapplication of these modules. To demonstrate the benefits of using Herb.jl, we show three common use cases: 1. how to implement a simple problem and grammar, and how to solve it, 2. how to implement a previously developed synthesizer with just a few lines of code, and 3. how to run a synthesizer against a benchmark.

academic

Herb.jl: A Unifying Program Synthesis Library

基本信息

论文ID: 2510.09726
标题: Herb.jl: A Unifying Program Synthesis Library
作者: Tilman Hinnerichs, Reuben Gardos Reid, Jaap de Jong, Bart Swinkels, Pamela Wochner, Nicolae Filat, Tudor Magurescu, Issa Hanou, Sebastijan Dumancic (Technische Universiteit Delft)
分类: cs.PL (Programming Languages), cs.AI (Artificial Intelligence), cs.SE (Software Engineering)
发表时间: Journal of Machine Learning Research 10 (2025) 1-48, Submitted 10/25
论文链接: https://arxiv.org/abs/2510.09726

领域特定性：合成器实现通常针对特定语言设计，难以适应新的语法规则
模块化不足：相同的构建模块无法轻易重用，研究者需要重复实现相同的思想
比较困难：由于工程选择的差异，方法比较往往退化为实现质量的比较
基准测试重用困难：基准测试的语法规则选择往往是隐式的，影响了公平比较

研究动机

现有的程序合成方法虽然在各自领域表现出色，但存在以下局限性：

实现过于专门化，缺乏重用性规划
缺乏跨程序合成分支的模块化设计
隐式假设和优化使得方法比较变得困难
基准测试的语法规则定义不统一

核心贡献

提出了Herb.jl：一个用Julia语言编写的新颖统一程序合成库
展示了模块化实现：演示如何使用Herb.jl轻松实现已有合成器
提供标准化基准测试：以人类可读和可扩展格式重新实现标准基准测试
设计原则总结：概述了Herb.jl中的指导设计原则，对其他合成器实现具有参考价值

方法详解

任务定义

程序合成问题由两个组件定义：

规范(Specification)：描述用户意图，通常通过输入-输出示例表达
语法(Grammar)：描述目标语言，由上下文无关派生规则组成

架构设计

Herb.jl采用分层模块化架构，包含以下核心组件：

核心模块

HerbCore.jl：定义语法、程序和约束的接口
HerbSpecification.jl：处理问题规范定义
HerbGrammar.jl：定义程序语法结构
HerbInterpret.jl：处理程序语义和求值
HerbConstraints.jl：约束公式化和传播
HerbSearch.jl：搜索方法和枚举技术

特殊模块

Herb.jl：总体包装模块
HerbBenchmarks.jl：标准基准测试集合
Garden.jl：已知合成器的实现集合

技术创新点

1. 语法与语义分离

Herb.jl明确分离语法和语义：

程序枚举纯粹基于语法，通过更新抽象语法树(AST)完成
程序转换为可执行表达式来检查规范
用户只需提供可执行表达式，无需了解内部机制

2. 统一树结构

引入自定义数据结构"统一树"：

相似形状的操作产生相似形状的程序
统一节点描述相同形状的操作域，而非单个操作
显著减少内存使用，支持高效约束应用和传播

3. 枚举顺序优化

通过两个函数定义程序枚举顺序：

优先级函数：定义优先队列中元素的优先级值
派生启发式：定义统一树内元素域的枚举顺序

# 定义输入-输出规范
problem = Problem([
    IOExample(Dict(:x => 0), 1),
    IOExample(Dict(:x => 1), 3),
    IOExample(Dict(:x => 2), 5),
    IOExample(Dict(:x => 3), 7)
])

# 定义语法
grammar = @cfgrammar begin
    Int = 1 | 2 | x
    Int = Int + Int
    Int = Int * Int
end

# 执行搜索
iterator = BFSIterator(grammar, :Int, max_depth=5)
solution, flag = synth(problem, iterator)

案例2：实现Probe算法

展示如何用几行代码重新实现Probe合成器：

实现最可能优先搜索迭代器(MLFSIterator)
定义概率计算函数
实现Probe循环逻辑

案例3：基准测试运行

using HerbBenchmarks
pairs = get_all_problem_grammar_pairs(PBE_SLIA_Track_2019)

solved_problems = 0
for (problem, grammar) in pairs
    solution = probe(grammar, :Start, problem; max_depth=5)
    if !isnothing(solution)
        solved_problems += 1
    end
end