2025-11-20T12:31:14.883101

Uncovering Singularities in Feynman Integrals via Machine Learning

Liu, Xu, Zhang

We introduce a machine-learning framework based on symbolic regression to extract the full symbol alphabet of multi-loop Feynman integrals. By targeting the analytic structure rather than reduction, the method is broadly applicable and interpretable across different families of integrals. It successfully reconstructs complete symbol alphabets in nontrivial examples, demonstrating both robustness and generality. Beyond accelerating computations case by case, it uncovers the analytic structure universally. This framework opens new avenues for multi-loop amplitude analysis and provides a versatile tool for exploring scattering amplitudes.

academic

Uncovering Singularities in Feynman Integrals via Machine Learning

Basic Information

Paper ID: 2510.10099
Title: Uncovering Singularities in Feynman Integrals via Machine Learning
Authors: Yuanche Liu (USTC), Yingxuan Xu (KIT), Yang Zhang (USTC/PKU)
Classification: hep-ph cs.AI cs.LG hep-th
Publication Date: October 14, 2025
Paper Link: https://arxiv.org/abs/2510.10099

Abstract

This paper proposes a machine learning framework based on symbolic regression to extract the complete symbolic alphabet from multi-loop Feynman integrals. By directly targeting the analytic structure rather than the reduction process, the method demonstrates broad applicability and interpretability across different integral families. It successfully reconstructs the complete symbolic alphabet in non-trivial examples, demonstrating robustness and generality. Beyond accelerating individual computations, the framework universally reveals analytic structures, opening new avenues for multi-loop amplitude analysis and providing a general tool for exploring scattering amplitudes.

Research Background and Motivation

Core Problems

High-precision scattering amplitude requirements: Future high-energy physics experiments (HL-LHC, CEPC, FCC-ee) and third-generation gravitational wave detectors demand unprecedented theoretical precision, particularly for multi-loop scattering amplitude calculations.
Difficulty in symbolic alphabet extraction: The symbolic alphabet is central to modern amplitude techniques, encoding the algebraic structure of iterated integrals, but its analytic construction is computationally extremely challenging.
Limitations of existing methods:
- HyperInt can only provide a superset of Landau singularities
- PLD.jl and SOFIA compute singularities but lack comprehensiveness
- Baikovletter reconstructs via Baikov representation but has limitations

Research Significance

The symbolic alphabet not only encodes the algebraic structure of iterated integrals but also supports modern amplitude techniques, including bootstrap methods for master integrals and complete scattering amplitudes. Accurate extraction of the symbolic alphabet is crucial for understanding the analytic structure of multi-loop Feynman integrals.

Core Contributions

Innovative methodological framework: Proposes a machine learning approach based on symbolic regression, directly targeting analytic structure rather than IBP reduction processes
Broad applicability: The method applies to different integral families without requiring prior singularity knowledge or expensive reduction steps
Complete alphabet reconstruction: Successfully identifies all symbolic letters, including square root structures
Practical validation: Validates the method's effectiveness on multiple non-trivial multi-loop examples, including three-loop four-point and two-loop three-point integrals

Methodology Details

Task Definition

Given a multi-loop Feynman integral family, reconstruct analytic expressions from numerically computed canonical differential equation (CDE) matrices using symbolic regression, thereby extracting the complete symbolic alphabet.

Core Framework: Three-Layer Architecture

1. Pre-processing Layer

Performs IBP reduction on the given integral family, constructing CDE matrices at multiple numerical points
Uses the Kira tool for numerical IBP reduction
Truncates rational coefficients to 30 significant digits, balancing efficiency and precision

2. Regression Layer

Employs PySR for symbolic regression to reconstruct the analytic form of CDE matrices
Utilizes evolutionary algorithms to search candidate expressions
Enhances reliability through "evolution-simplification-optimization" cycles

3. Post-processing Layer

Performs exponentiation and factorization on symbolic expressions
Collects all candidate symbolic letters and assembles the complete symbolic alphabet

Technical Core: Symbolic Regression

PySR Framework Characteristics

High performance: Julia backend supporting JIT compilation and multi-core parallelization
Hybrid optimization: Combines discrete structure search with continuous parameter optimization
Pareto frontier: Balances accuracy and complexity, providing multiple candidate solutions

Mathematical Foundation

The symbolic regression problem is formalized as:

(s*, θ*) = argmin{min L_D(f_{s,θ}) + λC(s,θ)}

where L_D is the data loss and C(s,θ) is the complexity penalty term.

Key Innovations

Direct structural targeting: Independent of explicit integral representations or singularity analysis
Enforced overfitting: Achieves exact results through completely accurate symbolic expressions
Constraint design: Tailored to CDE characteristics, restricting functions to log and sqrt structures only
Multi-variable extension: Supports symbolic regression for multi-variable partial differential equations

Experimental Setup

Test Cases

Three-loop four-point single-mass integrals: 83 master integrals based on UT basis from reference 40
Non-planar two-loop three-point integrals: Including elliptic integrals and polylogarithms with square root letters

Implementation Details

Number of numerical points: 200 different kinematic points
Precision settings: 30 significant digits
Computing environment: Intel i9-13950HX CPU, 12-core parallelization
Convergence criterion: Error reduction from 10^{-2} to 10^{-30}

Evaluation Criteria

Completeness: Whether the complete symbolic alphabet is reconstructed
Accuracy: Consistency with known results
Efficiency: Computational time and resource consumption

Experimental Results

Main Achievements

Case 1: Three-loop Four-point Single-mass Integrals

Target expression:

f(x,y) = (14/15)log(1-x) - (2/5)log((1-x-y)/(1-x)) + (2/5)log(y)

Reconstruction result:

f₂ = (4/3)log(1-x) - (2/5)log(1-x-y) + (2/5)log(y)

Symbolic alphabet: {x, 1-x, y, 1-y, x+y, 1-x-y}
Verification: Completely consistent with reference 40

Case 2: Non-planar Two-loop Three-point Integrals

Successfully identified 5 symbolic letters:

l₁ = √x
l₂ = (1/2)(√x + √(x+4))
l₃ = √(x+4)
l₄ = (1/2)(√x + √(x-4))
l₅ = √(x-4)

Completely matches results from reference 41.

Systematic Test Results

Loops\Integral Family	1-scale	2-scale	3-scale	5-scale	5+-scale
1-loop	✓	✓	✓	⚬	⚬
2-loop	✓	✓	✓	⚬	✗
3-loop	✓	✓	✓	⚬	——
4-loop	✓	——	——	——	——

Legend: ✓ complete reconstruction; ⚬ most letters obtained; ✗ some letters not found

Performance Metrics

Computational time: ~1 hour per CDE matrix element
Precision achieved: Final error ~10^{-30}, consistent with input precision
Success rate: Complete symbolic alphabet reconstruction achieved in most tested integral families

Traditional Methods

HyperInt: Based on reduction algorithms, but only provides a superset of Landau singularities
PLD.jl/SOFIA: Computes singularities but has limitations with complex structures
Baikovletter: Reconstructs via Baikov representation with limited applicability

Machine Learning Applications in Physics

Previous ML applications primarily focused on accelerating IBP reduction 15-17
This work is the first to directly target analytic structures, pioneering a new application direction

Symbolic Regression Development

Evolution from simple genetic programming to modern multi-objective optimization
PySR represents the current state-of-the-art symbolic regression tool

Conclusions and Discussion

Main Conclusions

Method effectiveness: Successfully reconstructs complete symbolic alphabets in multiple non-trivial examples
Broad applicability: Applicable to integral families with different loop orders and external legs
Technical breakthrough: First direct extraction of symbolic structures from numerical CDEs

Limitations

High-scale constraints: For integrals with five or more scales, some complex letters still require manual construction
Computational complexity: Computational time increases significantly with integral complexity
Precision dependence: Method effectiveness depends on the precision of input numerical data

Future Directions

Extension to higher loops: Explore applications in more complex integrals
Bootstrap integration: Combine with bootstrap methods to accelerate analytic structure discovery
Increased automation: Enhance automation levels to reduce manual intervention

In-Depth Evaluation

Strengths

Technical Innovation

Paradigm shift: Transition from traditional reduction methods to direct structure analysis
Tool fusion: Skillfully combines symbolic regression with physical constraints
General framework: Provides an extensible methodological framework

Experimental Sufficiency

Diverse testing: Covers different types of integral families
Precision verification: Achieves precision consistent with input data
Systematic assessment: Provides detailed applicability analysis

Practical Value

Computational acceleration: Significantly reduces effort in symbolic alphabet extraction
Universal applicability: No prior knowledge required, broad applicability
Interpretability: Results have clear physical meaning

Shortcomings

Method Limitations

Scale dependence: Performance degrades for high-scale cases
Structural constraints: Currently primarily handles algebraic letters; extension to transcendental functions needs exploration
Computational cost: Complex cases still require substantial computational resources

Theoretical Analysis

Convergence guarantees: Lacks theoretical convergence analysis
Error propagation: Insufficient systematic analysis of numerical error impact on final results
Completeness: Cannot guarantee finding the complete alphabet in all cases

Impact Assessment

Academic Contributions

Interdisciplinary fusion: Demonstrates deep application potential of AI in theoretical physics
Methodological innovation: Provides new technical pathways for multi-loop calculations
Tool development: Provides practical computational tools for the community

Practical Applications

High-energy physics: Directly serves theoretical predictions for LHC experiments
Gravitational wave physics: Supports precise modeling of gravitational wave signals
Computational physics: Promotes integration of symbolic computation and numerical methods

Applicable Scenarios

Multi-loop integral analysis: Particularly suitable for complex integral families with 2-3 loops
Symbolic structure exploration: Preliminary structural analysis of unknown integral families
Verification tool: Independent verification and cross-checking of known results

Technical Details Supplement

PySR Configuration Optimization

# Single-variable case
expression_spec = TemplateExpressionSpec(
    expressions=["f"],
    variable_names=["x"],
    combine="df = D(f, 1); df(x)",
)

# Multi-variable case
nested_constraints = {
    "sqrt": {"sqrt": 0, "log": 0},
    "log": {"sqrt": 1, "log": 0},
}

Numerical Precision Control

IBP reduction coefficients truncated to 30 digits
Final error controlled at 10^{-30} level
Balances computational efficiency with precision requirements

References

The paper cites 42 important references spanning symbolic computation, differential equations, and machine learning, reflecting the interdisciplinary nature of the work and the solid theoretical foundation.

Overall Assessment: This is a groundbreaking interdisciplinary research work that successfully applies modern machine learning techniques to core computational problems in theoretical physics. The methodology is novel, experiments are comprehensive, and results are convincing. It opens new technical pathways for multi-loop Feynman integral calculations and possesses significant academic value and practical importance.