2025-11-20T12:31:14.883101

Uncovering Singularities in Feynman Integrals via Machine Learning

Liu, Xu, Zhang
We introduce a machine-learning framework based on symbolic regression to extract the full symbol alphabet of multi-loop Feynman integrals. By targeting the analytic structure rather than reduction, the method is broadly applicable and interpretable across different families of integrals. It successfully reconstructs complete symbol alphabets in nontrivial examples, demonstrating both robustness and generality. Beyond accelerating computations case by case, it uncovers the analytic structure universally. This framework opens new avenues for multi-loop amplitude analysis and provides a versatile tool for exploring scattering amplitudes.
academic

Uncovering Singularities in Feynman Integrals via Machine Learning

Basic Information

  • Paper ID: 2510.10099
  • Title: Uncovering Singularities in Feynman Integrals via Machine Learning
  • Authors: Yuanche Liu (USTC), Yingxuan Xu (KIT), Yang Zhang (USTC/PKU)
  • Classification: hep-ph cs.AI cs.LG hep-th
  • Publication Date: October 14, 2025
  • Paper Link: https://arxiv.org/abs/2510.10099

Abstract

This paper proposes a machine learning framework based on symbolic regression to extract the complete symbolic alphabet from multi-loop Feynman integrals. By directly targeting the analytic structure rather than the reduction process, the method demonstrates broad applicability and interpretability across different integral families. It successfully reconstructs the complete symbolic alphabet in non-trivial examples, demonstrating robustness and generality. Beyond accelerating individual computations, the framework universally reveals analytic structures, opening new avenues for multi-loop amplitude analysis and providing a general tool for exploring scattering amplitudes.

Research Background and Motivation

Core Problems

  1. High-precision scattering amplitude requirements: Future high-energy physics experiments (HL-LHC, CEPC, FCC-ee) and third-generation gravitational wave detectors demand unprecedented theoretical precision, particularly for multi-loop scattering amplitude calculations.
  2. Difficulty in symbolic alphabet extraction: The symbolic alphabet is central to modern amplitude techniques, encoding the algebraic structure of iterated integrals, but its analytic construction is computationally extremely challenging.
  3. Limitations of existing methods:
    • HyperInt can only provide a superset of Landau singularities
    • PLD.jl and SOFIA compute singularities but lack comprehensiveness
    • Baikovletter reconstructs via Baikov representation but has limitations

Research Significance

The symbolic alphabet not only encodes the algebraic structure of iterated integrals but also supports modern amplitude techniques, including bootstrap methods for master integrals and complete scattering amplitudes. Accurate extraction of the symbolic alphabet is crucial for understanding the analytic structure of multi-loop Feynman integrals.

Core Contributions

  1. Innovative methodological framework: Proposes a machine learning approach based on symbolic regression, directly targeting analytic structure rather than IBP reduction processes
  2. Broad applicability: The method applies to different integral families without requiring prior singularity knowledge or expensive reduction steps
  3. Complete alphabet reconstruction: Successfully identifies all symbolic letters, including square root structures
  4. Practical validation: Validates the method's effectiveness on multiple non-trivial multi-loop examples, including three-loop four-point and two-loop three-point integrals

Methodology Details

Task Definition

Given a multi-loop Feynman integral family, reconstruct analytic expressions from numerically computed canonical differential equation (CDE) matrices using symbolic regression, thereby extracting the complete symbolic alphabet.

Core Framework: Three-Layer Architecture

1. Pre-processing Layer

  • Performs IBP reduction on the given integral family, constructing CDE matrices at multiple numerical points
  • Uses the Kira tool for numerical IBP reduction
  • Truncates rational coefficients to 30 significant digits, balancing efficiency and precision

2. Regression Layer

  • Employs PySR for symbolic regression to reconstruct the analytic form of CDE matrices
  • Utilizes evolutionary algorithms to search candidate expressions
  • Enhances reliability through "evolution-simplification-optimization" cycles

3. Post-processing Layer

  • Performs exponentiation and factorization on symbolic expressions
  • Collects all candidate symbolic letters and assembles the complete symbolic alphabet

Technical Core: Symbolic Regression

PySR Framework Characteristics

  • High performance: Julia backend supporting JIT compilation and multi-core parallelization
  • Hybrid optimization: Combines discrete structure search with continuous parameter optimization
  • Pareto frontier: Balances accuracy and complexity, providing multiple candidate solutions

Mathematical Foundation

The symbolic regression problem is formalized as:

(s*, θ*) = argmin{min L_D(f_{s,θ}) + λC(s,θ)}

where L_D is the data loss and C(s,θ) is the complexity penalty term.

Key Innovations

  1. Direct structural targeting: Independent of explicit integral representations or singularity analysis
  2. Enforced overfitting: Achieves exact results through completely accurate symbolic expressions
  3. Constraint design: Tailored to CDE characteristics, restricting functions to log and sqrt structures only
  4. Multi-variable extension: Supports symbolic regression for multi-variable partial differential equations

Experimental Setup

Test Cases

  1. Three-loop four-point single-mass integrals: 83 master integrals based on UT basis from reference 40
  2. Non-planar two-loop three-point integrals: Including elliptic integrals and polylogarithms with square root letters

Implementation Details

  • Number of numerical points: 200 different kinematic points
  • Precision settings: 30 significant digits
  • Computing environment: Intel i9-13950HX CPU, 12-core parallelization
  • Convergence criterion: Error reduction from 10^{-2} to 10^{-30}

Evaluation Criteria

  • Completeness: Whether the complete symbolic alphabet is reconstructed
  • Accuracy: Consistency with known results
  • Efficiency: Computational time and resource consumption

Experimental Results

Main Achievements

Case 1: Three-loop Four-point Single-mass Integrals

  • Target expression:
f(x,y) = (14/15)log(1-x) - (2/5)log((1-x-y)/(1-x)) + (2/5)log(y)
  • Reconstruction result:
f₂ = (4/3)log(1-x) - (2/5)log(1-x-y) + (2/5)log(y)
  • Symbolic alphabet: {x, 1-x, y, 1-y, x+y, 1-x-y}
  • Verification: Completely consistent with reference 40

Case 2: Non-planar Two-loop Three-point Integrals

Successfully identified 5 symbolic letters:

l₁ = √x
l₂ = (1/2)(√x + √(x+4))
l₃ = √(x+4)
l₄ = (1/2)(√x + √(x-4))
l₅ = √(x-4)

Completely matches results from reference 41.

Systematic Test Results

Loops\Integral Family1-scale2-scale3-scale5-scale5+-scale
1-loop
2-loop
3-loop——
4-loop————————

Legend: ✓ complete reconstruction; ⚬ most letters obtained; ✗ some letters not found

Performance Metrics

  • Computational time: ~1 hour per CDE matrix element
  • Precision achieved: Final error ~10^{-30}, consistent with input precision
  • Success rate: Complete symbolic alphabet reconstruction achieved in most tested integral families

Traditional Methods

  1. HyperInt: Based on reduction algorithms, but only provides a superset of Landau singularities
  2. PLD.jl/SOFIA: Computes singularities but has limitations with complex structures
  3. Baikovletter: Reconstructs via Baikov representation with limited applicability

Machine Learning Applications in Physics

  • Previous ML applications primarily focused on accelerating IBP reduction 15-17
  • This work is the first to directly target analytic structures, pioneering a new application direction

Symbolic Regression Development

  • Evolution from simple genetic programming to modern multi-objective optimization
  • PySR represents the current state-of-the-art symbolic regression tool

Conclusions and Discussion

Main Conclusions

  1. Method effectiveness: Successfully reconstructs complete symbolic alphabets in multiple non-trivial examples
  2. Broad applicability: Applicable to integral families with different loop orders and external legs
  3. Technical breakthrough: First direct extraction of symbolic structures from numerical CDEs

Limitations

  1. High-scale constraints: For integrals with five or more scales, some complex letters still require manual construction
  2. Computational complexity: Computational time increases significantly with integral complexity
  3. Precision dependence: Method effectiveness depends on the precision of input numerical data

Future Directions

  1. Extension to higher loops: Explore applications in more complex integrals
  2. Bootstrap integration: Combine with bootstrap methods to accelerate analytic structure discovery
  3. Increased automation: Enhance automation levels to reduce manual intervention

In-Depth Evaluation

Strengths

Technical Innovation

  1. Paradigm shift: Transition from traditional reduction methods to direct structure analysis
  2. Tool fusion: Skillfully combines symbolic regression with physical constraints
  3. General framework: Provides an extensible methodological framework

Experimental Sufficiency

  1. Diverse testing: Covers different types of integral families
  2. Precision verification: Achieves precision consistent with input data
  3. Systematic assessment: Provides detailed applicability analysis

Practical Value

  1. Computational acceleration: Significantly reduces effort in symbolic alphabet extraction
  2. Universal applicability: No prior knowledge required, broad applicability
  3. Interpretability: Results have clear physical meaning

Shortcomings

Method Limitations

  1. Scale dependence: Performance degrades for high-scale cases
  2. Structural constraints: Currently primarily handles algebraic letters; extension to transcendental functions needs exploration
  3. Computational cost: Complex cases still require substantial computational resources

Theoretical Analysis

  1. Convergence guarantees: Lacks theoretical convergence analysis
  2. Error propagation: Insufficient systematic analysis of numerical error impact on final results
  3. Completeness: Cannot guarantee finding the complete alphabet in all cases

Impact Assessment

Academic Contributions

  1. Interdisciplinary fusion: Demonstrates deep application potential of AI in theoretical physics
  2. Methodological innovation: Provides new technical pathways for multi-loop calculations
  3. Tool development: Provides practical computational tools for the community

Practical Applications

  1. High-energy physics: Directly serves theoretical predictions for LHC experiments
  2. Gravitational wave physics: Supports precise modeling of gravitational wave signals
  3. Computational physics: Promotes integration of symbolic computation and numerical methods

Applicable Scenarios

  1. Multi-loop integral analysis: Particularly suitable for complex integral families with 2-3 loops
  2. Symbolic structure exploration: Preliminary structural analysis of unknown integral families
  3. Verification tool: Independent verification and cross-checking of known results

Technical Details Supplement

PySR Configuration Optimization

# Single-variable case
expression_spec = TemplateExpressionSpec(
    expressions=["f"],
    variable_names=["x"],
    combine="df = D(f, 1); df(x)",
)

# Multi-variable case
nested_constraints = {
    "sqrt": {"sqrt": 0, "log": 0},
    "log": {"sqrt": 1, "log": 0},
}

Numerical Precision Control

  • IBP reduction coefficients truncated to 30 digits
  • Final error controlled at 10^{-30} level
  • Balances computational efficiency with precision requirements

References

The paper cites 42 important references spanning symbolic computation, differential equations, and machine learning, reflecting the interdisciplinary nature of the work and the solid theoretical foundation.


Overall Assessment: This is a groundbreaking interdisciplinary research work that successfully applies modern machine learning techniques to core computational problems in theoretical physics. The methodology is novel, experiments are comprehensive, and results are convincing. It opens new technical pathways for multi-loop Feynman integral calculations and possesses significant academic value and practical importance.