2025-11-22T21:28:16.108948

Forecasting Generative Amplification

Bahl, Diefenbacher, Elmer et al.

Generative networks are perfect tools to enhance the speed and precision of LHC simulations. It is important to understand their statistical precision, especially when generating events beyond the size of the training dataset. We present two complementary methods to estimate the amplification factor without large holdout datasets. Averaging amplification uses Bayesian networks or ensembling to estimate amplification from the precision of integrals over given phase-space volumes. Differential amplification uses hypothesis testing to quantify amplification without any resolution loss. Applied to state-of-the-art event generators, both methods indicate that amplification is possible in specific regions of phase space, but not yet across the entire distribution.

academic

Forecasting Generative Amplification

Basic Information

Paper ID: 2509.08048
Title: Forecasting Generative Amplification
Authors: Henning Bahl, Sascha Diefenbacher, Nina Elmer, Tilman Plehn, Jonas Spinner
Classification: hep-ph cs.LG
Submission Date: October 17, 2025 to SciPost Physics
Paper Link: https://arxiv.org/abs/2509.08048

Abstract

Generative networks are ideal tools for enhancing the speed and accuracy of LHC simulations. Particularly when generating events beyond the scale of training datasets, understanding their statistical accuracy is crucial. This paper proposes two complementary methods to estimate amplification factors without requiring large held-out datasets. Average amplification uses Bayesian networks or ensemble methods to estimate amplification from precision integrated over given phase space volumes. Differential amplification uses hypothesis testing to quantify amplification without loss of resolution. Applied to state-of-the-art event generators, both methods demonstrate that amplification can be achieved within specific phase space regions.

Research Background and Motivation

Problem Background

Computational Challenges: The High-Luminosity LHC (HL-LHC) will increase data volume by an order of magnitude, requiring corresponding increases in simulation precision and quantity, yet computational budgets fall far short of requirements.
Concept of Generative Amplification: Generative amplification refers to the phenomenon where datasets sampled from generative networks can provide better descriptions of true distributions than training data. This phenomenon is based on the interpolation capability of generative networks for underlying densities.
Limitations of Existing Evaluation Methods:
- Dependence on known true distributions
- Requirement for large held-out datasets
- Impracticality in real physics applications

Research Motivation

Provide a systematic framework to quantify statistical amplification of generative networks without requiring large held-out datasets
Provide reliable uncertainty quantification for generative network applications in LHC physics
Address two core concerns: understanding how to use generative networks for simulation and providing lower bounds for statistical uncertainties in generated datasets

Core Contributions

Propose Two Complementary Amplification Factor Estimation Methods:
- Average amplification factor: estimation based on precision integrated over phase space volume
- Differential amplification factor: estimation based on hypothesis testing without resolution loss
Evaluation Framework Without Large Held-Out Datasets: utilizing Bayesian networks or ensemble methods to estimate model uncertainty
Verification in Practical LHC Physics Applications: application to state-of-the-art event generators for top quark pair production
Systematic Theoretical Framework: provides mathematically rigorous definitions and evaluation methods for generative amplification

Methodology Details

Task Definition

Given training dataset $D^{n_{train}}_{true} \sim p_{true}(x)$ , the generative network learns density $p_{gen}(x)$ . The amplification factor is defined as: $G = \frac{n_{equiv}}{n_{train}}$ where $n_{equiv}$ is the equivalent number of events satisfying: $M[D^{n_{equiv}}_{true}, p_{true}] = \lim_{n_{gen} \to \infty} M[D^{n_{gen}}_{gen}, p_{true}]$

\sigma^2_{stat}(n_{gen}) & \text{if } p_{gen} = p_{true} \\ \sigma^2_{stat}(n_{gen}) + \sigma^2_{model}(p_{gen}, p_{true}) & \text{if } p_{gen} \neq p_{true} \end{cases}$$ #### Bayesian Network Implementation Estimate model uncertainty using Bayesian generative networks: $$\sigma^2_{model}(p_{gen}, p_{true}) = \langle \bar{I}^2 \rangle_\theta - \langle \bar{I} \rangle^2_\theta - \frac{\langle \bar{I} \rangle_\theta (1 - \langle \bar{I} \rangle_\theta)}{n_{gen}}$$ ### Differential Amplification Factor Method #### Kolmogorov-Smirnov Test Use KS test statistic: $$M_{KS}[D_1, D_2] = \sup_y |F(y, D_1) - F(y, D_2)|$$ #### Asymptotic Behavior For two datasets from identical distributions, the KS statistic exhibits known asymptotic behavior: $$\sqrt{\frac{n_1 n_2}{n_1 + n_2}} M_{KS}[D_1, D_2] = K \sim p_K(K)$$ #### Likelihood Ratio Classifier Use trained classifier output as one-dimensional summary statistic, which according to the Neyman-Pearson lemma is the most powerful summary statistic. ## Experimental Setup ### Toy Datasets - **Gaussian Ring Distribution**: 2D and 4D with radial distribution $p_R(x) = \mathcal{N}(R; 1, 0.1^2)$ - **Network Architecture**: Autoregressive Transformer using Gaussian mixture parameterization for conditional probabilities ### Physics Application Datasets - **Top Quark Pair Production**: Generated using MadGraph5_AMC@NLO 3.5.1 - **Two Datasets**: - $t\bar{t} + 0j$: training set ~5×10⁵, test set ~8×10⁶ - $t\bar{t} + 4j$: training set ~2×10⁵, test set ~2×10⁵ ### Generative Network Architecture - **Conditional Flow Matching (CFM)** generator - **Three Architectures**: - Standard Transformer - L-GATr (Lorentz-equivariant Geometric Algebra Transformer) - LLoCa Transformer (Lorentz Local Gauge Normalization) ## Experimental Results ### Toy Dataset Results #### Average Amplification - **2D Gaussian Ring**: $G = 2.6$ in region 2, $G = 7.0$ in combined regions - **4D Gaussian Ring**: $G = 1.9$ in region 2, $G = 2.8$ in combined regions - **Tail Regions**: amplification factor decreases significantly, $G = 0.9$ in 2D, $G = 0.03$ in 4D #### Differential Amplification - **Summary Statistic Sensitivity**: radial summary statistic $R$ shows higher amplification factor ($G \approx 22$), while likelihood ratio statistic shows no amplification - **Dimensionality Effect**: amplification weakens in 4D case, reflecting challenges in high-dimensional learning ### Physics Application Results #### $t\bar{t} + 0j$ Production **Average Amplification**: - Transformer: $G_{est} = 0.3$, $G_{truth} = 0.3$ - L-GATr: $G_{est} = 0.8$, $G_{truth} = 0.7$ - LLoCa-Tr: $G_{est} = 1.7$, $G_{truth} = 1.2$ **Differential Amplification**: - Full phase space: $G \approx 0.01-0.1$ for all architectures - High $m_{t\bar{t}}$ region: LLoCa Transformer achieves $G \approx 2$ #### $t\bar{t} + 4j$ Production **Average Amplification** (high $m_{t\bar{t}}$ region): - Transformer: $G_{est} = 2.3$ - L-GATr: $G_{est} = 10.9$ - LLoCa-Tr: $G_{est} = 12.0$ **Differential Amplification**: - High $m_{t\bar{t}}$ region: $G \approx 5$ for all architectures ### Key Findings 1. **Advantages of Lorentz Equivariance**: L-GATr and LLoCa Transformer significantly outperform standard Transformer 2. **Region Dependence**: amplification is more readily achieved in specific phase space regions (e.g., high-mass tails) 3. **Method Complementarity**: average and differential methods provide different perspectives on amplification assessment ## Related Work ### Generative Amplification Research - Early work primarily verified amplification effects in synthetic data and detector simulations - Existing methods rely on known true distributions or large held-out datasets for verification ### LHC Event Generation - Generative networks for phase space sampling, end-to-end event generation, hadronization, and detector simulation - Learned smooth amplitude surrogates and classifier-based benchmarking ### Uncertainty Quantification - Use of Bayesian neural networks and ensemble methods in physics applications - Uncertainty quantification for generative networks as important component for reliable amplification ## Conclusions and Discussion ### Main Conclusions 1. **Feasibility Verification**: Modern generative networks can indeed achieve statistical amplification within specific phase space regions 2. **Method Validity**: Both proposed methods effectively estimate amplification factors without requiring large held-out datasets 3. **Architecture Importance**: Lorentz-equivariant architectures demonstrate superior performance in LHC event generation ### Limitations 1. **Regional Constraints**: amplification is primarily achieved in specific phase space regions, not yet covering the entire distribution 2. **Dimensionality Challenges**: amplification effects weaken in high-dimensional cases 3. **Method Differences**: the two methods yield slightly different amplification factors, reflecting different resolution sensitivities ### Future Directions 1. Extension to more complex LHC processes and higher dimensions 2. Improvement of generative network architectures to achieve broader amplification 3. Integration with other uncertainty quantification techniques ## In-Depth Evaluation ### Strengths 1. **Theoretical Rigor**: provides mathematically rigorous definitions and evaluation framework for generative amplification 2. **Practical Value**: addresses critical needs in real physics applications without requiring large held-out datasets 3. **Methodological Innovation**: two complementary methods each with distinct advantages; average method is intuitive and straightforward, differential method preserves resolution 4. **Comprehensive Verification**: systematic validation from simple toy models to complex physics processes ### Weaknesses 1. **Limited Amplification Range**: currently achieves amplification only in specific regions, with global amplification still distant 2. **Computational Overhead**: Bayesian networks and ensemble methods increase computational cost 3. **KS Test Limitations**: differential method restricted to univariate test statistics ### Impact 1. **Academic Contribution**: provides important theoretical foundation for generative network applications in high-energy physics 2. **Practical Value**: offers feasible solutions to computational challenges of HL-LHC 3. **Method Generalizability**: proposed methods extensible to other scientific computing domains ### Applicable Scenarios 1. **High-Energy Physics Simulation**: LHC event generation and detector simulation 2. **Scientific Computing**: physics problems requiring extensive Monte Carlo simulations 3. **Generative Model Evaluation**: any application requiring quantification of generation quality and statistical reliability ## References The paper includes abundant references covering machine learning applications in LHC physics, generative networks, Bayesian methods, and uncertainty quantification. Particularly noteworthy are the authors' previous pioneering work on GANplification and recent research on Lorentz-equivariant network architectures.

Forecasting Generative Amplification

Forecasting Generative Amplification

Basic Information

Abstract

Research Background and Motivation

Problem Background

Research Motivation

Core Contributions

Methodology Details

Task Definition

Average Amplification Factor Method

Core Concept

Uncertainty Decomposition