2025-11-18T21:19:13.596005

Optimised neural networks for online processing of ATLAS calorimeter data on FPGAs

Aad, Bertrand, Laatu et al.
A study of neural network architectures for the reconstruction of the energy deposited in the cells of the ATLAS liquid-argon calorimeters under high pile-up conditions expected at the HL-LHC is presented. These networks are designed to run on the FPGA-based readout hardware of the calorimeters under strict size and latency constraints. Several architectures, including Dense, Recurrent (RNN), and Convolutional (CNN) neural networks, are optimised using a Bayesian procedure that balances energy resolution against network size. The optimised Dense, CNN, and combined Dense+RNN architectures achieve a transverse energy resolution of approximately 80 MeV, outperforming both the optimal filtering (OF) method currently in use and RNNs of similar complexity. A detailed comparison across the full dynamic range shows that Dense, CNN, and Dense+RNN accurately reproduce the energy scale, while OF and RNNs underestimate the energy. Deep Evidential Regression is implemented within the Dense architecture to address the need for reliable per-event energy uncertainties. This approach provides predictive uncertainty estimates with minimal increase in network size. The predicted uncertainty is found to be consistent, on average, with the difference between the true deposited energy and the predicted energy.
academic

Optimised neural networks for online processing of ATLAS calorimeter data on FPGAs

Basic Information

  • Paper ID: 2510.11469
  • Title: Optimised neural networks for online processing of ATLAS calorimeter data on FPGAs
  • Authors: Georges Aad, Raphaël Bertrand, Lauri Laatu, Emmanuel Monnier, Arno Straessner, Nairit Sur, Johann C. Voigt
  • Classification: physics.ins-det (Physics - Instrumentation and Detectors)
  • Publication Date: October 13, 2025
  • Paper Link: https://arxiv.org/abs/2510.11469v1

Abstract

This study presents an in-depth investigation of neural network architectures for reconstructing energy deposits in ATLAS liquid argon calorimeter cells under the high-pileup conditions expected at the High-Luminosity Large Hadron Collider (HL-LHC). These networks are designed to operate on FPGA-based calorimeter readout hardware under strict size and latency constraints. Through Bayesian optimization, multiple architectures including Dense networks, Recurrent Neural Networks (RNNs), and Convolutional Neural Networks (CNNs) are optimized to balance energy resolution against network complexity. The optimized Dense, CNN, and Dense+RNN hybrid architectures achieve transverse energy resolution of approximately 80 MeV, significantly outperforming the currently employed Optimal Filtering (OF) method and RNNs of comparable complexity. Detailed comparisons across the full dynamic range demonstrate that Dense, CNN, and Dense+RNN architectures accurately reproduce energy scales, while OF and RNN systematically underestimate energies. Furthermore, Deep Evidential Regression (DER) is implemented within the Dense architecture to provide reliable per-event energy uncertainty estimates.

Research Background and Motivation

Problem Context

  1. High-Luminosity LHC Challenges: The HL-LHC upgrade (2026-2030) will produce up to 200 simultaneous proton-proton collisions, resulting in severe signal pileup issues
  2. Hardware Constraints: The ATLAS liquid argon calorimeter contains 182,468 cells, generating hundreds of terabytes of data per second, requiring specialized electronic boards for processing
  3. Latency Requirements: Energy reconstruction algorithms must complete within 125 ns to meet the fast response demands of the trigger system
  4. Limitations of Existing Methods: The currently employed Optimal Filtering (OF) algorithm exhibits significantly degraded performance under high-pileup conditions

Research Motivation

  • Advances in FPGA processing capabilities provide a unique opportunity to implement modern machine learning algorithms at early stages of the data processing pipeline
  • Development of new methods capable of operating under strict hardware constraints while outperforming OF algorithms
  • Implementation of per-event energy uncertainty estimation to enhance precision in subsequent data acquisition and reconstruction steps

Core Contributions

  1. Multi-Architecture Optimization: Proposes and optimizes four neural network architectures (Dense, RNN, CNN, Dense+RNN) through Bayesian optimization to achieve optimal balance between energy resolution and network complexity
  2. Hardware-Constrained Objective Function: Designs a piecewise penalty objective function accounting for MAC unit counts, effectively controlling network size
  3. Performance Enhancement: Optimal architectures achieve approximately 80 MeV transverse energy resolution, representing ~8% improvement over OF algorithm
  4. Uncertainty Quantification: First implementation of Deep Evidential Regression (DER) under FPGA constraints, providing per-event energy uncertainty estimates
  5. Full Dynamic Range Validation: Validates method effectiveness and energy scale accuracy across 0-130 GeV energy range

Methodology Details

Task Definition

Input: Digitized pulse sample sequences from calorimeter cells

  • 4 post-deposit samples (starting from the bunch crossing of target energy deposit)
  • Up to 28 pre-deposit samples (for correcting distortions from previous energy deposits)

Output: True transverse energy ETtrueE_T^{true} at the specific bunch crossing Constraints: Network size <500 MAC units, latency <125 ns

Model Architectures

1. CNN Architecture

  • Structure: Two convolutional layers + input/output layers
  • First Layer: 5 parallel 1D filters with kernel size 7, sliding over 25 input samples
  • Second Layer: 6 2D filters with kernel size 11×5, input 19×5
  • Output Layer: Single filter with kernel size 9×6
  • Advantage: Sliding window pattern enables reuse of previous computations, reducing latency

2. RNN Architecture

  • Structure: Sequence of 5 RNN units + final dense layer
  • Units: Simple vanilla cells with dimension 8, ReLU activation
  • Characteristics: Computation synchronized with sample arrival, parameter sharing with limited reuse

3. Dense+RNN Architecture

  • Innovative Design: Dense layer processes pre-deposit samples to initialize RNN units
  • Advantage: Maintains RNN benefits while reducing computational cost for long sequences
  • Structure: Dense layer (pre-deposit) → RNN sequence (post-deposit) → final dense layer

4. Staged Dense Architecture

  • Two-Stage Design:
    • Stage 1: Pre-deposit samples correct pulse distortions
    • Stage 2: Combine post-deposit samples to capture pulse shape
  • Latency Optimization: Stage 1 can be precomputed

Technical Innovations

1. Hardware-Constrained Objective Function

f(M,σ) = {
  σ̃                           if M ≤ 500
  σ̃ + 0.3(M̃ - 0.3)          if M ∈ ]500; 850]
  σ̃ + 0.3(M̃ - 0.3) + e^(M̃-0.65) - 1  otherwise
}
  • Piecewise penalty mechanism ensures network compliance with FPGA constraints
  • Balances energy resolution against computational complexity

2. Deep Evidential Regression (DER)

  • NIG Distribution Parameterization: γ (expected value), ν (epistemic variance), α, β (aleatoric variance parameters)
  • Uncertainty Decomposition: Aleatoric uncertainty + epistemic uncertainty
  • Implementation: Replace final dense layer with DenseNormalGamma layer

Experimental Setup

Dataset

  • Simulation Tool: AREUS toolkit
  • Training Set: 1 million events
  • Validation Set: 1.5 million events
  • Test Set: 2.5 million events
  • Final Evaluation: 13 million independent events
  • Energy Range: 0-130 GeV uniform distribution (covering 80% of high-gain readout dynamic range)
  • Pileup Conditions: Average 200 simultaneous collisions (⟨μ⟩=200)

Evaluation Metrics

  • Primary Metric: Transverse energy resolution σ(E_T^pred - E_T^true)
  • Energy Scale: ⟨E_T^pred - E_T^true⟩ vs E_T^true
  • Uncertainty Assessment: Pull distribution (E_T^pred - E_T^true)/δ_pred

Comparison Methods

  • Baseline: Optimal Filtering (OF) algorithm
  • Network Comparisons: RNN, Dense, CNN, Dense+RNN

Implementation Details

  • Framework: TensorFlow Keras
  • Optimization: Bayesian optimization, 30-100 iterations
  • Surrogate Model: 5/2 Matérn kernel Gaussian process
  • Acquisition Function: Expected Improvement criterion

Experimental Results

Main Results

Energy Resolution Comparison

ArchitectureEnergy Resolution (MeV)MAC UnitsRelative OF Improvement
OF~90--
RNN~903680%
Dense~80240~11%
CNN~80419~11%
Dense+RNN~80392~11%

Energy Scale Accuracy

  • Dense, CNN, Dense+RNN: Accurately reproduce energy scales with near-zero bias
  • OF: Systematic energy underestimation (expected by design, does not include average simultaneous pileup component)
  • RNN: Slight underestimation at low energies, increasing bias at high energies

Ablation Studies

Importance of Pre-Deposit Samples

  • All optimized networks (except RNN) utilize >20 pre-deposit samples
  • Demonstrates critical importance of capturing distortions from previous energy deposits
  • RNN constrained by high computational cost of long sequences

Network Size Optimization

Bayesian optimization process reveals:

  • Significant network size reduction after initial 10 random evaluations
  • Energy resolution recovery and network size stabilization after 20 evaluations
  • Marginal improvements in subsequent 100 evaluations

DER Uncertainty Analysis

Pull Distribution Characteristics

  • Mean: -0.06 (near zero, slight overestimation tendency)
  • Standard Deviation: 0.75 (slight overestimation of uncertainty)
  • Overall uncertainty estimates consistent with true deviations

Uncertainty Decomposition

  • Epistemic Uncertainty: Dominant (72-79 MeV)
  • Aleatoric Uncertainty: Minor (30-42 MeV)
  • 99% of events within narrow band, indicating stable model predictions

Neural Networks on FPGAs

  • Rapid growth in FPGA neural network applications in LHC experiments
  • Successful cases of trigger algorithm replacement
  • Emerging applications in raw detector data processing

Calorimeter Energy Reconstruction

  • Traditional OF algorithms show degraded performance under high pileup
  • Previous studies limited to 0-5 GeV range and simplified simulations
  • This work extends to larger dynamic range and more realistic simulations

Uncertainty Quantification

  • Bayesian neural networks computationally prohibitive
  • DER provides practical uncertainty estimation method
  • First application under FPGA constraints

Conclusions and Discussion

Main Conclusions

  1. Performance Enhancement: Dense and CNN architectures achieve ~8% energy resolution improvement
  2. Hardware Feasibility: All optimized networks <500 MAC units, satisfying FPGA constraints
  3. Energy Scale: Neural networks accurately reproduce energy scales across full dynamic range
  4. Uncertainty Estimation: DER successfully provides per-event uncertainty estimates

Limitations

  1. Single Cell: Study limited to individual calorimeter cells
  2. Ideal Triggering: Assumes perfect hard scattering event detection
  3. High Gain: Considers only high-gain readout configuration
  4. Anomaly Detection: Current uncertainty estimation struggles to identify reconstruction anomalies

Future Directions

  1. Multi-Cell Extension: Extend to joint processing of multiple calorimeter cells
  2. Trigger Integration: Combine with bunch crossing assignment functionality
  3. Anomaly Detection: Explore handling of noise bursts and non-uniform bunch structures
  4. Architecture Refinement: Larger training datasets and refined architectures

In-Depth Evaluation

Strengths

  1. High Practical Value: Directly addresses HL-LHC requirements with strict hardware constraints
  2. Comprehensive Methodology: Systematic comparison of multiple architectures with Bayesian optimization ensuring fair comparison
  3. Innovative Design: Dense+RNN architecture cleverly balances performance and computational cost
  4. Uncertainty Quantification: First DER implementation under FPGA constraints with significant practical value
  5. Thorough Validation: Full dynamic range validation with large-scale independent test sets

Limitations

  1. Scope Constraints: Limited to single specific calorimeter cell location
  2. Simplified Assumptions: Ideal triggering assumptions may diverge from practical applications
  3. Anomaly Handling: Limited capability for handling reconstruction anomalies
  4. Generalization: Generalization across different locations and conditions insufficiently verified

Impact

  1. Technical Contribution: Provides novel solutions for real-time data processing in high-energy physics experiments
  2. Methodology: Hardware-constrained optimization methods generalizable to other FPGA applications
  3. Practical Value: Directly serves ATLAS experiment upgrades with significant engineering value
  4. Interdisciplinary: Promotes deep integration of machine learning with high-energy physics instrumentation

Applicable Scenarios

  1. High-Energy Physics: Similar calorimeter energy reconstruction tasks
  2. Real-Time Systems: Signal processing applications requiring low latency and high precision
  3. FPGA Applications: Neural network deployment in resource-constrained environments
  4. Uncertainty Quantification: Engineering applications requiring real-time uncertainty estimation

References

This paper cites 28 important references covering ATLAS experiment design, LHC upgrade plans, FPGA neural network implementations, and Deep Evidential Regression theory, providing solid theoretical and technical foundations for the research.


Overall Assessment: This is a high-quality applied research paper achieving good balance between theoretical innovation and engineering practice. The research directly serves major scientific facility upgrade requirements with well-designed methodology and comprehensive experimental validation, offering significant value to both high-energy physics experiments and FPGA application domains.