2025-11-18T21:19:13.596005

Optimised neural networks for online processing of ATLAS calorimeter data on FPGAs

Aad, Bertrand, Laatu et al.

A study of neural network architectures for the reconstruction of the energy deposited in the cells of the ATLAS liquid-argon calorimeters under high pile-up conditions expected at the HL-LHC is presented. These networks are designed to run on the FPGA-based readout hardware of the calorimeters under strict size and latency constraints. Several architectures, including Dense, Recurrent (RNN), and Convolutional (CNN) neural networks, are optimised using a Bayesian procedure that balances energy resolution against network size. The optimised Dense, CNN, and combined Dense+RNN architectures achieve a transverse energy resolution of approximately 80 MeV, outperforming both the optimal filtering (OF) method currently in use and RNNs of similar complexity. A detailed comparison across the full dynamic range shows that Dense, CNN, and Dense+RNN accurately reproduce the energy scale, while OF and RNNs underestimate the energy. Deep Evidential Regression is implemented within the Dense architecture to address the need for reliable per-event energy uncertainties. This approach provides predictive uncertainty estimates with minimal increase in network size. The predicted uncertainty is found to be consistent, on average, with the difference between the true deposited energy and the predicted energy.

academic

Optimised neural networks for online processing of ATLAS calorimeter data on FPGAs

Basic Information

Paper ID: 2510.11469
Title: Optimised neural networks for online processing of ATLAS calorimeter data on FPGAs
Authors: Georges Aad, Raphaël Bertrand, Lauri Laatu, Emmanuel Monnier, Arno Straessner, Nairit Sur, Johann C. Voigt
Classification: physics.ins-det (Physics - Instrumentation and Detectors)
Publication Date: October 13, 2025
Paper Link: https://arxiv.org/abs/2510.11469v1

Abstract

This study presents an in-depth investigation of neural network architectures for reconstructing energy deposits in ATLAS liquid argon calorimeter cells under the high-pileup conditions expected at the High-Luminosity Large Hadron Collider (HL-LHC). These networks are designed to operate on FPGA-based calorimeter readout hardware under strict size and latency constraints. Through Bayesian optimization, multiple architectures including Dense networks, Recurrent Neural Networks (RNNs), and Convolutional Neural Networks (CNNs) are optimized to balance energy resolution against network complexity. The optimized Dense, CNN, and Dense+RNN hybrid architectures achieve transverse energy resolution of approximately 80 MeV, significantly outperforming the currently employed Optimal Filtering (OF) method and RNNs of comparable complexity. Detailed comparisons across the full dynamic range demonstrate that Dense, CNN, and Dense+RNN architectures accurately reproduce energy scales, while OF and RNN systematically underestimate energies. Furthermore, Deep Evidential Regression (DER) is implemented within the Dense architecture to provide reliable per-event energy uncertainty estimates.

Research Background and Motivation

Problem Context

High-Luminosity LHC Challenges: The HL-LHC upgrade (2026-2030) will produce up to 200 simultaneous proton-proton collisions, resulting in severe signal pileup issues
Hardware Constraints: The ATLAS liquid argon calorimeter contains 182,468 cells, generating hundreds of terabytes of data per second, requiring specialized electronic boards for processing
Latency Requirements: Energy reconstruction algorithms must complete within 125 ns to meet the fast response demands of the trigger system
Limitations of Existing Methods: The currently employed Optimal Filtering (OF) algorithm exhibits significantly degraded performance under high-pileup conditions

Research Motivation

Advances in FPGA processing capabilities provide a unique opportunity to implement modern machine learning algorithms at early stages of the data processing pipeline
Development of new methods capable of operating under strict hardware constraints while outperforming OF algorithms
Implementation of per-event energy uncertainty estimation to enhance precision in subsequent data acquisition and reconstruction steps

Core Contributions

Multi-Architecture Optimization: Proposes and optimizes four neural network architectures (Dense, RNN, CNN, Dense+RNN) through Bayesian optimization to achieve optimal balance between energy resolution and network complexity
Hardware-Constrained Objective Function: Designs a piecewise penalty objective function accounting for MAC unit counts, effectively controlling network size
Performance Enhancement: Optimal architectures achieve approximately 80 MeV transverse energy resolution, representing ~8% improvement over OF algorithm
Uncertainty Quantification: First implementation of Deep Evidential Regression (DER) under FPGA constraints, providing per-event energy uncertainty estimates
Full Dynamic Range Validation: Validates method effectiveness and energy scale accuracy across 0-130 GeV energy range

Methodology Details

Task Definition

Input: Digitized pulse sample sequences from calorimeter cells

4 post-deposit samples (starting from the bunch crossing of target energy deposit)
Up to 28 pre-deposit samples (for correcting distortions from previous energy deposits)

Output: True transverse energy $E_T^{true}$ at the specific bunch crossing Constraints: Network size <500 MAC units, latency <125 ns

Model Architectures

1. CNN Architecture

Structure: Two convolutional layers + input/output layers
First Layer: 5 parallel 1D filters with kernel size 7, sliding over 25 input samples
Second Layer: 6 2D filters with kernel size 11×5, input 19×5
Output Layer: Single filter with kernel size 9×6
Advantage: Sliding window pattern enables reuse of previous computations, reducing latency

2. RNN Architecture

Structure: Sequence of 5 RNN units + final dense layer
Units: Simple vanilla cells with dimension 8, ReLU activation
Characteristics: Computation synchronized with sample arrival, parameter sharing with limited reuse

3. Dense+RNN Architecture

Innovative Design: Dense layer processes pre-deposit samples to initialize RNN units
Advantage: Maintains RNN benefits while reducing computational cost for long sequences
Structure: Dense layer (pre-deposit) → RNN sequence (post-deposit) → final dense layer

4. Staged Dense Architecture

Two-Stage Design:
- Stage 1: Pre-deposit samples correct pulse distortions
- Stage 2: Combine post-deposit samples to capture pulse shape
Latency Optimization: Stage 1 can be precomputed

Technical Innovations

1. Hardware-Constrained Objective Function

f(M,σ) = {
  σ̃                           if M ≤ 500
  σ̃ + 0.3(M̃ - 0.3)          if M ∈ ]500; 850]
  σ̃ + 0.3(M̃ - 0.3) + e^(M̃-0.65) - 1  otherwise
}

Piecewise penalty mechanism ensures network compliance with FPGA constraints
Balances energy resolution against computational complexity

2. Deep Evidential Regression (DER)

NIG Distribution Parameterization: γ (expected value), ν (epistemic variance), α, β (aleatoric variance parameters)
Uncertainty Decomposition: Aleatoric uncertainty + epistemic uncertainty
Implementation: Replace final dense layer with DenseNormalGamma layer

Experimental Setup

Dataset

Simulation Tool: AREUS toolkit
Training Set: 1 million events
Validation Set: 1.5 million events
Test Set: 2.5 million events
Final Evaluation: 13 million independent events
Energy Range: 0-130 GeV uniform distribution (covering 80% of high-gain readout dynamic range)
Pileup Conditions: Average 200 simultaneous collisions (⟨μ⟩=200)

Evaluation Metrics

Primary Metric: Transverse energy resolution σ(E_T^pred - E_T^true)
Energy Scale: ⟨E_T^pred - E_T^true⟩ vs E_T^true
Uncertainty Assessment: Pull distribution (E_T^pred - E_T^true)/δ_pred

Comparison Methods

Baseline: Optimal Filtering (OF) algorithm
Network Comparisons: RNN, Dense, CNN, Dense+RNN

Implementation Details

Framework: TensorFlow Keras
Optimization: Bayesian optimization, 30-100 iterations
Surrogate Model: 5/2 Matérn kernel Gaussian process
Acquisition Function: Expected Improvement criterion

Experimental Results

Main Results

Energy Resolution Comparison

Architecture	Energy Resolution (MeV)	MAC Units	Relative OF Improvement
OF	~90	-	-
RNN	~90	368	0%
Dense	~80	240	~11%
CNN	~80	419	~11%
Dense+RNN	~80	392	~11%

Energy Scale Accuracy

Dense, CNN, Dense+RNN: Accurately reproduce energy scales with near-zero bias
OF: Systematic energy underestimation (expected by design, does not include average simultaneous pileup component)
RNN: Slight underestimation at low energies, increasing bias at high energies

Ablation Studies

Importance of Pre-Deposit Samples

All optimized networks (except RNN) utilize >20 pre-deposit samples
Demonstrates critical importance of capturing distortions from previous energy deposits
RNN constrained by high computational cost of long sequences

Network Size Optimization

Bayesian optimization process reveals:

Significant network size reduction after initial 10 random evaluations
Energy resolution recovery and network size stabilization after 20 evaluations
Marginal improvements in subsequent 100 evaluations

DER Uncertainty Analysis

Pull Distribution Characteristics

Mean: -0.06 (near zero, slight overestimation tendency)
Standard Deviation: 0.75 (slight overestimation of uncertainty)
Overall uncertainty estimates consistent with true deviations

Uncertainty Decomposition

Epistemic Uncertainty: Dominant (72-79 MeV)
Aleatoric Uncertainty: Minor (30-42 MeV)
99% of events within narrow band, indicating stable model predictions

Neural Networks on FPGAs

Rapid growth in FPGA neural network applications in LHC experiments
Successful cases of trigger algorithm replacement
Emerging applications in raw detector data processing

Calorimeter Energy Reconstruction

Traditional OF algorithms show degraded performance under high pileup
Previous studies limited to 0-5 GeV range and simplified simulations
This work extends to larger dynamic range and more realistic simulations

Uncertainty Quantification

Bayesian neural networks computationally prohibitive
DER provides practical uncertainty estimation method
First application under FPGA constraints

Conclusions and Discussion

Main Conclusions

Performance Enhancement: Dense and CNN architectures achieve ~8% energy resolution improvement
Hardware Feasibility: All optimized networks <500 MAC units, satisfying FPGA constraints
Energy Scale: Neural networks accurately reproduce energy scales across full dynamic range
Uncertainty Estimation: DER successfully provides per-event uncertainty estimates

Limitations

Single Cell: Study limited to individual calorimeter cells
Ideal Triggering: Assumes perfect hard scattering event detection
High Gain: Considers only high-gain readout configuration
Anomaly Detection: Current uncertainty estimation struggles to identify reconstruction anomalies

Future Directions

Multi-Cell Extension: Extend to joint processing of multiple calorimeter cells
Trigger Integration: Combine with bunch crossing assignment functionality
Anomaly Detection: Explore handling of noise bursts and non-uniform bunch structures
Architecture Refinement: Larger training datasets and refined architectures

In-Depth Evaluation

Strengths

High Practical Value: Directly addresses HL-LHC requirements with strict hardware constraints
Comprehensive Methodology: Systematic comparison of multiple architectures with Bayesian optimization ensuring fair comparison
Innovative Design: Dense+RNN architecture cleverly balances performance and computational cost
Uncertainty Quantification: First DER implementation under FPGA constraints with significant practical value
Thorough Validation: Full dynamic range validation with large-scale independent test sets

Limitations

Scope Constraints: Limited to single specific calorimeter cell location
Simplified Assumptions: Ideal triggering assumptions may diverge from practical applications
Anomaly Handling: Limited capability for handling reconstruction anomalies
Generalization: Generalization across different locations and conditions insufficiently verified

Impact

Technical Contribution: Provides novel solutions for real-time data processing in high-energy physics experiments
Methodology: Hardware-constrained optimization methods generalizable to other FPGA applications
Practical Value: Directly serves ATLAS experiment upgrades with significant engineering value
Interdisciplinary: Promotes deep integration of machine learning with high-energy physics instrumentation

Applicable Scenarios

High-Energy Physics: Similar calorimeter energy reconstruction tasks
Real-Time Systems: Signal processing applications requiring low latency and high precision
FPGA Applications: Neural network deployment in resource-constrained environments
Uncertainty Quantification: Engineering applications requiring real-time uncertainty estimation

References

This paper cites 28 important references covering ATLAS experiment design, LHC upgrade plans, FPGA neural network implementations, and Deep Evidential Regression theory, providing solid theoretical and technical foundations for the research.

Overall Assessment: This is a high-quality applied research paper achieving good balance between theoretical innovation and engineering practice. The research directly serves major scientific facility upgrade requirements with well-designed methodology and comprehensive experimental validation, offering significant value to both high-energy physics experiments and FPGA application domains.