2025-11-14T00:28:11.716967

An extraction of the Collins-Soper kernel from a joint analysis of experimental and lattice data

Avkhadiev, Bertone, Bissolotti et al.
We present a first joint extraction of the Collins-Soper kernel (CSK) combining experimental and lattice QCD data in the context of an analysis of transverse-momentum-dependent distributions (TMDs). Based on a neural-network parametrization, we perform a Bayesian reweighting of an existing fits of TMDs using lattice data, as well as a joint TMD fit to lattice and experimental data. We consistently find that the inclusion of lattice information shifts the central value of the CSK by approximately 10% and reduces its uncertainty by 40-50%, highlighting the potential of lattice inputs to improve TMD extractions.
academic

An extraction of the Collins-Soper kernel from a joint analysis of experimental and lattice data

Basic Information

  • Paper ID: 2510.26489
  • Title: An extraction of the Collins-Soper kernel from a joint analysis of experimental and lattice data
  • Authors: Artur Avkhadiev, Valerio Bertone, Chiara Bissolotti, Matteo Cerutti, Yang Fu, Simone Rodini, Phiala Shanahan, Michael Wagman, Yong Zhao
  • Institutions: MIT, Argonne National Laboratory, CEA Saclay, DESY, University of Pavia, Fermilab
  • Classification: hep-ph (High Energy Physics - Phenomenology), hep-ex (High Energy Physics - Experiment), hep-lat (High Energy Physics - Lattice)
  • Publication Date: October 30, 2025
  • Paper Link: https://arxiv.org/abs/2510.26489

Abstract

This paper presents the first joint analysis of experimental and lattice QCD data for extracting the Collins-Soper kernel (CSK) within the framework of transverse momentum dependent (TMD) distributions. Based on neural network parameterization methods, the researchers perform Bayesian reweighting of existing TMD fits and execute a joint fit. The study consistently finds that incorporating lattice information shifts the CSK central value by approximately 10%, while reducing uncertainties by 40-50%, highlighting the potential of lattice input to improve TMD extraction.

Research Background and Motivation

Research Problem

The Collins-Soper kernel (CSK) is a crucial non-perturbative quantity describing the evolution of transverse momentum dependent distributions (TMDs). It encodes information about the QCD vacuum rather than the internal structure of hadrons. The behavior of CSK at large transverse distances b (non-perturbative region) cannot be calculated through perturbative QCD and must be extracted from data.

Importance of the Problem

  1. TMD Precision Enhancement: Accurate determination of CSK is crucial for reducing model dependence in TMD evolution
  2. Multi-process Predictions: Improved CSK can enhance QCD prediction accuracy for various processes
  3. Factorization Tests: Can be used to test the validity of factorization in the non-perturbative region
  4. Absence of Direct Experimental Observation: No experimental observable is directly sensitive to CSK; lattice QCD becomes the only direct source of information

Limitations of Existing Methods

  1. Pure Phenomenological Extraction: Traditional methods extract CSK only indirectly from experimental data (e.g., Drell-Yan processes), with relatively large uncertainties
  2. Model Dependence: The non-perturbative part typically employs simple parameterizations (e.g., g_K(b) = 2g_2²b²), lacking first-principles constraints
  3. Underutilization of Lattice Data: Although lattice QCD can directly calculate CSK, it has not been systematically incorporated into phenomenological analyses

Research Motivation

Drawing on the successful experience of combining lattice and experimental data in parton distribution function (PDF) extraction, this paper applies this strategy to TMD extraction for the first time, particularly for CSK determination.

Core Contributions

  1. First Joint Analysis: Systematically incorporates lattice QCD data into TMD phenomenological extraction for the first time, achieving joint analysis of experimental and lattice data
  2. Dual Methodology Verification:
    • Bayesian reweighting of existing fits
    • Simultaneous fit of experimental and lattice data
    • High consistency between the two methods validates robustness
  3. Significant Precision Improvement:
    • CSK non-perturbative parameter g₂ central value shifts by ~10%
    • Uncertainty reduction of 40-50%
    • Improvement from g₂^Baseline = 0.186±0.033 to g₂^Fit = 0.167±0.015
  4. Tension-free Verification: Demonstrates absence of tension between lattice and experimental data, enabling self-consistent combination
  5. Methodological Innovation: Systematic approach to handling finite lattice spacing data, including simultaneous continuum limit extrapolation

Detailed Methodology

Task Definition

Inputs:

  • Experimental data: 482 Drell-Yan process data points (fixed-target, RHIC, Tevatron, LHCb, CMS, ATLAS)
  • Lattice data: 21 CSK data points from 3 different lattice spacings (a₁=0.15 fm, a₂=0.12 fm, a₃=0.09 fm)

Outputs:

  • CSK non-perturbative parameter g₂ and its uncertainty
  • Complete TMD distribution functions

Constraints:

  • Maintain quality of description for experimental data
  • Properly handle lattice spacing effects
  • Satisfy perturbative QCD constraints in the small-b region

Theoretical Framework of Collins-Soper Kernel

The CSK satisfies the renormalization group equation: K(b,μ)lnμ=γK(αs(μ))\frac{\partial K(b,\mu)}{\partial \ln\mu} = -\gamma_K(\alpha_s(\mu))

where γ_K is the cusp anomalous dimension, known to four-loop precision. The solution is:

K(b,μ)=K(b,μb)μbμdμμγK(αs(μ))gK(b)K(b,\mu) = K(b_*,\mu_{b_*}) - \int_{\mu_{b_*}}^{\mu} \frac{d\mu'}{\mu'}\gamma_K(\alpha_s(\mu')) - g_K(b)

Key Components:

  1. Perturbative Part: K(b_,μ_{b_}) and integral term, calculable exactly
  2. Non-perturbative Part: g_K(b), parameterized in this work as gK(b)=2g22b2g_K(b) = 2g_2^2 b^2
  3. b_ Prescription*: Ensures perturbative calculation in reliable region (μ_{b_} = 2e^{-γ_E}/b_)

Baseline TMD Fit

Based on the framework of Ref. 32:

  • Theoretical Precision: N³LL (next-to-next-to-next-to-leading logarithm)
  • Parameterization: Neural network (NN) for non-perturbative part
  • Dataset: Comprehensive Drell-Yan data
  • Baseline Result: g₂^Baseline = 0.186±0.033
  • MC Samples: N=250 Monte Carlo replicas

Lattice Data Treatment

Two Processing Methods:

  1. Continuum Limit Extrapolated Data:
    • n=21 data points {K_j±σ_j}
    • No point-to-point correlations
    • χ² calculation: χα2=j=1n(KjK(α)(bj)σj)2\chi_\alpha^2 = \sum_{j=1}^{n} \left(\frac{K_j - K^{(\alpha)}(b_j)}{\sigma_j}\right)^2
  2. Finite Lattice Spacing Data:
    • Three subsets (n₁=6, n₂=7, n₃=8 points)
    • Each subset has covariance matrix Σᵢ
    • Requires correction for lattice spacing effects: gK(b)=2g22b2k1abk2a2b2g_K(b) = 2g_2^2 b^2 - k_1\frac{a}{b} - k_2\frac{a^2}{b^2}
    • χ² calculation includes correlations: χα2=i=13ji,li=1ni(KjiK(α)(bji))(Σi1)ji,li(KliK(α)(bli))\chi_\alpha^2 = \sum_{i=1}^{3}\sum_{j_i,l_i=1}^{n_i}(K_{j_i}-K^{(\alpha)}(b_{j_i}))(\Sigma_i^{-1})_{j_i,l_i}(K_{l_i}-K^{(\alpha)}(b_{l_i}))

Bayesian Reweighting Method

Principle: Assign weights to MC replicas of existing fits according to the likelihood of lattice data

Weight Calculation: wα=N(χα2)n/21eχα2/2w_\alpha = \mathcal{N}\left(\chi_\alpha^2\right)^{n/2-1}e^{-\chi_\alpha^2/2}

where N is the normalization constant such that ∑w_α = N

Weighted Statistics: g2=α=1Nwαg2(α),σg22=α=1Nwα(g2(α)g2)2\langle g_2\rangle = \sum_{\alpha=1}^{N}w_\alpha g_2^{(\alpha)}, \quad \sigma_{g_2}^2 = \sum_{\alpha=1}^{N}w_\alpha(g_2^{(\alpha)}-\langle g_2\rangle)^2

Effective Replica Number: N_eff ≈ 150 (original N=250), indicating successful reweighting

Simultaneous Fit Method

Setup:

  • Total data points: 503 (482 DY + 21 lattice)
  • Training-validation split: 50%-50% for experimental data, all lattice data used for training
  • Identical NN parameterization and fitting framework
  • Lattice data correlations handled via covariance matrix

Fit Parameters:

  • g₂: CSK non-perturbative parameter
  • k₁, k₂: Lattice spacing correction parameters (priors: k₁=0.22±0.08, k₂=0±0.1)

Experimental Setup

Dataset Details

Experimental Data (482 points):

  • Fixed-target experiments: 233 points
  • RHIC: 7 points
  • Tevatron: 71 points
  • LHCb: 21 points
  • CMS: 78 points
  • ATLAS: 72 points

Lattice QCD Data (21 points):

  • Source: Refs. 53, 54, first CSK lattice calculation with complete systematic error control
  • Includes continuum limit extrapolation (a→0)
  • Three lattice spacings:
    • a₁ = 0.15 fm (6 points)
    • a₂ = 0.12 fm (7 points)
    • a₃ = 0.09 fm (8 points)
  • Transverse distance range: b ∈ 0, 5 GeV⁻¹
  • Renormalization scale: μ = 2 GeV

Evaluation Metrics

  1. χ² Statistic: Measures agreement between theoretical prediction and data
  2. Average χ²: ⟨χ²⟩ averaged over MC ensemble
  3. Parameter Uncertainty: Determined from MC replica distribution
  4. Effective Replica Number: N_eff, assesses reweighting effectiveness

Comparison Methods

  1. Baseline: Pure experimental data fit (Ref. 32)
  2. Reweighting-continuum: Reweighting using continuum limit extrapolated lattice data
  3. Reweighting-finite: Reweighting using finite lattice spacing data
  4. Simultaneous fit: Joint fit of experimental and lattice data

Implementation Details

  • Neural Network Architecture: Identical to Ref. 32
  • MC Sample Size: N=250 (baseline), 10⁴ (for k₁,k₂ sampling in reweighting)
  • Perturbative Order: N³LL precision
  • b_ Prescription*: Standard choice ensuring perturbative reliability
  • Prior Distributions: k₁∼Normal(0.22, 0.08), k₂∼Normal(0, 0.1)

Experimental Results

Main Results

Comparison of g₂ Parameter Extraction Results:

Methodg₂ ValueCentral Value ShiftUncertainty Reduction
Baseline (Experimental)0.186±0.033--
Lattice only0.152±0.027--
Reweighting-continuum0.164±0.020~12%~40%
Reweighting-finite0.165±0.020~11%~40%
Simultaneous fit0.167±0.015~10%~55%

Key Findings:

  1. Two reweighting methods yield identical results, validating method robustness
  2. Simultaneous fit gives slightly smaller uncertainty (0.015 vs 0.020)
  3. Lattice data systematically pulls g₂ toward smaller values
  4. Uncertainty significantly reduced, precision improved nearly twofold

χ² Analysis

Table I: ⟨χ²⟩ values for various data subsets

Experiment/LatticeN_datBaselineReweightingFit
Fixed-target233239.67234.09242.29
RHIC77.067.117.62
Tevatron7161.3359.9165.31
LHCb2122.9421.8822.77
CMS7830.7429.8529.48
ATLAS7295.3595.6095.71
Lattice a₁631.1311.512.56
Lattice a₂76.304.671.77
Lattice a₃88.503.886.01

Important Observations:

  1. Experimental Data Stability: χ² for all DY data remains stable across three configurations, indicating absence of tension
  2. Lattice Data Improvement: χ² for lattice data significantly reduced (from 31.13→12.56, 6.30→1.77, 8.50→6.01)
  3. Constraint Power Demonstrated: Despite fewer lattice data points, they provide strong constraints on CSK

Visualization Analysis

Figure 1: MC Distribution of g₂

  • Red (Baseline): Broader distribution, centered at 0.186
  • Green (Reweighting): Narrower distribution, center shifted to 0.165
  • Blue (Fit): Narrowest distribution, centered at 0.167
  • Clearly demonstrates the constraining effect of lattice data

Figure 2: CSK Variation with b (μ=2 GeV)

  • Small-b region (b<2 GeV⁻¹): Three methods give essentially identical predictions, dominated by perturbative QCD
  • Large-b region (b>2.5 GeV⁻¹):
    • Baseline prediction has large uncertainty
    • Reweighting and fit results show significantly reduced uncertainty
    • Bottom inset shows ratio relative to baseline; lattice data reduces predictions by ~20-30%
  • Good agreement with lattice data points (black dots)

Ablation Analysis

Although the paper does not explicitly perform ablation studies, comparison of different methods reveals contributions of various components:

  1. Independent Constraining Power of Lattice Data:
    • Pure lattice fit: g₂^Lattice = 0.152±0.027
    • ~18% difference from experimental baseline, but compatible within uncertainties
  2. Continuum Limit vs Finite Spacing:
    • Two treatment methods yield consistent results (0.164 vs 0.165)
    • Validates effectiveness of lattice spacing corrections
  3. Reweighting vs Simultaneous Fit:
    • Central values highly consistent (0.165 vs 0.167)
    • Simultaneous fit has slightly smaller uncertainty (0.015 vs 0.020)
    • Indicates direct fitting more fully exploits data information

TMD Distribution Stability

Important finding: Despite significant changes in g₂, the TMD PDFs themselves remain stable. This indicates:

  1. Changes in CSK primarily affect evolution behavior
  2. Impact on final physical observables is absorbed by other parameters
  3. Lattice data provides additional physical constraints rather than simple parameter adjustment

Precedents for Joint Lattice-Experimental Analysis

PDF Extraction:

  1. Unpolarized PDFs: Lin et al. (2018), Cichy et al. (2019), Del Debbio et al. (2020-2021)
  2. Polarized PDFs: Bringewatt et al. (2021), Barry et al. (2022), Hou et al. (2023)
  3. Transverse PDFs: Karpie et al. (2024)
  4. Generalized PDFs: Guo et al. (2023), Cichy et al. (2024)

These works demonstrate the value of lattice QCD data in improving PDF determination.

Lattice QCD Calculations of CSK

Theoretical Development:

  • Original proposals: Ji et al. (2015), Ebert et al. (2019), Ji et al. (2020-2022)
  • Quasi-TMD parton distribution approach: Shanahan et al. (2020-2021)
  • Quasi-TMD wave function approach: Zhang et al. (2020), Chu et al. (2022)
  • Coulomb gauge approach: Bollweg et al. (2024-2025)

Data Used in This Paper:

  • Avkhadiev et al. (2023-2024): First complete calculation with continuum limit extrapolation and full systematic error control
  • Three lattice spacings, complete covariance matrices

TMD Phenomenological Analysis

Recent Progress:

  • MAP Collaboration: Bacchetta et al. (2020-2024), multidimensional analysis framework
  • Moos et al. (2024-2025): Advanced TMD extraction
  • Aslan et al. (2024): Different g_K parameterizations
  • Camarda et al. (2024): High-precision Drell-Yan analysis

Innovation in This Paper: First systematic incorporation of lattice data into TMD phenomenological extraction

Relation to Ref. 17

Cridge et al. (2025) also explore similar directions, but this paper:

  1. First completes full joint extraction
  2. Uses neural network parameterization
  3. Provides two independent verification methods (reweighting and simultaneous fit)

Conclusions and Discussion

Main Conclusions

  1. Technical Feasibility: Successfully incorporates lattice QCD data into TMD phenomenological analysis for the first time; two independent methods (reweighting and simultaneous fit) yield consistent results
  2. Significant Precision Improvement:
    • g₂ central value shift ~10%: from 0.186 to 0.167
    • Uncertainty reduction 40-55%: from ±0.033 to ±0.015
    • Substantially improved constraints on CSK in non-perturbative region
  3. Data Compatibility: Experimental and lattice data show no tension, enabling self-consistent combination
  4. Method Robustness:
    • Continuum limit and finite spacing data treatment yield consistent results
    • Reweighting (N_eff≈150) and simultaneous fit results consistent
    • Quality of DY data description remains stable
  5. Physical Insight: Lattice data provides direct constraints on CSK, addressing the indirectness of experimental data

Limitations

  1. Limited Lattice Data Volume:
    • Only 21 data points (vs 482 experimental points)
    • From single lattice collaboration calculation
    • Future independent lattice calculations needed for verification
  2. Parameterization Simplicity:
    • Employs simplest g_K(b)=2g₂²b² form
    • Paper states "no evidence for need of more complex form," but may limit flexibility
  3. Lattice Spacing Corrections:
    • k₁, k₂ parameter priors from lattice analysis
    • Ideally should be fully independently determined in joint analysis
  4. Physical Point Extrapolation:
    • Lattice calculations performed at unphysical pion masses
    • Continuum limit extrapolation systematic errors included, but residual effects possible
  5. Deeper Understanding of TMD Stability:
    • Why does 10% change in g₂ maintain stable TMD PDFs?
    • Requires more detailed theoretical analysis

Future Directions

  1. More Lattice Data:
    • Independent calculations from different lattice collaborations
    • Finer lattice spacings
    • Larger b range
    • Physical pion mass calculations
  2. Extension to Other TMDs:
    • Paper focuses on unpolarized TMD
    • Extendable to polarized and transverse TMDs
    • Gluon TMD lattice calculations
  3. More Complex Parameterizations:
    • Explore alternative g_K functional forms
    • Deeper machine learning applications
    • Reduce model dependence
  4. More Experimental Processes:
    • Include semi-inclusive deep inelastic scattering (SIDIS)
    • e⁺e⁻ collider data
    • Future EIC experimental data
  5. Theoretical Improvements:
    • Higher-order perturbative corrections (N⁴LL)
    • Systematic treatment of power corrections
    • Improved lattice-continuum matching
  6. Methodological Development:
    • More refined Bayesian inference applications
    • Deeper machine learning applications in TMD extraction
    • Improved uncertainty quantification

In-Depth Evaluation

Strengths

1. Pioneering Work

  • First systematic incorporation of lattice QCD data into TMD phenomenological analysis
  • Establishes important paradigm for future TMD research
  • Fills methodological gap in CSK extraction

2. Rigorous Methodology

  • Dual verification: two independent methods (reweighting and simultaneous fit)
  • Complete uncertainty treatment: statistical and systematic errors
  • Proper covariance matrix handling
  • Systematic consideration of lattice spacing effects

3. Significant Results

  • 40-55% uncertainty reduction, substantial improvement
  • No data tension, validates method self-consistency
  • Effective replica number N_eff≈150, successful reweighting

4. Technical Innovation

  • Neural network parameterization provides flexibility
  • Direct use of finite lattice spacing data
  • Simultaneous continuum limit extrapolation

5. Clear Presentation

  • Clear logical structure
  • Sufficient technical details
  • Informative figures and tables
  • Intuitive result presentation

Weaknesses

1. Data Volume Limitation

  • Only 21 lattice points, relatively small compared to 482 experimental points
  • Results from single lattice collaboration, lacks independent verification
  • May limit comprehensive systematic error assessment

2. Parameterization Simplicity

  • g_K(b) uses simplest quadratic form
  • While paper states "no evidence for need of more complex form," higher precision may require it
  • Lacks systematic comparison of alternative functional forms

3. Insufficient Physical Interpretation

  • Why does 10% g₂ change maintain stable TMD PDFs? Lacks deep discussion
  • CSK physical meaning (QCD vacuum structure) insufficiently elaborated
  • Physical picture at different b regions could be clearer

4. Lattice Systematic Errors

  • Continuum limit extrapolation depends on specific fitting form
  • Finite volume effects discussion insufficient
  • Unphysical pion mass effects not detailed

5. Limited Comparative Analysis

  • Lacks detailed comparison with latest results from other TMD collaborations (SV19, MAP, etc.)
  • Missing systematic study of different g_K parameterizations
  • 18% difference with pure lattice extraction (g₂^Lattice=0.152±0.027) insufficiently discussed

6. Practical Considerations

  • High cost of lattice data acquisition limits method universality
  • Predictive capability for future EIC experiments not fully demonstrated
  • Method reproducibility depends on specific lattice data availability

Impact

Contributions to the Field:

  1. Paradigm Shift: Establishes new paradigm for TMD extraction, will become standard method for future research
  2. Precision Enhancement: Provides most precise CSK determination to date
  3. Methodological Contribution: Provides template for extraction of other TMDs (polarized, transverse) and gluon TMD

Practical Value:

  1. Phenomenological Application: Improved CSK directly applicable to LHC, RHIC, and future EIC theoretical predictions
  2. Reduced Model Dependence: First-principles constraints reduce arbitrariness of non-perturbative models
  3. Experimental Guidance: Provides more reliable theoretical input for future experiment design

Reproducibility:

  • Clear method description, explicit theoretical framework
  • Depends on public lattice data (Refs. 53, 54)
  • Baseline fit (Ref. 32) published
  • Full reproduction requires substantial computational resources and expertise

Expected Impact:

  1. Short-term: Rapid adoption by TMD community, becomes standard method
  2. Mid-term: Stimulates more lattice QCD calculations of CSK and other TMD quantities
  3. Long-term: May transform overall TMD extraction methodology, lattice input becomes essential component

Applicable Scenarios

Direct Applications:

  1. Drell-Yan Processes: Precise predictions for W/Z boson production
  2. SIDIS Analysis: TMD extraction from semi-inclusive deep inelastic scattering
  3. Future EIC: Theoretical input for electron-ion collider
  4. Heavy-Ion Collisions: Small-x physics at RHIC and LHC

Potential Extensions:

  1. Gluon TMD: Method directly generalizable to gluon CSK
  2. Polarized TMD: Transverse spin, longitudinal spin TMD extraction
  3. Other Non-perturbative Quantities: Soft functions, jet functions, etc.
  4. Heavy Quark TMD: Charm and bottom quark TMD

Limitation Scenarios:

  1. Situations requiring ultra-high precision theory predictions (beyond current N³LL)
  2. Cases where lattice data unavailable or unreliable
  3. Extreme kinematic regions (very large or very small b)
  4. Applications requiring real-time rapid computation (lattice data acquisition slow)

References

Key Lattice QCD References:

  • 53, 54 A. Avkhadiev et al., Phys. Rev. D 108, 114505 (2023); Phys. Rev. Lett. 132, 231901 (2024) - Lattice data source for this paper

Baseline TMD Fit:

  • 32 A. Bacchetta et al. (MAP), Phys. Rev. Lett. 135, 021904 (2025) - Baseline fit for this paper

Reweighting Method:

  • 55 R. D. Ball et al. (NNPDF), Nucl. Phys. B 849, 112 (2011) - Original proposal of Bayesian reweighting method

Theoretical Foundation:

  • 18 I. Moult et al., JHEP 08, 280 (2022) - Four-loop cusp anomalous dimension
  • 19 C. Duhr et al., Phys. Rev. Lett. 129, 162001 (2022) - Four-loop CSK calculation

Related Joint Analyses:

  • 1-13 Various PDF lattice-experimental joint extraction works
  • 17 T. Cridge et al., arXiv:2506.13874 (2025) - Parallel work in similar direction

Summary

This paper represents a milestone work in the TMD field, successfully incorporating lattice QCD data systematically into phenomenological extraction of the Collins-Soper kernel for the first time. Through two independent methods—Bayesian reweighting and simultaneous fitting—the researchers demonstrate that lattice data can significantly improve CSK determination: central value shifts by ~10%, uncertainties reduced by 40-55%. This success establishes a new paradigm for future TMD research and provides a methodological template for extraction of other non-perturbative QCD quantities.

Despite limitations such as limited lattice data volume and relatively simple parameterization, the paper's core contribution—demonstrating that lattice and experimental data can be self-consistently combined with significant precision improvement—is unquestionable. As lattice QCD computational capabilities improve and more independent calculations emerge, this method is poised to become the standard paradigm for TMD extraction, with profound implications for understanding QCD vacuum structure and hadron internal dynamics.