2025-11-16T22:37:12.905987

Learning the Exact SABR Model

Rensi, Rossi, Bianchetti
The SABR model is a cornerstone of interest rate volatility modeling, but its practical application relies heavily on the analytical approximation by Hagan et al., whose accuracy deteriorates for high volatility, long maturities, and out-of-the-money options, admitting arbitrage. While machine learning approaches have been proposed to overcome these limitations, they have often been limited by simplified SABR dynamics or a lack of systematic validation against the full spectrum of market conditions. We develop a novel SABR DNN, a specialized Artificial Deep Neural Network (DNN) architecture that learns the true SABR stochastic dynamics using an unprecedented large training dataset (more than 200 million points) of interest rate Cap/Floor volatility surfaces, including very long maturities (30Y) and extreme strikes consistently with market quotations. Our dataset is obtained via high-precision unbiased Monte Carlo simulation of a special scaled shifted-SABR stochastic dynamics, which allows dimensional reduction without any loss of generality. Our SABR DNN provides arbitrage-free calibration of real market volatility surfaces and Caps/Floors prices for any maturity and strike with negligible computational effort and without retraining across business dates. Our results fully address the gaps in the previous machine learning SABR literature in a systematic and self-consistent way, and can be extended to cover any interest rate European options in different rate tenors and currencies, thus establishing a comprehensive functional SABR framework that can be adopted for daily trading and risk management activities.
academic

Learning the Exact SABR Model

Basic Information

  • Paper ID: 2510.10343
  • Title: Learning the Exact SABR Model
  • Authors: Giorgia Rensi, Pietro Rossi, Marco Bianchetti
  • Categories: q-fin.CP (Computational Finance), q-fin.PR (Pricing of Securities), q-fin.RM (Risk Management)
  • Publication Date: October 14, 2025
  • Paper Link: https://arxiv.org/abs/2510.10343

Abstract

The SABR model is a cornerstone of interest rate volatility modeling, yet its practical application heavily relies on the analytical approximation formula by Hagan et al., which deteriorates in accuracy under high volatility, long maturities, and out-of-the-money options, even creating arbitrage opportunities. While machine learning approaches have attempted to overcome these limitations, they are often constrained by simplified SABR dynamics or lack systematic validation across full market conditions. This research develops a novel SABR DNN architecture that learns the true SABR stochastic dynamics through a large-scale training dataset exceeding 200 million data points, covering maturities up to 30 years and extreme strike prices. The method provides arbitrage-free market volatility surface calibration with exceptional computational efficiency and requires no retraining.

Research Background and Motivation

Problem Background

  1. Importance of the SABR Model: The SABR (Stochastic Alpha Beta Rho) model is the most widely used interest rate volatility model in global financial markets, particularly dominant in interest rate option pricing. According to BIS data, interest rate options are the most actively traded option type in the market (trading volume of $600 billion in the second half of 2024).
  2. Limitations of Hagan Approximation:
    • Severe accuracy deterioration under high volatility, long maturities, and out-of-the-money options
    • May produce negative probability densities, creating arbitrage opportunities
    • Cannot accurately price complex products dependent on volatility smile wings
  3. Shortcomings of Existing Machine Learning Approaches:
    • Most studies only consider simplified lognormal SABR (β=1)
    • Limited training dataset scale with incomplete market condition coverage
    • Lack of systematic validation on real market data
    • Insufficient utilization of complete shifted-SABR dynamics

Research Motivation

To establish a deep neural network framework capable of learning the "exact" SABR model, overcoming the limitations of analytical approximations, and providing high-precision, efficient pricing tools for daily trading and risk management.

Core Contributions

  1. Construction of Ultra-Large-Scale Training Dataset: Generation of over 200 million data points of interest rate volatility surfaces, covering 30-year maturities and extreme strike prices (-1.5% to 10%)
  2. Development of Specialized SABR DNN Architecture: Design of three deep neural networks targeting short-term, medium-term, and long-term periods, capable of learning complete shifted-SABR stochastic dynamics
  3. Implementation of Dimensionality Reduction: Achievement of parameter space dimensionality reduction through scaled shifted-SABR model, improving training efficiency without loss of generality
  4. Provision of Arbitrage-Free Pricing: Implementation of arbitrage-free calibration to real market volatility surfaces, adaptable to different trading days without retraining
  5. Systematic Benchmark Testing: First comprehensive accuracy assessment of the latest version of Hagan et al.'s approximation formula, quantifying errors across different market regions

Methodology Details

Task Definition

Input: SABR model parameters θ_SABR = {α̂, β, ρ, ν} and contract parameters θ_CF = {T, K̂} Output: shifted-Black implied volatility σ_DNN Objective: Learn the mapping relationship (θ_SABR, θ_CF) → σ_MC such that DNN output approximates Monte Carlo simulation results

Model Architecture

1. Scaled Shifted-SABR Dynamics

To reduce parameter dimensionality, standardized processes are introduced:

X(t) = F̄(t)/F̄₀
dX(t) = σ̂(t)X^β(t)dW(t), X(0) = 1
dσ̂(t) = νσ̂(t)dZ(t), σ̂(0) = αF̂₀^(β-1)

2. Three-Layer DNN Architecture

  • Input Layer: 6 nodes receiving {α̂, β, ρ, ν, T, K̂}
  • Hidden Layers: 5 layers with 64 nodes each, using ELU activation functions
  • Output Layer: 1 node outputting implied volatility with linear activation
  • Optimizer: ADAM with maximum 500 epochs and early stopping mechanism

3. Hierarchical Training Strategy

Maturity domain divided into three subsets:

  • DNN 1: Short-term [0.25, 4 years)
  • DNN 2: Medium-term [4, 10.5 years)
  • DNN 3: Long-term 10.5, 30 years

Technical Innovations

  1. Complete SABR Dynamics: β parameter not fixed, maintaining model completeness and flexibility
  2. High-Precision Monte Carlo: Unbiased Monte Carlo simulation used to generate benchmark data, avoiding analytical approximation errors
  3. Intelligent Data Sampling: Latin hypercube sampling employed to ensure comprehensive parameter space coverage
  4. Error Filtering Mechanism: DNN acts as a filter, extracting true information while discarding Monte Carlo noise

Experimental Setup

Dataset

  • Training Set: 1,572,864 random grid surfaces, totaling approximately 239 million volatility points
  • Validation Set: 20% of training set (approximately 47.7 million points)
  • Test Set: Independently generated 40,960 sample points
  • Parameter Ranges:
    • F₀: 0.25%, 5%
    • α: 0.001, 0.2
    • β: 0.05, 0.9
    • ρ: -0.8, 0.6
    • ν: 0.05, 1.6

Evaluation Metrics

  • RMSE: Root Mean Square Error
  • Relative Error: |Δσ| = |σ_DNN - σ_MC|
  • RMSD: Relative Root Mean Square Distance
  • ARD: Absolute Relative Difference

Comparison Methods

  • SABR Hagan: Latest version analytical approximation by Hagan et al.
  • MC SABR: High-precision Monte Carlo simulation as benchmark

Implementation Details

  • Computational Resources: 25,000-30,000 CPU hours with 256 CPUs in parallel
  • Training Time: Approximately 5 GPU hours per DNN (including hyperparameter tuning)
  • Monte Carlo Settings: 2^18 paths with time steps of 0.5-3 days

Experimental Results

Main Results

1. DNN Training Performance

MetricTraining SetTest Set
RMSE0.28%0.25%
|Δσ| > 1%1%-
|Δσ| > 5%0.26%-

2. Market Calibration Accuracy Comparison

Using EUR Cap/Floor market data from August 30, 2024 as example:

Short-term (1.5 years):

  • SABR DNN and MC SABR DNN nearly perfectly coincide
  • SABR Hagan and MC SABR Hagan show minor differences

Long-term (30 years):

  • SABR DNN maintains high accuracy with RMSD < 1%
  • SABR Hagan error significantly increases, RMSD > 5% at lowest strike prices

3. Accuracy Deterioration Analysis

Relative error of Hagan approximation varies with maturity and strike price:

  • Maturity Effect: 30-year options show approximately 10 times higher error than 1.5-year options
  • Strike Price Effect: Lowest strike prices (-1.5%) exhibit maximum error reaching 10%
  • SABR DNN: Maintains stable error < 2% across all regions

Ablation Studies

  1. Network Depth Impact: Reducing hidden layers causes performance degradation; increasing layers shows diminishing returns
  2. Dataset Scale: Larger datasets improve filtering capability for noisy data
  3. Parameter Range: Iteratively optimized parameter ranges ensure calibration stability

Computational Performance

  • Offline Phase: Data generation and training require substantial computational resources (one-time)
  • Online Phase: Single smile calibration < 1 second with exceptional computational efficiency
  • No Retraining Required: Same DNN handles market data from different trading days

Traditional SABR Methods

  • Hagan et al. (2002): Original SABR model and analytical approximation
  • Hagan et al. (2016): shifted-SABR extension handling negative rates

Machine Learning SABR Methods

  • McGhee (2021): First application of neural networks to SABR, limited to β=1
  • Jeon et al. (2022): GPU-accelerated Monte Carlo for dataset generation
  • Funahashi (2023): Control variate methods improving training
  • Hoshisashi et al. (2024): Derivative-constrained neural networks ensuring arbitrage-free pricing

Advantages of This Work

  1. Completeness: Considers complete shifted-SABR dynamics without β simplification
  2. Scale: Training dataset scale surpasses previous studies by orders of magnitude
  3. Practicality: Directly targets real market data and trading practice
  4. Systematicity: Provides complete end-to-end solution

Conclusions and Discussion

Main Conclusions

  1. Technical Feasibility: Deep neural networks successfully learn complex SABR stochastic dynamics
  2. Accuracy Advantages: Significantly outperforms analytical approximations in long-term and extreme strike price regions
  3. Practical Value: Meets precision and efficiency requirements for daily trading and risk management
  4. Robustness: Single-training models adapt to different market environments

Limitations

  1. Computational Cost: Initial data generation and training require substantial computational resources
  2. Market Coverage: Currently limited to EUR Cap/Floor market; extension to other products needed
  3. Market Regime Changes: Major market structural changes may require retraining
  4. Model Risk: "Black-box" nature of neural networks may introduce model risk

Future Directions

  1. Product Extension: Expansion to Swaption cubes and overnight rate products
  2. Multi-Currency: Coverage of USD, GBP, and other major currency markets
  3. Network Optimization: Exploration of advanced network architectures and training strategies
  4. Risk Applications: Applications in historical VaR and stress testing

In-Depth Evaluation

Strengths

  1. Strong Innovation: First large-scale machine learning implementation of complete SABR model with novel technical approach
  2. High Practical Value: Directly addresses core pain points in financial practice with clear commercial application prospects
  3. Comprehensive Experiments: Ultra-large dataset and thorough benchmark testing ensure result credibility
  4. Clear Writing: Detailed technical exposition with strong reproducibility

Weaknesses

  1. Generalization Capability: Validation limited to EUR market; applicability to other markets remains uncertain
  2. Theoretical Analysis: Lacks theoretical analysis of neural network approximation error
  3. Extreme Cases: Insufficient robustness analysis under extreme market volatility
  4. Computational Barrier: High computational costs may limit adoption by smaller institutions

Impact

  1. Academic Contribution: Provides important example for computational finance and machine learning intersection
  2. Industry Impact: May change industry standard practices in interest rate derivatives pricing
  3. Methodology: Offers insights for machine learning applications to other complex financial models
  4. Practical Application: Directly applicable to trading and risk management workflows

Applicable Scenarios

  1. Large Investment Banks: Institutions with sufficient computational resources can directly apply
  2. Risk Management: High-precision pricing scenarios requiring accurate risk measurement
  3. Algorithmic Trading: High-frequency trading environments with extreme computational efficiency requirements
  4. Academic Research: Serves as benchmark model for further methodological research

References

  1. Hagan, P. et al. (2002). Managing Smile Risk. Wilmott Magazine.
  2. Hagan, P. et al. (2016). Universal Smiles. Wilmott.
  3. McGhee, W. A. (2021). An artificial neural network representation of the SABR stochastic volatility model. Journal of Computational Finance.
  4. Baschetti, F. et al. (2024). Deep calibration with random grids. Quantitative Finance.

Overall Assessment: This is a high-quality research paper with significant practical value in computational finance. The authors systematically address key technical challenges in SABR model applications, providing a complete end-to-end solution. Despite limitations such as high computational costs and pending generalization validation, its technical innovation and practical value make it an important contribution to the field.