2025-11-16T22:37:12.905987

Learning the Exact SABR Model

Rensi, Rossi, Bianchetti

The SABR model is a cornerstone of interest rate volatility modeling, but its practical application relies heavily on the analytical approximation by Hagan et al., whose accuracy deteriorates for high volatility, long maturities, and out-of-the-money options, admitting arbitrage. While machine learning approaches have been proposed to overcome these limitations, they have often been limited by simplified SABR dynamics or a lack of systematic validation against the full spectrum of market conditions. We develop a novel SABR DNN, a specialized Artificial Deep Neural Network (DNN) architecture that learns the true SABR stochastic dynamics using an unprecedented large training dataset (more than 200 million points) of interest rate Cap/Floor volatility surfaces, including very long maturities (30Y) and extreme strikes consistently with market quotations. Our dataset is obtained via high-precision unbiased Monte Carlo simulation of a special scaled shifted-SABR stochastic dynamics, which allows dimensional reduction without any loss of generality. Our SABR DNN provides arbitrage-free calibration of real market volatility surfaces and Caps/Floors prices for any maturity and strike with negligible computational effort and without retraining across business dates. Our results fully address the gaps in the previous machine learning SABR literature in a systematic and self-consistent way, and can be extended to cover any interest rate European options in different rate tenors and currencies, thus establishing a comprehensive functional SABR framework that can be adopted for daily trading and risk management activities.

academic

Learning the Exact SABR Model

Basic Information

Paper ID: 2510.10343
Title: Learning the Exact SABR Model
Authors: Giorgia Rensi, Pietro Rossi, Marco Bianchetti
Categories: q-fin.CP (Computational Finance), q-fin.PR (Pricing of Securities), q-fin.RM (Risk Management)
Publication Date: October 14, 2025
Paper Link: https://arxiv.org/abs/2510.10343

Abstract

The SABR model is a cornerstone of interest rate volatility modeling, yet its practical application heavily relies on the analytical approximation formula by Hagan et al., which deteriorates in accuracy under high volatility, long maturities, and out-of-the-money options, even creating arbitrage opportunities. While machine learning approaches have attempted to overcome these limitations, they are often constrained by simplified SABR dynamics or lack systematic validation across full market conditions. This research develops a novel SABR DNN architecture that learns the true SABR stochastic dynamics through a large-scale training dataset exceeding 200 million data points, covering maturities up to 30 years and extreme strike prices. The method provides arbitrage-free market volatility surface calibration with exceptional computational efficiency and requires no retraining.

Research Background and Motivation

Problem Background

Importance of the SABR Model: The SABR (Stochastic Alpha Beta Rho) model is the most widely used interest rate volatility model in global financial markets, particularly dominant in interest rate option pricing. According to BIS data, interest rate options are the most actively traded option type in the market (trading volume of $600 billion in the second half of 2024).
Limitations of Hagan Approximation:
- Severe accuracy deterioration under high volatility, long maturities, and out-of-the-money options
- May produce negative probability densities, creating arbitrage opportunities
- Cannot accurately price complex products dependent on volatility smile wings
Shortcomings of Existing Machine Learning Approaches:
- Most studies only consider simplified lognormal SABR (β=1)
- Limited training dataset scale with incomplete market condition coverage
- Lack of systematic validation on real market data
- Insufficient utilization of complete shifted-SABR dynamics

Research Motivation

To establish a deep neural network framework capable of learning the "exact" SABR model, overcoming the limitations of analytical approximations, and providing high-precision, efficient pricing tools for daily trading and risk management.

Core Contributions

Construction of Ultra-Large-Scale Training Dataset: Generation of over 200 million data points of interest rate volatility surfaces, covering 30-year maturities and extreme strike prices (-1.5% to 10%)
Development of Specialized SABR DNN Architecture: Design of three deep neural networks targeting short-term, medium-term, and long-term periods, capable of learning complete shifted-SABR stochastic dynamics
Implementation of Dimensionality Reduction: Achievement of parameter space dimensionality reduction through scaled shifted-SABR model, improving training efficiency without loss of generality
Provision of Arbitrage-Free Pricing: Implementation of arbitrage-free calibration to real market volatility surfaces, adaptable to different trading days without retraining
Systematic Benchmark Testing: First comprehensive accuracy assessment of the latest version of Hagan et al.'s approximation formula, quantifying errors across different market regions

Methodology Details

Task Definition

Input: SABR model parameters θ_SABR = {α̂, β, ρ, ν} and contract parameters θ_CF = {T, K̂} Output: shifted-Black implied volatility σ_DNN Objective: Learn the mapping relationship (θ_SABR, θ_CF) → σ_MC such that DNN output approximates Monte Carlo simulation results

Model Architecture

1. Scaled Shifted-SABR Dynamics

To reduce parameter dimensionality, standardized processes are introduced:

X(t) = F̄(t)/F̄₀
dX(t) = σ̂(t)X^β(t)dW(t), X(0) = 1
dσ̂(t) = νσ̂(t)dZ(t), σ̂(0) = αF̂₀^(β-1)

2. Three-Layer DNN Architecture

Input Layer: 6 nodes receiving {α̂, β, ρ, ν, T, K̂}
Hidden Layers: 5 layers with 64 nodes each, using ELU activation functions
Output Layer: 1 node outputting implied volatility with linear activation
Optimizer: ADAM with maximum 500 epochs and early stopping mechanism

3. Hierarchical Training Strategy

Maturity domain divided into three subsets:

DNN 1: Short-term [0.25, 4 years)
DNN 2: Medium-term [4, 10.5 years)
DNN 3: Long-term 10.5, 30 years

Technical Innovations

Complete SABR Dynamics: β parameter not fixed, maintaining model completeness and flexibility
High-Precision Monte Carlo: Unbiased Monte Carlo simulation used to generate benchmark data, avoiding analytical approximation errors
Intelligent Data Sampling: Latin hypercube sampling employed to ensure comprehensive parameter space coverage
Error Filtering Mechanism: DNN acts as a filter, extracting true information while discarding Monte Carlo noise

Experimental Setup

Dataset

Training Set: 1,572,864 random grid surfaces, totaling approximately 239 million volatility points
Validation Set: 20% of training set (approximately 47.7 million points)
Test Set: Independently generated 40,960 sample points
Parameter Ranges:
- F₀: 0.25%, 5%
- α: 0.001, 0.2
- β: 0.05, 0.9
- ρ: -0.8, 0.6
- ν: 0.05, 1.6

Evaluation Metrics

RMSE: Root Mean Square Error
Relative Error: |Δσ| = |σ_DNN - σ_MC|
RMSD: Relative Root Mean Square Distance
ARD: Absolute Relative Difference

Comparison Methods

SABR Hagan: Latest version analytical approximation by Hagan et al.
MC SABR: High-precision Monte Carlo simulation as benchmark

Implementation Details

Computational Resources: 25,000-30,000 CPU hours with 256 CPUs in parallel
Training Time: Approximately 5 GPU hours per DNN (including hyperparameter tuning)
Monte Carlo Settings: 2^18 paths with time steps of 0.5-3 days

Experimental Results

Main Results

1. DNN Training Performance

Metric	Training Set	Test Set
RMSE	0.28%	0.25%
\|Δσ\| > 1%	1%	-
\|Δσ\| > 5%	0.26%	-

2. Market Calibration Accuracy Comparison

Using EUR Cap/Floor market data from August 30, 2024 as example:

Short-term (1.5 years):

SABR DNN and MC SABR DNN nearly perfectly coincide
SABR Hagan and MC SABR Hagan show minor differences

Long-term (30 years):

SABR DNN maintains high accuracy with RMSD < 1%
SABR Hagan error significantly increases, RMSD > 5% at lowest strike prices

3. Accuracy Deterioration Analysis

Relative error of Hagan approximation varies with maturity and strike price:

Maturity Effect: 30-year options show approximately 10 times higher error than 1.5-year options
Strike Price Effect: Lowest strike prices (-1.5%) exhibit maximum error reaching 10%
SABR DNN: Maintains stable error < 2% across all regions

Ablation Studies

Network Depth Impact: Reducing hidden layers causes performance degradation; increasing layers shows diminishing returns
Dataset Scale: Larger datasets improve filtering capability for noisy data
Parameter Range: Iteratively optimized parameter ranges ensure calibration stability

Computational Performance

Offline Phase: Data generation and training require substantial computational resources (one-time)
Online Phase: Single smile calibration < 1 second with exceptional computational efficiency
No Retraining Required: Same DNN handles market data from different trading days

Traditional SABR Methods

Hagan et al. (2002): Original SABR model and analytical approximation
Hagan et al. (2016): shifted-SABR extension handling negative rates

Machine Learning SABR Methods

McGhee (2021): First application of neural networks to SABR, limited to β=1
Jeon et al. (2022): GPU-accelerated Monte Carlo for dataset generation
Funahashi (2023): Control variate methods improving training
Hoshisashi et al. (2024): Derivative-constrained neural networks ensuring arbitrage-free pricing

Advantages of This Work

Completeness: Considers complete shifted-SABR dynamics without β simplification
Scale: Training dataset scale surpasses previous studies by orders of magnitude
Practicality: Directly targets real market data and trading practice
Systematicity: Provides complete end-to-end solution

Conclusions and Discussion

Main Conclusions

Technical Feasibility: Deep neural networks successfully learn complex SABR stochastic dynamics
Accuracy Advantages: Significantly outperforms analytical approximations in long-term and extreme strike price regions
Practical Value: Meets precision and efficiency requirements for daily trading and risk management
Robustness: Single-training models adapt to different market environments

Limitations

Computational Cost: Initial data generation and training require substantial computational resources
Market Coverage: Currently limited to EUR Cap/Floor market; extension to other products needed
Market Regime Changes: Major market structural changes may require retraining
Model Risk: "Black-box" nature of neural networks may introduce model risk

Future Directions

Product Extension: Expansion to Swaption cubes and overnight rate products
Multi-Currency: Coverage of USD, GBP, and other major currency markets
Network Optimization: Exploration of advanced network architectures and training strategies
Risk Applications: Applications in historical VaR and stress testing

In-Depth Evaluation

Strengths

Strong Innovation: First large-scale machine learning implementation of complete SABR model with novel technical approach
High Practical Value: Directly addresses core pain points in financial practice with clear commercial application prospects
Comprehensive Experiments: Ultra-large dataset and thorough benchmark testing ensure result credibility
Clear Writing: Detailed technical exposition with strong reproducibility

Weaknesses

Generalization Capability: Validation limited to EUR market; applicability to other markets remains uncertain
Theoretical Analysis: Lacks theoretical analysis of neural network approximation error
Extreme Cases: Insufficient robustness analysis under extreme market volatility
Computational Barrier: High computational costs may limit adoption by smaller institutions

Impact

Academic Contribution: Provides important example for computational finance and machine learning intersection
Industry Impact: May change industry standard practices in interest rate derivatives pricing
Methodology: Offers insights for machine learning applications to other complex financial models
Practical Application: Directly applicable to trading and risk management workflows

Applicable Scenarios

Large Investment Banks: Institutions with sufficient computational resources can directly apply
Risk Management: High-precision pricing scenarios requiring accurate risk measurement
Algorithmic Trading: High-frequency trading environments with extreme computational efficiency requirements
Academic Research: Serves as benchmark model for further methodological research

References

Hagan, P. et al. (2002). Managing Smile Risk. Wilmott Magazine.
Hagan, P. et al. (2016). Universal Smiles. Wilmott.
McGhee, W. A. (2021). An artificial neural network representation of the SABR stochastic volatility model. Journal of Computational Finance.
Baschetti, F. et al. (2024). Deep calibration with random grids. Quantitative Finance.

Overall Assessment: This is a high-quality research paper with significant practical value in computational finance. The authors systematically address key technical challenges in SABR model applications, providing a complete end-to-end solution. Despite limitations such as high computational costs and pending generalization validation, its technical innovation and practical value make it an important contribution to the field.