2025-11-22T21:49:17.013774

Towards Foundation Inference Models that Learn ODEs In-Context

Mauel, Hinz, Seifner et al.

Ordinary differential equations (ODEs) describe dynamical systems evolving deterministically in continuous time. Accurate data-driven modeling of systems as ODEs, a central problem across the natural sciences, remains challenging, especially if the data is sparse or noisy. We introduce FIM-ODE (Foundation Inference Model for ODEs), a pretrained neural model designed to estimate ODEs zero-shot (i.e., in context) from sparse and noisy observations. Trained on synthetic data, the model utilizes a flexible neural operator for robust ODE inference, even from corrupted data. We empirically verify that FIM-ODE provides accurate estimates, on par with a neural state-of-the-art method, and qualitatively compare the structure of their estimated vector fields.

academic

Towards Foundation Inference Models that Learn ODEs In-Context

Basic Information

Paper ID: 2510.12650
Title: Towards Foundation Inference Models that Learn ODEs In-Context
Authors: Maximilian Mauel, Manuel Hinz, Patrick Seifner, David Berghaus, Ramsés J. Sánchez
Classification: cs.LG (Machine Learning)
Publication Time/Conference: AI in Science (AIS), 2025, Copenhagen, Denmark
Paper Link: https://arxiv.org/abs/2510.12650

Abstract

Ordinary differential equations (ODEs) describe dynamical systems with deterministic evolution in continuous time. Accurately modeling ODE systems from a data-driven perspective is a fundamental problem in natural sciences, yet remains challenging with sparse or noisy data. This paper introduces FIM-ODE (Foundation Inference Model for ODEs), a pretrained neural model designed to estimate ODEs from sparse and noisy observations in a zero-shot (in-context) manner. The model is trained on synthetic data and leverages flexible neural operators for robust ODE inference, functioning effectively even on corrupted data. Experimental validation demonstrates that FIM-ODE provides accurate estimates with performance comparable to state-of-the-art neural network methods, and qualitatively compares their estimated vector fields.

Research Background and Motivation

Problem Definition

The core problem addressed in this research is the ODE system identification problem: estimating the ODE (i.e., vector field) that best describes a system based solely on time series observations. This has broad applications in natural sciences, from Newton's laws of motion to population dynamics in biological systems to atmospheric convection in meteorology.

Problem Significance

Broad Applicability: ODEs are fundamental modeling tools across multiple fields including physics, biology, and meteorology
Predictive Capability: Accurate ODE models can characterize latent phenomena (such as fixed points or limit cycles) and predict future states
Scientific Understanding: ODE models facilitate understanding of system mechanisms and dynamical properties

Limitations of Existing Methods

Traditional Methods: Non-parametric methods or symbolic regression-based approaches perform poorly with sparse, noisy data
ODEFormer Limitations: Although a state-of-the-art neural symbolic regression method, it can only handle single trajectories and may produce unreasonable complex patterns in global vector field prediction

Research Motivation

Building on the Foundation Inference Models (FIMs) framework, which has demonstrated effectiveness in continuous-time Markov chain, stochastic differential equation, and point process inference, the authors propose FIM-ODE, a model specifically designed for ODE inference.

Core Contributions

Proposes FIM-ODE Model: The first pretrained ODE inference model based on the Foundation Inference Models framework
Neural Operator Architecture: Employs DeepONet neural operators for flexible vector field estimation
Multi-Trajectory Processing: Capable of simultaneously handling multiple trajectories of the same system, improving inference accuracy
Superior Performance: Outperforms ODEFormer on synthetic datasets with R² accuracy of 0.90 vs 0.65 (reconstruction task) and 0.26 vs 0.19 (generalization task)
More Reasonable Global Prediction: Provides simpler and more reasonable vector field predictions in regions far from observations compared to ODEFormer

Methodology Details

Task Definition

Given a collection of time series observations $\mathcal{D} = \{y_k\}_{k=1}^K$ , where each sequence $y_k = [(t_{k1}, y_{k1}), \ldots, (t_{kL}, y_{kL})]$ , the goal is to estimate a vector field $\hat{f}$ that describes the ODE system generating these observations:

$\frac{dx(t)}{dt} = f(t, x(t))$

Model Architecture

1. Synthetic Data Generation

Samples each component of the vector field from multivariate polynomial distributions (maximum degree 3)
Supports ODE systems up to 3 dimensions
Simulates systems on irregular grids and adds noise to generate training data

2. Neural Operator Architecture (DeepONet)

FIM-ODE employs the DeepONet neural operator architecture with three main components:

Branch Network:

Uses a Transformer encoder
Encodes observation data $\mathcal{D}$ into $K(L-1)$ representations of dimension $E$ : $\mathcal{D} \in \mathbb{R}^{E \times K(L-1)}$
Maintains independent encodings for nearly all observations

Trunk Network:

Linear mapping that encodes position $x \in \mathbb{R}^D$ into $h(x) \in \mathbb{R}^E$

Combination Network:

Sequence of residual attention layers similar to Transformer decoders
Uses $\mathcal{D}$ as keys and values, $h(x)$ as queries
Final linear projection yields vector field estimate $\hat{f}(x)$

3. Training Objective

Employs supervised learning objective: $L(x, \mathcal{D}, f) = \|\hat{f}(x) - f(x)\|^2$

Matches predicted and true vector fields at sampled points $x$ near observations.

Technical Innovations

In-Context Learning Capability: Handles new ODE systems without further training or fine-tuning
Multi-Trajectory Fusion: Simultaneously processes multiple trajectories, effectively extracting and combining all available information
Flexible Function Approximation: Neural operators are more flexible than symbolic regression when handling sparse, noisy data
Local-Global Balance: Provides complex predictions near observations while offering simple, reasonable predictions far from observations

Experimental Setup

Dataset

Training Data: 600,000 synthetic ODE equations with approximately 20 million model parameters
Test Data: 4,000 polynomial vector field ODEs (maximum degree 3, up to 3 dimensions)
Trajectory Settings: Each ODE generates 9 trajectories with initial states sampled from $N(0,1)$
Observation Settings: 200 observation points per trajectory on regular grid with time interval $\Delta\tau = 0.05$

Evaluation Metrics

Uses R² Accuracy: Percentage of R² scores greater than 0.9

Comparison Methods

ODEFormer: Pretrained neural symbolic regression method trained on 50 million equations with 86 million parameters

Experimental Tasks

Reconstruction Task: Measures reconstruction performance on in-context trajectories
Generalization Task: Measures reconstruction performance on held-out trajectories

Experimental Results

Main Results

Model	Reconstruction Task	Generalization Task
ODEFormer	0.65	0.19
FIM-ODE	0.90	0.26

Key Findings:

FIM-ODE significantly outperforms ODEFormer on both tasks
Generalization task is more challenging than reconstruction, which aligns with intuition
Despite ODEFormer being trained on broader distributions with more parameters, FIM-ODE still achieves superior performance

Multi-Trajectory Context Analysis

Figure 1 shows FIM-ODE's vector field estimation with varying numbers of context trajectories:

Single Trajectory: Inaccurate estimation at positions far from observations
Multiple Trajectories: As trajectory count increases, FIM-ODE corrects these estimates, effectively covering larger spatial regions

Local vs. Global Prediction Comparison

Figure 2 compares vector field estimates between FIM-ODE and ODEFormer:

FIM-ODE:
- Local: Predicts complex patterns at observations to reconstruct trajectories
- Global: Predicts simpler patterns far from observations
ODEFormer: Predicts more complex vector fields, resulting in complex global patterns lacking support from single simple trajectories

Structural Difference Analysis

Differences between the two models stem from different vector field parameterizations:

ODEFormer: Restricted to (rational) polynomial symbolic equations, which may not default to simple expressions under sparse or noisy observations
FIM-ODE: Neural operators handle these cases more flexibly

Traditional ODE Inference Methods

Non-parametric Methods: Such as Gaussian processes
Symbolic Regression Methods: Traditional symbolic regression based on genetic algorithms or other optimization methods

Foundation Inference Models Framework

FIM-CTMC: Continuous-time Markov chain inference
FIM-SDE: Stochastic differential equation inference
FIM-PP: Point process inference
FIM-ODE in this paper extends the framework to ODE inference

Neural Symbolic Regression

ODEFormer: Pretrained neural method converting time series observations into symbolic equations

Conclusions and Discussion

Main Conclusions

FIM-ODE successfully extends the Foundation Inference Models framework to ODE inference problems
On synthetic datasets, FIM-ODE significantly outperforms existing state-of-the-art method ODEFormer
The flexibility of neural operators enables FIM-ODE to provide more reasonable global vector field predictions
Multi-trajectory processing capability is an important advantage of FIM-ODE over ODEFormer

Limitations

Data Distribution Constraints: Currently validated only on polynomial vector fields; real systems may be more complex
Dimensionality Constraints: Current experiments limited to 3-dimensional systems
Evaluation Scope: Requires validation on broader ODE systems
Computational Efficiency: Paper lacks detailed discussion of computational complexity and inference speed

Future Directions

ODEBench Evaluation: Compare methods on benchmark dataset containing 63 hand-selected ODEs
Latent Dynamics Discovery: Explore discovering latent dynamics using pretrained FIM-ODE
Application Extensions:
- Neural population dynamics
- Chemical reaction kinetics
- Natural language content evolution

In-Depth Evaluation

Strengths

Methodological Innovation: First application of FIM framework to ODE inference with well-designed architecture
Technical Advantages:
- Multi-trajectory processing capability
- Flexible neural operator architecture
- In-context learning ability
Experimental Sufficiency:
- Direct comparison with strong baseline
- Multi-perspective analysis (reconstruction vs. generalization, local vs. global)
- Visualization analysis enhances understanding
Result Convincingness: Significantly outperforms comparison methods on all metrics

Weaknesses

Limited Experimental Scope:
- Validation only on synthetic polynomial data
- Lacks validation on real-world data
- Limited dimensionality and complexity
Insufficient Comparisons:
- Comparison only with ODEFormer, lacking traditional methods
- No computational efficiency comparison
Missing Theoretical Analysis:
- Lacks convergence or generalization guarantees
- No analysis of theoretical advantages
Insufficient Technical Details:
- Brief training details description
- Lack of hyperparameter selection explanation

Impact

Academic Contribution:
- Extends FIM framework application scope
- Provides new neural network method for ODE inference
Practical Value:
- Zero-shot inference capability has practical application potential
- Multi-trajectory processing more practical in real scenarios
Reproducibility:
- Based on existing FIM-SDE architecture with clear technical approach
- Lacks detailed implementation details

Applicable Scenarios

Scientific Computing: Dynamical system modeling in physics, biology, chemistry and other fields
Engineering Applications: Control systems, signal processing and other scenarios requiring system identification
Data-Sparse Scenarios: Particularly suitable for limited or noisy observational data
Multi-Trajectory Data: Significant advantages when multiple observation trajectories of the same system are available

References

This paper primarily references the following key works:

d'Ascoli et al. (2024): Original ODEFormer paper
Seifner et al. (2025a): FIM-SDE framework
Lu et al. (2021): DeepONet neural operators
Berghaus et al. (2024): Foundation work of FIM framework

Overall Assessment: This is a technically solid paper that successfully extends the Foundation Inference Models framework to ODE inference problems. While experimental scope is limited, it demonstrates clear advantages within its defined settings. This work provides a valuable new approach for system identification problems in scientific computing with promising development prospects.