2025-11-22T21:49:17.013774

Towards Foundation Inference Models that Learn ODEs In-Context

Mauel, Hinz, Seifner et al.
Ordinary differential equations (ODEs) describe dynamical systems evolving deterministically in continuous time. Accurate data-driven modeling of systems as ODEs, a central problem across the natural sciences, remains challenging, especially if the data is sparse or noisy. We introduce FIM-ODE (Foundation Inference Model for ODEs), a pretrained neural model designed to estimate ODEs zero-shot (i.e., in context) from sparse and noisy observations. Trained on synthetic data, the model utilizes a flexible neural operator for robust ODE inference, even from corrupted data. We empirically verify that FIM-ODE provides accurate estimates, on par with a neural state-of-the-art method, and qualitatively compare the structure of their estimated vector fields.
academic

Towards Foundation Inference Models that Learn ODEs In-Context

Basic Information

  • Paper ID: 2510.12650
  • Title: Towards Foundation Inference Models that Learn ODEs In-Context
  • Authors: Maximilian Mauel, Manuel Hinz, Patrick Seifner, David Berghaus, Ramsés J. Sánchez
  • Classification: cs.LG (Machine Learning)
  • Publication Time/Conference: AI in Science (AIS), 2025, Copenhagen, Denmark
  • Paper Link: https://arxiv.org/abs/2510.12650

Abstract

Ordinary differential equations (ODEs) describe dynamical systems with deterministic evolution in continuous time. Accurately modeling ODE systems from a data-driven perspective is a fundamental problem in natural sciences, yet remains challenging with sparse or noisy data. This paper introduces FIM-ODE (Foundation Inference Model for ODEs), a pretrained neural model designed to estimate ODEs from sparse and noisy observations in a zero-shot (in-context) manner. The model is trained on synthetic data and leverages flexible neural operators for robust ODE inference, functioning effectively even on corrupted data. Experimental validation demonstrates that FIM-ODE provides accurate estimates with performance comparable to state-of-the-art neural network methods, and qualitatively compares their estimated vector fields.

Research Background and Motivation

Problem Definition

The core problem addressed in this research is the ODE system identification problem: estimating the ODE (i.e., vector field) that best describes a system based solely on time series observations. This has broad applications in natural sciences, from Newton's laws of motion to population dynamics in biological systems to atmospheric convection in meteorology.

Problem Significance

  1. Broad Applicability: ODEs are fundamental modeling tools across multiple fields including physics, biology, and meteorology
  2. Predictive Capability: Accurate ODE models can characterize latent phenomena (such as fixed points or limit cycles) and predict future states
  3. Scientific Understanding: ODE models facilitate understanding of system mechanisms and dynamical properties

Limitations of Existing Methods

  1. Traditional Methods: Non-parametric methods or symbolic regression-based approaches perform poorly with sparse, noisy data
  2. ODEFormer Limitations: Although a state-of-the-art neural symbolic regression method, it can only handle single trajectories and may produce unreasonable complex patterns in global vector field prediction

Research Motivation

Building on the Foundation Inference Models (FIMs) framework, which has demonstrated effectiveness in continuous-time Markov chain, stochastic differential equation, and point process inference, the authors propose FIM-ODE, a model specifically designed for ODE inference.

Core Contributions

  1. Proposes FIM-ODE Model: The first pretrained ODE inference model based on the Foundation Inference Models framework
  2. Neural Operator Architecture: Employs DeepONet neural operators for flexible vector field estimation
  3. Multi-Trajectory Processing: Capable of simultaneously handling multiple trajectories of the same system, improving inference accuracy
  4. Superior Performance: Outperforms ODEFormer on synthetic datasets with R² accuracy of 0.90 vs 0.65 (reconstruction task) and 0.26 vs 0.19 (generalization task)
  5. More Reasonable Global Prediction: Provides simpler and more reasonable vector field predictions in regions far from observations compared to ODEFormer

Methodology Details

Task Definition

Given a collection of time series observations D={yk}k=1K\mathcal{D} = \{y_k\}_{k=1}^K, where each sequence yk=[(tk1,yk1),,(tkL,ykL)]y_k = [(t_{k1}, y_{k1}), \ldots, (t_{kL}, y_{kL})], the goal is to estimate a vector field f^\hat{f} that describes the ODE system generating these observations:

dx(t)dt=f(t,x(t))\frac{dx(t)}{dt} = f(t, x(t))

Model Architecture

1. Synthetic Data Generation

  • Samples each component of the vector field from multivariate polynomial distributions (maximum degree 3)
  • Supports ODE systems up to 3 dimensions
  • Simulates systems on irregular grids and adds noise to generate training data

2. Neural Operator Architecture (DeepONet)

FIM-ODE employs the DeepONet neural operator architecture with three main components:

Branch Network:

  • Uses a Transformer encoder
  • Encodes observation data D\mathcal{D} into K(L1)K(L-1) representations of dimension EE: DRE×K(L1)\mathcal{D} \in \mathbb{R}^{E \times K(L-1)}
  • Maintains independent encodings for nearly all observations

Trunk Network:

  • Linear mapping that encodes position xRDx \in \mathbb{R}^D into h(x)REh(x) \in \mathbb{R}^E

Combination Network:

  • Sequence of residual attention layers similar to Transformer decoders
  • Uses D\mathcal{D} as keys and values, h(x)h(x) as queries
  • Final linear projection yields vector field estimate f^(x)\hat{f}(x)

3. Training Objective

Employs supervised learning objective: L(x,D,f)=f^(x)f(x)2L(x, \mathcal{D}, f) = \|\hat{f}(x) - f(x)\|^2

Matches predicted and true vector fields at sampled points xx near observations.

Technical Innovations

  1. In-Context Learning Capability: Handles new ODE systems without further training or fine-tuning
  2. Multi-Trajectory Fusion: Simultaneously processes multiple trajectories, effectively extracting and combining all available information
  3. Flexible Function Approximation: Neural operators are more flexible than symbolic regression when handling sparse, noisy data
  4. Local-Global Balance: Provides complex predictions near observations while offering simple, reasonable predictions far from observations

Experimental Setup

Dataset

  • Training Data: 600,000 synthetic ODE equations with approximately 20 million model parameters
  • Test Data: 4,000 polynomial vector field ODEs (maximum degree 3, up to 3 dimensions)
  • Trajectory Settings: Each ODE generates 9 trajectories with initial states sampled from N(0,1)N(0,1)
  • Observation Settings: 200 observation points per trajectory on regular grid with time interval Δτ=0.05\Delta\tau = 0.05

Evaluation Metrics

Uses R² Accuracy: Percentage of R² scores greater than 0.9

Comparison Methods

ODEFormer: Pretrained neural symbolic regression method trained on 50 million equations with 86 million parameters

Experimental Tasks

  1. Reconstruction Task: Measures reconstruction performance on in-context trajectories
  2. Generalization Task: Measures reconstruction performance on held-out trajectories

Experimental Results

Main Results

ModelReconstruction TaskGeneralization Task
ODEFormer0.650.19
FIM-ODE0.900.26

Key Findings:

  • FIM-ODE significantly outperforms ODEFormer on both tasks
  • Generalization task is more challenging than reconstruction, which aligns with intuition
  • Despite ODEFormer being trained on broader distributions with more parameters, FIM-ODE still achieves superior performance

Multi-Trajectory Context Analysis

Figure 1 shows FIM-ODE's vector field estimation with varying numbers of context trajectories:

  • Single Trajectory: Inaccurate estimation at positions far from observations
  • Multiple Trajectories: As trajectory count increases, FIM-ODE corrects these estimates, effectively covering larger spatial regions

Local vs. Global Prediction Comparison

Figure 2 compares vector field estimates between FIM-ODE and ODEFormer:

  • FIM-ODE:
    • Local: Predicts complex patterns at observations to reconstruct trajectories
    • Global: Predicts simpler patterns far from observations
  • ODEFormer: Predicts more complex vector fields, resulting in complex global patterns lacking support from single simple trajectories

Structural Difference Analysis

Differences between the two models stem from different vector field parameterizations:

  • ODEFormer: Restricted to (rational) polynomial symbolic equations, which may not default to simple expressions under sparse or noisy observations
  • FIM-ODE: Neural operators handle these cases more flexibly

Traditional ODE Inference Methods

  1. Non-parametric Methods: Such as Gaussian processes
  2. Symbolic Regression Methods: Traditional symbolic regression based on genetic algorithms or other optimization methods

Foundation Inference Models Framework

  • FIM-CTMC: Continuous-time Markov chain inference
  • FIM-SDE: Stochastic differential equation inference
  • FIM-PP: Point process inference
  • FIM-ODE in this paper extends the framework to ODE inference

Neural Symbolic Regression

ODEFormer: Pretrained neural method converting time series observations into symbolic equations

Conclusions and Discussion

Main Conclusions

  1. FIM-ODE successfully extends the Foundation Inference Models framework to ODE inference problems
  2. On synthetic datasets, FIM-ODE significantly outperforms existing state-of-the-art method ODEFormer
  3. The flexibility of neural operators enables FIM-ODE to provide more reasonable global vector field predictions
  4. Multi-trajectory processing capability is an important advantage of FIM-ODE over ODEFormer

Limitations

  1. Data Distribution Constraints: Currently validated only on polynomial vector fields; real systems may be more complex
  2. Dimensionality Constraints: Current experiments limited to 3-dimensional systems
  3. Evaluation Scope: Requires validation on broader ODE systems
  4. Computational Efficiency: Paper lacks detailed discussion of computational complexity and inference speed

Future Directions

  1. ODEBench Evaluation: Compare methods on benchmark dataset containing 63 hand-selected ODEs
  2. Latent Dynamics Discovery: Explore discovering latent dynamics using pretrained FIM-ODE
  3. Application Extensions:
    • Neural population dynamics
    • Chemical reaction kinetics
    • Natural language content evolution

In-Depth Evaluation

Strengths

  1. Methodological Innovation: First application of FIM framework to ODE inference with well-designed architecture
  2. Technical Advantages:
    • Multi-trajectory processing capability
    • Flexible neural operator architecture
    • In-context learning ability
  3. Experimental Sufficiency:
    • Direct comparison with strong baseline
    • Multi-perspective analysis (reconstruction vs. generalization, local vs. global)
    • Visualization analysis enhances understanding
  4. Result Convincingness: Significantly outperforms comparison methods on all metrics

Weaknesses

  1. Limited Experimental Scope:
    • Validation only on synthetic polynomial data
    • Lacks validation on real-world data
    • Limited dimensionality and complexity
  2. Insufficient Comparisons:
    • Comparison only with ODEFormer, lacking traditional methods
    • No computational efficiency comparison
  3. Missing Theoretical Analysis:
    • Lacks convergence or generalization guarantees
    • No analysis of theoretical advantages
  4. Insufficient Technical Details:
    • Brief training details description
    • Lack of hyperparameter selection explanation

Impact

  1. Academic Contribution:
    • Extends FIM framework application scope
    • Provides new neural network method for ODE inference
  2. Practical Value:
    • Zero-shot inference capability has practical application potential
    • Multi-trajectory processing more practical in real scenarios
  3. Reproducibility:
    • Based on existing FIM-SDE architecture with clear technical approach
    • Lacks detailed implementation details

Applicable Scenarios

  1. Scientific Computing: Dynamical system modeling in physics, biology, chemistry and other fields
  2. Engineering Applications: Control systems, signal processing and other scenarios requiring system identification
  3. Data-Sparse Scenarios: Particularly suitable for limited or noisy observational data
  4. Multi-Trajectory Data: Significant advantages when multiple observation trajectories of the same system are available

References

This paper primarily references the following key works:

  • d'Ascoli et al. (2024): Original ODEFormer paper
  • Seifner et al. (2025a): FIM-SDE framework
  • Lu et al. (2021): DeepONet neural operators
  • Berghaus et al. (2024): Foundation work of FIM framework

Overall Assessment: This is a technically solid paper that successfully extends the Foundation Inference Models framework to ODE inference problems. While experimental scope is limited, it demonstrates clear advantages within its defined settings. This work provides a valuable new approach for system identification problems in scientific computing with promising development prospects.