2025-11-14T06:07:11.042630

Asynchronous Event-Inertial Odometry using a Unified Gaussian Process Regression Framework

Li, Wang, Liu et al.

Recent works have combined monocular event camera and inertial measurement unit to estimate the $SE(3)$ trajectory. However, the asynchronicity of event cameras brings a great challenge to conventional fusion algorithms. In this paper, we present an asynchronous event-inertial odometry under a unified Gaussian Process (GP) regression framework to naturally fuse asynchronous data associations and inertial measurements. A GP latent variable model is leveraged to build data-driven motion prior and acquire the analytical integration capacity. Then, asynchronous event-based feature associations and integral pseudo measurements are tightly coupled using the same GP framework. Subsequently, this fusion estimation problem is solved by underlying factor graph in a sliding-window manner. With consideration of sparsity, those historical states are marginalized orderly. A twin system is also designed for comparison, where the traditional inertial preintegration scheme is embedded in the GP-based framework to replace the GP latent variable model. Evaluations on public event-inertial datasets demonstrate the validity of both systems. Comparison experiments show competitive precision compared to the state-of-the-art synchronous scheme.

academic

Asynchronous Event-Inertial Odometry using a Unified Gaussian Process Regression Framework

Basic Information

Paper ID: 2412.03136
Title: Asynchronous Event-Inertial Odometry using a Unified Gaussian Process Regression Framework
Authors: Xudong Li, Zhixiang Wang, Zihao Liu, Yizhai Zhang, Fan Zhang, Xiuming Yao, Panfeng Huang
Category: cs.RO (Robotics)
Publication Date: December 4, 2024 (arXiv preprint)
Paper Link: https://arxiv.org/abs/2412.03136

Abstract

This paper proposes an asynchronous event-inertial odometry method based on a unified Gaussian Process (GP) regression framework for naturally fusing asynchronous data association and inertial measurements. The method leverages GP latent variable models to construct data-driven motion priors and obtain analytical integration capabilities, then tightly couples asynchronous event feature association and integrated pseudo-measurements within the same GP framework. The fusion estimation problem is solved through a sliding window factor graph with ordered marginalization of historical states considering sparsity. The authors also design a comparative system that embeds traditional inertial preintegration schemes into the GP framework. Evaluation on public event-inertial datasets demonstrates the effectiveness of both systems, with comparative experiments showing accuracy comparable to state-of-the-art synchronous methods.

Research Background and Motivation

Problem Definition

Event cameras, as bio-inspired visual sensors with asynchronous triggering mechanisms, independently record light intensity changes at each pixel level. This unique mechanism provides event cameras with significant advantages over traditional cameras: low power consumption, low latency, high dynamic range, and high temporal resolution. However, the asynchronous nature of event cameras presents substantial challenges for traditional fusion algorithms.

Limitations of Existing Methods

Frame-based discrete-time schemes: Accumulate events into fixed temporal windows, losing temporal diversity of events, resulting in motion blur and requiring additional deblurring operations
Traditional IMU preintegration: Applied within discrete-time frameworks, discarding numerous inter-frame temporal measurements and losing fine-grained motion information
Computational efficiency: Existing GP methods typically employ full smoothing backends with high computational costs

Research Motivation

To fully exploit the high temporal resolution characteristics of event cameras, there is an urgent need to introduce new methods for fusing asynchronous and high temporal resolution event-inertial observations. This paper focuses on the asynchronous measurement fusion problem and proposes a solution based on a unified GP framework.

Core Contributions

Unified GP Framework: Proposes a unified Gaussian Process regression framework capable of naturally handling fusion of asynchronous event feature association and inertial measurements
GP Latent Variable Model: Introduces latent variable models into the GP regression framework for analytical integration of inertial measurements and implicitly inducing data-driven GP priors
Dual System Design: Implements comparative systems for two fusion approaches:
- CT-IMU: Sparse GP prior + IMU preintegration
- GP-IMU: GP regression preintegration
Efficient Sliding Window: Employs sliding window factor graph optimization with marginalization strategies to maintain computational efficiency
Fully Asynchronous Processing: Uses EKLT for event-driven feature detection and tracking, preserving the high temporal resolution characteristics of event cameras

Methodology Details

Task Definition

Input: Asynchronous event stream and IMU measurement data Output: SE(3) trajectory estimation (including position, orientation, and velocity) Constraints: Handle asynchronous data association while maintaining computational efficiency

Model Architecture

1. Sparse GP Prior

Employs white noise acceleration (WNOA) motion prior for SE(3) modeling:

Ṫwb(t) = Twb(t)ϖbwb(t)∧
ϖ̇bwb(t) = w(t), w(t) ∼ GP(0,Qcδ(t-t'))

where ϖbwb(t) is the velocity in body coordinates and w(t) is a generalized acceleration vector modeled as zero-mean white noise GP.

2. GP Regression Preintegration

Models relative acceleration and rotational velocity as independent GPs:

ṙbkb(t) ∼ GP(0,kr(t,t'))
abbk(t) ∼ GP(0,ka(t,t'))

Obtains noisy observations of GP through latent states ρ̂ and α̂, then leverages GP inference capabilities to compute preintegrated velocity, position, and rotation increments.

3. System Architecture

The entire system comprises two parallel threads:

Asynchronous feature tracking frontend: Uses EKLT for event-driven feature detection and tracking
GP-based sliding window backend: Handles feature management, triangulation, and factor graph optimization

Technical Innovations

1. Unified Framework Design

Both methods operate within the same GP framework but handle IMU data differently:

CT-IMU: Queries states on continuous-time trajectory, separately fuses IMU measurements
GP-IMU: Relies on IMU measurements for state inference, reducing trajectory prior constraints

2. Interpolation Projection Factor

Obtains pose Twbτ at measurement time tτ through GP interpolation, with visual residual error defined as:

rV(Twbτ, li, ẑi) = ẑi - (1/di)K(TwbτTbτcτ)T li

3. Sliding Window Optimization

Employs dynamic marginalization strategy, prioritizing marginalization of newest states and related landmarks to maintain sparsity of the Hessian matrix.

Experimental Setup

Datasets

DAVIS Dataset: Records aggressive motion data using DAVIS240C (240×180) across multiple scenes
MVSEC Dataset: Uses left event camera data (DAVIS 346B, 346×260)

Evaluation Metrics

RMS RTE: Root mean square relative trajectory error for accuracy assessment
Computation Time: Average time consumption of each module
Factor Graph Scale: Complexity indicator of optimization problem

Comparison Methods

Vidal et al. 3 (E+I configuration)
Guan & Lu 4 event-inertial method
Internal comparison of two proposed methods

Implementation Details

Parallax threshold: 8 pixels
Minimum feature track length: 4
GP-IMU latent states: 400
Sliding window minimum size: 40
State temporal interval: 0.05 seconds

Experimental Results

Main Results

Sequence	CT-IMU	GP-IMU	Ref.4	Ref.3
dynamic translation	0.030	0.060	0.056	0.037
dynamic 6dof	0.076	0.056	0.073	0.040
poster translation	0.087	0.082	0.242	0.087
poster 6dof	0.156	0.084	0.210	0.197
boxes 6dof	0.347	0.151	0.073	0.078
shapes 6dof	0.108	0.244	---	0.163

Performance Analysis

Accuracy Performance: Both methods demonstrate accuracy comparable to discrete optimization methods on most sequences, with superior performance on certain sequences
Computational Efficiency: GP-IMU typically exhibits lower computational costs due to fewer variables
Robustness: GP-IMU is more sensitive to IMU noise because it relies on IMU-driven GP for constructing visual residuals

Time Consumption Analysis

Method	Frontend	Optimization	Marginalization	IMU Preintegration	Other
CT-IMU(s)	1273.97	247.834	3.951	0.177	0.743
GP-IMU(s)	1274.51	182.054	4.914	4.713	0.693

The EKLT tracker consumes approximately 80% of total time, being the most time-consuming component. GP-IMU is faster in graph optimization but slightly slower in IMU preintegration.

Event-Inertial Odometry Classification

Frame-based discrete-time schemes: Inherit traditional frame camera algorithms, performing data association on event accumulation
Event-driven continuous-time methods: Directly process event streams, employing continuous-time backends

Gaussian Process Applications in Robotics

GP continuous-time representation methods were early applied to trajectory inference for scanning LiDAR and asynchronous sensors. Recent research applies GP to monocular event visual odometry systems, though with high computational costs.

Conclusions and Discussion

Main Conclusions

Both proposed GP methods effectively handle asynchronous event-inertial fusion problems
GP-IMU achieves higher accuracy on most sequences but is more sensitive to IMU noise
The sliding window strategy effectively controls computational complexity
The method demonstrates competitive performance in complex motion scenarios

Limitations

Real-time Performance: The system cannot currently run in real-time due to retaining all frontend asynchronous measurements for optimization
Insufficient Robustness: Lacks outlier rejection or motion compensation mechanisms
IMU Quality Dependency: GP-IMU method requires high-quality IMU data
Aggressive Motion Constraints: Both methods may be affected by rapid acceleration changes

Future Directions

Information-theoretic graph sparsification strategies for real-time performance
Improved frontend to enhance system robustness
Algorithm optimization for low-quality IMU
Extension to more complex motion patterns

In-Depth Evaluation

Strengths

Theoretical Innovation: The unified GP framework elegantly addresses asynchronous fusion with solid theoretical foundations
Systematic Research: Dual system design provides comprehensive comparative analysis
Sufficient Experiments: Thorough evaluation on multiple public datasets
Engineering Implementation: GTSAM-based implementation ensures reproducibility

Weaknesses

Real-time Limitations: Current inability to meet real-time application requirements limits practical value
Frontend Dependency: Over-reliance on EKLT frontend with insufficient handling of exceptional cases
Limited Applicability: Certain constraints on IMU quality and motion types
Insufficient Theoretical Analysis: Lacks in-depth analysis of theoretical differences between the two methods

Impact

Academic Value: Provides new theoretical framework for event camera and inertial fusion
Practical Potential: After addressing real-time issues, promising applications in robot navigation
Extensibility: Framework demonstrates good extensibility for other sensor fusion scenarios

Applicable Scenarios

High-Dynamic Environments: Suitable for high-speed motion scenarios where traditional cameras struggle
Sufficient Computational Resources: Appropriate for applications with high accuracy requirements and relatively abundant computational resources
Research Platforms: Provides valuable benchmark methods for event camera research

References

This paper cites 26 relevant references covering important works in event camera surveys, IMU preintegration, continuous-time estimation, Gaussian process regression, and other key domains, with comprehensive and authoritative citations.

Overall Assessment: This is an innovative work in the event-inertial odometry field that proposes a unified GP framework providing new insights for handling asynchronous sensor fusion. Despite limitations such as real-time performance, it makes significant theoretical contributions with sufficient experimental evaluation, laying a solid foundation for subsequent research in this domain.